Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions .agents/skills/debug-openshell-cluster/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,11 +296,9 @@ openshell logs <sandbox-name>
| Kubernetes gateway pod crash loops | Missing secret, bad DB URL, bad TLS config | `kubectl -n openshell logs deployment/openshell -c openshell-gateway` or `kubectl -n openshell logs statefulset/openshell -c openshell-gateway` |
| CLI TLS error | Local mTLS bundle does not match server cert/CA | Check `~/.config/openshell/gateways/<name>/mtls/` |
| Image pull failure | Gateway or sandbox image cannot be pulled | Runtime events and image pull credentials |
| `K8s namespace not ready` with `envoy-gateway-openshell.yaml: the server could not find the requested resource` | Optional Gateway API manifest was applied without Envoy Gateway CRDs, or k3s Helm controller startup exceeded the namespace wait | Apply `deploy/kube/manifests/envoy-gateway-openshell.yaml` manually only after Envoy Gateway is installed and `grpcRoute` is enabled |
| HTTPS ingress (`grpcRoute.gateway.listener.protocol=HTTPS`) connection resets or TLS handshake hangs | Envoy terminates TLS but the gateway pod still expects TLS, so the plaintext backend hop fails | Set `server.disableTls=true` so Envoy forwards plaintext to the pod; verify the listener `certificateRefs` Secret exists in the release namespace and `openshell status` over `https://<host>` |
| HTTPS ingress returns `Unauthenticated` after connecting | TLS terminates at Envoy, so the gateway never sees a client cert; no OIDC issuer is configured for identity | Configure `server.oidc.issuer` and register with `openshell gateway add https://<host> --oidc-issuer <url>`, or set `server.auth.allowUnauthenticatedUsers=true` for a trusted-proxy/dev cluster |

## Reporting
| `K8s namespace not ready` with `envoy-gateway-openshell.yaml: the server could not find the requested resource` | Optional Gateway API manifest was applied without Envoy Gateway CRDs, or k3s Helm controller startup exceeded the namespace wait | Apply `deploy/kube/manifests/envoy-gateway-openshell.yaml` manually only after Envoy Gateway is installed and a `gatewayApi` resource is enabled |
| TLSRoute remains unaccepted or port 443 is not exposed | The referenced Gateway has no compatible TLS passthrough listener, or the Gateway API controller does not support TLSRoute | Check `kubectl describe tlsroute -n <namespace>` and `kubectl describe gateway -n <namespace>`; with a chart-created Gateway, verify both `gatewayApi.gateway.create=true` and `gatewayApi.routes.tls.enabled=true` |
| TLS passthrough connection fails during the handshake | The gateway server is running plaintext or its certificate does not cover the external SNI hostname | Keep `server.disableTls=false`, verify the server certificate SAN, and configure `pkiInitJob.serverDnsNames` or `certManager.serverDnsNames` before issuing the certificate |

When handing results back to the user, include:

Expand Down
21 changes: 10 additions & 11 deletions deploy/helm/openshell/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ The `dev` tags are intended for testing changes ahead of a release. Production d

See [`values.yaml`](values.yaml) for source defaults. Selected overlays:

- [`ci/values-gateway.yaml`](ci/values-gateway.yaml) - gateway-only configuration
- [`ci/values-gateway.yaml`](ci/values-gateway.yaml) - Gateway API configuration
- [`ci/values-gateway-tls.yaml`](ci/values-gateway-tls.yaml) - Gateway API TLS passthrough configuration
- [`ci/values-cert-manager.yaml`](ci/values-cert-manager.yaml) - cert-manager integration
- [`ci/values-keycloak.yaml`](ci/values-keycloak.yaml) - Keycloak OIDC integration
- [`ci/values-high-availability.yaml`](ci/values-high-availability.yaml) - CI overlay for multi-replica external PostgreSQL testing
Expand Down Expand Up @@ -147,16 +148,14 @@ add `ci/values-spire.yaml` to the OpenShell release values files.
| certManager.serverDnsNames | list | `["openshell","openshell.openshell.svc","openshell.openshell.svc.cluster.local","localhost","openshell.localhost","*.openshell.localhost","host.docker.internal"]` | DNS SANs on the cert-manager-issued server certificate. |
| certManager.serverIpAddresses | list | `["127.0.0.1"]` | IP SANs on the cert-manager-issued server certificate. |
| fullnameOverride | string | `""` | Override the full generated resource name. |
| grpcRoute.enabled | bool | `false` | Create a Gateway API GRPCRoute for the gateway service. |
| grpcRoute.gateway.className | string | `"eg"` | GatewayClass to reference. Envoy Gateway installs one named "eg". |
| grpcRoute.gateway.create | bool | `false` | When true, a Gateway resource is created in the release namespace. Set to false and provide name/namespace to attach to a pre-existing Gateway. |
| grpcRoute.gateway.listener.allowedRoutes | string | `"Same"` | "Same" restricts attached routes to the release namespace; "All" allows any namespace. |
| grpcRoute.gateway.listener.port | int | `80` | Listener port for the generated Gateway resource. Use 443 with protocol HTTPS. |
| grpcRoute.gateway.listener.protocol | string | `"HTTP"` | Listener protocol for the generated Gateway resource: HTTP or HTTPS. HTTPS terminates TLS at the Envoy Gateway listener; pair it with server.disableTls=true so Envoy forwards plaintext to the gateway pod, and use OIDC for client identity (the gateway never sees the client cert). |
| grpcRoute.gateway.listener.tls.certificateRefs | list | `[]` | certificateRefs for the HTTPS listener. Required when protocol is HTTPS. Each entry needs a `name` pointing at a kubernetes.io/tls Secret in the Gateway's namespace. May reference a cert-manager-issued Secret or the existing openshell-server-tls Secret (its SANs must include the external hostname). |
| grpcRoute.gateway.name | string | `""` | Name of the Gateway resource. Defaults to the chart fullname. |
| grpcRoute.gateway.namespace | string | `""` | Namespace of the Gateway referenced by the GRPCRoute parentRef. Defaults to the release namespace. |
| grpcRoute.hostnames | list | `[]` | Hostnames the GRPCRoute matches on. Leave empty to match all hosts. |
| gatewayApi.gateway.className | string | `"eg"` | GatewayClass to reference. Envoy Gateway installs one named "eg". |
| gatewayApi.gateway.create | bool | `false` | When true, a Gateway resource is created. Set to false and provide name/namespace to attach to a pre-existing Gateway. |
| gatewayApi.gateway.name | string | `""` | Name of the Gateway resource. Defaults to the chart fullname. |
| gatewayApi.gateway.namespace | string | `""` | Namespace of a pre-existing Gateway referenced by routes. Defaults to the release namespace. Chart-created Gateways always use the release namespace. |
| gatewayApi.routes.grpc.enabled | bool | `false` | Create a Gateway API GRPCRoute for the gateway service. |
| gatewayApi.routes.grpc.hostnames | list | `[]` | Hostnames the GRPCRoute matches on. Leave empty to match all hosts. |
| gatewayApi.routes.tls.enabled | bool | `false` | Create a Gateway API TLSRoute that passes encrypted traffic through to the gateway service. When gatewayApi.gateway.create is true, the generated Gateway also gets a TLS listener on port 443 in Passthrough mode. |
| gatewayApi.routes.tls.hostnames | list | `[]` | SNI hostnames the TLSRoute matches on. Leave empty to match all hosts. |
| image.pullPolicy | string | `"IfNotPresent"` | Gateway image pull policy. |
| image.repository | string | `"ghcr.io/nvidia/openshell/gateway"` | Gateway image repository. |
| image.tag | string | `""` | Gateway image tag. Defaults to the chart appVersion when empty. |
Expand Down
3 changes: 2 additions & 1 deletion deploy/helm/openshell/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ The `dev` tags are intended for testing changes ahead of a release. Production d

See [`values.yaml`](values.yaml) for source defaults. Selected overlays:

- [`ci/values-gateway.yaml`](ci/values-gateway.yaml) - gateway-only configuration
- [`ci/values-gateway.yaml`](ci/values-gateway.yaml) - Gateway API configuration
- [`ci/values-gateway-tls.yaml`](ci/values-gateway-tls.yaml) - Gateway API TLS passthrough configuration
- [`ci/values-cert-manager.yaml`](ci/values-cert-manager.yaml) - cert-manager integration
- [`ci/values-keycloak.yaml`](ci/values-keycloak.yaml) - Keycloak OIDC integration
- [`ci/values-high-availability.yaml`](ci/values-high-availability.yaml) - CI overlay for multi-replica external PostgreSQL testing
Expand Down
37 changes: 12 additions & 25 deletions deploy/helm/openshell/ci/values-gateway-tls.yaml
Original file line number Diff line number Diff line change
@@ -1,33 +1,20 @@
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# Gateway API overlay with TLS termination at the Envoy Gateway listener.
#
# Exercises the HTTPS listener branch of templates/gateway.yaml for
# `helm template`/lint coverage. Envoy Gateway terminates TLS using the
# referenced kubernetes.io/tls Secret and forwards plaintext to the gateway pod,
# so the gateway runs with TLS disabled and uses OIDC for client identity.
#
# The certificate Secret (openshell-ingress-tls) and the OIDC issuer below are
# placeholders for render coverage; a real deployment must provide a valid TLS
# Secret in the release namespace and a reachable OIDC issuer.
# Gateway API overlay with TLS passthrough to the OpenShell gateway server.
# Envoy routes by SNI without terminating TLS, preserving server TLS and mTLS.

grpcRoute:
enabled: true
gatewayApi:
gateway:
create: true
className: "eg"
listener:
port: 443
protocol: HTTPS
tls:
certificateRefs:
- name: openshell-ingress-tls
hostnames: []
routes:
tls:
enabled: true
hostnames:
- openshell.example.com

server:
# Envoy terminates TLS at the edge; the gateway listens plaintext behind it.
disableTls: true
oidc:
issuer: "https://keycloak.example.com/realms/openshell"
audience: "openshell-cli"
# Include the external hostname in certificates generated for fresh installs.
pkiInitJob:
serverDnsNames:
- openshell.example.com
14 changes: 8 additions & 6 deletions deploy/helm/openshell/ci/values-gateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,14 @@
# kubectl -n openshell port-forward svc/<envoy-service> 8080:80
# # then: grpcurl -plaintext localhost:8080 ...

grpcRoute:
enabled: true
gatewayApi:
gateway:
create: true
className: "eg"
# Set one or more hostnames to scope the route, e.g.:
# hostnames:
# - openshell.example.com
hostnames: []
routes:
grpc:
enabled: true
# Set one or more hostnames to scope the route, e.g.:
# hostnames:
# - openshell.example.com
hostnames: []
10 changes: 6 additions & 4 deletions deploy/helm/openshell/skaffold.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,14 @@ deploy:
# crds.enabled: true
# Envoy Gateway — Kubernetes Gateway API implementation.
# Installs the Gateway API CRDs and the "eg" GatewayClass.
# Required when grpcRoute.enabled is true in the openshell release.
# Required when Gateway API routes are enabled in the openshell release.
#- name: envoy-gateway
# remoteChart: oci://docker.io/envoyproxy/gateway-helm
# version: v1.7.2
# version: v1.8.1
# namespace: envoy-gateway-system
# createNamespace: true
# # wait ensures Gateway API CRDs are registered before the openshell
# # release attempts to create Gateway and HTTPRoute resources.
# # release attempts to create Gateway and route resources.
# wait: true
# SPIRE — installs SPIRE Server, Agent, Controller Manager, CSI Driver,
# and OIDC Discovery Provider using the SPIFFE hardened charts.
Expand Down Expand Up @@ -114,8 +114,10 @@ deploy:
# setup task first, then uncomment the line below:
# mise run keycloak:k8s:setup
#- ci/values-keycloak.yaml
# To enable the Gateway API HTTPRoute (requires Envoy Gateway above):
# To enable the Gateway API Gateway and GRPCRoute (requires Envoy Gateway above):
#- ci/values-gateway.yaml
# To enable TLS passthrough instead (requires Envoy Gateway above):
#- ci/values-gateway-tls.yaml
# To enable SPIFFE/SPIRE provider token grants (requires the
# spire-crds and spire releases above):
#- ci/values-spire.yaml
Expand Down
26 changes: 26 additions & 0 deletions deploy/helm/openshell/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,26 @@ Name of the Secret holding gateway-minted sandbox JWT signing material.
{{- .Values.server.sandboxJwt.signingSecretName | default (printf "%s-jwt-keys" (include "openshell.fullname" .)) -}}
{{- end }}

{{/*
Name of the Kubernetes Gateway API Gateway resource created or referenced by
the chart.
*/}}
{{- define "openshell.gatewayApiGatewayName" -}}
{{- .Values.gatewayApi.gateway.name | default (include "openshell.fullname" .) -}}
{{- end }}

{{/*
Namespace of the Kubernetes Gateway API Gateway resource created or referenced
by the chart.
*/}}
{{- define "openshell.gatewayApiGatewayNamespace" -}}
{{- if .Values.gatewayApi.gateway.create -}}
{{- .Release.Namespace -}}
{{- else -}}
{{- .Values.gatewayApi.gateway.namespace | default .Release.Namespace -}}
{{- end -}}
{{- end }}

{{/*
gRPC endpoint sandbox pods use to call back into the gateway. An explicit
.Values.server.grpcEndpoint is used verbatim. Otherwise it is derived from
Expand Down Expand Up @@ -178,6 +198,12 @@ database requires persistent per-pod storage.
Validate chart values that Helm would otherwise accept silently.
*/}}
{{- define "openshell.validateValues" -}}
{{- if hasKey .Values "grpcRoute" -}}
{{- fail "grpcRoute values were replaced by gatewayApi; configure Gateway creation under gatewayApi.gateway and GRPCRoute creation under gatewayApi.routes.grpc." -}}
{{- end -}}
{{- if and .Values.gatewayApi.routes.tls.enabled .Values.server.disableTls -}}
{{- fail "gatewayApi.routes.tls.enabled requires server.disableTls=false because TLS passthrough forwards the encrypted connection to the gateway server." -}}
{{- end -}}
{{- $workloadKind := include "openshell.workloadKind" . -}}
{{- $workload := .Values.workload | default dict -}}
{{- $replicaCount := int (default 1 .Values.replicaCount) -}}
Expand Down
34 changes: 13 additions & 21 deletions deploy/helm/openshell/templates/gateway.yaml
Original file line number Diff line number Diff line change
@@ -1,33 +1,25 @@
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

{{- if and .Values.grpcRoute.enabled .Values.grpcRoute.gateway.create }}
{{- if .Values.gatewayApi.gateway.create }}
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: {{ default (include "openshell.fullname" .) .Values.grpcRoute.gateway.name }}
namespace: {{ .Release.Namespace }}
name: {{ include "openshell.gatewayApiGatewayName" . }}
namespace: {{ include "openshell.gatewayApiGatewayNamespace" . }}
labels:
{{- include "openshell.labels" . | nindent 4 }}
spec:
gatewayClassName: {{ .Values.grpcRoute.gateway.className }}
gatewayClassName: {{ .Values.gatewayApi.gateway.className }}
listeners:
- name: {{ ternary "https" "http" (eq .Values.grpcRoute.gateway.listener.protocol "HTTPS") }}
port: {{ .Values.grpcRoute.gateway.listener.port }}
protocol: {{ .Values.grpcRoute.gateway.listener.protocol }}
{{- if eq .Values.grpcRoute.gateway.listener.protocol "HTTPS" }}
{{- if not .Values.grpcRoute.gateway.listener.tls.certificateRefs }}
{{- fail "grpcRoute.gateway.listener.tls.certificateRefs is required when grpcRoute.gateway.listener.protocol is HTTPS" }}
{{- end }}
{{- if not .Values.server.disableTls }}
{{- fail "grpcRoute.gateway.listener.protocol=HTTPS terminates TLS at Envoy Gateway, which forwards plaintext gRPC to the gateway pod; set server.disableTls=true so the pod listens plaintext (this chart does not render a BackendTLSPolicy for re-encryption to a TLS backend)" }}
{{- end }}
- name: http
port: 80
protocol: HTTP
{{- if .Values.gatewayApi.routes.tls.enabled }}
- name: tls
port: 443
protocol: TLS
tls:
mode: Terminate
certificateRefs:
{{- toYaml .Values.grpcRoute.gateway.listener.tls.certificateRefs | nindent 10 }}
{{- end }}
allowedRoutes:
namespaces:
from: {{ .Values.grpcRoute.gateway.listener.allowedRoutes }}
mode: Passthrough
{{- end }}
{{- end }}
10 changes: 5 additions & 5 deletions deploy/helm/openshell/templates/grpcroute.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

{{- if .Values.grpcRoute.enabled }}
{{- if .Values.gatewayApi.routes.grpc.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: GRPCRoute
metadata:
Expand All @@ -11,11 +11,11 @@ metadata:
{{- include "openshell.labels" . | nindent 4 }}
spec:
parentRefs:
- name: {{ default (include "openshell.fullname" .) .Values.grpcRoute.gateway.name }}
namespace: {{ default .Release.Namespace .Values.grpcRoute.gateway.namespace }}
{{- if .Values.grpcRoute.hostnames }}
- name: {{ include "openshell.gatewayApiGatewayName" . }}
namespace: {{ include "openshell.gatewayApiGatewayNamespace" . }}
{{- if .Values.gatewayApi.routes.grpc.hostnames }}
hostnames:
{{- toYaml .Values.grpcRoute.hostnames | nindent 4 }}
{{- toYaml .Values.gatewayApi.routes.grpc.hostnames | nindent 4 }}
{{- end }}
rules:
- backendRefs:
Expand Down
24 changes: 24 additions & 0 deletions deploy/helm/openshell/templates/tlsroute.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

{{- if .Values.gatewayApi.routes.tls.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: TLSRoute
metadata:
name: {{ include "openshell.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "openshell.labels" . | nindent 4 }}
spec:
parentRefs:
- name: {{ include "openshell.gatewayApiGatewayName" . }}
namespace: {{ include "openshell.gatewayApiGatewayNamespace" . }}
{{- if .Values.gatewayApi.routes.tls.hostnames }}
hostnames:
{{- toYaml .Values.gatewayApi.routes.tls.hostnames | nindent 4 }}
{{- end }}
rules:
- backendRefs:
- name: {{ include "openshell.fullname" . }}
port: {{ .Values.service.port }}
{{- end }}
Loading
Loading