chart 0.7.3 — slug-suffix per-tenant ClusterIssuer (qsoft2 SSL fix)

cluster-issuer.yaml: name → letsencrypt-prod-{{ tenant.slug }}, hard-pin
apiTokenSecretRef.name to cloudflare-api-token-{{ tenant.slug }} so it
matches the ESO-created Secret. ACME account key also slug-suffixed
for tenant isolation. Pre-0.7.3 the unsuffixed letsencrypt-prod
mismatched what instance.go:504 stamps into per-instance Certificates
(letsencrypt-prod-<slug>), so cert-manager logged 'Referenced
ClusterIssuer not found' and erp2 served Traefik default cert forever.

tenants-wildcard-cert.yaml: issuerRef.name → letsencrypt-prod-{{ $.Values.tenant.slug }}
to match the renamed ClusterIssuer.

values.yaml: secrets.cloudflareTokenSecret block deprecated (the chart
no longer reads it; kept for back-compat with external overrides).

Diagnosed in the qsoft2 migrate test 2026-05-09.
This commit is contained in:
OdooSky v3
2026-05-09 21:30:36 +03:00
parent bdb0d44aee
commit d602063448
4 changed files with 48 additions and 18 deletions

View File

@@ -23,8 +23,8 @@ description: |
Git). Git).
type: application type: application
version: 0.7.2 version: 0.7.3
appVersion: "0.7.2" appVersion: "0.7.3"
# All 4 subcharts now resolve from registry.odoosky.cloud (mirrored # All 4 subcharts now resolve from registry.odoosky.cloud (mirrored
# 2026-05-08). Mirror-first discipline + China-region readiness: a # 2026-05-08). Mirror-first discipline + China-region readiness: a

View File

@@ -1,5 +1,15 @@
{{- if .Values.tenant.domain }} {{- if and .Values.tenant.domain .Values.tenant.slug }}
# letsencrypt-prod ClusterIssuer — DNS-01 challenge via Cloudflare. # letsencrypt-prod-<slug> ClusterIssuer — DNS-01 challenge via Cloudflare,
# scoped to THIS tenant via the per-tenant CF token Secret. The
# `letsencrypt-prod-<slug>` naming MUST match tenantClusterIssuerName()
# in backend/cmd/api/tenant_substrate.go — the per-instance overlay
# renderer in instance.go:504 stamps that exact name into every
# Certificate's issuerRef. Pre-0.7.3 charts used the unsuffixed name
# `letsencrypt-prod`, which broke for any instance asking for the
# slugged form (the qsoft2 migrate test on 2026-05-09 surfaced this:
# erp2's Certificate referenced letsencrypt-prod-qsoft, the chart only
# rendered letsencrypt-prod, cert-manager logged "Referenced ClusterIssuer
# not found", erp2 served the Traefik default cert forever).
# #
# Multi-zone: the solver has NO `selector.dnsZones` restriction. The # Multi-zone: the solver has NO `selector.dnsZones` restriction. The
# tenant's Cloudflare token typically covers many zones (a tenant with # tenant's Cloudflare token typically covers many zones (a tenant with
@@ -13,10 +23,10 @@
# `4th.online`). Dropping the selector unifies single-zone and # `4th.online`). Dropping the selector unifies single-zone and
# multi-zone tenants under one issuer. # multi-zone tenants under one issuer.
# #
# The cloudflare-api-token Secret is NOT in this chart. Tower # The cloudflare-api-token-<slug> Secret is now chart-managed via the
# kubectl-applies it into cert-manager ns at Connect time using the # ESO ExternalSecret in cloudflare-api-token-externalsecret.yaml (which
# tenant's per-tenant Vault credential (v3/tenants/<id>/cloudflare-token). # pulls the token from OpenBao at v3/tenants/<id>/cloudflare-token).
# The chart references it by name only. # Naming kept symmetric with that template.
# #
# Sync wave 1 (Slice 2B.1.1, 2026-05-04). cert-manager itself # Sync wave 1 (Slice 2B.1.1, 2026-05-04). cert-manager itself
# installs at the default wave 0; Argo waits for ALL wave-0 # installs at the default wave 0; Argo waits for ALL wave-0
@@ -32,24 +42,35 @@
# (in tenants-wildcard-cert.yaml) — Certificate references the # (in tenants-wildcard-cert.yaml) — Certificate references the
# ClusterIssuer by name, so the resource graph also reflects the # ClusterIssuer by name, so the resource graph also reflects the
# logical dependency. # logical dependency.
#
# Multi-tenant clusters (visiting tenants on a host tenant's cluster)
# remain a known gap (Item #9 follow-up): the ESO ExternalSecret loop
# only iterates the cluster-owner tenant. When a future deploy lands a
# non-owner tenant on a cluster, that tenant's CF Secret + Issuer must
# be applied out-of-band until this template grows a `Values.tenants[]`
# loop and Tower's onboarding code populates it.
apiVersion: cert-manager.io/v1 apiVersion: cert-manager.io/v1
kind: ClusterIssuer kind: ClusterIssuer
metadata: metadata:
name: letsencrypt-prod name: letsencrypt-prod-{{ .Values.tenant.slug }}
annotations: annotations:
argocd.argoproj.io/sync-wave: "1" argocd.argoproj.io/sync-wave: "1"
labels: labels:
app.kubernetes.io/managed-by: cluster-platform-v3 app.kubernetes.io/managed-by: cluster-platform-v3
odoosky.io/tenant: {{ .Values.tenant.id | quote }}
spec: spec:
acme: acme:
email: {{ required "acme.email is required" .Values.acme.email | quote }} email: {{ required "acme.email is required" .Values.acme.email | quote }}
server: {{ .Values.acme.server | quote }} server: {{ .Values.acme.server | quote }}
privateKeySecretRef: privateKeySecretRef:
name: letsencrypt-prod-account-key # Slug-suffixed so each tenant has its own ACME account key
# cleaner isolation if a tenant rotates / audits, and avoids
# implicit shared state if two tenants ever land on one cluster.
name: letsencrypt-prod-account-key-{{ .Values.tenant.slug }}
solvers: solvers:
- dns01: - dns01:
cloudflare: cloudflare:
apiTokenSecretRef: apiTokenSecretRef:
name: {{ .Values.secrets.cloudflareTokenSecret.name | quote }} name: cloudflare-api-token-{{ .Values.tenant.slug }}
key: {{ .Values.secrets.cloudflareTokenSecret.key | quote }} key: api-token
{{- end }} {{- end }}

View File

@@ -96,7 +96,11 @@ metadata:
spec: spec:
secretName: {{ printf "tenants-wildcard%s-tls" $suffix | quote }} secretName: {{ printf "tenants-wildcard%s-tls" $suffix | quote }}
issuerRef: issuerRef:
name: letsencrypt-prod # Slug-suffixed since chart 0.7.3 — matches the ClusterIssuer
# name rendered by cluster-issuer.yaml. Pre-0.7.3 this was the
# unsuffixed `letsencrypt-prod`. See cluster-issuer.yaml's
# docstring for the why.
name: letsencrypt-prod-{{ $.Values.tenant.slug }}
kind: ClusterIssuer kind: ClusterIssuer
commonName: {{ $d.wildcardHost | quote }} commonName: {{ $d.wildcardHost | quote }}
dnsNames: dnsNames:

View File

@@ -143,14 +143,19 @@ traefik:
port: websecure port: websecure
priority: 10 priority: 10
# secrets — Tower applies these out-of-band via the registered # secrets — DEPRECATED for cloudflareTokenSecret as of chart 0.7.3.
# kubeconfig at Connect time (B2). The chart references them by # The cluster-issuer.yaml template now hard-references
# name only; values never enter Git. # `cloudflare-api-token-<tenant.slug>` (matches the ESO-created Secret
# in cloudflare-api-token-externalsecret.yaml) and ignores this block.
# Kept here as no-op back-compat for any external chart consumer that
# overrides these values; chart templates no longer read
# secrets.cloudflareTokenSecret. s3CredentialsSecret is still consumed
# by the per-instance backup CronJob path and remains live.
secrets: secrets:
cloudflareTokenSecret: cloudflareTokenSecret:
namespace: odoosky-system namespace: odoosky-system
name: cloudflare-api-token name: cloudflare-api-token # unused since 0.7.3; chart computes from tenant.slug
key: api-token key: api-token # unused since 0.7.3
s3CredentialsSecret: s3CredentialsSecret:
namespace: tenants namespace: tenants
name: s3-backup-creds name: s3-backup-creds