cluster-issuer.yaml: name → letsencrypt-prod-{{ tenant.slug }}, hard-pin
apiTokenSecretRef.name to cloudflare-api-token-{{ tenant.slug }} so it
matches the ESO-created Secret. ACME account key also slug-suffixed
for tenant isolation. Pre-0.7.3 the unsuffixed letsencrypt-prod
mismatched what instance.go:504 stamps into per-instance Certificates
(letsencrypt-prod-<slug>), so cert-manager logged 'Referenced
ClusterIssuer not found' and erp2 served Traefik default cert forever.
tenants-wildcard-cert.yaml: issuerRef.name → letsencrypt-prod-{{ $.Values.tenant.slug }}
to match the renamed ClusterIssuer.
values.yaml: secrets.cloudflareTokenSecret block deprecated (the chart
no longer reads it; kept for back-compat with external overrides).
Diagnosed in the qsoft2 migrate test 2026-05-09.
ArgoCD was reporting all 6 ExternalSecrets as OutOfSync because the
live CRs had conversionStrategy/decodingStrategy/metadataPolicy fields
filled in by the CRD defaults that werent in the chart manifests.
Stamping them explicitly so the diff is clean. Tower UI will now show
Provisioning state correctly transition to Ready.
Phase 2 of Item #9. Adds ExternalSecret manifests for:
- docker-mirror-pull (×2 namespaces, dockerconfigjson template)
- cloudflare-api-token-<slug> (per-tenant, gated on tenant.id+slug)
- s3-backup-creds (per-tenant, in tenants ns)
- longhorn-s3-creds (per-tenant, gated on tenant.s3Endpoint)
New helm values: tenant.id, tenant.slug, tenant.s3Endpoint. Tower must
pass these per-cluster (next ship). All manifests gated on
externalSecrets.enabled + mountPath set + tenant.id set, so old apps
without the new params remain on the legacy Tower-stamped path until
the operator opts them in.
Chart 0.6.1's fullnameOverride attempted to give ESO resources stable
names (just 'external-secrets' instead of '<release>-external-secrets')
but ArgoCD couldn't fully drain the prefixed resources from 0.6.0,
leaving sync stuck. Reverting: keep the subchart's default release-
prefixed naming, template the SA reference in ClusterSecretStore via
{{ .Release.Name }}-external-secrets so it resolves correctly per
cluster.
Phase 1 of Item #9 (Tower-stamped Secrets → ESO + OpenBao migration).
Replaces Tower's imperative kubectl-stamp of gitea-archive-pull with
a declarative ExternalSecret synced from OpenBao at v3/platform/gitea-
archive-pull. Other 4 Tower-stamped Secrets (cloudflare, s3-backup,
longhorn-s3, docker-mirror-pull) remain on legacy path.
Tower must pass externalSecrets.openbao.mountPath as a per-cluster
helm parameter (kubernetes-<server-name>) for ESO to activate; chart
guards against unset mountPath via {{ if }} in both new templates.
Add the chart-side machinery that lets Tower bypass the cert-manager
Certificate path on Reconnect by injecting a Vault-stashed wildcard
cert directly as a kubernetes.io/tls Secret.
values.yaml:
certManager.injectedWildcards: []
Each entry: { root, primary, crt, key }. Empty list = legacy ACME-only.
templates/tenants-wildcard-cert.yaml:
Build $injectedRoots index from injectedWildcards[]; per-domain
Certificate is skipped when its root has an injected entry.
templates/tenants-wildcard-secret.yaml (NEW):
Per injected entry, render kubernetes.io/tls Secret using the same
name the cert path would have produced (tenants-wildcard-tls primary,
tenants-wildcard-<root-as-dashes>-tls non-primary). Sync-wave 2 to
match the cert path's timing. Label odoosky.io/wildcard-source=
vault-injected so harvester can skip them.
Verified via helm template + self-signed dummy cert:
- Pure injection: 0 Certificate, 1 Secret (correct name + base64)
- Pure ACME: 1 Certificate, 0 Secret (status quo)
- Mixed (2 domains, 1 injected): 1 Certificate + 1 Secret
Inert without Tower wiring — existing clusters render identically to
0.5.6 because injectedWildcards defaults to []. Pushed first as the
foundation layer for the upcoming Tower restore + harvester slices.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bootstrap.sh-equivalent K8s manifests now ship as part of this
chart instead of being installed inline by the customer's
`curl … | sudo bash`. Result: customer terminal time drops from
~5 min to ~1 min once Tower's SubmitConnect (B2) creates the
per-cluster Argo Application that points here.
What's vendored:
- cert-manager v1.16.1 (helm dep, charts/cert-manager-v1.16.1.tgz)
- traefik 33.2.1 (helm dep, charts/traefik-33.2.1.tgz)
What's parameterized via .Values.tenant.{domain,wildcardHost}:
- letsencrypt-prod ClusterIssuer (DNS-01 + tenant's Cloudflare zone)
- tenants Namespace
- tenants-wildcard Certificate (commonName + dnsNames from helm.values)
What stays out of Git (Tower kubectl-applies via kubeconfig at
Connect time, sourced from the tenant's Vault paths):
- cloudflare-api-token Secret (cert-manager ns)
- s3-backup-creds Secret (tenants ns)
The chart references both Secrets by name only.
Argo health roll-up: a tenant server is "Ready" when this
Application's Health is `Healthy` and the tenants-wildcard
Certificate's Ready condition is True. Tower's Server card UI
will surface this as "Provisioning…" → "Ready" in B4.
Lint + template clean with a real tenant value set; clean with
empty values too (templates skip themselves so a default-rendered
chart doesn't fail without a tenant).