Commit Graph

39 Commits

Author SHA1 Message Date
OdooSky v3
912d68010e chore(chart): keda + keda-add-ons-http crds.install=false (CRDs bootstrapped per-cluster) 2026-05-09 22:15:25 +02:00
OdooSky v3
67f26784d2 chore: regen Chart.lock for chart 0.7.4 (keda + keda-add-ons-http deps) 2026-05-09 21:58:25 +02:00
OdooSky v3
ba5408f7f9 chart 0.7.4 — install KEDA core + HTTP add-on for AI Studio scale-to-zero
Adds two subchart deps:
  - keda v2.15.1 (event-driven autoscaler, ScaledObject CRD)
  - keda-add-ons-http v0.8.0 (HTTPScaledObject CRD + interceptor-proxy)

Both gated by enabled flags (keda.enabled, kedaHttpAddon.enabled),
default true so all clusters can host AI Studio (per-instance OpenCode
pods that scale 0↔1 on URL hit). Idle cost ~300 MB RAM total — small
relative to typical customer cluster (7+ GB allocatable).

Charts mirrored to registry.odoosky.cloud/odoosky/docker-mirror/charts
following the existing mirror-first pattern used by cert-manager,
traefik, longhorn, external-secrets.

Studio chart (studio-template-v3) created in monorepo as part of the
same feature; chart-side IngressRoute will be updated in 0.1.1 to
point at keda-add-ons-http-interceptor-proxy.keda.svc instead of the
per-instance Service (KEDA HTTP routing pattern). Tower handlers for
deploy/get/write-mode-toggle/delete already shipped in 0.76.47 behind
a studioChartReady=false feature flag.
2026-05-09 22:53:37 +03:00
OdooSky v3
d602063448 chart 0.7.3 — slug-suffix per-tenant ClusterIssuer (qsoft2 SSL fix)
cluster-issuer.yaml: name → letsencrypt-prod-{{ tenant.slug }}, hard-pin
apiTokenSecretRef.name to cloudflare-api-token-{{ tenant.slug }} so it
matches the ESO-created Secret. ACME account key also slug-suffixed
for tenant isolation. Pre-0.7.3 the unsuffixed letsencrypt-prod
mismatched what instance.go:504 stamps into per-instance Certificates
(letsencrypt-prod-<slug>), so cert-manager logged 'Referenced
ClusterIssuer not found' and erp2 served Traefik default cert forever.

tenants-wildcard-cert.yaml: issuerRef.name → letsencrypt-prod-{{ $.Values.tenant.slug }}
to match the renamed ClusterIssuer.

values.yaml: secrets.cloudflareTokenSecret block deprecated (the chart
no longer reads it; kept for back-compat with external overrides).

Diagnosed in the qsoft2 migrate test 2026-05-09.
2026-05-09 21:30:36 +03:00
OdooSky v3
bdb0d44aee feat: 0.7.2 - mirror 4 subcharts to registry.odoosky.cloud (OCI) 2026-05-08 07:10:48 +03:00
OdooSky v3
ff7eb9fafc fix(eso): chart 0.7.1 — explicit CRD defaults to clear ArgoCD OutOfSync
ArgoCD was reporting all 6 ExternalSecrets as OutOfSync because the
live CRs had conversionStrategy/decodingStrategy/metadataPolicy fields
filled in by the CRD defaults that werent in the chart manifests.
Stamping them explicitly so the diff is clean. Tower UI will now show
Provisioning state correctly transition to Ready.
2026-05-07 21:47:00 +03:00
OdooSky v3
c26ee5b3c6 feat(eso): chart 0.7.0 — migrate all 4 remaining Tower-stamped Secrets to ExternalSecret
Phase 2 of Item #9. Adds ExternalSecret manifests for:
  - docker-mirror-pull (×2 namespaces, dockerconfigjson template)
  - cloudflare-api-token-<slug> (per-tenant, gated on tenant.id+slug)
  - s3-backup-creds (per-tenant, in tenants ns)
  - longhorn-s3-creds (per-tenant, gated on tenant.s3Endpoint)

New helm values: tenant.id, tenant.slug, tenant.s3Endpoint. Tower must
pass these per-cluster (next ship). All manifests gated on
externalSecrets.enabled + mountPath set + tenant.id set, so old apps
without the new params remain on the legacy Tower-stamped path until
the operator opts them in.
2026-05-07 21:25:41 +03:00
OdooSky v3
52a157f187 fix(eso): chart 0.6.2 - revert fullnameOverride; use templated SA in ClusterSecretStore
Chart 0.6.1's fullnameOverride attempted to give ESO resources stable
names (just 'external-secrets' instead of '<release>-external-secrets')
but ArgoCD couldn't fully drain the prefixed resources from 0.6.0,
leaving sync stuck. Reverting: keep the subchart's default release-
prefixed naming, template the SA reference in ClusterSecretStore via
{{ .Release.Name }}-external-secrets so it resolves correctly per
cluster.
2026-05-07 21:01:38 +03:00
OdooSky v3
ddc01def62 fix(eso): include all subchart tarballs (longhorn, cert-manager, traefik) — repo-server helm dep build fails without them 2026-05-07 20:54:02 +03:00
OdooSky v3
f32ad64c4c fix(eso): chart 0.6.1 - fullnameOverride to keep SA names stable
The ESO subchart was prefixing the ServiceAccount with the parent
release name (qsoft-platform-external-secrets), breaking both
ClusterSecretStore.serviceAccountRef and OpenBao's role binding which
both expect plain 'external-secrets'. Lock the name via
fullnameOverride.
2026-05-07 20:48:44 +03:00
OdooSky v3
536cb72a72 feat(eso): chart 0.6.0 - ESO subchart + ClusterSecretStore + gitea-archive-pull ExternalSecret
Phase 1 of Item #9 (Tower-stamped Secrets → ESO + OpenBao migration).
Replaces Tower's imperative kubectl-stamp of gitea-archive-pull with
a declarative ExternalSecret synced from OpenBao at v3/platform/gitea-
archive-pull. Other 4 Tower-stamped Secrets (cloudflare, s3-backup,
longhorn-s3, docker-mirror-pull) remain on legacy path.

Tower must pass externalSecrets.openbao.mountPath as a per-cluster
helm parameter (kubernetes-<server-name>) for ESO to activate; chart
guards against unset mountPath via {{ if }} in both new templates.
2026-05-07 20:46:22 +03:00
OdooSky v3
f50156d99d feat(traefik): tenants-default-retry Middleware (3 attempts, 200ms init) 2026-05-05 12:12:57 +02:00
OdooSky v3
0c39732f63 fix(traefik): platform-level HTTP→HTTPS redirect at the web entrypoint 2026-05-05 12:04:22 +02:00
pro-777
7e3280aa26 feat(slice 2B.3): chart Restore half — injectedWildcards conditional (0.5.7)
Add the chart-side machinery that lets Tower bypass the cert-manager
Certificate path on Reconnect by injecting a Vault-stashed wildcard
cert directly as a kubernetes.io/tls Secret.

values.yaml:
  certManager.injectedWildcards: []
    Each entry: { root, primary, crt, key }. Empty list = legacy ACME-only.

templates/tenants-wildcard-cert.yaml:
  Build $injectedRoots index from injectedWildcards[]; per-domain
  Certificate is skipped when its root has an injected entry.

templates/tenants-wildcard-secret.yaml (NEW):
  Per injected entry, render kubernetes.io/tls Secret using the same
  name the cert path would have produced (tenants-wildcard-tls primary,
  tenants-wildcard-<root-as-dashes>-tls non-primary). Sync-wave 2 to
  match the cert path's timing. Label odoosky.io/wildcard-source=
  vault-injected so harvester can skip them.

Verified via helm template + self-signed dummy cert:
  - Pure injection: 0 Certificate, 1 Secret (correct name + base64)
  - Pure ACME: 1 Certificate, 0 Secret (status quo)
  - Mixed (2 domains, 1 injected): 1 Certificate + 1 Secret

Inert without Tower wiring — existing clusters render identically to
0.5.6 because injectedWildcards defaults to []. Pushed first as the
foundation layer for the upcoming Tower restore + harvester slices.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:27:30 +03:00
OdooSky v3
d52d335853 feat(slice 2B.1.2): disable startupapicheck PostSync hook (chart 0.5.6) 2026-05-04 13:50:37 +03:00
OdooSky v3
252ac78888 feat(slice 2B.1.1): sync waves — kill the cert-manager-webhook race (chart 0.5.5) 2026-05-04 13:09:47 +03:00
OdooSky v3
46e8309153 feat(slice 2B.1): SkipHealthCheck on tenants-wildcard Cert (chart 0.5.4) 2026-05-04 12:38:18 +03:00
Tower Bot
b6d5b29f3e feat: loop tenant.domains[] for N wildcard certs (#320.C) 2026-05-03 13:58:48 +02:00
OdooSky Bot
0213a0b513 0.5.3: bump to invalidate Argo render cache 2026-05-02 23:28:49 +03:00
OdooSky Bot
e44078f061 0.5.2: api-approved annotation on CSI snapshot CRDs + spec.name on RecurringJobs (validator fixes) 2026-05-02 23:21:08 +03:00
OdooSky Bot
f73bdf6d62 0.5.1: RecurringJob namespace from .Values.namespace (Longhorn was deployed to odoosky-system, not longhorn-system) 2026-05-02 23:15:02 +03:00
OdooSky Bot
3e642dd7a1 0.5.0: Longhorn local snapshots + async S3 backup (#347 phase 5) 2026-05-02 23:14:15 +03:00
OdooSky Bot
8fca9aadfa 0.4.0: csi external-snapshotter v8.1.0 (Phase 3a — VolumeSnapshot CRDs + controller) 2026-05-02 22:01:26 +03:00
OdooSky Bot
cf0fd4c477 0.3.3: longhorn.persistence.defaultClass=false (k3s default-SC stays local-path) 2026-05-02 21:39:00 +03:00
OdooSky Bot
dcf9cf79d8 0.3.2: disable longhorn preUpgradeChecker (Argo+helm-hook ordering bug) 2026-05-02 21:24:49 +03:00
OdooSky Bot
c9aab7117a bump chart 0.3.1 2026-05-02 21:16:44 +03:00
OdooSky Bot
124eff9ee4 0.3.1: drop premature VolumeSnapshotClass — depends on external snapshotter CRDs (deferred to phase 3) 2026-05-02 21:16:41 +03:00
OdooSky Bot
81ec240e03 0.3.0: Longhorn CSI skeleton (#347 phase 1) — additive, default-off 2026-05-02 21:11:41 +03:00
Tower deploy
4a545946ab fix: drop ClusterIssuer dnsZones selector for multi-zone tenants 2026-05-02 11:31:15 +03:00
ops
7ee9856e25 per-cluster differentiator SAN on tenants-wildcard cert (avoid LE Duplicate Cert rate limit) 2026-04-29 22:27:02 +02:00
ops
976c67afd1 cloudflare-token Secret lives in odoosky-system (where cert-manager runs) 2026-04-29 22:03:10 +02:00
ops
4b19af28e3 vendor cert-manager v1.16.1 CRDs (resource-policy:keep stripped); subchart crds.enabled=false 2026-04-29 21:55:56 +02:00
ops
04b989facf cert-manager crds.keep=false (drop resource-policy annotation Argo wont apply) 2026-04-29 21:48:21 +02:00
ops
c8946a8965 cert-manager subchart: use dep-name alias + crds.enabled (v1.16 install fix) 2026-04-29 21:41:40 +02:00
ops
1a301cd3db sync-wave 5 on ClusterIssuer + Certificate (CRD ordering) 2026-04-29 21:36:44 +02:00
pro-777
eccb648276 0.2.0 — vendor cert-manager + traefik; parameterized substrate
bootstrap.sh-equivalent K8s manifests now ship as part of this
chart instead of being installed inline by the customer's
`curl … | sudo bash`. Result: customer terminal time drops from
~5 min to ~1 min once Tower's SubmitConnect (B2) creates the
per-cluster Argo Application that points here.

What's vendored:
  - cert-manager v1.16.1 (helm dep, charts/cert-manager-v1.16.1.tgz)
  - traefik 33.2.1       (helm dep, charts/traefik-33.2.1.tgz)

What's parameterized via .Values.tenant.{domain,wildcardHost}:
  - letsencrypt-prod ClusterIssuer (DNS-01 + tenant's Cloudflare zone)
  - tenants Namespace
  - tenants-wildcard Certificate (commonName + dnsNames from helm.values)

What stays out of Git (Tower kubectl-applies via kubeconfig at
Connect time, sourced from the tenant's Vault paths):
  - cloudflare-api-token Secret (cert-manager ns)
  - s3-backup-creds Secret      (tenants ns)

The chart references both Secrets by name only.

Argo health roll-up: a tenant server is "Ready" when this
Application's Health is `Healthy` and the tenants-wildcard
Certificate's Ready condition is True. Tower's Server card UI
will surface this as "Provisioning…" → "Ready" in B4.

Lint + template clean with a real tenant value set; clean with
empty values too (templates skip themselves so a default-rendered
chart doesn't fail without a tenant).
2026-04-29 15:09:33 +03:00
Tower Deploy
0c17429d4c Registry as NodePort (30500) so kubelet can pull via host loopback while in-cluster pods push via cluster DNS 2026-04-27 00:56:47 +03:00
Tower Deploy
a1dbe14c20 Initial chart: odoosky-system namespace + local container registry (Distribution v2) 2026-04-27 00:47:07 +03:00
049144dc04 Initial commit 2026-04-26 21:46:11 +00:00