Commit Graph

24 Commits

Author SHA1 Message Date
OdooSky v3
912d68010e chore(chart): keda + keda-add-ons-http crds.install=false (CRDs bootstrapped per-cluster) 2026-05-09 22:15:25 +02:00
OdooSky v3
ba5408f7f9 chart 0.7.4 — install KEDA core + HTTP add-on for AI Studio scale-to-zero
Adds two subchart deps:
  - keda v2.15.1 (event-driven autoscaler, ScaledObject CRD)
  - keda-add-ons-http v0.8.0 (HTTPScaledObject CRD + interceptor-proxy)

Both gated by enabled flags (keda.enabled, kedaHttpAddon.enabled),
default true so all clusters can host AI Studio (per-instance OpenCode
pods that scale 0↔1 on URL hit). Idle cost ~300 MB RAM total — small
relative to typical customer cluster (7+ GB allocatable).

Charts mirrored to registry.odoosky.cloud/odoosky/docker-mirror/charts
following the existing mirror-first pattern used by cert-manager,
traefik, longhorn, external-secrets.

Studio chart (studio-template-v3) created in monorepo as part of the
same feature; chart-side IngressRoute will be updated in 0.1.1 to
point at keda-add-ons-http-interceptor-proxy.keda.svc instead of the
per-instance Service (KEDA HTTP routing pattern). Tower handlers for
deploy/get/write-mode-toggle/delete already shipped in 0.76.47 behind
a studioChartReady=false feature flag.
2026-05-09 22:53:37 +03:00
OdooSky v3
d602063448 chart 0.7.3 — slug-suffix per-tenant ClusterIssuer (qsoft2 SSL fix)
cluster-issuer.yaml: name → letsencrypt-prod-{{ tenant.slug }}, hard-pin
apiTokenSecretRef.name to cloudflare-api-token-{{ tenant.slug }} so it
matches the ESO-created Secret. ACME account key also slug-suffixed
for tenant isolation. Pre-0.7.3 the unsuffixed letsencrypt-prod
mismatched what instance.go:504 stamps into per-instance Certificates
(letsencrypt-prod-<slug>), so cert-manager logged 'Referenced
ClusterIssuer not found' and erp2 served Traefik default cert forever.

tenants-wildcard-cert.yaml: issuerRef.name → letsencrypt-prod-{{ $.Values.tenant.slug }}
to match the renamed ClusterIssuer.

values.yaml: secrets.cloudflareTokenSecret block deprecated (the chart
no longer reads it; kept for back-compat with external overrides).

Diagnosed in the qsoft2 migrate test 2026-05-09.
2026-05-09 21:30:36 +03:00
OdooSky v3
c26ee5b3c6 feat(eso): chart 0.7.0 — migrate all 4 remaining Tower-stamped Secrets to ExternalSecret
Phase 2 of Item #9. Adds ExternalSecret manifests for:
  - docker-mirror-pull (×2 namespaces, dockerconfigjson template)
  - cloudflare-api-token-<slug> (per-tenant, gated on tenant.id+slug)
  - s3-backup-creds (per-tenant, in tenants ns)
  - longhorn-s3-creds (per-tenant, gated on tenant.s3Endpoint)

New helm values: tenant.id, tenant.slug, tenant.s3Endpoint. Tower must
pass these per-cluster (next ship). All manifests gated on
externalSecrets.enabled + mountPath set + tenant.id set, so old apps
without the new params remain on the legacy Tower-stamped path until
the operator opts them in.
2026-05-07 21:25:41 +03:00
OdooSky v3
52a157f187 fix(eso): chart 0.6.2 - revert fullnameOverride; use templated SA in ClusterSecretStore
Chart 0.6.1's fullnameOverride attempted to give ESO resources stable
names (just 'external-secrets' instead of '<release>-external-secrets')
but ArgoCD couldn't fully drain the prefixed resources from 0.6.0,
leaving sync stuck. Reverting: keep the subchart's default release-
prefixed naming, template the SA reference in ClusterSecretStore via
{{ .Release.Name }}-external-secrets so it resolves correctly per
cluster.
2026-05-07 21:01:38 +03:00
OdooSky v3
f32ad64c4c fix(eso): chart 0.6.1 - fullnameOverride to keep SA names stable
The ESO subchart was prefixing the ServiceAccount with the parent
release name (qsoft-platform-external-secrets), breaking both
ClusterSecretStore.serviceAccountRef and OpenBao's role binding which
both expect plain 'external-secrets'. Lock the name via
fullnameOverride.
2026-05-07 20:48:44 +03:00
OdooSky v3
536cb72a72 feat(eso): chart 0.6.0 - ESO subchart + ClusterSecretStore + gitea-archive-pull ExternalSecret
Phase 1 of Item #9 (Tower-stamped Secrets → ESO + OpenBao migration).
Replaces Tower's imperative kubectl-stamp of gitea-archive-pull with
a declarative ExternalSecret synced from OpenBao at v3/platform/gitea-
archive-pull. Other 4 Tower-stamped Secrets (cloudflare, s3-backup,
longhorn-s3, docker-mirror-pull) remain on legacy path.

Tower must pass externalSecrets.openbao.mountPath as a per-cluster
helm parameter (kubernetes-<server-name>) for ESO to activate; chart
guards against unset mountPath via {{ if }} in both new templates.
2026-05-07 20:46:22 +03:00
OdooSky v3
0c39732f63 fix(traefik): platform-level HTTP→HTTPS redirect at the web entrypoint 2026-05-05 12:04:22 +02:00
pro-777
7e3280aa26 feat(slice 2B.3): chart Restore half — injectedWildcards conditional (0.5.7)
Add the chart-side machinery that lets Tower bypass the cert-manager
Certificate path on Reconnect by injecting a Vault-stashed wildcard
cert directly as a kubernetes.io/tls Secret.

values.yaml:
  certManager.injectedWildcards: []
    Each entry: { root, primary, crt, key }. Empty list = legacy ACME-only.

templates/tenants-wildcard-cert.yaml:
  Build $injectedRoots index from injectedWildcards[]; per-domain
  Certificate is skipped when its root has an injected entry.

templates/tenants-wildcard-secret.yaml (NEW):
  Per injected entry, render kubernetes.io/tls Secret using the same
  name the cert path would have produced (tenants-wildcard-tls primary,
  tenants-wildcard-<root-as-dashes>-tls non-primary). Sync-wave 2 to
  match the cert path's timing. Label odoosky.io/wildcard-source=
  vault-injected so harvester can skip them.

Verified via helm template + self-signed dummy cert:
  - Pure injection: 0 Certificate, 1 Secret (correct name + base64)
  - Pure ACME: 1 Certificate, 0 Secret (status quo)
  - Mixed (2 domains, 1 injected): 1 Certificate + 1 Secret

Inert without Tower wiring — existing clusters render identically to
0.5.6 because injectedWildcards defaults to []. Pushed first as the
foundation layer for the upcoming Tower restore + harvester slices.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:27:30 +03:00
OdooSky v3
d52d335853 feat(slice 2B.1.2): disable startupapicheck PostSync hook (chart 0.5.6) 2026-05-04 13:50:37 +03:00
Tower Bot
b6d5b29f3e feat: loop tenant.domains[] for N wildcard certs (#320.C) 2026-05-03 13:58:48 +02:00
OdooSky Bot
3e642dd7a1 0.5.0: Longhorn local snapshots + async S3 backup (#347 phase 5) 2026-05-02 23:14:15 +03:00
OdooSky Bot
8fca9aadfa 0.4.0: csi external-snapshotter v8.1.0 (Phase 3a — VolumeSnapshot CRDs + controller) 2026-05-02 22:01:26 +03:00
OdooSky Bot
cf0fd4c477 0.3.3: longhorn.persistence.defaultClass=false (k3s default-SC stays local-path) 2026-05-02 21:39:00 +03:00
OdooSky Bot
dcf9cf79d8 0.3.2: disable longhorn preUpgradeChecker (Argo+helm-hook ordering bug) 2026-05-02 21:24:49 +03:00
OdooSky Bot
81ec240e03 0.3.0: Longhorn CSI skeleton (#347 phase 1) — additive, default-off 2026-05-02 21:11:41 +03:00
ops
7ee9856e25 per-cluster differentiator SAN on tenants-wildcard cert (avoid LE Duplicate Cert rate limit) 2026-04-29 22:27:02 +02:00
ops
976c67afd1 cloudflare-token Secret lives in odoosky-system (where cert-manager runs) 2026-04-29 22:03:10 +02:00
ops
4b19af28e3 vendor cert-manager v1.16.1 CRDs (resource-policy:keep stripped); subchart crds.enabled=false 2026-04-29 21:55:56 +02:00
ops
04b989facf cert-manager crds.keep=false (drop resource-policy annotation Argo wont apply) 2026-04-29 21:48:21 +02:00
ops
c8946a8965 cert-manager subchart: use dep-name alias + crds.enabled (v1.16 install fix) 2026-04-29 21:41:40 +02:00
pro-777
eccb648276 0.2.0 — vendor cert-manager + traefik; parameterized substrate
bootstrap.sh-equivalent K8s manifests now ship as part of this
chart instead of being installed inline by the customer's
`curl … | sudo bash`. Result: customer terminal time drops from
~5 min to ~1 min once Tower's SubmitConnect (B2) creates the
per-cluster Argo Application that points here.

What's vendored:
  - cert-manager v1.16.1 (helm dep, charts/cert-manager-v1.16.1.tgz)
  - traefik 33.2.1       (helm dep, charts/traefik-33.2.1.tgz)

What's parameterized via .Values.tenant.{domain,wildcardHost}:
  - letsencrypt-prod ClusterIssuer (DNS-01 + tenant's Cloudflare zone)
  - tenants Namespace
  - tenants-wildcard Certificate (commonName + dnsNames from helm.values)

What stays out of Git (Tower kubectl-applies via kubeconfig at
Connect time, sourced from the tenant's Vault paths):
  - cloudflare-api-token Secret (cert-manager ns)
  - s3-backup-creds Secret      (tenants ns)

The chart references both Secrets by name only.

Argo health roll-up: a tenant server is "Ready" when this
Application's Health is `Healthy` and the tenants-wildcard
Certificate's Ready condition is True. Tower's Server card UI
will surface this as "Provisioning…" → "Ready" in B4.

Lint + template clean with a real tenant value set; clean with
empty values too (templates skip themselves so a default-rendered
chart doesn't fail without a tenant).
2026-04-29 15:09:33 +03:00
Tower Deploy
0c17429d4c Registry as NodePort (30500) so kubelet can pull via host loopback while in-cluster pods push via cluster DNS 2026-04-27 00:56:47 +03:00
Tower Deploy
a1dbe14c20 Initial chart: odoosky-system namespace + local container registry (Distribution v2) 2026-04-27 00:47:07 +03:00