Deploy: K3s (production scale)

Pulumi-managed K3s deploy — one cluster runs every Kumiko-built app + the platform. Used in production at kumiko.so.

Layers

The cluster has three Pulumi stacks, layered:

Stack	Lives in	What it provides
platform	`infra/pulumi/platform/`	Hetzner VMs, K3s install, Wireguard VPN
operators	`infra/pulumi/operators/`	ingress-nginx, cert-manager, CloudNativePG, Velero
sites	`infra/pulumi/sites/`	per-app Deployment + Service + Ingress + Certificate

You bring up platform once, operators once, then add an entry to sites for each app.

Adding a new site

infra/pulumi/sites/index.ts uses a createStaticSite helper that wraps the Deployment + Service + Ingress + Certificate combo:

createStaticSite({
  name: "my-app",
  domain: "my-app.kumiko.so",
  image: "ghcr.io/your-org/my-app:latest",
  // ghcrPull only if the image is private
  ghcrPull: { username: cfg.requireSecret("ghcrUser"), password: cfg.requireSecret("ghcrToken") },
  k8sProvider,
});

Then pulumi up from infra/pulumi/sites/.

The helper sets:

imagePullPolicy: Always — pulls fresh :latest on every pod restart
cert-manager.io/cluster-issuer: letsencrypt-prod annotation on the Ingress
nginx ingress class
DNS record for the domain (if Cloudflare provider is configured)

Rolling update on new image push

:latest tag + Always pull is half the picture. K8s won’t restart pods if the image tag string is unchanged. Two options:

Annotation bump — write a unique value (e.g. SHA) to a pod-template annotation in the deployment. K8s sees the spec change → rolling restart.
kubectl rollout restart deployment/my-app — explicit restart from CI after image push.

Both work. Option 1 keeps Pulumi as the source of truth; option 2 is one extra step in your build workflow.

Migrate before pod start

K3s deployments use init-containers for the pre-deploy migrate step:

spec:
  template:
    spec:
      initContainers:
        - name: migrate
          image: ghcr.io/your-org/my-app:latest
          command: ["bun", "/app/kumiko.js", "migrate", "apply"]
          env:
            - name: DATABASE_URL
              valueFrom: { secretKeyRef: { name: my-app-db, key: url } }

Pod refuses to start if migrate fails. Boot-gate inside the app provides a second safety net (SchemaDriftError if the schema still drifts from the journal).

Backups

The operators stack ships Velero (K8s resources + PVC snapshots) and CloudNativePG’s barman-cloud (continuous WAL archiving). Both back up to Hetzner Object Storage (S3-compatible).

Daily K8s backup at 03:00 UTC, 7-day retention. Postgres backups are continuous with 30-day retention.

Multi-tenant on a single Postgres cluster

CloudNativePG creates one Postgres cluster (pg). Each app gets its own database in that cluster — Tenant-Provisioning creates databases on demand. See Architecture: Tenant-DB-Context for the per-tenant connection routing.