Skip to content

Deploy: K3s (production scale)

Pulumi-managed K3s deploy — one cluster runs every Kumiko-built app + the platform. Used in production at kumiko.so.

Layers

The cluster has three Pulumi stacks, layered:

StackLives inWhat it provides
platforminfra/pulumi/platform/Hetzner VMs, K3s install, Wireguard VPN
operatorsinfra/pulumi/operators/ingress-nginx, cert-manager, CloudNativePG, Velero
sitesinfra/pulumi/sites/per-app Deployment + Service + Ingress + Certificate

You bring up platform once, operators once, then add an entry to sites for each app.

Adding a new site

infra/pulumi/sites/index.ts uses a createStaticSite helper that wraps the Deployment + Service + Ingress + Certificate combo:

createStaticSite({
name: "my-app",
domain: "my-app.kumiko.so",
image: "ghcr.io/your-org/my-app:latest",
// ghcrPull only if the image is private
ghcrPull: { username: cfg.requireSecret("ghcrUser"), password: cfg.requireSecret("ghcrToken") },
k8sProvider,
});

Then pulumi up from infra/pulumi/sites/.

The helper sets:

  • imagePullPolicy: Always — pulls fresh :latest on every pod restart
  • cert-manager.io/cluster-issuer: letsencrypt-prod annotation on the Ingress
  • nginx ingress class
  • DNS record for the domain (if Cloudflare provider is configured)

Rolling update on new image push

:latest tag + Always pull is half the picture. K8s won’t restart pods if the image tag string is unchanged. Two options:

  1. Annotation bump — write a unique value (e.g. SHA) to a pod-template annotation in the deployment. K8s sees the spec change → rolling restart.
  2. kubectl rollout restart deployment/my-app — explicit restart from CI after image push.

Both work. Option 1 keeps Pulumi as the source of truth; option 2 is one extra step in your build workflow.

Migrate before pod start

K3s deployments use init-containers for the pre-deploy migrate step:

spec:
template:
spec:
initContainers:
- name: migrate
image: ghcr.io/your-org/my-app:latest
command: ["bun", "/app/kumiko.js", "migrate", "apply"]
env:
- name: DATABASE_URL
valueFrom: { secretKeyRef: { name: my-app-db, key: url } }

Pod refuses to start if migrate fails. Boot-gate inside the app provides a second safety net (SchemaDriftError if the schema still drifts from the journal).

Backups

The operators stack ships Velero (K8s resources + PVC snapshots) and CloudNativePG’s barman-cloud (continuous WAL archiving). Both back up to Hetzner Object Storage (S3-compatible).

Daily K8s backup at 03:00 UTC, 7-day retention. Postgres backups are continuous with 30-day retention.

Multi-tenant on a single Postgres cluster

CloudNativePG creates one Postgres cluster (pg). Each app gets its own database in that cluster — Tenant-Provisioning creates databases on demand. See Architecture: Tenant-DB-Context for the per-tenant connection routing.