Note: This post was written with AI assistance (Claude). The opinions and experiences are entirely my own.
The first version of my homelab deployment process was a shell script that SSH'd into each server and ran docker pull && docker restart. When I migrated everything to K3s, I briefly used kubectl rollout restart over SSH triggered from CI. It worked, but it felt wrong — CI pushing credentials around, the cluster's state not matching any declarative source of truth, everything held together by a bash script nobody could interpret six months later.
The current setup has none of that. A push to main triggers a GitLab CI pipeline. The pipeline runs tests, builds a Docker image, and pushes it to the GitLab container registry. Keel — running inside the cluster — notices the new image and rolls it out automatically. I never run kubectl as part of a deploy.
The Pipeline
GitLab CI is configured in .gitlab-ci.yml at the repo root. Each app owns a build.yml included via local: references, so the root config stays clean and each app manages its own jobs.
The stages are test → build. A typical app:
myapp:test:
image: node:24-alpine
stage: test
before_script:
- cd apps/myapp/app/
- npm install
script:
- npm test
rules:
- changes:
- apps/myapp/**/*
myapp:docker-main:
image: docker:latest
stage: build
services:
- docker:dind
before_script:
- cd apps/myapp/app/
- docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
script:
- docker build --pull --build-arg BUILD_TIME="$CI_PIPELINE_CREATED_AT" \
-t "${CI_REGISTRY_IMAGE}/myapp:latest" .
- docker push "${CI_REGISTRY_IMAGE}/myapp:latest"
rules:
- if: $CI_COMMIT_BRANCH == 'main'
changes:
- apps/myapp/**/*
The rules block means a job only runs if something in apps/myapp/**/* changed — a commit that only touches documentation doesn't trigger a Docker build. $CI_REGISTRY_USER, $CI_REGISTRY_PASSWORD, and $CI_REGISTRY are built-in GitLab CI variables; no manual secret setup for the registry.
The docker-dev job is identical except for the branch condition and the image tag:
myapp:docker-dev:
image: docker:latest
stage: build
services:
- docker:dind
before_script:
- cd apps/myapp/app/
- docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
script:
- docker build --pull --build-arg BUILD_TIME="$CI_PIPELINE_CREATED_AT" \
-t "${CI_REGISTRY_IMAGE}/myapp:dev-${CI_COMMIT_REF_SLUG}" .
- docker push "${CI_REGISTRY_IMAGE}/myapp:dev-${CI_COMMIT_REF_SLUG}"
rules:
- if: $CI_COMMIT_BRANCH != 'main'
changes:
- apps/myapp/**/*
The staging branch pushes :dev-staging — $CI_COMMIT_REF_SLUG is the branch name URL-slugified by GitLab. The staging deployment's image tag is pinned to :dev-staging, so Keel picks it up automatically the same way it handles :latest in production.
Keel
Keel is a Kubernetes operator that watches image registries and automatically updates deployments when new images are pushed. It runs as a deployment inside the cluster and polls the GitLab registry every minute.
Four annotations on each deployment enable it:
metadata:
annotations:
keel.sh/policy: force
keel.sh/trigger: poll
keel.sh/match-tag: "true"
keel.sh/poll-schedule: "@every 1m"
policy: force means Keel updates the image even if the tag hasn't changed — :latest is always overwritten on push. match-tag: "true" means it only updates when the tag matches, so the production deployment (:latest) won't accidentally pick up :dev-staging pushes. poll-schedule: "@every 1m" overrides Keel's global default of @every 1h.
Here's what the two deployments look like side by side, cut down to the relevant parts:
# production.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: myapp
annotations:
keel.sh/policy: force
keel.sh/trigger: poll
keel.sh/match-tag: "true"
keel.sh/poll-schedule: "@every 1m"
spec:
replicas: 2
template:
spec:
containers:
- name: myapp
image: registry.gitlab.com/cmunroe/ops/myapp:latest
---
# staging.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-staging
namespace: myapp
annotations:
keel.sh/policy: force
keel.sh/trigger: poll
keel.sh/match-tag: "true"
keel.sh/poll-schedule: "@every 1m"
spec:
replicas: 1
template:
spec:
containers:
- name: myapp
image: registry.gitlab.com/cmunroe/ops/myapp:dev-staging
The image tags are the only meaningful difference. match-tag: "true" is what keeps them independent — when CI pushes :latest, Keel updates myapp only; when it pushes :dev-staging, it updates myapp-staging only. Without that annotation, both deployments would race to pull whatever the most recent push was.
End-to-end — push to merge, pipeline completes, Keel rolls out — takes about two minutes.
Branch → Environment Mapping
| Branch | Image tag | Picked up by |
|---|---|---|
main |
:latest |
Production deployment |
staging |
:dev-staging |
Staging deployment |
| Any other branch | :dev-<slug> |
Nothing automatic |
The staging deployment is identical to production except for the image tag and replica count (1 instead of 2). Every non-trivial change goes through staging first — push to staging, verify on dev.myapp.com, merge to main.
What Flux Handles vs. What Keel Handles
Flux manages cluster state from the Git repository: it applies Kubernetes manifests, creates namespaces, ensures the right deployments and services exist. Keel handles the runtime image update loop — updating the image in an already-running deployment without requiring a manifest change.
The division is deliberate. If I change resource limits, replica count, or environment variables, that goes through a manifest change reconciled by Flux. If I push new application code, that goes through CI → registry → Keel. Neither system knows about the other.
Renovate rounds out the picture by opening merge requests when base Docker images or Helm chart versions have updates — keeping the non-application parts of the stack current without manual tracking.
What's Missing
The main gap is automatic rollback. If a new image starts crash-looping, Keel doesn't revert — I have to do that manually. For a homelab where "broken" means "my blog is down for five minutes", that's an acceptable tradeoff. For something more critical it would matter.
The other gap: liveness and readiness probes. Without them, Kubernetes rolls out to a new pod as soon as it starts — not as soon as it's serving traffic. Adding /api/health as a readiness probe to every app is on the list.