Stop fighting bloat, codex-home drift, and per-project contamination — make each CI job and agent lane a fresh box that's thrown away when it's done.
The VDS runs everything on shared mutable host state — one home, one ~/.codex, lane worktrees that pile up, one AGENTS.md lineage. That shared state is exactly what drifts, bloats, and cross-contaminates. Docker replaces it with per-project images + ephemeral containers: each job runs in a clean, controlled box and is discarded on exit. Nothing accumulates; nothing drifts; a new repo is just another image.
ali in the docker group, 8 CPU / 31 GB / 70 G free) — only ever ran hello-world. This is adoption, not installation.This is why Docker is the right tool, not just a disk fix:
~/.codex is shared, mutable, and re-discovers apps/plugins (the 1.64M→16k bloat you hit). In a container, a minimal, version-controlled CODEX_HOME is baked into the image and reset on every run. The minimal codex-home stops being something you maintain and becomes the default — drift is impossible because every container starts clean.| Recurring problem | Today (shared host) | With Docker |
|---|---|---|
| Worktree / node_modules bloat | 71 G of leftover checkouts | deleted on exit — zero host growth |
| codex-home drift + plugin bloat | shared, re-discovers, grows | minimal CODEX_HOME baked in, reset per run |
| AGENTS.md bleed across projects | one shared lineage | per-image, per-repo, versioned |
| One experiment starving the box | unbounded | per-container CPU/mem limits |
| Adding a new repo/experiment | manual host setup + drift | drop a Dockerfile → isolated |
| "works on my box" | host-dependent | the image is the environment |
Build it once into an image; throw the run away every time.
Docker installed, running, ali in group. Nothing to do.
base + ci image; runner runs each job in a fresh container. Free + ephemeral + reproducible. ~½ day.
codex-lane image w/ baked min CODEX_HOME; dispatch via docker run. Kills codex-home/AGENTS.md/worktree bloat at the source. ~1 day.
run-lane <repo> helper, resource limits, docker system prune cron. ~½ day.
docker system prune cron. Bounded and predictable, unlike the unbounded worktree growth it replaces.pnpm install from scratch, since image layers cache deps.Phase 1 first — containerized CI is the biggest bloat + reproducibility win, lowest risk, and builds on the runner already live. Then Phase 2 to retire codex-home/AGENTS.md/worktree bloat for good. None of it is urgent: free CI + the janitor + the 85% alert already contain the immediate problem. Docker is the durable, multi-project upgrade.