Docker Auto-Updates
1. What? — Definition and context
Section titled “1. What? — Definition and context”The Docker Auto-Updates workflow automates Docker image updates on the VPS. DIUN detects new versions, the N8N hub classifies each image (critical / base / app), and a sub-workflow runs the full cycle: backup, pull/build, health check, rollback if needed, notification.
Refactored architecture (#275)
Section titled “Refactored architecture (#275)”| Workflow | ID | Nodes | Role |
|---|---|---|---|
| Docker DIUN (parent) | WdSepRkceMzI0QQ5 | 36 | Triggers, per-category routing, classification |
| DIUN Update Executor (SW-1) | VSFJv2DbdkvQ2cl1 | 25 | Shared lifecycle: backup, update, health check, rollback, notify |
| DIUN Queue Processor (SW-2) | Eb3QIk8DnEQIXeSl | 6 | 03h-05h queue processing loop |
| DIUN Approval Handler (SW-3) | JLsR7JSMAQDwzqEX | 20 | Classification, approval, rejection, deferred Telegram report |
Before #275, this workflow had 101 monolithic nodes. Splitting into hub + 3 sub-workflows lets each cycle be tested in isolation and lets the DIUN Update Executor be reused from other triggers (manual rebuild, file provider).
Architecture diagram
Section titled “Architecture diagram”Image categories
Section titled “Image categories”| Category | Behaviour | Examples |
|---|---|---|
| critical | Immediate update + health check | caddy, crowdsec, security-stack |
| base | Queued for the 03h-05h window | postgres, redis, prometheus |
| app | Admin approval required via Telegram | n8n, odoo, grafana, ai-stack |
2. Why? — Stakes and motivations
Section titled “2. Why? — Stakes and motivations”Problems solved
Section titled “Problems solved”| Problem | Without this workflow | With this workflow |
|---|---|---|
| Outdated images | Unpatched vulnerabilities | Policy-driven automatic updates |
| User downtime | Updates during business hours | Nightly maintenance window |
| Risky updates | No prior validation | Approval for critical apps |
| Manual rollback | Late intervention | Health check + automatic rollback |
| No version visibility | ”Something changed” | Before/after version tracking in the notification |
Why three categories?
Section titled “Why three categories?”| Category | Justification |
|---|---|
| critical | Security priority, immediate update (Caddy = entry point) |
| base | Stable infrastructure, nightly update to minimise impact |
| app | Business-critical, human validation required |
Why a shared Update Executor sub-workflow?
Section titled “Why a shared Update Executor sub-workflow?”The update cycle (backup → pull/build → up → health check → rollback / notify) is identical regardless of trigger. Extracting it into SW-1 enables:
| Benefit | Detail |
|---|---|
| Reuse | Same guarantees for DIUN, file provider, manual rebuild |
| Isolated tests | Mockable independently from the trigger |
| Maintenance | Single place to change the health-check strategy |
3. How? — Technical implementation
Section titled “3. How? — Technical implementation”Data Tables
Section titled “Data Tables”image_policies — Per-image policy
Section titled “image_policies — Per-image policy”| Column | Type | Description |
|---|---|---|
image_key | Text | Unique key: project/service |
category | Text | critical / base / app |
backup | Boolean | Backup required before update |
custom_build | Boolean | Custom image (build vs pull) |
github_repo | Text | owner/repo for changelog |
compose_dir | Text | Absolute path of the docker-compose |
pending_updates — 03h-05h queue
Section titled “pending_updates — 03h-05h queue”| Column | Type | Description |
|---|---|---|
image_key | Text | Unique key |
image | Text | Full image name |
project | Text | Project name |
service | Text | Service name |
custom_build | Boolean | Build instead of pull |
created_at | Text | ISO timestamp |
pending_updates_approvals — Pending approvals
Section titled “pending_updates_approvals — Pending approvals”App approvals waiting for a Telegram answer. TTL 7d; beyond that, the approval is automatically re-requested at the next scan.
Generated Docker commands
Section titled “Generated Docker commands”# Standard imagedocker compose -f /path/to/stack/docker-compose.yaml pulldocker compose -f /path/to/stack/docker-compose.yaml up -d
# Custom image (custom_build = true)docker compose -f /path/to/stack/docker-compose.yaml pull --ignore-buildabledocker compose -f /path/to/stack/docker-compose.yaml build --no-cachedocker compose -f /path/to/stack/docker-compose.yaml up -dApproval flow (app images)
Section titled “Approval flow (app images)”Post-update health check
Section titled “Post-update health check”After every update, SW-1 verifies for 2 minutes that containers are healthy:
docker compose -f /path/to/stack/docker-compose.yaml ps --format jsonIf a container stays unhealthy after 120 s, a rollback is triggered: docker compose pull <previous_digest> then up -d. A critical notification is sent to the Hub with the diagnostic details.
Self-update Docker (n8n-stack)
Section titled “Self-update Docker (n8n-stack)”Updating N8N itself is tricky: the update workflow runs in the container it updates. The self-restart pattern in use:
| Step | Action |
|---|---|
| 1 | Insert maintenance flag in error_handling_config (suppress notifications) |
| 2 | Trigger an external script via nohup that waits 10s then docker compose up -d --force-recreate |
| 3 | The N8N workflow stops (the container restarts) |
| 4 | On restart, a cleanup workflow removes the maintenance flag and notifies success |
Version tracking
Section titled “Version tracking”Every update captures before/after versions for traceability. The format uses SSH markers to extract versions from the Dockerfile or image:tag:
n8n-custom:latest 2.4.8 → 2.5.0caddy-crowdsec:latest 2.8.4 → 2.8.5Telegram notifications display the version transition, so the nature of the change (patch, minor, major) is visible at a glance.
File Provider Auto-Rebuild
Section titled “File Provider Auto-Rebuild”When a base image changes (node:20-slim, caddy:builder…), custom images that depend on it must be rebuilt. DIUN watches those images via its file provider (reading base-images.yml), and the workflow detects the dependency.
| Base image | Custom service | Category |
|---|---|---|
caddy:builder, caddy:latest | security-stack/caddy | critical (immediate rebuild) |
n8nio/n8n:latest | n8n-stack/n8n | app (approval required) |
node:20-slim, ghcr.io/astral-sh/uv | ai-stack/cli-ollama | app (approval required) |
Incident Response (AI-Assisted)
Section titled “Incident Response (AI-Assisted)”When Prometheus Alertmanager signals a problem (container down, CPU spike, disk full), an Incident Response workflow triggers a Claude diagnostic:
| Severity | Behaviour |
|---|---|
| TRIVIAL | Auto-remediation (restart) + health check + notification |
| MODERATE | Claude proposes + applies the fix, monitoring for 5 min |
| COMPLEX | Claude produces a plan, waits for human approval (30-min timeout) |
For COMPLEX cases, Claude produces a detailed plan sent over Telegram with [Execute] [Edit] [Ignore] buttons. Same Plan Engine pattern as the Conversational system.
Useful commands
Section titled “Useful commands”# Force a DIUN checkdocker exec diun diun --test
# DIUN logsdocker logs diun --tail 50
# View configured policies# N8N → Data → Tables → image_policies (26 current entries)
# View pending updates# N8N → Data → Tables → pending_updates4. What if? — Outlook and limits
Section titled “4. What if? — Outlook and limits”Current limits
Section titled “Current limits”| Limit | Impact | Mitigation |
|---|---|---|
| No automated tests | Risk of undetected regressions | Basic health check only |
| Manual rollback possible | If rollback fails, intervention required | Immediate notification with context |
| DIUN dependency | No detection if DIUN is down | Container monitoring via Health Check |
| No canary | Direct update, no progressive rollout | Acceptable on a single-VPS setup |
Evolution scenarios
Section titled “Evolution scenarios”If an update causes a regression:
- Health check detects the problem
- Automatic rollback to the previous image
- Notification with details for investigation
If updates are too frequent:
- Tune the DIUN polling (currently 6h)
- Create a “stable” category with monthly updates
- Filter by semantic versioning (major only)
If deeper testing is needed:
- Post-update smoke tests via the
monitoring-stack - Observation delay before success notification
- Dedicated staging environment for prior tests (extra VPS cost)
If a tighter CVE coupling is wanted:
- Reject an update that would introduce a new uncorrected CRITICAL
- Use the Security CVE Watch feed as gating
Related pages
Section titled “Related pages”Infrastructure
Section titled “Infrastructure”- VPS Architecture — Big picture
- Notify Stack — DIUN that triggers updates
- Security Stack — Caddy protected against accidental restart
Workflows
Section titled “Workflows”- Telegram Orchestrator — Receives approvals
- Notification Hub — Success/failure routing
- Security CVE Watch — CVE scan coupled to pre-update gating
- Error Handler — DIUN error capture
Reference
Section titled “Reference”- Glossary — DIUN, Health Check, Self-restart, File Provider