Skip to content

Docker Auto-Updates

The Docker Auto-Updates workflow automates Docker image updates on the VPS. DIUN detects new versions, the N8N hub classifies each image (critical / base / app), and a sub-workflow runs the full cycle: backup, pull/build, health check, rollback if needed, notification.

WorkflowIDNodesRole
Docker DIUN (parent)WdSepRkceMzI0QQ536Triggers, per-category routing, classification
DIUN Update Executor (SW-1)VSFJv2DbdkvQ2cl125Shared lifecycle: backup, update, health check, rollback, notify
DIUN Queue Processor (SW-2)Eb3QIk8DnEQIXeSl603h-05h queue processing loop
DIUN Approval Handler (SW-3)JLsR7JSMAQDwzqEX20Classification, approval, rejection, deferred Telegram report

Before #275, this workflow had 101 monolithic nodes. Splitting into hub + 3 sub-workflows lets each cycle be tested in isolation and lets the DIUN Update Executor be reused from other triggers (manual rebuild, file provider).

SW-2 Queue Processor · 6 nodes

SW-1 Update Executor · 25 nodes

SW-3 Approval Handler · 20 nodes

Docker DIUN · 36 nodes

Triggers

base

03h-05h

critical

app

approved

rejected/later

DIUN webhook · 6h scan

File Provider · base images

Cron 03h · queue processor

Manual rebuild

Webhook

Lookup image_policies

Classify · critical / base / app

Notification Hub · approval

Wait for callback · approve/reject/later

Switch decision

Backup if needed

Pull or Build

compose up -d

Health check 2 min

Rollback if KO

Notification Hub · success/failure

Loop pending_updates

Call Update Executor

Insert pending_updates

CategoryBehaviourExamples
criticalImmediate update + health checkcaddy, crowdsec, security-stack
baseQueued for the 03h-05h windowpostgres, redis, prometheus
appAdmin approval required via Telegramn8n, odoo, grafana, ai-stack

ProblemWithout this workflowWith this workflow
Outdated imagesUnpatched vulnerabilitiesPolicy-driven automatic updates
User downtimeUpdates during business hoursNightly maintenance window
Risky updatesNo prior validationApproval for critical apps
Manual rollbackLate interventionHealth check + automatic rollback
No version visibility”Something changed”Before/after version tracking in the notification
CategoryJustification
criticalSecurity priority, immediate update (Caddy = entry point)
baseStable infrastructure, nightly update to minimise impact
appBusiness-critical, human validation required

Why a shared Update Executor sub-workflow?

Section titled “Why a shared Update Executor sub-workflow?”

The update cycle (backup → pull/build → up → health check → rollback / notify) is identical regardless of trigger. Extracting it into SW-1 enables:

BenefitDetail
ReuseSame guarantees for DIUN, file provider, manual rebuild
Isolated testsMockable independently from the trigger
MaintenanceSingle place to change the health-check strategy

ColumnTypeDescription
image_keyTextUnique key: project/service
categoryTextcritical / base / app
backupBooleanBackup required before update
custom_buildBooleanCustom image (build vs pull)
github_repoTextowner/repo for changelog
compose_dirTextAbsolute path of the docker-compose
ColumnTypeDescription
image_keyTextUnique key
imageTextFull image name
projectTextProject name
serviceTextService name
custom_buildBooleanBuild instead of pull
created_atTextISO timestamp

pending_updates_approvals — Pending approvals

Section titled “pending_updates_approvals — Pending approvals”

App approvals waiting for a Telegram answer. TTL 7d; beyond that, the approval is automatically re-requested at the next scan.

Fenêtre de terminal
# Standard image
docker compose -f /path/to/stack/docker-compose.yaml pull
docker compose -f /path/to/stack/docker-compose.yaml up -d
# Custom image (custom_build = true)
docker compose -f /path/to/stack/docker-compose.yaml pull --ignore-buildable
docker compose -f /path/to/stack/docker-compose.yaml build --no-cache
docker compose -f /path/to/stack/docker-compose.yaml up -d

App image detected

Notification Hub · approval_request

Inline buttons · Approve / Reject / Later

NH Callback Handler

SW-3 Approval Handler

Switch decision

approved

rejected

later → pending_updates

SW-1 Update Executor

Notify · Skipped

Queue 03h-05h

After every update, SW-1 verifies for 2 minutes that containers are healthy:

Fenêtre de terminal
docker compose -f /path/to/stack/docker-compose.yaml ps --format json

If a container stays unhealthy after 120 s, a rollback is triggered: docker compose pull <previous_digest> then up -d. A critical notification is sent to the Hub with the diagnostic details.

Updating N8N itself is tricky: the update workflow runs in the container it updates. The self-restart pattern in use:

StepAction
1Insert maintenance flag in error_handling_config (suppress notifications)
2Trigger an external script via nohup that waits 10s then docker compose up -d --force-recreate
3The N8N workflow stops (the container restarts)
4On restart, a cleanup workflow removes the maintenance flag and notifies success

Every update captures before/after versions for traceability. The format uses SSH markers to extract versions from the Dockerfile or image:tag:

n8n-custom:latest 2.4.8 → 2.5.0
caddy-crowdsec:latest 2.8.4 → 2.8.5

Telegram notifications display the version transition, so the nature of the change (patch, minor, major) is visible at a glance.

When a base image changes (node:20-slim, caddy:builder…), custom images that depend on it must be rebuilt. DIUN watches those images via its file provider (reading base-images.yml), and the workflow detects the dependency.

Base imageCustom serviceCategory
caddy:builder, caddy:latestsecurity-stack/caddycritical (immediate rebuild)
n8nio/n8n:latestn8n-stack/n8napp (approval required)
node:20-slim, ghcr.io/astral-sh/uvai-stack/cli-ollamaapp (approval required)

When Prometheus Alertmanager signals a problem (container down, CPU spike, disk full), an Incident Response workflow triggers a Claude diagnostic:

SeverityBehaviour
TRIVIALAuto-remediation (restart) + health check + notification
MODERATEClaude proposes + applies the fix, monitoring for 5 min
COMPLEXClaude produces a plan, waits for human approval (30-min timeout)

For COMPLEX cases, Claude produces a detailed plan sent over Telegram with [Execute] [Edit] [Ignore] buttons. Same Plan Engine pattern as the Conversational system.

Fenêtre de terminal
# Force a DIUN check
docker exec diun diun --test
# DIUN logs
docker logs diun --tail 50
# View configured policies
# N8N → Data → Tables → image_policies (26 current entries)
# View pending updates
# N8N → Data → Tables → pending_updates

LimitImpactMitigation
No automated testsRisk of undetected regressionsBasic health check only
Manual rollback possibleIf rollback fails, intervention requiredImmediate notification with context
DIUN dependencyNo detection if DIUN is downContainer monitoring via Health Check
No canaryDirect update, no progressive rolloutAcceptable on a single-VPS setup

If an update causes a regression:

  • Health check detects the problem
  • Automatic rollback to the previous image
  • Notification with details for investigation

If updates are too frequent:

  • Tune the DIUN polling (currently 6h)
  • Create a “stable” category with monthly updates
  • Filter by semantic versioning (major only)

If deeper testing is needed:

  • Post-update smoke tests via the monitoring-stack
  • Observation delay before success notification
  • Dedicated staging environment for prior tests (extra VPS cost)

If a tighter CVE coupling is wanted:

  • Reject an update that would introduce a new uncorrected CRITICAL
  • Use the Security CVE Watch feed as gating

  • Glossary — DIUN, Health Check, Self-restart, File Provider