Docker Compose in Production: What You Need to Know

:::note[TL;DR]

Add restart: unless-stopped to every service — without it, crashed containers stay dead
Health checks + depends_on: condition: service_healthy prevent startup race conditions between services
Never put secrets in .env files on the server — use Docker secrets, CI/CD injection, or a secrets manager
Set max-size and max-file on log drivers — default JSON logging fills disk silently
Use resource limits (cpu/memory) to prevent one container from taking down the entire host :::

Docker Compose works great in development. In production, the same file will get you in trouble if you don’t change a few things. The defaults are built for convenience, not resilience.

This guide covers what to update before you point your domain at a Compose-based deployment.

What’s different in production

In development:

Containers restart manually
Secrets are in .env files
Logs go wherever
Containers share all resources freely
Health checks don’t matter

In production:

Containers must restart automatically
Secrets must not be in files on disk
Logs need to go to a log driver or external system
Resource limits prevent one container from killing the host
Health checks gate your load balancer and deployment logic

Use a separate production compose file

Don’t use one file for everything. Keep docker-compose.yml for shared config and docker-compose.prod.yml for production overrides:

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Restart policies

Without this, containers that crash stay dead:

services:
  app:
    restart: unless-stopped    # restart on crash, not on manual stop

Options:

no — never restart (dev default)
always — always restart, even on manual stop
on-failure — only restart on non-zero exit code
unless-stopped — restart always except when manually stopped (production default)

Health checks

Health checks let Docker know when a container is actually ready, not just running:

services:
  app:
    image: my-app:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s    # grace period on startup

  db:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

Use depends_on with condition to wait for a healthy dependency:

services:
  app:
    depends_on:
      db:
        condition: service_healthy

The scenario: You deploy your app and it starts in 3 seconds, but PostgreSQL takes 8 seconds to be ready for connections. Without health checks, your app crashes on startup trying to connect to a database that isn’t ready yet. With service_healthy, the app waits.

Resource limits

Without limits, one misbehaving container can take down the host:

services:
  app:
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 256M

Secrets management

:::warning Never commit .env files containing production secrets to your repo — even private repos. If the repo is ever made public, or a team member’s account is compromised, those secrets are exposed. Use .gitignore to exclude .env and inject secrets at deploy time via CI/CD or a secrets manager. :::

Never put production secrets in .env files committed to the repo or sitting on the server in plaintext.

Option 1: Docker secrets (Swarm mode)

secrets:
  db_password:
    external: true

services:
  db:
    secrets:
      - db_password
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password

Option 2: Environment variables from a secure source

Use your CI/CD system (GitHub Actions, GitLab CI) to inject secrets at deploy time, or a secrets manager (Vault, AWS Secrets Manager) that your deploy script calls before starting Compose.

Option 3: .env file with restricted permissions

If you must use a .env file on the server, restrict access:

chmod 600 /app/.env
chown appuser:appuser /app/.env

Never commit it. Add it to .gitignore.

Logging

:::tip Always configure max-size and max-file on your log driver. The default JSON file driver writes unlimited logs to disk — a verbose service can fill a server disk in hours. Set max-size: "10m" and max-file: "3" as a minimum for every service in production. :::

Default JSON file logs fill up your disk. Configure a log driver:

services:
  app:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

For centralized logging, use loki, fluentd, or awslogs:

    logging:
      driver: awslogs
      options:
        awslogs-region: ap-south-1
        awslogs-group: /app/production
        awslogs-stream: app

Networking

By default, all services in a Compose file share one network. In production, segment your services:

networks:
  frontend:
  backend:

services:
  nginx:
    networks:
      - frontend

  app:
    networks:
      - frontend
      - backend

  db:
    networks:
      - backend    # not reachable from nginx directly

Production compose example

# docker-compose.prod.yml
services:
  app:
    image: my-app:${IMAGE_TAG:-latest}
    restart: unless-stopped
    environment:
      NODE_ENV: production
      DATABASE_URL: ${DATABASE_URL}
    ports:
      - "3000:3000"
    depends_on:
      db:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 1G
    logging:
      driver: json-file
      options:
        max-size: "20m"
        max-file: "5"
    networks:
      - frontend
      - backend

  db:
    image: postgres:16-alpine
    restart: unless-stopped
    environment:
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: ${DB_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          memory: 2G
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    networks:
      - backend

volumes:
  pgdata:

networks:
  frontend:
  backend:

Zero-downtime deploys

Compose doesn’t do rolling updates natively. For zero-downtime:

Pull the new image: docker compose pull app
Recreate with minimal downtime: docker compose up -d --no-deps app

Or use a proxy like Nginx/Traefik to route traffic while you swap containers.

For anything requiring true zero-downtime at scale, that’s when Docker Swarm or Kubernetes starts making sense.

Summary

Change restart: no (dev default) to restart: unless-stopped for every service in production
Health checks let Compose and load balancers know when a container is actually ready — use depends_on with condition: service_healthy
Never put production secrets in .env files on the server — use Docker secrets, CI/CD injection, or a secrets manager
Add max-size and max-file to your logging config to prevent filling your disk with JSON logs
Use resource limits (cpus, memory) to prevent one misbehaving container from taking down the host

Frequently Asked Questions

Should I use Docker Compose or Kubernetes in production?

Compose is the right choice for small-to-medium deployments on a single server or a few servers. Kubernetes is for orchestrating containers across many nodes at scale. If you’re running 2–10 services on a VPS, Compose is simpler and maintainable. If you need auto-scaling, multi-zone redundancy, and rolling deploys across a cluster, that’s Kubernetes territory.

How do I update a service with zero downtime using Compose?

True zero-downtime with plain Compose requires a reverse proxy (Nginx, Traefik, Caddy) handling traffic. Pull the new image, start a new container on a different port, update the proxy config to point to it, then stop the old container. Traefik automates this with labels. For simpler setups, a brief (~1-2 second) restart with docker compose up -d --no-deps app is usually acceptable.

What’s the difference between `depends_on` and `depends_on` with `condition: service_healthy`?

Plain depends_on only waits for the container to start — not for the service inside to be ready. condition: service_healthy waits until the container’s health check passes. Always use the health condition for databases, caches, and any service that takes a few seconds to initialize.

Docker Compose in Production: What You Need to Know

What’s different in production

Use a separate production compose file

Restart policies

Health checks

Resource limits

Secrets management

Logging

Networking

Production compose example

Zero-downtime deploys

Summary

Frequently Asked Questions

Should I use Docker Compose or Kubernetes in production?

How do I update a service with zero downtime using Compose?

What’s the difference between `depends_on` and `depends_on` with `condition: service_healthy`?

What to Read Next

System_Continuity

How to Set Up Coolify for Self-Hosted Deployments

Docker Multi-Stage Builds: Smaller Images, Faster Deploys

How to Run PostgreSQL Locally with Docker

What’s different in production

Use a separate production compose file

Restart policies

Health checks

Resource limits

Secrets management

Logging

Networking

Production compose example

Zero-downtime deploys

Summary

Frequently Asked Questions

Should I use Docker Compose or Kubernetes in production?

How do I update a service with zero downtime using Compose?

What’s the difference between depends_on and depends_on with condition: service_healthy?

What to Read Next

System_Continuity

How to Set Up Coolify for Self-Hosted Deployments

Docker Multi-Stage Builds: Smaller Images, Faster Deploys

How to Run PostgreSQL Locally with Docker

What’s the difference between `depends_on` and `depends_on` with `condition: service_healthy`?