Category Archives: docker

How We Fixed the “First Web Container is Unhealthy” Error: A DNS Deep Dive


The Error That Nearly Broke Our Deployment

Three hours into our Kamal deployment, we were stuck in a loop:

ERROR Failed to boot web on {ip_address}
  INFO First web container is unhealthy on {ip_address}, not booting any other roles

The container would start, but Kamal’s health check kept failing. After 30 seconds, Kamal would kill the container
and retry, creating an endless loop.

We spent hours debugging deployment scripts, PostgreSQL configurations, and Rails settings. The fix turned out to
be much simpler: DNS configuration.

The Root Cause: Broken DNS Resolution

What Was Happening

When Kamal tried to verify container health, it performed this sequence:

  1. Container starts → my_app-web-abc123 boots
  2. Traefik (Kamal proxy) tries to check /up endpoint
  3. DNS lookup → Resolve my_app-web-abc123 to an IP address
  4. Health check fails → DNS resolution times out or fails
  5. Container killed → Kamal marks it as unhealthy

The DNS Failure

The Traefik container’s /etc/resolv.conf showed:

nameserver 127.0.0.53
  search members.linode.com
  options edns0 trust-ad ndots:0

Problem: 127.0.0.53 is the host’s systemd-resolved DNS server. It’s not accessible from inside
Docker containers!

When Traefik tried to resolve my_app-web-abc123:

  • It queried 127.0.0.53 (systemd-resolved)
  • The query failed with “connection refused”
  • Health check failed
  • Container was killed

The Solution: Proper Docker DNS Configuration

What We Fixed

We configured Docker’s DNS settings in /etc/docker/daemon.json:

{
    "dns": ["127.0.0.11", "8.8.8.8", "1.1.1.1"]
  }

Why This Works

1. 127.0.0.11 (Docker’s Internal DNS) – First Priority

  • Resolves container hostnames automatically
  • Handles inter-container communication
  • Always available inside Docker networks

2. 8.8.8.8 (Google DNS) – Second Priority

  • Resolves external domains (APIs, gems, etc.)
  • Fast and reliable
  • Global infrastructure

3. 1.1.1.1 (Cloudflare DNS) – Third Priority

  • Privacy-focused external DNS
  • Backup if 8.8.8.8 fails
  • No query logging

How Docker Uses This

Docker’s DNS resolution order:

  1. Try 127.0.0.11 (internal) → container names
  2. If that fails → 8.8.8.8 (external) → domains
  3. If that fails → 1.1.1.1 (external) → domains

The IPv4/IPv6 Issue

While debugging, we discovered another subtle problem:

The IPv6 Trap

The server setup script used:

SERVER_IP=$(curl -s ifconfig.me || echo "ip_address_goes_here")

Problem: ifconfig.me returned an IPv6 address:

2600:3c03::...

This IPv6 address was used in PostgreSQL’s pg_hba.conf:

host my_app_production my_app_user 2600:3c03.../32 md5

PostgreSQL had issues with this IPv6 address, causing authentication failures.

The Fix

Force IPv4 detection:

SERVER_IP=$(curl -s -4 ifconfig.me || echo "ip_address_goes_here")

The -4 flag ensures we always get an IPv4 address, which PostgreSQL handles reliably.

The PostgreSQL Network Isolation Issue

The Problem

Kamal uses a separate Docker network (172.18.0.0/16) for containers, while PostgreSQL is on the host’s Docker
bridge network (172.17.0.0/16).

The firewall only allowed 172.17.0.0/16:

5432/tcp  ALLOW  172.17.0.0/16

The Fix

Add the Kamal network to both firewall and PostgreSQL config:

Firewall (ufw):

sudo ufw allow from 172.18.0.0/16 to any port 5432

PostgreSQL (pg_hba.conf):

host my_app_production my_app_user 172.18.0.0/16 md5

Complete Fix in our setup script

IPv4 Fix

SERVER_IP=$(curl -s -4 ifconfig.me || echo "ip_address_goes_here")

Kamal Network Firewall Rule

sudo ufw allow from 172.18.0.0/16 to any port 5432

PostgreSQL Kamal Network Rule

host $DB_NAME $DB_USER 172.18.0.0/16 md5

Docker DNS Configuration

{
    "dns": ["127.0.0.11", "8.8.8.8", "1.1.1.1"]
  }

Key Takeaways

  1. DNS is Critical for Container Orchestration
    • Always configure Docker’s DNS properly
    • Include both internal and external DNS servers
    • Test DNS resolution from containers
  2. Network Isolation Matters
    • Docker networks are isolated by default
    • PostgreSQL must allow connections from all Docker networks
    • Firewall rules must match
  3. IPv4 vs IPv6 Can Break Things
    • PostgreSQL works better with IPv4
    • Force IPv4 when detecting server IPs
    • Test both IPv4 and IPv6 connectivity
  4. Health Checks are Essential
    • The /up endpoint is critical for Kamal
    • DNS must work for health checks to succeed
    • Timeout settings matter (30s default)

Troubleshooting DNS Issues

If you encounter “First web container is unhealthy”:

  1. Check Container Logs
    docker logs my_app-web-abc123
  2. Check Traefik/Kamal Proxy Logs
    docker logs kamal-proxy | grep -i healthcheck
  3. Test DNS Resolution
    # From inside Traefik container
      docker exec kamal-proxy getent hosts my_app-web-abc123
      docker exec kamal-proxy getent hosts google.com
  4. Verify DNS Configuration
    # Check daemon.json
      cat /etc/docker/daemon.json
    
      # Check container's resolv.conf
      docker exec kamal-proxy cat /etc/resolv.conf
  5. Check PostgreSQL Connectivity
    # From kamal network
      docker run --rm --network kamal postgres:16 psql \
        -h 172.17.0.1 -U my_app_user -d my_app_production -c "SELECT 1"

Results

After implementing all fixes:

  • ✅ DNS resolution works (internal and external)
  • ✅ Health checks pass (Traefik can reach containers)
  • ✅ PostgreSQL connections work (from both Docker networks)
  • ✅ Deployments succeed (consistent, reliable)
  • ✅ IPv4 detection works (no IPv6 issues)

Final Thoughts

The “First web container is unhealthy” error can be a DNS configuration issue, not a deployment or application
problem.

By understanding how Docker networks work, how DNS resolution functions, and how PostgreSQL authentication works, we can prevent this issue from ever occurring again.

Key files to review:

  • /etc/docker/daemon.json – Docker DNS configuration
  • /etc/postgresql/16/main/pg_hba.conf – PostgreSQL authentication
  • /etc/ufw/rules.conf – Firewall rules

The fix is now automated in our setup script, ensuring new servers have proper DNS and network configuration from
day one.

Docker MySQL with a Custom SQL Script for Development

The setup is similar to setting up MariaDB.

Start with standard docker-compose file. If using custom SQL mode, specify the necessary options in the command options:

version: "3.7"
services:
    mysql:
        build:
            context: .
            dockerfile: dev.dockerfile
        restart: always
        command: --sql_mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE"
        environment:
            MYSQL_ROOT_PASSWORD: root_password
            MYSQL_DATABASE: dev
            MYSQL_USER: dev_user
            MYSQL_PASSWORD: dev_password
        ports:
            - 3306:3306

Add dev.dockerfile:

FROM mysql:8.0.17

ADD init.sql /docker-entrypoint-initdb.d/ddl.sql


Finally, add your init.sql file. Let’s give all privileges to our dev_user and switch the default caching_sha2_password to mysql_native_password (don’t do it unless you rely on older packages that require the less secure au

Finally, add your init.sql file. Let’s give all privileges to our dev_user and switch the default caching_sha2_password to mysql_native_password (don’t do it unless you rely on older packages that require the less secure mysql_native_password authentication method):

GRANT ALL PRIVILEGES ON *.* TO 'dev_user'@'%';
ALTER USER 'dev_user'@'%' IDENTIFIED WITH mysql_native_password BY 'dev_password';

If you want to access the database container from other containers, while running them separately, you can specify host.docker.internal as the host address of your database.

Dockerizing MariaDB with a Custom SQL Script in Development

Start with standard docker-compose file.

version: "3.7"
services:
    mariadb:
        build:
            context: .
            dockerfile: dev.dockerfile
        restart: always
        environment:
            MYSQL_ROOT_PASSWORD: password
            MYSQL_DATABASE: db_name
            MYSQL_USER: sql_user
            MYSQL_PASSWORD: password
        ports:
            - 3306:3306

Add dev.dockerfile:

FROM mariadb:latest

ADD init.sql /docker-entrypoint-initdb.d/ddl.sql

Finally, add your init.sql file. Let’s give all privileges to our sql_user:

GRANT ALL PRIVILEGES ON *.* TO 'sql_user'@'%';

Now, run docker-compose build, then docker-compose up.

Access from another container

If you want to access the database container from other containers, while running them separately, you can specify host.docker.internal as the address of your database.

If you’re on linux, then you need docker engine >= 20.03, and you need to add to your docker-compose file:

  my_app:
    extra_hosts:
      - "host.docker.internal:host-gateway"

If you’re are on Mac ^^ will break your setup unless you are at least on Docker Desktop for Mac 3.3.0. See Support host.docker.internal DNS name to host · Issue #264 · docker/for-linux (github.com) for details.

docker-compose build and deployment for Angular

In this tutorial we’ll make docker-compose files for angular and write a simple deploy script to build and deploy the images from your local machine.

Development

Let’s start with the dev environment. First, add .dockerignore file in the root of your project:

.git
.gitignore
.vscode
docker-compose*.yml
Dockerfile
node_modules

Create .docker directory in the root of your project. Add dev.dockerfile:

FROM node:10

RUN mkdir /home/node/app && chown node:node /home/node/app
RUN mkdir /home/node/app/node_modules && chown node:node /home/node/app/node_modules
WORKDIR  /home/node/app
USER node
COPY --chown=node:node package.json package-lock.json ./
RUN npm ci --quiet
COPY --chown=node:node . .

We are using node 10 image and using a less privileged node user. npm ci “is similar to npm install, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment — or any situation where you want to make sure you’re doing a clean install of your dependencies.” – npm-ci | npm Docs (npmjs.com)

Create docker-compose.yml file in the root of your project:

# docker-compose
version: '3.7'
services:
services:
  app:
    container_name: 'your-container-name'
    build:
      context: .
      dockerfile: .docker/dev.dockerfile
    command: sh -c "npm start"
    ports:
      - 4200:4200
    working_dir: /home/node/app
    volumes:
      - ./:/home/node/app
      - node_modules:/home/node/app/node_modules
volumes:
  node_modules:


With this setup, the node_modules will be overridden when we build a new container. Basically, this means you may have to run docker-compose run app npm install when you need to update your packages. Rebuilding the image is not going to do it for you.

For alternative setups, check out this stackoverflow answer.

In you package.json you should have the definition of the npm start command:

"scripts": {
    "ng": "ng",
    "start": "ng serve --host 0.0.0.0",
    "build": "ng build"
  },

Run docker-compose build and docker-compose up.

Deployment

Docker Setup

Let’s add production.dockerfile to .docker directory:

# Stage 1
FROM node:10 as node

RUN mkdir /home/node/app && chown node:node /home/node/app
RUN mkdir /home/node/app/node_modules && chown node:node /home/node/app/node_modules
WORKDIR  /home/node/app
USER node
COPY --chown=node:node package.json package-lock.json ./
RUN npm ci --quiet
COPY --chown=node:node . .

# max_old_space_size is optional but can help when you have a lot of modules
RUN node --max_old_space_size=4096 node_modules/.bin/ng build --prod

# Stage 2
# Using a light-weight nginx image
FROM nginx:alpine

COPY --from=node /home/node/app/dist /usr/share/nginx/html
COPY --from=node /home/node/app/.docker/nginx.conf /etc/nginx/conf.d/default.conf

Add docker-compose.production.yml file:

version: '3.7'
services:
services:
  app:
    build:
      context: .
      dockerfile: .docker/production.dockerfile
    image: production-image
    container_name: production-container
    ports:
      - 80:80

Deploy script

We are going to ssh into our destination server and copy the updated image directly. Using a repository has a lot of advantages over this approach, but if you need something simple this will work:

#!/bin/sh

# Build the image locally, upload to your production box and start the new container based on the latest image

{
    echo "Create an image"
    docker-compose -f docker-compose.yml -f docker-compose.production.yml build

    echo "Upload the latest image"
    echo $(date +"%T")
    docker save production-image:latest | ssh -C user@your_server_ip docker load

    echo "Stop and restart containers"
    ssh -C user@your_server_ip "echo Stopping container at $(date +'%T'); \
        docker stop production-container || true; \
        docker rm production-container || true; \
        docker container run -d --restart unless-stopped -p 80:80 --name production-container production-image:latest; \
        echo Restarted container at $(date +'%T'); \
        docker image prune -f || true"

    echo "Finished"
    echo $(date +"%T")
} || {
    # catch
    echo "Something went wrong"
}

We are starting a new container based on the latest uploaded image on our destination host and mapping the host port 80 to the container port 80.

Helpful resources: