~ symptom router
What does your error message say? Pick the closest match to jump to the relevant root cause.

You check your syslog, your journald output, or your container host's daemon log and you see lines like this repeating at high frequency:

dockerd log output
level=error msg="[resolver] failed to query external DNS server" \
  client-addr="udp:172.18.0.7:49663" \
  dns-server="udp:130.221.128.2:53" \
  error="read udp 172.18.0.7:49663->130.221.128.2:53: i/o timeout" \
  question=";sudowheel.com. IN A"

Or maybe the variant with a loopback address:

dockerd log output (systemd-resolved variant)
level=error msg="[resolver] failed to query external DNS server" \
  client-addr="udp:127.0.0.1:47051" \
  dns-server="udp:127.0.0.1:53" \
  error="read udp 127.0.0.1:47051->127.0.0.1:53: i/o timeout" \
  question=";sudowheel.com. IN A"

The log entry names an upstream DNS server, describes an i/o timeout or operation not permitted, and then identifies the domain being queried. It appears in a loop. Containers may or may not be able to resolve names depending on whether any fallback resolvers succeed. This guide explains exactly what is happening, why it happens in predictable scenarios, and how to fix every root cause.

Why This Error Is Harder to Reason About Than It Looks

DNS failures inside containers are cognitively tricky because the same host can resolve names perfectly at the shell while every container silently fails. This gap exists because the host and its containers do not share a network namespace -- they share a physical interface but have separate loopback interfaces, separate routing tables, and separate views of what an IP address means. A resolver address that is valid and reachable from the host may be entirely meaningless or unreachable from inside a container. Until you internalize that namespace boundary, the behavior looks like a bug in Docker. Once you internalize it, the behavior is completely predictable.

How Docker's DNS Stack Actually Works

Before you can fix this error, you need a clear picture of Docker's DNS architecture. There are two very different DNS configurations depending on which network type a container is attached to.

When a container is attached to Docker's default bridge network (docker0), it does not get Docker's embedded DNS server. Instead, the daemon copies the host's /etc/resolv.conf directly into the container, filtering out any loopback addresses since those would refer to the container itself rather than the host. The container then queries whatever nameservers are left in that copied file.

When a container is attached to a user-defined bridge or overlay network, the behavior is entirely different. Docker injects nameserver 127.0.0.11 into the container's /etc/resolv.conf. That address is Docker's embedded DNS server, which runs inside the dockerd process on the host. The embedded server is responsible for two things: resolving container names to their IP addresses within the network, and forwarding all other queries upstream to real nameservers.

Architecture Note: The Random Port Detail

Docker's embedded DNS server runs in the host's PID namespace but listens inside each container's network namespace via a socket bound to an ephemeral, randomly assigned port -- not port 53. The address 127.0.0.11 is reachable from the container because Docker uses iptables DNAT rules to intercept all UDP and TCP traffic directed at 127.0.0.11:53 from within the container and redirect it to the actual random port the resolver is listening on. This interception happens transparently, so the container always sees port 53. If those DNAT rules are absent or corrupted, DNS queries from the container never reach the embedded resolver at all -- they time out silently before the upstream forwarding stage even begins. The resolver implementation lives in moby/libnetwork/resolver.go and uses the miekg/dns library.

~ network namespace boundary — click any address to understand it
Host Network Namespace
127.0.0.53 — systemd-resolved stub
Reachable from the host's loopback. This is where systemd-resolved listens and proxies to real upstream servers. Completely unreachable from any container's separate loopback namespace — the source of the most common variant of this error.
/run/systemd/resolve/resolv.conf
The actual upstream resolver addresses that systemd-resolved is forwarding to. Docker should read this file rather than /etc/resolv.conf. Set "resolv-conf": "/run/systemd/resolve/resolv.conf" in daemon.json to use it directly.
dockerd embedded resolver (random port, host PID ns)
Runs inside dockerd on the host but listens in each container's network namespace. Not port 53 on the host — iptables DNAT redirects port 53 traffic inside the container to the actual random port. If those DNAT rules are missing, DNS never reaches the resolver at all.
Container Network Namespace
127.0.0.11 — injected by Docker (user-defined nets)
Docker writes this into the container's /etc/resolv.conf on any user-defined bridge or overlay network. All DNS queries go here first. iptables DNAT intercepts traffic to 127.0.0.11:53 and routes it to the embedded resolver's actual port.
127.0.0.53 — broken if inherited here
If the container's /etc/resolv.conf shows this address, DNS will silently fail. Inside the container's namespace, 127.0.0.53 means the container's own loopback — where nothing is listening on port 53. This is the loopback namespace collision that causes this error.
172.18.0.x — container bridge address
The container's IP on the bridge network. When the embedded resolver logs the error, this address appears in the client-addr field. Match it against docker inspect output to identify which container is generating failed queries.
loopback is NOT shared across namespaces
Core insight: Each namespace owns its own 127.0.0.0/8 loopback range. An address valid on the host loopback is a dead end inside any container. This one fact explains the majority of variants of this error.

The upstream forwarders that the embedded DNS server uses are determined by reading the host's DNS configuration when the daemon starts or when the daemon reads /run/systemd/resolve/resolv.conf if systemd-resolved is detected. This is the critical link that breaks in the scenarios described below.

The Upstream List Is Read at Daemon Start, Not at Container Start

This is the source of a very common "I applied the fix and nothing changed" experience. Modifying /etc/resolv.conf, updating systemd-resolved, or editing daemon.json while the Docker daemon is already running has no effect on any running or future containers until the daemon itself is restarted. The embedded resolver captures its upstream list once at startup and holds it until the daemon exits. If your host's DNS configuration changes -- because of a VPN connection, a DHCP lease renewal, or a manual edit -- Docker continues using the stale upstream list it loaded at start time. Always restart Docker after any DNS-related configuration change: sudo systemctl restart docker. If you need containers to keep running across the restart, configure "live-restore": true in daemon.json before the restart.

The log message [resolver] failed to query external DNS server is emitted by Docker's embedded resolver when it attempts to forward a query to one of those upstream addresses and cannot get a response. The embedded resolver is working as designed -- it just cannot reach the server it was told to use.

The Root Causes, Ranked by Frequency

systemd-resolved stub listener address leaking into containers

This is the single most common cause of this error on Ubuntu 18.04 and later, Debian with systemd-resolved enabled, Arch Linux, and any other distribution that points /etc/resolv.conf at systemd-resolved's stub listener at 127.0.0.53.

When dockerd reads /etc/resolv.conf and finds nameserver 127.0.0.53, it recognizes the address as a systemd-resolved stub and switches to reading /run/systemd/resolve/resolv.conf instead -- Docker's own code does this automatically. However, this behavior has edge cases. If the upstream servers in the stub file are themselves unavailable, or if the stub file was not populated correctly, Docker ends up forwarding queries to addresses that time out. In other configurations, the daemon may forward to 127.0.0.53 anyway, which cannot be reached from within a container's isolated network namespace since loopback is per-namespace.

Mental Model: Address Space Collision

The most useful way to think about this failure is not as a DNS misconfiguration but as an address space collision. The loopback range 127.0.0.0/8 is not a shared resource -- it is privately owned by each network namespace. When the host's /etc/resolv.conf says nameserver 127.0.0.53, that address is meaningful only inside the host's namespace. From inside a container, 127.0.0.53 resolves to the container's own loopback, which has nothing listening on port 53. Docker's embedded resolver inherits this address and faithfully tries to use it -- but it is operating across a namespace boundary the address was never designed to cross. The error is not Docker being broken. It is Docker doing exactly what it was told, in a context where those instructions are meaningless.

On distributions that do not use systemd-resolved at all -- Alpine Linux, Gentoo, older Debian without systemd, and any system using openresolv or direct /etc/resolv.conf management -- this exact loopback failure is less common but still possible. openresolv is a framework that manages /etc/resolv.conf on behalf of DHCP clients, VPN clients, and other network managers on non-systemd systems. If openresolv is configured with a local caching resolver like dnsmasq or pdnsd that binds to 127.0.0.1, it will write nameserver 127.0.0.1 into /etc/resolv.conf. Docker inherits that address and fails for the same reason: 127.0.0.1 inside a container namespace refers to the container's own loopback, not the host's. On Alpine and similar systems without systemd, Docker does not attempt any special resolv.conf fallback logic -- it reads the file as-is and uses whatever addresses it finds. The fix is the same as everywhere else: set explicit upstream resolvers in /etc/docker/daemon.json. On Alpine specifically, the file path and systemctl restart command are the same; only the init system differs (service docker restart instead of systemctl restart docker).

IPv6 loopback address in resolv.conf

The same namespace boundary problem that makes 127.0.0.53 unreachable from containers applies equally to IPv6 loopback addresses. On systems where /etc/resolv.conf contains nameserver ::1 -- which appears on hosts running a local DNS resolver like Unbound or dnsmasq bound to the IPv6 loopback -- Docker's embedded resolver inherits that address and tries to forward queries to it. From inside a container's network namespace, ::1 refers to the container's own IPv6 loopback, where nothing is listening on port 53. The result is an i/o timeout identical in structure to the IPv4 variant.

This variant is less common than the 127.0.0.53 case but regularly catches engineers off guard because they have already confirmed that 127.0.0.x addresses are absent from their resolver list without checking the IPv6 column. Inspect the full resolv.conf output, not just the first line:

terminal
# Check for both IPv4 and IPv6 loopback addresses in the host resolver config
$ grep nameserver /etc/resolv.conf
nameserver ::1    # <-- IPv6 loopback: will fail inside containers
nameserver 127.0.0.1    # <-- also a loopback: same problem

# Check what the embedded resolver would actually use
$ grep nameserver /run/systemd/resolve/resolv.conf

The fix is identical to the IPv4 loopback case: set explicit, routable upstream resolver addresses in /etc/docker/daemon.json. If you want to keep using a local resolver on the host, configure it to bind to the Docker bridge address (172.17.0.1 by default) rather than loopback, and point Docker at that bridge address as described in the systemd-resolved section above. The same DNSStubListenerExtra pattern works for dnsmasq and Unbound -- both support binding to additional interfaces.

Upstream DNS server unreachable or blocking UDP port 53

The error message i/o timeout specifically means Docker sent a UDP packet to the named DNS server and never received a reply. This happens when the upstream server address is correct and reachable in principle but queries time out because of network firewall rules, the server being overloaded, or the server being on a network segment not routable from the Docker bridge network.

On cloud instances, this manifests when the host's DNS resolver is a metadata-service address that is directly routable from the host but only reachable via the host network stack, not from inside a container's bridge-isolated namespace without NAT rules in place.

  • AWS EC2: The VPC resolver lives at 169.254.169.253 (or the base of the VPC CIDR +2, e.g. 10.0.0.2). The link-local 169.254.x.x address is unreachable from inside Docker bridge namespaces by default because the kernel does not route link-local traffic through NAT. The VPC base+2 address is generally reachable as long as Docker's MASQUERADE rules are intact, but some hardened AMIs block it.
  • Azure VMs: The Azure DNS resolver at 168.63.129.16 is a special platform address routed at the hypervisor level. Traffic to this address from inside a container bridge namespace typically drops silently at the virtual switch -- it never reaches the hypervisor. Use Azure's public recursive resolvers or your own DNS forwarder as an explicit daemon.json entry instead.
  • GCP Compute Engine: The metadata resolver at 169.254.169.254 and the VPC resolver at 8.8.8.8 (available by default in most GCP projects) behave differently. The metadata address has the same link-local problem as AWS. Set 8.8.8.8 explicitly in daemon.json on GCP, or use your project's internal resolver IP if you have a private zone.

The common thread across cloud providers is that their internal resolver mechanisms rely on hypervisor-level routing tricks that do not extend into guest network namespaces. Understanding this distinction explains why a container host that resolves names perfectly at the shell level can still fail to resolve anything from inside a container -- the two are not using the same network path.

iptables rules flushed or corrupted

Docker relies on iptables for two things that are both relevant to DNS: MASQUERADE rules in the nat table that allow containers to reach external IP addresses, and DNAT rules that redirect container DNS traffic to the embedded resolver. If these rules are flushed -- by a system reboot that did not properly restore them, by a third-party firewall manager like ufw or firewalld overwriting Docker's chains, or by manual iptables -F -- DNS queries from containers fail at the network layer before they ever reach an upstream server.

The symptom here is the variant of the error message that reads write: operation not permitted rather than i/o timeout. That indicates the kernel rejected the packet outright rather than allowing it to be sent.

VPN clients changing the host's DNS configuration

VPN clients frequently update /etc/resolv.conf or the systemd-resolved configuration with internal corporate DNS servers that are only reachable through the VPN tunnel interface. Docker's embedded resolver captures those internal server addresses as its upstreams. When the VPN is disconnected, or when the VPN routes do not extend into Docker's bridge network subnets, queries to those internal servers time out. The embedded resolver then logs the error for every forwarded query.

There are three distinct patterns within this root cause, each with a different fix:

  • VPN disconnects after Docker starts, resolver list becomes stale. Docker reads its upstream resolvers once at daemon start. If a VPN client rewrites /etc/resolv.conf after the daemon is already running, Docker's embedded resolver continues forwarding to the pre-VPN resolvers. When those addresses become unreachable -- because the VPN tunnel is down -- every query times out. The fix here is either the networkd-dispatcher automation described in the Fixes section, which automatically updates daemon.json and restarts Docker on network changes, or manually restarting Docker after each VPN connect or disconnect event.
  • VPN is connected but its DNS server is not routable from Docker's bridge namespace. Some VPN clients push an internal DNS server address that is only reachable via the VPN tunnel's virtual interface. Docker's bridge network does not route through that interface -- it routes through the host's default gateway. Queries reach the tunnel adapter from the host but are sent from a bridge namespace that has no route via the tunnel. The fix is to add the VPN DNS server as an explicit entry in daemon.json but also ensure the routing rules cover Docker's bridge subnet. Alternatively, use a public resolver as the fallback entry so containers still resolve external names when the VPN DNS is unreachable.
  • Split-tunnel VPN where internal names must go to the VPN resolver and external names go elsewhere. This is the most complex case. Docker's daemon.json does not support split DNS -- you cannot configure it to send .corp.internal to one resolver and everything else to another. The architecturally correct solution for this case is to make systemd-resolved listen on the Docker bridge address (as described in the systemd-resolved fix section) and configure systemd-resolved's per-domain routing to send internal names to the VPN resolver and external names to a public resolver. Docker then points at the bridge address as its single upstream and inherits the full split-DNS routing intelligence from systemd-resolved.

Docker Swarm mode and overlay network DNS

Docker Swarm introduces a second layer of DNS on top of the per-host embedded resolver. Each Swarm overlay network gets its own virtual IP (VIP) resolver that handles service discovery across nodes. When a container in a Swarm service queries another service name, the request goes to the embedded resolver at 127.0.0.11, which recognizes Swarm service names and returns the VIP. External queries are then forwarded upstream as on any other user-defined network. The failed to query external DNS server error can appear in Swarm with two distinct causes that are worth separating.

The first is the same systemd-resolved stub leak described above -- the embedded resolver on each Swarm node inherits the host's upstream resolvers, and if those resolvers are loopback addresses, external queries fail on every node. The fix is the same: set explicit upstream resolvers in daemon.json on each node in the Swarm. There is no central place to configure DNS for an entire Swarm cluster; each node must be configured individually.

The second is a documented interaction between Swarm's overlay network iptables rules and port publishing. When a service has a published port (via --publish in docker service create or ports: in a Compose stack deployed as a Swarm stack), Swarm adds ingress routing rules to handle load balancing across the routing mesh. On some Docker versions and kernel configurations, these ingress iptables rules conflict with the MASQUERADE rules Docker needs for outbound traffic from those same containers, producing the write: operation not permitted variant of this error. The affected service can resolve internal service names via VIP but cannot reach external DNS servers at all.

To confirm which scenario you are in, test DNS from inside a Swarm task container rather than from a standalone container:

terminal
# Find the task container ID for a running Swarm service
$ docker ps --filter "name=myservice" --format "{{.ID}}"

# Exec into it and test both internal and external resolution
$ docker exec -it <task-container-id> nslookup myservice
$ docker exec -it <task-container-id> nslookup sudowheel.com

# If external fails but internal succeeds, it is the Swarm iptables conflict
# Restart Docker on the affected node to rebuild the ingress chain
$ sudo systemctl restart docker

If restarting Docker resolves the external DNS failure for a Swarm service with published ports but the error returns after containers are rescheduled, the root issue is the interaction between the ingress network rules and the MASQUERADE chain. Upgrading Docker Engine to the latest patch release is the most reliable fix, as this interaction has been corrected in several successive Moby releases. If you cannot upgrade immediately, deploying the service without a published port temporarily can confirm whether the port mapping is triggering the conflict.

nftables migration conflicts on newer kernels

On Linux distributions that have migrated from iptables to nftables as the default packet filter framework -- including Debian 11+, Fedora 30+, and RHEL 8+ -- Docker's iptables rules and the system's nftables ruleset can coexist incorrectly. Docker uses iptables (specifically the iptables-legacy backend) to write its DNAT and MASQUERADE rules. If the system's firewall is managed through nftables directly, those iptables rules may not take effect in the right table priority, causing DNS traffic from containers to be silently dropped before it reaches Docker's embedded resolver or before forwarded queries can exit the host.

The diagnostic symptom is subtle: docker run --rm alpine nslookup sudowheel.com fails, yet sudo iptables -t nat -L POSTROUTING -n | grep MASQUERADE shows Docker's rule present. The rule is present -- but in the wrong iptables backend. Verify with sudo iptables-legacy -t nat -L POSTROUTING -n and compare the output. If Docker's MASQUERADE rule only appears in the legacy backend while your firewall is using the nft backend, you have a split-table conflict. The fix is to ensure Docker's iptables backend matches the system's active backend, or to disable Docker's built-in iptables management and write equivalent nftables rules manually.

Internal network containers leaking DNS via host loopback (CVE-2024-29018)

This root cause is distinct from the others because it combines the failed to query external DNS server error with a security implication. When a container is attached only to a Docker internal network (created with the --internal flag or the internal: true attribute in a Compose file), that network is supposed to be fully isolated from external traffic. Prior to Moby versions 26.0.0, 25.0.4, and 23.0.11, Docker's embedded DNS resolver bypassed that isolation entirely for DNS queries.

The mechanism: when the host's /etc/resolv.conf pointed to a loopback address such as 127.0.0.53, the resolver forwarded queries to that address via the host's loopback device -- completely outside the container's network namespace -- rather than routing the query through the container's isolated namespace. This meant a container on an internal network, which should have had no external connectivity, could reach external authoritative DNS servers. The Moby project security advisory noted that an attacker with a foothold in a compromised container could encode exfiltrated data in DNS subdomains and have them answered by attacker-controlled authoritative nameservers, even on supposedly isolated networks.

~ CVE-2024-29018 — what changed
container on --internal network
host loopback (127.0.0.53) — bypasses namespace
external authoritative nameserver
Vulnerable path: The embedded resolver forwarded DNS queries from --internal containers via the host loopback, completely escaping the container's namespace. A container that should have had zero external network access could reach attacker-controlled authoritative nameservers by encoding data in DNS subdomain queries — a technique called DNS tunneling. The isolation guarantee of --internal networks was meaningless for DNS traffic.
Security: CVE-2024-29018 -- Update Your Docker Engine

If you are running Docker Engine older than 26.0.0, 25.0.4, or 23.0.11 with any --internal networks, you are exposed to DNS-based data exfiltration from those networks. The vulnerability exists because dockerd forwards DNS queries via the host loopback, bypassing container network namespace isolation entirely. The fix is to update Docker Engine. As a workaround, run containers on internal networks with a custom --dns upstream address, which forces DNS resolution through the container's own namespace where it is correctly isolated. Source: Moby security advisory GHSA-mq39-4gv4-mvpx.

On patched Docker versions, containers on internal networks no longer forward DNS queries to external servers. If you upgraded Docker and noticed that containers on internal networks suddenly cannot resolve external names -- they should not have been able to before either, correctly speaking. Check your network design: anything that needs external DNS resolution should not be on an internal network.

Embedded DNS resolver returning SERVFAIL for HTTPS and SVCB record types

Docker's embedded resolver at 127.0.0.11 handles A and PTR record types reliably but does not understand newer record types including HTTPS (type 65) and SVCB (type 64), which are used by modern browsers and HTTP/3-capable clients. When a browser like Chrome sends a DNS query for a container-served hostname, it automatically issues both an A record query and an HTTPS record query in parallel. Docker's embedded resolver responds correctly to the A query but returns SERVFAIL for the HTTPS query. Some clients interpret that SERVFAIL as a complete DNS failure, refusing to load the page even though the IP address was returned successfully.

The symptom is specific: a service resolves correctly when queried with nslookup or curl from the command line but fails to load in a browser, and inspecting the browser's network tab shows a DNS error rather than a connection error. If this matches your situation, the fix is not in daemon.json -- it requires placing a full-featured resolver like CoreDNS in front of Docker's embedded resolver, configured to forward A queries to 127.0.0.11 and handle HTTPS and SVCB queries with appropriate responses rather than SERVFAIL.

The standard approach is to run CoreDNS as a container on the same Docker network, give it a static IP, and point containers at it via the dns key in your Compose file or in daemon.json. CoreDNS forwards container-name queries and external A queries to 127.0.0.11 as normal, but responds to HTTPS and SVCB queries with NOERROR and an empty answer rather than SERVFAIL, which satisfies browsers without disrupting resolution. A minimal Corefile to accomplish this looks like:

Corefile
# Forward all queries to Docker's embedded resolver
# and suppress SERVFAIL for unsupported record types
.:53 {
    errors
    health
    ready
    # Return NOERROR with empty answer for HTTPS and SVCB instead of SERVFAIL
    template ANY HTTPS {
        rcode NOERROR
    }
    template ANY SVCB {
        rcode NOERROR
    }
    # Forward everything else to Docker's embedded resolver
    forward . 127.0.0.11
    cache 30
    loop
    reload
    loadbalance
}

Run CoreDNS as a container alongside your services. Give it a predictable IP on the Docker network and point your other containers at it:

docker-compose.yml (CoreDNS alongside your services)
services:
  coredns:
    image: coredns/coredns:latest
    volumes:
      - ./Corefile:/Corefile:ro
    networks:
      appnet:
        ipv4_address: 172.28.0.2

  myapp:
    image: myapp:latest
    dns:
      - 172.28.0.2    # CoreDNS container IP
    networks:
      - appnet

networks:
  appnet:
    driver: bridge
    ipam:
      config:
        - subnet: 172.28.0.0/16

The template plugin intercepts HTTPS (type 65) and SVCB (type 64) queries before they reach the forward directive, returning an empty NOERROR response. Browsers treat an empty NOERROR as "this record type does not exist" rather than a DNS failure, which is the correct behavior. The A record lookup proceeds normally through the forward path and the browser connects successfully. The cache 30 directive adds a 30-second TTL cache, which reduces the number of upstream queries to Docker's embedded resolver and can measurably improve container startup performance when many containers resolve the same names at once.

Docker Desktop on macOS and Windows

Docker Desktop runs containers inside a Linux virtual machine managed by Docker itself. The entire iptables and systemd-resolved layer described in this article lives inside that VM, not on your Mac or Windows host. This changes the failure modes significantly.

On macOS, the most common cause of this error is DNS filtering software -- little Snitch, Pi-hole configured as a system resolver, corporate MDM profiles that intercept DNS -- intercepting or blocking the queries the VM sends out through the host's network stack. The VM cannot reach the upstream resolver because the host-side filter is dropping or redirecting the packet. The fix is to explicitly allow DNS egress for the Docker Desktop process in your filter, or to set a public resolver in Docker Desktop's settings under Resources > Network rather than in a daemon.json on disk.

On Windows with WSL 2, Docker Desktop uses a Hyper-V utility VM and DNS is proxied through the Windows host's resolver. If the Windows host uses a VPN that changes DNS settings, containers may lose resolution when the VPN connects or disconnects. The fix mirrors the Linux approach -- set explicit upstream resolvers -- but the daemon.json location on Windows is %USERPROFILE%\.docker\daemon.json and changes take effect after restarting Docker Desktop from the system tray, not via systemctl.

docker build fails to resolve names while running containers work fine

This is a frequently overlooked gap that trips up engineers who have already fixed DNS for their running Compose services. When docker build executes a layer that installs packages -- apt-get update, pip install, npm install -- and DNS resolution fails only during the build, not during docker run, the cause is almost always the network difference between build and runtime.

By default, docker build runs its build container on the default bridge network (docker0), not on a user-defined network. The default bridge does not use Docker's embedded DNS resolver at 127.0.0.11. Instead, it injects a filtered copy of the host's /etc/resolv.conf directly into the build container, stripping loopback addresses. If the host's resolv.conf contains only 127.0.0.53 and nothing else, the filtered file is empty and the build container has no nameserver to query.

Meanwhile, your running Compose services work because Compose always creates a user-defined bridge, which activates the embedded resolver. That resolver forwards to whatever you set in daemon.json. The fix is the same: set a dns array in /etc/docker/daemon.json. That setting applies to both the embedded resolver on user-defined networks and to the resolv.conf that gets injected into the default bridge network. If you set "dns": ["8.8.8.8", "1.1.1.1"], the build container will receive those addresses rather than a filtered and possibly empty resolv.conf.

Alternatively, pass --network=host to a specific build if you are on a trusted development machine and want the build to use the host's full resolver stack directly. Do not use --network=host in production CI environments or with untrusted build inputs, as it removes network isolation from the build process entirely.

This behavior extends to multi-stage builds. Each FROM stage in a multi-stage Dockerfile runs as a separate build container, all on the default bridge, all subject to the same DNS rules. If stage 1 downloads a base image layer and stage 2 runs npm install, both stages need working DNS. The daemon.json fix covers all stages uniformly -- once set, every stage in every build on that host picks up the configured upstream resolvers. There is no per-stage DNS override syntax in a Dockerfile; the fix must be at the daemon level or via a --network flag passed to the entire docker build command, not per stage.

CI/CD environments: ephemeral runners and absent systemd

CI/CD runners -- GitHub Actions hosted runners, GitLab CI shared runners, CircleCI machine executors -- introduce a distinct set of DNS failure conditions that differ from both desktop and server deployments. The failure mode depends on the runner type.

Hosted runners (GitHub Actions ubuntu-latest, GitLab shared runners). These are fresh Linux VMs provisioned per job. They have Docker pre-installed but systemd-resolved is either not running or not the primary resolver. The /etc/resolv.conf file on these VMs typically points to a hypervisor-level stub resolver at a cloud-specific address rather than at 127.0.0.53. On most GitHub Actions runners the resolver is at 168.63.129.16 (Azure) or a similar platform address. Docker inherits this address and containers may fail to reach it for the same reason cloud VMs fail -- the hypervisor route is not accessible from inside a bridge namespace. The recommended fix for CI pipelines is to add an explicit daemon.json configuration step at the start of the job, before any Docker commands run:

.github/workflows/build.yml (relevant job step)
steps:
  - name: Configure Docker DNS
    run: |
      # Write daemon.json before any Docker commands
      sudo mkdir -p /etc/docker
      echo '{"dns":["8.8.8.8","1.1.1.1"],"dns-opts":["ndots:1","timeout:2","attempts:2"]}' \
        | sudo tee /etc/docker/daemon.json
      sudo systemctl restart docker

  - name: Build image
    run: docker build -t myimage .

Self-hosted runners on VMs or bare metal. Self-hosted runners persist between jobs and have the same DNS configuration issues as any server Docker deployment. The systemd-resolved stub leak and VPN conflicts described in this article apply in full. Because runners are long-lived, a DNS misconfiguration can affect every job that runs on that machine until the runner is reconfigured. Add the same daemon.json DNS settings and restart Docker as a one-time setup step when provisioning the runner, not as a per-job step.

Runners inside Docker containers (Docker-in-Docker). When the CI job itself runs inside a Docker container and invokes Docker commands via the Docker socket or via DinD, there are two nested DNS layers: the outer container's resolver and the inner daemon's resolver. The outer container gets its DNS from the CI platform's host configuration. The inner Docker daemon reads its own /etc/resolv.conf -- which inside a container is what Docker injected there. If the outer container's resolver is 127.0.0.11 (user-defined network) and that address is what gets inherited by the inner daemon, the inner daemon tries to forward queries to the outer container's embedded resolver. Whether that works depends on the network topology of the DinD setup. The reliable fix is to pass explicit DNS servers to the inner daemon via its own daemon.json, the same as any other deployment.

Testing DNS in CI Before It Breaks a Build

Add a one-line DNS smoke test early in any CI job that does network-dependent work. If it fails, it fails fast with a clear error message instead of burying a DNS failure inside a package install log. Place it immediately after the daemon.json configuration step: docker run --rm alpine nslookup sudowheel.com || (echo "Docker DNS is broken -- check daemon.json on this runner" && exit 1).

Reading the Error Message Precisely

The full error log line has four fields worth parsing individually. Each one tells you something specific about where the failure occurred.

annotated error line — click any field
// select a field to inspect its diagnostic value
level=error msg="[resolver] failed to query external DNS server" \ client-addr="udp:172.18.0.7:49663" \ dns-server="udp:130.221.128.2:53" \ error="read udp 172.18.0.7:49663->130.221.128.2:53: i/o timeout" \ question=";sudowheel.com. IN A"
client-addr — which container sent this query

The bridge IP and ephemeral UDP port of the container that originated the lookup. The IP (172.18.0.x) maps to a specific container — cross-reference with docker inspect to find which service is responsible. If the same IP appears across thousands of log lines, that one container is the source of the flood. The port changes every query.

Useful for: tracing a log flood to a specific container or service rather than treating the error as a global daemon problem.
dns-server — your primary diagnostic indicator

The upstream resolver Docker's embedded resolver tried to reach. Read this field first:

127.0.0.53 or 127.0.0.1 → systemd-resolved namespace collision. Set explicit DNS in daemon.json.
External IP (e.g. 169.254.x.x) → cloud metadata address unreachable from bridge namespace.
Corporate or VPN IP → VPN DNS server unreachable when VPN routes do not extend into the bridge.

Useful for: immediately classifying the root cause without reading further. This one field narrows you to one of three diagnostic paths.
error — kernel-level failure description

Two values matter here and they indicate very different problems:

i/o timeout — packet was sent successfully but no reply arrived. Server is unreachable, the address doesn't exist in this namespace, or the server is silently dropping queries.

write: operation not permitted — kernel rejected the send before the packet left. An iptables DROP rule is blocking it. Docker's MASQUERADE or FORWARD rules are likely missing. Restart Docker first.

Useful for: routing between two separate fix paths. i/o timeout = address/routing problem. operation not permitted = iptables problem. Don't mix up the fixes.
question — the DNS query that failed

Written in zone file notation: ;hostname. IN recordtype. The semicolon is standard notation for "query for". IN is the Internet class. The record type reveals the specific failure:

IN A — standard IPv4 address lookup. Most common.
IN HTTPS (type 65) — Docker's resolver returns SERVFAIL for this. Modern browsers send it automatically alongside A queries. If you see this, the fix is CoreDNS in front of Docker's resolver, not daemon.json.

Useful for: distinguishing the HTTPS record type gap (browser-specific failure, CoreDNS fix) from ordinary A record failures (daemon.json fix). Also identifies which service is spamming queries.

The dns-server field is your primary diagnostic indicator. If it shows a loopback address (127.0.0.1, 127.0.0.53), the problem is almost certainly the systemd-resolved namespace isolation issue. If it shows a valid external IP that your host can reach but containers cannot, the problem is a routing or iptables issue. If it shows an internal corporate IP, a VPN-related DNS configuration problem is likely.

i/o timeout
operation not permitted
MeaningPacket was sent — no reply arrived
MeaningKernel rejected the send outright
Root causeUpstream server unreachable, loopback namespace mismatch, cloud metadata address not routable through container bridge, or VPN DNS servers offline
Root causeiptables DROP rule blocking the packet, Docker's MASQUERADE or FORWARD rules flushed, or ufw interfering with Docker's chains
First stepCheck daemon.json dns array; verify upstream is reachable from host
First stepRestart Docker to recreate iptables rules; run iptables -L FORWARD -n

Diagnosing the Problem Systematically

Work through these steps in order. Each one either confirms a hypothesis or rules it out.

Confirm the host's own DNS works

$ nslookup sudowheel.com

If the host itself cannot resolve external names, you have a host-level DNS problem that Docker is inheriting. Fix the host first. Docker cannot compensate for a broken host resolver.

Test DNS from a fresh container

$ docker run --rm alpine nslookup sudowheel.com

This is the fastest single-command test. It spins up a container on the default bridge, attempts an external lookup, and exits. If it succeeds, the problem is specific to a particular container or user-defined network. If it fails, Docker's default DNS path is broken.

Inspect what the container sees as its resolver

$ docker run --rm alpine cat /etc/resolv.conf

Look at the nameserver line. If it shows 127.0.0.53, you have the systemd-resolved stub leak. If it shows a public IP that should be reachable, the issue is routing or iptables. If it shows 127.0.0.11, the container is on a user-defined network and the embedded resolver's upstreams need investigation.

Find what the embedded resolver is actually using as its upstreams

terminal
# On a systemd-resolved host, check what the actual upstream servers are
$ resolvectl status | grep -A5 "DNS Servers"

# Check what Docker would read for upstream resolvers
$ cat /run/systemd/resolve/resolv.conf

# Compare with the stub resolver file
$ cat /etc/resolv.conf

Inspect the network's DNS configuration directly

docker network inspect shows the DNS servers and options that Docker has configured for a specific network -- without needing to exec into a running container. This is the fastest way to confirm what the embedded resolver is actually using as its upstreams, and it works even when no containers are running on the network.

terminal
# List all Docker networks to find the one in question
$ docker network ls

# Inspect a specific network -- look for the DNS configuration block
$ docker network inspect bridge

# Filter to just the DNS-relevant fields
$ docker network inspect bridge --format '{{json .Options}}'

# For a Compose-created network (named by convention: projectname_networkname)
$ docker network inspect myproject_default --format '{{json .Options}}'

In the full inspect output, look for the com.docker.network.bridge.enable_ip_masquerade option -- if it is absent or set to false, containers on this network cannot route traffic out to upstream DNS servers. Also check the IPAM section for the subnet range; if the subnet overlaps with a VPN or corporate network range, traffic from containers may be routed to the wrong destination. The docker network inspect output is the authoritative view of what Docker has configured -- compare it against what you expect before diving into iptables or packet captures.

Check iptables for Docker's forwarding rules

terminal
# Check that Docker's MASQUERADE rule exists in the nat table
$ sudo iptables -t nat -L POSTROUTING -n | grep MASQUERADE

# Verify the FORWARD chain allows Docker traffic
$ sudo iptables -L FORWARD -n | grep DOCKER

# Check if port 53 traffic is explicitly blocked anywhere
$ sudo iptables -L -n | grep "dpt:53"

If the MASQUERADE rule for Docker's subnet is absent, containers cannot reach external IP addresses at all, including DNS servers. Restarting Docker recreates these rules.

Confirm packets are actually leaving the host with tcpdump

When you have confirmed the resolver address and iptables rules look correct but the error persists, the next question is whether the DNS packet is actually reaching the wire. nslookup and the daemon log tell you what Docker tried to do -- they do not tell you whether the packet made it past the kernel. tcpdump answers that directly.

Run this on the host while simultaneously triggering a DNS lookup from inside a container in a second terminal:

terminal -- two separate sessions
# Terminal 1: capture DNS traffic on the host's primary interface
$ sudo tcpdump -i any -n port 53

# Terminal 2: trigger a lookup from a container
$ docker run --rm alpine nslookup sudowheel.com

What you see in the tcpdump output tells you exactly where the failure is:

  • No output at all: the packet never left the Docker bridge -- iptables is dropping it before it exits the host. Check FORWARD and nat POSTROUTING rules.
  • Query packet appears, no reply: the packet reached the wire and the upstream server is not responding. The upstream address is unreachable -- either blocked by a firewall, wrong address, or the server is down. Try a different upstream.
  • Query and reply both appear: the round trip is completing at the network level. The error in the daemon log is happening at a different layer -- possibly a parse error or a SERVFAIL from the upstream. Use dig with the exact server address to inspect the reply directly.
terminal
# Use dig to query the upstream server directly and inspect the full reply
$ dig @8.8.8.8 sudowheel.com A

# Test whether the upstream server is reachable from the host at all
$ dig @8.8.8.8 sudowheel.com A +time=3 +tries=1

# Test from inside a container with a specific upstream (bypasses embedded resolver)
$ docker run --rm alpine nslookup sudowheel.com 8.8.8.8

Validate daemon.json syntax before restarting Docker

A common failure mode when applying the daemon.json fixes in this article is that the file already exists with other settings, and adding the DNS entries produces invalid JSON -- a misplaced comma, mismatched brackets, or a duplicate key. Docker silently falls back to defaults or refuses to start when daemon.json is malformed, which can look like the fix did not work.

terminal
# Check whether daemon.json already exists before editing
$ cat /etc/docker/daemon.json

# Validate the JSON syntax before restarting Docker
$ python3 -m json.tool /etc/docker/daemon.json

# If Docker fails to start after your edit, check the daemon error
$ sudo journalctl -u docker --since "5 min ago" | grep -i "error\|failed\|invalid"

If daemon.json already contains a "dns" key and you add another one, Docker uses whichever it encounters last -- but many JSON parsers reject duplicates entirely. Merge the entire file into a single valid object. A correct merge of DNS settings with an existing log-level setting looks like this:

/etc/docker/daemon.json (merged, not appended)
{
  "log-level": "warn",
  "dns": ["8.8.8.8", "1.1.1.1"],
  "dns-opts": ["ndots:1", "timeout:3", "attempts:3"]
}

Fixes by Root Cause

Set explicit DNS servers in daemon.json

This is the most reliable fix for the majority of cases. It bypasses the host's resolver configuration entirely and gives Docker's embedded resolver a working set of upstream servers that are reachable from container network namespaces.

/etc/docker/daemon.json
{
  "dns": ["8.8.8.8", "1.1.1.1"],
  "dns-opts": ["ndots:1", "timeout:3", "attempts:3"]
}

After saving this file, restart the Docker daemon:

$ sudo systemctl restart docker

Then verify with a fresh container test:

$ docker run --rm alpine nslookup sudowheel.com
What ndots:1 means and why it matters here

The dns-opts entry ndots:1 controls how the resolver decides whether a hostname is fully qualified. With the default value of ndots:5 (inherited from Kubernetes conventions), a query for api.example.com triggers up to five search-domain lookup attempts before trying the name as-is -- each one a separate upstream query that can fail and generate a separate log line. Setting ndots:1 means any name with at least one dot is sent directly to the upstream resolver without search-domain expansion. On hosts where this error is flooding logs, a high ndots value is frequently a multiplier: one container request generates five or more upstream queries, all failing, all logged. If your containers are not in a Kubernetes environment, ndots:1 is almost always the right setting.

The other two entries in the recommended dns-opts array have their own effect on how failures behave. timeout:3 sets the per-query timeout in seconds before the resolver gives up waiting for a response from one upstream server and either tries a fallback or returns an error. The default is 5 seconds; reducing it to 3 means a broken primary resolver causes a 3-second delay per query rather than 5, which significantly reduces the impact on application startup times when the upstream is unreachable. attempts:3 controls how many times the resolver retries each upstream server before declaring it failed. The default is also 2; setting it to 3 gives flaky upstreams an extra chance before the error is logged. Together, timeout:3 and attempts:3 mean each upstream can take up to 9 seconds of combined retry time (3 seconds × 3 attempts) before the resolver moves on. If you have two upstream servers listed and both are broken, the maximum delay before a query fails entirely is 18 seconds. For most environments, timeout:2 and attempts:2 is a more aggressive setting that surfaces failures faster while still providing one retry for transient packet loss.

These options are written into each container's /etc/resolv.conf on the options line alongside any search domains. If you inspect a container's resolver config after applying these settings, you will see them directly:

inside a container -- cat /etc/resolv.conf
# What a container on a user-defined network sees after applying recommended daemon.json settings
nameserver 127.0.0.11
options ndots:1 timeout:3 attempts:3

# The search line is injected by Docker from the host's resolv.conf search domains
# and from the container's hostname. It varies by environment.
search localdomain
ndots: 5
hostname:
Corporate and VPN Networks

If you are on a corporate network or connected to a VPN that uses internal DNS for private domains, using only public resolvers like 8.8.8.8 will break resolution of internal hostnames. Include your corporate or VPN DNS server as the first entry in the dns array and fall back to a public resolver. If the VPN DNS changes when you connect, you will need to update daemon.json and restart Docker each time, or use the networkd-dispatcher or NetworkManager approach described in Fix 4.

Use the actual upstream resolvers from systemd-resolved

If you want Docker to use the same DNS servers that the host is using -- but cannot use 127.0.0.53 directly -- point Docker at the actual upstream addresses that systemd-resolved is forwarding to.

terminal
# Find the actual upstream DNS server addresses
$ resolvectl status | grep "DNS Servers"
  DNS Servers: 192.168.1.1 8.8.8.8

# Or read the non-stub resolv.conf directly
$ cat /run/systemd/resolve/resolv.conf
nameserver 192.168.1.1
nameserver 8.8.8.8

Then use those addresses in daemon.json:

/etc/docker/daemon.json
{
  "dns": ["192.168.1.1", "8.8.8.8"]
}

Alternatively, you can tell Docker to read from the non-stub resolv.conf file directly rather than from the default /etc/resolv.conf:

/etc/docker/daemon.json
{
  "resolv-conf": "/run/systemd/resolve/resolv.conf"
}

This approach means Docker reads the upstream resolver list fresh each time the daemon starts, which is useful on systems where the DNS servers change dynamically via DHCP.

Restore or verify iptables rules

If the error variant is operation not permitted rather than i/o timeout, or if containers cannot reach external IP addresses at all, the iptables rules are the likely culprit.

terminal
# Restart Docker to recreate its iptables rules from scratch
$ sudo systemctl restart docker

# Verify Docker's nat rules were recreated
$ sudo iptables -t nat -L -n | grep MASQUERADE
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0

# If using ufw, explicitly allow DNS outbound
$ sudo ufw allow out 53/udp
$ sudo ufw allow out 53/tcp
ufw and Docker do not cooperate by default

ufw manages iptables using its own ruleset. Docker also manages iptables. When they coexist on the same host, ufw can interfere with Docker's FORWARD chain, causing container traffic to be dropped. The standard fix is to add Docker's subnets to ufw's allowed forwarding rules, or to set DEFAULT_FORWARD_POLICY="ACCEPT" in /etc/default/ufw. Never set "iptables": false in daemon.json to work around this -- that will also break Docker's embedded DNS DNAT rules and container name resolution on user-defined networks.

Per-container DNS override

If you cannot or do not want to change the daemon-wide configuration, you can override DNS on a per-container basis using the --dns flag:

$ docker run --rm --dns 8.8.8.8 --dns 1.1.1.1 alpine nslookup sudowheel.com

In a Compose file, the same setting is expressed under the service definition:

docker-compose.yml
services:
  myapp:
    image: myapp:latest
    dns:
      - 8.8.8.8
      - 1.1.1.1
    dns_opt:
      - ndots:1
      - timeout:3

Make systemd-resolved listen on the Docker bridge interface

This is the architecturally correct solution for hosts where you want containers to use the full systemd-resolved feature set including split DNS for VPN and LLMNR. The problem is that systemd-resolved only listens on the loopback interface by default. The solution is to configure it to also listen on the Docker bridge address.

/etc/systemd/resolved.conf.d/docker.conf
[Resolve]
# Allow systemd-resolved to answer queries from Docker's bridge
# Replace 172.17.0.1 with your actual docker0 bridge address
DNSStubListenerExtra=172.17.0.1

Then find your Docker bridge address and configure Docker to use it:

terminal
# Find the docker0 bridge address
$ ip addr show docker0 | grep "inet "
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0

# Restart systemd-resolved to apply the new listener config
$ sudo systemctl restart systemd-resolved

Then set daemon.json to use the bridge address:

/etc/docker/daemon.json
{
  "dns": ["172.17.0.1"]
}

This approach preserves all of systemd-resolved's routing intelligence -- split DNS domains, per-interface resolvers, VPN-aware forwarding -- while making it accessible to containers. It is the recommended approach for development workstations and laptops that frequently switch networks and VPNs.

Automate daemon.json updates when DNS changes with networkd-dispatcher

For development machines that connect to multiple networks -- corporate VPNs, home routers, coffee shop Wi-Fi -- manually updating daemon.json on every network change is unsustainable. The right solution is to hook into the network event system and regenerate Docker's DNS configuration automatically whenever the upstream resolvers change.

On systemd-based hosts, networkd-dispatcher fires scripts when network interfaces change state. You can place a script in /etc/networkd-dispatcher/routable.d/ that reads the current upstream resolvers and writes a fresh daemon.json before restarting Docker:

/etc/networkd-dispatcher/routable.d/50-docker-dns
#!/bin/bash
# Regenerate Docker DNS config when network comes up
# Reads actual upstream resolvers from systemd-resolved

UPSTREAM=$(resolvectl status 2>/dev/null | awk '/DNS Servers/{for(i=3;i<=NF;i++) printf "%s ", $i; print ""}' | tr ' ' '\n' | grep -v '^$' | grep -v '^127\.' | head -3 | tr '\n' ',' | sed 's/,$//')

if [[ -z "$UPSTREAM" ]]; then
    # Fallback to public resolvers if no non-loopback upstream found
    UPSTREAM="8.8.8.8,1.1.1.1"
fi

# Build the dns array from the comma-separated list
DNS_JSON=$(echo "$UPSTREAM" | awk 'BEGIN{OFS=","}{split($0,a,","); for(i=1;i<=length(a);i++) printf "\"" a[i] "\"" (i /etc/docker/daemon.json <<EOF
{
  "dns": [${DNS_JSON}],
  "dns-opts": ["ndots:1", "timeout:3", "attempts:3"]
}
EOF

systemctl reload-or-restart docker

Make the script executable with chmod +x /etc/networkd-dispatcher/routable.d/50-docker-dns. The same pattern works with NetworkManager using a script in /etc/NetworkManager/dispatcher.d/. This turns DNS management from a manual chore into an automatic consequence of network state changes -- which is how it should work.

Restarting Docker Has Service Impact

Restarting the Docker daemon stops all running containers unless "live-restore": true is set in daemon.json. With live-restore enabled, the daemon can be restarted without stopping containers -- they continue running and reconnect to dockerd when it comes back up. Enable it by adding the key to your daemon.json before you need it:

/etc/docker/daemon.json
{
  "live-restore": true,
  "dns": ["8.8.8.8", "1.1.1.1"],
  "dns-opts": ["ndots:1", "timeout:3", "attempts:3"]
}

Note that live-restore does not apply to the first restart you use to enable it -- containers are still stopped during that one transition. Once it is in place, subsequent daemon restarts leave containers running. For production servers, configure live-restore before you need to restart Docker for DNS changes, or implement a smarter script that compares the current DNS list against the desired list and only restarts Docker when they differ.

Resolve nftables backend conflicts on modern distributions

On distributions using nftables as the primary packet filter framework, Docker's iptables rules may be written to the wrong backend. First, identify which backend is active:

terminal
# Check which iptables backend is the system default
$ update-alternatives --display iptables 2>/dev/null | grep "best link"

# Check Docker's rules in legacy backend
$ sudo iptables-legacy -t nat -L POSTROUTING -n | grep MASQUERADE

# Check Docker's rules in nft backend
$ sudo iptables-nft -t nat -L POSTROUTING -n | grep MASQUERADE

# Inspect the nftables ruleset directly
$ sudo nft list ruleset | grep -A3 MASQUERADE

If Docker's rules only appear in iptables-legacy while your firewall uses iptables-nft, you can switch Docker to the nft backend by setting the alternatives link:

terminal
# Switch system iptables to nft backend so Docker uses the same one
$ sudo update-alternatives --set iptables /usr/sbin/iptables-nft
$ sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-nft
$ sudo systemctl restart docker

After the restart, verify Docker's MASQUERADE and DNAT rules now appear under nft list ruleset. If your distribution does not use the alternatives system, you may instead disable Docker's built-in iptables management with "iptables": false in daemon.json and write the equivalent MASQUERADE and DNAT rules in your nftables configuration manually -- but this approach requires deep familiarity with both Docker's expected rule structure and nftables syntax, and breaks container name resolution unless the DNAT rules for port 53 are reproduced exactly.

Advanced and Architectural Solutions

The fixes above cover the majority of cases. The approaches below go further -- either by solving structural problems that the daemon.json approach cannot address, or by providing more durable solutions for production environments where DNS correctness is load-bearing.

Replace Docker's embedded resolver with a local forwarding resolver (Unbound or dnsmasq)

Docker's embedded resolver at 127.0.0.11 is intentionally minimal -- it handles A and PTR records, container name resolution, and basic forwarding. It does not support DNSSEC validation, caching beyond a single TTL cycle, conditional forwarding, or response policy zones. For environments where any of those features matter, the correct architectural move is to run a full-featured local resolver on the Docker bridge address and point the daemon at that resolver.

Unbound is well-suited for this. Bind it to the Docker bridge address (172.17.0.1 by default), enable DNSSEC validation, and configure an access-control rule that permits queries from Docker's bridge subnet. Docker containers then get DNSSEC-validated responses, and a failing upstream causes a SERVFAIL rather than a silent timeout, which surfaces DNS failures more cleanly in application logs.

/etc/unbound/unbound.conf.d/docker.conf
server:
  # Listen on the Docker bridge address, not just loopback
  interface: 172.17.0.1
  access-control: 172.17.0.0/16 allow
  access-control: 172.18.0.0/16 allow   # for user-defined networks
  do-ip4: yes
  do-udp: yes
  do-tcp: yes
  hide-identity: yes
  hide-version: yes
  harden-glue: yes
  harden-dnssec-stripped: yes

forward-zone:
  name: "."
  forward-addr: 8.8.8.8
  forward-addr: 1.1.1.1
/etc/docker/daemon.json
{
  "dns": ["172.17.0.1"],
  "dns-opts": ["ndots:1", "timeout:3", "attempts:2"]
}

With this configuration, Docker's embedded resolver forwards to Unbound on the bridge address, and Unbound handles DNSSEC validation and caching before forwarding upstream. The embedded resolver still manages container name resolution (service discovery); Unbound only handles external queries. Note that you need to add an access-control line for each Docker network subnet that Unbound should serve -- run docker network inspect --format '{{range .IPAM.Config}}{{.Subnet}}{{end}}' on each network to find the relevant ranges.

Per-network DNS routing with multiple daemon-level resolvers

Docker does not support split DNS natively -- you cannot tell the daemon to send .corp.internal to one resolver and .example.com to another. The dns array in daemon.json applies uniformly to all containers on all networks. However, you can achieve per-network DNS by combining two mechanisms.

First, set your split-DNS-capable resolver (systemd-resolved or Unbound with conditional zones) as the single upstream in daemon.json. That resolver handles the routing intelligence. Second, configure that resolver with per-domain forwarding rules: queries for .corp.internal go to the corporate DNS server, everything else goes to a public resolver.

With systemd-resolved, per-domain routing is configured through resolvectl domain or the DNS= and Domains= directives in /etc/systemd/resolved.conf. When a VPN comes up and pushes an internal domain, systemd-resolved handles the routing switch automatically -- and because Docker points to the bridge address where systemd-resolved is listening, containers inherit the updated routing without any daemon restart.

DNS response caching at the container level

Docker's embedded resolver does not cache responses across containers. If ten containers all query api.github.com within the same second, ten separate upstream queries are issued. Under load, this amplification is a real source of latency and, if the upstream is rate-limiting, a source of failures that look like DNS errors but are actually rate-limit rejections.

The CoreDNS cache directive (shown in the HTTPS record fix section) solves this, but it requires running CoreDNS as a sidecar. Alternatively, nscd (Name Service Caching Daemon) installed inside a base container image caches DNS responses at the glibc level for that individual container. For production workloads where many containers query the same external hostnames at startup, a shared CoreDNS caching resolver on the Docker network is the more scalable option -- it serves all containers from a single cache rather than per-container caches.

Suppress the search domain amplification problem at the compose level

Rather than relying solely on the ndots:1 daemon setting, suppress inherited search domains explicitly on services that do not need them. This provides defense in depth: if the daemon setting is ever overridden or a container's base image sets its own options line in /etc/resolv.conf, the Compose-level override ensures the container still sends minimal queries.

docker-compose.yml (defensive DNS configuration)
services:
  api:
    image: myapi:latest
    dns:
      - 8.8.8.8
      - 1.1.1.1
    dns_search: ""          # removes all search domain expansion
    dns_opt:
      - ndots:1
      - timeout:2
      - attempts:2

The dns_search: "" entry removes all inherited search domains for that specific service. Combined with ndots:1, this guarantees that any hostname with at least one dot is sent as a fully qualified query, no search expansion occurs, and a single upstream failure produces exactly one log line -- not five or ten. Services that do need to resolve short internal hostnames via search domains should omit dns_search: "" and set the search list explicitly instead.

Rate-limit DNS traffic from containers to protect upstream resolvers

When a container enters a tight retry loop and hammers a broken upstream with DNS queries, it can generate hundreds of upstream requests per second. This can exhaust rate limits on shared public resolvers (Google's 8.8.8.8 enforces per-source rate limits) and flood the embedded resolver's log. An iptables hashlimit rule on the Docker bridge interface provides a kernel-level governor that throttles DNS egress per source IP without requiring any application-level changes.

terminal -- iptables rate limit for DNS from Docker containers
# Limit each container to 30 DNS queries per second toward the embedded resolver
# Queries above the limit are dropped silently at the kernel
$ sudo iptables -I DOCKER-USER -i docker0 -p udp --dport 53 \
    -m hashlimit \
    --hashlimit-above 30/second \
    --hashlimit-burst 60 \
    --hashlimit-mode srcip \
    --hashlimit-name docker-dns-limit \
    -j DROP

The DOCKER-USER chain is the correct insertion point -- Docker documents this chain as the place for user-defined rules that persist across daemon restarts. Rules inserted into DOCKER or FORWARD directly are regenerated by Docker on restart and will overwrite your additions. The hashlimit tracks each source IP independently, so a single misbehaving container is throttled without affecting others on the same bridge. This is a mitigation, not a root fix -- apply it while diagnosing which container is generating the flood, then remove the rule once the underlying upstream problem is corrected.

DNS health monitoring with a sidecar healthcheck pattern

Docker's healthcheck mechanism can be used to detect DNS failures proactively rather than discovering them from application errors. A DNS-validating healthcheck in a base service image catches the failure before it cascades into connection errors inside the application. This is particularly useful in production Swarm or Compose environments where a container may be running and healthy at the process level while DNS is silently broken.

Dockerfile (DNS-aware healthcheck)
# In your base image, add a healthcheck that verifies DNS is working
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD nslookup example.com > /dev/null 2>&1 || exit 1
docker-compose.yml (equivalent)
services:
  myapp:
    image: myapp:latest
    healthcheck:
      test: ["CMD-SHELL", "nslookup example.com > /dev/null 2>&1 || exit 1"]
      interval: 30s
      timeout: 5s
      retries: 3

When the healthcheck fails, Docker marks the container as unhealthy. In Swarm mode, the scheduler will replace the task on a node where DNS is working. In Compose, the condition: service_healthy dependency directive on downstream services prevents them from starting until DNS is confirmed working in the dependency. This turns a DNS failure from a silent background problem into an observable, actionable service state.

Alerting on DNS failure rates via Docker daemon log metrics

If you are forwarding Docker daemon logs to a log aggregation system -- Loki, Elasticsearch, Splunk, or a SIEM -- the failed to query external DNS server message is directly parseable as a metric. The dns-server field gives you the upstream address, the client-addr gives you the container IP, and the timestamp gives you the rate. A query that counts this log pattern per minute and alerts when it exceeds a threshold gives you early warning of upstream DNS degradation before it affects application performance.

For Prometheus environments, the simplest approach is mtail or grok_exporter pointed at the Docker daemon log. Define a rule that increments a counter on the failed to query external DNS server pattern and expose it as a Prometheus metric. A Grafana alert on that counter -- for example, triggering when the rate exceeds 10 per minute -- gives you a DNS health signal independent of any application-level monitoring. This is particularly valuable in Swarm deployments where DNS failures on individual nodes may not produce visible application errors until the affected node accumulates enough failed containers to trigger a threshold.

When the Error Floods Your Logs

A specific and common complaint in Docker forums is that this error message floods syslog at a rate that makes the logs unusable. The flooding pattern typically means Docker's embedded resolver is receiving a continuous stream of DNS queries from a container and every single one is failing because the upstream server is unreachable.

Log Flooding Has a Secondary Security Impact

When this error floods syslog at hundreds of lines per minute, it does not just create noise -- it displaces other log entries from the rotation buffer. On systems with fixed-size log rotation, a DNS error flood can push out authentication failures, kernel warnings, and security-relevant events before they are written to persistent storage or forwarded to a SIEM. If you are running any kind of log-based alerting or compliance logging, a flooding DNS error in Docker is not cosmetic. Fix it before it buries something important.

The upstream server being targeted is often the one from the host's /etc/resolv.conf -- and in the flooding case, it is very frequently a loopback address that Docker cannot reach. The fix is the same: set explicit upstream resolvers in daemon.json. But there are a few additional points worth understanding about this scenario.

First, the flooding is not caused by Docker behaving incorrectly. Docker's embedded resolver is faithfully attempting to forward every DNS query the container generates. If the container is running a service that performs aggressive DNS polling, like a service mesh sidecar or a health-check loop, each failed query generates a log line. The resolver logs every failure -- it has no mechanism to suppress repeated failures to the same upstream.

Second, the flooding is sometimes benign. If containers can resolve names via a fallback resolver listed later in the dns array, the first server might be listed as unreachable but containers still work. In this case, the correct fix is to remove the unreachable server from the list rather than waiting for it to time out on every query.

While you diagnose the root cause, you can use journalctl filters to isolate the DNS error stream from the rest of the daemon log, or to measure the rate of failures, without tailing the entire log:

terminal -- journalctl filtering for DNS flood investigation
# Stream only the Docker DNS resolver errors in real time
$ sudo journalctl -u docker -f | grep "failed to query external DNS"

# Count how many DNS errors occurred in the last 5 minutes
$ sudo journalctl -u docker --since "5 min ago" | grep -c "failed to query external DNS"

# See which upstream servers are being targeted (extract the dns-server field)
$ sudo journalctl -u docker --since "10 min ago" | grep "failed to query" | grep -oP 'dns-server="\K[^"]+' | sort | uniq -c | sort -rn

# Identify which container IPs are generating the most failures
$ sudo journalctl -u docker --since "10 min ago" | grep "failed to query" | grep -oP 'client-addr="\K[^"]+' | sort | uniq -c | sort -rn

# On systems using /var/log/syslog instead of journald
$ grep "failed to query external DNS" /var/log/syslog | tail -50
$ grep -c "failed to query external DNS" /var/log/syslog

The dns-server frequency output from the third command above tells you immediately whether one broken upstream is responsible for the entire flood, or whether multiple servers are failing. The client-addr frequency output tells you which container is the most aggressive DNS poller -- matching that IP to a container name using docker inspect identifies the specific service generating the volume.

Quick Check: Is DNS Actually Working Despite the Errors?

Run docker run --rm alpine nslookup sudowheel.com from the affected host. If it resolves correctly and quickly, the error messages are noise from a broken primary resolver with a working fallback. Fix the primary resolver in daemon.json and the log flood will stop. If it fails or takes many seconds, you have a real DNS outage.

Default Bridge vs. User-Defined Networks: A Critical Distinction

This distinction matters for understanding which of the fixes above applies to your situation.

On the default bridge network, there is no embedded DNS resolver at all. Containers get a copy of the host's /etc/resolv.conf with loopback addresses stripped out. If the host's resolv.conf only contains 127.0.0.53 and Docker strips it, the container ends up with no nameservers and falls back to 8.8.8.8 hardcoded in the Docker source. The failed to query external DNS server error does not appear in this scenario -- instead you get silent resolution failures or the container uses Google's DNS unexpectedly.

On user-defined bridge or overlay networks, the embedded DNS resolver is active. The 127.0.0.11 address is injected into the container's /etc/resolv.conf, the resolver handles container name lookups, and it forwards external queries to whatever upstreams it found. This is where the log error appears -- the embedded resolver is the one that cannot reach the upstream server and logs the failure.

Docker Compose services always run on a user-defined network by default. Any multi-container Compose deployment is using the embedded DNS resolver, which means the configuration in daemon.json directly affects upstream resolver selection for all Compose-based workloads.

How to Fix Docker Failed to Query External DNS Server

Confirm the upstream DNS path

Run docker run --rm alpine nslookup sudowheel.com to test whether Docker's default DNS configuration can reach an external resolver. Then run docker run --rm alpine cat /etc/resolv.conf to see which nameserver address Docker injected. If the nameserver is 127.0.0.53 or another loopback address, you have confirmed the systemd-resolved stub leak. If it is a public IP that should be reachable, investigate iptables and routing.

Set explicit DNS servers in daemon.json

Create or edit /etc/docker/daemon.json to add a dns array with working upstream resolver addresses. Use 8.8.8.8 and 1.1.1.1 for a reliable public fallback, or use your router's IP as the primary resolver. Restart the Docker daemon with sudo systemctl restart docker and verify the fix by running the nslookup test again from a fresh container.

Check iptables if DNS is still failing

Verify that outbound UDP and TCP traffic on port 53 is not blocked from the Docker bridge network. Run sudo iptables -t nat -L POSTROUTING -n | grep MASQUERADE to confirm Docker's masquerade rule is intact. If rules are missing, restart Docker to recreate them. If using ufw, run sudo ufw allow out 53/udp and sudo ufw allow out 53/tcp to explicitly permit DNS egress.

Frequently Asked Questions

Is the 'failed to query external DNS server' message always a real problem?

Not necessarily. If containers are resolving names correctly and the error only appears for specific domains or upstream servers that are unreachable from your network, it may be informational noise rather than a blocking failure. The message becomes a real problem when containers cannot resolve external hostnames, or when it floods syslog and masks other log entries. Run docker run --rm alpine nslookup sudowheel.com to determine quickly whether resolution is working despite the log entries.

Why does Docker's embedded DNS server at 127.0.0.11 forward queries to the wrong upstream server?

Docker's embedded DNS server reads its upstream resolvers from the host's /etc/resolv.conf when the daemon starts. On Ubuntu and other systemd-based distributions, that file points to 127.0.0.53 -- the systemd-resolved stub listener. That address is only reachable from the host's loopback interface, not from inside a container's separate network namespace. The fix is to configure Docker to use real upstream resolver addresses in /etc/docker/daemon.json.

What is the fastest single-command way to test whether Docker's DNS path is broken?

Run docker run --rm alpine nslookup sudowheel.com -- this spins up a fresh Alpine container and immediately tries to resolve an external hostname. If it fails, Docker's upstream DNS path is broken. If it succeeds, the problem is specific to a particular container or network configuration rather than the daemon itself. For a deeper test, add --network to specify which network to test, since default bridge and user-defined networks behave differently.

Why do cloud VMs have DNS failures that bare metal hosts do not?

Cloud providers serve DNS through hypervisor-level routing mechanisms -- link-local addresses like 169.254.169.253 on AWS or platform-specific addresses like 168.63.129.16 on Azure -- that are only routable from the primary host network stack. These addresses are not accessible from inside a container's isolated bridge namespace because the hypervisor route that makes them reachable does not extend into guest network namespaces. On bare metal, the upstream resolver is typically a real router IP or a public address that is reachable from any namespace on the host. Always check whether your cloud instance's resolver is a link-local or platform-specific address and replace it with an explicit public or VPC resolver in daemon.json.

What happens when iptables and nftables are both active on the same host?

Docker writes its DNAT and MASQUERADE rules using the iptables API. On hosts that have migrated to nftables, there are two possible backends for the iptables command: iptables-legacy (which writes to the old xtables kernel module) and iptables-nft (which writes to nftables via a compatibility layer). If Docker uses the legacy backend while your system firewall uses nftables, the rules live in separate priority layers and may not interact as expected. DNS traffic from containers can be dropped before reaching Docker's embedded resolver, or forwarded queries can fail to exit the host. Confirm both Docker and your firewall are writing rules to the same backend, then restart Docker to regenerate its rule set in the correct layer.

Does Docker's embedded DNS resolver support all DNS record types?

No -- and this trips up more people than it should. Docker's embedded resolver at 127.0.0.11 handles A and PTR records reliably but returns SERVFAIL for newer record types including HTTPS (type 65) and SVCB (type 64). Modern browsers automatically query for HTTPS records alongside A records. If Docker's resolver returns SERVFAIL for the HTTPS query, some browsers treat the overall DNS resolution as failed even though the A record came back cleanly. The fix for this specific scenario is not in daemon.json; it requires placing a full-featured resolver such as CoreDNS in front of 127.0.0.11 to handle HTTPS and SVCB queries gracefully.

What is CVE-2024-29018 and how does it relate to this error?

CVE-2024-29018 is a Docker security vulnerability where containers attached only to --internal networks could leak DNS queries to external authoritative nameservers, bypassing the isolation that internal networks are supposed to enforce. The mechanism is the same loopback address boundary issue described throughout this article: Docker's embedded resolver forwarded queries via the host's loopback device, outside the container's network namespace entirely, which allowed DNS traffic to escape networks that were configured to be fully air-gapped. An attacker with a foothold in a compromised container could exfiltrate data by encoding it in DNS query subdomains pointed at attacker-controlled authoritative nameservers -- a technique commonly called DNS tunneling. The vulnerability was disclosed in March 2024 and patched in Moby 26.0.0, 25.0.4, and 23.0.11. Source: GitHub Security Advisory GHSA-mq39-4gv4-mvpx.

What does ndots have to do with this error?

The ndots search option in a container's /etc/resolv.conf controls how aggressively the resolver expands short hostnames before forwarding a query upstream. The default value in many environments is ndots:5, which means a hostname with fewer than five dots triggers a search-domain expansion loop -- potentially five or more separate upstream queries -- before the name is tried as fully qualified. In environments where the upstream DNS server is unreachable, every single one of those expanded queries generates its own failed to query external DNS server log line. A container that makes ten requests per second can produce fifty or more log lines per second under these conditions, even though the underlying problem is a single misconfigured upstream. Setting ndots:1 in dns-opts inside daemon.json eliminates the expansion loop for names with at least one dot, which is the correct behavior for containers that are not inside a Kubernetes cluster.

Does this affect Docker Desktop on Mac and Windows?

Yes, but the causes and fixes are different. Docker Desktop runs containers inside a Linux VM that Docker manages, so there is no direct access to the VM's iptables or systemd-resolved configuration the way there is on a bare Linux host. On macOS, DNS filtering tools and corporate MDM profiles are the most common culprits -- they intercept queries leaving the VM through the host's network stack. On Windows with WSL 2, VPN clients that modify Windows DNS settings can break resolution inside containers when the VPN connects or disconnects. In both cases, the fix is to set explicit upstream resolvers through Docker Desktop's graphical settings (Resources > Network) or by editing daemon.json in the appropriate location: %USERPROFILE%\.docker\daemon.json on Windows. Changes take effect after restarting Docker Desktop from the system tray.

What if I am using Podman instead of Docker?

Podman uses a different DNS architecture. For rootless Podman, name resolution inside containers is handled by aardvark-dns (in Podman 4.0 and later) or dnsname (older versions via the CNI plugin), not by a Docker-style embedded resolver at 127.0.0.11. If you see this exact error message with Podman, it is most likely because your system has a docker command that is aliased to Podman, and the error is coming from a different code path than expected. Verify which binary is actually running with which docker and docker version. If the version output shows Podman, the upstream DNS configuration is set through /etc/containers/containers.conf under the [network] section, not through /etc/docker/daemon.json.

Can DNS fail over TCP independently of UDP?

Yes. DNS queries larger than 512 bytes -- typically responses with many records, DNSSEC signatures, or large TXT entries -- trigger automatic fallback from UDP to TCP on port 53. Docker's embedded resolver supports both, but iptables rules and firewall configurations sometimes permit UDP port 53 while blocking TCP port 53 as an oversight. If small queries resolve correctly but queries for specific domains with large record sets fail, test TCP explicitly:

terminal
# Force a TCP DNS query from inside a container
$ docker run --rm alpine nslookup -vc sudowheel.com 8.8.8.8

# Check whether TCP port 53 is explicitly blocked
$ sudo iptables -L -n | grep "dpt:53"
$ sudo ufw status verbose | grep 53

If the TCP query fails while UDP succeeds, add an explicit TCP 53 allow rule alongside your UDP rule. The ufw command to permit both is sudo ufw allow out 53 without the protocol suffix, which covers both.

How do I identify which container is generating the failed DNS queries?

The client-addr field in the error log contains the container's IP address on the bridge network. Match that address to a running container using docker inspect or by listing container IPs directly:

terminal
# Extract the IP from the log line (e.g. client-addr="udp:172.18.0.7:49663")
# Then find which container owns that IP
$ docker ps -q | xargs docker inspect --format '{{.Name}} {{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'

# Or inspect a specific container by name
$ docker inspect mycontainer | grep '"IPAddress"'

Once you have identified the container, check whether it is running a service that performs aggressive DNS polling -- service mesh sidecars, health-check loops with short intervals, or retry logic that hammers a hostname when a downstream service is unreachable. If the container is the source of the flood, fixing the upstream resolver makes the log errors disappear, but it may also be worth increasing the health-check interval or adding a retry backoff to reduce the underlying query rate.

My daemon.json already has settings in it. How do I add DNS without breaking it?

The most common mistake when following online instructions for this fix is treating daemon.json as an append-only file. JSON does not work that way -- the entire file must be a single valid object. If you add a second JSON block after the first, or add a trailing comma after the last key, Docker will either silently ignore the file or refuse to start. Before making any changes, read the existing file with cat /etc/docker/daemon.json, then edit it so the result is a single merged object with no duplicate keys. After editing, validate the syntax with python3 -m json.tool /etc/docker/daemon.json before restarting Docker. If Docker fails to start after your change, run sudo journalctl -u docker --since "5 min ago" to see the exact parse error.

Can an IPv6 nameserver address in resolv.conf cause the same problem as 127.0.0.53?

Yes, and it often goes unnoticed because engineers checking for loopback leaks typically scan for 127.x.x.x addresses and stop there. If /etc/resolv.conf contains nameserver ::1 -- the IPv6 loopback, used by local resolvers like Unbound or dnsmasq when bound to the IPv6 interface -- Docker's embedded resolver inherits it. From inside a container's network namespace, ::1 means the container's own IPv6 loopback, which has nothing listening on port 53. The error is identical to the IPv4 variant: i/o timeout addressed to a loopback that does not cross namespace boundaries. The fix is the same: set explicit routable upstream addresses in daemon.json. Check both address families when inspecting your resolver configuration: grep nameserver /etc/resolv.conf shows all entries regardless of protocol.

Why does my container's resolv.conf have search domains I never configured?

Docker populates the search line in a container's /etc/resolv.conf from two sources. First, it reads the search and domain lines from the host's /etc/resolv.conf and carries them into the container. If your host is joined to a corporate domain or has been configured by a DHCP client, those search domains are inherited by every container. Second, Docker adds a search entry derived from the container's hostname and the network's name -- so a container named myapp on a Compose-created network called appnet will typically have myapp.appnet in its search list.

This matters for the failed to query external DNS server error because those inherited search domains amplify the ndots multiplier effect. If the host's /etc/resolv.conf has five corporate search domains and ndots:5 is in effect, a single short hostname query from inside a container can produce up to ten upstream queries -- five search expansions with each domain appended, then the direct lookup. All of them fail if the upstream is broken, and all of them generate a separate log line. You can override search domains for specific containers in a Compose file using the dns_search key, or suppress them entirely by setting dns_search: "", which removes all search domain expansion for that container.

docker-compose.yml
services:
  myapp:
    image: myapp:latest
    dns_search: ""          # suppresses all inherited search domains
    dns_opt:
      - ndots:1
      - timeout:2
      - attempts:2

What do timeout and attempts in dns-opts actually control?

timeout:N sets how many seconds the container's resolver waits for a reply from a single upstream server before giving up on that attempt. attempts:N sets how many times it retries the same server before moving on to the next one in the list. With timeout:3 and attempts:3, a single unresponsive upstream server can consume up to 9 seconds (3 seconds × 3 attempts) before the resolver tries the fallback server. If you have two broken upstream servers listed, the maximum wait before a query ultimately fails is 18 seconds. For environments where fast failure detection matters -- containers that poll external APIs on startup, health checks with short deadlines -- consider tightening both values to 2. For environments with occasional packet loss on the path to the upstream server, keeping attempts at 3 provides resilience without significantly increasing the timeout impact, since each attempt only costs the timeout duration if it actually times out.

I changed daemon.json and restarted Docker but the error is still happening. Why?

There are four common reasons the fix appears not to work after a restart. First, confirm the restart actually completed. sudo systemctl restart docker is synchronous but if the daemon fails to start -- because of a malformed daemon.json, for example -- it will exit immediately and the old process was already stopped. Run sudo systemctl status docker and confirm the state is active (running), not failed. Second, confirm the containers you are testing were created after the restart. Old containers retain the /etc/resolv.conf that was written to them when they were first created. Restarting the Docker daemon does not rewrite the resolver configuration of already-running containers. Be precise about how you restart containers: docker restart mycontainer stops and starts the container process but keeps the existing container filesystem intact, including the old /etc/resolv.conf -- it does not trigger a fresh resolver injection. Use docker stop mycontainer && docker start mycontainer instead, or recreate with docker compose up -d --force-recreate, to get a new /etc/resolv.conf written from the current daemon configuration. Third, confirm the right file was edited. On Docker Desktop (macOS and Windows), the daemon.json location is not /etc/docker/daemon.json -- it is inside the user profile at ~/.docker/daemon.json on macOS and %USERPROFILE%\.docker\daemon.json on Windows, and changes take effect after restarting Docker Desktop from the system tray, not via systemctl. Fourth, if you are using a hardened base image that mounts /etc/resolv.conf as read-only inside the container, Docker cannot write its DNS settings at container start. Check with docker inspect mycontainer --format '{{json .HostConfig.ReadonlyRootfs}}' and look for explicit volume mounts over /etc/resolv.conf in the image's Dockerfile or entrypoint.

How do I fix Docker DNS failures in GitHub Actions or other CI/CD pipelines?

Hosted CI runners -- GitHub Actions ubuntu-latest, GitLab shared runners, CircleCI machine executors -- run on cloud VMs where Docker's default upstream resolver is often a platform-specific address that is unreachable from inside bridge network namespaces. The fix is to write an explicit daemon.json as an early job step before any Docker commands, then restart the daemon. Because hosted runners are ephemeral fresh VMs per job, there is no persistent configuration to maintain -- the step runs on every job. A two-line step covers it: write /etc/docker/daemon.json with a public DNS array and sudo systemctl restart docker. For self-hosted runners on persistent VMs, apply the daemon.json fix once during runner provisioning rather than per-job. For Docker-in-Docker setups, each nested Docker daemon needs its own explicit DNS configuration -- the outer container's resolver configuration does not automatically propagate to the inner daemon.

Does fixing daemon.json DNS affect every node in a Docker Swarm cluster?

No. There is no centralized DNS configuration mechanism in Docker Swarm. The daemon.json setting on one node only affects containers scheduled on that node. To fix DNS across a Swarm, you need to apply the same daemon.json change and restart Docker on every manager and worker node in the cluster. If you use a configuration management tool like Ansible, Puppet, or Chef to manage your nodes, push the daemon.json change as a managed file and trigger a Docker service restart through it. Manual node-by-node changes work but are error-prone in larger clusters, and an inconsistently configured Swarm produces frustrating intermittent failures where the same service resolves external names on some nodes but not others depending on where the task was scheduled.