Every time you publish a port, spin up a container, or create a Docker network, the Docker daemon quietly inserts firewall rules into your host's netfilter subsystem. These rules handle NAT, packet forwarding, and inter-network isolation. If you have ever written a careful iptables ruleset and then watched Docker blow right past it, you have experienced this firsthand. Understanding exactly how Docker interacts with netfilter -- and how a compatibility layer now sits between Docker's iptables commands and the nftables kernel backend on modern distributions -- is essential for anyone running containers on a production Linux host.

Netfilter: The Kernel Framework Behind Everything

Before looking at what Docker does, it helps to understand what it touches. Netfilter is the packet-processing framework inside the Linux kernel. It provides hooks at defined points in the network stack -- PREROUTING, INPUT, FORWARD, OUTPUT, and POSTROUTING -- where modules can inspect, modify, or drop packets.

For years, iptables was the standard userspace tool for managing netfilter rules. It organized rules into tables (filter, nat, mangle, raw) and chains within those tables. The iptables binary talked directly to the kernel's xtables interface, and every sysadmin learned to think in terms of INPUT chains and FORWARD policies.

Then nftables arrived. Included in the Linux kernel since version 3.13 (released in 2014), nftables provides a cleaner rule syntax, unified IPv4/IPv6 handling, atomic ruleset updates, and more efficient matching through a virtual machine that executes compiled bytecode. Every major distribution has since made nftables the default backend: Debian 10+, Ubuntu 20.04+, RHEL 8+, and Fedora all ship with nftables as their primary packet filtering framework.

The critical detail is that on these modern systems, the iptables binary is no longer the legacy xtables tool. It is typically a symlink to iptables-nft -- a compatibility shim that accepts the traditional iptables command syntax but translates every rule into nftables objects behind the scenes. This is the compatibility layer that sits between Docker and the kernel on any reasonably current Linux installation.

How Docker Uses Netfilter

Docker's default bridge networking model depends on netfilter for four things: creating a Linux bridge (typically docker0), attaching veth pairs that connect containers to that bridge, performing source NAT (MASQUERADE) so containers can reach the outside world, and adding filter rules that handle forwarding and route published ports to the correct container.

When Docker starts, it enables IP forwarding on the host by setting net.ipv4.ip_forward to 1. When it does so, it also sets the default policy of the iptables FORWARD chain to DROP, ensuring that only traffic explicitly allowed by Docker's own rules can traverse between interfaces. This is a security measure -- it prevents a Docker host from inadvertently acting as a router -- but it also means Docker's chains take precedence over anything you might have added to FORWARD yourself. If IP forwarding was already enabled on the host before Docker started, Docker will not modify the FORWARD policy.

Docker creates several custom chains in the filter and nat tables:

In the nat table, Docker inserts a rule in the PREROUTING chain that jumps to its DOCKER chain for any packet with a local destination address. This is how port publishing works: when a packet arrives for a published port, Docker's DNAT rule rewrites the destination to the container's internal IP and port before the packet enters the routing decision. In POSTROUTING, a MASQUERADE rule rewrites the source address of outbound container traffic to the host's address, allowing containers to communicate with external networks.

Note

Because Docker's DNAT rules in the nat table's PREROUTING chain run before the routing decision, traffic destined for a published container port is forwarded through the FORWARD chain rather than delivered locally. Rules you add to INPUT will never see this traffic -- the packet is DNAT'd and forwarded, not consumed by the host. This is the single most common source of confusion when administrators try to restrict access to Docker containers using conventional iptables rules.

Inspecting Docker's Generated Rules

Examining exactly what Docker has inserted is the first step in understanding and troubleshooting container networking. Start by listing the filter table's FORWARD chain, where Docker's chain jumps are installed:

terminal
$ sudo iptables -L FORWARD -n -v --line-numbers
Chain FORWARD (policy DROP)
num   pkts bytes target                  prot opt in     out     source        destination
1      412  31K DOCKER-USER             all  --  *      *       0.0.0.0/0     0.0.0.0/0
2      412  31K DOCKER-ISOLATION-STAGE-1 all  --  *      *       0.0.0.0/0     0.0.0.0/0
3      206  15K ACCEPT                  all  --  *      docker0 0.0.0.0/0     0.0.0.0/0   ctstate RELATED,ESTABLISHED
4        0     0 DOCKER                  all  --  *      docker0 0.0.0.0/0     0.0.0.0/0
5      206  15K ACCEPT                  all  --  docker0 !docker0 0.0.0.0/0    0.0.0.0/0
6        0     0 ACCEPT                  all  --  docker0 docker0 0.0.0.0/0    0.0.0.0/0

Notice the ordering. Every forwarded packet hits DOCKER-USER first (line 1), then DOCKER-ISOLATION-STAGE-1 (line 2), then the established connection tracker (line 3), and finally the DOCKER chain itself (line 4). This ordering is why DOCKER-USER is the correct place for your custom rules -- anything you insert there is evaluated before Docker decides whether to accept or drop the packet.

Next, check the nat table to see Docker's DNAT and MASQUERADE rules:

terminal
$ sudo iptables -t nat -L -n -v

Chain PREROUTING (policy ACCEPT)
 pkts bytes target  prot opt in     out     source        destination
  128  8192 DOCKER  all  --  *      *       0.0.0.0/0     0.0.0.0/0   ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
 pkts bytes target      prot opt in     out      source          destination
  206  15K  MASQUERADE  all  --  *      !docker0 172.17.0.0/16   0.0.0.0/0

Chain DOCKER (2 references)
 pkts bytes target  prot opt in       out     source        destination
    0     0 RETURN  all  --  docker0  *       0.0.0.0/0     0.0.0.0/0
   45  2700 DNAT    tcp  --  !docker0 *       0.0.0.0/0     0.0.0.0/0   tcp dpt:8080 to:172.17.0.2:80

The PREROUTING rule sends any packet addressed to a local IP into the DOCKER chain. Inside DOCKER, you can see the per-container DNAT entry: external traffic arriving on port 8080 gets rewritten to the container at 172.17.0.2 on port 80. The MASQUERADE rule in POSTROUTING handles the return path, rewriting the source address of outbound container traffic so external hosts see the host's IP rather than a private 172.17.x.x address.

The iptables-nft Compatibility Layer

Here is where things get interesting on modern systems. When Docker issues all these iptables commands, it is (almost certainly) not talking directly to the legacy xtables kernel interface. On any distribution released in the last several years, the iptables binary is typically iptables-nft, which silently translates every iptables command into nftables rules.

You can verify which backend your system uses:

$ iptables --version

If the output includes (nf_tables), your system is using the nftables backend. If it says (legacy), you are on the older xtables path. On Debian and Ubuntu, the update-alternatives system controls which binary iptables resolves to:

$ sudo update-alternatives --display iptables

When iptables-nft is active, Docker's rules are not stored in the legacy xtables format. Instead, they appear in the nftables ruleset as translated tables. You can inspect them using the native nft tool:

terminal
$ sudo nft list ruleset

# You will see tables like:
table ip filter {
    chain DOCKER-USER {
        counter packets 0 bytes 0 return
    }
    chain DOCKER-ISOLATION-STAGE-1 {
        iifname "docker0" oifname != "docker0" counter jump DOCKER-ISOLATION-STAGE-2
        counter return
    }
    # ... more Docker chains ...
}

table ip nat {
    chain PREROUTING {
        type nat hook prerouting priority -100; policy accept;
        fib daddr type local counter jump DOCKER
    }
    # ... MASQUERADE and DNAT rules ...
}

This is the compatibility layer in action. Docker issues iptables commands. iptables-nft intercepts those commands and writes the rules as nftables objects. The kernel only sees nftables. The whole process is transparent to Docker -- it has no idea it is talking to nftables.

Warning

The compatibility layer works well for the common case, but it has edge cases. If another tool on the system manages rules using iptables-legacy while Docker uses iptables-nft, you end up with rules in two separate kernel subsystems. The kernel processes both iptables (xtables) and nftables rules for the same packet, which creates unpredictable behavior. Always ensure every tool on the host uses the same backend. Check for stale legacy rules with sudo iptables-legacy -L -n.

The Split-Brain Problem

The single most dangerous scenario with the compatibility layer is running mixed backends. This happens more often than you might expect: a firewall management tool writes rules via iptables-legacy, Docker writes rules via iptables-nft, and an administrator inspects with nft list ruleset and sees only half the picture.

When both backends have active rules, the kernel evaluates iptables (xtables) rules and nftables rules independently for the same packet. A packet that one backend accepts can still be dropped by the other. An administrator who sees an ACCEPT in nftables might not realize a legacy iptables DROP is also in play. The fix is simple in principle: consolidate everything onto one backend, flush the other, and ensure the update-alternatives symlink points where you want it.

consolidating to iptables-nft
# Check for stale legacy rules
$ sudo iptables-legacy -L -n 2>/dev/null
$ sudo iptables-legacy -t nat -L -n 2>/dev/null

# If legacy rules exist, flush them
$ sudo iptables-legacy -F
$ sudo iptables-legacy -t nat -F
$ sudo iptables-legacy -t mangle -F

# Ensure alternatives point to iptables-nft (Debian/Ubuntu)
$ sudo update-alternatives --set iptables /usr/sbin/iptables-nft
$ sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-nft

# Restart Docker to regenerate rules under the correct backend
$ sudo systemctl restart docker

Using the DOCKER-USER Chain Correctly

The DOCKER-USER chain is Docker's designated hook for custom administrator rules. Docker never modifies this chain -- it creates it with a single RETURN rule and leaves it alone. Every forwarded packet passes through DOCKER-USER before reaching Docker's own chains, making it the correct place to implement access controls.

There is an important nuance: by the time packets arrive at DOCKER-USER, they have already been through DNAT in the nat table's PREROUTING chain. This means the destination address has been rewritten to the container's internal IP and port. If you want to match on the original destination port or the external source address, you need to use the conntrack extension.

DOCKER-USER filtering with conntrack
# Always add established/related first, or you will break existing connections
$ sudo iptables -I DOCKER-USER -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow a specific subnet to reach container port 8080 (original destination port)
$ sudo iptables -I DOCKER-USER -p tcp -m conntrack --ctorigsrc 10.0.0.0/24 --ctorigdstport 8080 -j ACCEPT

# Drop everything else destined for published container ports
$ sudo iptables -A DOCKER-USER -p tcp -m conntrack --ctorigdstport 8080 -j DROP

# Verify the chain
$ sudo iptables -L DOCKER-USER -n -v --line-numbers

Rule ordering matters. The DOCKER-USER chain should always have the ESTABLISHED,RELATED accept rule at the top, followed by your specific allow rules, with deny rules at the bottom. The default RETURN rule that Docker installs should remain as the very last entry -- it allows any traffic you did not explicitly block to proceed into Docker's own chains for normal processing.

Scoping Rules to a Specific Docker Network

A common mistake is writing DOCKER-USER rules against container IP addresses directly. Container IPs change every time a container is recreated. The correct approach is to match against the bridge interface name, which is stable as long as the Docker network exists. Each Docker network gets its own bridge (e.g., br-a1b2c3d4e5f6), and you can find the mapping with docker network inspect --format '{{.Options}}' <network> or by examining the com.docker.network.bridge.name option. Once you have the interface name, use -o <bridge> to scope rules to traffic entering a specific network:

scoping DOCKER-USER rules to a named network bridge
# Find the bridge interface name for a Docker network
$ docker network inspect my-app-network --format '{{index .Options "com.docker.network.bridge.name"}}'
# If blank, the bridge name is br- followed by the first 12 chars of the network ID
$ docker network inspect my-app-network --format 'br-{{slice .Id 0 12}}'

# Restrict access to a specific bridge network's published ports
$ sudo iptables -I DOCKER-USER -o br-a1b2c3d4e5f6 -m conntrack --ctorigsrc 10.0.0.0/24 -j ACCEPT
$ sudo iptables -A DOCKER-USER -o br-a1b2c3d4e5f6 -j DROP

Rate-Limiting Inbound Traffic per Source IP

The hashlimit module works inside DOCKER-USER and is one of the most underused tools for container traffic control. Unlike the basic limit module, hashlimit maintains per-source-IP rate counters, letting you enforce per-client limits without global throttling. This is particularly useful for published HTTP or API ports where you want to allow burst traffic from legitimate clients while dropping floods from a single source before they saturate the container:

per-source-IP rate limiting in DOCKER-USER
# Allow up to 100 new connections per minute per source IP to container port 8080
# Burst of 20 allows short spikes without triggering the limit immediately
$ sudo iptables -I DOCKER-USER -p tcp \
  -m conntrack --ctorigdstport 8080 --ctstate NEW \
  -m hashlimit \
    --hashlimit-name docker-8080 \
    --hashlimit-mode srcip \
    --hashlimit-upto 100/min \
    --hashlimit-burst 20 \
  -j ACCEPT

# Drop new connections that exceed the per-IP rate limit
$ sudo iptables -A DOCKER-USER -p tcp \
  -m conntrack --ctorigdstport 8080 --ctstate NEW \
  -j DROP

The Chain Recreation Problem

There is a subtle failure mode that catches administrators who have correctly written DOCKER-USER rules and correctly persisted them: Docker recreates the DOCKER-USER chain empty when the daemon restarts. iptables-save and iptables-restore write and reload the full chain contents, but if Docker starts after iptables-restore, Docker's startup sequence flushes DOCKER-USER and reinstalls a bare RETURN rule.

The fix is ordering. Your rule-restoration unit must run after docker.service, not before it. If you are using iptables-persistent on Debian/Ubuntu, override the unit's ordering:

/etc/systemd/system/netfilter-persistent.service.d/after-docker.conf
[Unit]
After=docker.service
Requires=docker.service

With this override, netfilter-persistent waits for Docker to finish its startup sequence (and recreate DOCKER-USER) before restoring your rules into the now-existing chain. Without it, the restore silently succeeds into a chain that Docker then immediately empties.

Pro Tip

Remember to persist your DOCKER-USER rules. Docker does not manage this chain, so if you add rules manually, they will be lost on reboot unless you save them with iptables-save and restore them at boot. On Debian-based systems, the iptables-persistent package handles this. On RHEL-based systems, use iptables-services or include the rules in a systemd unit that runs after Docker starts.

Conflicts with UFW and firewalld

Two of the most popular firewall frontends -- UFW (Uncomplicated Firewall) on Debian/Ubuntu and firewalld on RHEL/Fedora -- are both fundamentally incompatible with Docker's iptables behavior in ways that surprise administrators.

UFW manages rules in the INPUT and OUTPUT chains. Docker's container traffic flows through PREROUTING (nat table) and FORWARD (filter table), which UFW does not control. When you publish a container port, Docker's DNAT rule in PREROUTING redirects the packet before it ever reaches the chains UFW manages. The result: a UFW "deny" rule on a port has no effect on traffic reaching a container that publishes that port. The container remains accessible regardless of what UFW says.

There are two viable approaches when you need UFW and Docker to coexist. The first is to set DEFAULT_FORWARD_POLICY to DROP in /etc/default/ufw and then add forwarding rules through UFW's before.rules file directly. Edit /etc/ufw/before.rules and add your NAT and FORWARD entries above the *filter section -- UFW preserves this block on reload and it runs before Docker's own rules:

/etc/ufw/before.rules (prepend above *filter)
# Allow forwarding for Docker containers in the 172.17.0.0/16 range
*filter
:ufw-before-input - [0:0]
:ufw-before-output - [0:0]
:ufw-before-forward - [0:0]

# Permit established forwarded traffic (Docker containers outbound)
-A ufw-before-forward -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A ufw-before-forward -s 172.17.0.0/16 -j ACCEPT
-A ufw-before-forward -d 172.17.0.0/16 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

COMMIT

The second approach -- and the one that works without touching before.rules -- is to stop publishing container ports with -p entirely and use Docker networks with explicit --internal flag for inter-container traffic, combined with a reverse proxy container (nginx or Traefik) that binds only to 127.0.0.1. UFW controls access to that loopback-bound proxy port normally, and no container port is exposed to the host network directly. This eliminates the PREROUTING bypass problem at its root.

firewalld presents a different problem. When Docker detects firewalld, it creates a firewalld zone called docker with target ACCEPT and inserts all Docker bridge interfaces into that zone. It also creates a forwarding policy called docker-forwarding that permits forwarding from any zone into the docker zone. If firewalld is using the nftables backend (the default on RHEL 8+), Docker's iptables rules and firewalld's nftables rules coexist in the kernel. A firewalld reload can flush rules in ways that break Docker's networking, particularly if the reload touches the nftables tables where the compatibility layer has placed Docker's translated rules.

The most reliable approach on firewalld systems is to use firewalld rich rules scoped to the docker zone alongside --iptables=false only if your team is prepared to manage the full ruleset manually. For administrators who want Docker's automatic rule management and still need firewalld, the safest configuration is to keep firewalld managing the host's external interfaces and leave Docker full ownership of the docker zone. Use firewall-cmd --zone=docker --add-rich-rule to add per-port restrictions within Docker's own zone, rather than writing rules in external zones that Docker does not know about:

firewalld rich rules in the docker zone
# Restrict access to published port 8080 to a specific source range
# This works inside the docker zone without fighting Docker's forwarding policy
$ sudo firewall-cmd --zone=docker --add-rich-rule \
  'rule family="ipv4" source address="10.0.0.0/24" port port="8080" protocol="tcp" accept' \
  --permanent

# Block everything else on port 8080 in the docker zone
$ sudo firewall-cmd --zone=docker --add-rich-rule \
  'rule family="ipv4" port port="8080" protocol="tcp" drop' \
  --permanent

$ sudo firewall-cmd --reload

Note that firewalld rich rules in the docker zone are still subject to the PREROUTING DNAT problem -- they apply to the post-NAT packet in the FORWARD chain. Ordering between the rich rule DROP and Docker's own ACCEPT depends on rule priority, which firewalld manages internally. Test every change with nft list ruleset to confirm the translated rule ordering matches your intent before treating the configuration as production-safe.

Caution

Do not assume that adding a deny rule in UFW or firewalld protects your Docker containers. It does not. To restrict access to published container ports, you must use the DOCKER-USER chain (with the iptables backend) or write custom nftables rules with appropriate chain priorities (with the native nftables backend). Any other approach leaves your containers exposed regardless of what the firewall frontend reports.

Docker 29: The Native nftables Backend

Docker Engine 29.0.0, released November 11, 2025, introduced experimental support for a native nftables firewall backend, representing the beginning of Docker's transition away from iptables entirely. This work built on significant iptables rule restructuring introduced in Docker 28, which refactored the iptables rules for port publishing and network isolation as groundwork for native nftables support. The Docker team cited distribution-level iptables deprecation as the driver, noting in the official release post that it was time for Docker Engine to create its own nftables rules directly (Docker Blog, November 2025). When enabled, Docker no longer issues iptables commands that pass through the compatibility layer. Instead, it creates nftables rules directly in its own dedicated tables.

To enable the nftables backend, add the following to /etc/docker/daemon.json:

/etc/docker/daemon.json
{
  "firewall-backend": "nftables"
}

Or start the daemon with the command-line flag:

$ dockerd --firewall-backend=nftables

Under the nftables backend, Docker creates two dedicated tables: ip docker-bridges and ip6 docker-bridges. Each table contains base chains with well-known priority values, and additional chains are added for each bridge network. Docker expects full ownership of these tables -- do not modify them directly, as Docker may overwrite your changes at any time.

Key Differences from the iptables Backend

Several important behaviors change when you switch to the nftables backend:

Warning

The nftables backend in Docker 29 is experimental. Configuration options, behavior, and implementation may change in future releases. In a future release, nftables will become the default backend and iptables support will be deprecated. Test thoroughly in a non-production environment before migrating production workloads.

Custom Rules Under the nftables Backend

Without DOCKER-USER, you need a different approach to custom filtering -- and there is a critical semantic difference from iptables that trips up administrators who assume the same logic applies.

In nftables, an accept verdict is not final across base chains. When a packet is accepted in one base chain, it still traverses every other base chain attached to the same hook point. This means that if you write a custom nftables table with an accept rule for traffic you want to allow, and Docker's chain at the same hook point has a drop rule, the packet can still be dropped. Docker's official documentation makes this explicit: an accept verdict terminates processing within its own base chain, but the packet continues through any other base chains registered at the same hook, any of which may drop it (Docker Docs, 2025).

To actually override Docker's drop rules under the nftables backend, you must use a firewall mark with Docker's --bridge-accept-fwmark daemon option. You choose a mark value not already in use on your host, tell Docker to accept any packet carrying that mark, and then apply the mark in your own chain before Docker's chains run. Here is the pattern:

/etc/docker/daemon.json -- fwmark config
{
  "firewall-backend": "nftables",
  "bridge-accept-fwmark": "1"
}

With the daemon configured to accept mark 1, you write a custom nftables table that sets mark 1 on traffic you want to allow, using a chain priority lower than Docker's (to run before Docker's rules evaluate):

/etc/nftables.conf (excerpt) -- fwmark pattern
table inet my-docker-filter {
    chain forward {
        type filter hook forward priority filter - 1;
        policy accept;

        # Allow established connections
        ct state established,related accept

        # Mark allowed traffic -- Docker will accept packets carrying mark 1
        iifname "eth0" tcp dport 8080 ip saddr 10.0.0.0/24 meta mark set 0x1

        # Drop all other external traffic to port 8080 before Docker evaluates it
        iifname "eth0" tcp dport 8080 drop
    }
}

The chain priority filter - 1 places your rules before Docker's chains at the same forward hook, which is required. If you apply the firewall mark at the same priority level as Docker's chains, the ordering between chains is not guaranteed, and your mark may not be set in time. The --bridge-accept-fwmark option also accepts an optional bitmask: for example, "bridge-accept-fwmark": "1/3" matches any packet where the low two bits of the mark equal 1.

If you only need to block traffic (rather than allow traffic that Docker would otherwise drop), a drop rule in a priority filter - 1 chain does work as expected -- a drop verdict is final across all chains. The fwmark mechanism is only needed when you want to explicitly allow traffic past Docker's own drop rules.

Migrating from the Compatibility Layer to Native nftables

If you decide to move from Docker's default iptables backend (running through the iptables-nft compatibility layer) to the native nftables backend, there are several steps to handle carefully.

First, understand what will happen: when you restart Docker with firewall-backend=nftables, Docker will delete most of its iptables chains and rules and create nftables rules instead. However, it will not remove the jump from the iptables FORWARD chain to DOCKER-USER. That jump will remain until you manually remove it or reboot the host. Any rules you had in DOCKER-USER will continue to execute via the stale iptables jump, but they will run alongside (not instead of) Docker's new nftables rules.

migration steps
# 1. Save your current ruleset for reference (iptables-save produces a restorable format)
$ sudo iptables-save > /tmp/iptables-backup.txt

# 2. Enable IP forwarding (nftables backend will not do this for you)
$ echo "net.ipv4.ip_forward = 1" | sudo tee /etc/sysctl.d/99-docker-forward.conf
$ sudo sysctl --system

# 3. Stop Docker
$ sudo systemctl stop docker

# 4. Check and reset the iptables FORWARD policy if needed
$ sudo iptables -P FORWARD ACCEPT

# 5. Configure the nftables backend
$ sudo tee /etc/docker/daemon.json <<'EOF'
{
  "firewall-backend": "nftables"
}
EOF

# 6. Translate your DOCKER-USER rules to native nftables syntax
# Use the backup to extract relevant rules -- note that iptables-restore-translate
# outputs nftables syntax where DOCKER-USER becomes a chain name like "DOCKER-USER"
# in the translated ip filter table. Review the full output, not just a grep:
# iptables-restore-translate -f /tmp/iptables-backup.txt > /tmp/ruleset.nft
# grep -A5 "DOCKER-USER" /tmp/ruleset.nft

# 7. Restart Docker
$ sudo systemctl start docker

# 8. Verify the new nftables ruleset
$ sudo nft list ruleset | grep docker
Pro Tip

The iptables-restore-translate utility converts a full ruleset dump into nftables syntax. Pipe your saved ruleset through it: iptables-restore-translate -f /tmp/iptables-backup.txt > ruleset.nft. For individual rules, iptables-translate handles single command-line entries. The translation is not always perfect -- review the output carefully, especially for rules using the conntrack extension -- but it provides a solid starting point.

The Atomic Ruleset Hazard

One of nftables' strengths -- atomic ruleset updates -- is also one of its sharpest operational hazards when Docker is in the picture. Nftables allows you to replace the entire ruleset in a single atomic operation using nft flush ruleset followed by nft -f /etc/nftables.conf. This is clean and deterministic, but it also wipes out every table, including the ones Docker created.

If a firewall management tool (or a systemd service that reloads nftables on boot) performs a full flush-and-replace, Docker's chains disappear. Containers lose network connectivity. Published ports stop working. MASQUERADE rules vanish, so containers can no longer reach the internet. Docker is not aware that its rules are gone and will not recreate them until the next relevant event (like creating a new network or restarting the daemon).

The solution is to never use nft flush ruleset on a host running Docker. Instead, flush only the tables you own:

safe nftables reload
# Flush only YOUR table, leave Docker's tables intact
$ sudo nft flush table inet my-docker-filter

# Reload your rules
$ sudo nft -f /etc/nftables-custom.conf

# Never do this on a Docker host:
# sudo nft flush ruleset  ← destroys Docker's chains

The systemd Reload Race

There is a second form of this hazard that is harder to catch: the systemd service ordering race at boot. On many distributions, nftables.service is configured to load /etc/nftables.conf at startup and that file begins with flush ruleset. If nftables.service runs after docker.service (which it can, depending on your unit ordering and whether Docker was already running before the reload), the boot-time nftables load wipes Docker's tables. Containers that were already running lose network connectivity without any obvious error in either service's log.

Audit your nftables configuration file directly. If the first non-comment line is flush ruleset, that file is incompatible with coexisting Docker on the same host. Replace it with explicit table-level flushes for only the tables you own:

/etc/nftables.conf -- safe pattern for Docker hosts
#!/usr/sbin/nft -f
# Do NOT use "flush ruleset" here -- it will destroy Docker's tables

# Declare your own table (add table is idempotent; won't fail if it exists)
add table inet my-host-filter

# Flush only your own table before reloading rules
flush table inet my-host-filter

table inet my-host-filter {
    chain input {
        type filter hook input priority filter; policy drop;
        # your host INPUT rules here
        ct state established,related accept
        iif lo accept
    }
}

If you do need to use flush ruleset (for example, in a one-shot cleanup script), always follow it immediately with systemctl restart docker so Docker recreates its tables before any container tries to send or receive traffic.

The --iptables=false Escape Hatch

Docker provides a daemon flag (--iptables=false) that prevents it from creating firewall rules entirely. This option exists for environments where a dedicated network team manages all NAT and filter rules externally. It is tempting for administrators who are frustrated with Docker's rule manipulation, but it is a sharp knife that cuts quietly.

With iptables=false, Docker will not create MASQUERADE rules, so containers in bridge networks cannot reach external hosts. It will not create DNAT rules, so published ports will not work. It will not create the FORWARD chain rules, so inter-container traffic routing breaks. You are responsible for implementing all of this yourself, and you must do it in a way that survives Docker creating and destroying containers and networks dynamically.

This option is appropriate only when you run fixed container networks with explicitly defined published ports, your organization has a dedicated team managing all firewall rules through nftables or a policy engine, and you fully accept that Docker will not work out of the box for developers who expect -p 8080:80 to simply work.

If you do commit to --iptables=false, you need to manually implement the three things Docker would have handled. Here is the minimum viable nftables ruleset for a single bridge network with one published port:

manual nftables ruleset replacing Docker's --iptables=false behavior
table ip docker-manual {

    # MASQUERADE: containers reach external networks via the host's IP
    chain postrouting {
        type nat hook postrouting priority srcnat;
        # Adjust 172.17.0.0/16 to match your Docker bridge subnet
        ip saddr 172.17.0.0/16 oifname != "docker0" masquerade
    }

    # DNAT: publish container port 80 on host port 8080
    chain prerouting {
        type nat hook prerouting priority dstnat;
        iifname != "docker0" tcp dport 8080 dnat to 172.17.0.2:80
    }

    # FORWARD: allow traffic to/from containers
    chain forward {
        type filter hook forward priority filter;
        policy accept;
        # Allow established traffic back to containers
        oifname "docker0" ct state established,related accept
        # Allow forwarded traffic into container from external
        iifname != "docker0" oifname "docker0" tcp dport 80 accept
        # Allow containers to reach external networks
        iifname "docker0" oifname != "docker0" accept
        # Allow inter-container traffic on the same bridge
        iifname "docker0" oifname "docker0" accept
    }
}

The critical operational problem with this approach is that bridge interface names and container IPs change dynamically. When you run docker network create, a new bridge appears. When a container restarts, its IP may change. With --iptables=false, none of this is reflected in your ruleset automatically. You either constrain yourself to a static network layout (fixed network names, fixed container IPs using --ip at run time) or you build external tooling -- a script that monitors Docker events via docker events --filter type=network and updates nftables rules accordingly. Neither is trivial in a dynamic environment where developers spin containers up and down freely.

If you disable Docker's iptables integration, you are not simplifying your networking. You are taking full ownership of a complex, dynamic rule management problem that Docker was handling for you.

Operational Checklist

Whether you are running Docker with the iptables backend (through the iptables-nft compatibility layer) or the native nftables backend, there is a consistent set of validations to perform after any change to your firewall configuration, after a Docker upgrade, and after a host reboot:

  1. Verify the backend. Run iptables --version and confirm all tools on the host are using the same backend (either all nft or all legacy, never mixed). Also check iptables-legacy -L -n 2>/dev/null for stale rules in the legacy subsystem that may be silently processing packets alongside your nft rules.
  2. Confirm MASQUERADE is present. Without it, containers cannot reach external networks. Check with iptables -t nat -L POSTROUTING -n (iptables backend) or nft list table ip docker-bridges (nftables backend). Also run this check after creating or removing any Docker network, not just at reboot -- Docker recreates MASQUERADE rules per-network, and a race condition during network teardown can occasionally leave a MASQUERADE entry missing until the daemon is restarted.
  3. Verify the FORWARD jump to DOCKER-USER. If using the iptables backend, confirm DOCKER-USER is the first target in the FORWARD chain. If using the nftables backend, confirm your custom table's chain priority runs before Docker's chains. Also check the FORWARD chain policy -- Docker sets this to DROP when it enables IP forwarding itself, but if forwarding was already enabled on the host before Docker started, the policy may remain at ACCEPT. Verify the policy matches your intended posture, and note that some firewall management tools (including UFW's DEFAULT_FORWARD_POLICY) may reset it independently.
  4. Check hairpin NAT for container-to-host access. If your containers need to reach the host's published IP (for example, to call a service bound to the host's external address), verify that hairpin NAT is enabled on the bridge: cat /sys/class/net/docker0/brif/*/hairpin_mode should return 1 for each interface. Without hairpin NAT, a container calling the host's external IP for a published port gets no response because the packet never leaves the bridge.
  5. Test end-to-end connectivity. Start a test container, publish a port, and verify that external clients can reach the published port and that the container can reach external hosts. Also test container-to-container traffic on a separate Docker network to confirm isolation chains are working.
  6. Verify conntrack table headroom. On hosts running large numbers of containers with high connection rates, the netfilter conntrack table can fill up. When full, new connections are dropped silently. Check current usage with nf_conntrack_count and the limit with nf_conntrack_max in /proc/sys/net/netfilter/. If usage is consistently above 80% of the maximum, increase net.netfilter.nf_conntrack_max via sysctl and consider enabling net.netfilter.nf_conntrack_tcp_timeout_established reduction for short-lived API connections.
  7. Verify rule persistence. Reboot the host and repeat all the above checks. Rules that disappear after a reboot indicate a persistence problem in your iptables-save/restore or nftables configuration. Pay particular attention to whether DOCKER-USER rules survive -- if they do not, check the systemd ordering of your rule-restoration service relative to docker.service, as described in the section above.

Wrapping Up

Docker's relationship with netfilter is the source of more production networking surprises than almost any other aspect of container infrastructure. The core problem is straightforward: Docker needs to manipulate firewall rules to implement bridge networking, and it does so aggressively, inserting chains and rules that take precedence over conventional host-level firewall configurations.

On modern distributions, a compatibility layer (iptables-nft) sits between Docker's iptables commands and the nftables kernel backend, translating everything transparently. This works well as long as every tool on the host uses the same backend -- the moment you have a split between legacy and nft, you are debugging in two firewall universes simultaneously.

Docker 29's experimental native nftables backend (released November 11, 2025) represents the next phase: Docker creating nftables rules directly, without the translation layer. The DOCKER-USER chain goes away, replaced by the nftables pattern of separate tables with priority-ordered base chains. IP forwarding management shifts to the administrator. The default drop policy disappears. Critically, the semantics of accept change: because an accept in one nftables base chain does not prevent other chains at the same hook from processing (and potentially dropping) the packet, allowing traffic past Docker's drop rules now requires the --bridge-accept-fwmark pattern rather than a simple accept verdict. It is a cleaner architecture, but it demands more intentional configuration from sysadmins.

Whichever backend you run, the principles remain the same: know which backend is active, never mix legacy and nft, use the designated hook points for custom rules (DOCKER-USER or a priority-ordered nftables table), persist your rules properly, and validate everything after reboots. Do that, and Docker's iptables compatibility layer stops being a source of confusion and starts being a well-understood component of your network stack.

How to Identify and Manage Docker's iptables Compatibility Layer

Step 1: Identify your iptables backend

Run iptables --version on the host. If the output contains (nf_tables) in parentheses, your system routes iptables commands through the nftables kernel backend via the iptables-nft compatibility layer. If it shows (legacy), you are using the older xtables interface directly. On Debian and Ubuntu, you can also run update-alternatives --display iptables to see which binary is selected.

Step 2: Inspect Docker's generated firewall rules

List Docker's chains in the filter table with iptables -L -n -v and in the nat table with iptables -t nat -L -n -v. If your system uses iptables-nft, you can also view the translated nftables ruleset by running nft list ruleset, where Docker's rules appear under tables named ip filter and ip nat. Look for the DOCKER, DOCKER-USER, DOCKER-ISOLATION-STAGE-1, and DOCKER-ISOLATION-STAGE-2 chains.

Step 3: Add custom filtering rules in the DOCKER-USER chain

Insert your custom rules into the DOCKER-USER chain, which Docker evaluates before its own FORWARD rules. First add an ESTABLISHED,RELATED accept rule using conntrack, then add your specific allow or deny rules. Because packets in this chain have already been through DNAT, use the conntrack extension with --ctorigsrc and --ctorigdstport flags to match on original source addresses and destination ports rather than the post-NAT container addresses.

Frequently Asked Questions

Why do my iptables rules not block traffic to Docker containers?

Docker routes published container traffic through the nat table's PREROUTING chain before the routing decision. After DNAT rewrites the destination to a container IP, the packet is forwarded through the FORWARD chain, not delivered to INPUT. Rules you add to INPUT are never reached by traffic destined for a published container port. Rules added to FORWARD are evaluated after Docker's DOCKER-USER and DOCKER chains have already run, and the packet's destination has already been rewritten. To filter traffic destined for containers, place your rules in the DOCKER-USER chain, which Docker processes before its own FORWARD rules.

What is the difference between iptables-legacy and iptables-nft?

iptables-legacy communicates directly with the kernel's xtables interface, while iptables-nft is a compatibility shim that translates iptables commands into nftables rules behind the scenes. On modern distributions like Debian 10+, Ubuntu 20.04+, and RHEL 8+, the iptables binary typically points to iptables-nft by default. You can check which backend your system uses by running iptables --version and looking for (nf_tables) or (legacy) in the output.

Does Docker 29's native nftables backend replace the iptables-nft compatibility layer?

Yes, when you enable the experimental nftables backend in Docker 29 with the firewall-backend=nftables option, Docker creates nftables rules directly in its own tables (ip docker-bridges and ip6 docker-bridges) instead of issuing iptables commands that get translated. This eliminates the compatibility layer entirely for bridge networks, though overlay network rules still use iptables. The DOCKER-USER chain does not exist under the nftables backend. Custom rules must be placed in separate nftables tables with appropriate base chain priorities. Additionally, because an accept verdict in nftables is not final across base chains, allowing traffic past Docker's own drop rules requires setting a firewall mark via the --bridge-accept-fwmark daemon option rather than relying on a simple accept.