nftables is the default firewall framework on every major Linux distribution shipped since 2020. It replaces iptables, ip6tables, arptables, and ebtables with a single unified tool -- nft -- backed by the nf_tables kernel subsystem. The syntax is cleaner, the performance is better (hash-based lookups instead of linear rule traversal), and the feature set is richer, with native support for sets, maps, and atomic rule updates.
This article is a reference collection of practical nftables rules. Every example uses the inet address family, which handles both IPv4 and IPv6 in a single table, unless the rule specifically requires the ip family (as with NAT on older kernels). Each section is self-contained -- grab what you need, adapt it to your environment, and move on. If you want the conceptual background before working through examples, the companion article on nftables architecture and the migration from iptables covers the why before the how.
How nftables Is Structured
Before walking through rules, it helps to understand the hierarchy. nftables organizes everything into three layers: tables hold chains, and chains hold rules. Unlike iptables, none of these exist by default. You create every table, chain, and rule yourself, which means a freshly installed nftables system passes all traffic until you define policy.
Tables are scoped by address family. The families you will encounter are ip (IPv4 only), ip6 (IPv6 only), inet (both IPv4 and IPv6), arp, bridge, and netdev. For general-purpose firewalling, inet is almost always the right choice.
Chains come in two varieties. A base chain is attached to a netfilter hook (input, output, forward, prerouting, postrouting) and intercepts traffic at that point in the packet path. A regular chain is not attached to any hook and only runs when another rule jumps or gotos into it -- useful for organizing complex rulesets into logical groups.
Chain priority determines evaluation order when multiple chains hook into the same point. Lower numbers run first. The standard filter priority is 0, NAT prerouting is typically -100, and NAT postrouting is 100. You can use any integer. Since nftables 0.9.6, you can also write these as named keywords: priority filter, priority dstnat, and priority srcnat respectively. The numeric forms remain valid and are used throughout this article for maximum compatibility with older distributions.
netdev Ingress and Egress: The Earliest Hook
The netdev address family provides hooks that fire before any other Netfilter processing. The ingress hook runs immediately after the NIC driver delivers a packet to the kernel — before fragment reassembly, before conntrack, before prerouting. This makes it the right place to drop definitively malformed or unwanted traffic with zero overhead from connection tracking. The egress hook (added in kernel 5.16) mirrors this on the outbound path, firing after routing but before the driver transmits the packet. Unlike the inet family, each netdev chain is bound to a single named interface.
One precision that trips people up: because the ingress hook runs before IP defragmentation (which happens at priority -400), a rule at priority -500 may see individual IP fragments. For those fragments after the first, the transport header is absent — the fragment carries only the IP header. A rule matching tcp flags or tcp dport will not match those subsequent fragments. Rules matching only on IP fields (ip saddr, ip daddr) work correctly on all fragments. Since kernel 5.10, the inet family also supports an ingress hook, which shares sets and maps with the rest of your inet table but does not impose the single-interface restriction.
#!/usr/sbin/nft -f flush ruleset # Replace enp1s0 with your actual WAN interface name table netdev filter { chain ingress { type filter hook ingress device enp1s0 priority -500; policy accept; # Drop IP fragments before reassembly -- no transport header available ip frag-off & 0x1fff != 0 counter drop # Drop bogon source addresses (RFC 5735 / Team Cymru bogon list) ip saddr { 0.0.0.0/8, 100.64.0.0/10, 127.0.0.0/8, 169.254.0.0/16, 192.0.0.0/24, 192.0.2.0/24, 198.18.0.0/15, 198.51.100.0/24, 203.0.113.0/24, 224.0.0.0/3 } counter drop # Drop XMAS packets (all TCP flags set) tcp flags & (fin|syn|rst|psh|ack|urg) == fin|syn|rst|psh|ack|urg counter drop # Drop NULL packets (no TCP flags set) tcp flags & (fin|syn|rst|psh|ack|urg) == 0x0 counter drop # Drop TCP SYN packets with an anomalously small MSS value # Legitimate MSS values start at 536 (RFC 879 minimum) tcp flags syn tcp option maxseg size 1-535 counter drop } }
The bogon list above omits RFC 1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) intentionally. A publicly routed server should drop those as source addresses on its WAN interface, but a server sitting inside a private network should not. Adjust the list to match your topology. The full Team Cymru bogon reference is maintained at team-cymru.com.
Use netdev ingress when you need per-interface early filtering on a specific WAN interface, or when you need to match any ethertype (ARP, VLAN 802.1q, Q-in-Q). Use inet ingress (kernel 5.10+) when you want to share sets and maps already defined in your inet filter table — for example, referencing your existing blocklist set from an early ingress rule without duplicating it. The two can coexist; they operate independently at the same hook point.
Basic Workstation Firewall
This is the starting point for a workstation or end-user device. It drops all inbound traffic by default, allows established connections and loopback, permits SSH, and accepts ICMP for basic network diagnostics. Outbound traffic is unrestricted.
#!/usr/sbin/nft -f flush ruleset table inet filter { chain input { type filter hook input priority 0; policy drop; # Accept established/related, drop invalid ct state established,related accept ct state invalid drop # Allow loopback iifname "lo" accept # ICMP and ICMPv6 ip protocol icmp accept ip6 nexthdr icmpv6 accept # SSH tcp dport 22 accept } chain forward { type filter hook forward priority 0; policy drop; } chain output { type filter hook output priority 0; policy accept; } }
The flush ruleset directive at the top is critical. It clears every existing table, chain, and rule before loading the new configuration, ensuring a clean state every time the file is applied. Without it, repeated loads would stack duplicate rules.
Setting the input policy to drop means any packet that does not match an explicit accept rule will be silently discarded. If you are configuring a remote server over SSH, make sure your SSH accept rule is in place before you load the ruleset. Test with nft -c -f /etc/nftables.conf to validate syntax without applying.
ct state established,related accept -- does not match, because this is a new connection with no existing tracking entry. The second rule drops ct state invalid packets; a fresh SYN is valid, so it continues. The loopback rule doesn't match. ICMP rules don't match. The SSH rule on port 22 doesn't match port 8080. No rule accepts it, so the chain's default policy drop fires: the packet is silently discarded with no response to the sender. The scanner sees only a timeout. Compare this to a server behind a firewall using reject as its final rule -- that scanner would get an immediate ICMP port-unreachable, confirming the host is alive but the port is closed.
Server Firewall with Multiple Services
A web server typically needs to expose HTTP, HTTPS, and SSH while blocking everything else. This example uses a named set to group the allowed ports into a single rule, which is both cleaner and faster than writing separate rules for each port.
#!/usr/sbin/nft -f flush ruleset table inet filter { set allowed_tcp_ports { type inet_service elements = { 22, 80, 443 } } chain input { type filter hook input priority 0; policy drop; ct state established,related accept ct state invalid drop iifname "lo" accept ip protocol icmp accept ip6 nexthdr icmpv6 accept # Accept traffic to ports in the allowed set tcp dport @allowed_tcp_ports accept # Reject everything else with a polite ICMP message reject with icmpx type port-unreachable } chain forward { type filter hook forward priority 0; policy drop; } chain output { type filter hook output priority 0; policy accept; } }
Named sets give you a central place to manage allowed ports. Adding a new service later is a single nft add element command -- no need to insert rules at the right position in a chain. Sets use hash-based lookups internally, so performance stays constant regardless of how many elements they contain.
nftables 1.0.6 introduced a built-in ruleset optimizer accessible via nft -c -o -f ruleset.nft (the flag was stabilized and fixed for counter statements in 1.0.7 and 1.1.0). It merges adjacent rules that match the same fields into anonymous sets and verdict maps automatically -- for example, two separate tcp dport 80 counter accept and tcp dport 443 counter accept rules get merged into a single tcp dport { 80, 443 } counter accept. Run the optimizer against any existing ruleset before deploying to reduce rule count and improve evaluation throughput. The -c flag runs in check-only mode so nothing is applied.
Rate Limiting and Connection Throttling
Rate limiting is essential for protecting exposed services from brute-force attacks and resource exhaustion. nftables supports rate limiting at both the global level and per-source-IP using dynamic sets.
Global ICMP Rate Limit
This rule accepts a maximum of 10 ICMP echo-request packets per second. Traffic exceeding that threshold is silently dropped by the chain's default policy.
# Inside an input chain with policy drop ip protocol icmp icmp type echo-request \ limit rate 10/second accept
When you write limit rate 10/second, nftables sets the burst to the same value as the rate: 10 packets. A source that was idle can therefore send 10 packets instantly before the limiter engages. For ICMP this is usually fine, but for SSH brute-force protection limit rate 3/minute allows 3 attempts in rapid succession before the per-minute replenishment begins -- which is the same burst window an automated tool uses. To tighten this, add an explicit burst: limit rate 3/minute burst 1 packets allows only 1 attempt before the rate gate closes. The token bucket visualizer below demonstrates the difference -- set burst to 1 and observe the behaviour versus the default.
Per-IP SSH Rate Limit with Dynamic Sets
This is where nftables significantly outperforms iptables. A dynamic set tracks state per source IP address, automatically expiring entries after a timeout. The following example limits each source IP to 3 new SSH connections per minute. For a complete picture of locking down SSH -- including certificate-based authentication, jump hosts, and fail2ban -- see Hardening SSH: Beyond the Basics.
table inet filter { set ssh_ratelimit { type ipv4_addr size 65536 timeout 60s flags dynamic } chain input { type filter hook input priority 0; policy drop; ct state established,related accept ct state invalid drop iifname "lo" accept # Rate limit new SSH connections per source IP ct state new tcp dport 22 \ update @ssh_ratelimit { ip saddr limit rate 3/minute } \ accept } }
When a packet matches this rule, nftables adds the source address to the ssh_ratelimit set and attaches a rate limiter to that entry. If the address already exists, the rate limiter is evaluated and the 60-second timeout is refreshed. After 60 seconds of inactivity, the entry expires and the limiter is released. This replaces the hashlimit and connlimit modules from iptables with a single, more flexible mechanism.
size 65536 flag to the set declaration to cap memory usage. When the set is full, new source IPs are no longer added and the rate limit rule stops matching them -- so monitor set occupancy with nft list set inet filter ssh_ratelimit on high-exposure servers.
Connection Count Limiting
You can also limit the number of concurrent connections from a single IP. This example restricts each address to 2 simultaneous SSH sessions. It uses a named dynamic set with update, which is the current recommended approach. The older inline meter syntax still parses in modern nftables builds -- the userspace tool converts it to a named dynamic set internally -- but using an explicit named set with flags dynamic is cleaner, more readable, and easier to inspect with nft list set.
set ssh_connlimit { type ipv4_addr size 65535 flags dynamic } # Reject if source IP has more than 2 active SSH connections # Must use add (not update) -- ct count and set timeouts are mutually exclusive tcp dport 22 add @ssh_connlimit \ { ip saddr ct count over 2 } \ reject with icmpx type port-unreachable
ct count tracks concurrent connections by querying the conntrack table directly, not the set element's own timer. This means two things that catch experienced practitioners off-guard. First, you must use add, not update, in the rule that references ct count -- using update causes an immediate "Operation not supported" error because update attempts to refresh a set element timeout that does not exist when ct count is in use. Second, defining a timeout on the set itself also triggers the same error for the same reason. The conntrack table's own per-protocol timeout governs when entries expire -- by default, TCP ESTABLISHED connections time out after 5 days (controlled by net.netfilter.nf_conntrack_tcp_timeout_established). There are no set-level timers to manage here. The size 65535 flag on the set caps memory under heavy scanning.
Sets and Maps
Sets and maps are among the strongest features in nftables. Sets group values of the same type (IP addresses, ports, interface names) and let you reference them in a single rule. Maps go further by associating a key with a value or a verdict.
IP Blocklist with Interval Set
An interval set supports CIDR ranges, making it ideal for blocklists. Lookups remain O(1) regardless of how many entries the set contains.
set blocklist { type ipv4_addr flags interval elements = { 203.0.113.0/24, 198.51.100.0/24, 192.0.2.50 } } # In the input chain, before any accept rules: ip saddr @blocklist drop
You can add and remove elements dynamically without reloading the entire ruleset:
When ingesting external IP blocklists from threat feeds, add flags auto-merge to the set declaration alongside flags interval. Without auto-merge, adding overlapping CIDR ranges produces an error. With it, nftables merges adjacent and overlapping prefixes automatically, keeping the set minimal and preventing duplicate-entry failures during bulk imports. Note: a regression introduced in nftables 1.1.0 broke auto-merge on sets that also carry a timeout value -- this was fixed in 1.1.3 (June 2025). If you combine both flags on a build between 1.1.0 and 1.1.2 inclusive, verify the merge behavior before deploying to production.
Trusted Networks Set
The inverse approach works for allow-listing internal networks. This is especially useful on servers that should only accept management traffic from known subnets.
set trusted_nets { type ipv4_addr flags interval elements = { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 } } # Allow all traffic from trusted networks ip saddr @trusted_nets accept
Verdict Maps
A verdict map associates a key with an action. Instead of writing a separate rule per port, you define a map that tells nftables what to do for each destination port.
map port_policy { type inet_service : verdict elements = { 22 : accept, 80 : accept, 443 : accept, 23 : drop } } # Apply the verdict map to incoming TCP traffic tcp dport vmap @port_policy
vmap expression only matches ports that are explicitly listed in the map. Port 8080 has no entry, so the expression does not match and evaluation continues to the next rule in the chain. If the chain's policy is drop and no subsequent rule accepts the packet, it gets dropped by policy -- but the vmap itself did nothing. This is different from a verdict map that includes a catch-all, which you cannot do in a standard map (you'd need a default chain rule after the vmap for unmatched traffic).
Concatenations let you match on combinations of fields in a single set. This example grants specific IP addresses access to specific ports -- like an access control list baked directly into the firewall.
set allowed_access { type ipv4_addr . inet_service elements = { 10.0.0.5 . 3306, 10.0.0.6 . 5432, 10.0.0.10 . 6379 } } # Match source IP + destination port together ip saddr . tcp dport @allowed_access accept
In this configuration, 10.0.0.5 can reach MySQL on port 3306, 10.0.0.6 can reach PostgreSQL on 5432, and 10.0.0.10 can reach Redis on 6379. No other combinations are permitted. Without concatenation, you would need a separate rule for each pair.
ip saddr . tcp dport @allowed_access accept evaluates the concatenation 10.0.0.5 . 6379 against the set. That pair is not present (only 10.0.0.5 . 3306, 10.0.0.6 . 5432, and 10.0.0.10 . 6379 are). The expression does not match, so no accept verdict is issued. Evaluation falls through to subsequent rules -- and if none accept it, the packet is dropped by chain policy. The key insight is that the concatenation creates a compound key: both dimensions must match simultaneously. Being in the set on one dimension (source IP) does not grant access to ports not listed for that specific IP.
NAT and Masquerading
NAT rules require chains of type nat hooked into prerouting (for destination NAT) or postrouting (for source NAT and masquerading). The NAT engine uses connection tracking -- only the first packet of a flow is evaluated against the ruleset, and the binding is applied to all subsequent packets in that connection automatically.
Internet Gateway with Masquerade
The classic NAT gateway: internal hosts on a private subnet share a single public IP for outbound internet access. Masquerading automatically uses the outgoing interface's current address, which makes it the right choice for interfaces with dynamic IPs (DHCP, PPPoE).
#!/usr/sbin/nft -f flush ruleset table ip nat { chain prerouting { type nat hook prerouting priority -100; policy accept; } chain postrouting { type nat hook postrouting priority 100; policy accept; # Masquerade all traffic leaving via the WAN interface oifname "eth0" masquerade } } table inet filter { chain forward { type filter hook forward priority 0; policy drop; # Allow LAN to WAN iifname "eth1" oifname "eth0" accept # Allow established traffic back from WAN to LAN iifname "eth0" oifname "eth1" ct state established,related accept } }
IP forwarding must be enabled in the kernel for NAT to work. Add net.ipv4.ip_forward = 1 to /etc/sysctl.d/99-forwarding.conf and apply it with sysctl -p. For IPv6 forwarding, set net.ipv6.conf.all.forwarding = 1 in the same file. For a broader understanding of how the kernel makes routing decisions before packets reach nftables postrouting, see Understanding Linux Routing Tables.
The NAT engine only evaluates the first packet of a flow. Once a NAT mapping is created, all subsequent packets in that connection are translated automatically without re-entering the nat chain. This means a policy drop on a nat chain is not a safety net -- it is a trap. Any first packet that reaches the end of the chain without matching a rule gets dropped, and no mapping is ever created for the reply direction. The connection fails with no log entry and no error, because the drop happens in the nat chain before the filter chain has any visibility. Always set policy accept on nat chains. Your filtering policy belongs in the inet filter forward and input chains, not here.
Source NAT with a Static IP
If your WAN interface has a static address, use SNAT instead of masquerade. SNAT is slightly more efficient because it does not need to look up the interface address for every new connection.
chain postrouting { type nat hook postrouting priority 100; policy accept; oifname "eth0" snat to 203.0.113.1 }
Port Forwarding (DNAT)
Destination NAT rewrites the destination address and/or port of incoming packets, directing them to a host behind the gateway. This is the standard technique for exposing internal services through a public IP.
Forward HTTP to an Internal Web Server
table ip nat { chain prerouting { type nat hook prerouting priority -100; policy accept; # Forward port 80 to internal web server iifname "eth0" tcp dport 80 dnat to 192.168.1.10:80 # Forward port 443 to the same server iifname "eth0" tcp dport 443 dnat to 192.168.1.10:443 # Forward SSH on port 2222 to a different internal host iifname "eth0" tcp dport 2222 dnat to 192.168.1.50:22 } chain postrouting { type nat hook postrouting priority 100; policy accept; # Masquerade traffic headed to forwarded hosts ip daddr 192.168.1.0/24 masquerade } }
The masquerade rule in postrouting ensures that reply traffic from the internal server routes back through the gateway rather than attempting a direct path to the client. Without it, the internal server would try to respond directly, and the client would drop the asymmetric reply.
When a host on the internal LAN tries to reach an internal server using the gateway's public IP -- the normal case when you have no split-horizon DNS -- the connection fails silently. The gateway rewrites the destination to the internal server's IP via DNAT, but the source address remains the LAN client's private IP. The internal server sends its reply directly back to the LAN client rather than through the gateway, so the gateway never rewrites the source back, and the client drops the asymmetric reply. The fix is a hairpin masquerade rule in postrouting that rewrites the source address of those forwarded packets to the gateway's LAN IP, forcing all reply traffic back through the gateway where the NAT mapping can be applied correctly.
table ip nat { chain prerouting { type nat hook prerouting priority -100; policy accept; # DNAT inbound port 80 from the internet to the internal web server iifname "eth0" tcp dport 80 dnat to 192.168.1.10:80 } chain postrouting { type nat hook postrouting priority 100; policy accept; # Normal WAN masquerade oifname "eth0" masquerade # Hairpin: LAN clients reaching the internal web server via the public IP # Rewrite source to gateway LAN IP so replies route back through here ip saddr 192.168.1.0/24 ip daddr 192.168.1.10 tcp dport 80 masquerade } }
The alternative to hairpin NAT is split-horizon DNS, where an internal DNS resolver returns the internal IP for the relevant hostname rather than the public IP. That approach is cleaner because it avoids the extra masquerade hop and keeps traffic entirely on the LAN. Hairpin NAT is the right solution when you cannot control DNS for all clients on the network.
Local Port Redirect
Redirect is a special case of DNAT that sends packets to the local machine. This is useful for transparent proxying -- intercepting HTTP traffic and sending it to a local proxy on a non-standard port.
chain prerouting { type nat hook prerouting priority -100; policy accept; # Redirect port 80 to a local proxy on port 3128 tcp dport 80 redirect to :3128 }
Bypassing Connection Tracking with notrack
Every packet that passes through the kernel normally gets a conntrack entry. On a high-volume service like a public DNS resolver, NTP server, or UDP-based game server, that tracking adds CPU overhead and memory pressure for flows that do not need stateful inspection. The notrack statement exempts specific packets from connection tracking entirely. It must be placed in a chain with a prerouting or output hook at priority lower than -200 — the raw priority (-300) is the standard choice, because conntrack registers at -200 and must not have seen the packet yet. Packets marked notrack arrive in later chains with ct state untracked; your input chain rules must explicitly accept that state for those services.
table inet filter { chain raw_pre { type filter hook prerouting priority -300; policy accept; # Skip conntrack for DNS and NTP -- stateless, high volume udp dport { 53, 123 } notrack } chain input { type filter hook input priority 0; policy drop; ct state established,related accept ct state invalid drop iifname "lo" accept # Accept notrack'd packets -- ct state untracked for bypassed flows ct state untracked udp dport { 53, 123 } accept tcp dport 22 ct state new accept } }
The performance gain is real but narrow: notrack pays off on services receiving tens of thousands of packets per second with short-lived flows. For a typical web server with long-lived HTTP/2 connections, the per-connection conntrack cost is negligible and notrack adds complexity with no benefit. Measure with conntrack -L | wc -l and compare against nf_conntrack_max before applying it.
SYN Proxy: SYN Flood Mitigation Without Consuming conntrack
A SYN flood exhausts conntrack resources by sending large numbers of half-open TCP connections. The synproxy statement intercepts incoming SYN packets and completes the three-way handshake using SYN cookies without creating a conntrack entry. Only after the client proves it is a real host by returning a valid ACK does the kernel create a conntrack entry and pass the connection to the server. This keeps the conntrack table empty during a flood. Three kernel settings are required before synproxy will work correctly.
Run these three commands and make them persistent in /etc/sysctl.d/: net.ipv4.tcp_syncookies = 1, net.ipv4.tcp_timestamps = 1, and net.netfilter.nf_conntrack_tcp_loose = 0. The first two enable the SYN cookie mechanism synproxy relies on. The third is critical and non-obvious: with loose tracking on (the default), the final ACK from the client would be marked ESTABLISHED and let through, but synproxy needs it to be marked INVALID so the rule below can catch it and complete the handshake. Without setting loose to 0, legitimate connections will time out silently.
# sysctl prerequisites (make persistent in /etc/sysctl.d/): # net.ipv4.tcp_syncookies = 1 # net.ipv4.tcp_timestamps = 1 # net.netfilter.nf_conntrack_tcp_loose = 0 table ip synproxy_demo { chain raw_pre { type filter hook prerouting priority -300; policy accept; # Mark incoming SYN packets on port 80 as untracked tcp dport 80 tcp flags syn notrack } chain input { type filter hook input priority 0; policy accept; # Hand untracked SYN packets and INVALID 3WHS ACK packets to synproxy # mss and wscale should match what your actual backend announces tcp dport 80 ct state { invalid, untracked } \ synproxy mss 1460 wscale 9 timestamp sack-perm # Drop anything that did not match synproxy (bad cookies, etc.) ct state invalid drop } }
To determine the correct mss and wscale values to advertise, capture a SYN-ACK from your backend with tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)' port 80 and read the TCP options from the output. The kernel module nft_synproxy must be loaded; verify with lsmod | grep synproxy. On RHEL 8 and earlier, the module is absent from default kernels — RHEL 9 with kernel 5.14.0-84 and later includes it.
Conntrack Helpers for Multi-Connection Protocols
Some application-layer protocols open secondary connections whose ports are negotiated dynamically during the primary session. FTP passive mode, SIP, H.323, and TFTP all work this way. Without a conntrack helper, a stateful firewall cannot know which secondary ports to allow, so those connections are either blocked or you are forced to open wide port ranges. A ct helper named object teaches conntrack how to parse the primary session and create RELATED entries for the secondary connections automatically.
Two critical precision points that the documentation often glosses over. First, helper assignment must happen at a chain with hook priority greater than -200 — conntrack must have already classified the packet before you assign a helper to it. The filter priority (0) is correct; the raw priority (-300) is too early. Second, when matching on RELATED packets in your accept rule, ct helper "ftp" uses the in-kernel helper name, not your named object name. The object is named ftp-standard in the example below, but the match expression uses the kernel name "ftp". Getting this wrong produces a rule that silently never matches.
table inet filter { # Declare named ct helper objects ct helper ftp-standard { type "ftp" protocol tcp; l3proto inet; } ct helper sip-5060 { type "sip" protocol udp; l3proto inet; } ct helper tftp-69 { type "tftp" protocol udp; l3proto inet; } chain prerouting { type filter hook prerouting priority 0; policy accept; # Assign helpers to primary connections # Must run AFTER conntrack (priority > -200) ct state new tcp dport 21 ct helper set "ftp-standard" # Assign multiple helpers in a single rule using a verdict map ct helper set udp dport map { 69 : "tftp-69", 5060 : "sip-5060" } } chain input { type filter hook input priority 0; policy drop; ct state established,related accept ct state invalid drop # Allow initial FTP control connection tcp dport 21 ct state new accept # Allow FTP data channels -- restrict to FTP-related flows # ct helper matches the KERNEL name ("ftp"), not the object name ct state related ct helper "ftp" \ tcp dport 1024-65535 accept } }
Automatic helper assignment (echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper) is disabled by default on kernels 4.7 and later for security reasons — it would assign helpers to all traffic on the registered port regardless of source or destination. Explicit assignment in a rule, as shown above, is the secure approach and has been required since kernel 4.12 / nftables 0.8. The helper kernel modules (nf_conntrack_ftp, nf_conntrack_sip, etc.) must be loaded either manually or via /etc/modules-load.d/.
Logging and Counters
One of the advantages nftables has over iptables is the ability to combine multiple actions in a single rule. In iptables, logging and dropping a packet required two separate rules. In nftables, you combine both into one.
# Log and drop in a single rule tcp dport 23 log prefix "[nftables telnet] " counter drop # Log new SSH connections without interfering with traffic tcp dport 22 ct state new log prefix "[nftables ssh] " counter accept # Rate-limit log output to avoid flooding syslog tcp dport 23 limit rate 6/minute log prefix "[nftables telnet] " counter drop
Statement order matters in nftables rules. Expressions are evaluated left to right. If you place log before limit rate, every packet is logged regardless of the rate limit. Place the limit expression first so that only packets within the threshold reach the log statement.
Counters are simple to use. Adding counter to any rule tracks the number of packets and bytes that matched. You can view the counts with nft list ruleset and reset them with nft reset counters.
Production-Ready Ruleset
This final example combines everything into a complete, production-grade configuration for a server that exposes web and SSH services, rate-limits SSH, maintains a blocklist, logs rejected traffic, and keeps counters for monitoring.
#!/usr/sbin/nft -f flush ruleset table inet filter { set blocklist { type ipv4_addr flags interval elements = { } } set ssh_ratelimit { type ipv4_addr timeout 120s flags dynamic } set web_ports { type inet_service elements = { 80, 443 } } chain input { type filter hook input priority 0; policy drop; ct state established,related counter accept ct state invalid counter drop iifname "lo" accept ip saddr @blocklist counter drop ip protocol icmp icmp type echo-request \ limit rate 10/second counter accept ip6 nexthdr icmpv6 icmpv6 type { \ echo-request, nd-router-solicit, \ nd-router-advert, nd-neighbor-solicit, \ nd-neighbor-advert \ } limit rate 10/second counter accept ct state new tcp dport 22 \ update @ssh_ratelimit { ip saddr limit rate 3/minute } \ counter accept tcp dport @web_ports counter accept limit rate 30/minute log prefix "[nftables reject] " \ counter reject with icmpx type port-unreachable } chain forward { type filter hook forward priority 0; policy drop; } chain output { type filter hook output priority 0; policy accept; } }
This ruleset follows a defensive ordering: connection tracking first (so established traffic takes the fast path), blocklist next (early discard of known-bad sources), then rate-limited services, then open ports, and finally a logged reject for anything else. The reject with ICMP at the end is more network-friendly than a silent drop for legitimate clients that hit the wrong port -- they get an immediate response instead of waiting for a timeout. For tuning the kernel-side network stack that nftables sits on top of, see Linux Kernel Tuning for High-Traffic Servers.
Flowtable Fastpath for Forwarded Traffic
On gateway and router hosts, flowtables let established TCP and UDP flows bypass the classic Netfilter forward chain entirely after the first packet. Once a flow is added to the flowtable via flow add @f, all subsequent packets in that connection are forwarded directly through neigh_xmit(), skipping prerouting, forward, and postrouting hooks. The NAT mapping from the initial packet is cached inside the flowtable entry so SNAT and DNAT are still applied correctly. There are two precise consequences many people do not anticipate: any counter statement on your forward chain stops incrementing for offloaded flows (the counter only fires for the first one or two packets), and fragmented packets are never offloaded because the flowtable lookup requires a complete 7-tuple including transport ports, which fragments do not carry.
table inet router { # Flowtable registered on the ingress hook of both interfaces flowtable ft { hook ingress priority 0 devices = { eth0, eth1 } } chain forward { type filter hook forward priority 0; policy drop; ct state established,related accept ct state invalid drop # Offload established TCP/UDP flows after the first packet ct state established meta l4proto { tcp, udp } \ flow add @ft # Allow new connections from LAN to WAN iifname "eth1" oifname "eth0" ct state new accept } }
Hardware offload is also available on supported NICs by adding flags offload to the flowtable declaration (requires kernel 5.13 and a driver that implements the offload interface -- Mellanox/NVIDIA ConnectX adapters are among the few with upstream support as of kernel 6.x). When a flow is hardware-offloaded, the conntrack status bit IPS_HW_OFFLOAD is set and the connection appears as [HW_OFFLOAD] in conntrack -L output. For bandwidth shaping and queuing discipline on top of nftables filtering, see tc and Traffic Shaping: A Practical Guide.
Ruleset Management
Knowing the rules is only half the job. You also need to load, validate, export, and debug them efficiently.
# List the entire active ruleset # nft list ruleset # List a specific table # nft list table inet filter # List rules with handles (needed for deletion) # nft -a list chain inet filter input # Delete a specific rule by handle number # nft delete rule inet filter input handle 14 # Validate configuration without applying # nft -c -f /etc/nftables.conf # Load configuration from file # nft -f /etc/nftables.conf # Export ruleset in JSON (useful for scripting) # nft -j list ruleset # Flush everything -- use with caution on remote hosts # nft flush ruleset # Save current rules to the persistent config file # nft list ruleset > /etc/nftables.conf
To ensure your rules survive a reboot, enable the nftables service with systemd:
The systemd service reads /etc/nftables.conf at boot and applies it atomically. Every rule change goes through a transaction -- either the entire configuration loads successfully, or nothing changes. There is no partial-apply state.
Running nft flush ruleset on a remote server with no out-of-band access will lock you out immediately if you have no rules permitting your SSH session. Always keep a backup session or console access when making changes to a production firewall.
Docker, Podman, and firewalld all write their own tables into the same nftables ruleset. Running nft flush ruleset wipes every table from every source simultaneously, with no confirmation prompt and no error. Container port mappings stop working immediately, managed firewall policies vanish, and nothing in the output tells you why. On systems running any of these tools, use flush table inet filter (or the specific table name you own) rather than flush ruleset. If you must use flush ruleset -- for example, during initial provisioning -- verify that Docker and any other table-writing services are restarted afterward. For a full picture of how Docker manages its own network namespace and iptables/nftables rules, see Docker Networking Without the Guesswork.
If your config file does not start with flush ruleset (or a targeted flush table) and you run nft -f on a file that re-declares a set with elements already present in the live ruleset, the load fails mid-file. nftables processes the file as a transaction per-command, not as a single atomic operation across the entire file, so rules added before the failing command are already live. You end up with an indeterminate combination of old and new rules, with no clear indication of where the apply stopped. The safest pattern is always to start with a flush, or to use the nft -c -f check pass first and confirm the file parses cleanly against the current kernel state before applying.
The flags owner Table Flag
A table declared with flags owner is owned exclusively by the process that created it. Non-owner processes that attempt to modify or delete an owner table receive EPERM. More importantly, nft flush ruleset skips owner tables entirely -- they do not get wiped. The table is automatically destroyed when the owning process exits or closes its netlink socket, which makes the flag useful for temporary diagnostic rulesets (add tracing, close the shell, tracing disappears). Firewalld uses this mechanism to protect its tables from being inadvertently cleared by an administrator running a manual flush ruleset. The flag was introduced in nftables 0.9.9 / Linux kernel 5.13.
# This table is tied to the interactive nft session. # Closing the shell removes it automatically. # nft flush ruleset will NOT touch it. table ip temp-trace { flags owner; chain prerouting { type filter hook prerouting priority raw - 1; ip protocol tcp meta nftrace set 1 } }
The flags persist Table Flag
nftables 1.1.0 introduced a complementary table flag: flags persist. Where flags owner ties a table's lifetime to a process and destroys it on exit, flags persist does the opposite -- the table survives process exit and is not wiped by nft flush ruleset, but it also has no owning process and any process can modify it. The intended use case is daemons that need to ensure their table is not accidentally cleared by an operator running a manual flush, but that also need the table to survive if the daemon restarts. Firewalld moved to flags persist in newer releases for exactly this reason. Unlike flags owner, a persist table must be deleted explicitly with nft delete table.
flush table Does Not Clear Named Set Contents
This catches people who switch from flush ruleset to flush table inet filter specifically to avoid touching Docker's tables. flush table clears all rules and chains in the named table but leaves named sets intact, including their current elements. If you change the elements = { } block of a set in your config file and reload with flush table followed by nft -f, the file re-declares the set -- but adding an element that already exists in the live set fails with a duplicate error, and if flags auto-merge is not set on the set, the entire load aborts. The set ends up with its previous contents, not the new ones from the file.
There are two clean solutions. The first is to explicitly delete and re-create the set in your config file before populating it: delete set inet filter blocklist before the new set blocklist { ... } declaration. The second is to use flush ruleset during provisioning when Docker is not running, then rely on idempotent set declarations with flags auto-merge for subsequent reloads. Using flush ruleset on a live system with Docker running remains the footgun described above -- these two constraints are in genuine tension and there is no single answer that satisfies both simultaneously.
Per-Flow ct timeout Objects
The default TCP ESTABLISHED conntrack timeout is 432000 seconds -- five days. On servers handling thousands of short-lived connections (HTTP APIs, microservices), that default keeps dead connection tracking entries in memory long after the sessions are gone, inflating the conntrack table and consuming kernel memory. A named ct timeout object lets you attach an aggressive timeout policy to specific flows without changing the global sysctl. The object specifies per-state timeouts; any state not listed inherits the kernel default. Support was added in nftables 0.9.1.
table inet filter { # Override conntrack timeouts for API traffic on port 8443 # Default TCP ESTABLISHED timeout is 432000s (5 days) ct timeout api-tcp { l3proto ip; protocol tcp; policy = { established: 120, # 2 minutes instead of 5 days close_wait: 4, close: 4 } } chain input { type filter hook input priority 0; policy drop; ct state established,related accept # Attach the aggressive timeout to new API connections tcp dport 8443 ct state new \ ct timeout set "api-tcp" accept } }
To see current conntrack table size and usage, run conntrack -L | wc -l and compare against sysctl net.netfilter.nf_conntrack_max. When the table is full, new connections cannot be tracked and are dropped. Aggressive ct timeout policies on high-connection-rate services are a more surgical fix than raising nf_conntrack_max globally.
Live Packet Tracing with nftrace and nft monitor
When a packet disappears and you do not know which rule is dropping it, meta nftrace set 1 combined with nft monitor trace shows you every rule the packet touches, in every chain, with the verdict at each step. This is the nftables equivalent of iptables TRACE target but more powerful: it follows the packet across all tables and chains, reporting the full packet headers, the rule text, and the final verdict. The tracing output goes through the kernel netlink interface to userspace, so it does not require syslog and is visible immediately in the terminal.
The recommended workflow is to add a temporary trace chain at the highest priority at the prerouting hook, narrow its match expression to the specific traffic you want to diagnose, then run nft monitor trace in a second terminal. Using flags owner on the table makes cleanup automatic when you close the shell.
# Add this table in an interactive nft session. # It is automatically removed when you close the shell (flags owner). # Start watching in a second terminal: nft monitor trace table ip debug-trace { flags owner; chain trace_pre { # Run before your existing prerouting chain type filter hook prerouting priority -301; # Narrow the trace to only the traffic you care about # This example: TCP to port 8080 from a specific source ip saddr 198.51.100.42 tcp dport 8080 \ limit rate 6/minute meta nftrace set 1 } }
# In a second terminal -- watch packets traverse your ruleset # nft monitor trace # Filter output to a specific chain # nft monitor trace | grep "inet filter input" # Watch for ruleset changes (rule additions, deletions, table flushes) # nft monitor # Watch only for new/destroyed conntrack entries # nft monitor new destroy tables
The rate limit on the trace rule (limit rate 6/minute) is important in production. Without it, a high-traffic flow generates thousands of netlink events per second, which can flood the monitoring terminal and measurably impact throughput. Always rate-limit trace rules before applying them to a live server. Remove the trace table when debugging is complete — leaving meta nftrace set 1 active permanently adds a measurable per-packet overhead. For packet-level inspection beyond what nftrace provides, Wireshark with a Remote Linux Capture: tcpdump + SSH Piping covers capturing traffic from a remote host without installing a GUI on the server.
The best firewall is the one you understand completely. nftables gives you explicit control over every table, chain, and rule. There is no magic -- just clear declarations and predictable behavior.
Wrapping Up
nft list ruleset > /etc/nftables.conf to save your working ruleset. After a reboot the server loads fine, but all your define variable names and inline comments are gone. Why?nftables gives you a single, coherent framework for everything iptables used to spread across four separate tools. The examples in this article cover the patterns you will encounter repeatedly: stateful filtering with connection tracking, named sets for clean port and IP management, per-source rate limiting with dynamic sets, verdict maps for compact policy logic, NAT with masquerading and DNAT for gateways and port forwarding, and combined logging with counters for observability.
Start with the basic workstation or server firewall, adapt it to your services, and build up from there. Every rule in this article is designed to be copied, modified, and deployed. The nft -c -f validation flag is your safety net -- use it every time before loading a new configuration.
How to Build an nftables Firewall
Step 1: Create a table and base chains
Use the nft add table command with the inet family to create a dual-stack table that handles both IPv4 and IPv6. Then add base chains for input, forward, and output hooks with appropriate priorities and default policies. Set the input and forward chain policies to drop so that only explicitly permitted traffic passes through.
Step 2: Add filtering rules with connection tracking
Insert rules that accept established and related connections using ct state, drop invalid packets, and allow traffic on the loopback interface. Then add rules to permit specific services such as SSH, HTTP, and HTTPS by matching on tcp dport. Use sets to group multiple ports into a single rule for cleaner configuration.
Step 3: Save and enable the ruleset at boot
Export the active ruleset to /etc/nftables.conf by running nft list ruleset and redirecting the output. Validate the configuration file without applying it by running nft with the -c and -f flags. Finally, enable the nftables systemd service so the rules are loaded automatically on every boot.
Frequently Asked Questions
What is the difference between the ip and inet families in nftables?
The ip family processes only IPv4 packets, while the inet family processes both IPv4 and IPv6 packets in a single table. Using inet lets you write one set of rules that applies to both protocol versions, eliminating the need for duplicate rulesets. For any new configuration, inet is the recommended family unless you have a specific reason to handle IPv4 and IPv6 separately.
How do I make nftables rules persist across reboots?
Save your ruleset to /etc/nftables.conf using nft list ruleset and redirect the output to the file. Then enable the nftables systemd service with systemctl enable nftables so the configuration is loaded automatically at boot. You can also use the nft -f flag to load a configuration file manually at any time.
Can I use nftables and iptables at the same time?
While technically possible through the iptables-nft compatibility layer, running both simultaneously is discouraged because they share the same underlying kernel infrastructure and can produce unpredictable rule interactions. On modern distributions like Debian 12 and RHEL 9, the iptables command already translates rules into the nftables backend. For new deployments, use nftables exclusively and migrate any existing iptables rules with the iptables-translate utility.
Sources
The technical claims in this article are grounded in the following primary sources. All nftables behavior described reflects the kernel-side netfilter subsystem and the nft userspace tool as documented in their official materials.
- Neira Ayuso, Pablo. nftables wiki: Performing Network Address Translation (NAT). Netfilter Project. wiki.nftables.org. Confirms NAT chain types, prerouting priority
-100, and postrouting priority100; establishes that masquerade requires Linux kernel 3.18 or later. - Netfilter Project. nftables wiki: Netfilter hooks. wiki.nftables.org. Source for hook ordering, priority keyword names (
filter,dstnat,srcnat), and the note that named priority keywords were introduced in nftables 0.9.6. - Netfilter Project. nftables wiki: Meters. wiki.nftables.org. Establishes that dynamic sets with
flags dynamicand stateful expressions (limit rate,ct count) are the recommended replacement for iptableshashlimitandconnlimit. Also documents the mutually exclusive relationship betweenct countand set timeouts: combining them produces "Operation not supported", andupdatemust be replaced withaddwhen usingct count. - Netfilter Project. nftables wiki: Sets. wiki.nftables.org. Documents named set types,
flags interval,flags dynamic,timeout,typeofkeyword (available since 0.9.4), and the 16-character set name limit. - Neira Ayuso, Pablo. nftables 1.1.0 release announcement. LWN.net, 2024. lwn.net/Articles/982283/. Source for the
nft -c -oruleset optimizer behavior, including automatic merging of adjacent rules into anonymous sets. - Red Hat. Security Guide: Configuring NAT using nftables. RHEL 7 Documentation. docs.redhat.com. Confirms
postrouting priority 100andprerouting priority -100for NAT chain declarations; cross-checks masquerade and SNAT procedures. - Wikipedia contributors. nftables. Wikipedia, The Free Encyclopedia. en.wikipedia.org/wiki/Nftables. Background on nftables availability since Linux kernel 3.13 (January 2014) and project history including the Netfilter Workshop 2008 presentation by Patrick McHardy.
- Linux Kernel Documentation. Netfilter's flowtable infrastructure. docs.kernel.org. Authoritative source for flowtable fastpath mechanics: offloaded flows bypass all Netfilter hooks after ingress, fragmented traffic falls back to classic forwarding, NAT mappings are cached per flowtable entry, and TTL is decremented before
neigh_xmit(). Also documents the hardware offload flag and kernel 5.7 counter support. - Neira Ayuso, Pablo. nftables 0.9.9 release announcement. LWN.net, 2021. lwn.net/Articles/857369/. Source for the
flags ownertable flag: owner tables are skipped byflush ruleset, destroyed on process exit, and reject modifications from non-owner processes with EPERM. Also source for flowtable hardware offload flag introduction in nftables 0.9.9. - Netfilter Project. nftables wiki: Ct timeout. wiki.nftables.org. Documents the
ct timeoutobject syntax for overriding per-flow conntrack state timeouts, with per-state granularity. Feature added in nftables 0.9.1. - Netfilter Project. nftables wiki: Netfilter hooks. wiki.nftables.org. Documents the
netdevfamily ingress and egress hooks, the single-interface binding requirement, fragmentation behavior at priority below -400, and theinetfamily ingress hook added in kernel 5.10. - Netfilter Project. nftables wiki: Setting packet connection tracking metainformation. wiki.nftables.org. Source for
notrackstatement requirements (priority below -200, raw priority recommended), the resultingct state untrackedmatch, and the conntrack zone assignment pattern. Feature added in kernel 4.9 / nftables 0.7. - Netfilter Project. nftables wiki: Synproxy. wiki.nftables.org. Documents the three required sysctl prerequisites (
tcp_syncookies,tcp_timestamps,nf_conntrack_tcp_loose = 0), the two-chain pattern (raw notrack + input synproxy), and the reason the final ACK must arrive as INVALID state. Named synproxy objects added in nftables 0.9.3. - Netfilter Project. nftables wiki: Conntrack helpers. wiki.nftables.org. Documents the
ct helpernamed object syntax, the rule priority constraint (must be > -200), and the security recommendation to restrict RELATED accepts by destination address. Also clarifies thatct helper "ftp"in match expressions uses the kernel name, not the object name. Requires kernel 4.12 / nftables 0.8. - Netfilter Project. nftables wiki: Ruleset debug/tracing. wiki.nftables.org. Documents
meta nftrace set 1, the recommended trace chain pattern, andnft monitor traceoutput format showing per-rule verdicts across all chains. - Neira Ayuso, Pablo. nftables 1.0.6 release announcement. LWN.net, 2022. lwn.net/Articles/909570/. Source for the introduction of the
-o/--optimizeruleset optimizer flag in nftables 1.0.6. Fixes and counter-statement support were added in 1.0.7 and 1.1.0 respectively. - Neira Ayuso, Pablo. nftables 1.1.0 release announcement. LWN.net, 2024. lwn.net/Articles/982283/. Source for the
flags persisttable flag introduced in 1.1.0: persist tables survive process exit and are skipped byflush rulesetbut have no owning process. Also source for the unbroken-o/--optimizeflag with counter statements in 1.1.0. - Sowden, Jeremy. nftables 1.1.3 release. Debian changelog, June 2025. launchpad.net (Ubuntu nftables changelog). Documents the restoration of the
auto-mergefeature for sets that also carry atimeoutvalue, which was broken in builds 1.1.0 through 1.1.2. Also documents thenft list hooksimprovements and netdev egress listing added in 1.1.1.