error: beginning maxstartups throttling -- What It Means and How to Fix It

You are mid-deploy, running a parallel configuration management push across fifty hosts, when half your SSH sessions start failing silently. Or an automated CI pipeline starts logging sporadic connection timeouts against a jump host. Or you are troubleshooting something urgent and every second ssh invocation comes back with nothing -- no error, just a closed connection. You check /var/log/auth.log or journalctl -u sshd and find a line that looks like this:

auth.log / journalctl -u sshd

error: beginning maxstartups throttling
error: drop connection #11
error: drop connection #12
error: drop connection #13

On the client side, this error is invisible. The connecting ssh process does not receive an explanatory message -- it simply gets the TCP connection closed before authentication begins. Depending on the client and context, you will see something like Connection closed by remote host, ssh_exchange_identification: read: Connection reset by peer, or in Ansible output, a generic UNREACHABLE with no useful detail. The mismatch between a silent client failure and a very explicit server-side log message is part of why this issue gets misdiagnosed as a network problem or a host availability issue rather than an sshd configuration boundary.

This is MaxStartups -- one of sshd's oldest and least-understood defense mechanisms. It is not a bug, not a misconfiguration on its own, and not something to silence blindly. Getting it right requires understanding what it actually does at the protocol level, why the defaults exist, and what the error is telling you about your environment before you change a single line of config.

What MaxStartups Actually Does

MaxStartups controls how many unauthenticated connection attempts sshd will tolerate simultaneously before it starts dropping new ones. The key word is unauthenticated. This directive has nothing to do with established, authenticated SSH sessions. It is a gate that applies only during the handshake phase -- from the moment a TCP connection is accepted until authentication either succeeds or fails.

The OpenSSH source code is unambiguous about this. In sshd.c, the throttling logic evaluates the count of connections that are still in the pre-authentication state. Once a session is authenticated, it exits the pool entirely and no longer counts against MaxStartups. This means a server with a hundred active SSH sessions and zero unauthenticated handshakes in progress will never trigger this error, regardless of how low the limit is set.

sshd_config(5) man page, OpenSSH

The official documentation describes MaxStartups as a ceiling on simultaneous unauthenticated connections to the SSH daemon. Any connections beyond that ceiling are dropped until an existing one either authenticates successfully or its LoginGraceTime window expires.

The directive accepts either a single integer or a colon-separated triple in the form start:rate:full. Understanding this three-value syntax is essential because the default changed in a way that catches many administrators off guard.

MaxStartups is not MaxSessions or MaxAuthTries

These three directives are frequently confused with each other. MaxSessions limits the number of open shell sessions, port forwards, or subsystem requests per multiplexed SSH connection -- it has no bearing on unauthenticated connection counts. MaxAuthTries limits how many authentication attempts a single already-accepted connection can make before sshd disconnects it. Neither of those is what fires when you see beginning maxstartups throttling. That message is produced exclusively by the unauthenticated connection pool logic controlled by MaxStartups.

The Three-Value Syntax: start:rate:full

The simple form -- a plain integer like MaxStartups 10 -- means drop all connections once there are ten unauthenticated handshakes in flight. That is a hard cliff: connection one through ten proceed normally, connection eleven is dropped unconditionally.

The three-value form adds a probabilistic middle ground:

start -- below this count, all connections are accepted normally
rate -- once the count exceeds start, this percentage of new connections are randomly dropped
full -- at or above this count, all new connections are dropped

The probability of dropping a connection increases linearly from 0% at start to rate% at full. This is intentional -- it creates a soft ramp rather than a sudden cliff, which makes the behavior more resistant to timing-based exploitation. An attacker who can precisely time connection bursts to stay just under a hard limit gets no such advantage against a probabilistic drop policy.

example MaxStartups values

# Hard limit: drop all connections beyond 10 unauthenticated
MaxStartups 10

# Soft ramp: start dropping at 10, full drop at 100
# Between 10 and 100, each new connection has a 30% drop chance
MaxStartups 10:30:100

# Current default (long-standing): 10:30:100
# The original default was the hard integer 10 (no probabilistic ramp)
MaxStartups 10:30:100

Default Value History

The 10:30:100 default has been in place for many years. The original default was simply 10 -- a hard integer with no probabilistic ramp -- and was changed to 10:30:100 to add the soft drop curve. OpenSSH 9.8 (released July 1, 2024) did not change the MaxStartups default; its notable sshd change was the introduction of PerSourcePenalties, a new mechanism that penalizes client addresses that repeatedly fail authentication or cause crashes. If you are running 9.8 or later, you have both MaxStartups and PerSourcePenalties working in parallel -- see the PerSourcePenalties section below. Verify your version with sshd -V and always confirm the running value with sshd -T | grep maxstartups.

Why the Error Is Firing on Your Server

There are two fundamentally different reasons you can see beginning maxstartups throttling, and they point toward completely different responses. Conflating them is where sysadmins get into trouble.

Scenario 1: Automation that opens many connections simultaneously

This is the overwhelmingly common cause in production environments. Configuration management tools (Ansible, SaltStack, Puppet over SSH transport), deployment pipelines, monitoring agents that SSH-check hosts, and parallel shell tools like pssh or cssh all open multiple SSH connections in rapid succession. When Ansible fans out to fifty hosts concurrently from a single control node, or when a CI system launches a matrix of parallel jobs, every one of those connections spends some time in the unauthenticated state while key exchange and authentication complete.

On a jump host or bastion that aggregates this traffic, ten simultaneous unauthenticated connections is an extremely low ceiling. If the authentication step takes even 300 milliseconds -- which is normal for key-based auth with an SSH agent -- then a burst of fifty simultaneous connections will have many of them colliding in the unauthenticated window.

Scenario 2: An actual brute-force or scanning attack

This is what MaxStartups was originally designed to defend against. Password brute-force tools and SSH scanners often work by maintaining many simultaneous half-open connections, attempting credentials against multiple sessions at once to maximize throughput. A low MaxStartups value degrades their effectiveness significantly -- it throttles the attack rate without requiring fail2ban, port knocking, or other external tooling.

OpenSSH security rationale — openssh.com

The OpenSSH project describes MaxStartups as a deliberate defense against large-scale brute-force attacks. By limiting how many simultaneous unauthenticated connections the daemon accepts, it constrains an attacker's ability to parallelize credential attempts against the SSH daemon. Source: OpenSSH security considerations.

The diagnostic step before changing anything is to determine which scenario you are in. Check the logs carefully. If the dropped connections are all coming from internal IP ranges -- your automation control node, your CI runner, your monitoring system -- you are in Scenario 1. If you see a diverse mix of external source IPs with rapid reconnect patterns, treat that as a potential attack before adjusting limits.

Scenario 3: Slow authentication extending the unauthenticated window

This is less discussed but frequently encountered. Each unauthenticated connection holds its slot until authentication succeeds or LoginGraceTime expires. Anything that delays authentication artificially inflates how long each connection occupies the pool. Three common culprits are worth checking before you conclude the problem is simply connection volume.

Reverse DNS lookups: When UseDNS yes is set (the default on some older distributions), sshd performs a reverse DNS lookup on the connecting IP address before proceeding. On a network where the DNS resolver is slow, unreachable, or returning NXDOMAIN after a long timeout, every connection spends several extra seconds in the unauthenticated state. With ten connections each taking five extra seconds, you can hit the default MaxStartups ceiling even with moderate traffic. Check this with:

terminal

# Check whether UseDNS is enabled in the running config
$ sudo sshd -T | grep usedns

# If it returns "usedns yes", consider disabling it
# Set in /etc/ssh/sshd_config:
UseDNS no

GSSAPI authentication negotiation: When GSSAPIAuthentication yes is configured but no Kerberos infrastructure is in place, clients still attempt GSSAPI negotiation before falling back to other methods. Depending on the client and the network, this negotiation can take several seconds per connection. Disabling GSSAPI when it is not in use eliminates this delay entirely:

/etc/ssh/sshd_config

# Disable GSSAPI if Kerberos is not in use
GSSAPIAuthentication no

Slow or unavailable LDAP/PAM backends: Systems using PAM for authentication that call out to an external LDAP or Active Directory backend can experience similar delays if that backend is slow or temporarily unreachable. The PAM call happens during the authentication phase, which still keeps the connection in the unauthenticated pool from sshd's perspective. If your LDAP server has latency issues, sshd's unauthenticated slot count will swell even under normal login volume.

In all three of these cases, raising MaxStartups can mask the actual problem. The more correct fix is to eliminate the authentication delay at its source -- disable UseDNS, turn off GSSAPI, or address the LDAP latency -- and then verify that the throttling messages stop without having changed the connection limit at all.

$ journalctl -u sshd --since "1 hour ago" | grep -E "(drop connection|Invalid user|Accepted)" | head -50

Diagnosing the Current State

Before touching sshd_config, establish a baseline. You need to know what your current limit is, how many connections are hitting the unauthenticated window right now, and whether the pattern suggests automation or attack.

Check the current effective MaxStartups value

terminal

# Show all effective sshd config values (including compiled defaults)
$ sudo sshd -T | grep -i maxstartups

# Check the sshd version
$ sshd -V

# Count current unauthenticated SSH connections
$ ss -tn state established '( dport = :22 or sport = :22 )' | wc -l

# More granular: show connection states on port 22
$ ss -tn '( dport = :22 or sport = :22 )'

The sshd -T command (capital T) prints the complete effective configuration after parsing sshd_config and applying compiled defaults. This is the single most reliable way to know exactly what value sshd is actually enforcing, regardless of what is or is not written in the config file. It is a command worth running before any SSH configuration change, not just for this issue.

Watch out for /etc/ssh/sshd_config.d/ drop-ins

On Ubuntu 22.04+, Debian 12+, and recent RHEL/AlmaLinux/Rocky builds, sshd_config includes a line like Include /etc/ssh/sshd_config.d/*.conf near the top of the file. Any .conf file placed in that directory is parsed first, and in OpenSSH, the first occurrence of a directive wins. This means a drop-in file in sshd_config.d/ can silently override a MaxStartups value you set in the main sshd_config. Always run sudo sshd -T | grep maxstartups to confirm what is actually in effect, and check ls /etc/ssh/sshd_config.d/ if the value does not match what you expect from the main config file.

Count dropped connections over time

terminal

# Count occurrences of the throttling message today
$ journalctl -u sshd --since today | grep -c "maxstartups throttling"

# See source IPs of dropped connections (requires some log parsing)
$ journalctl -u sshd --since "1 hour ago" | grep "drop connection" 

# Watch in real time
$ journalctl -u sshd -f | grep --line-buffered -E "(maxstartups|drop connection)"

Warning

On systems using traditional syslog rather than journald, the throttling messages go to /var/log/auth.log (Debian/Ubuntu) or /var/log/secure (RHEL/CentOS/Fedora). Use grep "maxstartups" /var/log/auth.log on those systems. If you are on a system that uses both journald and syslog forwarding, check both sources to avoid missing entries.

Fixing It for Legitimate Automation

Once you have confirmed the source is internal automation and not a scanning attack, you have several options. Increasing MaxStartups is the most direct, but it is worth considering whether the automation can be made more connection-efficient first.

Option 1: Tune MaxStartups upward

Edit /etc/ssh/sshd_config and set a value appropriate for your environment. For a jump host serving Ansible automation across a few dozen hosts, 100:30:200 is a common and reasonable starting point. For a bastion serving a large engineering team, you may need to go higher.

/etc/ssh/sshd_config

# Long-standing default: 10:30:100
# For automation-heavy bastion hosts, tune upward:
MaxStartups 100:30:200

# Also consider tuning LoginGraceTime to reduce the window
# Each unauthenticated connection holds a slot until this expires
LoginGraceTime 30

# After editing, validate the config before reloading
# sshd -t (lowercase t) runs a syntax check

After editing, always validate the configuration file before applying it. A syntax error in sshd_config on a remote server can lock you out permanently if sshd fails to start after reload.

terminal

# Validate config without restarting (safe to run at any time)
$ sudo sshd -t

# If that passes, reload the daemon (does not drop active sessions)
$ sudo systemctl reload sshd

# On non-systemd systems (older Debian/Ubuntu init, BSD-style init):
$ sudo service sshd reload

# Or send SIGHUP directly to the sshd parent process:
$ sudo kill -HUP $(cat /var/run/sshd.pid)

# Verify the new value took effect
$ sudo sshd -T | grep maxstartups

Critical: Never restart sshd without testing first

Use systemctl reload sshd, not restart, on live servers. Reload sends SIGHUP which causes sshd to re-read config while keeping all existing authenticated sessions alive. A restart kills sshd and restarts it, which drops every active connection. Before any SSH config change on a remote host, keep at least one existing authenticated session open in a separate terminal as a recovery path.

Option 2: Reduce LoginGraceTime to shrink the unauthenticated window

LoginGraceTime is the maximum time sshd will allow a connection to remain unauthenticated before forcibly closing it. The default is 120 seconds -- two full minutes. During those two minutes, that connection holds one slot against the MaxStartups count.

For internal automation using key-based authentication, there is essentially no legitimate reason for authentication to take more than a few seconds. Reducing LoginGraceTime to 30 or even 15 seconds means failed or stalled connections release their slots much faster, reducing the chance that a burst overwhelms the limit. This is also a strong independent security hardening measure, since it limits how long an attacker can hold a connection open attempting credentials.

Option 3: Use SSH connection multiplexing in automation

This is the architecturally cleaner fix for Ansible and similar tools. SSH connection multiplexing (ControlMaster/ControlPath/ControlPersist) allows multiple SSH sessions to share a single underlying TCP connection and authentication. Once one session authenticates, subsequent sessions over the same multiplexed connection bypass the unauthenticated state entirely -- they never enter the pool that MaxStartups counts.

~/.ssh/config or ansible.cfg ssh args

# SSH client config to enable multiplexing
Host bastion.example.com
    ControlMaster     auto
    ControlPath       ~/.ssh/cm-%r@%h:%p
    ControlPersist    10m

# Ansible: enable multiplexing via ssh_args in ansible.cfg
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=30m
pipelining = True

Multiplexing has a side effect worth knowing: if the master connection dies (network blip, server restart), all sessions sharing it are dropped simultaneously. For long-running interactive work this is usually acceptable. For automated pipelines, set ControlPersist to a value that covers the expected run time of your longest job, and build retry logic into your automation for the cases where the master connection does not exist yet.

The Security Tradeoff: What You Lose When You Raise the Limit

It is important to be explicit about what MaxStartups protects against before raising it. There are two concrete attack classes it mitigates.

Pre-authentication resource exhaustion

Each unauthenticated connection causes sshd to spawn a child process (or, in privilege-separated mode, a monitor and a slave process). In older OpenSSH versions, these processes consumed meaningful memory. In current versions with privilege separation the overhead is smaller, but it is not zero. An attacker who can hold ten thousand simultaneous connections in the authentication phase is consuming server resources on purpose. MaxStartups puts a ceiling on this class of resource exhaustion.

Brute-force rate limiting

Password authentication brute-force tools gain throughput by parallelizing attempts. Keeping MaxStartups low -- combined with a LoginGraceTime that limits how long each attempt can run -- creates a natural rate limit on parallel brute-force attempts, even without fail2ban or other external tools in the picture. The OpenSSH project has explicitly cited this as a design goal of the throttling mechanism.

This means the risk calculus is straightforward: on a server that has password authentication enabled and is exposed to the internet, raising MaxStartups significantly does weaken this protection. On a server where password authentication is disabled entirely and only key-based auth is permitted, the brute-force rationale largely disappears -- an attacker cannot meaningfully brute-force public-key authentication at the connection level -- and the resource exhaustion concern becomes the primary remaining consideration.

Recommendation

If you are raising MaxStartups on a server with PasswordAuthentication yes, compensate elsewhere. Ensure fail2ban or sshguard is running, set LoginGraceTime low (30 seconds or less), and consider rate-limiting new SSH connections at the firewall level with iptables/nftables --hashlimit or --connlimit. Raising the ceiling is reasonable; removing all backstops is not. For a full treatment of layered SSH defenses, see hardening SSH beyond the basics.

Jump Hosts and Bastion Servers: Special Considerations

The MaxStartups problem surfaces most often on jump hosts and bastion servers precisely because they are designed to funnel high connection volume. A bastion that serves a team of fifty engineers will see constant SSH activity, and when a deployment or on-call incident causes a burst of simultaneous logins, ten is a near-certain collision point. Before tuning connection limits, it is worth running a full SSH audit and hardening pass to confirm the baseline configuration is solid.

For dedicated bastion infrastructure, a reasonable hardened configuration looks like this:

/etc/ssh/sshd_config (bastion profile)

# Raised limit appropriate for automation-heavy bastion
MaxStartups          100:30:200

# Short grace time -- key auth should be near-instant
LoginGraceTime       30

# No password auth on a bastion -- eliminates brute-force risk
PasswordAuthentication  no
ChallengeResponseAuthentication  no
KbdInteractiveAuthentication     no

# Explicitly disable root login
PermitRootLogin      no

# Limit unauthenticated session time before TCP keepalive kicks in
ClientAliveInterval  120
ClientAliveCountMax  3

# Log level for connection auditing
LogLevel             VERBOSE

With PasswordAuthentication no, the brute-force protection rationale for keeping MaxStartups low largely evaporates. The remaining concern -- resource exhaustion from parallel handshakes -- is manageable at 100:30:200 on any modern server. The probabilistic drop policy (the 30 in the middle) still provides a soft ceiling that prevents runaway conditions.

Kubernetes, CI Pipelines, and Ephemeral Environments

A less obvious trigger for this error appears in Kubernetes clusters or CI environments that use SSH for node access, secret injection, or deployment steps. In these environments, pods or pipeline jobs may all start simultaneously -- especially on a shared runner or after a cluster scale-up event -- and each will attempt an SSH connection within a very short window.

GitHub Actions, GitLab CI, Jenkins, and similar systems running matrix builds or parallel job stages can saturate the default MaxStartups 10:30:100 in seconds. The fix is the same -- tune the target server's config -- but the diagnostic step is slightly different because the source IPs will all be internal cluster addresses and the connections will be very short-lived.

terminal

# Watch real-time connection counts during a CI run
$ watch -n 1 'ss -tn state syn-recv state established dport :22 | wc -l'

# Identify peak unauthenticated connections during the burst
$ journalctl -u sshd --since "10 minutes ago" | grep -c "drop connection"

If the peak connection count consistently exceeds your MaxStartups full value during CI runs, and the connections are coming from known internal sources with key-based authentication, the correct fix is to raise the limit to accommodate the burst. If you also have control over the CI pipeline, consider staggering SSH-based steps to avoid all-at-once connection storms -- this is better for both the target server and pipeline reliability.

OpenSSH 9.8+: PerSourcePenalties and PerSourceMaxStartups

OpenSSH 9.8, released July 1, 2024, introduced two complementary controls that work alongside MaxStartups and are worth understanding if your server runs a modern OpenSSH build.

PerSourceMaxStartups

PerSourceMaxStartups applies a per-source-address limit on unauthenticated connections, in addition to the global MaxStartups ceiling. The default is none (no per-source limit). Setting it to a low value like 3 or 5 means a single IP address can hold at most that many unauthenticated slots simultaneously, regardless of how high the global MaxStartups is set. This is useful on bastions where you want to allow overall high throughput while still preventing any individual IP from monopolizing the unauthenticated connection pool.

The granularity of the per-source grouping is controlled by PerSourceNetBlockSize, which defaults to 32:128 (each individual IPv4 or IPv6 address is considered separately). You can coarsen this to a CIDR prefix to apply limits at the subnet level -- useful when your automation runs from a known IP range.

/etc/ssh/sshd_config (OpenSSH 9.8+ features)

# Global unauthenticated connection ceiling (long-standing default)
MaxStartups            100:30:200

# Per-source limit: one IP can hold at most 10 unauthenticated slots
# Available since OpenSSH 9.8
PerSourceMaxStartups   10

# Group IPs by /24 for the per-source limit (treat a /24 as one "source")
PerSourceNetBlockSize  24:64

PerSourcePenalties

PerSourcePenalties, introduced in the same release, takes a different approach: rather than rate-limiting unauthenticated slots, it tracks problematic connection behavior over time and temporarily refuses connections from offending addresses. Conditions that trigger a penalty include repeated authentication failures (possible password guessing), clients that connect but never complete authentication, and clients whose behavior causes sshd to crash.

When a penalty threshold is reached, sshd refuses new connections from that source address for a configurable duration. Repeated offenses accumulate longer penalties up to a configured maximum. Trusted management addresses can be exempted from penalties using PerSourcePenaltyExemptList.

Operators with NAT or shared proxies: read this

PerSourcePenalties is on by default in OpenSSH 9.8 and later. If multiple users share an IP address -- through NAT, a corporate proxy, or a cloud NAT gateway -- a penalty triggered by one user will block all users from that address. The OpenSSH 9.8 release notes specifically call this out: operators accepting connections from behind NAT should review PerSourcePenalties settings and consider adding their NAT gateway addresses to PerSourcePenaltyExemptList. Source: OpenSSH 9.8 release announcement, openssh-unix-announce mailing list.

The practical interaction between these features and MaxStartups is that they operate at different layers. MaxStartups is a real-time gate on unauthenticated connection count. PerSourceMaxStartups is the same gate applied per-source. PerSourcePenalties is a persistent, time-based block on addresses that have demonstrated bad behavior. On a modern OpenSSH server, all three are in effect simultaneously and each can independently cause a connection to be refused.

Monitoring for the Right Signal

After tuning, it is worth setting up alerting that tells you when the error appears again -- not so you can keep raising the limit endlessly, but so you can distinguish between expected automation bursts and unexpected attack activity. A sudden spike in maxstartups throttling events from external IPs that were not present before is a meaningful signal that deserves investigation.

If you are running a log aggregation stack (ELK, Loki, Graylog), create an alert on the message string. For simpler environments, a fail2ban filter targeting the drop messages can work as a rudimentary alarm. The OpenSSH project recommends treating a persistent flood of this message from diverse external sources as a probable automated attack rather than a configuration problem.

monitoring one-liner (run periodically)

# Count MaxStartups drop events in the last 5 minutes
# Alert if this number exceeds a threshold for your environment
$ journalctl -u sshd --since "5 minutes ago" | grep -c "drop connection"

# Cross-reference with unique source IPs (requires verbose logging)
# LogLevel VERBOSE in sshd_config logs connection source IPs
$ journalctl -u sshd --since "5 minutes ago" | grep "Connection from" | awk '{print $NF}' | sort | uniq -c | sort -rn | head -20

Putting It Together

The beginning maxstartups throttling error is sshd functioning correctly -- it is a deliberate, documented defense mechanism that has been part of OpenSSH since the early 2.x releases. The error becomes a problem only when it blocks legitimate connections that your server should be accepting.

The decision tree is short. First, check whether slow authentication -- reverse DNS lookups with UseDNS yes, GSSAPI negotiation on a non-Kerberos network, or LDAP backend latency -- is artificially lengthening the unauthenticated window. If it is, fix that before touching MaxStartups at all. If the dropped connections are coming from internal automation sources and password authentication is disabled, raise MaxStartups to match your workload and lower LoginGraceTime to tighten the unauthenticated window. If password auth is enabled, raise the limit cautiously and compensate with fail2ban and firewall rate limiting. If the source IPs are external and unfamiliar, do not touch the limit -- the protection is doing its job. If you are on OpenSSH 9.8 or later, also check whether PerSourcePenalties or PerSourceMaxStartups are involved before concluding the cause is purely MaxStartups.

The one configuration practice that pays dividends across all of this: run sshd -T | grep maxstartups regularly, especially after OpenSSH upgrades and after any package update that may have dropped a file into /etc/ssh/sshd_config.d/. Always verify what your server is actually enforcing rather than relying on memory or assumptions about defaults. The long-standing default of 10:30:100 is a reasonable baseline for servers that see moderate individual user traffic, but it was not designed for automation-heavy environments, bastion hosts, or CI pipelines -- and the only way to know for certain what is in effect is to ask sshd directly.

Operational best practice

Never assume what sshd is enforcing based on memory or what you wrote in the config file. Run sshd -T | grep maxstartups to confirm the active value. Compiled defaults have shifted across OpenSSH releases, and a drop-in file in sshd_config.d/ can silently override whatever is in the main config. The running configuration is what matters.

Sources: sshd_config(5) OpenBSD man page; OpenSSH 9.8 release announcement (openssh-unix-announce mailing list); openssh-portable sshd.c source (GitHub); OpenSSH security considerations; sshd_config(5) Linux man page (man7.org).

^ back to top