Inside Nginx: Architecture, Internals, and Production Craft

Nginx (pronounced "engine x," officially stylized as NGINX) is a web server, reverse proxy, load balancer, and HTTP cache that has become among the most widely deployed pieces of internet infrastructure in the world. According to the official NGINX GitHub repository, it is currently the world's most popular web server, with W3Techs reporting a 33.8% share of all websites as of April 2025. Igor Sysoev began development in 2002 while working as a systems administrator at the Russian web portal Rambler, and released it publicly under a BSD license on October 4, 2004. It was built from scratch to solve a specific, well-understood problem: the C10K problem. Understanding what that problem was, and how Nginx's architecture addresses it, is the foundation for understanding everything else about how Nginx works.

The C10K Problem and Why Apache Wasn't Enough

In the late 1990s and early 2000s, Apache was the dominant web server. Apache's architecture is process-based: for each incoming connection, Apache either spawns a new process or assigns a pre-forked worker process to handle that connection. The worker is dedicated to that connection for its entire lifetime -- it reads the request, processes it, sends the response, and either closes or waits for the next request on a keepalive connection.

This model is intuitive and simple but carries a fundamental scalability ceiling. Each process or thread consumes memory -- typically several megabytes per worker with Apache's prefork MPM. At 1,000 concurrent connections, you might be consuming gigabytes of RAM just for worker overhead, before any application logic runs. At 10,000 concurrent connections, the system spends more time context-switching between processes than doing actual work. The operating system's scheduler becomes the bottleneck. This is the C10K problem: handling 10,000 concurrent connections efficiently on a single server.

Igor Sysoev designed Nginx to solve this architecturally rather than by tuning the Apache model. The solution was an event-driven, asynchronous, non-blocking architecture built around the operating system's efficient I/O multiplexing mechanisms. As the official NGINX blog described it, Sysoev envisioned "a novel architecture that would allow high-traffic sites to better handle tens of thousands of concurrent connections."

Nginx's Event-Driven Architecture

At the core of Nginx is an event loop. Rather than assigning a dedicated process or thread to each connection, Nginx uses a small, fixed number of worker processes -- typically one per CPU core -- and each worker handles thousands of connections simultaneously within a single thread.

The mechanism that makes this possible is epoll on Linux (and kqueue on FreeBSD/macOS). These are kernel-level I/O event notification interfaces that allow a single thread to register interest in thousands of file descriptors simultaneously and receive notifications when any of them become ready for I/O. The key distinction from older interfaces like select() and poll() is scalability. Both select() and poll() require the kernel to iterate through every registered file descriptor on each call to determine which are ready, giving them O(n) time complexity. epoll uses an internal red-black tree to track registered descriptors and a separate ready list that the kernel populates as events occur, giving it O(1) event retrieval regardless of how many total descriptors are registered.

Note

In practical terms, an Nginx worker process can have 10,000 open connections registered with epoll and wake up only when there is actual work to do -- data arriving on a socket, a file becoming readable, or a timeout expiring. Between events, the worker sleeps and consumes no CPU. This stands in stark contrast to a threaded model where each thread periodically checks whether its connection has data.

The Nginx worker loop runs roughly as follows: call epoll_wait() to collect ready events, then process each event by advancing its associated connection's state machine. A connection moves through states -- accepting, reading request headers, processing, sending response headers, sending response body -- with each state transition driven by I/O readiness events rather than blocking calls. When an operation would block (waiting for data from a backend, waiting for a file to become readable), the worker registers the file descriptor with epoll and moves on to handle other connections.

The Master and Worker Process Model

Nginx runs as a master process and one or more worker processes. The master process reads configuration, binds to ports, and manages the lifecycle of worker processes -- starting, stopping, and restarting them in response to signals or configuration reloads. The master process itself does not handle connections.

Worker processes handle all connection processing. The standard configuration sets the number of workers equal to the number of CPU cores using the directive worker_processes auto. Each worker is single-threaded and operates its own independent event loop. There is no shared state between workers for connection handling -- connections are distributed across workers by the operating system when they accept from the shared listening socket.

Pro Tip

When you send Nginx a SIGHUP signal (or run nginx -s reload), the master process reads the new configuration, validates it, and starts new worker processes with the updated config. The old workers are sent a graceful shutdown signal: they stop accepting new connections but continue processing existing ones until complete, then exit. This means Nginx can reload its entire configuration -- including TLS certificates -- with zero dropped connections. You should also raise worker_rlimit_nofile to match or exceed worker_connections, since each connection consumes a file descriptor and the OS default limit (often 1024) will silently throttle you under load. For latency-sensitive deployments, worker_cpu_affinity auto pins each worker process to a specific CPU core, reducing cache invalidation from the scheduler moving processes between cores and eliminating cross-NUMA memory access costs on multi-socket systems.

The Module System

Nginx is not a monolithic application. It is, in the words of its own development guide, a collection of modules. Even the most fundamental capabilities -- serving static files, proxying HTTP, and handling SSL -- are implemented as modules. The core takes care of network and application protocol handling, event loop management, and configuration parsing, then orchestrates a sequence of modules to process each request. This architecture is what makes Nginx extensible without requiring patches to its core.

Every module conforms to the ngx_module_t structure, which is the contract between a module and the Nginx core. The structure carries a type identifier (NGX_HTTP_MODULE, NGX_STREAM_MODULE, NGX_CORE_MODULE, etc.), a context struct holding pointers to lifecycle callbacks, an array of directives the module recognizes, and init/exit handler pointers. Nginx's initialization sequence calls each registered module's handlers in a defined order: configuration directive handlers fire as the config file is parsed, init_module fires in the master process after configuration is validated, and init_process fires in each worker process at startup.

Note

There are two categories of HTTP modules: handler modules and filter modules. Handler modules generate content -- ngx_http_static_module, ngx_http_proxy_module, ngx_http_fastcgi_module. Only one handler wins per location block. Filter modules transform the response after content is generated -- ngx_http_gzip_filter_module, ngx_http_headers_filter_module. All filters run on every response, in a chain. The order of that chain matters: gzip must run before the write filter, and the write filter must be last because it sends data to the client socket. This order is determined at compile time by the order modules appear in the build configuration.

Nginx 1.9.11 introduced dynamic modules -- the ability to load compiled .so shared objects at runtime using the load_module directive without recompiling the Nginx binary. This was a significant operational improvement, particularly for distribution package maintainers who can now ship optional modules as separate packages.

nginx.conf -- loading dynamic modules

# Dynamic modules must be loaded in the main context, before any blocks
load_module modules/ngx_http_brotli_filter_module.so;
load_module modules/ngx_http_brotli_static_module.so;
load_module modules/ngx_http_image_filter_module.so;
load_module modules/ngx_http_geoip2_module.so;

events { ... }
http    { ... }

Static modules are compiled directly into the binary with the --add-module configure flag; dynamic modules use --add-dynamic-module and produce a .so file in the objs/ directory. Nginx 1.27.x raised the maximum number of simultaneously loaded dynamic modules from 128 to 1024, defined by NGX_MAX_DYNAMIC_MODULES in the source -- confirm your binary's limit with nginx -V if you are running an older release. Filter module ordering for dynamic modules is specified via the ngx_module_order variable at build time -- without it, filter ordering is non-deterministic and can produce corrupted output if, for example, a compression filter runs after the write filter.

Memory Architecture: ngx_pool_t

Nginx's approach to memory allocation is one of the less-discussed but more consequential aspects of its performance profile. Rather than calling malloc() for each small allocation and tracking individual free operations, Nginx uses a pool-based allocator implemented in ngx_palloc.c. The type is ngx_pool_t, and pools are created with a specific lifetime in mind.

There are three main pool scopes. The cycle pool (cycle->pool) lives for the entire lifetime of a configuration and is freed only on reload. The connection pool (c->pool) is created when a TCP connection is accepted and destroyed when it closes -- it holds state that persists across multiple HTTP requests on a keepalive connection. The request pool (r->pool) is created when an HTTP request begins and destroyed when the response is complete -- it is the pool from which modules allocate headers, parsed URI data, upstream buffers, and temporary state.

pool lifecycle -- conceptual

/* Pools created by the Nginx core at specific lifecycle points */

/* Created once per configuration load; freed on nginx -s reload */
ngx_create_pool(NGX_CYCLE_POOL_SIZE, log)  // → cycle->pool

/* Created per accepted TCP connection; freed on connection close */
ngx_create_pool(sizeof(ngx_connection_t), log)  // → c->pool

/* Created per HTTP request; freed when request finalizes */
ngx_create_pool(sizeof(ngx_http_request_t), log)  // → r->pool

/* Module code allocates from request pool -- no manual free needed */
ngx_palloc(r->pool, sizeof(my_module_ctx_t));

The pool allocator works by maintaining a linked list of fixed-size memory blocks. When you call ngx_palloc(pool, size), it checks whether the current block has enough remaining space. If so, it advances a pointer by size bytes and returns the old pointer address -- this is effectively O(1). If not, a new block is allocated from the system heap and appended to the list. Allocations larger than NGX_MAX_ALLOC_FROM_POOL bypass the pool's internal blocks and go directly to malloc(), with the returned pointer tracked in a separate large list for cleanup at pool destruction time. As the Nginx source header documents: NGX_MAX_ALLOC_FROM_POOL is defined as (ngx_pagesize - 1), which is 4095 bytes on x86 with a standard 4KB page size, but the value is derived from the system's actual page size at runtime and may differ on systems configured with larger pages.

The important property of this design is that there are almost no individual free calls. When a request pool is destroyed -- when ngx_http_finalize_request() runs -- the entire pool is deallocated in a single operation by walking the block list and freeing each block. This eliminates memory fragmentation within a request's lifetime, avoids per-allocation overhead, and makes the allocator extremely cache-friendly. The cost is that memory allocated in a smaller-scoped pool cannot outlive that pool; if you need data to survive beyond the request, you must allocate it from the connection pool or a shared memory zone.

Note

Shared memory zones -- created by directives like limit_req_zone, proxy_cache_path, and ssl_session_cache -- use a different allocator entirely: ngx_slab_pool_t. Slab pools divide the shared zone into pages and allocate fixed-power-of-two size classes within each page, using a bitmask to track free slots. Access is protected by a mutex (ngx_shmtx_t) since all worker processes share the same memory region. This is the mechanism that allows Nginx's rate limiting state, cache metadata, and SSL session cache to be coherent across workers without any inter-process message passing.

The Request Lifecycle: An End-to-End Walk

Understanding what actually happens between a client connecting and a response arriving is the core of understanding Nginx's internals. The journey is more structured than most people realize: Nginx processes every HTTP request through an eleven-phase pipeline defined at startup by registered module handlers.

The phases, in order, are: POST_READ, SERVER_REWRITE, FIND_CONFIG, REWRITE, POST_REWRITE, PREACCESS, ACCESS, POST_ACCESS, TRY_FILES, CONTENT, and LOG. Not all phases are user-configurable -- FIND_CONFIG, POST_REWRITE, POST_ACCESS, and TRY_FILES are internal to the Nginx core and have no registered handlers from third-party modules. The remaining phases are where modules do their work.

request lifecycle -- abridged

/* Phase 0:  POST_READ        -- ngx_http_realip_module reads X-Forwarded-For */
/* Phase 1:  SERVER_REWRITE   -- rewrite directives at server{} level run */
/* Phase 2:  FIND_CONFIG      -- location block matched; config context set */
/* Phase 3:  REWRITE          -- rewrite/return/set directives inside location */
/* Phase 4:  POST_REWRITE     -- if rewrite looped, restart from FIND_CONFIG */
/* Phase 5:  PREACCESS        -- limit_conn and limit_req evaluated here */
/* Phase 6:  ACCESS           -- auth_basic, auth_request, allow/deny */
/* Phase 7:  POST_ACCESS      -- satisfy any/all logic resolved */
/* Phase 8:  TRY_FILES        -- try_files directive checked */
/* Phase 9:  CONTENT          -- handler module generates the response */
/* Phase 10: LOG              -- access log written */

A concrete request walks through it like this. The kernel accepts the TCP connection and hands a file descriptor to the worker process via epoll. The worker creates a connection object, allocates a connection pool, and registers a read event handler for incoming data. When the client sends the HTTP request line and headers, epoll fires, the worker reads the data, and Nginx's HTTP parser advances through the request line and header fields using a finite state machine. The parser is non-allocating for common header names (it uses a trie for recognition) and allocates only for header values and the URI, writing into the request pool.

Once headers are fully read, ngx_http_process_request() is called, which creates the request pool (distinct from the connection pool) and begins the phase engine by calling ngx_http_core_run_phases(). The phase engine is a flat array of phase handler function pointers built once at startup from all registered module handlers. It iterates through this array, calling each handler in turn. A handler returns NGX_DECLINED to pass to the next handler in the same phase, NGX_OK to advance to the next phase, or NGX_DONE to pause and wait for an event (say, an upstream response arriving). If a handler returns an error code, the phase engine stops and calls the error handler.

The content phase runs exactly one handler -- the one that won the location match. For proxy_pass targets, this is ngx_http_proxy_handler(), which initiates an upstream connection and hands control back to the event loop. When the upstream response arrives, the upstream state machine reads it into buffers and calls the output filter chain.

The filter chain is the response side. Where phases are for request processing, filters are for response transformation. Header filters run first, in reverse registration order (last registered, first executed -- because filters are prepended to a linked list). Then body filters run on each output buffer chain. The canonical chain looks like: ngx_http_not_modified_filter → ngx_http_headers_filter → ngx_http_gzip_filter → ngx_http_chunked_filter → ngx_http_write_filter. One important distinction: ngx_http_not_modified_filter is not a transforming filter in the same sense as the others -- it checks ETag and Last-Modified response headers against the client's conditional request headers and, if they match, short-circuits the chain entirely to send a 304 Not Modified with no body. The write filter is always last; it takes the final buffer chain and calls writev() or sendfile() to push data to the client socket.

Pro Tip

The practical consequence of the phase ordering is that execution order is determined by phase membership, not by directive order in the config file. This is a common source of confusion. set $variable value runs in the REWRITE phase. limit_req runs in PREACCESS. auth_request runs in ACCESS. If you write set $skip 1 followed by limit_req in the same location block, the set runs first -- not because it appears first in the file, but because REWRITE precedes PREACCESS. Configuration order only matters within the same phase.

The try_files Directive

try_files runs in the TRY_FILES phase (phase 8) and is one of the directives that appears in nearly every production Nginx configuration, particularly for PHP applications and single-page application deployments. Its behavior is often misunderstood: it tests a sequence of file or directory paths against the filesystem in order, and serves the first one that exists. The final argument is always a fallback and is treated as a URI rewrite (or an error code) rather than a path check.

nginx.conf -- try_files patterns

# Classic WordPress / PHP-FPM pattern
# Try the exact URI as a file, then as a directory, then hand off to index.php
location / {
    try_files $uri $uri/ /index.php?$query_string;
}

# SPA pattern: serve the file if it exists, else serve index.html
location / {
    try_files $uri $uri/ /index.html;
}

# Named location fallback -- useful for more complex routing
location / {
    try_files $uri @backend;
}
location @backend {
    proxy_pass http://app_backend;
}

The important internal mechanic: when try_files reaches its final fallback argument, it performs an internal redirect -- it changes the request URI and restarts the phase engine from FIND_CONFIG to re-match the new URI against location blocks. This means the fallback to /index.php?$query_string is not a direct pass-through; it is a new location match that will then be handled by whatever location block matches \.php$. Understanding this prevents the common mistake of writing catch-all try_files fallbacks that accidentally bypass access controls or caching logic placed in specific location blocks.

The ngx_http_realip_module

The POST_READ phase comment in the pipeline above references ngx_http_realip_module, which deserves explicit treatment. When Nginx sits behind a load balancer, CDN, or other reverse proxy, $remote_addr contains the IP of the upstream proxy rather than the actual client. The realip module corrects this by reading the client IP from a trusted header -- typically X-Forwarded-For or X-Real-IP -- and rewriting $remote_addr in-place before any other module evaluates it.

nginx.conf -- realip module

http {
    # Trust X-Real-IP from these load balancer IPs
    set_real_ip_from 10.0.0.0/8;
    set_real_ip_from 172.16.0.0/12;
    real_ip_header  X-Forwarded-For;

    # If using X-Forwarded-For which may contain a chain of IPs,
    # recursive search finds the first non-trusted IP in the chain
    real_ip_recursive on;
}

set_real_ip_from defines the CIDR ranges Nginx trusts as legitimate proxies. Only headers arriving from these addresses will be used to rewrite $remote_addr. This matters for rate limiting: if you configure limit_req_zone against $binary_remote_addr without running the realip module, every client behind the same load balancer will share a single rate limit counter keyed to the load balancer's IP, not the individual client IP. The realip module must be compiled in -- it is not enabled by default in all distributions -- and is controlled by the --with-http_realip_module configure flag.

Configuration Architecture

Nginx's configuration language is block-based and hierarchical. The top-level contexts are main, events, http, stream, mail, and upstream. Directives defined in an outer context are inherited by inner contexts, and inner contexts can override them.

The events block controls connection processing behavior. The key directive here is worker_connections, which sets the maximum number of simultaneous connections per worker process. Combined with worker_processes, this gives you the theoretical maximum concurrent connections for the server: worker_processes * worker_connections.

nginx.conf -- events block

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

The multi_accept on directive tells each worker to accept as many new connections as possible in a single pass through the accept loop rather than accepting one connection per event loop iteration. This improves throughput under high connection rates at the cost of slightly uneven distribution across workers.

The http block is where virtual hosts and web serving configuration lives. Virtual hosts are defined in server blocks, and URL-specific configuration goes inside location blocks nested within server blocks.

nginx.conf -- server block

http {
    server {
        listen 443 ssl;
        server_name example.com;

        location / {
            root /var/www/html;
            index index.html;
        }

        location ~ \.php$ {
            fastcgi_pass unix:/run/php/php8.2-fpm.sock;
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
            include fastcgi_params;
        }
    }
}

Location blocks support four matching types: exact match with =, preferential prefix match with ^~, case-sensitive regex with ~, and case-insensitive regex with ~*. Nginx's evaluation follows a specific two-pass process: first, all prefix strings are tested and the longest match is recorded; then, if that longest prefix is marked ^~, regex evaluation is skipped entirely and that prefix wins. If the longest prefix is not ^~, Nginx tests regex locations in the order they appear in the config file. The first matching regex wins; if no regex matches, the longest prefix from the first pass is used. Exact = matches are checked first and terminate evaluation immediately on a match. Understanding this two-pass process -- not a simple top-to-bottom read of the config file -- resolves most location-matching confusion, and is a common source of subtle bugs when prefixes and regexes interact unexpectedly.

Static File Serving and sendfile

One of Nginx's performance advantages for static content is its use of sendfile(). Normally, serving a file requires copying data from the filesystem into kernel space, then from kernel space into userspace, then from userspace back into kernel space for the network socket. With sendfile(), the kernel transfers data directly from the file descriptor to the socket descriptor without ever copying it into userspace -- reducing CPU usage and memory bandwidth consumption significantly.

nginx.conf -- performance directives

http {
    sendfile        on;
    tcp_nopush      on;
    tcp_nodelay     on;
    keepalive_timeout 65;
    worker_rlimit_nofile 65535;
}

tcp_nopush works in conjunction with sendfile to batch response headers and the beginning of the file body into a single TCP packet, reducing round trips and improving throughput. tcp_nodelay disables Nagle's algorithm for keepalive connections, ensuring that small amounts of data are sent immediately rather than buffered -- which reduces latency for interactive or API workloads. Note that tcp_nopush and tcp_nodelay are not contradictory; Nginx enables both simultaneously, using tcp_nopush to batch the first packet and then switching to tcp_nodelay for the remainder of the response.

The operating system's page cache also plays a role here. When Nginx serves a frequently-requested file, the kernel's page cache holds that file's contents in RAM. Subsequent sendfile() calls for the same file are satisfied directly from RAM with no disk I/O. For sites where a relatively small set of static assets is requested repeatedly, the effective static file serving cost approaches zero.

Compression: Gzip and Brotli

Compression is a first-order performance concern for any web server. Nginx ships with built-in gzip support via the ngx_http_gzip_module. Brotli support requires the community ngx_brotli module, which is available as a dynamic module in nginx-extras packages or can be compiled in. Brotli, developed by Google and described in RFC 7932, uses a combination of LZ77, Huffman coding, and second-order context modeling. Benchmarks published by the Brotli authors report compression ratios approximately 20–26% better than gzip at comparable speeds for general web content, though the improvement is not uniform across all payload types -- highly compressible content sees larger gains while already-dense payloads see less. In practice, gzip at level 6 typically reduces HTML/CSS/JS payloads by 60–80% of their original size, and Brotli at an equivalent level reaches somewhat further, with the exact delta varying significantly by content.

nginx.conf -- gzip and Brotli compression

# Load Brotli dynamic modules (if using ngx_brotli)
load_module modules/ngx_http_brotli_filter_module.so;
load_module modules/ngx_http_brotli_static_module.so;

http {
    # Gzip (built-in, universal browser support)
    gzip              on;
    gzip_comp_level   6;
    gzip_min_length   256;
    gzip_vary         on;
    gzip_proxied      any;
    gzip_types
        text/plain text/css text/xml
        application/json application/javascript
        application/xml image/svg+xml;

    # Brotli (requires ngx_brotli module)
    brotli            on;
    brotli_static     on;
    brotli_comp_level 6;
    brotli_types
        text/plain text/css text/xml
        application/json application/javascript
        application/xml image/svg+xml;
}

A compression level of 6 is the practical sweet spot for both gzip and Brotli on dynamic content -- it delivers significant size reduction without excessive CPU overhead. gzip_static on and brotli_static on tell Nginx to serve pre-compressed .gz and .br files from disk when they exist, eliminating runtime compression entirely for static assets. Never apply compression to already-compressed binary formats like JPEG, PNG, or MP4 -- they won't shrink and you'll waste CPU cycles. The gzip_vary on directive sends a Vary: Accept-Encoding response header, which is required to prevent CDNs and proxies from serving compressed content to clients that don't support it.

Upstream Module Architecture

When Nginx proxies a request to a backend -- whether via proxy_pass, fastcgi_pass, uwsgi_pass, or grpc_pass -- the work is handled by Nginx's upstream subsystem. Understanding this layer explains both the performance characteristics of proxying and many of the configuration knobs available to operators.

The upstream module operates its own state machine, parallel to the connection state machine on the client side. When a content phase handler (say, ngx_http_proxy_handler) fires, it creates an ngx_http_upstream_t structure on the request pool and calls ngx_http_upstream_init(). This function resolves which upstream server to use (applying the load balancing policy), establishes a connection to it (either creating a new TCP connection or retrieving one from the keepalive pool), and registers event handlers for the upstream socket with epoll. Critically, this is non-blocking: after initiating the connection, control returns to the event loop. The worker is free to process other connections while waiting for the backend to respond.

The upstream state machine has five key callbacks that module authors define: create_request (serialize the upstream request into a buffer), reinit_request (reset state if the request is retried), process_header (parse the upstream response headers), abort_request (clean up if the client disconnects), and finalize_request (final cleanup). The proxy_pass module's create_request reconstructs the HTTP request for the upstream server; the FastCGI module's create_request serializes the request into FastCGI records. Both share the same upstream state machine infrastructure.

nginx.conf -- upstream block with keepalive

upstream app_backend {
    least_conn;

    server 10.0.0.10:8080 weight=3;
    server 10.0.0.11:8080;
    server 10.0.0.12:8080 backup;

    # Maintain persistent connection pool to backends
    keepalive 64;
    keepalive_requests 1000;
    keepalive_timeout 75s;
}

server {
    location / {
        proxy_pass         http://app_backend;
        proxy_http_version 1.1;
        # Required for keepalive to backends
        proxy_set_header   Connection "";
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_buffers      16 16k;
        proxy_buffer_size  32k;
        proxy_read_timeout 60s;
    }
}

proxy_http_version 1.1 combined with proxy_set_header Connection "" is required to enable HTTP/1.1 keepalives to upstream servers. By default, Nginx uses HTTP/1.0 for upstream connections, which closes the connection after every request. Setting version 1.1 and explicitly clearing the Connection header (which Nginx would otherwise set to close when forwarding) enables connection reuse from the upstream keepalive pool, eliminating TCP handshake overhead for every request.

The buffering directives control how Nginx handles the mismatch between backend response speed and client download speed. With proxy_buffers 16 16k, Nginx allocates up to 16 buffers of 16 KB each (256 KB total) to hold the upstream response in memory. If the response fits, Nginx reads the entire response from the backend quickly, frees the backend connection back to the keepalive pool, and then transmits the buffered data to the client at whatever pace the client can handle. Without buffering, a slow client holds the backend connection open for the entire duration of the transfer -- one slow client on a mobile connection can tie up a backend worker for seconds. At scale, this asymmetry exhausts backend connection pools.

Warning

If a response exceeds the configured buffer space, Nginx writes the overflow to a temporary file on disk. Watch for "upstream response is buffered to a temporary file" in the error log -- it indicates buffer sizes are undersized for your typical response payloads and is a source of latency spikes and elevated disk I/O under load.

Nginx supports four load balancing algorithms out of the box: round-robin (the default, with optional weight), least_conn (fewest active connections, best for workloads with variable response times), ip_hash (consistent hashing on client IP for session affinity), and hash with an arbitrary key (which can hash on URI, cookie value, or any Nginx variable). The backup server flag designates a server as a standby that only receives traffic when all primary servers are unavailable or unhealthy.

FastCGI Caching

Nginx's fastcgi_cache module provides full-page caching at the web server layer, storing application-generated responses on disk or in memory and serving subsequent identical requests from cache without invoking the backend at all. For dynamic applications with cacheable anonymous content, this can reduce backend load by orders of magnitude.

nginx.conf -- FastCGI cache

http {
    fastcgi_cache_path /var/cache/nginx levels=1:2
        keys_zone=APPCACHE:100m inactive=60m;
    fastcgi_cache_key "$scheme$request_method$host$request_uri";

    server {
        location ~ \.php$ {
            fastcgi_pass          unix:/run/php/php8.2-fpm.sock;
            fastcgi_param         SCRIPT_FILENAME $document_root$fastcgi_script_name;
            include               fastcgi_params;
            fastcgi_cache         APPCACHE;
            fastcgi_cache_valid    200 60m;
            fastcgi_cache_bypass   $skip_cache;
            fastcgi_no_cache       $skip_cache;
            fastcgi_cache_use_stale error timeout updating
                                    http_500 http_503;
        }
    }
}

The keys_zone directive creates a shared memory zone named APPCACHE of 100 MB to store cache keys and metadata. The actual cached response bodies are written to disk at the specified path, using a two-level directory structure (levels=1:2) to avoid placing thousands of files in a single directory. The fastcgi_cache_use_stale directive is worth highlighting: it tells Nginx to serve a stale cached response if the backend returns an error or times out rather than passing a 500 or 503 to the client. This is a critical resilience feature -- it keeps the application serving cached content during backend outages.

Socket Permissions

When using a Unix socket path like unix:/run/php/php8.2-fpm.sock, Nginx's worker processes must have read and write permission on that socket file. Workers run as the user specified by the user directive (commonly www-data or nginx). If that user is not the owner of the PHP-FPM socket and is not in its group, Nginx will return a 502 Bad Gateway with a "connect() to unix:/run/php/php8.2-fpm.sock failed (13: Permission denied)" error in the log. Configure PHP-FPM's listen.owner, listen.group, and listen.mode in the pool file to match the Nginx worker user, or set listen.mode = 0660 with the Nginx worker user in the listen.group.

Cache bypass logic is controlled by setting $skip_cache based on request properties. Authenticated sessions, POST requests, and requests with query strings should typically bypass the cache to prevent serving stale or personalized responses to the wrong user.

nginx.conf -- cache bypass rules

set $skip_cache 0;

# Skip cache for non-GET/HEAD methods
if ($request_method !~ ^(GET|HEAD)$) { set $skip_cache 1; }

# Skip cache for query strings (vary by URL)
if ($query_string != "") { set $skip_cache 1; }

# Skip cache for authenticated sessions
if ($http_cookie ~* "session_id|auth_token|logged_in") {
    set $skip_cache 1;
}

Warning: if in location blocks

The if directive inside Nginx location or server blocks is well-documented to behave unexpectedly in some contexts -- the Nginx wiki describes it as "evil" for a reason. It is safe for simple variable assignments like set $skip_cache 1, but avoid combining if with proxy_pass, root, or other content-producing directives. For more complex bypass logic, prefer the map directive in the http context, which evaluates cleanly before any request is processed and has no phase-ordering side effects.

nginx.conf -- map-based bypass (preferred)

http {
    # map runs once at request start in http context -- no phase issues
    map $request_method $skip_by_method {
        default 0;
        POST    1;
        PUT     1;
        DELETE  1;
    }

    map $http_cookie $skip_by_cookie {
        default                    0;
        "~session_id|auth_token"   1;
    }

    # Combine into a single variable via a second map
    map $skip_by_method$skip_by_cookie $skip_cache {
        "~1"  1;
        default 0;
    }
}

SSL/TLS Configuration and Security Headers

Nginx's TLS implementation uses OpenSSL and is highly configurable. Modern best practices require TLS 1.2 at minimum, with TLS 1.3 strongly preferred. A hardened TLS configuration looks like this:

nginx.conf -- TLS hardening

ssl_protocols           TLSv1.2 TLSv1.3;
ssl_ciphers             ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305;
ssl_prefer_server_ciphers off;
ssl_session_cache       shared:SSL:10m;
ssl_session_timeout     1d;
ssl_session_tickets     off;
ssl_stapling            on;
ssl_stapling_verify     on;
server_tokens           off;

TLS session caching (ssl_session_cache shared:SSL:10m) creates a shared memory cache accessible to all worker processes, ensuring that a session established through one worker can be resumed through any other. A full TLS handshake requires multiple round trips and significant CPU work; session resumption reduces this to a single round trip. Session tickets are disabled (ssl_session_tickets off) because they can undermine forward secrecy -- a compromised ticket key exposes past sessions. ssl_prefer_server_ciphers off is correct for TLS 1.3, where the client's cipher preference should be honored. server_tokens off removes the Nginx version number from response headers and error pages, reducing information leakage.

OCSP stapling (ssl_stapling on) tells Nginx to fetch and cache the OCSP response for your certificate from the certificate authority and attach it to the TLS handshake, removing the need for the client to make a separate HTTP request to the CA's OCSP server to verify certificate revocation status.

OCSP Stapling Requires a Resolver

A common silent failure: ssl_stapling on requires a resolver directive in your http block or server block so Nginx can resolve the CA's OCSP responder hostname at runtime. Without it, as the Nginx bug tracker documents, stapling silently fails with a warning log entry and is disabled for that server. Additionally, ssl_stapling_verify on requires ssl_trusted_certificate pointing to your CA's certificate chain -- without it, Nginx cannot verify the OCSP response and stapling will not activate. Note also that Let's Encrypt ended OCSP support in a two-stage process: OCSP URLs were dropped from all newly issued certificates on May 7, 2025, and the OCSP responders were shut down entirely on August 6, 2025. Any certificate issued by Let's Encrypt after May 7, 2025 contains no OCSP URL, which means ssl_stapling on will trigger a configuration warning for those certs and should be disabled for all Let's Encrypt deployments.

ssl_stapling            on;
ssl_stapling_verify     on;
ssl_trusted_certificate /etc/ssl/certs/ca-chain.pem;
resolver                1.1.1.1 8.8.8.8 valid=300s;
resolver_timeout        5s;

Beyond TLS configuration, security response headers are an essential and frequently missed layer. The official NGINX blog on HSTS describes the full directive:

nginx.conf -- security response headers

# Force HTTPS for 1 year; include subdomains once all are HTTPS-ready
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

# Prevent MIME-type sniffing
add_header X-Content-Type-Options nosniff always;

# Deny framing to prevent clickjacking
add_header X-Frame-Options SAMEORIGIN always;

# Restrict resource origins (adjust to your actual sources)
add_header Content-Security-Policy "default-src 'self'; script-src 'self'; object-src 'none';" always;

The always parameter on add_header ensures headers are included even in error responses (4xx, 5xx), which matters for HSTS -- a browser that never sees the HSTS header on a 403 response remains vulnerable. Do not add the preload directive to HSTS unless every subdomain is HTTPS-ready and you have submitted the domain to the HSTS preload list; once set, the browser will refuse HTTP connections for the entire max-age period and the preload list entry is effectively permanent.

Note

HTTP/2 is enabled with the http2 on; directive inside a server block. As of Nginx 1.25.1, the older listen 443 ssl http2; parameter syntax was deprecated in favor of a standalone http2 directive. The correct current syntax is listen 443 ssl; followed by http2 on; on a separate line. The older parameter still works in 1.25.x but triggers a configuration warning and should be updated on any server running 1.25.1 or later. HTTP/2 provides multiplexing -- multiple requests over a single TCP connection simultaneously -- plus HPACK header compression, reducing overhead particularly for asset-heavy applications triggering many parallel requests.

HTTP/3 and QUIC

HTTP/3, which replaced TCP with QUIC as its transport protocol, was supported by Nginx starting with version 1.25.0, released May 2023. The protocol addresses two fundamental weaknesses of HTTP/2 over TCP: head-of-line blocking and the cost of connection establishment. On TCP, a lost packet stalls all multiplexed HTTP/2 streams until retransmission -- even streams that have no missing data. QUIC runs over UDP and implements per-stream reliability, so a lost packet only blocks the one stream whose data is missing. Combined with TLS 1.3 being an integral component of the protocol (not a layer on top), QUIC reduces connection establishment from two round trips to one, and from one to zero for resumed sessions with 0-RTT.

Nginx's multi-worker architecture created a specific challenge for QUIC. TCP connections are identified by a 4-tuple (source IP, source port, destination IP, destination port), which the kernel uses for socket dispatch -- a connection arrives reliably at the same socket and thus the same worker. QUIC connections are identified by a connection ID embedded in each UDP datagram, not by a stable 4-tuple (QUIC's design explicitly supports connection migration, where the client's IP can change). To solve this, the Nginx team implemented an eBPF program that integrates with the kernel's SO_REUSEPORT socket selection mechanism, mapping QUIC connection IDs to worker process socket file descriptors. This ensures that all packets belonging to a given QUIC connection are routed to the same worker, preserving Nginx's share-nothing worker architecture.

nginx.conf -- HTTP/3 with QUIC

server {
    # HTTP/3 over QUIC (UDP); reuseport required for multi-worker QUIC routing
    listen 443 quic reuseport;
    # HTTP/1.1 and HTTP/2 (TCP fallback)
    listen 443 ssl;

    http2 on;

    ssl_certificate     /etc/ssl/certs/example.com.crt;
    ssl_certificate_key /etc/ssl/private/example.com.key;
    ssl_protocols       TLSv1.2 TLSv1.3;

    # QUIC requires TLS 1.3 -- this is automatically satisfied when quic is enabled

    # Advertise HTTP/3 support via Alt-Svc header
    # Clients use this to upgrade on the next request
    add_header Alt-Svc 'h3=":443"; ma=86400' always;
}

The Alt-Svc header is how HTTP/3 negotiation works. A client makes its first request over TCP (HTTP/1.1 or HTTP/2). The server includes Alt-Svc: h3=":443"; ma=86400 in the response, advertising that it accepts HTTP/3 on UDP port 443 and that this preference should be cached for 86,400 seconds (one day). On the client's next visit, the browser attempts a QUIC connection directly. The reuseport parameter on the QUIC listen line is not optional -- it is required for the eBPF connection ID routing to function correctly across worker processes.

Note

OpenSSL did not historically provide the QUIC TLS API that Nginx requires, so the Nginx team developed an OpenSSL Compatibility Layer that ships by default in official Nginx distributions. As the official Nginx QUIC documentation states: OpenSSL 3.5.1 or higher is recommended to build Nginx with QUIC support -- specifically, OpenSSL 3.5.1 enables 0-RTT early data, while versions prior to 3.5.1 fall back to the compatibility layer which does not support 0-RTT. The Nginx Plus R36 release notes confirm this distinction explicitly: "support for 0-RTT in QUIC when using OpenSSL 3.5.1 or newer." For full QUIC support on distributions where OpenSSL 3.5.x is not yet available, BoringSSL, LibreSSL, or QuicTLS remain the recommended alternatives. Always confirm with nginx -V which SSL library your binary was compiled against before enabling QUIC in production.

Load Balancing

Nginx's upstream module provides load balancing across multiple backend servers. The upstream block declares the backend pool; the proxy_pass or fastcgi_pass directive in a location block references it by name.

nginx.conf -- upstream algorithms

# Least connections -- best for variable-duration requests
upstream backend_lc {
    least_conn;
    server 10.0.0.10:8080;
    server 10.0.0.11:8080;
}

# IP hash -- session affinity by client IP
upstream backend_sticky {
    ip_hash;
    server 10.0.0.10:8080;
    server 10.0.0.11:8080;
}

# Hash on arbitrary variable -- e.g., route by URI for cache locality
upstream backend_hash {
    hash $request_uri consistent;
    server 10.0.0.10:8080;
    server 10.0.0.11:8080;
}

Round-robin is the default when no algorithm directive is specified. least_conn routes to the backend with the fewest active connections, making it the right choice for workloads where request duration varies -- a backend that is slow on a complex request is automatically favored less until it recovers. ip_hash uses the first three octets of the client IPv4 address as the hash key, providing soft session affinity that breaks on IP change. For IPv6 clients, the entire address is used -- IPv6 support in ip_hash was added in Nginx 1.3.2. The hash directive with a custom variable (like $request_uri) enables routing to maximize cache hit rates across a pool of caching backends. The consistent parameter uses consistent hashing (ketama), which minimizes cache invalidation when a backend is added or removed from the pool.

Passive Health Checking

Nginx open source includes passive health checking via the max_fails and fail_timeout parameters on upstream server directives. Active health checks -- where Nginx periodically sends probe requests to backends regardless of traffic -- are an NGINX Plus feature, but the passive mechanism is available to all deployments and is important to configure explicitly.

nginx.conf -- passive health checking

upstream app_backend {
    least_conn;

    # Mark a server unavailable after 3 failures within 30 seconds
    # Keep it unavailable for 30 seconds before retrying
    server 10.0.0.10:8080 max_fails=3 fail_timeout=30s;
    server 10.0.0.11:8080 max_fails=3 fail_timeout=30s;

    # Backup only receives traffic when primary servers are all unavailable
    server 10.0.0.12:8080 backup;
}

max_fails sets the number of failed communication attempts to the server within a fail_timeout window that causes Nginx to mark the server temporarily unavailable. Once marked unavailable, the server is not used for the duration of fail_timeout, after which Nginx will probe it with a single request. If that request succeeds, the server is restored to the pool. A failure in this context means a connection timeout, a read error, or an upstream response that matches the error codes specified in proxy_next_upstream. The default values -- max_fails=1 and fail_timeout=10s -- are conservative. For APIs with any significant latency variance, raising max_fails to 3 and fail_timeout to 30s prevents a single slow response from prematurely removing a healthy server from rotation.

Upstream Keepalive: Connection Pooling

One of the most impactful and frequently omitted upstream settings in production Nginx configurations is keepalive inside the upstream block. By default, as the official F5/Nginx blog post on top configuration mistakes documents, Nginx opens a new TCP connection to an upstream backend for every proxied request. At high traffic volumes, this means constant TCP handshake overhead and accelerated exhaustion of ephemeral port ranges from TIME_WAIT socket accumulation. The keepalive directive creates a per-worker pool of idle persistent connections to backend servers, eliminating this overhead.

nginx.conf -- upstream keepalive connection pooling

upstream app_backend {
    least_conn;
    server 10.0.0.10:8080 max_fails=3 fail_timeout=30s;
    server 10.0.0.11:8080 max_fails=3 fail_timeout=30s;

    # Keep up to 32 idle keepalive connections per worker in the pool
    # This does NOT cap total connections -- it caps idle ones
    keepalive 32;
}

server {
    location /api/ {
        proxy_pass            http://app_backend;
        # Required for keepalive to work: use HTTP/1.1 and clear Connection header
        proxy_http_version    1.1;
        proxy_set_header      Connection "";
    }
}

The official upstream module documentation is explicit about an important distinction: the keepalive value sets the maximum number of idle connections preserved in the cache per worker, not a cap on total upstream connections. Workers can and will open more connections than this number under load; the pool simply recycles the most recently used ones when they become idle. The two companion directives -- proxy_http_version 1.1 and proxy_set_header Connection "" -- are required: without HTTP/1.1, Nginx defaults to HTTP/1.0 which does not support persistent connections, and without clearing the Connection header, downstream HTTP/1.0 clients may inject a Connection: close header that would terminate the backend connection after each request.

Note

For FastCGI upstreams (PHP-FPM), connection pooling requires two directives working together: keepalive N in the upstream block creates the idle connection pool, and fastcgi_keep_conn on in the location block instructs PHP-FPM to hold its end of the connection open. Neither directive alone is sufficient -- both are required for persistent connections to function.

Rate Limiting

Nginx's limit_req and limit_conn modules provide request rate limiting using a leaky bucket algorithm. The Nginx documentation states this explicitly: ngx_http_limit_req_module uses the leaky bucket method. Unlike the token bucket -- which accumulates tokens during idle periods and allows bursts proportional to that accumulation -- the leaky bucket enforces a constant output rate. Requests that arrive faster than the configured rate are either queued (up to the burst limit) or rejected immediately; there is no reserve that builds up over quiet periods. The nodelay parameter and the burst limit together control which path excess requests take, as shown in the configuration below. Both modules operate against a shared memory zone, making enforcement coherent across all worker processes.

nginx.conf -- rate limiting

http {
    # Zone: track request rate per client IP, 10MB zone, 10 req/s
    limit_req_zone  $binary_remote_addr zone=api:10m rate=10r/s;
    # Zone: track concurrent connections per client IP
    limit_conn_zone $binary_remote_addr zone=addr:10m;
}

server {
    location /api/ {
        # Allow burst of 20 above rate; excess served without delay up to burst
        limit_req  zone=api burst=20 nodelay;
        # Maximum 10 concurrent connections from one IP
        limit_conn addr 10;
        proxy_pass http://app_backend;
    }

    location /login {
        # Strict: 1 req/s, burst of 5, immediate rejection beyond burst
        limit_req  zone=api burst=5 nodelay;
        proxy_pass http://app_backend;
    }
}

The rate=10r/s setting allows an average of ten requests per second from a single IP. The burst=20 parameter permits a short burst of up to 20 additional requests above the rate before limiting engages. With nodelay, bursting requests are processed immediately rather than queued for the delay they would incur at the configured rate, and excess requests beyond the burst are rejected with the default 503 status code. Without nodelay, excess burst requests are queued, which increases latency rather than rejecting requests -- useful for human-facing endpoints, less useful for API rate limiting where a fast rejection is preferable to a slow 200.

Pro Tip: Return 429 instead of 503

By default, Nginx returns HTTP 503 (Service Temporarily Unavailable) when a client exceeds a rate limit. As the official Nginx rate limiting blog post explains, you can override this with the limit_req_status directive. For APIs, HTTP 429 (Too Many Requests) is significantly more appropriate -- RFC 6585 defines 429 specifically for this case, and it correctly signals to the client that the server is functioning normally and the problem is request frequency, not server health. Add limit_req_status 429; to your http or location context alongside limit_req.

Note

$binary_remote_addr is preferred over $remote_addr for zone keys because it stores IPv4 as 4 bytes and IPv6 as 16 bytes rather than as variable-length strings, making the zone more memory-efficient at scale. As the official Nginx documentation for ngx_http_limit_req_module notes, a 10 MB zone can track state for approximately 160,000 IPv4 addresses simultaneously when used for request rate limiting -- limit_conn_zone state entries carry slightly more overhead per entry, so the same zone would track somewhat fewer concurrent connection entries. If your application sits behind a load balancer, key the zone on $http_x_forwarded_for or use the ngx_http_realip_module to set $remote_addr correctly from the proxy header before rate limiting evaluates it.

Monitoring and Observability

Nginx provides a built-in status endpoint through the ngx_http_stub_status_module:

nginx.conf -- stub status

location /nginx_status {
    stub_status;
    allow 127.0.0.1;
    deny  all;
}

The status endpoint exposes active connections, accepted and handled connection counts, total request count, and the instantaneous reading/writing/waiting breakdown. The waiting count is particularly informative: waiting connections are keepalive connections sitting idle between requests, consuming a file descriptor and a small memory allocation but no CPU. A very high waiting count relative to active worker connections often indicates clients with aggressive keepalive settings -- tuning keepalive_timeout down can reclaim file descriptors. For production monitoring, this data feeds into Prometheus via nginx-prometheus-exporter, providing time-series metrics on connection pool saturation, request throughput, and error rates.

The access log is Nginx's primary observability tool and is highly configurable. A production log format should include upstream response time, cache status, TLS protocol, and request processing time:

nginx.conf -- extended log format

log_format detailed
    '$remote_addr - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" '
    'rt=$request_time '
    'uct=$upstream_connect_time urt=$upstream_response_time '
    'cs=$upstream_cache_status '
    'tls=$ssl_protocol cipher=$ssl_cipher';

access_log /var/log/nginx/access.log detailed;

$upstream_cache_status will be HIT, MISS, BYPASS, STALE, UPDATING, or EXPIRED -- tracking the HIT ratio over time is the primary indicator of caching effectiveness. $upstream_response_time measures elapsed time from the moment Nginx opens the upstream connection to the last byte received from the backend, giving a direct view into backend performance. $request_time measures total time from receiving the first byte of the client request to sending the last byte of the response, including client transmission time -- large delta between $request_time and $upstream_response_time typically indicates slow clients or insufficient proxy buffering.

Note: proxy_buffering and Slow Clients

proxy_buffering is enabled by default and is one of the more consequential settings for backend connection efficiency. When enabled, Nginx reads the entire backend response into buffers before sending it to the client, which means backend connections are released quickly regardless of how slowly the client consumes data. When proxy_buffering off is set, Nginx holds the backend connection open and streams data directly to the client at the client's pace -- on slow or mobile clients, this means backend connections are held open for the entire duration of client download, which can rapidly exhaust backend connection pools under load. The delta between $request_time and $upstream_response_time in your logs quantifies this effect: a large delta with buffering disabled is a signal that slow clients are accumulating backend connections.

Nginx Unit: Archived as of October 2025

Previous versions of this article discussed Nginx Unit as a forward-looking option for application serving. This section requires an important correction: Nginx Unit was officially archived by its maintainers in October 2025 and is no longer receiving new features, enhancements, or bug fixes from the project maintainers. Wikipedia's Nginx article confirms that the project was archived in October 2025 and is no longer receiving updates.

Nginx Unit was an application server that could run PHP, Python, Go, Node.js, Ruby, Perl, Java, and WebAssembly (WASI) applications directly with a JSON-based dynamic configuration API. While the source code remains available and community contributions are technically accepted, it should not be considered for new production deployments. The F5/Nginx team has not announced a replacement product. If you are evaluating application servers that combine web serving with runtime execution and dynamic configuration, current actively-maintained alternatives include Caddy, FrankenPHP, and language-specific application servers paired with Nginx as a reverse proxy using the well-established patterns described in this article.

Summary

Nginx's architecture is a study in how constraints produce elegance. Sysoev started from the requirement that a single server must handle tens of thousands of simultaneous connections without degrading. Every major design decision -- the event loop on top of epoll, the fixed worker process count, the pool-based memory allocator, the phase-based request pipeline, the filter chain -- follows from that requirement. None of these are arbitrary. Each removes a bottleneck that would otherwise appear at scale.

The module system is what transforms this performance foundation into a general-purpose platform. Every capability -- from static file serving to proxying to rate limiting to gzip compression -- is implemented as a module conforming to ngx_module_t. The core provides the event loop, configuration parsing, and request orchestration; modules provide everything else. This architecture makes Nginx extensible without patching the core, and it makes it possible to build a minimal binary with only the modules a deployment actually needs.

The memory architecture reinforces the event-driven model. Per-request pools mean that modules never individually free their allocations -- they allocate into a pool tied to the request lifetime, and the entire pool is reclaimed in one operation when the request finalizes. Per-connection pools persist across keepalive requests. Shared memory zones with slab allocators allow rate limiting counters, cache metadata, and SSL session state to be coherent across all worker processes without inter-process messaging. The result is a memory model that is both fast and predictable.

The eleven-phase request pipeline and the filter chain give operators precise control over where in the request and response lifecycle each module acts. Understanding that phases run in a defined order regardless of directive position in the config file, and that filter chains run in reverse registration order, resolves most of the configuration confusion that trips up Nginx practitioners.

HTTP/3 with QUIC extends Nginx into the next generation of web protocols while preserving the worker architecture -- through the eBPF-based connection ID routing that maps QUIC's connection IDs back to specific worker sockets, solving the fundamental problem that UDP datagrams carry no built-in process affinity.

The operational details matter as much as the architecture. try_files performs filesystem tests in sequence and triggers an internal redirect to the fallback URI, restarting the phase engine -- understanding this prevents configuration errors where catch-all fallbacks bypass carefully placed access controls. The ngx_http_realip_module must run before rate limiting or logging to ensure $remote_addr reflects the actual client IP rather than a proxy. Passive health checking via max_fails and fail_timeout provides resilience without NGINX Plus. The keepalive directive in every upstream block, combined with proxy_http_version 1.1 and a cleared Connection header, enables connection pooling to backends -- one of the most impactful and most frequently omitted production optimizations. Cache bypass logic written with map blocks rather than if directives avoids the well-documented edge cases that if introduces inside location contexts. Load balancing algorithm selection -- round-robin, least_conn, ip_hash, or consistent hash -- should be driven by workload characteristics, not convention, with the behavior differences (especially ip_hash's three-octet IPv4 key versus full IPv6 address, the latter supported since Nginx 1.3.2) understood before deployment. And OCSP stapling requires a resolver directive and ssl_trusted_certificate to function -- without them it fails silently; for Let's Encrypt certificates issued after May 7, 2025, ssl_stapling should be disabled entirely, as those certificates carry no OCSP URL.

The point of understanding Nginx at this depth is not to configure it differently -- it is to configure it correctly, and to know why each decision matters under load.

^ back to top