Introduction

Welcome to the fourth lesson of the Load Balancing and Performance Tuning course! In the previous lessons, we explored distributing traffic across servers, implementing advanced routing strategies, and caching responses to reduce backend load. Now, we'll address another critical performance factor: bandwidth consumption. Even with perfectly balanced servers and efficient caching, transferring large responses over the network can slow down your application and increase costs. In this lesson, we'll implement gzip compression in NGINX to dramatically reduce the size of transmitted data. We'll configure compression levels, specify which content types benefit most from compression, handle proxied requests, and verify the bandwidth savings through practical testing. By the end, you'll understand how to optimize data transfer without sacrificing content quality.

Understanding Bandwidth and Compression

When a client requests content from your server, the response must travel across the network from your server to their device. The larger the response, the longer this transfer takes and the more bandwidth it consumes. This affects both performance and cost: users experience slower load times, especially on mobile networks, and you pay for data transfer based on volume.

Compression addresses this challenge by reducing the size of responses before transmission. Text-based content like HTML, CSS, JavaScript, and JSON often contains repetitive patterns and whitespace that compress extremely well. A response that's 100 KB uncompressed might shrink to 20 KB or less when compressed, resulting in five times faster transfer and five times lower bandwidth costs. The client's browser automatically decompresses the content upon receipt, making the process transparent to users. The key is configuring your server to compress the right content at the right compression level, balancing CPU usage against bandwidth savings.

How Gzip Works in NGINX

Gzip is a widely supported compression algorithm that identifies and eliminates redundancies in data. When enabled, NGINX compresses eligible responses before sending them to clients. The process works as follows: the client sends an Accept-Encoding: gzip header indicating it can handle compressed content, NGINX compresses the response using gzip, adds a Content-Encoding: gzip header to signal the compression, and the client decompresses the content automatically.

NGINX offers two compression approaches. Dynamic compression processes responses on the fly, which is suitable for content generated at runtime. Static compression serves pre-compressed files that were compressed during deployment, which saves CPU resources during request handling. Both methods have their place: dynamic compression for dynamic content and API responses, static compression for unchanging assets like JavaScript bundles and CSS files. We'll configure both in this lesson.

Enabling Basic Gzip Compression

To start using compression, we first enable it globally and configure fundamental parameters in the http block:

The gzip on directive activates compression throughout this configuration context. The gzip_comp_level sets the compression intensity on a scale from 1 to 9, where higher values produce smaller files but consume more CPU. Level 6 provides an excellent balance: it achieves significant size reduction without excessive processing overhead. The gzip_min_length directive prevents compressing tiny responses, where the compression overhead might actually increase the total size; only responses larger than 256 bytes will be compressed. These three settings form the foundation of your compression strategy.

Specifying Content Types for Compression

Not all content benefits equally from compression. Text-based formats compress well because they contain repetitive patterns, while binary formats like images and videos are already compressed and won't shrink further. The gzip_types directive lists which MIME types should be compressed:

This comprehensive list covers the most common text-based formats: plain text, stylesheets, JavaScript files, JSON API responses, XML documents, and SVG images. Notice that text/html is not included because NGINX compresses it by default. Importantly, this list excludes binary formats like JPEG, PNG, and MP4, which are already optimized and would waste CPU cycles without meaningful size reduction.

Handling Proxied Request Compression

When NGINX acts as a reverse proxy, additional considerations arise for compression. The gzip_proxied directive controls when responses from upstream servers should be compressed:

This directive accepts several values that determine compression behavior based on response headers. Setting it to any means all proxied responses will be compressed regardless of their cache headers, which is the most straightforward approach. Other values like expired, no-cache, or auth provide finer control by compressing only responses with specific cache-control directives. For most applications, any works well because we want to compress all eligible content.

Adding the Vary Header

The gzip_vary directive controls whether NGINX includes a Vary: Accept-Encoding header in responses:

This header is crucial for proper caching behavior. It informs intermediate caches and CDNs that the response varies based on the client's compression capabilities. Without this header, a cache might serve a compressed response to a client that doesn't support gzip, or vice versa, leading to broken content. The Vary header ensures that compressed and uncompressed versions are cached separately, so each client receives the appropriate format. This is particularly important when caching proxies sit between NGINX and clients.

Configuring the Backend Upstream

As in previous lessons, we define an upstream group for our backend servers:

This familiar configuration creates a pool of three backend servers using round-robin distribution. The compression settings we've configured will apply to responses from these servers, reducing the bandwidth required to send their responses to clients. The backends themselves don't need to be aware of compression; NGINX handles it transparently as part of the proxying process.

Applying Compression to Proxied API Requests

Now, we'll configure a location that proxies requests to the backend while ensuring proper compression:

This location forwards requests to our backend upstream group with several important headers. The Host, X-Real-IP, and X-Forwarded-For headers preserve client information, which we've seen in previous lessons. The critical addition here is proxy_set_header Accept-Encoding gzip, which explicitly tells the backend that NGINX can handle compressed responses. If the backend supports compression, it might send pre-compressed content, which NGINX can then serve directly without recompressing. This header creates a compression-aware communication channel between NGINX and the backends.

Serving Pre-compressed Static Files

For static assets that don't change, pre-compressing them during deployment and serving the compressed versions directly is more efficient than compressing on every request:

The gzip_static directive tells NGINX to look for pre-compressed versions of requested files. When a client requests /static/compressed/app.js, NGINX first checks for /var/www/static/app.js.gz. If that file exists, NGINX serves it directly with the appropriate Content-Encoding header. If it doesn't exist, NGINX falls back to the uncompressed file. This approach eliminates compression CPU overhead for static files, since the compression happens once during deployment rather than on every request.

Important note: If you update the original app.js but forget to regenerate app.js.gz, NGINX could serve an outdated compressed file while the uncompressed version has already been updated. This mismatch can lead to subtle bugs and inconsistent behavior. It's crucial to ensure that compressed and uncompressed files always match during deployment. The best practice is to automate this process in your deployment pipeline, regenerating all .gz files whenever their source files change. You might use build tools or deployment scripts to maintain this synchronization automatically, preventing version mismatches from reaching production.

Selectively Disabling Compression

Sometimes, you might want to disable compression for specific content, perhaps for testing or because certain content doesn't compress well:

The gzip off directive disables compression for this location, overriding the global gzip on setting. Even though compression is enabled elsewhere, files served from this location will always be uncompressed. The custom X-Compression header makes this behavior visible in responses, which is useful for debugging and verifying that compression is indeed disabled where intended.

Creating a Compression Test Endpoint

To verify that compression is working and measure its effectiveness, we can create an endpoint that returns substantial text content:

This endpoint returns approximately 1024 bytes of text content, which is large enough to benefit from compression. The X-Original-Size header records the uncompressed size, making it easy to compare against the actual transferred size. When you request this endpoint, you can examine the response headers to see the Content-Encoding: gzip header and compare the Content-Length (compressed size) against the X-Original-Size (uncompressed size) to calculate the compression ratio.

Verifying Compression in Practice

When compression is working correctly, you'll observe several indicators in the response headers. The most obvious is the Content-Encoding: gzip header, which confirms that the response was compressed. The Content-Length header shows the compressed size in bytes, while the Vary: Accept-Encoding header indicates that the response varies based on compression support.

For our test endpoint returning 1024 bytes of text, the compressed response might be only 400–500 bytes, achieving approximately 50–60% size reduction. This translates directly to bandwidth savings: if you serve 1 million such responses per day, compression could save hundreds of gigabytes of data transfer. The actual compression ratio varies by content type; highly repetitive content like JSON with many similar fields might compress to 20% of its original size, while already-compressed formats won't shrink at all.

Conclusion and Next Steps

Fantastic work! We've implemented a comprehensive compression strategy in NGINX that intelligently reduces bandwidth consumption across different types of content. You've configured dynamic compression for API responses and proxied content, set up static compression for unchanging files, and learned how to selectively disable compression when needed. The combination of appropriate compression levels, targeted content types, and proper cache headers ensures optimal performance without wasting CPU resources. Together with load balancing from lessons one and two and caching from lesson three, compression completes your performance optimization toolkit. These four techniques work synergistically: load balancing distributes work, caching reduces work, and compression makes the remaining work faster and cheaper. Now, it's your turn to put these compression concepts into action through hands-on exercises that will solidify your understanding and prepare you for real-world optimization challenges!

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal