Implementing Basic Load Balancing

Introduction

Welcome to the Load Balancing and Performance Tuning course! We're excited to have you here as we begin our journey into one of the most critical aspects of modern web architecture. In this first lesson, we'll tackle the foundation of load balancing by implementing a basic configuration that distributes incoming requests across multiple backend servers. By the end of this lesson, you'll understand how to set up NGINX to act as a load balancer, directing traffic to multiple backend servers in a balanced manner. This skill is essential for building scalable and resilient web applications that can handle increased traffic while maintaining reliability.

Understanding Load Balancing Concepts

Before we dive into the configuration, let's build some intuition about what load balancing actually does. Imagine running a popular web application that receives thousands of requests per second. If all these requests hit a single server, that server would quickly become overwhelmed, leading to slow response times or even crashes. Load balancing solves this problem by distributing incoming requests across multiple servers, each capable of handling a portion of the total traffic.

The key benefits of load balancing include:

Improved performance: No single server bears the full burden of all requests.
Higher availability: If one server fails, others can continue serving requests.
Easier scaling: Adding more servers to handle increased load becomes straightforward.

In NGINX, we accomplish this by defining an upstream group, which is a collection of backend servers, and then configuring NGINX to proxy incoming requests to this group.

The Upstream Block

NGINX uses a special configuration block called upstream to define groups of backend servers. This block sits within the http context and gives us a way to reference multiple servers as a single entity. Think of an upstream as a logical grouping that NGINX can use to distribute requests.

The basic structure looks like this:

Here, we've created an upstream group named backend containing three servers. Each server is identified by its IP address and port. In this case, all three servers are running on the same machine (localhost) but on different ports, which is common for development and testing scenarios. In production environments, these would typically be different physical or virtual machines with their own IP addresses.

Configuring the Load Balancer Server

Now that we've defined our upstream group, we need to configure NGINX to listen for incoming requests and forward them to our backend servers. We do this with a server block:

This configuration tells NGINX to:

Listen on port 3000 for incoming HTTP requests.
Respond to requests directed at localhost.
Handle all paths (the / location matches everything).

The location block is where we'll add the proxy configuration that connects incoming requests to our upstream backend servers.

Proxying Requests to the Upstream

The magic of load balancing happens with the proxy_pass directive. This tells NGINX to forward incoming requests to our upstream group:

Notice that we reference our upstream group by name: http://backend. The http:// prefix is required even though we're referencing an upstream group, not a direct URL. When NGINX receives a request, it will select one of the servers from the backend upstream group and forward the request to it. By default, NGINX uses a round-robin algorithm, meaning it cycles through the servers in order: the first request goes to the first server, the second request to the second server, and so on.

Setting Proxy Headers

When NGINX acts as a reverse proxy, it's important to preserve information about the original client request. Without proper headers, the backend servers won't know the true source of the requests. We configure these headers within our location block:

Each directive serves a specific purpose:

Host: Preserves the original host header from the client's request.
X-Real-IP: Contains the client's actual IP address.
X-Forwarded-For: Maintains a chain of proxy servers the request passed through.
X-Forwarded-Proto: Indicates whether the original request used HTTP or HTTPS.

These headers ensure that backend servers have complete information about the original request, which is crucial for logging, security checks, and application logic.

Tracking Backend Responses

To verify that our load balancing is working correctly, we can add a custom response header that shows which backend server handled each request:

The $upstream_addr variable contains the address and port of the backend server that processed the request. The always parameter ensures this header is added to all responses, regardless of status code. This is particularly useful during testing and debugging, as we can make multiple requests and observe which backend server responded to each one.

How Round-Robin Distribution Works

With our configuration complete, NGINX will distribute requests using the round-robin algorithm. This means that if we send six requests to our load balancer, the distribution would look like this:

Request 1: Goes to 127.0.0.1:5000
Request 2: Goes to 127.0.0.1:5001
Request 3: Goes to 127.0.0.1:5002
Request 4: Goes to 127.0.0.1:5000
Request 5: Goes to 127.0.0.1:5001
Request 6: Goes to 127.0.0.1:5002

Round-robin is the simplest load balancing method and works well when all backend servers have similar capabilities. Each server receives an equal number of requests over time, ensuring that no single server becomes overloaded while others sit idle. The X-Upstream-Addr header in each response will confirm this distribution pattern.

The Complete Configuration

Let's look at how all these pieces fit together in the complete configuration:

This configuration creates a fully functional load balancer that listens on port 3000 and distributes requests across three backend servers. The worker_processes and events blocks handle NGINX's internal operations, while the http block contains our load balancing logic.

Conclusion and Next Steps

Congratulations! You've just learned how to implement basic load balancing with NGINX. We covered the essential components: defining an upstream group with multiple backend servers, configuring a proxy server to distribute requests, setting proper headers to preserve client information, and adding a response header to track which backend handled each request. This foundation will serve you well as we explore more advanced load balancing techniques in future lessons. Now it's time to roll up your sleeves and put this knowledge into practice with hands-on exercises that will solidify your understanding!

Next Lesson: Advanced Load Balancing Strategies

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal