Monitoring NGINX Performance

Introduction

Welcome back to Advanced NGINX Configuration and Monitoring! You've made fantastic progress through this course, mastering URL rewrites, custom error pages, and health checks. In the previous lesson, we configured NGINX to automatically detect backend failures and route traffic to healthy servers, ensuring high availability for our applications.

Now we're moving into a complementary but distinct area: real-time performance monitoring. While health checks tell us whether backends are functioning, they don't provide visibility into NGINX itself. How many active connections are being handled right now? How many total requests have been processed? Are there connections waiting for available workers? These questions are critical for capacity planning, troubleshooting performance issues, and understanding your server's behavior under load.

In this lesson, we'll explore NGINX's built-in stub_status module, which exposes essential performance metrics through a simple HTTP endpoint. We'll configure this endpoint with proper security restrictions, interpret the metrics it provides, and understand how to use this data for effective monitoring. By the end, you'll have a complete monitoring setup that complements your health-checking infrastructure.

Understanding Performance Metrics

Before diving into configuration, let's consider why dedicated performance monitoring matters even when we have health checks in place. Health checks verify that backends respond correctly to requests, but they don't reveal what's happening inside NGINX itself.

Performance metrics give us insight into several critical areas:

Connection handling: How many clients are actively connected? Are we approaching our worker connection limits?
Request throughput: How many total requests has the server processed since starting? What's the current request rate?
Resource utilization: Are connections stuck in reading or writing states, potentially indicating slow clients or network issues?

The stub_status module provides these metrics in a lightweight format that's easy to parse programmatically. Monitoring tools can query this endpoint regularly, track trends over time, and alert you when metrics exceed expected thresholds. This complements our health checks by monitoring NGINX's own health rather than just backend availability.

Creating the Basic Server Structure

Let's start building our configuration with the foundational elements:

This establishes a minimal server that will serve as the foundation for our monitoring endpoint. The default_type text/plain directive sets the content type for responses, which works well for status information. The root location simply returns "OK" to verify the server is running, while we'll add our monitoring endpoint in the next section.

Enabling the stub_status Module

Now we'll add the core monitoring functionality with a dedicated location block:

The stub_status directive activates NGINX's built-in status module at this location. When you access this endpoint, NGINX generates a real-time snapshot of its current state. This is remarkably lightweight; generating status information has negligible performance impact, making it safe to query frequently.

The location name /nginx_status is a convention, but you can choose any path. Some administrators prefer /status, /metrics, or even obscure paths like /internal_stats_52847 to reduce the chance of accidental discovery. However, as we'll see next, proper access restrictions are far more important than path obscurity.

Understanding the Status Output

Let's examine what the stub_status endpoint actually returns:

Each line provides specific insights into NGINX's current state:

Active connections: The total number of currently open client connections, including those reading requests, writing responses, and waiting idle.
accepts, handled, requests: Three cumulative counters: total accepted connections, successfully handled connections, and total client requests. Normally, accepts equals handled; a difference indicates resource limits preventing some connections from being processed.
Reading, Writing, Waiting: Current connection states. Reading means NGINX is receiving the request header, Writing means it's sending the response, and Waiting represents idle keepalive connections awaiting new requests.

The waiting count is particularly useful: a high number suggests many idle connections consuming resources, while zero might indicate you're hitting connection limits under heavy load.

Restricting Access for Security

Currently, our status endpoint is publicly accessible, which creates a security risk. We need to restrict access to trusted sources:

The access control directives work in order: allow 127.0.0.1 permits requests from localhost, while deny all blocks everything else. This means only processes running on the same server can access the status endpoint, preventing external parties from gathering intelligence about your server's capacity and traffic patterns.

For production systems with dedicated monitoring infrastructure, you might allow specific monitoring server IP addresses: allow 10.0.1.50; would permit your monitoring host to collect metrics remotely. Always prefer whitelisting specific IPs over exposing the endpoint broadly.

Disabling Access Logs

There's one more optimization we should apply to our status endpoint:

The access_log off directive prevents NGINX from recording each status request in its access logs. Since monitoring tools typically query this endpoint every few seconds, logging each request creates unnecessary disk I/O and log clutter. The status information itself is ephemeral and doesn't require an audit trail.

This setting only affects the /nginx_status location; regular traffic to other locations continues being logged normally. It's a small but meaningful optimization when status checks run continuously.

Integrating with Monitoring Tools

With our complete configuration in place, the status endpoint becomes valuable for integration with monitoring systems. Tools like Prometheus, Grafana, or custom monitoring scripts can periodically query /nginx_status, parse the metrics, and track them over time.

For example, tracking the request counter lets you calculate requests per second by measuring the difference between successive samples. Monitoring active connections reveals traffic patterns and helps with capacity planning. Sudden spikes in the reading or writing counts might indicate network issues or slow clients affecting performance.

The simplicity of the stub_status format makes it easy to parse with basic text processing: split on whitespace, extract the numeric values, and store them in your time-series database. Many monitoring solutions include native NGINX integrations that handle this automatically, but understanding the raw format helps when building custom monitoring solutions.

Conclusion and Next Steps

In this lesson, we've implemented a complete monitoring solution using NGINX's stub_status module. We learned how to expose real-time performance metrics through a dedicated endpoint, secure that endpoint with IP-based access controls, interpret the metrics that reveal connection handling and request throughput, and optimize the configuration by disabling access logs for status requests.

This monitoring capability complements the health-checking infrastructure we built in the previous lesson. Together, they provide comprehensive visibility: health checks monitor backend availability, while stub_status reveals NGINX's own performance characteristics. This dual approach is essential for maintaining reliable, observable web services in production environments.

Now it's your turn to implement these monitoring patterns; the practice exercises ahead will give you hands-on experience configuring status endpoints and interpreting the metrics they expose, preparing you to monitor real production systems with confidence!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal