Advanced OpenResty Features

Introduction

Welcome to lesson four of "Extending NGINX with Lua and OpenResty"! You've already built a solid foundation: you learned to embed Lua code in NGINX, manipulate HTTP traffic dynamically, and even construct a complete REST API with JSON handling and CRUD operations. Now, we're ready to explore advanced OpenResty features that will take your capabilities to the next level. In this lesson, we'll work with shared memory dictionaries to implement rate limiting and caching. These are the building blocks of production-ready systems that need to manage traffic, optimize performance, and integrate with other APIs.

Understanding Shared Memory in OpenResty

When we built our API in the previous lesson, we used a lua_shared_dict to store data. That worked well, but it’s worth highlighting why we chose it: regular global Lua variables live inside a single NGINX worker process, so other workers won’t see updates. If worker 1 increments a counter, worker 2 won’t see that change. For features like rate limiting—where we need to track requests across all workers—we need a shared, worker-safe storage layer.

Shared memory dictionaries solve this problem. They’re regions of memory that all worker processes can access simultaneously, making them perfect for:

Tracking request counts across the entire server
Storing cached data that all workers can read
Maintaining application-wide state that survives individual requests

OpenResty provides the lua_shared_dict directive to create these shared spaces. The data persists as long as NGINX is running, and all operations on shared dictionaries are atomic, preventing race conditions when multiple workers access the same keys simultaneously.

Setting Up Shared Dictionaries

Before we can use shared memory, we need to allocate it in our NGINX configuration. This happens at the HTTP level, outside any server block:

Here's what these declarations accomplish:

lua_shared_dict rate_limit 10m: Creates a shared dictionary named rate_limit with 10 megabytes of memory
lua_shared_dict cache 10m: Creates another dictionary named cache, also with 10 megabytes
These names become accessible in our Lua code via ngx.shared.rate_limit and ngx.shared.cache
The size limit ensures we don't consume unlimited memory; when full, OpenResty will evict the least recently used entries

Each dictionary is essentially a key-value store where keys are strings and values can be strings, numbers, or booleans. We can also set expiration times on entries, making them perfect for temporary data like rate limit counters or cached responses.

Implementing Rate Limiting

Rate limiting prevents clients from overwhelming our server by restricting how many requests they can make in a time window. We'll implement this using the access_by_lua_block phase, which runs before NGINX processes the request:

The logic here implements a simple fixed-window rate limiter:

ngx.shared.rate_limit: Accesses our shared dictionary
ngx.var.remote_addr: Gets the client's IP address as our tracking key
limit:incr(key, 1, 0, 5): Atomically increments the counter with these parameters:
- 1: Increment by 1
- 0: Initialize to 0 if the key doesn't exist
- 5: Set a 5-second expiration when creating the key
The incr method handles both initialization and increment in one atomic operation
If the operation fails (e.g., out of memory), we return a 500 error
The 5-second expiration means the counter automatically resets after our time window

This approach is more efficient than checking existence first because it combines the get, set, and increment operations into a single atomic call.

Managing Rate Limit State

Once we've tracked the request count, we need to decide whether to allow or reject the request:

This code enforces the limit and communicates status to clients:

We set rate limit headers for all requests, not just rejected ones
X-RateLimit-Limit: Tells clients the maximum allowed requests
X-RateLimit-Remaining: Shows how many requests they have left (never goes below 0)
math.max(0, max - count): Ensures remaining count doesn't go negative
If count exceeds our maximum, we return 429 Too Many Requests
ngx.exit(429): Immediately stops processing and returns the response
For allowed requests, processing continues to the content_by_lua_block
The success response is cleanly separated in its own phase

These standard headers help clients implement proper backoff strategies and understand when they can retry.

Testing Rate Limiting

Let's see how our rate limiter behaves with successive requests. When we make multiple requests to /rate-limited, here's what happens:

Notice how the behavior changes across requests:

The first request gets processed normally with 9 remaining requests
Each subsequent request decrements the remaining count
On the 11th request, we hit the limit and get rejected with a 429 status
After 5 seconds, the counter expires and resets automatically
Clients can use the X-RateLimit-Remaining header to pace their requests

This pattern protects your server from abuse while providing clear feedback to clients about their usage.

Building a Caching Layer

Caching is another powerful use of shared memory. Instead of regenerating expensive data for every request, we can store results and reuse them. Let's build an endpoint that caches its responses:

This cache lookup logic follows a standard pattern:

ngx.var.arg_id: Accesses the id query parameter (e.g., /cached-data?id=123)
We construct a cache key by prefixing "data:" to the ID, using "default" if no ID is provided
cache:get(cache_key): Attempts to retrieve cached data
If found, we set the X-Cache: HIT header and immediately return the cached response
This early return prevents us from regenerating the data

The cache hit path is extremely fast since we're just reading from shared memory rather than performing expensive operations.

Populating the Cache on Misses

When data isn't in the cache, we need to generate it and store it for future requests:

The cache miss handling completes our caching strategy:

X-Cache: MISS: Indicates this response was freshly generated
We create a JSON object with an ID, random value, and timestamp
cache:set(cache_key, data, 30): Stores the data with a 30-second expiration
The third parameter (30) means this cached entry will automatically expire after 30 seconds
We return the same data to the client, whether it was cached or freshly generated

This time-to-live (TTL) mechanism ensures cached data doesn't become stale. After 30 seconds, the next request will regenerate the data.

Observing Cache Behavior

Let's see how the cache performs over multiple requests to the same resource. When we request /cached-data?id=test:

The cache behavior demonstrates several important concepts:

The first request generates new data (notice the MISS header)
Subsequent requests within 30 seconds return the exact same data from the cache
The timestamp remains unchanged in cached responses
After expiration, we get a fresh MISS with new random data
Different IDs maintain separate cache entries

This pattern dramatically improves performance for expensive operations while ensuring data freshness through the TTL mechanism.

Conclusion and Next Steps

We've explored advanced OpenResty features that are essential for production systems! We learned how shared memory dictionaries enable data sharing across worker processes, implemented rate limiting with proper HTTP headers to protect against abuse and built a caching layer with TTL support to optimize performance. These techniques transform NGINX from a simple reverse proxy into a powerful application platform capable of handling complex traffic management, performance optimization, and service integration.

The skills you've gained here — managing shared state and implementing traffic controls — are the foundation of scalable, resilient systems. Now, you're ready to put these concepts into practice and build your own advanced features!

Previous Lesson

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal