Welcome to lesson four of "Extending NGINX with Lua and OpenResty"! You've already built a solid foundation: you learned to embed Lua code in NGINX, manipulate HTTP traffic dynamically, and even construct a complete REST API with JSON handling and CRUD operations. Now, we're ready to explore advanced OpenResty features that will take your capabilities to the next level. In this lesson, we'll work with shared memory dictionaries to implement rate limiting and caching. These are the building blocks of production-ready systems that need to manage traffic, optimize performance, and integrate with other APIs.
When we built our API in the previous lesson, we used a lua_shared_dict to store data. That worked well, but it’s worth highlighting why we chose it: regular global Lua variables live inside a single NGINX worker process, so other workers won’t see updates. If worker 1 increments a counter, worker 2 won’t see that change. For features like rate limiting—where we need to track requests across all workers—we need a shared, worker-safe storage layer.
Shared memory dictionaries solve this problem. They’re regions of memory that all worker processes can access simultaneously, making them perfect for:
- Tracking request counts across the entire server
- Storing cached data that all workers can read
- Maintaining application-wide state that survives individual requests
OpenResty provides the lua_shared_dict directive to create these shared spaces. The data persists as long as NGINX is running, and all operations on shared dictionaries are atomic, preventing race conditions when multiple workers access the same keys simultaneously.
Before we can use shared memory, we need to allocate it in our NGINX configuration. This happens at the HTTP level, outside any server block:
Here's what these declarations accomplish:
lua_shared_dict rate_limit 10m: Creates a shared dictionary namedrate_limitwith 10 megabytes of memorylua_shared_dict cache 10m: Creates another dictionary namedcache, also with 10 megabytes- These names become accessible in our Lua code via
ngx.shared.rate_limitandngx.shared.cache - The size limit ensures we don't consume unlimited memory; when full, OpenResty will evict the least recently used entries
Each dictionary is essentially a key-value store where keys are strings and values can be strings, numbers, or booleans. We can also set expiration times on entries, making them perfect for temporary data like rate limit counters or cached responses.
Rate limiting prevents clients from overwhelming our server by restricting how many requests they can make in a time window. We'll implement this using the access_by_lua_block phase, which runs before NGINX processes the request:
The logic here implements a simple fixed-window rate limiter:
ngx.shared.rate_limit: Accesses our shared dictionaryngx.var.remote_addr: Gets the client's IP address as our tracking keylimit:incr(key, 1, 0, 5): Atomically increments the counter with these parameters:1: Increment by 10: Initialize to 0 if the key doesn't exist5: Set a 5-second expiration when creating the key
- The
incrmethod handles both initialization and increment in one atomic operation - If the operation fails (e.g., out of memory), we return a 500 error
- The 5-second expiration means the counter automatically resets after our time window
This approach is more efficient than checking existence first because it combines the get, set, and increment operations into a single atomic call.
Once we've tracked the request count, we need to decide whether to allow or reject the request:
This code enforces the limit and communicates status to clients:
- We set rate limit headers for all requests, not just rejected ones
X-RateLimit-Limit: Tells clients the maximum allowed requestsX-RateLimit-Remaining: Shows how many requests they have left (never goes below 0)math.max(0, max - count): Ensures remaining count doesn't go negative- If count exceeds our maximum, we return 429 Too Many Requests
ngx.exit(429): Immediately stops processing and returns the response- For allowed requests, processing continues to the
content_by_lua_block - The success response is cleanly separated in its own phase
These standard headers help clients implement proper backoff strategies and understand when they can retry.
Let's see how our rate limiter behaves with successive requests. When we make multiple requests to /rate-limited, here's what happens:
Notice how the behavior changes across requests:
- The first request gets processed normally with 9 remaining requests
- Each subsequent request decrements the remaining count
- On the 11th request, we hit the limit and get rejected with a 429 status
- After 5 seconds, the counter expires and resets automatically
- Clients can use the
X-RateLimit-Remainingheader to pace their requests
This pattern protects your server from abuse while providing clear feedback to clients about their usage.
Caching is another powerful use of shared memory. Instead of regenerating expensive data for every request, we can store results and reuse them. Let's build an endpoint that caches its responses:
This cache lookup logic follows a standard pattern:
ngx.var.arg_id: Accesses theidquery parameter (e.g.,/cached-data?id=123)- We construct a cache key by prefixing "data:" to the ID, using "default" if no ID is provided
cache:get(cache_key): Attempts to retrieve cached data- If found, we set the
X-Cache: HITheader and immediately return the cached response - This early return prevents us from regenerating the data
The cache hit path is extremely fast since we're just reading from shared memory rather than performing expensive operations.
When data isn't in the cache, we need to generate it and store it for future requests:
The cache miss handling completes our caching strategy:
X-Cache: MISS: Indicates this response was freshly generated- We create a JSON object with an ID, random value, and timestamp
cache:set(cache_key, data, 30): Stores the data with a 30-second expiration- The third parameter (30) means this cached entry will automatically expire after 30 seconds
- We return the same data to the client, whether it was cached or freshly generated
This time-to-live (TTL) mechanism ensures cached data doesn't become stale. After 30 seconds, the next request will regenerate the data.
Let's see how the cache performs over multiple requests to the same resource. When we request /cached-data?id=test:
The cache behavior demonstrates several important concepts:
- The first request generates new data (notice the
MISSheader) - Subsequent requests within 30 seconds return the exact same data from the cache
- The timestamp remains unchanged in cached responses
- After expiration, we get a fresh
MISSwith new random data - Different IDs maintain separate cache entries
This pattern dramatically improves performance for expensive operations while ensuring data freshness through the TTL mechanism.
We've explored advanced OpenResty features that are essential for production systems! We learned how shared memory dictionaries enable data sharing across worker processes, implemented rate limiting with proper HTTP headers to protect against abuse and built a caching layer with TTL support to optimize performance. These techniques transform NGINX from a simple reverse proxy into a powerful application platform capable of handling complex traffic management, performance optimization, and service integration.
The skills you've gained here — managing shared state and implementing traffic controls — are the foundation of scalable, resilient systems. Now, you're ready to put these concepts into practice and build your own advanced features!
