Improving Node.js App Performance with Redis Caching
Most applications rely heavily on data sourced from databases or APIs. Accessing this data requires network requests, which often increases response latency, and leads to rate-limiting issues.
Caching addresses these challenges by storing frequently accessed data in a temporary location to allow for faster retrieval.
It minimizes the need for repeated network calls or database queries, which results in improved application performance, reduced latency, and lower API/network costs.
In this article, we'll examine how to implement caching in a Node.js application using Redis, a popular in-memory database often employed as a distributed cache.
Along the way, you'll learn how to choose the right caching strategy, achieve a high cache hit rate, and maintain consistency between your cache and the underlying data sources.
Let's get started!
Prerequisites
- Prior Node.js development experience.
- A recent version of Node.js and npm installed on your computer.
- Docker and SQLite installed.
Setting up a local Redis server
Before integrating Redis with your Node.js application, you need to set up a Redis server. While various installation options are available on the Redis website, read on to learn how to set it up using Docker.
To create a Redis container from the official redis image, execute the command below. This will run the container in "detached" mode and map port 6379 (the default Redis port) on your host machine to the container:
If the redis image isn't already present on your system, Docker will
automatically download it from Docker Hub. Once the container is created, the
command will return the container ID:
To confirm the Redis container is running, use:
The output should display the container's details, including the status and mapped ports:
To interact with the container, open a shell session by running:
Within the shell, you can access the Redis CLI by entering:
To confirm the Redis server is working correctly, use the ping command in the
Redis CLI:
You should receive a PONG output, which confirms that Redis is working
normally.
In the next section, I'll demonstrate how to connect your Node.js application to the Redis server.
Connecting to Redis from your Node.js app
With your Redis server running, the next step is to connect it to your Node.js application. To get started, you can clone a pre-configured Express app from this repository:
Navigate to the cloned directory and install the required dependencies, including Express and dotenv:
Rename the .env.example file to .env:
This file contains the following environmental variables:
These specify the server's listening port, the Redis connection URI, and the
SQLite database file. If your Redis server has different credentials or runs on
another host, modify the REDIS_URI using this format:
Run the server using the following command:
This uses nodemon to automatically restart the server whenever changes are made to any of the imported files. You'll see:
You're now ready to connect your application to a Redis server.
The first step is choosing whether to use the officially maintained node-redis package and the third-party ioredis package.
For this tutorial, we'll stick with the node-redis package which is already
installed.
Go ahead and create a redis.js file to handle the Redis connection:
This script initializes the Redis client using the REDIS_URI specified in the
.env file. The initializeRedisClient() function connects the client,
performs a PING command to verify the connection, and logs the result.
You can now update your server.js file to include Redis initialization:
When you restart the server, you should see confirmation that the Redis connection was successfully established:
Now that you've successfully connected to Redis, let's look at some common scenarios where caching comes in handy in typical web application development.
Scenario 1 — Caching API responses
Imagine you're building a web application that provides real-time Bitcoin-to-currency conversions. To fetch the latest exchange rates, your application integrates with an external API, such as the CoinGecko API. According to their documentation, exchange rates are updated every five minutes:
Given this update frequency, requesting the exchange rate multiple times within five minutes is redundant. Such requests could slow down your endpoints, increase costs, and potentially lead to rate-limiting by the API provider.
To optimize your usage, let's use Redis as a caching layer. By storing the fetched exchange rates in Redis with a 5-minute expiration time, subsequent requests within this window retrieve data directly from the cache, thus avoiding additional API calls.
Here's how to set it up:
Before issuing a request to the API, the server checks if the data exists in Redis. If so, it is parsed and returned as the response. Otherwise, it is fetched from the API and cached in Redis with the specified expiry time for reuse in future requests.
After five minutes, Redis will remove the stale data automatically. Future requests will then fetch fresh data and repopulate the cache.
You can try it out by using a tool like Postman, HTTPie, or curl to make a
request. On the first try, the data will come directly from the external API:
You'll also see that a "cache miss" is logged in the server console:
This initial request took about 875ms with my internet connection:
Subsequent requests within the five-minute window will be served from the cache instead:
In my testing, I saw a 175x improvement in response time when using the cache:
If your application can tolerate potentially stale data for longer, you only need to update the expiration time to your desired value.
This approach isn't limited to API responses. You can also use it for database queries whose results can be reused by other requests.
Scenario 2 — Caching server responses
Caching server responses is another effective way to improve application performance, especially for routes where the response can be reused across multiple requests without changes. This approach can be implemented as an Express middleware that integrates seamlessly into your application.
Add the following redisCachingMiddleware() function to your server.js file:
This redisCachingMiddleware() function takes an optional opts object with a
default expiry time to customize caching behavior. It then returns an Express
middleware that does the same thing you did in the previous section with a few
modifications.
This time, the cacheKey is based on the request URL and if this key exists in
Redis, the cached data is retrieved, parsed, and sent as the response without
calling the route handler.
If the data is not cached, the middleware overrides the res.send() function to
intercept and cache the response before sending it to the client. The original
res.send() is restored afterward.
To use this middleware, apply it to the routes where response caching is
desired. Here's how your /btc-exchange-rate/ endpoint would look now:
This setup eliminates the need for caching logic in the route handler itself as the middleware now handles all caching-related tasks.
You can also easily override the default cache expiry time by passing an opts
object:
With this setup in place, you'll observe the same behavior as in the previous section, but with slightly different log messages:
Crafting effective cache keys
When creating cache keys for cached data, you need to design them properly to ensure high cache hit rates and efficient retrieval.
A common practice is to include a prefix in cache keys to group-related values into namespaces:
This makes it easy to retrieve or invalidate all the values that share a single prefix with a single command. It also prevents collisions if a single cache server is used for multiple applications.
For caching server responses, ensure that the cache key accounts for parameters and headers that affect the response. For example, the following requests should generate the same key, even though the parameters are ordered differently:
This avoids unnecessary duplication and ensures a high cache hit rate.
With this in mind, you can write a function that generates cache keys based on everything that affects the response received from a handler:
This function extracts the query parameters from the request and uses the object-hash package (already installed) to generate an order-insensitive hash. The request path is retained in the generated key to make it easy to identify where the cached value was generated from.
You can incorporate this key generation logic into the
redisCachingMiddleware() from the previous section:
With this setup, you'll generate cache keys like the following:
If the response depends on the request body or specific headers, ensure to
include them in the data object for hashing:
This ensures the key reflects all factors influencing the response, preventing incorrect cache hits or misses.
Let's talk about caching strategies next.
Caching strategies explained
A caching strategy defines how cached data is stored, retrieved, and maintained, ensuring efficiency and optimal performance. Strategies can be proactive (pre-populating the cache) or reactive (populating the cache on demand).
The correct choice depends on the nature of the data and its usage patterns. For instance, a real-time stock ticker application requires a different approach than one displaying static historical weather data.
When choosing a caching strategy, you should consider:
- The type of data being cached: Is it static, real-time, or periodically updated?
- Data access patterns: How frequently is the data read, written, or updated?
- Eviction policies: How will outdated or unused data be removed?
In this section, we'll take a look at some of the most common caching patterns to be aware of.
1. Cache-aside
Cache-aside, also known as lazy loading, is a popular caching strategy where data is fetched from the cache if available (cache hit). If not (cache miss), the application retrieves it from the database and stores it in the cache for future use. It's the strategy used in the examples discussed earlier in this guide.
This approach ensures the cache holds only relevant data, making it cost-effective and easy to implement. However, the initial request for uncached data can be slower due to the extra step of fetching from the database.
2. Write-behind (Write-back)
In the write-behind pattern, data modifications are written to the cache first and then asynchronously to the database. This approach speeds up write operations and reduces database load but introduces a risk of data loss if the cache fails before the data is persisted to the database.
It is suitable for write-intensive applications where data consistency isn't absolutely critical, such as logging user activities or collecting analytics.
3. Write-through
The write-through pattern ensures data consistency by writing any changes to both the cache and the database simultaneously.
It eliminates the risk of data loss associated with write-behind but increases the latency of write operations due to the extra step of updating the cache.
The write-through pattern is almost always paired with cache-aside. If a cache miss occurs, the data is loaded from the data store and used to update the cache accordingly.
4. Read-through
In read-through caching, data is always read from the cache. If the requested data isn't present, it's fetched from the database, stored in the cache, and then returned to the application.
This approach simplifies data access by providing a single access point for both cached and uncached data. It's suitable for read-heavy applications with infrequent data updates, where consistent read performance is crucial.
To wrap up this article, let's illustrate how the write-through and cache-aside patterns work together. Combining both approaches with an appropriate expiration for the cached data offers both consistency and efficiency for many applications.
Scenario 3 — Keeping your cache in sync with reality
Let's say your application allows its users can customize their profiles and set preferences. Since this is frequently accessed data, storing it in the cache makes sense.
By combining the write-through and cache-aside patterns, updates are immediately written to both the database and the cache to minimize the chance of serving stale data.
Here's how to modify your server.js file to implement this:
This code defines a getUserProfile() function that implements the cache-aside
pattern once again to retrieve the specified user from the cache or the
database.
In the GET /users/:id route, getUserProfile() gets the user profile and the
cache hit flag. If it's a cache miss, the code stores the fetched profile in the
Redis cache.
The PUT /users/:id/bio route also uses getUserProfile() to fetch the user
profile before updating the bio. After updating the database, the code
immediately updates the cache with the modified user profile (write-through
caching) to maintain consistency between the cache and the database.
This way, the next request to GET /users/:id will return the updated profile,
and not the outdated information that was present in the cache prior to the
update.
You can try it out by retrieving the user with an ID of 1:
You'll get the following output:
And you'll see a cache miss in the server console:
The data should now be present in the cache, so if you repeat the request before its expiration time, you'll get the same response and see a "cache hit" message instead:
Let's say the user decides to update their bio with the following request:
You'll get the following response now:
And when you repeat the GET request, the data you'll get back is the updated one:
It will be accompanied by a cache hit message in the server console to show that the data was indeed loaded from the cache:
Final thoughts
In this guide, we've explored practical examples of caching patterns using Redis in a Node.js environment. These patterns reduce latency and server load and help manage infrastructure costs while delivering a predictable user experience.
When implementing caching, remember to:
- Choose a caching strategy that aligns with your application's data access patterns.
- Design consistent and efficient cache keys.
- Regularly monitor and manage your cache to avoid stale or bloated data.
By understanding and combining these strategies, you can ensure that your caching solution is robust, scalable, and effective for a wide range of scenarios.
To learn more about Redis, see their documentation, and don't forget to check out the final code on GitHub.
Thanks for reading!