Node.js Performance Monitoring: A Beginners Guide
Looking for an easier way to identify performance bottlenecks in your application? Monitoring tools can enhance observability and quickly help you detect issues.
Node.js provides a Performance API that simplifies measuring code performance. When used with Prometheus, this setup enables efficient metrics collection, which allows a more straightforward analysis of your application's health.
This article serves as a continuation of Part 1, which introduces the Node.js Performance API. It will dig deeper into how you can apply the concepts discussed in real-world applications.
Prerequisites
Before beginning this tutorial, ensure you have installed the latest version of Node.js and have a basic understanding of Node.js application development. Additionally, ensure Docker and Docker Compose are installed on your system.
Step 1 — Setting up the demo project
To showcase performance monitoring in a real-world Node.js application, I have prepared a sample project. The application retrieves data from an SQLite database, processes it, makes a fetch API request, and then displays the data. Throughout this article, you'll use the Node.js Performance API to track the time taken for each operation.
Begin by cloning the project repository to your local machine with the command:
After cloning, navigate to the project directory:
Next, install the necessary dependencies:
- Express: a widely-used web framework.
- Embedded JavaScript templates (EJS): a templating engine.
- node-sqlite3: asynchronous SQLite3 bindings for Node.js.
- autocannon: a high-performance load testing tool.
Run the following command to install them:
Once the dependencies are installed, launch the development server, which listens on port 3000:
The server will start, and you should see this output:
The --watch flag enables the server to restart automatically upon file change.
Note that this feature is currently experimental.
Now, open your browser and go to http://localhost:3000 to view the homepage:
The application displays a subset of the database records to avoid overwhelming the page with data. Specifically, it shows only 200 out of several thousand records.
The application includes a book.db file sourced from the
7k Books dataset,
which has been enhanced with additional data.
Step 2 — Identifying code segments for monitoring
Before starting the performance measurement, it's essential to understand the
code segments that will be monitored. Open the index.js file to review the
code:
When a GET request is made, the code queries the database to retrieve all rows
from the "books" table. Following that, it reaches out to an external API at
https://api.adviceslip.com/advice to fetch random advice. After retrieving the
data, it processes these entries to append a pageSizeCategory, calculated
using the categorizePageSize() function. Finally, it renders a template with
the data obtained from the database and the API.
We will assess and monitor the performance of each of these segments: database querying, API fetching, data processing, and template rendering. Here’s why monitoring these segments is important:
Database Query Performance: Measuring this helps spot inefficiencies in database interactions, possibly due to slow queries, inefficient connections, or indexing issues.
External API Request Performance: This measurement helps measure the delay introduced by external services, indicating network speed or connectivity issues.
Data Processing Performance: Monitoring this helps identify bottlenecks in data manipulation, which could be optimized with efficient algorithms for better performance.
Template Rendering Performance: Measuring this reveals the impact of rendering on application performance, potentially highlighting issues in the rendering engine or data integration.
Now that you understand the parts you need to monitor and the reasons why, you will begin measuring the database query performance.
Step 3 — Measuring database querying performance
In this step, you'll use the User Timing API to evaluate the performance of
database query operations. This API offers performance.mark() to set markers
around query executions and performance.measure() to compute the durations
between these markers. Additionally, an Observer will be used to track these
measurements in real-time.
Open the index.js file and include the following code:
In this code, you configure the PerformanceObserver to log performance
metrics, showing the duration of each event in milliseconds. You then mark the
start of the database operation with fetchDatabaseStart and the end with
fetchDatabaseEnd. The elapsed time is then calculated using
performance.measure().
After saving these changes, Node.js will restart the development server automatically.
To see the duration, refresh the endpoint at http://localhost:3000/ using the
following command in the second terminal:
Then, return to the first terminal, and you should see output similar to this:
This indicates that the database took approximately 170 milliseconds to process the queries. This duration might vary depending on your system. And after making multiple GET requests to the endpoint, I get different results each time:
With this setup, you can now accurately track the time it takes to execute database queries in milliseconds. Next, you will measure the performance of fetching data from external APIs.
Step 4 — Measuring API requests
Next, you'll measure the performance of external API requests. Fortunately, Resource Timing is incorporated into the Performance API, which automatically captures network timing data, such as DNS lookups and request durations.
Instead of using the mark() method, you can directly retrieve these
measurements from the performance timeline. Update your code and adjust the
observer to monitor entries of type resource:
In this setup, the PerformanceObserver tracks measure and resource
entries. If an entry is of type resource, the observer's callback logs the API
fetch duration.
After saving your modifications, refresh the http://localhost:3000/ endpoint
in your second terminal:
You should see output like this:
With that, you can successfully measure the fetch API requests by accessing the duration inside the performance timeline. Now, you'll move on to measuring the performance of data processing tasks.
Step 5 — Measuring data processing duration
After fetching the rows and making an API request, the database data is processed. In this section, you'll add markers to measure the duration of this task.
Open your index.js file and insert the following markers:
This code snippet sets up markers at the start and end of the data processing
segment. You then use performance.measure() to calculate the duration between
these markers.
Save your code, and refresh the homepage endpoint in the second terminal:
Return to the first terminal to see an output similar to this:
On my system, the data processing step takes approximately 21 milliseconds.
With the ability to measure data processing, you can calculate the rendering time.
Step 6 — Measuring rendering time
In this section, you'll measure your application's rendering time using a
similar approach to the previous steps: adding markers around the render()
function.
Update your index.js file with the following code:
This addition places markers at the start and end of the rendering process and measures the duration between these points.
After saving your updates, refresh the homepage in the second terminal:
Return to the first terminal to see output like this:
On my machine, the rendering time is approximately 15 milliseconds.
With this, you can measure the rendering time.
Step 6 — Implementing custom metrics for Prometheus
In this section, you'll create custom metrics from the measurements and expose them to an endpoint that Prometheus can scrape. Prometheus is a powerful open-source monitoring tool that collects and stores metrics data by scraping specified endpoints.
Setting up the Prometheus client
To create a metric endpoint, you'll use the prom-client package, which you can install with the following command:
Next, modify your index.js file to set up a metrics registry:
Here, you instantiate the Registry class. The registry keeps track of all metrics, and it's where all the metrics you create will be registered before they are exposed to an endpoint.
Next, you define a /metrics endpoint, where you invoke the metrics() method
of the register object to retrieve all the metrics data.
After updating your code, access the metrics endpoint by visiting
curl http://localhost:3000/metrics in your browser. Initially, this will yield
an empty response because no metrics have been defined yet:
Creating metrics for database durations
After measuring database query durations, you'll set up a metric to capture this data effectively. The histogram metric from Prometheus is ideal for representing data distributions.
A histogram organizes data into "buckets" based on set intervals. Each bucket tallies measurements within its range. For example, the durations of 2, 5, and 8 seconds correspond to individual buckets marked by these upper limits.
It's essential to define these buckets carefully to accurately calculate useful percentiles. If most durations exceed the highest bucket or are below the lowest, the histogram will not offer insightful data.
Begin with load testing to gauge the typical duration ranges of your tasks. Exercise caution, particularly if your application interacts with external services, to avoid being rate-limited due to excessive requests.
First, restart the server and redirect the output to a file:
In another terminal, run a load test:
Expect output similar to this:
After the test, stop redirecting the output to the file:
Now, open the duration_measurements.txt file containing various duration
entries. Review this file to assess the range of expected database durations.
Here are some common durations I recorded in my own time:
From these data points, I've formulated bucket ranges in seconds that cover the observed values:
In your index.js, integrate the following code to create and use a histogram
for tracking these durations:
In this code, you create a Histogram named fetch_database_duration_seconds.
The help parameter describes what the metric represents. The registers
option specifies the registry where this metric will be registered, which is the
register you instantiated earlier. Next, the buckets define the boundaries
of the buckets in seconds, where each bucket represents a range of durations.
In the / endpoint, you retrieve the duration of the performance measurement
named "fetchDatabase" and divide the result by 1000 to convert milliseconds to
seconds. Finally, the fetchDatabaseHistogram.observe() function observes the
duration by recording it in the histogram.
With the metric created, save your file.
Return to your browser and test the http://localhost:3000/metrics endpoint.
Upon running, you will see output that looks similar to the following:
A histogram metric includes the following components:
Histogram Buckets: Each bucket within a histogram is identified with a
_ bucketsuffix and functions as a counter. Thelelabel, standing for "less than or equal to," denotes the upper limit of the bucket's range. Measurements within or under this value are counted within that particular bucket.Sum Counter: This counter, indicated by the
_sumsuffix, represents the total sum of all recorded measurement values.Count Counter: Designated by the
_countsuffix, this counter records the total number of measurements taken, providing an overall count of all data points measured.
Refreshing the homepage in your browser and re-checking the metrics endpoint, the histogram will update to reflect the measured durations:
Now, you see some of the histogram values incremented by one.
Creating metrics for fetch, data processing, and rendering
In this section, you will set up histogram metrics for fetch operations, data
processing, and rendering. In my own time, I analyzed the
duration_measurements.txt file to determine the appropriate bucket values
based on the durations I observed.
Here's the updated code for your index.js:
The provided code snippet configures several histograms and a counter for monitoring various performance metrics in an application:
fetchDurationHistogram: Measures the duration of fetch API requests, with buckets ranging from 0 to 12 seconds, and includes labels for method, status code, and route.
fetchRequestsCounter: Counts the total number of fetch API requests.
dataTransferSizeHistogram: Monitors the data transfer size of fetch API requests, with buckets set at 100-byte increments up to 500 bytes.
modifyRowsHistogram: Tracks the time taken to modify rows, with fine-grained buckets from 0 to 1 second, emphasizing shorter durations.
renderHistogram: Measures the duration of rendering tasks, with detailed buckets for short durations ranging up to 0.1 seconds.
Now modify the "/" endpoint to record the measurements into the specified
histograms:
In this code snippet, whenever a fetch API request occurs, the
fetchRequestsCounter is incremented using the inc() method to track the
total number of requests made.
To measure the fetch duration, the code retrieves relevant performance entries
associated with resource fetches and computes the duration. This duration is
then recorded using fetchDurationHistogram.observe().
Additionally, the code records the time taken for modifying rows and rendering
operations with the modifyRowsHistogram.observe() and
renderHistogram.observe() methods,
After saving your changes, open your browser and visit the
http://localhost:3000/metrics endpoint. You'll see that additional histograms
and a counter have been included:
The /metrics endpoint now includes histograms tracking durations for fetch API
requests, row modifications, and rendering, alongside a counter for total fetch
requests and a histogram for data transfer sizes.
Next, conduct another load test to see these observe these metrics after multiple requests:
Once the load test is complete, revisit the http://localhost:3000/metrics
endpoint in your browser:
With the metrics now established, the next step is to set up Prometheus to scrape and analyze these data points.
Step 7 — Collecting performance metrics with Prometheus
In the previous section, you instrumented your code to generate metrics, but
it's only useful if these metrics end up in a monitoring system. In this step,
you'll set up Prometheus using Docker to monitor the /metrics endpoint.
Start by creating a prometheus.yml configuration file in the root directory of
your project:
The configuration sets a global scrape and evaluation interval of 15 seconds. It
includes a scrape job for Prometheus and introduces another job labeled
"books-app" targeting host.docker.internal:3000 at a 5-second interval, which
points to localhost.
Then, configure Docker with a docker-compose.yml file:
Launch the services in the background with:
You’ll see output indicating the successful launch of the services:
Now, open your browser and navigate to http://localhost:9090/graph. You should
see the Prometheus homepage as shown below:
To check if Prometheus is actively monitoring the endpoint, click on Status followed by Targets in the navigation menu, as shown in the below:
You will be taken to a page where the application endpoint is listed with the label "UP" next to it, which indicates that Prometheus is successfully monitoring it:
Next, you can proceed to query the metrics within the Prometheus interface.
Step 8 — Querying Metrics in Prometheus
Before analyzing the metrics, you must ensure enough data is available to work with. Do that by running a load test that runs for five minutes:
After the load test runs for about two minutes, head to http://localhost:9090/
in your browser.
With the necessary data collected, you'll focus on three main areas for analysis: The total number of requests Average duration of database fetching. The 99th percentile of database fetch durations
To start querying these metrics, go to the Prometheus homepage and use PromQL.
First, to check the total number of requests, input the query
fetch_requests_total. The result should appear similar to the one shown below:
After the load test concludes, you can calculate the average duration of database requests using the following expression:
This calculation will give you insights into the average duration, as illustrated in this screenshot:
Lastly, to compute the 99th percentile (0.99 quantile) of database durations, use this expression:
The result will provide the percentiles, as depicted in the following screenshot:
Final thoughts
This article demonstrated how to monitor a Node.js application's performance using the Performance API. We set up tailored metrics for Prometheus and used them for performance analysis.
With Prometheus at your disposal, you can gain insights into your application's health from the collected metrics.
Thanks for reading, and happy monitoring!