Job Scheduling in Node.js with BullMQ
BullMQ is a Node.js library that allows you to offload tasks from your main application to the background, helping your application run more efficiently. It's ideal for tasks that take time, like image or video processing, API calls, backups, and sending notifications or emails.
The BullMQ library uses Redis to store jobs, ensuring jobs persist even if the application stops working, which is critical for applications deployed in production. BullMQ's capabilities include running multiple jobs concurrently, retrying failed jobs, horizontal scaling, and prioritizing jobs. Its reliability and efficiency have garnered the trust of companies like Microsoft, Vendure, and Datawrapper.
This article will guide you through the setup, features, and best practices of using BullMQ to implement task scheduling effectively in a Node.js application.
Let's get started!
Prerequisites
Before proceeding with the tutorial, ensure you've installed the latest LTS version of Node.js on your system. You can find the installation guide here. Since BullMQ uses Redis for job storage, you'll also need to have Redis set up. Follow the official Redis installation guide for detailed instructions. Additionally, part of this tutorial will involve database backup using MongoDB. Ensure you have MongoDB installed, or opt for an alternative database, considering that the steps might differ.
Once you've installed the prerequisites, verify that Redis is operational by running:
If Redis is running smoothly, you'll see "active (running)" in the status output:
This confirms that Redis is active and ready to accept connections. If the output shows "Active: inactive (dead)," it indicates that Redis isn't running. In this case, start the service with:
Step 1 — Setting up the project directory
With Redis operational, begin by cloning the essential code from the GitHub repository:
Navigate to the newly created directory:
Next, use the npm command to install the necessary dependencies listed in the package.json file. This includes essential libraries like BullMQ for task scheduling and date-fns for manipulating dates:
Once all the dependencies are installed, you can start scheduling tasks with BullMQ in the following steps.
Step 2 — Scheduling tasks with BullMQ
In this section, you will create a basic task scheduler with BullMQ.
To begin, open the index.js file with an editor of your choice. This tutorial assumes you are using VSCode, which can be opened with the code command as follows:
You create a queue instance that creates a queue named myQueue in Redis using the values in the redisOptions object, which allows BullMQ to connect with Redis.
Next, you define an addJob() function that uses the add() method to add a job to the queue. It takes three arguments: the job name, the job data, and the options object. The options object includes the repeat property, which accepts milliseconds for scheduling jobs. Note that the second and third arguments are optional.
Next, you define a welcomeMessage() function, which is the actual job. This function would contain the time-intensive code in a more real-world application, but it is just a "hello world" message here. A worker will execute this function, which we will explore soon.
Finally, you call the addJob function to add a job with the welcomeMessage name.
You can then run the file with the following command:
BullMQ will create a queue and add a welcomeMessage job, but it won't execute the jobs in Redis. For that task, you need a worker, an instance that goes through the queue and executes each job stored in the queue. If executing the job is successful, the job is moved to the "completed" status. Conversely, if any issues arise during execution, leading to an error, the job is labeled "failed."
Now open the worker.js file in your editor:
First, you import the welcomeMessage function and assign it to a property in the jobHandlers object. This lets you dynamically choose functions as we add more functions to process tasks. Next, you define a processJob function, which dynamically checks for a function in the jobHandlers object, prints a message, and invokes the function with the job data.
Next, you create the worker instance with the myQueue, passing it the function that executes jobs and the connection settings required for the worker to connect with Redis and consume jobs.
Finally, the "completed" event handler logs a success message if a job is finished and marked as completed. If a job fails, it triggers the "failed" handler, and the job ID of the failed job is logged along with the error message.
After finishing this setup, open another terminal and start the worker.js script:
When it runs, you will see output similar to the following:
Step 3 — Customizing BullMQ jobs
In this step, you'll explore how to customize BullMQ jobs by passing custom data and setting job priorities.
In the index.js file, add the highlighted code to define another job that takes custom data:
In this example, the exportData() function takes name and path as data needed for exporting and logs a message for demonstration purposes. You then use the addJob() function to add another job, dataExport, with information stored under the jobData object, providing a name for the exported file and its path.
Save your file and start the index.js file again:
Now, open the worker.js in your text editor with the following contents:
In this code, you import the exportData function and reference the function definition with a dataExport property on the jobHandlers object.
When you are finished, start the worker again with the following command:
You will now see output that looks similar to the following:
Now that you can pass custom data to jobs, let's explore how to set up job priorities so that jobs that don't need immediate execution can be deprioritized, allowing workers to focus on jobs that need immediate attention. To add a priority, you use the priority option. When you create a job without this option, it is assigned the highest priority by default.
So, in the index.js file, add the following code to allow setting a priority on a job:
When the addJob() function is called with the second argument, the priority property is added to the options. Priorities in BullMQ range from 1 to 2,097,152. The lowest number represents the highest priority, and the priority decreases as the number goes higher.
Run the index.js again:
After that, rerun the worker:
With the new priority settings, BullMQ will prioritize the dataExport job over the welcomeMessage job, executing it first:
The dataExport task is prioritized due to its default higher priority setting. For an in-depth understanding of how job priorities function in BullMQ, consult the official BullMQ documentation.
Step 4 — Using cron expressions to schedule jobs
Up to now, you've been scheduling tasks using seconds to define the frequency. However, an alternative and widely used method is cron expressions. Originating from Unix systems, cron expressions offer a standardized way to schedule recurring tasks with a more granular approach.
A cron expression contains five fields that represent minute, hour, day of the month, month, and day of the week:
Here's a quick reference for the values each field can take:
| Field | Allowed Values |
|---|---|
| Minute | 0-59 |
| Hour | 0-23 |
| Day of the Month | 1-31 |
| Month | 1-12 or JAN-DEC |
| Day of the Week | 0-6 or SUN-SAT |
To incorporate cron expressions into BullMQ scheduling, you can use the cron property within the repeat option. Here are a couple of examples showing how to use cron expressions in BullMQ:
The cron property provides a cron expression for each job in these code examples. Cron expressions offer a powerful and flexible way to specify exact times and frequencies for your jobs, accommodating complex scheduling needs.
Step 5 — Scheduling jobs to run once
While you've primarily focused on recurring tasks, there are scenarios where you might need a job to run once or at a specific future time. BullMQ caters to these needs as well.
Running a job immediately
To have a job run immediately and only once, add it to the queue without the repeat option:
This method will allow BullMQ to execute the job a single time. Once the job is completed, it's marked as completed and won't run again.
Delaying a job
Sometimes, you may want to postpone a job's execution to a future time. BullMQ allows you to delay a job's start with the delay option, where you provide the delay duration in milliseconds.
For instance, to delay a job by 20 seconds, you'd write:
If you aim to schedule a job for a specific future time, calculate the delay from the current time to the desired execution time:
Here, you calculate the delay as the time difference between your target and current times. The job will then be scheduled to execute at the exact future time you've specified.
Step 6 — Managing BullMQ queues
BullMQ queues are highly configurable, allowing you to dictate how jobs are handled, removed, or added. Here's how you can leverage these features for efficient queue management.
Automating removal of finished jobs
Once jobs are completed or fail in BullMQ, they're categorized into "completed" or "failed" sections. While advantageous for review during development, these can accumulate, especially in a production environment. To automate the cleanup of these jobs, use the removeOnComplete and removeOnFail options:
Setting these options to true prompts BullMQ to discard the jobs automatically once they're done. For more detailed insights, refer to the official documentation.
Adding jobs in bulk
For scenarios requiring the addition of multiple jobs simultaneously, use the addBulk() method. This method ensures atomic addition of jobs — all jobs are added at once, and if an error occurs, none are added:
Removing jobs
If you need to clear out waiting or delayed jobs, BullMQ provides methods to do so. Note, however, that active, completed, failed, or waiting-children jobs won't be affected:
BullMQ also has other methods that allow you to remove all jobs, which you can find in the documentation.
Pausing and resuming queues
There might be instances where pausing a queue is necessary. When paused, workers won't pick new jobs from the queue. Pause and resume a queue as follows:
Now that you can manage the queues, let's review how to manage workers.
Step 7 — Managing BullMQ workers
Workers are crucial in BullMQ for processing tasks from the queue. They have various properties and methods to tailor their behavior to your needs.
Setting up concurrency
BullMQ allows you to set up concurrency, which determines how many jobs a worker can process simultaneously. You can configure this with the concurrency option:
Alternatively, you can achieve concurrency by having multiple workers, and the documentation recommends this approach over using the concurrency option. This not only allows for distributed processing across machines but also provides a more scalable solution:
Pausing and resuming workers
There may be situations where you need to halt a worker's activity temporarily. You can pause a worker, causing it to finish its current jobs before stopping, using the pause() method:
If you prefer the worker to stop immediately without waiting for active jobs to complete, pass true:
To unpause the worker, use the resume() method:
Step 8 — Scheduling database backups with BullMQ
This step teaches you how to create a simple script using BullMQ to automate backups of a MongoDB database.
Begin by opening the backup.js file with the following code:
Here, you use the spawn() function from the child_process module to run mongodump, a MongoDB utility designed for creating backups. The specified database name is admin, which is inherently available by default. Additionally, you configure compression settings, and a filename with a timestamp is dynamically generated for the backup.
Next, the backupDatabase() function has error handling for terminating the process and also logging success.
Before running the script, confirm MongoDB is installed and running:
You will see output similar to this confirming that MongoDB is running:
Run the script with this command:
You will observe output that looks similar to this:
Next, list the directory contents to confirm that the backup file has been created:
The output will contain the created archive like so:
Now, to schedule recurring backups with BullMQ, modify the backup.js file by adding a queue and job scheduling function:
This code is similar to what you have seen earlier in the article. The addJob() function adds a new job backupMongoDB to the backupQueue. You then export the backupDatabase() function and the redisOptions so that they can be used in a file containing a BullMQ worker.
Run the index.js file:
Then, define a worker to process the backup job by creating a backup-worker.js file, and then adding the following code:
In this code, you set up a worker to invoke and execute the backupDatabase() function.
Next, start the worker:
Keep the worker running for a few minutes, and observe as it automatically backs up your database at the specified intervals:
Now, list the directory to see a growing list of backup files in your directory:
With your jobs now scheduled and running smoothly, the next step is implementing Bull Board, a user-friendly interface that allows easy management and monitoring of your queued jobs.
Step 9 — Job monitoring with Bull Dashboard
If you prefer a visual interface for monitoring and managing jobs, Bull Dashboard provides a convenient solution. This tool offers a comprehensive view of your queues, their jobs, and their statuses.
To integrate the Bull Dashboard, first, install the necessary packages:
Once installed, integrate Bull Dashboard into your backup.js file with the following additions:
In this modification, Bull Board is initialized using createBullBoard() with a backupQueue. A middleware is set up for the /admin route, and a server is created to run on port 3000.
Once you've made these changes, run the backup.js file:
In your browser, navigate to http://localhost:3000/admin. You should see a dashboard similar to the screenshot provided:
The dashboard allows you to view all the jobs in the queue, inspect detailed job information, and observe delayed, prioritized, or paused jobs:
Step 10 — Monitoring scheduled jobs in production
When automating tasks, having a monitoring system is crucial to alert you to any issues. A prime example of the importance of diligent monitoring is Gitlab's data loss incident due to a failed backup process, as detailed in their post-mortem. This highlights the need for robust monitoring of scheduled tasks to ensure prompt awareness and resolution of issues.
One tool for this purpose is Better Stack, which actively monitors your jobs and alerts you through various channels like text, Slack, emails, or phone calls if any issues arise. This helps you stay informed about your job status and quickly address any problems.
First, sign up for a free Better Stack account and find the Heartbeats section on the sidebar. Click the Create Heartbeat button:
Next, name your monitor and specify the frequency of your job execution. In the On-call escalation section, select how you would like to be notified. When you are finished, click Create Heartbeat:
You will then be redirected to the Heartbeat page, which it will provide you with a new endpoint you can use to monitor your task:
Now copy the URL and return to your text editor. Open the .env file and add the following content. Replace <your_heartbeat_url> with the URL:
Following that, install the dotenv package, which allows you to use environmental variables:
Next, open the backup-worker.js file and use fetch to send a request to the Heartbeat URL once the job succeeds:
Ensure that the monitoring step is added where the monitored task executes successfully so that any failure or errors trigger an alert.
Save the file and run the backup-worker.js file again:
The scheduled backups will now process the jobs as before. When the first backup is successful, Better Stack will confirm that the task is "UP":
To simulate the job failing, stop the script and wait for a few minutes. When it stops, requests won't be sent to Better Stack anymore. As a result, Better Stack's status will update to "Down" and send an alert to your preferred channels.
With this setup, you can be more aware that the backup tasks are running, and if there is any problem, you will be alerted. Another good idea is to log the reason for the failures so that you can review them and quickly fix the issue and have your jobs running. See our Node.js logging guide for more details.
Final thoughts
Throughout this tutorial, we employed BullMQ for scheduling tasks in Node.js and explored its diverse capabilities. Furthermore, we established a monitoring system to alert you if a scheduled task goes wrong.
For further learning, consider exploring the documentation for BullMQ and Better Stack.
Thank you for reading, and happy coding!