Back to Scaling Python Applications guides

Load Testing with Locust: A High-Performance, Scalable Tool for Python

Stanley Ulili
Updated on February 21, 2025

Locust is a powerful, open-source load testing framework for Python that enables developers to simulate high-concurrency scenarios with ease.

Unlike traditional load testing tools that rely on heavy threads or processes, Locust leverages the lightweight gevent library, allowing it to scale efficiently while consuming minimal system resources.

This article will guide you through the basics of load testing with Locust.

Prerequisites

Before diving into Locust, ensure you have a modern version of Python installed on your machine (Python 3.13 or higher is recommended). Locust is a Python-based tool, so familiarity with basic Python programming concepts will be helpful.

Step 1 — Setting up a project for load testing

Before we can begin load testing with Locust, you need an API to test. In this step, you will create a simple API using Flask, a lightweight web framework for Python. The API will have three basic endpoints, and they will serve as your testing targets throughout this guide.

To keep things organized, start by creating a dedicated directory for the project and navigate into it. This will serve as the workspace for our load testing setup:

 
mkdir locust-load-test && cd locust-load-test

Next, set up a virtual environment to maintain clean dependencies and avoid conflicts with other Python projects.

Create one using Python's built-in venv module and activate it:

 
python3 -m venv venv

Once the virtual environment is created, activate it:

 
source venv/bin/activate

When the virtual environment is active, you should see its name (e.g., venv) in your terminal prompt. This indicates that any installed packages will be contained within this environment.

With the virtual environment activated, install Flask using pip:

 
pip install flask

To verify that Flask has been installed correctly, run the following command:

 
python -c "import importlib.metadata; print(importlib.metadata.version('flask'))"

If the installation is successful, this command will print the installed Flask version:

Output
3.1.0

Now, you'll create our test API. Create a new file called app.py and open it in your preferred text editor:

app.py
from flask import Flask, jsonify

app = Flask(__name__)


@app.route("/")
def home():
    return jsonify({"message": "Welcome to the Flask API!"})


@app.route("/users")
def users():
    return jsonify({"users": ["Alice", "Bob", "Charlie"]})


@app.route("/status")
def status():
    return jsonify({"status": "healthy"})


if __name__ == "__main__":
    app.run(debug=True)

This script initializes a Flask application and defines three routes:

  • / : Returns a JSON response with a welcome message.
  • /users : Returns a JSON response containing a list of sample users.
  • /status: Returns a JSON response indicating the system's health status.

The application runs in debug mode, which is useful during development since it provides helpful error messages and automatically reloads when changes are made.

Start the server with:

 
python app.py

You should see output similar to:

Output
 * Serving Flask app 'app'
 * Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: ....

To verify everything works correctly, use curl to test one of the endpoints in a second terminal:

 
curl -X GET http://127.0.0.1:5000/

You should see the welcome message in JSON format:

Output
{
  "message": "Welcome to the Flask API!"
}

With the API now up and running, you have a solid foundation for exploring Locust's load-testing capabilities.

Step 2 — Installing Locust

Now that your API is set up, the next step is to install Locust, the load-testing tool you will use to simulate multiple users interacting with our endpoints.

To do that, install Locust using pip:

 
pip install locust

After the installation is complete, verify that Locust was installed correctly by checking its version:

 
locust --version

If the installation was successful, you should see the installed Locust version printed in the terminal.

Output

locust 2.32.10 ....

In the next step, you will create a locustfile.py and define a simple user behavior model to simulate load testing on our API.

Step 3 — Creating your first Locust test

With Locust installed, you'll create your first load test script. For this initial test, you'll keep things simple by focusing on the home endpoint.

Create a new file called locustfile.py in your project directory:

locustfile.py
from locust import HttpUser, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)  # Simulates a wait time between requests

    @task
    def get_home(self):
        self.client.get("/")

Let's understand each component of this simple script. The APIUser class inherits from HttpUser, which provides the basic functionality for making HTTP requests.

The wait_time attribute defines a random delay between requests, ranging from 1 to 3 seconds - this helps create more realistic test scenarios by avoiding synchronized requests.

The @task decorator marks our get_home method as a task that Locust will execute during the test, and the method itself makes a simple GET request to our API's home endpoint.

To run the test, ensure your Flask application is still running in one terminal. Then, in a new terminal with your virtual environment activated, start Locust:

 
locust --host=http://127.0.0.1:5000

You will see the following output:

Output
[2025-02-20 13:30:50,556] MACOOKs-MacBook-Pro/INFO/locust.main: Starting Locust 2.32.10
[2025-02-20 13:30:50,559] MACOOKs-MacBook-Pro/INFO/locust.main: Starting web interface at http://0.0.0.0:8089

Open your web browser and navigate to http://localhost:8089. You'll see Locust's web interface where you can configure and start your test. x

Screenshot of Locust Dashboard

For this initial run, use these conservative settings:

Click "Start Swarming" to begin the test:

Screenshot of the Dashboard with the settings modifed

You will be redirected to the live dashboard, where Locust will display real-time metrics such as request rates, response times, and failure counts.

After a few seconds, observe how your API handles the load. When you're ready, click "Stop" to end the test.

stop-locust.png

The Locust web interface provides real-time insights, including response times, request rates, and failure counts—helping you gauge your API’s performance under stress.

Now that you've successfully run a basic load test, let's dive into interpreting the results to understand what the numbers mean and how they can inform optimizations.

Step 4 — Understanding load test results

Now that you've completed your first Locust test, let's break down the results and what they reveal about your API's performance under load.

Locust provides a detailed set of performance metrics. Here's a snapshot of the key metrics returned by our test:

Screenshot of key metrics returned by Locust

Each of these metrics tells an important part of the story:

  • Median response time: 4ms – Most requests are processed very quickly.
  • 95th percentile: 6ms – Even at high loads, nearly all requests remain within an acceptable range.
  • 99th percentile: 58ms – A noticeable jump, indicating occasional slower requests.
  • Max response time: 191ms – Some outliers suggest sporadic delays, which might be worth investigating further.
  • Failure rate: 0% – Great news! No errors occurred during the test.
  • Average response size: 45 bytes – Expected for the simple JSON response from the API.
  • Requests per second (RPS): 2.4 – With five simulated users and a 1-3 second wait time between requests, this reflects a steady load.

Overall, these results suggest that your Flask API is handling this basic load test well. The occasional response time spikes at the 99th percentile and max response times could be caused by factors like background processing or system resource contention.

Beyond the raw numbers, Locust provides charts that visually represent key performance metrics like request rate, response times, and total users:

screencapture-0-0-0-0-8089-2025-02-20-14_49_14.png

These graphs provide a clearer picture of how your API behaves under load, helping you identify trends, bottlenecks, or unexpected fluctuations.

Another useful tab is the failures tab http://0.0.0.0:8089/?tab=failures. In this test, the page is blank, confirming that no errors occurred:

Screenshot showing there are no erros

With this understanding of Locust’s test results, you're now equipped to interpret performance metrics, identify potential bottlenecks, and track improvements as you optimize your API.

Step 5 — Testing additional API endpoints and adding conditional checks

So far, you've tested the home (/) endpoint. Now, let's extend the test to include the /users and /status endpoints. This will give a more comprehensive view of how different parts of your API handle concurrent requests.

Modify locustfile.py to add the additional endpoints. Open the file and update it with the following:

locustfile.py
from locust import HttpUser, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)  # Simulates a wait time between requests  

    @task(3)  # The home endpoint is accessed more frequently  
    def get_home(self):
        self.client.get("/")
@task(2) # The users endpoint is accessed less frequently
def get_users(self):
self.client.get("/users")
@task(1) # The status endpoint is accessed the least
def get_status(self):
self.client.get("/status")

In this update: - The home (/) endpoint is called three times as often as the others. - The /users endpoint is accessed twice as often as /status. - The /status endpoint is tested the least.

This setup mimics real-world traffic distribution, where the homepage is usually hit more frequently than API status checks.

With the Flask API still running, restart Locust and launch the web interface: command locust --host=http://127.0.0.1:5000

Navigate to http://localhost:8089/, configure the test with 5 users**, and run it for a few seconds.

Now that we’ve included additional endpoints, the results will reflect how each one handles concurrent requests:

Screenshot 2025-02-20 at 3.44.57 PM.png

The test results confirm a 0% failure rate, demonstrating the API's stability under load. Response times remained low, with a 4ms median across all endpoints, indicating efficient performance.

As expected from our test configuration, the /users endpoint handled the most requests (61 total). The /status endpoint had a slightly higher max response time (12ms) but was still within acceptable limits.

Adding a basic conditional check

To improve test reliability, let's add a basic conditional check to ensure the / endpoint returns a status code of 200.

Modify the tests to include validation for HTTP status codes and response content:

locustfile.py
from locust import HttpUser, task, between

class APIUser(HttpUser):
   wait_time = between(1, 3)  # Simulates a wait time between requests

   @task(3)  
   def get_home(self):
with self.client.get("/") as response:
if response.status_code != 200:
response.failure(f"Got {response.status_code} instead of 200")
@task(2) def get_users(self):
with self.client.get("/users") as response:
if response.status_code != 200 or "users" not in response.json():
response.failure("Invalid response")
@task(1) # The status endpoint is accessed the least def get_status(self):
with self.client.get("/status") as response:
if response.status_code != 200 or "status" not in response.json():
response.failure("Invalid response")

This update introduces a check to ensure the home endpoint returns a 200 OK response.

If it doesn't, the test logs a failure. The /users and /status endpoints now include an additional check to confirm that the expected JSON fields are present in the response.

If they are missing or the status code is incorrect, Locust will flag the request as a failure. These simple validations improve the reliability of the test and help detect unexpected API behavior early.

Step 6 — Running Locust without the web UI

While the Locust web UI is great for interactive testing, running tests from the command line is often more efficient for automation, CI/CD pipelines, or large-scale performance testing in cloud environments.

Locust supports headless mode, allowing tests to run entirely from the terminal.

To execute Locust without the web interface, use

 
locust --host=http://127.0.0.1:5000 --users 10 --spawn-rate 2 --run-time 1m --headless

This command launches a test against the API with 10 users, spawning 2 per second, running for 1 minute. The --headless flag prevents Locust from opening the web UI.

As the test runs, the terminal displays real-time metrics, including request success rates, response times, and failures.

Output
....(output shorten for brevity)

Type     Name      # reqs  # fails  Avg(ms)  Min(ms)  Max(ms)  RPS
--------|--------|--------|--------|--------|--------|--------
GET      /          59       0        3        1        6      1.34  
GET      /status    48       0        3        1       19      1.09  
GET      /users    109       0        3        0       16      2.47  
Aggregated         216       0        3        0       19      4.89  

The report summarizes total requests, failures, and response times. In this example, all requests succeeded, and average response times remained low. Locust also logs percentile breakdowns to highlight response time distribution.

To analyze test results later, use the --csv flag to generate CSV reports:

 
locust --host=http://127.0.0.1:5000 --users 20 --spawn-rate 5 --run-time 2m --headless --csv=locust_results

Locust will create four CSV files in the project directory

 
....
locust_results_stats.csv       # Request statistics
locust_results_failures.csv    # Logged failures (if any)
locust_results_exceptions.csv  # Captured exceptions
locust_results_stats_history.csv  # Performance trends over time

These reports help track API performance, detect slow responses, and identify potential bottlenecks.

With headless execution, Locust becomes a powerful tool for automated performance testing to optimize APIs and identify bottlenecks.

Step 7 — Custom load shapes in Locust

So far, you’ve been using simple configurations to define how users spawn in Locust. Locust supports linear ramp-up by default using the --users and --spawn-rate options.

However, in real-world scenarios, traffic patterns are often more dynamic. Some applications experience gradual increases, while others have spiky bursts of traffic.

Locust provides custom load shapes to simulate such behavior better, allowing you to control how users are introduced over time precisely.

Custom load shapes in Locust are defined by creating a class that inherits from LoadTestShape. This class defines how many users should be active at a given time and how long each test phase should last.

Instead of a steady increase in users, a custom load shape allows for more complex scenarios, such as gradual ramp-ups, sudden spikes, and periodic dips.

To create a custom load shape, modify your locustfile.py to include a LoadTestShape class.

locustfile.py
from locust import HttpUser, task, between, LoadTestShape

class CustomLoadShape(LoadTestShape):
    """
    A load test shape that ramps up and down in waves.
    """

    stages = [
        {"duration": 30, "users": 10, "spawn_rate": 2},  # Start with 10 users
        {"duration": 60, "users": 50, "spawn_rate": 5},  # Ramp up to 50 users
        {"duration": 30, "users": 20, "spawn_rate": 2},  # Drop to 20 users
        {"duration": 40, "users": 80, "spawn_rate": 10},  # Spike to 80 users
        {"duration": 20, "users": 30, "spawn_rate": 3},  # Drop to 30 users
        {"duration": 30, "users": 0, "spawn_rate": 5},   # Gradual shutdown
    ]

    def tick(self):
        """Determines how many users should be active at a given time."""
        run_time = self.get_run_time()

        for stage in self.stages:
            if run_time < stage["duration"]:
                return stage["users"], stage["spawn_rate"]
            run_time -= stage["duration"]

        return None  # Stop the test once all stages complete

class APIUser(HttpUser):
    wait_time = between(1, 3)

    @task
    def get_home(self):
        self.client.get("/")

The CustomLoadShape class defines a series of stages, each lasting for a specific duration with a target number of users and a spawn rate.

The tick method determines how many users should run at any given moment based on the elapsed time.

Following the defined pattern, Locust will automatically adjust the number of users as the test progresses.

  1. The test starts with 10 users spawning at a rate of 2 users per second for 30 seconds.
  2. It ramps up to 50 users over 60 seconds with a spawn rate of 5 users per second.
  3. After 60 seconds, the number of users drops to 20 for 30 seconds.
  4. A sudden traffic spike increases the users to 80 over 40 seconds.
  5. After the spike, the test reduces the users to 30 over 20 seconds.
  6. Finally, the test enters a gradual shutdown phase, reducing users to zero in 30 seconds.

Once your locustfile.py is updated, you can start the test to run for three minutes:

 
locust --host=http://127.0.0.1:5000 --headless --run-time 3m

The test will dynamically adjust the number of active users according to the defined load shape.

With this approach, your Locust tests will closely mimic real-world traffic, providing deeper insights into how your system performs under different load conditions.

Final thoughts

This guide taught you how to use Locust for load testing, from setting up a test API to simulating various user behaviors, running tests in headless mode, defining custom load shapes, and implementing logging for better monitoring.

These skills allow you to measure and improve your API's performance under different load conditions.

To continue learning and exploring more advanced Locust features, check out the official Locust documentation.

It covers topics like distributed testing, authentication handling, and more complex user behaviors that can help refine your performance testing strategies.

Author's avatar
Article by
Stanley Ulili
Stanley Ulili is a technical educator at Better Stack based in Malawi. He specializes in backend development and has freelanced for platforms like DigitalOcean, LogRocket, and AppSignal. Stanley is passionate about making complex topics accessible to developers.
Got an article suggestion? Let us know
Next article
Introduction to Pyright
Learn how to set up and use Pyright, a powerful and fast static type checker for Python. This guide covers installation, configuration, type checking modes, advanced type hints, and enforcing type safety with `Final` and `Literal`.
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github