Gunicorn (short for Green Unicorn) is a high-performance WSGI server designed for running Python web applications. It is widely used in production environments due to its simplicity, compatibility with various frameworks, and efficient process management model.
This article will explore its core features, configuration options, worker models, and best practices for deploying Gunicorn in production.
Let's get started!
Prerequisites
Before getting started with Gunicorn, make sure your system has the following:
- The latest version of Python (3.13.2 or higher)
pip
, Python’s package manager (included with Python 3.x)- Experience with a Python web framework like Flask for testing Gunicorn
Why use Gunicorn?
As mentioned, Gunicorn is a WSGI server designed to efficiently and reliably deploy Python web applications. It uses a pre-fork worker model to handle each request separately for better stability and performance.
Gunicorn also enhances request handling by supporting both synchronous and asynchronous workers while efficiently managing workers automatically. This prevents the entire application from crashing and ensures smooth operation, even under heavy load
The following are some of the other key benefits:
- Handles requests separately for stability and performance
- Supports sync and async workers (Gevent, Eventlet, gthread)
- Manages workers automatically (timeouts, restarts, scaling)
- Works with Django, Flask, FastAPI, and any WSGI-compatible framework
- Highly configurable via Python config files, CLI options, and hooks
- Production-ready and built to handle real-world traffic
- Integrates with Nginx, HAProxy, AWS ALB for scalability
Step 1 — Setting up the demo application (optional)
For this guide, you'll create a simple Flask application to demonstrate Gunicorn's capabilities. If you have your own WSGI application, feel free to use that instead.
Before running Gunicorn, create a new directory and navigate into it:
mkdir gunicorn-demo && cd gunicorn-demo
Create a new virtual environment and activate it:
python3 -m venv venv
source venv/bin/activate
Install Flask:
pip install flask
Now, create a new file called app.py
with a basic Flask application:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
return 'Hello, Gunicorn!'
@app.route('/healthcheck')
def health():
return 'OK', 200
if __name__ == '__main__':
app.run()
This simple application includes two routes:
/
- Returns a greeting message/healthcheck
- A simple health check endpoint that returns "OK" with a 200 status code
Then, test the application using Flask's development server:
python app.py
You should see output like this:
* Serving Flask app 'app'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
You can see in the output that Flask's built-in development server is running on http://127.0.0.1:5000
.
But pay attention to the warning—Flask explicitly tells you this server isn’t meant for production.
This is where Gunicorn steps in as your production-ready solution, as it is built for performance and makes your app fast, stable, and scalable.
Before setting up Gunicorn, visit http://127.0.0.1:8000
in your browser or test it using curl
:
curl http://127.0.0.1:8000
You should see:
Hello, Gunicorn!
While this basic example demonstrates the core concepts, you'll need to explore Gunicorn's features. In the next section, you'll install Gunicorn and configure it to serve this application.
Step 2 — Getting started with Gunicorn
By now, you've seen that Flask’s built-in development server is not meant for production—it’s single-threaded, lacks performance optimizations, and won’t handle high traffic efficiently. This is where Gunicorn saves the day.
Gunicorn is a production-ready WSGI server that efficiently handles multiple requests using worker processes. Unlike Flask’s built-in server, Gunicorn scales smoothly, keeping your application stable and performant under real-world traffic.
Gunicorn can be installed via pip
. Enter the following command to install Gunicorn:
pip install gunicorn
Verify the installation by checking the version:
gunicorn --version
Expected output:
gunicorn (version 23.0.0)
Now that Gunicorn is installed, you can use it to serve your Flask application the right way—with better performance and concurrency.
Run the following command:
gunicorn -w 3 -b 127.0.0.1:8000 app:app
Here is the breakdown of the command:
gunicorn
: Starts Gunicorn instead of Flask’s built-in server-w 3
: Runs 3 worker processes, allowing the app to handle multiple requests simultaneously-b 127.0.0.1:8000
: Binds the application to127.0.0.1
on port8000
app:app
: Targets theapp
instance insideapp.py
.
When you run the command, you should see output similar to this:
[2025-02-17 08:58:50 +0200] [9799] [INFO] Starting gunicorn 23.0.0
[2025-02-17 08:58:50 +0200] [9799] [INFO] Listening at: http://127.0.0.1:8000 (9799)
[2025-02-17 08:58:50 +0200] [9799] [INFO] Using worker: sync
[2025-02-17 08:58:50 +0200] [9800] [INFO] Booting worker with pid: 9800
[2025-02-17 08:58:50 +0200] [9801] [INFO] Booting worker with pid: 9801
[2025-02-17 08:58:50 +0200] [9802] [INFO] Booting worker with pid: 9802
At this point, your Flask app is now running on a production-grade server.
Visit http://127.0.0.1:8000
to test the application:
curl http://127.0.0.1:8000
You should see:
Hello, Gunicorn!
This confirms that your application is now being served through Gunicorn instead of Flask's development server.
Step 3 — Using a configuration file for Gunicorn
So far, you've been running Gunicorn using command-line options. While this works for simple setups, using a configuration file makes managing and modifying settings easier, especially as your application scales.
Instead of manually specifying options every time you start Gunicorn, you can define them in a configuration file for a more organized and maintainable deployment.
Gunicorn supports both Python-based config.py
and INI-based gunicorn.conf.py
or gunicorn.conf
configuration files. This guide’ll use a Python configuration file, as it’s more flexible.
To simplify your setup, create a new file called gunicorn_config.py
in your project directory and add the following settings:
workers = 3 # Number of worker processes
bind = "127.0.0.1:8000" # Bind Gunicorn to localhost on port 8000
These are the same options you previously passed in the command line, but now they are stored in a dedicated configuration file. Storing them in a dedicated configuration file brings several advantages.
Your server configuration becomes version-controlled alongside your code, more maintainable through centralized settings, reusable across different deployments, and easier to understand with proper documentation.
Now, instead of specifying options manually in the command line, start Gunicorn using the configuration file:
gunicorn -c gunicorn_config.py app:app
When you run this command, you should see output similar to the following, confirming that Gunicorn is running with the specified configuration:
[2025-02-17 09:17:37 +0200] [10255] [INFO] Starting gunicorn 23.0.0
[2025-02-17 09:17:37 +0200] [10255] [INFO] Listening at: http://127.0.0.1:8000 (10255)
[2025-02-17 09:17:37 +0200] [10255] [INFO] Using worker: sync
[2025-02-17 09:17:37 +0200] [10256] [INFO] Booting worker with pid: 10256
[2025-02-17 09:17:37 +0200] [10257] [INFO] Booting worker with pid: 10257
[2025-02-17 09:17:37 +0200] [10258] [INFO] Booting worker with pid: 10258
At this point, your Gunicorn setup is more structured and easier to maintain.
Step 4 — Optimizing Gunicorn with asynchronous workers
By default, Gunicorn uses synchronous workers (sync
), which process one request per worker at a time. While this model is stable, it may struggle under high concurrency, especially when handling I/O-bound operations like database queries, API calls, or file I/O.
To improve performance, Gunicorn supports asynchronous worker classes, including:
gevent
: Uses greenlets (lightweight coroutines) for non-blocking executioneventlet
: Similar togevent
, providing cooperative multitaskinggthread
: Uses multiple threads per worker (not truly asynchronous)
For this guide, you’ll implement Eventlet, a powerful tool for handling high-concurrency workloads efficiently.
To enable Eventlet, you need to install it in your virtual environment:
pip install eventlet
Instead of modifying the command line, update your gunicorn_config.py
file to use Eventlet.
# Worker Settings
workers = 3
worker_class = 'eventlet' # Use gevent async workers
worker_connections = 1000 # Maximum concurrent connections per worker
# Server Settings
bind = "127.0.0.1:8000"
Once you've updated the configuration file, start Gunicorn as usual:
gunicorn -c gunicorn_config.py app:app
The output will confirm that Gunicorn is using eventlet workers:
[2025-02-17 09:47:45 +0000] [19406] [INFO] Starting gunicorn 23.0.0
[2025-02-17 09:47:45 +0000] [19406] [INFO] Listening at: http://127.0.0.1:8000 (19406)
[2025-02-17 09:47:45 +0000] [19406] [INFO] Using worker: eventlet
...
After switching to Eventlet, your application will have faster response times, better concurrency handling, and more stable performance under load.
Step 5 — Fine-tuning Gunicorn for maximum performance
Now that Gunicorn runs with asynchronous workers, the next step is fine-tuning its configuration for optimal performance. This includes setting the correct number of workers, effectively handling timeouts, keeping connections alive, and managing memory usage to prevent slowdowns.
Optimizing the number of workers
Gunicorn allows multiple workers to handle requests concurrently, which helps distribute the load efficiently.
However, setting the wrong number of workers can underutilize system resources or cause excessive CPU usage.
A widely accepted formula for determining the optimal number of workers is:
workers = 2 * (CPU cores) + 1
To check how many CPU cores are available, run the following command:
python -c "import multiprocessing; print(multiprocessing.cpu_count())"
For my system with 2 CPUs, the output shows:
2
Following the recommended formula for worker processes (2 × CPU cores + 1), we should use:
workers = 2 * 2 + 1 = 5
This means five worker processes would be optimal for this system.
Instead of manually setting this value, updating the configuration file to dynamically determine the number of workers is more practical.
Modify gunicorn_config.py
as follows:
# Worker Settings
import multiprocessing
workers = 2 * multiprocessing.cpu_count() + 1 # Dynamically determine the optimal workers
worker_class = 'eventlet'
worker_connections = 2000
# Server Settings
bind = "127.0.0.1:8000"
With this change, Gunicorn will automatically adjust the number of workers based on the system's CPU cores, ensuring an efficient balance between performance and resource utilization.
Restart Gunicorn with the updated configuration
gunicorn -c gunicorn_config.py app:app
We can see the five workers being created:
[2025-02-17 09:50:32 +0000] [19422] [INFO] Starting gunicorn 23.0.0
[2025-02-17 09:50:32 +0000] [19422] [INFO] Listening at: http://127.0.0.1:8000 (19422)
[2025-02-17 09:50:32 +0000] [19422] [INFO] Using worker: eventlet
[2025-02-17 09:50:32 +0000] [19423] [INFO] Booting worker with pid: 19423
[2025-02-17 09:50:32 +0000] [19424] [INFO] Booting worker with pid: 19424
[2025-02-17 09:50:32 +0000] [19425] [INFO] Booting worker with pid: 19425
[2025-02-17 09:50:32 +0000] [19426] [INFO] Booting worker with pid: 19426
[2025-02-17 09:50:32 +0000] [19427] [INFO] Booting worker with pid: 19427
As you can see, Gunicorn automatically created five worker processes based on my system's CPU cores. Your number of workers will differ depending on your system's CPU cores.
This approach ensures that Gunicorn scales appropriately with the hardware, preventing CPU bottlenecks while maximizing throughput.
Preventing stalled requests with timeout settings
Requests that take too long can block workers, slowing down the entire application. Proper timeout settings help mitigate this issue by killing unresponsive workers and ensuring that requests don’t hang indefinitely.
Update gunicorn_config.py
with the following timeout settings:
...
# Timeout Settings
timeout = 30 # Automatically restart workers if they take too long
graceful_timeout = 30 # Graceful shutdown for workers
The timeout
parameter determines how long a worker waits before being forcibly restarted, while graceful_timeout
allows Gunicorn to shut down a worker more gently, giving it time to complete any ongoing tasks.
Setting both values to 30 seconds helps prevent unnecessary terminations while ensuring workers don't hang indefinitely.
Reusing connections with keep-alive
For most applications, keeping connections open for a short period can reduce the overhead of re-establishing them. This is particularly useful when handling multiple rapid requests from the same clients.
Modify gunicorn_config.py
to include keep-alive settings:
....
# Keep-Alive Settings
keepalive = 2 # Keep connections alive for 2s
This small adjustment helps improve performance by allowing clients to reuse existing connections instead of opening new ones repeatedly.
Managing memory & preventing worker restarts
As Gunicorn workers handle requests, memory usage can gradually increase due to inefficiencies in garbage collection or memory leaks in the application. Restarting workers periodically helps keep memory usage under control and ensures long-running applications stay stable.
You can configure Gunicorn to restart workers after processing a certain number of requests, preventing memory buildup over time.
To implement this, update gunicorn_config.py
as follows:
...
# Worker Restart Settings
max_requests = 1000 # Restart workers after processing 1000 requests
max_requests_jitter = 50 # Add randomness to avoid mass restarts
The max_requests
setting ensures that workers don’t run indefinitely, forcing them to restart after handling 1,000 requests. The max_requests_jitter
setting introduces randomness, preventing all workers from restarting simultaneously and causing temporary downtime.
With these changes, Gunicorn can maintain long-term stability, preventing slowdowns and potential memory leaks without disrupting ongoing requests.
Implementing these optimizations allows your Gunicorn server to be more resilient, efficient, and capable of handling high-concurrency workloads.
Step 6 — Managing logs in Gunicorn
Now that Gunicorn is running efficiently with optimized worker settings and performance configurations, the next step is to enable logging for monitoring and debugging. Logs provide crucial insights into application performance, request handling, and potential errors.
By default, Gunicorn outputs logs to stdout(standard output), but configuring logging properly ensures better debugging and system monitoring.
Gunicorn supports two types of logs:
- Access logs: Record all incoming HTTP requests, including client IP, request method, and response time.
- Error logs: Capture critical server errors, including worker crashes and unexpected exceptions.
To enable these logs, update gunicorn_config.py
:
...
# Logging Settings
accesslog = "gunicorn_access.log" # Log HTTP requests to a file
errorlog = "gunicorn_error.log" # Log errors to a file
loglevel = "info" # Set log verbosity (debug, info, warning, error, critical)
This configuration saves logs to separate files, making it easier to analyze request patterns and diagnose server issues.
Restart Gunicorn with the new configuration:
gunicorn -c gunicorn_config.py app:app
Once the server is running, you won’t see logs in the console anymore. Instead, Gunicorn will write them to log files inside the project directory.
To verify that logging is working, send a request to the application using:
curl http://127.0.0.1:8000/
To view the logs, check the access log file:
cat gunicorn_access.log
127.0.0.1 - - [17/Feb/2025:09:59:22 +0000] "GET / HTTP/1.1" 200 16 "-" "curl/8.5.0"
If Gunicorn is managed by systems in production—something we will explore soon and revisit later—logs can be redirected to the system journal instead of being written to separate log files. This can be configured as follows:
accesslog = "-" # Log to stdout
errorlog = "-" # Log errors to stdout
Redirecting logs to stdout is particularly useful for centralized log management, especially in cloud environments or containerized deployments, where logs are aggregated and monitored using external tools.
Keep this in mind, as we will revisit this approach when integrating Gunicorn with systems.
Step 7 — Managing Gunicorn as a systemd service
Now that Gunicorn is optimized and logging is properly configured, it’s time to ensure that it runs reliably in production. Running Gunicorn manually in the terminal is fine for development, but in a real-world deployment, it should start automatically at boot, restart on failure, and run as a background service independent of user sessions. To achieve this, Gunicorn will be configured as a systemd service.
Gunicorn needs a dedicated systemd service file to define how it should start, restart, and integrate with the system logs. This file should be created inside /etc/systemd/system/
to ensure system-wide control. Open a terminal and create the service file using a text editor:
sudo nano /etc/systemd/system/gunicorn.service
Inside the file, define the Gunicorn service using the configuration below. Ensure that you replace the highlighted sections with your actual username and the correct paths to your Gunicorn project and virtual environment. Double-check that the paths align with your system setup to avoid errors:
[Unit]
Description=Gunicorn daemon for Flask app
After=network.target
[Service]
User=youruser
Group=www-data
WorkingDirectory=/home/youruser/gunicorn-demo
ExecStart=/home/youruser/gunicorn-demo/venv/bin/gunicorn --config gunicorn_config.py app:app
Restart=always
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
This configuration ensures that Gunicorn starts only after the network is available, runs under the correct user, and logs output directly to the system journal. Additionally, if the process crashes, systemd will restart it automatically.
Save and exit the file, then reload systemd so it recognizes the new service:
sudo systemctl daemon-reload
Once the service is set up, Gunicorn can be started using:
sudo systemctl start gunicorn
To confirm that it is running, check the status:
sudo systemctl status gunicorn
● gunicorn.service - Gunicorn daemon for Flask app
Loaded: loaded (/etc/systemd/system/gunicorn.service; disabled; preset: enabled)
Active: active (running) since Mon 2025-02-17 10:09:03 UTC; 14ms ago
Main PID: 19677 ((gunicorn))
Tasks: 1 (limit: 4543)
Memory: 636.0K (peak: 636.0K)
CPU: 6ms
CGroup: /system.slice/gunicorn.service
└─19677 "(gunicorn)"
Feb 17 10:09:03 testing systemd[1]: gunicorn.service: Scheduled restart job, restart counter is at 1.
Feb 17 10:09:03 testing systemd[1]: Started gunicorn.service - Gunicorn daemon for Flask app.
If everything is working correctly, Gunicorn will be listed as active. To make sure it starts automatically when the server reboots, enable it as a persistent service:
sudo systemctl enable gunicorn
If any configuration changes are made to Gunicorn, such as modifying worker settings, the service must be restarted to apply them:
sudo systemctl restart gunicorn
Managing Logs with systemd
Since Gunicorn logs are now managed by systemd’s journal, there is no need to rely on separate log files. Logs can be viewed directly from the system journal using:
sudo journalctl -u gunicorn --no-pager
For real-time log monitoring, use:
sudo journalctl -u gunicorn -f
If logs become too large, older entries can be removed while keeping recent logs intact:
sudo journalctl --vacuum-time=7d
To ensure Gunicorn correctly integrates with systemd’s logging, update gunicorn_config.py
so that logs are sent to stdout instead of files:
...
# Logging Settings
accesslog = "-" # Send access logs to stdout
errorlog = "-" # Send error logs to stdout
loglevel = "info" # Adjust verbosity level
After applying these changes, restart Gunicorn to load the new configuration:
sudo systemctl restart gunicorn
Next, send four requests to the application to verify that it's running correctly:
for i in {1..4}; do curl -s http://127.0.0.1:8000; echo; done
Now, check the logs to confirm that Gunicorn is processing requests as expected:
sudo journalctl -u gunicorn --no-pager
You should see log entries similar to the following, indicating that Gunicorn has started workers and is handling incoming requests:
Feb 17 10:14:04 testing gunicorn[19879]: [2025-02-17 10:14:04 +0000] [19879] [INFO] Booting worker with pid: 19879
Feb 17 10:14:04 testing gunicorn[19880]: [2025-02-17 10:14:04 +0000] [19880] [INFO] Booting worker with pid: 19880
Feb 17 10:14:04 testing gunicorn[19881]: [2025-02-17 10:14:04 +0000] [19881] [INFO] Booting worker with pid: 19881
Feb 17 10:15:11 testing gunicorn[19879]: 127.0.0.1 - - [17/Feb/2025:10:15:11 +0000] "GET / HTTP/1.1" 200 16 "-" "curl/8.5.0"
Feb 17 10:16:15 testing gunicorn[19879]: 127.0.0.1 - - [17/Feb/2025:10:16:15 +0000] "GET / HTTP/1.1" 200 16 "-" "curl/8.5.0"
Feb 17 10:16:15 testing gunicorn[19879]: 127.0.0.1 - - [17/Feb/2025:10:16:15 +0000] "GET / HTTP/1.1" 200 16 "-" "curl/8.5.0"
Feb 17 10:16:15 testing gunicorn[19878]: 127.0.0.1 - - [17/Feb/2025:10:16:15 +0000] "GET / HTTP/1.1" 200 16 "-" "curl/8.5.0"
These entries confirm that Gunicorn has successfully restarted, booted its worker processes, and logged the requests made using curl
.
Step 8 — Deploying Gunicorn for production
Now that Gunicorn is properly configured and managed as a systemd service, the final step in deploying it for production is to ensure security, stability, and scalability.
Gunicorn should not be exposed directly to the internet in a production environment. Instead, it should be deployed behind a reverse proxy like Nginx, which provides additional performance benefits, security enhancements, and load balancing.
While Gunicorn is an excellent WSGI server, it is not designed to handle direct client traffic, slow clients, or large numbers of concurrent connections efficiently. Nginx serves as an intermediary that:
- Handles static files efficiently instead of passing them to Gunicorn.
- Offloads SSL/TLS encryption for secure HTTPS connections.
- Provides better request buffering and rate limiting to protect against slow client attacks.
- Supports load balancing across multiple Gunicorn instances.
- Improves performance by handling concurrent connections more efficiently.
If Nginx is not already installed on your server, install it using the following command on Ubuntu:
sudo apt update && sudo apt install nginx -y
After installation, start Nginx and enable it to run at boot:
sudo systemctl start nginx
sudo systemctl enable nginx
Check if Nginx is running:
sudo systemctl status nginx
If it is active and running, allow HTTP traffic through the firewall:
sudo ufw allow 'Nginx HTTP'
To confirm that Nginx is installed correctly, visit your server’s IP address in a browser. You should see the default Nginx welcome page.
To forward traffic from Nginx to Gunicorn, create a new Nginx configuration file:
sudo nano /etc/nginx/sites-available/gunicorn
Add the following configuration, ensuring that the static file path matches your project structure:
server {
listen 80;
server_name your_domain_or_server_ip;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
location /static/ {
alias /home/youruser/gunicorn-demo/static/;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
This configuration:
- Forwards all traffic from
your_domain_or_server_ip
to Gunicorn at 127.0.0.1:8000. - Ensures static files are served directly by Nginx for better performance.
- Provides an error page handler for better user experience during failures.
Save and exit the file. Then, create a symbolic link to enable the configuration:
sudo ln -s /etc/nginx/sites-available/gunicorn /etc/nginx/sites-enabled/
Test the configuration for syntax errors:
sudo nginx -t
If the test is successful, restart Nginx to apply the changes:
sudo systemctl restart nginx
Gunicorn must be adjusted to ensure it works correctly behind Nginx.
Open gunicorn_config.py
and modify the bind
setting:
# Gunicorn Configuration for Nginx
bind = "127.0.0.1:8000" # Ensure it matches the Nginx proxy_pass setting
forwarded_allow_ips = "*" # Allow requests from Nginx
proxy_protocol = True # Enable proxy support
Restart Gunicorn so that the new settings take effect:
sudo systemctl restart gunicorn
Now, Gunicorn is running behind Nginx. To test the setup, visit your server’s IP address or domain in a browser:
curl -I http://your_domain_or_server_ip
You should receive:
Hello, Gunicorn!
This confirms that Nginx is successfully forwarding requests to Gunicorn.
If something is not working, check the logs for errors:
sudo journalctl -u nginx --no-pager
sudo journalctl -u gunicorn --no-pager
At this stage, Gunicorn is fully integrated into a production-ready environment. Nginx now handles incoming requests, forwards them to Gunicorn, and the Flask application processes them efficiently.
With Nginx acting as a reverse proxy, your application benefits from improved security, better performance, and efficient load handling.
Final thoughts
This article explored how to use Gunicorn, optimize it, and fully deploy it with systemd while securing it behind Nginx to make our application production-ready.
From here, the next steps involve further hardening and refining your deployment. Securing your Nginx setup with SSL/TLS certificates using Let’s Encrypt will enable HTTPS, improving security and compliance.
Thanks for reading!
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for us
Build on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github