Gunicorn (short for Green Unicorn) is a high-performance WSGI server designed for running Python web applications. It is widely used in production environments due to its simplicity, compatibility with various frameworks, and efficient process management model.
This article will explore its core features, configuration options, worker models, and best practices for deploying Gunicorn in production.
Let's get started!
Prerequisites
Before getting started with Gunicorn, make sure your system has the following:
- The latest version of Python (3.13.2 or higher)
pip, Python’s package manager (included with Python 3.x)- Experience with a Python web framework like Flask for testing Gunicorn
Why use Gunicorn?
As mentioned, Gunicorn is a WSGI server designed to efficiently and reliably deploy Python web applications. It uses a pre-fork worker model to handle each request separately for better stability and performance.
Gunicorn also enhances request handling by supporting both synchronous and asynchronous workers while efficiently managing workers automatically. This prevents the entire application from crashing and ensures smooth operation, even under heavy load
The following are some of the other key benefits:
- Handles requests separately for stability and performance
- Supports sync and async workers (Gevent, Eventlet, gthread)
- Manages workers automatically (timeouts, restarts, scaling)
- Works with Django, Flask, FastAPI, and any WSGI-compatible framework
- Highly configurable via Python config files, CLI options, and hooks
- Production-ready and built to handle real-world traffic
- Integrates with Nginx, HAProxy, AWS ALB for scalability
Step 1 — Setting up the demo application (optional)
For this guide, you'll create a simple Flask application to demonstrate Gunicorn's capabilities. If you have your own WSGI application, feel free to use that instead.
Before running Gunicorn, create a new directory and navigate into it:
Create a new virtual environment and activate it:
Install Flask:
Now, create a new file called app.py with a basic Flask application:
This simple application includes two routes:
/- Returns a greeting message/healthcheck- A simple health check endpoint that returns "OK" with a 200 status code
Then, test the application using Flask's development server:
You should see output like this:
You can see in the output that Flask's built-in development server is running on http://127.0.0.1:5000.
But pay attention to the warning—Flask explicitly tells you this server isn’t meant for production.
This is where Gunicorn steps in as your production-ready solution, as it is built for performance and makes your app fast, stable, and scalable.
Before setting up Gunicorn, visit http://127.0.0.1:8000 in your browser or test it using curl:
You should see:
While this basic example demonstrates the core concepts, you'll need to explore Gunicorn's features. In the next section, you'll install Gunicorn and configure it to serve this application.
Step 2 — Getting started with Gunicorn
By now, you've seen that Flask’s built-in development server is not meant for production—it’s single-threaded, lacks performance optimizations, and won’t handle high traffic efficiently. This is where Gunicorn saves the day.
Gunicorn is a production-ready WSGI server that efficiently handles multiple requests using worker processes. Unlike Flask’s built-in server, Gunicorn scales smoothly, keeping your application stable and performant under real-world traffic.
Gunicorn can be installed via pip. Enter the following command to install Gunicorn:
Verify the installation by checking the version:
Expected output:
Now that Gunicorn is installed, you can use it to serve your Flask application the right way—with better performance and concurrency.
Run the following command:
Here is the breakdown of the command:
gunicorn: Starts Gunicorn instead of Flask’s built-in server-w 3: Runs 3 worker processes, allowing the app to handle multiple requests simultaneously-b 127.0.0.1:8000: Binds the application to127.0.0.1on port8000app:app: Targets theappinstance insideapp.py.
When you run the command, you should see output similar to this:
At this point, your Flask app is now running on a production-grade server.
Visit http://127.0.0.1:8000 to test the application:
You should see:
This confirms that your application is now being served through Gunicorn instead of Flask's development server.
Step 3 — Using a configuration file for Gunicorn
So far, you've been running Gunicorn using command-line options. While this works for simple setups, using a configuration file makes managing and modifying settings easier, especially as your application scales.
Instead of manually specifying options every time you start Gunicorn, you can define them in a configuration file for a more organized and maintainable deployment.
Gunicorn supports both Python-based config.py and INI-based gunicorn.conf.py or gunicorn.conf configuration files. This guide’ll use a Python configuration file, as it’s more flexible.
To simplify your setup, create a new file called gunicorn_config.py in your project directory and add the following settings:
These are the same options you previously passed in the command line, but now they are stored in a dedicated configuration file. Storing them in a dedicated configuration file brings several advantages.
Your server configuration becomes version-controlled alongside your code, more maintainable through centralized settings, reusable across different deployments, and easier to understand with proper documentation.
Now, instead of specifying options manually in the command line, start Gunicorn using the configuration file:
When you run this command, you should see output similar to the following, confirming that Gunicorn is running with the specified configuration:
At this point, your Gunicorn setup is more structured and easier to maintain.
Step 4 — Optimizing Gunicorn with asynchronous workers
By default, Gunicorn uses synchronous workers (sync), which process one request per worker at a time. While this model is stable, it may struggle under high concurrency, especially when handling I/O-bound operations like database queries, API calls, or file I/O.
To improve performance, Gunicorn supports asynchronous worker classes, including:
gevent: Uses greenlets (lightweight coroutines) for non-blocking executioneventlet: Similar togevent, providing cooperative multitaskinggthread: Uses multiple threads per worker (not truly asynchronous)
For this guide, you’ll implement Eventlet, a powerful tool for handling high-concurrency workloads efficiently.
To enable Eventlet, you need to install it in your virtual environment:
Instead of modifying the command line, update your gunicorn_config.py file to use Eventlet.
Once you've updated the configuration file, start Gunicorn as usual:
The output will confirm that Gunicorn is using eventlet workers:
After switching to Eventlet, your application will have faster response times, better concurrency handling, and more stable performance under load.
Step 5 — Fine-tuning Gunicorn for maximum performance
Now that Gunicorn runs with asynchronous workers, the next step is fine-tuning its configuration for optimal performance. This includes setting the correct number of workers, effectively handling timeouts, keeping connections alive, and managing memory usage to prevent slowdowns.
Optimizing the number of workers
Gunicorn allows multiple workers to handle requests concurrently, which helps distribute the load efficiently.
However, setting the wrong number of workers can underutilize system resources or cause excessive CPU usage.
A widely accepted formula for determining the optimal number of workers is:
To check how many CPU cores are available, run the following command:
For my system with 2 CPUs, the output shows:
Following the recommended formula for worker processes (2 × CPU cores + 1), we should use:
This means five worker processes would be optimal for this system.
Instead of manually setting this value, updating the configuration file to dynamically determine the number of workers is more practical.
Modify gunicorn_config.py as follows:
With this change, Gunicorn will automatically adjust the number of workers based on the system's CPU cores, ensuring an efficient balance between performance and resource utilization.
Restart Gunicorn with the updated configuration
We can see the five workers being created:
As you can see, Gunicorn automatically created five worker processes based on my system's CPU cores. Your number of workers will differ depending on your system's CPU cores.
This approach ensures that Gunicorn scales appropriately with the hardware, preventing CPU bottlenecks while maximizing throughput.
Preventing stalled requests with timeout settings
Requests that take too long can block workers, slowing down the entire application. Proper timeout settings help mitigate this issue by killing unresponsive workers and ensuring that requests don’t hang indefinitely.
Update gunicorn_config.py with the following timeout settings:
The timeout parameter determines how long a worker waits before being forcibly restarted, while graceful_timeout allows Gunicorn to shut down a worker more gently, giving it time to complete any ongoing tasks.
Setting both values to 30 seconds helps prevent unnecessary terminations while ensuring workers don't hang indefinitely.
Reusing connections with keep-alive
For most applications, keeping connections open for a short period can reduce the overhead of re-establishing them. This is particularly useful when handling multiple rapid requests from the same clients.
Modify gunicorn_config.py to include keep-alive settings:
This small adjustment helps improve performance by allowing clients to reuse existing connections instead of opening new ones repeatedly.
Managing memory & preventing worker restarts
As Gunicorn workers handle requests, memory usage can gradually increase due to inefficiencies in garbage collection or memory leaks in the application. Restarting workers periodically helps keep memory usage under control and ensures long-running applications stay stable.
You can configure Gunicorn to restart workers after processing a certain number of requests, preventing memory buildup over time.
To implement this, update gunicorn_config.py as follows:
The max_requests setting ensures that workers don’t run indefinitely, forcing them to restart after handling 1,000 requests. The max_requests_jitter setting introduces randomness, preventing all workers from restarting simultaneously and causing temporary downtime.
With these changes, Gunicorn can maintain long-term stability, preventing slowdowns and potential memory leaks without disrupting ongoing requests.
Implementing these optimizations allows your Gunicorn server to be more resilient, efficient, and capable of handling high-concurrency workloads.
Step 6 — Managing logs in Gunicorn
Now that Gunicorn is running efficiently with optimized worker settings and performance configurations, the next step is to enable logging for monitoring and debugging. Logs provide crucial insights into application performance, request handling, and potential errors.
By default, Gunicorn outputs logs to stdout(standard output), but configuring logging properly ensures better debugging and system monitoring.
Gunicorn supports two types of logs:
- Access logs: Record all incoming HTTP requests, including client IP, request method, and response time.
- Error logs: Capture critical server errors, including worker crashes and unexpected exceptions.
To enable these logs, update gunicorn_config.py:
This configuration saves logs to separate files, making it easier to analyze request patterns and diagnose server issues.
Restart Gunicorn with the new configuration:
Once the server is running, you won’t see logs in the console anymore. Instead, Gunicorn will write them to log files inside the project directory.
To verify that logging is working, send a request to the application using:
To view the logs, check the access log file:
If Gunicorn is managed by systems in production—something we will explore soon and revisit later—logs can be redirected to the system journal instead of being written to separate log files. This can be configured as follows:
Redirecting logs to stdout is particularly useful for centralized log management, especially in cloud environments or containerized deployments, where logs are aggregated and monitored using external tools.
Keep this in mind, as we will revisit this approach when integrating Gunicorn with systems.
Step 7 — Managing Gunicorn as a systemd service
Now that Gunicorn is optimized and logging is properly configured, it’s time to ensure that it runs reliably in production. Running Gunicorn manually in the terminal is fine for development, but in a real-world deployment, it should start automatically at boot, restart on failure, and run as a background service independent of user sessions. To achieve this, Gunicorn will be configured as a systemd service.
Gunicorn needs a dedicated systemd service file to define how it should start, restart, and integrate with the system logs. This file should be created inside /etc/systemd/system/ to ensure system-wide control. Open a terminal and create the service file using a text editor:
Inside the file, define the Gunicorn service using the configuration below. Ensure that you replace the highlighted sections with your actual username and the correct paths to your Gunicorn project and virtual environment. Double-check that the paths align with your system setup to avoid errors:
This configuration ensures that Gunicorn starts only after the network is available, runs under the correct user, and logs output directly to the system journal. Additionally, if the process crashes, systemd will restart it automatically.
Save and exit the file, then reload systemd so it recognizes the new service:
Once the service is set up, Gunicorn can be started using:
To confirm that it is running, check the status:
If everything is working correctly, Gunicorn will be listed as active. To make sure it starts automatically when the server reboots, enable it as a persistent service:
If any configuration changes are made to Gunicorn, such as modifying worker settings, the service must be restarted to apply them:
Managing Logs with systemd
Since Gunicorn logs are now managed by systemd’s journal, there is no need to rely on separate log files. Logs can be viewed directly from the system journal using:
For real-time log monitoring, use:
If logs become too large, older entries can be removed while keeping recent logs intact:
To ensure Gunicorn correctly integrates with systemd’s logging, update gunicorn_config.py so that logs are sent to stdout instead of files:
After applying these changes, restart Gunicorn to load the new configuration:
Next, send four requests to the application to verify that it's running correctly:
Now, check the logs to confirm that Gunicorn is processing requests as expected:
You should see log entries similar to the following, indicating that Gunicorn has started workers and is handling incoming requests:
These entries confirm that Gunicorn has successfully restarted, booted its worker processes, and logged the requests made using curl.
Step 8 — Deploying Gunicorn for production
Now that Gunicorn is properly configured and managed as a systemd service, the final step in deploying it for production is to ensure security, stability, and scalability.
Gunicorn should not be exposed directly to the internet in a production environment. Instead, it should be deployed behind a reverse proxy like Nginx, which provides additional performance benefits, security enhancements, and load balancing.
While Gunicorn is an excellent WSGI server, it is not designed to handle direct client traffic, slow clients, or large numbers of concurrent connections efficiently. Nginx serves as an intermediary that:
- Handles static files efficiently instead of passing them to Gunicorn.
- Offloads SSL/TLS encryption for secure HTTPS connections.
- Provides better request buffering and rate limiting to protect against slow client attacks.
- Supports load balancing across multiple Gunicorn instances.
- Improves performance by handling concurrent connections more efficiently.
If Nginx is not already installed on your server, install it using the following command on Ubuntu:
After installation, start Nginx and enable it to run at boot:
Check if Nginx is running:
If it is active and running, allow HTTP traffic through the firewall:
To confirm that Nginx is installed correctly, visit your server’s IP address in a browser. You should see the default Nginx welcome page.
To forward traffic from Nginx to Gunicorn, create a new Nginx configuration file:
Add the following configuration, ensuring that the static file path matches your project structure:
This configuration:
- Forwards all traffic from
your_domain_or_server_ipto Gunicorn at 127.0.0.1:8000. - Ensures static files are served directly by Nginx for better performance.
- Provides an error page handler for better user experience during failures.
Save and exit the file. Then, create a symbolic link to enable the configuration:
Test the configuration for syntax errors:
If the test is successful, restart Nginx to apply the changes:
Gunicorn must be adjusted to ensure it works correctly behind Nginx.
Open gunicorn_config.py and modify the bind setting:
Restart Gunicorn so that the new settings take effect:
Now, Gunicorn is running behind Nginx. To test the setup, visit your server’s IP address or domain in a browser:
You should receive:
This confirms that Nginx is successfully forwarding requests to Gunicorn.
If something is not working, check the logs for errors:
At this stage, Gunicorn is fully integrated into a production-ready environment. Nginx now handles incoming requests, forwards them to Gunicorn, and the Flask application processes them efficiently.
With Nginx acting as a reverse proxy, your application benefits from improved security, better performance, and efficient load handling.
Final thoughts
This article explored how to use Gunicorn, optimize it, and fully deploy it with systemd while securing it behind Nginx to make our application production-ready.
From here, the next steps involve further hardening and refining your deployment. Securing your Nginx setup with SSL/TLS certificates using Let’s Encrypt will enable HTTPS, improving security and compliance.
Thanks for reading!