Python provides profiling tools that allow you to identify performance bottlenecks and optimize your code.
The standard library offers powerful profiling modules like cProfile
and profile
, combined with visualization tools like snakeviz
, providing comprehensive insights into your application's execution flow.
This article will guide you through creating and implementing a profiling strategy for your Python applications.
Prerequisites
Before continuing, make sure you have a recent version of Python (version 3.13 or higher) installed on your local machine. This guide assumes you're already comfortable with basic Python concepts.
Step 1 — Getting started with Python profiling
For the best learning experience, set up a fresh Python project to experiment directly with the concepts introduced in this tutorial.
Begin by creating a new directory and setting up a virtual environment:
mkdir python-profiling && cd python-profiling
python3 -m venv venv
Activate the virtual environment:
source venv/bin/activate
Let's start with a common recursive algorithm you'll use throughout this article to demonstrate different profiling techniques - the Fibonacci sequence.
Create a file named main.py
with the following content:
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
if __name__ == "__main__":
result = fibonacci(30)
print(f"Fibonacci(30) = {result}")
This function calculates the n-th Fibonacci number recursively. While simple to understand, this implementation has exponential time complexity - it recalculates the same Fibonacci numbers repeatedly, making it inefficient for larger values of n.
Let's see just how inefficient it is with some manual timing:
import time
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
if __name__ == "__main__":
start_time = time.time()
result = fibonacci(30)
end_time = time.time()
print(f"Fibonacci(30) = {result}")
print(f"Time taken: {end_time - start_time:.4f} seconds")
Here, you use the built-in time
module to measure manually precisely how long the recursive Fibonacci calculation takes.
Run your script with the following command:
python main.py
You'll see output like:
Fibonacci(30) = 832040
Time taken: 0.1044 seconds
While this basic timing approach tells us how long the function takes to run, it doesn't provide insights into what's happening inside it.
For complex applications with many functions, measuring total execution time doesn't help identify bottlenecks. This is where profiling comes in.
Step 2 — Basic profiling with the built-in cProfile
module
While manual timing gives us a general idea of overall execution time, it has several limitations:
- It only shows total execution time, not where time is spent within the function
- No call count information. We can't see how many times each function was called
- We don't see the relationships between function calls
- You need to add timing code around every function you want to measure
This is where Python's built-in profiling tools become invaluable. The standard library includes a powerful module called cProfile
that provides detailed insights without requiring you to modify your code extensively.
Let's use cProfile
to analyze our Fibonacci function in a new file called main.py
:
import cProfile
import pstats
import io
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
if __name__ == "__main__":
# Create and start the profiler
profiler = cProfile.Profile()
profiler.enable()
# Run the code we want to profile
result = fibonacci(30)
print(f"Fibonacci(30) = {result}")
# Disable the profiler and print stats
profiler.disable()
# Format and display the results
s = io.StringIO()
stats = pstats.Stats(profiler, stream=s).sort_stats('cumulative')
stats.print_stats(20) # Print top 20 functions
print(s.getvalue())
In the highlighted blocks, you first import Python's built-in profiling modules:
cProfile
for gathering detailed performance datapstats
for organizing and analyzing that dataio
for conveniently formatting the output
Next, you create a profiling instance (profiler
) and activate it using profiler.enable()
. With profiling active, you run the fibonacci(30)
function to capture detailed execution metrics. Once the function finishes executing, you deactivate the profiler (profiler.disable()
).
Finally, you use pstats.Stats
to process and sort the profiling data based on cumulative execution time, outputting the top 20 function calls to quickly identify the most time-consuming parts of your code.
Run this script:
python main.py
You should observe output similar to the following:
Fibonacci(30) = 832040
2692537 function calls (30 primitive calls) in 0.412 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
2692537/1 0.412 0.000 0.412 0.412 /Users/username/python-profiling/main_profiled.py:5(fibonacci)
1 0.000 0.000 0.412 0.412 {built-in method builtins.exec}
1 0.000 0.000 0.412 0.412 /Users/username/python-profiling/main_profiled.py:1(<module>)
This output reveals something striking - our fibonacci function was called 2,692,537 times to calculate the 30th Fibonacci number!
Let's understand what each column in the output means:
ncalls
: The number of calls to each function. For entries showing two numbers (e.g.,2692537/1
), it represents total calls versus primitive (non-recursive) calls. Specifically,2692537/1
means the function was invoked over 2.6 million times, but only 1 call was direct.tottime
: Total time spent in the function, excluding time spent in calls to other functions.percall
: Average time spent per call (tottime/ncalls).cumtime
: Cumulative time spent in the function, including time spent in calls to other functions.filename:lineno(function)
: The location and name of the function.
With this information, you can immediately identify algorithm inefficiencies, prioritize optimization efforts, and make data-driven decisions about where to focus your performance tuning work.
The profile data clearly shows that our naive recursive implementation is extremely inefficient due to the millions of redundant function calls.
However, working with the profiling data this way can still be cumbersome, especially for larger applications. Let's make profiling easier with a reusable solution.
Step 3 — Creating a reusable profiling decorator
While directly using cProfile
works well for one-off profiling, you'll often want to profile multiple functions in real projects.
Instead of repeating the same profiling code, let's create a reusable decorator that can be applied to any function.
Create a new file named profiler.py
:
import cProfile
import pstats
import io
from functools import wraps
def profile(func):
@wraps(func)
def wrapper(*args, **kwargs):
# Create and start profiler
pr = cProfile.Profile()
pr.enable()
# Call the original function
result = func(*args, **kwargs)
# Stop profiling
pr.disable()
# Format and print results
s = io.StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
ps.print_stats(20)
print(s.getvalue())
return result
return wrapper
This decorator wraps any function with profiling code, making it easy to profile any part of your application.
The @wraps(func)
decorator from functools
preserves the original function's metadata, which is essential for debugging and documentation.
For our Fibonacci example, you shouldn't apply the decorator directly to the recursive Fibonacci function itself, as this would cause issues with nested profiling.
Instead, let's create a wrapper function:
from profiler import profile
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
@profile
def run():
result = fibonacci(30)
print(f"Fibonacci(30) = {result}")
if __name__ == "__main__":
run()
Now you can add the @profile
decorator to any non-recursive function you want to profile. Run this script:
python main.py
Fibonacci(30) = 832040
2692540 function calls (4 primitive calls) in 0.464 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.464 0.464 /Users/username/profiling-python/python-profiling/main.py:10(run)
2692537/1 0.463 0.000 0.463 0.463 /Users/username/profiling-python/python-profiling/main.py:4(fibonacci)
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
You'll see the same profiling information as before, but now you have a reusable tool that can be applied to any function with a single line of code. This approach is much more maintainable for larger projects where you must profile different parts of your codebase.
Step 4 — Saving profile data to files
To analyze profiling data more thoroughly or share it with team members, we should enhance our profiler to save data to files. Let's modify our profiler.py
:
import cProfile
import pstats
import io
from functools import wraps
def profile(func=None, output_file=None):
def decorator(f):
@wraps(f)
def wrapper(*args, **kwargs):
# Create and start profiler
pr = cProfile.Profile()
pr.enable()
# Call the original function
result = f(*args, **kwargs)
# Stop profiling
pr.disable()
# Print formatted results to console
s = io.StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
ps.print_stats(20)
print(s.getvalue())
# Save to file if requested
if output_file:
ps.dump_stats(output_file)
print(f"Profile data saved to {output_file}")
return result
return wrapper
# Handle both @profile and @profile(output_file='stats.prof') syntax
if func is None:
return decorator
return decorator(func)
Here, you've updated the function signature of profile
to include an optional output_file
parameter, enabling you to specify a file location for saving profiling results.
You've also introduced new file-saving functionality. the added if output_file:
block writes profiling data to the specified file using pstats.Stats.dump_stats()
. Afterward, a confirmation message informs you that the file was created successfully.
The most significant structural change is the introduction of nested decorators. Previously, your decorator was straightforward—it always accepted a function directly. Now, it must handle two scenarios: being used directly (@profile
) or with arguments (@profile(output_file='profile.stats')
).
This enhancement allows you to save profiling data in a binary format for later analysis. Now, let's update your main.py
file to take advantage of this new feature:
from profiler import profile
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
@profile(output_file='fibonacci.prof')
def run():
result = fibonacci(30)
print(f"Fibonacci(30) = {result}")
if __name__ == "__main__":
run()
In the highlighted line, you've used the enhanced @profile
decorator with the new output_file
argument.
This modification instructs the profiler to save profiling results directly into a binary file named fibonacci.prof
Now, run your script to generate and save the profiling data:
python main.py
After the script runs, you'll receive a confirmation message indicating the profile data was successfully saved to this file:
Fibonacci(30) = 832040
2692540 function calls (4 primitive calls) in 0.440 seconds
Ordered by: cumulative time
....
Profile data saved to fibonacci.prof
You can verify the file exists:
ls -l fibonacci.prof
-rw-r--r--@ 1 stanley staff 523 Mar 7 14:23 fibonacci.prof
This .prof
file contains all your profiling data in a binary format. The benefit of saving to a file is that you can:
- Analyze profiles offline without re-running the code
- Compare profiles from different runs
- Share profile data with team members
- Use external tools to visualize and analyze the data
Saving profile data becomes especially valuable when profiling production systems or long-running processes where you can't quickly examine the immediate console output.
Step 5 — Visualizing profile data with snakeviz
With the profile data saved to a file, you can visualize it using specialized tools. One of the most popular is snakeviz, which creates interactive graphical representations of Python profiling data.
First, install snakeviz:
pip install snakeviz
Now, visualize the profile data you saved in the previous step:
snakeviz fibonacci.prof
snakeviz web server started on 127.0.0.1:8080; enter Ctrl-C to exit
http://127.0.0.1:8080/snakeviz/<path-to-the-file/fibonacci.prof
This command opens your default web browser with an interactive visualization:
When SnakeViz loads, you'll see several key components:
Control buttons at the top, including "Reset Root" and "Reset Zoom" for navigation
Visualization configuration dropdowns for Style, Depth, and Cutoff
The main visualization area in the center, showing your profiling data
A data table at the bottom with detailed function statistics
The default view shows an "Icicle" visualization, which displays your code's execution as nested rectangles. Each rectangle represents a function, with the width indicating the proportion of execution time.
At the top of the page, you'll find the "Style" dropdown to switch between visualization types.
Switch to the Sunburst view to quickly identify functions taking up significant execution time:
The Sunburst visualization creates a circular chart where each function call radiates outward from the center, clearly illustrating the hierarchical relationship and the proportion of time spent in each call.
The Sunburst view is particularly valuable for understanding recursive functions. In this visualization:
- Each ring represents a deeper level of recursion
- The size of each segment shows the proportion of time spent
- The center represents the entry point of your program
- Moving outward shows deeper function calls
The detailed statistics table at the bottom provides the same information you saw in the text-based profile output, but in a sortable, interactive format. Click on any column header to sort by that metric, helping you quickly identify the most expensive functions by different criteria.
Combining intuitive visual representations and detailed statistics, SnakeViz clearly illustrates your application's performance, helping you quickly pinpoint areas for optimization.
Step 6 — Optimizing code based on profiling results
Now that you've identified performance bottlenecks using profiling and visualization, it's time to apply this knowledge to optimize your code.
Based on our profiling data, you can see that the recursive implementation makes millions of redundant function calls.
Let's improve the Fibonacci implementation using an iterative approach:
from profiler import profile
import time
def fibonacci(n):
if n <= 1:
return n
a, b = 0, 1
for _ in range(2, n + 1):
a, b = b, a + b
return b
@profile(output_file='fibonacci_iterative.prof')
def run():
n = 30
start = time.time()
result = fibonacci(n)
end = time.time()
print(f"Fibonacci({n}) = {result}")
print(f"Time taken: {end - start:.6f} seconds")
if __name__ == "__main__":
run()
The iterative implementation eliminates recursion entirely. Instead of calling itself repeatedly, it uses a simple loop to calculate each Fibonacci number once, storing only the two most recent values (a and b) at any time.
Now run the code:
python main.py
Fibonacci(30) = 832040
Time taken: 0.000003 seconds
7 function calls in 0.000 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 /Users/stanley/python-profiling/main.py:15(run)
2 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 0.000 0.000 /Users/stanley/python-profiling/main.py:5(fibonacci)
2 0.000 0.000 0.000 0.000 {built-in method time.time}
Profile data saved to fibonacci.prof
The difference in performance is striking. Not only is the computation nearly instantaneous, but look at the profiling data:
Only 7 total function calls compared to over 2.6 million in the recursive version
Negligible execution time (microseconds instead of seconds)
Linear O(n) time complexity instead of exponential O(2ⁿ)
Visualize the profile with snakeviz:
snakeviz fibonacci_iterative.prof
You'll see a dramatically simpler visualization with far fewer function calls and much shorter execution time:
The visualization is dramatically different from the recursive implementation. Instead of the deep, complex call tree that we saw previously, the visualization shows a simple, flat structure:
The main
run()
function at the top levelA single call to
fibonacci()
that takes minimal timeA few built-in function calls like
print
andtime.time
The visualization confirms what your profiling data showed - the optimized version runs in constant time with minimal overhead.
The print
operations consume more time than the Fibonacci calculation itself!
The statistics table at the bottom of the visualization provides additional confirmation, showing that the actual fibonacci function now consumes just microseconds of processing time - an improvement of over 100,000x compared to the recursive version.
This exercise illustrates the power of profiling-driven optimization: pinpointing the exact performance issue (redundant recursive calls) allowed you to implement a targeted solution that drastically improved performance.
Step 7 — Conclusion
This article has guided you through a comprehensive Python profiling workflow, from simple timing to sophisticated visualization. We explored cProfile
, custom decorators, data persistence, and graphical analysis with SnakeViz.
The practical examples demonstrated how profiling can transform inefficient code into highly optimized implementations. This approach applies equally to complex applications - always profile before optimizing to ensure you're targeting actual bottlenecks.
Thanks for reading and happy coding!
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for us
Build on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github