When running a complex Python program that takes quite a long time to execute, you might want to improve its execution time. But how?
First of all, you need the tools to detect the bottlenecks of your code, i.e. which parts take longer to execute. This way, you can concentrate in speeding these parts first.
And also, you should also control the memory and CPU usage, as it can point you towards new portions of code that could be improved.
Therefore, in this post I’ll comment on 7 different Python tools that give you some insight about the execution time of your functions and the Memory and CPU usage.
1. Use a decorator to time your functions
The simpler way to time a function is to define a decorator that measures the elapsed time in running the function, and prints the result:
import time from functools import wraps def fn_timer(function): @wraps(function) def function_timer(*args, **kwargs): t0 = time.time() result = function(*args, **kwargs) t1 = time.time() print ("Total time running %s: %s seconds" % (function.func_name, str(t1-t0)) ) return result return function_timer
Then, you have to add this decorator before the function you want to measure, like
@fn_timer def myfunction(...): ...
For example, let’s measure how long it takes to sort an array of 2000000 random numbers:
@fn_timer def random_sort(n): return sorted([random.random() for i in range(n)]) if __name__ == "__main__": random_sort(2000000)
If you run your script, you should see something like
Total time running random_sort: 1.41124916077 seconds
2. Using the timeit module
Anther option is to use the timeit module, which gives you an average time measure.
To run it, execute the following command in your terminal:
$ python -m timeit -n 4 -r 5 -s "import timing_functions" "timing_functions.random_sort(2000000)"
where timing_functions is the name of your script.
At the end of the output, you should see something like:
4 loops, best of 5: 2.08 sec per loop
indicating that of 4 times running the test (-n 4), and averaging 5 repetitions on each test (-r 5), the best test result was of 2.08 seconds.
If you don’t specify the number of tests or repetitions, it defaults to 10 loops and 5 repetitions.
3. Using the time Unix command
However, both the decorator and the timeit module are based on Python. This is why the unix time utility may be useful, as it is an external Python measure.
To run the time utility type:
$ time -p python timing_functions.py
which gives the output:
Total time running random_sort: 1.3931210041 seconds real 1.49 user 1.40 sys 0.08
The first line comes from the decorator we defined, and the other three:
- Real indicates the total time spent executing the script.
- User indicates the amount of time the CPU spent executing the script
- Sys indicates the amount of time spent in kernel-level functions.
Note: as defined in wikipedia, the kernel is a computer program that manages input/output requests from software, and translates them into data processing instructions for the central processing unit (CPU) and other electronic components of a computer.
Therefore, the difference between the real time and the sum of user+sys may indicate the time spent waiting for input/output or that the system is busy running other external tasks.
4. Using the cProfile module
If you want to know how much time is spent on each function and method, and how many times each of them is called, you can use the cProfile module:
$ python -m cProfile -s cumulative timing_functions.py
Now you’ll see a detailed description of how many times each function in your code is called, and it will be sorted by the cumulative time spent on each one (thanks to the -s cumulative option).
You’ll see that the total amount of time spent on running your script is higher than before. This is the penalty we pay for measuring the time each function takes to execute.
5. Using line_profiler module
The line_profiler module gives you information about the CPU time spent on each line in your code.
This module has to be installed first, with
$ pip install line_profiler
Next, you need to specify which functions you want to evaluate using the @profile decorator (you don’t need to import it in your file):
@profile def random_sort2(n): l = [random.random() for i in range(n)] l.sort() return l if __name__ == "__main__": random_sort2(2000000)
Finally, you can obtain a line by line description of the random_sort2 function by typing:
$ kernprof -l -v timing_functions.py
where the -l flag indicates line-by-line and the -v flag indicates verbose output. With this method, we see that the array construction takes about 44% of the computation time, whereas the sort() method takes the remaining 56%.
You will also see that due to the time measurements, the script might take longer to execute.
6. Use the memory_profiler module
The memory_profiler module is used to measure memory usage in your code, on a line-by-line basis. However, it can make your code to run much more slower.
Install it with
$ pip install memory_profiler
Also, it is recommended to install the psutil package, so that the memory_profile runs faster:
$ pip install psutil
In a similar way as the line_profiler, use the @profile decorator to mark which functions to track. Next, type:
$ python -m memory_profiler timing_functions.py
yes, the previous script takes longer than the 1 or 2 seconds that took before. And if you didn’t install the psutil package, maybe you’re still waiting for the results!
Looking at the output, note that the memory usage is expressed in terms of MiB, which stand for mebibyte (1MiB = 1.05MB).
7. Using the guppy package
Finally, with this package you’ll be able to track how many objects of each type (str, tuple, dict, etc) are created at each stage in your code.
Install it with
$ pip install guppy
Next, add it in your code as:
from guppy import hpy def random_sort3(n): hp = hpy() print "Heap at the beginning of the function\n", hp.heap() l = [random.random() for i in range(n)] l.sort() print "Heap at the end of the function\n", hp.heap() return l if __name__ == "__main__": random_sort3(2000000)
And run your code with:
$ python timing_functions.py
You’ll see something like the following output:
By placing the heap at different places in your code, you can study the object creation and deletion in the script flow.
If you want to learn more about speeding your Python code, I recommend you the book High Performance Python: Practical Performant Programming for Humans, september 2014.
Hope it was useful! 🙂
Don’t forget to share it with your friends!

Marina Mele has experience in artificial intelligence implementation and has led tech teams for over a decade. On her personal blog (marinamele.com), she writes about personal growth, family values, AI, and other topics she’s passionate about. Marina also publishes a weekly AI newsletter featuring the latest advancements and innovations in the field (marinamele.substack.com)