Profiling
=========
Profiling PyOP2 programs
------------------------
Profiling a PyOP2 program is as simple as profiling any other Python
code. You can profile the jacobi demo in the PyOP2 ``demo`` folder as
follows: ::
python -m cProfile -o jacobi.dat jacobi.py
This will run the entire program under cProfile_ and write the profiling
data to ``jacobi.dat``. Omitting ``-o`` will print a summary to stdout,
which is not very helpful in most cases.
Creating a graph
................
There is a much more intuitive way of representing the profiling data
using the excellent gprof2dot_ to generate a graph. Install from `PyPI
`__ with ::
sudo pip install gprof2dot
Use as follows to create a PDF: ::
gprof2dot -f pstats -n 1 jacobi.dat | dot -Tpdf -o jacobi.pdf
``-f pstats`` tells ``gprof2dot`` that it is dealing with Python
cProfile_ data (and not actual *gprof* data) and ``-n 1`` ignores
everything that makes up less than 1% of the total runtime - most likely
you are not interested in that (the default is 0.5).
Consolidating profiles from different runs
..........................................
To aggregate profiling data from different runs, save the following as
``concat.py``: ::
"""Usage: concat.py PATTERN FILE"""
import sys
from glob import glob
from pstats import Stats
if len(sys.argv) != 3:
print __doc__
sys.exit(1)
files = glob(sys.argv[1])
s = Stats(files[0])
for f in files[1:]: s.add(f)
s.dump_stats(sys.argv[2])
With profiles from different runs named ``.*.part``, use it
as ::
python concat.py '.*.part' .dat
and then call ``gprof2dot`` as before.
Using PyOP2's internal timers
-----------------------------
PyOP2 automatically times the execution of certain regions:
* Sparsity building
* Plan construction
* Parallel loop kernel execution
* Halo exchange
* Reductions
* PETSc Krylov solver
To output those timings, call :func:`~pyop2.profiling.summary` in your
PyOP2 program or run with the environment variable
``PYOP2_PRINT_SUMMARY`` set to 1.
To query e.g. the timer for parallel loop execution programatically,
use the :func:`~pyop2.profiling.timing` helper: ::
from pyop2 import timing
timing("ParLoop compute") # get total time
timing("ParLoop compute", total=False) # get average time per call
To add additional timers to your own code, you can use the
:func:`~pyop2.profiling.timed_region` and
:func:`~pyop2.profiling.timed_function` helpers: ::
from pyop2.profiling import timed_region, timed_function
with timed_region("my code"):
# my code
@timed_function("my function")
def my_func():
# my func
Line-by-line profiling
----------------------
To get a line-by-line profile of a given function, install Robert Kern's
`line profiler`_ and:
1. Import the :func:`~pyop2.profiling.profile` decorator: ::
from pyop2.profiling import profile
2. Decorate the function to profile with ``@profile``
3. Run your script with ``kernprof.py -l ``
4. Generate an annotated source file with ::
python -m line_profiler
Note that ``kernprof.py`` injects the ``@profile`` decorator into the
Python builtins namespace. PyOP2 provides a passthrough version of this
decorator which does nothing if ``profile`` is not found in
``__builtins__``. This means you can run your script regularly without
having to remove the decorators again.
The :func:`~pyop2.profiling.profile` decorator also works with the
memory profiler (see below). PyOP2 therefore provides the
:func:`~pyop2.profiling.lineprof` decorator which is only enabled when
running with ``kernprof.py``.
A number of PyOP2 internal functions are decorated such that running
your PyOP2 application with ``kernprof.py`` will produce a line-by-line
profile of the parallel loop computation (but not the generated code!).
Memory profiling
----------------
To profile the memory usage of your application, install Fabian
Pedregosa's `memory profiler`_ and:
1. Import the :func:`~pyop2.profiling.profile` decorator: ::
from pyop2.profiling import profile
2. Decorate the function to profile with ``@profile``.
3. Run your script with ::
python -m memory_profiler
to get a line-by-line memory profile of your function.
4. Run your script with ::
memprof run --python
to record memory usage of your program over time.
5. Generate a plot of the memory profile with ``memprof plot``.
Note that ``memprof`` and ``python -m memory_profiler`` inject the
``@profile`` decorator into the Python builtins namespace. PyOP2
provides a passthrough version of this decorator which does nothing if
``profile`` is not found in ``__builtins__``. This means you can run
your script regularly without having to remove the decorators again.
The :func:`~pyop2.profiling.profile` decorator also works with the line
profiler (see below). PyOP2 therefore provides the
:func:`~pyop2.profiling.memprof` decorator which is only enabled when
running with ``memprof``.
A number of PyOP2 internal functions are decorated such that running
your PyOP2 application with ``memprof run`` will produce a memory
profile of the parallel loop computation (but not the generated code!).
.. _cProfile: https://docs.python.org/2/library/profile.html#cProfile
.. _gprof2dot: https://code.google.com/p/jrfonseca/wiki/Gprof2Dot
.. _line profiler: https://pythonhosted.org/line_profiler/
.. _memory profiler: https://github.com/fabianp/memory_profiler