PyOP2 Architecture¶
As described in PyOP2 Concepts, PyOP2 exposes an API that allows users to
declare the topology of unstructured meshes in the form of Sets
and Maps
and data in the form of
Dats
, Mats
, Globals
and Consts
. Computations on this data
are described by Kernels
described in PyOP2 Kernels
and executed by parallel loops
.
The API is the frontend to the PyOP2 runtime compilation architecture, which supports the generation and just-in-time (JIT) compilation of low-level code for a range of backends described in PyOP2 Backends and the efficient scheduling of parallel computations. A schematic overview of the PyOP2 architecture is given below:
From an outside perspective, PyOP2 is a conventional Python library, with
performance critical library functions implemented in Cython. A user’s
application code makes calls to the PyOP2 API, most of which are conventional
library calls. The exception are par_loop()
calls, which
encapsulate PyOP2’s runtime core functionality performing backend-specific
code generation. Executing a parallel loop comprises the following steps:
Compute a parallel execution plan, including information for efficient staging of data and partitioning and colouring of the iteration set for conflict-free parallel execution. This process is described in Parallel Execution Plan and does not apply to the sequential backend.
Generate backend-specific code for executing the computation for a given set of
par_loop()
arguments as detailed in PyOP2 Backends according to the execution plan computed in the previous step.Pass the generated code to a backend-specific toolchain for just-in-time compilation, producing a shared library callable as a Python module which is dynamically loaded. This module is cached on disk to save recompilation when the same
par_loop()
is called again for the same backend.Build the backend-specific list of arguments to be passed to the generated code, which may initiate host to device data transfer for the CUDA and OpenCL backends.
Call into the generated module to perform the actual computation. For distributed parallel computations this involves separate calls for the regions owned by the current processor and the halo as described in MPI.
Perform any necessary reductions for
Globals
.Call the backend-specific matrix assembly procedure on any
Mat
arguments.
Multiple Backend Support¶
The backend is selected by passing the keyword argument backend
to the
init()
function. If omitted, the sequential
backend is
selected by default. This choice can be overridden by exporting the
environment variable PYOP2_BACKEND
, which allows switching backends
without having to touch the code. Once chosen, the backend cannot be changed
for the duration of the running Python interpreter session.
PyOP2 provides a single API to the user, regardless of which backend the
computations are running on. All classes and functions that form the public
API defined in pyop2.op2
are interfaces, whose concrete implementations
are initialised according to the chosen backend. A metaclass takes care of
instantiating a backend-specific version of the requested class and setting
the corresponding docstrings such that this process is entirely transparent to
the user. The implementation of the PyOP2 backends is completely orthogonal to
the backend selection process and free to use established practices of
object-oriented design.