GPU Support#
Overview#
PyLops-mpi supports computations on GPUs leveraging the GPU backend of PyLops. Under the hood,
CuPy (cupy-cudaXX>=v13.0.0
) is used to perform all of the operations.
This library must be installed before PyLops-mpi is installed.
Note
Set environment variable CUPY_PYLOPS=0
to force PyLops to ignore the cupy
backend.
This can be also used if a previous (or faulty) version of cupy
is installed in your system,
otherwise you will get an error when importing PyLops.
The pylops_mpi.DistributedArray
and pylops_mpi.StackedDistributedArray
objects can be
generated using both numpy
and cupy
based local arrays, and all of the operators and solvers in PyLops-mpi
can handle both scenarios. Note that, since most operators in PyLops-mpi are thin-wrappers around PyLops operators,
some of the operators in PyLops that lack a GPU implementation cannot be used also in PyLops-mpi when working with
cupy arrays.
Example#
Finally, let’s briefly look at an example. First we write a code snippet using
numpy
arrays which PyLops-mpi will run on your CPU:
# MPI helpers
comm = MPI.COMM_WORLD
rank = MPI.COMM_WORLD.Get_rank()
size = MPI.COMM_WORLD.Get_size()
# Create distributed data (broadcast)
nxl, nt = 20, 20
dtype = np.float32
d_dist = pylops_mpi.DistributedArray(global_shape=nxl * nt,
partition=pylops_mpi.Partition.BROADCAST,
engine="numpy", dtype=dtype)
d_dist[:] = np.ones(d_dist.local_shape, dtype=dtype)
# Create and apply VStack operator
Sop = pylops.MatrixMult(np.ones((nxl, nxl)), otherdims=(nt, ))
HOp = pylops_mpi.MPIVStack(ops=[Sop, ])
y_dist = HOp @ d_dist
Now we write a code snippet using cupy
arrays which PyLops will run on
your GPU:
# MPI helpers
comm = MPI.COMM_WORLD
rank = MPI.COMM_WORLD.Get_rank()
size = MPI.COMM_WORLD.Get_size()
# Define gpu to use
cp.cuda.Device(device=rank).use()
# Create distributed data (broadcast)
nxl, nt = 20, 20
dtype = np.float32
d_dist = pylops_mpi.DistributedArray(global_shape=nxl * nt,
partition=pylops_mpi.Partition.BROADCAST,
engine="cupy", dtype=dtype)
d_dist[:] = cp.ones(d_dist.local_shape, dtype=dtype)
# Create and apply VStack operator
Sop = pylops.MatrixMult(cp.ones((nxl, nxl)), otherdims=(nt, ))
HOp = pylops_mpi.MPIVStack(ops=[Sop, ])
y_dist = HOp @ d_dist
The code is almost unchanged apart from the fact that we now use cupy
arrays,
PyLops-mpi will figure this out!
Note
The CuPy backend is in active development, with many examples not yet in the docs. You can find many other examples from the PyLops Notebooks repository.