Benchmark Utility in PyLops-MPI#

This tutorial demonstrates how to use the pylops_mpi.utils.benchmark and pylops_mpi.utils.mark utility methods in PyLops-MPI. It contains various function calling pattern that may come up during the benchmarking of a distributed code.

pylops_mpi.utils.benchmark is a decorator used to decorate any function to measure its execution time from start to finish pylops_mpi.utils.mark is a function used inside the benchmark-decorated function to provide fine-grain time measurements.

import sys
import logging
import numpy as np
from mpi4py import MPI
from pylops_mpi import DistributedArray, Partition

np.random.seed(42)
rank = MPI.COMM_WORLD.Get_rank()

par = {'global_shape': (500, 501),
       'partition': Partition.SCATTER, 'dtype': np.float64,
       'axis': 1}

Let’s start by import the utility and a simple exampple

from pylops_mpi.utils.benchmark import benchmark, mark


@benchmark
def inner_func(par):
    dist_arr = DistributedArray(global_shape=par['global_shape'],
                                partition=par['partition'],
                                dtype=par['dtype'], axis=par['axis'])
    # may perform computation here
    dist_arr.dot(dist_arr)

When we call inner_func, we will see the result of the benchmark print to standard output. If we want to customize the function name in the printout, we can pass the parameter description to the benchmark i.e., @benchmark(description="printout_name")

inner_func(par)

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pylops_mpi/DistributedArray.py:595: RuntimeWarning: overflow encountered in dot
  ncp.dot(x.local_array.flatten(), y.local_array.flatten()),
/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pylops_mpi/DistributedArray.py:595: RuntimeWarning: invalid value encountered in dot
  ncp.dot(x.local_array.flatten(), y.local_array.flatten()),
[decorator]inner_func: total runtime: 0.002347 s

We may want to get the fine-grained time measurements by timing the execution time of arbitary lines of code. pylops_mpi.utils.mark provides such utitlity.

@benchmark
def inner_func_with_mark(par):
    mark("Begin array constructor")
    dist_arr = DistributedArray(global_shape=par['global_shape'],
                                partition=par['partition'],
                                dtype=par['dtype'], axis=par['axis'])
    mark("Begin dot")
    dist_arr.dot(dist_arr)
    mark("Finish dot")

Now when we run, we get the detailed time measurement. Note that there is a tag [decorator] next to the function name to distinguish between the start-to-end time measurement of the top-level function and those that comes from pylops_mpi.utils.mark

inner_func_with_mark(par)

[decorator]inner_func_with_mark: total runtime: 0.000308 s
        Begin array constructor-->Begin dot: 0.000026 s
        Begin dot-->Finish dot: 0.000278 s

This utility benchmarking routines can also be nested. Let’s define an outer function that internally calls the decorated inner_func_with_mark

@benchmark
def outer_func_with_mark(par):
    mark("Outer func start")
    inner_func_with_mark(par)
    dist_arr = DistributedArray(global_shape=par['global_shape'],
                                partition=par['partition'],
                                dtype=par['dtype'], axis=par['axis'])
    dist_arr + dist_arr
    mark("Outer func ends")

If we run outer_func_with_mark, we get the time measurement nicely printed out with the nested indentation to specify that nested calls.

outer_func_with_mark(par)

/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pylops_mpi/DistributedArray.py:547: RuntimeWarning: overflow encountered in add
  SumArray[:] = self.local_array + dist_array.local_array
/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/pylops_mpi/DistributedArray.py:547: RuntimeWarning: invalid value encountered in add
  SumArray[:] = self.local_array + dist_array.local_array
[decorator]outer_func_with_mark: total runtime: 0.000867 s
        [decorator]inner_func_with_mark: total runtime: 0.000325 s
                Begin array constructor-->Begin dot: 0.000023 s
                Begin dot-->Finish dot: 0.000300 s
        Outer func start-->Outer func ends: 0.000864 s

In some cases, we may want to write benchmark output to a text file. pylops_mpi.utils.benchmark also takes the py:class:logging.Logger in its argument. Here we define a simple make_logger. We set the logger.propagate = False to isolate the logging of our benchmark from that of the rest of the code

save_file = True
file_path = "benchmark.log"


def make_logger(save_file=False, file_path=''):
    logger = logging.getLogger(__name__)
    logging.basicConfig(filename=file_path if save_file else None, filemode='w', level=logging.INFO, force=True)
    logger.propagate = False
    if save_file:
        handler = logging.FileHandler(file_path, mode='w')
    else:
        handler = logging.StreamHandler(sys.stdout)
    logger.addHandler(handler)
    return logger


logger = make_logger(save_file, file_path)

Then we can pass the logger to the pylops_mpi.utils.benchmark

@benchmark(logger=logger)
def inner_func_with_logger(par):
    dist_arr = DistributedArray(global_shape=par['global_shape'],
                                partition=par['partition'],
                                dtype=par['dtype'], axis=par['axis'])
    # may perform computation here
    dist_arr.dot(dist_arr)

Run this function and observe that the file benchmark.log is written.

inner_func_with_logger(par)

Total running time of the script: (0 minutes 0.007 seconds)

Gallery generated by Sphinx-Gallery