Message Passing Interface (MPI) Parallelism =========================================== .. sectionauthor:: Shane J. Latham .. currentmodule:: mango.mpi .. contents:: Mango takes advantage of multi-core parallelism on both shared-memory and distributed-memory architectures through `Message Passing Interface (MPI) `_ algorithm implementations (see also `Wikipedia Message Passing Interface `_). :mod:`mango.mpi` module ----------------------- The :mod:`mango.mpi` module contains functions and variables for handling various aspects related to MPI. Some useful functions include: :func:`mango.mpi.getLoggers` and :func:`mango.mpi.initialiseLoggers` Functions for generating and initialising :obj:`logging.Logger` objects which are configured to output MPI rank and message-time information when logging messages. :func:`mango.mpi.getCartShape` Returns a tuple indicating a cartesian grid of MPI processes. :data:`mango.mpi.rank` and :data:`mango.mpi.size` Convenience attributes for *world* communicator rank and *world* communicator size, respectively. Executing python scripts in parallel ------------------------------------ Typically MPI-parallel python is coded in a script file and is executed using the *mpirun* command. If the following script is written to the file :samp:`example_mpirun_script.py`: .. literalinclude:: examples/example_mpirun_script.py then it can be executed serially on a single process as:: mpirun -np 1 python example_mpirun_script.py and the output is: .. literalinclude:: examples/example_mpirun_script_np1.txt Running on 2 MPI processes:: mpirun -np 2 python example_mpirun_script.py generates output: .. literalinclude:: examples/example_mpirun_script_np2.txt and, finally, running on 8 MPI processes:: mpirun -np 8 python example_mpirun_script.py generates output: .. literalinclude:: examples/example_mpirun_script_np8.txt Image domain decomposition -------------------------- Mango image processing algorithms typically use `data parallelism `_ to achieve concurrency. By default, an image is decomposed into (approximately) equal sized disjoint rectangular sub-domains, with each sub-domain residing in the memory of a single MPI process. To illustrate, consider the following script (in file named :samp:`domdecom.py`) which simply creates a :obj:`mango.Dds` and assigns the elements (voxels) of the array to be the rank of the MPI process on which the elements are allocated: .. literalinclude:: examples/domdecom.py If this is executed on a single MPI process then all elements of the :samp:`(1,768,512)` shaped :obj:`mango.Dds` array are allocated on the single *rank 0* MPI process. Of course, in this case the domain decomposition consists of one sub-domain which is identical to the global-domain. +-----------------------------------------------------------------------+ +-----------------------------------------------------------------------+ |Decomposition for:: | | | | mpirun -np 1 domdecom.py 1,0,0 | | | |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x1x1.png | | :align: center | +-----------------------------------------------------------------------+ When executing on two MPI processes, disjoint half-domains are allocated on each of the MPI processes. +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ |Decomposition for:: |Decomposition for:: | | | | | mpirun -np 2 domdecom.py 1,0,0 | mpirun -np 2 domdecom.py 1,0,1 | | | | |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x1x2.png |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x2x1.png | | :align: center | :align: center | +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ And for 9 MPI processes the decomposition is as follows: +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ |Decomposition for:: |Decomposition for:: | | | | | mpirun -np 9 domdecom.py 1,0,0 | mpirun -np 9 domdecom.py 1,0,1 | | | | |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x3x3.png |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x9x1.png | | :align: center | :align: center | +-----------------------------------------------------------------------+-----------------------------------------------------------------------+ A :obj:`mango.Dds` object has three attributes associated with the MPI domain decomposition: Attribute :data:`mango.Dds.mpi` A :obj:`mango.DdsMpiInfo` object which possesses attributes related to the MPI layout (:samp:`mpidims`), MPI communicator (:obj:`mpi4py.MPI.Cartcomm`) and the index of a MPI-process/sub-domain within the sub-domain grid. Attribute :data:`mango.Dds.subd` A :obj:`mango.DdsNonHaloSubDomain` object with attributes related to the shape and position of the non-halo-sub-domain residing on a MPI process. Attribute :data:`mango.Dds.subd_h` A :obj:`mango.DdsHaloSubDomain` object with attributes related to the shape and position of the halo-sub-domain residing on a MPI process. In the above images the text labels (black text on rectangular white background) are generated by converting the :samp:`dds.subd.origin`, :samp:`dds.subd.shape`, :samp:`dds.subd.mpi.rank` and :samp:`dds.subd.mpi.index` attributes to strings. .. _tutorial-sub-domains-and-halos: Sub-domains and halos --------------------- Often when processing the image sub-domain on one MPI process the computation requires data which has been allocated/assigned on a remote MPI process. For example, when computing the convolution of an image with a :samp:`(3,3,3)` shaped kernel, say, the convolution value at each voxel requires a :samp:`(3,3,3)` shaped neighbourhood of voxels from the original image. When the sub-domains are disjoint the border voxels of the sub-domain do not have complete neighbourhoods because some of the neighbourhood voxels reside on a different MPI process (or even outside the global image domain). Enter the *halo* voxels. The halo is just a glorified term for an expansion of the MPI sub-domains, so that the sub-domains are no longer disjoint. When a :obj:`mango.Dds` object is created with a positive-sized halo, the sub-domain shape is enlarged by two times the halo-size, so each sub-domain has (halo-sized deep) border-layers of voxels which *overlap* with neighbouring sub-domains. Having the populated halo-voxels gives the neighbourhood voxels required to compute all convolution values on the disjoint (non-halo) sub-domains. The :mod:`mango` image processing routines are *halo aware* in that they compute image processing values for disjoint non-halo-sub-domain voxels but use/read the halo region voxels required to complete calculations. The :data:`mango.Dds.subd` attribute descibes the non-halo-sub-domain (*exclusive* sub-domain) and the :data:`mango.Dds.subd_h` describes the halo-sub-domain (*inclusive* sub-domain), so that (using a single MPI process):: >>> import mango >>> dds = mango.zeros(shape=(1,16,32), mtype="tomo", halo=(0, 4, 8), origin=(10,20,40)) >>> print( ... ( ... "\n" ... + ... "( dds.subd.origin, dds.subd.shape) = (%s,%s)\n" ... + ... "(dds.subd_h.origin, dds.subd_h.shape) = (%s,%s)\n" ... ) ... % ... (dds.subd.origin, dds.subd.shape, dds.subd_h.origin, dds.subd_h.shape) ... ) ( dds.subd.origin, dds.subd.shape) = ([10 20 40],[ 1 16 32]) (dds.subd_h.origin, dds.subd_h.shape) = ([10 16 32],[ 1 24 48]) >>> print(dds.subd.origin == (dds.subd_h.origin - dds.subd_h.halo)) True >>> print(dds.subd.shape == (dds.subd_h.shape - 2*dds.subd_h.halo)) True The :data:`mango.Dds.subd` and :data:`mango.Dds.subd_h` objects each have :samp:`asarray` methods (:meth:`mango.DdsNonHaloSubDomain.asarray` and :meth:`mango.DdsHaloSubDomain.asarray`, respectively). The :meth:`mango.DdsHaloSubDomain.asarray` returns a :obj:`numpy.ndarray` object which references the entire halo-sub-domain array (which is the same as the :meth:`mango.Dds.asarray` method). The :meth:`mango.DdsNonHaloSubDomain.asarray` method also returns :obj:`numpy.ndarray` object, however this object is a *view* of the halo-sub-domain array which is restricted to the non-halo-sub-domain region:: >>> import mango >>> import numpy as np >>> dds = mango.zeros(shape=(1,16,32), mtype="tomo", halo=(0, 4, 8), origin=(10,20,40)) >>> print( ... "(dds.subd.asarray().shape, dds.subd_h.asarray().shape) = (%s,%s)" ... % ... (dds.subd.asarray().shape, dds.subd_h.asarray().shape) ... ) (dds.subd.asarray().shape, dds.subd_h.asarray().shape) = ([ 1 16 32],[ 1 24 48]) >>> print(dds.subd.asarray().shape == (dds.subd_h.asarray().shape - 2*dds.subd_h.halo)) True >>> print( ... "(dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)]) = (%s,%s)" ... % ... (dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)) ... ) (dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)]) = (0, 0) >>> dds.subd.asarray()[0,0,0] = 8 >>> print( ... "(dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)]) = (%s,%s)" ... % ... (dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)) ... ) (dds.subd.asarray()[0,0,0], dds.subd_h.asarray()[tuple(dds.subd_h.halo)]) = (8, 8) >>> print("np.may_share_memory(dds.subd.asarray(), dds.subd_h.asarray())=%s" % (np.may_share_memory(dds.subd.asarray(), dds.subd_h.asarray()),)) np.may_share_memory(dds.subd.asarray(), dds.subd_h.asarray())=True Image domain decomposition with halos ------------------------------------- Now, returning to the :samp:`domdecom.py` script, we create a modified version (:samp:`halodomdecom.py`) which allows the specification of a halo value: .. literalinclude:: examples/halodomdecom.py A :samp:`halo=(0,halo,halo)` argument has been added to the :func:`mango.zeros` function call. Each halo-sub-domain array now has an increase in shape by :samp:`(0,2*halo,2*halo)` over the default :samp:`halo=(0,0,0)` halo-sub-domains created in the :samp:`domdecom.py` script. The assignment statement :samp:`dds.subd_h.asarray()[...] = dds.mpi.comm.Get_rank()` assigns all elements of the sub-domain array (even the halo voxels) to the rank of the corresponding MPI process. The call to the :meth:`mango.Dds.updateHaloRegions()` corrects the halo region values by fetching data from the appropriate remote MPI processes and assigning the correct sub-domain values to the halo voxels. A side effect of the sub-domain expansion is that the global-domain also increases it's shape by :samp:`(0,2*halo,2*halo)`. Because the global halo regions lie outside the original domain, there is no information on how to assign values to these voxels. Typically they are either assigned a constant value or the image is *mirrored* into these global halo regions. In the :samp:`halodomdecom.py` script the global halo voxels are assigned to an `invalid` rank (one greater than the maximum valid rank) in the :meth:`mango.Dds.setBorderToValue`. Executing the :samp:`halodomdecom.py` script with a halo of size :samp:`64` using a single MPI process gives a :obj:`mango.Dds` array where the central :samp:`(1,768,512)` shaped region is assigned to :samp:`0` and the outer :samp:`64`-wide global halo region assigned to :samp:`1`: +----------------------------------------------------------------------------------+ +----------------------------------------------------------------------------------+ |Decomposition with halo for:: | | | | mpirun -np 1 halodomdecom.py 1,0,0 64 | | | |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x1x1Halo0x64x64.png | | :align: center | +----------------------------------------------------------------------------------+ Note that the :data:`mango.Dds.shape`, :data:`mango.Dds.origin`, :data:`mango.DdsNonHaloSubDomain.shape` and :data:`mango.DdsNonHaloSubDomain.origin` **do not** account for the halo voxels, they refer to the non-halo region. Running on 9 MPI processes: +----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+ +----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+ |Decomposition with halo for:: |Rank 0 halo sub-domain for:: |Rank 4 halo sub-domain for:: | | | | | | mpirun -np 9 halodomdecom.py 1,0,0 64 | mpirun -np 9 halodomdecom.py 1,0,0 64 | mpirun -np 9 halodomdecom.py 1,0,0 64 | | | | | |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x3x3Halo0x64x64.png |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x3x3MpiIdx0x0x0Halo0x64x64.png |.. image:: examples/images/domainDecompImgSz1x768x512MpiDims1x3x3MpiIdx0x1x1Halo0x64x64.png | | :align: center | :align: center | :align: center | +----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------+ Image Generating Script Source ------------------------------ For interest, we include the python source code which was used to generate the above domain decomposition images: .. literalinclude:: examples/gen_domain_decomp_images.py and the shell script for generating all images: .. literalinclude:: examples/images/genimages.sh