MultiprocessingΒΆ

Conducting extensive data studies based on the HyperStudy or ChangepointStudy classes may involve several 10.000 or 100.000 individual fits (see e.g. here). Since these individual fits with different hyper-parameter values are independent of each other, the computational workload may be distributed among the individual cores of a multi-core processor. To keep things simple, bayesloop uses object serialization to create duplicates of the current HyperStudy or ChangepointStudy instance and distributes them across the predefined number of cores. In general, this procedure may be handled by the built-in Python module multiprocessing. However, multiprocessing relies on the built-in module pickle for object serialization, which fails to serialize the classes defined in bayesloop. We therefore use a different version of the multiprocessing module that is part of the pathos module.

The latest version of pathos can be installed directly via pip, but requires git:

pip install git+https://github.com/uqfoundation/pathos

Note: Windows users need to install a C compiler before installing pathos. One possible solution for 64-bit systems is to install Microsoft Visual C++ 2008 SP1 Redistributable Package (x64) and Microsoft Visual C++ Compiler for Python 2.7.

Once installed correctly, the number of cores to use in a hyper-study or change-point study can be specified by using the keyword argument nJobs within the fit method. Example:

S.fit(silent=True, nJobs=4)