Emcee does not use spawned multithreaded processes #669

matthiasfabry · 2020-08-27T13:49:52Z

Description

When using minimizer.emcee(), and supplying a pool of workers, multiple processes are spawned, but do not do any work. In my case (macOS with 16 threads), the sixteen instances of python3.8 appear in Activity Monitor, 15 of which stay idle at 0.0% cpu usage. In practice then, using processes=1 or processes=os.cpu_count() makes no difference in execution time.

A Minimal, Complete, and Verifiable example

import lmfit
import scipy
from multiprocessing.pool import Pool
import os

def cost_fun(params, **kwargs):
    return scipy.optimize.rosen_der([params['a'], params['b']])


if __name__ == '__main__':
    params = lmfit.Parameters()
    params.add('a', 1, min=-5, max=5, vary=True)
    params.add('b', 1, min=-5, max=5, vary=True)

    fitter = lmfit.Minimizer(cost_fun, params)
    with Pool(processes=os.cpu_count()) as pool:
        MC_results = fitter.emcee(workers=pool, steps=10000)

Version information

lmfit: 1.0.1, scipy: 1.5.0, numpy: 1.19.1, asteval: 0.9.16, uncertainties: 3.1.4

The text was updated successfully, but these errors were encountered:

newville · 2020-08-27T13:57:58Z

Is this related to #666?

For both @matthiasfabry and @odstrcilt: you need to read #601 and understand that I, personally, find emcee to be a wart in lmfit. If you want to use it, you want any further changes, you will have to provide them and support them.

matthiasfabry · 2020-08-27T14:45:08Z

@newville It is not completely related to #666 since I use the multiprocessing builtin Pool, but the main point here is that the pool subprocesses sit idly while one (probably the main) thread executes the MCMC sampling on its own.

I would say however that it is unrelated to #600. I have a clear use case for posterior sampling, which I believe emcee offers correctly and easily. With this issue I merely raise that the implementation of the underlying parallelizer is incomplete somehow. I'm not at all an expert in these things, so I wouldn't know whether this issue is on the lmfit side or the emcee side (or even deeper, in multiprocessing, but I doubt that), but I would like to use multithreaded capabilities to speed up my MCMC posterior distribution sampling.

odstrcilt · 2020-08-27T15:33:15Z

@matthiasfabry it can be an issue on EMCEE side.
Try to change EMCEE source code, in file ensemble.py, function getstate, is necessary to replace
d = self.__dict__
by
d = dict(self.__dict__)

newville · 2020-08-27T15:48:40Z

@matthiasfabry "I want it to be faster, so I'll use multiprocessing" while also confusing multiprocessing and multithreading is not at all a reassuring start. Multithreading is almost certainly not worth pursuing. Multiprocessing with lmfit (and possibly with emcee) is definitely complicated by the fact that multiprocessing relies on pickle. You can probably make it work for a silly example like the rosenbrook function, but once you are trying to solve a real problem, dragons will quickly be revealed.

So, sure, maybe multiprocessing will make it be faster, and maybe it will be worth the effort.

matthiasfabry · 2020-08-27T16:20:33Z

@newville I agree multiprocessing is what were after. For emcee, I don't actually think dragons should appear when doing this... You can perfectly sample a posterior distribution with workers independently of each other, no matter what function you are minimizing or optimizing for. Again, I have no deep understanding of python or pickle and what that might entail, but at least from a naive mathematical standpoint, it should be possible. In the worst case I can imagine writing a wrapper function spawning subprocesses each calling emcee(steps=totalsteps / cpus), dividing the total number of steps over the different processes. Combining the results into one uncertainty interval however is not trivial with this idea.

@odstrcilt your proposed fix actually slows the execution. The subprocesses do seem to run now, but not at at their full speed (ie cpu usage is not close to 100% per core). Apparently this fix causes a lot over overhead.

odstrcilt · 2020-08-27T16:34:45Z

@matthiasfabry yes, multiprocessing has a large overhead. All large arguments of your cost function should be passed at global variables to reduce overhead connected with creating the pickles. If the execution of your cost function is less than 100ms, the multiprocessing can be useless.

I'm using one more trick to reduce the overhead. Instead of executing each function in a pool worker, I split all tasks by a number of workers and then inside of each worker are tasks calculated serially. Here is an example:


class fast_pool:
    #vectorised pool, it reduces multiprocessing overhead
    def __init__(self, pool):
        self.pool = pool


    def map(self, f, arg):
        npool = len(self.pool._pool)

        arg_list = np.array_split(arg, npool)
        vf = np.vectorize(f, signature='(n)->()')

        res_list = self.pool.map(vf,   arg_list)

        return np.hstack(res_list)

from multiprocessing import Pool
global large_data

fitter = lmfit.Minimizer(cost_fun,**fitter_kwds)
fitter.emcee(workers= fast_pool(Pool(n_processes)), steps=10000, **emcee_args)

newville · 2020-08-27T18:42:22Z

@matthiasfabry

You can perfectly sample a posterior distribution with workers independently of each other, no matter what function you are minimizing or optimizing for. Again, I have no deep understanding of python or pickle and what that might entail, but at least from a naive mathematical standpoint, it should be possible.

From a "naive mathematical standpoint" one can say almost anything should be possible. We don't deal with naive mathematical standpoints.

Python multiprocessing creates new Python processes and send data from one process to another to do some work and then send data back. If that sounds simple, then you're not thinking very deeply about how to share and send objects between processes.

Like @odstrcilt says, it is not at all unusual for a naive use of multiprocessing to slow down a complex calculation.

matthiasfabry · 2020-08-27T21:41:01Z

@newville There is absolutely no need to lecture me. I am only reporting unwanted behavior here, and I never claimed that there is a practical solution for it. As a regular user of your software, I simply noticed the workers argument of minimized.emcee() doesn’t work as advertised in the documentation. It is up to you and the development team to decide whether to fix this. If not, fine, but then remove the feature you claim to provide and then this issue would turn into a feature request. It you do, I’m sure the community would greatly appreciate it, as it will speed up not only my research, but also all other people’s work.

@odstrcilt I will check whether your fix reduces the overhead enough to speed up the execution in my case.

newville · 2020-08-27T23:14:35Z

@matthiasfabry @odstrcilt As mentioned earlier and discussed in #601, I lean more to deprecating Minimizer.emcee than
I lean toward trying to make it work better. It simply does not belong with the other methods of Minimizer that are actually solvers - emcee() is not a solver.

Again, if anyone is expecting Minimizer.emcee() to work well and be generally useful then there is real work to do. That will have to be done (and supported) by someone other than me. Perhaps it would make sense to move these routines out of Minimizer. Whether lmfit's emcee method supports multiprocessing could certainly be part of that effort if someone wants to do that.

matthiasfabry · 2020-08-28T08:16:51Z

@newville That's fair. emcee is indeed not a solver, but I agree with @reneeotten in #601 that it has its place within lmfit (but indeed maybe not as part of the Minimizer class, that's an implementation issue the developers need to decide on). Doing MCMC posterior distribution sampling is a natural continuation of any high-dimensional minimization problem with a regular minimizer (say Nelder-Mead or Levenberg-Marquardt), where brute forcing is simply too expensive. If I'm not mistaken MCMC also takes autocorrelations naturally into account.

Finally then, emcee does work correctly, mind you, albeit on a single thread. I repeat one last time that I noticed the multiprocessed capabilities don't function as advertised.

@odstrcilt Your custom pool class seems to have a bug in it. array_split() throws the exception:

 File "/Users/matthiasf/Anaconda3/envs/spinOS/lib/python3.8/site-packages/numpy/lib/shape_base.py", line 769, in array_split
    Ntotal = len(ary)
TypeError: object of type 'generator' has no len()

Have you an idea for a fix or workaround?

odstrcilt · 2020-08-28T17:39:07Z

@matthiasfabry
ty to replace line
arg_list = np.array_split(arg, npool)
by
arg_list = np.array_split(list(arg), npool)

newville · 2020-11-20T04:46:29Z

@matthiasfabry is this resolved? I cannot tell...

matthiasfabry · 2020-11-20T08:12:49Z

Not exactly, but it seems this issue is not directly related to lmfit. I will close this, thanks for your input

matthiasfabry closed this as completed Nov 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emcee does not use spawned multithreaded processes #669

Emcee does not use spawned multithreaded processes #669

matthiasfabry commented Aug 27, 2020

newville commented Aug 27, 2020 •

edited

Loading

matthiasfabry commented Aug 27, 2020

odstrcilt commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 27, 2020

odstrcilt commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 28, 2020

odstrcilt commented Aug 28, 2020

newville commented Nov 20, 2020

matthiasfabry commented Nov 20, 2020

Emcee does not use spawned multithreaded processes #669

Emcee does not use spawned multithreaded processes #669

Comments

matthiasfabry commented Aug 27, 2020

Description

A Minimal, Complete, and Verifiable example

Version information

newville commented Aug 27, 2020 • edited Loading

matthiasfabry commented Aug 27, 2020

odstrcilt commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 27, 2020

odstrcilt commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 27, 2020

newville commented Aug 27, 2020

matthiasfabry commented Aug 28, 2020

odstrcilt commented Aug 28, 2020

newville commented Nov 20, 2020

matthiasfabry commented Nov 20, 2020

newville commented Aug 27, 2020 •

edited

Loading