-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emcee does not use spawned multithreaded processes #669
Comments
Is this related to #666? For both @matthiasfabry and @odstrcilt: you need to read #601 and understand that I, personally, find |
@newville It is not completely related to #666 since I use the I would say however that it is unrelated to #600. I have a clear use case for posterior sampling, which I believe |
@matthiasfabry it can be an issue on EMCEE side. |
@matthiasfabry "I want it to be faster, so I'll use multiprocessing" while also confusing multiprocessing and multithreading is not at all a reassuring start. Multithreading is almost certainly not worth pursuing. Multiprocessing with lmfit (and possibly with So, sure, maybe multiprocessing will make it be faster, and maybe it will be worth the effort. |
@newville I agree multiprocessing is what were after. For @odstrcilt your proposed fix actually slows the execution. The subprocesses do seem to run now, but not at at their full speed (ie cpu usage is not close to 100% per core). Apparently this fix causes a lot over overhead. |
@matthiasfabry yes, multiprocessing has a large overhead. All large arguments of your cost function should be passed at global variables to reduce overhead connected with creating the pickles. If the execution of your cost function is less than 100ms, the multiprocessing can be useless. I'm using one more trick to reduce the overhead. Instead of executing each function in a pool worker, I split all tasks by a number of workers and then inside of each worker are tasks calculated serially. Here is an example:
|
From a "naive mathematical standpoint" one can say almost anything should be possible. We don't deal with naive mathematical standpoints. Python multiprocessing creates new Python processes and send data from one process to another to do some work and then send data back. If that sounds simple, then you're not thinking very deeply about how to share and send objects between processes. Like @odstrcilt says, it is not at all unusual for a naive use of multiprocessing to slow down a complex calculation. |
@newville There is absolutely no need to lecture me. I am only reporting unwanted behavior here, and I never claimed that there is a practical solution for it. As a regular user of your software, I simply noticed the @odstrcilt I will check whether your fix reduces the overhead enough to speed up the execution in my case. |
@matthiasfabry @odstrcilt As mentioned earlier and discussed in #601, I lean more to deprecating Again, if anyone is expecting |
@newville That's fair. Finally then, @odstrcilt Your custom pool class seems to have a bug in it.
Have you an idea for a fix or workaround? |
@matthiasfabry |
@matthiasfabry is this resolved? I cannot tell... |
Not exactly, but it seems this issue is not directly related to |
Description
When using
minimizer.emcee()
, and supplying a pool of workers, multiple processes are spawned, but do not do any work. In my case (macOS with 16 threads), the sixteen instances of python3.8 appear in Activity Monitor, 15 of which stay idle at 0.0% cpu usage. In practice then, usingprocesses=1
orprocesses=os.cpu_count()
makes no difference in execution time.A Minimal, Complete, and Verifiable example
Version information
lmfit: 1.0.1, scipy: 1.5.0, numpy: 1.19.1, asteval: 0.9.16, uncertainties: 3.1.4
The text was updated successfully, but these errors were encountered: