You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use s3fs to access s3, which is apparently not safe for multi-processing. i.e. I have data loading code which calls read_eval_log. Using a ProcessPoolExecutor (but not a ThreadPoolExecutor!) hangs when using read_eval_log on an s3 bucket:
I opened an issue in their repo, and they replied saying this was a known limitation, and suggested to use spawn (new python interpret state) when doing multi-processing, i.e.:
This works, but we should have a more robust solution to multi-processing
The text was updated successfully, but these errors were encountered:
max-kaufmann
changed the title
The library we use to access s3 isn't thread-safe
Multi-processing with s3 is broken, unless you use spawn()
Nov 13, 2024
max-kaufmann
changed the title
Multi-processing with s3 is broken, unless you use spawn()
Multi-processing while reading s3 logs hangs, unless you use spawn to launch your subprocesses
Nov 13, 2024
max-kaufmann
changed the title
Multi-processing while reading s3 logs hangs, unless you use spawn to launch your subprocesses
Multi-processing w/ s3 logs, unless you use spawn to launch your subprocesses
Nov 13, 2024
@max-kaufmann We have made some other changes to back off of s3fs concurrency (as it may have been the cause of other deadlocks we saw stack traces for). I think in the name of not ever hanging we probably won't do more inside of Inspect here, but I'd certainly be interested if there is a documented solution for the "right" way to do this with multi-processing (and would add that to our docs for users looking to get better parallelism).
Another thought would be to find some robust way to download S3 logs in parallel (an entirely different package or alternate use of s3fs) and then once local use multi-processing on them (as then it will just be zip reading which seems like it would be multi-process safe). Again, if we can sort out something robust here we could either included it in Inspect or minimally add it to the docs.
I'm going to close this for now because we don't have plans to address internally, but again happy to take a PR to the library or the docs if the community discovers a good solution for this.
We use s3fs to access s3, which is apparently not safe for multi-processing. i.e. I have data loading code which calls read_eval_log. Using a ProcessPoolExecutor (but not a ThreadPoolExecutor!) hangs when using read_eval_log on an s3 bucket:
I opened an issue in their repo, and they replied saying this was a known limitation, and suggested to use spawn (new python interpret state) when doing multi-processing, i.e.:
This works, but we should have a more robust solution to multi-processing
The text was updated successfully, but these errors were encountered: