-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s3fs w/ multi-processing hangs #914
Comments
The asyncio/thread use in fsspec async implementations including s3fs is not safe to Remedies:
|
Thank you for the speedy + helpful reply! It might be worth adding to your docs, as I ctrl + f'd "thread-safe", "concurrency" etc. but couldn't find any mention of what you can + can't do w/ s3fs. |
Probably this should be added in https://filesystem-spec.readthedocs.io/en/latest/index.html and referenced from the s3fs docs and others. Would you like to add it? I'm not quite sure where it fits in. |
Hello, we are using s3fs to store data on s3 as part of our AI evaluations library Inspect AI, and we are seeing that using multi-processing (but interestingly, NOT multi-threading) leads to our data loading hanging. i.e. for our internal function process_eval_logs, this will hang:
and this will not:
It will hang in the s3 case if we use a ProcessPoolExecutor for our dataloading:
but Not a ThreadPoolExecutor:
I'm happy to give more details on how we are using the library, but before I do that I just wanted to check if this pattern matches to a known issue/limitation?
The text was updated successfully, but these errors were encountered: