-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Multi-Storage Client (MSC) as an optional dependency #754
base: main
Are you sure you want to change the base?
Conversation
08a450a
to
11bfea4
Compare
941e074
to
40e72aa
Compare
/blossom-ci |
pyproject.toml
Outdated
@@ -59,6 +59,7 @@ dev = [ | |||
"interrogate==1.5.0", | |||
"coverage==6.5.0", | |||
"ruff==0.0.290", | |||
"multi-storage-client>=0.12.2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this only in the dev optional dependency list? Wouldn't it be better to put it in some storage specific optional dependency list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, moved it to a storage
dependency list
pyproject.toml
Outdated
@@ -94,6 +95,7 @@ all = [ | |||
"nvidia-modulus[dev]", | |||
"nvidia-modulus[makani]", | |||
"nvidia-modulus[fignet]", | |||
"multi-storage-client[boto3]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only install boto3 and not other backends in the all
dep list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out! I removed the boto3 here as it should be installed in the example folder instead.
type: s3 | ||
options: | ||
region_name: us-east-1 | ||
endpoint_url: https://pbss.s8k.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this reference from here. A modulus example should be runnable by anyone without any specific network domain requirements. Please refactor to point this example to a publicly available zarr dataset. Here are a couple of dataset suggestions:
- CMIP6 archive on AWS: https://registry.opendata.aws/cmip6/
- ARCO ERA5 dataset on google cloud: https://github.com/google-research/arco-era5?tab=readme-ov-file#025-pressure-and-surface-level-data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I added examples of CMIP6.
e479897
to
e20346a
Compare
Modulus Pull Request
Description
This PR introduces Multi-Storage Client (MSC) as an optional dependency for Modulus, with examples.
The Multi-Storage Client (MSC) is a unified, high-performance Python client designed to seamlessly interface with various object and file storage systems. It supports:
MSC provides a generic interface to interact with objects and files across various storage services. This lets you spend less time learning each storage service's unique interface and lets you change where data is stored without having to change how your code accesses it.
We have successfully completed several PoCs (ICON, CorrDiff) using MSC training from S3-compatiable object stores on Modulus workloads.
Checklist
Dependencies