Skip to content

Latest commit

 

History

History
44 lines (35 loc) · 2.58 KB

20211001-minutes.md

File metadata and controls

44 lines (35 loc) · 2.58 KB

S3 Working Group Minutes

Friday, 2021/10/01

3:00 pm ET

Attendees:

Terrell Russell, Dave Fellinger, Alan King, Kory Draughn, Justin James, Mike Conway (NIEHS), Deep Patel (NIEHS), Illyoung Choi (CyVerse), Tony Edgin (CyVerse), Nirav Merchant (CyVerse)

Minutes

DISCUSSION

  • Illyoung Update
    • Using testing environment for multi-1247 transfers
      • Still testing
    • CSI-Driver was only using fuse-driver-lite
      • But had 'too many connections' to iRODS
      • Discovered when testing against CyVerse-DE
      • Moving to single connection via pool server
    • Will continue to test parallel transfer
      • But S3 work here does not depend on the parallel being functional
      • Can stream / single threaded transfer at first
    • Integrating/Morphing existing minio-gateway.go to new pure-go library
  • Use Cases from CyVerse (Tony)
    • Use Case 1: Uploading with Existing S3 Tools
      • Numerous CyVerse users have enquired about using existing S3 tools like rclone to upload data into iRODS. Once in iRODS, the users would analyze the data using existing CyVerse services like the Discovery Environment.
    • Use Case 2: NEON Project
      • The NEON Project has existing data in S3 buckets stored outside of CyVerse. Their existing cyberinfrastructure accesses the data through a MinIO Gateway. They would like to mirror these buckets in iRODS at CyVerse using a command like mc mirror and be able to access the data from both sites through their cyberinfrastructure transparently. They would like the copy of the data at CyVerse to be accessible through the native iRODS protocol and be able to use S3 compatible tools to access these data (similar to case 1).
    • Use Case 3: Argo Workflows
      • Argo workflows allow users to run custom container-native workflows in Kubernetes. Each step of the workflow runs in a container. The steps are organized in a DAG, meaning the steps can be performed sequentially, in parallel or in some combination of both. When designing containers for an Argo workflow, it is considered good (best?) practice to access data through an S3 interface since it allows for better scale out of the parallel workflow steps.
      • https://blog.argoproj.io/artifacts-management-in-container-native-workflows-part-1-f6efa838666b
  • Continue to design the TicketBooth/BoxOffice (Kory/Terrell)
  • Definitely need to consider ramifications of user vs anonymous re: tickets
  • Next Meeting - Friday, Nov 5