You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using Toil in server mode as a WES server, configured with the AWS Batch backend. In the AWS Batch compute environment, I have a drive mounted on each of the cluster nodes (currently via mount-s3).
I want to give access to this local path inside a CommandLineTool defined in a CWL workflow. Referencing files from S3 works fine, and they are staged correctly using InitialWorkDirRequirement. However, when I reference a file or directory using a local path (e.g., file:///mnt/service-data/blast/gramene_01_2022.tar.gz), I get the following error:
[2024-11-18T11:47:36+0000] [MainThread] [E] [root] No available job store implementation can import the URL 'missing:'. Ensure Toil has been installed with the appropriate extras.
Here is the relevant cwl definition and parameters for context:
What is the proper way to make these local node paths available to the jobs in this scenario? Is there a recommended approach for this? Am I missing something in the Toil configuration?
Thanks for your help!
┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1674
The text was updated successfully, but these errors were encountered:
If you want the leader to not see the file as missing, you need to have it available from the leader as well as from the workers. So the first thing to try is probably mounting your filesystem on the node you are issuing the command from.
If you are using import symlinking, and a file job store on your shared filesystem, your files should be imported into Toil as symlinks to their actual location on the shared filesystem, and not copied. But I am not sure that mounted S3 provides the consistency guarantees that Toil needs form a shared filesystem to use it as the job store. Or that it supports symlinks.
If you are using the AWS job store for storage, Toil will want to copy your files into the job store, and from there to each node. You could try to turn on toil-cwl-runner's --bypass-file-store option to make Toil just assume all paths are accessible from all nodes. But then you might need to set --tmp-outdir-prefix or some of the other CWL path settings to get Toil to create job outputs on your shared filesystem instead of in node-local temporary storage, because you're turning off the whole system responsible for moving files between nodes.
Hi,
I'm using Toil in server mode as a WES server, configured with the AWS Batch backend. In the AWS Batch compute environment, I have a drive mounted on each of the cluster nodes (currently via mount-s3).
I want to give access to this local path inside a
CommandLineTool
defined in a CWL workflow. Referencing files from S3 works fine, and they are staged correctly usingInitialWorkDirRequirement
. However, when I reference a file or directory using a local path (e.g.,file:///mnt/service-data/blast/gramene_01_2022.tar.gz
), I get the following error:Here is the relevant cwl definition and parameters for context:
list_mounted.tool.cwl:
workflow_params:
What is the proper way to make these local node paths available to the jobs in this scenario? Is there a recommended approach for this? Am I missing something in the Toil configuration?
Thanks for your help!
┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1674
The text was updated successfully, but these errors were encountered: