-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace URLs with versioned urls where possible since some are 'disappearing' already #15
Comments
a few more besides CORR and BGSP
|
@yarikoptic - i think that would be a good idea (to use versioned URLs). it would also be great if we knew when objects matched to each other across git-annex repos. so if i already have abide from datalad, it would be nice that how do we make crawlers common? datalad has crawlers, metasearch has crawlers, and it seems we should be able to use datalad crawlers to generate metasearch csv. |
versioned urls: I guess we could help with matched objects: what would you expect then to be done, e.g. symlink to be created into another local (eg abide) dataset? or key file common crawlers: I guess would indeed be nice if there was some "standard" or at least "common" collection of crawlers providing data about availability/versions/etc so different tools (metasearch, datalad,...) could use them. Someone should look into all the biocaddy and others I guess. |
@yarikoptic - i don't know how it would work, so here are some thoughts: let's say there is a global filesystem on my computer (could be at annex level or datalad level).
each git repo has its own local store (git annex), as normal. but, git annex would point to special local remote (global store). any file that's in global store will not be copied when i do a get, and this remote is local, it will:
if i modify the file, the:
|
@joeyh what do you think about above? seems to go along our discussion while at montreal. Such generic global-store could be "web-like" special remote providing access to keys, and otherwise not being trusted etc. "It could be provided by some normal local git-annex remote which could be registered also as any other git remote, so content could be "copied to" to populate it. |
Yaroslav Halchenko wrote:
@joeyh what do you think about above? seems to go along our discussion while at
montreal. Such generic global-store could be "web-like" special remote
providing access to keys, and otherwise not being trusted etc. "It could be
provided by some normal local git-annex remote which could be registered also
as any other git remote, so content could be "copied to" to populate it.
I don't understand what you're proposing, specifically, in the context
of git-annex.
…--
see shy jo
|
FTR: regarding "global-store" -- understanding was achieved and implemented at git-annex level, see https://git-annex.branchable.com/tips/local_caching_of_annexed_files/ |
What would you like to do:
while preparing datalad dataset we ran into a bunch of URLs 404ing since there were deleted in the bucket. But bucket was versioned seems after they were added and before they were removed so possibly those versions (or some other versions) are still available if null revision id would be provided, e.g.
since many urls do come from versioned fcp-indi bucket it I wondered if it would be great to remove ambiguity and make access more robust (unless bucket gets removed/recreated which would invalidate versionIds) by replacing URLs with versioned urls, like
http://fcp-indi.s3.amazonaws.com/data/Projects/BGSP/orig_bids/sub-1435/ses-01/anat/sub-1435_ses-01_T1w.nii.gz?versionId=ZzwCQ1fzDpWfUZzNvVGqwAONQ_QL.eI9
instead of
http://fcp-indi.s3.amazonaws.com/data/Projects/BGSP/orig_bids/sub-1435/ses-01/anat/sub-1435_ses-01_T1w.nii.gz .
datalad ls
could be of help here:The text was updated successfully, but these errors were encountered: