Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: rucio list-datasets-rse fails for large RSEs #789

Open
haozturk opened this issue Apr 30, 2024 · 6 comments
Open

Bug: rucio list-datasets-rse fails for large RSEs #789

haozturk opened this issue Apr 30, 2024 · 6 comments
Labels

Comments

@haozturk
Copy link
Contributor

Bug Description

Discussed and described at https://its.cern.ch/jira/browse/CMSTRANSF-897
Known to rucio: rucio/rucio#6014

Reproduction Steps

Run rucio list-datasets-rse T2_US_Florida

Expected Behavior

It should ideally work for all RSEs

Possible Solution

Improve error handling, find the real error and fix it if possible.

Related Issues

No response

@haozturk haozturk added the bug label Apr 30, 2024
@maksiks
Copy link

maksiks commented Jan 21, 2025

I am not sure exactly what the question is, so let me explain what confuses me right now.
First I issue "rucio list-datasets-rse T2_US_MIT_Tape" to obtain full list of datasets.
The returned information can be split in two groups. The first group is when for the dataset other fields are returned
saying rule id, when it happened, how many files and the state of the replica. The second group of datasets do not return anything at all, take, for example cms:/ADDGravToGG_NegInt-0_LambdaT-4000_M-2000To3000_TuneCUEP8M1_13TeV-pythia8/RunIISummer16MiniAODv3-PUMoriond
17_94X_mcRun2_asymptotic_v3-v1/MINIAODSIM

I suspect that the datasets in the second group might be very old and were transferred to tape in it's first iteration a couple of years ago.
--Max

@haozturk
Copy link
Contributor Author

Hi @maksiks the question is: "Why do you need this command?" rather than how the output should look like.

@maksiks
Copy link

maksiks commented Jan 22, 2025

Please tell me how else I can get a list of datasets located on a site.
--Max

@haozturk
Copy link
Contributor Author

I imagine you can query the catalogue of the file system to get what you have on tape, no? Is this because you want to know what CMS thinks is placed in your site so that you can ensure consistency? Or you need to know the datasets instead of files for some reason?

@maksiks
Copy link

maksiks commented Jan 22, 2025

Are you seriously asking me this? We have to be able to know what Rucio has on any site.
--Max

@haozturk
Copy link
Contributor Author

We have this information in the database, so the operations team has the access when needed. We're trying to understand why a site admin would need it. Is it a must feature or a nice-to-have feature? If it's a must, why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants