Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't share the same DBS object amongst multiple threads #9311

Merged
merged 1 commit into from
Jul 30, 2019

Conversation

amaltaro
Copy link
Contributor

Fixes #9221
Fixes #9253

Status

not-tested

Description

Instead of using a global cache of DBSReader, which are then shared between multiple threads, make sure each thread constructs its own DBSReader object (and then reuse it within the thread).

This PR also fixes how we evaluate whether the DBSReader object is pointing to the global instance or not (no need to make any calls to the server).

I wanted to get rid of this __dbses global variable:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WorkQueue/WorkQueueUtils.py#L30

but it would require substantial changes to the start policy algorithms. Given that only those are now using this global var, and they are sequential, I decided to let them be.

Is it backward compatible (if not, which system it affects?)

yes

Related PRs

no

External dependencies / deployment changes

Needs to be validated on both levels: global and local workqueue.

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 1 new failures
    • 3 tests no longer failing
  • Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 11 warnings
    • 55 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/9224/artifact/artifacts/PullRequestReport.html

@vlimant
Copy link
Contributor

vlimant commented Jul 25, 2019

does this mean that the dbsapi is not really thread safe @yuyiguo @vkuznet ?

@amaltaro
Copy link
Contributor Author

Yes, Jean-Roch. See this comment
dmwm/DBS#610 (comment)
where Valentin describes the problem.

I let them comment it further though.
Don't you see DBS problems on the Unified side? Or it catches an exception and silently retry?

@amaltaro
Copy link
Contributor Author

test this please

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 2 new failures
  • Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 11 warnings
    • 55 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/9233/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor Author

Tests look good in my VM as well. I double checked the jenkins job configuration and I still don't get why those Rucio tests don't get reported as unstable...

@amaltaro amaltaro merged commit 9b3a3f0 into dmwm:master Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants