Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Logic Error in Reference Manager #197

Open
3 tasks
kkdavis14 opened this issue Jan 2, 2025 · 1 comment
Open
3 tasks

[Bug]: Logic Error in Reference Manager #197

kkdavis14 opened this issue Jan 2, 2025 · 1 comment
Assignees
Labels
bug The code does not behave as expected / designed

Comments

@kkdavis14
Copy link
Contributor

kkdavis14 commented Jan 2, 2025

Priority Level

High

What happened?

Ref manager should be removing bad reconciliations as per line 248. However, it doesn't seem like 250 is getting triggered when it should. In the example below, it finds an existing YUID and does not remove the bad recons, even though they're not in the new equivs list.

Relevant log output

(ENV) kd736@ip-10-5-156-177:~/data-pipeline$ python run-reconcile.py --norefs --ypm --recid taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
Starting...
Update token is: __20241222a__
 *** ypm ***
.
--- https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json ---
    (uris)
  ---- WD Found: https://www.gbif.org/species/3032252 --> Q158086
 --- reconciler <pipeline.sources.general.wikidata.reconciler.WdReconciler object at 0x7fb4b61f4c40> / uri found http://www.wikidata.org/entity/Q158086 for https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
Adding http://www.wikidata.org/entity/Q158086 to record
 --- reconciler <pipeline.sources.authorities.lc.reconciler.LcshReconciler object at 0x7fb4b6e95eb0> / uri found http://id.loc.gov/authorities/subjects/sh2017003185 for https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
  ---- WD Found: https://www.gbif.org/species/3032252 --> Q158086
Adding http://id.loc.gov/authorities/subjects/sh2017003185 to record
  ---- WD Found: https://www.gbif.org/species/3032252 --> Q158086
  ---- WD Found: http://id.loc.gov/authorities/subjects/sh2017003185 --> Q158086
r_equivs: {'http://www.wikidata.org/entity/Q158086', 'https://www.gbif.org/species/3032252', 'https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json', 'http://id.loc.gov/authorities/subjects/sh2017003185'}
      (collecting)
    Collecting https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
      ... collect processing https://www.gbif.org/species/3032252
  https://www.gbif.org/species/3032252 / Actaea into https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json tested okay
      ... collect processing https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
  https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json / Actaea into https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json tested okay
     testing https://www.gbif.org/species/3032252 from https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json
      ... collect processing http://www.wikidata.org/entity/Q158086
  http://www.wikidata.org/entity/Q158086 / Actaea into https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json tested okay
     testing https://www.gbif.org/species/3032252 from http://www.wikidata.org/entity/Q158086
     testing http://id.loc.gov/authorities/subjects/sh2017003185 from http://www.wikidata.org/entity/Q158086
      ... collect processing http://id.loc.gov/authorities/subjects/sh2017003185
  http://id.loc.gov/authorities/subjects/sh2017003185 / Actaea into https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json tested okay
     testing https://id.worldcat.org/fast/1982746 from http://id.loc.gov/authorities/subjects/sh2017003185
       --> Adding https://id.worldcat.org/fast/1982746 from http://id.loc.gov/authorities/subjects/sh2017003185
     testing http://www.wikidata.org/entity/Q158086 from http://id.loc.gov/authorities/subjects/sh2017003185
      ... collect processing https://id.worldcat.org/fast/1982746
cr_equivs: {'http://www.wikidata.org/entity/Q158086', 'https://www.gbif.org/species/3032252', 'https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json', 'http://id.loc.gov/authorities/subjects/sh2017003185'}
Found https://lux.collections.yale.edu/data/concept/f930f183-6983-4737-a979-f1760ef9313a for https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json##quaType
Found existing: ['http://id.loc.gov/authorities/subjects/sh2017003185##quaType', '__20241222a__', 'https://www.gbif.org/species/3032252##quaType', 'https://www.gbif.org/species/2227733##quaType', 'http://www.wikidata.org/entity/Q4676748##quaType', 'https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json##quaType', 'http://www.wikidata.org/entity/Q158086##quaType']
Saw https://images.peabody.yale.edu/data/taxon/f/c7/fc76e841-e3d2-4ae6-97dc-9d0c97314f52.json##quaType in existing, not setting
Saw http://id.loc.gov/authorities/subjects/sh2017003185##quaType in existing, not setting
Saw https://www.gbif.org/species/3032252##quaType in existing, not setting
Saw https://www.gbif.org/species/2227733##quaType in existing, not setting
Saw http://www.wikidata.org/entity/Q4676748##quaType in existing, not setting
Saw http://www.wikidata.org/entity/Q158086##quaType in existing, not setting

Tasks

  • Determine where the bug is
  • Refactor the code to remove bad reconciliations
  • Test code to be sure it removes the reconciliations
@kkdavis14 kkdavis14 added the bug The code does not behave as expected / designed label Jan 2, 2025
@kkdavis14 kkdavis14 self-assigned this Jan 2, 2025
@kkdavis14
Copy link
Contributor Author

This may be a challenge. The ref mgr code could be working as well as could be expected.
This example has two YUIDS:
https://lux.collections.yale.edu/data/concept/f930f183-6983-4737-a979-f1760ef9313a
https://lux.collections.yale.edu/data/concept/7877845e-7cde-4c6e-976e-3f9765167283

each has only one YPM URI-- but the first has both the right external equivs for its attached YPM URI, AND the external equivs that should be on the second YUID. the second YUID has no equivs other than the YPM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The code does not behave as expected / designed
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants