Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset]: BNE Source Needs Refactored #187

Open
1 of 3 tasks
kkdavis14 opened this issue Dec 16, 2024 · 0 comments
Open
1 of 3 tasks

[Dataset]: BNE Source Needs Refactored #187

kkdavis14 opened this issue Dec 16, 2024 · 0 comments
Assignees
Labels
enhancement New feature to add to the code Medium Medium priority task v0.0.2 Change being developed for v.0.0.2

Comments

@kkdavis14
Copy link
Contributor

kkdavis14 commented Dec 16, 2024

Priority Level

Medium

Dataset Name

BNE

Description

BNE is the national library of Spain. We already have it as a source, but it is set up from an old dump file. There are better files available here: https://datos.gob.es/en/catalogo?publisher_display_name=Biblioteca+Nacional+de+Espa%C3%B1a.

Data Access Method

Base URL: https://datos.gob.es/en/catalogo
Example URLs: https://www.bne.es/media/datosgob/catalogo-autoridades/entidad/entidad-JSON.zip
https://datos.gob.es/en/catalogo/ea0019768-catalogo-de-autoridades-geografico-JSON.zip

Data Format

Format: JSON-LD
JSON example:

{
    "idBNE": "XX492624",
    "otros_codigos_identificacion": "lcsh: http://id.loc.gov/authorities/subjects/sh2006001296 ** rameau: https://data.bnf.fr/ark:/12148/cb12446309b ** ",
    "cdu": "(1-04)  // ",
    "encabezamiento_materia": "Tierras fronterizas ",
    "termino_materia_no_aceptado": "Regiones fronterizas // Regiones limítrofes // Tierras limítrofes // Zonas fronterizas // Zonas limítrofes // ",
    "termino_relacionado": "Contaminación transfronteriza      // Cooperación transfronteriza      // Fronteras      // ",
    "fuente_informacion": "WWW LCSH, 14-2-2020 // WWW RAMEAU, 14-2-2020 // ",
    "informacion_encontrada": "(Borderlands) // (Régions frontalières) // ",
    "termino_aceptado_otro_vocabulario": "lcsh: Borderlands ** ",
    "obras_relacionadas_BNE": "http://catalogo.bne.es/uhtbin/cgisirsi/0/x/0/05?searchdata1=^A492624"
  },

Entity Matching

The data has fields like otros_identificadores and otros_codigos_identificacion to store sameAs URIs. They map to LC and BNF.

Technical Requirements

Known Limitations

No response

Example Integration

Add to the BNE config the URLs for the datasets from BNE that we would like to harvest, in the remote dump files block. Ensure these files can be downloaded via existing download code. Write a loader to load these files into the BNE datacache. Refactor the existing BNE mapper to use this new data structure.

@kkdavis14 kkdavis14 added the enhancement New feature to add to the code label Dec 16, 2024
@kkdavis14 kkdavis14 self-assigned this Dec 16, 2024
@kkdavis14 kkdavis14 added the Medium Medium priority task label Dec 16, 2024
@kkdavis14 kkdavis14 added the v0.0.2 Change being developed for v.0.0.2 label Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature to add to the code Medium Medium priority task v0.0.2 Change being developed for v.0.0.2
Projects
None yet
Development

No branches or pull requests

1 participant