Skip to content

Bodleian MMM

tobyburrows edited this page Aug 6, 2019 · 1 revision

Bodleian MMM

June 2018

Main contribution by Andrew Morrison.

Background

The TEI XML format has been chosen for detailed cataloguing of manuscripts at the Bodleian (and elsewhere) because of its rich and flexible syntax. A custom schema, including mainly the TEI P5 modules for manuscript descriptions, has been created, with minor variations for Western, Islamic and Oriental manuscripts.

The Medieval manuscripts colletions are catalogued in the most detail, with a lot of work having been done to create local authority files for works, people and places - also in TEI - by Matthew Holford. These files are stored in a GitHub repository, with a web site built using open source technologies including XSLT, XQuery, Solr and Blacklight providing a user interface.

Aims

To express the key metadata fields and relationships between entities (manuscripts, parts, works, authors and other people, places) as Linked Data by simplifying the TEI data structure, and mapping that to CIDOC-CRM and FRBRoo ontologies using the 3M tool.

A particular focus was on mapping provenance information to enable the Mapping Manuscript Migrations project to track the ownership history of manuscripts. Some examples of the use of SPARQL to query origin, provenance, and acquisition events by location and time are included in a presentation given in June 2018.

Simplification of TEI manuscript descriptions

The source TEI data can be complex and hierarchical, with potentially a manuscript divided into several parts, each with its own history and containing works-within-works (e.g. a collection of poetry and individual poems.)

Provenance information has been catalogued in a variety of different styles (e.g. a single XML element describing the entire history of the manuscript or multitple elements each recounting one event). Dates might be encoded with date tags or attributes on the provenance element. So an XQuery script was developed to extract the relevant information in a more rigid structure, for easier mapping in the 3M tool.

The schema of the resulting simplified XML format evolved in response to changing requirements, so isn't as logically-structured as it could be, but the XSD can be viewed here. TODO: Add link to XSD file

Reconciliation

The names of people (including authors and former owners), works, organizations, and places in manuscript descriptions are controlled by local authority files.

These have been, in turn, manually reconcilated with URIs of records in external authorities such as VIAF, Library of Congress, Bibliothèque nationale de France, Système Universitaire de Documentation, Gemeinsame Normdatei, and WikiData.

Instances in the manuscripts are linked to entries in the local authority files via key attributes. For example, each of the five copies in five different manuscripts of De quantitate animae by Augustinus have a key of work_778, corresponding to the entry with that ID in local works authority file. Augustinus himself has a local persons authority file entry with an ID of 'person_66806872', to which all mentions of him in the manuscript refer, and which has been reconciled with all the aforementioned global authories.

The script to generate simplified XML turns those into URIs. The authority files themselves, being essentially flat lists, were mapped in the 3M tool directly.

Contacts

Step by step procedure for export

  1. Clone [email protected]:bodleian/medieval-mss.git to your computer
  2. cd processing/analysis/
  3. Run ./simplified-xml-for-3m.sh
  4. cd results/
  5. Check simplified-xml-for-3m.log in case of broken input files, unreadable dates, or unrecognized languages
  6. The output is written to simplified-xml-for-3m.xml

Detailed review of data in individual fields

Field list and mapping

The following table represents the mapping of TEI elements/attributes to the simplified XML and from there to the ontologies configured in the 3M tool. It is presented roughly in order of hierarchy in the TEI:

TEI element

Field in simplified XML

Ontology mapping in 3M

Notes and Example values

 

manuscripts

 

Root element container for all the manuscripts in all the TEI files in https://github.com/bodleian/medieval-mss/tree/master/collections

TEI

manuscript

crm:E24_Physical_Man-Made_Thing

and frbr:F4_Manifestation_Singleton if the manuscript contains no parts

 

TEI/@xml:id

manuscript/uri

URIorUUID

https://medieval.bodleian.ox.ac.uk/catalog/manuscript_4603

msDesc/msIdentifer/idno

manuscript/classmark

crm:P48_has_preferred_identifier > crm:E42_Identifier and frbr:F13_Identifier

Also rdfs:label for manuscript

MS. Douce 252

titleStmt/title[@type='collection']

manuscript/collection

crm:P46i_forms_part_of > crm:E78_Collection

MSS. Douce

msDesc/msIdentifier/repository

manuscript/repository

crm:P46i_forms_part_of > crm:E78_Collection

Bodleian Library

msDesc/msIdentifier/institution

manuscript/institution

frbr:F44_Bibliographic_Agency

University of Oxford

msPart

part

frbr:F4_Manifestation_Singleton

crm:E24_Physical_Man-Made_Thing

Container for metadata fields about a part in a composite manuscript. About 10% of manuscripts have parts. For the rest, the following fields are child elements of the manuscript

msPart/@xml:id

part/uri

URIorUUID

https://medieval.bodleian.ox.ac.uk/catalog/manuscript_2143#MS_Canon_Class_Lat_131-part1

msPart/msIdentifier/altIdentifier/idno

part/label

rdfs:label

MS. Canon. Class. Lat. 131 - Part 1

msItem

item

frbr:F22_Self-Contained_Expression

 

msItem/@xml:id

item/uri

URIorUUID

This is the URI of the instance of the work in this manuscript. E.g. https://medieval.bodleian.ox.ac.uk/catalog/manuscript_502#MS_Auct_D_inf_2_10-item1

msItem/title

item/title

frbr:R3i_realises > frbr:F1_Work

 

item/title/uri

URIorUUID

This is URI of the work defined in the authority file, of which this item in this manuscript is an instance. E.g. https://medieval.bodleian.ox.ac.uk/catalog/work_3509

item/title/label

rdfs:label

Commentary on St. Paul's Epistles

msItem/textLang/@mainLang
msItem/textLang/@otherLangs

item/language/uri

crm:P72_has_language > crm:E56_Language

This is the URI of the work's languages, each mapped to a Getty URI. The distinction between the "main language" and "other languages", typically of passages within the work or commentary but sometimes 50-50, is not preserved. E.g. http://vocab.getty.edu/aat/300389289

item/language/label

rdfs:label

Church Slavonic

supportDesc/@material

material

crm:P45_consists_of > crm:E57_Material

 

material/uri

 

http://vocab.getty.edu/aat/300014109

material/label

rdfs:label

Paper

dimensions

dimension

crm:P43_has_dimension > crm:E54_Dimension

 

dimensions/@unit

dimension/unit

crm:P91_has_unit > crm:E58_Measurement_Unit

mm

dimensions/@type

dimension/type

crm:P2_has_type > crm:E55_Type

e.g. if @type is leaf the diminsion/type might be: max width leaf

width/@min

dimension/value

 

crm:P90_has_value > Literal

70

width/@max

100

width/@quantity

163

width

Only used if no attributes exist

height/@min

175

height/@max

180

height/@quantity

310

height

Only used if no attributes exist

layout/@columns

layout/columns/min

Not mapped

1

layout/columns/max

Not mapped

2

layout/@ruledLines

layout/linespercolumn[@type='ruled']/min

Not mapped

40

layout/linespercolumn[@type='ruled']/max

Not mapped

44

layout/@writtenLines

layout/linespercolumn[@type='written']/min

Not mapped

18

layout/linespercolumn[@type='written']/max

Not mapped

20

origDate/@when

date[@context='origin']/from

crm:E12_Production > crm:E52_Time-Span > crm:P82a_begin_of_the_begin

These are known, single-year dates but CIDOC-CRM always expects a date range, so converted to start of the year, e.g. 1475-01-01T00:00:00

date[@context='origin']/to

crm:E12_Production > crm:E52_Time-Span > crm:P82b_end_of_the_end

End of the year, e.g. 1475-12-31T23:59:59

origDate/@notBefore

date[@context='origin']/from

crm:E12_Production > crm:E52_Time-Span > crm:P82a_begin_of_the_begin

1390-01-01T00:00:00

origDate/@notAfter

date[@context='origin']/to

crm:E12_Production > crm:E52_Time-Span > crm:P82b_end_of_the_end

1400-12-31T23:59:59

acquisition//date/@when

date[@context='acquisition']/from

crm:E10_Transfer_of_Custody > crm:E52_Time-Span > crm:P82a_begin_of_the_begin

1756-01-01T00:00:00

date[@context='acquisition']/to

crm:E10_Transfer_of_Custody > crm:E52_Time-Span > crm:P82b_end_of_the_end

1756-12-31T23:59:59

acquisition//date/@notBefore

date[@context='acquisition']/from

crm:E10_Transfer_of_Custody > crm:E52_Time-Span > crm:P82a_begin_of_the_begin

 

acquisition//date/@notAfter

date[@context='acquisition']/to

crm:E10_Transfer_of_Custody > crm:E52_Time-Span > crm:P82b_end_of_the_end

 

placeName

country

region

settlement

place[@context='acquisition']

crm:E10_Transfer_of_Custody > crm:P28_custody_surrendered_by > crm:E74_Group and frbr:F11_Corporate_Body

 

place[@context='origin']

crm:E12_Production > crm:P7_took_place_at > crm:E53_Place and frbr:F9_Place

 

place[@context='physdesc']

 

place[@context='title']

frbr:F1_Work > crm:P67_refers_to > crm:E53_Place

 

place/uri

URIorUUID

https://medieval.bodleian.ox.ac.uk/catalog/place_1000080

place/label

rdfs:label

Italian

orgName

org[@context='acquisition']

crm:E10_Transfer_of_Custody > crm:P28_custody_surrendered_by > crm:E74_Group and frbr:F11_Corporate_Body

 

org[@context='origin']

crm:E12_Production > crm:P11_had_participant > crm:E74_Group

 

org[@context='physdesc']

 

org[@context='title']

frbr:F1_Work > crm:P67_refers_to > crm:E74_Group

 

org/uri

URIorUUID

https://medieval.bodleian.ox.ac.uk/catalog/org_65

org/label

rdfs:label

Hickling Priory, Norfolk

org/role

Not mapped, but used to determine mapping of org when in provenance, see below.

formerOwner

persName

person[@context='acquisition']

crm:E10_Transfer_of_Custody > crm:P28_custody_surrendered_by > crm:E21_Person and frbr:F10_Person

 

person[@context='title']

frbr:F1_Work > crm:P67_refers_to > crm:E21_Person

 

person[@context='physdesc']

crm:E12_Production > crm:P01i_is_domain_of > crm:PC14_carried_out_by > crm:P02_has_range > crm:E21_Person

 

person[@context='physdesc']/role

crm:E12_Production > crm:P01i_is_domain_of > crm:PC14_carried_out_by > crm:P14.1_in_the_role_of > crm:E55_Type

scribe

person/uri

URIorUUID

https://medieval.bodleian.ox.ac.uk/catalog/person_162949

person/label

rdfs:label

Bartolomeo Fontius

person/role

Not mapped, except when context is physdesc, see above, but also used to determine mapping of person when in provenance, see below.

scribe

provenance

provenance

crm:E5_Event

This assumes each provenance in the TEI is a single event, which may not always be the case.

provenance/@xml:id

URIorUUID

ID generated by the script to be unique for each provenance event, e.g. manuscript_6327_prov3

provenance/text

crm:P3_has_note > Literal

The entire text describing a stage in the provenance of the manuscript

provenance/date

crm:P4_has_time-span > crm:E52_Time-Span

Each of these child elements of provenance themselves have the same child elements listed above when they’re not in provenance. The mapping here relates them to the specific stage in the provenance of the manuscript.

provenance/org

crm:P11_had_participant > crm:E74_Group

provenance/org[@role='formerOwner']

crm:P51_has_former_or_current_owner > crm:E74_Group

provenance/person

crm:P11_had_participant > crm:E21_Person and frbr:F10_Person

provenance/person[@role='formerOwner']

crm:P51_has_former_or_current_owner > crm:E21_Person and frbr:F10_Person

provenance/place

crm:P7_took_place_at > crm:E53_Place and frbr:F9_Place

inscription

crm:E34_Inscription

Inscriptions are simply provenances whose text start with a single-quote mark, e.g. 'Dominus Johannes Blathe', incised on the binding. All the same information is extracted from the TEI, but it is mapped differently to the ontology.

 

inscription/@xml:id

URIorUUID

inscription/text

crm:P3_has_note > Literal

inscription/date

crm:P67_refers_to > crm:E52_Time-Span

inscription/org

crm:P67_refers_to > crm:E74_Group

inscription/person

crm:P67_refers_to > crm:E21_Person and frbr:F10_Person

inscription/place

crm:P67_refers_to > crm:E53_Place and frbr:F9_Place

Identifiers

TODO: Add Halle's table here.

Endpoint

Upload data to the endpoint:

java -cp /home/blazegraph/blazegraph-2.1.4.jar com.bigdata.rdf.store.DataLoader -defaultGraph https://medieval.bodleian.ox.ac.uk/ -namespace mmm /home/blazegraph/fastload.properties /home/thanasis/mmm/*.ttl

SPARQL queries

  • MSs produced between 1550 - 1600 (includes parts)
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX frbr: <http://www.cidoc-crm.org/frbr/>

SELECT DISTINCT ?msl ?tse {
  ?ms crm:P108i_was_produced_by ?p . #MSs with production events
  ?ms rdfs:label ?msl . #get the MSs labels
  ?p crm:P4_has_time-span ?ts . #productions with period
  ?ts crm:P82a_begin_of_the_begin ?tsb . #begining of the period
  ?ts crm:P82b_end_of_the_end ?tse . #end of the period
  FILTER (xsd:dateTime(?tsb) > "1550-01-01T00:00:00"^^xsd:dateTime) .
  FILTER (xsd:dateTime(?tse) < "1600-01-01T00:00:00"^^xsd:dateTime) .
} LIMIT 1000
  • MSs produced in a specific continent, based on geocoordinates, nearby places and country/continent relationships. Still not working
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX frbr: <http://www.cidoc-crm.org/frbr/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>

SELECT ?ms ?msplace ?geocoors ?continent {
  ?ms crm:P108i_was_produced_by ?production .
  ?production crm:P7_took_place_at ?msplace .
  ?production crm:P4_has_time-span ?ts . #productions with period
  ?ts crm:P82a_begin_of_the_begin ?tsb . #begining of the period
  ?ts crm:P82b_end_of_the_end ?tse . #end of the period
  FILTER (xsd:dateTime(?tsb) > "1550-01-01T00:00:00"^^xsd:dateTime) .
  FILTER (xsd:dateTime(?tse) < "1600-01-01T00:00:00"^^xsd:dateTime) .
  ?msplace crm:P89i_contains ?mscentroidplace .
  ?mscentroidplace crm:P168_place_is_defined_by ?geocoorsraw .
  BIND (STRDT(CONCAT('Point(',STRBEFORE(STRAFTER(?geocoorsraw,","),")")," ",SUBSTR(STRBEFORE(?geocoorsraw,","),3),")"), geo:wktLiteral) AS ?geocoors) .
  {
    SELECT DISTINCT ?geocoors ?place ?placeLabel ?countryLabel ?continent ?continentLabel WHERE {
      
      SERVICE <https://query.wikidata.org/sparql> {
        SERVICE wikibase:around {
          ?place wdt:P625 ?location .
          bd:serviceParam wikibase:center ?geocoors .
          bd:serviceParam wikibase:radius "100" . # within 100 kilometres
        } .
        ?place wdt:P17 ?country.
        ?country wdt:P30 ?continent
        SERVICE wikibase:label {
          bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
        }
      }
    }
  }
} LIMIT 100
  • Same as the above only with the TGN geoservice (this works)
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX frbr: <http://www.cidoc-crm.org/frbr/>
prefix omgeo: <http://www.ontotext.com/owlim/geo#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
prefix gvp: <http://vocab.getty.edu/ontology#>
prefix xl: <http://www.w3.org/2008/05/skos-xl#>

SELECT DISTINCT ?ms ?geocoorsraw ?lat ?long ?continent ?continentTerm {
  ?ms crm:P108i_was_produced_by ?production . #MS with a production
  ?production crm:P4_has_time-span ?ts . #productions with period
  ?ts crm:P82a_begin_of_the_begin ?tsb . #begining of the period
  ?ts crm:P82b_end_of_the_end ?tse . #end of the period
  FILTER (xsd:dateTime(?tsb) > "1550-01-01T00:00:00"^^xsd:dateTime) .
  FILTER (xsd:dateTime(?tse) < "1600-01-01T00:00:00"^^xsd:dateTime) .
  ?production crm:P7_took_place_at ?msplace . #production at a place
  ?msplace crm:P89i_contains ?mscentroidplace . #place contains centroid
  ?mscentroidplace crm:P168_place_is_defined_by ?geocoorsraw . #geodata needs to be split
  BIND (STRDT(STRAFTER(STRBEFORE(?geocoorsraw,","),"("), xsd:decimal) AS ?lat) .
  BIND (STRDT(STRAFTER(STRBEFORE(?geocoorsraw,")"),","), xsd:decimal) AS ?long) .
  SERVICE <http://vocab.getty.edu/sparql> { #go to the Getty endpoint
    ?place wgs:lat ?lat ; #use the lat and long
           wgs:long ?long .
    ?nearbyplace omgeo:nearby(?lat ?long "4mi" ) . #find places around this lat lon
    ?nearbyplaceconcept foaf:focus ?nearbyplace ; #find the name of the places
                        gvp:broaderPartitive+ ?continent . #find all the parent terms
    ?continent gvp:placeTypePreferred <http://vocab.getty.edu/aat/300128176> . #only continents
    ?continent xl:prefLabel ?continentLabel . #get label
    ?continentLabel gvp:term ?continentTerm . #get value of label
    FILTER ( lang(?continentTerm) = "en" ) #only the English
  }
} LIMIT 100
  • Find the continent that a TGN place belongs to (solely run on the Getty endpoint)
SELECT ?s ?slt ?o ?olt {
  ?s gvp:broaderPartitive+ ?o . #place ?s has multiple partitive broader parents
  ?s gvp:prefLabelGVP ?sl . #label for place ?s
  ?sl gvp:term ?slt . #value of label of ?s
  ?o gvp:placeTypePreferred <http://vocab.getty.edu/aat/300128176> . #?o needs to be a continent
  ?o gvp:prefLabelGVP ?ol . #label for continent ?o
  ?ol gvp:term ?olt . #value for label of ?o
  ?s skos:inScheme <http://vocab.getty.edu/tgn/> #get ?s from the TGN thesaurus
} LIMIT 100
  • People (and places their birth) who were involved in provenance events after 1598
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX frbr: <http://www.cidoc-crm.org/frbr/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>

SELECT DISTINCT ?plabel ?tsb ?placeLabel WHERE {
  ?e crm:P11_had_participant ?p . #provenance events with participants
  ?p a crm:E21_Person . #participants who are persons as opposed to groups
  { #get a label for the person
    SELECT ?p (MIN(?l) AS ?plabel) { #magic by Graham
      ?p rdfs:label ?l
    } GROUP BY ?p
  } .
  ?p owl:sameAs ?er . #get only the reconciled persons
  ?e crm:P4_has_time-span ?ts . #provenance events at a specific period
  ?ts crm:P82a_begin_of_the_begin ?tsb . #period starting
  FILTER (xsd:dateTime(?tsb) > "1598-12-31T23:59:59"^^xsd:dateTime) . #at the end of 1598
  SERVICE <https://query.wikidata.org/sparql> { #go to wikidata
    SELECT ?er ?erLabel ?place ?placeLabel WHERE { #return the label
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } #find reconciled label
      ?er wdt:P19 ?place. #find the place of birth of the reconciled person
    }
  }
} LIMIT 1000
  • People (and places of birth) who were born between 1598-1698 and were invovled in provenance events
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX frbr: <http://www.cidoc-crm.org/frbr/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>


SELECT DISTINCT ?plabel ?betsb ?placeLabel WHERE {
  ?e crm:P11_had_participant ?p . #provenance event with participant
  ?p a crm:E21_Person . #participants who are persons not groups
    { #get a label for the person
    SELECT ?p (MIN(?l) AS ?plabel) { #magic by Graham
      ?p rdfs:label ?l
    } GROUP BY ?p
  } .
  ?p crm:P98i_was_born ?be . #participants birth event
  ?be crm:P4_has_time-span ?bets . #period of birth
  ?bets crm:P82a_begin_of_the_begin ?betsb . #begin of the period
  ?bets crm:P82b_end_of_the_end ?betse . #end of the period
  ?p owl:sameAs ?er . #only reconciled persons
  FILTER (xsd:dateTime(?betsb) > "1598-12-31T23:59:59"^^xsd:dateTime) . #persons born after 1598
  FILTER (xsd:dateTime(?betse) < "1698-12-31T23:59:59"^^xsd:dateTime) . #person born before 1698
  SERVICE <https://query.wikidata.org/sparql> { #go to wikidata
    SELECT ?er ?erLabel ?place ?placeLabel WHERE {
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } #fetch label
      ?er wdt:P19 ?place. #places of birth of reconciled persons
    }
  }
} LIMIT 1000
  • Production events which happened in a place whose name starts from "Eng" and which produced items which were once owned by an organisation.
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>

SELECT DISTINCT ?e ?pl ?org ?orgl WHERE {
  ?e crm:P7_took_place_at ?p .
  ?pmmt crm:P108i_was_produced_by ?e .
  ?pmmt crm:P51_has_former_or_current_owner ?org .
  ?org a crm:E74_Group .
  {
    SELECT ?org ?orgl {
      ?org rdfs:label ?orgl .
    } LIMIT 1
  }
  
  ?p rdfs:label ?pl .
  FILTER (regex(?pl, '^Eng')) .
  
} LIMIT 1000
  • Provenance events between a group and a person who was not born in London
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>

SELECT DISTINCT ?e ?grl ?pl ?ool ?grerl WHERE {
  SERVICE <> {
    ?e crm:P11_had_participant ?gr .
    ?gr a crm:E74_Group .
    ?e crm:P11_had_participant ?p .
    ?p a crm:E21_Person .
    ?gr rdfs:label ?grl .
    ?p rdfs:label ?pl .
    ?gr owl:sameAs ?grer .
    ?p owl:sameAs ?per .
  }
  SERVICE <https://query.wikidata.org/sparql> {
    ?per wdt:P19 ?oo .
    ?oo rdfs:label ?ool .
    filter(langMatches(lang(?ool),"EN"))
  }
  SERVICE <https://query.wikidata.org/sparql> {
    ?grer wdt:P31 <https://www.wikidata.org/wiki/Q44613> .
    ?grer rdfs:label ?grerl .
    filter(langMatches(lang(?grerl),"EN"))
  }  
} LIMIT 1000

Organisations are reconcilied with wikidata and we can look for instances (P31) of (and subclasses (P279) of) monastery (Q44613)

  • Provenance events with both organisation and person
PREFIX crm: <http://www.cidoc-crm.org/cidoc-crm/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>

SELECT DISTINCT ?e WHERE {
  SERVICE <> {
    ?e crm:P11_had_participant ?gr .
    ?gr a crm:E74_Group .
    ?e crm:P11_had_participant ?p .
    ?p a crm:E21_Person .
    ?gr rdfs:label ?grl .
    ?p rdfs:label ?pl .
  }
} LIMIT 1000

brings 92 results. Double-checked with XPath //provenance[./org and ./person] with the same number of records. Reconciled results with wikidata filter records to only 6, one of which has a place of birth in wikidata.

TODO

  • FRBR Nomen may be a useful class for this mapping. It is worth checking where it could be used.

  • The transformation at its current state assumes that the records are facts. It would be useful to consider a mapping using the CRMinf to model propositions and deductions instead.