-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema v1 to Aardvark migrator #143
Conversation
e0b8659
to
c051c55
Compare
This fixes the error about a pending spec without a reason. #143 will un-pend the test when it is merged.
c051c55
to
46bcf21
Compare
Making this a draft again pending discussion of behavior for some fields; see OpenGeoMetadata/metadata-issues#50 |
The path in |
46bcf21
to
b3f49d7
Compare
687f017
to
f30e6d1
Compare
@thatbudakguy any chance we can get |
@the-codetrane thx for pointing that out; I added a step to handle |
cd5bf1d
to
97fe4b9
Compare
@thatbudakguy found another key that could be migrated - |
there's code in this PR to do that – we use a lookup table to map geometry types to resources types. it's only straightforward for a few cases, imo. does it not work for you? |
This is what comes out when I run the migrator on a GBL 1.0 schema record: {
"dct_description_sm": [
"This polygon shapefile represents the 1964 County Boundaries for China. The layer includes population census data and was primarily based on the \"Historical Administrative Maps of the People's Republic of China,\" published by China Map Press, and some other yearly administrative maps. See the documentation for more information and a list of the layer variables."
],
"dct_format_s": "Shapefile",
"dct_identifier_sm": [
"http://hdl.handle.net/2451/34626"
],
"dct_language_sm": [
"English"
],
"dct_publisher_sm": [
"Beijing Hua tong ren shi chang xin xi you xian ze ren gong si"
],
"dc_relation_sm": [
"http://sws.geonames.org/1814991/about/rdf"
],
"dct_accessRights_s": "Restricted",
"dct_subject_sm": [
"Boundaries",
"Demographic surveys",
"Population"
],
"dct_title_s": "1964 County Boundaries of China with Population Census Data",
"dc_type_s": "Dataset",
"dct_isPartOf_sm": [
"Historical China County Population Census Data"
],
"dct_issued_s": "2005",
"schema_provider_s": "NYU",
"dct_references_s": "{\"http://schema.org/url\":\"http://hdl.handle.net/2451/34626\",\"http://www.opengis.net/def/serviceType/ogc/wfs\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wfs\",\"http://www.opengis.net/def/serviceType/ogc/wms\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wms\",\"http://schema.org/downloadUrl\":\"https://archive.nyu.edu/retrieve/74851/nyu_2451_34626.zip\",\"http://lccn.loc.gov/sh85035852\":\"https://archive.nyu.edu/retrieve/74896/nyu_2451_34626_doc.zip\"}",
"dct_spatial_sm": [
"People's Republic of China, China"
],
"dct_temporal_sm": [
"1964"
],
"gbl_mdVersion_s": "Aardvark",
"layer_geom_type_s": "Polygon", // I'M GUESSING THIS IS SUPPOSED TO BE SOMETHING ELSE?
"gbl_wxsIdentifier_s": "sdr:nyu_2451_34626",
"gbl_mdModified_dt": "2016-11-10T15:51:38Z",
"id": "nyu-2451-34626",
"nyu_addl_dspace_s": "35559",
"locn_geometry": "ENVELOPE(73.557693, 134.773911, 53.56086, 10.175472)",
"gbl_indexYear_im": [
1964
],
"nyu_addl_format_sm": [
"Shapefile"
],
"_version_": 1779481613907787776,
"timestamp": "2023-10-11T17:38:31.500Z"
} |
I would expect |
@the-codetrane can you share the record that you transformed to get that output? |
@thatbudakguy My contract at NYU ended, so I'm outside the walled garden. @mnyrop should be able to help you with this. |
OK, I found the record. I ran it through the migrator myself and got: {
"dct_creator_sm": [],
"dct_description_sm": [
"This polygon shapefile represents the 1964 County Boundaries for China. The layer includes population census data and was primarily based on the \"Historical Administrative Maps of the People's Republic of China,\" published by China Map Press, and some other yearly administrative maps. See the documentation for more information and a list of the layer variables."
],
"dct_format_s": "Shapefile",
"dct_identifier_sm": ["http://hdl.handle.net/2451/34626"],
"dct_language_sm": ["English"],
"dct_publisher_sm": [
"Beijing Hua tong ren shi chang xin xi you xian ze ren gong si"
],
"dc_relation_sm": ["http://sws.geonames.org/1814991/about/rdf"],
"dct_accessRights_s": "Restricted",
"dct_subject_sm": ["Boundaries", "Demographic surveys", "Population"],
"dct_title_s": "1964 County Boundaries of China with Population Census Data",
"dct_issued_s": "2005",
"schema_provider_s": "NYU",
"dct_references_s": "{\"http://schema.org/url\":\"http://hdl.handle.net/2451/34626\",\"http://www.opengis.net/def/serviceType/ogc/wfs\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wfs\",\"http://www.opengis.net/def/serviceType/ogc/wms\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wms\",\"http://schema.org/downloadUrl\":\"https://archive.nyu.edu/retrieve/74851/nyu_2451_34626.zip\",\"http://lccn.loc.gov/sh85035852\":\"https://archive.nyu.edu/retrieve/74896/nyu_2451_34626_doc.zip\"}",
"dct_spatial_sm": ["People's Republic of China, China"],
"dct_temporal_sm": ["1964"],
"gbl_mdVersion_s": "Aardvark",
"gbl_wxsIdentifier_s": "sdr:nyu_2451_34626",
"gbl_mdModified_dt": "2016-11-10T15:51:38Z",
"id": "nyu-2451-34626",
"nyu_addl_dspace_s": "35559",
"dcat_bbox": "ENVELOPE(73.557693, 134.773911, 53.56086, 10.175472)",
"gbl_indexYear_im": [1964],
"gbl_resourceClass_s": ["Datasets"],
"gbl_resourceType_s": ["Polygon data"]
} It turned out there was just a typo; the new field is I've corrected the mistake. |
Resource Class is also multivalued: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it out with this record and got the following output:
{"gbl_mdVersion_s":"Aardvark",
"dct_identifier_sm":["930D4EA3-442E-4A28-AEC7-830F1A6CB5F8"],
"dct_title_s":"Land Use Milwaukee County, WI 1963",
"dct_description_sm":["This data layer represents land use for Milwaukee County, Wisconsin in 1963."],
"dct_accessRights_s":"Public",
"schema_provider_s":"UW-Madison Robinson Map Library",
"gbl_wxsIdentifier_s":"",
"id":"930D4EA3-442E-4A28-AEC7-830F1A6CB5F8",
"gbl_mdModified_dt":"2022-01-22T20:12:43Z",
"dct_format_s":"Shapefile",
"dct_language_sm":["English"],
"dct_creator_sm":["Southeastern Wisconsin Regional Planning Commission"],
"dc_publisher_sm":[""],
"dct_subject_sm":["Planning and Cadastral"],
"dct_spatial_sm":[],
"dct_issued_s":"",
"dct_temporal_sm":["1963"],
"gbl_indexYear_im":[1963],
"dct_references_s":
"{\"http://schema.org/downloadUrl\":\"https://gisdata.wisc.edu/public/Milwaukee_LandUse_1963.zip\",\"http://www.isotc211.org/schemas/2005/gmd/\":\"https://gisdata.wisc.edu/public/metadata/Milwaukee_LandUse_1963.xml\"}",
"dcat_bbox":"ENVELOPE(-88.074273, -87.812986, 43.195098, 42.83888)",
"uw_supplemental_s":"For more information: http://www.sewrpc.org/SEWRPC/LandUse.htm",
"uw_notice_s":"",
"gbl_resourceClass_s":["Datasets"],
"gbl_resourceType_sm":["Polygon data"]}
I looked through it pretty carefully and don't see anything unusual. Note the local fields uw_notice_s and uw_supplemental_s both seem to have just come through as-is, which I assume is the default behavior.
2eaaa4e
to
40bda96
Compare
9338a89
to
52db14d
Compare
- Handle elements without crosswalk (via lookup tables) - Support migrating collections in dct_isPartOf_sm - Convert single to multivalued fields where appropriate - Retain custom fields and remove deprecated fields Closes #121
52db14d
to
524dd74
Compare
Closes #121