-
Notifications
You must be signed in to change notification settings - Fork 57
Direct mapping vs. CSV2RDF, simple case
Consider a simple table, called People
:
ID | fname | addr |
---|---|---|
7 | Bob | 18 |
A relational table always has a schema, that the RDF Direct Mapping makes use of. In this example, the schema may define that:
-
ID
is a primary key in the table, and the cells are integers - The
name
column contains strings - The
addr
column contains integers
Using these information, the result of the Direct Mapping is something like:
<http://foo.example/DB/People/ID=7> rdf:type <http://foo.example/DB/People>.
<http://foo.example/DB/People/#ID> 7;
<http://foo.example/DB/People/#fname> "Bob";
<http://foo.example/DB/People/#addr> 18
.
Where http://foo.example/DB/
is the URL for the database that contains the People
table.
If the only information the CSV2RDF processor has is just the CSV file (i.e., the only available metadata is the first row, providing names for the columns) the output of the conversion is as follows (where for the sake of comparison, we consider http://foo.example/DB/People
to be the URL for the file):
[]
<http://foo.example/DB/People#ID> "7";
<http://foo.example/DB/People#fname> "Bob";
<http://foo.example/DB/People#addr> "18"
.
Comparing the two conversion results:
- CSV2RDF does not have the information that the first column provides unique identifiers for the rows (i.e., that it is a primary key); consequently, a blank node must be used for the common row subject
- CSV2RDF does not have information on the data types and, therefore, cannot presume that the first and the third columns contain integers
- in general, there is no reason to use the URL for the table as a class for typing. That could lead to semantically incorrect situations
To provide the necessary information the following simple metadata could be made available to the CSV2RDF processor (essentially playing the role of the RDB Schema for the conversion):
{
"@context": "http://www.w3.org/ns/csvw",
"null": "",
"tableSchema": {
"url" : "http://foo.example/DB/People",
"aboutUrl" : "http://foo.example/DB/People/ID={ID}",
"columns": [{
"name": "ID",
"datatype" : "integer"
}, {
"name": "fname",
}, {
"name": "addr",
"datatype" : "integer"
}, {
"name": "type",
"virtual": "true",
"propertyUrl": "rdf:type",
"valueUrl" : "http://foo.example/DB/People"
}],
}
}
Using this metadata, the output of the CSV2RDF processor will be identical to the one produced by the RDF Direct Mapping.