Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map the annotation syntax to rdfs:states #128

Open
rat10 opened this issue Sep 23, 2024 · 49 comments
Open

map the annotation syntax to rdfs:states #128

rat10 opened this issue Sep 23, 2024 · 49 comments

Comments

@rat10
Copy link
Contributor

rat10 commented Sep 23, 2024

tl;dr

Define a property rdfs:states and map the annotation syntax to it,
to differentiate between

  • annotations on statements in the graph
  • annotations on unasserted statements.

E.g. map Turtle-star to N-triples-star as follows:

<< :s :p :o >> :a :b .
:s :p :o {| :c :d |} .

<==>

_:r rdf:reifies <<( :s :p :o )>> .
_:r  :a :b .
_:s rdfs:states <<( :s :p :o )>> .
_:s :c :d .
:s :p :o .

Advise users to only use Turtle-star and Sparql-star ("Turtle-star with holes") when interacting with RDF-star data, unless they are sure they know what they are doing and are prepared to handle "dangling" stated reifications.

Annotating un-asserted statements

RDF standard reification in practice is often used to annotate statements that are not contained in the graph, to e.g. discuss competing viewpoints or document earlier versions. Since the early times of the RDF* CG it has been established that annotating statements without asserting them is an important use case. However, not much detail was provided.

Intuitions

It turned out during discussions that two possible intuitions are at play:

  • a single truth
  • competing viewpoints

A single truth is what the somehow impoverished semantics of RDF supports: a statement is either in the graph and therefore true, or it's unknown.

Competing truths is what real life suggests, and possible interpretations range from "not yet confirmed" to "not endorsed" to "strongly opposed".

Competing truths

Multiple viewpoints can appear in statements or in annotations.

  • dis-agreement on a statement (but not on annotation), e.g. a claim is described, but disputed:
<< :Foo :madeOf :Bar >> :says :Alice .  # and disagrees
:Foo :madeOf :Bar {| :says :Bob |}.     # and endorses the theory
  • dis-agreement on annotation (but not on statement), e.g. one claim, multiple supporting theories, but only one of them considered valid:
<< :Foo :madeOf :Bar >> :becauseOf :TheoryOne .  # we disagree with the theory
<< :Foo :madeOf :Bar >> :becauseOf :TheoryTwo .  # and with the second theory
:Foo :madeOf :Bar {| :becauseOf :TheoryThree |} .# but we endorse this one

Reification semantics

Reification maintains a disconnect between
- a statement, which represents a type
- "its" annotation, which refers to an occurrence of that type.

The semantics of reification is tricky, to put it mildly. A reification:

  • is not asserted,
  • does not entail the statement it refers to,
  • but refers to (the interpretation of) a statement as an occurrence,
  • i.e. not to the statement "itself"
  • and also not to the statements as a type,
  • but just a specific instance,
  • however without being able to point to that specific instance
  • because the graph just contains the statement as a type (if at all!).
  • etc (ask two semanticists for what might sound like five different answers)

Reification is unavoidable in RDF to manage the constraints that the set semantics impose. There will always be a disconnect between the statement which may be contained in the graph as a type, and the annotations which instead refer to occurrences of that type of statement (instead of "occurrence" one might also call them instance, token or subtype, slight variations in meaning notwithstanding). In general, users should be shielded from these shallows as good as possible.

Unlike what reification provides many use cases ask for a clear and solid connection between a fact and "its" annotation. All use cases of qualification fall in this category, e.g. Wikidata, and many more. In LPG the edge "EXISTS", i.e. there are no unasserted statements in LPG, and the indirection through reification is quite irritating and unwelcome.

There's two ways to bridge that gap

  • from the inside: entail the annotated statement
  • from the outside: describe the statement to be annotated

Entailment is not available in simple RDF, so describing "such a statement" is the way to go. However, it's important to realize that the statement itself
:s :p :o .
and a triple term describing it
<<( :s :p :o )>>
and a reference to such a triple so described
:r rdf:reifies <<( :s :p :o )>>
are three different things and, while all similar, there is no way to connect them beyond describing that similarity.

Syntax

We currently have three syntactic primitives, one of them in N-triples to actually implement RDF-star, the other two in Turtle-star to provide appropriate syntactic sugar for users:

  • in N-triples-star the abstract triple term describes a statement, e.g. the term <<( :s : p :o )>> describes the triple :s :p :o. but that triple is not asserted and can only be referred to by very specific means. A so-called 'reifier' creates a reference to an occurrence of the abstract triple term, e.g. :r rdf:reifies <<( :s : p :o )>>.. The reifier :r can then be annotated, e.g. :r :a :b .. None of this actually states :s :p :o . - that requires a regular RDF statement to that effect.
  • in Turtle-star unasserted syntax a syntactic shortcut is provided to define the reifier, e.g. << :s : p :o >> :a :b . defines and annotates the reified statement - however, without actually stating the reified triple.
  • in Turtle-star annotation syntax a second syntactic shortcut is provided to define the reifier AND state the reified triple, e.g. :s :p :o {| :a :b |} . defines, states and annotates the reified statement in one go.

Mapping

There are two possible ways to map the syntactic sugar of Turtle-star to bare triples in N-triples-star.
Depending on the interpretation of what it means for a statement to be un-asserted one of them provides partial support of use cases, the other one near complete support. (Full support can't be claimed because of the limitations dictated by the semantics.)

The mapping as currently defined in RDF-star with only rdf:reifies

  • can only support a single truth
  • but fails to round-trip correctly in more involved cases
  • i.e. it lures users into expecting an expressivity that isn't supported in N-triples

The mapping as proposed here with rdf:reifiesand rdfs:states

  • can express competing viewpoints
  • but is not fully supported in N-triples-star
  • i.e. it can lead to incomplete representations when mixing syntactic sugar and triple terms

Mapping to N-triples-star with rdf:reifies and dis-agreement on annotation

<< :Foo :madeOf :Bar >> :becauseOf :TheoryOne .
<< :Foo :madeOf :Bar >> :becauseOf :TheoryTwo .
:Foo :madeOf :Bar {| :becauseOf :TheoryThree |} .

=>

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :becauseOf :TheoryOne .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :becauseOf :TheoryTwo .
_:r3 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :becauseOf :TheoryThree .
:Foo :madeOf :Bar .

=>

:Foo :madeOf :Bar {| :becauseOf :TheoryOne |} , 
                  {| :becauseOf  :TheoryTwo |} , 
                  {| :becauseOf  :TheoryThree |} .

Mapping to N-triples-star with rdf:reifies and dis-agreement on statement

<< :Foo :madeOf :Bar >> :says :Alice .
:Foo :madeOf :Bar {| :says :Bob |}.

=>

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .
:Foo :madeOf :Bar .

=>

:Foo :madeOf :Bar {| :says :Alice |} ,
                  {| :says:Bob |} .

snafu...

This is clearly unsatisfactory. The mapping as currently specified only works for uncontested data. A later addition of a statement first annotated as unasserted may profoundly change the meaning of an annotation. Conflicting viewpoints can not be represented.

But worst of all, the syntactic sugar conveys the illusion that all use cases are covered:

<< :Foo :madeOf :Bar >> :says :Alice .
:Foo :madeOf :Bar {| :says :Bob |} .

suggests that we can represent and annotate statements in both unasserted and asserted form side by side - which is prerequisite to describe competing viewpoints.

But the essential detail of unstatedness is lost in translation to N-triples - which is prerequisite to storing it in an RDF database - as soon as the triple is added from another viewpoint.
I.e. users are lured into believing they expressed a much richer description than what is actually stored in the back end, and transmitted over the wire.

This is the kind of unpleasant surprise that users tend to not forget, nor forgive.

Work arounds

Some work arounds have been proposed:

  • defer a solution until later, i.e. "collect experience first".
    However, the syntactic sugar is defined now and can't be re-mapped later.
  • add another statement to express "statedness"
    However, that means adding a lot of triples (most annotations are on asserted statements), i.e. verbosity
    and having to query for them (needing one more join), i.e. performance issues. Relying on users to add one more triple to express what seems clear in syntax is a brittle approach anyway.
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob ,
     rdf:type :Stated .
  • delegate to user space, because there are so many different reasons why it could be desirable to annotate a statement without asserting it.
    However, while we agree that all those different reasons do indeed belong in domain ontologies, the basic fact if a statement is considered asserted or not has to be covered by RDF proper.

Pragmatic solution: rdfs:states

The two syntactic devices to express annotations in Turtle-star

<< :Foo :madeOf :Bar >> :says :Alice .    # un-asserted
:Foo :madeOf :Bar {| :says :Bob |}.       # asserted

capture the intuition behind all use cases. It's the mapping to N-triples that loses an important aspect
and fails to cover non-simplistic use cases where the world is more than a one-dimensional series of facts.

Hence let them map to different properties in N-triples:

_:r1 rdf:reifies <<( :Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdfs:states <<( :Foo :madeOf :Bar )>> ;
     :says :Bob .
 :Foo :madeOf :Bar . 

This provides

  • safe round tripping between Turtle-star and N-triples-star
  • all use cases are covered
  • no extra triples needed to express the "obvious" (i.e. what the syntax suggests anyway)
  • no mismatch between intuition expressed by syntax and the data actually stored.

More formally speaking the properties rdf:reifies and rdfs:states could be defined as follows:

rdf:reifies 
    rdfs:domain rdf12:ReifiedTripleTerm ;
    rdfs:range rdf12:TripleTerm ;
    rdfs:comment "
        Reifying a triple term doesn’t make any assumption 
        if a token of that triple is true in the graph.
        It doesn’t entail the triple but merely states its existance.
    " .

rdfs:states 
    rdfs:subPropertyOf rdf12:reifies ;
    rdfs:domain rdf12:StatedTripleTerm ;
    rdfs:range rdf12:TripleTerm ;
    rdfs:comment "
        Stating a triple term does express the expectation that
        a token of the triple is true in the graph.
        It captures the intuition of Turtle-star annotation syntax.
        RDFS entailment interprets this expectation as a 
        semantic assumption and entails the triple so described.
   " .

(see also An update on [Proposal: described vs stated triple terms])

Entailment ???

But isn't it entailment when reifying a triple term with rdfs:states also creates the statement described by the triple term? Strictly speaking the answer might be 'yes', and that would be a problem because we need to specify a solution to statement annotation that works in simple RDF, without RDFS/OWL/etc reasoning.
However, the mapping from annotation syntax to rdfs:states and back again can be understood as a simple "macro", because it is perfectly predictable and it works on the level of syntax mapping, not in the realm of interpretation.
I.e. as long as all interactions are channeled through Turtle-star and Sparql-star everything is safe!

But what to do with the following N-triples-star file containing a "dangling" 'rdfs:states' statement, but missing the actual triple:

_:r2 rdfs:states <<( :Foo :madeOf :Bar )>> ;
     :says :Bob .
# the actual ':Foo :madeOf :Bar.' triple is missing
  • The easiest solution would be to call for entailment. However, that is not available in basic RDF.
  • Alternatively add to the specification that
    "Implementations MAY|SHOULD add the missing triple."
  • Trust users that mess with N-triples-star that they know what they are doing.
    For everybody else put up warning signs saying
    "Always use Turtle-star and Sparql-star, leave N-triples-star to back office machinery."
  • Accept that a waterproof solution to RDF meta modeling without entailment is just not possible.

Alternatively, drop the syntactic sugar that lures users into false expectations. But that still wouldn't meet all use cases. Go back to "Add rdfs:states"...

Querying

Just as Turtle-star should be used for authoring, Sparql-star should be used for querying. Users searching for the application of a triple term in both asserted and unasserted form will have to do so explicitly, by using both syntactic forms ("Turtle-star with holes"). Users that just search for annotations on statements that are true in the graph - probably the most common use case - don't have to think twice but just use the annotation syntax, and vice versa for searches of annotations on unasserted triples. At this level no knowledge of the underlying mapping to N-triples-star with its two different properties is required.
However, searching for all instantiations of a triple term in raw triple term form requires nothing more than searching for rdf:reifies and rdfs:states, which is not hard to do either.

@TallTed
Copy link
Member

TallTed commented Sep 24, 2024

Standing ovation

My single quibble is with the word states in rdf:states which makes me think more of conditions than it does of utters ... which I think would serve well, as in rdf:utters ...

... which I therefore suggest. (This might be best accompanied by changes from statement to utterance, but I can live without this.)


About the conclusion, however... I think it would be in keeping with RDF-Classic to allow loading of the "widowed" or "orphaned" line(s). Cleanup could be done dynamically at load time, or later as some kind of batch job; in both cases, probably best with user input to say how to resolve the detected issues (like, ":Foo :madeOf :Bar utterance found without any rdfs:utters and however-many rdf:reifies. Insert the utterance anyway? Add an rdfs:utters statement? In the latter case, with what subject entity?"). SHACL and ShEx and various SPARQL queries should be useful in these batch jobs.

Junk results are to be expected from junk data.

Discovery and repair of junk data before junk results in live deployment is to be worked toward.

Repairing junk data delivers good (or at least, tolerable) data, resulting in good (or at least, tolerable) query/analysis results, approaching best.

@niklasl
Copy link

niklasl commented Sep 24, 2024

Both "Mapping to N-triples-star" examples are incorrect in their final compact forms.

While this:

<< :Foo :madeOf :Bar >> :says :Alice .
:Foo :madeOf :Bar {| :says :Bob |}.

does indeed expand to:

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .
:Foo :madeOf :Bar .

it does not compact to:

:Foo :madeOf :Bar {| :says :Alice ,
                           :Bob |}.

but to:

:Foo :madeOf :Bar {| :says :Alice |}
                  {| :says :Bob |} .

If we were to introduce a relationship to the effect that the reifier implies the truth of the triple(s) it refies (or some of them), annotation syntax does not work to simply reify the triple so asserted; it is locked to such a meaning. This is a very important thing to recognize about this additional difference.

That means that you cannot use it to reference reifiers orthogonal to assertion using the annotation syntax. As you exemplify, you'd have to write:

:Foo :madeOf :Bar {| a :Belief ; :says :Bob |}.
<< :Foo :madeOf :Bar >> a :Disbelief ; :says :Alice .

Since this would be different to:

:Foo :madeOf :Bar {| a :Belief; :says :Bob |}
                  {| a :Disbelief ; :says :Alice |} .

I understand that this is what you want; I just want to make it clear. This would go for all uses: be it qualification, "neutral" provenance, marginalia, etc.

I am not principally opposed to this, if it is clear and intuitive in common practice. I personally think notions like propositional attitude and justified beliefs are interesting but tricky, and thus I've leaned towards having these aspects in the specific domain modelling, and just have "reifies" in the basics.

But if this distinction is common belief (and conversely that the lack of it would be harmful for comprehension), and given that the various practical uses prove to differentiate correctly (and not just reference the triple reified), then that is a case for introducing it at this basic level.

If the distinction is added, we would have something like "implicators" alongside reifiers (or as a subset thereof). In that case, I'd suggest rdf:implies (rather than rdf:states or rdf:utters) as the relationship.

To properly model something Bob believes that is not considered true/known a graph, consider:

<< :Moon :madeOf :Cheese >> a :Belief; :says :Bob .

That suggests that the belief doesn't imply the statement in this graph. I think it is OK (in this graph, this reifier is not an "implicator", or thuth-maker, if you will). It's just one of these differences we need to take into consideration.

(Aside: I'd use e.g. :heldBy instead of :says if modelling the above; but that's mostly surface stuff.)

(That is part of what I was exploring with the Anne Bonny example, where e.g. #005 is a source of triples both believed in and not by the publisher of that graph. I'd love to see more examples of "disbeliefs" to study the effect in representations.)

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@TallTed Thank you for the flowers ;-)
@niklasl Thank you for the syntax check. I hope it's correct now (trailing commas or not?).

About property naming:

  • RDF terminology is "statement", not "utterance", and rdfs:statesis taking that up to not introduce unnecessary disruption
  • I'd actually favor a pair rdf:mentionsand rdfs:statesbecause that seems like a more user friendly language to me
  • OTOH rdf:reifies and rdfs:impliesseems technically more correct and if we consider user facing demands to be met by the syntactic sugar, then one might argue that it doesn't hurt if the properties are named very technically.

YMMV ;-) but let's not get into a naming discussion too deep right now. I know it's tempting, but like syntax it should be fixed at the end. Right now any name is good enough that helps discussion the topic at hand.

About the Anne Bonny example:
This is a shortened version that hopefully contains all relevant variations:

##
# These are the primary facts asserted by the publisher of this graph.
<Anne_Bonny> a :Person ;
    :name "Anne Bonny" ;
    :parent <William_Cormac> ~ <#005> .
<#001> a :Circumstance ;
    :startDate "1716"^^:EDTF ~ <#005> .

##
# These are various documented sources of claims; asserted or just cited.
# Again, according to the publisher of this graph.
<#005> a :Reference ;
    :source <http://www.encyclopedia.com/doc/1G2-3446400036.html> ~ <#002> ;
    :date "2024-08-14T12:48:07Z"^^:DateTime .

##
# These are spurious claims made in the referenced sources.
# They are not considered true by the publisher of this graph.
<< <Anne_Bonny> :familyName "Brennan" ~ <#005> >> .

Mapping that to an rdfs:states-enhanced version of N-triples-star seems straightforward:

<#005> 
    rdfs:states <<( <Anne_Bonny> :parent <William_Cormac> )>> ,
                <<( <#001> :startDate "1716"^^:EDTF )>> ;
    rdf:reifies <<( <Anne_Bonny> :familyName "Brennan" )>> .
# omitting all the stated triples for brevity

Or do I miss something?

@rat10 rat10 changed the title map the annotation syntax to rdfs:states map the annotation syntax to rdfs:states Sep 24, 2024
@william-vw
Copy link

Thanks a lot for this @rat10. I understand your point much better now.

Under this proposal, at least how I understood it, I am worried that a proper use of RDF 1.2 can result in "junk data". I find orphaned assertions (@TallTed ) / dangling stated reifications (@rat10) problematic; I don't find the solutions convincing at this point. I have a feeling that people are going to use N-Triples syntax anyway (and why shouldn't they? it's a lot simpler); even if they don't, (poorly coded) applications can make mistakes or (bad) choices in N-Triples, e.g., leave out the assertions to reduce the dataset size.

My main problem - what if your current dataset is (valid use of RDF 1.2):

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .

and then you add:

:Foo :madeOf :Bar 

Does it mean that, all of a sudden, there is a mistake (missing rdf:states) in the N-triples?

Or, similarly, when you have vanilla RDF:

:Foo :madeOf :Bar .

But then add:

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .

Does it similarly mean there is now a mistake (missing rdf:states)?

I remember Pat Hayes saying that each RDF triple should be able to stand on its own; this no longer seems to be the case as an rdf:states must be accompanied by the assertion.

But, IIUC, your point is that there is a similar situation now:

<< :Foo :madeOf :Bar >> :says :Alice .

That adding:

:Foo :madeOf :Bar {| :says :Bob |}.

Now means that Alice is all of a sudden talking about a fact, and we no longer know which of the annotations referred to :Foo :madeOf :Bar as a fact (?)

@pfps
Copy link
Contributor

pfps commented Sep 24, 2024

In my opinion, RDF is supposed to be a very simple formalism, with only the bare minimum of machinery needed to support minimalist representation of truths. (Yes, this isn't exactly true for RDF but, again in my opinion, this is what any augmentation of RDF should be striving for.) But being simple can require wordy constructions for what can be stated concisely in more-complex languages. To alleviate this problem, some surface syntaxes for RDF provide shorthands for common wordy constructions.

In my opinion, the proposal to have both rdf:reifies and rdf:states in RDF violates both these minimalist ideals. First, the two properties appear to differ only in some notion related to propositional attitudes, i.e., the stance of the constructs vis-a-vis whether the truth of the quoted statment is supported by the construct. Propositional attitudes are a very complex notion, and thus do not belong in RDF. Second, propositional attitudes can be just as well be done in some extension of RDF or even done as user vocabulary, and thus again do not belong in RDF.

So I am against having both rdf:reifies and rdf:states in RDF, as this sort of capability goes against the minimalist nature of RDF.

PS: Here is how one could model a reifier supporting a statement:

_:r rdf:reifies <<( :s :p :o )>> .
_:r a ex:Supporter .

More-complex relationships are possible and would be modelled in more-complex ways, perhaps requiring a second "stand-off".

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@william-vw

Does it mean that, all of a sudden, there is a mistake (missing rdf:states) in the N-triples?

No. A statement that is not annotated is just a regular statement (same answer to the following "similar" question).

I remember Pat Hayes saying that each RDF triple should be able to stand on its own;

That is the problem with meta-modelling, and why RDF 1.0 resorted to reification.

this no longer seems to be the case as an rdf:states must be accompanied by the assertion.

Well, in my proposal it's defined as a syntactic macro, and where that macro fails to take hold (e.g. when users mess around with N-triples-star) instead of must I propose MAY or SHOULD (and @TallTed provides some helpful detail to that).

But, IIUC, your point is that there is a similar situation now:
<< :Foo :madeOf :Bar >> :says :Alice .
That adding:
:Foo :madeOf :Bar {| :says :Bob |}.
Now means that Alice is all of a sudden talking about a fact, and we no longer know which of the annotations referred to :Foo :madeOf :Bar as a fact (?)

Indeed similar insofar as adding a statement may change how another statement is interpreted - which of course depends on if your intuition is sufficiently primed by the RDF semantics, or if you rather follow lay persons intuitions and what the syntactic sugar suggests. Yes, that is my issue. In a way one could call this non-monotonic, as assumptions are evoked by the syntactic sugar, but then not backed up in the N-triples based machinery of triple stores and (especially streaming) data exchange.

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@pfps

In my opinion, RDF is supposed to be a very simple formalism, with only the bare minimum of machinery needed to support minimalist representation of truths. (Yes, this isn't exactly true for RDF but, again in my opinion, this is what any augmentation of RDF should be striving for.)

Minimalistic yes, but not to the point of not being useful or even misleading. It is a design discussion what is considered the most minimalistic but useful design, and we're having it here. I argue how RDF-star without rdfs:states is too minimalistic, and too prone to misleadig expectations.

But being simple can require wordy constructions for what can be stated concisely in more-complex languages. To alleviate this problem, some surface syntaxes for RDF provide shorthands for common wordy constructions.

But the syntactic sugar should not deviate from what is stored as bare triples. As I laid out above, it currently does. Instead of introducing an rdfs:statesproperty we could also take away the syntactic sugar - however, that would not be my favorite solution.

In my opinion, the proposal to have both rdf:reifies and rdf:states in RDF violates both these minimalist ideals. First, the two properties appear to differ only in some notion related to propositional attitudes, i.e., the stance of the constructs vis-a-vis whether the truth of the quoted statment is supported by the construct. Propositional attitudes are a very complex notion, and thus do not belong in RDF. Second, propositional attitudes can be just as well be done in some extension of RDF or even done as user vocabulary, and thus again do not belong in RDF.

Propositional attitudes are indeed a wide field, but the basic distinction between a statement considered true or not is really very - well - basic, and therefore IMO has to be part of a minimalistic design. I take confirmation in the fact that those two options - asserted or not - are precisely what the syntactic sugar provides. I agree with the intuitions behind the syntax. I want that intuition to also be represented - and not lost - in the raw triples.

So I am against having both rdf:reifies and rdf:states in RDF, as this sort of capability goes against the minimalist nature of RDF.

PS: Here is how one could model a reifier supporting a statement:

_:r rdf:reifies <<( :s :p :o )>> . _:r a ex:Supporter .

More-complex relationships are possible and would be modelled in more-complex ways, perhaps requiring a second "stand-off".

IMO modelling is actually not necessarily the worst part, but querying is. If one always has to check for extra annotation w.r.t. to the "stated-ness" of the resource an annotation is meant to refer to, then we put a pretty severe burden on query authors and query engines (i.e. one more line in the query, one more join the engine has to manage), and we would still need to provide some vocabulary to this effect. See also the discussion of work arounds above.

@william-vw
Copy link

william-vw commented Sep 24, 2024

My main problem - what if your current dataset is (valid use of RDF 1.2):

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .

and then you add:

:Foo :madeOf :Bar 

Does it mean that, all of a sudden, there is a mistake (missing rdf:states) in the N-triples?

No. A statement that is not annotated is just a regular statement (same answer to the following "similar" question).

That wasn't the issue; it becomes "annotated" once added to the dataset, since there are now 2 annotations about that triple. But now an rdf:states is missing. Similar point for the "similar" question :-)

Well, in my proposal it's defined as a syntactic macro, and where that macro fails to take hold (e.g. when users mess around with N-triples-star) instead of must I propose MAY or SHOULD (and @TallTed provides some helpful detail to that).

Not to be argumentative, but it seems to be that the answer is actually yes, those will be a mistakes :-) You say that MAY/SHOULD instead of MUST should be used, so they seem less like mistakes; and your refer to @TallTed re dealing with junk data and cleanup, which you don't need that if there aren't mistakes to be cleaned up (not meant to be snarky, just saying).

If you require (may/should/must aside), in an RDF 1.2 dataset, that an rdf:states to be accompanied by an asserted triple (and vice-versa, IIUC), and one of those is missing, then that seems like a mistake. There can be potential solutions for them, like you listed.

@william-vw
Copy link

william-vw commented Sep 24, 2024

If one always has to check for extra annotation w.r.t. to the "stated-ness" of the resource an annotation is meant to refer to, then we put a pretty severe burden on query authors and query engines (i.e. one more line in the query, one more join the engine has to manage)

Yes, but:

However, searching for all instantiations of a triple term in raw triple term form requires nothing more than searching for rdf:reifies and rdfs:states, which is not hard to do either.

(also not meant to be snarky, just to point it out)

@pfps
Copy link
Contributor

pfps commented Sep 24, 2024

rat10

pfps

In my opinion, RDF is supposed to be a very simple formalism, with only the bare minimum of machinery needed to support minimalist representation of truths. (Yes, this isn't exactly true for RDF but, again in my opinion, this is what any augmentation of RDF should be striving for.)

Minimalistic yes, but not to the point of not being useful or even misleading. It is a design discussion what is considered the most minimalistic but useful design, and we're having it here. I argue how RDF-star without rdfs:states [sic] is too minimalistic, and too prone to misleadig expectations.

It is no surprise that I completely disagree with both of these claims. I have seen nothing that requires rdf:states or even rdfs:states to be in RDF itself. More below.

But being simple can require wordy constructions for what can be stated concisely in more-complex languages. To alleviate this problem, some surface syntaxes for RDF provide shorthands for common wordy constructions.

But the syntactic sugar should not deviate from what is stored as bare triples. As I laid out above, it currently does. Instead of introducing an rdfs:statesproperty we could also take away the syntactic sugar - however, that would not be my favorite solution.

I dispute this claim. I do not see anything in the "annotation" shorthand that requires using a different property from the other shorthand. If users want to augment the "annotation" shorthand with information about propositional attitudes then they are free to do so. (Of course, RDF and RDFS miss much about propositional attitudes so propositional attitude triples do not expand RDF or RDFS to actually include propositional attitidues.)

In my opinion, the proposal to have both rdf:reifies and rdf:states in RDF violates both these minimalist ideals. First, the two properties appear to differ only in some notion related to propositional attitudes, i.e., the stance of the constructs vis-a-vis whether the truth of the quoted statment is supported by the construct. Propositional attitudes are a very complex notion, and thus do not belong in RDF. Second, propositional attitudes can be just as well be done in some extension of RDF or even done as user vocabulary, and thus again do not belong in RDF.

Propositional attitudes are indeed a wide field, but the basic distinction between a statement considered true or not is really very - well - basic, and therefore IMO has to be part of a minimalistic design. I take confirmation in the fact that those two options - asserted or not - are precisely what the syntactic sugar provides. I agree with the intuitions behind the syntax. I want that intuition to also be represented - and not lost - in the raw triples.

A propositional attitude is a relationship between two things, not "a statement considered true or not". RDF graphs already have a perfectly god way of determining this - whether the triple is in the graph or not. I believe that the expansion of both shorthands adquately captures what should be captured in RDF. If more is wanted, then that goes far outside of RDF.

So I am against having both rdf:reifies and rdf:states in RDF, as this sort of capability goes against the minimalist nature of RDF.
PS: Here is how one could model a reifier supporting a statement:
_:r rdf:reifies <<( :s :p :o )>> . _:r a ex:Supporter .
More-complex relationships are possible and would be modelled in more-complex ways, perhaps requiring a second "stand-off".

IMO modelling is actually not necessarily the worst part, but querying is. If one always has to check for extra annotation w.r.t. to the "stated-ness" of the resource an annotation is meant to refer to, then we put a pretty severe burden on query authors and query engines (i.e. one more line in the query, one more join the engine has to manage), and we would still need to provide some vocabulary to this effect. See also the discussion of work arounds above.

I would also like to not have to use complex constructs in queries, but that's not how RDF and SPARQL work. The right way to capture any part of any sort of propositional attitude is in a semantic extension of RDF, not in RDF itself.

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

My main problem - what if your current dataset is (valid use of RDF 1.2):

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .

and then you add:

:Foo :madeOf :Bar 

Does it mean that, all of a sudden, there is a mistake (missing rdf:states) in the N-triples?

No. A statement that is not annotated is just a regular statement (same answer to the following "similar" question).

That wasn't the issue; it becomes "annotated" once added to the dataset, since there are now 2 annotations about that triple. But now an rdf:states is missing. Similar point for the "similar" question :-)

The annotations in your example refer to a reifier that rdf:reifies the triple term. So they refer to unasserted statements, not to the statement that is then added. If you wanted to annotate a statement contained in the graph you'd have to use rdfs:states to create a reifier that refers to an asseretd statement.

Well, in my proposal it's defined as a syntactic macro, and where that macro fails to take hold (e.g. when users mess around with N-triples-star) instead of must I propose MAY or SHOULD (and @TallTed provides some helpful detail to that).

Not to be argumentative, but it seems to be that the answer is actually yes, those will be a mistakes :-) You say that MAY/SHOULD instead of MUST should be used, so they seem less like mistakes; and your refer to @TallTed re dealing with junk data and cleanup, which you don't need that if there aren't mistakes to be cleaned up (not meant to be snarky, just saying).

If you require (may/should/must aside), in an RDF 1.2 dataset, that an rdf:states to be accompanied by an asserted triple (and vice-versa, IIUC), and one of those is missing, then that seems like a mistake. There can be potential solutions for them, like you listed.

Okay, if you insist ;-) I wouldn't rule out that people find creative uses for a three-valued system where it makes a difference if an annotation claims to annotate a statement that it considers true or if the graph actually contains the statement. Also, MUST is very strong word. But in general the idea is indeed that an "rdfs:stated" reification that is not contained as a standard triple in the graph points to problem in the data.

@william-vw
Copy link

My main problem - what if your current dataset is (valid use of RDF 1.2):

_:r1 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Alice .
_:r2 rdf:reifies <<(:Foo :madeOf :Bar )>> ;
     :says :Bob .

and then you add:

:Foo :madeOf :Bar 

Does it mean that, all of a sudden, there is a mistake (missing rdf:states) in the N-triples?

No. A statement that is not annotated is just a regular statement (same answer to the following "similar" question).

That wasn't the issue; it becomes "annotated" once added to the dataset, since there are now 2 annotations about that triple. But now an rdf:states is missing. Similar point for the "similar" question :-)

The annotations in your example refer to a reifier that rdf:reifies the triple term. So they refer to unasserted statements, not to the statement that is then added.

But now there is no rdf:states to reflect the assertion. Hence my point about a missing rdf:states. Or, is it not a problem that there is an asserted statement without an rdf:states? Check @TallTed's message for that, i.e.:

:Foo :madeOf :Bar utterance found without any rdfs:utters and however-many rdf:reifies. Insert the utterance anyway? Add an rdfs:utters statement?

Not to be argumentative, but it seems to be that the answer is actually yes, those will be a mistakes :-) You say that MAY/SHOULD instead of MUST should be used, so they seem less like mistakes; and your refer to @TallTed re dealing with junk data and cleanup, which you don't need that if there aren't mistakes to be cleaned up (not meant to be snarky, just saying).
If you require (may/should/must aside), in an RDF 1.2 dataset, that an rdf:states to be accompanied by an asserted triple (and vice-versa, IIUC), and one of those is missing, then that seems like a mistake. There can be potential solutions for them, like you listed.

Okay, if you insist ;-) I wouldn't rule out that people find creative uses for a three-valued system where it makes a difference if an annotation claims to annotate a statement that it considers true or if the graph actually contains the statement. Also, MUST is very strong word. But in general the idea is indeed that an "rdfs:stated" reification that is not contained as a standard triple in the graph points to problem in the data.

Oof, we got there! :-) How about the inverse, i.e., a standard triple that is not accompanied by an rdfs:stated reification?

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@william-vw

Oof, we got there! :-)

Yes, but that was described in my proposal above from the start as a "dangling" stated annotation in section "Entailment ???".

How about the inverse, i.e., a standard triple that is not accompanied by an rdfs:stated reification?

That is not a problem. The rdfs:states property is only to be used for annotations of statements that are meant to be true, i.e. to be contained in the graph. Asserting a statement without annotating it is of course still possible - just add it to the graph. I'm not trying to reinvent RDF as a whole.

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@pfps

rat10

pfps

But the syntactic sugar should not deviate from what is stored as bare triples. As I laid out above, it currently does. Instead of introducing an rdfs:statesproperty we could also take away the syntactic sugar - however, that would not be my favorite solution.

I dispute this claim. I do not see anything in the "annotation" shorthand that requires using a different property from the other shorthand.

You may dispute this claim to your heart's content, but you might consider backing that up with some counter arguments. I illustrated the problems above, see sections "Mapping" ff.

In my opinion, the proposal to have both rdf:reifies and rdf:states in RDF violates both these minimalist ideals. First, the two properties appear to differ only in some notion related to propositional attitudes, i.e., the stance of the constructs vis-a-vis whether the truth of the quoted statment is supported by the construct. Propositional attitudes are a very complex notion, and thus do not belong in RDF. Second, propositional attitudes can be just as well be done in some extension of RDF or even done as user vocabulary, and thus again do not belong in RDF.

Propositional attitudes are indeed a wide field, but the basic distinction between a statement considered true or not is really very - well - basic, and therefore IMO has to be part of a minimalistic design. I take confirmation in the fact that those two options - asserted or not - are precisely what the syntactic sugar provides. I agree with the intuitions behind the syntax. I want that intuition to also be represented - and not lost - in the raw triples.

A propositional attitude is a relationship between two things, not "a statement considered true or not".

And what exactly is then the disagreement?

RDF graphs already have a perfectly god way of determining this - whether the triple is in the graph or not. I believe that the expansion of both shorthands adquately captures what should be captured in RDF. If more is wanted, then that goes far outside of RDF.

Oh, the hyperbole again: "far outside". Again: provide arguments and analyses, react to what I described in a lot of detail. Don't just put up strawmans like "propositional attitude" and then ascribe them to me.

@rat10
Copy link
Contributor Author

rat10 commented Sep 24, 2024

@william-vw

If one always has to check for extra annotation w.r.t. to the "stated-ness" of the resource an annotation is meant to refer to, then we put a pretty severe burden on query authors and query engines (i.e. one more line in the query, one more join the engine has to manage)

Yes, but:

However, searching for all instantiations of a triple term in raw triple term form requires nothing more than searching for rdf:reifies and rdfs:states, which is not hard to do either.

(also not meant to be snarky, just to point it out)

I was trying to argue that a tradeoff is to be made, but that it is well justified. I expect the latter case - where one queries for annotations on asserted AND on unasserted triples of some type - to be the exception. The norm should be to search for annotations on asserted statements, because most of the time RDF is used to describe facts. Such queries would be straightforward: just use zeh annotation syntax, which internally is mapped to rdf:states. However, having to add to each such search for an annotated fact an extra BGP to check for extra attributions like `_:r a rdf:Stated" puts indeed a much higher burden on users than the need to query for both patterns in the occasional query for annotations on both unasserted and asserted statement types. And it doesn't get much less burdensome if instead of annotating the "stated annotations" only the "unstated" ones are getting attributed, as a query still has to check the intended status of every annotation it finds.

@pfps pfps added the discuss-f2f Proposed for discussion during the next face-to-face meeting label Sep 25, 2024
@pchampin
Copy link
Contributor

pchampin commented Oct 4, 2024

This was (extensively) discussed during today's Semantics TF's meeting

@pchampin
Copy link
Contributor

pchampin commented Oct 4, 2024

This was (extensively) discussed during today's Semantics TF's meeting

My personal take-away of this conversation is that, as @franconi pointed out, we should probably separate this in two decisions:

  • do we want to change the way the annotation syntax works (and my personal opinion is "no")
  • do we want to introduce a term in RDFS that would allow to automatically entail triple terms (to which my personal opinion is "why not")

@pchampin
Copy link
Contributor

pchampin commented Oct 4, 2024

Multiple viewpoints can appear in statements or in annotations.

  • dis-agreement on a statement (but not on annotation), e.g. a claim is described, but disputed:
<< :Foo :madeOf :Bar >> :says :Alice .  # and disagrees
:Foo :madeOf :Bar {| :says :Bob |}.     # and endorses the theory

As I (and others) explained during today's Semantics TF's meeting, this is an over-interpretation of what's in the graph.
The fact that :Foo :madeOf :Bar is in the graph only means that the author of the graph endorses that statement. It says nothing about Alice's or Bob's endorsement (assuming that the predicate :says does not convey any such implication, which is counter-intuitive IMO, but that's a separate issue).

  • dis-agreement on annotation (but not on statement), e.g. one claim, multiple supporting theories, but only one of them considered valid:
<< :Foo :madeOf :Bar >> :becauseOf :TheoryOne .  # we disagree with the theory
<< :Foo :madeOf :Bar >> :becauseOf :TheoryTwo .  # and with the second theory
:Foo :madeOf :Bar {| :becauseOf :TheoryThree |} .# but we endorse this one

Again, this is an over-interpretation. Nothing in this graph tells me that the author of the graph endores TheoryThree as a whole! I can only tell that the author endorses :Foo :madeOf :Bar..

@rat10
Copy link
Contributor Author

rat10 commented Oct 7, 2024

@pchampin It obviously is an over-interpretation according to the defined semantics, but that is not my point. However, it is a natural interpretation according to at least my intuition of how a reader not versed in RDF will interpret syntactic sugar for annotated statements and annotated unasserted reifications and Turtle-star. That is the point.

Therefore also @franconi's proposal to separate the two issues is a double edged sword to me: it would be an important step forward on the N-triples level, and in that respect I'm glad that it finds some support.
However, it doesn't address my core concern: I would still expect users to use the syntactic sugar according to "popular" intuition, i.e. not following the more intricate semantics of RDF, and in consequence to bump into unpleasant surprises.

W.r.t. to the critique of the :says property used in the examples above: if I was presenting this argument to logicians only, I wouldn't see the need to come up with any example properties at all, but would just use letters like :s,:pand :o. IMO it couldn't be more obvious to the untrained eye how meaning changes in the example mappings given above on the level of syntactic sugar. I'm claiming that only a well-trained RDF person is able to not see that ;-) We want to attract users from LPG land and in general would like to make RDF easier to use. We should try hard to not introduce the next trap! But nonetheless I'll try to work out a more realistic example.

@pfps
Copy link
Contributor

pfps commented Oct 7, 2024

I completely agree with Pierre-Antoine. I further find no current merit whatsoever in arguments based on over-interpretation of what RDF is. RDF is a very low-level formalism with impoverished syntax, impoverished semantics, and impoverished consequences. I don't like this aspect of RDF but that is were we are.

I deeply sympathize with the argument that the impoverished nature of RDF is a problem. But in my view what would be worse than the current situation is a situation where parts of RDF are impoverished and other parts are not.

@rat10
Copy link
Contributor Author

rat10 commented Oct 7, 2024

@pfps Following that argument it would probably be consequent to drop the syntactic sugar for reified triple terms entirely, because what is worse than syntactic sugar that is not backed by the impoverished representation as bare triples?

@pfps
Copy link
Contributor

pfps commented Oct 7, 2024

@rat10 I don't see how that follows. Syntactic sugar is backed by its expansion because that's all there is to syntactic sugar.

@rat10
Copy link
Contributor Author

rat10 commented Oct 7, 2024

@pfps I discussed different mappings in the issue description and how what the syntactic sugar seems to suggest changes when round-tripping to the database, or N-triples for that matter. A N-triples-star serialization that maps both syntactic variants to rdf:reifies is not able to reconstruct the same syntactic variants. Meaning - annotations on unasserted statements vs. annotations on asserted ones - that probably seems naturally expressed to someone not versed in the workings of RDF gets lost.

We claim to support "unasserted statements", but what we actually support are merely "not yet asserted statements, without any guarantees on how long that state will hold". We claim to support "statements about statements", but because of the impoverished semantics of reification what we actually support are "statements about things that look like the statements one might want to annotate". The proposed rdfs:states addresses both problems, and puts both on more solid grounds.

@pfps
Copy link
Contributor

pfps commented Oct 7, 2024

As far as I can tell there is nothing in the syntactic shorthands that makes any claims in regard to supporting "unasserted statements". There is certainly nothing there that makes any claims to supporting "not yet asserted statements". The evolution of RDF graphs is something completly outside the reach of RDF or RDFS. Both RDF and RDFS are monotonic but that is not about evolution of RDF graphs.

Syntactic shorthands are not designed to be round-trippable. Because shorthands create a different syntactic form that expands to an underlying form that can be written without using the shorthand there is no way to round trip them. It is sometimes possible to define a canonical form that includes shorthands, but some of the shorthands in Turtle 1.1 do not admit a canonical form without an artificial order imposed on nodes.

In sum, I don't see a problem that needs to be solved here.

@william-vw
Copy link

I think the point is that information is lost when Turtle syntactic sugar is translated to N-Triples. IMO the round tripping aspect may be a bit of a red herring.

<< :x :y :z >> :b :c .
:x :y :z {| :k :l |} .

Translates to

:x :y :z .
_:r1 rdf:reifies << (:x :y :z) >> .
_:r1 :b :c .
_:r2 rdf:reifies << (:x :y :z) >> .
_:r2 :k :l .

We now no longer know which reifier was associated with the annotation syntax {| |}, and which one with the reification syntax << >>. @pfps (and @rat10) would that be something you can agree with? You are still free not have a problem with that information loss, of course.

This information may indicate the author's intended meaning for _:r1 (uncertainty), and thus its metadata :b :c vs. _:r2 (certainty) and its metadata :k :l. But, this information (and thus possibly the intended meaning) is now lost. (Note that data on the SW does not exist in isolation; even when the author only uses the reification syntax, someone on another server can assert :x :y :z ..)

When using N-Triples directly, you can use another reification property that aligns with your intended meaning; rdfs:states, my:unsureAbout, my:reallySureAbout, ... The same author - or other people on the SW - can then assert :x :y :z . to their heart's content; the intended meaning of the original triple remains clear. But, when using Turtle syntactic sugar, you can currently only spew out rdf:reifies, which IMO causes the information loss.

For that reason, I currently think it's a good idea to translate the annotation syntax to rdfs:states. However, I believe the macro expansion to an extra triple is problematic for reasons stated before (the dangling rdfs:states issue).

@pfps
Copy link
Contributor

pfps commented Oct 8, 2024

Using syntactic shorthands inevitability means information loss of some sort. I don't view the information loss here as in any way a problem.

There is lots of information loss going from Turtle to an RDF graph, not just from syntactic shorthands. Blank node identifiers are lost, for example. In RDF this information is considered irrelevant. I view the information loss using the annotation syntax as similarly irrelevant.

@rat10
Copy link
Contributor Author

rat10 commented Oct 8, 2024

(and @rat10) would that be something you can agree with?

I definitely agree.

Re "red herring": the round-tripping example serves to illustrate the problem - of course, if Turtle-star with syntactic sugar is the interface to RDF-star, then it is not only an illustration of a problem but a real problem.

Re danglingrdfs:states reifications: AFAICT only entailment can provide a sound and complete solution. As that is not available in base RDF we will have to live with some issue. IMO the risk of dangling reifications is the lesser evil compared to what we have now.

@rat10
Copy link
Contributor Author

rat10 commented Oct 15, 2024

@niklasl in the last Semantics TF meeting, discussingrdf:states, mentioned an example on how he would express opposing theories. I provided a counter example there, but would rather keep the discussion here. So let me add the following remarks.

Some argue that adding a property like rdfs:states to explicitly disambiguate between annotations on merely reified and actually stated statements is too fundamental a change to RDF and adding too much specificity to the base mechanism.
However, it is important to realize that there is not really a precedent to this problem: so far RDF data only contained facts, period. In practice graphs could be used to separate confirmed facts from spurious statements, but merging them was to be controlled by out-of-band means. Another approach is to use RDF standard reification to document but not assert statements. While not uncommon this is a hack: it doesn't follow the semantics of reification as specified in RDF and can't provide much guidance either. On those grounds it doesn't seem unreasonable to go for proper disambiguation, requiring a new property, instead of relying on a less involved but also inherently incomplete design as the current one.
Again, following @niklasl 's and @afs 's argument one might require the author to add a respective attribute line rdf:type rdf:Stated, but that brings us back to the need to not only always add such a statement but also - and much worse - to always query for it. Requiring each query to take precautions against "unasserted statements" in contexts where diverging viewpoints are to be expected (and where isn't that the case?) is asking a lot from users.

There is also another aspect to take into account, which is rather covering the opposite perspective: in the current design the connection between a statement and "its" annotation is very brittle: the reifier describes the occurrence of a statement of a certain type. It makes no assumption if such a statement actually occurs in the graph. And, assuming that such a statement indeed does occur in the graph, there is likewise no guarantee that the annotation refers to it: the statement in the graph represents the type itself, not some occurrence of stating it, and might be the result of another act of stating. There is no guarantee that the reification describes an occurrence that actually happened. The presence of a statement of that type in the graph can never be considered more than incidental. That impoverished is the semantics of reification, and anything more definitive lies completely in the eye of the beholder. Consequently, a little bit more of definitiveness would definitely be welcome. rdf:stateswould at least clarify the intent, and - through syntactic mapping and reasoning support in higher levels - put some thrust behind that intention.

IMO the current design is viable only if one doesn't really expect competing viewpoints, not even in the documentation of not endorsed statements - and that to me seems just very much not like the semantic web. The current design is too tedious and/or too un-expressive to be a general solution, and therefore too unreliable to be viable outside of well-controlled environments.
OTOH the addition of rdfs:states, mapped to the Turtle-star annotation syntax, demands very little from non-controversial scenarios where facts are cleanly separated from non-facts, but supports more involved use cases naturally. It also removes some of the irritating wiggle room that reification leaves open even for straightforward use cases. All things considered, to me this seems like a pretty good deal for everybody.

@pfps
Copy link
Contributor

pfps commented Oct 15, 2024

I believe that the following is not correct.

Another approach is to use RDF standard reification to document but not assert statements. While not uncommon this is a hack: it doesn't follow the semantics of reification as specified in RDF and can't provide much guidance either.

From https://www.w3.org/TR/rdf11-mt/#reification

A reification of a triple does not entail the triple, and is not entailed by it. The reification only says that the triple token exists and what it is about, not that it is true, so it does not entail the triple.

@afs
Copy link
Contributor

afs commented Oct 15, 2024

The origin of annotation syntax:
https://lists.w3.org/Archives/Public/public-rdf-star/2020Aug/0041.html

This shorthand form can express simple usage simply:

Example:

:s :p :o {| rdfx:source <URL> |} .

rdfx:source has no opinion about the asserted triple. The reifier predicate object list is still accurate if the asserted triple is later removed. rdfx:source can be used with other vocabularies which may define sub-properties of rdf:reifies or may infer a type to the reifier.

@rat10
Copy link
Contributor Author

rat10 commented Oct 17, 2024

@pfps wrote

I believe that the following is not correct.

Another approach is to use RDF standard reification to document but not assert statements. While not uncommon this is a hack: it doesn't follow the semantics of reification as specified in RDF and can't provide much guidance either.

From https://www.w3.org/TR/rdf11-mt/#reification

A reification of a triple does not entail the triple, and is not entailed by it. The reification only says that the triple token exists and what it is about, not that it is true, so it does not entail the triple.

That paragraph about reification in the RDF 1.1 Semantics document starts with:

The intended meaning of this vocabulary is to allow an RDF graph to act as metadata describing other RDF triples.

The RDF (1.0) Semantics specification from 2004 is even more explicit, saying that:

This particular interpretation of reification was chosen on the basis of use cases where properties such as dates of composition or provenance information have been applied to the reified triple, which are meaningful only when thought of as referring to a particular instance or token of a triple.

That the reification doesn't entail the triple can IMHO be interpreted as merely a side effect, a consequence of entailment not being available at the base level of RDF. What I call a hack is using that restriction to the effect of speaking about statements without asserting them, instead of to describe other RDF triples (where IIUC "triple" refers to a statement asserted in the graph).

@pchampin
Copy link
Contributor

We now no longer know which reifier was associated with the annotation syntax {| |}, and which one with the reification syntax << >>. @pfps (and @rat10) would that be something you can agree with? You are still free not have a problem with that information loss, of course.

I agree about the statement in the first sentence, and I gladly use my freedom to not have a problem with it.
Consider the following example, which might look very "intuitive":

dbr:Torn_\(Ednaswap_song\) a dbo:Work ;
    eg;recorded_by dbr:Lis_Sørensen ;
    eg:recorded_in "1995"^^xsd:gYear.
    
dbr::Torn_\(Ednaswap_song\)
    eq:recorded_by dbr:Natalie_Imbruglia ;
    eg:recorded_in "1997"^^xsd:gYear.

The grouping of triples does not round-trip, and some people might complain about that, as it mixes the information about who recorded when. This is not a bug in RDF or in Turtle, though, this is bad modelling that people should avoid.

Similarly, people must not rely on the specific kind of bracket they use ({| vs. <<) to convey relevant information, because those brackets are not part of the data model.

PS: people must not either rely on the difference between using square brackets vs. bnode labels (_:b). Nor on the difference between putting predicate-objects inside the square brackets vs. after them ([ :name "Alice" ] :knows ex:bob.). I could go on forever...

@rat10
Copy link
Contributor Author

rat10 commented Oct 17, 2024

The reifier predicate object list is still accurate if the asserted triple is later removed.

It was actually the other direction that I was concerned about in the beginning: an annotation referring to an unasserted statement becomes an annotation referring to a fact if the triple is added to the graph. That may run counter the intention of the initial annotation and may lead to false conclusions.
But on closer inspection I've come to an even more concerning result: both directions are inherently fragile.

Let me give another example:

<< :Alice :buys :Car >>
    :type :Cabriolet ;
    :purpose :FunRiding .
:Alice :buys :Car {|
        :type :Sedan ;
        :purpose :Commuting 
    |} .

Maybe Alice planned but never pulled through with the purchase of a cabriolet. Maybe it was just unconfirmed hearsay. In any case, we can't assume she did, but we know that she did buy a sedan for commuting to work. An author familiar only with Turtle-star but not the more intricate mapping to N-triples-star would be excused to think that s/he modeled this correctly in the Turtle-star above.

Mapping to N-triples-star according to the current design is straightforward (mapping the annotation syntax to rdfs:states would of course be straightforward as well):

:Alice :buys :Car .
_:r1 rdf:reifies <<( :Alice :buys :Car )>> ;
    :type :Cabriolet ;
    :purpose :FunRiding .
_:r2 rdf:reifies <<( :Alice :buys :Car )>> ;     # NOT the proposed rdfs:states
    :type :Sedan ;
    :purpose :Commuting .

We do now have different ways to interpret this (and to map it back to N-Turtle-star), following different intuitions:

  • a mapping that is based on the intuition that a statement contained in the graph is true, and all annotations on "such a statement" refer to that true statement.

    :Alice :buys :Car {|
            :type :Cabriolet ;
            :purpose :FunRiding |}  
        {|  :type :Sedan ;
            :purpose :Commuting |} .
    

    This loses the intuitive interpretation that Alice never bought a cabriolet.

  • on the other hand an intuition that follows the semantics of reification very closely can come to a different conclusion, namely that both reifiers _:r1 and _:r2 are not guaranteed to refer to a statement that is true in the graph. They may both refer to an unasserted statement, i.e. an hypothetical assertion event that never materialized. That intuition would be captured by the following mapping:

    << :Alice :buys :Car >>
        :type :Cabriolet ;
        :purpose :FunRiding .
    << :Alice :buys :Car >>
        :type :Sedan ;
        :purpose :Commuting .
    :Alice :buys :Car  .
    

    This loses any reference to which car Alice is actually known to have bought.

It should be obvious that

  • both mappings back to Turtle-star reflect very different intuitions
  • both mappings are supported by a seamingly strict reading of RDF semantics
  • both mappings fail to preserve the intuition of the initial Turtle-star document
  • both mappings lose important information, namely which car Alice bought, and which she didn't.

@rat10
Copy link
Contributor Author

rat10 commented Oct 17, 2024

@pchampin

This is not a bug in RDF or in Turtle, though, this is bad modelling that people should avoid.

So what would you recommend to avoid the issues I illustrate? I made the proposal to map the annotation syntax to rdfs:states. Andy proposed an additional attribution like _:r rdf:type rdf:Stated(which downside I discussed above). What is your proposal, what would in your opinion constitute good modelling?

@niklasl
Copy link

niklasl commented Oct 17, 2024

I can't clearly (as in propositional logic) see what you intuit, but I can imagine it being something like:

:Alice :buys :Car
    ~ :AliceCarPlan {|
        a :PurchasePlan ;
        :type :Cabriolet ;
        :purpose :FunRiding
    |}
    ~ :AliceCarBought {|
        a :Purchase ;
        :type :Sedan ;
        :purpose :Commuting
    |} .

And that your reading of this syntax deems it problematic. But, as I've written elsewhere, I don't see this as anymore problematic than footnotes in a book not having "written" the text, be it source references or something opposing.

I might also model it differently, but that's beside the point. I do sympathize with at least some of your concerns though, and I while I do believe that this is getting close to the philosophical quagmire of justified true beliefs, I have hopes we can address it from a different angle.

If w3c/rdf-semantics#49 is, is some form, part of standard entailment (maybe in RDFS), there is a foundation for defining, using OWL, :Purchase as a "truth-maker" class for simple truths like :Alice :buys :Car. My approach for doing that is described in w3c/rdf-ucr#27.

That would actually entail the triple from the reified triple term of :AliceCarBought, and in the process entailing the more clear:

_:AliceCarBought a :Purchase ;
  :type :Sedan ;
  :purpose :Commuting ;
  :buyer :Alice ;
  :item :Car .

(It might also be prudent to define :PurchasePlan owl:disjointWith :Purchase.)

Now, there is still the possible reading of << :Alice :buys :Car ~ :AliceCarPlan >> as in itself connecting :AliceCarPlan to the triple not being true. (Per the asserted triple, it is true; Alice did in fact buy a Car, just not according to her plans.) I just don't see that this distinction should be defined on the basic level of RDF, and thus not the reading thereof. Here, it is just a reification of that claim, independent of its truth value (simple or not).

What I do see is potential need for adding a note (perhaps in the primer) to make it clear that this is not implied by the syntax.

@pchampin
Copy link
Contributor

This was discussed durin yesterday's WG meeting https://www.w3.org/2024/10/17-rdf-star-minutes.html

@pchampin
Copy link
Contributor

@niklasl wrote, about @rat10's example about Alice buying a car:

I might also model it differently, but that's beside the point.

I am not sure it is... As I pointed out earlier, some modeling choices are better than others. Insisting that RDF should let you get away with bad modeling is, in my opinion, not a service to RDF or its users.

In my example above about the song "Torn", the modeling mistake is to conflate two distinct recordings of the same song (conflating them with each other, and conflating them with the song itself). Note that I am not claiming that this conflation is inherently bad (there are use cases where it would cause no harm), it is just not fit for purpose given the kind of information that this graph aims to convey.

@rat10's example about Alice buying a car suffers, in my opinion, from the same problem: it uses the same triple to describe plans (or actions) of Alice for buying different cars. A symptom of that problem is the need to attach :type :Cabriolet/:Sedan to the reifier, which is very strange for me: being of type Cabriolet is property of the car, not of the act (realized or not) of buying it...

All that being said, if I forget about my unease with this modeling, I totally agree with @niklasl's response.

@william-vw
Copy link

From a different perspective, and starting from a (hopefully) common ground: adding extra triples should not impact meaning of prior ones. In @rat10's examples, I think the following point is being made; using the annotation syntax later on changes the meaning of the first set of annotations (uncertain > certain); and this is being reflected in the roundtripping issue.

Firstly, I think this is a broader issue; asserting the triple later on, albeit manually or as an effect of the annotation syntax, should not change the meaning of the first set of annotations. So, not just the roundtripping, but also the annotation syntax may be a red herring here. Personally, as it stands, I don't think the asserted triple changes the meaning of the prior triples; rdf:reifies does not reflect a particular meaning (uncertainty/falseness in this case), but just that the object is a triple term. I think (but I could be wrong) that this is what @rat10 is saying; due to the dichotomy with the annotation syntax, that it reflects uncertainty. I think @niklasl said something similar in his response (font weight added for emphasis):

Now, there is still the possible reading of << :Alice :buys :Car ~ :AliceCarPlan >> as in itself connecting :AliceCarPlan to the triple not being true. (Per the asserted triple, it is true; Alice did in fact buy a Car, just not according to her plans.) I just don't see that this distinction should be defined on the basic level of RDF, and thus not the reading thereof. Here, it is just a reification of that claim, independent of its truth value (simple or not).

The solution is to always clarify your intended meaning using explicit metadata - cfr. @niklasl (and with the same caveat about modeling as @pchampin mentioned):

<< :Alice :buys :Car >>
	a :Planned ; a :Cabriolet ; :purpose :FunRiding .

<< :Alice :buys :Car >>
	a :Purchase ; a :Sedan ; :purpose :Commuting .

:Alice :buys :Car .

Adding the assertion does not change the meaning of the first set of annotations, since we don't assume anything about the rdf:reifies it translates to (e.g., uncertainty).

A more elegant solution would indeed be to use a different reification subproperty in the N-triples - such as ex:planned and ex:occurred (or rdf:states :-), so that a single triple would already reflect the intended meaning. One could consider expanding the syntactic sugar to allow for this, but it could make it too complicated.

Possibly related to what souri said during the meeting (based on the notes):

IMO, for keeping RDF simple and concise, the :s :p :o . triple, if present, should not have anything to do with possible presence or absence of :r1 rdf:reifies <<( :s :p :o )>> OR :r2 rdf:reifies <<( :s :p :o )>> (and what I say about :r1 or :r2).

Finally, from @franconi:

I understand the argumentation of tl and I believe this argumentation is sound.

I agree. But I think it could be solved by clarifying what the reification and annotation syntaxes mean, and what they don't mean.

@franconi
Copy link
Contributor

franconi commented Oct 18, 2024 via email

@pchampin
Copy link
Contributor

pchampin commented Oct 18, 2024

@william-vw

From a different perspective, and starting from a (hopefully) common ground: adding extra triples should not impact meaning of prior ones.

+1

In @rat10's examples, I think the following point is being made; using the annotation syntax later on changes the meaning of the first set of annotations (uncertain > certain).

Firstly, I think this is a broader issue; asserting the triple later on, albeit manually or as an effect of the annotation syntax, should not change the meaning of the first set of annotations.

That's a very good point. Following's @rat10's reasoning, the annotation syntax is not even required to trigger the problem. Using a triple :s :p :o between double-pointy brackets, then asserting it, would cause the same round-trip/meaning-changing issue.

The solution is to always clarify your intended meaning using explicit metadata

+1, and to be fair, I think we are all agreeing with that. But we disagree on where/how this metadata should be expressed. Some of us propose that this metadata should take the form of additional triples, while @rat10 proposes that this metadata is (1) carried by

  • the kind of bracket that you use (in Turtle and other concrete syntaxes)
  • the reification property (rdf:reifies vs. rdf:states) (in N-Triples and the abstract syntax)

@rat10's arguments against explicit triples is (see above)

I was trying to argue that a tradeoff is to be made, but that it is well justified. I expect the latter case - where one queries for annotations on asserted AND on unasserted triples of some type - to be the exception. The norm should be to search for annotations on asserted statements, because most of the time RDF is used to describe facts.

I share the intuition that people are less likely to query for all arbitrary reifiers of a given triple, and more likely to query for a specific kind of reifiers. However, following this reasoning, I don't think that the binary distinction between rdf:reifies and rdf:states is a sufficient granularity. I would prefer to explicitly type my reifier, and to query on one specific type.

Consider the following variant of the "Alice buying a car" example:

<< :alice :buys :Cabriolet ~ <#r1> >> a :PurchasePlan; :by :alice; :on "2024-09"; :purpose :funRiding.
<< :alice :buys :Cabriolet ~ <#r2> >> a :Belief;  :by :bob ; :on "2024-09".
<< :alice :buys :Cabriolet ~ <#r3> >> a :Doubt;  :by :charlie; :on "2024-09".
:alice :buys :Sedan ~ <#r4> {| a :Purchase; :by :alice; :on "2024-10"; :purpose :commuting |}
                    ~ <#r5> {| a :Doubt; :by :bob; :on "2024-09" |}
                    ~ <#r6> {| a :Belief; :by :charlie; :on "2024-09" |}.
# Alice planned in September to buy a Cabriolet for fun,
# but ended up being pragmatic and bought a Sedan in October.
# Back in September, Bob thought that Alice would buy the Cabriolet,
# and was doubtful about her settling for a Sedan.
# Charlie, on the other hand, thought that Alice would end up making the pragmatic choice.

I expect that people would be more interested in querying all beliefs or all doubts, rather than only segregating un-asserted vs. asserted annotation. In fact, as the example above shows, some types of reifiers are orthogonal to the assertedness of the reified triple (I can talk about people's wrong belief of wrong doubts).

edited: the example was using the same reified twice. This was unintended and has been fixed.

@franconi
Copy link
Contributor

franconi commented Oct 18, 2024

Consider the following variant of the "Alice buying a car" example:

<< :alice :buys :Cabriolet ~ <#r1> >> a :PurchasePlan; :by :alice; :on "2024-09"; :purpose :funRiding.
<< :alice :buys :Cabriolet ~ <#r2> >> a :Belief;  :by :bob ; :on "2024-09".
<< :alice :buys :Cabriolet ~ <#r3> >> a :Doubt;  :by :charlie; :on "2024-09".
:alice :buys :Sedan ~ <#r4> {| a :Purchase; :by :alice; :on "2024-10"; :purpose :commuting |}
                    ~ <#r5> {| a :Doubt; :by :bob; :on "2024-09" |}
                    ~ <#r6> {| a :Belief; :by :charlie; :on "2024-09" |}.
# Alice planned in September to buy a Cabriolet for fun,
# but ended up being pragmatic and bought a Sedan in October.
# Back in September, Bob thought that Alice would buy the Cabriolet,
# and was doubtful about her settling for a Sedan.
# Charlie, on the other hand, thought that Alice would end up making the pragmatic choice.

I expect that people would be more interested in querying all beliefs or all doubts, rather than only segregating un-asserted vs. asserted annotation. In fact, as the example above shows, some types of reifiers are orthogonal to the assertedness of the reified triple (I can talk about people's wrong belief of wrong doubts).

I fully agree with P-A.

@pfps
Copy link
Contributor

pfps commented Oct 18, 2024

There is definitely a modelling issue if one wants to have reifications that are somehow supposed to represent truth and other similar ones that do not have this feature. But the solution is to put this information on the reification itself,
as in

  << :Dick :married :Liz >> :location :LosVegas; :year 1975; :truth true .

instead of having several properties that go from a reification to its triple.

If RDF was more powerful better mechanisms would be possible, but RDF is very weak.

@pfps
Copy link
Contributor

pfps commented Oct 20, 2024

@pchampin wrote:
I am not sure it is... As I pointed out earlier, some modeling choices are better than others. Insisting that RDF should let you get away with bad modeling is, in my opinion, not a service to RDF or its users.

I don't see how anything about RDF (or indeed any formalism) can prevent bad modelling, ranging from using rdfs:subclassOf for instance to conflating multiple things to just stating incorrect facts.

@rat10
Copy link
Contributor Author

rat10 commented Oct 21, 2024

Sorry for the huge combined comment, but how else am I to respond to all these messages without completely drowning the whole thread in my utterings. I DRY not to repeat myself too often.

@pchampin above

Similarly, people must not rely on the specific kind of bracket they use ({| vs. <<) to convey relevant information, because those brackets are not part of the data model.

This is kind of a circular argument: I am arguing that the information that those brackets seem to convey should be reflected in the data model. That such a meaning is suggested, but not reflected in the data model is indeed the very problem that this proposal addresses. Then people will be able to rely on that what they see - and mean - is what they get.

@rat10's #128 (comment) about Alice buying a car suffers, in my opinion, from the same problem: it uses the same triple to describe plans (or actions) of Alice for buying different cars.

You're still thinking in terms of types, not occurrences. Some discussions obviously take years to sink in (hinting at your argument from the last WG meeting that this discussion has been dragging on for very long, or too long, already).
But to clarify: occurrences describe different things, although we don't specify by what the differ, e.g. different acts of buying, different acts of referring to the same buying event. One has to deduce from context and attributes what kind of difference is described (that might be a problem with RDF-star, but is irrelevant in the discussion of this proposal). Here, as hopefully is obvious enough, the difference is two different purchases, not two different descriptions of the same purchase.

A symptom of that problem is the need to attach :type :Cabriolet/:Sedan to the reifier, which is very strange for me: being of type Cabriolet is property of the car, not of the act (realized or not) of buying it...

No, it's not a symptom of that problem. It was my deliberate decision not to overload this example with the issue of annotating individual nodes, and therefore not using any special property to refer to the object. However, I obviously failed as I still did introduce the issue, just not the proper solution. If you find that explanation still unsatisfying, you might just ignore that part of the example and concentrate on the :purpose annotation.

@pfps above

There is definitely a modelling issue if one wants to have reifications that are somehow supposed to represent truth and other similar ones that do not have this feature. But the solution is to put this information on the reification itself, as in

  << :Dick :married :Liz >> :location :LasVegas; :year 1975; :truth true .

instead of having several properties that go from a reification to its triple.

I discussed that above already, since @afs made a similar proposal (_:r a rdf:Stated) early on: it puts an unreasonable burden on authors and consumers to require them to add such a true-making attribute to every annotation so inclined. The task of this WG is to define "statements about statements". Statements in RDF are true. The introduction of "unasserted statements" is pretty strange from the perspective of bare ("weak") RDF. There is no such thing in the specs that I'd be aware of. "Unasserted statements" were established as a practice, re-purposing RDF standard reification. They were introduced as a feature request to the RDF-star CG. They became a by-product of the re-definition of RDF* in the RDF*/star CG. They still feel out of place.
The current RDF-star model however makes them the basis of annotations, and annotating actual statements requires further activity, as per your suggestion, and that of @afs, @niklasl , @pchampin, @franconi, and others. IMO, this turns the world of RDF upside down.
Speaking in terms of weak-ness: if anything, then RDF is too weak to integrate the concept of "unasserted statements" merely by requiring another attribute on an annotation. That doesn't break anything on paper, but it breaks every reasonable expectation of users when presented with the annotation syntax of Turtle-star:
:Alice :buys :Car {| :year 2024 |} .
Who wouldn't expect the annotation :year 2024 to refer to the fact :Alice :buys :Car? To suggest anything else, to even require users to add an extra attribute to that effect is just ... well, very unreasonable.

If RDF was more powerful better mechanisms would be possible, but RDF is very weak.

You have repeatedly made it clear that you fundamentally oppose the change that this WG is tasked to define. I wonder if that doesn't influence your judgement of this specific proposal here. Also see what RDF 1.0 Semantics (2004) has to say about reification:

Semantic extensions MAY limit the interpretation of these so that a triple of the form aaa rdf:type rdf:Statement . is true in I just when I(aaa) is a token of an RDF triple in some RDF document, and the three properties, when applied to such a denoted triple, have the same values as the respective components of that triple.

IIUC this proposal, while purposefully falling short of a proper semantic extension, points into very much the same direction. I would be open to arguments that this "while ... falling short of a proper..." part is the problem that you'd like to be addressed, but I'm not convinced that you even see the need for such a detailed discussion.

@niklasl above
What you call "more clear" in that comment is typical ER style modeling: it centers around an in itself meaningless identifier of some entity (an event maybe), with all attributes (type, actors, etc) attached. This is not graph modelling, and merely tangential to what this WG is tasked to define. If you prefer ER style modelling, you are free to do - indeed, a lot of the members of this WG seem to agree (see my mail "statements about statements"). If you need statement annotation only to add orthogonal detail, fine: run with it. However, that's only one aspect of "statements about statements", and not the most promising one. The problems that this proposal addresses address stem from problems and chances of graph modelling itself. That's different, and indeed essential to the task of this WG as I interpret it.

Now, there is still the possible reading of << :Alice :buys :Car ~ :AliceCarPlan >> as in itself connecting :AliceCarPlan to the triple not being true. (Per the asserted triple, it is true; Alice did in fact buy a Car, just not according to her plans.) I just don't see that this distinction should be defined on the basic level of RDF, and thus not the reading thereof. Here, it is just a reification of that claim, independent of its truth value (simple or not).

The crucial distinction is that reifiers refer to occurrences, not to the type. While the type of both :Alice :buys :Car reifications is the same, the occurrences are not: in one instance the statement is true (she did buy a sedan), in the other case it isn't (she didn't buy a cabriolet). That distinction is so fundamental that

  • it has to be represented reliably in the data
  • its proper expression can't be delegated to domain ontologies.

Both requirements taken together mean that it has to be represented in the model of RDF.

If w3c/rdf-semantics#49 is, is some form, part of standard entailment (maybe in RDFS), there is a foundation for defining, using OWL, :Purchase as a "truth-maker" class for simple truths like :Alice :buys :Car. My approach for doing that is described in w3c/rdf-ucr#27.

We can't rely on OWL or other upper levels of the RDF stack to solve a problem this fundamental. Apart from that this seems like a variation of the TEP mechanism: one would need to define an abundance of truth makers, and one would need to define them per statement, right?

@william-vw above

A more elegant solution would indeed be to use a different reification subproperty in the N-triples - such as ex:planned and ex:occurred (or rdf:states :-), so that a single triple would already reflect the intended meaning. One could consider expanding the syntactic sugar to allow for this, but it could make it too complicated.

Too complicated compared to what? Users of Turtle-star annotation syntax will only feel the difference when they query for annotations on asserted and unasserted statements. OTOH, with the current design all users will always feel the difference, when authoring and when querying, because they will always be forced to explicitly mention "assertedness".

Finally, from @franconi:

I understand the argumentation of tl and I believe this argumentation is sound.
I agree. But I think it could be solved by clarifying what the reification and annotation syntaxes mean, and what they don't mean.

What do you mean by clarifying? "Educating users" through examples? Providing extra vocabulary like rdf:Stated to standardize expressivity?

@franconi above

As I said several times, a triple in the graph states a true fact, and it is unique, due to the graph being a set of triples.
It is impossible to associate a triple in the graph with a syntactically equal triple term in the graph, since the triple term may appear several times reified in different ways with different intuitively implied truth values.

The statement in a graph stands in for all its occurrences: it represents the fact, and the fact exists only once, no matter how often it is uttered. The triple term stands in for the mere possibility of such a statement being made, but it doesn't state it, and even a reifier, defined via rdf:reifies, doesn't express that the statement has actually been stated. So there are limitations in both realms. However, it is not impossible to refer to the triple term in different ways depending on if it refers to a statement actually made, or not. That is a crucial difference, and it can be expressed via the proposed approach, mapping annotation syntax to rdfs:states in the model.

This can be clarified, as I said several times, by saying that reification is a very general (many-to-many) association mechanism between a triple term and a resource.
This mechanism can be used for many very different purposes: generic annotation, provenance, event or state or resource association, n-ary relations, beliefs, modalities, temporal annotations, etc.

I don't think this is capturing the essential aspect. Sure (and you know about that much better than me) reification is a term with many applications. But the issue here is if an annotation refers to "something" that is considered true, or not. It doesn't matter if that something is an n-ary relation, a generic annotation, a temporal annotation, etc. Even for beliefs one might have some that one considers true, but that is just a secondary aspect. You're mixing categories that probably shouldn't be mixed in this context.

We should have a best practices Section explaining/suggesting how to encode such different use cases.

IMO that will not be sufficient (or it will be sufficiently disheartening to drive users away ;-)

@pchampin above

Following's @rat10's reasoning, the annotation syntax is not even required to trigger the problem. Using a triple :s :p :o between double-pointy brackets, then asserting it, would cause the same round-trip/meaning-changing issue.

Not if rdf:reifies is properly defined. The initial issue description above makes an attempt at a formal description of both rdf:reifies and rdfs:states, juxtaposing them appropriately to match the intuition conveyed by the the two syntaxes in Turtle-star.. Maybe the notion that rdf:reifies doesn't assert should be pronounced a bit stronger. I haven't looked at those definitions for quite some time.

However, following this reasoning, I don't think that the binary distinction between rdf:reifies and rdf:states is a sufficient granularity. I would prefer to explicitly type my reifier, and to query on one specific type. [...] :PurchasePlan, :Belief, :Doubt

You are free to do so :) I think we agree that such granularity doesn't belong into RDF itself, but it's easy to add as extra attributions via domain ontologies. One may also encode one's whole data as mere reifications, no basic RDF statements (i.e. "facts") whatsoever, and query over those reifications filtered by attributes given in annotations. However, that is very much not the way how data is usually encoded and shared on the semantic web. It is good to make this possible, but it can't be the default arrangement.

I expect that people would be more interested in querying all beliefs or all doubts, rather than only segregating un-asserted vs. asserted annotation.

(and @franconi expressed agreement)
I very much doubt that. I do indeed think that most people will be interested in neither, but will just want to annotate facts with further detail. Because until now RDF doesn't define anything else than facts - everything not encoded as fact is simply unknown (hackish use of RDF standard reification notwithstanding). The current design doesn't support that without extra statements, extra query parameters, extra "education". That is just not a sustainable design. This preference for unasserted statements in the model is also not covered by the charter.

In fact, as the example above shows, some types of reifiers are orthogonal to the assertedness of the reified triple (I can talk about people's wrong belief of wrong doubts).

That probably is an argument rather in favor of this proposal and against its conflation with more fine grained attributions, or isn't it?

@afs
Andy, from time to time you mention in WG meetings that there are still questions open from you that I haven't answered, but unfortunately you don't provide any detail, even when I explicitly ask you. So, again: I have no idea what questions from you are still unanswered. In this long running discussion I may well have forgotten to answer some, but as is hopefully obvious that I'm not in principle reluctant to do so. So please take the time to provide those questions here or on the mailing list, again. Otherwise I have no chance to resolve them.

@pchampin
Copy link
Contributor

For everyone's sake, I'l try to keep this brief, and focus on what I think is the most important:

@rat10's #128 (comment) about Alice buying a car suffers, in my opinion, from the same problem: it uses the same triple to describe plans (or actions) of Alice for buying different cars.

You're still thinking in terms of types, not occurrences.

I assure you that I'm not, although I see what makes you think I am: my example about the song "Torn" suffers indeed from two problems

  • assigning meaning to irrelevant syntactic features (here, the grouping of triples),
  • a type-token conflation (same IRI for two different recordings).
    My analogy was not perfect, because obviously, your example does not suffer from the latter. When I wrote "it uses the same triple to describe plans (or actions) of Alice for buying different cars.", I should have written "it uses reifiers of the same triple...", this would have been more accurate. But that is indeed what I meant, and I still consider it to be a problem.

Would you consider modelling all of Liz Tailor's 8 marriages as 8 reifiers of a same triple :liz :married :husband? And then attach the IRI of each husband to one of the reifier (or two, for Richard Burton, as we know)? I don't remember anyone proposing to model it like that, and I think there are good reasons for that.

@pchampin above

Similarly, people must not rely on the specific kind of bracket they use ({| vs. <<) to convey relevant information, because those brackets are not part of the data model.

This is kind of a circular argument: I am arguing that the information that those brackets seem to convey should be reflected in the data model.

I have well understood that this is your argument, and I still disagree with it. My disagreement is rooted in how Turtle 1.1 currently works (see my previous comments) so I don't think that it is a circular argument.

@rat10
Copy link
Contributor Author

rat10 commented Oct 23, 2024

@pchampin I'm also trying to be concise.

[...] Would you consider modelling all of Liz Tailor's 8 marriages as 8 reifiers of a same triple :liz :married :husband? And then attach the IRI of each husband to one of the reifier (or two, for Richard Burton, as we know)?

If the data focuses on marriages of popular persons then indeed yes I would consider this modelling, because it would make it very easy to find all marriages in which the well known Liz Taylor is involved, and provide easy access to further detail. That is precisely what I find attractive about statement annotation.

[...] I have well understood that this is your argument, and I still disagree with it. My disagreement is rooted in how Turtle 1.1 currently works (see my previous comments) so I don't think that it is a circular argument.

Sorry, but I don't understand that argument. What in Turtle 1.1 supports a stance to not map a specific syntactic element to a specific element in the model?
On that occasion: in Thursday's straw poll (at the very end of the meeting minutes) Andy and you added "separate syntax and data model". This strikes me as a very un-intuitive goal, actually quite contrary of what I would expect as an argument pro some design. Can you elaborate why you think this is a good thing to have?

@pchampin
Copy link
Contributor

Sorry, but I don't understand that argument. What in Turtle 1.1 supports a stance to not map a specific syntactic element to a specific element in the model?

My point is that, by design, any concrete syntax has features that are irrelevant in the underlying data model. Even N-Triples introduce an order in triples, which is not relevant in the data model. Arguing that something should be reflected in the data model, just because it seems to convey relevant information for the uninformed reader, is a weak argument in my opinion.

@gkellogg gkellogg removed the discuss-f2f Proposed for discussion during the next face-to-face meeting label Oct 24, 2024
@rat10
Copy link
Contributor Author

rat10 commented Jan 10, 2025

Sorry, but I don't understand that argument. What in Turtle 1.1 supports a stance to not map a specific syntactic element to a specific element in the model?

My point is that, by design, any concrete syntax has features that are irrelevant in the underlying data model. Even N-Triples introduce an order in triples, which is not relevant in the data model. Arguing that something should be reflected in the data model, just because it seems to convey relevant information for the uninformed reader, is a weak argument in my opinion.

That really depends on the concrete case. You are right that is not an argument that always has weight, but you seem to conclude that therefore it never has weight ;-)

@rat10
Copy link
Contributor Author

rat10 commented Jan 10, 2025

That said, let me add some last words, or a last rear up: I still think that the annotation syntax should be mapped to rdfs:states which itself should be defined as a subproperty of rdf:reifies. It is a relatively simple thing to do, and it is the right thing to do. IMO it is a bad idea to introduce such evocative syntactic sugar as the annotation syntax and then not go the full distance to meet the expectations it evokes.
The arguments pro the current constrained mapping require a solid understanding of the semantics of RDF: how statements are types, how a reifier can never be understood as the triple itself, how a reifier and a triple of that type can only ever have a platonic relationship at most. That solid understanding however can not be assumed to be common among users of RDF. The casual user will assume that she gets what she sees - but that is not the case. With the current design we are setting up a trap for users, and quite unnecessarily so.
Also we are missing the mark w.r.t. interoperability with Labeled Property Graphs if we don’t go this extra step. LPGs don’t know about abstract statements that are not in the graph. A mapping from LPG to RDF would be well served by a firmer connection between a statement (occurrence) and "it’s" annotation.
Expressivity in general would be improved if it was possible to express if an annotation refers to a triple in the graph or not - not in the either/or fashion we have now but with both perspectives being expressible side by side in the same graph.
Last not least the current design is inherently brittle w.r.t. to the connection between statement and annotation. In fact it provides plausible deniability against any claim that an annotation annotates a triple in the graph. Quite to the contrary to be safe one should always assume that an annotation refers merely to the proposition, not to any triple of that kind in a graph. I understand that given the set semantics of RDF this gap can never be closed completely, but it could be quite solidly bridged by rdfs:states.
The current design works well enough for administrative use cases, but IMO that is not good enough. I reject the interpretation that this is a prudent design. IMO it is insufficient and dangerously misleading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants