Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve usage of the VoID vocabulary #118

Open
MichaelRoeder opened this issue Sep 12, 2019 · 0 comments
Open

Improve usage of the VoID vocabulary #118

MichaelRoeder opened this issue Sep 12, 2019 · 0 comments

Comments

@MichaelRoeder
Copy link
Member

MichaelRoeder commented Sep 12, 2019

Description

The VoID vocabulary offers information about datasets which should be used by the crawler. The following triples contain the URI http://dbpedia.org/sparql as well as the information that this URI should be crawled as SPARQL endpoint:

:DBpedia a void:Dataset;
    void:sparqlEndpoint <http://dbpedia.org/sparql> .

Solution

  • At the moment, the RDF processing is not very intelligent and simply stores all newly retrieved URIs. We would have to enhance the processing, e.g., by using the decorator pattern and decorator classes which handle special cases like that. However, depending on the number of special cases we will have in the future, the decorator pattern might become to heavy.
  • Additionally, the storage of newly found URIs might be an issue. In the example above, it wouldn't be sufficient to add the type information since the URI could already have been found before and the update might be rejected by the store. The same holds for the Frontier's queue which might reject the appending of the new URI since it might already be known. In this case, update strategies might be necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant