Squirrel 0.2
Overall:
This release includes several performance improvements and new features.
First, the project has been splited in several modules. Thus, the class loading will be reduce dramatically when some parts of Squirrel are not started.
These are the following modules:
- squirrel.api :
- Containing the core classes of Squirrel
- squirrel.deduplication :
- The deduplication is a component computes {@link org.dice_research.squirrel.deduplication.hashing.HashValue}s for the triples found in the new uris, stores those hash values in the {@link KnownUriFilter} and compares the newly computed hash values with the hash values of all old triples. By doing that, duplicate data can be found and eliminated.
- squirrel.frontier :
- The frontier relative classes
- squirrel.worker :
- The worker relative classes
- squirrel.web :
- The web front end
- squirrel.web-api :
- Specific functionalities relative to the front end .
- SquirrelWebService :
- The webservice communication between the front end and the frontier.
New Components:
Fetcher:
*SparqlBasedFetcher: located under squirrel.worker, it allows you to fetch uris from a Sparql endpoint. Please check the docker-compose-sparql.yml for env variables needed
Sink:
SparqlBasedSink: located under squirrel.worker, it allows you to use a Sparql endpoint as sink. Please check the docker-compose-sparql.yml for env variables needed.
Build Notes:
Run mvn clean install and run the Makefile for building.
To run Squirrel: docker-compose -f docker-compose-file up