diff --git a/README.md b/README.md index eec1ffc..991c17a 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ Use GraMi (**Gra**kn**Mi**grator) to take care of your data migration for you. G - supports any tabular data file with your separator of choice (i.e.: csv, tsv, whatever-sv...) - supports gzipped files - ignores unnecessary columns - - [Entity](https://github.com/bayer-science-for-a-better-life/grami#migrating-entities), [Relation](https://github.com/bayer-science-for-a-better-life/grami#migrating-relations), and [Relation-with-Relation](https://github.com/bayer-science-for-a-better-life/grami#migrating-relation-with-relations) Migration: + - [Entity](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities), [Relation](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations), and [Nested Relations](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations) Migration: - migrate required/optional attributes of any grakn type (string, boolean, long, double, datetime) - migrate required/optional role players (entity & relations) - migrate list-like attribute columns as n attributes (recommended procedure until attribute lists are fully supported by Grakn) @@ -33,32 +33,29 @@ Use GraMi (**Gra**kn**Mi**grator) to take care of your data migration for you. G - parallelized asynchronous writes to Grakn to make the most of your hardware configuration - Stop/Restart: - tracking of your migration status to stop/restart, or restart after failure - - [Schema Updating](https://github.com/bayer-science-for-a-better-life/grami#schema-updating) for non-breaking changes (i.e. add to your schema or modify concepts that do not yet contain any data) - - [Appending Attributes](https://github.com/bayer-science-for-a-better-life/grami#attribute-appending) to existing entities/relations - - [Basic Column Preprocessing using RegEx's](https://github.com/bayer-science-for-a-better-life/grami#preprocessors) + - [Schema Updating](https://github.com/bayer-science-for-a-better-life/grami/wiki/Schema-Updating) for non-breaking changes (i.e. add to your schema or modify concepts that do not yet contain any data) + - [Appending Attributes](https://github.com/bayer-science-for-a-better-life/grami/wiki/Append-Attributes) to existing things + - [Basic Column Preprocessing using RegEx's](https://github.com/bayer-science-for-a-better-life/grami/wiki/Preprocessing) -After [creating your processor configuration](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/processorConfig.json) and [data configuration](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/dataConfig.json), you can use GraMi - - as a [Command Line Application](https://github.com/bayer-science-for-a-better-life/grami#using-grami-as-a-command-line-application) - no coding - configuration required - - in [your own Java project](https://github.com/bayer-science-for-a-better-life/grami#using-grami-in-your-java-application) - easy API - configuration required +After creating your processor configuration ([example](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/processorConfig.json)) and data configuration ([example](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/dataConfig.json)), you can use GraMi + - as an [executable CLI](https://github.com/bayer-science-for-a-better-life/grami/wiki/Grami-as-Executable-CLI) - no coding - configuration required + - in [your own Java project](https://github.com/bayer-science-for-a-better-life/grami/wiki/GraMi-as-Dependency) - easy API - configuration required Please note that the recommended way of developing your schema is still to use your favorite code editor/IDE in combination with the grakn console. ## How it works: -To illustrate how to use GraMi, we will use a slightly extended version of the "phone-calls" example [dataset](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls) and [schema](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/schema.gql) from Grakn: +To illustrate how to use GraMi, we will use a slightly extended version of the "phone-calls" example [dataset](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls) and [schema](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/schema.gql) from the grakn developer documentation: ### Processor Configuration -The processor configuration file describes how you want data to be migrated according to your schema. There are two difference processor types - one for entities and one for relations. +The processor configuration file describes how you want data to be migrated given the constraints of your schema. There are different processor types. -To get started, define the "processors" list in your processor configuration file: +Depending on what you would like to migrate, see here: -```JSON -{ - "processors": [ - ] -} -``` + - [Entity Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities#processor-config) + - [Relation Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations#processor-config) + - [Nested Relation Processor Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#processor-config) ### Data Configuration @@ -66,635 +63,33 @@ The data configuration file maps each data file and its columns to your processo A good point to start the performance optimization is to set the number of threads equal to the number of cores on your machine and the batchsize to 500 * threads (i.e.: 4 threads => batchsize = 2000). -### Migrating Entities - -For each entity in your schema, define a processor object that specifies for each entity attribute - - its concept name - - its value type - - whether it is required - -Please note that for each entity, at least one attribute should be required to avoid inserting empty entites into grakn. All attributes declared as keys need also be required or you will get many error messages in your logs. - -We will use the "person" entity from the phone-calls example to illustrate: - - ```GraphQL -person sub entity, - plays customer, - plays caller, - plays callee, - has first-name, - has last-name, - has phone-number, - has city, - has age, - has nick-name; - ``` - -Add the following processor object in your processor configuration file: - -``` -{ - "processor": "person", // the ID of your processor - "processorType": "entity", // creates an entity - "schemaType": "person", // of type person - "conceptGenerators": { - "attributes": { // with the following attributes according to schema - "first-name": { // ID of attribute generator - "attributeType": "first-name", // creates "first-name" attributes - "valueType": "string", // of value type string (other possibilities: long, double, boolean, or datetime) - "required": false // which is not required for each data record - }, - < lines omitted > - "phone-number": { // ID of attribute generator - "attributeType": "phone-number", // creates "phone-number" attributes - "valueType": "string", // of value type string - "required": true // which is required for each data record - }, - < lines omitted > - "nick-name": { // ID of attribute generator - "attributeType": "nick-name", // creates "phone-number" attributes - "valueType": "string", // of value type string - "required": false // which is required for each data record - } - } - } -} -``` - -GraMi will ensure that all values in your data files adhere to the value type specified or try to cast them. GraMi will also ensure that no data records enter grakn that are incomplete (missing required attributes). - -Next, you need to add a data configuration entry into your data configuration file. For example, for the [person](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/person.csv) data file, which looks like this: - -```CSV -first_name,last_name,phone_number,city,age,nick_name -Melli,Winchcum,+7 171 898 0853,London,55, -Celinda,Bonick,+370 351 224 5176,London,52, -Chryste,Lilywhite,+81 308 988 7153,London,66, -... -``` - -The corresponding data config entry would be: - -``` -"person": { - "dataPath": "path/to/person.csv", // the absolute path to your data file - "separator": ",", // the separation character used in your data file (alternatives: "\t", ";", etc...) - "processor": "person", // processor from processor config file - "batchSize": 2000, // batchSize to be used for this data file - "threads": 4, // # of threads to be used for this data file - "attributes": [ // attribute columns present in the data file - { - "columnName": "first_name", // column name in data file - "generator": "first-name" // attribute generator in processor person to be used for the column - }, - < lines omitted > - { - "columnName": "phone_number", // column name in data file - "generator": "phone-number" // attribute generator in processor person to be used for the column - }, - < lines omitted > - { - "columnName": "nick_name", // column name in data file - "generator": "nick-name", // attribute generator in processor person to be used for the column - "listSeparator": ";" // separator within column separating a list of values per data record - } - ] -} -``` - -### Migrating Relations - -For each relation in your schema, define a processor object that specifies - - each relation attribute, its value type, and whether it is required - - each relation player of type entity, its role, identifying attribute in the data file and value type, as well as whether the player is required +See Example here: -We will use the call relation from the phone-calls example to illustrate. Given the schema: + - [Entity Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Entities#data-config) + - [Relation Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Relations#data-config) + - [Nested Relation - Match by Attribute(s) - Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#data-config---attribute-matching) + - [Nested Relation - Match by Player(s) - Data Config Example](https://github.com/bayer-science-for-a-better-life/grami/wiki/Migrating-Nested-Relations#data-config---player-matching) - ```GraphQL -call sub relation, - relates caller, - relates callee, - has started-at, - has duration, - plays past-call; - ``` +### Migrate Data -Add the following processor object: +Once your configuration files are complete, you can use GraMi in one of two ways: -``` -{ - "processor": "call", // the ID of your processor - "processorType": "relation", // creates a relation - "schemaType": "call", // of type call - "conceptGenerators": { - "players": { // with the following players according to schema - "caller": { // ID of player generator - "playerType": "person", // matches entity of type person - "uniquePlayerId": "phone-number", // using attribute phone-number as unique identifier for type person - "idValueType": "string", // of value type string - "roleType": "caller", // inserts person as player the role caller - "required": true // which is a required role for each data record - }, - "callee": { // ID of player generator - "playerType": "person", // matches entity of type person - "uniquePlayerId": "phone-number", // using attribute phone-number as unique identifier for type person - "idValueType": "string", // of value type string - "roleType": "calle", // inserts person as player the role callee - "required": true // which is a required role for each data record - } - }, - "attributes": { // with the following attributes according to schema - "started-at": { // ID of attribute generator - "attributeType": "started-at", // creates "started-at" attributes - "valueType": "datetime", // of value type datetime - "required": true // which is required for each data record - }, - "duration": { // ID of attribute generator - "attributeType": "duration", // creates "duration" attributes - "valueType": "long", // of value type long - "required": true // which is required for each data record - } - } - } -} -``` - -Just as in the case for entities, GraMi will ensure that all values in your data files adhere to the value type specified or try to cast them. GraMi will also ensure that no data records enter grakn that are incomplete (missing required attributes/players). - -We then create a mapping of the data file [call.csv](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/call.csv) to the processor configuration: - -Here is an excerpt of the data file: - -```CSV -caller_id,callee_id,started_at,duration -+54 398 559 0423,+48 195 624 2025,2018-09-16T22:24:19,122 -+263 498 495 0617,+48 195 624 2025,2018-09-18T01:34:48,514 -+81 308 988 7153,+33 614 339 0298,2018-09-21T20:21:17,120 -+263 498 495 0617,+33 614 339 0298,2018-09-17T22:10:34,144 -+54 398 559 0423,+7 552 196 4096,2018-09-25T20:24:59,556 -+81 308 988 7153,+351 515 605 7915,2018-09-23T22:23:25,336 -... -``` - -The data config entry would be: - -``` -"calls": { - "dataPath": "path/to/call.csv", // the absolute path to your data file - "separator": ",", // the separation character used in your data file (alternatives: "\t", ";", etc...) - "processor": "call", // processor from processor config file - "batchSize": 100, // batchSize to be used for this data file - "threads": 4, // # of threads to be used for this data file - "players": [ // player columns present in the data file - { - "columnName": "caller_id", // column name in data file - "generator": "caller" // player generator in processor call to be used for the column - }, - { - "columnName": "callee_id", // column name in data file - "generator": "callee" // player generator in processor call to be used for the column - } - ], - "attributes": [ // attribute columns present in the data file - { - "columnName": "started_at", // column name in data file - "generator": "started-at" // attribute generator in processor call to be used for the column - }, - { - "columnName": "duration", // column name in data file - "generator" : "duration" // attribute generator in processor call to be used for the column - } - ] -} -``` - -Let's not forget about a great design pattern in Grakn (adding multiple players of the same type to a single relation). To achieve this, you can also add a listSeparator for players that are in a list in a column: - -Your data might look like: - -``` -company_name,person_id -Unity,+62 999 888 7777###+62 999 888 7778 -``` - -``` -"contract": { - "dataPath": "src/test/resources/phone-calls/contract.csv", - "separator": ",", - "processor": "contract", - "players": [ - { - "columnName": "company_name", - "generator": "provider" - }, - { - "columnName": "person_id", - "generator": "customer", - "listSeparator": "###" // like this! - } - ], - "batchSize": 100, - "threads": 4 - } -``` - -### Migrating Relation-with-Relations - -Grakn comes with the powerful feature of using relations as players in other relations. Just remember that a relation-with-relation/s must be added AFTER the relations that will act as players in the relation have been migrated. GraMi will migrate all relation-with-relations after having migrated entities and relations - but keep this in mind as you are building your graph - relations are only inserted as expected when all its players are already present. - -There are two ways to add relations into other relations: - -1. Either by an identifying attribute (similar to adding an entity as described above) -2. By providing players that are used to match the relation that will be added to the relation - -Be aware of unintended side-effects! Should your attribute be non-unique or your two players be part of more than one relation, you will add a relation-with-relation to each matching player relation! - -For example, given the following additions to our example schema: - -```GraphQL -person sub entity, - ..., - plays peer; - -call sub relation, - ..., - plays past-call; - -# This is the new relation-with-relation: -communication-channel sub relation, - relates peer, - relates past-call; # this is a call relation playing past-call -``` - -We define the following processor object that will allow for adding a call as a past-call either by an identifying attribute or matching via its players (caller and callee): - -``` -{ - "processor": "communication-channel", - "processorType": "Relation-with-Relation", - "schemaType": "communication-channel", - "conceptGenerators": { - "players": { - "peer": { - "playerType": "person", - "uniquePlayerId": "phone-number", - "idValueType": "string", - "roleType": "peer", - "required": true - } - }, - "relationPlayers": { // this is new! - "past-call": { // past-call will be the relation in the communication-channel relation - "playerType": "call", // it is of relation type "call" - "roleType": "past-call", // and plays the role of past-call in communication-channel - "required": true, // it is required - "matchByAttribute": { // we can identify a past-call via its attribute - "started-at": { // this is the name of the attribute processor - "attributeType": "started-at", // we identify by "started-at" attribute of a call - "valueType": "datetime" // which is of type datetime - } - }, - "matchByPlayer": { // we can also identify a past-call via its players - "caller": { // the name of the player processor - "playerType": "person", // the player is of type person - "uniquePlayerId": "phone-number", // is identified by a phone number - "idValueType": "string", // which is of type string - "roleType": "caller", // the person is a caller in the call - "required": true // and its required - }, - "callee": { // the name of the player processor - "playerType": "person", // the player is of type person - "uniquePlayerId": "phone-number", // is identified by a phone number - "idValueType": "string", // which is of type string - "roleType": "callee", // the person is a callee in the call - "required": true // and its required - } - } - } - } - } - } -``` - -1. This is how you can add a relation based on an identifying attribute: - -Given the data file [communication-channel.csv](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/communication-channel.csv): - -```CSV -peer_1,peer_2,call_started_at -+54 398 559 0423,+48 195 624 2025,2018-09-16T22:24:19 -+263 498 495 0617,+33 614 339 0298,2018-09-11T22:10:34### 2018-09-12T22:10:34###2018-09-13T22:10:34 ###2018-09-14T22:10:34###2018-09-15T22:10:34###2018-09-16T22:10:34 -+370 351 224 5176,+62 533 266 3426,2018-09-15T12:12:59 -... -``` - -The data config entry would be: - -``` -"communication-channel": { - "dataPath": "path/to/communication-channel.csv", // the absolute path to your data file - "separator": ",", // the separation character used in your data file (alternatives: "\t", ";", etc...) - "processor": "communication-channel", // processor from processor config file - "batchSize": 100, // batchSize to be used for this data file - "threads": 4, // # of threads to be used for this data file - "players": [ // player columns present in the data file - { - "columnName": "peer_1", // column name in data file - "generator": "peer" // player generator in processor call to be used for the column - }, - { - "columnName": "peer_2", // column name in data file - "generator": "peer" // player generator in processor call to be used for the column - } - ], - "relationPlayers": [ - { - "columnName": "call_started_at", // this is the column in your data file containing the "matchByAttribute" - "generator": "past-call", // it will use the past-call generator in your processor config - "matchByAttribute": "started-at", // it will use the started-at matchByAttribute processor - "listSeparator": "###" // it is a list-like column with "###" as a separator - } - ] -} -``` - -2. This is how you can add a relation based on player matching: - -Given the data file [communication-channel-pm.csv](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/communication-channel-pm.csv): - -```CSV -peer_1,peer_2 -+81 308 988 7153,+351 515 605 7915 -+7 171 898 0853,+57 629 420 5680 -... -``` - -The data config entry would be identical to the one above, except for: - -``` -"communication-channel-pm": { - "dataPath": "path/to/communication-channel-pm.csv", - ... - "relationPlayers": [ // each list entry is a relationPlayer - { - "columnNames": ["peer_1", "peer_2"], // two players will be used to identify the relation - these are the column names containing the attribute specified in the matchByPlayer object in the processor generator "past-call" - "generator": "past-call", // the generator in the processor configuration file to be used is "past-call" - "matchByPlayers": ["caller", "callee"] // the two player generators specified in the matchByPlayer object in the processor generator "past-call" - } - ] -} -``` - -For troubleshooting, it might be worth setting the troublesome data configuration entry to a single thread, as the log messages for error from grakn are more verbose and specific that way... - -See the [full processor configuration file for phone-calls here](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/processorConfig.json). -See the [full data configuration file for phone-calls here](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/dataConfig.json). - -### Schema Updating - -When using it in your own application: -```Java -package migration; - -import configuration.MigrationConfig; -import migrator.GraknMigrator; -import java.io.*; - -public class Migration { - - private static final String schema = "/path/to/your/updated-schema.gql"; - private static final String graknURI = "127.0.0.1:48555"; // defines which grakn server to migrate into - private static final String keyspaceName = "yourFavoriteKeyspace"; // defines which keyspace to migrate into - - private static final SchemaUpdateConfig suConfig = new SchemaUpdateConfig(graknURI, keyspaceName, schema); - - public static void main(String[] args) throws IOException { - SchemaUpdater su = new SchemaUpdater(suConfig); - su.updateSchema(); - } -} -``` - -or using the CLI: + 1. As an executable command line interface - no coding required ```Shell -./bin/grami schema-update \ +./bin/grami migrate \ +-dc /path/to/dataConfig.json \ +-pc /path/to/processorConfig.json \ +-ms /path/to/migrationStatus.json \ -s /path/to/schema.gql \ --k yourFavoriteKeyspace \ -``` - -### Attribute Appending - -It is often convenient to be able to append attributes to existing entities/relations that have already been inserted. - -Given [append-twitter.csv](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/append-twitter.csv): - -```CSV -phone_number,twitter -+7 171 898 0853,@jojo -+263 498 495 0617,@hui###@bui -+370 351 224 5176,@lalulix -+81 308 988 7153,@go34 -... -``` - -We can match a person entity on the phone-number attribute, and then insert the twitter-username as an additional attribute. - -Given the following in our schema: - -```GraphQL -twitter-username sub attribute, - value string; - -person sub entity, - ... - has twitter-username; -``` - -We create an append-attribute processor, which should look very familiar: - -``` -{ - "processor": "append-twitter-to-person", - "processorType": "append-attribute", // the append-attribute processor type must be set - "schemaType": "person", // concept to which the attributes will be appended - "conceptGenerators": { - "attributes": { - "phone-number": { // the attribute generator for the attribute used for identifying the entity/relation - "attributeType": "phone-number", - "valueType": "string" - }, - "twitter-username": { // the attribute generator for the attribute to be migrated - "attributeType": "twitter-username", - "valueType": "string", - "required": true - } - } - } -} -``` - -The data config entry would look like: - -``` -"append-twitter": { - "dataPath": "src/test/resources/phone-calls/append-twitter.csv", - "separator": ",", - "processor": "append-twitter-to-person", - "attributes": [ - { - "columnName": "phone_number", - "generator": "phone-number", - "match": true // the identifying attribute must contain the match flag - }, - { - "columnName": "twitter", - "generator": "twitter-username", - "listSeparator": "###" - } - ], - "batchSize": 100, - "threads": 4 - } -``` - -### Preprocessors - -Sometimes your data comes in an almost useful format. GraMi allows you to pre-process attribute columns using a regex preprocessor (any other ideas welcome!). It works like so: - -After slightly extending our example schema by another attribute: -```GraphQL -fakebook-link sub attribute, - value string; - -person sub entity, - ... - has fakebook-link; -``` - -``` -"append-pp-fakebook": { - "dataPath": "src/test/resources/phone-calls/append-fb-preprocessed.csv", - "separator": ",", - "processor": "append-pp-fb-to-person", - "attributes": [ - { - "columnName": "phone_number", - "generator": "phone-number", - "match": true - }, - { - "columnName": "fb", - "generator": "fakebook-link", - "listSeparator": "###", - "preprocessor": { // add a preprocessor to an attribute - "type": "regex", // there is currently only this type available - "params": { // the parameters the preprocessor requires - "regexMatch" : "^.*(fakebook\\.com.*)/$", // the regex to match on - "regexReplace": "$1" // the regex to replace with - } - } - } - ], - "batchSize": 100, - "threads": 4 - } -``` - -The processor config does not have to be modified in order to use the preprocessor: - -``` -{ -"processor": "append-pp-fb-to-person", - "processorType": "append-attribute", - "schemaType": "person", - "conceptGenerators": { - "attributes": { - "phone-number": { - "attributeType": "phone-number", - "valueType": "string" - }, - "fakebook-link": { - "attributeType": "fakebook-link", - "valueType": "string", - "required": true - } - } - } -} +-db yourFavoriteDatabase ``` -This will turn the following [append-fb-preprocessed.csv](https://github.com/bayer-science-for-a-better-life/grami/tree/master/src/test/resources/phone-calls/append-fb-preprocessed.csv): +[See details here](https://github.com/bayer-science-for-a-better-life/grami/wiki/Grami-as-Executable-CLI) -``` -phone_number,fb -+36 318 105 5629,https://www.fakebook.com/personOne/ -+63 808 497 1769,https://www.fakebook.com/person-Two/ -+62 533 266 3426,https://www.fakebook.com/person_three/ -``` - -Using the first line as an example: it will insert only as the fakebook-link attribute. - -### Using GraMi in your Java Application: - -#### Add GraMi as dependency - -Maven: - -Add jitpack and grakn as repositories: - -```XML - - - grakn.ai - https://repo.grakn.ai/repository/maven/ - - - jitpack.io - https://jitpack.io - - -``` - -Add GraMi as dependency: -```XML - - io.github.bayer-science-for-a-better-life - grami - 0.0.1 - -``` - -Gradle: - -Add jitpack and grakn as repositories: -``` -repositories { - ... - maven { url 'https://repo.grakn.ai/repository/maven/'} - maven { url 'https://jitpack.io' } -} -``` - -Add GraMi as dependency: -``` -dependencies { - ... - implementation 'com.github.bayer-science-for-a-better-life:grami:0.0.1' -} -``` - -#### Create Migration Class - -In your favorite IDE, create a Class that will handle your migration (here: migration.Migration): + 2. As a dependency in your own Java code ```Java -package migration; - -import configuration.MigrationConfig; -import migrator.GraknMigrator; -import java.io.*; - public class Migration { private static final String schema = "/path/to/your/schema.gql"; @@ -702,10 +97,10 @@ public class Migration { private static final String dataConfig = "/path/to/your/dataConfig.json"; private static final String migrationStatus = "/path/to/your/migrationStatus.json"; - private static final String graknURI = "127.0.0.1:48555"; // defines which grakn server to migrate into - private static final String keyspaceName = "yourFavoriteKeyspace"; // defines which keyspace to migrate into + private static final String graknURI = "127.0.0.1:1729"; // defines which grakn server to migrate into + private static final String databaseName = "yourFavoriteDatabase"; // defines which keyspace to migrate into - private static final MigrationConfig migrationConfig = new MigrationConfig(graknURI, keyspaceName, schema, dataConfig, processorConfig); + private static final MigrationConfig migrationConfig = new MigrationConfig(graknURI, databaseName, schema, dataConfig, processorConfig); public static void main(String[] args) throws IOException { GraknMigrator mig = new GraknMigrator(migrationConfig, migrationStatus, true); @@ -714,90 +109,29 @@ public class Migration { } ``` -The boolean flag cleanAndMigrate set to *true* as shown in: -```Java -GraknMigrator mig = new GraknMigrator(migrationConfig, migrationStatus, true); -``` -will, if exists, delete the schema and all data in the given keyspace. -If set to *false*, GraMi will continue migration according to the migrationStatus file - meaning it will continue where it left off previously and leave the schema as it is. - - -As for -```Java -mig.migrate(true, true, true, true); -``` - - setting all to false will only reapply the schema if cleanAndMigration is set to true - otherwise it will do nothing - - setting the first flag to true will migrate entities - - setting the second flag to true will migrate the relations in addition - - setting the third flag to true will migrate the relation-with-relations in addition - - setting the fourth flag to true will migrate the append-attributes in addition - -These flags exist because it is sometimes convenient for debugging during early development of the database model to migrate the three different classes one after the other. - -#### Configure Logging - -For control of GraMi logging, add the following to your log4j2.xml configuration: - -```XML - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -``` +[See details here](https://github.com/bayer-science-for-a-better-life/grami/wiki/GraMi-as-Dependency) -For tracking the progress of your importing, the suggested logging level for GraMi is "INFO". For more detailed output, set the level to DEBUG or TRACE at your convenience: -### Using GraMi as a Command-Line Application: - -Download the .zip/.tar file [here](https://github.com/bayer-science-for-a-better-life/grami/releases). After unpacking, you can run it directly out of the /bin directory: - -```Shell -./bin/grami migrate \ --d /path/to/dataConfig.json \ --p /path/to/processorConfig.json \ --m /path/to/migrationStatus.json \ --s /path/to/schema.gql \ --k yourFavoriteKeyspace \ --cm -``` +## Step-by-Step Tutorial -grami will create two log files (one for the application progress/warnings/errors, one concerned with data validity) in the grami directory for your convenience. +A complete tutorial for grakn version >= 2.0 is in work and will be published asap. -## Step-by-Step Tutorial +A complete tutorial for grakn version >= 1.8.2, but < 2.0 can be found [on Medium](https://medium.com/@hkuich/introducing-grami-a-data-migration-tool-for-grakn-d4051582f867). -A complete tutorial can be found [on Medium](https://medium.com/@hkuich/introducing-grami-a-data-migration-tool-for-grakn-d4051582f867). +There is this [example repository](https://github.com/bayer-science-for-a-better-life/grami-example). ## Compatibility -GraknMigrator is tested for +GraMi version >= 0.1.0 is tested for: +- [grakn-core](https://github.com/graknlabs/grakn) >= 2.0-alpha-6 +- [client-java](https://github.com/graknlabs/client-java) >= 2.0.0-alpha-8 + +GraMi version < 0.1.0 is tested for: - [grakn-core](https://github.com/graknlabs/grakn) >= 1.8.2 - [client-java](https://github.com/graknlabs/client-java) >= 1.8.3 +Find the Readme for GraMi for grakn < 2.0 [here](https://github.com/bayer-science-for-a-better-life/grami/blob/b3d6d272c409d6c40254354027b49f90b255e1c3/README.md) + ## Contributions GraknMigrator was built @[Bayer AG](https://www.bayer.com/) in the Semantic and Knowledge Graph Technology Group with the support of the engineers @[Grakn Labs](https://github.com/orgs/graknlabs/people) diff --git a/build.gradle b/build.gradle index 9725443..759f573 100644 --- a/build.gradle +++ b/build.gradle @@ -5,7 +5,7 @@ plugins { } group 'com.github.bayer-science-for-a-better-life' -version '0.0.3' +version '0.1.0-alpha-9' repositories { mavenCentral() @@ -15,7 +15,7 @@ repositories { } dependencies { - compile group: 'io.grakn.client', name: 'grakn-client', version: '1.8.3' + compile group: 'io.grakn.client', name: 'grakn-client', version: '2.0.0-alpha-9' testCompile group: 'junit', name: 'junit', version: '4.12' compile 'com.google.code.gson:gson:2.8.6' compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.25' @@ -25,7 +25,7 @@ dependencies { compile 'info.picocli:picocli:4.5.1' } -mainClassName = 'cli.Cli' +mainClassName = 'cli.GramiCLI' publishing { publications { diff --git a/src/main/java/cli/GramiCLI.java b/src/main/java/cli/GramiCLI.java index 6316ebe..3f5895b 100644 --- a/src/main/java/cli/GramiCLI.java +++ b/src/main/java/cli/GramiCLI.java @@ -8,7 +8,7 @@ import java.io.IOException; -@CommandLine.Command(description="Welcome to the CLI of GraMi - your grakn data migration tool", name = "grami", version = "0.0.3", mixinStandardHelpOptions = true) +@CommandLine.Command(description="Welcome to the CLI of GraMi - your grakn data migration tool", name = "grami", version = "0.1.0-alpha-9", mixinStandardHelpOptions = true) public class GramiCLI { public static void main(String[] args) { @@ -25,22 +25,22 @@ public static void main(String[] args) { class MigrateCommand implements Runnable { @CommandLine.Spec CommandLine.Model.CommandSpec spec; - @CommandLine.Option(names = {"-d", "--dataConfigFile"}, description = "data config file in JSON format", required = true) + @CommandLine.Option(names = {"-dc", "--dataConfigFile"}, description = "data config file in JSON format", required = true) private String dataConfigFilePath; - @CommandLine.Option(names = {"-p", "--processorConfigFile"}, description = "processor config file in JSON format", required = true) + @CommandLine.Option(names = {"-pc", "--processorConfigFile"}, description = "processor config file in JSON format", required = true) private String processorConfigFilePath; - @CommandLine.Option(names = {"-m", "--migrationStatusFile"}, description = "file to track migration status in", required = true) + @CommandLine.Option(names = {"-ms", "--migrationStatusFile"}, description = "file to track migration status in", required = true) private String migrationStatusFilePath; @CommandLine.Option(names = {"-s", "--schemaFile"}, description = "your schema file as .gql", required = true) private String schemaFilePath; - @CommandLine.Option(names = {"-k", "--keyspace"}, description = "target keyspace in your grakn instance", required = true) - private String keyspaceName; + @CommandLine.Option(names = {"-db", "--database"}, description = "target database in your grakn instance", required = true) + private String databaseName; - @CommandLine.Option(names = {"-g", "--grakn"}, description = "optional - grakn DB in format: server:port (default: localhost:48555)", defaultValue = "localhost:48555") + @CommandLine.Option(names = {"-g", "--grakn"}, description = "optional - grakn DB in format: server:port (default: localhost:1729)", defaultValue = "localhost:1729") private String graknURI; @CommandLine.Option(names = {"-cm", "--cleanMigration"}, description = "optional - delete old schema and data and restart migration from scratch - default: continue previous migration, if exists") @@ -57,12 +57,12 @@ public void run() { spec.commandLine().getOut().println("\tprocessor configuration: " + processorConfigFilePath); spec.commandLine().getOut().println("\ttracking migration status in: " + migrationStatusFilePath); spec.commandLine().getOut().println("\tschema: " + schemaFilePath); - spec.commandLine().getOut().println("\tkeyspace: " + keyspaceName); + spec.commandLine().getOut().println("\tdatabase: " + databaseName); spec.commandLine().getOut().println("\tgrakn server: " + graknURI); - spec.commandLine().getOut().println("\tdelete keyspace and all data in it for a clean new migration?: " + cleanMigration); + spec.commandLine().getOut().println("\tdelete database and all data in it for a clean new migration?: " + cleanMigration); spec.commandLine().getOut().println("\tmigration scope: " + scope); - final MigrationConfig migrationConfig = new MigrationConfig(graknURI, keyspaceName, schemaFilePath, dataConfigFilePath, processorConfigFilePath); + final MigrationConfig migrationConfig = new MigrationConfig(graknURI, databaseName, schemaFilePath, dataConfigFilePath, processorConfigFilePath); try { GraknMigrator mig = new GraknMigrator(migrationConfig, migrationStatusFilePath, cleanMigration); @@ -91,10 +91,10 @@ class SchemaUpdateCommand implements Runnable { @CommandLine.Option(names = {"-s", "--schemaFile"}, description = "your schema file as .gql", required = true) private String schemaFilePath; - @CommandLine.Option(names = {"-k", "--keyspace"}, description = "target keyspace in your grakn instance", required = true) - private String keyspaceName; + @CommandLine.Option(names = {"-db", "--database"}, description = "target database in your grakn instance", required = true) + private String databaseName; - @CommandLine.Option(names = {"-g", "--grakn"}, description = "optional - grakn DB in format: server:port (default: localhost:48555)", defaultValue = "localhost:48555") + @CommandLine.Option(names = {"-g", "--grakn"}, description = "optional - grakn DB in format: server:port (default: localhost:1729)", defaultValue = "localhost:1729") private String graknURI; @Override @@ -102,10 +102,10 @@ public void run() { spec.commandLine().getOut().println("############## GraMi schema-update ###############"); spec.commandLine().getOut().println("schema-update started with parameters:"); spec.commandLine().getOut().println("\tschema: " + schemaFilePath); - spec.commandLine().getOut().println("\tkeyspace: " + keyspaceName); + spec.commandLine().getOut().println("\tkeyspace: " + databaseName); spec.commandLine().getOut().println("\tgrakn server: " + graknURI); - SchemaUpdateConfig suConfig = new SchemaUpdateConfig(graknURI, keyspaceName, schemaFilePath); + SchemaUpdateConfig suConfig = new SchemaUpdateConfig(graknURI, databaseName, schemaFilePath); SchemaUpdater su = new SchemaUpdater(suConfig); su.updateSchema(); } diff --git a/src/main/java/generator/AppendAttributeGenerator.java b/src/main/java/generator/AppendAttributeGenerator.java index 6f0e8d7..333a257 100644 --- a/src/main/java/generator/AppendAttributeGenerator.java +++ b/src/main/java/generator/AppendAttributeGenerator.java @@ -3,13 +3,16 @@ import configuration.DataConfigEntry; import configuration.ProcessorConfigEntry; import graql.lang.Graql; -import graql.lang.statement.Statement; -import graql.lang.statement.StatementInstance; +import graql.lang.pattern.Pattern; +import graql.lang.pattern.variable.ThingVariable; +import graql.lang.pattern.variable.ThingVariable.Thing; +import graql.lang.pattern.variable.UnboundVariable; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import java.util.ArrayList; import java.util.Arrays; +import java.util.HashMap; import java.util.Map; import static generator.GeneratorUtil.*; @@ -21,38 +24,42 @@ public class AppendAttributeGenerator extends InsertGenerator { private static final Logger appLogger = LogManager.getLogger("com.bayer.dt.grami"); private static final Logger dataLogger = LogManager.getLogger("com.bayer.dt.grami.data"); - public AppendAttributeGenerator(DataConfigEntry dataConfigEntry, ProcessorConfigEntry processorConfigEntry) { + public AppendAttributeGenerator(DataConfigEntry dataConfigEntry, + ProcessorConfigEntry processorConfigEntry) { super(); this.dce = dataConfigEntry; this.pce = processorConfigEntry; appLogger.debug("Creating AppendAttribute for processor " + processorConfigEntry.getProcessor() + " of type " + processorConfigEntry.getProcessorType()); } - public ArrayList>> graknAppendAttributeInsert(ArrayList rows, String header) throws Exception { - ArrayList>> matchInsertStatements = new ArrayList<>(); + public HashMap>>> graknAppendAttributeInsert(ArrayList rows, + String header) throws Exception { + HashMap>>> matchInsertPatterns = new HashMap<>(); - ArrayList> matchStatements = new ArrayList<>(); - ArrayList> insertStatements = new ArrayList<>(); + ArrayList>> matchPatterns = new ArrayList<>(); + ArrayList>> insertPatterns = new ArrayList<>(); int insertCounter = 0; for (String row : rows) { - ArrayList> tmp = graknAppendAttributeQueryFromRow(row, header, insertCounter); + ArrayList>> tmp = graknAppendAttributeQueryFromRow(row, header, insertCounter); if (tmp != null) { if (tmp.get(0) != null && tmp.get(1) != null) { - matchStatements.add(tmp.get(0)); - insertStatements.add(tmp.get(1)); + matchPatterns.add(tmp.get(0)); + insertPatterns.add(tmp.get(1)); insertCounter++; } } } - matchInsertStatements.add(matchStatements); - matchInsertStatements.add(insertStatements); - return matchInsertStatements; + matchInsertPatterns.put("match", matchPatterns); + matchInsertPatterns.put("insert", insertPatterns); + return matchInsertPatterns; } - public ArrayList> graknAppendAttributeQueryFromRow(String row, String header, int insertCounter) throws Exception { + public ArrayList>> graknAppendAttributeQueryFromRow(String row, + String header, + int insertCounter) throws Exception { String fileSeparator = dce.getSeparator(); String[] rowTokens = row.split(fileSeparator); String[] columnNames = header.split(fileSeparator); @@ -63,35 +70,38 @@ public ArrayList> graknAppendAttributeQueryFromRow(String r throw new IllegalArgumentException("data config entry for " + dce.getDataPath() + " is incomplete - it needs at least one attribute used for matching (\"match\": true) and at least one attribute to be appended (\"match\": false or not set at all"); } - ArrayList matchStatements = new ArrayList<>(); - ArrayList insertStatements = new ArrayList<>(); + ArrayList> matchPatterns = new ArrayList<>(); + ArrayList> insertPatterns = new ArrayList<>(); // get all attributes that are isMatch() --> construct match clause - StatementInstance appendAttributeMatchStatement = addEntityToMatchStatement(insertCounter); + Thing appendAttributeMatchPattern = addEntityToMatchPattern(insertCounter); for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForMatchAttribute : dce.getAttributes()) { - if (generatorMappingForMatchAttribute.isMatch()){ - appendAttributeMatchStatement = addAttribute(rowTokens, appendAttributeMatchStatement, columnNames, generatorMappingForMatchAttribute, pce, generatorMappingForMatchAttribute.getPreprocessor()); + if (generatorMappingForMatchAttribute.isMatch()) { + appendAttributeMatchPattern = addAttribute(rowTokens, appendAttributeMatchPattern, columnNames, generatorMappingForMatchAttribute, pce, generatorMappingForMatchAttribute.getPreprocessor()); } } - matchStatements.add(appendAttributeMatchStatement); + matchPatterns.add(appendAttributeMatchPattern); // get all attributes that are !isMatch() --> construct insert clause - Statement appendAttributeInsertStatement = addEntityToInsertStatement(insertCounter); + UnboundVariable thingVar = addEntityToInsertPattern(insertCounter); + Thing appendAttributeInsertPattern = null; for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAppendAttribute : dce.getAttributes()) { - if (!generatorMappingForAppendAttribute.isMatch()){ - appendAttributeInsertStatement = addAttribute(rowTokens, appendAttributeMatchStatement, columnNames, generatorMappingForAppendAttribute, pce, generatorMappingForAppendAttribute.getPreprocessor()); + if (!generatorMappingForAppendAttribute.isMatch()) { + appendAttributeInsertPattern = addAttribute(rowTokens, thingVar, columnNames, generatorMappingForAppendAttribute, pce, generatorMappingForAppendAttribute.getPreprocessor()); } } - insertStatements.add((StatementInstance) appendAttributeInsertStatement); + if (appendAttributeInsertPattern != null) { + insertPatterns.add(appendAttributeInsertPattern); + } - ArrayList> assembledStatements = new ArrayList<>(); - assembledStatements.add(matchStatements); - assembledStatements.add(insertStatements); + ArrayList>> assembledPatterns = new ArrayList<>(); + assembledPatterns.add(matchPatterns); + assembledPatterns.add(insertPatterns); - if (isValid(assembledStatements)) { - appLogger.debug("valid query: <" + assembleQuery(assembledStatements).toString() + ">"); - return assembledStatements; + if (isValid(assembledPatterns)) { + appLogger.debug("valid query: <" + assembleQuery(assembledPatterns).toString() + ">"); + return assembledPatterns; } else { dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not contain at least one match attribute and one insert attribute. Faulty tokenized row: " + Arrays.toString(rowTokens)); return null; @@ -102,7 +112,7 @@ private boolean validateDataConfigEntry() { boolean containsMatchAttribute = false; boolean containsAppendAttribute = false; for (DataConfigEntry.DataConfigGeneratorMapping attributeMapping : dce.getAttributes()) { - if (attributeMapping.isMatch()){ + if (attributeMapping.isMatch()) { containsMatchAttribute = true; } if (!attributeMapping.isMatch()) { @@ -112,51 +122,58 @@ private boolean validateDataConfigEntry() { return containsMatchAttribute && containsAppendAttribute; } - private StatementInstance addEntityToMatchStatement(int insertCounter) { + private Thing addEntityToMatchPattern(int insertCounter) { if (pce.getSchemaType() != null) { - return Graql.var("e-" + insertCounter).isa(pce.getSchemaType()); +// return Graql.var("e-" + insertCounter).isa(pce.getSchemaType()); + return Graql.var("e").isa(pce.getSchemaType()); } else { throw new IllegalArgumentException("Required field not set in processor " + pce.getProcessor()); } } - private Statement addEntityToInsertStatement(int insertCounter) { + private UnboundVariable addEntityToInsertPattern(int insertCounter) { if (pce.getSchemaType() != null) { - return Graql.var("e-" + insertCounter); +// return Graql.var("e-" + insertCounter); + return Graql.var("e"); } else { throw new IllegalArgumentException("Required field not set in processor " + pce.getProcessor()); } } - private String assembleQuery(ArrayList> queries) { + private String assembleQuery(ArrayList>> queries) { StringBuilder ret = new StringBuilder(); - for (Statement st : queries.get(0)) { + for (ThingVariable st : queries.get(0)) { ret.append(st.toString()); } ret.append(queries.get(1).get(0).toString()); return ret.toString(); } - private boolean isValid(ArrayList> si) { - ArrayList matchStatements = si.get(0); - ArrayList insertStatements = si.get(1); - StringBuilder matchStatement = new StringBuilder(); - for (Statement st:matchStatements) { - matchStatement.append(st.toString()); + private boolean isValid(ArrayList>> si) { + ArrayList> matchPatterns = si.get(0); + ArrayList> insertPatterns = si.get(1); + + if (insertPatterns.size() < 1) { + return false; + } + + StringBuilder matchPattern = new StringBuilder(); + for (Pattern st : matchPatterns) { + matchPattern.append(st.toString()); } - String insertStatement = insertStatements.get(0).toString(); + String insertPattern = insertPatterns.get(0).toString(); // missing match attribute - for (DataConfigEntry.DataConfigGeneratorMapping attributeMapping: dce.getMatchAttributes()) { + for (DataConfigEntry.DataConfigGeneratorMapping attributeMapping : dce.getMatchAttributes()) { String generatorKey = attributeMapping.getGenerator(); ProcessorConfigEntry.ConceptGenerator generatorEntry = pce.getAttributeGenerator(generatorKey); - if (!matchStatement.toString().contains("has " + generatorEntry.getAttributeType())) { + if (!matchPattern.toString().contains("has " + generatorEntry.getAttributeType())) { return false; } } // missing required insert attribute - for (Map.Entry generatorEntry: pce.getRequiredAttributes().entrySet()) { - if (!insertStatement.contains("has " + generatorEntry.getValue().getAttributeType())) { + for (Map.Entry generatorEntry : pce.getRequiredAttributes().entrySet()) { + if (!insertPattern.contains("has " + generatorEntry.getValue().getAttributeType())) { return false; } } diff --git a/src/main/java/generator/EntityInsertGenerator.java b/src/main/java/generator/EntityInsertGenerator.java index b1c2055..55f9dc3 100644 --- a/src/main/java/generator/EntityInsertGenerator.java +++ b/src/main/java/generator/EntityInsertGenerator.java @@ -6,8 +6,9 @@ import configuration.DataConfigEntry; import configuration.ProcessorConfigEntry; import graql.lang.Graql; -import graql.lang.statement.Statement; -import graql.lang.statement.StatementInstance; +import graql.lang.pattern.Pattern; +import graql.lang.pattern.variable.ThingVariable; +import graql.lang.pattern.variable.ThingVariable.Thing; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; @@ -29,38 +30,41 @@ public EntityInsertGenerator(DataConfigEntry dataConfigEntry, ProcessorConfigEnt appLogger.debug("Creating EntityInsertGenerator for processor " + processorConfigEntry.getProcessor() + " of type " + processorConfigEntry.getProcessorType()); } - public ArrayList graknEntityInsert(ArrayList rows, String header) throws IllegalArgumentException { - ArrayList statements = new ArrayList<>(); + public ArrayList> graknEntityInsert(ArrayList rows, + String header) throws IllegalArgumentException { + ArrayList> patterns = new ArrayList<>(); int insertCounter = 0; for (String row : rows) { try { - Statement temp = graknEntityQueryFromRow(row, header, insertCounter); + ThingVariable temp = graknEntityQueryFromRow(row, header, insertCounter); if (temp != null) { - statements.add(temp); + patterns.add(temp); } insertCounter++; } catch (Exception e) { e.printStackTrace(); } } - return statements; + return patterns; } - public StatementInstance graknEntityQueryFromRow(String row, String header, int insertCounter) throws Exception { + public ThingVariable graknEntityQueryFromRow(String row, + String header, + int insertCounter) throws Exception { String fileSeparator = dce.getSeparator(); String[] rowTokens = row.split(fileSeparator); String[] columnNames = header.split(fileSeparator); appLogger.debug("processing tokenized row: " + Arrays.toString(rowTokens)); malformedRow(row, rowTokens, columnNames.length); - StatementInstance entityInsertStatement = addEntityToStatement(insertCounter); + Thing entityInsertStatement = addEntityToStatement(insertCounter); for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute : dce.getAttributes()) { entityInsertStatement = addAttribute(rowTokens, entityInsertStatement, columnNames, generatorMappingForAttribute, pce, generatorMappingForAttribute.getPreprocessor()); } if (isValid(entityInsertStatement)) { - appLogger.debug("valid query: <" + entityInsertStatement.toString() + ">"); + appLogger.debug("valid query: "); return entityInsertStatement; } else { dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not have a proper statement or is missing required attributes. Faulty tokenized row: " + Arrays.toString(rowTokens)); @@ -68,7 +72,7 @@ public StatementInstance graknEntityQueryFromRow(String row, String header, int } } - private StatementInstance addEntityToStatement(int insertCounter) { + private Thing addEntityToStatement(int insertCounter) { if (pce.getSchemaType() != null) { return Graql.var("e-" + insertCounter).isa(pce.getSchemaType()); } else { @@ -76,13 +80,13 @@ private StatementInstance addEntityToStatement(int insertCounter) { } } - private boolean isValid(StatementInstance si) { - String statement = si.toString(); - if (!statement.contains("isa " + pce.getSchemaType())) { + private boolean isValid(Pattern pa) { + String patternAsString = pa.toString(); + if (!patternAsString.contains("isa " + pce.getSchemaType())) { return false; } for (Map.Entry con : pce.getRequiredAttributes().entrySet()) { - if (!statement.contains("has " + con.getValue().getAttributeType())) { + if (!patternAsString.contains("has " + con.getValue().getAttributeType())) { return false; } } diff --git a/src/main/java/generator/GeneratorUtil.java b/src/main/java/generator/GeneratorUtil.java index d57b5b9..71cdf88 100644 --- a/src/main/java/generator/GeneratorUtil.java +++ b/src/main/java/generator/GeneratorUtil.java @@ -2,7 +2,10 @@ import configuration.DataConfigEntry; import configuration.ProcessorConfigEntry; -import graql.lang.statement.StatementInstance; +import graql.lang.pattern.variable.ThingVariable; +import graql.lang.pattern.variable.UnboundVariable; +import graql.lang.pattern.variable.ThingVariable.Thing; +import graql.lang.pattern.variable.ThingVariable.Relation; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import preprocessor.RegexPreprocessor; @@ -26,17 +29,21 @@ public static String cleanToken(String token) { return cleaned; } - public static void malformedRow(String row, String[] rowTokens, int numberOfColumns) throws Exception { + public static void malformedRow(String row, + String[] rowTokens, + int numberOfColumns) throws Exception { if (rowTokens.length > numberOfColumns) { throw new Exception("malformed input row (additional separator characters found) not inserted - fix the following and restart migration: " + row); } } - public static int idxOf(String[] headerTokens, String columnName) { + public static int idxOf(String[] headerTokens, + String columnName) { return Arrays.asList(headerTokens).indexOf(columnName); } - public static int[] indicesOf(String[] headerTokens, String[] columnNames) { + public static int[] indicesOf(String[] headerTokens, + String[] columnNames) { int[] indices = new int[columnNames.length]; int i = 0; for (String columnName : columnNames) { @@ -46,7 +53,12 @@ public static int[] indicesOf(String[] headerTokens, String[] columnNames) { return indices; } - public static StatementInstance addAttribute(String[] tokens, StatementInstance statement, String[] columnNames, DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute, ProcessorConfigEntry pce, DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + public static Thing addAttribute(String[] tokens, + Thing statement, + String[] columnNames, + DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute, + ProcessorConfigEntry pce, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { String attributeGeneratorKey = generatorMappingForAttribute.getGenerator(); ProcessorConfigEntry.ConceptGenerator attributeGenerator = pce.getAttributeGenerator(attributeGeneratorKey); String columnListSeparator = generatorMappingForAttribute.getListSeparator(); @@ -68,7 +80,87 @@ public static StatementInstance addAttribute(String[] tokens, StatementInstance return statement; } - public static StatementInstance cleanExplodeAdd(StatementInstance statement, String cleanedToken, String conceptType, String valueType, String listSeparator, DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + public static Relation addAttribute(String[] tokens, + Relation statement, + String[] columnNames, + DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute, + ProcessorConfigEntry pce, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + String attributeGeneratorKey = generatorMappingForAttribute.getGenerator(); + ProcessorConfigEntry.ConceptGenerator attributeGenerator = pce.getAttributeGenerator(attributeGeneratorKey); + String columnListSeparator = generatorMappingForAttribute.getListSeparator(); + String columnName = generatorMappingForAttribute.getColumnName(); + int columnNameIndex = idxOf(columnNames, columnName); + + if (columnNameIndex == -1) { + dataLogger.error("Column name: <" + columnName + "> was not found in file being processed"); + } else { + if ( columnNameIndex < tokens.length && + tokens[columnNameIndex] != null && + !cleanToken(tokens[columnNameIndex]).isEmpty()) { + String attributeType = attributeGenerator.getAttributeType(); + String attributeValueType = attributeGenerator.getValueType(); + String cleanedToken = cleanToken(tokens[columnNameIndex]); + statement = cleanExplodeAdd(statement, cleanedToken, attributeType, attributeValueType, columnListSeparator, preprocessorConfig); + } + } + return statement; + } + + public static Thing addAttribute(String[] tokens, + UnboundVariable statement, + String[] columnNames, + DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute, + ProcessorConfigEntry pce, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + + String attributeGeneratorKey = generatorMappingForAttribute.getGenerator(); + ProcessorConfigEntry.ConceptGenerator attributeGenerator = pce.getAttributeGenerator(attributeGeneratorKey); + String columnListSeparator = generatorMappingForAttribute.getListSeparator(); + String columnName = generatorMappingForAttribute.getColumnName(); + int columnNameIndex = idxOf(columnNames, columnName); + Thing returnThing = null; + + if (columnNameIndex == -1) { + dataLogger.error("Column name: <" + columnName + "> was not found in file being processed"); + } else { + if ( columnNameIndex < tokens.length && + tokens[columnNameIndex] != null && + !cleanToken(tokens[columnNameIndex]).isEmpty()) { + String attributeType = attributeGenerator.getAttributeType(); + String attributeValueType = attributeGenerator.getValueType(); + String cleanedToken = cleanToken(tokens[columnNameIndex]); + returnThing = cleanExplodeAdd(statement, cleanedToken, attributeType, attributeValueType, columnListSeparator, preprocessorConfig); + } + } + return returnThing; + } + + public static Thing cleanExplodeAdd(Thing statement, + String cleanedToken, + String conceptType, + String valueType, + String listSeparator, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + if (listSeparator != null) { + for (String exploded: cleanedToken.split(listSeparator)) { + String cleanedExplodedToken = cleanToken(exploded); + if (!cleanedExplodedToken.isEmpty()) { + statement = addAttributeOfColumnType(statement, conceptType, valueType, cleanedExplodedToken, preprocessorConfig); + } + } + return statement; + } else { + return addAttributeOfColumnType(statement, conceptType, valueType, cleanedToken, preprocessorConfig); + } + } + + public static Relation cleanExplodeAdd(Relation statement, + String cleanedToken, + String conceptType, + String valueType, + String listSeparator, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { if (listSeparator != null) { for (String exploded: cleanedToken.split(listSeparator)) { String cleanedExplodedToken = cleanToken(exploded); @@ -82,7 +174,37 @@ public static StatementInstance cleanExplodeAdd(StatementInstance statement, Str } } - public static StatementInstance addAttributeOfColumnType(StatementInstance statement, String conceptType, String valueType, String cleanedValue, DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + public static Thing cleanExplodeAdd(UnboundVariable statement, + String cleanedToken, + String conceptType, + String valueType, + String listSeparator, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + Thing returnThing = null; + if (listSeparator != null) { + int count = 0; + for (String exploded: cleanedToken.split(listSeparator)) { + String cleanedExplodedToken = cleanToken(exploded); + if (!cleanedExplodedToken.isEmpty()) { + if (count == 0) { + returnThing = addAttributeOfColumnType(statement, conceptType, valueType, cleanedExplodedToken, preprocessorConfig); + } else { + returnThing = addAttributeOfColumnType(returnThing, conceptType, valueType, cleanedExplodedToken, preprocessorConfig); + } + count++; + } + } + return returnThing; + } else { + return addAttributeOfColumnType(statement, conceptType, valueType, cleanedToken, preprocessorConfig); + } + } + + public static Thing addAttributeOfColumnType(Thing statement, + String conceptType, + String valueType, + String cleanedValue, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { if (preprocessorConfig != null) { cleanedValue = applyPreprocessor(cleanedValue, preprocessorConfig); } @@ -141,7 +263,135 @@ public static StatementInstance addAttributeOfColumnType(StatementInstance state return statement; } - private static String applyPreprocessor(String cleanedValue, DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + public static Relation addAttributeOfColumnType(Relation statement, + String conceptType, + String valueType, + String cleanedValue, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + if (preprocessorConfig != null) { + cleanedValue = applyPreprocessor(cleanedValue, preprocessorConfig); + } + + switch (valueType) { + case "string": + statement = statement.has(conceptType, cleanedValue); + break; + case "long": + try { + statement = statement.has(conceptType, Integer.parseInt(cleanedValue)); + } catch (NumberFormatException numberFormatException) { + dataLogger.warn("current row has column of type with non- value - skipping column"); + dataLogger.warn(numberFormatException.getMessage()); + } + break; + case "double": + try { + statement = statement.has(conceptType, Double.parseDouble(cleanedValue)); + } catch (NumberFormatException numberFormatException) { + dataLogger.warn("current row has column of type with non- value - skipping column"); + dataLogger.warn(numberFormatException.getMessage()); + } + break; + case "boolean": + if (cleanedValue.toLowerCase().equals("true")) { + statement = statement.has(conceptType, true); + } else if (cleanedValue.toLowerCase().equals("false")) { + statement = statement.has(conceptType, false); + } else { + dataLogger.warn("current row has column of type with non- value - skipping column"); + } + break; + case "datetime": + try { + DateTimeFormatter isoDateFormatter = DateTimeFormatter.ISO_DATE; + String[] dt = cleanedValue.split("T"); + LocalDate date = LocalDate.parse(dt[0], isoDateFormatter); + if (dt.length > 1) { + LocalTime time = LocalTime.parse(dt[1], DateTimeFormatter.ISO_TIME); + LocalDateTime dateTime = date.atTime(time); + statement = statement.has(conceptType, dateTime); + } else { + LocalDateTime dateTime = date.atStartOfDay(); + statement = statement.has(conceptType, dateTime); + } + } catch (DateTimeException dateTimeException) { + dataLogger.warn("current row has column of type with non- datetime value: "); + dataLogger.warn(dateTimeException.getMessage()); + } + break; + default: + dataLogger.warn("column type not valid - must be either: string, long, double, boolean, or datetime"); + break; + } + return statement; + } + + public static Thing addAttributeOfColumnType(UnboundVariable statement, + String conceptType, + String valueType, + String cleanedValue, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + if (preprocessorConfig != null) { + cleanedValue = applyPreprocessor(cleanedValue, preprocessorConfig); + } + Thing returnThing = null; + + switch (valueType) { + case "string": + returnThing = statement.has(conceptType, cleanedValue); + break; + case "long": + try { + returnThing = statement.has(conceptType, Integer.parseInt(cleanedValue)); + } catch (NumberFormatException numberFormatException) { + dataLogger.warn("current row has column of type with non- value - skipping column"); + dataLogger.warn(numberFormatException.getMessage()); + } + break; + case "double": + try { + returnThing = statement.has(conceptType, Double.parseDouble(cleanedValue)); + } catch (NumberFormatException numberFormatException) { + dataLogger.warn("current row has column of type with non- value - skipping column"); + dataLogger.warn(numberFormatException.getMessage()); + } + break; + case "boolean": + if (cleanedValue.toLowerCase().equals("true")) { + returnThing = statement.has(conceptType, true); + } else if (cleanedValue.toLowerCase().equals("false")) { + returnThing = statement.has(conceptType, false); + } else { + dataLogger.warn("current row has column of type with non- value - skipping column"); + } + break; + case "datetime": + try { + DateTimeFormatter isoDateFormatter = DateTimeFormatter.ISO_DATE; + String[] dt = cleanedValue.split("T"); + LocalDate date = LocalDate.parse(dt[0], isoDateFormatter); + if (dt.length > 1) { + LocalTime time = LocalTime.parse(dt[1], DateTimeFormatter.ISO_TIME); + LocalDateTime dateTime = date.atTime(time); + returnThing = statement.has(conceptType, dateTime); + } else { + LocalDateTime dateTime = date.atStartOfDay(); + returnThing = statement.has(conceptType, dateTime); + } + } catch (DateTimeException dateTimeException) { + dataLogger.warn("current row has column of type with non- datetime value: "); + dataLogger.warn(dateTimeException.getMessage()); + } + break; + default: + dataLogger.warn("column type not valid - must be either: string, long, double, boolean, or datetime"); + break; + } + return returnThing; + } + + private static String applyPreprocessor(String cleanedValue, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig.PreprocessorParams params = preprocessorConfig.getParams(); String processorType = preprocessorConfig.getType(); switch (processorType) { @@ -152,7 +402,9 @@ private static String applyPreprocessor(String cleanedValue, DataConfigEntry.Dat } } - private static String applyRegexPreprocessor(String stringToProcess, String matchString, String replaceString) { + private static String applyRegexPreprocessor(String stringToProcess, + String matchString, + String replaceString) { RegexPreprocessor rpp = new RegexPreprocessor(matchString, replaceString); return rpp.applyProcessor(stringToProcess); } diff --git a/src/main/java/generator/InsertGenerator.java b/src/main/java/generator/InsertGenerator.java index e2984e3..2d51576 100644 --- a/src/main/java/generator/InsertGenerator.java +++ b/src/main/java/generator/InsertGenerator.java @@ -1,11 +1,12 @@ package generator; -import graql.lang.statement.Statement; +import graql.lang.pattern.variable.ThingVariable; import java.util.ArrayList; +import java.util.HashMap; public abstract class InsertGenerator { - public ArrayList graknEntityInsert(ArrayList rows, String header) { return null; }; - public ArrayList>> graknRelationInsert(ArrayList rows, String header) throws Exception { return null; }; - public ArrayList>> graknAppendAttributeInsert(ArrayList rows, String header) throws Exception { return null; }; + public ArrayList> graknEntityInsert(ArrayList rows, String header) { return null; }; + public HashMap>>> graknRelationInsert(ArrayList rows, String header) throws Exception { return null; }; + public HashMap>>> graknAppendAttributeInsert(ArrayList rows, String header) throws Exception { return null; }; } diff --git a/src/main/java/generator/RelationInsertGenerator.java b/src/main/java/generator/RelationInsertGenerator.java index 2dc57a9..b795e52 100644 --- a/src/main/java/generator/RelationInsertGenerator.java +++ b/src/main/java/generator/RelationInsertGenerator.java @@ -9,8 +9,11 @@ import configuration.DataConfigEntry; import configuration.ProcessorConfigEntry; import graql.lang.Graql; -import graql.lang.statement.Statement; -import graql.lang.statement.StatementInstance; +import graql.lang.pattern.Pattern; +import graql.lang.pattern.variable.ThingVariable; +import graql.lang.pattern.variable.ThingVariable.Thing; +import graql.lang.pattern.variable.ThingVariable.Relation; +import graql.lang.pattern.variable.UnboundVariable; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; @@ -30,16 +33,16 @@ public RelationInsertGenerator(DataConfigEntry dce, ProcessorConfigEntry process appLogger.debug("Creating RelationInsertGenerator for " + pce.getProcessor() + " of type " + pce.getProcessorType()); } - public ArrayList>> graknRelationInsert(ArrayList rows, String header) throws Exception { - ArrayList>> matchInsertStatements = new ArrayList<>(); + public HashMap>>> graknRelationInsert(ArrayList rows, String header) throws Exception { + HashMap>>> matchInsertStatements = new HashMap<>(); - ArrayList> matchStatements = new ArrayList<>(); - ArrayList> insertStatements = new ArrayList<>(); + ArrayList>> matchStatements = new ArrayList<>(); + ArrayList>> insertStatements = new ArrayList<>(); int insertCounter = 0; for (String row : rows) { - ArrayList> tmp = graknRelationshipQueryFromRow(row, header, insertCounter); + ArrayList>> tmp = graknRelationshipQueryFromRow(row, header, insertCounter); if (tmp != null) { if (tmp.get(0) != null && tmp.get(1) != null) { matchStatements.add(tmp.get(0)); @@ -49,62 +52,72 @@ public ArrayList>> graknRelationInsert(ArrayList< } } - matchInsertStatements.add(matchStatements); - matchInsertStatements.add(insertStatements); + matchInsertStatements.put("match", matchStatements); + matchInsertStatements.put("insert", insertStatements); return matchInsertStatements; } - public ArrayList> graknRelationshipQueryFromRow(String row, String header, int insertCounter) throws Exception { + public ArrayList>> graknRelationshipQueryFromRow(String row, String header, int insertCounter) throws Exception { String fileSeparator = dce.getSeparator(); String[] rowTokens = row.split(fileSeparator); String[] columnNames = header.split(fileSeparator); appLogger.debug("processing tokenized row: " + Arrays.toString(rowTokens)); GeneratorUtil.malformedRow(row, rowTokens, columnNames.length); - ArrayList miStatements = new ArrayList<>(createPlayerMatchAndInsert(rowTokens, columnNames, insertCounter)); - ArrayList matchStatements = new ArrayList<>(miStatements.subList(0, miStatements.size() - 1)); - ArrayList insertStatements = new ArrayList<>(); + ArrayList> miStatements = new ArrayList<>(createPlayerMatchAndInsert(rowTokens, columnNames, insertCounter)); - if (!matchStatements.isEmpty()) { - StatementInstance playersInsertStatement = (StatementInstance) miStatements.subList(miStatements.size() - 1, miStatements.size()).get(0); - StatementInstance assembledInsertStatement = relationInsert(playersInsertStatement); + if (miStatements.size() >= 1) { + ArrayList> matchStatements = new ArrayList<>(miStatements.subList(0, miStatements.size() - 1)); + ArrayList> insertStatements = new ArrayList<>(); - if (dce.getAttributes() != null) { - for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute : dce.getAttributes()) { - assembledInsertStatement = addAttribute(rowTokens, assembledInsertStatement, columnNames, generatorMappingForAttribute, pce, generatorMappingForAttribute.getPreprocessor()); + if (!matchStatements.isEmpty()) { + ThingVariable playersInsertStatement = miStatements.subList(miStatements.size() - 1, miStatements.size()).get(0); + Relation assembledInsertStatement = relationInsert((Relation) playersInsertStatement); + + if (dce.getAttributes() != null) { + for (DataConfigEntry.DataConfigGeneratorMapping generatorMappingForAttribute : dce.getAttributes()) { + assembledInsertStatement = addAttribute(rowTokens, assembledInsertStatement, columnNames, generatorMappingForAttribute, pce, generatorMappingForAttribute.getPreprocessor()); + } } - } - insertStatements.add(assembledInsertStatement); + insertStatements.add(assembledInsertStatement); - ArrayList> assembledStatements = new ArrayList<>(); - assembledStatements.add(matchStatements); - assembledStatements.add(insertStatements); + ArrayList>> assembledStatements = new ArrayList<>(); + assembledStatements.add(matchStatements); + assembledStatements.add(insertStatements); - if (isValid(assembledStatements)) { - appLogger.debug("valid query: <" + assembleQuery(assembledStatements) + ">"); - return assembledStatements; + if (isValid(assembledStatements)) { +// System.out.println("valid query: <" + assembleQuery(assembledStatements) + ">"); + return assembledStatements; + } else { + dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not have a proper statement or is missing required players or attributes. Faulty tokenized row: " + Arrays.toString(rowTokens)); + return null; + } } else { - dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c does not have a proper statement or is missing required players or attributes. Faulty tokenized row: " + Arrays.toString(rowTokens)); + dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c has 0 players. Faulty tokenized row: " + Arrays.toString(rowTokens)); return null; } } else { - dataLogger.warn("in datapath <" + dce.getDataPath() + ">: skipped row b/c has 0 players. Faulty tokenized row: " + Arrays.toString(rowTokens)); return null; } } - private String assembleQuery(ArrayList> queries) { + private String assembleQuery(ArrayList>> queries) { StringBuilder ret = new StringBuilder(); - for (Statement st : queries.get(0)) { - ret.append(st.toString()); + ret.append("match "); + for (Pattern st : queries.get(0)) { + ret.append(st.toString()).append("; "); } + ret.append("insert "); ret.append(queries.get(1).get(0).toString()); + ret.append(";"); return ret.toString(); } - private Collection createPlayerMatchAndInsert(String[] rowTokens, String[] columnNames, int insertCounter) { - ArrayList players = new ArrayList<>(); - Statement playersInsertStatement = Graql.var("rel-" + insertCounter); + private Collection> createPlayerMatchAndInsert(String[] rowTokens, String[] columnNames, int insertCounter) { + ArrayList> players = new ArrayList<>(); +// UnboundVariable relVariable = Graql.var("rel-" + insertCounter); + UnboundVariable relVariable = Graql.var("rel"); + ArrayList> relationStrings = new ArrayList<>(); int playerCounter = 0; // add Entity Players: @@ -114,7 +127,7 @@ private Collection createPlayerMatchAndInsert(String[] rowT String columnName = generatorMappingForPlayer.getColumnName(); int columnNameIndex = idxOf(columnNames, columnName); - if(columnNameIndex == -1) { + if (columnNameIndex == -1) { appLogger.error("The column header " + generatorMappingForPlayer.getColumnName() + " specified in your dataconfig cannot be found in the file you specified."); } @@ -122,22 +135,30 @@ private Collection createPlayerMatchAndInsert(String[] rowT !cleanToken(rowTokens[columnNameIndex]).isEmpty()) { // make sure that after cleaning, there is more than an empty string String currentCleanedToken = cleanToken(rowTokens[columnNameIndex]); String columnListSeparator = generatorMappingForPlayer.getListSeparator(); - if(columnListSeparator != null) { - for (String exploded: currentCleanedToken.split(columnListSeparator)) { - if(!cleanToken(exploded).isEmpty()) { + if (columnListSeparator != null) { + for (String exploded : currentCleanedToken.split(columnListSeparator)) { + if (!cleanToken(exploded).isEmpty()) { String currentExplodedCleanedToken = cleanToken(exploded); - String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; +// String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; + String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter; String playerRole = playerGenerator.getRoleType(); players.add(createPlayerMatchStatement(currentExplodedCleanedToken, playerGenerator, playerVariable, generatorMappingForPlayer.getPreprocessor())); - playersInsertStatement = playersInsertStatement.rel(playerRole, playerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(playerRole); + rel.add(playerVariable); + relationStrings.add(rel); playerCounter++; } } } else { // single player, no columnListSeparator - String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; +// String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; + String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter; String playerRole = playerGenerator.getRoleType(); players.add(createPlayerMatchStatement(currentCleanedToken, playerGenerator, playerVariable, generatorMappingForPlayer.getPreprocessor())); - playersInsertStatement = playersInsertStatement.rel(playerRole, playerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(playerRole); + rel.add(playerVariable); + relationStrings.add(rel); playerCounter++; } } @@ -153,7 +174,7 @@ private Collection createPlayerMatchAndInsert(String[] rowT String columnName = generatorMappingForRelationPlayer.getColumnName(); int columnNameIndex = idxOf(columnNames, columnName); - if(columnNameIndex == -1) { + if (columnNameIndex == -1) { appLogger.error("The column header " + generatorMappingForRelationPlayer.getColumnName() + " specified in your dataconfig cannot be found in the file you specified."); } @@ -161,41 +182,53 @@ private Collection createPlayerMatchAndInsert(String[] rowT !cleanToken(rowTokens[columnNameIndex]).isEmpty()) { String currentCleanedToken = cleanToken(rowTokens[columnNameIndex]); String columnListSeparator = generatorMappingForRelationPlayer.getListSeparator(); - if(columnListSeparator != null) { - for (String exploded: currentCleanedToken.split(columnListSeparator)) { - if(!cleanToken(exploded).isEmpty()) { + if (columnListSeparator != null) { + for (String exploded : currentCleanedToken.split(columnListSeparator)) { + if (!cleanToken(exploded).isEmpty()) { String currentExplodedCleanedToken = cleanToken(exploded); - String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; +// String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; + String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter; String playerRole = playerGenerator.getRoleType(); players.add(createRelationPlayerMatchStatementByAttribute(currentExplodedCleanedToken, playerGenerator, generatorMappingForRelationPlayer, playerVariable)); - playersInsertStatement = playersInsertStatement.rel(playerRole, playerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(playerRole); + rel.add(playerVariable); + relationStrings.add(rel); playerCounter++; } } } else { // single player, no listSeparator - String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; +// String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; + String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter; String playerRole = playerGenerator.getRoleType(); players.add(createRelationPlayerMatchStatementByAttribute(currentCleanedToken, playerGenerator, generatorMappingForRelationPlayer, playerVariable)); - playersInsertStatement = playersInsertStatement.rel(playerRole, playerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(playerRole); + rel.add(playerVariable); + relationStrings.add(rel); playerCounter++; } } - // if matching the relation player by players in that relation: + // if matching the relationStrings player by players in that relationStrings: } else if (generatorMappingForRelationPlayer.getMatchByPlayers().length > 0) { int[] columnNameIndices = indicesOf(columnNames, generatorMappingForRelationPlayer.getColumnNames()); for (int i : columnNameIndices) { - if(i == -1) { + if (i == -1) { appLogger.error("The column header " + generatorMappingForRelationPlayer.getColumnName() + " specified in your dataconfig cannot be found in the file you specified."); } } int maxColumnIndex = Arrays.stream(columnNameIndices).max().getAsInt(); if (rowTokens.length > maxColumnIndex) { - String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; +// String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter + "-" + insertCounter; + String playerVariable = playerGenerator.getPlayerType() + "-" + playerCounter; String playerRole = playerGenerator.getRoleType(); players.addAll(createRelationPlayerMatchStatementByPlayers(rowTokens, columnNameIndices, playerGenerator, generatorMappingForRelationPlayer, playerVariable, insertCounter)); - playersInsertStatement = playersInsertStatement.rel(playerRole, playerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(playerRole); + rel.add(playerVariable); + relationStrings.add(rel); playerCounter++; } } else { @@ -204,12 +237,22 @@ private Collection createPlayerMatchAndInsert(String[] rowT } } - players.add(playersInsertStatement); + if (relationStrings.size() >= 1) { + Relation returnRelation = relVariable.rel(relationStrings.get(0).get(0), relationStrings.get(0).get(1)); + for (ArrayList rel : relationStrings.subList(1, relationStrings.size())) { + returnRelation = returnRelation.rel(rel.get(0), rel.get(1)); + } + players.add(returnRelation); + } + return players; } - private StatementInstance createPlayerMatchStatement(String cleanedToken, ProcessorConfigEntry.ConceptGenerator playerGenerator, String playerVariable, DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { - StatementInstance ms = Graql + private ThingVariable createPlayerMatchStatement(String cleanedToken, + ProcessorConfigEntry.ConceptGenerator playerGenerator, + String playerVariable, + DataConfigEntry.DataConfigGeneratorMapping.PreprocessorConfig preprocessorConfig) { + Thing ms = Graql .var(playerVariable) .isa(playerGenerator.getPlayerType()); String attributeType = playerGenerator.getUniquePlayerId(); @@ -218,8 +261,11 @@ private StatementInstance createPlayerMatchStatement(String cleanedToken, Proces return ms; } - private StatementInstance createRelationPlayerMatchStatementByAttribute(String cleanedToken, ProcessorConfigEntry.ConceptGenerator playerGenerator, DataConfigEntry.DataConfigGeneratorMapping dcm, String playerVariable) { - StatementInstance ms = Graql + private ThingVariable createRelationPlayerMatchStatementByAttribute(String cleanedToken, + ProcessorConfigEntry.ConceptGenerator playerGenerator, + DataConfigEntry.DataConfigGeneratorMapping dcm, + String playerVariable) { + Thing ms = Graql .var(playerVariable) .isa(playerGenerator.getPlayerType()); String attributeType = playerGenerator.getMatchByAttribute().get(dcm.getMatchByAttribute()).getAttributeType(); @@ -228,28 +274,33 @@ private StatementInstance createRelationPlayerMatchStatementByAttribute(String c return ms; } - private ArrayList createRelationPlayerMatchStatementByPlayers(String[] rowTokens, int[] columnNameIndices, ProcessorConfigEntry.ConceptGenerator playerGenerator, DataConfigEntry.DataConfigGeneratorMapping dcm, String playerVariable, int insertCounter) { - ArrayList assembledMatchStatements = new ArrayList<>(); - - Statement relationPlayerMatchStatement = Graql.var(playerVariable); + private ArrayList> createRelationPlayerMatchStatementByPlayers(String[] rowTokens, int[] columnNameIndices, ProcessorConfigEntry.ConceptGenerator playerGenerator, DataConfigEntry.DataConfigGeneratorMapping dcm, String playerVariable, int insertCounter) { + ArrayList> assembledMatchStatements = new ArrayList<>(); + UnboundVariable relVariable = Graql.var(playerVariable); + ArrayList> relationStrings = new ArrayList<>(); //match the n entites with their attributes int i = 0; for (int columnNameIndex : columnNameIndices) { if (!cleanToken(rowTokens[columnNameIndex]).isEmpty()) { String cleanedToken = cleanToken(rowTokens[columnNameIndex]); - String relationPlayerPlayerVariable = "relplayer-player-" + insertCounter + "-" + i; +// String relationPlayerPlayerVariable = "relplayer-player-" + insertCounter + "-" + i; + String relationPlayerPlayerVariable = "relplayer-player-" + i; String relationPlayerPlayerType = playerGenerator.getMatchByPlayer().get(dcm.getMatchByPlayers()[i]).getPlayerType(); String relationPlayerPlayerAttributeType = playerGenerator.getMatchByPlayer().get(dcm.getMatchByPlayers()[i]).getUniquePlayerId(); String relationPlayerPlayerAttributeValueType = playerGenerator.getMatchByPlayer().get(dcm.getMatchByPlayers()[i]).getIdValueType(); - StatementInstance relationPlayerCurrentPlayerMatchStatement = Graql.var(relationPlayerPlayerVariable).isa(relationPlayerPlayerType); + Thing relationPlayerCurrentPlayerMatchStatement = Graql.var(relationPlayerPlayerVariable).isa(relationPlayerPlayerType); relationPlayerCurrentPlayerMatchStatement = addAttributeOfColumnType(relationPlayerCurrentPlayerMatchStatement, relationPlayerPlayerAttributeType, relationPlayerPlayerAttributeValueType, cleanedToken, dcm.getPreprocessor()); assembledMatchStatements.add(relationPlayerCurrentPlayerMatchStatement); // here add the matched player to the relation statement (i.e.: (role: $variable)): String relationPlayerPlayerRole = playerGenerator.getMatchByPlayer().get(dcm.getMatchByPlayers()[i]).getRoleType(); - relationPlayerMatchStatement = relationPlayerMatchStatement.rel(relationPlayerPlayerRole, relationPlayerPlayerVariable); + ArrayList rel = new ArrayList<>(); + rel.add(relationPlayerPlayerRole); + rel.add(relationPlayerPlayerVariable); + relationStrings.add(rel); +// relationPlayerMatchStatement = relationPlayerMatchStatement.rel(relationPlayerPlayerRole, relationPlayerPlayerVariable); i++; } else { // this ensures that only relations in which all required players are present actually enter the match statement - empty list = skip of insert @@ -260,35 +311,36 @@ private ArrayList createRelationPlayerMatchStatementByPlayers(String[ } } - if (assembledMatchStatements.size() > 0) { - // complete the relation match statement & add to assembly: - relationPlayerMatchStatement = relationPlayerMatchStatement.isa(playerGenerator.getPlayerType()); - assembledMatchStatements.add(relationPlayerMatchStatement); + if (relationStrings.size() >= 1 && assembledMatchStatements.size() > 0) { + Relation returnRelation = relVariable.rel(relationStrings.get(0).get(0), relationStrings.get(0).get(1)); + for (ArrayList rel : relationStrings.subList(1, relationStrings.size())) { + returnRelation = returnRelation.rel(rel.get(0), rel.get(1)); + } + assembledMatchStatements.add(returnRelation.isa(playerGenerator.getPlayerType())); } return assembledMatchStatements; } - private StatementInstance relationInsert(StatementInstance si) { + private Relation relationInsert(Relation si) { if (si != null) { - si = si.isa(pce.getSchemaType()); - return si; + return si.isa(pce.getSchemaType()); } else { return null; } } - private boolean isValid(ArrayList> si) { - ArrayList matchStatements = si.get(0); - ArrayList insertStatements = si.get(1); + private boolean isValid(ArrayList>> si) { + ArrayList> matchStatements = si.get(0); + ArrayList> insertStatements = si.get(1); StringBuilder matchStatement = new StringBuilder(); - for (Statement st:matchStatements) { + for (Pattern st : matchStatements) { matchStatement.append(st.toString()); } String insertStatement = insertStatements.get(0).toString(); // missing required players - for (Map.Entry generatorEntry: pce.getRelationRequiredPlayers().entrySet()) { + for (Map.Entry generatorEntry : pce.getRelationRequiredPlayers().entrySet()) { if (!matchStatement.toString().contains("isa " + generatorEntry.getValue().getPlayerType())) { return false; } @@ -297,7 +349,7 @@ private boolean isValid(ArrayList> si) { } } // missing required attribute - for (Map.Entry generatorEntry: pce.getRequiredAttributes().entrySet()) { + for (Map.Entry generatorEntry : pce.getRequiredAttributes().entrySet()) { if (!insertStatement.contains("has " + generatorEntry.getValue().getAttributeType())) { return false; } diff --git a/src/main/java/insert/GraknInserter.java b/src/main/java/insert/GraknInserter.java index 2d33d1f..616bd95 100644 --- a/src/main/java/insert/GraknInserter.java +++ b/src/main/java/insert/GraknInserter.java @@ -1,228 +1,146 @@ package insert; import grakn.client.GraknClient; +import grakn.client.GraknClient.Transaction; +import grakn.client.GraknClient.Session; import graql.lang.Graql; +import graql.lang.pattern.variable.ThingVariable; import graql.lang.query.GraqlDefine; -import graql.lang.statement.Statement; +import graql.lang.query.GraqlInsert; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; -import java.io.BufferedReader; -import java.io.FileInputStream; -import java.io.IOException; -import java.io.InputStreamReader; import java.util.ArrayList; -import java.util.List; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.ExecutionException; -import java.util.stream.Collectors; +import java.util.HashMap; +import java.util.concurrent.atomic.AtomicInteger; -import static graql.lang.Graql.parse; +import static util.Util.loadSchemaFromFile; public class GraknInserter { private final String schemaPath; - private final String keyspaceName; - private final String uri; + private final String databaseName; + private final String graknURI; private static final Logger appLogger = LogManager.getLogger("com.bayer.dt.grami"); - public GraknInserter(String uri, String port, String schemaPath, String keyspaceName) { + public GraknInserter(String graknURI, String port, String schemaPath, String databaseName) { this.schemaPath = schemaPath; - this.keyspaceName = keyspaceName; - this.uri = String.format("%s:%s", uri, port); + this.databaseName = databaseName; + this.graknURI = String.format("%s:%s", graknURI, port); } // Schema Operations - public GraknClient.Session setKeyspaceToSchema(GraknClient client, GraknClient.Session session) { - if (client.keyspaces().retrieve().contains(keyspaceName)) { - deleteKeyspace(client); - session = getSession(client); - } - String schema = loadSchemaFromFile(); - defineToGrakn(schema, session); - return session; + public void cleanAndDefineSchemaToDatabase(GraknClient client) { + deleteDatabaseIfExists(client); + createDatabase(client); + String schema = loadSchemaFromFile(schemaPath); + defineToGrakn(schema, client); } - public GraknClient.Session updateCurrentSchema(GraknClient client, GraknClient.Session session) { - String schema = loadSchemaFromFile(); - defineToGrakn(schema, session); - return session; + private void createDatabase(GraknClient client) { + client.databases().create(databaseName); } - public String loadSchemaFromFile() { - String schema=""; - try { - BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(this.schemaPath))); - StringBuilder sb = new StringBuilder(); - String line = br.readLine(); - while (line != null) { - sb.append(line).append("\n"); - line = br.readLine(); - } - schema = sb.toString(); - } catch (IOException e) { - e.printStackTrace(); - } - return schema; - } + private void defineToGrakn(String schemaAsString, GraknClient client) { + Session schemaSession = getSchemaSession(client); + GraqlDefine q = Graql.parseQuery(schemaAsString); - private void defineToGrakn(String insertString, GraknClient.Session session) { - GraknClient.Transaction writeTransaction = session.transaction().write(); - writeTransaction.execute((GraqlDefine) parse(insertString)); - writeTransaction.commit(); - appLogger.info("Successfully defined schema"); - } - // Entity Operations - public int insertEntityToGrakn(ArrayList statements, GraknClient.Session session) { - GraknClient.Transaction writeTransaction = session.transaction().write(); - int i = 0; - for (Statement st : statements) { - writeTransaction.execute(Graql.insert(st)); - i++; - } + Transaction writeTransaction = schemaSession.transaction(Transaction.Type.WRITE); + writeTransaction.query().define(q); writeTransaction.commit(); - appLogger.trace(String.format("Txn with ID: %s has committed and is closed", writeTransaction.toString())); - return i; - } + writeTransaction.close(); + schemaSession.close(); - public void futuresParallelInsertEntity(ArrayList insertStatements, GraknClient.Session session, int cores) throws ExecutionException, InterruptedException { - ArrayList> batches = createEvenEntityBatches(insertStatements, cores); - - List> futures = new ArrayList<>(); - for (ArrayList batch : batches) { - if (!batch.isEmpty()) { - CompletableFuture inserted = CompletableFuture.supplyAsync(() -> insertEntityToGrakn(batch, session)); - futures.add(inserted); - } - } - for (CompletableFuture f : futures) { - f.get(); - } + appLogger.info("Defined schema to database <" + databaseName + ">"); } - private static ArrayList> createEvenEntityBatches(ArrayList inserts, int cores) { - - ArrayList> batches = new ArrayList<>(); - ArrayList batchStrings = new ArrayList<>(); - - for (int i = 0; i < cores; i++) { - ArrayList batch = new ArrayList<>(); - batches.add(batch); - batchStrings.add(""); + public void matchInsertThreadedInserting(HashMap>>> statements, Session session, int threads, int batchSize) throws InterruptedException { + + AtomicInteger queryIndex = new AtomicInteger(0); + Thread[] ts = new Thread[threads]; + + Runnable matchInsertThread = + () -> { + ArrayList>> matchStatements = statements.get("match"); + ArrayList>> insertStatements = statements.get("insert"); + + while (queryIndex.get() < matchStatements.size()) { + try (Transaction tx = session.transaction(Transaction.Type.WRITE)) { + int q; + for (int i = 0; i < batchSize && (q = queryIndex.getAndIncrement()) < matchStatements.size(); i++) { + ArrayList> rowMatchStatements = matchStatements.get(q); + ArrayList> rowInsertStatements = insertStatements.get(q); + GraqlInsert query = Graql.match(rowMatchStatements).insert(rowInsertStatements); + tx.query().insert(query); + } + tx.commit(); + } + } + }; + + for (int i = 0; i < ts.length; i++) { + ts[i] = new Thread(matchInsertThread); } - - for (Statement insert: inserts) { - int shortest = getListIndexOfShortestString(batchStrings); - batches.get(shortest).add(insert); - String curString = batchStrings.get(shortest); - batchStrings.set(shortest, curString + insert.toString()); + for (Thread value : ts) { + value.start(); } - - appLogger.trace("entity batches.size: " + batches.size()); - appLogger.trace("bucket sizes:"); - for (String s : batchStrings) { - appLogger.trace(s.length()); + for (Thread thread : ts) { + thread.join(); } - return batches; } - // Relation Operations - public int insertMatchInsertToGrakn(ArrayList>> statements, GraknClient.Session session) { - - ArrayList> matchStatements = statements.get(0); - ArrayList> insertStatements = statements.get(1); - - GraknClient.Transaction writeTransaction = session.transaction().write(); - int i = 0; - for (int row = 0; row < matchStatements.size(); row++) { - ArrayList rowMatchStatements = matchStatements.get(row); - ArrayList rowInsertStatements = insertStatements.get(row); - writeTransaction.execute(Graql.match(rowMatchStatements).insert(rowInsertStatements)); - i++; + public void insertThreadedInserting(ArrayList> statements, Session session, int threads, int batchSize) throws InterruptedException { + + AtomicInteger queryIndex = new AtomicInteger(0); + Thread[] ts = new Thread[threads]; + + Runnable insertThread = + () -> { + while (queryIndex.get() < statements.size()) { + try (Transaction tx = session.transaction(Transaction.Type.WRITE)) { + int q; + for (int i = 0; i < batchSize && (q = queryIndex.getAndIncrement()) < statements.size(); i++) { + GraqlInsert query = Graql.insert(statements.get(q)); + tx.query().insert(query); + } + tx.commit(); + } + } + }; + for (int i = 0; i < ts.length; i++) { + ts[i] = new Thread(insertThread); } - writeTransaction.commit(); - appLogger.trace(String.format("Txn with ID: %s has committed and is closed", writeTransaction.toString())); - return i; - } - - public void futuresParallelInsertMatchInsert(ArrayList>> statements, GraknClient.Session session, int cores) throws ExecutionException, InterruptedException { - ArrayList>>> batches = createEvenMatchInsertBatches(statements, cores); - - List> futures = new ArrayList<>(); - for (ArrayList>> batch : batches) { - if (!batch.isEmpty()) { - CompletableFuture inserted = CompletableFuture.supplyAsync(() -> insertMatchInsertToGrakn(batch, session)); - futures.add(inserted); - } + for (Thread value : ts) { + value.start(); } - for (CompletableFuture f : futures) { - f.get(); + for (Thread thread : ts) { + thread.join(); } } - private static ArrayList>>> createEvenMatchInsertBatches(ArrayList>> statements, int cores) { - - ArrayList>>> batches = new ArrayList<>(); - ArrayList batchStrings = new ArrayList<>(); - - ArrayList> matchStatements = statements.get(0); - ArrayList> insertStatements = statements.get(1); - - for (int i = 0; i < cores; i++) { - ArrayList> matches = new ArrayList<>(); - ArrayList> inserts = new ArrayList<>(); - - ArrayList>> batch = new ArrayList<>(); - batch.add(matches); - batch.add(inserts); - - batches.add(batch); - batchStrings.add(""); - } - - for (int i = 0; i < matchStatements.size(); i++) { - int shortestBatchIndex = getListIndexOfShortestString(batchStrings); - // add matches - batches.get(shortestBatchIndex).get(0).add(matchStatements.get(i)); - // add inserts - batches.get(shortestBatchIndex).get(1).add(insertStatements.get(i)); - // update string for batchIndex - String curString = batchStrings.get(shortestBatchIndex); - String mat = matchStatements.get(i).stream().map(Statement::toString).collect(Collectors.joining(";")); - mat += insertStatements.get(i).get(0).toString(); - batchStrings.set(shortestBatchIndex, curString + mat); - } - - appLogger.trace("relation batches.size: " + batches.size()); - appLogger.trace("bucket sizes:"); - for (String s : batchStrings) { - appLogger.trace(s.length()); - } - - return batches; + // Utility functions + public Session getDataSession(GraknClient client) { + return client.session(databaseName, Session.Type.DATA); } - // Utility functions - public GraknClient.Session getSession(GraknClient client) { - return client.session(this.keyspaceName); + public Session getSchemaSession(GraknClient client) { + return client.session(databaseName, Session.Type.SCHEMA); } public GraknClient getClient() { - return new GraknClient(this.uri); + return GraknClient.core(graknURI); } - private void deleteKeyspace(GraknClient client) { - client.keyspaces().delete(this.keyspaceName); + private void deleteDatabaseIfExists(GraknClient client) { + if (client.databases().contains(databaseName)) { + client.databases().delete(databaseName); + } } - private static int getListIndexOfShortestString(ArrayList batchStrings) { - int shortest = 0; - for (String s : batchStrings) { - if (s.length() < batchStrings.get(shortest).length()) { - shortest = batchStrings.indexOf(s); - } - } - return shortest; + // used by command + public void loadAndDefineSchema(GraknClient client) { + String schema = loadSchemaFromFile(schemaPath); + defineToGrakn(schema, client); } } diff --git a/src/main/java/migrator/GraknMigrator.java b/src/main/java/migrator/GraknMigrator.java index 9bcea8e..7863fa4 100644 --- a/src/main/java/migrator/GraknMigrator.java +++ b/src/main/java/migrator/GraknMigrator.java @@ -4,16 +4,17 @@ import com.google.gson.reflect.TypeToken; import configuration.*; import generator.*; -import loader.DataLoader; import grakn.client.GraknClient; +import grakn.client.GraknClient.Session; +import graql.lang.pattern.variable.ThingVariable; +import loader.DataLoader; import insert.GraknInserter; -import graql.lang.statement.Statement; - import java.io.*; import java.lang.reflect.Type; +import java.math.RoundingMode; +import java.text.DecimalFormat; import java.util.ArrayList; import java.util.HashMap; - import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; @@ -24,7 +25,7 @@ public class GraknMigrator { private final String migrationStatePath; private boolean cleanAndMigrate = false; private HashMap migrationStatus; - private final GraknInserter gm; + private final GraknInserter graknInserter; private final MigrationConfig migrationConfig; private static final Logger appLogger = LogManager.getLogger("com.bayer.dt.grami"); @@ -33,7 +34,7 @@ public GraknMigrator(MigrationConfig migrationConfig, this.dataConfig = migrationConfig.getDataConfig(); this.migrationStatePath = migrationStatePath; this.migrationConfig = migrationConfig; - this.gm = new GraknInserter(migrationConfig.getGraknURI().split(":")[0], + this.graknInserter = new GraknInserter(migrationConfig.getGraknURI().split(":")[0], migrationConfig.getGraknURI().split(":")[1], migrationConfig.getSchemaPath(), migrationConfig.getKeyspace() @@ -52,26 +53,25 @@ public GraknMigrator(MigrationConfig migrationConfig, public void migrate(boolean migrateEntities, boolean migrateRelations, boolean migrateRelationRelations, boolean migrateAppendAttributes) throws IOException { - GraknClient client = gm.getClient(); - GraknClient.Session session = gm.getSession(client); - - getMigrationStatus(); + initializeMigrationStatus(); + GraknClient client = graknInserter.getClient(); if (cleanAndMigrate) { - session = gm.setKeyspaceToSchema(client, session); - appLogger.info("cleaned and reloaded keyspace"); + graknInserter.cleanAndDefineSchemaToDatabase(client); + appLogger.info("cleaned database and migrate schema..."); } else { - appLogger.info("continuing previous migration..."); + appLogger.info("using existing DB and schema to continue previous migration..."); } - migrateThingsInOrder(session, migrateEntities, migrateRelations, migrateRelationRelations, migrateAppendAttributes); + GraknClient.Session dataSession = graknInserter.getDataSession(client); + migrateThingsInOrder(dataSession, migrateEntities, migrateRelations, migrateRelationRelations, migrateAppendAttributes); - session.close(); + dataSession.close(); client.close(); appLogger.info("GraMi is finished migrating your stuff!"); } - private void migrateThingsInOrder(GraknClient.Session session, boolean migrateEntities, boolean migrateRelations, boolean migrateRelationRelations, boolean migrateAppendAttributes) throws IOException { + private void migrateThingsInOrder(Session session, boolean migrateEntities, boolean migrateRelations, boolean migrateRelationRelations, boolean migrateAppendAttributes) throws IOException { if (migrateEntities) { appLogger.info("migrating entities..."); getStatusAndMigrate(session, "entity"); @@ -94,7 +94,7 @@ private void migrateThingsInOrder(GraknClient.Session session, boolean migrateEn } } - private void getStatusAndMigrate(GraknClient.Session session, String processorType) throws IOException { + private void getStatusAndMigrate(Session session, String processorType) throws IOException { for (String dcEntryKey : dataConfig.keySet()) { DataConfigEntry dce = dataConfig.get(dcEntryKey); String currentProcessor = dce.getProcessor(); @@ -125,7 +125,7 @@ private boolean isOfProcessorType(String key, String conceptType) { return false; } - private void getGeneratorAndInsert(GraknClient.Session session, DataConfigEntry dce, int skipRows) throws IOException { + private void getGeneratorAndInsert(Session session, DataConfigEntry dce, int skipRows) throws IOException { // choose insert generator InsertGenerator gen = getProcessor(dce); @@ -133,13 +133,17 @@ private void getGeneratorAndInsert(GraknClient.Session session, DataConfigEntry updateMigrationStatusIsCompleted(dce); } - private void writeThingToGrakn(DataConfigEntry dce, InsertGenerator gen, GraknClient.Session session, int skipLines) { + private void writeThingToGrakn(DataConfigEntry dce, InsertGenerator gen, Session session, int skipLines) { + + appLogger.info("inserting using " + dce.getThreads() + " threads" + " with thread commit size of " + dce.getBatchSize() + " rows"); + InputStream entityStream = DataLoader.getInputStream(dce.getDataPath()); String header = ""; ArrayList rows = new ArrayList<>(); String line; int batchSizeCounter = 0; int totalRecordCounter = 0; + double timerStart = System.currentTimeMillis(); if (entityStream != null) { try (BufferedReader br = new BufferedReader(new InputStreamReader(entityStream))) { @@ -158,61 +162,56 @@ private void writeThingToGrakn(DataConfigEntry dce, InsertGenerator gen, GraknCl // insert Batch once chunk size is reached rows.add(line); batchSizeCounter++; - if (batchSizeCounter == dce.getBatchSize()) { +// if (batchSizeCounter == dce.getBatchSize()) { + if (batchSizeCounter == dce.getBatchSize() * dce.getThreads()) { + System.out.print("+"); + System.out.flush(); writeThing(dce, gen, session, rows, batchSizeCounter, header); batchSizeCounter = 0; rows.clear(); } // logging if (totalRecordCounter % 50000 == 0) { - appLogger.info("progress: # rows processed so far (k): " + totalRecordCounter/1000); + System.out.println(); + appLogger.info("processed " + totalRecordCounter/1000 + "k rows"); } } //insert the rest when loop exits with less than batch size if (!rows.isEmpty()) { writeThing(dce, gen, session, rows, batchSizeCounter, header); - appLogger.info("final # rows processed: " + totalRecordCounter); + if (totalRecordCounter % 50000 != 0) { + System.out.println(); + } } + + appLogger.info("final # rows processed: " + totalRecordCounter); + appLogger.info(logInsertRate(timerStart, totalRecordCounter)); + + } catch (IOException e) { e.printStackTrace(); } } } - private void writeThing(DataConfigEntry dce, InsertGenerator gen, GraknClient.Session session, ArrayList rows, int lineCounter, String header) throws IOException { - int cores = dce.getThreads(); + + + private void writeThing(DataConfigEntry dce, InsertGenerator gen, Session session, ArrayList rows, int lineCounter, String header) throws IOException { + int threads = dce.getThreads(); try { if (isOfProcessorType(dce.getProcessor(), "entity")) { - ArrayList insertStatements = gen.graknEntityInsert(rows, header); + ArrayList> insertStatements = gen.graknEntityInsert(rows, header); appLogger.trace("number of generated insert Statements: " + insertStatements.size()); - if (cores > 1) { - appLogger.debug("inserting using " + cores + " threads"); - gm.futuresParallelInsertEntity(insertStatements, session, cores); - } else { - appLogger.debug("inserting using 1 thread"); - gm.insertEntityToGrakn(insertStatements, session); - } + graknInserter.insertThreadedInserting(insertStatements, session, threads, dce.getBatchSize()); } else if (isOfProcessorType(dce.getProcessor(), "relation") || isOfProcessorType(dce.getProcessor(), "relation-with-relation")) { - ArrayList>> statements = gen.graknRelationInsert(rows, header); - appLogger.trace("number of generated insert Statements: " + statements.get(0).size()); - if (cores > 1) { - appLogger.debug("inserting using " + cores + " threads"); - gm.futuresParallelInsertMatchInsert(statements, session, cores); - } else { - appLogger.debug("inserting using 1 thread"); - gm.insertMatchInsertToGrakn(statements, session); - } + HashMap>>> statements = gen.graknRelationInsert(rows, header); + appLogger.trace("number of generated insert Statements: " + statements.get("match").size()); + graknInserter.matchInsertThreadedInserting(statements, session, threads, dce.getBatchSize()); } else if (isOfProcessorType(dce.getProcessor(), "append-attribute")) { - ArrayList>> statements = gen.graknAppendAttributeInsert(rows, header); - appLogger.trace("number of generated insert Statements: " + statements.get(0).size()); - if (cores > 1) { - appLogger.debug("inserting using " + cores + " threads"); - gm.futuresParallelInsertMatchInsert(statements, session, cores); - } else { - appLogger.debug("inserting using 1 thread"); - gm.insertMatchInsertToGrakn(statements, session); - } + HashMap>>> statements = gen.graknAppendAttributeInsert(rows, header); + appLogger.trace("number of generated insert Statements: " + statements.get("match").size()); + graknInserter.matchInsertThreadedInserting(statements, session, threads, dce.getBatchSize()); } else { throw new IllegalArgumentException("the processor <" + dce.getProcessor() + "> is not known"); } @@ -226,7 +225,7 @@ private void clearMigrationStatusFile() throws IOException { new FileWriter(migrationStatePath, false).close(); } - private void getMigrationStatus() { + private void initializeMigrationStatus() { BufferedReader bufferedReader; try { bufferedReader = new BufferedReader(new FileReader(migrationStatePath)); @@ -306,4 +305,17 @@ private ProcessorConfigEntry getGenFromGenConfig(String processor, HashMap 0) { + return "insert rate inserts/second: " + df.format((totalRecordCounter / secs)); + } else { + return "insert rate inserts/second: superfast"; + } + + } + } diff --git a/src/main/java/migrator/SchemaUpdater.java b/src/main/java/migrator/SchemaUpdater.java index 5b90c5b..31c8cc3 100644 --- a/src/main/java/migrator/SchemaUpdater.java +++ b/src/main/java/migrator/SchemaUpdater.java @@ -1,6 +1,5 @@ package migrator; -import configuration.MigrationConfig; import configuration.SchemaUpdateConfig; import grakn.client.GraknClient; import insert.GraknInserter; @@ -22,11 +21,9 @@ public SchemaUpdater(SchemaUpdateConfig suConfig) { public void updateSchema() { GraknClient client = gm.getClient(); - GraknClient.Session session = gm.getSession(client); appLogger.info("applying schema to existing schema"); - gm.updateCurrentSchema(client, session); + gm.loadAndDefineSchema(client); appLogger.info("GraMi is finished applying your schema!"); - session.close(); client.close(); } } diff --git a/src/main/java/util/Util.java b/src/main/java/util/Util.java index 17c78d0..6771d53 100644 --- a/src/main/java/util/Util.java +++ b/src/main/java/util/Util.java @@ -1,10 +1,38 @@ package util; import java.io.*; +import java.util.ArrayList; public class Util { public static String getAbsPath(String p) { File file = new File(p); return file.getAbsolutePath(); } + + public static int getListIndexOfShortestString(ArrayList batchStrings) { + int shortest = 0; + for (String s : batchStrings) { + if (s.length() < batchStrings.get(shortest).length()) { + shortest = batchStrings.indexOf(s); + } + } + return shortest; + } + + public static String loadSchemaFromFile(String schemaPath) { + String schema=""; + try { + BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(schemaPath))); + StringBuilder sb = new StringBuilder(); + String line = br.readLine(); + while (line != null) { + sb.append(line).append("\n"); + line = br.readLine(); + } + schema = sb.toString(); + } catch (IOException e) { + e.printStackTrace(); + } + return schema; + } } diff --git a/src/test/java/cli/GramiCLITest.java b/src/test/java/cli/GramiCLITest.java index 4ce8390..0ce6bb6 100644 --- a/src/test/java/cli/GramiCLITest.java +++ b/src/test/java/cli/GramiCLITest.java @@ -14,12 +14,12 @@ public class GramiCLITest { public void migrateTest() { String[] args = { "migrate", - "-d", "src/test/resources/phone-calls/dataConfig.json", - "-p", "src/test/resources/phone-calls/processorConfig.json", - "-m", "src/test/resources/phone-calls/migrationStatus.json", + "-dc", "src/test/resources/phone-calls/dataConfig.json", + "-pc", "src/test/resources/phone-calls/processorConfig.json", + "-ms", "src/test/resources/phone-calls/migrationStatus.json", "-s", "src/test/resources/phone-calls/schema.gql", - "-k", "grami_cli_test", - "-g", "127.0.0.1:48555", + "-db", "grami_cli_test", + "-g", "127.0.0.1:1729", "-cm" }; @@ -41,8 +41,8 @@ public void updateTest() { String[] args = { "schema-update", "-s", "src/test/resources/phone-calls/schema-updated.gql", - "-k", "grami_cli_test", - "-g", "127.0.0.1:48555", + "-db", "grami_cli_test", + "-g", "127.0.0.1:1729", }; GramiCLI grami = new GramiCLI(); @@ -57,6 +57,6 @@ public void updateTest() { assertTrue(sw.toString().contains("GraMi schema-update")); assertTrue(sw.toString().contains("schema-updated.gql")); assertTrue(sw.toString().contains("grami_cli_test")); - assertTrue(sw.toString().contains("48555")); + assertTrue(sw.toString().contains("1729")); } } diff --git a/src/test/java/generator/EntityInsertGeneratorTest.java b/src/test/java/generator/EntityInsertGeneratorTest.java index 58e4825..a264eb1 100644 --- a/src/test/java/generator/EntityInsertGeneratorTest.java +++ b/src/test/java/generator/EntityInsertGeneratorTest.java @@ -2,9 +2,10 @@ import configuration.MigrationConfig; import configuration.ProcessorConfigEntry; -import graql.lang.statement.Statement; +import graql.lang.pattern.variable.ThingVariable; import org.junit.Assert; import org.junit.Test; + import static util.Util.getAbsPath; import java.util.ArrayList; import java.util.HashMap; @@ -20,7 +21,7 @@ public class EntityInsertGeneratorTest { private final String entity1dp = getAbsPath("src/test/resources/genericTests/entity1-test-data.tsv"); private final String entity2dp = getAbsPath("src/test/resources/genericTests/entity2-test-data.tsv"); private final String entity3dp = getAbsPath("src/test/resources/genericTests/entity3-test-data.tsv"); - private final MigrationConfig migrationConfig = new MigrationConfig("localhost:48555",keyspaceName, asp, adcp, gcp); + private final MigrationConfig migrationConfig = new MigrationConfig("localhost:1729",keyspaceName, asp, adcp, gcp); private final HashMap> genConf = migrationConfig.getProcessorConfig(); @Test @@ -32,66 +33,66 @@ public void graknEntityQueryFromRowTest() { String header = rows.get(0); rows = new ArrayList<>(rows.subList(1, rows.size())); - ArrayList < Statement > result = testEntityInsertGenerator.graknEntityInsert(rows, header); + ArrayList > result = testEntityInsertGenerator.graknEntityInsert(rows, header); - String tc0 = "$e-0 isa entity1, has entity1-id \"entity1id0\", has entity1-name \"entity1name0\", has entity1-exp \"entity1id0exp0\";"; + String tc0 = "$e-0 isa entity1, has entity1-id \"entity1id0\", has entity1-name \"entity1name0\", has entity1-exp \"entity1id0exp0\""; Assert.assertEquals(tc0, result.get(0).toString()); - String tc1 = "$e-1 isa entity1, has entity1-id \"entity1id1\", has entity1-name \"entity1name1\", has entity1-exp \"entity1id1exp11\", has entity1-exp \"entity1id1exp12\";"; + String tc1 = "$e-1 isa entity1, has entity1-id \"entity1id1\", has entity1-name \"entity1name1\", has entity1-exp \"entity1id1exp11\", has entity1-exp \"entity1id1exp12\""; Assert.assertEquals(tc1, result.get(1).toString()); - String tc2 = "$e-2 isa entity1, has entity1-id \"entity1id2\", has entity1-name \"entity1name2\", has entity1-exp \"entity1id2exp21\", has entity1-exp \"entity1id2exp22\", has entity1-exp \"entity1id2exp23\";"; + String tc2 = "$e-2 isa entity1, has entity1-id \"entity1id2\", has entity1-name \"entity1name2\", has entity1-exp \"entity1id2exp21\", has entity1-exp \"entity1id2exp22\", has entity1-exp \"entity1id2exp23\""; Assert.assertEquals(tc2, result.get(2).toString()); - String tc3 = "$e-3 isa entity1, has entity1-id \"entity1id3\", has entity1-name \"entity1name3\";"; + String tc3 = "$e-3 isa entity1, has entity1-id \"entity1id3\", has entity1-name \"entity1name3\""; Assert.assertEquals(tc3, result.get(3).toString()); - String tc4 = "$e-4 isa entity1, has entity1-id \"entity1id4\", has entity1-name \"entity1name4\";"; + String tc4 = "$e-4 isa entity1, has entity1-id \"entity1id4\", has entity1-name \"entity1name4\""; Assert.assertEquals(tc4, result.get(4).toString()); - String tc5 = "$e-5 isa entity1, has entity1-id \"entity1id5\", has entity1-name \"entity1name5\";"; + String tc5 = "$e-5 isa entity1, has entity1-id \"entity1id5\", has entity1-name \"entity1name5\""; Assert.assertEquals(tc5, result.get(5).toString()); - String tc6 = "$e-6 isa entity1, has entity1-id \"entity1id6\", has entity1-name \"entity1name6\";"; + String tc6 = "$e-6 isa entity1, has entity1-id \"entity1id6\", has entity1-name \"entity1name6\""; Assert.assertEquals(tc6, result.get(6).toString()); - String tc7 = "$e-7 isa entity1, has entity1-id \"entity1id7\", has entity1-name \"entity1name7\";"; + String tc7 = "$e-7 isa entity1, has entity1-id \"entity1id7\", has entity1-name \"entity1name7\""; Assert.assertEquals(tc7, result.get(7).toString()); - String tc8 = "$e-8 isa entity1, has entity1-id \"entity1id8\", has entity1-name \"entity1name8\";"; + String tc8 = "$e-8 isa entity1, has entity1-id \"entity1id8\", has entity1-name \"entity1name8\""; Assert.assertEquals(tc8, result.get(8).toString()); - String tc9 = "$e-9 isa entity1, has entity1-id \"entity1id9\", has entity1-name \"entity1name9\";"; + String tc9 = "$e-9 isa entity1, has entity1-id \"entity1id9\", has entity1-name \"entity1name9\""; Assert.assertEquals(tc9, result.get(9).toString()); - String tc10 = "$e-10 isa entity1, has entity1-id \"entity1id10\", has entity1-name \"entity1name10\";"; + String tc10 = "$e-10 isa entity1, has entity1-id \"entity1id10\", has entity1-name \"entity1name10\""; Assert.assertEquals(tc10, result.get(10).toString()); - String tc11 = "$e-11 isa entity1, has entity1-id \"entity1id11\", has entity1-name \"entity1name11\";"; + String tc11 = "$e-11 isa entity1, has entity1-id \"entity1id11\", has entity1-name \"entity1name11\""; Assert.assertEquals(tc11, result.get(11).toString()); - String tc12 = "$e-12 isa entity1, has entity1-id \"entity1id12\", has entity1-name \"entity1name12\";"; + String tc12 = "$e-12 isa entity1, has entity1-id \"entity1id12\", has entity1-name \"entity1name12\""; Assert.assertEquals(tc12, result.get(12).toString()); - String tc13 = "$e-13 isa entity1, has entity1-id \"entity1id13\", has entity1-name \"entity1name13\";"; + String tc13 = "$e-13 isa entity1, has entity1-id \"entity1id13\", has entity1-name \"entity1name13\""; Assert.assertEquals(tc13, result.get(13).toString()); - String tc14 = "$e-14 isa entity1, has entity1-id \"entity1id14\", has entity1-name \"entity1name14\";"; + String tc14 = "$e-14 isa entity1, has entity1-id \"entity1id14\", has entity1-name \"entity1name14\""; Assert.assertEquals(tc14, result.get(14).toString()); - String tc15 = "$e-15 isa entity1, has entity1-id \"entity1id15\", has entity1-name \"entity1name15\";"; + String tc15 = "$e-15 isa entity1, has entity1-id \"entity1id15\", has entity1-name \"entity1name15\""; Assert.assertEquals(tc15, result.get(15).toString()); - String tc16 = "$e-16 isa entity1, has entity1-id \"entity1id16\", has entity1-name \"entity1name16\", has entity1-name \"entity1name16-2\";"; + String tc16 = "$e-16 isa entity1, has entity1-id \"entity1id16\", has entity1-name \"entity1name16\", has entity1-name \"entity1name16-2\""; Assert.assertEquals(tc16, result.get(16).toString()); - String tc17 = "$e-17 isa entity1, has entity1-id \"entity1id17\", has entity1-name \"entity1name17\";"; + String tc17 = "$e-17 isa entity1, has entity1-id \"entity1id17\", has entity1-name \"entity1name17\""; Assert.assertEquals(tc17, result.get(17).toString()); - String tc18 = "$e-18 isa entity1, has entity1-id \"entity1id18\", has entity1-name \"entity1name18\";"; + String tc18 = "$e-18 isa entity1, has entity1-id \"entity1id18\", has entity1-name \"entity1name18\""; Assert.assertEquals(tc18, result.get(18).toString()); - String tc19 = "$e-19 isa entity1, has entity1-id \"entity1id19\", has entity1-name \"entity1name19\";"; + String tc19 = "$e-19 isa entity1, has entity1-id \"entity1id19\", has entity1-name \"entity1name19\""; Assert.assertEquals(tc19, result.get(19).toString()); Assert.assertEquals(20, result.size()); @@ -107,39 +108,39 @@ public void graknEntityQueryFromRowWithBoolAndDoubleTest() { String header = rows.get(0); rows = new ArrayList<>(rows.subList(1, rows.size())); - ArrayList result = testEntityInsertGenerator.graknEntityInsert(rows, header); + ArrayList> result = testEntityInsertGenerator.graknEntityInsert(rows, header); - String tc0 = "$e-0 isa entity2, has entity2-id \"entity2id0\", has entity2-bool true, has entity2-double 0.0;"; + String tc0 = "$e-0 isa entity2, has entity2-id \"entity2id0\", has entity2-bool true, has entity2-double 0.0"; Assert.assertEquals(tc0, result.get(0).toString()); - String tc1 = "$e-1 isa entity2, has entity2-id \"entity2id1\", has entity2-bool false, has entity2-double 1.1, has entity2-double 11.11;"; + String tc1 = "$e-1 isa entity2, has entity2-id \"entity2id1\", has entity2-bool false, has entity2-double 1.1, has entity2-double 11.11"; Assert.assertEquals(tc1, result.get(1).toString()); - String tc2 = "$e-2 isa entity2, has entity2-id \"entity2id2\", has entity2-bool true, has entity2-double 2.2;"; + String tc2 = "$e-2 isa entity2, has entity2-id \"entity2id2\", has entity2-bool true, has entity2-double 2.2"; Assert.assertEquals(tc2, result.get(2).toString()); - String tc3 = "$e-3 isa entity2, has entity2-id \"entity2id3\", has entity2-bool false, has entity2-double -3.3;"; + String tc3 = "$e-3 isa entity2, has entity2-id \"entity2id3\", has entity2-bool false, has entity2-double -3.3"; Assert.assertEquals(tc3, result.get(3).toString()); - String tc4 = "$e-4 isa entity2, has entity2-id \"entity2id4\", has entity2-double 4.0;"; + String tc4 = "$e-4 isa entity2, has entity2-id \"entity2id4\", has entity2-double 4.0"; Assert.assertEquals(tc4, result.get(4).toString()); - String tc5 = "$e-5 isa entity2, has entity2-id \"entity2id5\";"; + String tc5 = "$e-5 isa entity2, has entity2-id \"entity2id5\""; Assert.assertEquals(tc5, result.get(5).toString()); - String tc6 = "$e-6 isa entity2, has entity2-id \"entity2id6\";"; + String tc6 = "$e-6 isa entity2, has entity2-id \"entity2id6\""; Assert.assertEquals(tc6, result.get(6).toString()); - String tc7 = "$e-7 isa entity2, has entity2-id \"entity2id7\";"; + String tc7 = "$e-7 isa entity2, has entity2-id \"entity2id7\""; Assert.assertEquals(tc7, result.get(7).toString()); - String tc8 = "$e-8 isa entity2, has entity2-id \"entity2id8\";"; + String tc8 = "$e-8 isa entity2, has entity2-id \"entity2id8\""; Assert.assertEquals(tc8, result.get(8).toString()); - String tc9 = "$e-9 isa entity2, has entity2-id \"entity2id9\";"; + String tc9 = "$e-9 isa entity2, has entity2-id \"entity2id9\""; Assert.assertEquals(tc9, result.get(9).toString()); - String tc10 = "$e-10 isa entity2, has entity2-id \"entity2id10\";"; + String tc10 = "$e-10 isa entity2, has entity2-id \"entity2id10\""; Assert.assertEquals(tc10, result.get(10).toString()); Assert.assertEquals(11, result.size()); @@ -154,39 +155,39 @@ public void graknEntityQueryFromRowWithLongTest() { String header = rows.get(0); rows = new ArrayList<>(rows.subList(1, rows.size())); - ArrayList result = testEntityInsertGenerator.graknEntityInsert(rows, header); + ArrayList> result = testEntityInsertGenerator.graknEntityInsert(rows, header); - String tc0 = "$e-0 isa entity3, has entity3-id \"entity3id0\", has entity3-int 0;"; + String tc0 = "$e-0 isa entity3, has entity3-id \"entity3id0\", has entity3-int 0"; Assert.assertEquals(tc0, result.get(0).toString()); - String tc1 = "$e-1 isa entity3, has entity3-id \"entity3id1\", has entity3-int 1, has entity3-int 11;"; + String tc1 = "$e-1 isa entity3, has entity3-id \"entity3id1\", has entity3-int 1, has entity3-int 11"; Assert.assertEquals(tc1, result.get(1).toString()); - String tc2 = "$e-2 isa entity3, has entity3-id \"entity3id2\", has entity3-int 2;"; + String tc2 = "$e-2 isa entity3, has entity3-id \"entity3id2\", has entity3-int 2"; Assert.assertEquals(tc2, result.get(2).toString()); - String tc3 = "$e-3 isa entity3, has entity3-id \"entity3id3\", has entity3-int -3;"; + String tc3 = "$e-3 isa entity3, has entity3-id \"entity3id3\", has entity3-int -3"; Assert.assertEquals(tc3, result.get(3).toString()); - String tc4 = "$e-4 isa entity3, has entity3-id \"entity3id4\";"; + String tc4 = "$e-4 isa entity3, has entity3-id \"entity3id4\""; Assert.assertEquals(tc4, result.get(4).toString()); - String tc5 = "$e-5 isa entity3, has entity3-id \"entity3id5\";"; + String tc5 = "$e-5 isa entity3, has entity3-id \"entity3id5\""; Assert.assertEquals(tc5, result.get(5).toString()); - String tc6 = "$e-6 isa entity3, has entity3-id \"entity3id6\";"; + String tc6 = "$e-6 isa entity3, has entity3-id \"entity3id6\""; Assert.assertEquals(tc6, result.get(6).toString()); - String tc7 = "$e-7 isa entity3, has entity3-id \"entity3id7\";"; + String tc7 = "$e-7 isa entity3, has entity3-id \"entity3id7\""; Assert.assertEquals(tc7, result.get(7).toString()); - String tc8 = "$e-8 isa entity3, has entity3-id \"entity3id8\";"; + String tc8 = "$e-8 isa entity3, has entity3-id \"entity3id8\""; Assert.assertEquals(tc8, result.get(8).toString()); - String tc9 = "$e-9 isa entity3, has entity3-id \"entity3id9\";"; + String tc9 = "$e-9 isa entity3, has entity3-id \"entity3id9\""; Assert.assertEquals(tc9, result.get(9).toString()); - String tc10 = "$e-10 isa entity3, has entity3-id \"entity3id10\";"; + String tc10 = "$e-10 isa entity3, has entity3-id \"entity3id10\""; Assert.assertEquals(tc10, result.get(10).toString()); Assert.assertEquals(11, result.size()); diff --git a/src/test/java/generator/RelationInsertGeneratorTest.java b/src/test/java/generator/RelationInsertGeneratorTest.java index 6576651..a3102d1 100644 --- a/src/test/java/generator/RelationInsertGeneratorTest.java +++ b/src/test/java/generator/RelationInsertGeneratorTest.java @@ -2,7 +2,7 @@ import configuration.MigrationConfig; import configuration.ProcessorConfigEntry; -import graql.lang.statement.Statement; +import graql.lang.pattern.variable.ThingVariable; import org.junit.Assert; import org.junit.Test; @@ -33,49 +33,49 @@ public void graknRelationQueryFromRowTest() throws Exception { String header = rows.get(0); rows = new ArrayList<>(rows.subList(1, rows.size())); - ArrayList>> result = testRelationInsertGenerator.graknRelationInsert(rows, header); + HashMap>>> result = testRelationInsertGenerator.graknRelationInsert(rows, header); // test all there - String tc2m = "$entity1-0-2 isa entity1, has entity1-id \"entity1id1\";$entity2-1-2 isa entity2, has entity2-id \"entity2id1\";$entity3-2-2 isa entity3, has entity3-id \"entity3id1\";"; - Assert.assertEquals(tc2m, concatMatches(result.get(0).get(2))); - String tc2i = "$rel-2 (player-one: $entity1-0-2, player-two: $entity2-1-2, player-optional: $entity3-2-2) isa rel1, has relAt-1 \"att2\", has relAt-2 \"opt2\";"; - Assert.assertEquals(tc2i, result.get(1).get(2).get(0).toString()); + String tc2m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity2-1 isa entity2, has entity2-id \"entity2id1\";$entity3-2 isa entity3, has entity3-id \"entity3id1\";"; + Assert.assertEquals(tc2m, concatMatches(result.get("match").get(2))); + String tc2i = "$rel (player-one: $entity1-0, player-two: $entity2-1, player-optional: $entity3-2) isa rel1, has relAt-1 \"att2\", has relAt-2 \"opt2\""; + Assert.assertEquals(tc2i, result.get("insert").get(2).get(0).toString()); // test no optional player & no optional attribute - String tc15m = "$entity1-0-15 isa entity1, has entity1-id \"entity1id1\";$entity2-1-15 isa entity2, has entity2-id \"entity2id1\";"; - Assert.assertEquals(tc15m, concatMatches(result.get(0).get(15))); - String tc15i = "$rel-15 (player-one: $entity1-0-15, player-two: $entity2-1-15) isa rel1, has relAt-1 \"att15\";"; - Assert.assertEquals(tc15i, result.get(1).get(15).get(0).toString()); + String tc15m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity2-1 isa entity2, has entity2-id \"entity2id1\";"; + Assert.assertEquals(tc15m, concatMatches(result.get("match").get(15))); + String tc15i = "$rel (player-one: $entity1-0, player-two: $entity2-1) isa rel1, has relAt-1 \"att15\""; + Assert.assertEquals(tc15i, result.get("insert").get(15).get(0).toString()); // test attribute explosion - String tc0m = "$entity1-0-0 isa entity1, has entity1-id \"entity1id1\";$entity2-1-0 isa entity2, has entity2-id \"entity2id1\";$entity3-2-0 isa entity3, has entity3-id \"entity3id1\";"; - Assert.assertEquals(tc0m, concatMatches(result.get(0).get(0))); - String tc0i = "$rel-0 (player-one: $entity1-0-0, player-two: $entity2-1-0, player-optional: $entity3-2-0) isa rel1, has relAt-1 \"att0\", has relAt-1 \"explosion0\", has relAt-2 \"opt0\";"; - Assert.assertEquals(tc0i, result.get(1).get(0).get(0).toString()); + String tc0m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity2-1 isa entity2, has entity2-id \"entity2id1\";$entity3-2 isa entity3, has entity3-id \"entity3id1\";"; + Assert.assertEquals(tc0m, concatMatches(result.get("match").get(0))); + String tc0i = "$rel (player-one: $entity1-0, player-two: $entity2-1, player-optional: $entity3-2) isa rel1, has relAt-1 \"att0\", has relAt-1 \"explosion0\", has relAt-2 \"opt0\""; + Assert.assertEquals(tc0i, result.get("insert").get(0).get(0).toString()); // test empty explosion - String tc10m = "$entity1-0-9 isa entity1, has entity1-id \"entity1id1\";$entity2-1-9 isa entity2, has entity2-id \"entity2id1\";$entity3-2-9 isa entity3, has entity3-id \"entity3id1\";"; - Assert.assertEquals(tc10m, concatMatches(result.get(0).get(9))); - String tc10i = "$rel-9 (player-one: $entity1-0-9, player-two: $entity2-1-9, player-optional: $entity3-2-9) isa rel1, has relAt-1 \"att9\", has relAt-2 \"opt9\";"; - Assert.assertEquals(tc10i, result.get(1).get(9).get(0).toString()); + String tc10m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity2-1 isa entity2, has entity2-id \"entity2id1\";$entity3-2 isa entity3, has entity3-id \"entity3id1\";"; + Assert.assertEquals(tc10m, concatMatches(result.get("match").get(9))); + String tc10i = "$rel (player-one: $entity1-0, player-two: $entity2-1, player-optional: $entity3-2) isa rel1, has relAt-1 \"att9\", has relAt-2 \"opt9\""; + Assert.assertEquals(tc10i, result.get("insert").get(9).get(0).toString()); // test exploded player complete - String tc25m = "$entity1-0-25 isa entity1, has entity1-id \"entity1id1\";$entity1-1-25 isa entity1, has entity1-id \"entity1id2\";$entity2-2-25 isa entity2, has entity2-id \"entity2id1\";$entity3-3-25 isa entity3, has entity3-id \"entity3id1\";"; - Assert.assertEquals(tc25m, concatMatches(result.get(0).get(25))); - String tc25i = "$rel-25 (player-one: $entity1-0-25, player-one: $entity1-1-25, player-two: $entity2-2-25, player-optional: $entity3-3-25) isa rel1, has relAt-1 \"att39\", has relAt-2 \"opt39\";"; - Assert.assertEquals(tc25i, result.get(1).get(25).get(0).toString()); + String tc25m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity1-1 isa entity1, has entity1-id \"entity1id2\";$entity2-2 isa entity2, has entity2-id \"entity2id1\";$entity3-3 isa entity3, has entity3-id \"entity3id1\";"; + Assert.assertEquals(tc25m, concatMatches(result.get("match").get(25))); + String tc25i = "$rel (player-one: $entity1-0, player-one: $entity1-1, player-two: $entity2-2, player-optional: $entity3-3) isa rel1, has relAt-1 \"att39\", has relAt-2 \"opt39\""; + Assert.assertEquals(tc25i, result.get("insert").get(25).get(0).toString()); // test exploded player without optional player and optional attribute - String tc26m = "$entity1-0-26 isa entity1, has entity1-id \"entity1id1\";$entity1-1-26 isa entity1, has entity1-id \"entity1id2\";$entity2-2-26 isa entity2, has entity2-id \"entity2id1\";"; - Assert.assertEquals(tc26m, concatMatches(result.get(0).get(26))); - String tc26i = "$rel-26 (player-one: $entity1-0-26, player-one: $entity1-1-26, player-two: $entity2-2-26) isa rel1, has relAt-1 \"att40\";"; - Assert.assertEquals(tc26i, result.get(1).get(26).get(0).toString()); + String tc26m = "$entity1-0 isa entity1, has entity1-id \"entity1id1\";$entity1-1 isa entity1, has entity1-id \"entity1id2\";$entity2-2 isa entity2, has entity2-id \"entity2id1\";"; + Assert.assertEquals(tc26m, concatMatches(result.get("match").get(26))); + String tc26i = "$rel (player-one: $entity1-0, player-one: $entity1-1, player-two: $entity2-2) isa rel1, has relAt-1 \"att40\""; + Assert.assertEquals(tc26i, result.get("insert").get(26).get(0).toString()); Assert.assertEquals(2, result.size()); - // number of match statements = number of valid statements that would be inserted - Assert.assertEquals(27, result.get(0).size()); - Assert.assertEquals(27, result.get(1).size()); + // number of match ThingVariables = number of valid ThingVariables that would be inserted + Assert.assertEquals(27, result.get("match").size()); + Assert.assertEquals(27, result.get("insert").size()); } diff --git a/src/test/java/insert/GraknInserterTest.java b/src/test/java/insert/GraknInserterTest.java index f3188fe..2575f3c 100644 --- a/src/test/java/insert/GraknInserterTest.java +++ b/src/test/java/insert/GraknInserterTest.java @@ -1,53 +1,55 @@ package insert; import grakn.client.GraknClient; -import grakn.client.answer.ConceptMap; +import grakn.client.GraknClient.Session; +import grakn.client.GraknClient.Transaction; import graql.lang.Graql; import graql.lang.query.GraqlDelete; -import graql.lang.query.GraqlGet; import graql.lang.query.GraqlInsert; +import graql.lang.query.GraqlMatch; import org.junit.Assert; import org.junit.Test; import util.Util; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -import static graql.lang.Graql.parse; import static graql.lang.Graql.var; public class GraknInserterTest { - GraknInserter gm; - String keyspaceName = "grakn_migrator_test"; + GraknInserter gi; + String databaseName = "grakn_migrator_test"; + String schemaPath; public GraknInserterTest() { - String schemaPath = Util.getAbsPath("src/test/resources/genericTests/schema-test.gql"); - this.gm = new GraknInserter("localhost", "48555", schemaPath, keyspaceName); + this.schemaPath = Util.getAbsPath("src/test/resources/genericTests/schema-test.gql"); + this.gi = new GraknInserter("localhost", "1729", schemaPath, databaseName); } @Test public void reloadKeyspaceTest() { - GraknClient client = gm.getClient(); - GraknClient.Session session = client.session(keyspaceName); + GraknClient client = gi.getClient(); - session = gm.setKeyspaceToSchema(client, session); - Assert.assertTrue(client.keyspaces().retrieve().contains(keyspaceName)); + gi.cleanAndDefineSchemaToDatabase(client); + Assert.assertTrue(client.databases().contains(databaseName)); //ensure Keyspace contains schema - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("e").sub("entity")).get().limit(3); - Assert.assertEquals(3, read.stream(getQuery).get().count()); + Session dataSession = gi.getDataSession(client); + Transaction read = dataSession.transaction(Transaction.Type.READ); + GraqlMatch.Limited mq = Graql.match(var("e").sub("entity")).get("e").limit(100); + Assert.assertEquals(4, read.query().match(mq).count()); + read.close(); + + read = dataSession.transaction(Transaction.Type.READ); + mq = Graql.match(var("e").type("entity1")).get("e").limit(100); + Assert.assertEquals(1, read.query().match(mq).count()); read.close(); - session.close(); + dataSession.close(); client.close(); } @Test public void loadSchemaFromFileTest() { - String schema = gm.loadSchemaFromFile(); + String schema = Util.loadSchemaFromFile(schemaPath); Assert.assertTrue("schema test positive", schema.contains("entity1 sub entity")); Assert.assertFalse("schema test negative", schema.contains("entity99 sub entity")); @@ -56,60 +58,48 @@ public void loadSchemaFromFileTest() { @Test public void insertToGraknTest() { - GraknClient client = gm.getClient(); - GraknClient.Session session = client.session(keyspaceName); - - session = gm.setKeyspaceToSchema(client, session); + GraknClient client = gi.getClient(); + gi.cleanAndDefineSchemaToDatabase(client); //perform data entry - GraknClient.Transaction write = session.transaction().write(); + Session dataSession = gi.getDataSession(client); + Transaction write = dataSession.transaction(Transaction.Type.WRITE); GraqlInsert insertQuery = Graql.insert(var("e").isa("entity1").has("entity1-id", "ide1")); - write.execute(insertQuery); + write.query().insert(insertQuery); write.commit(); + write.close(); //ensure graph contains our insert - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("e").isa("entity1").has("entity1-id", "ide1")).get().limit(1); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Transaction read = dataSession.transaction(Transaction.Type.READ); + GraqlMatch.Limited getQuery = Graql.match(var("e").isa("entity1").has("entity1-id", "ide1")).get("e").limit(100); + Assert.assertEquals(1, read.query().match(getQuery).count()); read.close(); - read = session.transaction().read(); - getQuery = Graql.match(var("e").isa("entity1").has("entity1-id", "ide2")).get().limit(1); - Assert.assertEquals(0, read.stream(getQuery).get().count()); + read = dataSession.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("e").isa("entity1").has("entity1-id", "ide2")).limit(100); + Assert.assertEquals(0, read.query().match(getQuery).count()); read.close(); //another test for our insert - List queryList = Arrays.asList( - "match ", - " $e isa entity1, has entity1-id \"ide1\";", - " get $e;" - ); - read = session.transaction().read(); - List res = new ArrayList(); - List answers = read.execute((GraqlGet) parse(String.join("", queryList))).get(); - GraknClient.Session finalSession = session; - answers.forEach(a -> { - finalSession.transaction().read().getConcept(a.get("e").id()).asThing().attributes().forEach(l -> { - System.out.println(l.value()); - res.add(l.value().toString()); + read = dataSession.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("e").isa("entity1").has("entity1-id", "ide1")).get("e").limit(100); + read.query().match(getQuery).forEach( answers -> { + answers.concepts().stream().forEach( entry -> { + Assert.assertTrue(entry.isEntity()); }); }); - Assert.assertEquals("ide1", res.get(0)); read.close(); //clean up: - GraknClient.Transaction delete = session.transaction().write(); + Transaction delete = dataSession.transaction(Transaction.Type.WRITE); GraqlDelete delQuery = Graql.match( var("e").isa("entity1").has("entity1-id", "ide1") ).delete(var("e").isa("entity1")); - delete.execute(delQuery); - delQuery = Graql.match( - var("a").isa("entity1-id").val("ide1") - ).delete(var("a").isa("entity1-id")); - delete.execute(delQuery); + delete.query().delete(delQuery); delete.commit(); + delete.close(); - session.close(); + dataSession.close(); client.close(); } diff --git a/src/test/java/migrator/MigrationTest.java b/src/test/java/migrator/MigrationTest.java index dc90997..bcb078e 100644 --- a/src/test/java/migrator/MigrationTest.java +++ b/src/test/java/migrator/MigrationTest.java @@ -2,11 +2,13 @@ import configuration.MigrationConfig; import grakn.client.GraknClient; -import grakn.client.answer.ConceptMap; +import grakn.client.GraknClient.Session; +import grakn.client.GraknClient.Transaction; import graql.lang.Graql; -import graql.lang.query.GraqlGet; -import graql.lang.statement.Statement; -import graql.lang.statement.StatementInstance; +import graql.lang.pattern.variable.ThingVariable; +import graql.lang.pattern.variable.ThingVariable.Thing; +import graql.lang.pattern.variable.ThingVariable.Relation; +import graql.lang.query.GraqlMatch; import insert.GraknInserter; import org.junit.Assert; import org.junit.Test; @@ -17,20 +19,21 @@ import java.io.IOException; import java.util.ArrayList; -import java.util.stream.Stream; public class MigrationTest { + String graknURI = "localhost:1729"; + @Test public void migrateGenericTestsTest() throws IOException { - String keyspaceName = "grami_generic_test"; + String databaseName = "grami_generic_test"; String asp = getAbsPath("src/test/resources/genericTests/schema-test.gql"); String msp = getAbsPath("src/test/resources/genericTests/migrationStatus-test.json"); String adcp = getAbsPath("src/test/resources/genericTests/dataConfig-test.json"); String gcp = getAbsPath("src/test/resources/genericTests/processorConfig-test.json"); - MigrationConfig migrationConfig = new MigrationConfig("localhost:48555",keyspaceName, asp, adcp, gcp); + MigrationConfig migrationConfig = new MigrationConfig(graknURI,databaseName, asp, adcp, gcp); GraknMigrator mig = new GraknMigrator(migrationConfig, msp, true); mig.migrate(true, true, true,true); } @@ -38,234 +41,222 @@ public void migrateGenericTestsTest() throws IOException { @Test public void migratePhoneCallsTest() throws IOException { - String keyspaceName = "grami_phone_call_test"; + String databaseName = "grami_phone_call_test"; String asp = getAbsPath("src/test/resources/phone-calls/schema.gql"); String msp = getAbsPath("src/test/resources/phone-calls/migrationStatus.json"); String adcp = getAbsPath("src/test/resources/phone-calls/dataConfig.json"); String gcp = getAbsPath("src/test/resources/phone-calls/processorConfig.json"); - MigrationConfig migrationConfig = new MigrationConfig("localhost:48555",keyspaceName, asp, adcp, gcp); + MigrationConfig migrationConfig = new MigrationConfig(graknURI,databaseName, asp, adcp, gcp); GraknMigrator mig = new GraknMigrator(migrationConfig, msp, true); mig.migrate(true, true, true,true); - GraknInserter gi = new GraknInserter("localhost", "48555", asp, keyspaceName); - testEntities(gi, keyspaceName); - testRelations(gi, keyspaceName); - testRelationWithRelations(gi, keyspaceName); - testAppendAttribute(gi, keyspaceName); + GraknInserter gi = new GraknInserter(graknURI.split(":")[0], graknURI.split(":")[1], asp, databaseName); + GraknClient client = gi.getClient(); + Session session = gi.getDataSession(client); + testEntities(session); + testRelations(session); + testRelationWithRelations(session); + testAppendAttribute(session); + session.close(); + client.close(); } - public void testEntities(GraknInserter gi, String keyspace) { - GraknClient client = gi.getClient(); - GraknClient.Session session = client.session(keyspace); + public void testEntities(Session session) { // query person by phone-number - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("p").isa("person").has("phone-number", "+261 860 539 4754")).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Transaction read = session.transaction(Transaction.Type.READ); + GraqlMatch getQuery = Graql.match(var("p").isa("person").has("phone-number", "+261 860 539 4754")).get("p").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery).count()); // query person by last name - read = session.transaction().read(); - getQuery = Graql.match(var("p").isa("person").has("last-name", "Smith")).get().limit(1000); - Assert.assertEquals(2, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("p").isa("person").has("last-name", "Smith")).get("p").limit(1000); + Assert.assertEquals(2, read.query().match(getQuery).count()); // query all entities of type person - read = session.transaction().read(); - getQuery = Graql.match(var("c").isa("person")).get().limit(1000); - Assert.assertEquals(32, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("c").isa("person")).get("c").limit(1000); + Assert.assertEquals(32, read.query().match(getQuery).count()); // query all entites of type company - read = session.transaction().read(); - getQuery = Graql.match(var("e").isa("company")).get().limit(1000); - Assert.assertEquals(2, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("e").isa("company")).get("e").limit(1000); + Assert.assertEquals(2, read.query().match(getQuery).count()); read.close(); - session.close(); - client.close(); } - public void testRelations(GraknInserter gi, String keyspace) { - GraknClient client = gi.getClient(); - GraknClient.Session session = client.session(keyspace); + public void testRelations(Session session) { // query call by duration - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("c").isa("call").has("duration", 2851)).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Transaction read = session.transaction(Transaction.Type.READ); + GraqlMatch getQuery = Graql.match(var("c").isa("call").has("duration", 2851)).get("c").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery).count()); // query call by date - read = session.transaction().read(); - getQuery = Graql.match(var("c").isa("call").has("started-at", getDT("2018-09-17T18:43:42"))).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery = Graql.match(var("c").isa("call").has("started-at", getDT("2018-09-17T18:43:42"))).get("c").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery).count()); // query call by caller - read = session.transaction().read(); - StatementInstance player = Graql.var("p").isa("person").has("phone-number", "+7 171 898 0853"); - StatementInstance relation = Graql.var("c").isa("call").rel("caller", "p"); - ArrayList statements = new ArrayList<>(); + read = session.transaction(Transaction.Type.READ); + Thing player = Graql.var("p").isa("person").has("phone-number", "+7 171 898 0853"); + Relation relation = Graql.var("c").isa("call").toUnbound().rel("caller", "p"); + ArrayList statements = new ArrayList<>(); statements.add(player); statements.add(relation); - getQuery = Graql.match(statements).get().limit(1000); - Assert.assertEquals(14, read.stream(getQuery).get().count()); +// getQuery = Graql.match(statements); + getQuery = Graql.match(statements).get("c").limit(1000); + Assert.assertEquals(14, read.query().match(getQuery).count()); // query call by callee - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); player = Graql.var("p").isa("person").has("phone-number", "+7 171 898 0853"); - relation = Graql.var("c").isa("call").rel("callee", "p"); + relation = Graql.var("c").isa("call").toUnbound().rel("callee", "p"); statements = new ArrayList<>(); statements.add(player); statements.add(relation); - getQuery = Graql.match(statements).get().limit(1000); - Assert.assertEquals(4, read.stream(getQuery).get().count()); + getQuery = Graql.match(statements).get("c").limit(1000); + Assert.assertEquals(4, read.query().match(getQuery).count()); // query call by caller & callee - read = session.transaction().read(); - StatementInstance playerOne = Graql.var("p1").isa("person").has("phone-number", "+7 171 898 0853"); - StatementInstance playerTwo = Graql.var("p2").isa("person").has("phone-number", "+57 629 420 5680"); - relation = Graql.var("c").isa("call").rel("caller", "p1").rel("callee", "p2"); + read = session.transaction(Transaction.Type.READ); + Thing playerOne = Graql.var("p1").isa("person").has("phone-number", "+7 171 898 0853"); + Thing playerTwo = Graql.var("p2").isa("person").has("phone-number", "+57 629 420 5680"); + relation = Graql.var("c").isa("call").toUnbound().rel("caller", "p1").rel("callee", "p2"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); - getQuery = Graql.match(statements).get().limit(1000); - Assert.assertEquals(4, read.stream(getQuery).get().count()); + getQuery = Graql.match(statements).get("c").limit(1000); + Assert.assertEquals(4, read.query().match(getQuery).count()); read.close(); - session.close(); - client.close(); } - public void testRelationWithRelations(GraknInserter gi, String keyspace) { - GraknClient client = gi.getClient(); - GraknClient.Session session = client.session(keyspace); + public void testRelationWithRelations(Session session) { // query specific communication-channel and count the number of past calls (single past-call): - GraknClient.Transaction read = session.transaction().read(); - StatementInstance playerOne = Graql.var("p1").isa("person").has("phone-number", "+54 398 559 0423"); - StatementInstance playerTwo = Graql.var("p2").isa("person").has("phone-number", "+48 195 624 2025"); - StatementInstance relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("peer", "p2").rel("past-call","x"); - ArrayList statements = new ArrayList<>(); + Transaction read = session.transaction(Transaction.Type.READ); + Thing playerOne = Graql.var("p1").isa("person").has("phone-number", "+54 398 559 0423"); + Thing playerTwo = Graql.var("p2").isa("person").has("phone-number", "+48 195 624 2025"); + Relation relation = Graql.var("c").rel("peer", "p1").rel("peer", "p2").rel("past-call","x").isa("communication-channel"); + ArrayList statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); - GraqlGet getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + GraqlMatch getQuery = Graql.match(statements).get("c").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Assert.assertEquals(1, read.query().match(getQuery).count()); // query specific communication-channel and count the number of past calls (listSeparated past-calls: - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); playerOne = Graql.var("p1").isa("person").has("phone-number", "+263 498 495 0617"); playerTwo = Graql.var("p2").isa("person").has("phone-number", "+33 614 339 0298"); - relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x"); + relation = Graql.var("c").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x").isa("communication-channel"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Assert.assertEquals(1, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(6, read.stream(getQuery).get().count()); + Assert.assertEquals(6, read.query().match(getQuery).count()); // make sure that this doesn't get inserted: - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); playerOne = Graql.var("p1").isa("person").has("phone-number", "+7 690 597 4443"); playerTwo = Graql.var("p2").isa("person").has("phone-number", "+54 398 559 9999"); - relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x"); + relation = Graql.var("c").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x").isa("communication-channel"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(0, read.stream(getQuery).get().count()); + Assert.assertEquals(0, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(0, read.stream(getQuery).get().count()); + Assert.assertEquals(0, read.query().match(getQuery).count()); // these are added by doing player matching for past calls: - // make sure that this doesn't get inserted: - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); playerOne = Graql.var("p1").isa("person").has("phone-number", "+81 308 988 7153"); playerTwo = Graql.var("p2").isa("person").has("phone-number", "+351 515 605 7915"); - relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x"); + relation = Graql.var("c").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x").isa("communication-channel"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(5, read.stream(getQuery).get().count()); + Assert.assertEquals(5, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(5, read.stream(getQuery).get().count()); + Assert.assertEquals(5, read.query().match(getQuery).count()); - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); playerOne = Graql.var("p1").isa("person").has("phone-number", "+7 171 898 0853"); playerTwo = Graql.var("p2").isa("person").has("phone-number", "+57 629 420 5680"); - relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x"); + relation = Graql.var("c").rel("peer", "p1").rel("peer", "p2").rel("past-call", "x").isa("communication-channel"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(playerTwo); statements.add(relation); getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(4, read.stream(getQuery).get().count()); + Assert.assertEquals(4, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(4, read.stream(getQuery).get().count()); + Assert.assertEquals(4, read.query().match(getQuery).count()); // these must not be found (come from player-matched past-call): - read = session.transaction().read(); + read = session.transaction(Transaction.Type.READ); playerOne = Graql.var("p1").isa("person").has("phone-number", "+261 860 539 4754"); - relation = Graql.var("c").isa("communication-channel").rel("peer", "p1").rel("past-call", "x"); + relation = Graql.var("c").rel("peer", "p1").rel("past-call", "x").isa("communication-channel"); statements = new ArrayList<>(); statements.add(playerOne); statements.add(relation); getQuery = Graql.match(statements).get("c").limit(1000); - Assert.assertEquals(0, read.stream(getQuery).get().count()); + Assert.assertEquals(0, read.query().match(getQuery).count()); getQuery = Graql.match(statements).get("x").limit(1000); - Assert.assertEquals(0, read.stream(getQuery).get().count()); + Assert.assertEquals(0, read.query().match(getQuery).count()); read.close(); - session.close(); - client.close(); } - public void testAppendAttribute(GraknInserter gi, String keyspace) { - GraknClient client = gi.getClient(); - GraknClient.Session session = client.session(keyspace); + public void testAppendAttribute(Session session) { // Count number of total inserts - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("p").isa("person").has("twitter-username", var("x"))).get("p").limit(1000); - Assert.assertEquals(6, read.stream(getQuery).get().count()); + Transaction read = session.transaction(Transaction.Type.READ); + GraqlMatch.Limited getQuery = Graql.match(var("p").isa("person").has("twitter-username", var("x"))).get("p").limit(1000); + Assert.assertEquals(6, read.query().match(getQuery).count()); // Count multi-insert using listSeparator getQuery = Graql.match(var("p").isa("person").has("phone-number", "+263 498 495 0617").has("twitter-username", var("x"))).get("x").limit(1000); - Assert.assertEquals(2, read.stream(getQuery).get().count()); + Assert.assertEquals(2, read.query().match(getQuery).count()); //test relation total inserts getQuery = Graql.match(var("c").isa("call").has("call-rating", var("cr"))).get("c").limit(1000); - Assert.assertEquals(5, read.stream(getQuery).get().count()); + Assert.assertEquals(5, read.query().match(getQuery).count()); // specific relation insert getQuery = Graql.match(var("c").isa("call").has("started-at", getDT("2018-09-24T03:16:48")).has("call-rating", var("cr"))).get("cr").limit(1000); - Stream answers = read.stream(getQuery).get(); - answers.forEach(answer -> Assert.assertEquals(5L, answer.get("cr").asAttribute().value())); + read.query().match(getQuery).forEach(answer -> { + Assert.assertEquals(5L, answer.get("cr").asAttribute().getValue()); + }); read.close(); - session.close(); - client.close(); } @Test public void issue10Test() throws IOException { - String keyspaceName = "issue10"; + String databaseName = "issue10"; String asp = getAbsPath("src/test/resources/bugfixing/issue10/schema.gql"); String msp = getAbsPath("src/test/resources/bugfixing/issue10/migrationStatus.json"); String adcp = getAbsPath("src/test/resources/bugfixing/issue10/dataConfig.json"); String gcp = getAbsPath("src/test/resources/bugfixing/issue10/processorConfig.json"); - MigrationConfig migrationConfig = new MigrationConfig("localhost:48555",keyspaceName, asp, adcp, gcp); + MigrationConfig migrationConfig = new MigrationConfig(graknURI,databaseName, asp, adcp, gcp); GraknMigrator mig = new GraknMigrator(migrationConfig, msp, true); mig.migrate(true, true, true,true); - } } diff --git a/src/test/java/migrator/SchemaUpdaterTest.java b/src/test/java/migrator/SchemaUpdaterTest.java index 584891c..cb0aaeb 100644 --- a/src/test/java/migrator/SchemaUpdaterTest.java +++ b/src/test/java/migrator/SchemaUpdaterTest.java @@ -3,20 +3,15 @@ import configuration.MigrationConfig; import configuration.SchemaUpdateConfig; import grakn.client.GraknClient; +import grakn.client.GraknClient.Session; +import grakn.client.GraknClient.Transaction; import graql.lang.Graql; -import graql.lang.query.GraqlGet; -import graql.lang.statement.Statement; -import graql.lang.statement.StatementInstance; +import graql.lang.query.GraqlMatch; import insert.GraknInserter; import org.junit.Assert; import org.junit.Test; import java.io.IOException; -import java.time.LocalDate; -import java.time.LocalDateTime; -import java.time.LocalTime; -import java.time.format.DateTimeFormatter; -import java.util.ArrayList; import static graql.lang.Graql.type; import static graql.lang.Graql.var; @@ -24,56 +19,61 @@ public class SchemaUpdaterTest { + String graknURI = "localhost:1729"; + @Test public void updateSchemaTest() throws IOException { - String keyspaceName = "grami_phone_call_test_su"; + String databaseName = "grami_phone_call_test_su"; String asp = getAbsPath("src/test/resources/phone-calls/schema.gql"); String msp = getAbsPath("src/test/resources/phone-calls/migrationStatus.json"); String adcp = getAbsPath("src/test/resources/phone-calls/dataConfig.json"); String gcp = getAbsPath("src/test/resources/phone-calls/processorConfig.json"); - MigrationConfig migrationConfig = new MigrationConfig("localhost:48555",keyspaceName, asp, adcp, gcp); + MigrationConfig migrationConfig = new MigrationConfig(graknURI,databaseName, asp, adcp, gcp); GraknMigrator mig = new GraknMigrator(migrationConfig, msp, true); mig.migrate(true, true, true,true); - GraknInserter gi = new GraknInserter("localhost", "48555", asp, keyspaceName); + GraknInserter gi = new GraknInserter(graknURI.split(":")[0], graknURI.split(":")[1], asp, databaseName); asp = getAbsPath("src/test/resources/phone-calls/schema-updated.gql"); - SchemaUpdateConfig suConfig = new SchemaUpdateConfig("localhost:48555",keyspaceName, asp); + SchemaUpdateConfig suConfig = new SchemaUpdateConfig(graknURI,databaseName, asp); SchemaUpdater su = new SchemaUpdater(suConfig); su.updateSchema(); - postUpdateSchemaTests(gi, keyspaceName); + postUpdateSchemaTests(gi); } - private void postUpdateSchemaTests(GraknInserter gi, String keyspace) { + private void postUpdateSchemaTests(GraknInserter gi) { GraknClient client = gi.getClient(); - GraknClient.Session session = client.session(keyspace); + Session session = gi.getDataSession(client); // query attribute type - GraknClient.Transaction read = session.transaction().read(); - GraqlGet getQuery = Graql.match(var("a").type("added-attribute")).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + Transaction read = session.transaction(Transaction.Type.READ); + GraqlMatch.Filtered getQuery = Graql.match(var("a").type("added-attribute")).get("a"); + Assert.assertEquals(1, read.query().match(getQuery).count()); read.close(); // query entity type - read = session.transaction().read(); - getQuery = Graql.match(var("a").type("added-entity")).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + GraqlMatch.Limited getQuery2 = Graql.match(var("a").type("added-entity")).get("a").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery2).count()); read.close(); // query relation type - read = session.transaction().read(); - getQuery = Graql.match(var("a").type("added-relation")).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery2 = Graql.match(var("a").type("added-relation")).get("a").limit(1000); + Assert.assertEquals(1, read.query().match(getQuery2).count()); read.close(); // query role type - read = session.transaction().read(); - getQuery = Graql.match(type("added-relation").relates(var("r"))).get().limit(1000); - Assert.assertEquals(1, read.stream(getQuery).get().count()); + read = session.transaction(Transaction.Type.READ); + getQuery2 = Graql.match(type("added-relation").relates(var("r"))).limit(1000); + Assert.assertEquals(1, read.query().match(getQuery2).count()); read.close(); + + session.close(); + client.close(); } } diff --git a/src/test/java/test/TestUtil.java b/src/test/java/test/TestUtil.java index 6b2eee2..24539cb 100644 --- a/src/test/java/test/TestUtil.java +++ b/src/test/java/test/TestUtil.java @@ -1,7 +1,7 @@ package test; +import graql.lang.pattern.variable.ThingVariable; import loader.DataLoader; -import graql.lang.statement.Statement; import java.io.*; import java.time.LocalDate; @@ -40,11 +40,10 @@ public static ArrayList getData(String path) { return rows; } - public static String concatMatches(ArrayList statements) { + public static String concatMatches(ArrayList> statements) { String ret = ""; - for (Statement st : - statements) { - ret = ret + st.toString(); + for (ThingVariable st : statements) { + ret = ret + st.toString() + ";"; } return ret; } diff --git a/src/test/resources/bugfixing/issue10/schema.gql b/src/test/resources/bugfixing/issue10/schema.gql index 354e220..4a54783 100644 --- a/src/test/resources/bugfixing/issue10/schema.gql +++ b/src/test/resources/bugfixing/issue10/schema.gql @@ -1,13 +1,13 @@ define text sub entity, - key uid, - plays tagged; + owns uid @key, + plays tag:tagged; uid sub attribute, value long; label sub entity, - key name, - plays tagger; + owns name @key, + plays tag:tagger; name sub attribute, value string; tag sub relation, diff --git a/src/test/resources/genericTests/schema-test.gql b/src/test/resources/genericTests/schema-test.gql index 9a64888..0b8f4e4 100644 --- a/src/test/resources/genericTests/schema-test.gql +++ b/src/test/resources/genericTests/schema-test.gql @@ -1,27 +1,27 @@ define entity1 sub entity, - has entity1-id, - has entity1-name, - has entity1-exp, - plays player-one; + owns entity1-id, + owns entity1-name, + owns entity1-exp, + plays rel1:player-one; entity1-id sub attribute, value string; entity1-name sub attribute, value string; entity1-exp sub attribute, value string; entity2 sub entity, - has entity2-id, - has entity2-bool, - has entity2-double, - plays player-two; + owns entity2-id, + owns entity2-bool, + owns entity2-double, + plays rel1:player-two; entity2-id sub attribute, value string; entity2-bool sub attribute, value boolean; entity2-double sub attribute, value double; entity3 sub entity, - has entity3-id, - has entity3-int, - plays player-optional; + owns entity3-id, + owns entity3-int, + plays rel1:player-optional; entity3-id sub attribute, value string; entity3-int sub attribute, value long; @@ -29,8 +29,8 @@ rel1 sub relation, relates player-one, relates player-two, relates player-optional, - has relAt-1, - has relAt-2; + owns relAt-1, + owns relAt-2; relAt-1 sub attribute, value string; relAt-2 sub attribute, value string; diff --git a/src/test/resources/phone-calls/dataConfig.json b/src/test/resources/phone-calls/dataConfig.json index 140e0bb..77492cd 100644 --- a/src/test/resources/phone-calls/dataConfig.json +++ b/src/test/resources/phone-calls/dataConfig.json @@ -42,7 +42,7 @@ "generator": "name" }], "batchSize": 100, - "threads": 4 + "threads": 1 }, "contract": { "dataPath": "src/test/resources/phone-calls/contract.csv", diff --git a/src/test/resources/phone-calls/schema-updated.gql b/src/test/resources/phone-calls/schema-updated.gql index 0c3c6db..ecc3a2c 100644 --- a/src/test/resources/phone-calls/schema-updated.gql +++ b/src/test/resources/phone-calls/schema-updated.gql @@ -34,36 +34,36 @@ define call sub relation, relates caller, relates callee, - has started-at, - has duration, - has call-rating, - plays past-call; + owns started-at, + owns duration, + owns call-rating, + plays communication-channel:past-call; communication-channel sub relation, relates peer, relates past-call; company sub entity, - plays provider, - has name; + plays contract:provider, + owns name; person sub entity, - plays customer, - plays caller, - plays callee, - has first-name, - has last-name, - has phone-number, - has city, - has age, - has nick-name, - has twitter-username, - has fakebook-link, - plays peer; + plays contract:customer, + plays call:caller, + plays call:callee, + owns first-name, + owns last-name, + owns phone-number, + owns city, + owns age, + owns nick-name, + owns twitter-username, + owns fakebook-link, + plays communication-channel:peer; added-entity sub entity, - has added-attribute, - plays added-role; + owns added-attribute, + plays added-relation:added-role; added-relation sub relation, relates added-role; diff --git a/src/test/resources/phone-calls/schema.gql b/src/test/resources/phone-calls/schema.gql index 0eeeb04..23d2f9d 100644 --- a/src/test/resources/phone-calls/schema.gql +++ b/src/test/resources/phone-calls/schema.gql @@ -32,29 +32,29 @@ define call sub relation, relates caller, relates callee, - has started-at, - has duration, - has call-rating, - plays past-call; + owns started-at, + owns duration, + owns call-rating, + plays communication-channel:past-call; communication-channel sub relation, relates peer, relates past-call; company sub entity, - plays provider, - has name; + plays contract:provider, + owns name; person sub entity, - plays customer, - plays caller, - plays callee, - has first-name, - has last-name, - has phone-number, - has city, - has age, - has nick-name, - has twitter-username, - has fakebook-link, - plays peer; + plays contract:customer, + plays call:caller, + plays call:callee, + owns first-name, + owns last-name, + owns phone-number, + owns city, + owns age, + owns nick-name, + owns twitter-username, + owns fakebook-link, + plays communication-channel:peer;