Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-264: Add support for upsert/delete via periodic merge flushes #286

Closed
wants to merge 1 commit into from

Conversation

C0urante
Copy link
Collaborator

Addresses #264

Implements upsert/delete functionality as proposed in the collaborative design doc, fixes some bugs in the connector, and adds an embedded integration test framework and tests for upsert/delete.

Some notes:

  • The README is updated to include instructions on running the new embedded integration tests. It should be fairly straightforward to run these if you're used to running the existing Docker-based integration tests as the same environment variables are used and, beyond that, all that's required is a Gradle one-liner.
  • The integration testing framework added here can be expanded to include new tests and possibly address Replace Docker-based integration tests with embedded integration tests #285
  • The SchemaRetriever API has been updated to differentiate between key and value schemas when invoking setLastSeen; this is done in a backwards-compatible fashion, so existing implementations of that interface should work with the updated API seamlessly.
  • The MemorySchemaRetriever is updated to use that new API correctly (the existing one didn't differentiate between key and value schemas, which led to some buggy behavior that was unearthed during integration testing)
  • The SchemaManager class is optimized to avoid redundant operations (which is useful in general but also specifically with upsert/delete enabled as create/update requests are much more likely in that case and the likelihood of them being redundant is fairly high)
  • BigQuerySinkTask is tweaked to avoid creating multiple BigQuery and SchemaManager instances; both classes are thread-safe, so this should be fine
  • Existing unit tests are modified to reflect some of these tweaks
  • New unit tests are added for other changes

@C0urante
Copy link
Collaborator Author

cc @whynick1 @mtagle @criccomini

This is a big one. No rush, but it's up now if you want to take a look.

@codecov-commenter
Copy link

codecov-commenter commented Jun 29, 2020

Codecov Report

Merging #286 into master will decrease coverage by 1.18%.
The diff coverage is 65.86%.

@@             Coverage Diff              @@
##             master     #286      +/-   ##
============================================
- Coverage     70.87%   69.68%   -1.19%     
- Complexity      301      379      +78     
============================================
  Files            32       37       +5     
  Lines          1538     2052     +514     
  Branches        164      203      +39     
============================================
+ Hits           1090     1430     +340     
- Misses          390      549     +159     
- Partials         58       73      +15     
Impacted Files Coverage Δ Complexity Δ
...nect/bigquery/convert/BigQuerySchemaConverter.java 97.56% <ø> (ø) 26.00 <0.00> (ø)
...bigquery/exception/ExpectedInterruptException.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...ect/bigquery/write/row/AdaptiveBigQueryWriter.java 40.90% <0.00%> (+0.28%) 6.00 <0.00> (ø)
...bigquery/write/row/UpsertDeleteBigQueryWriter.java 26.31% <26.31%> (ø) 1.00 <1.00> (?)
...om/wepay/kafka/connect/bigquery/SchemaManager.java 35.03% <28.33%> (-41.44%) 11.00 <10.00> (+2.00) ⬇️
...ka/connect/bigquery/config/BigQuerySinkConfig.java 75.30% <50.00%> (-6.47%) 29.00 <2.00> (ø)
...wepay/kafka/connect/bigquery/BigQuerySinkTask.java 63.57% <64.06%> (+4.62%) 47.00 <23.00> (+20.00)
...afka/connect/bigquery/write/batch/TableWriter.java 66.66% <72.72%> (ø) 5.00 <1.00> (ø)
...y/kafka/connect/bigquery/utils/TableNameUtils.java 75.00% <75.00%> (ø) 3.00 <3.00> (?)
...fka/connect/bigquery/write/batch/MergeBatches.java 82.53% <82.53%> (ø) 20.00 <20.00> (?)
... and 14 more

@C0urante C0urante closed this Dec 8, 2020
@C0urante C0urante deleted the gh-264 branch December 8, 2020 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants