Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Kinesis source: explicitly disallow duplicate record processors for the same shard #86

Conversation

istreeter
Copy link
Contributor

No description provided.

The common-streams Kinesis source suffers from a problem where we don't
quite achieve at-least-once processing semantics near the end of a
shard. The problem was in the 3rd-party fs-kinesis library, and it is
not easy to fix with any small code change to that library.

Sorry I cannot provide a link here to back that up -- it is documented
internally at Snowplow.

This PR re-implements our Kinesis source from scratch, this time without
a dependency on fs2-kinesis.  The biggest difference is the way we block
the `shardEnded` method of the KCL record processor, until all records
from the shard have been written to the destination.
@istreeter istreeter force-pushed the kinesis-deny-duplicate-shard-processor branch from 0e83b8a to 3368625 Compare September 9, 2024 15:35
@istreeter istreeter force-pushed the re-implement-kinesis-source branch 2 times, most recently from 3838ac0 to a6b96d0 Compare September 9, 2024 23:07
@istreeter istreeter closed this Sep 9, 2024
@istreeter istreeter deleted the kinesis-deny-duplicate-shard-processor branch September 9, 2024 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant