Skip to content
This repository has been archived by the owner on Jul 16, 2024. It is now read-only.

BatchReplayer should generate part of the dataset during deployment #384

Open
vgkowski opened this issue Jun 13, 2022 · 0 comments
Open
Labels
data generator Components used for generating data enhancement New feature or request top priority Top priority features to implement

Comments

@vgkowski
Copy link
Collaborator

vgkowski commented Jun 13, 2022

BatchReplayer is currently replaying the dataset from scratch. Sometimes we just need data to be in the target and we don't want to wait for each batch/micro-batch to generate new data.
We should add a parameter to the construct to write a percentage of the dataset during the provisioning step of the construct. A part of the data will already be in the target when the CDK application is provisioned.

@vgkowski vgkowski added the enhancement New feature or request label Jun 13, 2022
@vgkowski vgkowski added the data generator Components used for generating data label Jun 13, 2022
@vgkowski vgkowski changed the title BatchReplayer already generate part of the dataset BatchReplayer generates part of the dataset during deployment Jun 13, 2022
@vgkowski vgkowski changed the title BatchReplayer generates part of the dataset during deployment BatchReplayer should generate part of the dataset during deployment Jun 13, 2022
@vgkowski vgkowski added the top priority Top priority features to implement label Jun 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data generator Components used for generating data enhancement New feature or request top priority Top priority features to implement
Projects
None yet
Development

No branches or pull requests

1 participant