We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User can do incremental update manually like:
def my_embedding(file: File) -> list[float]: return [...] dc = DataChain.from_storage("s3://bkt/dir1/*.jpg") # Create 1st version dc = dc.map(emd=my_embedding).save("image_emb") ... # update new = DataChain.from_storage("s3://bkt/dir1/*.jpg") old = DataChain.from_dataset("image_emb") diff = new.diff(old).map(emd=my_embedding) # Create 2nd version res = old.union(diff).save("image_emb")
It would be great if this can be supported out of the box. Users could then update datasets directly from the UI.
def my_embedding(file: File) -> list[float]: return [...] # Create 1st version dc = DataChain.incremental_dataset("s3://bkt/dir1/*.jpg", my_embedding, "image_emb") ... # Update to 2st version dc = dc.update()
Challenges:
my_embedding()
The text was updated successfully, but these errors were encountered:
ilongin
No branches or pull requests
Description
User can do incremental update manually like:
It would be great if this can be supported out of the box. Users could then update datasets directly from the UI.
Challenges:
my_embedding()
. So, Inline project meta #776 might be a prerequisite.The text was updated successfully, but these errors were encountered: