How can I get the "sampled_omnicorpus_cc.json" file? #11

EmberNoWeither · 2025-01-23T07:33:45Z

I found traces of the file in README.md under your mmlm_llava folder. I would like to know how the file organizes the OmniCorpus-CC dataset into a conversational format, so as to train on your dataset using the dataset processing code in "train_interleaved.py".

EmberNoWeither · 2025-01-23T07:36:40Z

#6 The url you provided was expired.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I get the "sampled_omnicorpus_cc.json" file? #11

How can I get the "sampled_omnicorpus_cc.json" file? #11

EmberNoWeither commented Jan 23, 2025

EmberNoWeither commented Jan 23, 2025

How can I get the "sampled_omnicorpus_cc.json" file? #11

How can I get the "sampled_omnicorpus_cc.json" file? #11

Comments

EmberNoWeither commented Jan 23, 2025

EmberNoWeither commented Jan 23, 2025