training data format #2

weizhou1991 · 2024-03-27T08:27:24Z

Hi,
Your work is really impressive. After reading your paper, I got some questions, seems like you build your own dataset to train the VLM and you used kind of CoT method to do that. Usually, training vlm would have the format looks like below:
"{
"id": "32357477_ijms-21-03049-f006",
"image": "32357477_ijms-21-03049-f006.jpg",
"conversatons": [
{
"from": "human",
"value": "Illustrate the image through a descriptive explanation\n"
},
{
"from": "gpt",
"value": "Determination of metaphase II entry in oocytes pre-exposed to ferrocenyl 4-(alkylamino)-1,4-dihydroquinolines. After incubation or not with compounds 9, 6, 10, 7, 11, 8 for 24 h, oocytes were rinsed four times in ND96 for 30 min, before progesterone stimulation. White spot appearance was scored after 15 h. N refers to the number of females and n to the number of oocytes (N = 2 and n = 60)."
}
]
},"""

So I am curious how does your format looks like, and you seperate your answers into 3 parts, do you train them one by one or as a whole part?

Really appreciate if you could give me some hits. Thank you so much .

Best,

caijianwei1996 · 2024-08-14T02:12:27Z

llava-next maybe the solution

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training data format #2

training data format #2

weizhou1991 commented Mar 27, 2024

caijianwei1996 commented Aug 14, 2024

training data format #2

training data format #2

Comments

weizhou1991 commented Mar 27, 2024

caijianwei1996 commented Aug 14, 2024