Fine-tuning on conversations (format of conversations) #15

Eichhof · 2023-01-14T11:19:15Z

Hello

I have a dataset consisting of dialogues between two people which I would like to use for fine-tuning GPT-J. Please see below for two example dialogues. The dialogues vary in length and can be longer than the examples.

Is the format of the conversations ok? For fine-tuning, should I just concatenate all conversations into one big file or do I have to use a separator between the conversations (if yes, which separator)?

First Dialogue:

user1:
Hey there. What’s up?

user2:
Not much, just hanging out. What about you?

user1:
Just thinking about what I’m going to do this weekend. You?

user2:
Probably just relaxing. What do you have planned?

user1:
I’m thinking about going to the beach. It’s supposed to be nice this weekend.

user2:
That sounds like a great plan! Have you been to the beach recently?

user1:
Not in a while. It would be nice to get out and enjoy the sun.

user2:
Definitely! I’m sure it’ll be a great time. Do you have any other ideas for the weekend?

Second Dialgoue:

user1:
Good morning. What is your profession?

user2:
Good morning. I’m an accountant. What about you?

user1:
I’m a software engineer. How long have you been an accountant?

user2:
I’ve been an accountant for about five years now. What about you? How long have you been a software engineer?

user1:
I’ve been a software engineer for three years. What do you like most about accounting?

user2:
I like how challenging it can be. There’s always something to learn or something new to figure out. What do you like most about software engineering?

user1:
I like how creative it can be. I get to come up with new ideas and new ways of solving problems. It’s a great feeling when you can come up with something that works.

glicerico · 2023-02-03T04:56:45Z

sorry, why is this related to Codegen?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning on conversations (format of conversations) #15

Fine-tuning on conversations (format of conversations) #15

Eichhof commented Jan 14, 2023 •

edited

Loading

glicerico commented Feb 3, 2023

Fine-tuning on conversations (format of conversations) #15

Fine-tuning on conversations (format of conversations) #15

Comments

Eichhof commented Jan 14, 2023 • edited Loading

glicerico commented Feb 3, 2023

Eichhof commented Jan 14, 2023 •

edited

Loading