Hallucinations in results #186

travisapple · 2025-01-12T05:08:30Z

First off, I love this project. THANK YOU for your time here.

I'm seeing some instances of hallucinations (if that's even the right word for it here), even on simple text like "hi there" on the large data set (rev 9). It gives me 20 seconds of nightmare fuel sound. Slow whispered repeating words in low quality. If I run the same string in the mini language set it works, though the pronunciation could be better.

I have a large amount of text I want to break up into small bits and generate speech, to then later stitch back together. My plan is to try each bit with the large data set, then if it fails I will re-generate with the mini data set.

My question is how I can programmatically detect a failure? How can I test for hallucinations?

SaKiEQ · 2025-01-23T16:48:42Z

In my limited experience so far, gabled audio that in unintelligible relates to padding scheme and max_new_tokens too low.
If you increase the max_new_tokens in you args config, then the generation works better, but slower.

travisapple · 2025-01-23T21:04:13Z

Tokenizer:

prompt = tokenizer(renderString.strip(), return_tensors="pt", padding=True).to(device)

My generate() looks like this now:


	generation = model.generate(
		input_ids=inputs.input_ids,
		attention_mask=inputs.attention_mask,
		prompt_input_ids=prompt.input_ids,
		min_new_tokens=10,
		max_new_tokens=2580,
		pad_token_id=1024,
		do_sample=True,
		temperature=0.8, # 1.0 more diverse, 0.0 more the same - smaller values take longer to gen
	)

I've added max_new_tokens but I have no clue what max_new_tokens should be. I took this number from an example I found but really who knows.

I manually set the max length of the prompt text input by the length of prompt.input_ids[0]. 35 seems to be about where this thing starts to die.

After that, I do some audio tests on the output to see if it died but it seems like there should be a better way to tell how confident we are in the output.

Should padding be set to something other than True?

Also, sometimes the last few words are gone. I'm assuming this has something to do with the truncate setting (???) but I don't know how.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hallucinations in results #186

Hallucinations in results #186

travisapple commented Jan 12, 2025

SaKiEQ commented Jan 23, 2025

travisapple commented Jan 23, 2025 •

edited

Loading

Hallucinations in results #186

Hallucinations in results #186

Comments

travisapple commented Jan 12, 2025

SaKiEQ commented Jan 23, 2025

travisapple commented Jan 23, 2025 • edited Loading

travisapple commented Jan 23, 2025 •

edited

Loading