[FR] Binoculars as a new metric #644

MayankChaturvedi · 2024-10-20T17:51:09Z

Binocular is a metric similar to perplexity [read about binocular]. Binocular uses two language models with same tokenization to detect the LLM generated text (or data contamination).

Currently the evaluate library doesn't include binocular in its long list of metrics. Something like the following could help the library be more useful

from evaluate import load
binocular = load("binocular", module_type="metric")
results = binocular.compute(predictions=predictions, observer_model='tiiuae/falcon-7b-instruct', performer_model='tiiuae/falcon-7b')

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Binoculars as a new metric #644

[FR] Binoculars as a new metric #644

MayankChaturvedi commented Oct 20, 2024

[FR] Binoculars as a new metric #644

[FR] Binoculars as a new metric #644

Comments

MayankChaturvedi commented Oct 20, 2024