Project Readme: Text Data Labeling and Evaluation with LLM

Overview

Welcome to the Text Data Labeling and Evaluation project! This project utilizes a Language Model (LLM) to label text data for various entities based on a predefined contract dataset. The entities, including document name, party name, governing law, agreement date, effective date, and expiration date, can be customized and configured in a settings file. The labeling process involves engineering prompts for the LLM, which outputs are then parsed to extract information for the specified entities. The labeled data is further processed to meet a specific format for extensive evaluation using defined Key Performance Indicators (KPIs).

Components

1. Language Model (LLM)

Usage: Labels text data based on predefined entities.
Configuration: Entities are defined in a config file for customization.
Prompt Engineering: Crafted prompts to instruct the LLM for entity labeling.

2. Data Parsing

Role: Extracts labeled information for the defined entities from the LLM output.

3. Data Formatting

Purpose: Ensures the labeled data conforms to a specific format for evaluation.

4. Evaluation

Metrics: Utilizes extensive Key Performance Indicators (KPIs) for evaluation.
Similarity Metric: Employs TF-IDF vectorization and embedding for similarity evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dataset		Dataset
6. Evaluation.ipynb		6. Evaluation.ipynb
Evaluation.py		Evaluation.py
ForEvaluation.py		ForEvaluation.py
README.md		README.md
config.py		config.py
llm_generation.py		llm_generation.py
main.py		main.py
outputParser.py		outputParser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Readme: Text Data Labeling and Evaluation with LLM

Overview

Components

1. Language Model (LLM)

2. Data Parsing

3. Data Formatting

4. Evaluation

About

Releases

Packages

Languages

UmerrAhsan/DataLabeling

Folders and files

Latest commit

History

Repository files navigation

Project Readme: Text Data Labeling and Evaluation with LLM

Overview

Components

1. Language Model (LLM)

2. Data Parsing

3. Data Formatting

4. Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages