Official repository for the paper "LegalDuet: Learning Effective Representations for Legal Judgment Prediction via a Dual-View Contrastive Learning".
This repository provides resources for our paper LegalDuet, which proposes a new method to enhance the accuracy of Legal Judgment Prediction (LJP). Our model leverages a dual-view legal reasoning mechanism designed to emulate a judge's reasoning process when analyzing legal cases. This approach involves:
- Law Case Clustering: Utilizing past legal decisions to inform current judgments.
- Legal Decision Matching: Extracting specific legal rules and triggers to improve prediction quality.
We used the CAIL benchmark, based on the CAIL2018 dataset, to comprehensively evaluate legal judgment prediction models.
Key tasks include:
- Law Article Prediction: Determining the correct legal articles applicable to a given case.
- Charge Prediction: Predicting the correct charge based on the criminal facts.
- Imprisonment Prediction: Estimating the sentence length based on case specifics.
LegalDuet employs two key reasoning modules:
- Law Case Clustering: Uses past cases and decisions to inform new judgments, identifying subtle differences between similar cases to refine predictions.
- Legal Decision Matching: Focuses on the specific legal articles and charges related to a case, enabling a more structured legal decision-making process.
The model is pre-trained using these dual mechanisms, creating a more tailored embedding space for legal tasks.
conda create -n LegalDuet_env python==3.8
conda activate LegalDuet_env
Check out and install requirements.
git clone https://github.com/NEUIR/LegalDuet.git
cd LegalDuet
pip install -r requirements.txt
To quickly start using our model, you can download our pretrained model from Hugging Face: 🤗 Model
Once downloaded, navigate to the Fine-Tuning
directory to begin fine-tuning:
cd Fine-Tuning
For detailed instructions on how to use the pretrained model, refer to Fine-Tuning/README.md
To reproduce the LegalDuet pretraining process, you will need the pretraining data.
The pretraining data rest_data.json
can be downloaded from the following link:📂 Pretraining Dataset
Once downloaded, navigate to the LegalDuet
directory to begin reproducing:
cd LegalDuet
For detailed instructions on how to use the pretrained model, refer to LegalDuet/README.md
We conducted a comparative study of embedding spaces to evaluate the discriminative power of LegalDuet embeddings. Using t-SNE, we visualized the embedding spaces of BERT, BERT+LegalDuet, and other ablation models, with the final visualization of SAILER+LegalDuet shown in the bottom-right.
The Legal Judgment Prediction Performance on the CAIL-small Dataset. The best evaluation results are highlighted in bold, and the underlined scores indicate the second-best results across all models.
The Legal Judgment Prediction Performance on the CAIL-big Dataset. The best evaluation results are highlighted in bold, and the underlined scores indicate the second-best results across all models.
Please cite the paper and star the repo if you use LegalDuet and find it helpful.
Feel free to contact [email protected] or open an issue if you have any questions.
@article{LegalDuet2024,
title={LegalDuet: Learning Effective Representations for Legal Judgment Prediction via a Dual-View Contrastive Learning},
author={Buqiang Xu, Zhenghao Liu, Sijia Yao, Xinze Li, Yu Gu, Ge Yu},
year={2024},
}