we introduce Nüwa, a comprehensive Traditional Chinese Medicine LLM that encompasses the entire training pipeline from Continuous Pre-training and Supervised Instruction Fine-tuning to Reinforcement Learning from AI Feedback. Nuwa outperforms other open-source Chinese medical LLMs within TCM domain, thanks in part to our construction of a large-scale TCM training corpus and TCM dialogue dataset.
✅ [2024/08/15] Nüwa starts releasing dataset, code, etc.
✅ [2024/08/01] Nüwa TCM repo is created.
-
data/pretrain
: Contains part of TCM corpus for continuous pre-training the model. -
data/finetune
: Contains part of TCM-QR for supervised instruction fine-tuning the model. -
data/reward
: Contains samples for training the reward model.
Training Stage:
Stage | Python script | Shell script |
---|---|---|
Stage 1: Continue Pre-training | pretraining.py | run_pt.sh |
Stage 2: Supervised Instruction Fine-tuning | supervised_finetuning.py | run_sft.sh |
Stage 3: Reward Modeling | reward_modeling.py | run_rm.sh |
Stage 4: Reinforcement Learning | rl_training.py | run_rl.sh |
- To install the required packages, you can create a conda environment.
conda create --name nvwa-tcm python=3.8
- Activate conda environment.
conda activate nvwa-tcm
- Use pip to install required packages.
pip install -r requirements.txt
- Please download the LLaMA-Ziya-13B model from the link Download Link.
-
- Continuous Pre-training
bash run_pt.sh
-
- Supervised Instruction Fine-tuning
bash run_sft.sh
The LoRA method is used here, and the parameters need to be merged into the Model
python merge_peft_adapter.py
-
- Reward Modeling
bash run_rm.sh
-
- Reward Modeling
run_rl.sh
python inference.py