MathChat

Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions

Welcome to the repository! This directory contains four JSONL files in our proposed MathChat benchmark:

First part is the benchmark files:

File Descriptions

1. follow_up.jsonl

This file contains entries that facilitate follow-up questioning. Each line consists of three keys:

question: Sourced from the GSM8k testing set.
answer: Corresponding answer from the GSM8k testing set.
followup: Includes two rounds of follow-up questions and reference answers, formatted as a conversation between a user (A:) and an assistant (B:).

2. error_correction.jsonl

This file is designed for error correction tasks. Each line consists of three keys:

question: Sourced from the GSM8k testing set.
answer: Corresponding answer from the GSM8k testing set.
error_correction: Contains a conversation between a user (A:) and an assistant (B:), which includes the original question, an incorrect answer, and the process of correcting the error.

3. error_analysis.jsonl

This file also focuses on error correction but employs a different prompt strategy. Each line consists of three keys:

question: Sourced from the GSM8k testing set.
answer: Corresponding answer from the GSM8k testing set.
error_analysis: Includes a conversation between a user (A:) and an assistant (B:), where the model is prompted to independently determine the correctness of the answer without being explicitly told.

4. p2p_generation.jsonl

This file contains entries for problem generation tasks. Each line consists of three keys:

question: Sourced from the GSM8k testing set.
answer: Corresponding answer from the GSM8k testing set.
new_problem: A new problem generated by GPT-4 to serve as a reference answer.

Second part is the MathChat_sync dataset that can be used to fine-tune your own LLMs.

Due to the large size of the dataset, we put the file on this google drive:

https://drive.google.com/file/d/1nkAXAL9EpmDiceoV_qv6M00Lj0LA7MKX/view?usp=sharing

We hope these files aid in your analysis and development efforts. For any questions or contributions, please feel free to open an issue or submit a pull request. Happy coding!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Eval_Scripts		Eval_Scripts
MathChat Benchmark		MathChat Benchmark
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MathChat

File Descriptions

1. follow_up.jsonl

2. error_correction.jsonl

3. error_analysis.jsonl

4. p2p_generation.jsonl

About

Releases

Packages

Languages

License

Zhenwen-NLP/MathChat

Folders and files

Latest commit

History

Repository files navigation

MathChat

File Descriptions

1. follow_up.jsonl

2. error_correction.jsonl

3. error_analysis.jsonl

4. p2p_generation.jsonl

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages