Source code for ACL 2024 paper INTERVENOR : Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing
INTERVENOR conducts an interactive code-repair process, facilitating the collaboration among agents and the code compiler.
In the initial step (Step A^0, where 0 signifies the initial turn), we prompt the Code Learner to generate code for the given task. Subsequently, the Code Learner executes the generated code using the code compiler to assess its correctness (Step B). Following this, the Code Teacher generates code repair instructions (CoR) based on the bug report and the associated buggy code (Step C). These instructions elucidate the reason of the bug, such as modified to handle the case where the input decimal is 0
, and include code correction planning, for example, we can simply return db0db
. Such instructions are informative and enhance the guidance for the Code Learner. Ultimately, the Code Learner follows the chain-of-repair (CoR) to rectify the code and subsequently resubmits the corrected version to the compiler for execution in the subsequent turn ((Step A^i), where i≥1 denotes the code repair process). The A^i, B, and C steps are iterated sequentially until either the code meets the compiler's estimation or the maximum turn limit is reached.
Install dependencies:
git clone https://github.com/NEUIR/INTERVENOR.git
cd INTERVENOR
pip install -r requirements.txt
pip install -e human-eval
Set your API KEY in this.
First, we need to generate initial code based on natural language descriptions.
python intervenor.py \
--problem_file =human_eval/data/human-eval-v2-20210705.jsonl \
--dataset humaneval \ # choices=["humaneval", "mbpp", "humanevalx", "codeerror"]
--language python \ # choices=["python", "cpp", "java", "js"]
--model gpt-3.5-turbo-instruct-0914 \
--max_tokens 512 \
--temperature 0.2 \
--num_samples_per_task 1 \
--todo code_generation # choices=["code_generation", "cor_generation", "code_reapir"]
The result file is saved in ./results/code_generation/{dataset}/{language}.jsonl
.
After obtaining the initial generated code, it is necessary to test the generated code and retain the test results along with error information.
For more detailed testing steps, please refer to HumanEval Repo
evaluate_functional_correctness {result path} --problem_file={problem_path}
The test result file is saved in the ./results/code_generation/{dataset}/{language}.jsonl_results.jsonl
.
After obtaining the test results, we know which code is correct and which is incorrect. For erroneous code, we employ Code Teacher to provide the Chain-of-Repairing (CoR).
python intervenor.py \
--problem_file =human_eval/data/human-eval-v2-20210705.jsonl \
--dataset humaneval \
--language python \
--model gpt-3.5-turbo-instruct-0914 \
--max_tokens 512 \
--temperature 0.2 \
--num_samples_per_task 1 \
--todo cor_generation # change to cor_generation
The cor file is saved in ./results/cor_generation/{dataset}/{language}_cor.jsonl
.
After obtaining the Chain-of-Repairing (CoR), we proceed to fix the buggy code and save the repaired code to ./results/code_repair/{dataset}/{language}.jsonl
.
python intervenor.py \
--problem_file =human_eval/data/human-eval-v2-20210705.jsonl \
--dataset humaneval \
--language python \
--model gpt-3.5-turbo-instruct-0914 \
--max_tokens 512 \
--temperature 0.2 \
--num_samples_per_task 1 \
--todo code_repair # change to cor_repair
Then, we test the repaired code.
evaluate_functional_correctness {result path} --problem_file={problem_path}
INTERVENOR involves the integration of two agents, Code Learner and Code Teacher, who work interactively to repair the generated codes. The prompts they use are as follows:
Code Teacher:
You are an experienced and insightful programming instructor, and you need to identify the bugs in the given code based on the error messages.
- buggy code:
%%%buggy_code%%%
When testing the above code, errors occurred: %%%error_message%%%, some test cases did not pass!
Please check the implementation of the function and provide a method for modification based on the error message. No need to provide the modified code.
Modification method:
Code Learner:
You are a student assistant with excellent code repair capabilities. You can attempt to fix the bugs in the above code based on the provided error information and the method for modification. Please make sure to carefully check every potentially problematic area and make appropriate adjustments and corrections.
- buggy code:
%%%buggy_code%%%
When testing the above code, errors occurred: %%%error_message%%% , some test cases did not pass! Please check the implementation of the function and fix the code based on the modification method.
- modification method:
%%%repair_method%%%
Correct the code:
For additional prompts, please refer to this directory and our paper.
If you use our code or our data for your research, feel free to cite our publication:
@inproceedings{wang2024intervenor,
title={INTERVENOR: Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repair},
author={Hanbin Wang and Zhenghao Liu and Shuo Wang and Ganqu Cui and Ning Ding and Zhiyuan Liu and Ge Yu},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics},
year={2024}
}
Please send email to [email protected].