iteration: 2 (number of times ai_coding replaced by a new version generated by itself)
This project uses LLM to generate code from a requirement file. The requirement file is in TOML format. The program reads the requirement file, and generate code in the language specified in the requirement file. The generated code is written to a file, and ready to execute.
This project is an exploration on the possibility of using LLM to write program without direct interact with code. Tools like github copilot and cursor allows you interact with code more efficiently, but user still act as the role of programmer. This project let user act as architect who design the program, write requirements and specifications, and let LLM to write the code. The goal is to maintain requirement file, rather than maintain AI generated code itself.
We all want AI could take one line description from user and generate the whole project, and user would say 'ah, that's what I want'. However this is a long way to go. This project follows the following principle: If one level of abstraction can be used as a 99.99% reliable tool, then next level could be automated by AI. So if Python language can be 99.99% reliable, then code generation of a small Python program can be automated by LLM. But it still needs human to write the next level of abstraction, which is the requirement file, because the automation by LLM to generate code is not 99.99% reliable. When we evolve, we will check whether the generated code is 99.99% reliable, and if it is, then we can automate the next level of abstraction, that is let AI generate requirement file by chat with user or product manager.
This seems in contrast with current practice that either let AI co-programming with human, or let a bunch of AI agent talk to each other to generate a project, or let AI agent divide and conquer to generate a project. This project is more bottom up approach. It takes what AI can do now, and seek to expand its capability by evolve little by little, and ask human to do what human must do to make sure everything is workable.
- 小实验:使用DeepSeek来设计能为需求自动生成代码的代码生成器ai_coder,而且还能自我迭代!https://zhuanlan.zhihu.com/p/8471526424
- Write a requirement .toml file, use
helloworld.toml
as example. - Set the following three environment variables accordingly:
OPENAI_API_KEY
, your api key from deepseekOPENAI_API_URL
, I use https://api.deepseek.com/betaOPENAI_API_MODEL
, I use deepseek-chat
- Run
python ai_coding.py <requirement.toml>
- Result will be written to a file ready to execute.
ai_coding.py
: the main program that generates code from requirement fileexamples
: example requirementstests
: test programs in requirement file formathistory/iter<n>
: history of the project, this contains oldai_coding.py
andai_coder.toml
.
Examples are under examples/
directory.
helloworld.toml
: a simple hello world program in Pythonqsort.toml
: a quick sort program that sorts your inputs from command line in Pythonaskllm.toml
: a program that asks a question and gets an answer from DeepSpeed in bash scriptaskllmpy.toml
: the same program in python and it streams the output and beautify it withrich
packageai_coder.toml
: a program that takes requirement file and generates code in Python, so this project can generate itself
Requirement file is in TOML format. It has the following sections:
- file section: this sections contains general information about the program
- name: name of the program
- language: language of the program
- description: description of the program
- extension(optional): file extension of the program
- project section: this section contains information about the project
- requirements: Your requirements goes to this field, it is a single string not a list
- specs: This fields contains specs that the program would use, for example, how to do certain action following a specification. If you want to tell the program about a function prototype you want it to call, put it here
- requirements: Your requirements goes to this field, it is a single string not a list
- specs: This fields contains specs that the program would use, for example, how to do certain action following a specification. If you want to tell the program about a function prototype you want it to call, etc. put it here.
- hints: This fields contains things that LLM should know if it is smart enough. If you want to tell the program about something it might get confused without, put it here.
The program askllm.py
is generated by ai_coding, without human interact with the code. Check examples/askllmpy.toml
see the requirement file.
- Reflection following Andrew Ng's artical https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-2-reflection/?ref=dl-staging-website.ghost.io
- Multiple component support, so that one requirement file can generate multiple files
- increase stability of code generated
- Generate and launch the generated code in Docker container
This project is evolving. I look it as an experiment to see how far I could go without write new code except the initial bootstrap code, or at least no new hand written code beyond certain iterations. This is the log of the project:
- 2024-11-22 morning: I created this project, and write the initial Python code. It can be used to generate
helloworld
and soon I expand toqsort
andaskllm
. - 2024-11-22 afternoon: I want to know whether the program can generate itself, so I wrote
ai_coder.toml
to describe the program itself. I useai_coding.py
to generateai_coder.py
, and the generated code can generatehelloworld.py
andqsort.py
. - 2024-11-24: I tried to use
ai_coder.py
to generateai_coder.toml
, it met some problem with handling of```
in the string.ai_coder
would stop generate code if met "```
" in code because it is looked as stop sign. However ai_coder itself check for```
as stop condition, so generation of itself would trigger early stop. It is easy to fix when I wroteai_coding.py
, but now I have to tellai_coder
how to get around this inai_coder.toml
. I put a hint inai_coder.toml
to tellai_coder
to use "`
"+"``
" instead of "```
" to avoid early stop. - 2024-11-25: Spent one hour on making a new iteration (
iter1
). Now I want to seperate reqirements field into requirements, specs, and hints. So requirement file will be easier to maintain or possible for auto generate requirement file in the future. I rewrote requirement file forai_coder
to use the new format. Replacedai_coding.py
with generatedai_coder.py
. I didn't write new python code in this iteration - 2024-11-27: Tried to add reflection in ai_coder. However it turns out that code comments given by LLM sometimes make the code generation generate worse code. i.e. exception handling around print statement. Finally I use flake8 as python critic for reflection. I didn't have conclusion for use flake8 as critic. Need to do more test and exploration on this. Another thing I found is that LLM will get confused if there are too many requirements. For example, initially code generator would be able to generate code according to language field in TOML file. But when I add more requirements, sometimes it would always generate with python language. Now I removed some requirements that is not very critical (i.e. compute cost of generation) so the code generator can focus on logic that is really important. Let's call this
iter2
- Consider execute the code as a critic as well to see how it works
- 2024-11-28: For relative simple requirements, this system is very reliable, see this updated
askllm
that beautify the result withrich
package. And the new requirement file here (https://github.com/delock/ai_coding/blob/main/examples/askllm.toml). I'm glad I only need to consider higher level direction and structure and let AI figure out the detail. - 2024-12-3: Remove use flake8 as critic for python code. It does not seem fix the code even when flake8 generate critic, may have a different try later.
- 2924-12-10: I added a setup.py so it can be installed and run as a command line tool. I'm not an expert in writing setup.py so I generate it with ai_coding. Move all .toml related to this project to
self/
directory. I have been usingai_coding
to generate code I need which is simple but need me to lookup a lot on internet,ai_coding
is perfect for this kind of task.
It would eventurally comes to the question that whether we can trust AI generated code without look at them. This is a topic that needs to be considered. In the future, actions may need to take to ensure the generated code is safe to run, this includes:
- Add a security check LLM for generated code
- Run the generated code in a sandbox environment to minimize interaction with the system