this is a basic compiler written in python to convert the easy written human language to the assembly that gets executed by NASM assembler in (x86-64) Machine it contains the variable declaration, loop, if and else condition and print statement
I was Asked to build a simple compiler for a course in college with minimum functionality since we didn't learn every thing we need about the compiler because of Corona.
When i was looking online about the compiler overall i found this great tutorial about linux Assembly and since i was familiar with assembly from other course i looked at it and decided to learn it and build the compiler to generate machine code machine and to play around with it and this is the result.
Before you begin, ensure you have met the following requirements:
- you have installed python3.6 or above
- you have installed nasm on your pc
- fork or download this repository and open the cli in your machine
- open the read.txt file and write the code
- you can run your code using by writing .
python3 main.py
in the CLI - and whenever you press enter you execute the code from the read.txt again so you can change the code and just press enter in the cli
it should be something like this
-
Lexical Analyzer
- Found inside compiler.LexicalAnalysis.py file
- Lexer is the main class that is called from this file
- it read the input as text and seperate it as tokens
-
Parser
- Found inside Compiler.parsing.py
- Parser is the main class that is called from this file
- it read the array of tokens that was generated by the Lexer and convert them into a tree using the grammer of the language and return the parent of this tree which is object of the Statement class
- The statement class can be found in utils package inside TreeNodes.py file
-
Intermediate code Generation
- Found inside compiler.intermidate_codegeneration.py
- IntermidateCodeGeneration is the main class
- it read the tree from the parser and return intermediate code very close to the machine code
-
Code Optimization
- Found inside compiler.code_Optimization.py
- CodeOptimization is the main class
- it read the array of codes generated by the intermidatecode generation
- optimize it by removing extra lines then return array of the same type
-
Code Generation
- Found inside compiler.code_generation
- CodeGeneration is the main class
- it read the arrays of code from the optimizer then convert it to the desired machine code
- print this machine code in a file and then execute it
-
Main
- this is the main.py file
- it run those 5 phases over and over untill the user exit the shell
Statement : (Assignment | IfStatement | WhileStatement |
Declaration | Print)*
Assignment : IDENTIFIER = arithmetic_expression ;
IfStatement : (Condition) {Statement}
WhileStatement : (Condition) {Statement}
Declaration : int IDENTIFIER (,IDENTIFIER)* ;
Print : print(arithmetic_expression); | prints(String)
String : "[A-Za-z0-9_]*"
Condition : arithmetic_expression Compare_operation arithmetic_expression
Compare_operation : ( > | < | >= | <= | == | != )
arithmetic_expression : term (( + | - ) term)*
term : factor(( * | / )factor)*
factor : (INT|IDENTIFIER| (exp) )
IDENTIFIER : letter(letter|digit)*
letter : A|B|C ... Z|a|b|c....z
digit : 0|1|2|3|4|5|6|7|8|9
INT : digit(digit)*
# you can write comments between two hash signs and it will not be displayed #
[1] :
prints("hello world"); # for printing strings use prints #
print(5); # for printing integers use print #
print(12+3);
print((42*12)-4);
[2] :
hello world
5
15
500
int a,b,c;
a=12;
b=13;
c=14;
print(a);
print(b);
print(c);
[3] :
12
13
14
int var1,var2,var3,var4;
prints("arithmetic operation");
var1 = 5+5;
var2 = 10-5;
var3 = 5*12;
var4 = 60/10;
print(var1);
print(var2);
print(var3);
print(var4);
prints("priority");
var1 = 20-3*5;
var2 = (20-3)*5;
print(var1);
print(var2);
[4] :
arithmetic operation
10
5
60
6
priority
5
85
int a,b;
a=10;
b=15;
if(a>=b){ # you can use (<,>,=,<=,>=,!=) #
prints("a >= b");
}else{
prints("b<a");
}
[5] :
b<a
int i,end;
i=0;
end=10;
while(i<=end){
print(i);
i=i+1;
}
prints("END");
print(i);
[6] :
0
1
2
3
4
5
6
7
8
9
10
END
11
- allow the code generation to be very dynamic so in the future this compiler can be easily used for different assembler with minimum amount of changes
- create Web Application (probably With Flask) allowing the user to write code in the text editor from the web browser and execute it in the server
- add extra features in the compiler language
- for loop instead of only having while loop
- allow strings to be variable
- improvements in the code optimization phase to make better use of the registers
if you want to contribute follow the next steps :-
Fork the repository
Create your feature branch (git checkout -b feature/fooBar)
Commit your changes (git commit -am 'Add some fooBar')
Push to the branch (git push origin feature/fooBar)
Create a new Pull Request
This project is open source software licensed as MIT.