Skip to content
/ jcc Public

A full C compiler, written in pure C. No 3rd party dependencies or parser generators

License

Notifications You must be signed in to change notification settings

john-h-k/jcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jcc

JCC is designed to be a pure C11 (no dependencies) C11/C18/C23 compiler.

Is it sound?

No, it is text based

Support

It currently supports AArch64 almost fully, with partial WIP RISC-V support.

Design

  • Preprocessor
    • Has two modes
      • Self-contained - when invoked with the -E flag, will run the preprocessor and output the result
      • Streaming - in normal compilation, tokens from the preprocessor are consumed and fed to the lexer
    • Code is preproc.h and preproc.c
  • Frontend - Lexer + Parser
    • These work in lockstep (tokens are provided on-demand by the lexer), and build the AST
    • It is a very loose and untyped AST, to try and parse as many programs as possible, with little verification
    • Lexing code is lex.h and lex.c
      • Lexer takes preproc tokens
    • Parsing code is parse.h and parse.c
  • Semantic analysis - Typecheck
    • Builds a typed AST from the parser output
    • Performs most validation (are types correct, do variables exist, etc)
    • Parsing code is typechk.h and typechk.c
  • Intermediate Representations and passes
    • All code located in the ir folder
    • IR representation structs and helper methods are in ir/ir.h and ir/ir.c
    • Pretty-printing functionality is in ir/prettyprint.h and ir/prettyprint.c
      • This also includes graph-building functionality with graphviz
    • IR building
      • This stage converts the AST into an SSA IR form
      • It assumes the AST is entirely valid and well-typed
      • Code is ir/build.h and ir/build.c
    • Lowering
      • Firstly, global lowering is performed. This lowers certain operations that are lowered on all platforms
        • E.g br.switchs are converted into a series of if-elses, and loadglb/storeglb operations are transformed to loadaddr/storeaddr
      • This converts the IR into the platform-native form
      • Then, per-target lowering occurs
        • For example, AArch64 has no % instr, so x = a % b is converted to c = a / b; x = a - (c * b)
      • The code for lowering is within the appropriate backend folders
    • Register allocation
      • Simple LSRA, done seperately across floating-point & general-purpose registers
    • Eliminate phi
      • Splits critical edges and inserts moves to preserve semantics of phi ops
  • Code Generation
    • Converts the IR into a list of 1:1 machine code instructions
    • These are all target specific
    • Currently codegen does too much - in the future I would like to move lots of its responsibilities (e.g ABI) into IR passes
  • Emitting
    • Actually emits the instructions from code generation into memory
  • Object file building
  • Linking
    • Links using the platform linker
    • Effectively just runs the linker as one would from the command line
    • Code is link.h and link.c

About

A full C compiler, written in pure C. No 3rd party dependencies or parser generators

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages