Compiler Architecture

The Jda compiler (jda1.jda) is a self-hosted compiler written entirely in Jda. It compiles .jda source files to native x86-64 Linux ELF binaries with no external dependencies.

Pipeline

Source (.jda)
    |
    v
  Lexer/Tokenizer
    |  Converts source to token stream
    v
  Parser
    |  Builds AST from tokens
    v
  Generic Expansion
    |  Monomorphizes generic functions
    v
  JIR (Jda Intermediate Representation)
    |  SSA-based IR with basic blocks
    v
  Optimization
    |  Constant folding, DCE, TCO, peephole
    v
  Register Allocation
    |  Linear scan with spill support
    v
  x86-64 Code Generation
    |  Direct machine code emission
    v
  ELF Binary Output
    |  Statically linked, no libc
    v
  Native executable

Key Data Structures

StructSizePurpose
Token40BLexer token (type, offset, length, value)
Instr96BSSA instruction in a basic block
BasicBlock196KBBlock of up to 128 instructions
JirFunction6.3MBFull function IR (256 basic blocks)
LowerCtx100KBCode generation context
StructTable199KBStruct metadata (fields, sizes, offsets)

Compilation Phases

1. Tokenization

Scans source into tokens: identifiers, keywords, integers, strings, operators, etc.

2. Parsing

Recursive descent parser builds an AST. Handles:

  • Function declarations with type annotations
  • Struct definitions with field types and arrays
  • Const declarations
  • Trait and impl blocks
  • Generic function scanning (<T>, <const N>)
  • Derive attribute processing

3. Generic Expansion

  • scan_generic_fns() — finds fn name<T>() and fn name<const N>() patterns
  • expand_all_generics() — monomorphizes: add<i64> becomes add_i64
  • copy_generic_toks() — duplicates function tokens with type/const substitution

4. JIR Generation

Converts AST to SSA-based intermediate representation:

  • Each function gets a JirFunction with up to 256 basic blocks
  • Instructions in SSA form (each value defined exactly once)
  • Handles: arithmetic, comparisons, loads, stores, calls, branches, phi nodes

5. Optimization

  • Constant folding — Evaluate constant expressions at compile time
  • Dead code elimination — Remove unused instructions
  • Tail call optimization — Convert tail-recursive calls to jumps
  • Peephole optimization — Strength-reduce mul/div by powers of 2 to shifts
  • Loop register promotion — Hoist loop variables into callee-saved registers (R13/R14/R15)
  • Single-arg call optimization — Direct MOV to RDI instead of push/pop
  • NOP fallthrough — Eliminate redundant jumps between consecutive blocks

6. Register Allocation

Linear scan register allocator with spill support. Maps SSA virtual registers to x86-64 physical registers (RAX, RCX, RDX, RSI, RDI, R8-R15).

7. Code Generation

Emits x86-64 machine code directly — no NASM assembly step. Produces a statically linked ELF binary with no libc dependency. Syscalls go directly to the kernel.

Self-Hosting

The Jda compiler compiles itself. The bootstrap chain:

jda0 (asm) -> jda1 (374 KB) -> jda1_sh2 (2.1 MB) -> jda1_sh3
                                                      ^^^^^^^^
                                                      identical to jda1_sh2

Self-host convergence was achieved April 2, 2026. The compiler produces a byte-identical binary when compiling itself.

Compiler Flags

FlagDescription
-o <file>Output binary path
--include <file>Include a source file
--emit-asmOutput assembly instead of binary
--emit-jirDump JIR for debugging
--no-optDisable optimizations
--safeEnforce unsafe block boundaries