Compiler Architecture

The Jda compiler (jda1.jda) is a self-hosted compiler written entirely in Jda. It compiles .jda source files to native x86-64 Linux ELF binaries with no external dependencies.

Pipeline

Source (.jda)
    |
    v
  Lexer/Tokenizer
    |  Converts source to token stream
    v
  Parser
    |  Builds AST from tokens
    v
  Generic Expansion
    |  Monomorphizes generic functions
    v
  JIR (Jda Intermediate Representation)
    |  SSA-based IR with basic blocks
    v
  Optimization
    |  Constant folding, DCE, TCO, peephole
    v
  Register Allocation
    |  Linear scan with spill support
    v
  x86-64 Code Generation
    |  Direct machine code emission
    v
  ELF Binary Output
    |  Statically linked, no libc
    v
  Native executable

Key Data Structures

Struct	Size	Purpose
`Token`	40B	Lexer token (type, offset, length, value)
`Instr`	96B	SSA instruction in a basic block
`BasicBlock`	196KB	Block of up to 128 instructions
`JirFunction`	6.3MB	Full function IR (256 basic blocks)
`LowerCtx`	100KB	Code generation context
`StructTable`	199KB	Struct metadata (fields, sizes, offsets)

Compilation Phases

1. Tokenization

Scans source into tokens: identifiers, keywords, integers, strings, operators, etc.

2. Parsing

Recursive descent parser builds an AST. Handles:

Function declarations with type annotations
Struct definitions with field types and arrays
Const declarations
Trait and impl blocks
Generic function scanning (<T>, <const N>)
Derive attribute processing

3. Generic Expansion

scan_generic_fns() — finds fn name<T>() and fn name<const N>() patterns
expand_all_generics() — monomorphizes: add<i64> becomes add_i64
copy_generic_toks() — duplicates function tokens with type/const substitution

4. JIR Generation

Converts AST to SSA-based intermediate representation:

Each function gets a JirFunction with up to 256 basic blocks
Instructions in SSA form (each value defined exactly once)
Handles: arithmetic, comparisons, loads, stores, calls, branches, phi nodes

5. Optimization

Constant folding — Evaluate constant expressions at compile time
Dead code elimination — Remove unused instructions
Tail call optimization — Convert tail-recursive calls to jumps
Peephole optimization — Strength-reduce mul/div by powers of 2 to shifts
Loop register promotion — Hoist loop variables into callee-saved registers (R13/R14/R15)
Single-arg call optimization — Direct MOV to RDI instead of push/pop
NOP fallthrough — Eliminate redundant jumps between consecutive blocks

6. Register Allocation

Linear scan register allocator with spill support. Maps SSA virtual registers to x86-64 physical registers (RAX, RCX, RDX, RSI, RDI, R8-R15).

7. Code Generation

Emits x86-64 machine code directly — no NASM assembly step. Produces a statically linked ELF binary with no libc dependency. Syscalls go directly to the kernel.

Self-Hosting

The Jda compiler compiles itself. The bootstrap chain:

jda0 (asm) -> jda1 (374 KB) -> jda1_sh2 (2.1 MB) -> jda1_sh3
                                                      ^^^^^^^^
                                                      identical to jda1_sh2

Self-host convergence was achieved April 2, 2026. The compiler produces a byte-identical binary when compiling itself.

Compiler Flags

Flag	Description
`-o <file>`	Output binary path
`--include <file>`	Include a source file
`--emit-asm`	Output assembly instead of binary
`--emit-jir`	Dump JIR for debugging
`--no-opt`	Disable optimizations
`--safe`	Enforce unsafe block boundaries