| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | various relocation related optimization | 2025-12-14 | 1 | -4/+6 | |
| | | | | | | | | | | | With 59ca5a8db, querying if a symbol is defined is cheap. If we're compiling code that calls foo() and we defined foo() in this compilation unit, we already know its offset within the .text section, so use it instead of emitting a relocation for the linker to handle. Also, put small literal data in the .text section instead of .rodata. This seems to improve performance (cache locality?), and as a bonus, it will be good for aarch64's instr encoding with smallish PC-relative offsets. | ||||
| * | regalloc: fixbug with phi move of stack <- stack | 2025-12-13 | 2 | -6/+5 | |
| | | |||||
| * | Add -O optimization flag | 2025-12-13 | 1 | -2/+4 | |
| | | |||||
| * | fix position independent loads of function symbols. | 2025-12-13 | 3 | -6/+6 | |
| | | | | | | | | | For `extern int x[1];`, can use PCREL32 for &x. But for `extern int x(int)`, must use GOTREL, when not being called directly (that's PLT). Therefore the type of an external symbol (actually just whether it denotes a function) matters when deciding what kind of relocation to emit, so keep that information. | ||||
| * | rename arraylength macro -> countof | 2025-12-11 | 5 | -22/+22 | |
| | | |||||
| * | ir: bump MAXINSTR | 2025-12-10 | 1 | -1/+1 | |
| | | |||||
| * | parallel move; implement reg<->stack swp | 2025-12-10 | 1 | -3/+18 | |
| | | |||||
| * | regalloc: optimize a little edge case better | 2025-12-10 | 1 | -4/+6 | |
| | | | | | | | | | | | | | With two-address instructions one needs to make sure the dst doesn't get allocated to the same reg as the right-hand operand: %r = mul %x, %y ; %y cannot be %r Except, if the operands are the same %r = mul %x, %x ; if %x is dead after this instr, it's fine to allocate %r to the same reg | ||||
| * | misc fixes | 2025-12-10 | 1 | -1/+1 | |
| | | |||||
| * | rega: change assert for spilled callee. it's ok if nspill==1 | 2025-12-09 | 1 | -1/+1 | |
| | | |||||
| * | abi: fix aggregate passed by regs 2nd reg offset | 2025-12-06 | 2 | -24/+28 | |
| | | | | | | | | | | It was broken for example `struct { i32 a; f64 b; }` (would try to load/store b from byte offset 4, not 8). Introduce r2off, realize in x86-64 it's always 8; even `struct {i32 a; f32 b;}` gets passed in one (integer) register. But not so in (future) ABIs like RISC-V, I believe there `{i32, f32}` would get passed in 1 integer and 1 float register (r2off = 4). | ||||
| * | add command-line predefined macros (-D, -U) | 2025-12-06 | 1 | -2/+0 | |
| | | |||||
| * | ir: float fold div/0 | 2025-12-05 | 1 | -4/+3 | |
| | | |||||
| * | regalloc: kill dead defs of physical regs | 2025-12-04 | 1 | -8/+16 | |
| | | |||||
| * | c: make tentative definitions work | 2025-12-02 | 1 | -1/+1 | |
| | | |||||
| * | abi/isel: aggregate args in stack wip | 2025-11-27 | 1 | -9/+31 | |
| | | |||||
| * | regalloc: skip dead phis | 2025-11-26 | 1 | -1/+4 | |
| | | |||||
| * | ir: simplify some occurrences of single-argument phis | 2025-11-24 | 2 | -8/+17 | |
| | | |||||
| * | ir.h: tweak mkintrin() definition to work with tinycc | 2025-11-24 | 1 | -1/+1 | |
| | | |||||
| * | ir: implement cvtu64f. and other bug fixes | 2025-11-23 | 1 | -2/+35 | |
| | | | | | | compiler is bootstrapping?! however, stage1 and stage2+ executables aren't bit-identical.. small differences in the codegen.. need to look into that | ||||
| * | implement cvtfXu64 by lowering it in builder | 2025-11-23 | 1 | -9/+46 | |
| | | | | | this should probably be in a separate pass? | ||||
| * | c: check actual reachability for non-void func may not return value | 2025-11-23 | 2 | -0/+22 | |
| | | |||||
| * | implement float varargs, and some other fixes | 2025-11-23 | 3 | -7/+17 | |
| | | |||||
| * | make sure indirect function call pointer does not end up in clobber reg | 2025-11-22 | 1 | -2/+2 | |
| | | |||||
| * | ir: freeblk: clear preds | 2025-11-22 | 1 | -0/+2 | |
| | | |||||
| * | ir/ir.c: fix assert in mkcallarg | 2025-11-22 | 1 | -1/+1 | |
| | | |||||
| * | ir/dump: initialize out buffer statically | 2025-11-22 | 1 | -3/+1 | |
| | | |||||
| * | regalloc: merge overlapping fixed intervals better | 2025-11-22 | 1 | -1/+12 | |
| | | |||||
| * | irdump: print alloca # bytes | 2025-11-21 | 1 | -0/+3 | |
| | | |||||
| * | ir: implement dominator tree computation | 2025-11-21 | 3 | -0/+40 | |
| | | |||||
| * | ir: barebones IR passes checked contracts | 2025-11-21 | 7 | -2/+26 | |
| | | |||||
| * | remove umul | 2025-11-21 | 3 | -3/+1 | |
| | | |||||
| * | change op names to match 285063eba44 | 2025-11-21 | 8 | -142/+142 | |
| | | |||||
| * | rename IR classes to reflect bitsize | 2025-11-21 | 9 | -46/+46 | |
| | | |||||
| * | regalloc: assert nops aren't being used | 2025-11-21 | 1 | -0/+1 | |
| | | |||||
| * | ir/builder: peephole optimize branch with constant conditional | 2025-11-21 | 1 | -4/+14 | |
| | | |||||
| * | mem2reg: implement marker algorithm from Braun et al | 2025-11-21 | 1 | -8/+40 | |
| | | |||||
| * | mem2reg: store pending phis implicitly | 2025-11-21 | 1 | -12/+8 | |
| | | |||||
| * | ir: fix delpred when npred becomes 1 | 2025-11-21 | 1 | -2/+12 | |
| | | |||||
| * | ir/dump: print block predecessors | 2025-11-21 | 1 | -2/+10 | |
| | | |||||
| * | cfg: sortrpo delete unreachable blocks with allocas by hoisting them to the ↵ | 2025-11-21 | 1 | -6/+7 | |
| | | | | | entry block | ||||
| * | isel: lower allocas a different way, such that stk address gets materialized ↵ | 2025-11-20 | 1 | -1/+1 | |
| | | | | | when necesary | ||||
| * | ir: for easier debugging, keep ctype in dats, print as literal when possible | 2025-11-20 | 3 | -21/+53 | |
| | | |||||
| * | mem2reg: fix edgecase.. | 2025-11-19 | 1 | -1/+1 | |
| | | |||||
| * | debug output to stdout | 2025-11-19 | 5 | -75/+79 | |
| | | |||||
| * | factor type stuff into type.h | 2025-11-16 | 1 | -0/+14 | |
| | | |||||
| * | ir: 'trap' jump; c: __builtin_trap; lex: __has_builtin | 2025-11-15 | 4 | -4/+13 | |
| | | |||||
| * | abi0: remove debugging leftover sortpo. but do number blks (free) | 2025-11-14 | 1 | -1/+2 | |
| | | |||||
| * | preeliminary va_list support | 2025-11-14 | 6 | -25/+68 | |
| | | |||||
| * | mem2reg: handle uses in branches in cmpuse() | 2025-11-12 | 1 | -0/+2 | |
| | | |||||