aboutsummaryrefslogtreecommitdiffhomepage
path: root/ir
Commit message (Collapse)AuthorAgeFilesLines
* ir: move cls2load to interfacelemon2025-12-183-7/+7
| | | | | There's plenty of code duplication like this around I'm looking to reduce.
* regalloc+emit: get rid of xsave/xrestore hacklemon2025-12-182-51/+63
| | | | | | | Was used for situation where we needed to spill more than 1 temporary and have to use a register that is already used. Instead of push/pop, we can just allocate and set aside specific stack slots for this purpose. Also, reworked linearscan() interval sets to separate FPR/GPR intervals.
* rega: implement stack<->stack swap for parallel moveslemon2025-12-181-29/+34
|
* x86_64: for vararg calls, write to EAX in isellemon2025-12-181-8/+25
| | | | Also, in regalloc ensure fixed intervals are sorted
* x86-64/emit: implement single-exit-point ret with jump threadinglemon2025-12-162-1/+3
|
* bitset: better implementation of bsiter() and stufflemon2025-12-162-2/+2
| | | | Also changed the type to size_t for portability
* mem2reg: fix obvious inefficiencylemon2025-12-161-16/+10
| | | | | | | deltrivialphis() was iterating over every variably instead of just looking at the variable being examined. And I'd been wondering why mem2reg was such a bottleneck for a testcase like sqlite3 amalgamation.. it's easy to miss the forest for the trees.
* create distinct interned string typelemon2025-12-153-10/+10
| | | | | | | | | | | | | | Interned strings are used pervasively, so it's a good idea to add a layer of type safety to differentiate them from general cstrs and avoid potential bugs from comparing non-interned and interned strings. Not that that's happened so far that I can remember, but it could. I'm 90% sure it's legal to alias `struct {char c;}` pointers with `char` pointers. This specific typedef gives type safety but with a simple one-way `internstr -> const char *` typecast (with `&istr->c`). Converting the other way around is more intentional: a straight up cast `(internstr)cstr` which sticks out as unchecked and probably wrong, or calling the intern(cstr) function, which is the right way.
* move intern() to mem.clemon2025-12-151-1/+0
| | | | | Being in lex.c was vestigial, since it was being used all over the frontend and backend.
* regalloc: fix lifetime construction for nested loopslemon2025-12-151-17/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, given something like ``` 1 a = ... 2 loop { // outer 3 b = do something with a 4 loop { // inner 5 ... 6 if (b < 0) 7 break 'inner; 8 if (b == 0) 9 return; 10 ... 11 } 12 } ``` Regalloc thought outer goes from 2..6, because 6 is the last place where flow jumps directly back to 2. So `a` would have the lifetime [1,7). However if neither the break nor return are taken, the inner loop repeats and then control could flow back to 7 -> 3. But now the physical location for `a` might have been clobbered between 8..10, which is wrong. This fixes that by making sure the outer loop is considered to span 2..10. The way I went about it might not be the best way of doing it. I'm not 100% certain that it's fully correct and will always find the correct loopend, either. It's surprising it took this long to hit this edge case.
* only put dats can in .text now when emitting itlemon2025-12-142-3/+3
|
* various relocation related optimizationlemon2025-12-141-4/+6
| | | | | | | | | | With 59ca5a8db, querying if a symbol is defined is cheap. If we're compiling code that calls foo() and we defined foo() in this compilation unit, we already know its offset within the .text section, so use it instead of emitting a relocation for the linker to handle. Also, put small literal data in the .text section instead of .rodata. This seems to improve performance (cache locality?), and as a bonus, it will be good for aarch64's instr encoding with smallish PC-relative offsets.
* regalloc: fixbug with phi move of stack <- stacklemon2025-12-132-6/+5
|
* Add -O optimization flaglemon2025-12-131-2/+4
|
* fix position independent loads of function symbols.lemon2025-12-133-6/+6
| | | | | | | | For `extern int x[1];`, can use PCREL32 for &x. But for `extern int x(int)`, must use GOTREL, when not being called directly (that's PLT). Therefore the type of an external symbol (actually just whether it denotes a function) matters when deciding what kind of relocation to emit, so keep that information.
* rename arraylength macro -> countoflemon2025-12-115-22/+22
|
* ir: bump MAXINSTRlemon2025-12-101-1/+1
|
* parallel move; implement reg<->stack swplemon2025-12-101-3/+18
|
* regalloc: optimize a little edge case betterlemon2025-12-101-4/+6
| | | | | | | | | | | | With two-address instructions one needs to make sure the dst doesn't get allocated to the same reg as the right-hand operand: %r = mul %x, %y ; %y cannot be %r Except, if the operands are the same %r = mul %x, %x ; if %x is dead after this instr, it's fine to allocate %r to the same reg
* misc fixeslemon2025-12-101-1/+1
|
* rega: change assert for spilled callee. it's ok if nspill==1lemon2025-12-091-1/+1
|
* abi: fix aggregate passed by regs 2nd reg offsetlemon2025-12-062-24/+28
| | | | | | | | | It was broken for example `struct { i32 a; f64 b; }` (would try to load/store b from byte offset 4, not 8). Introduce r2off, realize in x86-64 it's always 8; even `struct {i32 a; f32 b;}` gets passed in one (integer) register. But not so in (future) ABIs like RISC-V, I believe there `{i32, f32}` would get passed in 1 integer and 1 float register (r2off = 4).
* add command-line predefined macros (-D, -U)lemon2025-12-061-2/+0
|
* ir: float fold div/0lemon2025-12-051-4/+3
|
* regalloc: kill dead defs of physical regslemon2025-12-041-8/+16
|
* c: make tentative definitions worklemon2025-12-021-1/+1
|
* abi/isel: aggregate args in stack wiplemon2025-11-271-9/+31
|
* regalloc: skip dead phislemon2025-11-261-1/+4
|
* ir: simplify some occurrences of single-argument phislemon2025-11-242-8/+17
|
* ir.h: tweak mkintrin() definition to work with tinycclemon2025-11-241-1/+1
|
* ir: implement cvtu64f. and other bug fixeslemon2025-11-231-2/+35
| | | | | compiler is bootstrapping?! however, stage1 and stage2+ executables aren't bit-identical.. small differences in the codegen.. need to look into that
* implement cvtfXu64 by lowering it in builderlemon2025-11-231-9/+46
| | | | this should probably be in a separate pass?
* c: check actual reachability for non-void func may not return valuelemon2025-11-232-0/+22
|
* implement float varargs, and some other fixeslemon2025-11-233-7/+17
|
* make sure indirect function call pointer does not end up in clobber reglemon2025-11-221-2/+2
|
* ir: freeblk: clear predslemon2025-11-221-0/+2
|
* ir/ir.c: fix assert in mkcallarglemon2025-11-221-1/+1
|
* ir/dump: initialize out buffer staticallylemon2025-11-221-3/+1
|
* regalloc: merge overlapping fixed intervals betterlemon2025-11-221-1/+12
|
* irdump: print alloca # byteslemon2025-11-211-0/+3
|
* ir: implement dominator tree computationlemon2025-11-213-0/+40
|
* ir: barebones IR passes checked contractslemon2025-11-217-2/+26
|
* remove umullemon2025-11-213-3/+1
|
* change op names to match 285063eba44lemon2025-11-218-142/+142
|
* rename IR classes to reflect bitsizelemon2025-11-219-46/+46
|
* regalloc: assert nops aren't being usedlemon2025-11-211-0/+1
|
* ir/builder: peephole optimize branch with constant conditionallemon2025-11-211-4/+14
|
* mem2reg: implement marker algorithm from Braun et allemon2025-11-211-8/+40
|
* mem2reg: store pending phis implicitlylemon2025-11-211-12/+8
|
* ir: fix delpred when npred becomes 1lemon2025-11-211-2/+12
|