I have implemented a reasonable number of instructions now doing x86 -> ir. The differences between this compared to REIL, are the use of explicit instructions for each combination of operand type, eg AddImm8 and Add32. The use of instructions which could be implemented using simpler operations, eg, Loading the 2nd byte of a dword into an 8 bit register (think ah/bh/ch/dh). The use of many conditional branches (Je/Jne/Jz/Jnz/Jl/Jle/Jg/Jge). The use of a Dispatch_Jz/Dispatch/Jnz conditional etc that uses a virtual address instead of a label. Not marking each instruction with an address of its own, but using labels. And marking labels using a Label instruction. The use of a RaiseException instruction. Handling of segmentation and checking segment limits – but still using linear addresses in ReadMem/StoreMem.
Those are the major differences, and I certainly haven’t finished writing all the code yet, so it may change in the future. But the instructions I’ve done so far are working out OK and I’m generating some nice output. I’ve spent some time this morning thinking if I should have Put/Get instructions like Valgrind/VEX which copy to and from a guest register/context area. Will think some more..
Which got me thinking about the next phase of my code, translating the IR back into x86. ( In the final implementation there will be an optimisation phase on the IR, but it’s not so important for now and I will focus on the code generation)
My IR is a linear IR implemented as three address code. The naive approach to code generation is to convert is to translate each individual IR instruction to x86 seperately. The problem with this, is that I suspect it will produce very bad code.
AddImm32 $1,,%eax Add32 %eax,%ebx,%eax # eax = eax + ebx LoadMem32 %eax,,%ecx # ecx = LoadMem32(eax) // annotation: eax no longer live
Translating one instruction at a time to x86 will produce something like this –>
inc %eax add %ebx,%eax mov (%eax),%ecx
But really, what we want is something like this –>
mov 1(%eax,%ebx), %ecx
This seems to be hard to generate from three address code. A tree based IR seems much more able for this type of instruction selection.
Valgrind uses a tree based IR internally (though it alternatives between a linear IR which it calls a flat or flattened IR. QEMU uses a linear IR.
Maybe three address code will work out OK and I’m worrying about nothing, but I’m not entirely convinced. Maybe for the first implementation I will just use three address code.
The tree IR seems interesting though for static analysis also in the sense that the representation inherently is able to support instruction re-ordering and register renaming (to an extent). I’m really exaggerating its ability to handle this in the sense of being able to use it for identifying equivalence in some types of code blocks, but I suspect it can make the basis of a useful approach. Think code normalization and malware detection of variants. Not sure if this is plausible. Something for the future..