First implementation of IR -> Tree IR

The pastes below arent functional nor are they final, or even 100% correct.  But they are kinda cool.

I haven’t tested them fully so they might be really wrong.

# Label (*0x0, , )
# Load32 (%8, , %temp_memreg)
# Add32 (%temp_memreg, %eax, %temp_memreg)
# LoadMem8 (%temp_memreg, , %temp_op1b)
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsOp1)
# LoadImm8 ($0x7f, , %temp_op2b)
# LoadImm32 ($0x7f, , %lazyEFlagsOp2)
# And8 (%temp_op1b, %temp_op2b, %temp_op1b)
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsResult)
# StoreMem8 (%temp_op1b, , %temp_memreg)
# Load32 (%8, , %temp_memreg)
# Add32 (%temp_memreg, %eax, %temp_memreg)
# AddImm32 (%40, $0x0, %temp_memreg)
# LoadMem8 (%temp_memreg, , %temp_op1b)
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsOp1)
# LoadImm8 ($0x7f, , %temp_op2b)
# LoadImm32 ($0x7f, , %lazyEFlagsOp2)
# And8 (%temp_op1b, %temp_op2b, %temp_op1b)
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsResult)
# StoreMem8 (%temp_op1b, , %temp_memreg)

now the TreeIR version -->

t2 = AddImm32(%40, $0x0)
t1 = LoadMem8(%t2)
t0 = And8(%t1, $0x7f)
StoreMem8(And8(LoadMem8(Add32(%8, %eax)), $0x7f), Add32(%8, %eax))
StoreMem8(%t0, %t2)
Load8(%t0, %temp_op1b)
Load8($0x7f, %temp_op2b)
Load32(%t2, %temp_memreg)
Load32(Widen8_To_SXLo8In32(%t0), %lazyEFlagsResult)
Load32(Widen8_To_SXLo8In32(%t1), %lazyEFlagsOp1)
Load32($0x7f, %lazyEFlagsOp2)

Another example..
# Label (*0x0, , )
# Load32 (%ebx, , %temp_op1d)
# Load32 (%ebx, , %lazyEFlagsOp1)
# LoadImm32 ($0x4, , %temp_op2d)
# LoadImm32 ($0x4, , %lazyEFlagsOp2)
# Add32 (%temp_op1d, %temp_op2d, %temp_op1d)
# Load32 (%temp_op1d, , %lazyEFlagsResult)
# Load32 (%temp_op1d, , %ebx)

and the TreeIR -->

t0 = Add32(%ebx, $0x4)
Load32(%t0, %ebx)
Load32(%t0, %temp_op1d)
Load32($0x4, %temp_op2d)
Load32(%t0, %lazyEFlagsResult)
Load32(%ebx, %lazyEFlagsOp1)
Load32($0x4, %lazyEFlagsOp2)


The Load32/Load16/Load8 at the end of each basic block is important in the TreeIR.  It’s the writing back to real registers what it should be on exit.  This is inconsistant to how the temporaries are assigned.  Also, the temoraries dont have sizes associated with them yet but this isnt a major problem.

There is a slight unoptimisation in that things like temp_op1, temp_op2 etc are also written, when they are infact no longer live.  This could be eliminated with liveness analysis, but solving the data flow equations are too expensive I suspect to be used in binary translation.  I am thinking of introducing a NotLive instruction in my IR to help me with this scenario.

This is almost useful for a decompiler😉

2 responses to “First implementation of IR -> Tree IR

  1. Are u implementing this as standalone or some kind of IDA plugin? I

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s