NotLive IR instruction

Maybe this is too small a thing to blog about. I added a NotLive instruction to my IR. Its essentially an annotation to mark a register as not being live. This means it will not be read again before being overwritten. This allows for some optimisations to occur on my IR -> TreeIR without the expense of liveness analysis. The NotLive instruction is entirely optional. There is no problem if it’s left out – it simply means further optimisations will not be possible. The use of the instruction can lead to an inconsistancy in the code if the register marked as dead (not live) is then read from. Good use of the instruction should avoid this.

An example from yesterday without NotLive –>

t2 = AddImm32(%40, $0x0)
t1 = LoadMem8(%t2)
t0 = And8(%t1, $0x7f)
StoreMem8(And8(LoadMem8(Add32(%8, %eax)), $0x7f), Add32(%8, %eax))
StoreMem8(%t0, %t2)
Load8(%t0, %temp_op1b)                #can be removed
Load8($0x7f, %temp_op2b)              #can be removed
Load32(%t2, %temp_memreg)           #can be removed
Load32(Widen8_To_SXLo8In32(%t0), %lazyEFlagsResult)
Load32(Widen8_To_SXLo8In32(%t1), %lazyEFlagsOp1)
Load32($0x7f, %lazyEFlagsOp2)

 

You can see those Load’s to temp_op1, temp_op2b and temp_memreg are actually not necessary. They are just artifacts from the IR generation and no longer necessary once the basic block has been completed. They are intermediaries.

# ... [skip ]
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsOp1)
# LoadImm8 ($0x7f, , %temp_op2b)
# LoadImm32 ($0x7f, , %lazyEFlagsOp2)
# And8 (%temp_op1b, %temp_op2b, %temp_op1b)
# Widen8_To_SXLo8In32 (%temp_op1b, , %lazyEFlagsResult)
# StoreMem8 (%temp_op1b, , %temp_memreg)
# NotLive (%temp_memreg, , )
# NotLive (%temp_op1b, , )
# NotLive (%temp_op2b, , )

 

Note the use of NotLive to mark those intermediaries as no longer live. Which results in the following tree IR.

t2 = AddImm32(%40, $0x0)
t1 = LoadMem8(%t2)
t0 = And8(%t1, $0x7f)
StoreMem8(And8(LoadMem8(Add32(%8, %eax)), $0x7f), Add32(%8, %eax))
StoreMem8(%t0, %t2)
Load32(Widen8_To_SXLo8In32(%t0), %lazyEFlagsResult)
Load32(Widen8_To_SXLo8In32(%t1), %lazyEFlagsOp1)
Load32($0x7f, %lazyEFlagsOp2)

 

That produced quite nice code. I think that is near optimal. Of course your wondering why there is an AddImm(%40,$0) which is equivalent to a NOP. This is due to a lack of an optimisation pass over the linear IR once it’s been generated. I’ll save that for another day..

Since I’ve been blogging about how great this tree IR is, I _really_ hope there aren’t too many major bugs in it which cause it be killed off. I really have done little debugging in terms of verifying the output – except from the casual glance to see that is approximately correct.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s