Title says most of it. I have now integrated into my emulator the IR component of the static analysis I will require to perform the control flow classification in my Masters research. To help my debug, I generated some graphviz output of the callgraph and control flow graphs. The following is a control flow graph from one of the procedures in the callgraph after auto unpacking to the binary hostname.exe which was packed by UPX http://silvio.cesare.googlepages.com/cfg.png .
The IR implementation is still somewhat buggy. It doesn’t generate all that is required for the status flags handling in x86, and I also have quickly noticed that I have some inconsistency in how i use the StoreMem instruction. But none the less, it should give an idea of what I’m doing.
I generate the callgraph and control flow graph based on the IR. This differs to some implementations (like ERESI I believe), which build these graphs from the native assembly. There is some redundancy between the disassembler and building the graphs from the IR, but this system allows me to use linear sweep disassembly when I need too. I currently use linear sweep disassembly to disassemble any areas that are not covered by recursive traversal. This technique is referred to as speculative disassembly.