I spent part of today working on the emulator again. I decided it was finally time to implement the virtual file system code. I had this working for Linux some time ago, but not for win32. Now its implemented for win32, but doesn’t work on Linux. But the majority of code is done, and its only a small amount of work to get it working for Linux alongside win32.
With the file system code done and a few other win32 functions, telock now is unpacked successfuly. telock checks the first 0x400 bytes of the memory image (which consists of PE headers) against the file image. I only have telock unpacking when i trace the program in parallel. This is due to only a partial implementation of CreateMutex and OpenProcess.
I tried unpacking pespin after that. pespin actually checks the ImageBase field in the PE optional header. The win32 kernel PE loader sets this field in the memory image to reflect the load address. I talked about this once before, but deleted the comment from my blog as I wasn’t convinced what was happening in memory. But yes, it gets patched. pespin also uses the sti instruction which I hadn’t implemented. It raises a priviledged instruction exception.
pespin and pelock both fail to be emulated completely, and i’m still working on the fixes..
I guess in the future I’m going to have to implement thread support. Apparently some packers use threads and I’m not implementing any of this currently. Also floating point might need to be done at some point in the future.
The other big problem I have is that the emulator is very slow. Much too slow to be used in an on-access AV scanner. It takes 20 seconds to unpack a rlpack’d calc.exe (120k). telock unpacking a minimal program of 7.5k uses 4 times the number of instructions than rlpack. pespin takes a longer still. When I finish the telock unpacking in standalone mode I’ll have to run some more tests for speed.
I never designed the emulator with speed in mind. It will take significant efforts to get it up to speed. I’m still currently more interested in having it work against more packers, but if the code is ever to be used for a real purpose, then I’ll have to address the speed at some point.
I also have to fix up the MemCheck code to work with a Linux host.. I have been putting that off for some time now. Turns out in my release version I still left some debug code in it and am only have translation blocks of single instructions.. duh.. That only makes it slower though; it still performs correctly. At some point, I’d like to publicise MemCheck on some mailing lists like dailydave/bugtraq/fd/lkml etc, so I better get round to it at some point.