Monthly Archives: November 2007

read() buffer overflows

When I was auditing some code recently, I was trying to find ‘entry points’ into the code I was auditing.  I don’t know if anyone uses this terminology, but I’ll give it a shot.

 Basically, I didn’t wabnt to read all the code, or understand too many details.  So an entry point is just something where bugs around it are likely to lead to exploitation.  Examples in userland are malloc, examples in kernel are kmalloc, copy_from_user and copy_to_user.  It’s the same idea as grepping for strcpy.

 I had this idea that if I could find an integer overflow in the size parameter of a read system call, I might be able to copy a large amount into the destination buffer of the read call.  Its a genuine source of bugs, and I found some instances of it.  But alas, in the instance I found, after writing an exploit, read returned an error, and did not overflow my buffer.  The size parameter I was using was very large.  I was hoping it would copy the rest of the file in question in use by the read call.

I think perhaps on some other or older operating systems, this could potentially be exploitable.  I gave up and didnt investigate any furthur on why it didn’t work.  I’ll have to continue this in the future.

*malloc integer overflows (In Kernel)

I’ve been auditing a class of bugs using the following methodology.

  1. Find instances of malloc, if the malloc size is fixed, then discard.
  2. Then follow the code backwards to determine if it comes from a user controlled location (but don’t get carried away with following the code unless the next condition is met).
  3. If the size is user controlled, or if the code flow is hard to follow, see if the malloc size uses an arithmetic operation like foo + user_len, or foo*user_len.  This could potentially be an integer overflows.
  4. Verify that there is no input validation that restricts the size, and make sure its user controlled.

It’s a bit easier in practice to audit for these types of bugs.  It seems like it could lend itself to automatic bug checking.. I curse myself for not having the fortune of being able to write on (maybe more on this later).  But there are still many bugs to be found using this simple approach of auditing.

I started looking for Linux bugs, thinking that there might be potentially be kmalloc integer overflows in the drivers concerning IO and Ioctl calls.

 I did find a couple bugs fairly quickly, but another problem presented itself.  The times I saw instance of this bug, the kmalloc call was quickly proceeded by a copy_from_user call.  But due to the nature of those integer overflows, the size parameter being used in the copy_from_user call is very large.  And Linux verifies that the resulting buffers would reside in the appropriate text or data segment.

There are probably more instances of this type of bug that dont encounter the copy_from_user call immediately blocking exploitation.  I will have to continue auditing to see if any more bugs appear.

I just sent a remote heap overflow to IDefense

I’m curious if IDefense pays more or less than Tipping point

 I sent a local OpenBSD / FreeBSD kernel bug to ZDI, and they replied back saying they were only interested in remote vulnerabilities, and for the most part of that only bought pre-authentication bugs.

 Well, I found a reasonably significant heap overflow in some moderately popular opensource Unix software, which is aimed primarily for the server and partly the desktop market.  It’s not internet critical infrastructure or anything like OpenSSH or Apache, but it’s a reasonable bug none the less.  I suspect it could lead to arbitary code execution, though I haven’t attempted to write an exploit for proof of that.

 I sent the analysis and sample Denial of Service exploit to IDefense, last Friday (the 23rd).  I got a mail back today saying it has been assigned to a researcher for determination (if the bug is valid and if IDefense will offer to purchase the vulnerability).

 I hope I made the right decision to send to IDefense and not ZDI.  Does anyone have comments or ballpark figures of how much money each one pays?

Disassembling Obfuscated Assembly

I wrote a disassembler a week or two ago.  Actually I used libdasm to do the grunt work while I just played with the higher level code.

I want to write some static analaysis tools, so disassembling is the first part of the process. Analysis of control flow soon follows the disassembly. Obfsucating the control flow can cause static analysis tools to fail.

  .byte 0xE8 ; 0xE8 is the opcode for a CALL
A: CALL _myfunc

This is a small obfuscation that works on disassemblers that use the Linear Sweep method. It successfuly obfuscates the real call, making static analysis tools ,that are dependant on correctly identifing the control flow and call graph, fail. The Linear Sweep method of disassembly is used most noteably by objdump.

The Recursive Traversal method can successfuly disassemble the previous obfuscation, because it follows the control flow.  Therefore, it would cease disassembly after the jump site, and then correctly follow the jump to the target.

It is possible to modify the code again to make it harder to disassemble by recursive traversal.

  MOVL $0,%eax
  JZ A
.byte 0xE8
A: CALL _myfunc

The conditional jump now replaces our earlier unconditional one. A recursive traversal disassembler will see the conditional jump and believe control flow can occur to the jump target, and immediately proceeding the jump site. The code immediately proceeding the jump site is incorrect, so this should be ignored. But how to do it by automatic methods?

The first idea I came up with, was modifying the disassembler to look for static code. In the case above, the disassembler would try to recgonize jz+1.

What happens though if the following occurs.

  MOV $0,%eax
  JZ A
  .byte 0xe8
A: CALL _myfunc

In this case, it’s no longer a jz +1, but a jz+4. This can be changed indefinately. The NOPS’ could be  replaced with junk instructions, or any other polymorphic and metamorphic code.

There will however be a conflict using recursive traversal when it reaches A via the jump target, and reaches A-1 via disassembling in a straight line. There will be overlapping disassemblies, and it is unclear which disassembly is correct.

The root problem is that the recursive traversal identifies code, when infact it should not. This is what is termed a false positive in disassembly.

Perhaps disassembly should keep both sequences of disassembly, and consider them unique parts of the program. Later on, dead code elimination can be performed if there is enough data flow analysis.

Or should the data flow analysis be done during the recursive traversal?  This might be enable the disassembler to recognize the conditional jump as really being unconditional.

Another method of obfuscation is to replace the conditional jump with an indirect jump.  This works, because data flow analysis is required to see where the jump target occurs.  Static disassembly has a hard time dealing with this.

PUSH _myfunc; // setup return address. aka emulate call
PUSH _myfunc; // now to do the jump

Again, there are a number of methods of doing this. We could scan for PUSH/RET pairs and identify them as jump sites. But what if an obfuscator changes that code.

PUSH _myfunc;
MOV %eax,%eax
PUSH _myfunc
mov %ebx,%ebx

There are many such morphisms that can occur. The solution to disassemble this, could be to perform data analysis as we perform disassembly. That is, for each basic block found, eliminate all equivalent NOP instructions and then check for jumps.

This idea still wouldn’t be perfect I imagine, but it could improve the situation considerably.

Bugs and Exploits From 2002 (from the Ruxcon archive)

This never really recieved much attention at the time.  And today its only a historical footnote in my life.  But I’ve included a tarball that I included in a Ruxcon presentation in 2003.  The presentation was on Kernel Auditing, and I later spoke on the same topic at Blackhat.  The tarball wasn’t part of the Blackhat package.  The archive consists of vulnerable snippets of code, patches, exploits, and reports or logs of the many kernel vulnerabilities I found in OpenBSD, NetBSD, Linux, and FreeBSD.  Also included are some userland bugs, but there are only a few of those.

 If anyone wants to have a look, I’ve included it on my website


 A quine is a program that when run, produces as output its own listing.

I was going to write Quine sources for the title of this post, but then realised it was redundant 🙂

 I have a website that has some quines I have written.