Over the past few days, I’ve been using and debugging my Memcheck code using the Linux Test Project http://ltp.sourceforge.net/. The LTP is a set of over 3000 tests that excercise various components of the Linux Kernel. I haven’t run all the LTP tests as I’ve been debugging my code also and restarting the tests after a recompile of memcheck. I also tested memcheck with a kernel module I wrote to do an off by 1 read on a buffer returned by kmalloc. This was detected, and a report generated. The only problem though, was that I didnt get symbolic names for the kernel addresses in the module. I’ll have to fix this, perhaps by inspecting the kernels symbol table.
But still, I haven’t found a single out of bounds heap access in the Linux Kernel. Is the kernel really bug free, or am I missing something?
A potential source of problems leading me to miss bugs is that the slub allocator may be very efficient with little internal fragmentation. I don’t know slub internals, but perhaps what is happening is that allocation objects are being places seqentially after each other without any unused space in between. An off by 1, may just result in accessing the next allocated object. No report would be generated, since the memory being accessed is valid.
I can think of two ways to solve this.
1) I can add guard bytes around each object. This requires the least amount of changes to the kernel and is probably what I will try in the immediate future. This type of arrangement will allow detection of off by 1’s and true overflows were memory is accessed linearly past the end of the buffer. It won’t catch out of bounds access that point directly outside the object and its guard. The next solution may solve this problem.
2) A more challenging, but superior solution is to keep track of the base pointer used to reference objects internal members. By associating a memory access with a base pointer, or the specific object in question, all memory accesses that point to the other objects may be caught. The way to implement this solution is by tracing every instruction, keeping track of references to the start of heap allocations/objects. Pointers can be incremented, but the base pointer or object it refers remains constant. Then just following where those pointers lead. When a memory access occurs, and that location is derived from an address we are tracking, we can restrict access to a single object on the heap, insted of all active objects on the heap.
I really think 2) is the solution to implement in the long term. It’s slow and will take much longer to implement, but is really the only guaranteed way of tracking out of bounds access without many false negatives. I don’t know if I have time to implement this before Ruxcon. I don’t believe valgrind implements a solution like this (unless I’m mistaken), so it could be quite unique. Also, the code could be used as a base for memory leak detection, as its almost the basis of a garbage collector.
[ update: slub’s debugging option uses a redzone and can catch some types of corruption, so i probably dont need to implement guard bytes around each object ]