Memory corruption is a scenario where a given buffer (or memory area) is unintentionally modified by an unknown piece of code. This data change causes the rightful users of the buffer to receive bad information which could either modify their behavior or even crash the program. It is usually very hard to find the root cause of memory corruptions because the corruption itself may take place asynchronously to the actual use of the buffer, thus when it actually happens, the crash may occur seconds or minutes later.
In this post I will present one of the ways that may help you catch such corruptions using memory protections.
In most complicated memory corruption scenarios, a buffer is passed along multiple functions, some of them may reside in external libraries (it can get even more complicated if you don’t have all sources). Due to the asynchronous nature of the effects of memory corruption it won’t help if you’ll place a break point when the rightful users access the buffer, because it’s already too late.
In order to help us catch the code that writes to our buffer, we utilize memory protections. The idea is simple. First, we need to allocate memory for the buffer, then, we need to somehow make it “read-only”, and lastly, let the program run using our read-only version of the buffer. When some code tries to write something to it, we will be notified.
Linux allows us to assign different protection properties to different pages in memory. Therefore, we can create our own page, allocate the required amount of memory for our buffer, initialize it with the correct data and then lock it for reading only. Once it is locked, we can pass on this buffer and see what happens. In case some code tries to write to this region, the program will get a SIGSEGV exception and terminate. This will prevent the corruption on one hand, but will not let us know yet who did it. In order to figure it out, we must use a signal handler which will handle this exception and provide some more information. In case your project does not have a process controller, a recommended option is to use pcd; otherwise, you will have to write your own signal handler. With the information provided by the signal handler, we will get the address of the instruction that tried to write to our buffer, and using addr2line application, we could retrieve the source code file name and line. In case you don’t have the source code, this is a good proof to your library provider, that there is a bug in his library.
As an example, I wrote two helper functions that will help us facilitate this ability. The first one; rte_malloc, has a similar interface to standard malloc. However, instead of allocating the size of your buffer, it allocates a complete page and places the buffer inside of it. The reason is because memory protection can be applied only on pages and the buffer address must be page aligned. The second function; rte_protect, activates the required protection on the buffer, either read, write or both. You can call this function anytime and dynamically change the protection on your buffer during your investigation. Here’s the source code of these functions:
/* Tracking buffer abuse using read/write protection.
* Written by Hai Shalom.
*
* http://www.rt-embedded.com/
*/
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>
#include <limits.h> /* for PAGESIZE */
#ifndef PAGESIZE
#define PAGESIZE 4096
#endif
void *rte_malloc( size_t size )
{
char *buf;
/* Allocate memory, reserve space for a full page */
buf = malloc( size + PAGESIZE - 1 );
if( !buf )
return NULL; /* Failed to allocate memory */
/* Align pointer to a multiple of PAGESIZE */
buf = (char *)(((int) buf + PAGESIZE - 1 ) & ~( PAGESIZE - 1 ));
return buf;
}
int rte_protect( void *buf, int protect_read, int protect_write )
{
int prot = PROT_READ | PROT_WRITE; /* Default buffer protection */
/* Configure protection */
if( protect_read )
prot &= ~PROT_READ;
if( protect_write )
prot &= ~PROT_WRITE;
/* Now, buffer is aligned, apply desired protection */
return mprotect( buf, PAGESIZE, prot );
}
|
Now let’s write a short program to demonstrate how to use it. It is assumed that the above functions are available to it. I also instrumented it with the pcd’s exception handlers which you can either use if you have pcd running in your target, or replace with your own implementation:
/* Optional, required only if PCD is used */
#include "pcdapi.h"
#define BUF_SIZE 100
int main( int argc, char *argv[] )
{
char *buf;
char c;
/* Register to the PCD exception handler.
* Note that this is optional, and added only for
* getting a detailed signal information.
*/
PCD_API_REGISTER_EXCEPTION_HANDLERS();
/* Allocate aligned memory */
buf = rte_malloc( BUF_SIZE );
if( !buf )
{
perror("Failed to allocate memory");
return errno;
}
else
{
printf( "Buffer address = %p\n", buf );
}
/* Write OK */
buf[ 0 ] = 1;
/* Read OK */
c = buf[0];
/* Now apply write protection */
if( rte_protect( buf, 0, 1 ) )
{
perror( "Failed to apply protection" );
return errno;
}
/* Read OK */
c = buf[ 0 ];
/* Write protected, expect program to terminate */
buf[ 0 ] = 1;
return 0;
}
|
As we can see from this example, we allocated a buffer in the size of 100, wrote a byte and read a byte. Then, we applied the write protection and tried to repeat these action. The program will receive a SIGSEGV signal when it will try to write the byte in the second time. Now let’s review the crash dump from pcd (taken in MIPS platform):
/tmp # ./protect Buffer address = 0x412000 /tmp # ************************************************************************** **************************** Exception Caught **************************** ************************************************************************** Signal information: Time: Thu Jan 1 10:55:12 1970 Process name: ./protect PID: 2283 Fault Address: 0x412000 Signal: Segmentation fault Signal Code: Invalid permissions for mapped object Last error: Success (0) Last error (by signal): 0 MIPS registers: regmask=0x2ab8d510 status=0x0000003b pc=0x00400940 zero=0x00000000 at=0x00000001 v0=0x00000000 v1=0x2ab1d000 a0=0x00412000 a1=0x00001000 a2=0x00000001 a3=0x00000000 t0=0x00000000 t1=0x81730470 t2=0x00000000 t3=0x00000000 t4=0x00000000 t5=0x81f145f0 t6=0x00100071 t7=0x81f145f0 s0=0x00412000 s1=0x00000001 s2=0x7fb930b8 s3=0x7fb93174 s4=0x00000001 s5=0x00400614 s6=0x004008c8 s7=0x00000000 t8=0x00000000 t9=0x2ab2a040 k0=0x00000000 k1=0x00000000 gp=0x00418b60 sp=0x7fb93078 fp=0x0045a3ac ra=0x00400934 Maps file: 00400000-00401000 r-xp 00000000 00:09 170737 /tmp/protect 00410000-00411000 rw-p 00000000 00:09 170737 /tmp/protect 00411000-00412000 rwxp 00000000 00:00 0 00412000-00413000 r--p 00000000 00:00 0 2aaa8000-2aaad000 r-xp 00000000 1f:01 59 /lib/ld-uClibc-0.9.30.so 2aaad000-2aaae000 rw-p 00000000 00:00 0 2aabc000-2aabd000 r--p 00004000 1f:01 59 /lib/ld-uClibc-0.9.30.so 2aabd000-2aabe000 rw-p 00005000 1f:01 59 /lib/ld-uClibc-0.9.30.so 2aabe000-2aac0000 r-xp 00000000 1f:01 77 /lib/libpcd.so 2aac0000-2aacf000 ---p 00000000 00:00 0 2aacf000-2aad0000 rw-p 00001000 1f:01 77 /lib/libpcd.so 2aad0000-2aad2000 r-xp 00000000 1f:01 81 /lib/libipc.so 2aad2000-2aae1000 ---p 00000000 00:00 0 2aae1000-2aae2000 rw-p 00001000 1f:01 81 /lib/libipc.so 2aae2000-2ab0c000 r-xp 00000000 1f:01 83 /lib/libgcc_s.so.1 2ab0c000-2ab1c000 ---p 00000000 00:00 0 2ab1c000-2ab1d000 rw-p 0002a000 1f:01 83 /lib/libgcc_s.so.1 2ab1d000-2ab75000 r-xp 00000000 1f:01 68 /lib/libuClibc-0.9.30.so 2ab75000-2ab84000 ---p 00000000 00:00 0 2ab84000-2ab85000 r--p 00057000 1f:01 68 /lib/libuClibc-0.9.30.so 2ab85000-2ab86000 rw-p 00058000 1f:01 68 /lib/libuClibc-0.9.30.so 2ab86000-2ab8b000 rw-p 00000000 00:00 0 7fb7f000-7fb94000 rwxp 00000000 00:00 0 [stack] ************************************************************************** |
As we can see from the program printout, the buffer address is 0×412000. Now let’s examine the map file (which is also accessible while the program is running by running cat /proc/<pid>/maps). We can see that there is a region in the range of 0×412000 – 0×413000 which has the “r–p” protection bits. This information corresponds to our buffer address and applied protection, and in fact, this line describes the page our buffer was allocated from. Next, we can see that the reported fault address is also 0×412000. It means that the program crashed while accessing this address. This also corresponds with our program, because we indeed tried to access buf[ 0 ]. The last piece of information is the program counter register (pc) which holds the address of the instruction that caused the memory write. The pc shows that the instruction address is 0×00400940, which suggests that it’s inside our application and not in any of the linked libraries. We can now use addr2line application to match the address to a file name and line number:
mips-linux-uclibc-addr2line -e /home/hai/protect 0x00400940 /home/hai/protect.c:93 |
Line 93 indeed matches to the write memory access in the program, after the protection was enabled.
For conclusion, although this example was very simple, you can apply the same technique for large programs, by dynamically enabling and disabling the memory protection. For example, disable the protection for pieces of the code that you examined and found out that are OK, and enable for the rest of the program.
This memory protection technique can be also used to resolve other kinds of problems. For example, use after free. Use after free is a scenario where a piece of code tries to access portion of memory that was already freed and might read garbage (although sometimes that right data may still be there, that makes it even worse). In order to do that, replace the call for free by enabling read protection. This will catch the code that tries to access this memory area.
Another example is memory overrun detection. In this case, a piece of code write more bytes than the buffer was designed to hold, and might overwrite some other data. In this case, you can allocate two pages of memory. Enable write protection to the second page, and allocate the buffer from the end of the first page. In this case, when a memory overrun will occur, it will be written in the first byte of the second page, thus triggering the SIGSEGV signal.
You can download the source code for this example here.
| Check out the ads, there could be something that may interest you there. The ads revenue helps me to pay for the domain and storage. |




ShareThis

Popular Posts