Memory leaks is one of the RT Embedded major bugs, as I described in the article about the 5 most destructive bugs. Memory leak bugs usually kill your system slowly and painfully. Depending on the total amount of RAM and the extent of the memory leak, the system will continue to run and perform well for a certain amount of time. It could be hours, days, or weeks. During this time, the amount of free memory is constantly decreased. In Linux based systems, when memory is required and there is no free memory, the kernel will start paging out programs and clearing the page cache. Further memory requests will cause performance impact due to paging out living tasks. Eventually, the kernel will trigger the Out-Of-Memory killer and start killing processes. In this stage, the system may be unusable already. The last stage will cause the kernel to panic and hang/reboot. Many consumer electronics products in the field remain active their whole life, therefore you can’t allow any memory leaks to happen, or else the customer will have to power cycle the unit from time to time (this is annoying, isn’t it?). Memory leaks are also hard to notice during unit-tests or lab tests because in these scenarios, the unit is rebooted constantly, as part of the development or testing work.
This article covers a few techniques that will help you isolate and resolve the memory leak in your user-space application. An article about kernel memory leaks will be provided as well.
Memory leaks occur due to continuous allocation of resources without freeing them, thus reducing the amount of free memory.
First step: Do we have a memory leak?
As I described earlier, it is hard to detect memory leaks during development due to the frequent system reboots in the development process. Furthermore, in case the leak is minor it could take plenty of time until the system’s performance starts to degrade to a noticeable state. Therefore, the first action to take during development, is the code review and flow review. Make sure all memory allocations are coupled with memory free. Cover all if-else and other conditional flows to make sure that the coupling still remains. In case the system starts to slow down after it was running for a while, and/or you see strange messages which contain the phrase “oom_killer”, then a memory leak probably exists somewhere. Also, during your tests, you may want to check the amount of free memory in the system periodically (read here if you don’t know how) and check the log if the amount reduces over time. If it reduces over time, then a memory leak exists.
Let’s see the following code example of a program that leaks memory on purpose:
#include <stdlib.h> #include <string.h> #include <unistd.h> #define BLOCK_SIZE ( 128 * 1024 )
int rte_steal_mem( void )
{
char *p;
/* Allocate memory */
p = malloc( BLOCK_SIZE );
if( p )
{
/* Clear it - force real allocation */
memset( p, 0, BLOCK_SIZE );
return 0;
}
return -1;
}
int main( int argc, char *argv[] )
{
while( 1 )
{
/* Call this function */
if( rte_steal_mem() != 0 )
{
return 1;
}
/* Wait 2 seconds */
sleep(2);
}
return 0;
}
|
This program will allocate additional 128KB of memory each 2 seconds. Now, lets run it in the background and use free utility to periodically check the amount of free memory, in 10 second intervals:
# memleak & |
As we can see in the “free” column, the amount of free memory is reduced in each iteration (marked in red) and the “used” column shows an increasing amount of memory. This is the expected scenario in case there is a memory leak. Again, the extent of the leak depends on the bug itself and its occurrence intervals.
Second step: Isolate the leaking process
The TOP application provides a per-process breakdown of memory allocations. Let’s take a look on the following snapshots, where the first two were taken immediately after the memleak process was activated, and the third one was taken a few minutes later:
##### First top iteration #####
##### Second top iteration #####Mem: 10068K used, 108752K free, 0K shrd, 436K buff, 848K cached##### top output after a few minutes #####Mem: 65212K used, 53608K free, 0K shrd, 436K buff, 848K cached |
Take a look at the VSZ and %MEM columns which show the amount of memory allocated and the percentage from the system memory. In the first iteration, the memleak process looks normal, like all the other processes. In the second iteration (marked in orange), it starts to be noticable, and in the third output (marked in red), we can see that the memleak process has allocated about 57MB which is 47% of the system’s memory!
Now we know which process is leaking. In case no process shows growing numbers, then the leak could be in the kernel code. Click here in order to find out how to detect and fix these.
Third step: Open the code, find and fix the leak
The traditional approach – Code reviews
As I mentioned earlier, the first approach is to try to avoid memory leaks by performing code reviews and flow covers. However, sometimes, a post release code review (once we’ve isolated the process) can reveal the leaking location inside the process. Here are some points to consider:
- Make sure all memory allocations are coupled with memory free.
- Make sure all other resource allocations are coupled with resource release (such as fopen->fclose, open->close, etc.).
- Cover all if-else, switch-case and other conditional flows, including erroneous flows, to make sure that the coupling still remains.
- There are cases where a function allocates memory for the caller. Make sure that you free the memory once it is not needed anymore.
- Make sure that a pointer that holds an address to a dynamically allocated memory is not erased or overrun by another value. If this will happen, you will not be able to free this resource.
If all of that does not help, you can consider using code inspection tools, or the follow open source library.
Replacing the malloc function with a debug version
We can try to apply a nice trick to figure out the caller, by replacing the standard malloc function with a macro. The macro will undefine the original malloc address and will show some useful information about the caller, and how much memory it required. It is most useful to put this macro in a common header file, or in a specific C file that you suspect. Note that the beauty of this macro is in the fact that it will not harm the program, every memory allocation will still be served correctly. However, it will slow down the program due to the prints. The macro that we use is highlighted in the following example code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define __malloc malloc
#undef malloc
#define malloc( _size ) \ ({ printf( "%s: Calling malloc(%d) from line %d, filename: %s\n", __func__, _size, __LINE__, __FILE__ ); \ __malloc( _size ); })
int main( int argc, char *argv[] )
{
void *p;
p = malloc( 1024 );
memset( p, 0, 1024 );
printf( "p=%p\n", p );
free(p);
return 0;
}
|
And here is the output of this file:
$ ./mem main: Calling malloc(1024) from line 13, filename: mem.c p=0x9c06008 |
Any doubt who was the caller…? Note that same macro technique can be applied on free or any other allocation function. If a specific caller is constantly allocating memory (for example; in a loop), it will be very easy to catch it here. However, if the leak is minimal and we need more analysis to apply, the macro can be extended to also print the address of the allocated memory, and so will the free macro replacement. Using an offline processing tool (such as Microsoft Excel), we can then analyze the output of the macros and compare allocations against releases of the same address. Then we could track who was the caller of an allocation which wasn’t freed. If it sounds too complicated, then the next section describes a tool that may automate some of the work.
Using dmalloc to debug your application
The dmalloc is a dynamic memory allocation analysis tool that replaces the traditional malloc (including all its derivatives) and free functions with debug enhanced versions, which will allow you to detect:
- Memory leaks.
- Memory overruns.
- Mutiple calls for free on the same pointer.
- Freeing a NULL pointer.
- Using a freed resource.
The tool can be configured in run time, and also provides statistics of memory allocation during the program’s life and after it exits. The provided log is detailed and contains file names and line numbers.
Once you’ve downloaded and compiled the dmalloc package, you’ll see that it created an executable and a library. The executable is used to configure the dmalloc in runtime and the library contains the debug enhanced versions of the allocation/release functions.
In order to facilitate the dmalloc library in your program, you need to do some minor modifications to its source files and makefiles. In each source file that you are interested to enable the dmalloc debugging, you need to include the library’s header file. This include directive can be ifdef’ed to be enabled only when you enable debug mode. For example:
#ifdef ENABLE_DMALLOC_DEBUG |
And in the makefile, you need to link with the dmalloc library as in this example:
ifdef ENABLE_DMALLOC_DEBUG |
Note that multithreaded applications require to link with the thread aware version of the library. In this example you need to compile your application with the ENABLE_DMALLOC_DEBUG flag enabled in order to enable the dmalloc infrastructure for debug purposes.
Once you’ve compiled and linked your application with dmalloc enabled, boot the system and configure the tool with your requested settings. When your system finished booting, configure the dmalloc with your required options, such as shell type, monitor intensity, poll intervals, log file and other advanced options according to your requirements. Run the dmalloc application with the –usage flag to see other flags. If you are using an embedded system with BusyBox, use the -b flag, and then copy-and-paste the output text that the dmalloc prints. This procedure is required in order to setup an environment variable which will be visible to the program you want to debug. Here’s an example:
# dmalloc -b high -i 1 -l /dev/console |
In this example, the dmalloc is initialized to print an init string for borne shell types, high monitoring in 1 seconds intervals and display the log in the cosole (this could be replaced by any file name). Don’t forget you need to copy-and-paste the output to make this configuration effective. Now you are ready to run your program. As an example, let’s modify our memleak program to do 5 iterations and then exit. Let’s see what statistics are displayed when the program exits:
# memleak |
Let’s see the useful debug information which the dmalloc provided:
- General information
- Heap statistics
- Total calls for each allocation or free functions
- Allocation statistics
- Top 10 allocations
- And a list of non-freed pointers including file name and line number (This is great for us!).
With this information, we can isolate and fix the memory leak immediately. In case your program does not exist (in case of a daemon for example), you can show these statistics by calling the statistics function dmalloc_log_stats( ), which is actually the same function that is automatically called upon process termination. Your program can call this function periodically or by external request (such as an IPC action).
Although it is not relevant directly to memory leaks, lets also see what happens when the program overflows a buffer. Let’s change the memset call in out memleak program to write BLOCK_SIZE+1 bytes (1 extra byte). Here’s the output:
66: 2: Dmalloc version '5.5.2' from 'http://dmalloc.com/' |
As we can see, the dmalloc caught the overflow and terminated the program. We can see when the buffer was allocated and the data that was overflowing.
You can download the dmalloc source code from here, and consult its detailed documentation here.
Forth step: Validate your fix
Congratulations, you found and fixed the memory leak. The last step is to repeat step 1 and make sure the system does not show memory degradation over time, and your’e done. Resources:
http://dmalloc.com/
http://linux.die.net/man/1/top
| Check out the ads, there could be something that may interest you there. The ads revenue helps me to pay for the domain and storage. |




ShareThis

Popular Posts