What is an Alignment Trap?
The memory of the system is used for two main purposes: Store code, and store data. Regarding code, each 32-bit opcode must be stored in a 32-bit aligned address, meaning, the address of each opcode must divide by 4. A 16-bit processor (or ARM in Thumb mode), can access opcodes only in 16-bit aligned addresses, meaning the address must divide by 2. Storing the code differently in the memory will cause an unpredictable behavior because the CPU will read the opcodes incorrectly.
An alignment trap occurs when the CPU is required to access a 32-bit variable which is not 32-bit aligned (same for 16-bit).
Some examples for alignment of variables
When you define a variable globally, on the stack, or allocate memory for it on the heap, it will be positioned in a memory location according to its size. A 32-bit variable address must divide by 4, a 16-bit variable address must divide by 2 and an 8-bit variable address could be anywhere. The CPU has opcodes to access words, half-words and bytes, accordingly. Data structures which contain variables with various sizes will be padded to make their variables aligned according to their sizes, unless forcing it to skip padding with packed attribute. In this case, the compiler generates code that reads the nearest aligned address and isolates the required value using register shift, logical and and multiple byte access functions. This of course makes the code larger, and therefore, it is always recommended to work with the native-size variables of the CPU (in this case; int or unsigned int) to get the optimal performance, even if we don’t need their full range, and avoid packed structures as much as possible (note that there are cases that packed structures must be used, for example, hardware address or network packets). For conclusion, it looks like we are ok, and all kinds of memory accesses should be covered. Alignment traps occur when the CPU tries to perform a memory access on an unaligned address. But if we are covered, why do we still get alignment traps?
Let’s take a look on the following code snippet which defines some types of variables, a structure and a packed structure:
Now let’s print each variable’s address, using a simple program:
As expected, the addresses of the unpacked structure, as well as the integers are 32-bit aligned, the addresses of the short integers are 16-bit aligned and the addresses of the 8-bit variables could be odd or even. Pay attention to the packed structure, where the addresses of its members are not aligned – but access to each one of them is expensive in terms of code size and performance.
An example of bad code which causes an alignment trap
Let’s use the above data types, and write a function that returns the address of s1 variable inside a rte_unpacked_t structure. From the main, we will use an integer variable (on purpose) which is incompatible with the size of the short integer (int is 32-bit in length, and short is 16-bit in length). From the output above, we know that the address of s1 is aligned to 16-bit (0x109e2). The following code produces a warning from the compiler, about some incompatible pointer type. In many cases, these warnings are ignored, or even worse, a casting is added in the main function to make it shut up…
As a result, our program crashes immediately, and we get the following error:
Well, this is easy because it’s a very simple and short program. It could be a major problem when it is hidden inside thousands of code lines, especially if someone used casting to avoid the warning. It could be even worse if the function returns a void pointer address to a database which contains various data types with different sizes.
We have an alignment trap error. Now what?
The error message provides a lot of useful information we could use in order to isolate the bug. We can see the process name, process id, the current value of the Program Counter register, what was the instruction that caused the error, what is the address which was not aligned and what is the value of FSR (in ARM platforms).
The process name and id gives us a general location of the problem. In a system with many programs, it’s a good first step towards the solution. As we suspected, the address in the error message matches the address of our packed.s1 variable, but usually we can’t conclude any information from this field because we don’t know the address of each variable. Now, we’ll take a closer look at the PC and Instr. The value of the Program Counter points to the process’s virtual memory, so the physical address is not 0×8680 but something else. We don’t really care about that. This value can help us to determine whether the instruction was executed from within the program or from a shared library.
In the next step, will reboot (or restart the process) and look at its maps file:
Since it’s just an example, there are not many libraries. Let’s take a look at the address ranges in the first column, and compare against the reported value of the Program Counter. We can clearly see that the error comes from within the program’s code (See the first line, with the Read/Execute permissions). If there was a problem in a linked library (such as the libc library), the PC value would be mapped much higher, and for the libc library example, it would be something in the range of 0x400e000 and 0×4049000. In order to calculate the instruction address inside the library, we reduce the corresponding library’s base address from the reported PC value.
OK, so now we know where to look for. Assuming we compiled our program with debug symbols, we can use the objdump utility and ask it to show the mixed assembly and C code (using the –S option) around this address. Here’s a snippet from the output:
The highlighted numbers in the objdump output show us both the reported address and instruction, right next to the problematic code! This trick works even for the most complicated programs. Assuming you can’t figure out the correct address, you can try to grep the objdump’s output with the reported instruction. In high probability, you’ll get a few matching results, one of them hold the key to your error.
If the address belongs to a shared library, we use the objdump on the shared library’s binary code and not on the program’s code.
In case your program was not compiled with symbols, you can recompile the program with symbols and try to reproduce again. It is recommended to always compile the program with symbols. Symbols are removed anyway in the final target’s filesystem.
In order to find the exact assembly line when the problem comes from a library, we’ll subtract the correct base address from the reported PC value. This will be the offset of the problematic code inside the library.
- Try to use 32-bit variables. Short and Char variables do not save memory or perform faster anyway because the CPU loads them into 32-bit registers in any case, and has even more work to deal with them correctly.
- Use casting with caution. Beware of arrays of bytes; don’t try to access this array with a 32-bit pointer.
- Clean the warnings. Don’t leave warnings in your code.
- Don’t fix warnings that you are not sure how to fix. Consult with others.
How can PCD help debugging, resolving and preventing alignment traps?
When a program crashes due to an alignment trap, it will be terminated once the error message is displayed. This crash will not trigger any recovery action, and your system will probably because unstable or unusable. The PCD can help here in two fields:
Enhanced debugging capabilities and system recovery. Once registering to the PCD exception handlers, they will provide more information about this crash, including the Program counter, Link register (return address), all the other registers, last value of errno and the maps file of the process, right before it was terminated. The latter will help you analyze the location of the error just be looking at the PCD’s error report. It will also trigger a recovery action once the crash was detected, and return the system to functional mode. The crash information is also saved on a non-volatile storage for later/offline analysis. Let’s configure a simple PCD rule to start and monitor the alignment program:
Here is the output on the console once PCD has started this rule. Note that the selected recovery action here was “Reboot”, that’s why the system is rebooting right after the crash:
00008000-00009000 r-xp 00000000 1f:07 57 /usr/sbin/alignment
The details of this crash provided by the PCD can help you find and resolve this issue easily and quickly. The fact that it is also saved in the non-volatile storage of the target makes it even possible to fix it remotely or later during post-mortem analysis.
|Check out the ads, there could be something that may interest you there. The ads revenue helps me to pay for the domain and storage.|