The CPU in embedded systems is an important resource. The CPU type, model and processing power is carefully selected during the project’s first steps. A system that its CPU is busy most of the time is not healthy, and a system that its CPU is idle most of the time is wasteful. Assuming we’re past the design stage and the system is running, you and your managers will probably want to know how busy is the CPU while performing various tasks. Operations that require extensive CPU processings include cryptographic calculations, handling and routing network traffic, graphic calculations and many other product specific calculations.
This article covers 3 different utilities which will help you measure the CPU load in Linux environment. Their output defers from each other by the levels of report details.
The VMSTAT utility
The vmstat utility reports summarized information about processes, memory, paging, block I/O, traps, and CPU load. It is distributed as part of the procps package which contains libraries and other utilities that provide information based on the information provided in the proc filesystem. The following window provides an example screenshot of the vmstat utility, which was configured to display 12 readings in 1 second intervals. Each line provides a snapshot of the system in the current sampling time (use vmstat 1 12):
# vmstat 1 12 |
The following fields are displayed in the vmstat output:
| Procs |
| r: The number of processes waiting for run time. |
| b: The number of processes in uninterruptible sleep. |
| Memory |
| swpd: the amount of virtual memory used. |
| free: the amount of idle memory. |
| buff: the amount of memory used as buffers. |
| cache: the amount of memory used as cache. |
| Swap |
| si: Amount of memory swapped in from disk (or disks). |
| so: Amount of memory swapped to disk (or disks). |
| IO |
| bi: Blocks received from a block device (blocks/s). |
| bo: Blocks sent to a block device (blocks/s). |
| System |
| in: The number of interrupts per second, including the clock. |
| cs: The number of context switches per second. |
| CPU |
| us: %CPU spent by User space applications. |
| sy: %CPU spent by the System (kernel mode). |
| id: %CPU spent idle. |
| wa: %CPU spent waiting for I/O. |
| st: %CPU stolen from a virtual machine. |
The procps package source code can be downloaded from here.
The TOP utility
The TOP utility is another standard utility. It is distributed as part of the procps package and a lighter implementation is provided in Busybox. It provides a summary about the CPU and memory usage in the system, and it also provides detailed information about each task in the system. This information includes the process ID, the parent’s process ID, owner, state and memory consumption. It also shows the CPU usage breakdown per each process in the column before the process name. This detailed output is mainly used for debugging purposes, to detect tasks that consume too much CPU time. The output refresh rate (in seconds) and number of iteration can be configure in the command line. Here’s a snapshot of Busybox’s top output in an idle machine, which was configured to to display 10 readings in 1 second intervals. The top output refreshes the whole screen in each iteration (use top 2 10):
# top 2 10 |
The CPU summay line provides the follwoing information:
| Field | Description |
| usr | %CPU spent by User space applications |
| sys | %CPU spent by the System (kernel mode) |
| nic | %CPU spent by Low priority user mode (nice) |
| Idle | %CPU which is available (idle) |
| io | %CPU spent by I/O waiting |
| irq | %CPU spent servicing interrupt requests |
| sirq | %CPU spent servicing soft irqs |
Please note that this output may vary between different top versions or implementations. In Busybox implementation, the CPU measurement feature is disabled by default. In order to enable it, you must change the Busybox configuration to enable this feature (Process Utilities->top->Show CPU per-process usage percentage and Show CPU global usage percentage).
The IOSTAT utility
The iostat utility is used for monitoring system input/output device and can help you estimate the performance of block devices, but it also calculates the CPU load. It can be configured to display the CPU load periodically. Its output is a summary of the CPU load in the sampling time. Here’s the output of iostat, when it was configured to show the CPU usage in 1 second intervals (use iostat c 1):
# iostat c 1 |
Each line provides the follwoing information:
| Field | Description |
| user | %CPU spent by User space applications |
| system | %CPU spent by System (kernel mode) |
| io | %CPU spent by I/O waiting |
| idle | %CPU which is available (idle) |
The iostat source code can be downloaded here.
How to interpret the numbers
In high level, all 3 tools provide a summary of the current CPU usage. The four imporant fields are; user, system, i/o and idle, where the sum of all four fields must be 100%. If you are interested to see the CPU load in general, you may run iostat for a summarized information. If you want more details, you can run vmstat. If you need a per-process breakdown, use top.
A system which shows high percentage in the user field, suggests that the CPU is busy mainly with one or more processes in the user space. Usually, a process should not consume too much CPU during its run time due to the scheduling policy. In case a process consumes constant high CPU processing power, it may suggest that it is performing either a heavy calculation taks or that something is wrong. If you know that the specific process should consume a lot of CPU time, then it is OK. Otherwise, you need to check the implementation of the process. It might be stuck in busy-wait state, it might be over prioritized (read here for more details), or perhaps its code implementation is not optimized (for example; performing many iterations of division/modulus calls, performing unneeded calculations, calling too many system calls, etc.).
A system which shows high percentage in the system field, suggests that the CPU is busy mainly running kernel code and drivers. This is typically seen in high bandwidth networking drivers, where the system processes high networking traffic, or high interrupts load. The Linux kernel provides means to reduce the networking overhead with the NAPI networking API. Interrupt handling overhead can be reduced, if the hardware supports it, by interrupt pacing, which allows activation of the servicing function once per a number of interrupts, thus doing all the work once.
A system which shows high percentage in the i/o field, suggests that the CPU is waiting for data to be read or written from or to a storage device (disk, flash, cd, etc). Note that it doesn’t mean that the CPU is busy; it could perform other tasks while the i/o operation completes. However, it might suggest that you need to tune your caches and file systems.
A system which shows high percentage in the idle field, suggests that the CPU is idle most of the time (just sitting and waiting). It is normal in idle state where the system is waiting (for example in the shell, or login screen). However, if this is the situation even in the highest work load possible in the product, then this CPU is too strong for the product, and it is a waste of money and power. You should consider adding additional work for this CPU to do, or put the CPU in low power mode while reducing its clock rate.
Please note that the scenarios which were described are just examples. There could be other reasons which affect your system behavior in a different manner.
Resources:
http://procps.sourceforge.net/
http://linux.die.net/man/8/vmstat
http://linux.die.net/man/1/iostat
http://www.busybox.net/
| Check out the ads, there could be something that may interest you there. The ads revenue helps me to pay for the domain and storage. |




ShareThis

That’s an excellent paper , thanks you so much.Keep open source alive!
Hello Hai.
First, this article is a really good one. Well done!
As to my questions:
1. Do you have some article regards the CPU utilization?
I’d like to know whether there is a limit to the %CPU utilization, which an embedded system should not cross.
I know the number of 70%, but my collegues claim it can reach even 85%. Can you tell, or point me to some relevant info?
2. Is there some limit to the TOTAL %Memory (i.e.: total used memory) which is recommended for an embedded system (note: embedded system, and not desktop)?
I mean, when we design the system, should we say the crossing this limit is un-acceptable, or maybe we have to test the system and see if it crashes?
Thanks a lot.
Hi Aviv.
1.There is no way to do that. There are other means to limit a process utilization, mainly by giving it the correct priority. A process controller can also send stop and start signals to a process. There is no point of “reserving” CPU cycles, because then you actually throw them away. If a job needs to be done, it should use the CPU. Again, you need to fine tune your system and prioritize each process, so important tasks are done correctly.
2. You should read the post. Send another question if it doesn’t provide an answer for you.