ShareThis

Size optimization - Part 3

Now that you’ve optimized your applications (programs) and archives (static libraries), we’ll discuss how to optimize your shared libraries. Unlike archives which are used only during link time on the host machine, shared libraries reside on the target’s file system, and cannot be reduced using the the same techniques. Furthermore, when you create a shared library, you can not know which functions will be used by the applications and which functions are not used. It is also not trivial to figure out the dependency between the library’s functions (which function requires another one inside the library). Therefore, shared libraries always contain the full set of functions. The question that we ask is; how much storage space is wasted for unused code of a shared library?

Assuming we always use small and light-weighted libraries, it may not be a serious issue, although the sum of many small libraries may add up eventually. However, there are cases where large libraries are used, for example OpenSSL. In terms of embedded systems, this library is huge and requires a large part of the system’s storage. In most cases, we don’t need the full functional set of this library. Some libraries (OpenSSL too) allow a level of configuration which allows the user to disable some of its features to reduce the final size. We can also work hard to manually remove some objects from the library according to our current needs. But what happens if we add a new program that requires some functionality that we already removed? We’ll end up with a broken system due to unresolved symbols. How can we automate shared library reduction? Here’s the answer: use mklibs.

What is mklibs?

mklibs is a script which was originally used for creating bootable floppy diskettes. Due to the fact that these historical storage medias were very limited in size, it was necessary to save only the needed code, and nothing else. The Internet is full of variants of this script, some variants use shell syntax and others use Python syntax. In the resources section I provided links for some variants. Nowadays, in modern embedded systems we have a similar problem. Embedded systems also have limited storage, not because of technological limitation, but due to the final cost. Smaller storage devices are simply cheaper.

How does it work?

The mklibs uses recursive iterations on all the program’s symbols and share library simbols as follows (quoting from the script):

  • Gather all unresolved symbols and libraries needed by the programs  and reduced libraries.
  • Gather all symbols provided by the already reduced libraries (none on the first pass).
  • If all symbols are provided:
    • we are done.
  • Else:
    • Go through all libraries and remember what symbols they provide.
    • Go through all unresolved/needed symbols and mark them as used.
    • For each library:
      • Find pic file (if not present copy and strip the share library).
      • Compile in only used symbols.
      • Strip.
      • Back to the top.

How can it be used?

Please note that it may not be so trivial to use this script in your system, there are plenty of potential issues that will prevent the script from completing its task, or the outcome to actually work on the target. Here are a few pointers that will help you (I spent a lot of time figuring out everything):

  1. Your shared libraries must be created with the soname flag enabled. This flag embeds the shared library name inside the shared library itself. Here’s an example: “-Wl,--soname,$(so_target)“, where the so_targetvariable contains the full library’s name (such as libmath.so).
  2. Per each library you must create a pic file, which is actually a static library version of your shared library, compiled with function and data sections (see part 2 for more details). The archive name must be with the same name of the shared library, with the suffix _pic and the extension .a. In order to accomplish this, you need to use the linker to create a single object file from the list of library objects, and then call the archiver to create an archive from this linked object. For example, the following command will create a single object out of the library’s objects: “ld -r -o $(so_target)_pic.o $(objs) $(LDFLAGS)“, the following command to create an archive: “ar rc $(so_target)_pic.a $(so_target)_pic.o“. The last command is required after the archive is created: “ranlib $(so_target)_pic.a“.

Once the script encounter a shared library with the name embedded inside and a matching pic file, it will be able to maximize the reduction of this library. Otherwise, it will only strip it, which will not reduce it size further (stripping is usually done anyway).

The following parameters are supported by the script. Note that this list may change between the various variants:

-d, --dest-dir DIRECTORY create libraries in DIRECTORY
-D, --no-default-lib omit default libpath ( /lib/ : /usr/lib/ : /usr/X11R6/lib/ )
-L DIRECTORY[:DIRECTORY]... add DIRECTORY(s) to the library search path
--ldlib LDLIB use LDLIB for the dynamic linker
--libc-extras-dir DIRECTORY look for libc extra files in DIRECTORY
--target TARGET prepend TARGET to the gcc and binutils calls
--root ROOT search in ROOT for library rpaths
-v, --verbose explain what is being done
-h, --help display this help and exit

For embedded systems, the following flags are recommended:

  1. -D: You must configure the script not to use the host’s libraries.
  2. -L <directory>: Specify where your libraries reside.
  3. –ldlib <loader library>: Specify where is the loader library. For uClibc systems, the loader is called ld-uClibc-0.9.XX.so, where XX stands for the uClibc version (28,29,30…).
  4. –libc-extras-dir <libc directory>: Specify where the uClibc libraries reside. In uClibc systems, they reside in the uClibc directory, under lib/.
  5. –target <CROSS>: Specify your cross compiler prefix.
  6. -d <dest directory>: Specify the destination output directory where all the reduced libraries will be stored.

If you are using uClibc in your system, you should consider using the uClibc version of mklibs (see link in the resources section).

Estimated size reduction

If your system uses large libraries, especially ones that came from Open Source, and designed for general purposes, there is a good chance that you’ll see a lot of reduction in size. In the platform I am working with, I was able to reduce the final size of the OpenSSL library by 25%. It means that there were a few hundred bytes which were not used, just sitting on the precious and limited storage media. On the other hand, if most of the libraries are light weighted, and locally written, usually there are no excess functions which are not used, as most are used by design.

Other pointers

The script will give you hard time until it will produce the required results. However, if your system uses open source libraries and is limited with storage, this script will produce the most of size reduction. In case there is a shared library which is breaking due to this optimization process (it is possible), you can disable the optimization on this specific library by removing its pic file. As I mentioned, without a pic file, the script will only strip the library and keep it intact.

Resources:
mklibs from debian: http://ftp.debian.org/pool/main/m/mklibs/
mklibs with uClibc fixes: http://cristi.indefero.net/p/buildroot-cristi/source/tree/8a25f1c4e78f02b1e722ccc2f659c025ba662296/toolchain/mklibs/mklibs.py
http://tracker.coreboot.org/trac/buildrom/browser/buildrom-devel/bin/mklibs.py

Check out the ads, there could be something that may interest you there. The ads revenue helps me to pay for the domain and storage.

2 comments to Size optimization – Part 3

  • jekell

    Quotes :
    “Please note that it may not be so trivial to use this script in your system”
    “The script will give you hard time until it will produce the required results”

    I confirm these statement, I am trying to get a valid result out of mklibs for an SH4 arch running a Qt application.
    As of now, I was not able to get anything satisfying out of mklibs :
    - The set of “required” libs given by mklibs is insufficient and leads to a appli launched but crashing at runtime
    - It is hard to get valid pic files for all shared objects, and for many of them, mklibs get stuck because of an undefined “__dso_handle” symbol issue
    - Optim with pic file sometimes leads to final shared libs that have problems at runtime because of undefined symbols.

    So for the moment, mklibs only gives me a “clue” on what shared libs I can remove in final rootfs.
    After searching for help documentation on mklibs usage and possible issues, I found nothing very usefull appart from this page.
    If anyone has links to such pages, please share.

  • Xiaoyu

    1. I have solved “__dso_handle” symbol not found issue by,
    when you do “ld -r -o …”, you should make sure all the system level .o have been included, for exmaple, on my system,
    ld -r -o libTrace_pic.o /usr/lib/gcc/i686-redhat-linux/4.4.4/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.4/crtbeginS.o cTrace.o cStr.o cValue.o /usr/lib/gcc/i686-redhat-linux/4.4.4/crtendS.o /usr/lib/gcc/i686-redhat-linux/4.4.4/../../../crtn.o

    2. It seems mklibs could not reduce the shared library too much, I am doubt I have made mistakes?
    Highligt what I have done,
    a) Compiling the upper cTrace.o cStr.o and cValue.o with “-ffunction-sections -fdata-sections”
    b) Revise “mklibs” to open the detail debug and remove “dpkg-architecture –qDEB_HOST_MULTIARCH” related code(I am working on Ferdora 13).

    the mklibs seems running properly, the key part of the log as below,
    ——————————————————————————–
    reducing libTrace.so
    using: _ZN4CStrC1Ev _ZN4CStrD1Ev _ZN4CStr6toPStrEPKcPc
    resolving /home/wruser/elf/mklibs_test/Release/lib/libTrace_pic.a
    resolved to /home/wruser/elf/mklibs_test/Release/lib/libTrace_pic.a
    extracting from: /home/wruser/elf/mklibs_test/Release/lib/libTrace_pic.a so_file: /home/wruser/elf/mklibs_test/Release/lib/libTrace.so
    calling mklibs-readelf –print-soname /home/wruser/elf/mklibs_test/Release/lib/libTrace.so
    soname: libTrace.so
    calling mklibs-readelf –print-needed /home/wruser/elf/mklibs_test/Release/lib/libTrace.so
    calling gcc -nostdlib -nostartfiles -shared -Wl,–gc-sections,-soname=libTrace.so -u_ZN4CStrD1Ev -u_ZN4CStrC1Ev -u_ZN4CStr6toPStrEPKcPc -o reduced_libs/libTrace.so-so /home/wruser/elf/mklibs_test/Release/lib/libTrace_pic.a -lgcc -Lreduced_libs -L/home/wruser/elf/mklibs_test/Release/lib -L/lib/ -L/usr/lib/ -L/usr/X11R6/lib/ -lstdc++ -lm -lgcc_s -lc
    calling objcopy –strip-unneeded -R .note -R .comment reduced_libs/libTrace.so-so reduced_libs/libTrace.so-so-stripped
    /home/wruser/elf/mklibs_test/Release/lib/libTrace.so 26484L
    reduced_libs/libTrace.so-so 96594L
    reduced_libs/libTrace.so-so-stripped 26356L
    ———————————————————————————–
    I’ve tryed using “-Wl,–gc-sections” or not in upper gcc command, but seems the reducing effects nearly can be ignored, only less 100 bytes reduced for libTrace.so.

    My test program only used cStr.o, I expected at lest cTrace.o and cValue.o could be removed from libcTrace.so.

    I expect someone could tell me what is wrong with the test? By the way, I use the mklibs_0.1.33.tar.gz, download from http://packages.debian.org/sid/mklibs.

Leave a Reply

 

 

*