Picture of Erek Göktürk Erek Göktürk

Crash Course in Shared Libraries and Preloading

This page gives a crash course in compiling some object files into shared libraries, and preloading shared libraries to play with modified versions of some library functionality without having to recompile the program using shared libraries.

This page by itself give nowhere near complete information, therefore it is assumed to be read along with the material pointed to in the following section.

Have fun...

Some resources

Must reads (along with this document ;) ):

  • David A Wheeler's program library howto
  • gcc, ld, ld-so, and ld-linux.so info documents (for example info gcc, and manuals (for example man ld.so)).

The following are optional:

  • If you are new to how linking and loading works, this article from the Linux Journal titled "Linkers and Loaders" by Sandeep Grover might provide the basic understanding.
  • I didn't have the time to check it, but if you need further information on linking and loading, it might worth looking at the book "Linkers and Loaders" by John Levine. He has the manuscript online.
  • I have also been tipped by a friend (Zeljko Vrba) about a paper presented at UKUUG 2002 by Ulrich Drepper titled "How to write shared libraries" (pdf). I haven't had time to have a look, but I've been told that it's pretty extensive. Slides (pdf), and a script mentioned in the slides is also available on his site.

Building a shared library

Let's make a small library that does nothing, well almost nothing, like the traditional "hello world" stuff. Here is the first object:

/* print1.c */

#include <stdio.h>

void print1(void) {
        printf("1\n");
}

Let's make one more of these useless functions (more play, less work, all the fun!):

/* print2.c */

#include <stdio.h>

void print2(void) {
        printf("2\n");
}

Now we compile it. These two lines create position-independent code object files (print1.o, and print2.o) from our tiny source files:

gcc -Wall -g -c -fpic print1.c
gcc -Wall -g -c -fpic print2.c

And now we do our magic trick and create our shared library:

gcc -shared -Wlsoname=libprint.so.1 -g \
    -o libprint.so.1.0.1 print1.o print2.o

Lastly, we create the file with the loader name for the library:

ln -sf libprint.so.1.0.1 libprint.so

We need to write a header file for our library (so that other people can call the ultimately useful functions that we provided):

/* print.h */
#ifndef PRINT_H
#define PRINT_H
void print1(void);
void print2(void);
#endif

And here goes our test program:

/* print.c */
#include "print.h"

int main() {
	print1();
	print2();
	return 0;
}

Next, we compile our new program:

gcc -Wall -g -L. -lprint -o print print.c

To run the program, we need to tell the loader where to find our library. The following line works in bash. For other shells, you might need to write a script or run the loader directly (see the manual page for ld.so).

LD_LIBRARY_PATH=. ./print

Overriding functions by preloading

The following is not the single way of overriding functions, nor it would always work. But it gives an idea on what this is all about.

The sequence the loader searches the libraries for symbols it encounters in the executable being loaded is the most important thing (for now). Therefore if we tell the loader to search some library before others, we can intercept function calls made to functions in shared libraries. This is especially useful for debugging or instrumenting programs for various purposes.

Let's add a new function to our library. This time our function has a name that should ring bells for any c programmer: fopen.

/* fopen.c */

#include 

FILE* fopen(const char* fn, const char* mode)
{
        printf("fopen called\n");
        return NULL;
}

A little word of caution: You should remember that it is important to match the declarations of the overriding and overriden functions. The loader would not complain if you do not do so, but you most probably would mess up the stack. Well, of course you can be a bit careful with the stack signature of the functions and do whatever you want.

Now let's compile our new function, and put it into our library:

gcc -Wall -g -c -fpic fopen.c
gcc -shared -Wlsoname=libprint.so.1 -g \
    -o libprint.so.1.0.2 print1.o print2.o fopen.o
ln -sf libprint.so.1.0.2 libprint.so

We need a little program to test, too:

/* print-2.c */
#include 

int main() {
        FILE* f;

        f = fopen("abc", "r");
        if(f == NULL) {
		printf("1");
        }
        else {
                printf("2");
                fclose(f);
        }

        return 0;
}

Now compile the program as you normally would compile it:

gcc -Wall -g -o print print.c

You should be able to guess what happens when you run it (try if you can't ;) ). Then, try the following line:

LD_PRELOAD=./libprint.so ./print

If you did everything correctly, you should see fopen called being written in the output.

Wrapping functions

Suppose that you don't want to override a function, but to wrap a function in some library. You can't do it just by writing an fopen function like we did above, and at the end of that function place fopen (fn, mode). Such a call would turn into a recursive function call after loading (if not at compile time due to optimizations), since the function name is also defined with the symbol fopen.

So what we need is a pointer to the original library function. Here is a three file, but still not working version. The fact that this solution will not work is much less obvious. First we write a header that will be included by out own function. Let's use the log function defined in math.h for this example, to avoid any complexities to be introduced by the FILE* in the declaration of fopen. If you don't know about function pointers in C, there is a cute function pointer tutorial by Lars Haendel.

/* fnren.h */ #ifnden FNREN.h #define fnren.h extern double (* c_log)(double); #endif

And the c code for this file defines the storage space of this variable, and attempts to initialize it to the original log function:

/* fnren.c */ #include double (* c_log)(double) = &log;

And the third file defines the wrapper (which does nothing sane):

/* log.c */ #include "fnren.h" double log(double x) { return c_log(x); }

Now if you compile all these into a shared library (don't forget to include the option -lm to tell the linker to mark the math library as a dependency), and preload and run a program that calls the log function, all you will have is a nice stack overflow. Why? Because the first place the loader will look for a symbol is in the object being processed, which is your library when it is being preloaded. Therefore the c_log actually gets initialized to the log() function defined in the library we have prepared.

The answer to wrapping lies in the dynamic library API's dlsym() function, and the library initialization. Here is the how to do it in one file for the whole library:

/* liblog.c */ #include /* for dlsym(), RTLD_NEXT */ double (* c_log)(double); /* The init function for the library */ void __attribute__ ((constructor)) liblog_init(void) { c_log = (double (*)(double)) dlsym(RTLD_NEXT, "log"); /* excluded error check for dlsym(), see "man 3 dlsym" */ } double log(double x) { return c_log(x); }

Of course you should do something useful in the log function, but this is just an example. You should also check for errors after the dlsym() call.

To compile this code, you should define _GNU_SOURCE by issuing the gcc with the -D_GNU_SOURCE option. Otherwise the RTLD_NEXT will not be defined.

What you can't wrap

The method above is not flawless, and you will soon figure out that can't wrap evey function. Moreover, you will figure out that while you wrap the proper c library functions, some other c library functions call the original functions that you wrapped without your wrapper noticing. This is because of how the shared libraries make use of local default symbols. The internal calls in the c library are made to __<function name>() versions of the externally provided <function name>() functions. So, for example, the fopen() call has a __fopen() counterpart, and while by wrapping the fopen function you are able to intercept the invocations made in the executing process, you can't intercept the calls made internally in the c library.

You might ask "So what's the big deal?" thinking that you would wrap the __<function name>() versions as well, and intercept the internal invocations along with normal ones. Unfortunately, this is not possible at load time with the dynamic linker (and if you find out that it is, please inform me). For the details of why this looks impossible, have a look at Ulrich Drepper's paper.

Where to?

Well, the rest is up to you. See the resources listed at the head of this page.


Last updated: Aug 18 2005 1355 GMT+1