20 January 2016

lgfang: From http://webpages.charter.net/ppluzhnikov/linker.html, by Paul Pluzhnikov

lgfang: For two advantages static library link over direct link, refer to Section 7.6.2 of Computer Systems: A Programmer's Perspective 2nd Edtion.

How do UNIX linkers work with archive libraries, and why the order of objects and libraries on the link line matters.

I originally wrote this article in 2001 in gnu.gcc.help.

Here is a link to it.

Since then, I had to refer many people to it, and so decided that a local copy (with a shorter URL) is in order.

Here it is [slightly edited for clarity on 2006-01-10]:

"Robert Kraus" <robkra@gmx.net> wrote in message F%HA7.11606$PV1.303467@news.chello.at

I often have linking problems, when I build a project with multiple static libraries.

You are not alone: I see many people who do

gcc -lfoo main.o

Sometimes these problems can get resolved by changing the order of the linked libraries.

What causes these errors? Why can it be resolved by rearranging - does this mean, that all includes, definitions, declarations etc. are correct ?

I find the following analogy most useful when explaining how the linker works with archive libraries:

  • think of an archive library as a bookshelf, with some books on it (the separate .o files).
  • some books may refer you to other books (via unresolved symbols), which may be on the same, or on a different bookshelf.

A librarian comes to the first bookshelf (the first archive library on your link line) with two lists – a list of things she still needs (unresolved symbols), and a list of things she already has (symbols already defined).

Usually, she will already have crt0.o (added to the link line by the compiler driver), which refers to main. So her "need" list contains main. Often she will have main.o as well, which means she no longer needs main, but probably needs malloc, free, printf, etc.

Now she starts to scan one bookshelf at a time looking for books that define symbols she needs. If a book defines one of the symbols on the "need" list, she takes that book from the shelf, scratches off all symbols defined in that book from the "need" list, adds all defined symbols to the "have" list, and adds all symbols used by that book and not already in the "have" list to the "need" list.

If a book does not define any symbols currently in the "need" list, she does not take it (even though this book may come in handy later).

If a book she just took defines a symbol she already has, there is "multiply-defined symbol" problem.

If she took any books from the current shelf, she re-scans the shelf again, looking for more books to take (because the books she just took may need other books on the current shelf).

Once there are no more books she needs on the current shelf, she is done with that shelf, and she does not return to it (unless the library corresponding to it is listed on the command line several times).

If, after searching all the bookshelves listed on command line (as well as libc which is added by the compiler driver), she still has entries in the "need" list, there is an "undefined symbol" error.

Clearly, if many of your biology books refer to chemistry books, and you've got a biology paper to write, you better start searching the biology bookshelf before the chemistry one.

Sometimes, two archive libraries are inter-dependent: objects from one depend on objects from the other, and vice versa.

This means, that there is no correct order of linking such libraries, that both libraries are poorly structured, and that in order to successfully link an executable, you may need to list both of these libraries several times, as in:

gcc main.o -lfoo -lbar -lfoo -lbar -lfoo

Jane Anderson (j.anderson@geac.com) adds, that on systems using GNU ld one can use the –start-group and –end-group command-line options to work around such inter-library dependencies:

gcc main.o '-Wl,-(' -lfoo -lbar '-Wl,-)'

Finally, there is no reason a linker could not pass over all the objects and libraries twice: first to build the two lists, then to actually pull all the needed .o files from their libraries.

This is certainly user-friendly (you never get the types of errors that could be solved by rearranging library order).

I believe 2-pass linkers were common some 10-20 years ago. I believe the reasons they are not used much anymore are:

  • they consume twice the amount of I/O and significant additional amounts of memory, and
  • they force users with well-structured libraries (who can and do arrange them properly on the link line) to pay this price so that users with poorly-structured libraries can still link successfully.

The linker on AIX is the only UNIX linker I know that uses a 2-pass approach.

There is a good Linkers and Loaders book by John R. Levine. An early draft of it is available online.



blog comments powered by Disqus