Shared libraries diamond problem
If you split up code into different libraries you can get a diamond dependency problem.
That is you have two parts of your code that depend on different incompatible versions of the same library.
Normally you shouldn't get in this situation. Only someone who hates their users makes a non backwards compatible change to a library ABI. You don't hate your users, do you?
(just kidding about hating your users.)
Disclaimer
I thought I'd dive into this problem as a weekend project.
Don't rely on this article as a source of truth, but please correct me
where I'm wrong. I'm not an expert in creating shared libraries, and it's much harder
that it would first appear. The existence of libtool proves that.
Example project described can be found here.
Multiple versions of the same library
The lovely land of modern Unix will allow you to have multiple versions of the same library installed at the same time. That's not the problem. The problem is that when you load the libraries all the symbols are resolved inside the same namespace. There can't be two versions in use of the same function with the same name, even if they are in different libraries (well, it won't work the way you want it to). You can't choose which one to see from different code. (see breadth first search description here for details).
The scenario
In this scenario there are three separate libraries:
libd1- Calls functions inlibd2andlibbaseversion 1. File name is simplylibd1.so.libd2- Calls functions inlibd1andlibbaseversion 2. File name is simplylibd2.so. (cross dependency betweenlibd1andlibd2)libbase- This is the library that made non backwards compatible changes between version 1 and 2. Files arelibbase.so.1.0andlibbase.so.2.0, with traditional symlinks fromlibbase.so.n. The important bit here is that the two versions are to be considered incompatible and must both be usable in the same process at the same time.
p which links to libd1 and libd2.
What we want to achieve is to have libd1 and libd2 be able to call
functions in libbase, but to have them
call different versions. Both libbase.so.1 and libbase.so.2
should be loaded and active (have their functions callable).
What normally happens
When you link an executable with -lfoo the linker finds
libfoo.so and verifies that it (along with default libraries)
contain all the symbols that are currently unresolved. Linking shared objects
(DSO, .so files) is similar, except
there's
no requirement that all symbols are resolved. If an unresolved symbol exists in more than one
library then only one is actually used.
You can't therefore link two overlapping
ABIs,
even indirectly via an intermediate dependency. Well, you can, but
are you comfortable with one of them being "hidden"? (which one is not important for this article)
Both versions of the library can therefore be loaded, but if they overlap then one will hide the other (in the overlap).
How to solve it
The hiding of one of the symbols in a name clash is done by the dynamic linker (ld.so)
at load time (run time). Since
symbols are referenced by name the only way to differentiate the two names is to force them to
have different names.
This can be done in different ways; manually, or automatically. Both append some text to every symbol you want to differentiate.
Automatic (with --default-symver)
A small change. Certainly doesn't *appear* to refer to a new name. You can also inspect with$ gcc -fPIC -Wall -pedantic -c -o base1.o base1.c $ gcc -shared \ -Wl,--default-symver \ -Wl,-soname,libbase.so.1 \ -o libbase.so.1.0 base1.o $ nm libbase.so.1.0 [...] 00000000000006f0 T base_print <--- same name as without the special args [...] 0000000000000000 A libbase.so.1 <--- this is new with --default-symver [...]
objdump -T, but the magic happens when you link something to it.
Let's compile and link one of the libraries that uses libbase.so:
Interesting. The unresolved symbol base_print was changed in the linking step, but d2_print was not. This is because the version info was put into$ ldconfig -N -f ld.so.conf $ ln -fs libbase.so.1 libbase.so # library to link with $ gcc -fPIC -Wall -pedantic -c -o d1.o d1.c $ gcc -Wl,-rpath=. -L. \ -Wl,--default-symver \ -Wl,-soname,libd1.so \ -shared -o libd1.so d1.o -lbase $ nm d1.o [...] U base_print [...] U d2_print [...] $ nm libd1.so [...] U base_print@@libbase.so.1 [...] U d2_print [...] $ ldd libd1.so [...] libbase.so.1 => ./libbase.so.1 [...]
libbase.so (symlinked to
libbase.so.1.0). libd2.so didn't yet exist,
so it can't have version info attached. If you want version info for libd1 and
libd2 referencing each other then
you'll first have to create versions of the libraries that don't depend on each other but do have version info.
It's probably possible to bootstrap this manually, but I haven't looked into it.
Also note that libd1.so knows that it's depending on libbase.so.1 (not
plain libbase.so). This is
the name given with the -soname option when linking libbase.so. So there's
even more magic going on than I led on. And it's a good reason for having that option.
libd2.so is compiled similarly, except against version 2 of libbase.so.
The program p is then linked to both libd1.so and libd2.so, which in
turn pulls in both versions of libbase.so:
$ cc -Wl,-rpath=. -o p p.o -L. -Wl,-rpath=. -ld1 -ld2
$ ldd p
linux-vdso.so.1 => (0x00007ffff41f8000)
libd1.so => ./libd1.so (0x00007fe08b788000)
libd2.so => ./libd2.so (0x00007fe08b586000)
libc.so.6 => /lib/libc.so.6 (0x00007fe08b1e5000)
libbase.so.1 => ./libbase.so.1 (0x00007fe08afe3000)
libbase.so.2 => ./libbase.so.2 (0x00007fe08ade1000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe08b98c000)
$ nm p | grep libd
U d1@@libd1.so
U d2@@libd2.so
$ ./p
base version 2> init
base version 1> init
d1> init
d2> init
d1()
d2_print()
base version 2> d1 <--- libd1 -> libd2 -> libbase version 2
d2()
d1_print()
base version 1> d2 <--- libd2 -> libd1 -> libbase version 1
d2> fini
d1> fini
base version 1> fini
base version 2> fini
Note that both base versions are loaded, and that d1() calls d2_print() and vice versa.
Diamond problem solved. Yay!
Manual with --version-script
Instead of --default-syms one can use --version-script=base1.map
and create the map file.
1 2 3 4 | BASE1 {
global: base_print;
local: *;
};
|
Manual from within the code
Outside functions, add:asm(".symver base_print_foo,base_print@@BASE1");
to create base_print@@BASE1 from base_print_foo. This may still need a map file,
but it will override it.
Notes
- "@@" in a name means "this version and the default". A single "@" means just "this version".
Summary
… yet Unix still has much less DLL hell than Windows.
I aimed to provide an accessible view into shared libraries by solving a specific problem. For more in-depth information from people who know more about it than me see the links below. I've barely scratched the surface.
The compile time linker (ld) and the dynamic linker (ld.so) do
a lot of magic. More than you'd expect if you haven't thought about it before.
Links
- Example project
- How To Write Shared Libraries, by Ulrich Drepper
- Program Library HOWTO - 3. Shared Libraries
- ld.so GNU linker/loader
- Gold readiness obstacle #1: Berkeley DB. Also see other linker related posts on that blog.
- It's not all gold that shines — Why underlinking is a bad thing. Nice illustrations of the problem.
- Autotools Mythbuster - Chapter 3. Building All Kinds of Libraries — libtool
- Why there are shared object files and a bunch of symlinks
- Binutils ld manual - 3.9 VERSION Command
Comments
Post a Comment