"You take a million, billion tonnes of flaming inferno and turn it into 'twinkle, twinkle little star' ..."

Fri, 03 Sep 2010

Building a statically-linked program

I'm currently working on a Fortran program at work: a post-processing tool that takes climate data, in NetCDF format, and outputs in CMOR2 format (a NetCDF variant with climate conventions). So, it links against netcdf and cmor.

Now in HPC and climate in particular, codes are typically linked statically: partially for robustness, but mostly for speed (more on which later). So, I'd like to link this statically, as I have tens of terabytes of data to process. Now, mostly I've been linking using pkg-config:

  gfortran -o nemo-rewriter nemo-rewriter.f90 `pkg-config --libs --cflags nemo cmor`

pkg-config assembles the libraries. For dynamic libraries, the netcdf and cmor libraries are themselves linked to dependencies. But in the static case, all dependencies need to be on the link line, which is more complex. Never mind, it should be possible with:

  gfortran -static -o nemo-rewriter nemo-rewriter.f90 `pkg-config --static --libs --cflags nemo cmor`

This should work by assembling all the required static libraries, via pkg-config dependencies. Unfortunately not every package has a .pc file, and so this fails: As of version 4.1 NetCDF allows a URL instead of a file to read, and hence depends on curl to retrieve the file. Curl has no pkg-config .pc file describing its libraries, and it fails.

Never mind, lets assemble the static libraries by hand. Debian provides static versions of libraries in the -dev packages. Can I assemble a statically-linked program ? For this I need:

  • NetCDF needs libnetcdff.a and libnetcdf.a directly.
  • NetCDF needs HDF5: libnethdf5_hl.a and libhdf5.a for version 4 files.
  • CMOR2 needs: libcmor2.a
  • CMOR2 depends on libudunits2.a, to convert between physical units.

Now here it gets interesting. To handle secure communications and authentication, curl has some complex dependencies. It has two versions. Pick the gnutls one for example:

I may have missed some out, having stopped because there is no static implementation of Kerberos on Debian. But still, the idea that a simple little fortran proggie will statically link in four database libraries is silly. It appears to be no longer possible to simply statically link a program in Debian, and definitely not via pkg-config, because so many dependencies do not yet have configuration files.

Static linking on Linux is a dangerous road anyway, due to NSS and a lot of other under-the-cover dynamically loaded code in Linux, to say nothing of the issues of attempting it with C++.  Also, depending on how the static library was built, the performance gain is essentially nothing anyway.
The performance gain of static linking occurs mostly in processing relocations.  Each relocation only occurs once, and you can push even that to when the program starts by setting LD_BIND_NOW in the environment.

You can also get a performance gain on x86-32 due to the overhead of PIC code, but that doesn't apply to x86-64.
Adam:

True. From the perspective of HPC (where the few percent performance advantage, if real, matters), the point is that this program will not be accessing those functions. This program, for example, is expected to have a several week runtime (small post-processing tool), take in a file, output a file, locally ...  is a "partially static" build possible, where the code that actually will be executed is statically built, and the rest of the calls are available dynamically.
Perhaps a tool to do this can do a 'profile run', determining which functions get linked, linking those in statically.

Anonymous:
I'm (when time permits) re-examining the truth (in detail) of static-vs-shared assumptions on modern architectures. For HPC, the cost of processing relocations is irrelevant: the overhead, if any, of PIC isn't: indirect accesses, code bloat leading to cache misses, etc.
Err, my whole point is that you don't know that, and you almost certainly cannot know that anymore.  In fact, in this specific case, accessing a URL via CURL can cause dlopen() calls, at which point you're dead.  If you want your application statically-linked, you need to build netcdf without curl support, or build curl without support for anything that could make a dlopen call. This sort of problem is routine now. 

And there's still overhead due to PIC on x86_64 (additional deference through the GOT), which is smaller than the overhead on x86, but still present.
Static libraries are not useable. And because they are dead they do not get the new functionality added dynamic libraries got over the time, which means they are not useable. There is no way around that.

And by the way, getting a static list of library dependencies for a static library does not make much sense, as a well written static library should allow you to do without most of its dependencies if you only use functionality that does not need those dependencies.
(But as that needs proper interface design, so the code can decide at link time if it needed some functionality, and because noone uses static libraries anyway, that will hardly be done).
If speed is your concern, also consider only statically linking some libraries. Having code that is hardly ever used linked in statically does not give you any speed advantages (and as more and more things dlopen stuff, not really robustness either). Nothing stops you from linking some libraries statically and other libraries (including dependencies of your static libraries) dynamically.
Adam, I think I'm not making myself clear: the point of the post is that static linking is effectively broken, even on "simple" cases like the program described. I agree with everything you said.

I'm trying to build an optimized version of a simple program, avoiding the PIC overhead. The 'normal' way of doing this is to statically link it. I'm just showing that this is no longer possible (with netcdf 4.1).

I agree with what you say about building netcdf without curl support, and have in fact done this. At work i've built two versions of netcdf on our supercomputers: the dynamically-linked full-featured version, and the statically-linked curl-free version. But recommending to scientists that they build their own versions of 5 libraries (netcdf, hdf5, udunits, uuid, cmor ) for their application isn't really a runner. (See here for example). We need to do better than this.

I agree with what you say about dynamic linking under the hood, and am not particularly worried about building properly statically linked codes (only a subset of programs like /sbin/ldconfig really care in that way). I would be happy if I could 'just' build the program so that all the code that mattered was PIC-free.

I'm optimising the program for a particular use-case (e.g. files locally present, curl not used), which is typical in HPC. What i'm working towards is a build-tool (using pkg-config underneath) that profiles an existing (dynamically linked) program, examines which symbols it uses during a run, and rebuilds an optimized version from PIC-free static libraries for those objects, and dynamically-linked libraries for the rest.
The overhead of indirect lookups through the GOT goes away after the first such access, which snaps the indirection.  And in any case, the GOT quickly ends up in cache.
Kerberos upstream (like glibc) has effectively dropped support for
static linking.  If you want to statically link in curl, you'll need
to recompile it (curl) without krb.

See http://bugs.darcs.net/issue806 and http://bugs.debian.org/495163
Hi! 
Re-twit you post:  to my @urciibqo twitter

Post a Comment

Name: 
Your email address: 
Your website: 
 
Comment: