"You take a million, billion tonnes of flaming inferno and turn it into 'twinkle, twinkle little star' ..."

Wed, 22 Apr 2009

pkg-config, -as-needed and multiple compilers

At work I've been promoting the use of pkg-config and modules to solve the problems of avoiding hard-coded paths and an environment where we've multiple compilers. In summary, "module load intel-cc" to load the intel compiler, "module load netcdf-intel" to, which among other things appends /ichec/packages/netcdf/4.0-intel/pkgconfig to $PKG_CONFIG_PATH, and then:

NCDF_INCS := `pkg-config netcdf --cflags`
NCDF_LIBS := `pkg-config netcdf --libs`
in the application code. Replace "intel-cc" and "netcdf-intel" with "gcc" and "netcdf-gcc", and it builds with gcc.

This would work better if upstream supplied .pc files, which means the next stage in world domination is to send patches to do just that. But it's not that easy, apparently.

netcdf (for example) supplies two libraries, libnetcdf.so and libnetcff.so, with the second including code only needed for Fortran. So, for gcc I have the following netcdf file:

prefix=/ichec/packages/netcdf/4.0-gcc
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: netcdf
Description: netCDF libraries, include files and development tools (gcc version)
Version: 4.0
Libs: -L${libdir} -Wl,--as-needed -lnetcdf   -lnetcdff 
Cflags: -I${includedir}
While for Intel I have:
prefix=/ichec/packages/netcdf/4.0-intel
# Add fortran libs 
forlibs=/ichec/packages/intel/fce/11.0.081/lib/intel64
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: netcdf
Description: netCDF libraries, include files and development tools
Version: 4.0
Libs: -L${libdir} -lnetcdf -L${forlibs}  -lnetcdff -lifcore
Cflags: -I${includedir}

The main problem is that --as-needed is not understood by non GNU-ld linkers, and must be conditionally removed somehow. Any ideas ? (There is a second wrinkle of needing to add additional libraries for Intel Fortran here, but I'm sure I can remove that with the addition of more .pc files.).

A second issue is that --as-needed can break otherwise working pkg-config usages. Thanks to galtgendo at PhP Bugs for this example:

alastair@ailm:~$ cc -o testx  -Wl,--as-needed `pkg-config glib-2.0 --cflags` test.c `pkg-config glib-2.0 --libs`
works, but:
alastair@ailm:~$ cc -o testx  -Wl,--as-needed `pkg-config glib-2.0 --cflags`  `pkg-config glib-2.0 --libs` test.c
/tmp/ccsUEwtk.o: In function `main':
test.c:(.text+0x20): undefined reference to `g_print'
collect2: ld returned 1 exit status
The problem being that --as-needed removes libraries as unnecessary before the linker sees the test.o code.

Sat, 28 Mar 2009

Pkg-config and Modules

At ICHEC one of our tasks is to help users build their software on our supercomputers. Mostly we have to change paths, etc. in code bases that are used by a handful of scientists, and we don't want to exacerbate the hard-coded path problem. We also have multiple versions of certain libraries and compilers installed, which we select using modules: to do this, you select your software environment with e.g.

$ module load intel-fc/11.0.082  netcdf/4.0-intel
This selects the compiler version, and library version. Then, this places appropriate paths in $PATH, $LD_LIBRARY_PATH. While this is useful for selecting, eg. compilers, its often not great for many build processes that expect library paths, etc. hard-coded. I also dislike adding random environmental variables: it loses the Single Point Of Truth that describes the configuration. So I've come up with the following pattern.

First, record the modules used (eg. library version) in config.log if configure is used. To do this, add the following to the autoconf file or system headers:

# AC_MODULES_OUTPUT
# -----------------
# Check if the system runs 'modules', and if so, record the modules environment in the config.log
AC_DEFUN([AC_MODULES_LIST],
[AC_CACHE_CHECK[modules output if present],[ac_modules_output],
(ac_modules_output=`module list 2>&1`)
if test $? eq 0; then
 AC_MSG_RESULT([Output from 'module list' was:\n $ac_modules_output])
fi
])

Secondly, For all packages, ensure that a pkgconfig package file exists. For example, we have multiple MPI implementations on our SGI system. For MVAPICH, we have a modules file:

stokes2:~$ module show mvapich2-intel
-------------------------------------------------------------------
/ichec/modulefiles/mvapich2-intel/1.2p1:

module-whatis    MVAPICH2 1.2p1 (Intel 11.0.074) 
conflict         mvapich2-gnu mvapich2-intel 
prepend-path     PATH /ichec/packages/mvapich/1.2p1-intel/bin 
prepend-path     LD_LIBRARY_PATH /ichec/packages/mvapich/1.2p1-intel/lib 
prepend-path     MANPATH /ichec/packages/mvapich/1.2p1-intel/share/man 
prepend-path     INCLUDE /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     CPATH /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     FPATH /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     PKG_CONFIG_PATH /ichec/packages/mvapich/1.2p1-intel/share/pkgconfig 
-------------------------------------------------------------------
We add a file mpi.pc to the PKG_CONFIG_PATH for each MPI implementation. So, mpi.pc looks like:
# MPI pkgconfig file for mvapich2-intel
# Normally only used to get variables $prefix, etc.
prefix=/ichec/packages/mvapich/1.2p1-intel
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: mpi
Description: netCDF libraries, include files and development tools
Version: 1.2p1
Libs: -L${libdir}  -lmpichf90 -lmpich -lpthread -lrdmacm -libverbs -libumad -lrt
Cflags: -I${includedir}
This then means we can add the following small patch to our target software (in this case a climate code COSMOS):
 #########################################################################
 #
 #  MPI message passing root directory of the chosen compiler
 
+  # Get default MPI ROOT from the pkgconfig file if possible
+  pkg-config mpi --variable=prefix 2> /dev/null > /dev/null
+  if [ $? -eq 0 ]; then
+    MPIROOT=`pkg-config mpi --variable=prefix`
+  fi
+
   if   [ $compiler = ifort ]; then
     export cc=cc
     if [ $CHAN = MPI2 ]; then
-       export MPIROOT=/sw/sarge-ia32/mpich2-1.0.4p1-pgi
+       export MPIROOT=${MPIROOT:-/sw/sarge-ia32/mpich2-1.0.4p1-pgi}
     elif [ $CHAN = MPI1 ]; then

Such a patch is generic enough to be useful outside our institution, and means the code can build out-of-the-box on many systems. Now to add mpi.pc to Debian ...