"You take a million, billion tonnes of flaming inferno and turn it into 'twinkle, twinkle little star' ..."

Sun, 21 Jun 2009

Debian Meteorology : Status, Summer Solstice 2009.

Unfortunately I won't be able to make it to DebConf9, so as an aid to those who are going, here's a summary of current work:

I've just uploaded terralib 3.3.1 to Debian, and its sitting in the NEW queue, as the older version was removed from the archive due to lack of maintenance (it had an RC bug in both including its own copy of libtiff and failing to link against it - now it links against an external copy of libgeotiff and libtiff).

In the NEW queue it joins g2clib, hdf-eos4, hdf-eos5 and udunits. These are there as dependencies of other Meteorology-related packages I'm working on: magics++ needs terralib and gshhs; zygrib needs gshhs (it has a copy built-in). NCL (NCAR Command Language) has a rake of dependencies including udunits, hdf-eos, g2clib and vis5d+ (ITP'd) . I'm also packaging VISIT for visualization.

Then there is the GSHHS issue: I think I'll end up packaging 'gmt-coastline-high', but the format of the coastline maps needs to be decided (netCDF or its own binary format) and updating the sundry packages to read the latest version needs tackling.

I'm packaging these as they are used at ICHEC and i've experience building them. One of the main aims I had in setting up Debian Meteorology (beyond adding the software to Debian) was to help integrate all the Free and Open Source code in the Earth-sciences field, and sort out dependency and build issues. I hadn't expected to encounter quite so many so quickly, though. I don't expect to get more done before vacation-time, but I'll be happy if I get these done this summer.

Thu, 18 Jun 2009

Maps and Coastlines in Debian

As mentioned before, I've started working on Debian Meteorology, adding "standard" meteorology-related packages to Debian. Part of the aim of this is to jump-start an effort of integrating the FLOSS in the field: all the usual libraries that people working in the field use and expect to be on the supercomputers and workstations they use.

So, two packages I've been working on are Magics++ and zyGrib, which are plotting and visualisaton tools. respectively. So they both contain coastline maps of the world. Digging deeper shows they use the same files : a binary database called 'GSHHS', or Global Self-consistent Hierarchical High-resolution Shorelines. Some scope for integration here.

So, I start investigating GSHHS in order to create a 'coastline data' package to be shared. It turns out that building GSHHS depends on GMT, the Generic Mapping Tools, already present in Debian, and this coastline issue has been explored before, and a package gmt-coast-low created.

"gmt-coast-low" is 5.5 MB in size, and as its name suggests, there was once a "gmt-coast-high", but this has since been dropped for taking up too much space in the Debian archive (in its place, a script which will download this data for you has been created. But the files in gmt-coastline-low are in netCDF rather than GSHHS's own binary format; what to do? Posting a mail for help and it turns out that another package is being considered, Basemap, an add-on for Mathplotlib, that also includes the GSHHS data.

I've summarized the files, sizes and versions here in the Debian Wiki. Offhand it appears that there is scope for re-adding a gmt-coastline-high package (with perhaps additional small datafiles on states boundaries, etc. seen in Basemap), though some questions remain:

  • Is 170 MB of arch-independent data too much these days in the Debian archive, especially since it appears at least 4 packages can use it ?
  • It seems that some packages would need to be patched to bring them up to date with the latest format version for the database. What format should the data be in, this special binary format (quite simple) or netCDF ?

Wed, 22 Apr 2009

pkg-config, -as-needed and multiple compilers

At work I've been promoting the use of pkg-config and modules to solve the problems of avoiding hard-coded paths and an environment where we've multiple compilers. In summary, "module load intel-cc" to load the intel compiler, "module load netcdf-intel" to, which among other things appends /ichec/packages/netcdf/4.0-intel/pkgconfig to $PKG_CONFIG_PATH, and then:

NCDF_INCS := `pkg-config netcdf --cflags`
NCDF_LIBS := `pkg-config netcdf --libs`
in the application code. Replace "intel-cc" and "netcdf-intel" with "gcc" and "netcdf-gcc", and it builds with gcc.

This would work better if upstream supplied .pc files, which means the next stage in world domination is to send patches to do just that. But it's not that easy, apparently.

netcdf (for example) supplies two libraries, libnetcdf.so and libnetcff.so, with the second including code only needed for Fortran. So, for gcc I have the following netcdf file:

prefix=/ichec/packages/netcdf/4.0-gcc
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: netcdf
Description: netCDF libraries, include files and development tools (gcc version)
Version: 4.0
Libs: -L${libdir} -Wl,--as-needed -lnetcdf   -lnetcdff 
Cflags: -I${includedir}
While for Intel I have:
prefix=/ichec/packages/netcdf/4.0-intel
# Add fortran libs 
forlibs=/ichec/packages/intel/fce/11.0.081/lib/intel64
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: netcdf
Description: netCDF libraries, include files and development tools
Version: 4.0
Libs: -L${libdir} -lnetcdf -L${forlibs}  -lnetcdff -lifcore
Cflags: -I${includedir}

The main problem is that --as-needed is not understood by non GNU-ld linkers, and must be conditionally removed somehow. Any ideas ? (There is a second wrinkle of needing to add additional libraries for Intel Fortran here, but I'm sure I can remove that with the addition of more .pc files.).

A second issue is that --as-needed can break otherwise working pkg-config usages. Thanks to galtgendo at PhP Bugs for this example:

alastair@ailm:~$ cc -o testx  -Wl,--as-needed `pkg-config glib-2.0 --cflags` test.c `pkg-config glib-2.0 --libs`
works, but:
alastair@ailm:~$ cc -o testx  -Wl,--as-needed `pkg-config glib-2.0 --cflags`  `pkg-config glib-2.0 --libs` test.c
/tmp/ccsUEwtk.o: In function `main':
test.c:(.text+0x20): undefined reference to `g_print'
collect2: ld returned 1 exit status
The problem being that --as-needed removes libraries as unnecessary before the linker sees the test.o code.

Sat, 28 Mar 2009

Pkg-config and Modules

At ICHEC one of our tasks is to help users build their software on our supercomputers. Mostly we have to change paths, etc. in code bases that are used by a handful of scientists, and we don't want to exacerbate the hard-coded path problem. We also have multiple versions of certain libraries and compilers installed, which we select using modules: to do this, you select your software environment with e.g.

$ module load intel-fc/11.0.082  netcdf/4.0-intel
This selects the compiler version, and library version. Then, this places appropriate paths in $PATH, $LD_LIBRARY_PATH. While this is useful for selecting, eg. compilers, its often not great for many build processes that expect library paths, etc. hard-coded. I also dislike adding random environmental variables: it loses the Single Point Of Truth that describes the configuration. So I've come up with the following pattern.

First, record the modules used (eg. library version) in config.log if configure is used. To do this, add the following to the autoconf file or system headers:

# AC_MODULES_OUTPUT
# -----------------
# Check if the system runs 'modules', and if so, record the modules environment in the config.log
AC_DEFUN([AC_MODULES_LIST],
[AC_CACHE_CHECK[modules output if present],[ac_modules_output],
(ac_modules_output=`module list 2>&1`)
if test $? eq 0; then
 AC_MSG_RESULT([Output from 'module list' was:\n $ac_modules_output])
fi
])

Secondly, For all packages, ensure that a pkgconfig package file exists. For example, we have multiple MPI implementations on our SGI system. For MVAPICH, we have a modules file:

stokes2:~$ module show mvapich2-intel
-------------------------------------------------------------------
/ichec/modulefiles/mvapich2-intel/1.2p1:

module-whatis    MVAPICH2 1.2p1 (Intel 11.0.074) 
conflict         mvapich2-gnu mvapich2-intel 
prepend-path     PATH /ichec/packages/mvapich/1.2p1-intel/bin 
prepend-path     LD_LIBRARY_PATH /ichec/packages/mvapich/1.2p1-intel/lib 
prepend-path     MANPATH /ichec/packages/mvapich/1.2p1-intel/share/man 
prepend-path     INCLUDE /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     CPATH /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     FPATH /ichec/packages/mvapich/1.2p1-intel/include 
prepend-path     PKG_CONFIG_PATH /ichec/packages/mvapich/1.2p1-intel/share/pkgconfig 
-------------------------------------------------------------------
We add a file mpi.pc to the PKG_CONFIG_PATH for each MPI implementation. So, mpi.pc looks like:
# MPI pkgconfig file for mvapich2-intel
# Normally only used to get variables $prefix, etc.
prefix=/ichec/packages/mvapich/1.2p1-intel
exec_prefix=${prefix}
libdir=${prefix}/lib
includedir=${prefix}/include

Name: mpi
Description: netCDF libraries, include files and development tools
Version: 1.2p1
Libs: -L${libdir}  -lmpichf90 -lmpich -lpthread -lrdmacm -libverbs -libumad -lrt
Cflags: -I${includedir}
This then means we can add the following small patch to our target software (in this case a climate code COSMOS):
 #########################################################################
 #
 #  MPI message passing root directory of the chosen compiler
 
+  # Get default MPI ROOT from the pkgconfig file if possible
+  pkg-config mpi --variable=prefix 2> /dev/null > /dev/null
+  if [ $? -eq 0 ]; then
+    MPIROOT=`pkg-config mpi --variable=prefix`
+  fi
+
   if   [ $compiler = ifort ]; then
     export cc=cc
     if [ $CHAN = MPI2 ]; then
-       export MPIROOT=/sw/sarge-ia32/mpich2-1.0.4p1-pgi
+       export MPIROOT=${MPIROOT:-/sw/sarge-ia32/mpich2-1.0.4p1-pgi}
     elif [ $CHAN = MPI1 ]; then

Such a patch is generic enough to be useful outside our institution, and means the code can build out-of-the-box on many systems. Now to add mpi.pc to Debian ...

Thu, 20 Nov 2008

One of Those Codes
--3177-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--3177-- si_code=80;  Faulting address: 0x0;  sp: 0x402A9FD40

valgrind: the 'impossible' happened:
   Killed by fatal signal
   ==3177==    at 0x3801FDEA: unlinkBlock (m_mallocfree.c:190)
   ==3177==    by 0x38020CAE: vgPlain_arena_malloc (m_mallocfree.c:1055)
   ==3177==    by 0x38035516: vgPlain_cli_malloc (replacemalloc_core.c:101)
   ==3177==    by 0x380022F5: vgMemCheck_malloc (mc_malloc_wrappers.c:182)
   ==3177==    by 0x38035BA7: do_client_request (scheduler.c:1158)
   ==3177==    by 0x380372B1: vgPlain_scheduler (scheduler.c:869)
   ==3177==    by 0x38051B59: run_a_thread_NORETURN (syswrap-linux.c:87)

This code has already killed one debugger (Intel DB) and sent strace into an infinite loop of segfaults. <Sigh />.

Tue, 10 Jun 2008

Little Endian problems
#include <stdio.h>
#include <sys/types.h>

int main(int argc, char *argv[])
{
#ifdef LITTLE_ENDIAN
	printf("Little Endian defined\n");
#endif
}

On Linux/x86, this prints outputs little-endian. Unfortunately it also does so on the IBM BlueGene (Linux, GNU Libc), which is, by default, big-endian. (Thanks to my colleague Honore Tapamo for discovering this).

It turns out that this is due to <sys/types.h> including <endian.h>, which has:

#define __LITTLE_ENDIAN 1234
#define __BIG_ENDIAN    4321
#define __PDP_ENDIAN    3412

/* This file defines `__BYTE_ORDER' for the particular machine.  */
#include <bits/endian.h>

   #ifdef  __USE_BSD
   # define LITTLE_ENDIAN  __LITTLE_ENDIAN
   # define BIG_ENDIAN     __BIG_ENDIAN
   # define PDP_ENDIAN     __PDP_ENDIAN
   # define BYTE_ORDER     __BYTE_ORDER
   #endif
   
   #if __BYTE_ORDER == __LITTLE_ENDIAN
   # define __LONG_LONG_PAIR(HI, LO) LO, HI
   #elif __BYTE_ORDER == __BIG_ENDIAN
   # define __LONG_LONG_PAIR(HI, LO) HI, LO
   #endif

We're working on a code at the moment that has LITTLE_ENDIAN defined in Makefiles on Little-Endian architectures. This all needs to be changed to something like IS_LITTLE_ENDIAN to avoid this issue.

POSIX requires that all symbols beginning with an underscore, followed by a Capital letter or another underscore, are reserved for "the system". Unfortunately the reverse does not appear to be true, and this then is a collision with a symbol that programmers thought was unique to the code being defined in system headers but with different meaning.

Is there a standard for detecting byte order? There doesn't appear to be in POSIX. Linux / libc provides __BYTE_ORDER but what other Operating Systems does this work on ?

Thu, 28 Jun 2007

Fun Filenames

Ok, some interesting output from a build script of a nameless code I'm working on:

...
scanning lib[10].a ...
scanning lib[11].a ...
scanning lib[12].a ...
scanning lib[13].a ...
scanning lib[14].a ...
scanning lib[15].a ...

pgf90 -Wl,--start-group ./master.o -L. -l[1] -l[2] -l[3] -l[4] -l[5] -l[6] -l[7] -l[8] -l[9] -l[10] -l[11] -l[12] -l[13] -l[14] -l[15] -Wl,--end-group -llapack -lblas
./lib[2].a(cprep1.o)(.text+0x181d): In function `cprep1':
...

Next time, think of someone trying to find these interesting files. People using regexps, ok ? What do these 'libraries' contain, anyway ?

Thu, 24 May 2007

Comment-free code

Often looking through old code you find yourself wishing for more comments, or even more correct comments. Rarely, less comments.

But looking at some code I'm debugging at the moment at work I found the following bizarre idiom (details changed to protect the afflicted) ...

#ifdef DOC
C  Some fortran comments.
C  Yes, ordinary comments, nothing fancy like preprocessor directives, but if you want
C  your code without a paragraph explaining what it does, you can just preprocess it
C  and remove it.
C 
C Perhaps they'd expected that their fellow developers would write essays in the
C source code that would make it hard to follow, but then, you'd still need to edit
C the full source code to debug and develop it, right?
C
#endif
Preprocessor abuse : chapter 2 of lots.

Those who program C are used to seeing function / subroutine definitions in header files. Not so in Fortran, there people use the C pre-processor to put Header files in subroutine definitions.

Don't try this at home, folks.

Again, names changed to protect the afflicted ...

subroutine  funmame( param1,
	      param2, param3,
#ifdef FOO32
	     foo, bar,
#endif
	      param7, param8, 
#include "basarg.h"
	     )

Remarkably, basarg.h included:

	C Comment.
	param5, param6,
#ifdef SOMETHING
        param10, param20,
#endif

There were in fact dozens of parameters in basarg.h.. And even a higher.h that included basarg.h but had more parameters ...