Sun, 21 Jun 2009
Unfortunately I won't be able to make it to DebConf9, so as an aid to those who are going, here's a summary of current work:
I've just uploaded terralib 3.3.1 to Debian, and its sitting in the NEW queue, as the older version was removed from the archive due to lack of maintenance (it had an RC bug in both including its own copy of libtiff and failing to link against it - now it links against an external copy of libgeotiff and libtiff).
In the NEW queue it joins g2clib, hdf-eos4, hdf-eos5 and udunits. These are there as dependencies of other Meteorology-related packages I'm working on: magics++ needs terralib and gshhs; zygrib needs gshhs (it has a copy built-in). NCL (NCAR Command Language) has a rake of dependencies including udunits, hdf-eos, g2clib and vis5d+ (ITP'd) . I'm also packaging VISIT for visualization.
Then there is the GSHHS issue: I think I'll end up packaging 'gmt-coastline-high', but the format of the coastline maps needs to be decided (netCDF or its own binary format) and updating the sundry packages to read the latest version needs tackling.
I'm packaging these as they are used at ICHEC and i've experience building them. One of the main aims I had in setting up Debian Meteorology (beyond adding the software to Debian) was to help integrate all the Free and Open Source code in the Earth-sciences field, and sort out dependency and build issues. I hadn't expected to encounter quite so many so quickly, though. I don't expect to get more done before vacation-time, but I'll be happy if I get these done this summer.
Thu, 18 Jun 2009
As mentioned before, I've started working on Debian Meteorology, adding "standard" meteorology-related packages to Debian. Part of the aim of this is to jump-start an effort of integrating the FLOSS in the field: all the usual libraries that people working in the field use and expect to be on the supercomputers and workstations they use.
So, two packages I've been working on are Magics++ and zyGrib, which are plotting and visualisaton tools. respectively. So they both contain coastline maps of the world. Digging deeper shows they use the same files : a binary database called 'GSHHS', or Global Self-consistent Hierarchical High-resolution Shorelines. Some scope for integration here.
So, I start investigating GSHHS in order to create a 'coastline data' package to be shared. It turns out that building GSHHS depends on GMT, the Generic Mapping Tools, already present in Debian, and this coastline issue has been explored before, and a package gmt-coast-low created.
"gmt-coast-low" is 5.5 MB in size, and as its name suggests, there was once a "gmt-coast-high", but this has since been dropped for taking up too much space in the Debian archive (in its place, a script which will download this data for you has been created. But the files in gmt-coastline-low are in netCDF rather than GSHHS's own binary format; what to do? Posting a mail for help and it turns out that another package is being considered, Basemap, an add-on for Mathplotlib, that also includes the GSHHS data.
I've summarized the files, sizes and versions here in the Debian Wiki. Offhand it appears that there is scope for re-adding a gmt-coastline-high package (with perhaps additional small datafiles on states boundaries, etc. seen in Basemap), though some questions remain:
Wed, 22 Apr 2009
At work I've been promoting the use of pkg-config and modules to solve the problems of avoiding hard-coded paths and an environment where we've multiple compilers. In summary, "module load intel-cc" to load the intel compiler, "module load netcdf-intel" to, which among other things appends /ichec/packages/netcdf/4.0-intel/pkgconfig to $PKG_CONFIG_PATH, and then:
in the application code. Replace "intel-cc" and "netcdf-intel" with "gcc" and "netcdf-gcc", and it builds with gcc.This would work better if upstream supplied .pc files, which means the next stage in world domination is to send patches to do just that. But it's not that easy, apparently.
netcdf (for example) supplies two libraries, libnetcdf.so and libnetcff.so, with the second including code only needed for Fortran. So, for gcc I have the following netcdf file:
prefix=/ichec/packages/netcdf/4.0-gcc exec_prefix=${prefix} libdir=${prefix}/lib includedir=${prefix}/include Name: netcdf Description: netCDF libraries, include files and development tools (gcc version) Version: 4.0 Libs: -L${libdir} -Wl,--as-needed -lnetcdf -lnetcdff Cflags: -I${includedir}While for Intel I have:prefix=/ichec/packages/netcdf/4.0-intel # Add fortran libs forlibs=/ichec/packages/intel/fce/11.0.081/lib/intel64 exec_prefix=${prefix} libdir=${prefix}/lib includedir=${prefix}/include Name: netcdf Description: netCDF libraries, include files and development tools Version: 4.0 Libs: -L${libdir} -lnetcdf -L${forlibs} -lnetcdff -lifcore Cflags: -I${includedir}The main problem is that --as-needed is not understood by non GNU-ld linkers, and must be conditionally removed somehow. Any ideas ? (There is a second wrinkle of needing to add additional libraries for Intel Fortran here, but I'm sure I can remove that with the addition of more .pc files.).
A second issue is that --as-needed can break otherwise working pkg-config usages. Thanks to galtgendo at PhP Bugs for this example:
works, but: The problem being that --as-needed removes libraries as unnecessary before the linker sees the test.o code.Sat, 28 Mar 2009
At ICHEC one of our tasks is to help users build their software on our supercomputers. Mostly we have to change paths, etc. in code bases that are used by a handful of scientists, and we don't want to exacerbate the hard-coded path problem. We also have multiple versions of certain libraries and compilers installed, which we select using modules: to do this, you select your software environment with e.g.
This selects the compiler version, and library version. Then, this places appropriate paths in $PATH, $LD_LIBRARY_PATH. While this is useful for selecting, eg. compilers, its often not great for many build processes that expect library paths, etc. hard-coded. I also dislike adding random environmental variables: it loses the Single Point Of Truth that describes the configuration. So I've come up with the following pattern.First, record the modules used (eg. library version) in config.log if configure is used. To do this, add the following to the autoconf file or system headers:
Secondly, For all packages, ensure that a pkgconfig package file exists. For example, we have multiple MPI implementations on our SGI system. For MVAPICH, we have a modules file:
We add a file mpi.pc to the PKG_CONFIG_PATH for each MPI implementation. So, mpi.pc looks like:# MPI pkgconfig file for mvapich2-intel # Normally only used to get variables $prefix, etc. prefix=/ichec/packages/mvapich/1.2p1-intel exec_prefix=${prefix} libdir=${prefix}/lib includedir=${prefix}/include Name: mpi Description: netCDF libraries, include files and development tools Version: 1.2p1 Libs: -L${libdir} -lmpichf90 -lmpich -lpthread -lrdmacm -libverbs -libumad -lrt Cflags: -I${includedir}This then means we can add the following small patch to our target software (in this case a climate code COSMOS):######################################################################### # # MPI message passing root directory of the chosen compiler + # Get default MPI ROOT from the pkgconfig file if possible + pkg-config mpi --variable=prefix 2> /dev/null > /dev/null + if [ $? -eq 0 ]; then + MPIROOT=`pkg-config mpi --variable=prefix` + fi + if [ $compiler = ifort ]; then export cc=cc if [ $CHAN = MPI2 ]; then - export MPIROOT=/sw/sarge-ia32/mpich2-1.0.4p1-pgi + export MPIROOT=${MPIROOT:-/sw/sarge-ia32/mpich2-1.0.4p1-pgi} elif [ $CHAN = MPI1 ]; thenSuch a patch is generic enough to be useful outside our institution, and means the code can build out-of-the-box on many systems. Now to add mpi.pc to Debian ...
Thu, 20 Nov 2008
This code has already killed one debugger (Intel DB) and sent strace into an infinite loop of segfaults. <Sigh />.
Tue, 10 Jun 2008
#include <stdio.h> #include <sys/types.h> int main(int argc, char *argv[]) { #ifdef LITTLE_ENDIAN printf("Little Endian defined\n"); #endif }On Linux/x86, this prints outputs little-endian. Unfortunately it also does so on the IBM BlueGene (Linux, GNU Libc), which is, by default, big-endian. (Thanks to my colleague Honore Tapamo for discovering this).
It turns out that this is due to <sys/types.h> including <endian.h>, which has:
We're working on a code at the moment that has LITTLE_ENDIAN defined in Makefiles on Little-Endian architectures. This all needs to be changed to something like IS_LITTLE_ENDIAN to avoid this issue.
POSIX requires that all symbols beginning with an underscore, followed by a Capital letter or another underscore, are reserved for "the system". Unfortunately the reverse does not appear to be true, and this then is a collision with a symbol that programmers thought was unique to the code being defined in system headers but with different meaning.
Is there a standard for detecting byte order? There doesn't appear to be in POSIX. Linux / libc provides __BYTE_ORDER but what other Operating Systems does this work on ?
Thu, 28 Jun 2007
Ok, some interesting output from a build script of a nameless code I'm working on:
Next time, think of someone trying to find these interesting files. People using regexps, ok ? What do these 'libraries' contain, anyway ?
Thu, 24 May 2007
Often looking through old code you find yourself wishing for more comments, or even more correct comments. Rarely, less comments.
But looking at some code I'm debugging at the moment at work I found the following bizarre idiom (details changed to protect the afflicted) ...
Those who program C are used to seeing function / subroutine definitions in header files. Not so in Fortran, there people use the C pre-processor to put Header files in subroutine definitions.
Don't try this at home, folks.
Again, names changed to protect the afflicted ...
Remarkably, basarg.h included:
C Comment. param5, param6, #ifdef SOMETHING param10, param20, #endifThere were in fact dozens of parameters in basarg.h.. And even a higher.h that included basarg.h but had more parameters ...