"You take a million, billion tonnes of flaming inferno and turn it into 'twinkle, twinkle little star' ..."

Tue, 10 Jun 2008

Little Endian problems
#include <stdio.h>
#include <sys/types.h>

int main(int argc, char *argv[])
{
#ifdef LITTLE_ENDIAN
	printf("Little Endian defined\n");
#endif
}

On Linux/x86, this prints outputs little-endian. Unfortunately it also does so on the IBM BlueGene (Linux, GNU Libc), which is, by default, big-endian. (Thanks to my colleague Honore Tapamo for discovering this).

It turns out that this is due to <sys/types.h> including <endian.h>, which has:

#define __LITTLE_ENDIAN 1234
#define __BIG_ENDIAN    4321
#define __PDP_ENDIAN    3412

/* This file defines `__BYTE_ORDER' for the particular machine.  */
#include <bits/endian.h>

   #ifdef  __USE_BSD
   # define LITTLE_ENDIAN  __LITTLE_ENDIAN
   # define BIG_ENDIAN     __BIG_ENDIAN
   # define PDP_ENDIAN     __PDP_ENDIAN
   # define BYTE_ORDER     __BYTE_ORDER
   #endif
   
   #if __BYTE_ORDER == __LITTLE_ENDIAN
   # define __LONG_LONG_PAIR(HI, LO) LO, HI
   #elif __BYTE_ORDER == __BIG_ENDIAN
   # define __LONG_LONG_PAIR(HI, LO) HI, LO
   #endif

We're working on a code at the moment that has LITTLE_ENDIAN defined in Makefiles on Little-Endian architectures. This all needs to be changed to something like IS_LITTLE_ENDIAN to avoid this issue.

POSIX requires that all symbols beginning with an underscore, followed by a Capital letter or another underscore, are reserved for "the system". Unfortunately the reverse does not appear to be true, and this then is a collision with a symbol that programmers thought was unique to the code being defined in system headers but with different meaning.

Is there a standard for detecting byte order? There doesn't appear to be in POSIX. Linux / libc provides __BYTE_ORDER but what other Operating Systems does this work on ?

As you posted, glibc defines not only _BYTE_ORDER but BYTE_ORDER (i.e., without the underscores).  Further, it is conditionally defined depending on whether the BSD feature set is selected.  This suggests that this feature is derived from BSD and likely quite portable.  To check the byte order using these macros, use:  if (BYTE_ORDER == LITTLE_ENDIAN) { ... } else if (BYTE_ORDER == BIG_ENDIAN) { ... } else { panic ("Unknown byte order."); }.
Easiest to just test it yourself:

union { unsigned char little_endian; unsigned short endian_test; };
endian_test = 1;

Or if you don't mind using C99:

const bool little_endian = ((union { unsigned char c; unsigned short s; }) { .s = 1 }).c;

Both of these turn into compile-time constants.
Easy yes, but not what i'm looking for. It still adds unnecesary

if (little_endian)
  do_a();
else
  do_b();

conditionals all over the code. What I was looking for was a standard method of detecting at compile-time. Posix (1) is silent on this matter; I'm not sure about SUS.

It looks from Neals comment that BYTE_ORDER == LITTLE_ENDIAN is the best. It works for me (tm) on Linux, MacOSX (BSD) and Aix, at least; looking for other archs to test, and then push it into future standards, maybe.

(Though your run-time test could be useful on machines like the BlueGene that are 'open endian' and the endianness is defined at boot-time).

Post a Comment

Name: 
Your email address: 
Your website: 
 
Comment: