comp.lang.c FAQ list · Question 12.42

Q: How can I write code to conform to these old, binary data file formats?

A: It's hard, because of word size and byte order differences, floating-point formats, and structure padding. To get the control you need over these particulars, you may have to read and write things a byte at a time, shuffling and rearranging as you go. (This isn't always as bad as it sounds, and gives you both portability of your code and complete control.)

For example, to read a data structure consisting of a character, a 32-bit integer, and a 16-bit integer, from the stream fp, into the C structure

struct mystruct {
	char c;
	long int i32;
	int i16;
} s;

you might use code like this:

	s.c = getc(fp);

	s.i32 = (long)getc(fp) << 24;
	s.i32 |= (long)getc(fp) << 16;
	s.i32 |= (unsigned)(getc(fp) << 8);
	s.i32 |= getc(fp);

	s.i16 = getc(fp) << 8;
	s.i16 |= getc(fp);

This code assumes that getc reads 8-bit characters, and that the data is stored most significant byte first (``big endian''). The casts to (long) ensure that the 16- and 24-bit shifts operate on long values (see question 3.14), and the cast to (unsigned) guards against sign extension. (In general, it's safer to use all unsigned types when writing code like this, but see question 3.19.)

The corresponding code to write the structure might look like:

	putc(s.c, fp);

	putc((unsigned)((s.i32 >> 24) & 0xff), fp);
	putc((unsigned)((s.i32 >> 16) & 0xff), fp);
	putc((unsigned)((s.i32 >> 8) & 0xff), fp);
	putc((unsigned)(s.i32 & 0xff), fp);


	putc((s.i16 >> 8) & 0xff, fp);
	putc(s.i16 & 0xff, fp);

See also questions 2.12, 12.38, 16.7, and 20.5.

about this FAQ list about eskimo search feedback copyright

Hosted by