20. Miscellaneous

comp.lang.c FAQ list · Question 20.1

Q: How can I return multiple values from a function?


A: There are several ways of doing this. (These examples show hypothetical polar-to-rectangular coordinate conversion functions, which must return both an x and a y coordinate.)

  1. Pass pointers to several locations which the function can fill in:
    #include <math.h>
    
    polar_to_rectangular(double rho, double theta,
    		double *xp, double *yp)
    {
    	*xp = rho * cos(theta);
    	*yp = rho * sin(theta);
    }
    
    ...
    
    	double x, y;
    	polar_to_rectangular(1., 3.14, &x, &y);
    
  2. Have the function return a structure containing the desired values:
    struct xycoord { double x, y; };
    
    struct xycoord
    polar_to_rectangular(double rho, double theta)
    {
    	struct xycoord ret;
    	ret.x = rho * cos(theta);
    	ret.y = rho * sin(theta);
    	return ret;
    }
    
    ...
    
    	struct xycoord c = polar_to_rectangular(1., 3.14);
    
  3. Use a hybrid: have the function accept a pointer to a structure, which it fills in:
    polar_to_rectangular(double rho, double theta,
    		struct xycoord *cp)
    {
    	cp->x = rho * cos(theta);
    	cp->y = rho * sin(theta);
    }
    
    ...
    
    	struct xycoord c;
    	polar_to_rectangular(1., 3.14, &c);
    
    (Another example of this technique is the Unix system call stat.)
  4. In a pinch, you could theoretically use global variables (though this is rarely a good idea).

See also questions 2.7, 4.8, and 7.5a.




comp.lang.c FAQ list · Question 20.2

Q: What's a good data structure to use for storing lines of text? I started to use fixed-size arrays of arrays of char, but they're just too restrictive.


A: One good way of doing this is with a pointer (simulating an array) to a set of pointers (each simulating an array) of char. This data structure is sometimes called a ``ragged array,'' and looks something like this:

[FIGURE GOES HERE]

You could set up the tiny array in the figure above with these simple declarations:

char *a[4] = {"this", "is", "a", "test"};
char **p = a;
(where p is the pointer-to-pointer-to-char and a is an intermediate array used to allocate the four pointers-to-char).

To really do dynamic allocation, you'd of course have to call malloc:

#include <stdlib.h>
char **p = malloc(4 * sizeof(char *));
if(p != NULL) {
	p[0] = malloc(5);
	p[1] = malloc(3);
	p[2] = malloc(2);
	p[3] = malloc(5);

	if(p[0] && p[1] && p[2] && p[3]) {
		strcpy(p[0], "this");
		strcpy(p[1], "is");
		strcpy(p[2], "a");
		strcpy(p[3], "test");
	}
}
(Some libraries have a strdup function which would streamline the inner malloc and strcpy calls. It's not Standard, but it's obviously trivial to implement something like it.)

Here is a code fragment which reads an entire file into memory, using the same kind of ragged array. This code is written in terms of the agetline function from question 7.30.

#include <stdio.h>
#include <stdlib.h>
extern char *agetline(FILE *);
FILE *ifp;

/* assume ifp is open on input file */

char **lines = NULL;
size_t nalloc = 0;
size_t nlines = 0;
char *p;

while((p = agetline(ifp)) != NULL) {
	if(nlines >= nalloc) {
		nalloc += 50;
#ifdef SAFEREALLOC
		lines = realloc(lines, nalloc * sizeof(char *));
#else
		if(lines == NULL)		/* in case pre-ANSI realloc */
			lines = malloc(nalloc * sizeof(char *));
		else	lines = realloc(lines, nalloc * sizeof(char *));
#endif
		if(lines == NULL) {
			fprintf(stderr, "out of memory");
			exit(1);
		}
	}

	lines[nlines++] = p;
}
(See the comments on reallocation strategy in question 7.30.)

See also question 6.16.




comp.lang.c FAQ list · Question 20.3

Q: How can I open files mentioned on the command line, and parse option flags?


A: Here is a skeleton which implements a traditional Unix-style argv parse, handling option flags beginning with -, and optional filenames. (The two flags accepted by this example are -a and -b; -b takes an argument.)

#include <stdio.h>
#include <string.h>
#include <errno.h>

main(int argc, char *argv[])
{
	int argi;
	int aflag = 0;
	char *bval = NULL;

	for(argi = 1; argi < argc && argv[argi][0] == '-'; argi++) {
		char *p;
		for(p = &argv[argi][1]; *p != '\0'; p++) {
			switch(*p) {
			case 'a':
				aflag = 1;
				printf("-a seen\n");
				break;

			case 'b':
				bval = argv[++argi];
				printf("-b seen (\"%s\")\n", bval);
				break;

			default:
				fprintf(stderr,
					"unknown option -%c\n", *p);
			}
		}
	}

	if(argi >= argc) {
		/* no filename arguments; process stdin */
		printf("processing standard input\n");
	} else {
		/* process filename arguments */

		for(; argi < argc; argi++) {
			FILE *ifp = fopen(argv[argi], "r");
			if(ifp == NULL) {
				fprintf(stderr, "can't open %s: %s\n",
					argv[argi], strerror(errno));
				continue;
			}

			printf("processing %s\n", argv[argi]);

			fclose(ifp);
		}
	}

	return 0;
}
(This code assumes that fopen sets errno when it fails, which is not guaranteed, but usually works, and makes error messages much more useful. See also question 20.4.)

There are several canned functions available for doing command line parsing in a standard way; the most popular one is getopt (see also question 18.16). Here is the above example, rewritten to use getopt:

extern char *optarg;
extern int optind;

main(int argc, char *argv[])
{
	int aflag = 0;
	char *bval = NULL;
	int c;

	while((c = getopt(argc, argv, "ab:")) != -1)
		switch(c) {
		case 'a':
			aflag = 1;
			printf("-a seen\n");
			break;

		case 'b':
			bval = optarg;
			printf("-b seen (\"%s\")\n", bval);
			break;
	}

	if(optind >= argc) {
		/* no filename arguments; process stdin */
		printf("processing standard input\n");
	} else {
		/* process filename arguments */

		for(; optind < argc; optind++) {
			FILE *ifp = fopen(argv[optind], "r");
			if(ifp == NULL) {
				fprintf(stderr, "can't open %s: %s\n",
					argv[optind], strerror(errno));
				continue;
			}

			printf("processing %s\n", argv[optind]);

			fclose(ifp);
		}
	}

	return 0;
}

The examples above overlook a number of nuances: a lone ``-'' is often taken to mean ``read standard input''; the marker ``--'' often signifies the end of the options (proper versions of getopt do handle this); it's traditional to print a usage message when a command is invoked with improper or missing arguments.

If you're wondering how argv is laid out in memory, it's actually a ``ragged array''; see the picture in question 20.2. See also questions 8.2, 13.7, and 19.20.

References: K&R1 Sec. 5.11 pp. 110-114
K&R2 Sec. 5.10 pp. 114-118
ISO Sec. 5.1.2.2.1
H&S Sec. 20.1 p. 416
PCS Sec. 5.6 pp. 81-2, Sec. 11 p. 159, pp. 339-40 Appendix F
Schumacher, ed., Software Solutions in C Sec. 4 pp. 75-85




comp.lang.c FAQ list · Question 20.4

Q: What's the right way to use errno?


A: In general, you should detect errors by checking return values, and use errno only to distinguish among the various causes of an error, such as ``File not found'' or ``Permission denied''. (Typically, you use perror or strerror to print these discriminating error messages.) It's only necessary to detect errors with errno when a function does not have a unique, unambiguous, out-of-band error return (i.e. because all of its possible return values are valid; one example is atoi). In these cases (and in these cases only; check the documentation to be sure whether a function allows this), you can detect errors by setting errno to 0, calling the function, then testing errno. (Setting errno to 0 first is important, as no library function ever does that for you.)

To make error messages useful, they should include all relevant information. Besides the strerror text derived from errno, it may also be appropriate to print the name of the program, the operation which failed (preferably in terms which will be meaningful to the user), the name of the file for which the operation failed, and, if some input file (script or source file) is being read, the name and current line number of that file.

See also question 12.24.

References: ISO Sec. 7.1.4, Sec. 7.9.10.4, Sec. 7.11.6.2
CT&P Sec. 5.4 p. 73
PCS Sec. 11 p. 168, Sec. 14 p. 254




comp.lang.c FAQ list · Question 20.5

Q: How can I write data files which can be read on other machines with different word size, byte order, or floating point formats?


A: The most portable solution is to use text files (usually ASCII), written with fprintf and read with fscanf or the like. (Similar advice also applies to network protocols.) Be skeptical of arguments which imply that text files are too big, or that reading and writing them is too slow. Not only is their efficiency frequently acceptable in practice, but the advantages of being able to interchange them easily between machines, and manipulate them with standard tools, can be overwhelming.

If you must use a binary format, you can improve portability, and perhaps take advantage of prewritten I/O libraries, by making use of standardized formats such as Sun's XDR (RFC 1014), OSI's ASN.1 (referenced in CCITT X.409 and ISO 8825 ``Basic Encoding Rules''), CDF, netCDF, or HDF. See also questions 2.12, 12.38, and 12.42.

References: PCS Sec. 6 pp. 86, 88




comp.lang.c FAQ list · Question 20.6

Q: If I have a char * variable pointing to the name of a function, how can I call that function? Code like

	extern int func(int, int);
	char *funcname = "func";
	int r = (*funcname)(1, 2);
or
	r = (*(int (*)(int, int))funcname)(1, 2);
doesn't seem to work.


A: By the time a program is running, information about the names of its functions and variables (the ``symbol table'') is no longer needed, and may therefore not be available. The most straightforward thing to do, therefore, is to maintain that information yourself, with a correspondence table of names and function pointers:

int one_func(), two_func();
int red_func(), blue_func();

struct { char *name; int (*funcptr)(); } symtab[] = {
	"one_func",	one_func,
	"two_func",	two_func,
	"red_func",	red_func,
	"blue_func",	blue_func,
};
Then, search the table for the name, and call via the associated function pointer, with code like this:
#include <stddef.h>

int (*findfunc(char *name))()
{
	int i;

	for(i = 0; i < sizeof(symtab) / sizeof(symtab[0]); i++) {
		if(strcmp(name, symtab[i].name) == 0)
			return symtab[i].funcptr;
		}

	return NULL;
}

...

	char *funcname = "one_func";
	int (*funcp)() = findfunc(funcname);
	if(funcp != NULL)
		(*funcp)();
The callable functions should all have compatible argument and return types. (Ideally, the function pointers would also specify the argument types.)

It is sometimes possible for a program to read its own symbol table if it is still present, but it must first be able to find its own executable (see question 19.31), and it must know how to interpret the symbol table (some Unix C libraries provide an nlist function for this purpose). See also questions 2.15, 18.14, and 19.36.

References: PCS Sec. 11 p. 168




comp.lang.c FAQ list · Question 20.6b

Q: How can I ensure that integer arithmetic doesn't overflow?


A: The usual approach is to test the operands against the limits in the header file <limits.h> before doing the operation. For example, here is a ``careful'' addition function:

int
chkadd(int a, int b)
{
	if(INT_MAX - b < a) {
		fputs("int overflow\n", stderr);
		return INT_MAX;
	}
	return a + b;
}
See also question 19.39.

Additional links: more sample code




comp.lang.c FAQ list · Question 20.7

Q: How can I manipulate individual bits?


A: Bit manipulation is straightforward in C, and commonly done. To extract (test) a bit, use the bitwise AND (&) operator, along with a bit mask representing the bit(s) you're interested in:

	value & 0x04
To set a bit, use the bitwise OR (| or |=) operator:
	value |= 0x04
To clear a bit, use the bitwise complement (~) and the AND (& or &=) operators:
	value &= ~0x04
(The preceding three examples all manipulate the third-least significant, or 2**2, bit, expressed as the constant bitmask 0x04.)

To manipulate an arbitrary bit, use the shift-left operator (<<) to generate the mask you need:

	value & (1 << bitnumber)
	value |= (1 << bitnumber)
	value &= ~(1 << bitnumber)
Alternatively, you may wish to precompute an array of masks:
	unsigned int masks[] =
		{0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80};

	value & masks[bitnumber]
	value |= masks[bitnumber]
	value &= ~masks[bitnumber]

To avoid surprises involving the sign bit, it is often a good idea to use unsigned integral types in code which manipulates bits and bytes.

See also questions 9.2 and 20.8.

References: K&R1 Sec. 2.9 pp. 44-45
K&R2 Sec. 2.9 pp. 48-49
ISO Sec. 6.3.3.3, Sec. 6.3.7, Sec. 6.3.10, Sec. 6.3.12
H&S Sec. 7.5.5 p. 197, Sec. 7.6.3 pp. 205-6, Sec. 7.6.6 p. 210




comp.lang.c FAQ list · Question 20.8

Q: How can I implement sets or arrays of bits?


A: Use arrays of char or int, with a few macros to access the desired bit in the proper cell of the array. Here are some simple macros to use with arrays of char:

#include <limits.h>		/* for CHAR_BIT */

#define BITMASK(b) (1 << ((b) % CHAR_BIT))
#define BITSLOT(b) ((b) / CHAR_BIT)
#define BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
#define BITCLEAR(a, b) ((a)[BITSLOT(b)] &= ~BITMASK(b))
#define BITTEST(a, b) ((a)[BITSLOT(b)] & BITMASK(b))
#define BITNSLOTS(nb) ((nb + CHAR_BIT - 1) / CHAR_BIT)
(If you don't have <limits.h>, try using 8 for CHAR_BIT.)

Here are some usage examples. To declare an ``array'' of 47 bits:

	char bitarray[BITNSLOTS(47)];
To set the 23rd bit:
	BITSET(bitarray, 23);
To test the 35th bit:
	if(BITTEST(bitarray, 35)) ...
To compute the union of two bit arrays and place it in a third array (with all three arrays declared as above):
	for(i = 0; i < BITNSLOTS(47); i++)
		array3[i] = array1[i] | array2[i];
To compute the intersection, use & instead of |.

As a more realistic example, here is a quick implementation of the Sieve of Eratosthenes, for computing prime numbers:

#include <stdio.h>
#include <string.h>

#define MAX 10000

int main()
{
	char bitarray[BITNSLOTS(MAX)];
	int i, j;

	memset(bitarray, 0, BITNSLOTS(MAX));

	for(i = 2; i < MAX; i++) {
		if(!BITTEST(bitarray, i)) {
			printf("%d\n", i);
			for(j = i + i; j < MAX; j += i)
				BITSET(bitarray, j);
		}
	}
	return 0;
}

See also question 20.7.

Additional links: further explanation

References: H&S Sec. 7.6.7 pp. 211-216




comp.lang.c FAQ list · Question 20.9

Q: How can I determine whether a machine's byte order is big-endian or little-endian?


A: The usual techniques are to use a pointer:

	int x = 1;
	if(*(char *)&x == 1)
		printf("little-endian\n");
	else	printf("big-endian\n");
or a union:
	union {
		int i;
		char c[sizeof(int)];
	} x;
	x.i = 1;
	if(x.c[0] == 1)
		printf("little-endian\n");
	else	printf("big-endian\n");

(Note that there are also byte order possibilities beyond simple big-endian and little-endian[footnote] .)

See also questions 10.16 and 20.9b.

References: H&S Sec. 6.1.2 pp. 163-4




comp.lang.c FAQ list · Question 20.9b

Q: How do I swap bytes?


A: V7 Unix had a swab function, but it seems to have been forgotten.

A problem with explicit byte-swapping code is that you have to decide whether to call it or not, based on the byte order of the data and the byte order of the machine in use. Question 20.9 shows how, but it's a nuisance.

A better solution is to define functions which convert between the known byte order of the data and the (unknown) byte order of the machine in use, and to arrange for these functions to be no-ops on those machines which already match the desired byte order. A set of such functions, introduced with the BSD networking code but now in wide use, is ntohs, htons, ntohl, and htonl. These are intended to convert between ``network'' and ``host'' byte orders, for ``short'' or ``long'' integers, where ``network'' order is always big-endian, and where ``short'' integers are always 16 bits and ``long'' integers are 32 bits. (This is not the C definition, of course, but it's compatible with the C definition; see question 1.1.) So if you know that the data you want to convert from or to is big-endian, you can use these functions. (The point is that you always call the functions, making your code much cleaner. Each function either swaps bytes if it has to, or does nothing. The decision to swap or not to swap gets made once, when the functions are implemented for a particular machine, rather than being made many times in many different calling programs.)

If you do have to write your own byte-swapping code, the two obvious approaches are again to use pointers or unions, as in question 20.9. Here is an example using pointers:

void byteswap(char *ptr, int nwords)
{
	char *p = ptr;
	while(nwords-- > 0) {
		char tmp = *p;
		*p = *(p + 1);
		*(p + 1) = tmp;
		p += 2;
	}
}

And here is one using unions:

union word
	{
	short int word;
	char halves[2];
	};

void byteswap(char *ptr, int nwords)
{
	register union word *wp = (union word *)ptr;
	while(nwords-- > 0) {
		char tmp = wp->halves[0];
		wp->halves[0] = wp->halves[1];
		wp->halves[1] = tmp;
		wp++;
	}
}

These functions swap two-byte quantities; the extension to four or more bytes should be obvious. The union-using code is imperfect in that it assumes that the passed-in pointer is word-aligned. It would also be possible to write functions accepting separate source and destination pointers, or accepting single words and returning the swapped values.

References: PCS Sec. 11 p. 179




comp.lang.c FAQ list · Question 20.10

Q: How can I convert integers to binary or hexadecimal?


A: Make sure you really know what you're asking. Integers are stored internally in binary, although for most purposes it is not incorrect to think of them as being in octal, decimal, or hexadecimal, whichever is convenient. The base in which a number is expressed matters only when that number is read in from or written out to the outside world, either in the form of a source code constant or in the form of I/O performed by a program.

In source code, a non-decimal base is indicated by a leading 0 or 0x (for octal or hexadecimal, respectively). During I/O, the base of a formatted number is controlled in the printf and scanf family of functions by the choice of format specifier (%d, %o, %x, etc.) and in the strtol and strtoul functions by the third argument. During binary I/O, however, the base again becomes immaterial: if numbers are being read or written as individual bytes (typically with getc or putc), or as multi-byte words (typically with fread or fwrite), it is meaningless to ask what ``base'' they are in.

If what you need is formatted binary conversion, it's easy enough to do. Here is a little function for formatting a number in a requested base:

char *
baseconv(unsigned int num, int base)
{
	static char retbuf[33];
	char *p;

	if(base < 2 || base > 16)
		return NULL;

	p = &retbuf[sizeof(retbuf)-1];
	*p = '\0';

	do {
		*--p = "0123456789abcdef"[num % base];
		num /= base;
	} while(num != 0);

	return p;
}
(Note that this function, as written, returns a pointer to static data, such that only one of its return values can be used at a time; see question 7.5a. A better size for the retbuf array would be sizeof(int)*CHAR_BIT+1; see question 12.21.)

For more information about ``binary'' I/O, see questions 2.11, 12.37, and 12.42. See also questions 8.6 and 13.1.

Additional links: A long reply I sent to someone who was asking how to write a ``binary to decimal'' conversion function

References: ISO Secs. 7.10.1.5,7.10.1.6




comp.lang.c FAQ list · Question 20.11

Q: Can I use base-2 constants (something like 0b101010)?
Is there a printf format for binary?


A: No, on both counts, although there are various preprocessor tricks you can try (see the links below). You can convert base-2 string representations to integers with strtol. If you need to print numbers out in base 2, see the example code in question 20.10.

Additional links:

example by Jack Klein

preprocessor trick by Karl Heuer

prettier preprocessor trick by Bill Finke




comp.lang.c FAQ list · Question 20.12

Q: What is the most efficient way to count the number of bits which are set in an integer?


A: Many ``bit-fiddling'' problems like this one can be sped up and streamlined using lookup tables (but see question 20.13). Here is a little function which computes the number of bits in a value, 4 bits at a time:

static int bitcounts[] =
	{0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4};

int bitcount(unsigned int u)
{
	int n = 0;

	for(; u != 0; u >>= 4)
		n += bitcounts[u & 0x0f];

	return n;
}



comp.lang.c FAQ list · Question 20.13

Q: What's the best way of making my program efficient?


A: By picking good algorithms, implementing them carefully, and making sure that your program isn't doing any extra work. For example, the most microoptimized character-copying loop in the world will be beat by code which avoids having to copy characters at all.

When worrying about efficiency, it's important to keep several things in perspective. First of all, although efficiency is an enormously popular topic, it is not always as important as people tend to think it is. Most of the code in most programs is not time-critical. When code is not time-critical, it is usually more important that it be written clearly and portably than that it be written maximally efficiently. (Remember that computers are very, very fast, and that seemingly ``inefficient'' code may be quite efficiently compilable, and run without apparent delay.)

It is notoriously difficult to predict what the ``hot spots'' in a program will be. When efficiency is a concern, it is important to use profiling software to determine which parts of the program deserve attention. Often, actual computation time is swamped by peripheral tasks such as I/O and memory allocation, which can be sped up by using buffering and caching techniques.

Even for code that is time-critical, one of the least effective optimization techniques is to fuss with the coding details. Many of the ``efficient coding tricks'' which are frequently suggested are performed automatically by even simpleminded compilers. Heavyhanded optimization attempts can make code so bulky that performance is actually degraded, by increasing the number of page faults or by overflowing instruction caches or pipelines. Furthermore, optimization tricks are rarely portable (i.e. they may speed things up on one machine but slow them down on another). In any case, tweaking the coding usually results in at best linear performance improvements; the big payoffs are in better algorithms.

If the performance of your code is so important that you are willing to invest programming time in source-level optimizations, make sure that you are using the best optimizing compiler you can afford. (Compilers, even mediocre ones, can perform optimizations that are impossible at the source level).

When efficiency is truly important, the best algorithm has been chosen, and even the coding details matter, the following suggestions may be useful. (These are mentioned merely to keep followups down; appearance here does not necessarily constitute endorsement by the author. Note that several of these techniques cut both ways, and may make things worse.)

  1. Sprinkle the code liberally with register declarations for oft-used variables; place them in inner blocks, if applicable. (On the other hand, most modern compilers ignore register declarations, on the assumption that they can perform register analysis and assignment better than the programmer can.)
  2. Check the algorithm carefully. Exploit symmetries where possible to reduce the number of explicit cases.
  3. Examine the control flow: make sure that common cases are checked for first, and handled more easily. If one side of an expression involving && or || will usually determine the outcome, make it the left-hand side, if possible. (See also question 3.6.)
  4. Use memcpy instead of memmove, if appropriate (see question 11.25).
  5. Use machine- and vendor-specific routines and #pragmas.
  6. Manually place common subexpressions in temporary variables. (Good compilers do this for you.)
  7. Move critical, inner-loop code out of functions and into macros or in-line functions (and out of the loop, if invariant). If the termination condition of a loop is a complex but loop-invariant expression, precompute it and place it in a temporary variable. (Good compilers do these for you.)
  8. Change recursion to iteration, if possible.
  9. Unroll small loops.
  10. Discover whether while, for, or do/while loops produce the best code under your compiler, and whether incrementing or decrementing the loop control variable works best.
  11. Remove goto statements--some compilers can't optimize as well in their presence.
  12. Use pointers rather than array subscripts to step through arrays (but see question 20.14).
  13. Reduce precision. (Using float instead of double may result in faster, single-precision arithmetic under an ANSI compiler, though older compilers convert everything to double, so using float can also be slower.) Replace time-consuming trigonometric and logarithmic functions with your own, tailored to the range and precision you need, and perhaps using table lookup. (Be sure to give your versions different names; see question 1.29.)
  14. Cache or precompute tables of frequently-needed values. (See also question 20.12.)
  15. Use standard library functions in preference to your own. (Sometimes the compiler inlines or specially optimizes its own functions.) On the other hand, if your program's calling patterns are particularly regular, your own special-purpose implementation may be able to beat the library's general-purpose version. (Again, if you do write your own version, give it a different name.)
  16. As a last, last resort, hand-code critical routines in assembly language (or hand-tune the compiler's assembly language output). Use asm directives, if possible.

Here are some things not to worry about:

  1. 17x. whether i++ is faster than i = i + 1
  2. 18x. whether i << 1 (or i >> 1, or i & 1) is faster than i * 2 (respectively i / 2, i % 2).

(These are examples of optimizations which compilers regularly perform for you; see questions 20.14 and 20.15.)

It is not the intent here to suggest that efficiency can be completely ignored. Most of the time, however, by simply paying attention to good algorithm choices, implementing them cleanly, and avoiding obviously inefficient blunders (i.e. make sure you don't end up with an O(n**3) implementation of an O(n**2) algorithm), perfectly acceptable results can be achieved.

For more discussion of efficiency tradeoffs, as well as good advice on how to improve efficiency when it is important, see chapter 7 of Kernighan and Plauger's The Elements of Programming Style, and Jon Bentley's Writing Efficient Programs.

See also question 17.11.




comp.lang.c FAQ list · Question 20.14

Q: Are pointers really faster than arrays? How much do function calls slow things down? Is ++i faster than i = i + 1?


A: Precise answers to these and many similar questions depend of course on the processor and compiler in use. If you simply must know, you'll have to time test programs carefully. (Often the differences are so slight that hundreds of thousands of iterations are required even to see them. [footnote] Check the compiler's assembly language output, if available, to see if two purported alternatives aren't compiled identically.)

For conventional machines, it is usually faster to march through large arrays with pointers rather than array subscripts, but for some processors the reverse is true. (Better compilers should generate good code regardless of which notation you use, though it's arguably easier for a compiler to convert array indices to pointers than vice versa[footnote] .)

Function calls, though obviously incrementally slower than in-line code, contribute so much to modularity and code clarity that there is rarely good reason to avoid them. (Actually, by reducing bulk, functions can improve performance.) Also, some compilers are able to expand small, critical-path functions in-line, either as an optimization or at the programmer's request.

Before rearranging expressions such as i = i + 1, remember that you are dealing with a compiler, not a keystroke-programmable calculator. Any decent compiler will generate identical code for ++i, i += 1, and i = i + 1. The reasons for using ++i or i += 1 over i = i + 1 have to do with style, not efficiency. (See also question 3.12b.)




comp.lang.c FAQ list · Question 20.15

Q: I've been replacing multiplications and divisions with shift operators, because shifting is more efficient.


A: This is an excellent example of a potentially risky and usually unnecessary optimization. Any compiler worthy of the name can replace a constant, power-of-two multiplication with a left shift, or a similar division of an unsigned quantity with a right shift. (Ritchie's original PDP-11 compiler, though it ran in less than 64K of memory and omitted several features now considered mandatory, performed both of these optimizations, without even turning on its optional optimization pass.) Furthermore, a compiler will make these optimizations only when they're correct; many programmers overlook the fact that shifting a negative value to the right is not equivalent to division. (Therefore, when you need to make sure that these optimizations are performed, you may have to declare relevant variables as unsigned.)




comp.lang.c FAQ list · Question 20.15b

Q: People claim that optimizing compilers are good and that we no longer have to write things in assembler for speed, but my compiler can't even replace i/=2 with a shift.


A: Was i signed or unsigned? If it was signed, a shift is not equivalent (hint: think about the result if i is negative and odd), so the compiler was correct not to use it.




comp.lang.c FAQ list · Question 20.15c

Q: How can I swap two values without using a temporary?


A: The standard hoary old assembly language programmer's trick is:

a ^= b;
b ^= a;
a ^= b;
But this sort of code has little place in modern, HLL programming. Temporary variables are essentially free, and the idiomatic code using three assignments, namely
	int t = a;
	a = b;
	b = t;
is not only clearer to the human reader, it is more likely to be recognized by the compiler and turned into the most-efficient code (e.g. perhaps even using an EXCH instruction). The latter code is obviously also amenable to use with pointers and floating-point values, unlike the XOR trick. See also questions 3.3b and 10.3.

Additional links: further reading




comp.lang.c FAQ list · Question 20.16

Q: Which is more efficient, a switch statement or an if/else chain?


A: The differences, if any, are likely to be slight. The switch statement was designed to be efficiently implementable, though the compiler may choose to use the equivalent of an if/else chain (as opposed to a compact jump table) if the case labels are sparsely distributed.

Do use switch when you can: it's certainly cleaner, and perhaps more efficient (and certainly should never be any less efficient).

See also questions 20.17 and 20.18.




comp.lang.c FAQ list · Question 20.17

Q: Is there a way to switch on strings?


A: Not directly. Sometimes, it's appropriate to use a separate function to map strings to integer codes, and then switch on those:

#define CODE_APPLE	1
#define CODE_ORANGE	2
#define CODE_NONE	0

switch(classifyfunc(string)) {
	case CODE_APPLE:
		...

	case CODE_ORANGE:
		...

	case CODE_NONE:
		...
}
where classifyfunc looks something like
static struct lookuptab {
	char *string;
	int code;
} tab[] = {
	{"apple",	CODE_APPLE},
	{"orange",	CODE_ORANGE},
};

classifyfunc(char *string)
{
	int i;
	for(i = 0; i < sizeof(tab) / sizeof(tab[0]); i++)
		if(strcmp(tab[i].string, string) == 0)
			return tab[i].code;

	return CODE_NONE;
}

Otherwise, of course, you can fall back on a conventional if/else chain:

	if(strcmp(string, "apple") == 0) {
		...
	} else if(strcmp(string, "orange") == 0) {
		...
	}
(A macro like Streq() from question 17.3 can make these comparisons a bit more convenient.)

See also questions 10.12, 20.16, 20.18, and 20.29.

References: K&R1 Sec. 3.4 p. 55
K&R2 Sec. 3.4 p. 58
ISO Sec. 6.6.4.2
H&S Sec. 8.7 p. 248




comp.lang.c FAQ list · Question 20.18

Q: Is there a way to have non-constant case labels (i.e. ranges or arbitrary expressions)?


A: No. The switch statement was originally designed to be quite simple for the compiler to translate, therefore case labels are limited to single, constant, integral expressions. You can attach several case labels to the same statement, which will let you cover a small range if you don't mind listing all cases explicitly.

If you want to select on arbitrary ranges or non-constant expressions, you'll have to use an if/else chain.

See also questions 20.16 and 20.17.

References: K&R1 Sec. 3.4 p. 55
K&R2 Sec. 3.4 p. 58
ISO Sec. 6.6.4.2
Rationale Sec. 3.6.4.2
H&S Sec. 8.7 p. 248




comp.lang.c FAQ list · Question 20.19

Q: Are the outer parentheses in return statements really optional?


A: Yes.

Long ago, in the early days of C, they were required, and just enough people learned C then, and wrote code which is still in circulation, that the notion that they might still be required is widespread.

(As it happens, parentheses are optional with the sizeof operator, too, under certain circumstances.)

References: K&R1 Sec. A18.3 p. 218
ISO Sec. 6.3.3, Sec. 6.6.6
H&S Sec. 8.9 p. 254




comp.lang.c FAQ list · Question 20.20

Q: Why don't C comments nest? How am I supposed to comment out code containing comments? Are comments legal inside quoted strings?


A: C comments don't nest mostly because PL/I's comments, which C's are borrowed from, don't either. Therefore, it is usually better to ``comment out'' large sections of code, which might contain comments, with #ifdef or #if 0 (but see question 11.19).

The character sequences /* and */ are not special within double-quoted strings, and do not therefore introduce comments, because a program (particularly one which is generating C code as output) might want to print them. (It is hard to imagine why anyone would want or need to place a comment inside a quoted string. It is easy to imagine a program needing to print "/*".)

Note also that // comments have only become legal in C as of C99.

References: K&R1 Sec. A2.1 p. 179
K&R2 Sec. A2.2 p. 192
ISO Sec. 6.1.9, Annex F
Rationale Sec. 3.1.9
H&S Sec. 2.2 pp. 18-9
PCS Sec. 10 p. 130




comp.lang.c FAQ list · Question 20.20b

Q: Why isn't there a numbered, multi-level break statement to break out of several loops at once? What am I supposed to use instead, a goto?


A: First, remember why it is that break and continue exist at all--they are, in effect, ``structured gotos'' used in preference to goto (and accepted as alternatives by most of those who shun goto) because they are clean and structured and pretty much restricted to a common, idiomatic usages. A hypothetical multi-level break, on the other hand, would rapidly lose the inherent cleanliness of the single break--programmers and readers of code would have to carefully count nesting levels to figure out what a given break did, and the insertion of a new intermediately-nested loop could, er, break things badly. (By this analysis, a numbered break statement can be even more confusing and error-prone than a goto/label pair.)

The right way to break out of several loops at once (which C also does not have) involves a syntax which allows the naming of loops, so that a break statement can specify the name of the loop to be broken out of.

If you do have to break out of more than one loop at once (or break out of a loop from inside a switch, where break would merely end a case label) yes, go ahead and use a goto. (But when you find the need for a multi-level break, it's often a sign that the loop should be broken out to its own function, at which point you can achieve roughly the same effect as that multi-level break by using a premature return.)




comp.lang.c FAQ list · Question 20.21

Q: There seem to be a few missing operators, like ^^, &&=, and ->=.


A: A logical exclusive-or operator (hypothetically ``^^'') would be nice, but it couldn't possibly have short-circuiting behavior analogous to && and || (see question 3.6). Similarly, it's not clear how short-circuiting would apply to hypothetical assignment operators &&= and ||=. (It's also not clear how often &&= and ||= would actually be needed.)

Though p = p->next is an extremely common idiom for traversing a linked list, -> is not a binary arithmetic operator. A hypothetical ->= operator therefore wouldn't really fit the pattern of the other assignment operators.

You can write an exclusive-or macro in several ways:

	#define XOR(a, b) ((a) && !(b) || !(a) && (b))	/* 1 */
	#define XOR(a, b) (!!(a) ^ !!(b))		/* 2 */
	#define XOR(a, b) (!!(a) != !!(b))		/* 3 */
	#define XOR(a, b) (!(a) ^ !(b))			/* 4 */
	#define XOR(a, b) (!(a) != !(b))		/* 5 */
	#define XOR(a, b) ((a) ? !(b) : !!(b))		/* 6 */
The first is straight from the definition, but is poor because it may evaluate its arguments multiple times (see question 10.1). The second and third ``normalize'' their operands [footnote] to strict 0/1 by negating them twice--the second then applies bitwise exclusive or (to the single remaining bit); the third one implements exclusive-or as !=. The fourth and fifth are based on an elementary identity in Boolean algebra, namely that
        _   _
a (+) b = a (+) b
(where (+) is exclusive-or and an overbar indicates negation). Finally, the sixth one, suggested by Lawrence Kirby and Dan Pop, uses the ?: operator to guarantee a sequence point between the two operands, as for && and ||. (There is still no ``short circuiting'' behavior, though, nor can there be.)

Additional links: A definitive answer from Dennis Ritchie about ^^




comp.lang.c FAQ list · Question 20.21a

Q: Does C have circular shift operators?


A: No. (Part of the reason why is that the sizes of C's types aren't precisely defined--see question 1.2--but a circular shift makes most sense when applied to a word of a particular known size.)

You can implement a circular shift using two regular shifts and a bitwise OR:

	(x << 13) | (x >> 3)	/* circular shift left 13 in 16 bits */



comp.lang.c FAQ list · Question 20.21b

Q: Is C a great language, or what? Where else could you write something like a+++++b ?


A: Well, you can't meaningfully write it in C, either. The rule for lexical analysis is that at each point during a straightforward left-to-right scan, the longest possible token is determined, without regard to whether the resulting sequence of tokens makes sense. The fragment in the question is therefore interpreted as

a ++ ++ + b
and cannot be parsed as a valid expression.

References: K&R1 Sec. A2 p. 179
K&R2 Sec. A2.1 p. 192
ISO Sec. 6.1
H&S Sec. 2.3 pp. 19-20




comp.lang.c FAQ list · Question 20.22

Q: If the assignment operator were :=, wouldn't it then be harder to accidentally write things like if(a = b) ?


A: Yes, but it would also be just a little bit more cumbersome to type all of the assignment statements which a typical program contains.

In any case, it's really too late to be worrying about this sort of thing now. The choices of = for assignment and == for comparison were made, rightly or wrongly, over two decades ago, and are not likely to be changed. (With respect to the question, many compilers and versions of lint will warn about if(a = b) and similar expressions; see also question 17.4.)

As a point of historical interest, the choices were made based on the observation that assignment is more frequent than comparison, and so deserves fewer keystrokes. In fact, using = for assignment in C and its predecessor B represented a change from B's own predecessor BCPL, which did use := as its assignment operator. (See also question 20.38).




comp.lang.c FAQ list · Question 20.23

Q: Does C have an equivalent to Pascal's with statement?


A: No. The way in C to get quick and easy access to the fields of a structure is to declare a little local structure pointer variable (which, it must be admitted, is not quite as notationally convenient as a with statement and doesn't save quite as many keystrokes, though it is probably safer). That is, if you have something unwieldy like

	structarray[complex_expression].a =
		structarray[complex_expression].b +
			structarray[complex_expression].c;
you can replace it with
	struct whatever *p = &structarray[complex_expression];
	p->a = p->b + p->c;



comp.lang.c FAQ list · Question 20.24

Q: Why doesn't C have nested functions?


A: It's not trivial to implement nested functions such that they have the proper access to local variables in the containing function(s), so they were deliberately left out of C as a simplification. (gcc does allow them, as an extension.) For many potential uses of nested functions (e.g. qsort comparison functions), an adequate if slightly cumbersome solution is to use an adjacent function with static declaration, communicating if necessary via a few static variables. (A cleaner solution, though unsupported by qsort, is to pass around a pointer to a structure containing the necessary context.)




comp.lang.c FAQ list · Question 20.24b

Q: What is assert() and when would I use it?


A: It is a macro, defined in <assert.h>, for testing ``assertions''. An assertion essentially documents an assumption being made by the programmer, an assumption which, if violated, would indicate a serious programming error. For example, a function which was supposed to be called with a non-null pointer could write

	assert(p != NULL);
A failed assertion terminates the program. Assertions should not be used to catch expected errors, such as malloc or fopen failures.

References: K&R2 Sec. B6 pp. 253-4
ISO Sec. 7.2
H&S Sec. 19.1 p. 406




comp.lang.c FAQ list · Question 20.25

Q: How can I call FORTRAN (C++, BASIC, Pascal, Ada, LISP) functions from C? (And vice versa?)


A: The answer is entirely dependent on the machine and the specific calling sequences of the various compilers in use, and may not be possible at all. Read your compiler documentation very carefully; sometimes there is a ``mixed-language programming guide,'' although the techniques for passing arguments and ensuring correct run-time startup are often arcane. Besides arranging calling sequences correctly, you may also have to conspire between the various languages to get aggregate data structures declared compatibly.

For FORTRAN, more information may be found in FORT.gz by Glenn Geers, available via anonymous ftp from suphys.physics.su.oz.au in the src directory. Burkhard Burow's header file cfortran.h simplifies C/FORTRAN interfacing on many popular machines. It is available via anonymous ftp from zebra.desy.de or at http://www-zeus.desy.de/~burow.

In C++, a "C" modifier in an external function declaration indicates that the function is to be called using C calling conventions.

In Ada, you can use the Export and Convention pragmas, and types from the package Interfaces.C, to arrange for C-compatible calls, parameters, and data structures.

References: H&S Sec. 4.9.8 pp. 106-7




comp.lang.c FAQ list · Question 20.26

Q: Does anyone know of a program for converting Pascal or FORTRAN (or LISP, Ada, awk, ``Old'' C, ...) to C?


A: Several freely distributable programs are available:

The following companies sell various translation tools and services:

See also questions 11.31 and 18.16.




comp.lang.c FAQ list · Question 20.27

Q: Is C++ a superset of C? What are the differences between C and C++? Can I use a C++ compiler to compile C code?


A: C++ was derived from C, and is largely based on it, but there are some legal C constructs which are not legal C++. Conversely, ANSI C inherited several features from C++, including prototypes and const, so neither language is really a subset or superset of the other; the two also define the meaning of some common constructs differently.

The most important feature of C++ not found in C is of course the extended structure known as a class which along with operator overloading makes object-oriented programming convenient. There are several other differences and new features: variables may be declared anywhere in a block; const variables may be true compile-time constants; structure tags are implicitly typedeffed; an & in a parameter declaration requests pass by reference; and the new and delete operators, along with per-object constructors and destructors, simplify dynamic data structure management. There are a host of mechanisms tied up with classes and object-oriented programming: inheritance, friends, virtual functions, templates, etc. (This list of C++ features is not intended to be complete; C++ programmers will notice many omissions.)

Some features of C which keep it from being a strict subset of C++ (that is, which keep C programs from necessarily being acceptable to C++ compilers) are that main may be called recursively, character constants are of type int, prototypes are not required, and void * implicitly converts to other pointer types. Also, every keyword in C++ which is not a keyword in C is available in C as an identifier; C programs which use words like class and friend as ordinary identifiers will be rejected by C++ compilers.

In spite of the differences, many C programs will compile correctly in a C++ environment, and many recent compilers offer both C and C++ compilation modes. (But it's usually a bad idea to compile straight C code as if it were C++; the languages are different enough that you'll generally get poor results.)

See also questions 8.9 and 20.20.

Additional links:

Bjarne Stroustrup's answer on the subset/superset question

an article by Richard Stamp listing some differences

an article by ``Noone Really'' listing some more

References: H&S p. xviii, Sec. 1.1.5 p. 6, Sec. 2.8 pp. 36-7, Sec. 4.9 pp. 104-107




comp.lang.c FAQ list · Question 20.28

Q: I need a sort of an ``approximate'' strcmp routine, for comparing two strings for close, but not necessarily exact, equality.


A: Some nice information and algorithms having to do with approximate string matching, as well as a useful bibliography, can be found in Sun Wu and Udi Manber's paper ``AGREP--A Fast Approximate Pattern-Matching Tool.''

Another approach involves the ``soundex'' algorithm, which maps similar-sounding words to the same codes. Soundex was designed for discovering similar-sounding names (for telephone directory assistance, as it happens), but it can be pressed into service for processing arbitrary words.

References: Knuth Sec. 6 pp. 391-2 Volume 3
Wu and Manber, ``AGREP--A Fast Approximate Pattern-Matching Tool''




comp.lang.c FAQ list · Question 20.29

Q: What is hashing?


A: Hashing is the process of mapping strings to integers, usually in a relatively small range. A ``hash function'' maps a string (or some other data structure) to a bounded number (the ``hash bucket'') which can more easily be used as an index in an array, or for performing repeated comparisons. (Obviously, a mapping from a potentially huge set of strings to a small set of integers will not be unique. Any algorithm using hashing therefore has to deal with the possibility of ``collisions.'')

Many hashing functions and related algorithms have been developed; a full treatment is beyond the scope of this list. An extremely simple hash function for strings is simply to add up the values of all the characters:

unsigned hash(char *str)
{
	unsigned int h = 0;
	while(*str != '\0')
		h += *str++;
	return h % NBUCKETS;
}
A somewhat better hash function is
unsigned hash(char *str)
{
	unsigned int h = 0;
	while(*str != '\0')
		h = (256 * h + *str++) % NBUCKETS;
	return h;
}
which actually treats the input string as a large binary number (8 * strlen(str) bits long, assuming characters are 8 bits) and computes that number modulo NBUCKETS, by Horner's rule. (Here it is important that NBUCKETS be prime, among other things. To remove the assumption that characters are 8 bits, use UCHAR_MAX+1 instead of 256; the ``large binary number'' will then be CHAR_BIT * strlen(str) bits long. UCHAR_MAX and CHAR_BIT are defined in <limits.h>.)

When the set of strings is known in advance, it is also possible to devise ``perfect'' hashing functions which guarantee a collisionless, dense mapping.

References: K&R2 Sec. 6.6
Knuth Sec. 6.4 pp. 506-549 Volume 3
Sedgewick Sec. 16 pp. 231-244




comp.lang.c FAQ list · Question 20.30

Q: How can I generate random numbers with a normal or Gaussian distribution?


A: See question 13.20.




comp.lang.c FAQ list · Question 20.31

Q: How can I find the day of the week given the date?


A: Here are three methods:

  1. Use mktime or localtime (see question 13.13). Here is a code fragment which computes the day of the week for February 29, 2000:
    #include <stdio.h>
    #include <time.h>
    
    char *wday[] = {"Sunday", "Monday", "Tuesday", "Wednesday",
    		"Thursday", "Friday", "Saturday"};
    
    struct tm tm;
    
    tm.tm_mon = 2 - 1;
    tm.tm_mday = 29;
    tm.tm_year = 2000 - 1900;
    tm.tm_hour = tm.tm_min = tm.tm_sec = 0;
    tm.tm_isdst = -1;
    
    if(mktime(&tm) != -1)
    	printf("%s\n", wday[tm.tm_wday]);
    
    When using mktime like this, it's usually important to set tm_isdst to -1, as shown (especially if tm_hour is 0), otherwise a daylight saving time correction could push the time past midnight into another day.
  2. Use Zeller's congruence, which says that if

    	J is the number of the century [i.e. the year / 100],
    	K the year within the century [i.e. the year % 100],
    	m the month,
    	q the day of the month,
    	h the day of the week [where 1 is Sunday];
    


    and if January and February are taken as months 13 and 14 of the previous year [affecting both J and K]; then h for the Gregorian calendar is the remainder when the sum

    	q + 26(m + 1) / 10 + K + K/4 + J/4 - 2J
    


    is divided by 7, and where all intermediate remainders are discarded. [footnote] The translation into C is straightforward:
    	h = (q + 26 * (m + 1) / 10 + K + K/4 + J/4 + 5*J) % 7;
    
    (where we use +5*J instead of -2*J to make sure that both operands of the modulus operator % are positive; this bias totalling 7*J will obviously not change the final value of h, modulo 7).
  3. Use this elegant code by Tomohiko Sakamoto:
    int dayofweek(int y, int m, int d)	/* 0 = Sunday */
    {
    	static int t[] = {0, 3, 2, 5, 0, 3, 5, 1, 4, 6, 2, 4};
    	y -= m < 3;
    	return (y + y/4 - y/100 + y/400 + t[m-1] + d) % 7;
    }
    

See also questions 13.14 and 20.32.

References: ISO Sec. 7.12.2.3
Chr. Zeller, ``Kalender-Formeln''




comp.lang.c FAQ list · Question 20.32

Q: Is (year % 4 == 0) an accurate test for leap years? (Was 2000 a leap year?)


A: No, it's not accurate (and yes, 2000 was a leap year). The actual rules for the present Gregorian calendar are that leap years occur every four years, but not every 100 years, except that they do occur every 400 years, after all. In C, these rules can be expressed as:

	year % 4 == 0 && (year % 100 != 0 || year % 400 == 0)
See a good astronomical almanac or other reference [footnote] for details.

Actually, if the domain of interest is limited (perhaps by the range of a signed 32-bit time_t) such that the only century year it encompasses is 2000, the expression

	year % 4 == 0		/* 1901-2099 only */
is accurate, if less than robust.

If you trust the implementor of the C library, you can use mktime to determine whether a given year is a leap year; see the code fragments in questions 13.14 or 20.31 for hints.

Note also that the transition from the Julian to the Gregorian calendar involved deleting several days to make up for accumulated errors. (The transition was first made in Catholic countries under Pope Gregory XIII in October, 1582, and involved deleting 10 days. In the British Empire, eleven days were deleted when the Gregorian calendar was adopted in September 1752. A few countries didn't switch until the 20th century.) Calendar code which has to work for historical dates must therefore be especially careful.

See also question 13.14b.




comp.lang.c FAQ list · Question 20.33

Q: Why can tm_sec in the tm structure range from 0 to 61, suggesting that there can be 62 seconds in a minute?


A: That's actually a buglet in the Standard. There can be 61 seconds in a minute during a leap second. It's possible for there to be two leap seconds in a year, but it turns out that it's guaranteed that they'll never both occur in the same day (let alone the same minute).




comp.lang.c FAQ list · Question 20.34

Q: Here's a good puzzle: how do you write a program which produces its own source code as output?


A: It is actually quite difficult to write a self-reproducing program that is truly portable, due particularly to quoting and character set difficulties.

Here is a classic example (which ought to be presented on one line, although it will fix itself the first time it's run):

char*s="char*s=%c%s%c;main(){printf(s,34,s,34);}";
main(){printf(s,34,s,34);}
(This program has a few deficiencies, among other things neglecting to #include <stdio.h>, and assuming that the double-quote character " has the value 34, as it does in ASCII.)

Here is an improved version, posted by James Hu:

#define q(k)main(){return!puts(#k"\nq("#k")");}
q(#define q(k)main(){return!puts(#k"\nq("#k")");})



comp.lang.c FAQ list · Question 20.35

Q: What is ``Duff's Device''?


A: It's a devastatingly devious way of unrolling a loop, devised by Tom Duff while he was at Lucasfilm. In its ``classic'' form, it was used to copy bytes, and looked like this:

	register n = (count + 7) / 8;	/* count > 0 assumed */
	switch (count % 8)
	{
	case 0:	   do { *to = *from++;
	case 7:		*to = *from++;
	case 6:		*to = *from++;
	case 5:		*to = *from++;
	case 4:		*to = *from++;
	case 3:		*to = *from++;
	case 2:		*to = *from++;
	case 1:		*to = *from++;
		      } while (--n > 0);
	}
where count bytes are to be copied from the array pointed to by from to the memory location pointed to by to (which is a memory-mapped device output register, which is why to isn't incremented). It solves the problem of handling the leftover bytes (when count isn't a multiple of 8) by interleaving a switch statement with the loop which copies bytes 8 at a time. (Believe it or not, it is legal to have case labels buried within blocks nested in a switch statement like this. In his announcement of the technique to C's developers and the world, Duff noted that C's switch syntax, in particular its ``fall through'' behavior, had long been controversial, and that ``This code forms some sort of argument in that debate, but I'm not sure whether it's for or against.'')

Additional links: longer explanation




comp.lang.c FAQ list · Question 20.36

Q: When will the next International Obfuscated C Code Contest (IOCCC) be held? How do I submit contest entries? Who won this year's IOCCC? How can I get a copy of the current and previous winning entries?


A: The contest schedule varies over time; see http://www.ioccc.org/index.html for current details.

Contest winners are usually announced at a Usenix conference, and are posted to the net sometime thereafter. Winning entries from previous years (back to 1984) are archived at ftp.uu.net (see question 18.16) under the directory pub/ioccc/; see also http://www.ioccc.org/index.html .

References: Don Libes, Obfuscated C and Other Mysteries




comp.lang.c FAQ list · Question 20.37

Q: What was the entry keyword mentioned in K&R1?


A: It was reserved to allow the possibility of having functions with multiple, differently-named entry points, à la FORTRAN. It was not, to anyone's knowledge, ever implemented (nor does anyone remember what sort of syntax might have been imagined for it). It has been withdrawn, and is not a keyword in ANSI C. (See also question 1.12.)

References: K&R2 p. 259 Appendix C




comp.lang.c FAQ list · Question 20.38

Q: Where does the name ``C'' come from, anyway?


A: C was derived from Ken Thompson's experimental language B, which was inspired by Martin Richards's BCPL (Basic Combined Programming Language), which was a simplification of CPL (Combined Programming Language, or perhaps Cambridge Programming Language). For a while, there was speculation that C's successor might be named P (the third letter in BCPL) instead of D, but of course the most visible descendant language today is C++.

References: Dennis Ritchie, ``The Development of the C Language''




comp.lang.c FAQ list · Question 20.39

Q: How do you pronounce ``char''? What's that funny name for the ``#'' character?


A: You can pronounce the C keyword ``char'' in at least three ways: like the English words ``char,'' ``care,'' or ``car'' (or maybe even ``character''); the choice is arbitrary. Bell Labs once proposed the (now obsolete) term ``octothorpe'' for the ``#'' character.

Trivia questions like these aren't any more pertinent for comp.lang.c than they are for most of the other groups they frequently come up in. You can find lots of information in the net.announce.newusers frequently-asked questions postings, the ``jargon file'' (also published as The [New] Hacker's Dictionary), and the old Usenet ASCII pronunciation list. (The pronunciation list also appears in the jargon file under ASCII, as well as in the comp.unix frequently-asked questions list.)




comp.lang.c FAQ list · Question 20.39b

Q: What do ``lvalue'' and ``rvalue'' mean?


A: Simply speaking, an lvalue is an expression that could appear on the left-hand sign of an assignment; you can also think of it as denoting an object that has a location. (But see question 6.7 concerning arrays.) An rvalue is any expression that has a value (and that can therefore appear on the right-hand sign of an assignment).




comp.lang.c FAQ list · Question 20.40

Q: Where can I get extra copies of this list?


A: An up-to-date copy may be obtained from ftp.eskimo.com in directory home/scs/C-faq/. You can also just pull it off the net; it is normally posted to comp.lang.c on the first of each month, with an Expires: line which should keep it around all month. A parallel, abridged version is available (and posted), as is a list of changes accompanying each significantly updated version.

The various versions of this list are also posted to the newsgroups comp.answers and news.answers. Several sites archive news.answers postings and other FAQ lists, including this one; two sites are rtfm.mit.edu (directories pub/usenet/news.answers/C-faq/ and pub/usenet/comp.lang.c/) and ftp.uu.net (directory usenet/news.answers/C-faq/). If you don't have ftp access, a mailserver at rtfm.mit.edu can mail you FAQ lists: send a message containing the single word ``help'' to mail-server@rtfm.mit.edu for more information. See the meta-FAQ list in news.answers for more information.

An extended version of this FAQ list has been published by Addison-Wesley as C Programming FAQs: Frequently Asked Questions (ISBN 0-201-84519-9). An errata list is at http://www.eskimo.com/~scs/C-faq/book/Errata.html and on ftp.eskimo.com in home/scs/ftp/C-faq/book/Errata .





Read sequentially: prev next up



about this FAQ list   about eskimo   search   feedback   copyright

Hosted by Eskimo North