13. Library Functions

comp.lang.c FAQ list · Question 13.1

Q: How can I convert numbers to strings (the opposite of atoi)? Is there an itoa function?


A: Just use sprintf:

	sprintf(string, "%d", number);
(Don't worry that sprintf may be overkill, potentially wasting run time or code space; it works well in practice.) See also the examples in the answer to question 7.5a, and also questions 8.6 and 12.21.

You can obviously use sprintf to convert long or floating-point numbers to strings as well (using %ld or %f); in other words, sprintf can also be thought of as the opposite of atol and atof. In addition, you have quite a bit of control over the formatting. (It's for these reasons that C supplies sprintf as a general solution, and not itoa.)

If you simply must write an itoa function, here are some things to consider:

See also questions 12.21 and 20.10.

References: K&R1 Sec. 3.6 p. 60
K&R2 Sec. 3.6 p. 64




comp.lang.c FAQ list · Question 13.2

Q: Why does strncpy not always place a '\0' terminator in the destination string?


A: strncpy was first designed to handle a now-obsolete data structure, the fixed-length, not-necessarily-\0-terminated ``string.'' [footnote] strncpy is admittedly a bit cumbersome to use in other contexts, since you must often append a '\0' to the destination string by hand.

You can get around the problem by using strncat instead of strncpy. If the destination string starts out empty (that is, if you do *dest = '\0' first), strncat does what you probably wanted strncpy to do:

	*dest = '\0';
	strncat(dest, source, n);
This code copies up to n characters, and always appends a \0.

Another possibility is

sprintf(dest, "%.*s", n, source)
(though, strictly speaking, this is only guaranteed to work for n <= 509).

When arbitrary bytes (as opposed to strings) are being copied, memcpy is usually a more appropriate function to use than strncpy.




comp.lang.c FAQ list · Question 13.3

Q: Does C have anything like the ``substr'' (extract substring) routine present in other languages?


A: Not as such. (One reason it doesn't is that, as mentioned in question 7.2 and section 8, C has no managed string type.)

To extract a substring of length LEN starting at index POS in a source string, use something like

	char dest[LEN+1];
	strncpy(dest, &source[POS], LEN);
	dest[LEN] = '\0';	/* ensure \0 termination */
or, using the trick from question 13.2,
	char dest[LEN+1] = "";
	strncat(dest, &source[POS], LEN);
or, making use of pointer instead of array notation,
	strncat(dest, source + POS, LEN);
(The expression source + POS is, by definition, identical to &source[POS] --see also section 6.)


comp.lang.c FAQ list · Question 13.4

Q: How do I convert a string to all upper or lower case?


A: Some libraries have routines strupr and strlwr or strupper and strlower, but these are not Standard or portable. It's a straightforward exercise to write upper/lower-case functions in terms of the toupper and tolower macros in <ctype.h>; see also question 13.5. (The only tricky part is that the function will either have to modify the string in-place or deal with the problem of returning a new string; see question 7.5a.)

(Note also that converting characters and strings to upper or lower case is vastly more complicated when multinational character sets are being used.)

References: K&R1 Sec. 2.7 p. 40
K&R2 Sec. 2.7 p. 43




comp.lang.c FAQ list · Question 13.5

Q: Why do some versions of toupper act strangely if given an upper-case letter?
Why does some code call islower before toupper?


A: In earlier times, toupper was a function-like preprocessor macro and was defined to work only on lower-case letters; it misbehaved if applied to digits, punctuation, or letters which were already upper-case. Similarly, tolower worked only on upper-case letters. Therefore, old code (or code written for wide portability) tends to call islower before toupper, and isupper before tolower.

The C Standard, however, says that toupper and tolower must work correctly on all characters, i.e. characters which don't need changing are left alone.

References: ISO Sec. 7.3.2
H&S Sec. 12.9 pp. 320-1
PCS p. 182




comp.lang.c FAQ list · Question 13.6

Q: How can I split up a string into whitespace-separated fields?
How can I duplicate the process by which main() is handed argc and argv?


A: The only Standard function available for this kind of ``tokenizing'' is strtok, although it can be tricky to use [footnote] and it may not do everything you want it to. (For instance, it does not handle quoting.) Here is a usage example, which simply prints each field as it's extracted:

#include <stdio.h>
#include <string.h>
char string[] = "this is a test";	/* not char *; see Q 16.6 */
char *p;
for(p = strtok(string, " \t\n"); p != NULL;
			p = strtok(NULL, " \t\n"))
	printf("\"%s\"\n", p);

As an alternative, here is a routine I use for building an argv all at once:

#include <ctype.h>

int makeargv(char *string, char *argv[], int argvsize)
{
	char *p = string;
	int  i;
	int argc = 0;

	for(i = 0; i < argvsize; i++) {
		/* skip leading whitespace */
		while(isspace(*p))
			p++;

		if(*p != '\0')
			argv[argc++] = p;
		else {
			argv[argc] = 0;
			break;
		}

		/* scan over arg */
		while(*p != '\0' && !isspace(*p))
			p++;
		/* terminate arg: */
		if(*p != '\0' && i < argvsize-1)
			*p++ = '\0';
	}

	return argc;
}

Calling makeargv is straightforward:

	char *av[10];
	int i, ac = makeargv(string, av, 10);
	for(i = 0; i < ac; i++)
		printf("\"%s\"\n", av[i]);

If you want each separator character to be significant, for instance if you want two tabs in a row to indicate an omitted field, it's probably more straightforward to use strchr:

#include <stdio.h>
#include <string.h>

char string[] = "this\thas\t\tmissing\tfield";
char *p = string;

while(1) {		/* break in middle */
	char *p2 = strchr(p, '\t');
	if(p2 != NULL)
		*p2 = '\0';
	printf("\"%s\"\n", p);
	if(p2 == NULL)
		break;
	p = p2 + 1;
}

All the code fragments presented here modify the input string, by inserting \0's to terminate each field (meaning that the string must be writable; see question 1.32). If you'll need the original string later, make a copy before breaking it up.

References: K&R2 Sec. B3 p. 250
ISO Sec. 7.11.5.8
H&S Sec. 13.7 pp. 333-4
PCS p. 178




comp.lang.c FAQ list · Question 13.7

Q: I need some code to do regular expression and wildcard matching.


A: Make sure you recognize the difference between:

There are a number of packages available for matching regular expressions. Most packages use a pair of functions, one for ``compiling'' the regular expression, and one for ``executing'' it (i.e. matching strings against it). Look for header files named <regex.h> or <regexp.h>, and functions called regcmp/regex, regcomp/regexec, or re_comp/re_exec. (These functions may exist in a separate regexp library.) A popular, freely-redistributable regexp package by Henry Spencer is available from ftp.cs.toronto.edu in pub/regexp.shar.Z or in several other archives. The GNU project has a package called rx. [footnote] See also question 18.16.

Filename wildcard matching (sometimes called ``globbing'') is done in a variety of ways on different systems. On Unix, wildcards are automatically expanded by the shell before a process is invoked, so programs rarely have to worry about them explicitly. Under MS-DOS compilers, there is often a special object file which can be linked in to a program to expand wildcards while argv is being built. Several systems (including MS-DOS and VMS) provide system services for listing or opening files specified by wildcards. Check your compiler/library documentation. See also questions 19.20 and 20.3.

Here is a quick little wildcard matcher by Arjan Kenter:

int match(char *pat, char *str)
{
	switch(*pat) {
	case '\0':  return !*str;
	case '*':   return match(pat+1, str) ||
				*str && match(pat, str+1);
	case '?':   return *str && match(pat+1, str+1);
	default:    return *pat == *str && match(pat+1, str+1);
	}
}
(Copyright 1995, Arjan Kenter)

With this definition, the call match("a*b.c", "aplomb.c") would return 1.

References: Schumacher, ed., Software Solutions in C Sec. 3 pp. 35-71




comp.lang.c FAQ list · Question 13.8

Q: I'm trying to sort an array of strings with qsort, using strcmp as the comparison function, but it's not working.


A: By ``array of strings'' you probably mean ``array of pointers to char.'' The arguments to qsort's comparison function are pointers to the objects being sorted, in this case, pointers to pointers to char. strcmp, however, accepts simple pointers to char. Therefore, strcmp can't be used directly. Write an intermediate comparison function like this:

/* compare strings via pointers */
int pstrcmp(const void *p1, const void *p2)
{
	return strcmp(*(char * const *)p1, *(char * const *)p2);
}

The comparison function's arguments are expressed as ``generic pointers,'' const void *. They are converted back to what they ``really are'' (pointers to pointers to char) and dereferenced, yielding char *'s which can be passed to strcmp.

The call to qsort might look like

#include <stdlib.h>
char *strings[NSTRINGS];
int nstrings;
/* nstrings cells of strings[] are to be sorted */
qsort(strings, nstrings, sizeof(char *), pstrcmp);

(Don't be misled by the discussion in K&R2 Sec. 5.11 pp. 119-20, which is not discussing the Standard library's qsort, and makes a quiet, unnecessary assumption about the equivalence of char * and void *).

For more information on qsort comparison functions--how they are called and how they must be declared--see question 13.9.

References: ISO Sec. 7.10.5.2
H&S Sec. 20.5 p. 419




comp.lang.c FAQ list · Question 13.9

Q: Now I'm trying to sort an array of structures with qsort. My comparison function takes pointers to structures, but the compiler complains that the function is of the wrong type for qsort. How can I cast the function pointer to shut off the warning?


A: The conversions must be in the comparison function, which must be declared as accepting ``generic pointers'' (const void *) as discussed in question 13.8 above. For a hypothetical little date structure

struct mystruct {
	int year, month, day;
};
the comparison function might look like [footnote]
int mystructcmp(const void *p1, const void *p2)
{
	const struct mystruct *sp1 = p1;
	const struct mystruct *sp2 = p2;
	if(sp1->year < sp2->year) return -1;
	else if(sp1->year > sp2->year) return 1;
	else if(sp1->month < sp2->month) return -1;
	else if(sp1->month > sp2->month) return 1;
	else if(sp1->day < sp2->day) return -1;
	else if(sp1->day > sp2->day) return 1;
	else return 0;
}
(The conversions from generic pointers to struct mystruct pointers happen in the initializations sp1 = p1 and sp2 = p2; the compiler performs the conversions implicitly since p1 and p2 are void pointers.)

For this version of mystructcmp, the call to qsort might look like

#include <stdlib.h>
struct mystruct dates[NDATES];
int ndates;
/* ndates cells of dates[] are to be sorted */
qsort(dates, ndates, sizeof(struct mystruct), mystructcmp);

If, on the other hand, you're sorting pointers to structures, you'll need indirection, as in question 13.8; the head of the comparison function would look like

int myptrstructcmp(const void *p1, const void *p2)
{
	struct mystruct *sp1 = *(struct mystruct * const *)p1;
	struct mystruct *sp2 = *(struct mystruct * const *)p2;
and the call would look like
struct mystruct *dateptrs[NDATES];
qsort(dateptrs, ndates, sizeof(struct mystruct *), myptrstructcmp);

To understand why the curious pointer conversions in a qsort comparison function are necessary (and why a cast of the function pointer when calling qsort can't help), it's useful to think about how qsort works. qsort doesn't know anything about the type or representation of the data being sorted: it just shuffles around little chunks of memory. (All it knows about the chunks is their size, which you specify in qsort's third argument.) To determine whether two chunks need swapping, qsort calls your comparison function. (To swap them, it uses the equivalent of memcpy.)

Since qsort deals in a generic way with chunks of memory of unknown type, it uses generic pointers (void *) to refer to them. When qsort calls your comparison function, it passes as arguments two generic pointers to the chunks to be compared. Since it passes generic pointers, your comparison function must accept generic pointers, and convert the pointers back to their appropriate type before manipulating them (i.e. before performing the comparison). A void pointer is not the same type as a structure pointer, and on some machines it may have a different size or representation (which is why these casts are required for correctness).

If you were sorting an array of structures, and had a comparison function accepting structure pointers:

	int mywrongstructcmp(struct mystruct *, struct mystruct *);
and if you called qsort as
	qsort(dates, ndates, sizeof(struct mystruct),
		(int (*)(const void *, const void *))mywrongstructcmp);
							/* WRONG */
the cast (int (*)(const void *, const void *)) would do nothing except perhaps silence the message from the compiler telling you that this comparison function may not work with qsort. The implications of any cast you use when calling qsort will have been forgotten by the time qsort gets around to calling your comparison function: it will call them with const void * arguments, so that is what your function must accept. No prototype mechanism exists which could operate down inside qsort to convert the void pointers to struct mystruct pointers just before calling mywrongstructcmp.

In general, it is a bad idea to insert casts just to ``shut the compiler up.'' Compiler warnings are usually trying to tell you something, and unless you really know what you're doing, you ignore or muzzle them at your peril. See also question 4.9.

Additional links

References: ISO Sec. 7.10.5.2
H&S Sec. 20.5 p. 419




comp.lang.c FAQ list · Question 13.10

Q: How can I sort a linked list?


A: Sometimes it's easier to keep the list in order as you build it (or perhaps to use a tree instead). Algorithms like insertion sort and merge sort lend themselves ideally to use with linked lists. If you want to use a standard library function, you can allocate a temporary array of pointers, fill it in with pointers to all your list nodes, call qsort, and finally rebuild the list pointers based on the sorted array.

Additional links: example by Chris Torek

References: Knuth Sec. 5.2.1 pp. 80-102, Sec. 5.2.4 pp. 159-168
Sedgewick Sec. 8 pp. 98-100, Sec. 12 pp. 163-175




comp.lang.c FAQ list · Question 13.11

Q: How can I sort more data than will fit in memory?


A: You want an ``external sort,'' which you can read about in Knuth, Volume 3. The basic idea is to sort the data in chunks (as much as will fit in memory at one time), write each sorted chunk to a temporary file, and then merge the files. Your operating system may provide a general-purpose sort utility, and if so, you can try invoking it from within your program: see questions 19.27 and 19.30, and the example in question 19.28.

References: Knuth Sec. 5.4 pp. 247-378
Sedgewick Sec. 13 pp. 177-187




comp.lang.c FAQ list · Question 13.12

Q: How can I get the current date or time of day in a C program?


A: Just use the time, ctime, localtime and/or strftime functions. Here is a simple example: [footnote]

#include <stdio.h>
#include <time.h>

int main()
{
	time_t now;
	time(&now);
	printf("It's %s", ctime(&now));
	return 0;
}

Calls to localtime and strftime look like this:

	struct tm *tmp = localtime(&now);
	char fmtbuf[30];
	printf("It's %d:%02d:%02d\n",
		tmp->tm_hour, tmp->tm_min, tmp->tm_sec);
	strftime(fmtbuf, sizeof fmtbuf, "%A, %B %d, %Y", tmp);
	printf("on %s\n", fmtbuf);
(Note that these functions take a pointer to the time_t variable, even when they will not be modifying it.[footnote] )

If you need sub-second resolution, see question 19.37.

References: K&R2 Sec. B10 pp. 255-7
ISO Sec. 7.12
H&S Sec. 18




comp.lang.c FAQ list · Question 13.13

Q: I know that the library function localtime will convert a time_t into a broken-down struct tm, and that ctime will convert a time_t to a printable string. How can I perform the inverse operations of converting a struct tm or a string into a time_t?


A: ANSI C specifies a library function, mktime, which converts a struct tm to a time_t.

Converting a string to a time_t is harder, because of the wide variety of date and time formats which might be encountered. Some systems provide a strptime function, which is basically the inverse of strftime. Other popular functions are partime (widely distributed with the RCS package) and getdate (and a few others, from the C news distribution). See question 18.16.

References: K&R2 Sec. B10 p. 256
ISO Sec. 7.12.2.3
H&S Sec. 18.4 pp. 401-2




comp.lang.c FAQ list · Question 13.14

Q: How can I add N days to a date? How can I find the difference between two dates?


A: The ANSI/ISO Standard C mktime and difftime functions provide some (limited) support for both problems. mktime accepts non-normalized dates, so it is straightforward to take a filled-in struct tm, add or subtract from the tm_mday field, and call mktime to normalize the year, month, and day fields (and incidentally convert to a time_t value). difftime computes the difference, in seconds, between two time_t values; mktime can be used to compute time_t values for two dates to be subtracted.

However, these solutions are guaranteed to work correctly only for dates in the range which can be represented as time_t's [footnote] . The tm_mday field is an int, so day offsets of more than 32,736 or so may cause overflow. (See below for an alternative solution without these limitations.) Note also that at daylight saving time changeovers, local days are not 24 hours long, so be careful if you try to divide by 86,400 seconds/day.

Here is a code fragment to compute the date 90 days past October 24, 1994:

#include <stdio.h>
#include <time.h>

tm1.tm_mon = 10 - 1;
tm1.tm_mday = 24;
tm1.tm_year = 1994 - 1900;
tm1.tm_hour = tm1.tm_min = tm1.tm_sec = 0;
tm1.tm_isdst = -1;

tm1.tm_mday += 90;

if(mktime(&tm1) == -1)
	fprintf(stderr, "mktime failed\n");
else	printf("%d/%d/%d\n",
		tm1.tm_mon+1, tm1.tm_mday, tm1.tm_year+1900);
(Setting tm_isdst to -1 helps to guard against daylight saving time anomalies; setting tm_hour to 12 would, too.)

Here is a piece of code to compute the difference in days between February 28 and March 1 in the year 2000:

	struct tm tm1, tm2;
	time_t t1, t2;

	tm1.tm_mon = 2 - 1;
	tm1.tm_mday = 28;
	tm1.tm_year = 2000 - 1900;
	tm1.tm_hour = tm1.tm_min = tm1.tm_sec = 0;
	tm1.tm_isdst = -1;

	tm2.tm_mon = 3 - 1;
	tm2.tm_mday = 1;
	tm2.tm_year = 2000 - 1900;
	tm2.tm_hour = tm2.tm_min = tm2.tm_sec = 0;
	tm2.tm_isdst = -1;

	t1 = mktime(&tm1);
	t2 = mktime(&tm2);
	
	if(t1 == -1 || t2 == -1)
		fprintf(stderr, "mktime failed\n");
	else {
		long d = (difftime(t2, t1) + 86400L/2) / 86400L;
		printf("%ld\n", d);
	}
(The addition of 86400L/2 rounds the difference to the nearest day; see also question 14.6.)

Another approach to both problems, which will work over a much wider range of dates, is to use ``Julian day numbers''. A Julian day number is the number of days since January 1, 4013 BC. [footnote] Given ToJul and FromJul routines declared as

/* returns Julian for month, day, year */
long ToJul(int month, int day, int year);

/* returns month, day, year for jul */
void FromJul(long jul, int *monthp, int *dayp, int *yearp);
adding n days to a date can be implemented as
	int n = 90;
	int month, day, year;
	FromJul(ToJul(10, 24, 1994) + n, &month, &day, &year);
and the number of days between two dates is
	ToJul(3, 1, 2000) - ToJul(2, 28, 2000)
Code for handling Julian day numbers can be found in the Snippets collection (see question 18.15c), the Simtel/Oakland archives (file JULCAL10.ZIP, see question 18.16), and the ``Date conversions'' article mentioned in the References.

See also questions 13.13, 20.31, and 20.32.

Additional links:

further explanation by Mark Brader

more code for date-difference and day-of-week calculation by Branko Radovanovic

References: K&R2 Sec. B10 p. 256
ISO Secs. 7.12.2.2,7.12.2.3
H&S Secs. 18.4,18.5 pp. 401-2
David Burki, ``Date Conversions''




comp.lang.c FAQ list · Question 13.14b

Q: Did C have any Year 2000 problems?


A: No, although poorly-written C programs might have.

The tm_year field of struct tm holds the value of the year minus 1900; this field therefore contains the value 100 for the year 2000. Code that uses tm_year correctly (by adding or subtracting 1900 when converting to or from human-readable 4-digit year representations) has no problems at the turn of the millennium. Any code that used tm_year incorrectly, however, such as by using it directly as a human-readable 2-digit year, or setting it from a 4-digit year with code like

	tm.tm_year = yyyy % 100;	/* WRONG */
or printing it as an allegedly human-readable 4-digit year with code like
	printf("19%d", tm.tm_year);	/* WRONG */
would have had grave y2k problems indeed. See also question 20.32.

(The y2k problem is now mostly old history; all we have left to do is fix all the 32-bit time_t problems by 2038...)

References: K&R2 Sec. B10 p. 255
ISO Sec. 7.12.1
H&S Sec. 18.4 p. 401




comp.lang.c FAQ list · Question 13.15

Q: I need a random number generator.


A: The Standard C library has one: rand. The implementation on your system may not be perfect, but writing a better one isn't necessarily easy, either.

If you do find yourself needing to implement your own random number generator, there is plenty of literature out there; see the References below or the sci.math.num-analysis FAQ list. There are also any number of packages on the net: old standbys are r250, RANLIB, and FSULTRA (see question 18.16), and there is much recent work by Marsaglia, and Matumoto and Nishimura (the ``Mersenne Twister''), and some code collected by Don Knuth on his web pages.

Here is a portable C implementation of the ``minimal standard'' generator proposed by Park and Miller:

#define a 16807
#define m 2147483647
#define q (m / a)
#define r (m % a)

static long int seed = 1;

long int PMrand()
{
	long int hi = seed / q;
	long int lo = seed % q;
	long int test = a * lo - r * hi;
	if(test > 0)
		seed = test;
	else	seed = test + m;
	return seed;
}
(The ``minimal standard'' is adequately good; it is something ``against which all others should be judged''; it is recommended for use ``unless one has access to a random number generator known to be better.'')

This code implements the generator

X  <-  (aX + c) mod m
for a = 16807, m = 2147483647 (which is 2**31-1), and c = 0.[footnote] The multiplication is carried out using a technique described by Schrage, ensuring that the intermediate result aX does not overflow. The implementation above returns long int values in the range [1, 2147483646]; that is, it corresponds to C's rand with a RAND_MAX of 2147483646, except that it never returns 0. To alter it to return floating-point numbers in the range (0, 1) (as in the Park and Miller paper), change the declaration to
	double PMrand()
and the last line to
	return (double)seed / m;
For slightly better statistical properties, Park and Miller now recommend using a = 48271.

References: K&R2 Sec. 2.7 p. 46, Sec. 7.8.7 p. 168
ISO Sec. 7.10.2.1
H&S Sec. 17.7 p. 393
PCS Sec. 11 p. 172
Knuth Vol. 2 Chap. 3 pp. 1-177
Park and Miller, ``Random Number Generators: Good Ones are Hard to Find''




comp.lang.c FAQ list · Question 13.16

Q: How can I get random integers in a certain range?


A: The obvious way,

	rand() % N		/* POOR */
(which tries to return numbers from 0 to N-1) is poor, because the low-order bits of many random number generators are distressingly non-random. (See question 13.18.) A better method is something like
	(int)((double)rand() / ((double)RAND_MAX + 1) * N)

If you'd rather not use floating point, another method is

	rand() / (RAND_MAX / N + 1)
If you just need to do something with probability 1/N, you could use
	if(rand() < (RAND_MAX+1u) / N)
All these methods obviously require knowing RAND_MAX (which ANSI #defines in <stdlib.h>), and assume that N is much less than RAND_MAX.

When N is close to RAND_MAX, and if the range of the random number generator is not a multiple of N (i.e. if (RAND_MAX+1) % N != 0), all of these methods break down: some outputs occur more often than others. (Using floating point does not help; the problem is that rand returns RAND_MAX+1 distinct values, which cannot always be evenly divvied up into N buckets.) If this is a problem, about the only thing you can do is to call rand multiple times, discarding certain values:

	unsigned int x = (RAND_MAX + 1u) / N;
	unsigned int y = x * N;
	unsigned int r;
	do {
		r = rand();
	} while(r >= y);
	return r / x;

For any of these techniques, it's straightforward to shift the range, if necessary; numbers in the range [M, N] could be generated with something like

	M + rand() / (RAND_MAX / (N - M + 1) + 1)

(Note, by the way, that RAND_MAX is a constant telling you what the fixed range of the C library rand function is. You cannot set RAND_MAX to some other value, and there is no way of requesting that rand return numbers in some other range.)

If you're starting with a random number generator which returns floating-point values between 0 and 1 (such as the last version of PMrand alluded to in question 13.15, or drand48 in question 13.21), all you have to do to get integers from 0 to N-1 is multiply the output of that generator by N:

	(int)(drand48() * N)

Additional links

References: K&R2 Sec. 7.8.7 p. 168
PCS Sec. 11 p. 172




comp.lang.c FAQ list · Question 13.17

Q: Each time I run my program, I get the same sequence of numbers back from rand().


A: It's a characteristic of most pseudo-random number generators (and a defined property of the C library rand) that they always start with the same number and go through the same sequence. (Among other things, a bit of predictability can make debugging much easier.) When you don't want this predictability, you can call srand to seed the pseudo-random number generator with a truly random (or at least variable) initial value. Popular seed values are the time of day, or a process ID number, or the elapsed time before the user presses a key, or some combination of these. Here's an example call, using the time of day as a seed:

	#include <stdlib.h>
	#include <time.h>

	srand((unsigned int)time((time_t *)NULL));
(There remain several difficulties: the time_t returned by time might be a floating-point type, hence not portably convertible to unsigned int without the possibility of overflow. Furthermore, if time of day is available with 1-second resolution, using it by itself means that successive runs of the program can easily get the same seed. Subsecond resolution, of time-of-day or keystroke presses, is hard to achieve portably; see question 19.37.)

Note also that it's rarely useful to call srand more than once during a run of a program; in particular, don't try calling srand before each call to rand, in an attempt to get ``really random'' numbers.

References: K&R2 Sec. 7.8.7 p. 168
ISO Sec. 7.10.2.2
H&S Sec. 17.7 p. 393




comp.lang.c FAQ list · Question 13.18

Q: I need a random true/false value, so I'm just taking rand() % 2, but it's alternating 0, 1, 0, 1, 0...


A: Poor pseudorandom number generators (such as the ones unfortunately supplied with some systems) are not very random in the low-order bits. (In fact, for a pure linear congruential random number generator with period 2**e, and this tends to be how random number generators for e-bit machines are written, the low-order n bits repeat with period 2**n.) For this reason, it's preferable to use the higher-order bits: see question 13.16.

References: Knuth Sec. 3.2.1.1 pp. 12-14




comp.lang.c FAQ list · Question 13.19

Q: How can I return a sequence of random numbers which don't repeat at all?


A: What you're looking for is often called a ``random permutation'' or ``shuffle.'' One way is to initialize an array with the values to be shuffled, then randomly interchange each of the cells with another one later in the array:

	int a[10], i, nvalues = 10;

	for(i = 0; i < nvalues; i++)
		a[i] = i + 1;

	for(i = 0; i < nvalues-1; i++) {
		int c = randrange(nvalues-i);
		int t = a[i]; a[i] = a[i+c]; a[i+c] = t;	/* swap */
	}
where randrange(N) is rand() / (RAND_MAX/(N) + 1) or one of the other expressions from question 13.16.

References: Knuth Sec. 3.4.2 pp. 137-8




comp.lang.c FAQ list · Question 13.19b

Q: How can I generate floating-point random numbers?


A: See question 13.21 for some examples.




comp.lang.c FAQ list · Question 13.20

Q: How can I generate random numbers with a normal or Gaussian distribution?


A: There are a number of ways of doing this.

  1. Exploit the Central Limit Theorem (``law of large numbers'') and add up several uniformly-distributed random numbers:
    #include <stdlib.h>
    #include <math.h>
    
    #define NSUM 25
    
    double gaussrand()
    {
    	double x = 0;
    	int i;
    	for(i = 0; i < NSUM; i++)
    		x += (double)rand() / RAND_MAX;
    
    	x -= NSUM / 2.0;
    	x /= sqrt(NSUM / 12.0);
    
    	return x;
    }
    
    (Don't overlook the sqrt(NSUM / 12.) correction, though it's easy to do so accidentally, especially when NSUM is 12.)
  2. Use a method described by Abramowitz and Stegun:
    #include <stdlib.h>
    #include <math.h>
    
    #define PI 3.141592654
    
    double gaussrand()
    {
    	static double U, V;
    	static int phase = 0;
    	double Z;
    
    	if(phase == 0) {
    		U = (rand() + 1.) / (RAND_MAX + 2.);
    		V = rand() / (RAND_MAX + 1.);
    		Z = sqrt(-2 * log(U)) * sin(2 * PI * V);
    	} else
    		Z = sqrt(-2 * log(U)) * cos(2 * PI * V);
    
    	phase = 1 - phase;
    
    	return Z;
    }
    
  3. Use a method discussed in Knuth and due originally to Marsaglia:
    #include <stdlib.h>
    #include <math.h>
    
    double gaussrand()
    {
    	static double V1, V2, S;
    	static int phase = 0;
    	double X;
    
    	if(phase == 0) {
    		do {
    			double U1 = (double)rand() / RAND_MAX;
    			double U2 = (double)rand() / RAND_MAX;
    
    			V1 = 2 * U1 - 1;
    			V2 = 2 * U2 - 1;
    			S = V1 * V1 + V2 * V2;
    			} while(S >= 1 || S == 0);
    
    		X = V1 * sqrt(-2 * log(S) / S);
    	} else
    		X = V2 * sqrt(-2 * log(S) / S);
    
    	phase = 1 - phase;
    
    	return X;
    }
    
These methods all generate numbers with mean 0 and standard deviation 1. (To adjust to some other distribution, multiply by the standard deviation and add the mean.) Method 1 is poor ``in the tails'' (especially if NSUM is small), but methods 2 and 3 perform quite well. See the references for more information.

Additional links

References: Knuth Sec. 3.4.1 p. 117
Box and Muller, ``A Note on the Generation of Random Normal Deviates''
Marsaglia and Bray, ``A Convenient Method for Generating Normal Variables''
Abramowitz and Stegun, Handbook of Mathematical Functions
Press et al., Numerical Recipes in C Sec. 7.2 pp. 288-290




comp.lang.c FAQ list · Question 13.21

Q: I'm porting this program, and it calls a routine drand48, which my library doesn't have. What is it?


A: drand48 is a Unix System V routine which returns floating point random numbers (presumably with 48 bits of precision) in the half-open interval [0, 1)[footnote] . (Its companion seed routine is srand48; neither is in the C Standard.) It's easy to write a low-precision replacement:

#include <stdlib.h>

double drand48()
{
	return rand() / (RAND_MAX + 1.);
}

To more accurately simulate drand48's semantics, you can try to give it closer to 48 bits worth of precision:

#define PRECISION 2.82e14	/* 2**48, rounded up */

double drand48()
{
	double x = 0;
	double denom = RAND_MAX + 1.;
	double need;

	for(need = PRECISION; need > 1;
			need /= (RAND_MAX + 1.)) {
		x += rand() / denom;
		denom *= RAND_MAX + 1.;
	}

	return x;
}

Before using code like this, though, beware that it is numerically suspect, particularly if (as is usually the case) the period of rand is on the order of RAND_MAX. (If you have a longer-period random number generator available, such as BSD random, definitely use it when simulating drand48.)

References: PCS Sec. 11 p. 149




comp.lang.c FAQ list · Question 13.22

Q: Is exit(status) truly equivalent to returning the same status from main?


A: See question 11.16.




comp.lang.c FAQ list · Question 13.23

Q: What's the difference between memcpy and memmove?


A: See question 11.25.




comp.lang.c FAQ list · Question 13.24

Q: I'm trying to port this old program. Why do I get ``undefined external'' errors for some library functions?


A: Some old or semistandard functions have been renamed or replaced over the years;
if you need:/you should instead:

index
use strchr.
rindex
use strrchr.
bcopy
use memmove, after interchanging the first and second arguments (see also question 11.25).
bcmp
use memcmp.
bzero
use memset, with a second argument of 0.

See also questions 12.22 and 13.21.

References: PCS Sec. 11




comp.lang.c FAQ list · Question 13.25

Q: I keep getting errors due to library functions being undefined, but I'm #including all the right header files.


A: In the general case of calling code in an external library, using #include to pull in the right header file(s) is only half of the story; you also have to tell the linker to search the external library itself. The declarations in the header file only tell the compiler how to call the external functions; the header file doesn't supply the definitions of the external functions, or tell the compiler/linker where to find those definitions.

In some cases (especially if the functions are nonstandard) obtaining those definitions may require explicitly asking for the correct libraries to be searched when you link the program. (Some systems may be able to arrange that whenever you #include a header, its associated library, if nonstandard, is automatically requested at link time, but such a facility is not widespread.) See also questions 10.11, 11.30, 13.26, 14.3, and 19.40.




comp.lang.c FAQ list · Question 13.26

Q: I'm still getting errors due to library functions being undefined, even though I'm explicitly requesting the right libraries while linking.


A: Many linkers make one pass over the list of object files and libraries you specify, and extract from libraries only those modules which satisfy references which have so far come up as undefined. Therefore, the order in which libraries are listed with respect to object files (and each other) is significant; usually, you want to search the libraries last.

For example, under Unix, a command line like

	cc -lm myprog.c		# WRONG
usually won't work. Instead, put any -l options at the end of the command line:
	cc myprog.c -lm

If you list a library first, the linker doesn't know that it needs anything out of it yet, and passes it by. See also question 13.28.




comp.lang.c FAQ list · Question 13.27

Q: Why is my simple program, which hardly does more than print ``Hello, world!'' in a window, compiling to such a huge executable (several hundred K)? Should I #include fewer header files?


A: What you're seeing is the current (poor) state of the ``art'' in library design. As run-time libraries accrete more and more features (especially having to do with Graphical User Interfaces), and when one library function calls another library function to do part of its job (which ought to be a Good Thing; that's what library functions are for), it can happen that calling anything in the library (particularly something relatively powerful like printf) eventually pulls in practically everything else, leading to horribly bloated executables.

#including fewer header files probably won't help, because declaring a few functions which you don't call (which is mostly all that happens when you #include a header you don't need) shouldn't result in those functions being placed in your executable, unless they actually do get called. See also question 13.25.

You may be able to track down and derail a chain of unnecessarily-coupled functions which are bloating your executable, or maybe complain to your vendor to clean up the libraries.

References: H&S Sec. 4.8.6 pp. 103-4




comp.lang.c FAQ list · Question 13.28

Q: What does it mean when the linker says that _end is undefined?


A: That message is a quirk of the old Unix linkers. You get an error about _end being undefined only when other symbols are undefined, too--fix the others, and the error about _end will disappear. (See also questions 13.25 and 13.26.)




comp.lang.c FAQ list · Question 13.29

Q: My compiler is complaining that printf is undefined! How can this be? It's the world's most popular C function...


A: Allegedly, there are C compilers for Microsoft Windows which do not support printf, on the argument that printf is for printing to old-fashioned terminals, while under Windows the right way to display text is to call xxx to open a window and then xxx to display text in it. It may be possible to convince such a compiler that what you are writing is a ``console application'' meaning that it will take care of opening a ``console window'' for you automatically, and condescend to let you call printf to print stuff to it.

See also question 19.4b.





Read sequentially: prev next up



about this FAQ list   about eskimo   search   feedback   copyright

Hosted by Eskimo North