[Someone asked me whether it was or wasn't obvious that the way to print a single % sign using printf was with the (correct) "%%" or the (incorrect) "\%". This is part of my reply, edited slightly for this web page.]
From: Steve Summit
Subject: Re: printf("%")
In-Reply-To: your message <199703310107.RAA19240@mx2.eskimo.com>
of Sun, 30 Mar 1997 17:05:57 -0800
Cc: scs@eskimo.com
If you're used to thinking about how a C program really ``works'' -- how the compiler generates an executable program, and how functions get called at run time -- then it really can be obvious that \% can't possibly work, and that the solution has to be something like %%.
Suppose that you write this program:
main() { char line[80]; fgets(line, 80, stdin); fputs(line, stdout); }
Obviously, this program reads one line from standard input and prints it right back out again. Suppose that the line you type at it consists of the 8 characters
abc\ndef
What gets printed? The answer is, one line, consisting of the 8 characters
abc\ndef
You might think that it would instead print the two lines
abc def
(In fact, it occurs to me that ``Why doesn't it print two
lines?'' is an FAQ which I've never added to the list.)
[But I was wrong; it's question 8.8.]
Why doesn't it print two lines? Why aren't the two characters \n interpreted as a single newline character, as they are in character and string constants in source code? Because those backslash escapes for various special characters are interpreted only in character and string constants in source code. Furthermore, they're interpreted by the compiler, at compile time, in the process of converting your source code into an executable program. None of the standard I/O functions (fgets, fputs, printf, etc.) ever do any interpretation of backslashes. As far as they're concerned, a backslash is just another character being read or written. By the time these functions are running, any backslashes which had been in the source code (that is, in character or string constants which had been handed to these functions to work on) will already have been processed.
That is, when fgets() reads a \ from the input file, it just places it into the array with all of the other characters. But when you write
printf("Hello, world!\n");
printf receives a string consisting of 14 characters (not including the terminating \0). The 14th character in the string is a single newline character, however that's represented internally. printf does not receive the two characters \ and n, nor does it have to translate them into a newline character. The compiler did that already.
If you understand this, it's pretty obvious that
printf("\%");
can't work. What will the compiler translate \% into? As it happens, \% isn't a legal special-character sequence, but by analogy with \', \", and \\, it's likely that the compiler will convert \% into a single % character. (It may also generate a warning message about the undefined escape sequence.) So printf will probably receive a string consisting of a single %, exactly as if you'd simply typed
printf("%");
Whenever printf sees a % character in its format string, it always looks at the next character(s) to decide what to print. But there is no following character, so printf is confused.
I understand why you expect \% to work -- in some contexts, at least, the \ ``turns off'' the special interpretation of the following character. (That's what it's doing when you write '\'' or "This string contains a \" and a \\".) So when you want to print a single %, you want to ``turn off'' printf's special interpretation of the % character, and the first thing you think of is the backslash. But wait: the backslash turns off special interpretation of characters by the compiler, while what you want is to turn off special interpretation of a character by printf. The two contexts are completely separate, so the rules might be different, and in fact the rules are different. (The first context is ``string constants being processed by the compiler'', and in that context, the rule for turning off special interpretation of certain characters is to precede them with a backslash. The second context is ``format strings being interpreted by printf'', and in that context, the rule is that if you want a single %, you write %%.)