1. Declarations and Initializations
2. Structures, Unions, and Enumerations
3. Expressions
4. Pointers
5. Null Pointers
6. Arrays and Pointers
7. Memory Allocation
8. Characters and Strings
9. Boolean Expressions and Variables
10. C Preprocessor
11. ANSI/ISO Standard C
12. Stdio
13. Library Functions
14. Floating Point
15. Variable-Length Argument Lists
16. Strange Problems
17. Style
18. Tools and Resources
19. System Dependencies
20. Miscellaneous
Bibliography
Acknowledgements
(The question numbers within each section are not always continuous,
because they are aligned with the aforementioned book-length version,
which contains even more questions.)
Herewith, some frequently-asked questions and their answers:
Section 1. Declarations and Initializations
1.1: How do you decide which integer type to use?
A: If you might need large values (above 32,767 or below -32,767),
use long. Otherwise, if space is very important (i.e. if there
are large arrays or many structures), use short. Otherwise, use
int. If well-defined overflow characteristics are important and
negative values are not, or if you want to steer clear of sign-
extension problems when manipulating bits or bytes, use one of
the corresponding unsigned types. (Beware when mixing signed
and unsigned values in expressions, though.)
Although character types (especially unsigned char) can be used
as "tiny" integers, doing so is sometimes more trouble than it's
worth, due to unpredictable sign extension and increased code
size. (Using unsigned char can help; see question 12.1 for a
related problem.)
A similar space/time tradeoff applies when deciding between
float and double. None of the above rules apply if the address
of a variable is taken and must have a particular type.
If for some reason you need to declare something with an *exact*
size (usually the only good reason for doing so is when
attempting to conform to some externally-imposed storage layout,
but see question 20.5), be sure to encapsulate the choice behind
an appropriate typedef.
References: K&R1 Sec. 2.2 p. 34; K&R2 Sec. 2.2 p. 36, Sec. A4.2
pp. 195-6, Sec. B11 p. 257; ISO Sec. 5.2.4.2.1, Sec. 6.1.2.5;
H&S Secs. 5.1,5.2 pp. 110-114.
1.4: What should the 64-bit type on a machine that can support it?
A: The forthcoming revision to the C Standard (C9X) specifies type
long long as effectively being at least 64 bits, and this type
has been implemented by a number of compilers for some time.
(Others have implemented extensions such as __longlong.)
On the other hand, there's no theoretical reason why a compiler
couldn't implement type short int as 16, int as 32, and long int
as 64 bits, and some compilers do indeed choose this
arrangement.
See also question 18.15d.
References: C9X Sec. 5.2.4.2.1, Sec. 6.1.2.5.
1.7: What's the best way to declare and define global variables
and functions?
A: First, though there can be many "declarations" (and in many
translation units) of a single "global" (strictly speaking,
"external") variable or function, there must be exactly one
"definition". (The definition is the declaration that actually
allocates space, and provides an initialization value, if any.)
The best arrangement is to place each definition in some
relevant .c file, with an external declaration in a header
(".h") file, which is #included wherever the declaration is
needed. The .c file containing the definition should also
#include the same header file, so that the compiler can check
that the definition matches the declarations.
This rule promotes a high degree of portability: it is
consistent with the requirements of the ANSI C Standard, and is
also consistent with most pre-ANSI compilers and linkers. (Unix
compilers and linkers typically use a "common model" which
allows multiple definitions, as long as at most one is
initialized; this behavior is mentioned as a "common extension"
by the ANSI Standard, no pun intended. A few very odd systems
may require an explicit initializer to distinguish a definition
from an external declaration.)
It is possible to use preprocessor tricks to arrange that a line
like
DEFINE(int, i);
need only be entered once in one header file, and turned into a
definition or a declaration depending on the setting of some
macro, but it's not clear if this is worth the trouble.
It's especially important to put global declarations in header
files if you want the compiler to catch inconsistent
declarations for you. In particular, never place a prototype
for an external function in a .c file: it wouldn't generally be
checked for consistency with the definition, and an incompatible
prototype is worse than useless.
See also questions 10.6 and 18.8.
References: K&R1 Sec. 4.5 pp. 76-7; K&R2 Sec. 4.4 pp. 80-1; ISO
Sec. 6.1.2.2, Sec. 6.7, Sec. 6.7.2, Sec. G.5.11; Rationale
Sec. 3.1.2.2; H&S Sec. 4.8 pp. 101-104, Sec. 9.2.3 p. 267; CT&P
Sec. 4.2 pp. 54-56.
1.11: What does extern mean in a function declaration?
A: It can be used as a stylistic hint to indicate that the
function's definition is probably in another source file, but
there is no formal difference between
extern int f();
and
int f();
References: ISO Sec. 6.1.2.2, Sec. 6.5.1; Rationale
Sec. 3.1.2.2; H&S Secs. 4.3,4.3.1 pp. 75-6.
1.12: What's the auto keyword good for?
A: Nothing; it's archaic. See also question 20.37.
References: K&R1 Sec. A8.1 p. 193; ISO Sec. 6.1.2.4, Sec. 6.5.1;
H&S Sec. 4.3 p. 75, Sec. 4.3.1 p. 76.
1.14: I can't seem to define a linked list successfully. I tried
typedef struct {
char *item;
NODEPTR next;
} *NODEPTR;
but the compiler gave me error messages. Can't a structure in C
contain a pointer to itself?
A: Structures in C can certainly contain pointers to themselves;
the discussion and example in section 6.5 of K&R make this
clear. The problem with the NODEPTR example is that the typedef
has not been defined at the point where the "next" field is
declared. To fix this code, first give the structure a tag
("struct node"). Then, declare the "next" field as a simple
"struct node *", or disentangle the typedef declaration from the
structure definition, or both. One corrected version would be
struct node {
char *item;
struct node *next;
};
typedef struct node *NODEPTR;
and there are at least three other equivalently correct ways of
arranging it.
A similar problem, with a similar solution, can arise when
attempting to declare a pair of typedef'ed mutually referential
structures.
See also question 2.1.
References: K&R1 Sec. 6.5 p. 101; K&R2 Sec. 6.5 p. 139; ISO
Sec. 6.5.2, Sec. 6.5.2.3; H&S Sec. 5.6.1 pp. 132-3.
1.21: How do I declare an array of N pointers to functions returning
pointers to functions returning pointers to characters?
A: The first part of this question can be answered in at least
three ways:
1. char *(*(*a[N])())();
2. Build the declaration up incrementally, using typedefs:
typedef char *pc; /* pointer to char */
typedef pc fpc(); /* function returning pointer to char */
typedef fpc *pfpc; /* pointer to above */
typedef pfpc fpfpc(); /* function returning... */
typedef fpfpc *pfpfpc; /* pointer to... */
pfpfpc a[N]; /* array of... */
3. Use the cdecl program, which turns English into C and vice
versa:
cdecl> declare a as array of pointer to function returning
pointer to function returning pointer to char
char *(*(*a[])())()
cdecl can also explain complicated declarations, help with
casts, and indicate which set of parentheses the arguments
go in (for complicated function definitions, like the one
above). See question 18.1.
Any good book on C should explain how to read these complicated
C declarations "inside out" to understand them ("declaration
mimics use").
The pointer-to-function declarations in the examples above have
not included parameter type information. When the parameters
have complicated types, declarations can *really* get messy.
(Modern versions of cdecl can help here, too.)
References: K&R2 Sec. 5.12 p. 122; ISO Sec. 6.5ff (esp.
Sec. 6.5.4); H&S Sec. 4.5 pp. 85-92, Sec. 5.10.1 pp. 149-50.
1.22: How can I declare a function that can return a pointer to a
function of the same type? I'm building a state machine with
one function for each state, each of which returns a pointer to
the function for the next state. But I can't find a way to
declare the functions.
A: You can't quite do it directly. Either have the function return
a generic function pointer, with some judicious casts to adjust
the types as the pointers are passed around; or have it return a
structure containing only a pointer to a function returning that
structure.
1.25: My compiler is complaining about an invalid redeclaration of a
function, but I only define it once and call it once.
A: Functions which are called without a declaration in scope
(perhaps because the first call precedes the function's
definition) are assumed to be declared as returning int (and
without any argument type information), leading to discrepancies
if the function is later declared or defined otherwise. Non-int
functions must be declared before they are called.
Another possible source of this problem is that the function has
the same name as another one declared in some header file.
See also questions 11.3 and 15.1.
References: K&R1 Sec. 4.2 p. 70; K&R2 Sec. 4.2 p. 72; ISO
Sec. 6.3.2.2; H&S Sec. 4.7 p. 101.
1.25b: What's the right declaration for main()?
Is void main() correct?
A: See questions 11.12a to 11.15. (But no, it's not correct.)
1.30: What am I allowed to assume about the initial values
of variables which are not explicitly initialized?
If global variables start out as "zero", is that good
enough for null pointers and floating-point zeroes?
A: Uninitialized variables with "static" duration (that is, those
declared outside of functions, and those declared with the
storage class static), are guaranteed to start out as zero, as
if the programmer had typed "= 0". Therefore, such variables
are implicitly initialized to the null pointer (of the correct
type; see also section 5) if they are pointers, and to 0.0 if
they are floating-point.
Variables with "automatic" duration (i.e. local variables
without the static storage class) start out containing garbage,
unless they are explicitly initialized. (Nothing useful can be
predicted about the garbage.)
Dynamically-allocated memory obtained with malloc() and
realloc() is also likely to contain garbage, and must be
initialized by the calling program, as appropriate. Memory
obtained with calloc() is all-bits-0, but this is not
necessarily useful for pointer or floating-point values (see
question 7.31, and section 5).
References: K&R1 Sec. 4.9 pp. 82-4; K&R2 Sec. 4.9 pp. 85-86; ISO
Sec. 6.5.7, Sec. 7.10.3.1, Sec. 7.10.5.3; H&S Sec. 4.2.8 pp. 72-
3, Sec. 4.6 pp. 92-3, Sec. 4.6.2 pp. 94-5, Sec. 4.6.3 p. 96,
Sec. 16.1 p. 386.
1.31: This code, straight out of a book, isn't compiling:
int f()
{
char a[] = "Hello, world!";
}
A: Perhaps you have a pre-ANSI compiler, which doesn't allow
initialization of "automatic aggregates" (i.e. non-static
local arrays, structures, and unions). (As a workaround, and
depending on how the variable a is used, you may be able to make
it global or static, or replace it with a pointer, or initialize
it by hand with strcpy() when f() is called.) See also
question 11.29.
1.31b: What's wrong with this initialization?
char *p = malloc(10);
My compiler is complaining about an "invalid initializer",
or something.
A: Is the declaration of a static or non-local variable? Function
calls are allowed only in initializers for automatic variables
(that is, for local, non-static variables).
1.32: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].
A: A string literal can be used in two slightly different ways. As
an array initializer (as in the declaration of char a[]), it
specifies the initial values of the characters in that array.
Anywhere else, it turns into an unnamed, static array of
characters, which may be stored in read-only memory, which is
why you can't safely modify it. In an expression context, the
array is converted at once to a pointer, as usual (see section
6), so the second declaration initializes p to point to the
unnamed array's first element.
(For compiling old code, some compilers have a switch
controlling whether strings are writable or not.)
See also questions 1.31, 6.1, 6.2, and 6.8.
References: K&R2 Sec. 5.5 p. 104; ISO Sec. 6.1.4, Sec. 6.5.7;
Rationale Sec. 3.1.4; H&S Sec. 2.7.4 pp. 31-2.
1.34: I finally figured out the syntax for declaring pointers to
functions, but now how do I initialize one?
A: Use something like
extern int func();
int (*fp)() = func;
When the name of a function appears in an expression like this,
it "decays" into a pointer (that is, it has its address
implicitly taken), much as an array name does.
An explicit declaration for the function is normally needed,
since implicit external function declaration does not happen in
this case (because the function name in the initialization is
not part of a function call).
See also questions 1.25 and 4.12.
Section 2. Structures, Unions, and Enumerations
2.1: What's the difference between these two declarations?
struct x1 { ... };
typedef struct { ... } x2;
A: The first form declares a "structure tag"; the second declares a
"typedef". The main difference is that you subsequently refer
to the first type as "struct x1" and the second simply as "x2".
That is, the second declaration is of a slightly more abstract
type -- its users don't necessarily know that it is a structure,
and the keyword struct is not used when declaring instances of it.
2.2: Why doesn't
struct x { ... };
x thestruct;
work?
A: C is not C++. Typedef names are not automatically generated for
structure tags. See also question 2.1 above.
2.3: Can a structure contain a pointer to itself?
A: Most certainly. See question 1.14.
2.4: What's the best way of implementing opaque (abstract) data types
in C?
A: One good way is for clients to use structure pointers (perhaps
additionally hidden behind typedefs) which point to structure
types which are not publicly defined.
2.6: I came across some code that declared a structure like this:
struct name {
int namelen;
char namestr[1];
};
and then did some tricky allocation to make the namestr array
act like it had several elements. Is this legal or portable?
A: This technique is popular, although Dennis Ritchie has called it
"unwarranted chumminess with the C implementation." An official
interpretation has deemed that it is not strictly conforming
with the C Standard, although it does seem to work under all
known implementations. (Compilers which check array bounds
carefully might issue warnings.)
Another possibility is to declare the variable-size element very
large, rather than very small; in the case of the above example:
...
char namestr[MAXSIZE];
where MAXSIZE is larger than any name which will be stored.
However, it looks like this technique is disallowed by a strict
interpretation of the Standard as well. Furthermore, either of
these "chummy" structures must be used with care, since the
programmer knows more about their size than the compiler does.
(In particular, they can generally only be manipulated via
pointers.)
C9X will introduce the concept of a "flexible array member",
which will allow the size of an array to be omitted if it is
the last member in a structure, thus providing a well-defined
solution.
References: Rationale Sec. 3.5.4.2; C9X Sec. 6.5.2.1.
2.7: I heard that structures could be assigned to variables and
passed to and from functions, but K&R1 says not.
A: What K&R1 said (though this was quite some time ago by now) was
that the restrictions on structure operations would be lifted
in a forthcoming version of the compiler, and in fact structure
assignment and passing were fully functional in Ritchie's
compiler even as K&R1 was being published. A few ancient C
compilers may have lacked these operations, but all modern
compilers support them, and they are part of the ANSI C
standard, so there should be no reluctance to use them.
(Note that when a structure is assigned, passed, or returned,
the copying is done monolithically; the data pointed to by any
pointer fields is *not* copied.)
References: K&R1 Sec. 6.2 p. 121; K&R2 Sec. 6.2 p. 129; ISO
Sec. 6.1.2.5, Sec. 6.2.2.1, Sec. 6.3.16; H&S Sec. 5.6.2 p. 133.
2.8: Is there a way to compare structures automatically?
A: No. There is no single, good way for a compiler to implement
implicit structure comparison (i.e. to support the == operator
for structures) which is consistent with C's low-level flavor.
A simple byte-by-byte comparison could founder on random bits
present in unused "holes" in the structure (such padding is used
to keep the alignment of later fields correct; see question
2.12). A field-by-field comparison might require unacceptable
amounts of repetitive code for large structures.
If you need to compare two structures, you'll have to write your
own function to do so, field by field.
References: K&R2 Sec. 6.2 p. 129; Rationale Sec. 3.3.9; H&S
Sec. 5.6.2 p. 133.
2.10: How can I pass constant values to functions which accept
structure arguments?
A: As of this writing, C has no way of generating anonymous
structure values. You will have to use a temporary structure
variable or a little structure-building function.
The C9X Standard will introduce "compound literals"; one form of
compound literal will allow structure constants. For example,
to pass a constant coordinate pair to a plotpoint() function
which expects a struct point, you will be able to call
plotpoint((struct point){1, 2});
Combined with "designated initializers" (another C9X feature),
it will also be possible to specify member values by name:
plotpoint((struct point){.x=1, .y=2});
See also question 4.10.
References: C9X Sec. 6.3.2.5, Sec. 6.5.8.
2.11: How can I read/write structures from/to data files?
A: It is relatively straightforward to write a structure out using
fwrite():
fwrite(&somestruct, sizeof somestruct, 1, fp);
and a corresponding fread invocation can read it back in.
However, data files so written will *not* be portable (see
questions 2.12 and 20.5). Note also that if the structure
contains any pointers, only the pointer values will be written,
and they are most unlikely to be valid when read back in.
Finally, note that for widespread portability you must use the
"b" flag when fopening the files; see question 12.38.
A more portable solution, though it's a bit more work initially,
is to write a pair of functions for writing and reading a
structure, field-by-field, in a portable (perhaps even human-
readable) way.
References: H&S Sec. 15.13 p. 381.
2.12: My compiler is leaving holes in structures, which is wasting
space and preventing "binary" I/O to external data files. Can I
turn off the padding, or otherwise control the alignment of
structure fields?
A: Your compiler may provide an extension to give you this control
(perhaps a #pragma; see question 11.20), but there is no
standard method.
See also question 20.5.
References: K&R2 Sec. 6.4 p. 138; H&S Sec. 5.6.4 p. 135.
2.13: Why does sizeof report a larger size than I expect for a
structure type, as if there were padding at the end?
A: Structures may have this padding (as well as internal padding),
if necessary, to ensure that alignment properties will be
preserved when an array of contiguous structures is allocated.
Even when the structure is not part of an array, the end padding
remains, so that sizeof can always return a consistent size.
See also question 2.12 above.
References: H&S Sec. 5.6.7 pp. 139-40.
2.14: How can I determine the byte offset of a field within a
structure?
A: ANSI C defines the offsetof() macro, which should be used if
available; see
implementation is
#define offsetof(type, mem) ((size_t) \
((char *)&((type *)0)->mem - (char *)(type *)0))
This implementation is not 100% portable; some compilers may
legitimately refuse to accept it.
See question 2.15 below for a usage hint.
References: ISO Sec. 7.1.6; Rationale Sec. 3.5.4.2; H&S
Sec. 11.1 pp. 292-3.
2.15: How can I access structure fields by name at run time?
A: Build a table of names and offsets, using the offsetof() macro.
The offset of field b in struct a is
offsetb = offsetof(struct a, b)
If structp is a pointer to an instance of this structure, and
field b is an int (with offset as computed above), b's value can
be set indirectly with
*(int *)((char *)structp + offsetb) = value;
2.18: This program works correctly, but it dumps core after it
finishes. Why?
struct list {
char *item;
struct list *next;
}
/* Here is the main program. */
main(argc, argv)
{ ... }
A: A missing semicolon causes main() to be declared as returning a
structure. (The connection is hard to see because of the
intervening comment.) Since structure-valued functions are
usually implemented by adding a hidden return pointer, the
generated code for main() tries to accept three arguments,
although only two are passed (in this case, by the C start-up
code). See also questions 10.9 and 16.4.
References: CT&P Sec. 2.3 pp. 21-2.
2.20: Can I initialize unions?
A: The current C Standard allows an initializer for the first-named
member of a union. C9X will introduce "designated initializers"
which can be used to initialize any member.
References: K&R2 Sec. 6.8 pp. 148-9; ISO Sec. 6.5.7; C9X
Sec. 6.5.8; H&S Sec. 4.6.7 p. 100.
2.22: What is the difference between an enumeration and a set of
preprocessor #defines?
A: At the present time, there is little difference. The C Standard
says that enumerations may be freely intermixed with other
integral types, without errors. (If, on the other hand, such
intermixing were disallowed without explicit casts, judicious
use of enumerations could catch certain programming errors.)
Some advantages of enumerations are that the numeric values are
automatically assigned, that a debugger may be able to display
the symbolic values when enumeration variables are examined, and
that they obey block scope. (A compiler may also generate
nonfatal warnings when enumerations and integers are
indiscriminately mixed, since doing so can still be considered
bad style even though it is not strictly illegal.) A
disadvantage is that the programmer has little control over
those nonfatal warnings; some programmers also resent not having
control over the sizes of enumeration variables.
References: K&R2 Sec. 2.3 p. 39, Sec. A4.2 p. 196; ISO
Sec. 6.1.2.5, Sec. 6.5.2, Sec. 6.5.2.2, Annex F; H&S Sec. 5.5
pp. 127-9, Sec. 5.11.2 p. 153.
2.24: Is there an easy way to print enumeration values symbolically?
A: No. You can write a little function to map an enumeration
constant to a string. (For debugging purposes, a good debugger
should automatically print enumeration constants symbolically.)