Saturday, April 5, 2008

C language latest notes and Technical Interview Questions - 11

C Preprocessor

1. How can we write a generic macro to swap two values?
A: There is no good answer to this question. If the values are integers, a well-known trick using exclusive-OR could perhaps be used, but it will not work for floating-point values or pointers, or if the two values are the same variable. If the macro is intended to be used on values of arbitrary type (the usual goal), it cannot use a temporary, since it does not know what type of temporary it needs (and would have a hard time picking a name for it if it did), and standard C does not provide a typeof operator.
The best all-around solution is probably to forget about using a macro, unless you're willing to pass in the type as a third argument.

2. What's the best way to write a multi-statement macro?
A: The usual goal is to write a macro that can be invoked as if it were a statement consisting of a single function call. This means that the "caller" will be supplying the final semicolon, so the macro body should not. The macro body cannot therefore be a simple brace-enclosed compound statement, because syntax errors would result if it were invoked (apparently as a single statement, but with a resultant extra semicolon) as the if branch of an if/else statement with an explicit else clause.

The traditional solution, therefore, is to use
#define MACRO(arg1, arg2) do { \
/* declarations */ \
stmt1; \
stmt2; \
/* ... */ \
} while(0) /* (no trailing ; ) */

When the caller appends a semicolon, this expansion becomes a single statement regardless of context. (An optimizing compiler will remove any "dead" tests or branches on the constant condition 0, although lint may complain.)

If all of the statements in the intended macro are simple expressions, with no declarations or loops, another technique is to write a single, parenthesized expression using one or more comma operators. (For an example, see the first DEBUG() macro in question 10.26.) This technique also allows a value to be "returned."

3. I'm splitting up a program into multiple source files for the first time, and I'm wondering what to put in .c files and what to put in .h files. (What does ".h" mean, anyway?)
A: As a general rule, you should put these things in header (.h) files:
macro definitions (preprocessor #defines)
structure, union, and enumeration declarations
typedef declarations
external function declarations
global variable declarations

It's especially important to put a declaration or definition in a header file when it will be shared between several other files.

On the other hand, when a definition or declaration should remain private to one .c file, it's fine to leave it there.

4. Is it acceptable for one header file to #include another?
A: It's a question of style, and thus receives considerable debate. Many people believe that "nested #include files" are to be avoided: the prestigious Indian Hill Style Guide disparages them; they can make it harder to find relevant definitions; they can lead to multiple-definition errors if a file is #included twice; and they make manual Makefile maintenance very difficult. On the other hand, they make it possible to use header files in a modular way (a header file can #include what it needs itself, rather than requiring each #includer to do so); a tool like grep (or a tags file) makes it easy to find definitions no matter where they are; a popular trick along the lines of:

#ifndef HFILENAME_USED
#define HFILENAME_USED
...header file contents...
#endif

(where a different bracketing macro name is used for each header file) makes a header file "idempotent" so that it can safely be #included multiple times; and automated Makefile maintenance
tools (which are a virtual necessity in large projects anyway; handle dependency generation in the face of nested #include files easily.

5. What's the difference between #include <> and #include "" ?
A: The <> syntax is typically used with Standard or system-supplied headers, while "" is typically used for a program's own header files.

6. What are the complete rules for header file searching?
A: The exact behavior is implementation-defined (which means that it is supposed to be documented;).
Typically, headers named with <> syntax are searched for in one or more standard places. Header files named with "" syntax are first searched for in the "current directory," then (if not found) in the same standard places.

Traditionally (especially under Unix compilers), the current directory is taken to be the directory containing the file containing the #include directive. Under other compilers, however, the current directory (if any) is the directory in which the compiler was initially invoked. Check your compiler
documentation.

7. I'm getting strange syntax errors on the very first declaration in a file, but it looks fine.
A: Perhaps there's a missing semicolon at the end of the last declaration in the last header file you're #including.

8. I seem to be missing the system header file . Can someone send me a copy?
A: Standard headers exist in part so that definitions appropriate to your compiler, operating system, and processor can be supplied. You cannot just pick up a copy of someone else's header file and expect it to work, unless that person is using exactly the same environment. Ask your compiler vendor why the file was not provided (or to send a replacement copy).

9. How can I construct preprocessor #if expressions which compare strings?
A: You can't do it directly; preprocessor #if arithmetic uses only integers. An alternative is to #define several macros with symbolic names and distinct integer values, and implement conditionals on those.

10. Does the sizeof operator work in preprocessor #if directives?
A: No. Preprocessing happens during an earlier phase of compilation, before type names have been parsed. Instead of sizeof, consider using the predefined constants in ANSI's <limits.h>, if applicable, or perhaps a "configure" script. (Better yet, try to write code which is inherently insensitive to type sizes; .)

11. Can I use an #ifdef in a #define line, to define something two different ways?
A: No. You can't "run the preprocessor on itself," so to speak. What you can do is use one of two completely separate #define lines, depending on the #ifdef setting.

12. Is there anything like an #ifdef for typedefs?
A: Unfortunately, no. You may have to keep sets of preprocessor macros (e.g. MY_TYPE_DEFINED) recording whether certain typedefs have been declared.

13. How can I use a preprocessor #if expression to tell if a machine is big-endian or little-endian?
A: You probably can't. (Preprocessor arithmetic uses only long integers, and there is no concept of addressing.) Are you sure you need to know the machine's endianness explicitly? Usually it's better to write code which doesn't care.

14. I inherited some code which contains far too many #ifdef's for my taste. How can I preprocess the code to leave only one conditional compilation set, without running it through the preprocessor and expanding all of the #include's and #define's as well?
A: There are programs floating around called unifdef, rmifdef, and scpp ("selective C preprocessor") which do exactly this.

15.: How can I list all of the predefined identifiers?
A: There's no standard way, although it is a common need. gcc provides a -dM option which works with -E, and other compilers may provide something similar. If the compiler documentation is unhelpful, the most expedient way is probably to extract printable strings from the compiler or preprocessor executable with something like the Unix strings utility. Beware that many traditional system-specific predefined identifiers (e.g. "unix") are non-Standard (because they clash with the user's namespace) and are being removed or renamed.

16. I have some old code that tries to construct identifiers with a macro like
#define Paste(a, b) a/**/b
but it doesn't work any more.
: It was an undocumented feature of some early preprocessor
implementations (notably John Reiser's) that comments
disappeared entirely and could therefore be used for token
pasting. ANSI affirms (as did K&R1) that comments are replaced
with white space. However, since the need for pasting tokens
was demonstrated and real, ANSI introduced a well-defined token-
pasting operator, ##, which can be used like this:

#define Paste(a, b) a##b

17. Why is the macro
#define TRACE(n) printf("TRACE: %d\n", n)
giving me the warning "macro replacement within a string literal"?

It seems to be expanding
TRACE(count);
as
printf("TRACE: %d\count", count);

18: I've got this tricky preprocessing I want to do and I can't figure out a way to do it.
A: C's preprocessor is not intended as a general-purpose tool. (Note also that it is not guaranteed to be available as a separate program.) Rather than forcing it to do something inappropriate, consider writing your own little special-purpose preprocessing tool, instead. You can easily get a utility like
make(1) to run it for you automatically.

If you are trying to preprocess something other than C, consider using a general-purpose preprocessor. (One older one available on most Unix systems is m4.)

19. How can I write a macro which takes a variable number of arguments?
A: One popular trick is to define and invoke the macro with a single, parenthesized "argument" which in the macro expansion becomes the entire argument list, parentheses and all, for a function such as

printf(): #define DEBUG(args) (printf("DEBUG: "), printf args)
if(n != 0) DEBUG(("n is %d\n", n));
The obvious disadvantage is that the caller must always remember to use the extra parentheses.

gcc has an extension which allows a function-like macro to accept a variable number of arguments, but it's not standard. Other possible solutions are to use different macros (DEBUG1, DEBUG2, etc.) depending on the number of arguments, or to play tricky games with commas:

#define DEBUG(args) (printf("DEBUG: "), printf(args))
#define _ ,
DEBUG("i = %d" _ i)

C9X will introduce formal support for function-like macros with variable-length argument lists. The notation ... will appear at the end of the macro "prototype" (just as it does for varargs functions), and the pseudomacro __VA_ARGS__ in the macro definition will be replaced by the variable arguments during invocation.

Finally, you can always use a bona-fide function, which can take a variable number of arguments in a well-defined way. (If you needed a macro replacement, try using a function plus a non-function-like
macro, e.g. #define printf myprintf .)

No comments: