Saturday, April 5, 2008

C language Library Functions Questions - 14

C Language Library Functions

1. How can I convert numbers to strings (the opposite of atoi)? Is there an itoa() function?
A: Just use sprintf(). (Don't worry that sprintf() may be overkill, potentially wasting run time or code space; it works well in practice.)

You can obviously use sprintf() to convert long or floating- point numbers to strings as well (using %ld or %f).

2. Why does strncpy() not always place a '\0' terminator in the destination string?
A: strncpy() was first designed to handle a now-obsolete data structure, the fixed-length, not-necessarily-\0-terminated "string." (A related quirk of strncpy's is that it pads short strings with multiple \0's, out to the specified length.) strncpy() is admittedly a bit cumbersome to use in other
contexts, since you must often append a '\0' to the destination string by hand. You can get around the problem by using strncat() instead of strncpy(): if the destination string starts out empty, strncat() does what you probably wanted strncpy() to do. Another possibility is sprintf(dest, "%.*s", n, source) .

When arbitrary bytes (as opposed to strings) are being copied, memcpy() is usually a more appropriate function to use than strncpy().

3. Why do some versions of toupper() act strangely if given an upper-case letter? Why does some code call islower() before toupper()?
A: Older versions of toupper() and tolower() did not always work correctly on arguments which did not need converting (i.e. on digits or punctuation or letters already of the desired case). In ANSI/ISO Standard C, these functions are guaranteed to work appropriately on all character arguments.

4. How can I split up a string into whitespace-separated fields? How can I duplicate the process by which main() is handed argc and argv?
A: The only Standard function available for this kind of "tokenizing" is strtok(), although it can be tricky to use and it may not do everything you want it to. (For instance, it does not handle quoting.)

5. I need some code to do regular expression and wildcard matching.
A: Make sure you recognize the difference between classic regular expressions (variants of which are used in such Unix utilities as ed and grep), and filename wildcards (variants of which are used by most operating systems).

There are a number of packages available for matching regular expressions. Most packages use a pair of functions, one for "compiling" the regular expression, and one for "executing" it (i.e. matching strings against it). Look for header files named or , and functions called regcmp/regex, regcomp/regexec, or re_comp/re_exec. (These functions may exist in a separate regexp library.) A popular, freely- redistributable regexp package by Henry Spencer is available
from ftp.cs.toronto.edu in pub/regexp.shar.Z or in several other archives. The GNU project has a package called rx. Filename wildcard matching (sometimes called "globbing") is done in a variety of ways on different systems. On Unix, wildcards are automatically expanded by the shell before a process is invoked, so programs rarely have to worry about them explicitly.Under MS-DOS compilers, there is often a special object file which can be linked in to a program to expand wildcards while argv is being built. Several systems (including MS-DOS and VMS) provide system services for listing or opening files specified by wildcards. Check your compiler/library documentation.

6. I'm trying to sort an array of strings with qsort(), using strcmp() as the comparison function, but it's not working.
A: By "array of strings" you probably mean "array of pointers to char." The arguments to qsort's comparison function are pointers to the objects being sorted, in this case, pointers to pointers to char. strcmp(), however, accepts simple pointers to char. Therefore, strcmp() can't be used directly. Write an intermediate comparison function like this:
/* compare strings via pointers */
int pstrcmp(const void *p1, const void *p2)
{
return strcmp(*(char * const *)p1, *(char * const *)p2);
}

The comparison function's arguments are expressed as "generic pointers," const void *. They are converted back to what they "really are" (pointers to pointers to char) and dereferenced, yielding char *'s which can be passed to strcmp().

7. Now I'm trying to sort an array of structures with qsort(). My comparison function takes pointers to structures, but the compiler complains that the function is of the wrong type for qsort(). How can I cast the function pointer to shut off the warning?
A: The conversions must be in the comparison function, which must be declared as accepting "generic pointers" (const void *) as discussed in question 13.8 above. The comparison function might look like
int mystructcmp(const void *p1, const void *p2)
{
const struct mystruct *sp1 = p1;
const struct mystruct *sp2 = p2;
/* now compare sp1- ... */

(The conversions from generic pointers to struct mystruct pointers happen in the initializations sp1 = p1 and sp2 = p2; the compiler performs the conversions implicitly since p1 and p2 are void pointers.)

If, on the other hand, you're sorting pointers to structures, you'll need indirection,
sp1 = *(struct mystruct * const *)p1 .
In general, it is a bad idea to insert casts just to "shut the compiler up." Compiler warnings are usually trying to tell you something, and unless you really know what you're doing, you ignore or muzzle them at your peril.

8. How can I sort a linked list?
A: Sometimes it's easier to keep the list in order as you build it (or perhaps to use a tree instead). Algorithms like insertion sort and merge sort lend themselves ideally to use with linked lists. If you want to use a standard library function, you can allocate a temporary array of pointers, fill it in with pointers to all your list nodes, call qsort(), and finally rebuild the list pointers based on the sorted array.

9. How can I sort more data than will fit in memory?
A: You want an "external sort," which you can read about in Knuth, Volume 3. The basic idea is to sort the data in chunks (as much as will fit in memory at one time), write each sorted chunk to a
temporary file, and then merge the files. Your operating system may provide a general-purpose sort utility, and if so, you can try invoking it from within your program:

10. How can I get the current date or time of day in a C program?
A: Just use the time(), ctime(), localtime() and/or strftime() functions. Here is a simple example:
#include
#include
int main()
{
time_t now;
time(&now);
printf("It's %.24s.\n", ctime(&now));
return 0;
}

11. I know that the library function localtime() will convert a time_t into a broken-down struct tm, and that ctime() will convert a time_t to a printable string. How can I perform the inverse operations of converting a struct tm or a string into a time_t?
A: ANSI C specifies a library function, mktime(), which converts a struct tm to a time_t.
Converting a string to a time_t is harder, because of the wide variety of date and time formats which might be encountered. Some systems provide a strptime() function, which is basically the inverse of strftime(). Other popular functions are partime() (widely distributed with the RCS package) and getdate() (and a few others, from the C news distribution).

12. How can I add N days to a date? How can I find the difference between two dates?
A: The ANSI/ISO Standard C mktime() and difftime() functions provide some support for both problems. mktime() accepts non- normalized dates, so it is straightforward to take a filled-in
struct tm, add or subtract from the tm_mday field, and call mktime() to normalize the year, month, and day fields (and incidentally convert to a time_t value). difftime() computes the difference, in seconds, between two time_t values; mktime() can be used to compute time_t values for two dates to be subtracted.

These solutions are only guaranteed to work correctly for dates in the range which can be represented as time_t's. The tm_mday field is an int, so day offsets of more than 32,736 or so may
cause overflow. Note also that at daylight saving time changeovers, local days are not 24 hours long (so don't assume that division by 86400 will be exact).

13. Does C have any Year 2000 problems?
A: No, although poorly-written C programs do.

The tm_year field of struct tm holds the value of the year minus 1900; this field will therefore contain the value 100 for the year 2000. Code that uses tm_year correctly (by adding or subtracting 1900 when converting to or from human-readable 4-digit year representations) will have no problems at the turn of the millennium. Any code that uses tm_year incorrectly, however, such as by using it directly as a human-readable 2-digit year, or setting it from a 4-digit year with code like

tm.tm_year = yyyy % 100; /* WRONG */

or printing it as an allegedly human-readable 4-digit year with code like

printf("19%d", tm.tm_year); /* WRONG */
will have grave y2k problems indeed.

14. How to generate a random numbers ?
A: The Standard C library has one: rand(). The implementation on your system may not be perfect, but writing a better one isn't necessarily easy, either.

If you do find yourself needing to implement your own random number generator, there is plenty of literature out there; see the References. There are also any number of packages on the
net: look for r250, RANLIB, and FSULTRA Generators: Good Ones are Hard to Find".

15. How can I get random integers in a certain range?
A: The obvious way,
rand() % N /* POOR */
(which tries to return numbers from 0 to N-1) is poor, because the low-order bits of many random number generators are distressingly *non*-random. A better method is something like

(int)((double)rand() / ((double)RAND_MAX + 1) * N)

If you're worried about using floating point, you could use
rand() / (RAND_MAX / N + 1)

Both methods obviously require knowing RAND_MAX (which ANSI #defines in ), and assume that N is much less than
RAND_MAX.
(Note, by the way, that RAND_MAX is a *constant* telling you what the fixed range of the C library rand() function is. You cannot set RAND_MAX to some other value, and there is no way of requesting that rand() return numbers in some other range.)

If you're starting with a random number generator which returns floating-point values between 0 and 1, all you have to do to get integers from 0 to N-1 is multiply the output of that generator by N.

16. Each time I run my program, I get the same sequence of numbers back from rand(). Why?
A: You can call srand() to seed the pseudo-random number generator with a truly random initial value. Popular seed values are the time of day, or the elapsed time before the user presses a key
(although keypress times are hard to determine portably; (Note also that it's rarely useful to call srand() more than once during a run of a program; in particular, don't try calling srand() before each call to rand(), in an attempt to get "really random" numbers.)

17. I need a random true/false value, so I'm just taking rand() % 2, but it's alternating 0, 1, 0, 1, 0...
A: Poor pseudorandom number generators (such as the ones unfortunately supplied with some systems) are not very random in the low-order bits. Try using the higher-order bits:

18. How can I generate random numbers with a normal or Gaussian distribution?
A: Here is one method, recommended by Knuth and due originally to Marsaglia:
#include
#include

double gaussrand()
{
static double V1, V2, S;
static int phase = 0;
double X;

if(phase == 0) {
do {
double U1 = (double)rand() / RAND_MAX;
double U2 = (double)rand() / RAND_MAX;

V1 = 2 * U1 - 1;
V2 = 2 * U2 - 1;
S = V1 * V1 + V2 * V2;
} while(S <= 1 S == 0);

X = V1 * sqrt(-2 * log(S) / S);
} else
X = V2 * sqrt(-2 * log(S) / S);

phase = 1 - phase;

return X;
}

See the extended versions of this list for other ideas.

19. I keep getting errors due to library functions being undefined, but I'm #including all the right header files.
A: In general, a header file contains only declarations. In some cases (especially if the functions are nonstandard) obtaining the actual *definitions* may require explicitly asking for the correct libraries to be searched when you link the program. (#including the header doesn't do that.)

20. I'm still getting errors due to library functions being undefined, even though I'm explicitly requesting the right libraries while linking.
A: Many linkers make one pass over the list of object files and libraries you specify, and extract from libraries only those modules which satisfy references which have so far come up as undefined. Therefore, the order in which libraries are listed with respect to object files (and each other) is significant; usually, you want to search the libraries last. (For example, under Unix, put any -l options towards the end of the command line.)

21 What does it mean when the linker says that _end is undefined?
A: That message is a quirk of the old Unix linkers. You get an error about _end being undefined only when other symbols are undefined, too -- fix the others, and the error about _end will disappear.

No comments: