Overview and macro building blocks
Generic programming in C is usually seen as painful, crufty, and verbose in comparison with languages such as C++ or Java. Many C programmers opt instead for the non-generic approach of implementing polymorphism with
void *, but this can carry a performance penalty, since the lack of precise type information at compile-time inhibits optimizations, and necessitates calling through function pointers (or worse, use of library calls like
One of the interesting features of the new C11 Standard is type-generic expressions, implemented using the
_Generic keyword (for a nice introduction, see http://www.robertgamble.net/2012/01/c11-generic-selections.html). By itself, this feature may not seem to add all that much convenience to the language, but combining it with some simple macro constructs makes it possible to write generic and fully type-safe data structures and functions, entirely in straight C. Adding a new type becomes as simple as creating a new entry in a few lists – still not quite as convenient as in more modern languages, but close enough for C work.
Ideally, a generic idiom should provide for:
- Type-independent programming and code reuse: data structures or functions are defined once and used across many types,
- Type checking: attempts to mix generic objects or functions of the wrong type should yield compile-time errors,
- No run-time overhead: defining separate implementations for each type allows the compiler to generate optimal code for each use case, and
- Simple, readable syntax.
In this post we’ll explore a simple implementation of generics in C11 that provides all four of the above (with some room for programming language connoisseurs to quibble over the last bullet point). All the source code and profiling output cited below is available on Bitbucket. The implementation rests on the concepts of “typed struct” and “typed function,” defined by the macros:
## (concatenation) preprocessor operator, these macros simply append a type name to a given struct or function, separated by underscores. For example,
TYPED_STRUCT(list, int) expands to
struct list_int, while
TYPED_FN(list, int, add) becomes
list_int_add. These are just convenience definitions for use in header files where we’ll declare the typed variants for the structs or functions we’re creating.
A generic bounds-checked array
Consider the problem of implementing a generic, bounds-checked, fixed-size array of values — call it a
chkd_array. We’ll define this array’s implementation details once, and reuse that definition for any arbitrary type where it may be needed. If clients attempt to read or write outside the array’s bounds, rather than using error codes (or segfaulting or scribbling some memory somewhere), we wish to return a “garbage” error value which is specified at the time the
chkd_array is created. To demonstrate how everything works with different C data types, we’ll implement the
floats, and a custom struct (called
struct strct and consisting of a single
The following macros declare the
chkd_array and some basic operations on it.
With these macros available, the statement
CHKD_ARRAY_DEC(int, int, "chkd_array_int")
would declare a
struct chkd_array_int, along with function signatures like
chkd_array_int_set(struct chkd_array_int *self, unsigned idx, int *val);
DESC_LIT string literal, unused in this particular macro, is included in its arguments so that we can return a descriptive name from the
DESC typed function. We’ll use the
DECLARE macro in a subsequent header file to generate code for a given list of types.
Next, we need the macros to define the implementations for these functions. At the point where this next header file is
#included, the client already should have specified the types for which it needs
chkd_array definitions. For brevity’s sake I’ve included only the
DESC definitions here.
Note that we can’t define
IS_EQUAL in the “master”
CHKD_ARRAY_DEF macro, because the method body must differ depending on whether
T is a simple data type, or a
struct (for which the
== operator is not defined). The definition is split out into a separate macro to at least reuse the code across primitive types.
Using these declaration and definition macros, we could set up our
chkd_arrays by writing out
CHKD_ARRAY_DEF for each type, but there’s a more concise (and less error-prone) way. With XX factory macros, we can create a single list of types and automatically expand the macros for each list entry. The header for the
chkd_array client code defines the
struct strct type and the
XX lists of needed types.
Toward the end of this file, we finally make use of
_Generic. Before that point, with all of the macros we had defined so far, we still couldn’t work with our new array in a truly generic way, because the function names all have a type spliced into them —
chkd_array_float_get, etc. What we want is to call a single
get function and leave it up to the compiler to figure out which function body to use based on the types being passed in. This is what
_Generic gives us. Since the
get function takes a pointer to a
chkd_array of a given type, the
GENERIC_ARRAY_PTR_FN_DEFINE macro can take those pointers along with a function name and select the appropriate typed function to use.
The rest of the
chkd_array_factory_defs.h file uses the
XX lists to define most of the function implementations, and creates a
CHKD_ARRAY_GET macro which provides the type-independent
get. It’s a simple matter to extend this example to the other
The final piece of the puzzle is the definition of
is_equal for the local
struct strct type, which it makes sense to place in the client’s
int_array is a pointer to a
struct chkd_array_int, the statement
int *i_ptr = CHKD_ARRAY_GET(int_array, 3);
expands to a call to
chkd_array_int_get automatically. See the repo for a lot more examples of how this is implemented. Once all these macros have been set up, I find the syntax for using them almost downright pleasant; for instance, here’s how new
chkd_arrays are created:
These macros serve as good building blocks for generic functions in whichever files they are used. (In the repo, have a look at the WRITE_VALS_DEF macro in chkd_array_demo.c for an example.) And as a final boon, they are also type-safe — attempting to get a
float pointer from a
chkd_array_int generates the following error in
Okay. So, why bother with all this effort? Setting up a generic data structure with this sort of preprocessor hackery is admittedly verbose and not a little tedious to write, and we could use
void * (and a bit of added care not to mix up our types) to get the same result from our programs.
As it turns out, avoiding the use of
void * and the associated function pointers (which, incidentally, also require more duplication of code) can reduce the number of instructions executed in our programs significantly. To see this effect in action, I created a
void * implementation which mirrors
chkd_array and each of its functions, and made an identically ordered series of calls, randomized by type, to both implementations. Performance profiles were generated using Valgrind’s
callgrind utility. Here’s how the two variants stacked up:
Here’s the Valgrind annotated output for the
-O3 runs. The generic portion comprised the instructions counted in
make_generic_calls, and half those in
chkd_array_strct_is_equal (which is also used by the
void * version), while the remainder were executed during the
void * run.
-O3 optimizations enabled, the generic functions gave us a healthy reduction in execution time, finishing in 64% of the time required for their
void * counterparts, while executing only 57% as many CPU instructions. The Valgrind output also hints that the compiler is able to inline the generic calls to the
is_equal methods, likely the source of most of the reduction in instruction count. Optimizations like these allow C++’s
std::sort, a generic template implementation, to outperform naive library calls to C’s
While the syntax made possible with these generic macros is quite nice once they’re defined, it’s tedious to create header declarations and definitions for entire data structures or algorithms within preprocessor
#define statements. Every line except the last must be terminated with a backslash, and it’s impossible — at least, as far as I’m aware — to add comments to the source code within the
#define. Editors and IDEs offer much less in the way of syntax highlighting, and the
clang-format.py tool, which I like to use to auto-format my C code in Vim, is also a little shaky when faced with macros using
In addition to those issues, which are mostly unique to C, these generic techniques carry all the same disadvantages of C++ templates. Code is duplicated for each generic function in each translation unit where it’s
#included, which can lead to long compile times and bloated executables. Debugging, at both compile-time and run-time, is made much more difficult by the need to trace a generated line of code back to the macro where it was expanded. This only gets harder as more levels of generic macros are nested within each other.
Nevertheless, if you’re looking for a way to add type safety or squeeze a little performance out of your C program, have access to C11’s
_Generic or equivalent compiler extensions, and are willing to pay the cost in increased compile time and program size, this approach is worth considering.
clang has supported
_Generic since its 3.0 release. GCC 4.9, due (hopefully) to be released in the first half of this year, will support it as well. In the meantime, those committed to compiling with GCC can still emulate the C11 feature using the P99_GENERIC macro.