Overview and macro building blocks
Generic programming in C is usually seen as painful, crufty, and verbose in comparison with languages such as C++ or Java. Many C programmers opt instead for the non-generic approach of implementing polymorphism with void *
, but this can carry a performance penalty, since the lack of precise type information at compile-time inhibits optimizations, and necessitates calling through function pointers (or worse, use of library calls like memcpy()
and memcmp()
).
One of the interesting features of the new C11 Standard is type-generic expressions, implemented using the _Generic
keyword (for a nice introduction, see http://www.robertgamble.net/2012/01/c11-generic-selections.html). By itself, this feature may not seem to add all that much convenience to the language, but combining it with some simple macro constructs makes it possible to write generic and fully type-safe data structures and functions, entirely in straight C. Adding a new type becomes as simple as creating a new entry in a few lists – still not quite as convenient as in more modern languages, but close enough for C work.
Ideally, a generic idiom should provide for:
- Type-independent programming and code reuse: data structures or functions are defined once and used across many types,
- Type checking: attempts to mix generic objects or functions of the wrong type should yield compile-time errors,
- No run-time overhead: defining separate implementations for each type allows the compiler to generate optimal code for each use case, and
- Simple, readable syntax.
In this post we’ll explore a simple implementation of generics in C11 that provides all four of the above (with some room for programming language connoisseurs to quibble over the last bullet point). All the source code and profiling output cited below is available on Bitbucket. The implementation rests on the concepts of “typed struct” and “typed function,” defined by the macros:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "generic_defs.h" | |
#define TYPED_STRUCT(NAME, T) struct NAME##_##T | |
#define TYPED_FN(NAME, T, FN_NAME) NAME##_##T##_##FN_NAME |
Using the ##
(concatenation) preprocessor operator, these macros simply append a type name to a given struct or function, separated by underscores. For example, TYPED_STRUCT(list, int)
expands to struct list_int
, while TYPED_FN(list, int, add)
becomes list_int_add
. These are just convenience definitions for use in header files where we’ll declare the typed variants for the structs or functions we’re creating.
A generic bounds-checked array
Consider the problem of implementing a generic, bounds-checked, fixed-size array of values — call it a chkd_array
. We’ll define this array’s implementation details once, and reuse that definition for any arbitrary type where it may be needed. If clients attempt to read or write outside the array’s bounds, rather than using error codes (or segfaulting or scribbling some memory somewhere), we wish to return a “garbage” error value which is specified at the time the chkd_array
is created. To demonstrate how everything works with different C data types, we’ll implement the chkd_array
for int
s, float
s, and a custom struct (called struct strct
and consisting of a single int
and float
).
The following macros declare the chkd_array
and some basic operations on it.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "chkd_array_factory_head.h" | |
// Guards to ensure we don't reuse an old list of types | |
#ifdef CHKD_ARRAY_TYPES | |
#error "CHKD_ARRAY_TYPES defined before included header" | |
#endif | |
#ifdef CHKD_ARRAY_PRIMITIVE_TYPES | |
#error "CHKD_ARRAY_PRIMITIVE_TYPES defined before included header" | |
#endif | |
#include "generic_defs.h" | |
// Expansion macro for checked array struct with typename T_NM | |
#define CHKD_ARRAY(T_NM) TYPED_STRUCT(chkd_array, T_NM) | |
// Expansion macros for functions on chkd_arrays with typename T_NM | |
#define CREATE(T_NM) TYPED_FN(chkd_array, T_NM, create) | |
#define IS_EQUAL(T_NM) TYPED_FN(chkd_array, T_NM, is_equal) | |
#define SET(T_NM) TYPED_FN(chkd_array, T_NM, set) | |
#define GET(T_NM) TYPED_FN(chkd_array, T_NM, get) | |
#define DESC(T_NM) TYPED_FN(chkd_array, T_NM, desc) | |
// Declaration of the checked array typed struct and functions. | |
#define CHKD_ARRAY_DEC(T, T_NM, DESC_LIT) \ | |
\ | |
CHKD_ARRAY(T_NM) { \ | |
T *data; \ | |
unsigned length; \ | |
T err_val; \ | |
}; \ | |
static CHKD_ARRAY(T_NM) * CREATE(T_NM)(unsigned length, T *err_val); \ | |
static inline bool IS_EQUAL(T_NM)(T *first, T *second); \ | |
static inline T SET(T_NM)(CHKD_ARRAY(T_NM) * self, unsigned idx, T *val); \ | |
static inline T *GET(T_NM)(CHKD_ARRAY(T_NM) * self, unsigned idx); \ | |
static inline char *DESC(T_NM)(); | |
// Used later to declare the checked array for ALL_TYPES. | |
#define DECLARE(ALL_TYPES) ALL_TYPES(CHKD_ARRAY_DEC) |
With these macros available, the statement
CHKD_ARRAY_DEC(int, int, "chkd_array_int")
would declare a struct chkd_array_int
, along with function signatures like
chkd_array_int_set(struct chkd_array_int *self, unsigned idx, int *val);
A DESC_LIT
string literal, unused in this particular macro, is included in its arguments so that we can return a descriptive name from the DESC
typed function. We’ll use the DECLARE
macro in a subsequent header file to generate code for a given list of types.
Next, we need the macros to define the implementations for these functions. At the point where this next header file is #include
d, the client already should have specified the types for which it needs chkd_array
definitions. For brevity’s sake I’ve included only the GET
and DESC
definitions here.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "chkd_array_factory_defs.h" — first part | |
// Guards to ensure lists of types for which to create defs are defined | |
#ifndef CHKD_ARRAY_TYPES | |
#error "CHKD_ARRAY_TYPES not defined for defs" | |
#endif | |
#ifndef CHKD_ARRAY_PRIMITIVE_TYPES | |
#error "CHKD_ARRAY_PRIMITIVE_TYPES not defined for defs" | |
#endif | |
DECLARE(CHKD_ARRAY_TYPES) // See XX factory macro discussion below | |
// Definition of implementations for the generic functions. | |
#define CHKD_ARRAY_DEF(T, T_NM, DESC_LIT) \ | |
\ | |
static CHKD_ARRAY(T_NM) * CREATE(T_NM)(unsigned length, T *err_val) \ | |
{ /* implementation omitted */ } \ | |
\ | |
static inline T SET(T_NM)(CHKD_ARRAY(T_NM) *self, unsigned idx, T *val) \ | |
{ /* implementation omitted */ } \ | |
\ | |
static inline T *GET(T_NM)(CHKD_ARRAY(T_NM) * self, unsigned idx) \ | |
{ \ | |
if (idx >= self->length) { \ | |
return &self->err_val; \ | |
} \ | |
\ | |
return &self->data[idx]; \ | |
} \ | |
\ | |
static inline char *DESC(T_NM)() \ | |
{ \ | |
return (DESC_LIT); \ | |
} | |
// Definition of IS_EQUAL impl for types using == operator | |
#define IS_EQUAL_DEF(T, T_NM, DESC_LIT) \ | |
\ | |
static inline bool IS_EQUAL(T_NM)(T *first, T *second) \ | |
{ \ | |
return *first == *second; \ | |
} |
Note that we can’t define IS_EQUAL
in the “master” CHKD_ARRAY_DEF
macro, because the method body must differ depending on whether T
is a simple data type, or a struct
(for which the ==
operator is not defined). The definition is split out into a separate macro to at least reuse the code across primitive types.
Using these declaration and definition macros, we could set up our chkd_array
s by writing out CHKD_ARRAY_DEC
and CHKD_ARRAY_DEF
for each type, but there’s a more concise (and less error-prone) way. With XX factory macros, we can create a single list of types and automatically expand the macros for each list entry. The header for the chkd_array
client code defines the struct strct
type and the XX
lists of needed types.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "chkd_array_client.h" | |
#include "chkd_array_factory_head.h" | |
// Add client's new type named "strct" before defining generic functions | |
struct strct { | |
int a; | |
float b; | |
}; | |
// List primitive types with == operator | |
#define CHKD_ARRAY_PRIMITIVE_TYPES(XX) \ | |
XX(int, int, "chkd_array_int") \ | |
XX(float, float, "chkd_array_float") | |
// List all types for which client needs chkd_array defined | |
#define CHKD_ARRAY_TYPES(XX) \ | |
CHKD_ARRAY_PRIMITIVE_TYPES(XX) \ | |
XX(struct strct, struct_strct, "chkd_array_struct_strct") | |
// Create generic function names for chkd_array * functions | |
#define GENERIC_ARRAY_PTR_FN_DEFINE(CHKD_ARRAY_PTR, FN_NAME) \ | |
_Generic((CHKD_ARRAY_PTR), \ | |
struct chkd_array_int *: (chkd_array_int ## _ ## FN_NAME), \ | |
struct chkd_array_float *: (chkd_array_float ## _ ## FN_NAME), \ | |
struct chkd_array_struct_strct *: (chkd_array_struct_strct ## _ ## FN_NAME)) | |
// Declare and define the generic val ptr and chkd_array ptr functions | |
#include "chkd_array_factory_defs.h" |
Toward the end of this file, we finally make use of _Generic
. Before that point, with all of the macros we had defined so far, we still couldn’t work with our new array in a truly generic way, because the function names all have a type spliced into them — chkd_array_int_get
, chkd_array_float_get
, etc. What we want is to call a single get
function and leave it up to the compiler to figure out which function body to use based on the types being passed in. This is what _Generic
gives us. Since the get
function takes a pointer to a chkd_array
of a given type, the GENERIC_ARRAY_PTR_FN_DEFINE
macro can take those pointers along with a function name and select the appropriate typed function to use.
The rest of the chkd_array_factory_defs.h
file uses the XX
lists to define most of the function implementations, and creates a CHKD_ARRAY_GET
macro which provides the type-independent get
. It’s a simple matter to extend this example to the other chkd_array
functions.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "chkd_array_factory_defs.h" — second part | |
// Define the chkd_array functions for the passed-in types | |
#define DEFINE(TYPES) TYPES(CHKD_ARRAY_DEF) | |
DEFINE(CHKD_ARRAY_TYPES) | |
// Define the == is_equal function for passed-in primitive types | |
#define DEF_PRIMITIVES(TYPES) TYPES(IS_EQUAL_DEF) | |
DEF_PRIMITIVES(CHKD_ARRAY_PRIMITIVE_TYPES) | |
// An example definition of generic get function | |
#define CHKD_ARRAY_GET(CHKD_ARRAY_PTR, idx) \ | |
GENERIC_ARRAY_PTR_FN_DEFINE((CHKD_ARRAY_PTR), get)((CHKD_ARRAY_PTR), idx) |
The final piece of the puzzle is the definition of is_equal
for the local struct strct
type, which it makes sense to place in the client’s .c
file:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// "generic_chkd_array_client.c" | |
#include "chkd_array_client.h" | |
// Since == could not be used in the def of is_equal() for struct strct, define it here | |
static inline bool chkd_array_struct_strct_is_equal(struct strct *first, struct strct *second) | |
{ | |
return first->a == second->a && first->b == second->b; | |
} | |
// All chkd_arrays and generic functions now available in this file …. |
Now, if int_array
is a pointer to a struct chkd_array_int
, the statement
int *i_ptr = CHKD_ARRAY_GET(int_array, 3);
expands to a call to chkd_array_int_get
automatically. See the repo for a lot more examples of how this is implemented. Once all these macros have been set up, I find the syntax for using them almost downright pleasant; for instance, here’s how new chkd_array
s are created:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <limits.h> | |
// … | |
int err_int = INT_MIN; | |
float err_float = FLT_MIN; | |
struct strct err_strct = { .a=err_int, .b=err_float}; | |
// Set up int, float, and strct type arrays | |
CHKD_ARRAY(int) *int_array = CHKD_ARRAY_CREATE(LENGTH, &err_int); | |
CHKD_ARRAY(float) *float_array = CHKD_ARRAY_CREATE(LENGTH, &err_float); | |
CHKD_ARRAY(struct_strct) *strct_array = CHKD_ARRAY_CREATE(LENGTH, &err_strct); |
These macros serve as good building blocks for generic functions in whichever files they are used. (In the repo, have a look at the WRITE_VALS_DEF macro in chkd_array_demo.c for an example.) And as a final boon, they are also type-safe — attempting to get a float
pointer from a chkd_array_int
generates the following error in clang
:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chkd_array_demo.c:268:11: error: initializing 'float' with an expression of | |
incompatible type 'int *' | |
float *f_ptr = CHKD_ARRAY_GET(int_array, 2); | |
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1 error generated. |
Performance gains
Okay. So, why bother with all this effort? Setting up a generic data structure with this sort of preprocessor hackery is admittedly verbose and not a little tedious to write, and we could use void *
(and a bit of added care not to mix up our types) to get the same result from our programs.
As it turns out, avoiding the use of void *
and the associated function pointers (which, incidentally, also require more duplication of code) can reduce the number of instructions executed in our programs significantly. To see this effect in action, I created a void *
implementation which mirrors chkd_array
and each of its functions, and made an identically ordered series of calls, randomized by type, to both implementations. Performance profiles were generated using Valgrind’s callgrind
utility. Here’s how the two variants stacked up:
Here’s the Valgrind annotated output for the -O3
runs. The generic portion comprised the instructions counted in make_generic_calls
, and half those in chkd_array_strct_is_equal
(which is also used by the void *
version), while the remainder were executed during the void *
run.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
——————————————————————————– | |
Profile data file 'callgrind.out.22884' (creator: callgrind-3.8.1) | |
——————————————————————————– | |
… | |
——————————————————————————– | |
Ir | |
——————————————————————————– | |
7,316,523,785 PROGRAM TOTALS | |
——————————————————————————– | |
Ir file:function | |
——————————————————————————– | |
2,478,169,035 ???:make_generic_calls [/home/abissell/dev/c11-generics/src/a.out] | |
2,051,740,027 ???:make_vpa_calls [/home/abissell/dev/c11-generics/src/a.out] | |
546,000,000 ???:set_strct [/home/abissell/dev/c11-generics/src/a.out] | |
448,000,000 ???:set_float [/home/abissell/dev/c11-generics/src/a.out] | |
406,000,000 ???:set_int [/home/abissell/dev/c11-generics/src/a.out] | |
351,000,000 ???:chkd_array_strct_is_equal [/home/abissell/dev/c11-generics/src/a.out] | |
240,000,000 ???:is_equal_float [/home/abissell/dev/c11-generics/src/a.out] | |
234,000,000 ???:get_strct [/home/abissell/dev/c11-generics/src/a.out] | |
192,000,000 ???:get_float [/home/abissell/dev/c11-generics/src/a.out] | |
174,000,000 ???:get_int [/home/abissell/dev/c11-generics/src/a.out] | |
174,000,000 ???:is_equal_int [/home/abissell/dev/c11-generics/src/a.out] |
With -O3
optimizations enabled, the generic functions gave us a healthy reduction in execution time, finishing in 64% of the time required for their void *
counterparts, while executing only 57% as many CPU instructions. The Valgrind output also hints that the compiler is able to inline the generic calls to the get
, set
, and is_equal
methods, likely the source of most of the reduction in instruction count. Optimizations like these allow C++’s std::sort
, a generic template implementation, to outperform naive library calls to C’s qsort
.
Some drawbacks
While the syntax made possible with these generic macros is quite nice once they’re defined, it’s tedious to create header declarations and definitions for entire data structures or algorithms within preprocessor #define
statements. Every line except the last must be terminated with a backslash, and it’s impossible — at least, as far as I’m aware — to add comments to the source code within the #define
. Editors and IDEs offer much less in the way of syntax highlighting, and the clang-format.py
tool, which I like to use to auto-format my C code in Vim, is also a little shaky when faced with macros using _Generic
.
In addition to those issues, which are mostly unique to C, these generic techniques carry all the same disadvantages of C++ templates. Code is duplicated for each generic function in each translation unit where it’s #include
d, which can lead to long compile times and bloated executables. Debugging, at both compile-time and run-time, is made much more difficult by the need to trace a generated line of code back to the macro where it was expanded. This only gets harder as more levels of generic macros are nested within each other.
Nevertheless, if you’re looking for a way to add type safety or squeeze a little performance out of your C program, have access to C11’s _Generic
or equivalent compiler extensions, and are willing to pay the cost in increased compile time and program size, this approach is worth considering.
Compiling with _Generic
clang has supported _Generic
since its 3.0 release. GCC 4.9, due (hopefully) to be released in the first half of this year, will support it as well. In the meantime, those committed to compiling with GCC can still emulate the C11 feature using the P99_GENERIC macro.
After finishing up this post, I also found this great short intro to many of the same techniques: http://cecilsunkure.blogspot.com/2012/08/generic-programming-in-c.html
this is a great post. the performance implications of using void* casting vs _Generic are explained very well. the first article I have seen to do so.
it would be nice if the standard adopts typeof() or even decltype() to solve these problems. but_Generic is better then nothing.
a note about gcc. a good number the c99 and c11 features come from gnu-c extentions. gnu-c has typeof() and a host of other extensions. if you want to use typeof() in c code you can do so with gcc.
I used a similar technique once to implement a generic vector in C99 but never found how _Generic could improve the implementation. It used yet another layer of macros and indirection, plus the equivalent of a vtable to make the whole thing work. It also used variadic macro dispatch to overload functions on the number of parameters. Care to have a look? From what I can see we could both benefit from that 🙂
http://codereview.stackexchange.com/q/43809/15094
I’m trying to create a _Generic function to read bits in different byte and bit orders, sounds like the perfect application right?
What I’m struggling with tho, is that the deciding variable isn’t the type of object, but an enum variable saying if it should read as big endian or little endian, how the hell do I do this?
I have an enum called ByteOrders that’s defined as
extern enum ByteOrders {
BigEndian = 0, // Big Endian
LittleEndian = 1, // Little Endian
} ByteOrders;
#define ReadBits(ByteOrder, BitB, Bits2Read) \
_Generic( (ByteOrder), BigEndian:ReadBitsBigEndian, LittleEndian:ReadBitsLittleEndian)(BitB, Bits2Read)
The only obvious problem is that I discard the ByteOrder variable from the macro at the end for the actual function definition (which is precisely what I want, that way there’s no branching, and so the code becomes more easily inlinable)
Unfortunately I’m no C expert — this article was really the only deep dive into C macros I’ve ever done. The only idea I have is to make each byte order a separate enum, so that they have different types which _Generic can disambiguate.
You might try asking Jens Gustedt, he wrote the P99_GENERIC macro I linked to in the article.