Tuesday, September 13, 2011

Compile time asserts

C and C++ don't have a built-in version of a compile time assert. Being in both the network protocol business, the embedded system business and distributed system business, I spend a lot of time counting bytes to make sure our data structures are the correct size. I then have to chase down the problems that arise when somebody adds an additional field to a data structure, which causes the compiler to add more padding and things get larger than expected.

GCC 4.3 (but not by default) and VS 10 support static_assert as part of the C++x0 changes, but it's still not supported in C.

I poked around and came up with a gimmicky solution but it works pretty well.


#define ASSERT_LINE_HELP_CONCAT(a, b) a##b
#define ASSERT_LINE_HELP(a, b) ASSERT_LINE_HELP_CONCAT(a, b)
#define CT_ASSERT(expression) \
enum { \
ASSERT_LINE_HELP(COMPILE_TIME_ASSERT_ON_LINE_, __LINE__) \
= 1/((expression)) \
}


It creates an enum called COMPILE_TIME_ASSERT_ON_LINE_### and sets it to 1 on a valid constant expression and 1/0 on an invalid expression. Setting an enum constant to 1/0 will cause it to fail compilation (that's what we want!). The extra indirection is to allow the __LINE__ macro to be expanded to show the line number.

As an example:


#include <stdio.h>

#define ASSERT_LINE_(a, b) a##b
#define ASSERT_LINE(a, b) ASSERT_LINE_(a, b)
#define CT_ASSERT(expression) \
enum { \
ASSERT_LINE(COMPILE_TIME_ASSERT_ON_LINE_, __LINE__) \
= 1/((expression)) \
}


// as defined in the Liunx kernel
struct tcphdr {
unsigned short source;
unsigned short dest;
unsigned int seq;
unsigned int ack_seq;
unsigned short res1:4;
unsigned short doff:4;
unsigned short OOPS:1; // this is the problem
unsigned short fin:1;
unsigned short syn:1;
unsigned short rst:1;
unsigned short psh:1;
unsigned short ack:1;
unsigned short urg:1;
unsigned short res2:2;
unsigned short window;
unsigned short check;
unsigned short urg_ptr;
};

CT_ASSERT(sizeof(struct tcphdr) == 20);

int main()
{
return 0;
}





bandken@six6six:~$ gcc temp.c
temp.c:33: warning: division by zero
temp.c:33: error: enumerator value for ‘COMPILE_TIME_ASSERT_ON_LINE_33’ is not an integer constant
bandken@six6six:~$


This lets us know that our TCP header size isn't 20.

There's a few caveats. It will be confusing to have multiple instances of the CT_ASSERT(...) on the same line. That's pretty simple to remedy, but the bigger problem exists when CT_ASSERT(...) is used in a header file. It won't work if there aren't #include guards. If multiple CT_ASSERTs are included in a header file, they will all resolve to the same line. If that is the case, it might be preferable to use a separate .c file that includes the various asserts that could be put into the header files.

This is a bit kludgy, but it does provide checking that wouldn't normally be available. In cases where data structure size is vital, it provides a compile time check for these cases. Unit tests that checked the sizes could be used as an alternative to the compile time assert, but that requires the unit tests to be a) written and b) run every time. Depending on the environment, that's not guaranteed, where as CT_ASSERT(...) will force the issue to be dealt with before it's committed to the source code repo.

Sunday, September 11, 2011

pure virtual method called


bandken@six6six:~$ ./a.out
pure virtual method called
terminate called without an active exception
Aborted
bandken@six6six:~$


Someone had asked me to explain what this meant. Simply, it means that you called a pure virtual function without having the method defined. With most compilers (at least all the ones that I've used), a dummy function is inserted into the virtual function table entry, and it remains there until a concrete class is constructed and the pure virtual function replaced with the correct function. GCC won't even let you call a pure virtual function in the abstract class's constructor, as this example shows:


class Base
{
public:
Base()
{
Oopsie(); // ERROR: warning: abstract
// virtual ‘virtual void
// Base::Oopsie()’
// called from constructor
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



We're lucky that we actually get an error. There's no reason why the compiler needs to tell us about this. It doesn't take much tweaking to get past this.


class Base
{
public:
Base()
{
init();
}

void init()
{
Oopsie();
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



The constructor order is Base, then Derived. In the constructor for Base, we call init(), which calls Oopsie(). Since Derived hasn't been instantiated yet, we call the pure virtual method and our program crashes.

We're not limited to having this problem only in the constructor. Here you can see the problem happening in the destructor. The destructor order is Derived first, then Base (opposite order of the constructors). By the time the done() method is called, the Derived destructor has already been called, and the derived Oopsie() call no longer exists.


class Base
{
public:
~Base()
{
done();
}

void done()
{
Oopsie();
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



Typically, a good rule of thumb is to be very careful when putting any virtual functions into constructors and destructors. You might even know what you are doing. Maybe. Sure, calling that virtual function from the abstract base class constructor is fine, because....uh, I can't think of ANY examples why this is a good idea. If you really wanted some functionality the was to be provided in a function that happened to be virtual, please do something like this:


class Base
{
public:
Base()
{
init();
}

void init()
{
// ..
// ..
// ..
functionalityThatICannotLiveWithout();
}

void functionalityThatICannotLiveWithout()
{
// ..
}

virtual void pureVirtualMethod()
{
functionalityThatICannotLiveWithout();
}
};

class Derived : public Base
{
public:
virtual void pureVirtualMethod() {};
};

int main()
{
Derived derived;
return 0;
}