Saturday, October 22, 2011

Template metaprogramming

Template metaprogramming is a fairly useless feature of C++, but we can have some fun with it. I've been trying to improve my functional programming skill set, so a friend proposed a problem:

Write a routine that can count bits in an integer at compile time using templates.

Here's the basic recursive solution:

int count_bits(unsigned int x)
{
if(x == 0)
{
return 0;
}
else
{
return (x & 0x1) + count_bits(x >> 1);
}
}



This isn't what I had in mind though. I want to do this using templates. That gets us there:

#include <stdio.h>

template<unsigned long long input>struct numBits
{
static const unsigned long long value = numBits<(input >> 1)>::value+(input&1);
};

template <> struct numBits<0>
{
static const unsigned long long value=0;
};

#define output(num) printf("0x%llX: %u\n", (unsigned long long)num, numBits<(unsigned long long)num>::value)

int main(int argc, const char **argv)
{
output(0x0);
output(0x1);
output(0x2);
output(0x3);
output(0x4);
output(0x11);
output(0x111);
output(0x1000);
output(0x1010);
output(0x1001001);
output(0xFFFFFFFFFFFFFFFEULL);
output(0xFFFFFFFFFFFFFFFFULL);

return 0;
}




Ha! The biggest problem with this is it only works on constants (it IS compile time). It's an academic exercise, as there's not much good reason to use this over the other options. The output looks like:

0x0: 0
0x1: 1
0x2: 1
0x3: 2
0x4: 1
0x11: 2
0x111: 3
0x1000: 1
0x1010: 2
0x1001001: 3
0xFFFFFFFFFFFFFFFE: 63
0xFFFFFFFFFFFFFFFF: 64

Sunday, October 9, 2011

Near space attempt #1




I have been reading about a bunch of the high altitude weather balloon experiments, and I wanted to give it a try. I started with a bunch of websites sort of explaining what they did, and initially decided to copy them. I got myself a Canon A470 and a Motorola i290 prepaid cell phone from Amazon, a sturdy styrofoam box and some kite string from my parents, a 600g weather balloon off Ebay, a rocket parachute from a sketchy online hobby shop and two 60 cubic foot helium tanks from Taylor Rental. The goal was 100K feet (3 times higher than those really small jets up there).




This was collected over a 2.5 month period. The project was "officially" started when I found a different camera and the Motorola cell phone on sale on Amazon on July 24th. The first camera I got couldn't support larger than a 2GB memory card, so I returned it and bough a used Canon A470 instead. I'll put up more info about each of the parts used in separate posts.

We've had a lot of rainy weekends lately, and the weekend of October 8th was the first nice one in the past 3 or 4 weeks. It turns out the the high winds we favorably blowing from north to south, allowing us to launch the balloon from Nashua and recovering it somewhere in the Massachusetts Metro-West area. I wanted it to land in a civilized area, so it we had a chance of getting to it without a huge overland hike and somewhere where there should be cell phone coverage.

Using habhub.org the predicted flight path put it somewhere south of Framingham. This was the weekend.





Fill 'er up

I played phone tag all week with various FAA offices until I at least got someone to at a least acknowledge that I was going to send a balloon up somewhere near Boire Field. I arrived at 8 and scoped out a place to launch it and decided that behind the tennis court would work. A co-worker helped inflate the balloon. The 600g balloon was supposed to be 6ft in diameter fully inflated (113 cubic ft). I used a piece of tubing duct taped inside the balloon nozzle and stuck over the balloon filler. Emptying a single 60 cubic foot helium tank took about 8 or 9 minutes. This is a picture of the balloon after the first tank:



Switching to the second tank was pretty easy. Packing the payload was pretty straightforward:



The green piece is a mylar balloon crumpled up so it (possibly) would show up on radar. I don't know if that worked, but it didn't add much weight. The black box plugged into the phone is a battery extender that I had. I don't think that did anything either, but it did add weight. The phone would typically run for around 10 hours with the GPS tracking software enabled, so the extra battery pack was overkill. The camera is pointing through a hole in the side of the box. I filled the box up the rest of the was with packing peanuts and taped a note onto the top:



Hopefully that will keep the Massachusetts police organizations from over-react-ing.


The Launch

Liftoff was at 9:13 EDT. We toddled down to Framingham to wait for a signal of the parachuting payload coming back to Earth. While eating lunch, we got our tracking signal:





Awesome! It landed in North Easton!?! We were still 40 miles away in Framingham. Touchdown was at 12:13 EDT. We drove over and found the payload intact:



The balloon was shredded and really twisted up in the parachute. I think it would be better to put the ballon farther from the parachute next time, but I'll detail that later.

The raw result:


Flying a balloon to 100K feet

The pictures weren't outstanding, but I'll call it a success. I got a picture of the blackness of space, so I know it got a long way up there, and we were able to successfully do it once.
Some of the highlights:











Tuesday, September 13, 2011

Compile time asserts

C and C++ don't have a built-in version of a compile time assert. Being in both the network protocol business, the embedded system business and distributed system business, I spend a lot of time counting bytes to make sure our data structures are the correct size. I then have to chase down the problems that arise when somebody adds an additional field to a data structure, which causes the compiler to add more padding and things get larger than expected.

GCC 4.3 (but not by default) and VS 10 support static_assert as part of the C++x0 changes, but it's still not supported in C.

I poked around and came up with a gimmicky solution but it works pretty well.


#define ASSERT_LINE_HELP_CONCAT(a, b) a##b
#define ASSERT_LINE_HELP(a, b) ASSERT_LINE_HELP_CONCAT(a, b)
#define CT_ASSERT(expression) \
enum { \
ASSERT_LINE_HELP(COMPILE_TIME_ASSERT_ON_LINE_, __LINE__) \
= 1/((expression)) \
}


It creates an enum called COMPILE_TIME_ASSERT_ON_LINE_### and sets it to 1 on a valid constant expression and 1/0 on an invalid expression. Setting an enum constant to 1/0 will cause it to fail compilation (that's what we want!). The extra indirection is to allow the __LINE__ macro to be expanded to show the line number.

As an example:


#include <stdio.h>

#define ASSERT_LINE_(a, b) a##b
#define ASSERT_LINE(a, b) ASSERT_LINE_(a, b)
#define CT_ASSERT(expression) \
enum { \
ASSERT_LINE(COMPILE_TIME_ASSERT_ON_LINE_, __LINE__) \
= 1/((expression)) \
}


// as defined in the Liunx kernel
struct tcphdr {
unsigned short source;
unsigned short dest;
unsigned int seq;
unsigned int ack_seq;
unsigned short res1:4;
unsigned short doff:4;
unsigned short OOPS:1; // this is the problem
unsigned short fin:1;
unsigned short syn:1;
unsigned short rst:1;
unsigned short psh:1;
unsigned short ack:1;
unsigned short urg:1;
unsigned short res2:2;
unsigned short window;
unsigned short check;
unsigned short urg_ptr;
};

CT_ASSERT(sizeof(struct tcphdr) == 20);

int main()
{
return 0;
}





bandken@six6six:~$ gcc temp.c
temp.c:33: warning: division by zero
temp.c:33: error: enumerator value for ‘COMPILE_TIME_ASSERT_ON_LINE_33’ is not an integer constant
bandken@six6six:~$


This lets us know that our TCP header size isn't 20.

There's a few caveats. It will be confusing to have multiple instances of the CT_ASSERT(...) on the same line. That's pretty simple to remedy, but the bigger problem exists when CT_ASSERT(...) is used in a header file. It won't work if there aren't #include guards. If multiple CT_ASSERTs are included in a header file, they will all resolve to the same line. If that is the case, it might be preferable to use a separate .c file that includes the various asserts that could be put into the header files.

This is a bit kludgy, but it does provide checking that wouldn't normally be available. In cases where data structure size is vital, it provides a compile time check for these cases. Unit tests that checked the sizes could be used as an alternative to the compile time assert, but that requires the unit tests to be a) written and b) run every time. Depending on the environment, that's not guaranteed, where as CT_ASSERT(...) will force the issue to be dealt with before it's committed to the source code repo.

Sunday, September 11, 2011

pure virtual method called


bandken@six6six:~$ ./a.out
pure virtual method called
terminate called without an active exception
Aborted
bandken@six6six:~$


Someone had asked me to explain what this meant. Simply, it means that you called a pure virtual function without having the method defined. With most compilers (at least all the ones that I've used), a dummy function is inserted into the virtual function table entry, and it remains there until a concrete class is constructed and the pure virtual function replaced with the correct function. GCC won't even let you call a pure virtual function in the abstract class's constructor, as this example shows:


class Base
{
public:
Base()
{
Oopsie(); // ERROR: warning: abstract
// virtual ‘virtual void
// Base::Oopsie()’
// called from constructor
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



We're lucky that we actually get an error. There's no reason why the compiler needs to tell us about this. It doesn't take much tweaking to get past this.


class Base
{
public:
Base()
{
init();
}

void init()
{
Oopsie();
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



The constructor order is Base, then Derived. In the constructor for Base, we call init(), which calls Oopsie(). Since Derived hasn't been instantiated yet, we call the pure virtual method and our program crashes.

We're not limited to having this problem only in the constructor. Here you can see the problem happening in the destructor. The destructor order is Derived first, then Base (opposite order of the constructors). By the time the done() method is called, the Derived destructor has already been called, and the derived Oopsie() call no longer exists.


class Base
{
public:
~Base()
{
done();
}

void done()
{
Oopsie();
}

virtual void Oopsie() = 0;
};

class Derived : public Base
{
public:
virtual void Oopsie() {};
};

int main()
{
Derived derived;
return 0;
}



Typically, a good rule of thumb is to be very careful when putting any virtual functions into constructors and destructors. You might even know what you are doing. Maybe. Sure, calling that virtual function from the abstract base class constructor is fine, because....uh, I can't think of ANY examples why this is a good idea. If you really wanted some functionality the was to be provided in a function that happened to be virtual, please do something like this:


class Base
{
public:
Base()
{
init();
}

void init()
{
// ..
// ..
// ..
functionalityThatICannotLiveWithout();
}

void functionalityThatICannotLiveWithout()
{
// ..
}

virtual void pureVirtualMethod()
{
functionalityThatICannotLiveWithout();
}
};

class Derived : public Base
{
public:
virtual void pureVirtualMethod() {};
};

int main()
{
Derived derived;
return 0;
}

Monday, August 29, 2011

Fixing an electric piano...


...or really, my first attempt at time-lapse photography. My friend gave me a Roland MP 300 that had a few broken keys. It is a synthesizer that has actual hammers instead of springs. Two hammers were broken and their remains blocked others from functioning properly, and I broke two more taking it apart. There's 5 different hammers for each 12 tone octave, and I went to all the trouble of looking up the correct part numbers and sizes, but if you order them directly from Roland, you actually need to just give them the note (A#, D, etc.). They couldn't look them up by the part numbers.

To take the images, I used a Canon A470 with CHDK installed. It was a nightmare trying to put the images together. I tried to use iMovie on my iPhone but it kept crashing. Windows Live Movie Maker for 2 different Windows 7 computers could build the entire movie, but would crash when trying to save it. The only one I could get to work was Movie Maker for Windows XP. Even still, it was a pain in the neck. The only Windows XP computer that I have is one that's running on my headless VirtualBox server. Movie Maker doesn't run over Remote Desktop or without any audio hardware, so I had to enable VRDE in VirtualBox and install a "virtual" sound card into the machine. Don't worry if that doesn't make any sense.

The interval is 5 seconds per picture, and there are 1200 images. That's roughly 1.5 hours over two days.



The hardest part was figuring out how to get the keyboard out because there were many brackets and hidden screw and tabs, plus the frame that held the hammers and keys is a big chunk of steel. The keyboard alone probably weighed close to 40 pounds. I only have one leftover screw!



The sound quality on my dad's Yamaha keyboard is better, but it's much more pleasant to play the Roland with the inertial hammers and keys. I can pipe the audio output of the Roland into the Yamaha to get the best of the two, but it does require having two keyboard set up in the basement. When I get good enough to play something recognizable, I'll record and post.

Friday, August 12, 2011

My interview questions

I've always been surprised by how few people ask any technical questions at an interview. There's plenty of posts that are much better than mine on this subject. My favorite is this one: http://steve-yegge.blogspot.com/2006/03/truth-about-interviewing.html

I'm often tasked with evaluating the basic technical competence of an individual. This is commonly known as the "bad cop" around here. I think that's undeserved, as the technical part of the interview is my favorite part. I'd much rather design another object-oriented elevator than discuss what my greatest weakness is, or how I dealt with a problematic co-worker.

Anyway, I wanted to share my interview questions. If you are going to get interviewed by me, it's likely that you'll end up some variants of these. The good candidates get through then in short order so we can get onto more important things. These were modeled after Steve's recommendations from above. These allow for more in depth probing after the initial answer is obtained.

1. Write a C function that counts the number of bits set in an unsigned int. What about signed?
2. Tell me about polymorphism, and why it is useful. How could this be done under the hood?
3. You have a singly linked list of addresses, sorted by the person's name. There's millions of entries. We need to provide the linked list to some other entity at some point so it at least needs to keep the sorted linked list format. What are some ways to speed up the lookup of an entry?
4. We have a directory (say on an SMB share so anybody can access it) that has an arbitrary number of sub-directories. These directories may or may not have sub-directories, but they do have an arbitrary number of text files. These text files contain IP addresses in the dot-decimal format like so:

10.127.13.3
10.127.14.5
10.127.13.16
10.127.13.241

I generated these files incorrectly, and any IP address that has a zero in one of it's fields is printed incorrectly:
10.127..1
10..1.3
10.127.14.
.1.2.3
127...1
...1

should be really printed as
10.127.0.1
10.0.1.3
10.127.14.0
0.1.2.3
127.0.0.1
0.0.0.1

How can I find these bad IP addresses and can I fix the easily?



Thursday, August 11, 2011

Project Euler

I starting using www.projecteuler.net to teach myself Haskell. At one point, I ended up with a problem that was really suited to a MATLAB, and from that point on I started using whichever language I thought would be the easiest. These have been fun, and I haven't worked on them as much during the summer. I've started another blog (maybe a sub-blog, or a meta-blog?) that contains the source code for each of the problems that I've solved. It's at benprojecteuler.blogspot.com.

Monday, August 8, 2011

Interview question from an OS vendor

This question was given to me by a friend that was interviewing with an OS vendor. I couldn't resist a friendly competition, so I took a shot. This was only one hour of work, and it didn't get the exact answer as I didn't handle the word wraps correctly (waistcoat-pocket was counted as waistcoatpocket). Test file and test output.







/*
* Problem
* =======
*
* Write an _efficient_ program that reads an ascii file in, and prints
* out all the words in the file in frequency order (most frequent
* first). Each output line should be of the following form:
*
* > <word> <count>
*
* You have 1 hour.
*
*
* Comments
* ========
*
* An example input file 'alice-in-wonderland.txt' has been provided
* along with 10 lines of example output ('alice-in-wonderland.output').
*
* The problem is deliberately vague; do not ask for clarifications.
* Instead do something sensible.
*
* This will be marked by reading your code, as well as running it on
* some test cases.
*/



#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>

struct element
{
int count;
char *word;
struct element *next;
};


// modified this from http://lernc.blogspot.com/2008/12/bubble-sort-linked-list.html
/* preform a bubble sort on the linked list */
struct element *bubble_sort(struct element *head) {
struct element *a = NULL;
struct element *b = NULL;
struct element *c = NULL;
struct element *e = NULL;
struct element *tmp = NULL;

/*
// the `c' node precedes the `a' and `e' node
// pointing up the node to which the comparisons
// are being made.
*/

while(e != head->next)
{
c = a = head;
b = a->next;
while(a != e)
{
if(a->count < b->count)
{
if(a == head)
{
tmp = b -> next;
b->next = a;
a->next = tmp;
head = b;
c = b;
} else {
tmp = b->next;
b->next = a;
a->next = tmp;
c->next = b;
c = b;
}
} else
{
c = a;
a = a->next;
}
b = a->next;
if(b == e)
e = a;
}
}

return head;
}

struct element *make_new_node(const char *word)
{
struct element *new_node = (struct element *)malloc(sizeof(struct element));

new_node->word = (char *)malloc(strlen(word)+1);
strcpy(new_node->word, word);

new_node->next = NULL;
new_node->count = 0;

return new_node;
}

struct element *initialize_list()
{
struct element *element;
struct element *head;
char c;

element = make_new_node("a");
if(element == NULL)
{
return NULL;
}

head = element;

for(c = 'b'; c<='z'; ++c)
{
char word[2];
sprintf(word, "%c", c);
element->next = make_new_node(word);
if(element->next == NULL)
{
return NULL;
}

element = element->next;
}

return head;
}

int error()
{
printf("out of memory\n");
return -1;
}

void print_list(struct element *list)
{
while(list)
{
printf("%s %d\n", list->word, list->count);
list = list->next;
}
}

void initialize_index(struct element **index, struct element *list)
{
int i = 0;

while(list)
{
index[i] = list;
++i;
list = list->next;
}

if(i != 26)
{
printf("problems\n");
}
}

void clean_up(char* str)
{
char ch;
int i;
int ii = strlen(str);
int write_index = 0;
char word[ii+1];
for (i=0; i <ii;i++)
{
strncpy(&ch,str+i,1);
if(isalpha(ch))
{
if (ch>='A' && ch<='Z')
{
ch = tolower(ch);
}
memcpy(word+write_index,&ch,1);
write_index++;
word[write_index]='\0';
}
}

strcpy(str, word);
}


struct element *get_indexed_element(struct element **index, const char *word)
{
return index[word[0]-'a'];
}

int main()
{
struct element *index[26];
struct element *head = initialize_list();

if(head == NULL)
{
return error();
}

initialize_index(index, head);

// read book and insert elements
char word[256];
FILE *f;

f=fopen("alice.txt","r");
while (!feof(f))
{
fscanf(f,"%s",word);

// make it all lowercase
// remove whitespace and punctuation
clean_up(word);

// add it to the running total
struct element *elm = get_indexed_element(index, word);

while(elm->next && (strcmp(elm->next->word, word) != 1))
{
elm = elm->next;
}

if(strcmp(elm->word, word) == 0)
{
elm->count++;
}
else
{
struct element *new_node = make_new_node(word);
if(new_node == NULL)
{
return error();
}
new_node->next = elm->next;

new_node->count++;
elm->next = new_node;
}

}
// sort list
head = bubble_sort(head);

print_list(head);

// delete memory

while(head)
{
struct element *elm = head;
head=head->next;

free(elm->word);
free(elm);
}

fclose(f);

return 0;
}




Sunday, August 7, 2011

Audio organization through the years




I first decided to move my collection of music from a binder full of CDs to my hard drive around 1998. I had purchased a portable CD player (think Discman) that could play a data CD full of mp3s. The original CD rippers barely let you name things more than Track-01.mp3, but I had a system where it would at least get a directory structure set up correctly containing the genre, album and artist. Having a directory mp3/Ska/ReelBigFish/TurnTheRadioOff/Track*.mp3 (that's the year, right?) was fine for what I was doing. The entire collection was put to disk. I think I even spent the big bucks to buy an 6GB hard drive for that.

At some point I spent the time to go through and modify all of the track names so they had the name of the song. I remember doing this, and it didn't take all that long if you did a couple CDs per day. I was in college, it just took away from my video game time anyway.

In 2000, I worked for Lydstrom, a company that was working in the music business (trying to sell mp3s online). They produced a set-top box that could play their own higher quality format but being an insider, I knew how to turn on mp3-mode. The part that controlled the mp3 playing didn't use the directory structure or file name to view the track name, it used the ID3 tag attached to the end of the file. If I remember correctly, it wouldn't even play the mp3 unless it had the ID3 tag.

I installed one of these set-top boxes in my car using a 12VDC-120VAC power inverter, a small TV and the remnants of a trunk CD changer. I wrote a script that crawled through my directories and added the ID3 tag to all of the tracks based on the directory structure.

Given that CDDB was coming online at this time, I thought about re-ripping my CD collection. I started down that route, but CDDB was not robust. To uniquely identify the CD, it used the length + the number of tracks. CDs, at least the punk and pop CDs that I had, typically have 10-14 tracks and are around 45-60 minutes long. Combined with the number of CDs in existence, there's lots of collisions, so I spent all my time telling the ripping software what CD it was. Plus it was going to take a long time, and now that I was working, I didn't have as much time to do things like spend 4 days straight ripping CDs.

My new system served me well for many years. It was portable, so it could move with me from computer to computer. CD ripping software became more sophisticated, CDDB became better, FreeDB became even better than that and all devices supported the .mp3 format with ID3 tags.

I stopped buying CDs after a while, as I become a curmudgeon about music. I caught myself saying to friends, "Is it me, or is most of the music coming out now really bad?" Yup, I got old. I got one or two CDs a year probably from 2002-2007, and some of those new CDs possibly came from the library.

Most of my players up until 2005ish were various no-name portables or car stereo replacements. In 2005, I got one the original iPod Shuffles when it first came out. This required me to use iTunes for the first time. I didn't like that it managed your library. I'm a curmudgeon, I know what's best, I can do it myself. I tried everything I could to avoid using iTunes, and I eventually settled on a combination of WinAmp and various plugins to load the Shuffle. This allowed me to keep all my music situated in the directory structure and keep it portable.

That lasted me up until 2008, when I got the iPhone. That took over as my audio device of choice. I had to use iTunes at this point, but it was much better on OS X than on Windows. I ran a Linux fileserver that contained all of my mp3s, and iTunes didn't need write access to the directory (unlike older versions of iTunes).

I saw other iPhones playing music and now only did they play music, they played music while showing the album covers! I was jealous. Everybody else has spent years buying music from the iTunes store and iTunes provided the album artwork with the track. 99% of my tracks didn't have any artwork, and purchasing everything again through iTunes wasn't going to happen.


My friend introduced me to a piece of software called TuneUp. It was a plug into iTunes that would scan the files and load them with the correct album artwork. iTunes is already pretty messy and slow, so I didn't want to add an additional layer. I was already using virtual machines for other unpleasant things, and this was not different. I ran Windows XP with iTunes and TuneUp. It took a lot of baby-sitting to get through my entire library, but now I get to listen to music with beautiful artwork:


TuneUp isn't something that I'd want to have running all the time though. It slows down iTunes even more, but running it once was worth it. Since it was in a VM, I just ditched the VM when it was complete.

Nowadays, I've started buying music again. I only use Amazon. They don't have as large a selection as the iTunes store, but it's in DRM-free mp3 format, and it contains the album artwork. Eventually I will probably switch to FLAC or Ogg Vorbis formats for new music, but the conversion is more work than I want to invest in something that I have that is working fine. I'm not committed to the Apple world either, as my audio files are sitting on my fileserver that can be connected to any other piece of software.