Last week, I wrote a function that runs a statement for each value of enum.
To ensure this function is updated when a new value is added to the enum,
I wrote a simple
for loop over the values of enum that calls
with a giant switch statement on the enum values.
I was thinking that compilers are smart enough to detect that the loop is of a fixed length, unroll the loop, and then use constant propagation to see that switch statement reduces to a single statement. It turned out that gcc doesn’t do this, and my patch caused a 15% performance regression on some Dromaeo test (dom-attr).
My fix: Use template meta programming to unroll the loop and inline the heck out of the function.