Simple C++ optimizations - Part 1
an example of low-level optimization




The following snippet is quite common:
float f,g,x;
...
if (x>=0.0) f += g;

If the value of x is highly unpredictable, the resulting branch could be quite expensive.

Let's write a function
that returns 1.0 if x is positive and 0.0 if it is negative:
inline float ispositive(float x) {
return (x>=0.0?1.0:0.0);
}
Now we'll write insted
f += g * ispositive(x);
Of course we just shuffled the problem around (actually worsened, efficiency-wise) but we can remove the branch using the common memory-manipulation approach.

inline float ispositive(float x) {
union {
float f;
int i;
} v;
v.f = x;
//return (x>=0.0?1.0:0.0);
}
Now v.i will hold the 32 bits of the float x. If you are unfamiliar with the IEEE floating point representation, this page is an excellent source of information. For our purpose we just need to know that the sign is stored in most significant bit (pun, albeit non-phonetical, intended), and the floating point representations of the two possible output values: 0.0f is 0...00 (all zeroes), while 1.0f is 001111110...00, or 0x3f800000 in hexadecimal. We must somehow transform our sign bit into the correct output.
So what are we going to do with our 32 bits in v.i? Let's start by discarding what is useless:
v.i &= 0x80000000;
This will bitwise-AND the bits with 100...00, effectively retaining only the sign bit. But if the number is negative (sign bit=1) we want to output 0.0, (all zeroes) so we should reverse the sign bit:
v.i = ~v.i;
v.i &= 0x80000000;
Now the first AND is useless and can be discarded. The following step is to extend the sign bit to the whole dword:
v.i = v.i >> 31;
The right shift works as intended because v.i is a signed int, so for the sign to be preserved it is propagated to the less significant bits. You can also see that the second AND is now useless too.
At this point, if x is positive (sign bit 0) we have 1...11 and 0...00 otherwise. A most convenient pattern, since
v.i &= 0x3f800000;
will yeld the desired 1.0 when x is positive leaving the 0s untouched in the other case.

This is the function so far - the right shift can be limited to 8 since the last 23 bits are going to be zeroed anyway:
inline float ispositive(float x) {
union {
float f;
int i;
} v;
v.f = x;
//v.i = ~v.i;
//v.i = v.i >> 8;
//v.i &=
0x3f800000;
v.i = (int(~v.i) >> 8) &
0x3f800000; //compact form
return v.f;
}

Or, getting rid of the union (I suspect this could let the compiler do some marginal optimizations with complex expressions):
inline float ispositive(float x) {
int i = (int(~*(reinterpret_cast<int*>(&x))) >> 8) & 0x3f800000;
return *(reinterpret_cast<float*>(&i));
}
The saga continues in part 2 with an example of high-level, and definitely wiser, optimization.