Maths trick: doing fewer comparisons
Note: this is not an optimisation. It is just one more tool you should have in your toolbox when looking for optimisations. It may be useful.
This is the trick:
![\[\min(x,y) = \dfrac{x + y - |x - y|}{2}\]
\[\max(x,y) = \dfrac{x + y + |x - y|}{2}\]](http://lolengine.net/tracmath/84cce2f2fc6089f719a166eea4d7900c0018fe45.png)
You can check for yourself that it is always true: when x > y, |x - y| is the same as x - y, etc.
What good is it for? There is often an implicit comparison in min
or max
. It might be interesting to replace it with a call to the branchless fabs
.
Example usage
Consider the following code:
float a, b, c, d; /* ... */ return (a > b) && (c > d);
That kind of code is often used eg. in collision checks, where a lot of tests can be done. This code does two comparisons. On some architectures, this means two branches. Not always something you want.
The test condition is equivalent to:
(a - b > 0) && (c - d > 0)
Now when are two given numbers both positive? That is if and only if the smallest is positive:
min(a - b, c - d) > 0
We may now use our trick:
(a - b) + (c - d) - |(a - b) - (c + d)| > 0
And so the code could be rewritten as such:
float a, b, c, d; /* ... */ return (a - b) + (c - d) > fabsf((a - b) - (c - d));
We basically replaced the additional test with a call to fabsf
and some additions/subtractions. It may be possible to reorganise the input data so that this second version performs better.
C++ trick: selectively restrict implicit conversions
TL;DR: given a class Foo
with an implicit constructor from int
, how to allow the implicit conversion in f(42);
but not in g(42);
where both f
and g
take a Foo const &
argument?
Background
So I have this real
class that performs numeric operations that I want use just like any other C++ numeric type. For instance, I can write the following:
float f = 15, g = 3.5; int x = f / g;
If I decide that I need double precision, I can write:
double f = 15, g = 3.5; int x = f / g;
And of course, using my real
class for even higher precision:
real f = 15, g = 3.5; int x = f / g;
I like that. I can just write code as usual, and when I need higher precision, I use real
instead of double
. It's transparent and convenient.
Implementation example
Here is a highly simplified example of a real
class:
struct real { inline real(double d) : m_value(d) {} inline operator int() const { return (int)m_value; } /* ... */ long double m_value; };
It is possible to write real f = 15
because of the implicit constructor. Actually, C++ constructors are always implicit unless specified otherwise.
It is possible to write int x = f / g
because of the conversion operator.
So far, so good.
The problem with implicit promotion
Here is how fabs
could be implemented:
real fabs(real const &r) { return real(r.m_value < 0 ? -r.m_value : r.m_value); }
But now we have a problem. A subtle problem. Consider the following code:
double x = fabs(-5.0);
What does this do? Well, it depends. It depends whether <cmath>
was included or not. Because if <cmath>
wasn’t included, then that code is going to automatically promote -5.0 to a real
and call our custom function instead of the one provided by the math library! With no compile-time warning!
This is confusing. It should not happen. But it is a well known problem and there are several obvious workarounds:
- What most professional C++ programmers will tell you: use namespaces
- Mark the
real(int)
constructorexplicit
The problem with 1. is that I am not a professional C++ programmer. I am a C programmer who uses C++. I use preprocessor macros and printf
and memalign
and goto
. Try and stop me!
The problem with 2. is that I can no longer write real f = 15
, I would need real f(15)
or real f = real(15)
instead. This is not acceptable, I want real
to behave exactly like float
and others, to the fullest extent of what the language allows.
Another solution
Fortunately, the C++ standard has a solution for us: “Implicit conversions will be performed [...] if the parameter type contains no template-parameters that participate in template argument deduction” (ISO/IEC 14882:1998, section 14.8.1.4). You cannot have implicit conversion and template argument deduction at the same time.
It means we just have to make fabs
a template function! Which means making real
a template class, too.
A quick way to fix real
would be:
/* N is unused */ template<int N> struct real_base { inline real_base(double d) : m_value(d) {} inline operator int() const { return (int)m_value; } /* ... */ long double m_value; }; typedef real_base<0> real;
The template argument is useless, unfortunately. It will just have to be here, forever. But who knows, you might find a use for it one day.
And to fix fabs
:
/* A generic template declaration is needed */ template<int N> real_base<N> fabs(real_base<N> const &r); /* Here we just add template<> to the previous version */ template<> real fabs(real const &r) { return real(r.m_value < 0 ? -r.m_value : r.m_value); }
So, what happens with double x = fabs(-5.0);
when we forget to include <cmath>
now? Well, here is what GCC says:
In function ‘int main()’: error: no matching function for call to ‘fabs(double)’ note: candidate is: note: template<int N> real_base<N> fabs(const real_base<N>&)
It seems we’ve successfully managed to avoid the problematic implicit conversion, yet still allow it in places where it was useful!
So what is the rule? It’s simple: where implicit conversion should not be allowed, make the function a specialised template function.