Announce: LolRemez 0.1 released
In my previous article about the Remez exchange algorithm I said I was working on a Remez exchange toolkit for everyone to play with. Though it’s far from being full-featured, I already use it and I believe it is already extremely useful. So I decided to release LolRemez 0.1 to the masses.
You can visit the software homepage to download LolRemez and, more importantly, the comprehensive documentation featuring a step-by-step tutorial.
Better function approximations: Taylor vs. Remez
You may have once crossed this particular piece of magic:

The right part is the Taylor series of sin
around 0. It converges very quickly to the actual value of sin(a)
. This allows a computer to compute the sine of a number with arbitrary precision.
And when I say it’s magic, it’s because it is! Some functions, called the entire functions, can be computed everywhere using one single formula! Other functions may require a different formula for different intervals; they are the analytic functions, a superset of the entire functions. In general, Taylor series are an extremely powerful tool to compute the value of a given function with very high accuracy, because for several common functions such as sin
, tan
or exp
the terms of the series are easy to compute and, when implemented on a computer, can even be stored in a table at compile time.
Approximating sin with Taylor series
This is how one would approximate sin
using 7 terms of its Taylor series on the [-π/2, π/2] interval. The more terms, the better the precision, but we’ll stop at 7 for now:
static double taylorsin(double x) { static const double a0 = 1.0, a1 = -1.666666666666666666666666666666e-1, /* -1/3! */ a2 = 8.333333333333333333333333333333e-3, /* 1/5! */ a3 = -1.984126984126984126984126984126e-4, /* -1/7! */ a4 = 2.755731922398589065255731922398e-6, /* 1/9! */ a5 = -2.505210838544171877505210838544e-8, /* -1/11! */ a6 = 1.605904383682161459939237717015e-10; /* 1/13! */ double x2 = x * x; return x * (a0 + x2 * (a1 + x2 * (a2 + x2 * (a3 + x2 * (a4 + x2 * (a5 + x2 * a6)))))); }
And you may think…
“Oh wow that is awesome! So simple for such a difficult function. Also, since I read your masterpiece about polynomial evaluation I know how to improve that function so that it is very fast!”
Well, actually, no.
If you are approximating a function over an interval using its Taylor series then either you or the person who taught you is a fucking idiot because a Taylor series approximates a function near a fucking point, not over a fucking interval, and if you don’t understand why it’s important then please read on because that shit is gonna blow your mind.
Error measurement
Let’s have a look at how much error our approximation introduces. The formula for the absolute error is simple:

And this is how it looks like over our interval:
You can see that the error skyrockets near the edges of the [-π/2, π/2] interval.
“Well the usual way to fix this is to split the interval in two or more parts, and use a different Taylor series for each interval.”
Oh, really? Well, let’s see the error on [-π/4, π/4] instead:
I see no difference! The error is indeed smaller, but again, it becomes extremely large at the edges of the interval. And before you start suggesting reducing the interval even more, here is the error on [-π/8, π/8] now:
I hope this makes it clear that:
- the further from the centre of the interval, the larger the error
- the error distribution is very unbalanced
- the maximum error on [-π/2, π/2] is about
6.63e-10
And now I am going to show you why that maximum error value is pathetic.
A better approximation
Consider this new function:
static double minimaxsin(double x) { static const double a0 = 1.0, a1 = -1.666666666640169148537065260055e-1, a2 = 8.333333316490113523036717102793e-3, a3 = -1.984126600659171392655484413285e-4, a4 = 2.755690114917374804474016589137e-6, a5 = -2.502845227292692953118686710787e-8, a6 = 1.538730635926417598443354215485e-10; double x2 = x * x; return x * (a0 + x2 * (a1 + x2 * (a2 + x2 * (a3 + x2 * (a4 + x2 * (a5 + x2 * a6)))))); }
It doesn’t look very different, right? Right. The values a0
to a6
are slightly different, but the rest of the code is strictly the same.
Yet what a difference it makes! Look at this error curve:
That new function makes it obvious that:
- the error distribution is better spread over the interval
- the maximum error on [-π/2, π/2] is about
4.96e-14
Check that last figure again. The new maximum error isn’t 10% better, or maybe twice as good. It is more than ten thousand times smaller!!
The minimax polynomial
The above coefficients describe a minimax polynomial: that is, the polynomial that minimises a given error when approximating a given function. I will not go into the mathematical details, but just remember this: if the function is sufficiently well-suited (as sin
, tan
, exp
etc. are), then the minimax polynomial can be found.
The problem? It’s hard to find. The most popular algorithm to find it is the Remez exchange algorithm, and few people really seem to understand how it works (or there would be a lot less Taylor series). I am not going to explain it right now. Usually you need professional math tools such as Maple or Mathematica if you want to compute a minimax polynomial. The Boost library is a notable exception, though.
But you saw the results, so stop using Taylor series. Spending some time finding the minimax polynomial is definitely worth it. This is why I am working on a Remez framework that I will make public and free for everyone to use, modify and do what the fuck they want. In the meantime, if you have functions to numerically approximate, or Taylor-based implementations that you would like to improve, let me know in the comments! This will be great use cases for me.
C/C++ trick: static lookup table generation
There are two major ways of using lookup tables (LUTs) in C/C++ code:
- build them at runtime,
- embed them in the code.
One major advantage of runtime initialisation is the choice between static initialisation (at program startup), or lazy initialisation (on demand) to save memory. Also, the generating code can be complex, or use information that is only available at runtime.
In the case of an embedded table, the generation cost is only at compile time, which can be very useful. Also, the compiler may take advantage of its early knowledge of the table contents to optimise code. However, quite often the content of embedded tables is abstruse and hardly useful to someone viewing the code. Usually this is due to the use of an external program for generation, sometimes in a completely different language. But the generation can also often be done using the C/C++ preprocessor.
Practical example
Consider the bit interleaving routine at Sean Eron Anderson's Bit Twiddling Hacks page (which, by the way, I recommend you read and bookmark). It uses the following LUT (shortened for brevity):
static const unsigned short MortonTable256[256] = { 0x0000, 0x0001, 0x0004, 0x0005, 0x0010, 0x0011, 0x0014, 0x0015, 0x0040, 0x0041, 0x0044, 0x0045, 0x0050, 0x0051, 0x0054, 0x0055, 0x0100, 0x0101, 0x0104, 0x0105, 0x0110, 0x0111, 0x0114, 0x0115, ... 32 lines in total ... 0x5540, 0x5541, 0x5544, 0x5545, 0x5550, 0x5551, 0x5554, 0x5555 };
The MortonTable256
table has, as its name suggests, 256 elements. It was pregenerated by some external piece of code which probably looked like this:
for (int i = 0; i < 256; i++) MortonTable256[i] = (i & 1) | ((i & 2) << 1) | ((i & 4) << 2) | ((i & 8) << 3);
The problem with that external piece of code is that it is external. You cannot write it in this form and have it fill the table at compile time.
If you only take the output of this table, the information on how the table was created is lost. It makes it impractical to build another table that has, for instance, all values shifted one bit left. Even if such a table was created using a modified version of the above code, switching between the two tables would be a hassle unless both versions were kept between preprocessor tests.
Preprocessor iterator
Here is one way to get the best of both worlds. First, declare the following iterator macros. They can be declared somewhere in a global .h
, maybe with more descriptive names:
#define S4(i) S1((i)), S1((i)+1), S1((i)+2), S1((i)+3) #define S16(i) S4((i)), S4((i)+4), S4((i)+8), S4((i)+12) #define S64(i) S16((i)), S16((i)+16), S16((i)+32), S16((i)+48) #define S256(i) S64((i)), S64((i)+64), S64((i)+128), S64((i)+192) #define S1024(i) S256((i)), S256((i)+256), S256((i)+512), S256((i)+768)
Their purpose is simple: calling eg. S16(i)
will expand to S1(i), S1(i+1), …, S1(i+15)
. Similarly, S256(i)
will call S1
with values from i
to i + 255
times.
And this is how to use them in our example:
static const unsigned short MortonTable256[256] = { #define S1(i) ((i & 1) | ((i & 2) << 1) | ((i & 4) << 2) | ((i & 8) << 3)) S256(0) #undef S1 };
That's it! The table will be built at compile time, and you get to keep the logic behind it.
A more complex example
Jeroen van der Zijp's fast half float conversions paper describes table-based methods to convert between 16-bit and 32-bit floating point values. The construction of one of the LUTs is as follows:
void generatetables(){ for(unsigned int i=0; i<256; ++i){ int e=i-127; if(e<-24){ // Very small numbers map to zero basetable[i|0x000]=0x0000; basetable[i|0x100]=0x8000; } else if(e<-14){ // Small numbers map to denorms basetable[i|0x000]=(0x0400>>(-e-14)); basetable[i|0x100]=(0x0400>>(-e-14)) | 0x8000; } else if(e<=15){ // Normal numbers just lose precision basetable[i|0x000]=((e+15)<<10); basetable[i|0x100]=((e+15)<<10) | 0x8000; } else if(e<128){ // Large numbers map to Infinity basetable[i|0x000]=0x7C00; basetable[i|0x100]=0xFC00; } else{ // Infinity and NaN's stay Infinity and NaN's basetable[i|0x000]=0x7C00; basetable[i|0x100]=0xFC00; } } }
And this is the compile-time version :
static uint16_t const basetable[512] = { #define S1(i) (((i) < 103) ? 0x0000 : \ ((i) < 113) ? 0x0400 >> (0x1f & (113 - (i))) : \ ((i) < 143) ? ((i) - 112) << 10 : 0x7c00) S256(0), #undef S1 #define S1(i) (0x8000 | basetable[i]) S256(0), #undef S1 };
In this case the macro code is slightly bigger and was slightly rewritten, but is no more complicated than the original code. Note also the elegant reuse of previous values in the second half of the table.
This trick is certainly not new, but since I have found practical uses for it, I thought you may find it useful, too.
Understanding basic motion calculations in games: Euler vs. Verlet
During the past month, I have found myself in the position of having to explain the contents of this article to six different persons, either at work or over the Internet. Though there are a lot of articles on the subject, it’s still as if almost everyone gets it wrong. I was still polishing this article when I had the opportunity to explain it a seventh time.
And two days ago a coworker told me the source code of a certain framework disagreed with me… The kind of framework that probably has three NDAs preventing me from even thinking about it.
Well that framework got it wrong, too. So now I’m mad at the entire world for no rational reason other than the ever occurring realisation of the amount of wrong out there, and this article is but a catharsis to deal with my uncontrollable rage.
A simple example
Imagine a particle with position Pos
and velocity Vel
affected by acceleration Accel
. Let’s say for the moment that the acceleration is constant. This is the case when only gravity is present.
A typical game engine loop will update position with regards to a timestep (often the duration of a frame) using the following method, known as Euler integration:
Particle::Update(float dt) { Accel = vec3(0, 0, -9.81); /* Constant acceleration: gravity */ Vel = Vel + Accel * dt; /* New, timestep-corrected velocity */ Pos = Pos + Vel * dt; /* New, timestep-corrected position */ }
This comes directly from the definition of acceleration:
![\[a(t) = \frac{\mathrm{d}}{\mathrm{d}t}v(t)\]
\[v(t) = \frac{\mathrm{d}}{\mathrm{d}t}p(t)\]](http://lolengine.net/tracmath/00005492b5e2abc9714eda7a54cac6863bda1944.png)
Putting these two differential equations into Euler integration gives us the above code.
Measuring accuracy
Typical particle trajectories would look a bit like this:
These are three runs of the above simulation with the same initial values.
- once with maximum accuracy,
- once at 60 frames per second,
- once at 30 frames per second.
You can notice the slight inaccuracy in the trajectories.
You may think…
“Oh, it could be worse; it’s just the expected inaccuracy with different framerate values.”
Well, no.
Just… no.
If you are updating positions this way and you do not have a really good reason for doing so then either you or the person who taught you is a fucking idiot and should not have been allowed to write so-called physics code in the first place and I most certainly hope to humbly bestow enlightenment upon you in the form of a massive cluebat and don’t you dare stop reading this sentence before I’m finished.
Why this is wrong
When doing kinematics, computing position from acceleration is an integration process. First you integrate acceleration with respect to time to get velocity, then you integrate velocity to get position.
![\[v(t) = \int_0^t a(t)\,\mathrm{d}t\]
\[p(t) = \int_0^t v(t)\,\mathrm{d}t\]](http://lolengine.net/tracmath/731cc93e3ef1153741555f5d34eea544b63a07fc.png)
The integral of a function can be seen as the area below its curve. So, how do you properly get the integral of our velocity between t
and t + dt
, ie. the green area below?
It’s not by doing new_velocity * dt
(left image).
It’s not by doing old_velocity * dt
either (middle image).
It’s obviously by doing (old_velocity + new_velocity) * 0.5 * dt
(right image).
And now for the correct code
This is what the update method should look like. It’s called Velocity Verlet integration (not strictly the same as Verlet integration, but with a similar error order) and it always gives the perfect, exact position of the particle in the case of constant acceleration, even with the nastiest framerate you can think of. Even at two frames per second.
Particle::Update(float dt) { Accel = vec3(0, 0, -9.81); vec3 OldVel = Vel; Vel = Vel + Accel * dt; Pos = Pos + (OldVel + Vel) * 0.5 * dt; }
And the resulting trajectories at different framerates:
Further readings
“Oh wow thank you. But what if acceleration is not constant, like in real life?”
Well I am glad you asked.
Euler integration and Verlet integration are part of a family of iterative methods known as the Runge-Kutta methods, respectively of first order and second order. There are many more for you to discover and study.
- Richard Lord did this nice and instructive animated presentation about several integration methods.
- Glenn Fiedler also explains in this article why idiots use Euler, and provides a nice introduction to RK4 together with source code.
- Florian Boesch did a thorough coverage of various integration methods for the specific application of gravitation (it is one of the rare cases where Euler seems to actually perform better).
In practice, Verlet will still only give you an approximation of your particle’s position. But it will almost always be a much better approximation than Euler. If you need even more accuracy, look at the fourth-order Runge-Kutta (RK4) method. Your physics will suck a lot less, I guarantee it.
Acknowledgements
I would like to thank everyone cited in this article, explicitly or implicitly, as well as the commenters below who spotted mistakes and provided corrections or improvements.