Lol Engine - Blog
https://lolengine.net/blog
The Lol Engine Blogen-USTrac 1.2.2Lol Engine/chrome/site/logo.png
https://lolengine.net/blog
Damping with delta-timesamSun, 03 May 2015 00:24:03 GMT
https://lolengine.net/blog/2015/05/03/damping-with-delta-time
https://lolengine.net/blog/2015/05/03/damping-with-delta-time<p>
Just a quick tip on how to convert usual damping code to something framerate-independent.
</p>
<p>
Most of us have probably, at some point, written code resembling this:
</p>
<div class="wiki-code"><div class="code"><pre>// Perform velocity damping
velocity -= velocity * 0.01f;
</pre></div></div><p>
… or probably the more correct:
</p>
<div class="wiki-code"><div class="code"><pre>// Per-second damping coefficient
float const D = 10.0f;
// Damp velocity according to timestep
velocity -= velocity * D * delta_time;
</pre></div></div><p>
Yet this is not fully framerate-independent; results are slightly different at 30fps and 60fps, and more importantly, spikes in the framerate cause lots of weird artifacts, causing developers to attempt to fix the situation by clamping <code>delta_time</code>, which is not ideal.
</p>
<h2 id="Theexponentiationmethod">The exponentiation method</h2>
<p>
Here is one way to fix it: assume that the code works correctly at 60 fps. This means that each frame, <code>velocity</code> is effectively multiplied by <code>1 - D / 60</code>.
</p>
<p>
After one second, <em>i.e.</em> 60 frames, <code>velocity</code> has been multiplied by <code>(1 - D / 60) ^ 60</code>.
</p>
<p>
After two seconds, it has been multiplied by <code>(1 - D / 60) ^ (60 * 2)</code>.
</p>
<p>
After <code>N</code> seconds, it has been multiplied by <code>(1 - D / 60) ^ (60 * N)</code>.
</p>
<p>
So, there, we have a formula that tells us what happens after <code>N</code> seconds, and it’s a continuous function. We can therefore choose <code>N</code> as we like, and especially <code>N = delta_time</code>:
</p>
<div class="wiki-code"><div class="code"><pre>// Per-second damping coefficient
float const D = 10.0f;
// Damp velocity (framerate-independent)
velocity *= pow(1.f - D / 60.f, 60.f * delta_time);
</pre></div></div><p>
Which can be conveniently rewritten as:
</p>
<div class="wiki-code"><div class="code"><pre>// Per-second damping coefficient
float const D = 10.0f;
// Exponentiation base for velocity damping
float const D2 = pow(1.f - D / 60.f, 60.f);
// Damp velocity (framerate-independent)
velocity *= pow(D2, delta_time);
</pre></div></div><h2 id="Usewithlerp">Use with lerp</h2>
<p>
The same method can be adapted to uses of linear interpolation such as this one:
</p>
<div class="wiki-code"><div class="code"><pre>// Perform velocity damping
velocity = lerp(velocity, target_velocity, K * delta_time);
</pre></div></div><p>
Which we replace with:
</p>
<div class="wiki-code"><div class="code"><pre>// Damp velocity (framerate-independent)
velocity = lerp(velocity, target_velocity,
1.f - pow(1.f - K / 60.f, 60.f * delta_time));
</pre></div></div>mathstipcodephysicsQuaternion from two vectors: the final versionsamMon, 24 Feb 2014 14:52:13 GMT
https://lolengine.net/blog/2014/02/24/quaternion-from-two-vectors-final
https://lolengine.net/blog/2014/02/24/quaternion-from-two-vectors-final<p>
I have published <a href="https://lolengine.net/blog/2013/09/18/beautiful-maths-quaternion-from-vectors">a first article</a> about how to build a quaternion from two arbitrary direction vectors that will transform one of the directions into the other. That article was deliberately omitting the special case when the two vectors were facing away from each other, which required special treatment. So I wrote <a href="https://lolengine.net/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts">a second article</a> explaining how to quickly pick an orthogonal vector.
</p>
<p>
Combining the results from the two articles should be straightforward. Here is the resulting, final function:
</p>
<div class="wiki-code"><div class="code"><pre>/* Build a unit quaternion representing the rotation
* from u to v. The input vectors need not be normalised. */
quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(dot(u, u) * dot(v, v));
float real_part = norm_u_norm_v + dot(u, v);
vec3 w;
if (real_part < 1.e-6f * norm_u_norm_v)
{
/* If u and v are exactly opposite, rotate 180 degrees
* around an arbitrary orthogonal axis. Axis normalisation
* can happen later, when we normalise the quaternion. */
real_part = 0.0f;
w = abs(u.x) > abs(u.z) ? vec3(-u.y, u.x, 0.f)
: vec3(0.f, -u.z, u.y);
}
else
{
/* Otherwise, build quaternion the standard way. */
w = cross(u, v);
}
return normalize(quat(real_part, w.x, w.y, w.z));
}
</pre></div></div><p>
If you know you are exclusively dealing with unit vectors, you can replace all occurrences of <code>norm_u_norm_v</code> with the value <code>1.0f</code> in order to avoid a useless square root.
</p>
mathsoptimOn picking an orthogonal vector (and combing coconuts)samSat, 21 Sep 2013 09:25:39 GMT
https://lolengine.net/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts
https://lolengine.net/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts<p>
Consider the following problem:
</p>
<blockquote>
<p>
Given a non-zero vector <strong>v</strong> in 3D space, find a non-zero vector <strong>w</strong> orthogonal to <strong>v</strong> (<em>i.e.</em> such that <strong>v.w</strong> = 0).
</p>
</blockquote>
<p>
At first sight, this seems easy. For every possible <strong>v</strong> there are an infinite number of orthogonal vectors lying in a plane:
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts/math-vector-ortho.png" crossorigin="anonymous" />
</p>
<p>
We just need to pick one. But it’s not as trivial as it may seem.
</p>
<h2 id="Firstattempts">First attempts</h2>
<p>
The following vector satisfies <strong>v.w1</strong> = 0:
</p>
<img src="https://lolengine.net/tracmath/246dbb03a6e565baea0ba0e5ad200de1aa507a0f.png" alt="$$\mathbf{v} = \begin{bmatrix}x \\ y \\ z \end{bmatrix}
\quad
\mathbf{w_1} = \begin{bmatrix}-y \\ x \\ 0 \end{bmatrix}
\quad
\mathbf{v} . \mathbf{w_1} = x * -y + y * x + z * 0 = 0
$$
" /><p>
Unfortunately that <strong>w1</strong> vector can be zero, for instance when <strong>v</strong> = [0 0 1]. The original problem requires a non-zero vector.
</p>
<p>
Let’s try another one:
</p>
<img src="https://lolengine.net/tracmath/056221e759390ad43f8323741762631181138901.png" alt="$$\mathbf{v} = \begin{bmatrix}x \\ y \\ z \end{bmatrix}
\quad
\mathbf{w_2} = \begin{bmatrix}0 \\ -z \\ y \end{bmatrix}
\quad
\mathbf{v} . \mathbf{w_2} = x * 0 + y * -z + z * y = 0
$$
" /><p>
Again, it works most of the time, except in some cases, for instance <strong>v</strong> = [1 0 0].
</p>
<p>
How about adding them together? Certainly <strong>w1</strong> and <strong>w2</strong> cannot be both zero at the same time!
</p>
<img src="https://lolengine.net/tracmath/db3f45a01a1d3e10d9b19aa75049ff40b42e2c3e.png" alt="$$\mathbf{v} = \begin{bmatrix}x \\ y \\ z \end{bmatrix}
\quad
\mathbf{w_3} = \begin{bmatrix}-y \\ x-z \\ y \end{bmatrix}
\quad
\mathbf{v} . \mathbf{w_3} = x * -y + y * (x-z) + z * y = 0
$$
" /><p>
Alas, this still doesn’t work all the time: it fails with <strong>v</strong> = ![1 0 1].
</p>
<p>
Should we try harder? Isn’t there a linear combination of these that will work all the time?
</p>
<p>
Well, no. There isn’t.
</p>
<h2 id="Youcantcombacoconut">You can’t comb a coconut</h2>
<p>
Sadly, the <a class="ext-link" href="http://en.wikipedia.org/wiki/Hairy_ball_theorem"><span class="icon"></span>hairy ball theorem</a>, a consequence of <a class="ext-link" href="http://en.wikipedia.org/wiki/Brouwer_fixed-point_theorem"><span class="icon"></span>Brouwer’s fixed-point theorem</a>, states that any continuous function <em>f</em> that maps the set of unit vectors to orthogonal vectors takes the value zero somewhere.
</p>
<p>
Whichever way you look at the sphere of unit vectors, a continuous tangent field will always have a zero point:
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts/math-hairy-ball.png" crossorigin="anonymous" />
</p>
<p>
Most of the maths operations we usually do (adding, multiplying, taking square roots, etc.) are continuous. Fortunately, we are not limited to them: we can use a simple <code>if()</code> construct to create a discontinuity that will save us.
</p>
<h2 id="Averygoodimplementation">A very good implementation</h2>
<p>
Here is my personal implementation of the orthogonal vector problem. <strong>It is very good. It has excellent numerical stability. It only does one test. It is short. It is beautiful. I really hope you consider using it.</strong>
</p>
<div class="wiki-code"><div class="code"><pre>/* Always works if the input is non-zero.
* Doesn’t require the input to be normalised.
* Doesn’t normalise the output. */
vec3 orthogonal(vec3 v)
{
return abs(v.x) > abs(v.z) ? vec3(-v.y, v.x, 0.0)
: vec3(0.0, -v.z, v.y);
}
</pre></div></div><p>
Many implementations, such as Ogre3D’s, have additional tests or operations to perform this task, and use epsilons and normalised vectors to avoid singularities.
</p>
<p>
But our only test is actually enough: if |x|>|z|, it means the largest component in absolute value in <strong>v</strong> is either x or y. So we build a vector using x and y. Otherwise, the largest component in absolute value is either y or z. So we build a vector using y and z. That way, we ensure that the length of our returned vector never, ever comes near zero.
</p>
<p>
Whether some given code will cause inefficient branching is often unclear. In our case, it is very likely that the ternary operator will actually be branchless, with some help of the compiler and the hardware.
</p>
<p>
That said, how about we try to get rid of the ternary operator, just for fun?
</p>
<h2 id="Goingbranchlessforfun">Going branchless for fun</h2>
<p>
Let’s see. We had these two candidate vectors, <strong>w1</strong> and <strong>w2</strong>, which worked almost always except when specific values of <strong>v</strong> caused them to be zero. And whatever the constant <em>k</em> we may pick, a vector of the form <strong>w1</strong> + <em>k</em> <strong>w2</strong> will eventually take the value zero for some value of <strong>v</strong>, too, because of the hairy ball theorem.
</p>
<p>
Now here comes the trick. Instead of using a constant <em>k</em>, we use a function <em>f(x,y,z)</em>. This is what our <strong>w</strong> vector looks like:
</p>
<img src="https://lolengine.net/tracmath/46ccc8714d09f4c44f331ff9bf6881dc52f54874.png" alt="$$\mathbf{v} = \begin{bmatrix}x \\ y \\ z \end{bmatrix}
\quad
\mathbf{w} = \begin{bmatrix}-y \\ x- z * f(x,y,z) \\ y * f(x,y,z) \end{bmatrix}
\quad
\mathbf{v} . \mathbf{w} = \cdots = 0
$$
" /><p>
From now I shall assume that <strong>v</strong> is a unit vector.
</p>
<p>
What can cause <strong>w</strong> to be zero, then?
</p>
<p>
One necessary condition is <em>y = 0</em>. When <em>y ≠ 0</em> we can chose anything we want for <em>f(x,y,z)</em>, it’ll always give us a good orthogonal vector. This restricts the problem to the <em>y = 0</em> circle, giving us the useful equality <em>x² + z² = 1</em>.
</p>
<p>
The other condition is <em>x = z*f(x,y,z)</em>. So if we manage to build a function <em>f</em> such that <em>f(x,y,z)</em> never equals <em>x/z</em>, we win.
</p>
<p>
Using <em>x² + z² = 1</em> we can plot all the possible values for <em>x/z</em> as a function of <em>x</em>. It will show us, for a given <em>x</em>, the values that <em>f</em> <strong>cannot</strong> take:
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts/plot-1.png" crossorigin="anonymous" />
</p>
<p>
The almost vertical slopes you see go to infinity upwards and downwards. As expected, this prevents us from using a continuous function: it would obviously cross one of these walls at some point.
</p>
<p>
Well, let’s try a non-continuous function, then. What are our options?
</p>
<ul><li><em>fmod</em>
</li><li><em>floor</em>, <em>ceil</em>, <em>round</em>…
</li><li><em>fract</em>
</li></ul><p>
Here is one that I came up with and which works reasonably well:
</p>
<img src="https://lolengine.net/tracmath/fe60abf72771e71a780128699683f19bc89f6af3.png" alt="$$f(x,y,z) = \mathrm{fract}(|x| + \frac{1}{2})$$
" /><p>
Look how it nicely avoids <em>x/z</em>:
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2013/09/21/picking-orthogonal-vector-combing-coconuts/plot-2.png" crossorigin="anonymous" />
</p>
<p>
And finally, our resulting branchless code:
</p>
<div class="wiki-code"><div class="code"><pre>/* Requires the input to be normalised.
* Doesn’t normalise the output. */
vec3 orthogonal(vec3 v)
{
float k = fract(abs(v.x) + 0.5);
return vec3(-v.y, v.x - k * v.z, k * v.y);
}
</pre></div></div><p>
I find it highly unlikely that this second version will perform better than the branching one. However, I haven’t put much thought into it and maybe someone will come up with a much better solution using the ideas presented here. Have fun!
</p>
mathsoptimBeautiful maths simplification: quaternion from two vectorssamWed, 18 Sep 2013 16:13:46 GMT
https://lolengine.net/blog/2013/09/18/beautiful-maths-quaternion-from-vectors
https://lolengine.net/blog/2013/09/18/beautiful-maths-quaternion-from-vectors<p>
In this article I would like to explain the thought process behind the derivation of a widely used formula: <strong>finding a quaternion representing the rotation between two 3D vectors</strong>. Nothing really new, but hopefully a few ideas could be reused at other times.
</p>
<p>
<strong>Note</strong>: the routine presented here is incomplete on purpose. For a version that can be used in production code, see the next article instead, <em><a href="https://lolengine.net/blog/2014/02/24/quaternion-from-two-vectors-final">Quaternion from two vectors: the final version</a></em>.
</p>
<h2 id="Naivemethod">Naive method</h2>
<p>
A rotation is best visualised using a rotation axis and an angle. Except in degenerate cases, the rotation axis can be obtained by computing the cross product of the two original vectors:
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2013/09/18/beautiful-maths-quaternion-from-vectors/maths-rotation.png" crossorigin="anonymous" />
</p>
<p>
Then the angle can be obtained using the properties of the cross product and/or the dot product:
</p>
<img src="https://lolengine.net/tracmath/3392cb9cb09b7674a03abe2bd44c9bb61cfaf05d.png" alt="\begin{align}
\boldsymbol{u} . \boldsymbol{v} &= |\boldsymbol{u}| . |\boldsymbol{v}| . \cos\theta \\
||\boldsymbol{u} \times \boldsymbol{v}|| &= |\boldsymbol{u}| . |\boldsymbol{v}| . |\sin\theta|
\end{align}
" /><p>
Since θ is always between 0 and π, we only care about the dot product. This gives us some obvious code to create the quaternion (omitting corner cases such as <code>θ = 0</code> for clarity):
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float cos_theta = dot(normalize(u), normalize(v));
float angle = acos(cos_theta);
vec3 w = normalize(cross(u, v));
return quat::fromaxisangle(angle, w);
}
</pre></div></div><p>
This is naive but it works. Googling for “quaternion from two vectors” shows <a class="ext-link" href="http://www.ogre3d.org/forums/viewtopic.php?f=10&t=45076"><span class="icon"></span>this forum post</a> and <a class="ext-link" href="http://stackoverflow.com/q/10236557/111461"><span class="icon"></span>this SO question</a> where this method or a variation thereof is suggested.
</p>
<h2 id="Lookingunderthehood">Looking under the hood</h2>
<p>
Let’s have a look at what happens when building the quaternion from an axis and an angle:
</p>
<img src="https://lolengine.net/tracmath/972b6a5c6de8778942a85d6551ff7634038de683.png" alt="$$ q = \cos\frac{\theta}{2} + \boldsymbol{i}\sin\frac{\theta}{2}v_x + \boldsymbol{j}\sin\frac{\theta}{2}v_y + \boldsymbol{k}\sin\frac{\theta}{2}v_z $$
" /><p>
This means the code for <code>quat::fromaxisangle</code> would look somewhat like this:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromaxisangle(float angle, vec3 axis)
{
float half_sin = sin(0.5f * angle);
float half_cos = cos(0.5f * angle);
return quat(half_cos,
half_sin * axis.x,
half_sin * axis.y,
half_sin * axis.z);
}
</pre></div></div><h2 id="Avoidingtrigonometry">Avoiding trigonometry</h2>
<p>
If you read Iñigo Quilez’s recent article about <a class="ext-link" href="http://www.iquilezles.org/www/articles/noacos/noacos.htm"><span class="icon"></span>avoiding trigonometry</a> you’ll have probably frowned at the fact that we computed θ from cos(θ), then computed sin(θ/2) and cos(θ/2).
</p>
<p>
Indeed, it happens that there is a <strong>much simpler way</strong> to do it; the <a class="ext-link" href="https://en.wikipedia.org/wiki/Double-angle_formula#Double-angle.2C_triple-angle.2C_and_half-angle_formulae"><span class="icon"></span>half-angle formulas</a> from precalculus tell us the following:
</p>
<img src="https://lolengine.net/tracmath/f4113671369abf13004c271454a8686b5e934515.png" alt="\begin{align}
\sin\frac{\theta}{2} &= \sqrt{\frac{1 - \cos\theta}{2}} \\
\cos\frac{\theta}{2} &= \sqrt{\frac{1 + \cos\theta}{2}}
\end{align}
" /><p>
This allows us to simplify our quaternion creation code:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float cos_theta = dot(normalize(u), normalize(v));
float half_cos = sqrt(0.5f * (1.f + cos_theta));
float half_sin = sqrt(0.5f * (1.f - cos_theta));
vec3 w = normalize(cross(u, v));
return quat(half_cos,
half_sin * w.x,
half_sin * w.y,
half_sin * w.z);
}
</pre></div></div><p>
This is pretty nice. By using well known trigonometry formulas, we got rid of all trigonometry function calls!
</p>
<h2 id="Avoidingsquareroots">Avoiding square roots</h2>
<p>
It happens that we can do slightly better. Note that we normalize three vectors: <code>u</code>, <code>v</code> and <code>cross(u, v)</code>. That’s three square roots. The thing is, we already know the norm of <code>w</code> through this formula:
</p>
<img src="https://lolengine.net/tracmath/b1e3fb7f76781d8c31a9a40391570d97ab770da0.png" alt="$$||\boldsymbol{u} \times \boldsymbol{v}|| = |\boldsymbol{u}| . |\boldsymbol{v}| . |\sin\theta|$$
" /><p>
And we know <code>sin(θ)</code> from precalculus again:
</p>
<img src="https://lolengine.net/tracmath/fd77406133709a556fc015cf9080ef57c830cc3b.png" alt="$$\sin\theta = 2 \sin\frac{\theta}{2} \cos\frac{\theta}{2}$$
" /><p>
Also, using the fact that <code>sqrt(a)sqrt(b) = sqrt(ab)</code> lets us perform one less square root.
</p>
<p>
We can therefore come up with the following performance improvement:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
float cos_theta = dot(u, v) / norm_u_norm_v;
float half_cos = sqrt(0.5f * (1.f + cos_theta));
float half_sin = sqrt(0.5f * (1.f - cos_theta));
vec3 w = cross(u, v) / (norm_u_norm_v * 2.f * half_sin * half_cos);
return quat(half_cos,
half_sin * w.x,
half_sin * w.y,
half_sin * w.z);
}
</pre></div></div><p>
Oh wait! We divide by <code>sin(θ/2)</code> to compute <code>w</code>, then we multiply by <code>sin(θ/2)</code> again. This means we don’t even need that variable, and we can simplify even further:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
float cos_theta = dot(u, v) / norm_u_norm_v;
float half_cos = sqrt(0.5f * (1.f + cos_theta));
vec3 w = cross(u, v) / (norm_u_norm_v * 2.f * half_cos);
return quat(half_cos, w.x, w.y, w.z);
}
</pre></div></div><p>
This is more or less the code used by the <a class="ext-link" href="https://bitbucket.org/sinbad/ogre/src/9db75e3ba05c/OgreMain/include/OgreVector3.h#cl-651"><span class="icon"></span>Ogre3D engine in OgreVector3.h</a>, except they perform an additional normalisation step on the final result. This is mathematically useless, but due to numerical stability issues, it is probably safe to do so nonetheless.
</p>
<p>
This final normalisation step is actually an opportunity to simplify the code even further.
</p>
<h2 id="ImprovingonOgre3D">Improving on Ogre3D</h2>
<p>
We are down to two square roots and four divisions, plus quite a few mul/adds. Depending on the platform that we are running on, it is possible to simplify even further and improve performance. For instance, on many SIMD architectures, normalising a quaternion can be very fast.
</p>
<p>
This is the code we get if we multiply every component of the quaternion by <code>2.f * half_cos</code> and let <code>normalize()</code> do the rest of the job:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
float cos_theta = dot(u, v) / norm_u_norm_v;
float half_cos = sqrt(0.5f * (1.f + cos_theta));
vec3 w = cross(u, v) / norm_u_norm_v;
return normalize(quat(2.f * half_cos * half_cos, w.x, w.y, w.z));
}
</pre></div></div><p>
Now <code>half_cos</code> only appears in its squared form, and since it comes from a square root, we can simply omit that square root:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
float cos_theta = dot(u, v) / norm_u_norm_v;
vec3 w = cross(u, v) / norm_u_norm_v;
return normalize(quat(1.f + cos_theta, w.x, w.y, w.z));
}
</pre></div></div><p>
And using the same reasoning we can multiply every quaternion component by <code>norm_u_norm_v</code>:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float norm_u_norm_v = sqrt(sqlength(u) * sqlength(v));
vec3 w = cross(u, v);
quat q = quat(norm_u_norm_v + dot(u, v), w.x, w.y, w.z);
return normalize(q);
}
</pre></div></div><p>
We are still doing two square roots and four divisions, some of which are hidden in <code>normalize()</code>, but the code is considerably shorter now.
</p>
<h2 id="Finalform">Final form</h2>
<p>
<img width="300px" crossorigin="anonymous" src="https://lolengine.net/raw-attachment/blog/2013/09/18/beautiful-maths-quaternion-from-vectors/final-form.jpeg" />
</p>
<p>
If <code>u</code> and <code>v</code> can be enforced to be unit vectors, <code>norm_u_norm_v</code> can be omitted and simply replaced with <code>1.0f</code>:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
vec3 w = cross(u, v);
quat q = quat(1.f + dot(u, v), w.x, w.y, w.z);
return normalize(q);
}
</pre></div></div><p>
Isn’t it beautiful, considering the <code>sin()</code>, <code>cos()</code> and <code>acos()</code> ridden mess we started with?
</p>
<p>
This algorithm can be found all over the Internet, but I do not know who first came up with it. Also, a lot of 3D engines (both publicly available and slightly more private) could benefit from it.
</p>
<h2 id="Update06012014">Update (06/01/2014)</h2>
<p>
In the comments below, Michael Norel provides the following improvement to the non-unit version. Since the values <code>d = dot(u, v)</code> and <code>w = cross(u, v)</code> are computed no matter what, the value <code>sqlength(u) * sqlength(v)</code> could be computed in a different way, *i.e.* <code>d * d + sqlength(w)</code>. The following code does at least three multiplications less:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
vec3 w = cross(u, v);
quat q = quat(dot(u, v), w.x, w.y, w.z);
q.w += length(q);
return normalize(q);
}
</pre></div></div><p>
Also, Marc B. Reynolds notes that, in the unit version, the final normalisation factor is <code>sqrt((1 + dot(u, v))² + sqlength(cross(u, v)))</code> which reduces to <code>sqrt(2 + 2 dot(u, v))</code> thanks to the <code>sin² + cos²</code> identity. It leads to the following possibly improved version:
</p>
<div class="wiki-code"><div class="code"><pre>quat quat::fromtwovectors(vec3 u, vec3 v)
{
float m = sqrt(2.f + 2.f * dot(u, v));
vec3 w = (1.f / m) * cross(u, v);
return quat(0.5f * m, w.x, w.y, w.z);
}
</pre></div></div>mathsoptimMaths trick: doing fewer comparisonssamTue, 14 Feb 2012 13:22:31 GMT
https://lolengine.net/blog/2012/02/14/fewer-comparisons
https://lolengine.net/blog/2012/02/14/fewer-comparisons<p>
Note: this is <strong>not</strong> an optimisation. It is just one more tool you should have in your toolbox when looking for optimisations. It may be useful.
</p>
<p>
This is the trick:
</p>
<img src="https://lolengine.net/tracmath/84cce2f2fc6089f719a166eea4d7900c0018fe45.png" alt="\[\min(x,y) = \dfrac{x + y - |x - y|}{2}\]
\[\max(x,y) = \dfrac{x + y + |x - y|}{2}\]
" /><p>
You can check for yourself that it is always true: when <em>x > y</em>, <em>|x - y|</em> is the same as <em>x - y</em>, etc.
</p>
<p>
What good is it for? There is often an implicit comparison in <code>min</code> or <code>max</code>. It might be interesting to replace it with a call to the branchless <code>fabs</code>.
</p>
<h2 id="Exampleusage">Example usage</h2>
<p>
Consider the following code:
</p>
<div class="wiki-code"><div class="code"><pre>float a, b, c, d;
/* ... */
return (a > b) && (c > d);
</pre></div></div><p>
That kind of code is often used <em>eg.</em> in collision checks, where a lot of tests can be done. This code does <strong>two comparisons</strong>. On some architectures, this means two branches. Not always something you want.
</p>
<p>
The test condition is equivalent to:
</p>
<div class="wiki-code"><div class="code"><pre>(a - b > 0) && (c - d > 0)
</pre></div></div><p>
Now when are two given numbers <strong>both</strong> positive? That is if and only if the <strong>smallest</strong> is positive:
</p>
<div class="wiki-code"><div class="code"><pre>min(a - b, c - d) > 0
</pre></div></div><p>
We may now use our trick:
</p>
<div class="wiki-code"><div class="code"><pre>(a - b) + (c - d) - |(a - b) - (c + d)| > 0
</pre></div></div><p>
And so the code could be rewritten as such:
</p>
<div class="wiki-code"><div class="code"><pre>float a, b, c, d;
/* ... */
return (a - b) + (c - d) > fabsf((a - b) - (c - d));
</pre></div></div><p>
We basically replaced the additional test with a call to <code>fabsf</code> and some additions/subtractions. It may be possible to reorganise the input data so that this second version performs better.
</p>
mathsoptimcodetipAnnounce: LolRemez 0.2 releasedsamMon, 09 Jan 2012 23:14:28 GMT
https://lolengine.net/blog/2012/01/10/announce-lolremez-0.2-release
https://lolengine.net/blog/2012/01/10/announce-lolremez-0.2-release<p>
A new version of our high precision polynomial approximation solver, <strong>LolRemez 0.2</strong>, is available.
</p>
<p>
The changes, taking into account all the feedback users provided, are as follows:
</p>
<ul><li>significant performance and accuracy improvements thanks to various bugfixes and a better extrema finder for the error function.
</li><li>user can now define accuracy of the final result.
</li><li><code>exp</code>, <code>sin</code>, <code>cos</code> and <code>tan</code> are now about 20% faster.
</li><li>multiplying a real number by an integer power of two is now a virtually free operation.
</li><li>fixed a rounding bug in the real number printing routine.
</li></ul><p>
You can visit <a class="wiki" href="https://lolengine.net/wiki/oss/lolremez">the software homepage</a> to download LolRemez and, more importantly, the <a class="wiki" href="https://lolengine.net/wiki/doc/maths/remez">comprehensive documentation</a> featuring a step-by-step tutorial.
</p>
releasemathsoptimAnnounce: LolRemez 0.1 releasedsamThu, 29 Dec 2011 18:47:28 GMT
https://lolengine.net/blog/2011/12/29/announce-lolremez-0.1-release
https://lolengine.net/blog/2011/12/29/announce-lolremez-0.1-release<p>
In my previous <a href="https://lolengine.net/blog/2011/12/21/better-function-approximations">article about the Remez exchange algorithm</a> I said I was working on a Remez exchange toolkit for everyone to play with. Though it’s far from being full-featured, I already use it and I believe it is already extremely useful. So I decided to release <strong>LolRemez 0.1</strong> to the masses.
</p>
<p>
You can visit <a class="wiki" href="https://lolengine.net/wiki/oss/lolremez">the software homepage</a> to download LolRemez and, more importantly, the <a class="wiki" href="https://lolengine.net/wiki/doc/maths/remez">comprehensive documentation</a> featuring a step-by-step tutorial.
</p>
releasemathsoptimBetter function approximations: Taylor vs. RemezsamWed, 21 Dec 2011 22:08:09 GMT
https://lolengine.net/blog/2011/12/21/better-function-approximations
https://lolengine.net/blog/2011/12/21/better-function-approximations<p>
You may have once crossed this particular piece of magic:
</p>
<img src="https://lolengine.net/tracmath/0583ac4071f8fb6dfd836422eae88b918f766fb1.png" alt="$\sin(a) = a - \dfrac{a^3}{3!} + \dfrac{a^5}{5!} - \dfrac{a^7}{7!} + \dfrac{a^9}{9!} + \dots$
" /><p>
The right part is the <strong>Taylor series of <code>sin</code> around 0</strong>. It converges very quickly to the actual value of <code>sin(a)</code>. This allows a computer to compute the sine of a number with arbitrary precision.
</p>
<p>
And when I say it’s magic, it’s because it is! Some functions, called the <a class="ext-link" href="http://en.wikipedia.org/wiki/Entire_function"><span class="icon"></span>entire functions</a>, can be computed everywhere using <strong>one single formula! </strong> Other functions may require a different formula for different intervals; they are the <a class="ext-link" href="http://en.wikipedia.org/wiki/Analytic_function"><span class="icon"></span>analytic functions</a>, a superset of the entire functions. In general, Taylor series are an extremely powerful tool to compute the value of a given function with very high accuracy, because for several common functions such as <code>sin</code>, <code>tan</code> or <code>exp</code> the terms of the series are easy to compute and, when implemented on a computer, can even be stored in a table at compile time.
</p>
<h2 id="ApproximatingsinwithTaylorseries">Approximating sin with Taylor series</h2>
<p>
This is how one would approximate <code>sin</code> using 7 terms of its Taylor series on the [-π/2, π/2] interval. The more terms, the better the precision, but we’ll stop at 7 for now:
</p>
<div class="wiki-code"><div class="code"><pre>static double taylorsin(double x)
{
static const
double a0 = 1.0,
a1 = -1.666666666666666666666666666666e-1, /* -1/3! */
a2 = 8.333333333333333333333333333333e-3, /* 1/5! */
a3 = -1.984126984126984126984126984126e-4, /* -1/7! */
a4 = 2.755731922398589065255731922398e-6, /* 1/9! */
a5 = -2.505210838544171877505210838544e-8, /* -1/11! */
a6 = 1.605904383682161459939237717015e-10; /* 1/13! */
double x2 = x * x;
return x * (a0 + x2 * (a1 + x2 * (a2 + x2
* (a3 + x2 * (a4 + x2 * (a5 + x2 * a6))))));
}
</pre></div></div><p>
And you may think…
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" alt="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" title="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" style="float:left" crossorigin="anonymous" /> “Oh wow that is awesome! So simple for such a difficult function. Also, since I read your <a href="https://lolengine.net/blog/2011/9/17/playing-with-the-cpu-pipeline">masterpiece about polynomial evaluation</a> I know how to improve that function so that it is <em>very fast</em>!”
</p>
</blockquote>
<p>
Well, actually, no.
</p>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/14/understanding-motion-in-games/no-rage.png" alt="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/no-rage.png" title="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/no-rage.png" style="float:right" crossorigin="anonymous" />
</p>
<div style="text-transform: uppercase; font-weight: bold; font-size: 1.1em;" class="wikipage"><p>
If you are approximating a function over an interval using its Taylor series then either you or the person who taught you is a fucking idiot because a Taylor series approximates a function <span style="color: red;">near</span> a fucking <span style="color: red;">point</span>, not <span style="color: red;">over</span> a fucking <span style="color: red;">interval</span>, and if you don’t understand why it’s important then please read on because that shit is gonna blow your mind.
</p>
</div><p>
</p><div style="clear:both;"></div><p>
</p>
<h2 id="Errormeasurement">Error measurement</h2>
<p>
Let’s have a look at how much error our approximation introduces. The formula for the absolute error is simple:
</p>
<img src="https://lolengine.net/tracmath/c8fc451d16e0c58eaa1ab817a799f479109a4193.png" alt="$\text{E}(x) = \left|\sin(x) - \text{taylorsin}(x)\right|$
" /><p>
And this is how it looks like over our interval:
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/21/better-function-approximations/error-pi-2.png" crossorigin="anonymous" />
</p>
</blockquote>
<p>
You can see that the error skyrockets near the edges of the [-π/2, π/2] interval.
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" alt="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" title="/raw-attachment/blog/2011/12/14/understanding-motion-in-games/derp.png" style="float:left" crossorigin="anonymous" /> “Well the usual way to fix this is to split the interval in two or more parts, and use a different Taylor series for each interval.”
</p>
</blockquote>
<p>
Oh, really? Well, let’s see the error on [-π/4, π/4] instead:
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/21/better-function-approximations/error-pi-4.png" crossorigin="anonymous" />
</p>
</blockquote>
<p>
<strong>I see no difference</strong>! The error is indeed smaller, but again, it becomes extremely large at the edges of the interval. And before you start suggesting reducing the interval even more, here is the error on [-π/8, π/8] now:
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/21/better-function-approximations/error-pi-8.png" crossorigin="anonymous" />
</p>
</blockquote>
<p>
I hope this makes it clear that:
</p>
<ul><li>the further from the centre of the interval, the larger the error
</li><li>the error distribution is very unbalanced
</li><li>the maximum error on [-π/2, π/2] is about <code>6.63e-10</code>
</li></ul><p>
And now I am going to show you why that maximum error value is <strong>pathetic</strong>.
</p>
<h2 id="Abetterapproximation">A better approximation</h2>
<p>
Consider this new function:
</p>
<div class="wiki-code"><div class="code"><pre>static double minimaxsin(double x)
{
static const
double a0 = 1.0,
a1 = -1.666666666640169148537065260055e-1,
a2 = 8.333333316490113523036717102793e-3,
a3 = -1.984126600659171392655484413285e-4,
a4 = 2.755690114917374804474016589137e-6,
a5 = -2.502845227292692953118686710787e-8,
a6 = 1.538730635926417598443354215485e-10;
double x2 = x * x;
return x * (a0 + x2 * (a1 + x2 * (a2 + x2
* (a3 + x2 * (a4 + x2 * (a5 + x2 * a6))))));
}
</pre></div></div><p>
It doesn’t look very different, right? Right. The values <code>a0</code> to <code>a6</code> are slightly different, but the rest of the code is <em>strictly the same</em>.
</p>
<p>
Yet what a difference it makes! Look at this error curve:
</p>
<blockquote>
<p>
<img src="https://lolengine.net/raw-attachment/blog/2011/12/21/better-function-approximations/remez-error.png" crossorigin="anonymous" />
</p>
</blockquote>
<p>
That new function makes it obvious that:
</p>
<ul><li>the error distribution is better spread over the interval
</li><li>the maximum error on [-π/2, π/2] is about <code>4.96e-14</code>
</li></ul><p>
Check that last figure again. The new maximum error isn’t 10% better, or maybe twice as good. It is more than <strong>ten thousand times smaller</strong>!!
</p>
<h2 id="Theminimaxpolynomial">The minimax polynomial</h2>
<p>
The above coefficients describe a <a class="ext-link" href="http://en.wikipedia.org/wiki/Minimax_approximation_algorithm"><span class="icon"></span>minimax polynomial</a>: that is, the polynomial that minimises a given error when approximating a given function. I will not go into the mathematical details, but just remember this: if the function is sufficiently well-suited (as <code>sin</code>, <code>tan</code>, <code>exp</code> etc. are), then the minimax polynomial can be found.
</p>
<p>
The problem? It’s hard to find. The most popular algorithm to find it is the <a class="ext-link" href="http://en.wikipedia.org/wiki/Remez_algorithm"><span class="icon"></span>Remez exchange algorithm</a>, and few people really seem to understand how it works (or there would be a lot less Taylor series). I am not going to explain it right now. Usually you need professional math tools such as Maple or Mathematica if you want to compute a minimax polynomial. The <a class="ext-link" href="http://www.boost.org/doc/libs/1_36_0/libs/math/doc/sf_and_dist/html/math_toolkit/toolkit/internals2/minimax.html"><span class="icon"></span>Boost library</a> is a notable exception, though.
</p>
<p>
But you saw the results, so <strong>stop using Taylor series</strong>. Spending some time finding the minimax polynomial is definitely worth it. This is why I am working on <a class="wiki" href="https://lolengine.net/wiki/oss/lolremez">a Remez framework</a> that I will make public and free for everyone to use, modify and <a class="ext-link" href="http://www.wtfpl.net/"><span class="icon"></span>do what the fuck they want</a>. In the meantime, if you have functions to numerically approximate, or Taylor-based implementations that you would like to improve, let me know in the comments! This will be great use cases for me.
</p>
optimrantcodemaths