Changeset 891


Ignore:
Timestamp:
Sep 2, 2011, 1:36:42 PM (12 years ago)
Author:
sam
Message:

optim: split the Taylor series calculation into two separate values.

This is at the cost of one additional multiply, but performance increases
by more than 11%, because the PS3 pipeline is a lot happier now.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/trig.cpp

    r890 r891  
    169169    double absx = lol_fabs(x * INV_PI);
    170170    double sign = lol_fsel(x, PI, NEG_PI);
     171
     172    /* To compute sin(x) we build a Taylor series for |x|/pi wrapped to
     173     * the range [-1, 1]. We also switch the result sign if the number
     174     * of cycles is odd. */
    171175#if defined __CELLOS_LV2__
    172176    double num_cycles = lol_round(absx);
     
    185189#endif
    186190    double norm_x = absx - num_cycles;
    187     double y = norm_x * norm_x;
    188     double taylor = (((((((SC[7] * y + SC[6]) * y + SC[5])
    189                                  * y + SC[4]) * y + SC[3])
    190                                  * y + SC[2]) * y + SC[1])
    191                                  * y + SC[0]) * y + ONE;
     191
     192    /* Computing x^4 is one multiplication too many we do, but it helps
     193     * interleave the Taylor series operations a lot better. */
     194    double x2 = norm_x * norm_x;
     195    double x4 = x2 * x2;
     196    double sub1 = ((SC[7] * x4 + SC[5]) * x4 + SC[3]) * x4 + SC[1];
     197    double sub2 = ((SC[6] * x4 + SC[4]) * x4 + SC[2]) * x4 + SC[0];
     198    double taylor = (sub1 * x2 + sub2) * x2 + ONE;
     199
    192200    double result = norm_x * taylor;
    193201    return result * sign;
Note: See TracChangeset for help on using the changeset viewer.