KVR Audio

Z1202 · Post by **Z1202** » Tue Feb 12, 2019 8:47 am

On modern computers it's recommended to use Estrin's scheme rather than Horner method due to the parallelization options of the former. I was wondering why instead we don't use a factored form of the polynomials, which should be almost freely parallelizable? At least when the polynomials are known at compile time

2DaT · Post by **2DaT** » Tue Feb 12, 2019 9:52 am

Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials. OTOH, Horner form is well-conditioned (at least for good polynomials), and it has error-cancelling properties. Estrin's scheme also has some error-cancelling properties, and is suitable for accurate evaluation.

Also, Fused-multiply-add operations (FMA) can't be used on a factored form.

vortico · Post by **vortico** » Tue Feb 12, 2019 10:16 am

Not related to your question, but Estrins scheme only makes sense when you aren't already evaluating 4 or 8 floats in parallel (4/8 note polyphony for example). In that case, Horner's method is better since it uses fewer arithmetic operations.

Z1202 · Post by **Z1202** » Tue Feb 12, 2019 10:18 am

2DaT wrote: ↑Tue Feb 12, 2019 9:52 am Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials.

Hmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be ill-conditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case ill-conditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally ill-conditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is ill-conditioned with respect to the roots, at least more than it would be in the Horner scheme.

Would you have any further pointers?

PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.

2DaT · Post by **2DaT** » Tue Feb 12, 2019 11:56 am

Z1202 wrote: ↑Tue Feb 12, 2019 10:18 am
2DaT wrote: ↑Tue Feb 12, 2019 9:52 am Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials.
Hmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be ill-conditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case ill-conditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally ill-conditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is ill-conditioned with respect to the roots, at least more than it would be in the Horner scheme.

Would you have any further pointers?

PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.

Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460

"Decaying" coefficients allow accurate evaluation in Horner form.

2DaT · Post by **2DaT** » Tue Feb 12, 2019 12:30 pm

vortico wrote: ↑Tue Feb 12, 2019 10:16 am Not related to your question, but Estrins scheme only makes sense when you aren't already evaluating 4 or 8 floats in parallel (4/8 note polyphony for example). In that case, Horner's method is better since it uses fewer arithmetic operations.

Long dependency chains constrain out-of-order execution. In Horner's scheme every operation is dependent on previous, but in Estrin's scheme there's less dependency. Of course performance depends on the surrounding code, but in practice Estrin's scheme wins for orders >5.

Z1202 · Post by **Z1202** » Tue Feb 12, 2019 1:00 pm

2DaT wrote: ↑Tue Feb 12, 2019 11:56 am Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].
Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
"Decaying" coefficients allow accurate evaluation in Horner form.

The maximum difference between
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x-2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [-0.5,0.5] is 2.4*10^-7 (evaluated at 10^6 different points with 32 bit float)

mystran · Post by **mystran** » Tue Feb 12, 2019 2:28 pm

https://www.maa.org/sites/default/files ... kinson.pdf

Z1202 · Post by **Z1202** » Tue Feb 12, 2019 2:46 pm

mystran wrote: ↑Tue Feb 12, 2019 2:28 pm https://www.maa.org/sites/default/files ... kinson.pdf

TLDR, but, as far as I understand, this is exactly about ill-conditioning of the roots in respect to the coefficients, which is not what we really care about (because, IIUC, the very same ill-conditioning will ill-condition the polynomial values evaluated by the Horner scheme in a comparable amount). Or did I miss any important part while glancing through?

Z1202 · Post by **Z1202** » Tue Feb 12, 2019 2:47 pm

posted by mistake, delete

mystran · Post by **mystran** » Tue Feb 12, 2019 3:13 pm

2DaT wrote: ↑Tue Feb 12, 2019 11:56 am Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].
Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
"Decaying" coefficients allow accurate evaluation in Horner form.

Depends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.

2DaT · Post by **2DaT** » Tue Feb 12, 2019 6:11 pm

Z1202 wrote: ↑Tue Feb 12, 2019 1:00 pm
2DaT wrote: ↑Tue Feb 12, 2019 11:56 am Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].
Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
"Decaying" coefficients allow accurate evaluation in Horner form.
The maximum difference between
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x-2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [-0.5,0.5] is 2.4*10^-7 (evaluated at 10^6 different points with 32 bit float)

I did my own tests.

Code: Select all

Factored: 3.8990848138928413ulp
Horner: 0.76185148768126965ulp

The error is not dramatic, but not neligible either. And i have a feeling that it will be worse for higher orders.

mystran wrote: ↑Tue Feb 12, 2019 3:13 pm
2DaT wrote: ↑Tue Feb 12, 2019 11:56 am Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].
Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
"Decaying" coefficients allow accurate evaluation in Horner form.
Depends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.

Worth a look. Author claims excellent numerical stability.
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.

mystran · Post by **mystran** » Tue Feb 12, 2019 8:36 pm

2DaT wrote: ↑Tue Feb 12, 2019 6:11 pm I did my own tests.
Code: Select all
Factored: 3.8990848138928413ulp
Horner: 0.76185148768126965ulp
The error is not dramatic, but not neligible either. And i have a feeling that it will be worse for higher orders.

I'm curious, did you solve the factors from the monomial, or build the monomial from the factors?

2DaT wrote: ↑Tue Feb 12, 2019 6:11 pm Worth a look. Author claims excellent numerical stability.
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.

Curiously enough, I actually had that document open in a tab already... but yeah, I'm aware.

juha_p · Post by **juha_p** » Tue Feb 12, 2019 9:02 pm

Is there any advantage to use hexadecimal floating-point constants ... looks like many does use it ?

https://www.exploringbinary.com/hexadec ... constants/

Example (atan)

Edit: g++ (printf(%.14a" ...)) gave these hex values from 2DaT's example :

Code: Select all

0x1.430972002bf370p-13 + // x^5
0x1.5f07fa00024be0p-10 +
0x1.3b2b9fffffc470p-7 +
0x1.c6af6dffffee50p-5 +
0x1.ebfbde00000380p-3 + //x
0x1.62e43000000040p-1

mystran · Post by **mystran** » Tue Feb 12, 2019 9:29 pm

juha_p wrote: ↑Tue Feb 12, 2019 9:02 pm Is there any advantage to use hexadecimal floating-point constants ... looks like many does use it ?

https://www.exploringbinary.com/hexadec ... constants/

Example (atan)

See the same site you linked: https://www.exploringbinary.com/number- ... nversions/

The short version is that as long as your decimal-form literal has enough digits (eg. 9 for floats and 17 for doubles, assuming the above site gets the math right, but the numbers sound about right), you'll get an exact binary to decimal to binary round-trip as long as neither your floating point printer or parser screws up. Whether they can be trusted is another story entirely, but if your original numbers come from a CAS and the parser is part of a mainstream compiler, then I wouldn't worry about it too much.

edit: This reddit article https://www.reddit.com/r/rust/comments/ ... d_correct/ contains some interesting notes on the challenges of actually parsing floats correctly. With a bit of Google you can find more articles on the subject as well.

Estrin's scheme vs. factored polynomials