Estrin's scheme vs. factored polynomials

Z1202
 KVRian
 1059 posts since 12 Apr, 2002
Estrin's scheme vs. factored polynomials
On modern computers it's recommended to use Estrin's scheme rather than Horner method due to the parallelization options of the former. I was wondering why instead we don't use a factored form of the polynomials, which should be almost freely parallelizable? At least when the polynomials are known at compile time

2DaT
 KVRist
 380 posts since 21 Jun, 2013
Re: Estrin's scheme vs. factored polynomials
Evaluation of factored polynomials is illconditioned, esp. for high order polynomials. OTOH, Horner form is wellconditioned (at least for good polynomials), and it has errorcancelling properties. Estrin's scheme also has some errorcancelling properties, and is suitable for accurate evaluation.
Also, Fusedmultiplyadd operations (FMA) can't be used on a factored form.
Also, Fusedmultiplyadd operations (FMA) can't be used on a factored form.

vortico
 KVRist
 283 posts since 19 Jul, 2008
Re: Estrin's scheme vs. factored polynomials
Not related to your question, but Estrins scheme only makes sense when you aren't already evaluating 4 or 8 floats in parallel (4/8 note polyphony for example). In that case, Horner's method is better since it uses fewer arithmetic operations.
VCV Rack opensource virtual modular synthesizer

Z1202
 KVRian
 1059 posts since 12 Apr, 2002
Re: Estrin's scheme vs. factored polynomials
Hmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be illconditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case illconditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally illconditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is illconditioned with respect to the roots, at least more than it would be in the Horner scheme.
Would you have any further pointers?
PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.

2DaT
 KVRist
 380 posts since 21 Jun, 2013
Re: Estrin's scheme vs. factored polynomials
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.Z1202 wrote: ↑Tue Feb 12, 2019 2:18 amHmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be illconditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case illconditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally illconditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is illconditioned with respect to the roots, at least more than it would be in the Horner scheme.
Would you have any further pointers?
PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.
For example: (2^x1)/x remez on [0.5,0.5].
Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460

2DaT
 KVRist
 380 posts since 21 Jun, 2013
Re: Estrin's scheme vs. factored polynomials
Long dependency chains constrain outoforder execution. In Horner's scheme every operation is dependent on previous, but in Estrin's scheme there's less dependency. Of course performance depends on the surrounding code, but in practice Estrin's scheme wins for orders >5.

Z1202
 KVRian
 1059 posts since 12 Apr, 2002
Re: Estrin's scheme vs. factored polynomials
The maximum difference between2DaT wrote: ↑Tue Feb 12, 2019 3:56 amWell, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x1)/x remez on [0.5,0.5]."Decaying" coefficients allow accurate evaluation in Horner form.Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [0.5,0.5] is 2.4*10^7 (evaluated at 10^6 different points with 32 bit float)

mystran
 KVRAF
 5325 posts since 12 Feb, 2006 from Helsinki, Finland
Re: Estrin's scheme vs. factored polynomials
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

Z1202
 KVRian
 1059 posts since 12 Apr, 2002
Re: Estrin's scheme vs. factored polynomials
TLDR, but, as far as I understand, this is exactly about illconditioning of the roots in respect to the coefficients, which is not what we really care about (because, IIUC, the very same illconditioning will illcondition the polynomial values evaluated by the Horner scheme in a comparable amount). Or did I miss any important part while glancing through?
Last edited by Z1202 on Tue Feb 12, 2019 6:48 am, edited 1 time in total.

Z1202
 KVRian
 1059 posts since 12 Apr, 2002
Re: Estrin's scheme vs. factored polynomials
posted by mistake, delete

mystran
 KVRAF
 5325 posts since 12 Feb, 2006 from Helsinki, Finland
Re: Estrin's scheme vs. factored polynomials
Depends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.2DaT wrote: ↑Tue Feb 12, 2019 3:56 amWell, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x1)/x remez on [0.5,0.5]."Decaying" coefficients allow accurate evaluation in Horner form.Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

2DaT
 KVRist
 380 posts since 21 Jun, 2013
Re: Estrin's scheme vs. factored polynomials
I did my own tests.Z1202 wrote: ↑Tue Feb 12, 2019 5:00 amThe maximum difference between2DaT wrote: ↑Tue Feb 12, 2019 3:56 amWell, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x1)/x remez on [0.5,0.5]."Decaying" coefficients allow accurate evaluation in Horner form.Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [0.5,0.5] is 2.4*10^7 (evaluated at 10^6 different points with 32 bit float)
Code: Select all
Factored: 3.8990848138928413ulp
Horner: 0.76185148768126965ulp
Worth a look. Author claims excellent numerical stability.mystran wrote: ↑Tue Feb 12, 2019 7:13 amDepends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.2DaT wrote: ↑Tue Feb 12, 2019 3:56 amWell, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x1)/x remez on [0.5,0.5]."Decaying" coefficients allow accurate evaluation in Horner form.Code: Select all
0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.

mystran
 KVRAF
 5325 posts since 12 Feb, 2006 from Helsinki, Finland
Re: Estrin's scheme vs. factored polynomials
I'm curious, did you solve the factors from the monomial, or build the monomial from the factors?2DaT wrote: ↑Tue Feb 12, 2019 10:11 amI did my own tests.The error is not dramatic, but not neligible either. And i have a feeling that it will be worse for higher orders.Code: Select all
Factored: 3.8990848138928413ulp Horner: 0.76185148768126965ulp
Curiously enough, I actually had that document open in a tab already... but yeah, I'm aware.2DaT wrote: ↑Tue Feb 12, 2019 10:11 amWorth a look. Author claims excellent numerical stability.
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

juha_p
 KVRian
 550 posts since 21 Feb, 2006 from FI
Re: Estrin's scheme vs. factored polynomials
Is there any advantage to use hexadecimal floatingpoint constants ... looks like many does use it ?
https://www.exploringbinary.com/hexadec ... constants/
Example (atan)
Edit: g++ (printf(%.14a" ...)) gave these hex values from 2DaT's example :
https://www.exploringbinary.com/hexadec ... constants/
Example (atan)
Edit: g++ (printf(%.14a" ...)) gave these hex values from 2DaT's example :
Code: Select all
0x1.430972002bf370p13 + // x^5
0x1.5f07fa00024be0p10 +
0x1.3b2b9fffffc470p7 +
0x1.c6af6dffffee50p5 +
0x1.ebfbde00000380p3 + //x
0x1.62e43000000040p1
Last edited by juha_p on Tue Feb 12, 2019 2:39 pm, edited 2 times in total.

mystran
 KVRAF
 5325 posts since 12 Feb, 2006 from Helsinki, Finland
Re: Estrin's scheme vs. factored polynomials
See the same site you linked: https://www.exploringbinary.com/number ... nversions/juha_p wrote: ↑Tue Feb 12, 2019 1:02 pmIs there any advantage to use hexadecimal floatingpoint constants ... looks like many does use it ?
https://www.exploringbinary.com/hexadec ... constants/
Example (atan)
The short version is that as long as your decimalform literal has enough digits (eg. 9 for floats and 17 for doubles, assuming the above site gets the math right, but the numbers sound about right), you'll get an exact binary to decimal to binary roundtrip as long as neither your floating point printer or parser screws up. Whether they can be trusted is another story entirely, but if your original numbers come from a CAS and the parser is part of a mainstream compiler, then I wouldn't worry about it too much.
edit: This reddit article https://www.reddit.com/r/rust/comments/ ... d_correct/ contains some interesting notes on the challenges of actually parsing floats correctly. With a bit of Google you can find more articles on the subject as well.
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.