KVR Audio

Kraku · Post by **Kraku** » Fri Apr 05, 2019 9:17 am

I was wondering if there is some consensus here about BLITs and BLEPs:

How long BLITs/BLEPs do you usually use?
How many "frames"(i.e. sub-sample locations) do you pre-generate?
Do you interpolate between these frames? (if you do, which method do you use?)
Do you prefer polyBLEPs/polyBLITs or traditional LUTs?

mystran · Post by **mystran** » Fri Apr 05, 2019 9:56 pm

Kraku wrote: ↑Fri Apr 05, 2019 9:17 am I was wondering if there is some consensus here about BLITs and BLEPs:

How long BLITs/BLEPs do you usually use?
How many "frames"(i.e. sub-sample locations) do you pre-generate?
Do you interpolate between these frames? (if you do, which method do you use?)
Do you prefer polyBLEPs/polyBLITs or traditional LUTs?

There's probably no "consensus" but there are some technical considerations that might vary, depending on what you want to do with the BLEPs.

The choice of polyBLEPs vs. LUTs is largely a matter of the performance vs. quality trade-off you want to make. Even for polynomial BLEP kernels with increasing degree, there's a point where it's faster to fetch a branch (or two for lerp) from a LUT than to compute them on the fly and you might also encounter numerical difficulties trying to integrate the polynomials analytically (or rather, trying to evaluate the resulting monomials).

As for length, I find that 32-samples is a pretty good value (at least as a ballpark figure) when the cutoff is between half and quarter sampling rate. When the next module in the signal-flow needs oversampling (eg. non-linear filter), it's usually faster to run the BLEP oscillators at the oversampled rate and adjust the BLEP-cutoff lower (ie. base-rate Nyquist), such that you can save an upsampling stage. Since the transition bandwidth requirements for the BLEPs aren't quite as tight in this case, one doesn't necessarily need longer BLEPs even though lowering the cutoff widens the transition.

As for branches (or "frames" are you call them), around 8k typically works reasonably well without interpolation. If you do linear interpolation then I feel like the main limiting factor becomes the numerical integration errors, especially for "higher order" BLEPs. For anti-aliasing cubics with BLEPs integrated using trapezoidal, I find that about 1k branches is enough if you integrate half the kernel and then handle the other half by symmetry. For just steps/ramps you can probably get away with less.

That said... there are a number of trade-offs to be made and a lot of it depends on the details. The numbers above are basically such that if you're throwing significantly more CPU at the problem, then it's likely that some other issue (eg. some implementation detail) is limiting your quality.

If you go with the LUT approach, make sure you reorder the BLEP data in the LUT in such a way that for any given transition you only need to fetch the minimum number of cache lines (ie. store each branch separately; incidentally this also makes SIMD easier).

2DaT · Post by **2DaT** » Sat Apr 06, 2019 11:41 am

PolyBLEPS don't scale well for long kernels (>2 samples). You need more evaluations and higher degree polynomials which leads to quadratic scaling (also, more overlaps).
Short length PolyBLEPS are great for SIMD though.
https://www.youtube.com/watch?v=cn-5k8fm_u0

mystran · Post by **mystran** » Sat Apr 06, 2019 4:54 pm

2DaT wrote: ↑Sat Apr 06, 2019 11:41 am Short length PolyBLEPS are great for SIMD though.

LUT BLEPs are pretty SIMD friendly as well, as long as you store the data in the right order.

Basically, if you have (logically in terms of memory order) an array of BLEP branches, then you can fetch the desired branch (and the next one, when doing linear interpolation) directly as-is with SIMD-widths all the way up to the length of the individual branch. You need unaligned access to deal with the output buffer, but that's no different from PolyBLEPs.

If you store an extra branch or two, you can also guarantee that you always have the two branches required for lerp as long as the desired sample offset is within [0, 1+eps] for some small eps, safely allowing for slight floating point inaccuracy when solving the offset (ie. you don't need to worry about rounding) without having to check for it.

2DaT · Post by **2DaT** » Sat Apr 06, 2019 8:24 pm

mystran wrote: ↑Sat Apr 06, 2019 4:54 pm
2DaT wrote: ↑Sat Apr 06, 2019 11:41 am Short length PolyBLEPS are great for SIMD though.
LUT BLEPs are pretty SIMD friendly as well, as long as you store the data in the right order.

Right, but you need scalar code for everything else. With short PolyBELPs you can get away with transition per sample, because it's so cheap. Though it can get crazy with additional features such as PWM and/or hardsync, with lots of divisions and redundant calculations; pretty much killing the SIMD advantage, because scalar code has branch prediction and oscillators tend to be somewhat predictable.

mystran · Post by **mystran** » Sat Apr 06, 2019 9:38 pm

2DaT wrote: ↑Sat Apr 06, 2019 8:24 pm Right, but you need scalar code for everything else. With short PolyBELPs you can get away with transition per sample, because it's so cheap. Though it can get crazy with additional features such as PWM and/or hardsync, with lots of divisions and redundant calculations; pretty much killing the SIMD advantage, because scalar code has branch prediction and oscillators tend to be somewhat predictable.

I would argue that "somewhat predictable" is really an understatement, as most samples don't need any transitions until the oscillator frequency starts to approach the sampling rate, unless your waveforms have a lot of small segments. Rather I would assume that you generally take a predictable miss-prediction once per transition.

Gallup about the BLITs and BLEPs you use