Yes, as I wrote, "far less efficient methods have to be employed to split the work up between multiple threads." What you're calling a spinlock is a loop that constantly checks to see if a switch is set, which momentarily pegs your CPU at 100%. The end result is that, while the work will be split between multiple processors, the overall CPU use will be higher - potentially a lot higher, depending on the situation, as you explained.
However, there is another issue. Take a typical configuration - an oscillator into a filter into a VCA into a system output. First the oscillator has to generate a signal, then the filter can filter it, then the VCA can attenuate it, and then the output can transmit the signal to the sound card. You cannot perform these operations in any other order, as each step is dependent on the previous step. So, consider the multi-processor scenario. One processor handles the oscillator, one handles the filter, and one handles the VCA. The only way to process these three modules simultaneously is to introduce latency. If the VCA processes the previous output from the filter, and the filter processes the previous output from the oscillator, then we've solved the simultaneous processing problem, but we've introduced 2 samples of latency. The more times a signal crosses the processor barrier, the more latency gets introduced. A large, complex patch could theoretically have a lot of latency introduced.
That's the price that has to be paid for any form of multiprocessor processing in a virtual modular synthesizer. The introduction of latency, and higher overall CPU use, are inescapable. What do you all think? Would it be worth it?
- Dan @ Cherry Audio