That reads like a sales pitchWavetone wrote:Of course HT off, who needs it?... extra cores, pff, don't need at all... Some dumb useless marketing feature probably. More Ram is what counts, you never can't get enough, I get mine at https://downloadmoreram.com Always fast, reliable and my daw has never been so happy.
Hyperthreading in DAW - ON or OFF?
- Banned
- Topic Starter
- 11467 posts since 4 Jan, 2017 from Warsaw, Poland
- KVRAF
- 23102 posts since 7 Jan, 2009 from Croatia
This is not how HT works. Halving one physical core into two logical ones doesn't "halve their capacity".antic604 wrote:If for example we had only 1 physical core to process everything - DSP processing, GUI, antivirus, I/O, etc. - then it would have 100% of capacity available. If we'd split that into 2 logical cores, each with 50% of power but one of them would be dedicated exclusively to antivirus & I/O and the other to DAW's DSP and GUI, then we'd end up with limited DAW performance and a lot of "wasted" performance for system stuff.
- Banned
- Topic Starter
- 11467 posts since 4 Jan, 2017 from Warsaw, Poland
So, how does it work, then? Are you suggesting that somehow two logical cores combined have more "power" than one physical core on which they're running virtually? Or do you mean the opposite, but this I've already mentioned when referring to scheduling overhead.EvilDragon wrote:This is not how HT works. Halving one physical core into two logical ones doesn't "halve their capacity".antic604 wrote:If for example we had only 1 physical core to process everything - DSP processing, GUI, antivirus, I/O, etc. - then it would have 100% of capacity available. If we'd split that into 2 logical cores, each with 50% of power but one of them would be dedicated exclusively to antivirus & I/O and the other to DAW's DSP and GUI, then we'd end up with limited DAW performance and a lot of "wasted" performance for system stuff.
Or are you referring to my assumption, that Hyperthreading "splits" the core in half, when in fact it can be arbitraty, like 1 logical core gets 80% of the clock and the other gets 20%; or that it's maybe even a dynamic process?
If you bother to reply, please don't be vague - we're all learning here, at least that was my goal
-
- KVRAF
- 3508 posts since 12 May, 2011
I went to the site and downloaded more 2096 gigabytes of ram and my machine is now 500% faster!antic604 wrote:That reads like a sales pitchWavetone wrote:Of course HT off, who needs it?... extra cores, pff, don't need at all... Some dumb useless marketing feature probably. More Ram is what counts, you never can't get enough, I get mine at https://downloadmoreram.com Always fast, reliable and my daw has never been so happy.
- KVRAF
- 8828 posts since 6 Jan, 2017 from Outer Space
I found a decent explanation on wikipedia. Hyperthreading does help in certain circumstances, the core is not doubled, but registers are. Especially if you have more threads than cores (100% true for 2 or 4 core machines), you have to slip in other threads all the time. I also read, that very demanding processes better run without hyperthreading. Its an optimisation technique on the hardware level. It has to be supported by the operating system, which is the case for all current OS systems which run audio software...antic604 wrote: If you bother to reply, please don't be vague - we're all learning here, at least that was my goal
-
- KVRAF
- 4420 posts since 13 Jul, 2004 from Earth
I did some more testing here now and i get less vst's in Cubase with SMT off (Amds hyperthreading).antic604 wrote:This thread is not to compare Reason to other DAWs, because it's a common knowledge that current VST implementation in it is inferior to anything else in terms of performance.D-Fusion wrote:I have tested it ;.
So I'm not interested if Cubase was better than Reason and you that haven't tested it with hyperthreading off because "it works so well". I'm exactly interested in you turning it off and reporting if it works even better
So Cubase works better with HT or SMT Enabled after some more testing.
With SMT off i can only play 6 Korg Odyssey with preset No.30 at once.
With it turned on i can play 12 of them or 11 + many other vsts at the same time.
-
- KVRian
- 1262 posts since 15 May, 2002 from Finland
So let's necro this: I benchmarked now with Bitwig v3 beta 4, hyperthreading made a huge positive difference on my test setup at least. Curiously turbo boost lowered performance.
- Banned
- Topic Starter
- 11467 posts since 4 Jan, 2017 from Warsaw, Poland
More info would be welcome, like what's "huge", what is your setup (at least the CPU, OS), what happens when turbo boost is enabled (I'm guessing in thermal throttles below nominal clock)?
-
- KVRian
- 1262 posts since 15 May, 2002 from Finland
i7 6700, but in a mini form factor EliteDesk G2 65W chassis, so not the most effective cooler. I could run about twice the number of device-heavy groups with HT on. Turbo boost does not throttle below clock, it's a BIOS setting that can not be changed elsewhere, and which allows one or more cores to run at above the base frequency. How much, is very situation specific, and affected by thermal situation. So it is possible that with turbo boost on my CPU tried to increase frequency, but ended up taking it down when heat accumulated too much, which caused lowered performance. I'm not sure, but I got around 15% lower average DSP loads with Turbo Boost off.
The situation with HT might have been that no one core could run alone two instances of that heavy group, so HT enabled to use the excess power from other cores. In my earlier tests with big projects that typically have lots of tracks that individually are not so resource hungry, HT seemed to increase DSP loads a bit, but this was only one one operating system. Now I tried with Win 7, Win 10 and Ubuntu Studio 19.04. My test project was this: https://www.dropbox.com/s/cdgixc1bkhlxe ... oject?dl=0
So there's two heavy groups by default that I tried to fill with lots of different practical configurations of devices and modulators. I did the test in beta 3 and beta 4, beta 4 was also around 25% more effective so there's stuff happening with the parallel code. I will do the tests again when v3 launches for good.
The situation with HT might have been that no one core could run alone two instances of that heavy group, so HT enabled to use the excess power from other cores. In my earlier tests with big projects that typically have lots of tracks that individually are not so resource hungry, HT seemed to increase DSP loads a bit, but this was only one one operating system. Now I tried with Win 7, Win 10 and Ubuntu Studio 19.04. My test project was this: https://www.dropbox.com/s/cdgixc1bkhlxe ... oject?dl=0
So there's two heavy groups by default that I tried to fill with lots of different practical configurations of devices and modulators. I did the test in beta 3 and beta 4, beta 4 was also around 25% more effective so there's stuff happening with the parallel code. I will do the tests again when v3 launches for good.
-
AdvancedFollower AdvancedFollower https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=418780
- KVRian
- 1234 posts since 8 May, 2018 from Sweden
The idea behind SMT (called Hyperthreading by Intel) is to share unused resources of one physical core among two logical cores. Modern CPUs have multiple execution resources and are in fact over-provisioned most of the time. A single instruction making its way through the CPU core's pipeline usually won't fully utilize the resources at every stage, and may also be bottle-necked at certain stages while other stages of the pipeline are idle. So by presenting the core as two virtual cores to the OS, the pipeline can be fed more efficiently and fewer execution resources stay idle.antic604 wrote: ↑Wed Mar 14, 2018 8:11 amSo, how does it work, then? Are you suggesting that somehow two logical cores combined have more "power" than one physical core on which they're running virtually? Or do you mean the opposite, but this I've already mentioned when referring to scheduling overhead.EvilDragon wrote:This is not how HT works. Halving one physical core into two logical ones doesn't "halve their capacity".antic604 wrote:If for example we had only 1 physical core to process everything - DSP processing, GUI, antivirus, I/O, etc. - then it would have 100% of capacity available. If we'd split that into 2 logical cores, each with 50% of power but one of them would be dedicated exclusively to antivirus & I/O and the other to DAW's DSP and GUI, then we'd end up with limited DAW performance and a lot of "wasted" performance for system stuff.
Or are you referring to my assumption, that Hyperthreading "splits" the core in half, when in fact it can be arbitraty, like 1 logical core gets 80% of the clock and the other gets 20%; or that it's maybe even a dynamic process?
If you bother to reply, please don't be vague - we're all learning here, at least that was my goal
A CPU core already tries to extract instruction-level parallelism to keep multiple operations "in flight" at the same time, but it can be done much more easily and efficiently when executing two independent threads at the same time, where the result of one thread isn't necessarily immediately dependent on the outcome of same calculation in the other thread (unlike an operation inside a thread, which might depend on the outcome of a previous operation).
In some cases, the two logical cores might end up competing for the same execution resource, but it's almost always a net performance gain overall. CPU schedulers are incredibly complex and can usually avoid those situations. The allocation of execution resources constantly changes, it's not a static 50/50 split or anything like that. In modern CPUs, some resources have also been duplicated for the purpose of SMT.
- Banned
- Topic Starter
- 11467 posts since 4 Jan, 2017 from Warsaw, Poland
Thank you - that was very educational!AdvancedFollower wrote: ↑Tue Jun 04, 2019 12:18 pmThe idea behind SMT (called Hyperthreading by Intel) is to share unused resources of one physical core among two logical cores. Modern CPUs have multiple execution resources and are in fact over-provisioned most of the time. A single instruction making its way through the CPU core's pipeline usually won't fully utilize the resources at every stage, and may also be bottle-necked at certain stages while other stages of the pipeline are idle. So by presenting the core as two virtual cores to the OS, the pipeline can be fed more efficiently and fewer execution resources stay idle.
A CPU core already tries to extract instruction-level parallelism to keep multiple operations "in flight" at the same time, but it can be done much more easily and efficiently when executing two independent threads at the same time, where the result of one thread isn't necessarily immediately dependent on the outcome of same calculation in the other thread (unlike an operation inside a thread, which might depend on the outcome of a previous operation).
In some cases, the two logical cores might end up competing for the same execution resource, but it's almost always a net performance gain overall. CPU schedulers are incredibly complex and can usually avoid those situations. The allocation of execution resources constantly changes, it's not a static 50/50 split or anything like that. In modern CPUs, some resources have also been duplicated for the purpose of SMT.
Are there any (easy to grasp) documents about pros & cons of HT/SMT, in particular researching the benefits in real-life scenarios? I thought it's typically in 5-20% range, based on what I've seen from asynchronous GPU compute (which is a similar idea, I think?), but some people here claim to have almost twice the performance?
-
- KVRian
- 1262 posts since 15 May, 2002 from Finland
I think it might have to do with the fact that the FX Layers were very heavy, and maybe if one CPU could not run quite two of them, having multiple threads evened out the load so that the excess was better utilised.
The project was this, I tried to create a mix of various complex modulations using many different devices and modulators: https://www.dropbox.com/s/cdgixc1bkhlxe ... oject?dl=0 (this was made in beta 3 of Bitwig btw)
By default there's two tracks, by seeing how many you can run before crackling starts, and watcing the average load, it's possible to benchmark different operating systems and system setups.
The project was this, I tried to create a mix of various complex modulations using many different devices and modulators: https://www.dropbox.com/s/cdgixc1bkhlxe ... oject?dl=0 (this was made in beta 3 of Bitwig btw)
By default there's two tracks, by seeing how many you can run before crackling starts, and watcing the average load, it's possible to benchmark different operating systems and system setups.
- KVRist
- 282 posts since 24 Aug, 2017
Old thread (was searching for something), but HT does help. The reasons that cores often spend time waiting on other things to finish (fetch, write etc) and you splitting a core into 2 the core can continue with another process while waiting. Therefore HT should almost always be turned on. I would maybe turn it off on an appliance which is handling heavy real time loads.
- KVRist
- 282 posts since 24 Aug, 2017
This is exactly the case, see my previous post. 1+1=2.5 in this caseantic604 wrote: ↑Wed Mar 14, 2018 8:11 amSo, how does it work, then? Are you suggesting that somehow two logical cores combined have more "power" than one physical core on which they're running virtually? Or do you mean the opposite, but this I've already mentioned when referring to scheduling overhead.EvilDragon wrote:This is not how HT works. Halving one physical core into two logical ones doesn't "halve their capacity".antic604 wrote:If for example we had only 1 physical core to process everything - DSP processing, GUI, antivirus, I/O, etc. - then it would have 100% of capacity available. If we'd split that into 2 logical cores, each with 50% of power but one of them would be dedicated exclusively to antivirus & I/O and the other to DAW's DSP and GUI, then we'd end up with limited DAW performance and a lot of "wasted" performance for system stuff.
Or are you referring to my assumption, that Hyperthreading "splits" the core in half, when in fact it can be arbitraty, like 1 logical core gets 80% of the clock and the other gets 20%; or that it's maybe even a dynamic process?
If you bother to reply, please don't be vague - we're all learning here, at least that was my goal