www. O S N E W S .com
News Features Interviews
BlogContact Editorials
.
AMD Threadripper reviews and benchmarks
By Thom Holwerda on 2017-08-11 19:46:32

In this review we've covered several important topics surrounding CPUs with large numbers of cores: power, frequency, and the need to feed the beast. Running a CPU is like the inverse of a diet - you need to put all the data in to get any data out. The more pi that can be fed in, the better the utilization of what you have under the hood.

AMD and Intel take different approaches to this. We have a multi-die solution compared to a monolithic solution. We have core complexes and Infinity Fabric compared to a MoDe-X based mesh. We have unified memory access compared to non-uniform memory access. Both are going hard against frequency and both are battling against power consumption. AMD supports ECC and more PCIe lanes, while Intel provides a more complete chipset and specialist AVX-512 instructions. Both are competing in the high-end prosumer and workstation markets, promoting high-throughput multi-tasking scenarios as the key to unlocking the potential of their processors.

As always, AnandTech's the only review you'll need, but there's also the Ars review and the Tom's Hardware review.

I really want to build a Threadripper machine, even though I just built a very expensive (custom watercooling is pricey) new machine a few months ago, and honestly, I have no need for a processor like this - but the little kid in me loves the idea of two dies molten together, providing all this power. Let's hope this renewed emphasis on high core and thread counts pushes operating system engineers and application developers to make more and better use of all the threads they're given.

 Email a friend - Printer friendly - Related stories
.
Read Comments: 1-10 -- 11-20 -- 21-29
.
Comment by Licaon_Kter
By Licaon_Kter on 2017-08-11 20:39:14
...hopefully you can even compile something without the system melting ( like Ryzen does :( )
Permalink - Score: -4
.
Threads
By kwan_e on 2017-08-12 02:20:49
> make more and better use of all the threads they're given

More use? No. Better use? Yes. Programs should definitely be thread agnostic and thus structured (layered) for usage patterns like work-stealing queues etc.
Permalink - Score: 3
.
RE: Threads
By dpJudas on 2017-08-12 09:57:05
> More use? No. Better use? Yes. Programs should definitely be thread agnostic and thus structured (layered) for usage patterns like work-stealing queues etc.
Problem is that once NUMA enters the picture it becomes much more difficult to be thread agnostic. A generic threadpool doesn't know what memory accesses each work task is going to do, for example.
Permalink - Score: 3
.
RE[2]: Threads
By Alfman on 2017-08-12 14:55:08
dpJudas,

> Problem is that once NUMA enters the picture it becomes much more difficult to be thread agnostic. A generic threadpool doesn't know what memory accesses each work task is going to do, for example.

I agree, multithreaded code can quickly reach a point of diminishing returns (and even negative returns). NUMA overcomes those bottlenecks by isolating the different cores from each other's work, but then obviously not all threads can be equal and code that assumes they are will be penalized. These are intrinsic limitations that cannot really be fixed in hardware, so personally I think we should be designing operating systems that treat NUMA as clusters instead of as normal threads. And our software should be programed to scale in clusters rather than merely with threads.

The benefit of the cluster approach is that software can scale with many more NUMA cores than pure multithreaded software. And without the shared memory constraints across the entire set of threads, we can potentially scale the same software with NUMA or additional computers on a network.
Permalink - Score: 4
.
RE[3]: Threads
By FortranMan on 2017-08-12 17:34:55
This is really why I still use MPI for parallel execution even when running on a single node. This approach also has the added benefit of scaling up to small computer clusters without much extra effort.

I mostly write engineering simulation codes though, so I'm pretty sure this does not make sense for entire classes of program.
Permalink - Score: 4
.
RE[4]: Threads
By Alfman on 2017-08-12 20:40:59
FortranMan,

> I mostly write engineering simulation codes though, so I'm pretty sure this does not make sense for entire classes of program.


Oh cool, I'd really like to learn more about that. I've played around with inverse kinematic software and written some code to experiment with fluid dynamics, but nothing sophisticated. I've long wanted to try building physical simulations with a GPGPU, though obviously it requires a different approach!
Permalink - Score: 2
.
RE[3]: Threads
By tylerdurden on 2017-08-13 00:04:50
It seems you want to get the worst of both worlds in order to not get the benefits of NUMA.

You can simply pin threads if you're that concerned with NUMA latencies. Otherwise let the scheduler/mem controller deal with it.

Edited 2017-08-13 00:06 UTC
Permalink - Score: 2
.
RE[4]: Threads
By Alfman on 2017-08-13 00:38:18
tylerdurden,

> It seems you want to get the worst of both worlds in order to not get the benefits of NUMA.

You can simply pin threads if you're that concerned with NUMA latencies. Otherwise let the scheduler/mem controller deal with it.


That's the problem, the operating system scheduler/mem controller CAN'T deal with it.

If you had 32 cores divided into 8 NUMA clusters, the system would start incurring IPC overhead between NUMA clusters at just 5+ threads. You can keep adding more threads, but the system will be saturated with intra-NUMA IO.

To scale well, software must take the NUMA configuration into account. IMHO using separate processes is quite intuitive and allows the OS to effectively manage the NUMA threads. It also gives us the added benefit of distributing the software across computers on a network if we choose to. But obviously you can do it all manually pinning threads to specific cores and developing your own custom NUMA aware memory allocators, or you could allowing the OS to distribute them by process, it achieves a similar result. Personally I'd opt for the multiprocess approach, but you can choose whatever way you want.

Edited 2017-08-13 00:40 UTC
Permalink - Score: 2
.
RE[5]: Threads
By tylerdurden on 2017-08-13 02:28:36
Sure. But remember, the whole point of NUMA is not to incur in IPC overhead.

I think, if I'm correct, you're viewing threads as basically being at the process level. There, sure message passing makes sense, since you're not dealing with shared address spaces. But that's not what NUMA is trying to deal with.

You only have issues with NUMA when you have a very shitty memory mapping, when every core is referencing contents in another chip, but those pathological cases are rare.

Edited 2017-08-13 02:33 UTC
Permalink - Score: 2
.
RE[5]: Threads
By kwan_e on 2017-08-13 04:32:15
> It also gives us the added benefit of distributing the software across computers on a network if we choose to.

But trying to be too general in your approach will mean getting the worst of both worlds. If the software doesn't require such a thing, they shouldn't pay the cost of the underlying implementation.

> That's the problem, the operating system scheduler/mem controller CAN'T deal with it.
.
.
.
But obviously you can do it all manually pinning threads to specific cores and developing your own custom NUMA aware memory allocators, or you could allowing the OS to distribute them by process, it achieves a similar result. Personally I'd opt for the multiprocess approach, but you can choose whatever way you want.


To me, that just means the OS should open up a way for a process to say "these bunch of threads/tasks/contexts should be clustered together" and the software can say "these work units are of type X" and the OS can schedule them appropriately. Something like Erlang's lightweight processes?

Edited 2017-08-13 04:33 UTC
Permalink - Score: 2

Read Comments 1-10 -- 11-20 -- 21-29

Currently browsing at -500.
Browse at your normal score threshold.

No new comments are allowed for stories older than 10 days.
This story is now archived.

.
News Features Interviews
BlogContact Editorials
.
WAP site - RSS feed
© OSNews LLC 1997-2007. All Rights Reserved.
The readers' comments are owned and a responsibility of whoever posted them.
Prefer the desktop version of OSNews?