Originally posted by Weasel
View Post
Announcement
Collapse
No announcement yet.
Intel Talks Up Xeon Cascade Lake Performance - Up To 48 Cores, 12 DDR4 Channels
Collapse
X
-
-
Originally posted by duby229 View PostI never said synthetics aren't useful, in fact synthetics that -identify- bottlenecks are specifically useful to application developers. In the case of new Instruction set extensions synthetics should be written in two folds, one using the older set of extensions and a second to establish the new set of extensions. It's -only- in this way that bottlenecks between the two can be identified. In this case a good synthetic would be a subset of some actual load.
Let me use another example where you'll hopefully get it. Suppose Joe uses an app that needs a fast population count. The devs of the app profiled it to hell and optimized it as much as they could, but we're in an era where popcnt instruction doesn't exist yet for x86 CPUs.
Intel show new CPU with popcnt benchmarks, blowing anything that used software implementations to count bits before.
If you use Joe's software right now you won't notice much of a difference since the app doesn't use the new instruction. But for the dev of that app he'll now know to make use of the new instruction and gain a massive speed boost.
So the benchmarks show the dev that YES, IT IS WORTHWHILE to convert to the new instructions and show just how fast the CPU can be (i.e. potential).
Comment
-
Originally posted by Weasel View PostMan, I've no idea what you're talking about. This isn't about bottlenecks or profiling. This is about benchmarking the potential of the CPU, not of an app.
Let me use another example where you'll hopefully get it. Suppose Joe uses an app that needs a fast population count. The devs of the app profiled it to hell and optimized it as much as they could, but we're in an era where popcnt instruction doesn't exist yet for x86 CPUs.
Intel show new CPU with popcnt benchmarks, blowing anything that used software implementations to count bits before.
If you use Joe's software right now you won't notice much of a difference since the app doesn't use the new instruction. But for the dev of that app he'll now know to make use of the new instruction and gain a massive speed boost.
So the benchmarks show the dev that YES, IT IS WORTHWHILE to convert to the new instructions and show just how fast the CPU can be (i.e. potential).
You keep talking about the "Potential" of your CPU and that actual programs don't matter at all. And even even that it's not about application developers.... What you have been talking about are programs made to give that end user a "feelz gud" moment... And that's in fact all you're getting. This potential you keep speaking of is entirely a figment of your imagination.Last edited by duby229; 09 November 2018, 05:03 PM.
Comment
-
Originally posted by duby229 View PostDumbass, CPU's -ONLY- run programs.... If the applications that people actually use don't use newer instructions then it doesn't matter. The -ONLY- way for a developer to know whether an instruction set extension can benefit his app is to benchmark it himself. And that's why good synthetics are actual subsets of actual programs.
You keep talking about the "Potential" of your CPU and that actual programs don't matter at all. And even even that it's not about application developers.... What you have been talking about are programs made to give that end user a "feelz gud" moment... And that's in fact all you're getting. This potential you keep speaking of is entirely a figment of your imagination.
There's a reason GPUs (and in a sense, CPUs, with SIMD) are measured in FLOPS, not based on some piece of shit "typical workload" of some John Average Doe Jr.
Most (all?) of the stats shown on a CPU or GPU are the absolute max you can get. In effect, this tells you the TRUE POTENTIAL of that CPU or GPU. You can of course, run shit software on it, or software not optimized for it, but don't expect anywhere near that.
Let's say you care about memory transfer performance. Your memory is rated (evil synthetic benchmark) as 20 GB/s. You have programs that copy at 10 GB/s. Clearly, your fucking "typical workload" is not optimal. And if you were to code and come to say 19.2 GB/s, you'd know it's almost the limit so the fact that you know the POTENTIAL tells you WHEN TO STOP.
Otherwise, how the FUCK would you know when the performance is "optimal enough"?!? It doesn't matter how many times you profile, you need to know the POTENTIAL of the CPU so that you know how close you are to it.
Comment
Comment