Originally posted by coder
View Post
Announcement
Collapse
No announcement yet.
Intel CR 23.35.27191.9 Released As A Big Update To Their Open-Source GPU Compute Stack
Collapse
X
-
Michael
Sting Operation Trapped Phoronix-forum exploid Hackers.
The Hackers who did use Firefox version 119 exploids to install Trojan Horse on my Computer where trapped by a Sting Operation of the Militars Intelligence an undercover agent did pay these group of hackers money to target me and use the exploid to install a Trojan Horse on my Computer.
One of the forum members involed is "Sophisticles" many more are involved.
All the attackers will go down.
SWAT Police Raid upcoming
The attackers used fraudulent benchmarks manipulated in favor of Intel and posted this in AMD Threadripper 7000 phoronix.com forum threats. this link did go to a webserver who was hacked by the attackers and if a Targeted Individual did go on that website the exploid was executed. this webserver was https://www. pugetsystems .com/
Phantom circuit Sequence Reducer Dyslexia
Comment
-
Originally posted by TemplarGR View PostBasically, the whole of your reply is a snarky attempt to prove to everyone to this forum that you are completely ignorant about coding....
Originally posted by TemplarGR View PostYou are confusing gpu compute fp64 with cpu fp64.....
Originally posted by TemplarGR View PostYes, cpus could calculate at fp64 since forever....
Originally posted by TemplarGR View PostBut we are talking about gpgpu here, remember?
Originally posted by TemplarGR View Postthe vast majority of applications that use double precision floating point are not gpgpu apps.
Originally posted by TemplarGR View PostAnd even the apps that do use it, do not use it all the time, most of the calculations do not need it.
Originally posted by TemplarGR View PostAlso, while Skylate did have fp64, you are confusing theoritical throughput with realistic performance. Unless software does fp64 calculations all day with no branching, you are not seeing those 220glops, not in your dreams.
Originally posted by TemplarGR View PostGpgpu has tons of latency, even on a SoC.
"CUDA launch overhead for null-kernels is typically around 5 to 7 microseconds in sane driver environments."
source: https://forums.developer.nvidia.com/...d-opencl/48792
Who's the n00b, now?
And yes, you're really a n00b if you're dumb enough to make synchronous calls to run stuff on a GPU. The APIs have queues for good reasons. Anyone with any GPU programming experience appreciates the need to overlap as much processing as possible between the CPU and GPU.
Originally posted by TemplarGR View PostWhile a cpu core has much less theoritical fp64 throughput, it doesn't stall nearly as much.
Originally posted by TemplarGR View PostOr else we wouldn't be using cpus at all, everything would have been gpu only by now....
Other reasons include laziness, ignorance, and CPUs continually adding more cores. As long as people can get more performance by spinning up more CPU threads (which have their own communication latencies, mind you), it's less of a pressing need to use GPUs. Sadly, the efficiency benefits GPUs can provide too often go unutilized.
Originally posted by TemplarGR View Posti repeat, igpus do not need hardware fp64.
Originally posted by TemplarGR View PostApplications that for some reason need to run gpgpu fp64, can do it with emulation for a small performance hit.
Originally posted by TemplarGR View PostFor igpus were silicon space matters, it is better to not have it
TL;DR: take your butthurt elsewhere.
Comment
-
Originally posted by coder View PostPot calling the kettle black.
Not confusing, but relating.
In x87, sure. However, Intel made conscious decisions to incorporate it into SSE and AVX. As with GPUs, those implementations don't natively support denormals. So, that aspect of the comparison is apples-to-apples. It's also not the easiest thing to program, not unlike GPUs. Yes, I've done both.
Duh.
That's not the claim you made that I objected to. You simply said enterprise doesn't need fp64. Now you're trying to move the goalposts.
Sure, there are cases where it's used egregiously, but not in many cases of the examples I gave.
There's no confusion, here. Anyone who understands how computers work will know that the theoretical numerical performance of a CPU typically isn't sustained, because programs have to do other things. That's where having a GPU can really help, since it's quite likely doing little else at the time.
"CUDA launch overhead for null-kernels is typically around 5 to 7 microseconds in sane driver environments."
source: https://forums.developer.nvidia.com/...d-opencl/48792
Who's the n00b, now?
And yes, you're really a n00b if you're dumb enough to make synchronous calls to run stuff on a GPU. The APIs have queues for good reasons. Anyone with any GPU programming experience appreciates the need to overlap as much processing as possible between the CPU and GPU.
Another n00b comment. If your workload has enough concurrency, your GPU shouldn't be stalling.
The reasons people still use CPUs so much are myriad. The number 1 issue is the lack of OpenCL (or comparable) ubiquity. If software developers can't count on there being a GPU-like accelerator capable of running their code, that greatly diminishes the value proposition.
Other reasons include laziness, ignorance, and CPUs continually adding more cores. As long as people can get more performance by spinning up more CPU threads (which have their own communication latencies, mind you), it's less of a pressing need to use GPUs. Sadly, the efficiency benefits GPUs can provide too often go unutilized.
For some definition of "need", they don't. That's not the same thing as saying it's worthless, or should be omitted.
Not small.
Granted, the former Intel GPUs were rather generous with it. Still, they didn't have to go to zero. They could've kept one scalar fp64 unit per EU, giving them an effective ratio of 8:1. That would still be enough to make it less painful when you need to operate on 64-bit matrices in either graphics or compute-oriented applications.
TL;DR: take your butthurt elsewhere.
For example, why are you confusing code branching latency with CUDA kernel launch latency? Who talked about THAT latency? FFS man.... You are constantly making strawman arguments and attacking them to pretend in your head that you somehow "won the argument".
I obviously meant how less efficient gpgpu is at "branching code". I said it in the previous post to which you replied as well. GPGPU is like a VLIW architecture, it is great at parallelizing calculations but is trash for more serial calculations and branching code. This is like coding 101, which is why i told you even in the first post, you are embarrassing yourself and just displaying to the whole forum you don't know what you are talking about man....
Even if there was only 1 language to use for all programmers in the whole world, for both cpu and gpu, even then people would use cpus mostly. It has nothing to do with the reasons you claimed. CPUs are just more efficient for most general purpose computing, and this has always been the case and will always be. Unless you have a scientific calculation that benefits from running multiple parallel calculations (like graphics do), gpus are inefficient.
VLIW architectures are not a new thing, they are a very old thing in processor design. And modern gpus are just glorified VLIW processors with a RAMDAC (ok obviously more modern designs are more than that, with their hierarchies etc, but it is the same principle). If VLIW was indeed better overall, it would have won in cpu space a long time ago, but it didn't. There is a reason, and it is not that "people snub OpenCL" (LOL, just LOL)
So, to return to my original comment, AGAIN, igpus do not really need hardware gpgpu fp64 support. Which is why no one cared when 12th gen, YEARS AGO, dropped it, and no one noticed, no one cared, no one lost anything of value.... No one in their right mind will do large professional parallel fp64 calculations on an igpu. Unless it is a student or amateur practicing, in which case he can use the emulated fp64 and not bat an eye.Last edited by TemplarGR; 30 November 2023, 12:27 AM.
Comment
-
Originally posted by TemplarGR View PostOK, you are the one clearly butthurt and you are confusing more things in the new reply... It is getting embarrassing for you, really. If you actually knew what you talked about, that is...
Originally posted by TemplarGR View PostFor example, why are you confusing code branching latency with CUDA kernel launch latency?
"Gpgpu has tons of latency, even on a SoC."
I made the most reasonable interpretation that you were talking about kernel launch latency, since that's one of the main things that would be improved with an iGPU vs. dGPU, as the words "even on a SoC" implied.
Originally posted by TemplarGR View PostI obviously meant how less efficient gpgpu is at "branching code".
Originally posted by TemplarGR View PostWho talked about THAT latency? FFS man.... You are constantly making strawman arguments and attacking them to pretend in your head that you somehow "won the argument".
In my experience, people who have trouble expressing themselves clearly also tend not to think very clearly. You might be on the verge of outing yourself as a lousy programmer.
Originally posted by TemplarGR View PostGPGPU is like a VLIW architecture, it is great at parallelizing calculations but is trash for more serial calculations and branching code.
In fact, it's a common misconception that GPUs aren't good at branching. They really are! You just need the entire wavefront/warp to follow the same codepath. Where they suffer is in control flow that depends on vector data.
Originally posted by TemplarGR View Posti told you even in the first post, you are embarrassing yourself
Originally posted by TemplarGR View PostEven if there was only 1 language to use for all programmers in the whole world, for both cpu and gpu, even then people would use cpus mostly.
This shows that the problem we have really isn't one of programming languages. You just need ubiquitous platform & hardware support for compute acceleration + good libraries, APIs, and frameworks.
Obviously, the problem has to be a decent fit. For instance, you don't run data encryption workloads on a video compression engine. Nor would it make sense to port a C compiler to run on a GPU. However, nobody is saying that all code should run on a GPU - just in the cases where it makes sense.
Originally posted by TemplarGR View PostCPUs are just more efficient for most general purpose computing,
Originally posted by TemplarGR View PostVLIW architectures are not a new thing,
I've programmed and subsequently forgotten more about VLIW processors than you'll probably ever know. Same with GPUs, from the sound of it.
Originally posted by TemplarGR View PostIf VLIW was indeed better overall, it would have won in cpu space a long time ago, but it didn't.
Originally posted by TemplarGR View Postigpus do not really need hardware gpgpu fp64 support. Which is why no one cared when 12th gen, YEARS AGO, dropped it, and no one noticed, no one cared, no one lost anything of value....
Comment
Comment