Announcement

Collapse
No announcement yet.

Rootbeer: A High-Performance GPU Compiler For Java

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by rohcQaH View Post
    With cars you have a pretty strict metric of speed = distance over time. Which metric are you using to claim that GPUs are faster than CPUs? "Computation over time" is just too hard to define, and you'll find definitions that favor either side.
    Would you understand this comparison without that information? Or do I have to explain how stylistic devices work?

    Btw. it's time to response.

    Originally posted by rohcQaH View Post
    Thus, none can be declared the winner.
    Which is why we still have SIMD and MIMD devices.

    Originally posted by rohcQaH View Post
    It can run on either. If it does run on the GPU, it does not run concurrently with other GPU threads. They're run one after the other, with those expensive context switches and CPU involvement in between.

    On the CPU, both could run concurrently on their own cores with virtually no overhead.
    Right, and which one still runs faster?

    Originally posted by rohcQaH View Post
    On a GPU? You don't. The only synchronization primitive is "The CPU task is informed that the current batch of data has been processed and the results are ready."
    Which disables quite a bit of parallel algorithms.
    Then tell me, what do i.e. local/global memory fences do in OpenCL or what they're good for.

    Comment


    • #17
      Originally posted by alexThunder View Post
      Then tell me, what do i.e. local/global memory fences do in OpenCL or what they're good for.
      Memory fences and barrier(CLK_GLOBAL_MEM_FENCE) can only synchronize things within a given WORKGROUP. The GLOBAL is in reference to the global memory space, not synchronizing all GPU threads.

      They are useful for making sure that threads within a workgroup don't get out of sync, but they CANNOT be used to attempt to synchronize all global work items in an OpenCL kernel invocation. Trust me, I've tried this, and I've tried to create global synchronization mechanisms in OpenCL (and found fun ways to lock up my GPU in the process).

      Comment


      • #18
        Nice discussion!

        Comment


        • #19
          Originally posted by alexThunder View Post
          I don't. In general my point is, that they're just faster, although they might not be usable for everything.

          For instance: Motorbikes are usually faster than cars. The fact that they hardly can carry anything compared to a car, doesn't make them slower, does it?
          If you were to continue on the vehicle analogy, a GPU would be more like a truck and the CPU more like a motorbike. It seems to me like both of you are correct in saying which is faster but are both going by a bit different definition as to what "faster" means. A GPU is "faster" in the sense that it can a crap load of parallel code at the same time, on the other hand the CPU is "faster" on a per thread basis.

          Comment


          • #20
            Nice discusstion here! I am learning Java!

            Comment

            Working...
            X