Announcement

Collapse
No announcement yet.

OpenCL Support In GCC?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by bridgman View Post
    It's supposed to hurt. That means you're starting to understand. Congratulations

    Ever since the introduction of programmable shaders GPU drivers have included an on-the-fly compilation step (going from, say, GLSL to GPU shader instructions) and the GPU hardware has run many copies of those compiled shader programs in parallel to get acceptable performance.

    GPU vendors did a good job of hiding that complexity from the application -- but with OpenCL you get to see all the scary stuff behind the scenes.

    Back in 2002 the R300 (aka 9700) was running 8 copies of the pixel shader program in parallel, each working on a different pixel.
    Why not have 1 copy of the pixel shader operating on 8 different pixels? like SIMD. What's the rationale behind using so many processors if they are all running the same code?

    Comment


    • #22
      For those interested in bridgmans posts about the hardware, I highly recommend the articles anandtech wrote for the GT200/RV770 launches.




      Yes, they support more than 1000 threads.

      It's very interesting to see the different approaches ATI and NVidia took. NVidia went with very simple SPs, which are all capable of doing all operations, while ATI went with more complex ones which can run multiple instructions at once - with the tradeoff being that the instructions have to be a certain mix which means the compiler has to be very smart about how to use the resources.

      Comment


      • #23
        Originally posted by monraaf View Post
        Why not have 1 copy of the pixel shader operating on 8 different pixels? like SIMD. What's the rationale behind using so many processors if they are all running the same code?
        AFAIK we have used SIMDs for pixel shaders right back to the R300. Not sure if the vertex shaders were SIMD or not. An RV770 has 10 SIMD blocks, with a single program counter per SIMD. Each SIMD block runs the same superscalar instruction on 16 threads, and each instruction can perform up to 5 floating point operations per clock.

        Other GPU vendors use SIMDs in a similar fashion, but without the superscalar instructions.We have multiple SIMDs because even for a single task you need some granularity to handle the mix of vertex, geometry and pixel shader processing.
        Last edited by bridgman; 01 February 2009, 11:27 PM.
        Test signature

        Comment


        • #24
          Originally posted by Louise View Post
          So a GPU can have processes running? Does that mean, that there could be made a "ps" and "top" for GPU processes?

          It would be so cool to have a Gnome/KDE GPU load monitor

          [...]
          Sorry for dragging up an old thread, but yes, you can have "top" for the GPU, behold intel_gpu_top:



          I guess a someone still needs to write a load monitor for your favourite desktop

          Comment

          Working...
          X