Announcement

Collapse
No announcement yet.

LLVMpipe Still Is Slow At Running OpenGL On The CPU

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    only a 48core Opteron 6000 (155gb/s ramspeed)can beat a GPU (hd5870 160gb/s )

    a normal PC do have 5-15gb/s compared to an hd5870 160gb/s its very slow.

    this benchmark only show us this divergence

    Comment


    • #12
      Are there any plans to use any parts of LLVMpipe alongside the GPU drivers? For example, to move calculations to the cpu if the gpu is overloaded, or if there are things a particular cpu does more efficiently than the GPU?

      I guess more importantly, are there any benefits to such an approach?

      Comment


      • #13
        I believe parts of LLVMpipe are being used already in cases where the GPU does not have vertex shader hardware.

        Dynamic load balancing (shifting driver work between CPU and GPU depending on the relative performance of the two) is a much bigger issue because getting non-sucky performance requires that the CPU have all of its data in system memory while the GPU needs all of its data to be in video memory.

        Comment


        • #14
          Originally posted by bridgman View Post
          I believe parts of LLVMpipe are being used already in cases where the GPU does not have vertex shader hardware.

          Dynamic load balancing (shifting driver work between CPU and GPU depending on the relative performance of the two) is a much bigger issue because getting non-sucky performance requires that the CPU have all of its data in system memory while the GPU needs all of its data to be in video memory.
          I don't know if this is stupid, but it seems like a back and forth latency problem. Can you keep the memory both in system memory and video memory and have that sync all the time? Maybe them you would not need to do the back and forth in a crucial moment in time?

          *ducks and runs*

          Comment


          • #15
            Sure, you just turn on the "magic pixie dust" bit in the hardware

            Seriously, dealing with latency between different memory subsystems (with the associated need for change detection and synchronization) is probably the biggest single challenge when implementing highly parallel systems. It doesn't mean that an easy solution does not exist but it hasn't been found yet.

            Read up on "cache snooping", for example.

            Comment


            • #16
              Originally posted by bridgman View Post
              Sure, you just turn on the "magic pixie dust" bit in the hardware

              Seriously, dealing with latency between different memory subsystems (with the associated need for change detection and synchronization) is probably the biggest single challenge when implementing highly parallel systems. It doesn't mean that an easy solution does not exist but it hasn't been found yet.

              Read up on "cache snooping", for example.
              Oh yeah... that shit...

              OK first you need to have two adress maps which do not match so they both have to know what adress of a copy of what adres matches with a tag.

              Then both the CPU and the GPU must never at the same time write to the same thing at the same time, while you cannot work with each other (great) so one of the two needs to be the dominant desicion maker. That would be the CPU as it can execute a driver for the graphics card.

              So then the CPU will put what's about to be modified in a command buffer (sort of) and then check what the GPU can alter at that time that does not correspond to the CPU's adress tags changes. Then wen both buffers are empty the lock on what can't be done by the GPU that the CPU keeps track of is lifted and then the GPU can continue.

              Either way massive latency hell...

              Comment


              • #17
                Welcome to driver development

                Comment


                • #18
                  Dynamic load balancing, we already have that, it's called SLI

                  Comment


                  • #19
                    SLI (and Crossfire) might do dynamic load balancing between GPUs, but this is something different - dynamic load balancing between CPU and GPU.

                    Comment


                    • #20
                      Originally posted by bridgman View Post
                      SLI (and Crossfire) might do dynamic load balancing between GPUs, but this is something different - dynamic load balancing between CPU and GPU.
                      Not worth it unless you have a CPU that is -really- good at graphics rendering. A good example of this is the PS3- even the RSX with a weak fill rate can deliver stunning visuals with the Cell's SPUs working on parts of the graphics load. This is largely thanks to the combination of XDR and DDR3- low latency XDR doesn't have the peak raw bandwidth of DDR3, but can be randomly accessed by the GPU without taking too many memory cycles away from the CPU- allowing communication and load distribution to work really smoothly (if done right, of course).

                      x86(_64) on the other hand is not really good at this and chipset hardware was never designed for ultra-low-latency or high-bandwidth communication between the CPU and GPU. The CPU would better be put to work on physics in most 4-8 core scenarios.

                      Comment

                      Working...
                      X