Announcement

Collapse
No announcement yet.

Even With An Intel Core i9 7980XE, LLVMpipe Is Still Slow

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Even With An Intel Core i9 7980XE, LLVMpipe Is Still Slow

    Phoronix: Even With An Intel Core i9 7980XE, LLVMpipe Is Still Slow

    During the recent holidays when running light on benchmarks to run, I was toying around with LLVMpipe in not having run this LLVM-accelerated software rasterizer in some time. I also ran some fresh tests of Intel's OpenSWR OpenGL software rasterizer that has also been living within Mesa...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Considering the resolution, at least ET: Legacy is playable, which is a bit of an achievement. I'm a bit curious to see how this performs vs the oldest and slowest OpenGL 3.3 hardware you can get. So for example:
    GeForce 8400
    Intel's HD 2000 series (maybe from something like a Celeron B800)
    ATI HD 2400

    Comment


    • #3
      Does it use AVX 2/AVX-512?

      Comment


      • #4
        Clicking on the OpenBenchmark link shows one test that is done both single- and multi-threaded. (GL vs VK, static scene).

        The OpenSWR result is almost the same, and the LLVM result little more than 2x better. This suggests that the software renderers don’t make use of the 18 cores, and I seem to remember that LLVMpipe is limited in the number of threads it is “allowed” to use. So I guess using a 7980 for this test isn’t really the advantage one might hope for.

        Comment


        • #5
          LLVMpipe is mostly good on modern desktop CPUs as a fallback for composited desktops when no hardware GPU/driver is available
          If you actually tried that you'd know that no, llvmpipe is not suitable for the common desktop compositors (Gnome Shell and such) even on modern CPUs. CPU usage through the roof the whole time and so much lag that it's very frustrating to actually use the machine, add to that slow running applications because the compositor is sucking up all the CPU so that very little is left for the application.

          A software compositor could very well work, including a lot of the visual effects, but it would need to be designed from the ground up to run on the CPU, unlike current compositors which simply call GPU functions and expect those to be fast (because they are fast on an actual GPU).

          Comment


          • #6
            I use software rendering on some strange configurations, for example GLES 3.0 under cygwin. Performance is better than enough, at least for programming and testing.

            Comment


            • #7
              Originally posted by indepe View Post
              Clicking on the OpenBenchmark link shows one test that is done both single- and multi-threaded. (GL vs VK, static scene).

              The OpenSWR result is almost the same, and the LLVM result little more than 2x better. This suggests that the software renderers don’t make use of the 18 cores, and I seem to remember that LLVMpipe is limited in the number of threads it is “allowed” to use. So I guess using a 7980 for this test isn’t really the advantage one might hope for.
              Yes, llvmpipe has a static limit of max number of threads, right now it's 16 threads. It can be trivially increased but scaling is bad, in particular because everything before rasterization isn't multithreaded at all (and vertex shader / setup can't even run in parallel to the multithreaded rasterization/fragment shader). Albeit swr shouldn't suffer from these issues.

              Comment


              • #8
                It would have been nice with some hardware rendered results as reference.

                Comment


                • #9
                  Originally posted by mczak View Post

                  Yes, llvmpipe has a static limit of max number of threads, right now it's 16 threads. It can be trivially increased but scaling is bad, in particular because everything before rasterization isn't multithreaded at all (and vertex shader / setup can't even run in parallel to the multithreaded rasterization/fragment shader). Albeit swr shouldn't suffer from these issues.
                  Makes me wonder if an acceptable frame rate would be achievable in some cases.

                  Comment

                  Working...
                  X