Announcement

Collapse
No announcement yet.

RadeonSI Ups Its Compiler Threads To Let Shader-DB Run Faster On Modern Systems

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Veerappan View Post
    Should hopefully translate to faster load times on many core CPUs for games as well
    This. I really hope to see benchmarks with empty cache that show whether this helped in games where startup was slowed by shader compilation

    Comment


    • #12
      Originally posted by smitty3268 View Post

      Though in practice, I'm not aware of any computers where it wouldn't be. 12, 16, 24, 32, 36, 48, etc. cores all divide evenly. I guess maybe in a virtual environment there might be an odd # of cores assigned.
      https://en.wikipedia.org/wiki/AMD_Phenom#Phenom_X3

      + of course virtual environments
      Last edited by konserw; 04-29-2018, 04:23 AM.

      Comment


      • #13
        Originally posted by smitty3268 View Post

        Though in practice, I'm not aware of any computers where it wouldn't be. 12, 16, 24, 32, 36, 48, etc. cores all divide evenly. I guess maybe in a virtual environment there might be an odd # of cores assigned.
        Even with numbers that are in theory divisible by 4 you may get a different result because floating point numbers have errors https://en.wikipedia.org/wiki/Floati...uracy_problems

        /edit:
        Removed the wrong example because I'm bad at math in my head

        Comment


        • #14
          Originally posted by droste View Post

          Even with numbers that are in theory divisible by 4 you may get a different result because floating point numbers have errors


          Not if you have integral numbers all the way, which is the case here.

          Comment


          • #15
            Originally posted by LinAGKar View Post
            Not if you have integral numbers all the way, which is the case here.
            Yes but there's a division there so the result may not be a integral number anymore. Easy example: 1 / 10 is NOT 0.1 in floating point arithmetic.

            /edit: And the original post I replied to had "0.75" which isn't an integral number
            Last edited by droste; 04-29-2018, 08:48 AM.

            Comment


            • #16
              Originally posted by droste View Post

              Yes but there's a division there so the result may not be a integral number anymore. Easy example: 1 / 10 is NOT 0.1 in floating point arithmetic.

              /edit: And the original post I replied to had "0.75" which isn't an integral number
              In this case though, we're dividing by 4, which is a power of two, so the result is exactly representable by a float. 0.75 is also exactly representable in binary (0.112). Also, I was talking about thread counts that are divisible by 4, like those listed there.

              Comment


              • #17
                Originally posted by andrei_me View Post
                marek what were the effects of this change in shaderdb execution time?
                shader-db was bottlenecked by the 3-thread shader compiler queue, so it only used 5 threads or so on a 16-thread CPU. Now it's using ~16.

                Comment


                • #18
                  Any idea what effect this will have on multiple NUMA nodes? Is each thread mostly self-contained or will there be memory swapping across nodes?

                  Comment

                  Working...
                  X