Announcement

Collapse
No announcement yet.

RADV vs. NVIDIA Vulkan/OpenGL Performance For Serious Sam 2017

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    -O2 is usually the best choice, unless the code specifically benefits from other optimization levels.

    -O3 is safe if your code conforms to the C/C++ standard, but will increase code size by a lot. This means it will occupy more of your CPU cache, which can be detrimental to performance in situations where many processes run in parallel. If your code works with -O2 but not with -O3, it is usually broken in some way.

    -Os should be mentioned here too. There is almost never a reason to use it over -O2.

    -Ofast does not guarantee that the compiled code does what the C/C++ standard says. It implies -ffast-math which can cause FP calculations to give unexpected results.

    Comment


    • #62
      Originally posted by Kano View Post
      And where are the results that show better Vulkan speed compared to OpenGL you speak about???
      I'm wasn't speaking to high Vulkan framerates... but I may have missed the intention of your question so I'll re-word my reply.

      Your question was driven by a desire to see what happens at low resolution when it becomes CPU-limited (due to high FPS) instead of GPU limited (due to high pixel counts), right?

      I agree with you - the OpenGL 1080p results are painfully missing from a review which bills itself as a Vulkan vs OpenGL comparison, and it would have revealed whether Vulkan was winning under those circumstances, and that's interesting to those of us who are interested in Vulkan vs OpenGL from an architectural point of view rather than a gaming point of view.

      In my previous reply I should have said that it was my belief that Michael skipped the OpenGL 1080p results because he felt his readers would find those results uninteresting, and he moved on with generating and posting the results at 2160p because at that resolution the choice of Vulkan vs OpenGL could make a difference in the playability, and I think that's what Michael feels his readers mostly care about.

      PS: Michael - frame latency histograms are a better indicator of playability than average FPS... 95th percentile latency of 60ms @ 40fps is more playable than 95th percentile of 100ms @ 60fps. Vulkan is great at delivering low average frame latency but we don't get to see that in these benchmarks.
      Last edited by linuxgeex; 25 March 2017, 05:34 PM.

      Comment


      • #63
        Originally posted by chithanh View Post
        -O2 is usually the best choice, unless the code specifically benefits from other optimization levels.
        That's quite frequently, though. My understanding is that Clear Linux (which is distinctly faster than other distributions) makes significant use of -O3.

        Keep in mind that optimal code not only runs faster, it also consumes less energy and increases battery life on a laptop.

        If you write code for yourself, maybe you just use debug flags, but if you distribute code to a larger number of users, isn't it worth the effort to actually measure the effect of compiler flags, instead of just defaulting to anything like -O2 ? And what about PGO?

        Originally posted by chithanh View Post
        -O3 is safe if your code conforms to the C/C++ standard, ...
        The same applies to -O2, for example regarding strict aliasing rules.

        Originally posted by chithanh View Post
        ..., but will increase code size by a lot. This means it will occupy more of your CPU cache, which can be detrimental to performance in situations where many processes run in parallel.
        It doesn't necessarily increase code size "a lot" (and certainly not if you use it on specific parts of your code). As always, you need to measure the effect and compare, unless you just compile for yourself.

        Originally posted by chithanh View Post
        If your code works with -O2 but not with -O3, it is usually broken in some way.
        Yes, and you need to find out how, and to fix it, not the least since it is an indication of further undetected problems.

        Originally posted by chithanh View Post
        -Os should be mentioned here too. There is almost never a reason to use it over -O2.
        Maybe for code that runs only once, or usually not at all?

        Originally posted by chithanh View Post
        -Ofast does not guarantee that the compiled code does what the C/C++ standard says. It implies -ffast-math which can cause FP calculations to give unexpected results.
        -Ofast is special purpose and in a completely different category (use at your own risk).

        Comment


        • #64
          Originally posted by chithanh View Post
          -O2 is usually the best choice, unless the code specifically benefits from other optimization levels.

          -O3 is safe if your code conforms to the C/C++ standard, but will increase code size by a lot. This means it will occupy more of your CPU cache, which can be detrimental to performance in situations where many processes run in parallel. If your code works with -O2 but not with -O3, it is usually broken in some way.
          I need to correct you here.
          O3 is better, except for edge cases.
          If your code grows, it's due to simplification of the code to remove branches. Yes, in some cases the code will grow slightly, but the code will be more linear, which will make the prefetcher able to fetch this code with no issue. O3 also includes -finline-functions, which will help reduce code cache misses by a lot if you have a lot of calls to small functions.

          But how can this degrade performance where processes work in parallel? (I suppose you mean threads?)

          Comment


          • #65
            Nah, O3 and LTO is the way to go, especially with C++ which loves to bloat.

            On a more serious note though, there is not best setting. It really depends on what your code is doing, and the compiled code can differ significantly depending on the compiler and optimization.

            Comment


            • #66
              Originally posted by chithanh View Post
              -O2 is usually the best choice, unless the code specifically benefits from other optimization levels.

              -O3 is safe if your code conforms to the C/C++ standard, but will increase code size by a lot. This means it will occupy more of your CPU cache, which can be detrimental to performance in situations where many processes run in parallel. If your code works with -O2 but not with -O3, it is usually broken in some way.

              -Os should be mentioned here too. There is almost never a reason to use it over -O2.

              -Ofast does not guarantee that the compiled code does what the C/C++ standard says. It implies -ffast-math which can cause FP calculations to give unexpected results.
              So as an advanced Gentoo user, what recommendations do you have for us about -how- to use -O3 effectively? It's been my experience that -O3 results in very random behavior that I don't even know how to describe. So I guess my question is, how do I know what packages to build with -O3? Does Gentoo have some mechanism to assist its users in making that determination? And if it is true, as you say, that code is broken in the scenario that if compiled with -O3 it behaves wrong, then doesn't that make it directly a gentoo developer problem? Not just gentoo, but every developers problem?

              Comment


              • #67
                Originally posted by duby229 View Post
                ...
                So I guess my question is, how do I know what packages to build with -O3? ...
                I'd hope it is possible to get this information, regarding builds of the OS itself, from Clear Linux. And not knowing Gentoo, I'd hope it is not too difficult to set up Gentoo to use this per-package info.

                However, I would think the Gentoo distributors should do this work, and provide some grouping of packages so that you can set compiler options for each group, in a way that puts those packages which have conformance problems into a special group.

                Comment


                • #68
                  Originally posted by efikkan View Post

                  A driver is in really bad state when it needs to be optimized for every game.
                  Hopefully the later didn't mean for every game. Airlied has said that there are a few things that are left to do (or at least need to be merged) that will result in large performance gains.
                  I'm not sure how much room exists on the driver side of vulkan to provide per-game optimizations.

                  Comment


                  • #69
                    Originally posted by indepe View Post

                    I'd hope it is possible to get this information, regarding builds of the OS itself, from Clear Linux. And not knowing Gentoo, I'd hope it is not too difficult to set up Gentoo to use this per-package info.

                    However, I would think the Gentoo distributors should do this work, and provide some grouping of packages so that you can set compiler options for each group, in a way that puts those packages which have conformance problems into a special group.
                    Of course Gentoo can do it per package, but how can a user know if -O3 is going to result in undefined behaviour? Trial and error is the only method I can think of, but Gentoo is an end user distribution that touts itself as a so called meta-distribution. Gentoo has a freakin awesome package manager that is designd to build everything from source and it's got some fantastic tools, but I don't think automatic testing for undefined behaviour is included.

                    Comment


                    • #70
                      Originally posted by duby229 View Post
                      Of course Gentoo can do it per package, but how can a user know if -O3 is going to result in undefined behaviour? Trial and error is the only method I can think of, but Gentoo is an end user distribution that touts itself as a so called meta-distribution. Gentoo has a freakin awesome package manager that is designd to build everything from source and it's got some fantastic tools, but I don't think automatic testing for undefined behaviour is included.
                      Not sure if you really read my post. I suggested that Gentoo creates different groups of packages, so that you can provide different compiler options for each group. Then one of the groups are those packages that are problematic and/or don't comply with the language specification sufficiently. For that group of packages, you would then of course not use -O3. But, if you want, for others. Or, the other way around, you define named compiler-option-sets, and then each package specifies which set it uses. I believe that, or something in that direction, is also how Clear Linux handles this.

                      The bottom line would be that Gentoo maintainers determine which packages are problematic (similar to Clear Linux), and there would be two, or more, sets of compiler options. Unless someone fixes those problematic packages, which of course would be even better.

                      Comment

                      Working...
                      X