Announcement

Collapse
No announcement yet.

AMD Working On Better Page Fault Handling For Navi / Vega GPUs

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by skeevy420 View Post
    It's a bit older now, but Polaris is pretty rock solid.
    My experience is quite the opposite. Linux kernels up to 4.13 crashed my system like once per two days, and everything newer is unstable like crazy, with both the "linux-stable" and the latest head of amd-staging-drm-next crashing within minutes of a simple test use case (3 fps video replay). I have given up hope amdgpu will get usefully stable for me and just wait for the Intel Xe to hit the shelves.

    Comment


    • #12
      Originally posted by dwagner View Post
      My experience is quite the opposite. Linux kernels up to 4.13 crashed my system like once per two days, and everything newer is unstable like crazy, with both the "linux-stable" and the latest head of amd-staging-drm-next crashing within minutes of a simple test use case (3 fps video replay). I have given up hope amdgpu will get usefully stable for me and just wait for the Intel Xe to hit the shelves.
      Outside of trying out things like AMD_DEBUG variables, projects from git like llvm & mesa, and other things of that nature, I've had a really pleasant experience with both my old 260x and my current 580 on Arch and Manjaro. I haven't stuck with any other OS family for more than a week in the past 8 or 10 years so I won't speak for how other distributions may or may not work with AMD cards.

      I also haven't needed to use AMD staging kernels since their 4.17 staging kernel. I suppose people with Navis or Vegas might need to for whatever reason, but, at least for my uses, they haven't been needed to fix anything for a quite a few kernel releases now. Currently on 5.2.11 and I'm more worried about the zstd initramfs patches I'm trying out than anything else (zstd --fast=4 compresses the kernel image almost as fast as lz4 with better compression ratios than xz...that's zstd's 2nd weakest compression setting, btw).

      Comment


      • #13
        Gosh darn it, stop blaming the pages!

        It's not their fault

        Comment


        • #14
          Originally posted by tildearrow View Post
          Of course, Polaris is super stable because it has all the attention. Nobody cares about Vega.
          Have to disagree strongly with that statement. If Polaris is more stable it's because the programming model didn't change much.

          The Polaris to Vega SW/FW-visible changes were much (much much much...) more significant than the Fiji to Polaris changes... maybe 5x or more.

          Comment


          • #15
            Originally posted by bridgman View Post

            Have to disagree strongly with that statement. If Polaris is more stable it's because the programming model didn't change much.

            The Polaris to Vega SW/FW-visible changes were much (much much much...) more significant than the Fiji to Polaris changes... maybe 5x or more.
            That. As a Polaris owner, as of late I haven't seen nearly as many "Polaris is getting X feature" or "Polaris is getting Y fix" when compared to Vega and Navi when browsing random git repos...kernel, mesa, etc....

            Comment


            • #16
              Originally posted by bridgman View Post

              Have to disagree strongly with that statement. If Polaris is more stable it's because the programming model didn't change much.

              The Polaris to Vega SW/FW-visible changes were much (much much much...) more significant than the Fiji to Polaris changes... maybe 5x or more.
              Thanks for the feedback. I thought it was because Vega simply was consumed by the "miners" and few people used it as a graphics card.

              Sorry about that.

              Comment


              • #17
                Originally posted by tildearrow View Post
                Thanks for the feedback. I thought it was because Vega simply was consumed by the "miners" and few people used it as a graphics card.
                Sorry about that.
                No worries... I don't think we every talked much about how big the changes were outside of events like HotChips. If you think of Vega as having a somewhat modified GFX core along with a totally new uncore (including rebuilding the GPU internal design around Infinity Fabric) that's pretty close.

                https://www.hotchips.org/wp-content/...tor-AMD-f1.pdf

                Assuming you accept "uncore" as a word, of course. I'm still on the fence about that

                Comment


                • #18
                  Originally posted by tildearrow View Post
                  Great. Can this be used to gracefully kill the X server in case of a hang? (without having to reboot the machine)
                  there is no much good of having booted machine with all your session gone, boot is faster than browsers startup. i'd prefer killing at most one problematic app

                  Comment


                  • #19
                    Originally posted by ssokolow View Post
                    Stories like that
                    apply to driver developers too - they have to suffer many lockups during new hardware bringup
                    the last time I ever had the nVidia drivers do anything like that was back when I was on a GeForce 7600GS back around 2009
                    you mean novideo is no longer main source of bsods? wow such a great vendor.
                    i take it back, quick search shows https://forums.tomshardware.com/thre...bsods.3344146/
                    Last edited by pal666; 09-05-2019, 08:26 PM.

                    Comment


                    • #20
                      Originally posted by pal666 View Post
                      there is no much good of having booted machine with all your session gone, boot is faster than browsers startup.
                      False. I still use an HDD and boot takes like 1 minute.

                      Originally posted by pal666 View Post
                      i'd prefer killing at most one problematic app
                      I agree on this one though. I'd like to see this ability too, or at least some way to "unfreeze".

                      Comment

                      Working...
                      X