Announcement

Collapse
No announcement yet.

AMDGPU Driver Sees More Fixes For Linux 5.7 Development

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by gardotd426 View Post

    Polaris and Vega are both incredibly stable. Before my 2600X I had a 3200G with Vega 8 integrated graphics, and then I had an RX 580 which is of course Polaris, and it was amazingly stable. I've heard that it took some time to get there though. I'm really hoping the new hire AMD is going to make for a lead kernel developer for amdgpu will spur some action on Navi's stability.
    Well, my 3200G doesnt't like to standby, it cannot resume my Plasma session. At least after a lot of trouble with the 2200G this is stable now. I do a have a mobile 2500U and this freezes the desktop from time to time. All this happens on updated Arch Linux systems.

    But my Radeon VII is stable as hell, it is even almost impossible to find games that can get it down

    Comment


    • #12
      Polaris (20)

      only 2 remaining issues (for me) are:

      >=2 identical monitor (1920x1080, HDMI, here) flickering problem
      (even with amdgpu.dcfeaturemask=2 amdgpu.ppfeaturemask=0xfffd7fff runpm=1)

      to high idle power usage (compared to the other OS) ~32 W, here.

      Sapphire Technology Limited Nitro+ Radeon RX 580, 8 GB
      (Maybe firmware issue. - AMD anyone?)

      01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/4
      80/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA controller])
      Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 570/580/590

      600 MHz (PSTATE_SCLK)
      1000 MHz (PSTATE_MCLK)

      Both P-states will NOT go lower?!
      Last edited by nuetzel; 13 March 2020, 11:59 AM.

      Comment


      • #13
        Originally posted by nuetzel View Post
        Polaris (20)

        only 2 remaining issues (for me) are:

        >=2 identical monitor (1920x1080, HDMI, here) flickering problem
        (even with amdgpu.dcfeaturemask=2 amdgpu.ppfeaturemask=0xfffd7fff runpm=1)

        to high idle power usage (compared to the other OS) ~32 W, here.

        Sapphire Technology Limited Nitro+ Radeon RX 580, 8 GB
        (Maybe firmware issue. - AMD anyone?)

        01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/4
        80/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA controller])
        Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 570/580/590

        600 MHz (PSTATE_SCLK)
        1000 MHz (PSTATE_MCLK)

        Both P-states will NOT go lower?!
        Same dual monitor issue, only two different resolutions and one is HDMI and the other is DVI. The problem is ppfeaturemask.

        Disable it and dual monitors will work as expected (though dcfeaturemask may still be needed).

        If you need to overclock your GPU and use Dual Monitors you have to mod the BIOS of your GPU.

        Comment


        • #14
          Originally posted by skeevy420 View Post

          Same dual monitor issue, only two different resolutions and one is HDMI and the other is DVI. The problem is ppfeaturemask.

          Disable it and dual monitors will work as expected (though dcfeaturemask may still be needed).

          If you need to overclock your GPU and use Dual Monitors you have to mod the BIOS of your GPU.
          Sorry, this is NOT the 'right' solution.
          Without my mentioned amdgpu kernel parameters, you'll never get low power multimonitor modes (with same fequencies!). - Ask Alex...;-)
          No over-/underclock needed/used, here. - But thanks for your hint!
          Last edited by nuetzel; 13 March 2020, 12:51 PM.

          Comment


          • #15
            Originally posted by nuetzel View Post

            Sorry, this is NOT the 'right' solution.
            Without my mentioned amdgpu kernel parameters, you'll never get low power multimonitor modes (with same fequencies!). - Ask Alex...;-)
            No over-/underclock needed/used, here. - But thanks for your hint!
            This is all with Linux 5.5.8. With dcfeaturemask=2 on Fedora 31 with my dual monitor setup, sclk goes all the way down to 300mhz. mclk is what stays up. Without dcfeaturemask=2 the sclk uses 600mhz as a minimum.

            Enabling ppfeaturemask=0xfffd7fff results in a the green wavy flickering effect. I assume that's due to the mclk going all the way down to 300mhz because, apparently on Windows and us, now it's forced to 2000mhz (or whatever max is) to prevent issues when multiple monitors are in use and that forcing doesn't happen when ppfeaturemask is enabled (I assume because they assume the user will fix it...I dunno...).

            I wonder if forcing the mclk to only use 1000mhz and 2000mhz would work when ppfeaturemask is enabled? I'll have to experiment with that later on.

            Code:
            cat /sys/class/drm/card0/device/pp_dpm_sclk
            0: 300Mhz *
            1: 600Mhz
            2: 918Mhz
            3: 1167Mhz
            4: 1239Mhz
            5: 1282Mhz
            6: 1326Mhz
            7: 1366Mhz
            
            cat /sys/class/drm/card0/device/pp_dpm_mclk
            0: 300Mhz
            1: 1000Mhz
            2: 2000Mhz *
            
            cat /sys/class/drm/card0/device/power_dpm_force_performance_level
            auto

            Comment


            • #16
              Originally posted by middy
              crashes are not normal... to any extent.. crashes mean something is wrong and shouldn't be happening. you shouldn't be use to nor take crashes as part and parcel of daily use. as crashes should never be normal. just because they can happen doesn't and shouldn't mean part and parcel.

              i use nvidia on linux because i don't want any hassles. the drivers just work. and gsync works great with my 1440p 144hz gsync (real gsync) monitor. the control panel is super nice and convenient. i can easily see hardware info, change my fan speed, overclock with coolbits enabled, color calibration, monitor info like connector type and speed, easily acquire edid of my monitor, etc. see gpu load, video engine load, memory usage and pci-express buss usage and speed rate, see the cards power state and current clock rate, AA and AF overrides, etc. enable a game overlay to see fps, what api is in use, if vsync or gsync is enabled, etc. even create application profiles. it might still be the same interface from 2004 but it just works.

              yes nvidia has had problems in the past, i've been using nvidia on linux since i was in high school in 2005. along with a few amd / ati gpu's throughout the years. like i orginally had a 5700xt but sent it back for a refund for a 2070 super because amd's windows drivers at the time were so awful i couldn't even go a day without some sort of hard crash / screen lock up. which has been a popular problem on amd with windows. i think they now finally fixed the screen flickering and locks with the latest windows driver release. i never really got use it on linux sadly as i only had it for three weeks. now i'm finally full time linux again after finishing off my 3900x build and getting it fully tested and up and running.

              sometimes nvidia has been slow to update for newer xorg releases. and a few drivers throughout the years had some bugs. like locking up the screen if you killed xorg session. usually caused by a new xorg release and nvidia still needs to optimize for it. but nvidia always got around to fixing them and those issues are uncommon. nvidia is as good on linux as it is on windows from my experience. performance wise they are close. i know people usually have a sour taste in their mouth due to that laptop thing, optimus / bumblebee or w.e but there has been improvements with it over the years. but from desktop or single manufacter gpu perspective, nvidia works well. plus their video decoder and encoder with nvec+cuda works amazingly well. both on linux and windows. way better than amd's from what i've seen. i use vo=gpu and hwdec=nvdec in mpv and it works like a champ.

              overall from what i've seen is older amd releases like vega and polaris and running well on linux. no different than nvidia from stability and even performance is on par. and i see a lot of people say amd on linux with those generations are way better than on windows but that's because amd on windows has really dropped the ball. navi there appears to still be growing pains though on linux but more stable than on windows..
              Agreed, poor driver quality is simply unacceptable.

              Comment


              • #17
                Originally posted by skeevy420 View Post

                This is all with Linux 5.5.8. With dcfeaturemask=2 on Fedora 31 with my dual monitor setup, sclk goes all the way down to 300mhz. mclk is what stays up. Without dcfeaturemask=2 the sclk uses 600mhz as a minimum.

                Enabling ppfeaturemask=0xfffd7fff results in a the green wavy flickering effect.
                That's the 'flickering issue' (see 'bugtrag').
                Without this you do NOT get full low power idle with more then 1 (_same_) monitors. (see amd-staging-drm-next).

                I assume that's due to the mclk going all the way down to 300mhz because,
                No, it stems from the automatic mclk _switching_ (down to 300 mhz and up to 2000 mhz max).

                apparently on Windows and us, now it's forced to 2000mhz (or whatever max is) to prevent issues when multiple monitors are in use and that forcing doesn't happen when ppfeaturemask is enabled (I assume because they assume the user will fix it...I dunno...).
                'auto switching' is disabled per default 'cause it is broken, now.
                So you need 'amdgpu.ppfeaturemask=0xfffd7fff' to enable it for testing (amd-staging-drm-next).
                Without it you never get full low idle.

                I wonder if forcing the mclk to only use 1000mhz and 2000mhz would work when ppfeaturemask is enabled? I'll have to experiment with that later on.
                It should, but you do NOT get full low idle.

                Code:
                cat /sys/class/drm/card0/device/pp_dpm_sclk
                0: 300Mhz *
                1: 600Mhz
                2: 918Mhz
                3: 1167Mhz
                4: 1239Mhz
                5: 1282Mhz
                6: 1326Mhz
                7: 1366Mhz
                
                cat /sys/class/drm/card0/device/pp_dpm_mclk
                0: 300Mhz
                1: 1000Mhz
                2: 2000Mhz *
                
                cat /sys/class/drm/card0/device/power_dpm_force_performance_level
                auto
                Let me see
                cat /sys/kernel/debug/dri/0/amdgpu_pm_info
                please.

                GFX Clocks and Power:
                300 MHz (MCLK)
                300 MHz (SCLK)
                600 MHz (PSTATE_SCLK)
                1000 MHz (PSTATE_MCLK)
                750 mV (VDDGFX)
                32.60 W (average GPU)

                GPU Temperature: 28 C
                GPU Load: 0 %
                MEM Load: 3 %

                As I wrote here is 'all' fine except the 'green wavy flickering effect' and maybe the firmware bug (to high PSTATE_SCLK and PSTATE_MCLK, they go not down).

                Comment


                • #18
                  Originally posted by nuetzel View Post
                  Polaris (20)

                  only 2 remaining issues (for me) are:

                  >=2 identical monitor (1920x1080, HDMI, here) flickering problem
                  (even with amdgpu.dcfeaturemask=2 amdgpu.ppfeaturemask=0xfffd7fff runpm=1)

                  to high idle power usage (compared to the other OS) ~32 W, here.

                  Sapphire Technology Limited Nitro+ Radeon RX 580, 8 GB
                  (Maybe firmware issue. - AMD anyone?)

                  01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/4
                  80/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA controller])
                  Subsystem: Sapphire Technology Limited Nitro+ Radeon RX 570/580/590

                  600 MHz (PSTATE_SCLK)
                  1000 MHz (PSTATE_MCLK)

                  Both P-states will NOT go lower?!
                  All you have to do is either:

                  Code:
                  sudo sh -c "echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level"
                  Or

                  Code:
                  sudo sh -c "echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level"
                  or select high or low within radeon-profile, etc.

                  The problem is that you have your performance level set to "auto," which causes the flickering issue when you have amdgpu.ppfeaturemask enabled.

                  If you're on an Arch-based distribution, TK-Glitch has custom kernels with a patch to fix this so you don't need to do any of those things, you can find it here:

                  https://github.com/tk-glitch/PKGBUILDS

                  You'll find the patch in community-patches/linux55-tkg or ../linux56-tkg, any of the linux kernels he has, they all have the patch in the community-patches directory.


                  It's a rather simple workaround, I've used it on Navi and Polaris, it always works. If you set it to high, it'll have your GPU run at the highest clock states, low goes to lowest, obviously. With TKG's patch you can have it set to auto or whatever you want, it doesn't matter.

                  Comment


                  • #19
                    Originally posted by gardotd426 View Post
                    The problem is that you have your performance level set to "auto," which causes the flickering issue when you have amdgpu.ppfeaturemask enabled.
                    No, no, no.
                    That's not the problem.

                    'auto' is definitely what we want to get lowest idle power usage (with >1 identical monitors).
                    And there lies currently the bug. Alex and others are working on it.

                    Your hints only 'overrule' the 'auto' mode by setting max or low as constant mode.

                    I've only summarized what's currently not working (for me) with current Polaris drivers.
                    E.g. with 'amd-staging-drm-next' (Over the last several months/year.)

                    Second point was, that some (my) Nitro+ seems to have a firmware bug that do not allow to decrease PSTATE_SCLK and PSTATE_MCLK to get lowest idle power usage. No over-/underclocking.
                    ~32 W min is to much with 2 identical HD monitors for Polaris chips.
                    https://gitlab.freedesktop.org/drm/amd/issues/629

                    RX580 should be better than RX480, so...

                    [-] tomshardware.com
                    Idle and Low-Load Power Consumption
                    The Radeon RX 480’s minimum GPU and memory clock rate is 300MHz, resulting in an idle power measurement of 16W (or 19W if you're using multiple monitors). That's simply too high for a modern graphics card. It was actually hard to take a stable reading at idle since even an empty Windows desktop is subject to load fluctuations. The card reacted quickly whenever these occurred, in spite of its high minimum frequency.

                    And this:
                    Bug 110865 - Rx480 consumes 20w more power in idle than under Windows
                    It is NOT solved. I have 2 identical HDMI monitors, here.
                    Moved, here:
                    https://gitlab.freedesktop.org/drm/amd/issues/817

                    And here:
                    https://www.reddit.com/r/Amd/comment...idle_too_high/
                    Last edited by nuetzel; 14 March 2020, 12:36 AM. Reason: Bug 110865 moved to gitlab.

                    Comment


                    • #20
                      Originally posted by nuetzel View Post

                      No, no, no.
                      That's not the problem.

                      'auto' is definitely what we want to get lowest idle power usage (with >1 identical monitors).
                      And there lies currently the bug. Alex and others are working on it.

                      Your hints only 'overrule' the 'auto' mode by setting max or low as constant mode.

                      I've only summarized what's currently not working (for me) with current Polaris drivers.
                      E.g. with 'amd-staging-drm-next' (Over the last several months/year.)

                      Second point was, that some (my) Nitro+ seems to have a firmware bug that do not allow to decrease PSTATE_SCLK and PSTATE_MCLK to get lowest idle power usage. No over-/underclocking.
                      ~32 W min is to much with 2 identical HD monitors for Polaris chips.
                      https://gitlab.freedesktop.org/drm/amd/issues/629

                      RX580 should be better than RX480, so...

                      [-] tomshardware.com
                      Idle and Low-Load Power Consumption
                      The Radeon RX 480’s minimum GPU and memory clock rate is 300MHz, resulting in an idle power measurement of 16W (or 19W if you're using multiple monitors). That's simply too high for a modern graphics card. It was actually hard to take a stable reading at idle since even an empty Windows desktop is subject to load fluctuations. The card reacted quickly whenever these occurred, in spite of its high minimum frequency.

                      And this:
                      Bug 110865 - Rx480 consumes 20w more power in idle than under Windows
                      It is NOT solved. I have 2 identical HDMI monitors, here.
                      Moved, here:
                      https://gitlab.freedesktop.org/drm/amd/issues/817

                      And here:
                      https://www.reddit.com/r/Amd/comment...idle_too_high/
                      Doing any of the three things I mentioned eliminates the flickering issue, which is what I was referring to. If you still want to be able to use "auto," the patch works.

                      Comment

                      Working...
                      X