Announcement

Collapse
No announcement yet.

AMD's Raven Ridge Botchy Linux Support Appears Worse With Some Motherboards/BIOS

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by agd5f View Post
    The reference platforms are generally not sold directly.
    that's a shame. you should sell your branded motherboards then

    Comment


    • #52
      Originally posted by bridgman View Post
      We had to remove some of that abstraction and replace it with Linux-specific logic, making the code more expensive to maintain and more likely to have bugs (but more readable and better integrated with Linux-specific subsystem code), in order to have it accepted upstream.
      A bit offtopic, by "expensive to maintain" are you referring to past or present? Per my understanding Windows does not require opening the code, so development probably happens with Linux, and then gets Windows-specific shim (i.e. because Windows won't care anyway). There's no maintainance cost, is there?

      Comment


      • #53
        Originally posted by bridgman View Post

        Actually no - the abstraction you are complaining about there was deliberate in order to let the same code run cleanly across a broad range of OSes and platforms. We had to remove some of that abstraction and replace it with Linux-specific logic, making the code more expensive to maintain and more likely to have bugs (but more readable and better integrated with Linux-specific subsystem code), in order to have it accepted upstream.

        There is some auto-generated code as agd5f already said (the bandwidth calculation logic) but if anything that is under-abstracted not over-abstracted.
        What's this nonsense about "more likely to have bugs"? The fact of the matter is that -readable- code is -less- likely to have bugs. Code that can be comprehended can be fixed, but code that can't be, can't be..... Compare how buggy fglrx was compared to the OSS stack... History has already proven that fglrx was a horribly buggy mess across the entire stack, in every component. Nobody and I mean not one single person wants that mess in the OSS stack.

        How can simpler, far more readable, way better commented code be more expensive to maintain? I absolutely guarantee far more people can read the current DC code and understand it than the initial release. That much is absolutely certain. How can it possibly cost more?
        Last edited by duby229; 19 February 2018, 04:47 PM.

        Comment


        • #54
          Originally posted by Hi-Angel View Post
          A bit offtopic, by "expensive to maintain" are you referring to past or present? Per my understanding Windows does not require opening the code, so development probably happens with Linux, and then gets Windows-specific shim (i.e. because Windows won't care anyway). There's no maintainance cost, is there?
          Present and future. Development still happens primarily with Windows and diags, and so we keep needing to add/update Linux-specific code rather than using what was already written for (and, more importantly, tested on) other platforms.
          Test signature

          Comment


          • #55
            Thanks Michael for explaining the difficulties. I got a Ryzen 2200g and a ASRock AB350M Pro4, but the combination with Arch Linux didn't work well. Thanks to my older Ryzen 1700 I could upgrade the BIOS to the latest version, so at least the system boots now past the UEFI with the 2200g. But even with Linux firmware git, Mesa git and Linux 4.15, 4.16 git or AMD staging DRM next I cannot get past the kernel modesetting, the system freezes as in the Antergos screenshot of Michael. Not a good experience so far. Some trouble was expected, but I thought upgrading would make the system at least sometimes usable. What I personally don't like is that these graphic hiccups can freeze Linux, but this is a generic problem in Linux. Anyway, I'm confident that the issues will be sorted soon and then it will be a truly powerful hardware :-)

            Comment


            • #56
              Originally posted by duby229 View Post
              What's this nonsense about "more likely to have bugs"? The fact of the matter is that -readable- code is -less- likely to have bugs. Code that can be comprehended can be fixed, but code that can't be, can't be..... Compare how buggy fglrx was compared to the OSS stack... History has already proven that fglrx was a horribly buggy mess across the entire stack, in every component. Nobody and I mean not one single person wants that mess in the OSS stack.
              The new code is *less* readable for the primary display devs, who are mostly working on other OSes/platforms than Linux. What we did was add some OS-specific code paths (used for Linux only) to make the Linux view more readable for Linux drm developers. Downside is that also replaces common (andd heavily tested on other OSes) functions with Linux-specific code that is no longer able to leverage that testing.

              I don't understand the connection with fglrx other than maybe both of them being something you don't like.

              Originally posted by duby229 View Post
              How can simpler, far more readable, way better commented code...
              Boy, that's the biggest pile of superlatives I have seen in a while. I'm glad you think so highly of the new code

              Originally posted by duby229 View Post
              ... be more expensive to maintain? I absolutely guarantee far more people can read the current DC code and understand it than the initial release. That much is absolutely certain. How can it possibly cost more?
              The people doing the maintaining (and the bulk of the testing) are not working with the new Linux-specific code; they are still working with the original cross-platform code since that is what works on the other OSes & drivers. The question you should be asking is "how can more readable but only lightly tested code be buggier than less readable but heavily and continuously tested code ?", and you probably already know the answer to that.

              Don't get me wrong, there was also some "just plain cleanup" which was an all-around win, but when we replace common and heavily tested code with OS-specific code that doesn't leverage testing done on other platforms you don't automatically benefit from those changes in terms of ongoing code quality.

              The benefits come in other ways, primarily (a) making it easier for community developers to maintain the code if AMD goes away, and (b) making it easier for people changing other parts of the kernel to understand the potential impact of those changes on amdgpu driver code.

              By the way there is a "next step to the plan" which should help to reduce the impact of having to add more Linux-specific code paths... over time I am hoping we can start moving our HW diagnostics to run over the standard Linux driver rather than the separate code paths used today. If so then that should help to make up for some of the test coverage we lost by replacing common code with Linux-specific paths.
              Last edited by bridgman; 19 February 2018, 05:45 PM.
              Test signature

              Comment


              • #57
                Originally posted by bridgman View Post

                The new code is *less* readable for the primary display devs, who are mostly working on other OSes/platforms than Linux. What we did was add some OS-specific code paths (used for Linux only) to make the Linux view more readable for Linux drm developers. Downside is that also replaces common (andd heavily tested on other OSes) functions with Linux-specific code that is no longer able to leverage that testing.

                I don't understand the connection with fglrx other than maybe both of them being something you don't like. The main problems with fglrx were (a) it was not upstream and so distro packagers could not help with integration and testing, and (b) it didn't evolve with new DRM features like kernel modesetting and TTM and so gradually became out of sync with the behaviour expected by a kernel gfx driver.
                No. Just no. What it had was abstraction layers that -couldn't- test OS specific code paths. Where bugs would exhibit themselves would almost always be in the abstraction layer because it couldn't test OS paths.

                Boy, that's the biggest pile of superlatives I have seen in a while. I'm glad you think so highly of the new code
                It still has it's share of problems i think, but it is better now. I can follow the code and largely understand what it is trying to do, and it is better commented now in many places. About superlatives, I try to be as descriptive as I can be and sometimes without knowing how to be... Sorry.

                The people doing the maintaining (and the bulk of the testing) are not working with the new Linux-specific code; they are still working with the original cross-platform code since that is what works on the other OSes & drivers. The question you want to be asking is "how can more readable but only lightly tested code be buggier than less readable but heavily and continuously tested code ?", and you probably already know the answer to that.

                Don't get me wrong, there was also some "just plain cleanup" which was a net gain, but when we replace common and heavily tested code with OS-specific code that doesn't leverage testing done on other platforms you don't automatically win.

                That said, there is a "next step to the plan" which should help to reduce the impact of having to add more Linux-specific code paths... over time I am hoping we can start moving our HW diagnostics to run over the standard Linux driver rather than the separate code paths used today. If so then that should help to make up for some of the test coverage we lost by replacing common code with Linux-specific paths.
                That's my point exactly. Isn't that the root of the problem?

                Comment


                • #58
                  Originally posted by duby229 View Post
                  No. Just no. What it had was abstraction layers that -couldn't- test OS specific code paths. Where bugs would exhibit themselves would almost always be in the abstraction layer because it couldn't test OS paths.
                  Huh ? It "couldn't test OS-specific paths" because they weren't there - most of the code was common (albeit at the price of being abstracted away from the OS code).

                  Originally posted by duby229 View Post
                  That's my point exactly. Isn't that the root of the problem?
                  You quoted a lot of text so not sure what "that" refers to... but guessing you mean "isn't the fact that most of the common code development is done on other OSes the root of the problem ?".

                  If so, then the answer is "what did you expect ?". The whole idea behind using DC in the Linux driver was to leverage testing done on other OSes with much larger market shares and hence much larger engineering budgets.
                  Test signature

                  Comment


                  • #59
                    I don't think this is a Linux problem. Windows people can't run shit either. The low end motherboards, even the one AMD sent Jayztwocents With the APU, didn't work on Windows. They have a firmware problem. Wait for another agesa.

                    Comment


                    • #60
                      bridgman
                      From your saying, you had to let go reusable and tested code, that was supposedly working on multiple other systems, in order to please the Linux developers. Looks like you were forced to create a fork of your code. That must have been really disturbing for your engineers. Do you think that that melding process bared any benefits to the code for the other platforms as well?

                      Comment

                      Working...
                      X