Announcement

Collapse
No announcement yet.

PCIe bus errors on Linux with GTX 980

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    well, no updates from ASRock that are helpful at this point. it was suggested I enable the above 4G decoding option, but i'm not sure that will help (any thoughts on this?) ... if I was running out of address space for memory mapping, I would think I'd get a more explicit error or unrecognised devices, no? I may blindly give it a try anyway...

    in other news, I found another way to "workaround" the problem in Linux. as previously mentioned, the problem can be avoided by disabling MSI with pci=nomsi. The drawback of this approach is that all other MSI capable devices also fall back to legacy interrupts and supposedly this is less than optimal. Well, I've been trying a different kernel setting the last 24 hrs that also seems to "work around" the problem and that is pci=nommconf, from kernel documentation:

    [X86] Disable use of MMCONFIG for PCI Configuration

    This allows everything to run with MSI, but I don't know if there are drawbacks of disabling MMCONFIG.

    Comment


    • #32
      Originally posted by BLinux View Post
      well, no updates from ASRock that are helpful at this point. it was suggested I enable the above 4G decoding option, but i'm not sure that will help (any thoughts on this?) ... if I was running out of address space for memory mapping, I would think I'd get a more explicit error or unrecognised devices, no? I may blindly give it a try anyway...

      in other news, I found another way to "workaround" the problem in Linux. as previously mentioned, the problem can be avoided by disabling MSI with pci=nomsi. The drawback of this approach is that all other MSI capable devices also fall back to legacy interrupts and supposedly this is less than optimal. Well, I've been trying a different kernel setting the last 24 hrs that also seems to "work around" the problem and that is pci=nommconf, from kernel documentation:

      [X86] Disable use of MMCONFIG for PCI Configuration

      This allows everything to run with MSI, but I don't know if there are drawbacks of disabling MMCONFIG.
      In any case, let us know how it works out for you. I'll see if I can look into trying that option this weekend.

      Comment


      • #33
        tried the enable "above 4G decoding" option in BIOS as suggested by ASRock support, but it did not help... AER PCIe bus error messages were spewing everywhere even during boot up.

        Comment


        • #34
          I had the same mistake. helped uninstall

          Comment


          • #35
            Originally posted by marcinw View Post
            I had the same mistake. helped uninstall
            You mean reinstalling the drivers fixed the error?

            Comment


            • #36
              I tried uninstalling Nvidia's drivers entirely. I was still getting tons of errors. Unfortunately, I don't have another GPU to test.

              I think this is an issue that has to be fixed with a BIOS update. I'll try bugging Asus at some point today. Hopefully pointing out that some windows users have seen this with various other X99 boards will help.

              Comment


              • #37
                Potential Progress.

                I got a response back from Asus' tech support. They recommended me to try the latest bios (1103). Unfortunately, I'm already using that BIOS (I installed it the day it came out, which was very recently), and I am still having this issue.

                The support fellow said if that doesn't fix the issue, he'll see what he can find out. At least I haven't been dismissed (yet?) for using Linux! Hopefully this will get this problem addressed. Maybe SteamOS is getting motherboard makers to be more receptive?

                BLinux,

                Any luck yet?

                Comment


                • #38
                  Originally posted by hiryu View Post
                  I got a response back from Asus' tech support. They recommended me to try the latest bios (1103). Unfortunately, I'm already using that BIOS (I installed it the day it came out, which was very recently), and I am still having this issue.

                  The support fellow said if that doesn't fix the issue, he'll see what he can find out. At least I haven't been dismissed (yet?) for using Linux! Hopefully this will get this problem addressed. Maybe SteamOS is getting motherboard makers to be more receptive?

                  BLinux,

                  Any luck yet?
                  Hi hiryu - nothing from my side. ASRock insists I install Windows 7 or 8 and use their drivers before they will take any action. I don't have a license for either but asked if I could use Windows 10 Technical Preview since it's free and they said they don't support it. So far using the nommconf option has been very stable and no errors whatsoever while maintaining the use of MSI/MSI-X. So, for now, that is my solution.

                  Comment


                  • #39
                    I would just use Windows 10, but if you insist of using Windows 8 you can get a legal trial well:

                    The Microsoft Evaluation Center brings you full-featured Microsoft product evaluation software available for download or trial on Microsoft Azure.


                    All you need is a MS account (without Win 8/10 is very limited anyway).

                    Comment


                    • #40
                      Good news everyone!

                      I've almost entirely fixed the problem with my Asus Rampage V Extreme. I decided to disable the "AI overlock" (that's probably not the *exact* name) option in the BIOS and set it to manual. While having my overclocking set to manual (and not overclocking at all), my system is completely stable. I left my system up for 8-9 hours last night, and ultimately only had JUST 3 of these errors. Furthermore, all the different things I could do to crash my system with 100% certainty no longer cause me any problems (previously The Witcher 2 was able to crash my system with 100% reliability while strangely other intense 3d games such as Metro redux were mostly okay except for crashes only very occasionally).

                      My understanding is that the AI Overclocking option will overclock your CPU slightly by default (I just learned this last night)... But last night while rebooting my computer trying to find a solution to this problem, I noticed during POST that my CPU speed was reported at a bit over 6000 MHz!!! So it seems AI overclocking is a buggy feature. I simply left it on the default auto state as I figured it wouldn't attempt to actually overclock at all unless I told it to (at least in theory, I think that is supposed how it's supposed to operate). Seeing as how my system nearly always reported a 3.0 GHz speed at POST (the native speed of the CPU), I think this feature is probably just generally unstable and buggy (and may also be unintentionally doing more to the system than simply overclocking the CPU).

                      Something else to note is that when you change this setting from AI overclocking to manual and save your settings in the BIOS and exit, the system doesn't power off, and will start the boot process again the same as with most BIOS changes. Once you've booted, you will still see the usual PCIE errors. I experimented with this a bit and I found what you need to do is to fully power off the system after making and saving this change in the BIOS. Once you power back on, you will be okay. I figured this out because when you (re)enable AI overclocking and save your changes to the BIOS and exit, the system will fully power cycle itself to apply those changes.

                      But now I can play videos fullscreen without my system randomly choking, The Witcher 2 and bioshock no longer lock the system up, etc so I feel the problem is 99% solved.

                      Note: I'm currently running BIOS 1302 on RVE, but I suspect this fix will work with (most) earlier BIOS versions and probably 1401 as well (which I found to be too buggy for other reasons so I reverted back to 1302).

                      BLinux,

                      Are you still having issues? Maybe you have a similar option in your BIOS?

                      Comment

                      Working...
                      X