Announcement

Collapse
No announcement yet.

Skylake Lockups, or what is going on?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Skylake Lockups, or what is going on?

    Hello. I've got built a computer, finally, which has the following specs:

    - Processor: Intel Core i7-6700K (stock frequency)
    - Motherboard: MSI Z170A GAMING PRO
    - Memory: Around 16GB
    - Graphics: (didn't install a card yet)

    So, I tried to run Linux (kernel 4.6-rt) on it, because Michael posted that motherboard works great under Linux in an article.

    It works almost perfectly, but I've been having a real strain: it locks up after a few hours/minutes, and I don't know why.

    So I upgraded​​​ to Linux 4.8.15-rt, and I didn't get any lockup until next day.

    So I disabled rc6, hoping it'd fix the lockup.

    Now it doesn't lock up that often, but it still does after 9 to 12 hours...

    So I modified this PKGBUILD, and upgraded to 4.9.6-rt, and didn't get any lockups for the first days. However, if I leave a USB hard drive connected, then after I return from bathroom (I usually take ~30 minutes), the system is locked up, with the Ethernet activity light blinking forever, and the fans running quiet.

    So I upgraded to 4.9.9-rt, but now it is further unstable, and at a random time, in some days, it locks up
    I even enabled to panic on all lockups, but it doesn't even panic. So I returned to 4.9.6-rt.

    The kernels I was using were all real-time ones, because I urge having it. I have not tested with a non-realtime one yet.

    I am using Arch Linux (using sysvinit) as distribution.

    If anyone could help me (especially Michael due to the motherboard), I'd really really appreciate it.

    And if this is in the wrong category, please tell me, and I'm going to delete this thread, and move it. Thank you in advance.

  • #2
    Hey tildearrow : have you tried updating the BIOS? Or resetting its BIOS to default settings? Those could be possible scenarios. Also, can you be more specific about your RAM? Thanks.
    Michael Larabel
    https://www.michaellarabel.com/

    Comment


    • #3
      Originally posted by Michael View Post
      Hey tildearrow : have you tried updating the BIOS? Or resetting its BIOS to default settings? Those could be possible scenarios. Also, can you be more specific about your RAM? Thanks.
      Now it locked up after 2 minutes of uptime.

      I fear bricking my motherboard by updating its firmware. I'll try once I have the money for a new one.

      And I'll try to be more specific regarding memory:
      It's 2 Kingston KHX2400C15/8G, installed in slots 2 and 4.

      Also, has the system crashed for you?

      Comment


      • #4
        Originally posted by tildearrow View Post
        Hello. I've got built a computer, finally, which has the following specs:

        - Processor: Intel Core i7-6700K (stock frequency)
        - Motherboard: MSI Z170A GAMING PRO
        - Memory: Around 16GB
        - Graphics: (didn't install a card yet)

        So, I tried to run Linux (kernel 4.6-rt) on it, because Michael posted that motherboard works great under Linux in an article.

        It works almost perfectly, but I've been having a real strain: it locks up after a few hours/minutes, and I don't know why.

        So I upgraded​​​ to Linux 4.8.15-rt, and I didn't get any lockup until next day.

        So I disabled rc6, hoping it'd fix the lockup.

        Now it doesn't lock up that often, but it still does after 9 to 12 hours...

        So I modified this PKGBUILD, and upgraded to 4.9.6-rt, and didn't get any lockups for the first days. However, if I leave a USB hard drive connected, then after I return from bathroom (I usually take ~30 minutes), the system is locked up, with the Ethernet activity light blinking forever, and the fans running quiet.

        So I upgraded to 4.9.9-rt, but now it is further unstable, and at a random time, in some days, it locks up
        I even enabled to panic on all lockups, but it doesn't even panic. So I returned to 4.9.6-rt.

        The kernels I was using were all real-time ones, because I urge having it. I have not tested with a non-realtime one yet.

        I am using Arch Linux (using sysvinit) as distribution.

        If anyone could help me (especially Michael due to the motherboard), I'd really really appreciate it.

        And if this is in the wrong category, please tell me, and I'm going to delete this thread, and move it. Thank you in advance.


        Nice set up! I'm also thinking of setting up one. I guess this gave me an idea.

        Comment


        • #5
          Hmm, strange problem. Maybe there's some funky interaction going on with frequency scaling and rt kernel. Perhaps try these one by one, checking for stability each time:

          - Install microcode updates (pacman -S intel-ucode)
          - Try disabling turboboost, either in the BIOS or in OS (# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo)
          - Try disabling intel_pstate frequency scaling governor by adding intel_pstate=disable to kernel run line and reset. Set constant CPU freq with cpupower (pacman -S cpupower). If that works, try setting to one of the other governors e.g. performance, on-demand etc. If it's still stable at that point, it's probably intel_pstate messing something up. If it's only stable at constant freq, it's probably some bug in rt kernel that can't handle dynamic cpu scaling on your particular cpu or mobo.
          - If all of the above fails, try booting from a live USB and just muck around in there for a while (surf the net, whatever). If it's stable it's probably either the rt kernel (most likely, in which case just install regular arch kernel) or a HDD problem (you'd have to be unlucky, but it happens).
          - Very remote chance it could be faulty ram. If none of the above has helped, try running on one stick. If problems persist try just running on the other.

          For more detail on freq scaling see: https://wiki.archlinux.org/index.php...quency_scaling

          EDIT: oh and, as someone mentioned above, decent chance it's due to old buggy bios firmware (in which case update when you can). it's possible the bug will only manifest with RT kernel, so while you're waiting to save up in case of brick (super unlikely in my experience) you might try switching to regular arch kernel in the interim.
          Last edited by spangry; 04 March 2017, 06:36 AM.

          Comment


          • #6
            Originally posted by spangry View Post
            Hmm, strange problem. Maybe there's some funky interaction going on with frequency scaling and rt kernel. Perhaps try these one by one, checking for stability each time:

            - Install microcode updates (pacman -S intel-ucode)
            - Try disabling turboboost, either in the BIOS or in OS (# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo)
            - Try disabling intel_pstate frequency scaling governor by adding intel_pstate=disable to kernel run line and reset. Set constant CPU freq with cpupower (pacman -S cpupower). If that works, try setting to one of the other governors e.g. performance, on-demand etc. If it's still stable at that point, it's probably intel_pstate messing something up. If it's only stable at constant freq, it's probably some bug in rt kernel that can't handle dynamic cpu scaling on your particular cpu or mobo.
            - If all of the above fails, try booting from a live USB and just muck around in there for a while (surf the net, whatever). If it's stable it's probably either the rt kernel (most likely, in which case just install regular arch kernel) or a HDD problem (you'd have to be unlucky, but it happens).
            - Very remote chance it could be faulty ram. If none of the above has helped, try running on one stick. If problems persist try just running on the other.

            For more detail on freq scaling see: https://wiki.archlinux.org/index.php...quency_scaling

            EDIT: oh and, as someone mentioned above, decent chance it's due to old buggy bios firmware (in which case update when you can). it's possible the bug will only manifest with RT kernel, so while you're waiting to save up in case of brick (super unlikely in my experience) you might try switching to regular arch kernel in the interim.
            Well I did try performance and intel_pstate=disable. It was very stable until today. Then it locked up 2 times, one of them just a few minutes after disconnecting from SSH (processor was running at 800MHz and powersave by the way).

            Anyways I updated the firmware. Will test during a month, and come back.

            Comment


            • #7
              You might check your mobo

              Comment


              • #8
                Question: Does the machine stays stable under Windows? Assuming you have tried (Gaming motherboard). It would help narrowing down the root of the issue, software/hardware..

                Comment


                • #9
                  I have never tested on Windows. If it locks up again I'll test on it.

                  Comment


                  • #10
                    Originally posted by tildearrow View Post
                    Anyways I updated the firmware. Will test during a month, and come back.
                    Sounds good, will be interested to hear how it turns out.

                    Comment

                    Working...
                    X