Loongson Introducing An EDAC Driver For LoongArch + ECC Memory Systems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • phoronix
    Administrator
    • Jan 2007
    • 67050

    Loongson Introducing An EDAC Driver For LoongArch + ECC Memory Systems

    Phoronix: Loongson Introducing An EDAC Driver For LoongArch + ECC Memory Systems

    Loongson's LoongArch processors for the Chinese market have been primarily for desktop systems but it looks like their workstation/server ambitions may be growing with now contributing an Error Detection And Correction (EDAC) driver for Loongson SoCs with ECC memory...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite
  • ayumu
    Senior Member
    • Oct 2008
    • 613

    #2
    Caring about ECC might have China easily leapfrog the west.

    Comment

    • Developer12
      Senior Member
      • Dec 2019
      • 1526

      #3
      Somehow these guys can put ECC support in place but after 30 years intel still can't get their shit together.

      Comment

      • JMB9
        Senior Member
        • Mar 2016
        • 226

        #4
        Won't bet on it right now - but Intel did lie for a long time about ECC necessity and AMD allowed ECC
        for some time (and more in silence than really pushing ECC to market with some facts and guarantees).
        But currently reality is that having more than 64 GB RAM ECC RAM may not work for a lot of motherboards.
        So for my system (Zen4: Ryzen 9 7900, RDNA3: Radeon RX 7800 XT, and ASUS ROG STRIX X670-E - E)
        change the 64 GB to 128 GB was no longer recommended ... - and this situation was similar for other
        vendors, too - thus I expect, the basic chipsets were not really reliable - or at least not as reliable as
        thought/designed for.

        Additionally I had a hard time getting my system to boot relibly ... may also be a problem of ASUS board,
        with additional problems by NVMe-s (2 TB) and problems with initial Ubuntu 24.04.1.
        After several updates of Ubuntu 24.04.1, a lot of testing and no longer installing OS-partitions or
        EFI-partitions on NVMe-s, booting seems to work - I never saw such ugly race conditions before ... was
        like a Las Vegas feeling when to boot a partition ...
        Even right now EFI still throws error messages - but can boot now reliably.
        This is not what I would expect by a workstation - and the old BIOS trash should be long replaced
        by somthing usable ... and still a CMOS buffer battery ... cough!
        No kidding - it had to be replaced after a year ... thanks ASUS!

        Especially 4 TB SSDs and 2 TB NVMe-s are really tiny - 8 TB SSDs would be ok ... but below is
        just problematic. And guarantees larger 5 years would be welcome - as data reliability is key!

        But my main problem is missing 8k screens (prototypes shown in 2015 with 1st cheap 4k screens):
        non-glare, DisplayPort 2.1(+) [i.e. 10/2022 and newer] and larger then 46" [necessities for a workstation!]
        (as 31.5" are not using full infor from 4k screens! - so 8k would be completely wasted with that size
        except for DTP usage; glare screens are not for working 10+ hours; HDMI is just a source of problems -
        especially for Linux as reported by Phoronix when AMD was not allowed to support basic HDMI features
        by open source drivers - so no one should target that trash for real work - not to mention the crappy
        plugs of HDMI). Maybe Vulkan 1.4 can help to change industry targets, as '8k rendering with up to
        8 separate render targets is now guaranteed to be supported, along with several other limits increased'.
        For education 64 k resolution would be a realistic target ... 4k has similar {little less} pixels (i.e. 8 294 499)
        than a 300 dpi page of a Laser printer of 1990 (i.e. 8 699 840 pels ... and a lot less than former pocket slides
        used for education in the 1980ies ...

        So this is a general quality problem by industry ... really strange.

        But if china aims at quality, maybe US must change their current convenience zone of selling trash
        for a lot of money ... thus restart creating high quality.

        This problem can also be seen by Linux (kernel and distros) - as it is much more difficult
        to get compatibility information than in 1990-ies ... a list would be easy to show the problems,
        which makes a workstation being usable after at least 2 years of age ... HWE does not do the magic,
        and most Internet info comes from 2010 and earlier - currently no longer helpful ... :
        o Is there any workstation motherboard out there getting support by automatic firmware update
        via Linux LVFS/fwupd service? In a reliable and thus frequent way?
        o Which sound cards are completely supported by current ALSA (used in latest ??.4.1 LTS)?
        o Which printers are really out of the box usable with current CUPS (used in latest LTS ?.4.1)?
        ...

        ​Just some ideas and thoughts about the current situation for a Linux workstation ...
        from some current experience ... never thought that the situation got _so bad_ !!!
        It would be approptiate if industry really cares ... and Linux Foundation would be
        in charge to start that ... not looking in other directions ...

        Comment

        • NateHubbard
          Senior Member
          • Mar 2015
          • 575

          #5
          Originally posted by JMB9 View Post
          Just some ideas and thoughts about the current situation for a Linux workstation ...
          from some current experience ... never thought that the situation got _so bad_ !!!
          If it makes you feel better, I'm not having any issues at all with my hardware. An MSI motherboard with 64GB, and a Ryzen 7700 with two monitors, one HDMI, and one DisplayPort.

          Comment

          • Developer12
            Senior Member
            • Dec 2019
            • 1526

            #6
            Originally posted by JMB9 View Post
            Won't bet on it right now - but Intel did lie for a long time about ECC necessity and AMD allowed ECC
            for some time (and more in silence than really pushing ECC to market with some facts and guarantees).
            But currently reality is that having more than 64 GB RAM ECC RAM may not work for a lot of motherboards.
            So for my system (Zen4: Ryzen 9 7900, RDNA3: Radeon RX 7800 XT, and ASUS ROG STRIX X670-E - E)
            change the 64 GB to 128 GB was no longer recommended ... - and this situation was similar for other
            vendors, too - thus I expect, the basic chipsets were not really reliable - or at least not as reliable as
            thought/designed for.

            Additionally I had a hard time getting my system to boot relibly ... may also be a problem of ASUS board,
            with additional problems by NVMe-s (2 TB) and problems with initial Ubuntu 24.04.1.
            After several updates of Ubuntu 24.04.1, a lot of testing and no longer installing OS-partitions or
            EFI-partitions on NVMe-s, booting seems to work - I never saw such ugly race conditions before ... was
            like a Las Vegas feeling when to boot a partition ...
            Even right now EFI still throws error messages - but can boot now reliably.
            This is not what I would expect by a workstation - and the old BIOS trash should be long replaced
            by somthing usable ... and still a CMOS buffer battery ... cough!
            No kidding - it had to be replaced after a year ... thanks ASUS!

            Especially 4 TB SSDs and 2 TB NVMe-s are really tiny - 8 TB SSDs would be ok ... but below is
            just problematic. And guarantees larger 5 years would be welcome - as data reliability is key!

            But my main problem is missing 8k screens (prototypes shown in 2015 with 1st cheap 4k screens):
            non-glare, DisplayPort 2.1(+) [i.e. 10/2022 and newer] and larger then 46" [necessities for a workstation!]
            (as 31.5" are not using full infor from 4k screens! - so 8k would be completely wasted with that size
            except for DTP usage; glare screens are not for working 10+ hours; HDMI is just a source of problems -
            especially for Linux as reported by Phoronix when AMD was not allowed to support basic HDMI features
            by open source drivers - so no one should target that trash for real work - not to mention the crappy
            plugs of HDMI). Maybe Vulkan 1.4 can help to change industry targets, as '8k rendering with up to
            8 separate render targets is now guaranteed to be supported, along with several other limits increased'.
            For education 64 k resolution would be a realistic target ... 4k has similar {little less} pixels (i.e. 8 294 499)
            than a 300 dpi page of a Laser printer of 1990 (i.e. 8 699 840 pels ... and a lot less than former pocket slides
            used for education in the 1980ies ...

            So this is a general quality problem by industry ... really strange.

            But if china aims at quality, maybe US must change their current convenience zone of selling trash
            for a lot of money ... thus restart creating high quality.

            This problem can also be seen by Linux (kernel and distros) - as it is much more difficult
            to get compatibility information than in 1990-ies ... a list would be easy to show the problems,
            which makes a workstation being usable after at least 2 years of age ... HWE does not do the magic,
            and most Internet info comes from 2010 and earlier - currently no longer helpful ... :
            o Is there any workstation motherboard out there getting support by automatic firmware update
            via Linux LVFS/fwupd service? In a reliable and thus frequent way?
            o Which sound cards are completely supported by current ALSA (used in latest ??.4.1 LTS)?
            o Which printers are really out of the box usable with current CUPS (used in latest LTS ?.4.1)?
            ...

            Just some ideas and thoughts about the current situation for a Linux workstation ...
            from some current experience ... never thought that the situation got _so bad_ !!!
            It would be approptiate if industry really cares ... and Linux Foundation would be
            in charge to start that ... not looking in other directions ...
            ECC requires just an extra 8 data lines between the dimms and the CPU, and a few minor changes to the memory controller inside the CPU. doesn't matter the size of the sticks.

            You _might_ run into an issue though where the motherboard vendor didn't really pay a lot of attention when routing the ECC lines and didn't test them thoroughly, if they bothered to connect them at all. Motherboard vendors already regularly violate the JDEC standards and instead rely on "qualified vendor lists" of sticks they tried out and are "pretty sure" work.

            Part of that is on DDR being a standard that is long overdue to be replaced, in conflict with the laws of physics. Every other interface in a modern computer, from USB to SATA to PCIe has moved to independent lanes with differential signalling which is much easier to get right and can offer much higher bandwidth. It's just too hard to get every wire in a 64-72 pin parallel bus to line up with each other electrically at high speeds.

            Comment

            • chithanh
              Senior Member
              • Jul 2008
              • 2491

              #7
              I used to be a big proponent of ECC but not any longer.

              I think that memory encryption with integrity protection (à la AMD SEV-SNP) will cover most uses of ECC going forward. Additionally it is more fine-grained and will work on "normal" RAM without the extra data lines. It does come with some overhead but you can limit that to VMs which actually require integrity, with a Qubes OS like approach you could even set this individually per application.

              So far, AMD does not want to enable SEV in consumer products despite the silicon being capable, which is a shame.

              Comment

              • aviallon
                Senior Member
                • Dec 2022
                • 273

                #8
                Originally posted by chithanh View Post
                I used to be a big proponent of ECC but not any longer.

                I think that memory encryption with integrity protection (à la AMD SEV-SNP) will cover most uses of ECC going forward. Additionally it is more fine-grained and will work on "normal" RAM without the extra data lines. It does come with some overhead but you can limit that to VMs which actually require integrity, with a Qubes OS like approach you could even set this individually per application.

                So far, AMD does not want to enable SEV in consumer products despite the silicon being capable, which is a shame.
                ECC has no real use for security, and is tailored for reliability and stability (of long running systems or where computing with accurate results is of prime importance)

                Comment

                • chithanh
                  Senior Member
                  • Jul 2008
                  • 2491

                  #9
                  Originally posted by aviallon View Post
                  ECC has no real use for security,
                  It does. If you don't have memory encryption, then ECC becomes your last line of defense against Rowhammer.
                  Originally posted by aviallon View Post
                  computing with accurate results
                  ECC does not ensure the accuracy but the integrity of computation.

                  Comment

                  Working...
                  X