Announcement

Collapse
No announcement yet.

Linus Torvalds On The Importance Of ECC RAM, Calls Out Intel's "Bad Policies" Over ECC

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Zan Lynx View Post

    According to Wikipedia (and its sources) SMT was invented by IBM and then first used commercially in the Alpha CPU by DEC. As I remember it, AMD bought most of DEC and their engineers, and DEC technology was a major part of the really great AMD64 design. Although it didn't use SMT.

    So if anyone owes license fees for SMT it sounds like it would be Intel. Although the original IBM patents would have expired decades ago.
    DEC was bought by Compaq and then Compaq was bought by Hewlett-Packard.

    DEC was acquired in June 1998 by Compaq in what was at that time the largest merger in the history of the computer industry. During the purchase, some parts of DEC were sold to other companies; the compiler business and the Hudson, Massachusetts facility, were sold to Intel. At the time, Compaq was focused on the enterprise market and had recently purchased several other large vendors. DEC was a major player overseas where Compaq had less presence. However, Compaq had little idea what to do with its acquisitions,[1][2] and soon found itself in financial difficulty of its own. The company subsequently merged with Hewlett-Packard (HP) in May 2002.

    https://en.wikipedia.org/wiki/Digita...nt_Corporation.
    I do not know about AMD getting DEC engineers, but I have read where many DEC engineers went to Microsoft and helped form the foundation of NTFS. (Please do not ask me where - I could probably find it.) DEC systems were amazing for their time. I loved VMS and the VAX architecture.
    GOD is REAL unless declared as an INTEGER.

    Comment


    • Originally posted by Zan Lynx View Post
      According to Wikipedia (and its sources) SMT was invented by IBM and then first used commercially in the Alpha CPU by DEC. As I remember it, AMD bought most of DEC and their engineers, and DEC technology was a major part of the really great AMD64 design. Although it didn't use SMT.
      Edit: I guess I remembered that wrong. Compaq bought DEC and HP bought Compaq? And DEC technology went into Intel Itanium? But for some reason I was sure AMD64 used some Alpha tech. Hmm.
      So if anyone owes license fees for SMT it sounds like it would be Intel. Although the original IBM patents would have expired decades ago.
      well you have a misunterstading amd does not pay license fee for the patent or because intel inveted it. no amd pay license fee because they use the intel implemenation of this technique.
      this means it does not save AMD here. they pay for the intel implemenation...
      Phantom circuit Sequence Reducer Dyslexia

      Comment


      • Originally posted by Zan Lynx View Post
        Edit: I guess I remembered that wrong. Compaq bought DEC and HP bought Compaq? And DEC technology went into Intel Itanium? But for some reason I was sure AMD64 used some Alpha tech. Hmm.
        According to Wikipedia article about Dirk Meyer:

        He was a co-architect of the Alpha 21064 and Alpha 21264 microprocessors during his employment at DEC and also worked at Intel in its microprocessor design group.

        Meyer joined AMD in 1996, where he personally led the team that designed and developed the Athlon processor.
        Following the link to Athlon article:

        The K7 design team was led by Dirk Meyer, who had previously worked as a lead engineer at DEC on multiple Alpha microprocessors. When DEC was sold to Compaq in 1998 and discontinued Alpha processor development, and Sanders brought most of the Alpha design team to the K7 project.
        There is a [Citation Needed] note on the text about bringing most of the Alpha team to K7.
        Last edited by bridgman; 08 January 2021, 08:20 PM.

        Comment


        • Originally posted by coder View Post
          The chips are interleaved, on single-rank DIMMs. I'm not sure how dual-rank DIMMs are mapped, but it still won't be the case that one bad chip = 1/8th or 1/16 of the address range is unusable. It would either break the whole DIMM or maybe half of its address range, I think.


          Yes, one or more bad pages/rows/etc. can be blocked, at boot time. I've never done it, but I know it's possible.
          I had to do it recently to keep a machine running while I got fresh RAM modules ordered in. It is quite easy to do (on Linux, anyway - other OSes YMMV!).
          But it is also only a stop-gap measure: if your RAM is failing in one place now, you are best to assume more failures are likely coming, so replace the RAM ASAP!

          In the end, if the data in your ram, or the time it would take to re-create it, is important to you... you need ECC. If not*, you don't.

          *If your AAA video game crashes from a bit flip one time out of the thousands of times it crashed because you bought it on release day and it wasn't actually finished yet, you probably won't care too much beyond a mild grumble as you restart the game from the last savepoint (if only because putting up with shit quality from software developers has long since become completely normalised in our society!). Auntie Maude probably won't even notice if your email to her has a typo caused by a single bit flip in the hundreds of text-encoding bytes she is reading. Not caring too much is a legitimate stance. But just be sure about it! .... 99.99% of what is in my RAM over the course of its life, I really wouldn't worry much about, but that other 0.01% is important enough to me to justify ECC in my own desktop!
          Last edited by Viki Ai; 08 January 2021, 08:30 PM.

          Comment


          • Originally posted by bridgman View Post
            According to Wikipedia article about Dirk Meyer:
            does AMD pay licence fee for Hyperthreading to intel ?
            Phantom circuit Sequence Reducer Dyslexia

            Comment


            • Originally posted by Viki Ai View Post
              I had to do it recently to keep a machine running while I got fresh RAM modules ordered in. It is quite easy to do (on Linux, anyway - other OSes YMMV!).
              But it is also only a stop-gap measure: if your RAM is failing in one place now, you are best to assume more failures are likely coming, so replace the RAM ASAP!

              In the end, if the data in your ram, or the time it would take to re-create it, is important to you... you need ECC. If not*, you don't.

              *If your AAA video game crashes from a bit flip one time out of the thousands of times it crashed because you bought it on release day and it wasn't actually finished yet, you probably won't care too much beyond a mild grumble as you restart the game from the last savepoint (if only because putting up with shit quality from software developers has long since become completely normalised in our society!). Auntie Maude probably won't even notice if your email to her has a typo caused by a single bit flip in the hundreds of text-encoding bytes she is reading. Not caring too much is a legitimate stance. But just be sure about it! .... 99.99% of what is in my RAM over the course of its life, I really wouldn't worry much about, but that other 0.01% is important enough to me to justify ECC in my own desktop!
              There's a simple rule of thumb here:

              If you care about your data enough to back it up, you want ECC.

              If you use your PC to just do stuff and don't back up anything cause you don't care, then sure, don't use ECC.

              Comment


              • Originally posted by Qaridarium View Post
                does AMD pay licence fee for Hyperthreading to intel ?
                I don't think so - AFAIK it's more of a cross-licensing (aka we don't sue you, you don't sue us) relationship.

                I don't know all the details but I doubt there are any per-unit fees.

                Comment


                • Originally posted by bridgman View Post

                  I don't think so - AFAIK it's more of a cross-licensing (aka we don't sue you, you don't sue us) relationship.

                  I don't know all the details but I doubt there are any per-unit fees.
                  That is interesting. I wonder (and I have no idea) if part of that is to avoid anti-trust issues since they are both US based. If AMD did not exist, there could be an anti-trust issue with x86-64 for Intel much like the whole issue with IBM in the 70s-80s (although I do not remember all of the details). I do know that Microsoft propped up Apple in order to avoid additional anti-trust regulation. The corporate world (and I work in it now) sure is bizarre...companies scrapping by day and being bedfellows by night. It is enough to give you a headache and make you drink.
                  GOD is REAL unless declared as an INTEGER.

                  Comment


                  • Originally posted by waxhead View Post
                    Ok, but this is not quite how I have understood patrol scrub. Unlike what you correctly describes as a standard RAID consistency check, patrol scrub reads a block, verifies checksums, write the same block and verifies that there is no error. e.g. a READ/WRITE cycle.
                    From what I can find, memory scrubbing only involves writing when a correctable error was found:
                    Originally posted by waxhead View Post
                    This is supposed to exercise all memory and catch single bit errors early so that one can offline a memory module before it goes completely bonkers.
                    It's probably configurable, but what I've read says that corrected data is written back, if a single-bit error is detected.

                    You might offline a DIMM when a double-bit error is detected, but the only way I see that being workable is to simply prevent new allocations from using it. And even that would require you're not interleaving memory channels (or interleaving at page-granularity).

                    Comment


                    • Originally posted by mdedetrich View Post
                      Actually in terms of components, ECC is only marginally more expensive than non ECC memory. You basically have an extra memory cell that stores parity data , which is in the ballpark of 10-15% of the cost.
                      That might not be a big deal for you, but it's definitely an issue for lower-margin players and market segments.

                      Originally posted by mdedetrich View Post
                      The main issue (which is what Linus is actually complaining about) is that Intel artificially segments the market so that the motherboards and CPU's that properly support ECC memory are much more expensive (i.e. typically server CPU's) than consumer ones.
                      It's only the motherboards that truly lack it. As far as I can tell, all CPUs that Intel produces seem to have the low-level capability to support it, but they disable that capability on most consumer-oriented CPUs.

                      There's a reasonable argument that if Intel just left the capability enabled on all CPUs, then more motherboards would support it and the price premium (i.e. price difference after accounting for COGS) on such motherboards and UDIMMs would largely disappear.

                      Comment

                      Working...
                      X