Originally posted by schmidtbag
View Post
Announcement
Collapse
No announcement yet.
Linus Torvalds On The Importance Of ECC RAM, Calls Out Intel's "Bad Policies" Over ECC
Collapse
X
-
Originally posted by drhoho View PostI live at 7500 feet. Cosmic ray flux is 7.2 times higher than at sea level. During solar storms with 32 GB of non-ECC I'd have to reboot 1-3 times a day. Got a new laptop with 64 GB of ECC (Xeon) and it's stable for months.
Linux is correct and the cosmic ray issue is real. Google Cisco cosmic rays-ECC, etc. Sandia did a study that confirmed the issue of cosmic particle flux and non-ECC.
About Introduction We’ve collected a few amusing and interesting things about bit flipping caused by cosmic rays the other day. Please rest assured we are only using ECC memory - Wikipedia for our server machines. DRAM quotes “A bit flipping at random is not a problem solely related to broken memory. Perfectly healthy memory is also subject, with a small probability, to bit flipping because of cosmic rays. […] According to a few sources, including IBM, Intel and Corsair, a computer with...
So, not a rant, but reality.
- Likes 1
Comment
-
Hi everybody:
I subscribe just for this post.
About Linus and ECC: bollocks.
What is the problem of ECC?
* It costs more because instead of 8 modules, we should have a 9 module plus a chip.
* It adds latency. Why? Because the tiny chip should validate every operation.
And what is the advantage of it?
For instance, ECC does not automatically correct memories.
Usually, ECC does the next steps:
* The system reads some space of memory including the parity check.
* If the parity fails, then it tries it again, it is the "correction" and yes, it is flaky and usually it doesn't solve the problem at all and usually continues with the next step.
* If the memory fails it again, then the system halts (or enters into interactive/maintenance mode). If you are using an expensive server, then it could show some lights and even says which module failed.
Also, memories are way less prone to fail than in the past. For example, if you are worked on a datacenter, then replacing a hard disk is part and included in the global costs. Instead, it is not common to replace the memories or the CPU, mainly because those components are well inside of the motherboard and protected inside several layers of capacitors.
Comment
-
Originally posted by Ivan Dimitrov View PostTotally support Linus's sentiment although it is not fully correct. AMD supports (validates) ECC RAM support ONLY on their PRO processors. ECC is enabled on non-PRO processors but validation and implementation is left to the motherboard vendor. So ECC will most likely work on non-PRO processor but it is not clear if it will work in ECC mode and how will it report the errors. You can watch Wendel for the current state of ECC on Ryzen - in short it is messy.
You can argue that Ryzen's "support" of ECC can give false sense of security for some users....
Anyway whoever needs ECC they are most likely to buy the solution validated by the vendor which means Ryzen PRO and Xeon CPUs - not big difference here. So to be precise instead of "AMD did it", I think the correct statement is more like "AMD raised a valid point, made some noise and scored some marketing points for ECC RAM support".
- Likes 3
Comment
-
Originally posted by zxy_thf View Post-- "The "modern DRAM is so reliable that it doesn't need ECC"
How dare they lie in a straight face?
DRAM is much much less reliable than even five years ago. MemTest is mandatory nowadays even if you're not doing anything special.
I've purchased DRAM modules ~4 years ago and last year. The older ones run on my semi-server (without ECC because I misread the spec) quite happily, but one of latter modules can't even pass MemTest for 1 minute.
Another module from my colleague (brought last year) was also broken from the beginning.
A funny fact for modern DDR4 is, they're mostly DDR4-2133 under JEDEC spec (1.2V), and people simply overclock it to 3200+ with 1.35V, under an obscure name called "Intel XMP".
The desktop DRAM market is not trustworthy anymore, when overclocking becomes the "new common".
If the newer RAM modules were *that* defective, people would be ranting and up in arms about bad RAM. And tech reviewers would also be ranting about it.
Comment
-
Originally posted by schmidtbag View PostThis is one of those things where if you don't know whether you need it or even know what it is, you very likely (but not assuredly) don't need it. You don't need ECC on a family PC. You don't need ECC for a gaming PC. You don't need ECC for a home media center. You don't need ECC for an office PC that just runs a web browser and MS Office all day. Bit flipping is a very real and dangerous problem but it's not enough of a threat to the average user. If it were, either all RAM would be ECC or all CPUs would support ECC.
Comment
-
Originally posted by Mel Spektor View PostCan Intel just die please?
Enjoy the lack of voltage, current and power monitoring. In the future, probably even frequency monitoring won't be supported. Oh, you're trying to figure out why your Ryzen CPU cores aren't boosting to the turbo frequency? Well it turns out that only one of them is guaranteed to boost to the turbo frequency. Everything else is a coin toss. AMDs turbo frequency is a lie. It's pure marketing bullshit.
- Likes 1
Comment
-
Originally posted by CommunityMember View Post
I believe the issue is not that most people don't care, but they are not aware they should care or need to make a choice.
When it comes to such purchases, some people (a lot of people, actually) want to go to Wally World (or BestBuy) and get a cheap, but serviceable tool, and some people want the best, no matter what the price.
One tends to choose ECC only when one is spending OPM, and not so much when you have to choose to spend your own money on your families devices (unless your income is like high 6/7 figures like Linus's).
Of course, Linus is speaking only to the choir about the value of things like ECC. One should certainly ask him whether his laptop, and his phone, and all those he has bought for his family has ECC, or whether he has chosen "good enough" in that case, which is, for better or worse, where most people tend to end up (I am only aware of a handful of laptop vendors that offer ECC, and while I suspect there is some specific phone with ECC, that is not the norm unless your have a .gov at the end of your email address).
Only reason we know on desktop is because Intel and AMD hardware usually comes with their stickers plastered on.
Comment
-
Originally posted by Zan Lynx View Post
People need it even if they don't know it.
I've helped out friends and family who have corrupted documents. Was it their drive? I guess they should have been running btrfs or ZFS. They didn't know it, but they needed it. Or it could have been a RAM error while copying or saving the file. However, since their computer didn't have any error correction there is no way to know what error correction would have helped.
Nice Catch-22 there isn't it. Without ECC you don't know if you needed ECC.
- Likes 1
Comment
-
Originally posted by Sonadow View PostThe fact remains that the world has been fine with using non-ECC memory on high-end professional workstations, desktop PCs, gaming PCs and work PCs for decades with no major complications or consequences. People *want* ECC memory but don't *need* it. Gamerfags sure as hell don't *need* ECC memory other than to have dick size comparisons with other gamerfags.
I have a 2990WX workstation with 128GB of standard DDR4 memory for use in compiling software (with the memory being used as a RAMdisk for faster compilations) and have not run into any problems since day one.
Save the limited stocks of ECC memory for the machines that really *need* them, like production-critical servers in gigantic datacentres where even a single corrupted bit results in massive consequences.
But yeah I agree that most AMD fans are simply thumping their chest and comparing their dick/boob sorry, CPU sizes to others'.
Also stop using fags in a derogatory way like that, it's a shitty thing to do.
- Likes 1
Comment
Comment