Failing A PCIe 5.0 NVMe SSD In Less Than 3 Minutes Without Extra Cooling
Following the Corsair MP700 PCIe 5.0 NVMe SSD review under Linux with many readers being surprised by file-system errors when not adding extra cooling like the motherboard's passive M.2 heatsink and being curious about the situation myself, here are more tests of ultimately how this drive will reliably hit file-system errors in three minutes or less without added cooling.
After getting the basic Linux testing out of the way for the Corsair MP700 2TB, I did some further testing of looking at the file-system errors when running without any added cooling. Since, after all, one would normally assume that the NVMe solid-state drive would throttle before outright getting to the point of file-system errors if there was excessive heat.
As mentioned in the review, I was hitting file-system errors simply when installing tests after a clean Ubuntu 23.04 install and not even getting to actually stressing the NVMe drive with benchmarks. So I was quite curious to see how long the drive would last under the disk benchmarking workloads when run without any after-market cooling. Long story short, it was less than 3 minutes into clean boots before reliably hitting errors.
When collecting the dmesg logs remotely, typically within 180 seconds of a boot and simply starting a MariaDB server and running mysqlslap to exercise the database server there would be file-system errors. Repeating this several times, it would always happen within three minutes and the reported NVMe drive temperature via the NVMe HWMON sysfs was always around 87 degrees. For what it's worth, the MP700 tech specs outline a temperature range of -40C to 85C.
From the collected logs, at least under Linux it looks like the NVMe controller goes down which in turn is leading to the EXT4 file-system errors. For example:
The NVMe drive was busy with MariaDB tasks, so it shouldn't have been trying to go into a low power management state -- short of thermal throttling... So I did also try the "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" kernel options recommended in the dmesg output. but ultimately neither of those options to disable PCI Express Active State Power Management (ASPM) or changing the NVMe max latency had any help.
I haven't seen this behavior at all with the Inland TD510 PCIe 5.0 NVMe SSD but again that drive is equipped with an active heatsink by default.
After correcting the EXT4 file-system errors and attaching a passive NVMe heatsink to the MP700, I fired off 24 hours of demanding I/O benchmarks with MariaDB, PostgreSQL, FIO, ClickHouse, and other workloads and it has run without issue... Still quite warm up into the low to mid 80's, but never any file-system errors or NVMe controller reports in the kernel log. It will be interesting to see though if this behavior occurs under Windows as well or if some NVMe driver difference affects the situation.
In any event though, I'd recommend on waiting to upgrade to PCIe 5.0 NVMe consumer storage until there are more compelling options available that are faster, more reliable, and prices have begun to better compete with PCIe 4.0 NVMe drive pricing. If going for any PCIe 5.0 NVMe SSD right now though as an early adopter, I'd recommend to really ensure you have adequate cooling in place.
Update: Phison has now contacted Phoronix with the following response:
After getting the basic Linux testing out of the way for the Corsair MP700 2TB, I did some further testing of looking at the file-system errors when running without any added cooling. Since, after all, one would normally assume that the NVMe solid-state drive would throttle before outright getting to the point of file-system errors if there was excessive heat.
As mentioned in the review, I was hitting file-system errors simply when installing tests after a clean Ubuntu 23.04 install and not even getting to actually stressing the NVMe drive with benchmarks. So I was quite curious to see how long the drive would last under the disk benchmarking workloads when run without any after-market cooling. Long story short, it was less than 3 minutes into clean boots before reliably hitting errors.
When collecting the dmesg logs remotely, typically within 180 seconds of a boot and simply starting a MariaDB server and running mysqlslap to exercise the database server there would be file-system errors. Repeating this several times, it would always happen within three minutes and the reported NVMe drive temperature via the NVMe HWMON sysfs was always around 87 degrees. For what it's worth, the MP700 tech specs outline a temperature range of -40C to 85C.
From the collected logs, at least under Linux it looks like the NVMe controller goes down which in turn is leading to the EXT4 file-system errors. For example:
[ 177.187278] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
[ 177.187283] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 177.187285] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
[ 177.235051] nvme 0000:19:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 177.235151] nvme nvme0: Disabling device after reset failure: -19
[ 177.251298] nvme0n1: detected capacity change from 3907029168 to 0
[ 177.251308] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194516 starting block 125091905)
[ 177.251310] Buffer I/O error on device nvme0n1p2, logical block 129128830
[ 177.251312] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55193889 starting block 273991222)
[ 177.251316] Buffer I/O error on device nvme0n1p2, logical block 124816449
[ 177.251317] Buffer I/O error on device nvme0n1p2, logical block 273715766
[ 177.251322] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 29884642 starting block 129404287)
[ 177.251327] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194516 starting block 125091902)
[ 177.251327] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 29884642 starting block 129404288)
[ 177.251329] Buffer I/O error on device nvme0n1p2, logical block 124816446
[ 177.251331] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 30164735 starting block 129396193)
[ 177.251331] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194516 starting block 125091875)
[ 177.251332] Buffer I/O error on device nvme0n1p2, logical block 129120737
[ 177.251333] Buffer I/O error on device nvme0n1p2, logical block 124816419
[ 177.251334] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194516 starting block 125091852)
[ 177.251336] Buffer I/O error on device nvme0n1p2, logical block 124816396
[ 177.251336] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 30164735 starting block 129396194)
[ 177.251338] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194516 starting block 125091849)
[ 177.251339] Buffer I/O error on device nvme0n1p2, logical block 124816393
[ 177.251341] Buffer I/O error on device nvme0n1p2, logical block 124816371
[ 177.251343] Buffer I/O error on device nvme0n1p2, logical block 124816353
[ 177.251345] Aborting journal on device nvme0n1p2-8.
[ 177.251353] EXT4-fs error (device nvme0n1p2) in ext4_reserve_inode_write:5906: Journal has aborted
[ 177.251358] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm kworker/u64:9: Detected aborted journal
[ 177.251359] Buffer I/O error on dev nvme0n1p2, logical block 220726066, lost async page write
[ 177.251360] Buffer I/O error on dev nvme0n1p2, logical block 243826688, lost sync page write
[ 177.251362] EXT4-fs error (device nvme0n1p2): ext4_dirty_inode:6110: inode #55194516: comm systemd-journal: mark_inode_dirty error
[ 177.251365] Buffer I/O error on dev nvme0n1p2, logical block 121147223, lost async page write
[ 177.251366] JBD2: I/O error when updating journal superblock for nvme0n1p2-8.
[ 177.251367] EXT4-fs error (device nvme0n1p2) in ext4_dirty_inode:6111: Journal has aborted
[ 177.251369] Buffer I/O error on dev nvme0n1p2, logical block 121141967, lost async page write
[ 177.251372] Buffer I/O error on dev nvme0n1p2, logical block 121110852, lost async page write
[ 177.251373] EXT4-fs error (device nvme0n1p2) in ext4_reserve_inode_write:5906: Journal has aborted
[ 177.251375] Buffer I/O error on dev nvme0n1p2, logical block 121110544, lost async page write
[ 177.251376] EXT4-fs error (device nvme0n1p2): ext4_dirty_inode:6110: inode #55193889: comm rs:main Q:Reg: mark_inode_dirty error
[ 177.251376] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm mariadbd: Detected aborted journal
[ 177.251377] Buffer I/O error on dev nvme0n1p2, logical block 121110529, lost async page write
[ 177.251378] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 177.251379] Buffer I/O error on dev nvme0n1p2, logical block 120587356, lost async page write
[ 177.251381] Buffer I/O error on dev nvme0n1p2, logical block 58, lost async page write
[ 177.251384] EXT4-fs error (device nvme0n1p2) in ext4_dirty_inode:6111: Journal has aborted
[ 177.251386] EXT4-fs (nvme0n1p2): previous I/O error to superblock detected
[ 177.251386] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal
[ 177.251391] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[ 177.251399] EXT4-fs (nvme0n1p2): previous I/O error to superblock detected
[ 177.251406] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 177.251407] EXT4-fs (nvme0n1p2): Remounting filesystem read-only
[ 177.251409] EXT4-fs (nvme0n1p2): failed to convert unwritten extents to written extents -- potential data loss! (inode 29884642, error -30)
[ 177.251413] EXT4-fs (nvme0n1p2): failed to convert unwritten extents to written extents -- potential data loss! (inode 30164735, error -30)
[ 177.251415] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 177.251415] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 177.251417] EXT4-fs (nvme0n1p2): previous I/O error to superblock detected
[ 177.251420] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 186.476141] EXT4-fs error (device nvme0n1p2): __ext4_find_entry:1663: inode #63307794: comm apport: reading directory lblock 0
[ 186.476151] buffer_io_error: 4 callbacks suppressed
[ 186.476152] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 186.476154] EXT4-fs: 2 callbacks suppressed
[ 186.476154] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 186.476170] EXT4-fs error (device nvme0n1p2): __ext4_find_entry:1663: inode #63307794: comm apport: reading directory lblock 0
[ 186.476174] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 186.476174] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 186.476184] EXT4-fs error (device nvme0n1p2): __ext4_find_entry:1663: inode #63307794: comm apport: reading directory lblock 0
[ 186.476187] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 186.476188] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 186.476197] EXT4-fs error (device nvme0n1p2): __ext4_find_entry:1663: inode #63307794: comm apport: reading directory lblock 0
[ 186.476199] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 186.476200] EXT4-fs (nvme0n1p2): I/O error while writing superblock
The NVMe drive was busy with MariaDB tasks, so it shouldn't have been trying to go into a low power management state -- short of thermal throttling... So I did also try the "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" kernel options recommended in the dmesg output. but ultimately neither of those options to disable PCI Express Active State Power Management (ASPM) or changing the NVMe max latency had any help.
[ 164.581235] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
[ 164.581240] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 164.581241] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
[ 164.641482] nvme0n1: I/O Cmd(0x2) @ LBA 203495560, 32 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 164.641490] I/O error, dev nvme0n1, sector 203495560 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 2
[ 164.669173] nvme 0000:19:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 164.669278] nvme nvme0: Disabling device after reset failure: -19
[ 164.693186] I/O error, dev nvme0n1, sector 3089104896 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[ 164.693195] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 386138112)
[ 164.693198] nvme0n1: detected capacity change from 3907029168 to 0
[ 164.693200] Buffer I/O error on device nvme0n1p2, logical block 385862656
[ 164.693211] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390649)
[ 164.693213] Buffer I/O error on device nvme0n1p2, logical block 129115193
[ 164.693213] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 29884650 starting block 273996343)
[ 164.693215] Buffer I/O error on device nvme0n1p2, logical block 129115194
[ 164.693217] Buffer I/O error on device nvme0n1p2, logical block 129115195
[ 164.693219] Buffer I/O error on device nvme0n1p2, logical block 129115196
[ 164.693220] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390653)
[ 164.693224] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390637)
[ 164.693226] Buffer I/O error on device nvme0n1p2, logical block 129115181
[ 164.693228] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390598)
[ 164.693229] Buffer I/O error on device nvme0n1p2, logical block 129115142
[ 164.693231] Buffer I/O error on device nvme0n1p2, logical block 129115143
[ 164.693232] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390595)
[ 164.693234] Buffer I/O error on device nvme0n1p2, logical block 129115139
[ 164.693235] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390578)
[ 164.693236] Buffer I/O error on device nvme0n1p2, logical block 129115122
[ 164.693238] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390566)
[ 164.693240] EXT4-fs warning (device nvme0n1p2): ext4_end_bio:343: I/O error 10 writing to inode 55194123 starting block 129390563)
[ 164.693278] Buffer I/O error on dev nvme0n1p2, logical block 120619365, lost async page write
[ 164.693285] Buffer I/O error on dev nvme0n1p2, logical block 127926285, lost async page write
[ 164.693293] Buffer I/O error on dev nvme0n1p2, logical block 62, lost async page write
[ 164.693302] EXT4-fs error (device nvme0n1p2): ext4_check_bdev_write_error:223: comm mariadbd: Error while async write back metadata
[ 164.693323] Aborting journal on device nvme0n1p2-8.
[ 164.693329] EXT4-fs error (device nvme0n1p2) in ext4_dirty_inode:6111: IO failure
[ 164.693329] Buffer I/O error on dev nvme0n1p2, logical block 243826688, lost sync page write
[ 164.693334] JBD2: I/O error when updating journal superblock for nvme0n1p2-8.
[ 164.693334] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 164.693336] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 164.693340] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 164.693341] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 164.693353] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[ 164.693364] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 164.693368] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 164.693369] EXT4-fs error (device nvme0n1p2): ext4_journal_check_start:83: comm mariadbd: Detected aborted journal
[ 164.693370] EXT4-fs (nvme0n1p2): Remounting filesystem read-only
[ 164.693378] Buffer I/O error on dev nvme0n1p2, logical block 0, lost sync page write
[ 164.693380] EXT4-fs (nvme0n1p2): I/O error while writing superblock
[ 164.693455] EXT4-fs (nvme0n1p2): ext4_do_writepages: jbd2_start: 13262 pages, ino 55194123; err -30
[ 164.727715] Process 10715(apport) has RLIMIT_CORE set to 1
[ 164.727716] Aborting core
[ 164.915069] Process 10732(apport) has RLIMIT_CORE set to 1
[ 164.915071] Aborting core
[ 164.947259] Process 10736(apport) has RLIMIT_CORE set to 1
[ 164.947261] Aborting core
[ 164.981899] Process 10747(apport) has RLIMIT_CORE set to 1
I haven't seen this behavior at all with the Inland TD510 PCIe 5.0 NVMe SSD but again that drive is equipped with an active heatsink by default.
After correcting the EXT4 file-system errors and attaching a passive NVMe heatsink to the MP700, I fired off 24 hours of demanding I/O benchmarks with MariaDB, PostgreSQL, FIO, ClickHouse, and other workloads and it has run without issue... Still quite warm up into the low to mid 80's, but never any file-system errors or NVMe controller reports in the kernel log. It will be interesting to see though if this behavior occurs under Windows as well or if some NVMe driver difference affects the situation.
In any event though, I'd recommend on waiting to upgrade to PCIe 5.0 NVMe consumer storage until there are more compelling options available that are faster, more reliable, and prices have begun to better compete with PCIe 4.0 NVMe drive pricing. If going for any PCIe 5.0 NVMe SSD right now though as an early adopter, I'd recommend to really ensure you have adequate cooling in place.
Update: Phison has now contacted Phoronix with the following response:
"After carefully reviewing the recent reports from TechPowerUp and Phoronix, Phison would like to acknowledge the issue found in the reviews of products using the new Phison PS5026-E26 controller. We take this matter seriously and are committed to resolving it promptly.So this should be addressed with a firmware update. Now for how well these firmware updates work out for Linux users... Some vendors at least offer bootable NVMe firmware update handling while only a select few go the extra mile with LVFS+Fwupd support to make it an easy process for Linux users.
Our firmware engineering teams have already isolated the problem and made the necessary adjustments to the thermal throttle curve within hours of the report. However, the new firmware must undergo Phison's strict validation process before our partners can release it to customers. Rest assured our partners will notify end-users as soon as the validated update is available.
It is important to note that all E26 SSDs shipped without a heatsink are intended to be used with a heatsink. Most motherboards shipping with PCIe Gen5 enabled also include cooling specifically designed for Gen5 SSDs. We offer the 'bare drive' option to allow customers to use their existing cooling products.
We want to emphasize our commitment to providing high-quality products and solutions to our customers and will continue to work diligently to ensure their satisfaction. Thank you for your patience and understanding during this process."
46 Comments