Announcement

**k1e0x** · 15 June 2020, 02:45 PM

Originally posted by oiaohm View Post

This is head in sand. The highest density data hard drives are all SMR for the Datacentre.
https://www.seagate.com/au/en/tech-i...smr-master-ti/
We are not talking a small difference here. SMR allows at least 25% more storage on the same drive mechanism. Some of the early models of SMR drives allowed you to put CMR firmware on them but they lost 25%+ of their capacity instantly.

Harddrive vendors have already given their 10 year into future roadmap on drives that roadmap includes the end to CMR production.

Remember you need 25% less material to provide the same capacity with SMR this is why SMR is appearing in desktop drivers hidden behind firmware(drive managed SMR) so Windows or OS X works. People want cheaper harddrives.

Fun part is SMR higher capacity can in fact give higher read speeds and higher write speeds than the prior CMR/PMR(depend on what vendor name you are using for the old tech). Big thing here SMR does not like random writes.

https://www.toshiba.co.jp/tech/revie...02/pdf/a08.pdf

Yes SMR behaviour is all documented.

SMR for a will designed copy on write file system that is SMR compadible should be a perfect match as in in fact giving more storage and more performance than the prior tech. Problem here is when the file system drives the storage in ways that are incompatible with SMR really hurts badly on SMR drives.

Please note this issue of firmware being used to fix SMR like issue does not start with SMR. The bank sizes inside SSD drives have the same behaviour problem as SMR drives where complete banks for flash has to be cleared as a whole that are not 4k sectors. Yes SSD drives stalling out with ZFS when getting full is the same problem where you are random-ally writing to something that does not like random writes.

Sorry ZFS was designed for CMR/PMR harddrives and ram drives these devices are happy with random writes. Problem is SMR harddrives and lots of modern SSD are not happy with random writes.

Technology has past the current ZFS design by. But XFS and Ext4 do show that it not impossible for ZFS to be fixed for SMR but you will end up with differences on disc to the prior version. Also you will not have access to the next generation of drive tech while ZFS remained under the license it is so is going to keep on getting hit by left field by these things until that license changes.

You don't appear to understand why this is.

No filesystem was designed with SMR in mind. The closest one is APFS that was designed for SSD.

In ZFS when you write a block it updates the pointers on every block in the tree from leaf to root. it does this to maintain consistency and integrity and provide it's atomic transactions.

Other file systems use the much older method of block pointer tables with extents. It's a very old design you can't pass this off as new, sorry.. They do not provide atomic transactions and can not prove their consistency on disk. In SMR you are allowing the firmware to rewrite the data for every track above the track you're writing. Thus if you do this you now have introduced another failure condition (one that is blind to the OS), added a lot more writes and iops, and you have NO WAY to prove your on disk consistency. What happens if you get a phantom write? If you have them in an array.. how does the array know what side of the mirror is correct? It doesn't. Also when was the last time you tested your UPS? Just wondering..

The difference between SSD's behavior in bank leveling and SMR's is that SSD's firmware don't do it all the time. When they do do it they are much faster at doing it and at least in ZFS's case, you can tell if it did it correctly.

True they provide about 25% more space, but does the reliability issues now warrant a hot backup on premise? Because that takes 100% more space.

So be my guest if you want to use SMR.. as a storage expert I do not recommend it.

**oiaohm** · 17 June 2020, 09:31 PM

Originally posted by k1e0x View Post

No filesystem was designed with SMR in mind. The closest one is APFS that was designed for SSD.

This is horrible wrong.

What is SMR that the first thing you need to understand. SMR is new form of object storage. When did something like SMR appear.

SCSI Object-Based Storage Device Commands is what you need to look up. Yes the year 2000 did something like SMR appear. XFS, BTRFS and EXT3 all contained prototypes to run this this. ZFS does not. Why XFS, BTRFS and EXT3-4 were all mainline. Also Linux kernel device manager raid also included support for SCSI Object-Based Storage Device. There is a catch here only parties who really had access any real world SCSI Object-Based Storage Device Commands supporting drives in any volume was really hard-drive manufacture labs.

Originally posted by k1e0x View Post

In ZFS when you write a block it updates the pointers on every block in the tree from leaf to root. it does this to maintain consistency and integrity and provide it's atomic transactions.

That has never been object based storage compadible. That is to be expected since Sun ZFS designer did not have access to SCSI Object-Based Storage Device devices to play with.

Originally posted by k1e0x View Post

Other file systems use the much older method of block pointer tables with extents. It's a very old design you can't pass this off as new, sorry.. They do not provide atomic transactions and can not prove their consistency on disk. In SMR you are allowing the firmware to rewrite the data for every track above the track you're writing. Thus if you do this you now have introduced another failure condition (one that is blind to the OS), added a lot more writes and iops, and you have NO WAY to prove your on disk consistency. What happens if you get a phantom write? If you have them in an array.. how does the array know what side of the mirror is correct?

Except SMR firmware operates in 3 modes.
1) Device Managed this is the mode you don't want.
2) Host Aware Mode is interesting
3) Host Managed mode. This also has a downside.

Yes the WD Reds that are SMR are able to operating in Device Managed and Host Aware Mode. When running ZFS on them you end up in broken Device managed mode.

Host Aware mode you can to modify a block using the on drive controller and be informed about it. Yes you can instructions like send X modfiications to be performed on block A to be placed in block B and inform me when that is done so I can read block B and validate that is correct then latter on overwrite block A once you know block B is right. Host Managed mode you have to send full blocks for write and read across.

So nothing like being totally wrong. Remember Linux Device mapper layer is able to operate SMR drives in Host Aware Mode when raiding so able to reduce drive thrash problem. Also SMR Host Aware Mode if you are writing a complete block from scratch like duplicating a drive to rebuild a raid you are able to complete avoid writing into the temporary CMR/PMR storage that is used for block modification. Heck SMR firmware in Device managed mode if you are writing from start fo disc to end will in fact write nothing into the CMR/PMR storage as long as the drive starts off new and blank. Why because writing in SMR is directly is not slower than writing in CMR/PMR its fast as long as you are not altering. SMR also support appending into a partly filled block without any speed hit as well.

The old design you complained about with blocks pointer tables and extends part of that is not that old. Extents comes out of the SCSI Object-Based Storage work.

Originally posted by k1e0x View Post

IThe difference between SSD's behavior in bank leveling and SMR's is that SSD's firmware don't do it all the time. When they do do it they are much faster at doing it and at least in ZFS's case, you can tell if it did it correctly.

This is totally incorrect with modern SSD. 4 bit per cell SSD are constantly doing bank leveling behind your back. Why because you write to them they write into 1 or 2 bit per cell area because it faster then when have to time rewrite into higher density storage and never tell you about it. So yes using modern day SSD you must have extra hidden writes. This is way different to SMR in Host Managed mode or Host Aware mode but is exactly the same as SMR device managed mode ZFS developers were told at 2015 conference they need to avoid like the plague because it will perform badly and possible be unsafe.

Originally posted by k1e0x View Post

True they provide about 25% more space, but does the reliability issues now warrant a hot backup on premise? Because that takes 100% more space.

https://www.youtube.com/watch?v=a2lnMxMUxyc Yes this video here ZFS developers where given a heads up to the problem 5 years ago. Problem was Ext, XFS and BTRFS had the heads up way sooner Ext and XFS is 2000 and BTRFS had the head up from 2007 this is all due to being in the harddrive makers labs.

This response says you are clueless is not a storage experts bootlace. The guy in 2015 video is a storage expert you are not. This is your problem most of the best storage experts are inside harddrive and SSD companies and ZFS license is prevent having those people in ZFS development rooms all the time.

I do not recommend using SMR in device managed mode. Host Aware Mode and Host Managed mode of SMR is better behaved than what SSD vendors these days is serving up.

The reality here is SMR should not give you any more reliability issues than CMR/PMR did as long as your software is able to drive them in Host Aware Mode or Host Managed Mode. Host Aware Mode you should at worst be about 4 percent slower at best about 8 percent faster than your prior CMR/PMR drive with 25% extra compactly. Remember SMR Host Aware reduces the write data over the wire to the drive to the same size it would be for a CMR/PMR mode. The effect of SMR Host Aware is not exactly increased write problem it is increased read due to having to validate larger area as in from where a write was in a SMR block to the end of SMR block if you have changed a bit in the middle of a block this is a complete block. So a correctly run SMR drive is more likely to pick up a drive with defective platter surface than a CMR/PMR drive because you will be checking more area more often.

Really it would be good to see SMR host aware mode features appear on SSD drives so writes stops happening behind your back without your operating systems knowing about it.

k1e0x you have it backwards basically SMR should be your most reliable storage as long your software/hardware can drive it correctly. Another fun point is drives operating in SMR device managed or host aware mode need trim commands like your normal SSD devices guess what else ZFS is failing to-do at the moment. Basically ZFS is basically not SMR compadible and its rebuild time shows it. When using a raid that is SMR compatible the rebuild time between CMR/PMR vs SMR is basically SMR is faster so technically safer.

SMR is not a bad solution. SMR can be very incompatible solution. But when you come aware everyone was given at least 5 years heads up that SMR was coming with direct instructions from hardware drive vendors 5 year ago what need to be implemented so SMR would work right. Really the SMR problems are not purely the Harddrive vendors fault it takes two to tango basically. Harddrive vendors should never really have shipped SMR drives with them unlabeled that was wrong. Parties making file systems and raid solutions had notice 5+ years ago this was coming so they need to explain why they are not ready.

Yes ZFS developers did have a problem that they could not get their hands on SMR drives and they could not ask those working hard drive vendors labs to run tests for them because they are a incompatible license with what is required to get your software into the hard drive vendors labs.

Stratis layers idea of having the integrity layer independent to the core file system is so that the integrity layer design can be change to match the storage media being used so you don't end up with the current ZFS problem of round peg and square hole of a integrity system that the file system is attempting to push on everything. So Redhat work on Stratis might be the right long term plan.

In some ways I think WD and other harddrive vendors are really getting sick of waiting around for ZFS and other parties to get their stuff in order for SMR and are now coming the the point where they will start brute forcing it. There road maps for CMR/PMR in fact show this. So past the point of we will not support SMR and get away with it its now you have to support SMR or in future you will not have drives. The harddrive vendors are not going to back off.

Every year more storage volume in hard-drives is wanting.there is no way to fill this demand long term running factories making CMR/PMR harddrives also there not power effective to have 25%+ more drives to store the data than you need to either. Please note 25%+ I am wring the + in fact SMR at its best is 40% more storage than a CMR/PMR drive with read speeds CMR/PMR drive can only dream of because everything is packed tighter. SMR at best almost cuts your drive storage cost in half. So what you were using a stripped raid for in CMR/PMR you could basically be using mirrored with SMR with a few more advancements. Sorry k1e0x even your reduced reliability arguement does not hold up. CMR/PMR is defeated tech and is coming more defeated and less and less cost effective to use.

**k1e0x** · 18 June 2020, 01:26 PM

Originally posted by oiaohm View Post

This is horrible wrong.

What is SMR that the first thing you need to understand. SMR is new form of object storage. When did something like SMR appear.

SCSI Object-Based Storage Device Commands is what you need to look up. Yes the year 2000 did something like SMR appear. XFS, BTRFS and EXT3 all contained prototypes to run this this. ZFS does not. Why XFS, BTRFS and EXT3-4 were all mainline. Also Linux kernel device manager raid also included support for SCSI Object-Based Storage Device. There is a catch here only parties who really had access any real world SCSI Object-Based Storage Device Commands supporting drives in any volume was really hard-drive manufacture labs.

That has never been object based storage compadible. That is to be expected since Sun ZFS designer did not have access to SCSI Object-Based Storage Device devices to play with.

Except SMR firmware operates in 3 modes.
1) Device Managed this is the mode you don't want.
2) Host Aware Mode is interesting
3) Host Managed mode. This also has a downside.

Yes the WD Reds that are SMR are able to operating in Device Managed and Host Aware Mode. When running ZFS on them you end up in broken Device managed mode.

Host Aware mode you can to modify a block using the on drive controller and be informed about it. Yes you can instructions like send X modfiications to be performed on block A to be placed in block B and inform me when that is done so I can read block B and validate that is correct then latter on overwrite block A once you know block B is right. Host Managed mode you have to send full blocks for write and read across.

So nothing like being totally wrong. Remember Linux Device mapper layer is able to operate SMR drives in Host Aware Mode when raiding so able to reduce drive thrash problem. Also SMR Host Aware Mode if you are writing a complete block from scratch like duplicating a drive to rebuild a raid you are able to complete avoid writing into the temporary CMR/PMR storage that is used for block modification. Heck SMR firmware in Device managed mode if you are writing from start fo disc to end will in fact write nothing into the CMR/PMR storage as long as the drive starts off new and blank. Why because writing in SMR is directly is not slower than writing in CMR/PMR its fast as long as you are not altering. SMR also support appending into a partly filled block without any speed hit as well.

The old design you complained about with blocks pointer tables and extends part of that is not that old. Extents comes out of the SCSI Object-Based Storage work.

This is totally incorrect with modern SSD. 4 bit per cell SSD are constantly doing bank leveling behind your back. Why because you write to them they write into 1 or 2 bit per cell area because it faster then when have to time rewrite into higher density storage and never tell you about it. So yes using modern day SSD you must have extra hidden writes. This is way different to SMR in Host Managed mode or Host Aware mode but is exactly the same as SMR device managed mode ZFS developers were told at 2015 conference they need to avoid like the plague because it will perform badly and possible be unsafe.

https://www.youtube.com/watch?v=a2lnMxMUxyc Yes this video here ZFS developers where given a heads up to the problem 5 years ago. Problem was Ext, XFS and BTRFS had the heads up way sooner Ext and XFS is 2000 and BTRFS had the head up from 2007 this is all due to being in the harddrive makers labs.

This response says you are clueless is not a storage experts bootlace. The guy in 2015 video is a storage expert you are not. This is your problem most of the best storage experts are inside harddrive and SSD companies and ZFS license is prevent having those people in ZFS development rooms all the time.

I do not recommend using SMR in device managed mode. Host Aware Mode and Host Managed mode of SMR is better behaved than what SSD vendors these days is serving up.

The reality here is SMR should not give you any more reliability issues than CMR/PMR did as long as your software is able to drive them in Host Aware Mode or Host Managed Mode. Host Aware Mode you should at worst be about 4 percent slower at best about 8 percent faster than your prior CMR/PMR drive with 25% extra compactly. Remember SMR Host Aware reduces the write data over the wire to the drive to the same size it would be for a CMR/PMR mode. The effect of SMR Host Aware is not exactly increased write problem it is increased read due to having to validate larger area as in from where a write was in a SMR block to the end of SMR block if you have changed a bit in the middle of a block this is a complete block. So a correctly run SMR drive is more likely to pick up a drive with defective platter surface than a CMR/PMR drive because you will be checking more area more often.

Really it would be good to see SMR host aware mode features appear on SSD drives so writes stops happening behind your back without your operating systems knowing about it.

k1e0x you have it backwards basically SMR should be your most reliable storage as long your software/hardware can drive it correctly. Another fun point is drives operating in SMR device managed or host aware mode need trim commands like your normal SSD devices guess what else ZFS is failing to-do at the moment. Basically ZFS is basically not SMR compadible and its rebuild time shows it. When using a raid that is SMR compatible the rebuild time between CMR/PMR vs SMR is basically SMR is faster so technically safer.

SMR is not a bad solution. SMR can be very incompatible solution. But when you come aware everyone was given at least 5 years heads up that SMR was coming with direct instructions from hardware drive vendors 5 year ago what need to be implemented so SMR would work right. Really the SMR problems are not purely the Harddrive vendors fault it takes two to tango basically. Harddrive vendors should never really have shipped SMR drives with them unlabeled that was wrong. Parties making file systems and raid solutions had notice 5+ years ago this was coming so they need to explain why they are not ready.

Yes ZFS developers did have a problem that they could not get their hands on SMR drives and they could not ask those working hard drive vendors labs to run tests for them because they are a incompatible license with what is required to get your software into the hard drive vendors labs.

Stratis layers idea of having the integrity layer independent to the core file system is so that the integrity layer design can be change to match the storage media being used so you don't end up with the current ZFS problem of round peg and square hole of a integrity system that the file system is attempting to push on everything. So Redhat work on Stratis might be the right long term plan.

In some ways I think WD and other harddrive vendors are really getting sick of waiting around for ZFS and other parties to get their stuff in order for SMR and are now coming the the point where they will start brute forcing it. There road maps for CMR/PMR in fact show this. So past the point of we will not support SMR and get away with it its now you have to support SMR or in future you will not have drives. The harddrive vendors are not going to back off.

Every year more storage volume in hard-drives is wanting.there is no way to fill this demand long term running factories making CMR/PMR harddrives also there not power effective to have 25%+ more drives to store the data than you need to either. Please note 25%+ I am wring the + in fact SMR at its best is 40% more storage than a CMR/PMR drive with read speeds CMR/PMR drive can only dream of because everything is packed tighter. SMR at best almost cuts your drive storage cost in half. So what you were using a stripped raid for in CMR/PMR you could basically be using mirrored with SMR with a few more advancements. Sorry k1e0x even your reduced reliability arguement does not hold up. CMR/PMR is defeated tech and is coming more defeated and less and less cost effective to use.

I see a lot of words here but very little proof or answers at all. Just claims and statements without any backing.

Object storage has nothing to do with this we are talking about block storage. So wtf
You haven't answered any of my questions about data validity. wtf
FreeBSD had SSD Trim in ZFS since 2012 (Linux got it in 2018). wtf
The CDDL does not restrict usage. wtf

Also Backblaze is a massive storage provider and provides excellent testing and reports on hard drive reliability. They use no SMR drives.

So.. wtf and suit yourself dude. SMR is a consumer grade product, not an enterprise product. From what I can see they preform like utter crap too in raid arrays so the only use I can see for them is a possible use case for cold storage before you archive the data down to tape. There is a real reason hard drive vendors are trying to hide the fact drives are SMR based.. it's because people know that they suck and they don't buy them.

Personally I think magnetic disk storage entirely is going to have to go the way of the dinosaur and rotating drum drives because they are past the point of physics now on them.. and flash capacity is getting pretty close..

**oiaohm** · 18 June 2020, 09:19 PM

Originally posted by k1e0x View Post

Object storage has nothing to do with this we are talking about block storage. So wtf

SMR drives operation is a Object storage device not a block storage device at design core its in fact emulating block storage.

Originally posted by k1e0x View Post

You haven't answered any of my questions about data validity. wtf

The reality is more spread out writes triggering a SSD/SMR to have to rework stuff behind you back reduces data validity.

DMIntegrity · Wiki · cryptsetup / cryptsetup · GitLab

https://gitlab.com/cryptsetup/cryptsetup/-/wikis/DMIntegrity

Cryptsetup and LUKS - open-source disk encryption

Stratis from redhat is looking into the DM Integrity layer. This has the data validity done under file system as in not part of the file system. The direct connection work between XFS and the block layer under it is to allow this lower information to be accessible at the file system level.

This makes your data validity system a removable/change able module without needing to change the file systems file structures. Yes the file system has to checksum its own data structures but the file data blocks read from drive and placed in memory is checked by the DM Integrity layer in the Stratis design.

Basically do you do data validitiy as part of the file system(ZFS) or do you do it as part of the block system(Stratis design). This is two different ways of skinning the problem. Block layer valid does have some serous advantages. Like being able to use the SMR native block sizes for checksums.

Yes your common default recordsize is 128 KB for ZFS but a SMR native block size can be like 4Meg+. This is basically where things get interesting right. The size that is idea for the SMR drive is not ideal for compact storage of files.

So splitting the file system and block device idea of storage size has some serous advantages when you get into object storage drives. Yes object storage drives started doing these massive section sizes because losing the distance between sectors allows more storage per disc. SMR is really a direct descendant of the early object drives. Remember our normal harddrives changes from 512 bytes per sector to 4k per sector to increase storage size. SMR we are going 1024x+ bigger again. Just like changing from 512 to 4k sector sizes required file system alterations this change requires file system alterations.

Originally posted by k1e0x View Post

FreeBSD had SSD Trim in ZFS since 2012 (Linux got it in 2018). wtf

Except I am talking about SMR drives in device managed or host aware mode. These kind of SMR drives need Trim commands like a SSD so they can function correctly. Not sending the trim command can jack knife a SMR drive the same way as operating a SSD drive without sending trim commands. Yes this causes extra performance hits. So does FreeBSD send SMR drives the trim commands it should the answer is no it does not yet.

Originally posted by k1e0x View Post

The CDDL does not restrict usage. wtf

In fact CDDL does restrict usage in labs you really need to go to copyright legal and start asking questions how will using CDDL in a development lab effect company patents.

Originally posted by k1e0x View Post

Also Backblaze is a massive storage provider and provides excellent testing and reports on hard drive reliability. They use no SMR drives.

A Look at How Backblaze Buys Petabytes of Hard Drive Storage

https://www.backblaze.com/blog/how-backblaze-buys-hard-drives/

Backblaze currenlty uses 117,658 hard drives in our data centers, which means we've learned a thing or two about purchasing petabytes-worth of storage.

Even Backblaze state could be 10 to 15 percent cheaper going SMR but they do need to update their software stack and hardware configuration to support it that is why they are not using any SMR drives at this point. Pays to check why. Changing your existing configuration can be hard.

[QUOTE=k1e0x;n1187391So.. wtf and suit yourself dude. SMR is a consumer grade product, not an enterprise product. From what I can see they preform like utter crap too in [/QUOTE]

We put Western Digital’s dreaded SMR Red drive to the test

https://arstechnica.com/gadgets/2020/06/western-digitals-smr-disks-arent-great-but-theyre-not-garbage/

Western Digital's SMR disks won't work for ZFS, but they're okay for most NASes.

MDRAID from Linux runs SMR drives reasonably even without a properly optimised SMR workload. SMR is not in fact ideal as a consumer product as noted here. Miss driven by a not optimised workload can cause SMR stalls.

Also note here they say for consumer usage SMR could be the worse option. Again there is need for the workflow driving SMR drives to get properly optimised this will require operating system changes.

Originally posted by k1e0x View Post

Personally I think magnetic disk storage entirely is going to have to go the way of the dinosaur and rotating drum drives because they are past the point of physics now on them.. and flash capacity is getting pretty close..

Not yet. 100TB hard drives are planned to have them in 5 years. Sorry flash capacity was growing faster than hard drives for a while but in recent years its slowed down to the point harddrives could that that lead back. Big issue is flash has increased capacity while reducing durability while hard drives are still increasing capacity while also increasing durability.

Flash is the physics limit this is why they are having with each new increase in size to area now a reduction in durability basically having to pay the physics price for exceeding what physics says is safe. Hard drives not quite yet at the physics limit and will not be for another 2 decades at least.

There is one catch all these ultra huge hard drives are going to be SMR.

Yes we could end in the horrible location where HDD are the long term storage being regulatory reloaded into flash that you cannot trust every time there is a check sum failure.

**k1e0x** · 22 June 2020, 07:36 PM

Originally posted by oiaohm View Post

SMR drives operation is a Object storage device not a block storage device at design core its in fact emulating block storage.

The reality is more spread out writes triggering a SSD/SMR to have to rework stuff behind you back reduces data validity.

https://gitlab.com/cryptsetup/crypts...is/DMIntegrity

Stratis from redhat is looking into the DM Integrity layer. This has the data validity done under file system as in not part of the file system. The direct connection work between XFS and the block layer under it is to allow this lower information to be accessible at the file system level.

This makes your data validity system a removable/change able module without needing to change the file systems file structures. Yes the file system has to checksum its own data structures but the file data blocks read from drive and placed in memory is checked by the DM Integrity layer in the Stratis design.

Basically do you do data validitiy as part of the file system(ZFS) or do you do it as part of the block system(Stratis design). This is two different ways of skinning the problem. Block layer valid does have some serous advantages. Like being able to use the SMR native block sizes for checksums.

Yes your common default recordsize is 128 KB for ZFS but a SMR native block size can be like 4Meg+. This is basically where things get interesting right. The size that is idea for the SMR drive is not ideal for compact storage of files.

So splitting the file system and block device idea of storage size has some serous advantages when you get into object storage drives. Yes object storage drives started doing these massive section sizes because losing the distance between sectors allows more storage per disc. SMR is really a direct descendant of the early object drives. Remember our normal harddrives changes from 512 bytes per sector to 4k per sector to increase storage size. SMR we are going 1024x+ bigger again. Just like changing from 512 to 4k sector sizes required file system alterations this change requires file system alterations.

Except I am talking about SMR drives in device managed or host aware mode. These kind of SMR drives need Trim commands like a SSD so they can function correctly. Not sending the trim command can jack knife a SMR drive the same way as operating a SSD drive without sending trim commands. Yes this causes extra performance hits. So does FreeBSD send SMR drives the trim commands it should the answer is no it does not yet.

In fact CDDL does restrict usage in labs you really need to go to copyright legal and start asking questions how will using CDDL in a development lab effect company patents.

https://www.backblaze.com/blog/how-b...s-hard-drives/

Even Backblaze state could be 10 to 15 percent cheaper going SMR but they do need to update their software stack and hardware configuration to support it that is why they are not using any SMR drives at this point. Pays to check why. Changing your existing configuration can be hard.

https://arstechnica.com/gadgets/2020...e-not-garbage/

MDRAID from Linux runs SMR drives reasonably even without a properly optimised SMR workload. SMR is not in fact ideal as a consumer product as noted here. Miss driven by a not optimised workload can cause SMR stalls.

Also note here they say for consumer usage SMR could be the worse option. Again there is need for the workflow driving SMR drives to get properly optimised this will require operating system changes.

Not yet. 100TB hard drives are planned to have them in 5 years. Sorry flash capacity was growing faster than hard drives for a while but in recent years its slowed down to the point harddrives could that that lead back. Big issue is flash has increased capacity while reducing durability while hard drives are still increasing capacity while also increasing durability.

Flash is the physics limit this is why they are having with each new increase in size to area now a reduction in durability basically having to pay the physics price for exceeding what physics says is safe. Hard drives not quite yet at the physics limit and will not be for another 2 decades at least.

There is one catch all these ultra huge hard drives are going to be SMR.

Yes we could end in the horrible location where HDD are the long term storage being regulatory reloaded into flash that you cannot trust every time there is a check sum failure.

This got picked up by Linus Tech Tips of all places. Some comments on that.. So It's not mentioned because he's off screen but Matt Ahrens (ZFS co-creator) asked a very important question about the library HGST mentions in that video you posted. They are talking about the library they created (Manfred Berger calls it an "emulator") And Matt asks "Is your vision that filesystem will cease to exist and instead people will use your library?" It's an important question because it sounds like it's just making it an object store and it's showing ZFS (or any other filesystem) is no longer in control of the block allocation. - Being that this is in the press due to WD's lawsuit we will probably see some attention being made to ZFS support of it, I imagine a module can be added to ZFS's SPA layer but how well it can do it's job when the blocks are being re-written by something else is questionable.. it's a good question for the engineers.

I think probably the best solution would be to have ZFS just do all the writes it needs to manually so it can verify them but your trading performance for integrity.. probably people want that trade off tho.

**oiaohm** · 23 June 2020, 05:15 AM

Originally posted by k1e0x View Post

They are talking about the library they created (Manfred Berger calls it an "emulator") And Matt asks "Is your vision that filesystem will cease to exist and instead people will use your library?"

File system still need to exist. Object stores cannot do everything. SMR being object based it need different interfaces to what has been historically provided.

Originally posted by k1e0x View Post

It's an important question because it sounds like it's just making it an object store and it's showing ZFS (or any other filesystem) is no longer in control of the block allocation.

The hard reality here is ever since hard-drives added the SMART(Self-Monitoring, Analysis and Reporting Technology) system the file system has not been in control of the block. Yes SMART Reallocated Sectors is only possible because what the file system is seeing about where the data is on the drive is a virtual creation of the controller. Yes a SMART sector relocation may happen and until you system checks for that information you don't knows it been copied and need to be rechecksumed in a integrity system. On PMR/CMR drives last time I checked ZFS pool system was not checking the SMART logs on the drive for relocation events. Yes just because controller relocates sector because it detects something as wrong does not mean it does it right.

Originally posted by k1e0x View Post

Being that this is in the press due to WD's lawsuit we will probably see some attention being made to ZFS support of it, I imagine a module can be added to ZFS's SPA layer but how well it can do it's job when the blocks are being re-written by something else is questionable.. it's a good question for the engineers.

The reality here when you using a:
1) Quad-level cells (QLC) or future penta-level cell PLC (5 bits per cell) SSDs blocks are being rewritten behind your back because writing QLC/PLC is slow so write send by the file system/block layer in the OS is done by the controller at first to SLC (1 bit per cell) or MLC (2 bits per cell) at the time then transferred by the controller when there is time to QLC/PLC latter.
2) modern CMR/PMR hard drive with SMART that all of them and a sector is relocated this is a sector being rewritten behind your back hidden by a virtual layer.
3) SMR is basically the same as above in this regard.

The time of file system/block layer absolutely being in control of writes had been over for decades with hard drives the hard drives controller has been in charge for decades and file systems authors attempting to do integrity have failed to get the memo.

Originally posted by k1e0x View Post

I think probably the best solution would be to have ZFS just do all the writes it needs to manually so it can verify them but your trading performance for integrity.. probably people want that trade off tho.

Wrong plan. The model kind of need to be a trust but verify. As in trust the controller to do the writes but read what its has written and verify that. The reality is if you cannot read the data back out the drive correctly due to any reason the fact that it there does not really change the problem right. We really need to change to the model to deal with SMART being added to harddrives as in read smart logs see what sectors have been relocated and that triggers a re verify on those sectors. Yes SMR drives that are host aware drives even when in the fall back mode of device managed mode are keeping logs on the rewrites.

Something to remember the WD SMR Red drives that gave trouble are not device managed drivers they are host aware drives meaning that the drive will respond to SMR instructions but if you don't send SMR instructions it fails back and act like a device managed drive. It is very important for the performance of particular operations that you do in fact use SMR instructions the reason why the re-silvering of ZFS is so slow is no SMR instructions. SMR instructions allows you to say drive I am sending you full block everything that this is going to over write don't care about just over write it. This way you avoid device managed mode controller thrash that is down right slow.

Drives that are SMR is form a object store. There is something have to think. How can a SMR drive have CMR sections? When you answer this things get tricker to manage for file systems. With the sector system the object side across the complete disc is all the same size. With object based device like a SMR drive they don't have to be. You could have like 4K objects in one area then 4meg in another section and then like 128Meg objects in another in a SMR drive and be valid. How will you know what blocks a SMR drive has will be use SMR drive instructions your existing drive commands basically will not work.

ZoneFS - Zone filesystem for Zoned block devices — The Linux Kernel documentation

https://www.kernel.org/doc/html/latest/filesystems/zonefs.html

This gives you the basics how a SMR drive looks. You have CMR/PMR zones that are called Conventional zone files that are 4k blocks note that a SMR drive may have more than 1 of these and each of these zones don't have to be the same size welcome to needing to JBOD (Just a Bunch of Disks) inside a single drive. Then you have your Sequential zone files if a device maker wanted to be a true ass no one of these zones has to be the same size.

Horrible reality right SMR harddrive is a pool of drives and objects inside a single device. SMR is a true object device with all the evils of historic object drives. Yes this requires file systems and block systems to be more complex than the historic form. Some file systems will be able to extend into this more complex world others will not.

**k1e0x** · 23 June 2020, 12:47 PM

Originally posted by oiaohm View Post

File system still need to exist. Object stores cannot do everything. SMR being object based it need different interfaces to what has been historically provided.

The hard reality here is ever since hard-drives added the SMART(Self-Monitoring, Analysis and Reporting Technology) system the file system has not been in control of the block. Yes SMART Reallocated Sectors is only possible because what the file system is seeing about where the data is on the drive is a virtual creation of the controller. Yes a SMART sector relocation may happen and until you system checks for that information you don't knows it been copied and need to be rechecksumed in a integrity system. On PMR/CMR drives last time I checked ZFS pool system was not checking the SMART logs on the drive for relocation events. Yes just because controller relocates sector because it detects something as wrong does not mean it does it right.

The reality here when you using a:
1) Quad-level cells (QLC) or future penta-level cell PLC (5 bits per cell) SSDs blocks are being rewritten behind your back because writing QLC/PLC is slow so write send by the file system/block layer in the OS is done by the controller at first to SLC (1 bit per cell) or MLC (2 bits per cell) at the time then transferred by the controller when there is time to QLC/PLC latter.
2) modern CMR/PMR hard drive with SMART that all of them and a sector is relocated this is a sector being rewritten behind your back hidden by a virtual layer.
3) SMR is basically the same as above in this regard.

The time of file system/block layer absolutely being in control of writes had been over for decades with hard drives the hard drives controller has been in charge for decades and file systems authors attempting to do integrity have failed to get the memo.

Wrong plan. The model kind of need to be a trust but verify. As in trust the controller to do the writes but read what its has written and verify that. The reality is if you cannot read the data back out the drive correctly due to any reason the fact that it there does not really change the problem right. We really need to change to the model to deal with SMART being added to harddrives as in read smart logs see what sectors have been relocated and that triggers a re verify on those sectors. Yes SMR drives that are host aware drives even when in the fall back mode of device managed mode are keeping logs on the rewrites.

Something to remember the WD SMR Red drives that gave trouble are not device managed drivers they are host aware drives meaning that the drive will respond to SMR instructions but if you don't send SMR instructions it fails back and act like a device managed drive. It is very important for the performance of particular operations that you do in fact use SMR instructions the reason why the re-silvering of ZFS is so slow is no SMR instructions. SMR instructions allows you to say drive I am sending you full block everything that this is going to over write don't care about just over write it. This way you avoid device managed mode controller thrash that is down right slow.

Drives that are SMR is form a object store. There is something have to think. How can a SMR drive have CMR sections? When you answer this things get tricker to manage for file systems. With the sector system the object side across the complete disc is all the same size. With object based device like a SMR drive they don't have to be. You could have like 4K objects in one area then 4meg in another section and then like 128Meg objects in another in a SMR drive and be valid. How will you know what blocks a SMR drive has will be use SMR drive instructions your existing drive commands basically will not work.

https://www.kernel.org/doc/html/late...ms/zonefs.html

This gives you the basics how a SMR drive looks. You have CMR/PMR zones that are called Conventional zone files that are 4k blocks note that a SMR drive may have more than 1 of these and each of these zones don't have to be the same size welcome to needing to JBOD (Just a Bunch of Disks) inside a single drive. Then you have your Sequential zone files if a device maker wanted to be a true ass no one of these zones has to be the same size.

Horrible reality right SMR harddrive is a pool of drives and objects inside a single device. SMR is a true object device with all the evils of historic object drives. Yes this requires file systems and block systems to be more complex than the historic form. Some file systems will be able to extend into this more complex world others will not.

Well Sun has talked about the design goals a lot. This is an interview from 2007. https://queue.acm.org/detail.cfm?id=1317400

Bill Moore: If you look at the trend of storage devices over the past decade, you’ll see that while disk capacities have been doubling every 12 to 18 months, one thing that’s remaining relatively constant is the bit-error rate on the disk drives, which is about one uncorrectable error every 10 to 20 terabytes.

Bill Moore: So, one of the design principles we set for ZFS was: never, ever trust the underlying hardware.

I think that is still more true than ever. There really isn't any other tools one can use when you absolutely have to know that your data is correct in scientific or financial workloads. You can have a hot backup on site but that doesn't address the integrity issue.

**oiaohm** · 23 June 2020, 04:56 PM

Originally posted by k1e0x View Post

Well Sun has talked about the design goals a lot. This is an interview from 2007. https://queue.acm.org/detail.cfm?id=1317400

Lets address this.

Bill Moore: If you look at the trend of storage devices over the past decade, you’ll see that while disk capacities have been doubling every 12 to 18 months, one thing that’s remaining relatively constant is the bit-error rate on the disk drives, which is about one uncorrectable error every 10 to 20 terabytes.

One of the reason to change to SMR is to change that error rate by insane margin. SMR you are not rewriting data next to other data multi times. The biggest cause of uncorrectable errors is in fact how CMR/PMR writes. So the change to SMR moves from 1 uncorrectable error ever 10 to 20 Tebibyte(TiB) to 1 uncorrectable error every 10 to 20 Pebibyte(PiB) at least testing stopped at that point they did not find what the new error rate of the SMR method is. Yes 1024x change in magnitude at least.

Bill Moore: So, one of the design principles we set for ZFS was: never, ever trust the underlying hardware.

This is kind of right and wrong at the same time. Yes it right on one hand not to trust the hardware but its not right on the other hand to keep on using CMR/PMR when for integrity SMR is on a complete different level. Yes learning to make stuff work with SMR is not going to be the most friendly all the time but if we care about our data we need it supported.

SMR we are going to see 100TB+ drives a 10 to 20 TB error rate is not suitable on a 100TB+ drive.

Originally posted by k1e0x View Post

I think that is still more true than ever. There really isn't any other tools one can use when you absolutely have to know that your data is correct in scientific or financial workloads. You can have a hot backup on site but that doesn't address the integrity issue.

Depends on your backup solution. Some backup solutions do their own checksums as well. Its like the DM integrity module with the Linux kernel. Integrity issue there are many ways to solve it does not require to be part of the file system itself just that your solution solves it by some means.

Year 1998 harddrive makers started researching into object storage drives because they were seeing that writing data on one track could effect the data on the track over. Object based storage where blocks of data was larger was way to fix this. SMR is just the final form of this work. The error rate Bill Moore was quoting 2007 was in fact wrong for even object storage harddrives of 2007 those were already in 10-20PiB+ level of stability but Bill Moore did not have access to them so is only design for CMR/PMR techs.

The reality is SMR is better technology for data safety with hard drives. Improve how you write the data on discs so reducing the errors disc produces you integrity issue reduces. Why be fixing a problem when you can get rid of the problem by changing technology.

k1e0x there is also problem with Bill Moore logic not trusting the underlying hardware meant he did not get in close with hard drive and ssd makers to understand what the drive was doing. You need the trust but verify model. You need to trust the drive to tell you stuff so that you can then verify what the drive has done behind your back. If you never trust the underlying hardware at all you never read the smart or SMR or SSD logs to see where operations have happened behind you back to verify that nothing has gone wrong behind your back. Sorry ZFS starts off with the wrong ideas and lack of access to hard ware. ZFS is design completely for the wrong type of hardware. Yes 2007 harddrive makers were already say some point the future of harddrives would be object stores we are now to the point where that future with SMR is here.

Announcement

Ubuntu's Ubiquity Installer Begins Adding ZFS Encryption Support

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment