Announcement

**k1e0x** · 22 January 2020, 11:33 PM

Originally posted by oiaohm View Post

Funny you arguement here. This is not bolting on this is using something that in the old design that is meant to be used that way. In fact something you admit latter ZFS is doing.

IMA was the example he used. But he also said it did not have to be IMA. Really there is no reason why a file system cannot a flag in it superblock to say I have X feature so VFS layer added feature need to be enabled. It is possible to say all files in this file system have checksums in xattr.

See ZFS is placing check-sum is xattr for whole file. This does not have to be implement in the ZFS file system. If you implemented where the IMA/fsverify are this could be taken out of ZFS and implement for all file systems that support xattr to have whole file checksumming. Lets really stop design checksum per file system on per file system based system it only created increases validation work and at times unrequired extra processing.

Notice something I enable IMA on ZFS by the current design of ZFS this can be calculating the same checksum twice and consuming twice the number of bytes in the xattr with no real advantage. Basically whole file checksumming need to get out of file systems and moved to the layer above. This is Linux coming into alignment with old IRIX ways.

Block level check sums can be stored in the block layer. Don't have to be stored in the file system. Rework of iomap in Linux is about changing things so checksums on blocks going to storage can be calculated and signed by the VFS/IMA level independent of file systems and in fact be more complete end to end. This is Linux coming into alignment with old IRIX ways again.

That the problem you are too young. You have not consider that most of what you are doing with ZFS was done over a decades before. Problem when XFS was ported to Linux the other bits around it did not come with it being the XVM(logical volume manager) or the differences in VFS bit above XFS for whole file checksumming did not come either. So the XFS we have been looking at is the feature crippled version. Fully implemented version of XFS is a lot stronger competitor and it also moves a lot of other file systems up as competitive on data protection.

So you are working on the younger tech with ZFS and don't get it. In fact the way ZFS does whole files checksums comes from the historic way XFS did it. Technology wise ZFS is the child of XFS in a lot of the file protection stuff. XFS on IRIX checksums on files were done in the layer above the file system in the VFS layer with IMA is. So having full file checksum in the layer above is a long standing technology design choice for whole file check sums.

Lets say in 10 years time someone implements ZFS without checksum then some other file system comes along claims a whole stack of advantages over ZFS because it has checksums are they not idiots. This is exactly the mistake you have made with XFS totally missing how far ahead it was. The old design has some very interesting points it about time you stop claiming bolting stuff on XFS developer currently is just implementing the stuff that should have been implemented to fully port XFS originally.

The original model around XFS is design to be bolt together to reduce down the duplication of effort. So every file system does not have to write a block layer or a validation layer stuff.

Omg block level != file level.

You can drop the agism card to make your point but I've worked on mainframes (PrimeOS a non posix OS you've never heard of for a reason). My first time using Unix was 1988 and my first time using Linux was 1994. I must not get it because I haven't been around long enough even though I've managed large storage arrays and worked for tech companies you talk about and use every day.

Whatever you think man.. I guess checksuming your data makes it harder to check it's integrity, Just let firmware do it what could go wrong (737 MAX?) Opensource is bad unless it's GPL and ZFS is just unstable and loses data all over the place. Since nothing else on Linux is even half baked I guess you recommend everyone use NetApp. Way to promote software freedom, open source and Linux in the enterprise. +1

**oiaohm** · 23 January 2020, 01:42 AM

Originally posted by k1e0x View Post

Omg block level != file level.

That is the first interest question. IRIX did not fully split the block level and file level. The ablity that xfs brings back to put a xfs file system in a file on top of a xfs file system and mount it without using a loop back file system comes from IRIX with the block level being transparent though the file system layer.

block level != file level. IRIX is block level+ file level then it came block level+file level+integrity level. This was something kind of unique to the IRIX systems made direct io on them work insanely well the iomap work in current day Linux kernel is bring this to Linux.

[GIT PULL] iomap: new code for 5.5 - Darrick J. Wong

https://lore.kernel.org/lkml/20191125190907.GN6219@magnolia/

I would guess you would not be watching this work.

Originally posted by k1e0x View Post

You can drop the agism card to make your point but I've worked on mainframes (PrimeOS a non posix OS you've never heard of for a reason).

https://en.wikipedia.org/wiki/PRIMOS I think you have that OS name wrong. Sorry to say I used that highschool. Might have been a backwater highschool but when you have 4 teachers with double doctrines in a school of 200 students the hardware there can be quite odd with lots of access. The first posix OS I played with was Primix. I was also playing with IRIX and few others at the time.

Originally posted by k1e0x View Post

My first time using Unix was 1988 and my first time using Linux was 1994.

Those are close 1988 was fairly much when I came contact with Unix except this was not 1 Unix this was 8 different ones. Linux was 1993 for me.

Originally posted by k1e0x View Post

I must not get it because I haven't been around long enough even though I've managed large storage arrays and worked for tech companies you talk about and use every day.

"Keynote: Drop Your Tools – Does Expertise have a Dark Side?" - Dr Sean Brady (LCA 2020)

https://www.youtube.com/watch?v=Yv4tI6939q0

Dr Sean Bradyhttps://lca2020.linux.org.au/schedule/presentation/218/All professionals possess some measure of expertise, and not only is this expertise usefu...

"Keynote: Drop Your Tools – Does Expertise have a Dark Side?" - Dr Sean Brady

It would pay you to watch this. In fact you Expertise can be the very reason you are not getting what the XFS file system developer is up-to.

Originally posted by k1e0x View Post

I guess checksuming your data makes it harder to check it's integrity,

This is that you did not read question.

If the IMA side is checksum the data blocks does it make any sense for the file system to duplicate?

This was in fact a answer from the video at the conference over teaching XFS old dog new tricks. Lot of it is really teaching the old XFS dog tricks it use to know.

So the question is where should you be performing the checksums. ZFS locations it performs it checksums may be wrong and the XFS lead developer does not agree with where ZFS does things. Is the right place the file system code or should it be system wide generic. Integrity means you must generate checksum.

When you look at the work to make iomap that is basically a library shared between Linux kernel file system drivers in future this give a different place to put in block checksums.

Full file checksums are need by the integrity layers so are per block checksums.yes the block is to locate what section in a full file had in fact been messed with.

ZFS was not really design to work with system wide integrity. When I say system wide you will be wanting to confirm that X file on two different file systems are in fact the same. Like you are coping to backups and the backups are not ZFS for some reason. Current ZFS system lets say I copy a file to a UDF for burning on disc to transfer the protections are gone.

Its about time you ZFS guys pull head out ass and work out your data protection being restricted to ZFS is a bug.

**k1e0x** · 23 January 2020, 03:03 AM

I was being sarcastic and no file level checksums are wrong because if that is how you want to do it then you loses the ability to provision a VM block device on top or put a iscsi SAN on it or other object block layer. ZFS allows you to format other filesystems to block devices any way you want in your pool. It can even emulate storage.. useful for developing things like XFS I'd imaging. heh.. for VM's it makes things like qcow, qcow2 (and whatever else they come up with next) obsolete. BSD's bhyve has no support for things like that.. it doesn't need it, it has ZFS.

If I wanted a checksum on the file.................................. I could just do it today exactly the same way *you* are because ZFS supports xattr's exactly the same way.

Then I can be "layer complete" lol - (sometimes I really wonder what the hell you're talking about.. it's already in there that way if you want to do it)

But in your preferred method if it's file based every time a VM wrote to it's drive you would have to calculate the checksum for the entire virtual disk. ZFS does this only for the blocks modified. Even if you slice them up, ZFS is still doing less work because it'd doing it on the atomic unit. And you get all the other cool features, snapshots, network portability, incrementals, compression, per dataset encryption, tier based cache on nvme's, writable cloning (duplicate a master image 100 times and only consume the delta) ..all for free. No development necessary it's already done, working and in production today on Linux.

And besides.. nobody does full file level checksums, It's only used for secure boot really.. probably because it's dog slow.

And you are right to a point, no file system is perfect.. as I said before many time, I don't care what client systems and home users use. I don't care what file system your phone runs. I care about storage in enterprise.

I'm also calling BS that you used a prime computer at your highschool.. what highschool in the 80's owned a mini-computer? Ya right.. Imagine the budget. What the hell would you use it for? No.. you looked it up on wikipedia and that's great that you're trying to impress me. Wow even using linux a year longer than me.. tell me, cuz I forget.. what was Linux used for back then? You aren't showing yourself to be very trustworthy. You can just say "ok the ageism shit was wrong"

**oiaohm** · 23 January 2020, 05:55 AM

Originally posted by k1e0x View Post

But in your preferred method if it's file based every time a VM wrote to it's drive you would have to calculate the checksum for the entire virtual disk. ZFS does this only for the blocks modified.

Try gain. This time read.
If the IMA side is checksum the data blocks does it make any sense for the file system to duplicate?

XFS developer is not talking about IMA checksum being a file based one but a block one being calculated and checked above the file system. It has to had to be a full file based one because it could not see the to block side though the file system but that will not be the case after the iomap changes are complete in the Linux kernel.

Originally posted by k1e0x View Post

And besides.. nobody does full file level checksums, It's only used for secure boot really.. probably because it's dog slow.

Interesting question are the full file level checksums that are in fact block-based? The answer is yes. Most known block and file checksum is the checksum system used by bittorrent. So this is a different class of checksum to what ZFS has at the moment that could be stored in a xattr.

So first thing to make IMA checksums in fact work and not be dog slow is make them a block based checksum. To be block based checksums they have to be able to see the blocks that make the file up from above the file system.

Omg block level != file level

As soon as you wrote this I knew I was dealing with a person who has let there experience cloud their logic.

**k1e0x** · 23 January 2020, 06:04 AM

Yeah,....

Well good luck with that.. the world waits on baited breath I'm sure for your solution..

The rest of us who actually have work to do already have one.

**oiaohm** · 23 January 2020, 06:17 AM

Originally posted by k1e0x View Post

The rest of us who actually have work to do already have one.

Are you sure. You thought Linux blocking fpu access for ZFS was bad. Now think what you are going todo if you wake up in a world that all Linux file systems have to be iomap based.

XFS developer is up-to something that should have you ZFS users and developers kind of worried.

**k1e0x** · 23 January 2020, 10:10 PM

Originally posted by oiaohm View Post

Are you sure. You thought Linux blocking fpu access for ZFS was bad. Now think what you are going todo if you wake up in a world that all Linux file systems have to be iomap based.

XFS developer is up-to something that should have you ZFS users and developers kind of worried.

There is a slight performance hit on kernels above 5.0 nothing in enterprise uses 5.0 that I'm aware of most recent stuff is like 4.19-LTS. I haven't noticed any real performance hit.. It preforms well 5.0 and above depending on your pool layout.

k? I'm sure he is. It's odd you talk about the "XFS guy" .. where as ZFS is a community of paid developers, is only one person working on improving XFS? That is no good.

I did watch this on the XFS design and plans.
Teaching an old dog new tricks. https://www.youtube.com/watch?v=wG8FUvSGROw

And I have some thoughts on this. In the early part when he's explaining "cow filesystems" He's talking directly about ZFS's design (leaf-to-root) and not about other implementation and he sounds to me like he has a lot of respect for that design. Part of the magic of ZFS is when it walks up the tree to update it's pointers it also adds it's checksums, it's snapshot references and other info all in one go. (I think this is rather elegant personally, Sun used a "design negative" in theory to add a lot of positive features in a seamless way.. almost like "well.. since you're there anyway.." and in the end it ends up being a performance boost as opposed to other methods trying to implement the same thing)

Then he seems sad and lists his problems going into replicating that feature set and really seems to be grasping as straws. Talking about loopback devices cp and tar... and showing scripts that implement this.. it's ugly. He also has a small mistake in that ZFS does not cache the file "bash" from 1000 clones. The ARC only works on blocks, 1000 clones of bash if unmodified share the same blocks. The ARC also isn't a LRU cache like most other filesystems. It's why benchmarking ZFS isn't really fair if you defeat it's cache, because the ARC is integral to it's use and it actually works, unlike LRU.

...You know I'm left with the feeling here.. That it's sad seeing really smart people try to jump over hurdles and work so hard to do this stuff. I really think he should take the old dog out and shoot it.. and if he thinks ZFS is a no go due to license.. alright, fine if you have that opinion ok.. but instead of all this stuff he should start working on bcachefs or HAMMER2. He needs a new design because when you start jumping to the point of a loopback device.. you might as well just start over. You sort of are anyhow and anything you come up with is going to be a hack.

Idk tho.. maybe I should be "worried" for totally vague reasons. Sorry.. just really not "worried" after watching that. just sad for him.

**oiaohm** · 24 January 2020, 12:34 PM

Originally posted by k1e0x View Post

k? I'm sure he is. It's odd you talk about the "XFS guy" .. where as ZFS is a community of paid developers, is only one person working on improving XFS? That is no good.

iomap & xfs support for large pages [LWN.net]

https://lwn.net/Articles/796847/

There is the XFS project lead who you see in that video laying out a plan then multi different personal including Oracle staff bring it into reality.

Originally posted by k1e0x View Post

I did watch this on the XFS design and plans.
Teaching an old dog new tricks. https://www.youtube.com/watch?v=wG8FUvSGROw

So you have watched it.

Originally posted by k1e0x View Post

Talking about loopback devices cp and tar... and showing scripts that implement this.. it's ugly.

That was only the prototype. iomap complete no loopback devices in play

Originally posted by k1e0x View Post

He also has a small mistake in that ZFS does not cache the file "bash" from 1000 clones.

That was not refering ZFS this is your expertise getting in the way its referring the issue btrfs and it interaction with the VFS layer above it. It also something that could happen with namespaces as well with ZFS.

Originally posted by k1e0x View Post

The ARC only works on blocks, 1000 clones of bash if unmodified share the same blocks. The ARC also isn't a LRU cache like most other filesystems. It's why benchmarking ZFS isn't really fair if you defeat it's cache, because the ARC is integral to it's use and it actually works, unlike LRU.

iomap work is to bring ARC like block management to all Linux kernel supported filesystems in time. Also provide a route so this declone at block layer is not undo by something at the VFS layer. Something you forget Linux kernel memory management above ZFS when running on Linux is also a LRU. Some of the work of iomap is to fix the Linux kernel memory system.

I hope ARC is ready to handle 2Meg blocks and other horrible caused by the up coming Linux kernel large pages.

**gilboa** · 26 January 2020, 02:26 PM

Originally posted by k1e0x View Post

I can assure you ZoL *IS* used in enterprise. I've used ZFS in enterprise since ~2006 on Solaris, FreeBSD and Linux. It is pretty much the fix for ransomware.

The CDDL nor the GPL have restrictions on usage, only on distribution. You can and always have been able to do whatever you want on your own systems.

I posted this before. 100 PB - ZoL Implementation. Storage of that size actually isn't that unique for ZFS (typically more Solaris and FreeBSD but ZoL is getting more popular.) Most enterprises don't share information on their storage platforms. Since that one is a quite impressive research system they do have details posted about it.
https://medium.com/codedotgov/oss-sp...x-6596fca6e5f6

"unlike ext4/EXFS, when ZOL breaks (and I've seen it break badly in production environment, including very recently) you get to keep all the pieces." -- seriously? You're going to make the claim ext4 is *better* at data integrity than ZFS?

No.. It's not. Not in any universe. Like Linus you don't seem familiar with ZFS.. that is ok.. but unlike Linus you shouldn't go making claims when you don't have the basic facts.

FYI, ZFS is a copy on write, always consistent file system with a checksum on every data block.
EXT4 is... not. It is a journaled, extent based file system that updates blocks in place making it not always consistent. Do you like fsck? Or data blocks with silent corrupt files? EXT4 is for when you want a small but present possibility of your pictures looking like this and never knowing till you open it.

Either my writing skills have massively deteriorated or your reading skills (intentionally or unintentionally) are lacking.

1. ZFS on Oracle (Be that Solaris or Unbreakable Linux) != ZOL.
One is *supported* by a major enterprise-grade OS supplier the other, is, well, not.
Comparing the two is humorous, at best.
Most of the ZFS on Linux users don't bother to pay the huge sums of money required to get proper support.
And if it breaks (see 3) they will get to keep all the pieces.

2. ext4 *is* support by enterprise-grade OS supplier (Be that RHEL, SUSE or Ubuntu).
Good luck trying to get support for your RHEL data cluster if you use anything ZFS.
Same goes for any other enterprise grade Linux distribution (beyond Oracle, that is, see 1).

3. Copy on write my protect you from single-bit-curroption. It *does* not protect you from bugs.
Having just witnessed a 0.8PB cluster (w/ ZFS) go up the flames, I'm not that impressed.

Again, if you use ZFS on a support platform, good for you.
If you are stupid enough to use ZOL on an unsupported platform thinking that copy-on-write might save your ass, think again.

- Gilboa

**k1e0x** · 28 January 2020, 10:36 PM

Originally posted by gilboa View Post

Either my writing skills have massively deteriorated or your reading skills (intentionally or unintentionally) are lacking.

1. ZFS on Oracle (Be that Solaris or Unbreakable Linux) != ZOL.
One is *supported* by a major enterprise-grade OS supplier the other, is, well, not.
Comparing the two is humorous, at best.
Most of the ZFS on Linux users don't bother to pay the huge sums of money required to get proper support.
And if it breaks (see 3) they will get to keep all the pieces.

2. ext4 *is* support by enterprise-grade OS supplier (Be that RHEL, SUSE or Ubuntu).
Good luck trying to get support for your RHEL data cluster if you use anything ZFS.
Same goes for any other enterprise grade Linux distribution (beyond Oracle, that is, see 1).

3. Copy on write my protect you from single-bit-curroption. It *does* not protect you from bugs.
Having just witnessed a 0.8PB cluster (w/ ZFS) go up the flames, I'm not that impressed.

Again, if you use ZFS on a support platform, good for you.
If you are stupid enough to use ZOL on an unsupported platform thinking that copy-on-write might save your ass, think again.

- Gilboa

1. I have never seen Oracle ZFS storage appliance used in the enterprise personally, I know they exist I've just never seen them. Only Sun Microsystem's before Solaris version 10. (Pool version 28~) Generally it's used on FreeBSD storage clusters that are secondary storage to NetApp, EMC or DDN. (around 32-64 spindles) You can get paid commercial ZFS (and ZoL) support from both FreeBSD 3rd parties and Canonical. Ubuntu 19.10 has ZFS on root in the installer and so will 20.04-LTS

2. All I got to say about ext4 is hope you like fsck.

3. You don't know what you're talking about. COW alone has nothing to do with bit-rot or uncorrectable errors. You're thinking of block checksums, and yes, they are good. COW provides other features such as snapshots, cloning and boot environments. Boot environments are pretty cool.. maybe Linux should get on that... oh wait.. ZFS is the only Linux file system that does it and we can't have *that*.

Check this out.. FreeBSD 12 has a new command for boot environments..

bectl

https://www.freebsd.org/cgi/man.cgi?query=bectl

bectl create zroot@snap
bectl jail zroot@snap

You just cloned your running OS into a writable, bootable, virtual environment and spawned a shell in it and it was instant.

Announcement

Linus Torvalds Doesn't Recommend Using ZFS On Linux

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment