Announcement

**torsionbar28** · 10 April 2018, 02:26 PM

Originally posted by jrch2k8 View Post

on the other hand if you manage business grade services on MDADM and ext4 you are probably going to get fired once the first storage audit hit your offices.

^ Spoken like someone who has never managed servers or storage for a business.

**jrch2k8** · 10 April 2018, 02:33 PM

Originally posted by pal666 View Post

you have funny definition of "stability"

Sweet nit picking line, of course lets completely bypass the rest of post and following posts where I did specify I have used ZoL for years and ZFS since Solaris 10 era and never lost data because it seems that is not important and of course lets bypass the fact that ryao has very specifically said twice so far that the data is not lost and a tool is coming to fix it because the data is actually written and safe you just cannot see it due to the bug effect.

I miss the competent trolls ..... sigh! at least those had enough tech skill to make it interesting.

As an interesting note I also have years of usage of ZFS on FreeBSD and Mac OS X which work very well in my external RAID regardless of the OS(outside some server that use large dnodes which are not supported on all OS yet)

**jrch2k8** · 10 April 2018, 02:38 PM

Originally posted by torsionbar28 View Post

^ Spoken like someone who has never managed servers or storage for a business.

I do and in this day and age no one will handle bit sensitive data in a device or FS that doesn't support self healing and metadata checksum which neither are supported by mdadm or ext4. Sure if you work with file sharing or hosting or other areas maybe fine.

**pal666** · 10 April 2018, 02:41 PM

Originally posted by jrch2k8 View Post

Sweet nit picking line, of course lets completely bypass the rest of post and following posts where I did specify I have used ZoL for years and ZFS since Solaris 10 era

what i'm trying to explain is that all your centuries of using zfs on solaris matter exactly zero, because zfs on linux is different software(and quite buggy, as can be easily seen).

**jrch2k8** · 10 April 2018, 03:52 PM

Originally posted by pal666 View Post

what i'm trying to explain is that all your centuries of using zfs on solaris matter exactly zero, because zfs on linux is different software(and quite buggy, as can be easily seen).

Actually ZFS was a lot more bugged on Solaris, specially the first x86_64 implementations but was very solid on Ultra Sparc Solaris(I give you that), the second most buggy implementation would be on the BSD world for a while(I remember having lot of issues on the first freeBSD tries, specially with extremely aggressive RAM usage) but in all honesty I can abide that ZoL has been very stable through the years and most of the bugs are like this ones, annoying for few days but nothing serious(which again no data is lost, just not visible and as I mentioned before it took me a lot of effort to reproduce on recent kernels/coreutils systems while was easy on Centos) and very spaced in time, if I remember right(ryao can correct me if I'm wrong) the last relatively important bug was on the first 0.6 series a big while ago.

Another thing I would like to clarify here, ZoL is the most development active of them all and most newer features are on ZoL which then are picked by the other OS trees(Mac OS X being the slowest), so yeah you will usually find more bugs reported on ZoL than on BSD or Mac.

Disclaimer: prolly Solaris 11 have even more features but I don't follow Oracle solaris anymore, I don't like Oracle as a business and I stopped recommending their products a while ago outside databases.

Disclaimer: regardless of the OS/Version I have never suffer from data loss or permanent corruption while using ZFS but always have proper backups, RAIDs and FS are not replacement for proper backups.

**torsionbar28** · 10 April 2018, 04:30 PM

Originally posted by jrch2k8 View Post

I do and in this day and age no one will handle bit sensitive data in a device or FS that doesn't support self healing and metadata checksum which neither are supported by mdadm or ext4. Sure if you work with file sharing or hosting or other areas maybe fine.

"No one"? The big players (think Google, Amazon, etc.) perform checksum and replication of data in software. They don't give a crap about the underlying filesystem or even the underlying block storage because they view it as inherently unreliable. And they use ext4 more often than not, with additional layers on top, e.g. Gluster. Sounds like your experience in this area is limited.

Heck, Google was using ext2 (yes two!) all the way into 2010, when they upgraded to ext4.

**jrch2k8** · 10 April 2018, 05:51 PM

Originally posted by torsionbar28 View Post

"No one"? The big players (think Google, Amazon, etc.) perform checksum and replication of data in software. They don't give a crap about the underlying filesystem or even the underlying block storage because they view it as inherently unreliable. And they use ext4 more often than not, with additional layers on top, e.g. Gluster. Sounds like your experience in this area is limited.

Heck, Google was using ext2 (yes two!) all the way into 2010, when they upgraded to ext4.

You literally went for the more ludicriuos example possible trying to look smart, first of all Google and Amazon are brands and are massive in a unique scale that both offer a ridicoulous amount of services and handle/serve a ludicrious amount of data and types of data, no filesystem/storage solution in existance could work on google or amazon or any other modern super conglomerate(including behemoths like Samsung) but all of them where properly needed.(their engineers knows their stuff).

second, google does use btrfs and does use ceph and does use gluster and does use Ext4 and does use custom designed solution and does use EmC storage and does use hitachi storage and does a myriad of other storage system and services because unlike you instead of fanboing they actually work very hard to use the right tools for the jobs at hand.

third, you don't have "data" or at least that word is a gross over simplification, you have:

1.) bit sensitive data: ZFS/Btrfs/Emc/others are king here and is as trustworthy as it gets.

2.) security sensitive data: ZFS/Emc/other do very well here but you may need custom solutions as well

3.) classified data: Custom equipment and filesystem, out of the scope of regular filesystems included ext4, ZFS, BTRFS, etc.

4.) Volatile data secure and unsecure: custom solution and RAMFS <-- this is data that is generated on the fly from cold data and while require integrity can be recomputed at the cost of latency

5.) Cold Data: also know as cold storage ZFS/Btrfs/Emc/others are king here and is as trustworthy as it gets.

6.) hot low scale distributed streaming: ceph, gluster, AFS. Usually this data systems work on layers of cold storage and buffers and latency is your biggest enemy, hence all solution present scalability problems at some point. This data also don't require extra safety measure but a fast as possible seek underlying system

7.) hot high scale distributed streaming: custom solutions required and breaks any general attempt for data checksuming/replication/selfhealing due to the fact that in this scale the latency between chunks and their checksums is so big that regular file system write contention is useless because on the checksumed chunk could have changed before getting a write permission hence invalidating the whole original chunk. For example google docs have its own custom system here to deal with this and parallel version and many other issue that are extremely unique to that platform.

i could get to a houndred here but in resume if there is a filesystem/OS/Equipment on earth that handle storage in any sort of way i guarantee you they are using it right now, WHERE IT MAKES SENSE, you have to be very bad at your job to use ZFS as a youtube backend simply because bit integrity play no roll(the bit loss on streaming is always huge hence is moot) and the latency will make it unusable in the same sense google will never use EXT4 to store highly bit sensitive binary data in the same sense amazon will never use x86 to handle sensitive financial data, etc. etc. etc.

Also both google and amazon are horrible example because their problem never will be data integrity(they have the top crop of cold storage systems for that) but latency genius, and latency require SPECIALIZED solution that requires intimate knowledge of the upper layer platform that in no way EXT4 or ZFS were designed for to start with, in fact youtube uses some form of raw distributed storage system(plain disk low level writes without a fs) handled in some way(is an industrial secret after all) by artificial intelligence to buffer as efficiently possible the videos initial sections guessing somehow what most user will request from a session to deal with latency while prefectching other parts of the stream, etc.

In resume, is not as simple as "Google uses ext4, brah!!!" but an incredible complex topic with a myriad of different tools to solve a myriad of different porblems that can grow massively in complexity depending the scenario.

**torsionbar28** · 10 April 2018, 06:02 PM

Originally posted by jrch2k8 View Post

blah... blah... blah... "it depends"

No one is claiming that a one-size-fits-all solution exists. Good grief. Who are you arguing with??

**ryao** · 10 April 2018, 06:16 PM

Okay. We believe that we can recover all of the missing directory entries. What we actually lost were the directory entries, which store the names of files and designates them as being part of a directory. Files that lost directory entries will be getting new ones in a lost+found directory. All inaccessible data stored should be be recovered once we release the update to fix this in the next week or two. It depends on how fast I work despite having several other things scheduled for this week.

As for the people who are running with the idea that ZFSOnLinux lost people's data because of a bug, we will have technically only lost metadata after everything is fixed. Also, this is much better than what happened in the past on other filesystems that people are mentioning in comparisons.

**mskarbek** · 10 April 2018, 06:38 PM

Originally posted by torsionbar28 View Post

Heck, Google was using ext2 (yes two!) all the way into 2010, when they upgraded to ext4.

Do you really think that everyone have resources and ability to handle data just like Google does? More specifically does everybody are running custom designed and developed software with data protection in mind from the beginning? Because I think more realistic is to assume that most of us have to use third part software which we don't have full control over how it reads/writes data to the file system and can't with full confidence rely only on that software without covering data integrity additionally on file system level.

Announcement

ZFS On Linux 0.7.8 Released To Deal With Possible Data Loss

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment