XFS Developer Takes Shots At Btrfs, EXT4
Chris Mason of Btrfs fame wasn't the only Linux file-system developer talking to the public last week. While the Btrfs talk was going on in Los Angeles at SCALE 10x, Dave Chinner was down under in Australia at LCA2012 talking about XFS. His talk included some controversial shots at EXT4 and Btrfs.
During his Linux.Conf.Au 2012 presentation in Barratt, Australia, Chinner first talked about the XFS meta-data problems of the file-system's meta-data modification performance being terrible. EXT4 can be 20~50x faster than XFS with certain workloads like unpacking a Linux kernel source tar-ball package. However, with one major algorithm change and various performance optimizations, the XFS performance is now scaling much better (Dave recommends the Linux 3.0 stable series or newer for the best XFS support).
The algorithm change made was delayed logging, which took about five years and four attempts to come up with a suitable solution that aggregates transaction commits in memory and is a feature modelled around EXT3.
In his 45-minute presentation, Chinner also talked about other improvements recently made to XFS like the lock-less log space reservation fast path, extensive meta-data sorting before the I/O dispatch, batching active log item manipulations, meta-data caching being divorced from the page cache, and lock-less (RCU-based) inode cache look-ups.
Dave Chinner says that the XFS meta-data performance and scalability is good and can be considered a "mostly solved" problem. Further XFS file-system work will deal with more performance improvements out of the VFS lock scalability work, validating performance scalability on high-IOPS storage (namely PCI Express SSDs), and improving the reliability and feature resilience will be the next major challenge.
Talked about during the presentation were also possible on-disk format changes without backwards/forwards-compatibility as CRCs are not sufficient by themselves to provide robust failure detection and recovery, not enough free-space in the XFS meta-data at present, and other functionality that necessitates on-disk format changes. Dave hopes XFS will have support for proactive detection of file-system corruption via online meta-data scrubbing, reverse mapping, online and application transparent detection and repair of certain common types of corruption.
This Red Hat developer believes XFS will be well placed to remain the "large and lots" go to Linux file-system.
Among the shots that Dave Chinner ended up taking against Btrfs was talking about "the skeletons in the corner that are getting dusty" that the Btrfs developers don't want the public to know about. In particular, Btrfs not scaling well at the moment when it comes to large meta-data count file-systems. Large scale allocation is also slow, but Chinner acknowledges that Btrfs will scale to arbitrarily large file-systems with further optimizations. Of course, for mobile and desktop Linux users this problem is largely mute.
Further about Btrfs, the Red Hat developer also says that the next-generation Linux file-system does not scale and is still "under heavy feature development and not fully stable" and that some deficiencies of Btrfs might take some time to overcome. "Btrfs is not ready for production...it will work, but will work slowly."
Dave acknowledges though that Btrfs will soon replace EXT4 as the default Linux file-system due to its unique feature set. He then proceeded to take hits at the EXT4 file-system for not being able to scale to arbitrarily large files and file-systems. Additionally, Chinner says, "EXT4 is not as stable or as well tested as most people think" and "EXT4 has become an aggregation of semi-finished projects that don't play well with each other."
Embedded below is the XFS file-system presentation from Linux.Conf.Au 2012.
During his Linux.Conf.Au 2012 presentation in Barratt, Australia, Chinner first talked about the XFS meta-data problems of the file-system's meta-data modification performance being terrible. EXT4 can be 20~50x faster than XFS with certain workloads like unpacking a Linux kernel source tar-ball package. However, with one major algorithm change and various performance optimizations, the XFS performance is now scaling much better (Dave recommends the Linux 3.0 stable series or newer for the best XFS support).
The algorithm change made was delayed logging, which took about five years and four attempts to come up with a suitable solution that aggregates transaction commits in memory and is a feature modelled around EXT3.
In his 45-minute presentation, Chinner also talked about other improvements recently made to XFS like the lock-less log space reservation fast path, extensive meta-data sorting before the I/O dispatch, batching active log item manipulations, meta-data caching being divorced from the page cache, and lock-less (RCU-based) inode cache look-ups.
Dave Chinner says that the XFS meta-data performance and scalability is good and can be considered a "mostly solved" problem. Further XFS file-system work will deal with more performance improvements out of the VFS lock scalability work, validating performance scalability on high-IOPS storage (namely PCI Express SSDs), and improving the reliability and feature resilience will be the next major challenge.
Talked about during the presentation were also possible on-disk format changes without backwards/forwards-compatibility as CRCs are not sufficient by themselves to provide robust failure detection and recovery, not enough free-space in the XFS meta-data at present, and other functionality that necessitates on-disk format changes. Dave hopes XFS will have support for proactive detection of file-system corruption via online meta-data scrubbing, reverse mapping, online and application transparent detection and repair of certain common types of corruption.
This Red Hat developer believes XFS will be well placed to remain the "large and lots" go to Linux file-system.
Among the shots that Dave Chinner ended up taking against Btrfs was talking about "the skeletons in the corner that are getting dusty" that the Btrfs developers don't want the public to know about. In particular, Btrfs not scaling well at the moment when it comes to large meta-data count file-systems. Large scale allocation is also slow, but Chinner acknowledges that Btrfs will scale to arbitrarily large file-systems with further optimizations. Of course, for mobile and desktop Linux users this problem is largely mute.
Further about Btrfs, the Red Hat developer also says that the next-generation Linux file-system does not scale and is still "under heavy feature development and not fully stable" and that some deficiencies of Btrfs might take some time to overcome. "Btrfs is not ready for production...it will work, but will work slowly."
Dave acknowledges though that Btrfs will soon replace EXT4 as the default Linux file-system due to its unique feature set. He then proceeded to take hits at the EXT4 file-system for not being able to scale to arbitrarily large files and file-systems. Additionally, Chinner says, "EXT4 is not as stable or as well tested as most people think" and "EXT4 has become an aggregation of semi-finished projects that don't play well with each other."
Embedded below is the XFS file-system presentation from Linux.Conf.Au 2012.
28 Comments