Originally posted by liam
View Post
Anyway, those are real issues, but they have nothing to do with Valerie's post. Here is where things stand on each:
Shrinking: This requires block pointer rewrite. The only implementation of that was done years ago by Matthew Ahrens at Sun and never saw the light of day because it had performance issues. Oracle has a team of new people working on it in their fork, but it is safe to consider this to be stalled. Since ZFS is meant to deprecate partitioning, this is not a huge problem.
Small random writes: ZFS is already very good at this thanks to the write sequentialization provided by ZIL and SLOG devices can help here significantly. Etienne Dechamps wrote a FASTWRITE algorithm in ZFSOnLinux in 2011 that further improves things in the presence of multiple top level vdevs. More recently, Brian Behlendorf wrote patches that further improve this on mirrors. In late July/early August, I wrote a patch that implements a list of drives known to misreport their sector sizes in the Linux port. It will adjust ashift automatically when creating top level vdevs with drives in the database. This improves random IO performance on new pools that involve those drives when the system administrator did not automatically override the default when making the vdevs/pool.
Fragmentation as things reach capacity: All mainstream filesystems suffer from fragmentation and reduced performance when they fill to capacity. George Wilson at Delphix recently wrote a set of patches that help things degrade more gracefully.
As for higher raid levels, Jeff Bonwick's original plan for ZFS was to implement N-parity raidz, but Oracle's acquisition of Sun led to his resignation. He is now at a fairly secretive startup and this work has stalled. We have triple parity raidz, which should be sufficient for the foreseeable future.
Leave a comment: