No announcement yet.

TFS File-System Still Aiming To Compete With ZFS, Written In Rust

  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by Zan Lynx View Post

    I'll believe it when I see it.

    In my experience, getting things working is the straight forward part. Proper error handling ends up as 2/3rds of the code. If you're doing it properly anyway. File systems have to handle data transfers that end half-way through, disks that run out of space, disks that lie about data commit, data that isn't there. Data that is sometimes there. RAM errors, those are fun. RAID recovery with errors on multiple disks. . .

    Being able to mount in degraded mode without making things worse.

    File systems. So much fun.
    Well, with two plus years of experience writing systems level software in Rust, I can easily believe that Ticki can pull this off in the timeframe he believes he can -- first stable release by the end of the summer and likely production-grade by the next year. A small team with Rust can do the same work as a significantly larger C/C++ team due to the Rust compiler acting as the heavyweight champion of code and error-handling discipline.

    We have a lot of tools that are de facto standards in Rust development that just aren't a standard at all with C/C++. Everything we are using is bleeding edge, current generation types of tools compared to the very conservative tool set a typical C box is configured with. We make extensive use of ADTs (algebraic data types) and pattern matching extensively for all of our software, including error handling. The compiler ensures that references are never touched in a way that could break when you writing a fully multi-threaded and fully asynchronous filesystem. The lifetimes mechanism also ensures that you don't have any values dropped too soon when they are still needed by some references elsewhere in your code. The type system allows some interesting hacks that can make some forms of logic errors, compile-time errors. The author, Ticki, also wants to extend Rust's error checking capabilities even further ( ). This is the kind of guy that's working on TFS.

    Basically, the most difficult parts of software development in a filesystem are taken care of in a manner most efficient, convenient, and safe. As for handling memory, that is why Ticki has created the Ralloc memory allocator for Redox and TFS. It has a few capabilities that the system allocator on Linux cannot do, and even has some nice security capabilities like optionally choosing to zero out memory that is dropped, with some nifty error handling and logging capabilities.


    • #42
      Originally posted by AllanJude View Post
      I would be much more interested in TFS if it implemented ZFS in an on-disk compatible way. So it could import my existing zpool with a RUST codebase. As for a different ZFS-like file system, I am not hopeful that any of them will ever compare to ZFS.

      When ZFS went open source, it already contained over 100 engineer years of effort, and its development has continued since then. ZFS is very active and there are major new features in the pipeline sponsored by an interesting mix of open source projects like IllumOS, Linux, FreeBSD, and OpenSFS, companies like Delphix, Nexenta, Datto, and even Intel (who is contracted to build a new super computer based on ZFS), plus government agencies like LLNL (US) and FAIR (EU).

      The advantage that ZFS had in the early days was the QA team at Sun. They ensured that the hard part of development, the testing, actually got done, because they got paid to do it.

      With a head start of over 100 engineer years, I just don't see how btrfs can ever catch up. Even if Oracle put 100 engineers on it full time, it would take 3 years to catch up to that 100 engineer year figure. And Oracle is just not putting in that level of effort. btrfs was just an attempt to have an answer for ZFS. That answer was wrong.

      This particular rationale is misguided in the best of cases. The problem is that just because there's a massive sunk cost in something doesn't mean that that's the thing that you should be using. The Linux kernel for example has an absolutely massive amount of Developer Years put into it (on the order of 1500 developers working on it per year), but I'm sure you of all people would be the first to try to sell people on the advantages of FreeBSD.

      That doesn't mean that man years is a worthless measure, however it must always be taken in context, and for this we should make an analogy back to a well known to the well known physics formula: Force = Mass * Acceleration. Just like it takes a certain amount of force to accerate a mass a particular amount, it takes a certain amount of man hours to reach a certain level of "completeness" (features, reliability, etc encapsulated in this idea) for a project that has a particular level of technical debt with additional coefficients for things like the programming language. As a consequence some projects with much smaller teams can move faster than projects with huge teams. That doesn't mean a project with (relatively speaking) no technical debt and a small team can necessarily overcome a project with vast technical debt and a massive team because you still need enough force to overcome.

      So what does that mean for something like TFS? Well, I would say being ready by the end of the summer is incredibly optimistic, but the reality is tools matter and anyone who says they don't is a moron.

      Examples like
      • Not having to deal with entire classes of bugs (with some of the most common bugs that are eradicated being the most deadly)
      • Having a proper error mechanism rather than the joke that is errno, which isn't even really reliable for saying that an error happened in THIS particular process. Not only that but error handling being widely implemented.
      • A much better testing infrastructure
      • Strong Typing

      All mean that vastly less time has to be spent on solving programming bugs as opposed to design/logic errors (and none of the above are really that particular to Rust, a lot of languages have these and thus are faster to develop in than C by consequence), and things like the Borrow Checker and Immutable By Default means that doing things like adding threading are just easier as you're forced to make things threadsafe.

      So a Rust based filesystem can develop and stabilize much faster when compared to a C one when we hold for the same design, specification, and amount of manpower. So automatically TFS needs vastly less man hours to reach that point than what ZFS which was written in C requires.

      Will it reach a point where TFS replaces btrfs and ZFS as the Next Hot Filesystem? Probably not, definitely not at it's current pace, but... and this is important... TFS is MIT licensed whereas ZFS is CDDL, hence TFS can go in the Linux tree and get integrated properly whereas ZFS has to remain as a poorly integrated external module that is having to dodge around licensing issues, which can very easily make that manpower problem disappear.

      I'm definitely not set on TFS as The Solution(TM), it hasn't made itself worth it for me to invest in as an idea, and I'm perfectly content with ZFS on FreeBSD... but... this is an incredibly arrogant attitude to take that is blinding you to threats to your particular technology of choice.
      Last edited by Luke_Wolf; 22 May 2017, 10:24 PM.


      • #43
        Originally posted by Zan Lynx View Post

        Hah ha. No.

        Most filesystem errors are logic errors, not coding errors. Rust is great, but if you tell it to multiply by 4 when you meant 3 it won't save you.
        I won't dispute that, but it can be a misleading statement. "Logic errors" as a class of errors do include things which Rust's type system is well-suited to turning into compile-time errors.

        For example:

        1. If something can be modelled as a finite state machine, you can use affine types (the compiler catches attempts to use stale references) and any of various mechanisms, such as limiting a method "impl" to specific types on a generic, to make it a compile-time error to request an invalid transition.

        Session types are a special case of that and the only shortcoming is that, since Rust has an affine (variable consumed at most once) rather than linear type system (variable consumed exactly once), guaranteeing that you reach the final state can't be verified at compile time.

        (eg. Rust's Hyper makes it a compile-time error to modify request headers after you've started on the request body, preventing a Rust equivalent to those PHP "headers already sent" errors you sometimes see at the tops of pages.)

        Last I checked, C++ can't do this, because, without affine typing in the compiler, there's no way to catch the use of stale references at compile time and Rust's syntax definitely makes it much more comfortable. (For historical reasons, C++ requires a lot of keywords to opt into what Rust gives by default.)

        2. Rust is well-suited to using parameterized "impl" and/or things like single-element structs and Zero-Sized Types (ie. stuff that only the compiler cares about, which vanishes during compilation) to implement units at the level of the type system so dangerous or nonsense operations are caught at compile time.

        Stylo uses that to turn things like conflating device-independent pixels and device-dependent pixels into compile-time errors. In a language like Python, you'll get a runtime error if you try something like "1" + 2. In Rust, you can get similar errors at compile time for things like Temperature<Celsius> + Temperature<Fahrenheit> simply by not "impl"ing the relevant methods for those combinations of types.

        3. According to Rust best practices, you're supposed to take advantage of Rust's "enum" support to avoid using booleans when feasible. For example, it's much harder to get the order of your arguments wrong if the two yes/no values are of different types. (eg. One talk I saw gave an example where, instead of there being a boolean with a name like inherit_fill, there was a Fill::{Explicit,Inherit} enum.)

        Last I checked, Rust was much more strict than C and C++ about conflating different types of enums. (As in you have to go into an "unsafe" block and use the scary "mem::transmute" typecast which enforces nothing but that the source and destination type have the same size, rather than the much more preferred "as" typecast which works in safe Rust.)

        4. In Rust, variables aren't nullable unless you mark them as such, which is one more opportunity to craft your APIs such that the logic violations will be caught by the type system. (And here's something that doesn't get talked about enough: It doesn't take any extra space in most cases. If you wrap a type in an Option<T> and it or any of its members (if it's a struct) are tagged NonZero, then, once it's passed type checking, the compiler optimizations will be smart enough to consider the familiar "zero is null" machine code behaviour you'd have gotten with a less strict type system.)
        Last edited by ssokolow; 23 May 2017, 01:40 AM.


        • #44
          Originally posted by cen1 View Post
          I did not imagine how much bullshit I could read about ZFS and programming in general by Rust apologists. Surprises me every time.
          I've seen potential BS (I don't know enough about ZFS and btrfs to be certain), but I haven't seen any evidence of it being tied to Rust specifically.

          Also, given that TFS is part of Redox OS, which is specifically intended to be written in Rust and is no stranger to trying new approaches to various aspects of its design, I don't see how the existence of TFS is a mark against ZFS aside from "we wanted something easier to hack on than our attempted Rust ZFS implementation and decided to take the opportunity to try a different approach here too".


          • #45
            Originally posted by Nille View Post
            I hope they don't make the same decision like the ZFS people that i cant add more drives to an array. (e.g. 3 disks in an raid5. if i want to add more disks i have to create another array or rebuild everything.
            You don't do raid5. You only use mirrors and throw them in a pool. Raid5 is for suckers.


            • #46
              well, starshipeleven btrfs doing it rigth after taking the time seems untrue to me as the raid5/6 part proves it they perfectly do thing in a non production ready preasure way
              Even facebook that was the "proof" it was production ready jumped in and reported that to the community and then got in silent mode for years now with no report and no information at all on how it is going on for tem.
              Also btrfs seems very bad to use with VM images or database where ZFS still works okay on those bad scenario for Cow FS no ?