Announcement

Collapse
No announcement yet.

Blk-mq Is Almost Feature Complete & Fast With Linux 3.16

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Blk-mq Is Almost Feature Complete & Fast With Linux 3.16

    Phoronix: Blk-mq Is Almost Feature Complete & Fast With Linux 3.16

    Merged for the Linux 3.13 kernel was the multi-queue block layer allows for better SSD performance with reduced latency and by balancing I/O workload across multiple CPU cores and supporting multiple hardware queues. With the upcoming Linux 3.16 kernel, the "blk-mq" code is expected to be feature complete and deliver great performance...

    http://www.phoronix.com/vr.php?view=MTcwNzc

  • #2
    I've been sticking with the 3.12.x longterm kernel, because of assorted issues and instability since the multi-queue block layer implementation. Maybe 3.16 will be the one that finally lets me move forward again.

    Comment


    • #3
      is it used by default or do I have to set something in /etc/fstab or so?
      is it used by default for ssd only? or also for hdd?

      Comment


      • #4
        Originally posted by macemoneta View Post
        I've been sticking with the 3.12.x longterm kernel, because of assorted issues and instability since the multi-queue block layer implementation. Maybe 3.16 will be the one that finally lets me move forward again.
        Could you perhaps describe these kind of issue's? I have been issue's on btrfs filesystems with kernel 3.13 or higher. Reading some files causes the process to stall.

        Comment


        • #5
          3.13 had a perf regression as a result of the blk-mq patch, I wonder if this is finally going to be fixed in 3.16?

          Comment


          • #6
            Originally posted by Caleb View Post
            3.13 had a perf regression as a result of the blk-mq patch, I wonder if this is finally going to be fixed in 3.16?
            What perf regression?

            Comment


            • #7
              Originally posted by macemoneta View Post
              I've been sticking with the 3.12.x longterm kernel, because of assorted issues and instability since the multi-queue block layer implementation. Maybe 3.16 will be the one that finally lets me move forward again.
              Sorry, but that's just nonsense. Unless you are running on virtio-blk as your storage driver, blk-mq could not have caused any instability issues for 3.13 or later.

              Comment


              • #8
                Originally posted by Rexilion View Post
                Could you perhaps describe these kind of issue's? I have been issue's on btrfs filesystems with kernel 3.13 or higher. Reading some files causes the process to stall.
                Stalls mostly, as well as reduced responsiveness. I use btrfs, but the btrfs mailing list seems to indicate they're not btrfs related. It may be the result of btrfs switching worker threads to now use kernel workqueues. Whatever the cause, my systems run flawlessly on the 3.12.x kernels. I've checked various 3.13, 3.14, and 3.15rc kernels, and they all exhibit the issue across multiple systems (Intel and AMD).

                Comment


                • #9
                  Originally posted by macemoneta View Post
                  Stalls mostly, as well as reduced responsiveness. I use btrfs, but the btrfs mailing list seems to indicate they're not btrfs related. It may be the result of btrfs switching worker threads to now use kernel workqueues. Whatever the cause, my systems run flawlessly on the 3.12.x kernels. I've checked various 3.13, 3.14, and 3.15rc kernels, and they all exhibit the issue across multiple systems (Intel and AMD).
                  What you want to do here is run a bisect between 3.12 and 3.13 and pinpoint exactly where the issue is. That is by far the most effective way to get the developers attention and get the issue fixed. Let me know if you need any help with doing the bisection. If you have already compiled a custom kernel, it's a trivial exercise. Especially if you can quickly vet or reject a given kernel after booting it.

                  Comment


                  • #10
                    Originally posted by axboe View Post
                    What you want to do here is run a bisect between 3.12 and 3.13 and pinpoint exactly where the issue is. That is by far the most effective way to get the developers attention and get the issue fixed. Let me know if you need any help with doing the bisection. If you have already compiled a custom kernel, it's a trivial exercise. Especially if you can quickly vet or reject a given kernel after booting it.
                    I'm familiar with the process, but it takes days, sometimes over a week for the problem to occur. By the time I completed a bisect, 3.17 would be out. Odds are, someone will have corrected the issue, intentionally or not, by then.

                    Comment


                    • #11
                      Originally posted by macemoneta View Post
                      I'm familiar with the process, but it takes days, sometimes over a week for the problem to occur. By the time I completed a bisect, 3.17 would be out. Odds are, someone will have corrected the issue, intentionally or not, by then.
                      Or it'll never get fixed, because nobody takes the time to put in the effort to reproduce it. Developer resources are scarce. Betting that "someone else will have fixed it" is a weak and losing proposition. Chances are it'll still be around come 3.17 and you will still be complaining about it.

                      If you (or someone in the know) has an idea where the problem might be, you could drastically reduce the number of cycles needed.

                      Comment


                      • #12
                        Originally posted by axboe View Post
                        Or it'll never get fixed, because nobody takes the time to put in the effort to reproduce it. Developer resources are scarce. Betting that "someone else will have fixed it" is a weak and losing proposition. Chances are it'll still be around come 3.17 and you will still be complaining about it.

                        If you (or someone in the know) has an idea where the problem might be, you could drastically reduce the number of cycles needed.
                        Not complaining, noting it (as others are as well). I do contribute my time where it's cost effective; this is an instance where I've determined that it's not. That's a decision I get to make, not you. Time is a finite resource.

                        Comment


                        • #13
                          Originally posted by macemoneta View Post
                          Not complaining, noting it (as others are as well). I do contribute my time where it's cost effective; this is an instance where I've determined that it's not. That's a decision I get to make, not you. Time is a finite resource.
                          Then stop pontificating that some random issue (that takes a week to reproduce, and that you are not willing to help get fixed) has anything to do with blk-mq, when you really have no idea if that is the case. If you're not running your storage on virtio-blk, then that is provably not the case. Spreading misinformation like that does a lot more harm than good, it'd be a lot more valuable to figure out what is actually causing the issue instead of potentially sending others off on a wild goose chase. Time is indeed a finite resource, please don't waste the time of others. You have already wasted plenty of mine. Time that could have been used pointing you in the right directions, or helping you get closer to a resolution. Unless there's some substantial information posted that could help resolve this issue, I'm done with this thread.

                          Comment


                          • #14
                            Originally posted by axboe View Post
                            Then stop pontificating that some random issue (that takes a week to reproduce, and that you are not willing to help get fixed) has anything to do with blk-mq, when you really have no idea if that is the case. If you're not running your storage on virtio-blk, then that is provably not the case. Spreading misinformation like that does a lot more harm than good, it'd be a lot more valuable to figure out what is actually causing the issue instead of potentially sending others off on a wild goose chase. Time is indeed a finite resource, please don't waste the time of others. You have already wasted plenty of mine. Time that could have been used pointing you in the right directions, or helping you get closer to a resolution. Unless there's some substantial information posted that could help resolve this issue, I'm done with this thread.
                            I tried to bisect on this old machine. My issue is consistently reproducible.

                            http://www.mail-archive.com/linux-bt.../msg33728.html

                            However, there are two issue's:

                            - there are two batches of commit skipping involved. The first one is to ommit non-booting kernels and the second one to omit oops-ing kernels during boot.
                            - mailing list has not responded with any clue's pointers / whatever

                            I take it I could upload some data about the file but I have no idea how.

                            Comment


                            • #15
                              Originally posted by axboe View Post
                              What perf regression?
                              I can't find it right now, but there were I/O tests here for 3.13, and it showed considerably lower perf than previous kernels, and the low perf continued with successive kernels (3.14, etc...).

                              Comment

                              Working...
                              X