Announcement

Collapse
No announcement yet.

Major Network Performance Regressions In Linux

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Major Network Performance Regressions In Linux

    Phoronix: Major Network Performance Regressions In Linux

    Affecting the latest Linux kernel release, Linux 3.7, are "multiple apparently unrelated network performance issues." The major network performance problems were reported by a well-known Linux kernel developer...

    http://www.phoronix.com/vr.php?view=MTI2Nzc

  • #2
    https://lkml.org/lkml/2013/1/6/34
    OK good news here, the performance drop on the myri was caused by a
    problem between the keyboard and the chair. After the reboot series,
    I forgot to reload the firmware so the driver used the less efficient
    firmware from the NIC (it performs just as if LRO is disabled).

    That makes me think that I should try 3.8-rc2 since LRO was removed
    there :-/

    The only remaining issue really is the loopback then.

    https://lkml.org/lkml/2013/1/6/54
    Just for the record, I tested 3.8-rc2, and the myri works as fast with
    GRO there as it used to work with LRO in previous kernels. The softirq
    work has increased from 26 to 48% but there is no performance drop when
    using GRO anymore. Andrew has done a good job !

    Comment


    • #3
      thank you for the heads up!

      Comment


      • #4
        It certainly is a performance regression, but I seriously doubt that it affects many Phoronix readers. I don't know how many of us use 10 GigE, but you have to start there, and then question how many of those readers have the Myri cards.

        Yep, it's a driver problem. For that small subset, it's pretty darned serious, but then again, it'll be fixed when 3.8 hits the streets.

        Comment


        • #5
          I think the problem is much more than a single driver or a simple performance regression. I updated a machine with gigabit Broadcom network to 3.7.1 . I started seeing processes hang when performing network operations--jdbc/memcached/activemq. The hangs were intermittent but would happen during heavy batch processing every night. Over the next several days, I tried messing with kernel settings, MTU settings, hugepage support settings, jdbc driver updates, jdbc driver reverts, disabling ipv6, enable/disable tcp keepalive, etc.

          I eventually reverted the setting changes back and reverted the kernel back to a 3.6 release and haven't see one hang since. I came across some lkml mailings talking about epoll hangs and figured there must be something going on, since the stack traces I was seeing were showing strange kernel hangs--like the client was waiting on the server and the server was waiting on the client

          I'll be following the progress on this network issue now to see what's uncovered.

          Comment


          • #6
            Originally posted by mgmartin View Post
            I think the problem is much more than a single driver or a simple performance regression. I updated a machine with gigabit Broadcom network to 3.7.1 . I started seeing processes hang when performing network operations--jdbc/memcached/activemq. The hangs were intermittent but would happen during heavy batch processing every night.
            Have you reported this to the devs?

            Comment


            • #7
              Initial tests using 3.7.2 look positive. I haven't seen the lockups I mentioned I had under 3.7.1 .

              Comment


              • #8
                I spoke too soon. Back to a 3.6 kernel.

                Comment

                Working...
                X