Announcement

Collapse
No announcement yet.

Netflix Optimized FreeBSD's Network Stack More Than Doubled AMD EPYC Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41

    Originally posted by wizard69 View Post
    Why these absolute paranoid posts?.
    Its not uncommon for companies to modify BSDs and not contribute back to the project, hardly paranoia.

    Originally posted by wizard69 View Post
    First off you could have investigated the status of the patches.
    Well I assumed they did not exist and therefore could not be investigated. Instead I have created discussion, go me.

    Comment


    • #42
      Originally posted by bpetty View Post
      It would be even faster if they weren't wasting so many cycles on encryption.
      Not 100% sure that is the case. First off, I presume the API bring used is actually sendfile, not sentfile as the article states.

      ​​​​​My understanding is that this is only efficient when no transformations such as encryption are required (hence you can do zero-copy transfer), and in that case the data must be pre-encrypted, probably in chunks. The cost of encryption would then be amortized over who else accesses the content, although decryption has a constant cost per viewer. Otherwise you might as well use buffered I/O because sendfile will degrade to that anyway.

      Of course, I could be wrong, it's late. :-)

      Comment


      • #43
        Originally posted by GreenReaper View Post

        Not 100% sure that is the case. First off, I presume the API bring used is actually sendfile, not sentfile as the article states.

        ​​​​​My understanding is that this is only efficient when no transformations such as encryption are required (hence you can do zero-copy transfer), and in that case the data must be pre-encrypted, probably in chunks. The cost of encryption would then be amortized over who else accesses the content, although decryption has a constant cost per viewer. Otherwise you might as well use buffered I/O because sendfile will degrade to that anyway.

        Of course, I could be wrong, it's late. :-)
        In any case it's a non issue since they do it because they are forced to (by the contracts with the media providers) and not because they want to.

        Comment


        • #44
          Originally posted by drewg123 View Post

          KTLS affinity (trivial): https://reviews.freebsd.org/D21648
          TCP_REUSEPORT_LB_NUMA: https://reviews.freebsd.org/D21636
          Thanks for both replies, I appreciate it.

          Comment


          • #45
            Originally posted by Space Heater View Post

            Could you explain this a bit more? What AMD performance tools are available for FreeBSD but not Linux?
            None that I know of, and my inital post doesn't make much sense. I'm not always wrong, but when I am I admit it.

            Comment


            • #46
              Originally posted by NateHubbard View Post

              While that does seem reasonable, it also wouldn't be the first time that a business wasn't willing to hand over their operating efficiencies to any competitor that wanted it.
              Netflix has a very interesting tech blog when they share a lot of their "operating efficiencies" https://medium.com/netflix-techblog. They also created and shared a lot of their libraries for building big "cloud" apps https://netflix.github.io/.

              Comment


              • #47
                Is 190-195gb/sec the max Epyc and Xeons can do due to Octa-channel DDR bandwidth which also happens to be in the same range?


                Max Bandwidth:
                190.7 GiB/s
                If so wouldn't it help if CPU manufacturers had HBM2 chips?

                Comment


                • #48
                  Originally posted by _Alex_ View Post
                  Is 190-195gb/sec the max Epyc and Xeons can do due to Octa-channel DDR bandwidth which also happens to be in the same range?


                  Max Bandwidth:
                  190.7 GiB/s
                  If so wouldn't it help if CPU manufacturers had HBM2 chips?
                  Note that a GB (and GiB) is 8 times more than a Gb, so there is plenty of spare memory bandwidth available.

                  From the presentation, it sounds like the limiting factor is actually the NICs. The intel server has 2x100 Gb/s cards, so it's close to maxing out. The AMD server was setup with 4x100 Gb/s cards, but were limited to 50 Gb/s each so they are also essentially maxing out.

                  That was because apparently the AMD motherboard only had pcie/3 x8 links to some of the cards. I assume x8 to 2 of them and x16 to the other 2. He mentioned that it was capable of going over 200Gb/s if he allowed the faster network cards to reach their faster speeds.

                  As far as the CPU utilization (and memory/numa bandwidth) goes, it seems like 300Gb/s would be very achievable, assuming no other big limitations came up to prevent that.
                  Edit: I think the NVME drives bandwidth is matched closely with the network cards, so increasing one would need to be matched with the other. Probably going to PCIE gen 4 is what would really need to happen to increase much more.
                  Last edited by smitty3268; 09 November 2019, 05:20 AM.

                  Comment

                  Working...
                  X