Originally posted by jbennett
View Post
Announcement
Collapse
No announcement yet.
10GbE Linux Networking Performance Between CentOS, Fedora, Clear Linux & Debian
Collapse
X
-
For proper utilization of 10GbE you need to do some work yourself:- Increase MTU to the hardware limits, although 9000 is a good bet
- Use maximum supported ring parameters (ethtool -g/-G)
- Set number of channels to number of CPU cores per NUMA core (ethtool -l/-L)
- Pin the channel IRQs to said CPU cores on the NUMA core the NIC is connected to
- Also pin the application that is transmitting data to CPU threads on said NUMA core
- Likes 4
Comment
-
Originally posted by ypnos View PostFor proper utilization of 10GbE you need to do some work yourself:- Increase MTU to the hardware limits, although 9000 is a good bet
- Use maximum supported ring parameters (ethtool -g/-G)
- Set number of channels to number of CPU cores per NUMA core (ethtool -l/-L)
- Pin the channel IRQs to said CPU cores on the NUMA core the NIC is connected to
- Also pin the application that is transmitting data to CPU threads on said NUMA core
On the RX side pin-app-and-irq-to-adjacent-numa-ID alone can be difference between barely reaching 5-6Gbps and maxing out the machine ~200Gbps.
... And yes, a dual Xeon machine running Fedora Server 27 can passively monitor 16+ x 10Gbps (or 4 x 40Gbps) NICs with near zero packet loss.
- GilboaoVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.
- Likes 2
Comment
-
Originally posted by fuzz View PostAre there ways to automate those settings? Otherwise it's useless for automated testing.
You can locate the NUMA ID of the PCI-E slot from lspci, compare that to the NUMA information from lscpu - this will give you the closest NUMA ID CPU.
Now use ethtool to reduce the number of RSS queues (to the number of cores on that NUMA node) and use irq_affinity to match assign one IRQ per CPU core.
Add some additional ethtool magic to configure the ring parameters, etc. And you should be done.
- GilboaoVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.
- Likes 2
Comment
-
Originally posted by pegasus View PostCongrats for expanding into new benchmarking territory, but there are new dragons here. These numbers all seem way too low. I regularly max out 100Gbit on old ivy bridge storage nodes running centos 6 and that's with less than 30min spent on tuning them. 10Gbit today can be maxed out with a single core ...
- Likes 1
Comment
-
There are many online. Broadcom have theirs, Intel have theirs, Mellanox have theirs ... They're mostly the same, tunning your tcp stack settings and congestion algorithms based on lan or wan scenarios, pinning nic interrupt processing threads to specific cores, enlarging the nic queue lengths, making sure that offloads are enabled etc. Mellanox even has a script that does all of that for you.
- Likes 1
Comment
Comment