I had something similar sometime back and it turned out not to be the servers but a faulty switch.
Announcement
Collapse
No announcement yet.
My Intel Linux NICs Have Developed A Nasty Habit Of Becoming Hung
Collapse
X
-
I have one I217-V that sometimes starts taking down the whole network by flooding it with packets. The network immediately recovers when the offender is unplugged.
I tracked it down to around kernel version 3.19; the machine will stay up for months without issues with kernels <=3.18, but start taking down the network completely
at random with linux 3.19+.
Comment
-
For "Intel Corporation 82579V Gigabit Network Connection" there also exists an official Intel update:
Without the NIC changes the PCI-ID during reboots (some IDs never get a connection, others just are annoying). I have this kind of nic on my Asus P8Z68-V, it basically works just Wake-On-LAN does not seem to work with Linux (WOL somehow disabled at poweroff), it worked with only Windows, a bit stupid...
Comment
-
Originally posted by mlau View PostI have one I217-V that sometimes starts taking down the whole network by flooding it with packets. The network immediately recovers when the offender is unplugged.
I tracked it down to around kernel version 3.19; the machine will stay up for months without issues with kernels <=3.18, but start taking down the network completely
at random with linux 3.19+.
cf. https://communities.intel.com/thread...art=0&tstart=0 and other threads you can find by searching intel wake on lan flood bug - if you're seeing a flood of ICMPv6 packets, it's almost certainly this bug.
Comment
-
-
Originally posted by mlau View PostI have one I217-V that sometimes starts taking down the whole network by flooding it with packets. The network immediately recovers when the offender is unplugged.
I tracked it down to around kernel version 3.19; the machine will stay up for months without issues with kernels <=3.18, but start taking down the network completely
at random with linux 3.19+.
Comment
-
As stated above tso off was't enough. In my case:
Intel Corporation 82567LM-3 Gigabit Network Connection (rev 02)
modinfo e1000e
filename: /lib/modules/3.10.0-327.13.1.el7.x86_64/updates/drivers/net/ethernet/intel/e1000e/e1000e.ko
version: 3.3.3-NAPI
Centos 7
I had to add to rc.local:
#get rid of intel's e1000e 1 GBit ethernet card random resets
ethtool -K enp0s25 gso off gro off tso off
Comment
-
Originally posted by chrisb View Post
If you already tracked the regression down to between v3.18 and v3.19 then consider doing a git bisect. It would only take around 13 steps to identify the exact commit that introduced the bug.
Comment
Comment