Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] igb firmware 1.63 broken / flapping on switch reboot - update or downgrade possible?
@ 2021-05-19 11:57 Christian Ruppert
  2021-05-19 13:09 ` Christian Ruppert
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Ruppert @ 2021-05-19 11:57 UTC (permalink / raw)
  To: intel-wired-lan

Hi List,

Problem: If we reboot a Switch that is connected to igb interfaces (we 
use bonding), the interface will flapp several times during the reboot 
of the switch
Setup: 2x 1GE I350 (igb) connected to 2x Juniper EX3330 for example
It's a active/backup Bonding with MIIMON being disabled and ARP check 
being configured

What we have figured out so far, it seems to be a bug in firmware 1.63 
while a system with 1.61 seems to work just fine:

We have a bunch of systems with:
02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network 
Connection (rev 01)
	Subsystem: Super Micro Computer Inc Device 1521
	Kernel driver in use: igb
	Kernel modules: igb
02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network 
Connection (rev 01)
	Subsystem: Super Micro Computer Inc Device 1521
	Kernel driver in use: igb
	Kernel modules: igb

Lets pick 2 of those systems, first the good one:
# ethtool -i net0
driver: igb
version: 5.6.0-k
firmware-version: 1.61, 0x8000090e
expansion-rom-version:
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

# uname -r
3.10.0-1160.25.1.el7.x86_64

CentOS 7.9

# dmesg
[627590.997603] igb 0000:02:00.0 net0: igb: net0 NIC Link is Down
[627598.277441] bond0: link status definitely down for interface net0, 
disabling it
[627598.278062] bond0: making interface net1 the new active one
[627598.278536] device net0 left promiscuous mode
[627598.279109] device net1 entered promiscuous mode
[627856.894229] igb 0000:02:00.0 net0: igb: net0 NIC Link is Up 1000 
Mbps Full Duplex, Flow Control: RX/TX
[627859.970951] bond0: link status definitely up for interface net0
[627859.971577] bond0: making interface net0 the new active one
[627859.972127] device net1 left promiscuous mode
[627859.972801] device net0 entered promiscuous mode


That's the complete switch reboot and that is how it's supposed to be.

Now the broken one (we have multiple broken ones, all the same firmware 
version):
# ethtool -i net0
driver: igb
version: 5.6.0-k
firmware-version: 1.63, 0x80000a05
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

# uname -r
3.10.0-1160.25.1.el7.x86_64

CentOS 7.9

# dmesg[451689.477836] igb 0000:01:00.0 net0: igb: net0 NIC Link is Down
[451697.112000] bond0: link status definitely down for interface net0, 
disabling it
[451697.113060] bond0: making interface net1 the new active one
[451697.113906] device net0 left promiscuous mode
[451697.114840] device net1 entered promiscuous mode
[451742.241325] bond0: link status definitely up for interface net0
[451742.242276] bond0: making interface net0 the new active one
[451742.243065] device net1 left promiscuous mode
[451742.243976] device net0 entered promiscuous mode
[451751.265579] bond0: link status definitely down for interface net0, 
disabling it
[451751.266503] bond0: making interface net1 the new active one
[451751.267300] device net0 left promiscuous mode
[451751.268166] device net1 entered promiscuous mode
[451817.443511] bond0: link status definitely up for interface net0
[451817.444428] bond0: making interface net0 the new active one
[451817.445216] device net1 left promiscuous mode
[451817.446100] device net0 entered promiscuous mode
[451826.467777] bond0: link status definitely down for interface net0, 
disabling it
[451826.468836] bond0: making interface net1 the new active one
[451826.469702] device net0 left promiscuous mode
[451826.470534] device net1 entered promiscuous mode
[451856.548666] bond0: link status definitely up for interface net0
[451856.549534] bond0: making interface net0 the new active one
[451856.550283] device net1 left promiscuous mode
[451856.551142] device net0 entered promiscuous mode
[451865.572959] bond0: link status definitely down for interface net0, 
disabling it
[451865.573892] bond0: making interface net1 the new active one
[451865.574671] device net0 left promiscuous mode
[451865.575504] device net1 entered promiscuous mode
[451874.597227] bond0: link status definitely up for interface net0
[451874.598273] bond0: making interface net0 the new active one
[451874.599057] device net1 left promiscuous mode
[451874.599901] device net0 entered promiscuous mode
[451883.621550] bond0: link status definitely down for interface net0, 
disabling it
[451883.622382] bond0: making interface net1 the new active one
[451883.623136] device net0 left promiscuous mode
[451883.623898] device net1 entered promiscuous mode
[451886.629557] bond0: link status definitely up for interface net0
[451886.630416] bond0: making interface net0 the new active one
[451886.631178] device net1 left promiscuous mode
[451886.632051] device net0 entered promiscuous mode
[451895.653860] bond0: link status definitely down for interface net0, 
disabling it
[451895.654792] bond0: making interface net1 the new active one
[451895.655548] device net0 left promiscuous mode
[451895.656372] device net1 entered promiscuous mode
[451898.661903] bond0: link status definitely up for interface net0
[451898.662789] bond0: making interface net0 the new active one
[451898.663582] device net1 left promiscuous mode
[451898.664464] device net0 entered promiscuous mode
[451907.686173] bond0: link status definitely down for interface net0, 
disabling it
[451907.687090] bond0: making interface net1 the new active one
[451907.687864] device net0 left promiscuous mode
[451907.688700] device net1 entered promiscuous mode
[451919.718549] bond0: link status definitely up for interface net0
[451919.719403] bond0: making interface net0 the new active one
[451919.720165] device net1 left promiscuous mode
[451919.721040] device net0 entered promiscuous mode
[451928.742836] bond0: link status definitely down for interface net0, 
disabling it
[451928.743834] bond0: making interface net1 the new active one
[451928.744601] device net0 left promiscuous mode
[451928.745452] device net1 entered promiscuous mode
[451949.799426] bond0: link status definitely up for interface net0
[451949.800297] bond0: making interface net0 the new active one
[451949.801080] device net1 left promiscuous mode
[451949.801978] device net0 entered promiscuous mode
[451954.463872] igb 0000:01:00.0 net0: igb: net0 NIC Link is Up 1000 
Mbps Full Duplex, Flow Control: RX/TX

This is the same reboot as on the good one. It's the same switch they're 
connected to. The same bonding config etc. So it doesn't seem to be 
related to the bonding.
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: net0 (primary_reselect always)
Currently Active Slave: net0
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 3000
ARP IP target/s (n.n.n.n form): 192.168.99.105

Slave Interface: net0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 9
Permanent HW addr: 0c:c4:7a:ab:f2:30
Slave queue ID: 0

Slave Interface: net1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 0c:c4:7a:ab:f2:31
Slave queue ID: 0


Is it possible to upgrade the firmware? Is there a more recent one at 
all? I couldn't find any info about that nor a changelog or something 
else so far. We'd do even a downgrade to get that fixed.
The firmware doesn't seem to be included into the driver so I would 
assume there's an external package for it?

-- 
Regards,
Christian Ruppert

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Intel-wired-lan] igb firmware 1.63 broken / flapping on switch reboot - update or downgrade possible?
  2021-05-19 11:57 [Intel-wired-lan] igb firmware 1.63 broken / flapping on switch reboot - update or downgrade possible? Christian Ruppert
@ 2021-05-19 13:09 ` Christian Ruppert
  0 siblings, 0 replies; 2+ messages in thread
From: Christian Ruppert @ 2021-05-19 13:09 UTC (permalink / raw)
  To: intel-wired-lan

On 2021-05-19 13:57, Christian Ruppert wrote:
> Hi List,
> 
> Problem: If we reboot a Switch that is connected to igb interfaces (we
> use bonding), the interface will flapp several times during the reboot
> of the switch
> Setup: 2x 1GE I350 (igb) connected to 2x Juniper EX3330 for example
> It's a active/backup Bonding with MIIMON being disabled and ARP check
> being configured
> 
> What we have figured out so far, it seems to be a bug in firmware 1.63
> while a system with 1.61 seems to work just fine:
> 
> We have a bunch of systems with:
> 02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
> Connection (rev 01)
> 	Subsystem: Super Micro Computer Inc Device 1521
> 	Kernel driver in use: igb
> 	Kernel modules: igb
> 02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
> Connection (rev 01)
> 	Subsystem: Super Micro Computer Inc Device 1521
> 	Kernel driver in use: igb
> 	Kernel modules: igb
> 
> Lets pick 2 of those systems, first the good one:
> # ethtool -i net0
> driver: igb
> version: 5.6.0-k
> firmware-version: 1.61, 0x8000090e
> expansion-rom-version:
> bus-info: 0000:02:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> 
> # uname -r
> 3.10.0-1160.25.1.el7.x86_64
> 
> CentOS 7.9
> 
> # dmesg
> [627590.997603] igb 0000:02:00.0 net0: igb: net0 NIC Link is Down
> [627598.277441] bond0: link status definitely down for interface net0,
> disabling it
> [627598.278062] bond0: making interface net1 the new active one
> [627598.278536] device net0 left promiscuous mode
> [627598.279109] device net1 entered promiscuous mode
> [627856.894229] igb 0000:02:00.0 net0: igb: net0 NIC Link is Up 1000
> Mbps Full Duplex, Flow Control: RX/TX
> [627859.970951] bond0: link status definitely up for interface net0
> [627859.971577] bond0: making interface net0 the new active one
> [627859.972127] device net1 left promiscuous mode
> [627859.972801] device net0 entered promiscuous mode
> 
> 
> That's the complete switch reboot and that is how it's supposed to be.
> 
> Now the broken one (we have multiple broken ones, all the same
> firmware version):
> # ethtool -i net0
> driver: igb
> version: 5.6.0-k
> firmware-version: 1.63, 0x80000a05
> expansion-rom-version:
> bus-info: 0000:01:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> 
> # uname -r
> 3.10.0-1160.25.1.el7.x86_64
> 
> CentOS 7.9
> 
> # dmesg[451689.477836] igb 0000:01:00.0 net0: igb: net0 NIC Link is 
> Down
> [451697.112000] bond0: link status definitely down for interface net0,
> disabling it
> [451697.113060] bond0: making interface net1 the new active one
> [451697.113906] device net0 left promiscuous mode
> [451697.114840] device net1 entered promiscuous mode
> [451742.241325] bond0: link status definitely up for interface net0
> [451742.242276] bond0: making interface net0 the new active one
> [451742.243065] device net1 left promiscuous mode
> [451742.243976] device net0 entered promiscuous mode
> [451751.265579] bond0: link status definitely down for interface net0,
> disabling it
> [451751.266503] bond0: making interface net1 the new active one
> [451751.267300] device net0 left promiscuous mode
> [451751.268166] device net1 entered promiscuous mode
> [451817.443511] bond0: link status definitely up for interface net0
> [451817.444428] bond0: making interface net0 the new active one
> [451817.445216] device net1 left promiscuous mode
> [451817.446100] device net0 entered promiscuous mode
> [451826.467777] bond0: link status definitely down for interface net0,
> disabling it
> [451826.468836] bond0: making interface net1 the new active one
> [451826.469702] device net0 left promiscuous mode
> [451826.470534] device net1 entered promiscuous mode
> [451856.548666] bond0: link status definitely up for interface net0
> [451856.549534] bond0: making interface net0 the new active one
> [451856.550283] device net1 left promiscuous mode
> [451856.551142] device net0 entered promiscuous mode
> [451865.572959] bond0: link status definitely down for interface net0,
> disabling it
> [451865.573892] bond0: making interface net1 the new active one
> [451865.574671] device net0 left promiscuous mode
> [451865.575504] device net1 entered promiscuous mode
> [451874.597227] bond0: link status definitely up for interface net0
> [451874.598273] bond0: making interface net0 the new active one
> [451874.599057] device net1 left promiscuous mode
> [451874.599901] device net0 entered promiscuous mode
> [451883.621550] bond0: link status definitely down for interface net0,
> disabling it
> [451883.622382] bond0: making interface net1 the new active one
> [451883.623136] device net0 left promiscuous mode
> [451883.623898] device net1 entered promiscuous mode
> [451886.629557] bond0: link status definitely up for interface net0
> [451886.630416] bond0: making interface net0 the new active one
> [451886.631178] device net1 left promiscuous mode
> [451886.632051] device net0 entered promiscuous mode
> [451895.653860] bond0: link status definitely down for interface net0,
> disabling it
> [451895.654792] bond0: making interface net1 the new active one
> [451895.655548] device net0 left promiscuous mode
> [451895.656372] device net1 entered promiscuous mode
> [451898.661903] bond0: link status definitely up for interface net0
> [451898.662789] bond0: making interface net0 the new active one
> [451898.663582] device net1 left promiscuous mode
> [451898.664464] device net0 entered promiscuous mode
> [451907.686173] bond0: link status definitely down for interface net0,
> disabling it
> [451907.687090] bond0: making interface net1 the new active one
> [451907.687864] device net0 left promiscuous mode
> [451907.688700] device net1 entered promiscuous mode
> [451919.718549] bond0: link status definitely up for interface net0
> [451919.719403] bond0: making interface net0 the new active one
> [451919.720165] device net1 left promiscuous mode
> [451919.721040] device net0 entered promiscuous mode
> [451928.742836] bond0: link status definitely down for interface net0,
> disabling it
> [451928.743834] bond0: making interface net1 the new active one
> [451928.744601] device net0 left promiscuous mode
> [451928.745452] device net1 entered promiscuous mode
> [451949.799426] bond0: link status definitely up for interface net0
> [451949.800297] bond0: making interface net0 the new active one
> [451949.801080] device net1 left promiscuous mode
> [451949.801978] device net0 entered promiscuous mode
> [451954.463872] igb 0000:01:00.0 net0: igb: net0 NIC Link is Up 1000
> Mbps Full Duplex, Flow Control: RX/TX
> 
> This is the same reboot as on the good one. It's the same switch
> they're connected to. The same bonding config etc. So it doesn't seem
> to be related to the bonding.
> # cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> 
> Bonding Mode: fault-tolerance (active-backup)
> Primary Slave: net0 (primary_reselect always)
> Currently Active Slave: net0
> MII Status: up
> MII Polling Interval (ms): 0
> Up Delay (ms): 0
> Down Delay (ms): 0
> ARP Polling Interval (ms): 3000
> ARP IP target/s (n.n.n.n form): 192.168.99.105
> 
> Slave Interface: net0
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 9
> Permanent HW addr: 0c:c4:7a:ab:f2:30
> Slave queue ID: 0
> 
> Slave Interface: net1
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 1
> Permanent HW addr: 0c:c4:7a:ab:f2:31
> Slave queue ID: 0
> 
> 
> Is it possible to upgrade the firmware? Is there a more recent one at
> all? I couldn't find any info about that nor a changelog or something
> else so far. We'd do even a downgrade to get that fixed.
> The firmware doesn't seem to be included into the driver so I would
> assume there's an external package for it?

Ok, it's probably not the firmware :(
We also have systems with the same version that work, while others 
don't. Something else must differ.
So I just found two systems, all the same as above, just that both have 
1.63 and one works, the other one doesn't.


-- 
Regards,
Christian Ruppert

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-05-19 13:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-19 11:57 [Intel-wired-lan] igb firmware 1.63 broken / flapping on switch reboot - update or downgrade possible? Christian Ruppert
2021-05-19 13:09 ` Christian Ruppert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox