* Re: Yukon2 88E8056 card problem with switch?
[not found] <4A0BC769.9010507@aei.mpg.de>
@ 2009-05-14 15:06 ` Stephen Hemminger
2009-05-14 15:41 ` Carsten Aulbert
2009-05-15 11:08 ` Carsten Aulbert
0 siblings, 2 replies; 3+ messages in thread
From: Stephen Hemminger @ 2009-05-14 15:06 UTC (permalink / raw)
To: Carsten Aulbert; +Cc: netdev
On Thu, 14 May 2009 09:25:29 +0200
Carsten Aulbert <carsten.aulbert@aei.mpg.de> wrote:
> Hi,
>
> sorry to ask you directly, but I'm running out of options how to solve
> this issue:
>
> We install our machines fully automatically via Debian's FAI mechanisms
> and hit a problem right at the end of the installation which can also be
> triggered after a standard system install.
>
> With kernel 2.6.27.21 (vanilla) and logging into the box via ssh and
> calling dmesg, the net watchdog starts barking:
>
> May 12 09:04:28 gpu01 kernel: [ 3000.040007] ------------[ cut here
> ]------------
> May 12 09:04:28 gpu01 kernel: [ 3000.040011] WARNING: at
> net/sched/sch_generic.c:219 dev_watchdog+0x121/0x1b8()
> May 12 09:04:28 gpu01 kernel: [ 3000.040013] NETDEV WATCHDOG: eth0
> (sky2): transmit timed out
> May 12 09:04:28 gpu01 kernel: [ 3000.040015] Modules linked in:
> ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_msghandler i2c_i801
> i2c_core sky2
> May 12 09:04:28 gpu01 kernel: [ 3000.040025] Pid: 0, comm: swapper Not
> tainted 2.6.27.21-atlas-generic-noinitrd #1
> May 12 09:04:28 gpu01 kernel: [ 3000.040027]
> May 12 09:04:28 gpu01 kernel: [ 3000.040028] Call Trace:
> May 12 09:04:28 gpu01 kernel: [ 3000.040030] <IRQ>
> [<ffffffff80237378>] warn_slowpath+0xb4/0xdc
> May 12 09:04:28 gpu01 kernel: [ 3000.040037] [<ffffffff804d2d00>]
> sk_filter+0x10/0x80
> May 12 09:04:28 gpu01 kernel: [ 3000.040040] [<ffffffff804e7b1a>]
> ip_route_input+0x63e/0xedf
> May 12 09:04:28 gpu01 kernel: [ 3000.040044] [<ffffffff803bf7b9>]
> __next_cpu+0x19/0x26
> May 12 09:04:28 gpu01 kernel: [ 3000.040048] [<ffffffff802302e7>]
> find_busiest_group+0x315/0x7c3
> May 12 09:04:28 gpu01 kernel: [ 3000.040051] [<ffffffff80232203>]
> try_to_wake_up+0x165/0x177
> May 12 09:04:28 gpu01 kernel: [ 3000.040054] [<ffffffff8022f0ce>]
> enqueue_task_fair+0xd8/0x130
> May 12 09:04:28 gpu01 kernel: [ 3000.040057] [<ffffffff804df6ed>]
> dev_watchdog+0x121/0x1b8
> May 12 09:04:28 gpu01 kernel: [ 3000.040060] [<ffffffff80232203>]
> try_to_wake_up+0x165/0x177
> May 12 09:04:28 gpu01 kernel: [ 3000.040062] [<ffffffff804df5cc>]
> dev_watchdog+0x0/0x1b8
> May 12 09:04:28 gpu01 kernel: [ 3000.040065] [<ffffffff8023fa06>]
> run_timer_softirq+0x16e/0x1ee
> May 12 09:04:28 gpu01 kernel: [ 3000.040069] [<ffffffff8024c075>]
> ktime_get_ts+0x21/0x49
> May 12 09:04:28 gpu01 kernel: [ 3000.040072] [<ffffffff8023bfad>]
> __do_softirq+0x6a/0xda
> May 12 09:04:28 gpu01 kernel: [ 3000.040075] [<ffffffff8021163c>]
> call_softirq+0x1c/0x28
> May 12 09:04:28 gpu01 kernel: [ 3000.040078] [<ffffffff802130fb>]
> do_softirq+0x3c/0x81
> May 12 09:04:28 gpu01 kernel: [ 3000.040082] [<ffffffff80220326>]
> smp_apic_timer_interrupt+0x8e/0xa7
> May 12 09:04:28 gpu01 kernel: [ 3000.040085] [<ffffffff80210e43>]
> apic_timer_interrupt+0x83/0x90
> May 12 09:04:28 gpu01 kernel: [ 3000.040086] <EOI>
> [<ffffffff802170e2>] mwait_idle+0x3c/0x46
> May 12 09:04:28 gpu01 kernel: [ 3000.040092] [<ffffffff8020ee32>]
> cpu_idle+0x91/0xd1
> May 12 09:04:28 gpu01 kernel: [ 3000.040094]
> May 12 09:04:28 gpu01 kernel: [ 3000.040096] ---[ end trace
> da19323bcd799bc5 ]---
> May 12 09:04:28 gpu01 kernel: [ 3000.040098] sky2 eth0: tx timeout
> May 12 09:04:28 gpu01 kernel: [ 3000.048993] sky2 eth0: transmit ring
> 348 .. 308 report=348 done=348
> May 12 09:04:28 gpu01 kernel: [ 3000.049017] sky2 eth0: disabling interface
> May 12 09:04:28 gpu01 kernel: [ 3000.053439] sky2 eth0: enabling interface
> May 12 09:04:31 gpu01 kernel: [ 3003.153938] sky2 eth0: Link is up at
> 1000 Mbps, full duplex, flow control rx
You are only seeing partial flow control. My recommendation would be to
turn off flow control with:
ethtool -A eth0 autoneg off rx off tx off
> Most of the time the device seem to heal itself after a couple of
> minutes, but not always. I suspect this is related to switching since I
> don't see this behavior when running a direct link cable between this
> machine and another one.
>
> On a related note: It seems that autosensing does not work reliably
> also, since our switches do report no pause frames on both tx as well as
> rx because that could potentially cause havoc in our large switching
> network.
It works with other switches, so check cable and try another switch.
> If've tried to make this problem go away via ethtool -A eth0, however so
> far without luck. I've yet to play around with the sky2 module
> parameters, any idea which parameter - if any - could help?
No parameters (by design) in driver.
--
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Yukon2 88E8056 card problem with switch?
2009-05-14 15:06 ` Yukon2 88E8056 card problem with switch? Stephen Hemminger
@ 2009-05-14 15:41 ` Carsten Aulbert
2009-05-15 11:08 ` Carsten Aulbert
1 sibling, 0 replies; 3+ messages in thread
From: Carsten Aulbert @ 2009-05-14 15:41 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Hi (thanks for the quick reply)
Stephen Hemminger schrieb:
>
> You are only seeing partial flow control. My recommendation would be to
> turn off flow control with:
> ethtool -A eth0 autoneg off rx off tx off
I've tried that, but that did *not* change the overall behavior -
although I omitted autosense off so far. I'll redo the test.
>
> It works with other switches, so check cable and try another switch.
>
I'll check that again, however we are pretty limited to this brand since
we need to have these because of the big uplink capabilities. If I
cannot reproduce this with another switch, I'll start bugging them :)
Cheers
Carsten
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Yukon2 88E8056 card problem with switch?
2009-05-14 15:06 ` Yukon2 88E8056 card problem with switch? Stephen Hemminger
2009-05-14 15:41 ` Carsten Aulbert
@ 2009-05-15 11:08 ` Carsten Aulbert
1 sibling, 0 replies; 3+ messages in thread
From: Carsten Aulbert @ 2009-05-15 11:08 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Hi again
Stephen Hemminger schrieb:
>
> You are only seeing partial flow control. My recommendation would be to
> turn off flow control with:
> ethtool -A eth0 autoneg off rx off tx off
>
Tried that, did not help.
>
> It works with other switches, so check cable and try another switch.
>
I think I've overlooked something. Is it possible that said chip (or the
driver) has problems with Jumbo frames? Our usual setup for more than a
year now is that we run everything with MTU 9000 and thus I have not
looked into this before. I just did a full install with MTU 1500 and it
just worked.
Is there a known issue with Jumbo frames?
The same set-up (switch, cable, etc, works with an e1000 and Jumbo frames).
Cheers
CArsten
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-05-15 11:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4A0BC769.9010507@aei.mpg.de>
2009-05-14 15:06 ` Yukon2 88E8056 card problem with switch? Stephen Hemminger
2009-05-14 15:41 ` Carsten Aulbert
2009-05-15 11:08 ` Carsten Aulbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).