* sky2 hangs, hw csum errors with 2.6.18
@ 2006-09-22 11:24 Martin Lucina
2006-09-22 16:56 ` Stephen Hemminger
0 siblings, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-09-22 11:24 UTC (permalink / raw)
To: netdev; +Cc: Stephen Hemminger
Hello,
I'm having problems with my sky2 NIC hanging under heavy load. This
appears to be an old problem since it happened for me with 2.6.17 as
well. Upgrading the affected systems to 2.6.18 has not solved the
problem. It's easily reproducible for me since I'm running some
application stress testing that easily saturates the link.
I've had a look at the recent traffic on linux-kernel, netdev and the
relevant bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=6839) but
it's not clear to me which patch I should try against a stock 2.6.18
kernel. If someone could confirm that the "TX pause fix" attached to
the bugzilla is sufficient, that would be great.
The card in question is a:
Sep 22 12:17:27 dezo kernel: sky2 v1.5 addr 0xf3000000 irq 169 Yukon-XL (0xb3) rev 1
it's a SysKonnect SK-9E21 PCI-E Server Adapter and the driver is using
PCI-MSI interrupts on my system.
The chip on the card is a Marvell 88E8061.
The actual errors leading up to the latest hang are:
Sep 21 21:47:06 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 21 21:47:06 dezo kernel: sky2 eth1: tx timeout
Sep 21 21:47:06 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
Sep 21 21:47:06 dezo kernel: sky2 hardware hung? flushing
Sep 21 21:59:41 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 21 21:59:41 dezo kernel: sky2 eth1: tx timeout
Sep 21 21:59:41 dezo kernel: sky2 eth1: transmit ring 179 .. 138 report=220 done=220
Sep 21 21:59:41 dezo kernel: sky2 status report lost?
Sep 21 22:00:41 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 21 22:00:41 dezo kernel: sky2 eth1: tx timeout
Sep 21 22:00:41 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
Sep 21 22:00:41 dezo kernel: sky2 hardware hung? flushing
Sep 21 22:13:10 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 21 22:13:10 dezo kernel: sky2 eth1: tx timeout
Sep 21 22:13:10 dezo kernel: sky2 eth1: transmit ring 179 .. 138 report=220 done=220
Sep 21 22:13:10 dezo kernel: sky2 status report lost?
Sep 21 22:14:20 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 21 22:14:20 dezo kernel: sky2 eth1: tx timeout
Sep 21 22:14:20 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
Sep 21 22:14:20 dezo kernel: sky2 hardware hung? flushing
Sep 21 22:15:09 dezo kernel: sky2 eth1: disabling interface
Sep 21 22:15:09 dezo kernel: sky2 eth1: enabling interface
Sep 21 22:15:12 dezo kernel: sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control
both
Sep 21 22:15:20 dezo kernel: eth1: no IPv6 routers present
While the interface does appear to have been reset, it never actually
started working again and the system was hung until I rebooted it this
morning.
I'm also seeing a lot of these under high load:
Sep 21 21:34:24 dezo kernel: eth1: hw csum failure.
Sep 21 21:34:24 dezo kernel:
Sep 21 21:34:24 dezo kernel: Call Trace:
Sep 21 21:34:24 dezo kernel: [dump_stack+16/21] dump_stack+0x10/0x15
Sep 21 21:34:24 dezo kernel: [__skb_checksum_complete+85/121] __skb_checksum_complete+0x5
5/0x79
Sep 21 21:34:24 dezo kernel: [tcp_v4_rcv+218/2405] tcp_v4_rcv+0xda/0x965
Sep 21 21:34:24 dezo kernel: [ip_local_deliver+433/635] ip_local_deliver+0x1b1/0x27b
Sep 21 21:34:24 dezo kernel: [ip_rcv+1234/1311] ip_rcv+0x4d2/0x51f
Sep 21 21:34:24 dezo kernel: [netif_receive_skb+589/621] netif_receive_skb+0x24d/0x26d
Sep 21 21:34:24 dezo kernel: [__nosave_end+128712870/2129981440] :sky2:sky2_status_intr+0
x23b/0x404
Sep 21 21:34:24 dezo kernel: [__nosave_end+128714646/2129981440] :sky2:sky2_poll+0x100/0x
1a1
Sep 21 21:34:24 dezo kernel: [net_rx_action+132/268] net_rx_action+0x84/0x10c
Sep 21 21:34:24 dezo kernel: [__do_softirq+107/226] __do_softirq+0x6b/0xe2
Sep 21 21:34:24 dezo kernel: [call_softirq+28/40] call_softirq+0x1c/0x28
Sep 21 21:34:24 dezo kernel: [do_softirq+45/129] do_softirq+0x2d/0x81
Sep 21 21:34:24 dezo kernel: [do_IRQ+112/132] do_IRQ+0x70/0x84
Sep 21 21:34:24 dezo kernel: [ret_from_intr+0/11] ret_from_intr+0x0/0xb
Sep 21 21:34:24 dezo kernel: [mwait_idle+58/82] mwait_idle+0x3a/0x52
Sep 21 21:34:24 dezo kernel: [cpu_idle+105/140] cpu_idle+0x69/0x8c
Sep 21 21:34:24 dezo kernel: [start_kernel+483/488] start_kernel+0x1e3/0x1e8
Sep 21 21:34:24 dezo kernel: [x86_64_start_kernel+459/474] x86_64_start_kernel+0x1cb/0x1d
Am happy to help with tracking this down...
Thanks,
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 11:24 sky2 hangs, hw csum errors with 2.6.18 Martin Lucina
@ 2006-09-22 16:56 ` Stephen Hemminger
2006-09-22 18:23 ` Martin Lucina
2006-09-22 18:29 ` Martin Lucina
0 siblings, 2 replies; 15+ messages in thread
From: Stephen Hemminger @ 2006-09-22 16:56 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Fri, 22 Sep 2006 13:24:43 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> Hello,
>
> I'm having problems with my sky2 NIC hanging under heavy load. This
> appears to be an old problem since it happened for me with 2.6.17 as
> well. Upgrading the affected systems to 2.6.18 has not solved the
> problem. It's easily reproducible for me since I'm running some
> application stress testing that easily saturates the link.
>
> I've had a look at the recent traffic on linux-kernel, netdev and the
> relevant bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=6839) but
> it's not clear to me which patch I should try against a stock 2.6.18
> kernel. If someone could confirm that the "TX pause fix" attached to
> the bugzilla is sufficient, that would be great.
You can turn off TX pause and get the same effect.
> The card in question is a:
>
> Sep 22 12:17:27 dezo kernel: sky2 v1.5 addr 0xf3000000 irq 169 Yukon-XL (0xb3) rev 1
>
> it's a SysKonnect SK-9E21 PCI-E Server Adapter and the driver is using
> PCI-MSI interrupts on my system.
>
> The chip on the card is a Marvell 88E8061.
>
> The actual errors leading up to the latest hang are:
>
> Sep 21 21:47:06 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
> Sep 21 21:47:06 dezo kernel: sky2 eth1: tx timeout
> Sep 21 21:47:06 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
> Sep 21 21:47:06 dezo kernel: sky2 hardware hung? flushing
> Sep 21 21:59:41 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
> Sep 21 21:59:41 dezo kernel: sky2 eth1: tx timeout
> Sep 21 21:59:41 dezo kernel: sky2 eth1: transmit ring 179 .. 138 report=220 done=220
> Sep 21 21:59:41 dezo kernel: sky2 status report lost?
> Sep 21 22:00:41 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
> Sep 21 22:00:41 dezo kernel: sky2 eth1: tx timeout
> Sep 21 22:00:41 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
> Sep 21 22:00:41 dezo kernel: sky2 hardware hung? flushing
> Sep 21 22:13:10 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
> Sep 21 22:13:10 dezo kernel: sky2 eth1: tx timeout
> Sep 21 22:13:10 dezo kernel: sky2 eth1: transmit ring 179 .. 138 report=220 done=220
> Sep 21 22:13:10 dezo kernel: sky2 status report lost?
> Sep 21 22:14:20 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
> Sep 21 22:14:20 dezo kernel: sky2 eth1: tx timeout
> Sep 21 22:14:20 dezo kernel: sky2 eth1: transmit ring 220 .. 179 report=220 done=220
> Sep 21 22:14:20 dezo kernel: sky2 hardware hung? flushing
> Sep 21 22:15:09 dezo kernel: sky2 eth1: disabling interface
> Sep 21 22:15:09 dezo kernel: sky2 eth1: enabling interface
> Sep 21 22:15:12 dezo kernel: sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control
> both
> Sep 21 22:15:20 dezo kernel: eth1: no IPv6 routers present
>
> While the interface does appear to have been reset, it never actually
> started working again and the system was hung until I rebooted it this
> morning.
>
> I'm also seeing a lot of these under high load:
>
> Sep 21 21:34:24 dezo kernel: eth1: hw csum failure.
> Sep 21 21:34:24 dezo kernel:
> Sep 21 21:34:24 dezo kernel: Call Trace:
> Sep 21 21:34:24 dezo kernel: [dump_stack+16/21] dump_stack+0x10/0x15
> Sep 21 21:34:24 dezo kernel: [__skb_checksum_complete+85/121] __skb_checksum_complete+0x5
> 5/0x79
> Sep 21 21:34:24 dezo kernel: [tcp_v4_rcv+218/2405] tcp_v4_rcv+0xda/0x965
> Sep 21 21:34:24 dezo kernel: [ip_local_deliver+433/635] ip_local_deliver+0x1b1/0x27b
> Sep 21 21:34:24 dezo kernel: [ip_rcv+1234/1311] ip_rcv+0x4d2/0x51f
> Sep 21 21:34:24 dezo kernel: [netif_receive_skb+589/621] netif_receive_skb+0x24d/0x26d
> Sep 21 21:34:24 dezo kernel: [__nosave_end+128712870/2129981440] :sky2:sky2_status_intr+0
> x23b/0x404
> Sep 21 21:34:24 dezo kernel: [__nosave_end+128714646/2129981440] :sky2:sky2_poll+0x100/0x
> 1a1
> Sep 21 21:34:24 dezo kernel: [net_rx_action+132/268] net_rx_action+0x84/0x10c
> Sep 21 21:34:24 dezo kernel: [__do_softirq+107/226] __do_softirq+0x6b/0xe2
> Sep 21 21:34:24 dezo kernel: [call_softirq+28/40] call_softirq+0x1c/0x28
> Sep 21 21:34:24 dezo kernel: [do_softirq+45/129] do_softirq+0x2d/0x81
> Sep 21 21:34:24 dezo kernel: [do_IRQ+112/132] do_IRQ+0x70/0x84
> Sep 21 21:34:24 dezo kernel: [ret_from_intr+0/11] ret_from_intr+0x0/0xb
> Sep 21 21:34:24 dezo kernel: [mwait_idle+58/82] mwait_idle+0x3a/0x52
> Sep 21 21:34:24 dezo kernel: [cpu_idle+105/140] cpu_idle+0x69/0x8c
> Sep 21 21:34:24 dezo kernel: [start_kernel+483/488] start_kernel+0x1e3/0x1e8
> Sep 21 21:34:24 dezo kernel: [x86_64_start_kernel+459/474] x86_64_start_kernel+0x1cb/0x1d
>
> Am happy to help with tracking this down...
>
> Thanks,
>
> -mato
Is this a dual port on single port card?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 16:56 ` Stephen Hemminger
@ 2006-09-22 18:23 ` Martin Lucina
2006-09-22 18:29 ` Martin Lucina
1 sibling, 0 replies; 15+ messages in thread
From: Martin Lucina @ 2006-09-22 18:23 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Stephen,
shemminger@osdl.org said:
> You can turn off TX pause and get the same effect.
OK, I'll try that and get back to you.
> Is this a dual port on single port card?
Single port, copper media.
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 16:56 ` Stephen Hemminger
2006-09-22 18:23 ` Martin Lucina
@ 2006-09-22 18:29 ` Martin Lucina
2006-09-22 18:31 ` Stephen Hemminger
1 sibling, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-09-22 18:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
shemminger@osdl.org said:
> You can turn off TX pause and get the same effect.
Not sure if I did the right thing, but:
# ifdown eth1
...
# ethtool -A eth1 tx off
# ethtool -a eth1
Pause parameters for eth1:
Autonegotiate: on
RX: on
TX: off
# ifup eth1
...
sky2 eth1: enabling interface
sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control both
# ethtool -a eth1
Pause parameters for eth1:
Autonegotiate: on
RX: on
TX: on
is this what I should be seeing?
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 18:29 ` Martin Lucina
@ 2006-09-22 18:31 ` Stephen Hemminger
2006-09-22 18:38 ` Martin Lucina
0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2006-09-22 18:31 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Fri, 22 Sep 2006 20:29:25 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> shemminger@osdl.org said:
> > You can turn off TX pause and get the same effect.
>
> Not sure if I did the right thing, but:
>
> # ifdown eth1
> ...
> # ethtool -A eth1 tx off
> # ethtool -a eth1
> Pause parameters for eth1:
> Autonegotiate: on
> RX: on
> TX: off
> # ifup eth1
> ...
> sky2 eth1: enabling interface
> sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control both
> # ethtool -a eth1
> Pause parameters for eth1:
> Autonegotiate: on
> RX: on
> TX: on
>
> is this what I should be seeing?
>
> -mato
To get tx flow control to turn off. You need a patch (already in netdev-2.6 upstream)
and then you can do:
ethtool -A eth1 autoneg off tx off
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 18:31 ` Stephen Hemminger
@ 2006-09-22 18:38 ` Martin Lucina
2006-09-22 18:50 ` Stephen Hemminger
0 siblings, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-09-22 18:38 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
shemminger@osdl.org said:
> To get tx flow control to turn off. You need a patch (already in netdev-2.6 upstream)
> and then you can do:
>
> ethtool -A eth1 autoneg off tx off
Sorry, you've lost me. Which patch? You're saying that turning off tx
flow control will fix the hangs I'm seeing?
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 18:38 ` Martin Lucina
@ 2006-09-22 18:50 ` Stephen Hemminger
2006-10-03 18:21 ` Martin Lucina
0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2006-09-22 18:50 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Fri, 22 Sep 2006 20:38:13 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> shemminger@osdl.org said:
> > To get tx flow control to turn off. You need a patch (already in netdev-2.6 upstream)
> > and then you can do:
> >
> > ethtool -A eth1 autoneg off tx off
>
> Sorry, you've lost me. Which patch? You're saying that turning off tx
> flow control will fix the hangs I'm seeing?
>
> -mato
Subject: sky2: handle forced settings
Handle cases where pause parameters are forced correctly.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
--- sky2.orig/drivers/net/sky2.c 2006-09-06 09:45:45.000000000 -0700
+++ sky2/drivers/net/sky2.c 2006-09-06 10:20:50.000000000 -0700
@@ -289,7 +289,7 @@
static void sky2_phy_init(struct sky2_hw *hw, unsigned port)
{
struct sky2_port *sky2 = netdev_priv(hw->dev[port]);
- u16 ctrl, ct1000, adv, pg, ledctrl, ledover;
+ u16 ctrl, ct1000, adv, pg, ledctrl, ledover, reg;
if (sky2->autoneg == AUTONEG_ENABLE &&
!(hw->chip_id == CHIP_ID_YUKON_XL || hw->chip_id == CHIP_ID_YUKON_EC_U)) {
@@ -358,6 +358,7 @@
ctrl = 0;
ct1000 = 0;
adv = PHY_AN_CSMA;
+ reg = 0;
if (sky2->autoneg == AUTONEG_ENABLE) {
if (hw->copper) {
@@ -390,21 +391,46 @@
/* forced speed/duplex settings */
ct1000 = PHY_M_1000C_MSE;
- if (sky2->duplex == DUPLEX_FULL)
- ctrl |= PHY_CT_DUP_MD;
+ /* Disable auto update for duplex flow control and speed */
+ reg |= GM_GPCR_AU_ALL_DIS;
switch (sky2->speed) {
case SPEED_1000:
ctrl |= PHY_CT_SP1000;
+ reg |= GM_GPCR_SPEED_1000;
break;
case SPEED_100:
ctrl |= PHY_CT_SP100;
+ reg |= GM_GPCR_SPEED_100;
break;
}
+ if (sky2->duplex == DUPLEX_FULL) {
+ reg |= GM_GPCR_DUP_FULL;
+ ctrl |= PHY_CT_DUP_MD;
+ } else if (sky2->speed != SPEED_1000 && hw->chip_id != CHIP_ID_YUKON_EC_U) {
+ /* Turn off flow control for 10/100mbps */
+ sky2->rx_pause = 0;
+ sky2->tx_pause = 0;
+ }
+
+ if (!sky2->rx_pause)
+ reg |= GM_GPCR_FC_RX_DIS;
+
+ if (!sky2->tx_pause)
+ reg |= GM_GPCR_FC_TX_DIS;
+
+ /* Forward pause packets to GMAC? */
+ if (sky2->tx_pause || sky2->rx_pause)
+ sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
+ else
+ sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
+
ctrl |= PHY_CT_RESET;
}
+ gma_write16(hw, port, GM_GP_CTRL, reg);
+
if (hw->chip_id != CHIP_ID_YUKON_FE)
gm_phy_write(hw, port, PHY_MARV_1000T_CTRL, ct1000);
@@ -508,6 +534,7 @@
gm_phy_write(hw, port, PHY_MARV_LED_OVER, ledover);
}
+
/* Enable phy interrupt on auto-negotiation complete (or link up) */
if (sky2->autoneg == AUTONEG_ENABLE)
gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_IS_AN_COMPL);
@@ -570,49 +597,11 @@
gm_phy_read(hw, 1, PHY_MARV_INT_MASK) != 0);
}
- if (sky2->autoneg == AUTONEG_DISABLE) {
- reg = gma_read16(hw, port, GM_GP_CTRL);
- reg |= GM_GPCR_AU_ALL_DIS;
- gma_write16(hw, port, GM_GP_CTRL, reg);
- gma_read16(hw, port, GM_GP_CTRL);
-
- switch (sky2->speed) {
- case SPEED_1000:
- reg &= ~GM_GPCR_SPEED_100;
- reg |= GM_GPCR_SPEED_1000;
- break;
- case SPEED_100:
- reg &= ~GM_GPCR_SPEED_1000;
- reg |= GM_GPCR_SPEED_100;
- break;
- case SPEED_10:
- reg &= ~(GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100);
- break;
- }
-
- if (sky2->duplex == DUPLEX_FULL)
- reg |= GM_GPCR_DUP_FULL;
-
- /* turn off pause in 10/100mbps half duplex */
- else if (sky2->speed != SPEED_1000 &&
- hw->chip_id != CHIP_ID_YUKON_EC_U)
- sky2->tx_pause = sky2->rx_pause = 0;
- } else
- reg = GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100 | GM_GPCR_DUP_FULL;
-
- if (!sky2->tx_pause && !sky2->rx_pause) {
- sky2_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
- reg |=
- GM_GPCR_FC_TX_DIS | GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
- } else if (sky2->tx_pause && !sky2->rx_pause) {
- /* disable Rx flow-control */
- reg |= GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
- }
-
- gma_write16(hw, port, GM_GP_CTRL, reg);
-
sky2_read16(hw, SK_REG(port, GMAC_IRQ_SRC));
+ /* Enable Transmit FIFO Underrun */
+ sky2_write8(hw, SK_REG(port, GMAC_IRQ_MSK), GMAC_DEF_MSK);
+
spin_lock_bh(&sky2->phy_lock);
sky2_phy_init(hw, port);
spin_unlock_bh(&sky2->phy_lock);
@@ -1529,40 +1518,10 @@
unsigned port = sky2->port;
u16 reg;
- /* Enable Transmit FIFO Underrun */
- sky2_write8(hw, SK_REG(port, GMAC_IRQ_MSK), GMAC_DEF_MSK);
-
- reg = gma_read16(hw, port, GM_GP_CTRL);
- if (sky2->autoneg == AUTONEG_DISABLE) {
- reg |= GM_GPCR_AU_ALL_DIS;
-
- /* Is write/read necessary? Copied from sky2_mac_init */
- gma_write16(hw, port, GM_GP_CTRL, reg);
- gma_read16(hw, port, GM_GP_CTRL);
-
- switch (sky2->speed) {
- case SPEED_1000:
- reg &= ~GM_GPCR_SPEED_100;
- reg |= GM_GPCR_SPEED_1000;
- break;
- case SPEED_100:
- reg &= ~GM_GPCR_SPEED_1000;
- reg |= GM_GPCR_SPEED_100;
- break;
- case SPEED_10:
- reg &= ~(GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100);
- break;
- }
- } else
- reg &= ~GM_GPCR_AU_ALL_DIS;
-
- if (sky2->duplex == DUPLEX_FULL || sky2->autoneg == AUTONEG_ENABLE)
- reg |= GM_GPCR_DUP_FULL;
-
/* enable Rx/Tx */
+ reg = gma_read16(hw, port, GM_GP_CTRL);
reg |= GM_GPCR_RX_ENA | GM_GPCR_TX_ENA;
gma_write16(hw, port, GM_GP_CTRL, reg);
- gma_read16(hw, port, GM_GP_CTRL);
gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_DEF_MSK);
@@ -1616,7 +1575,6 @@
reg = gma_read16(hw, port, GM_GP_CTRL);
reg &= ~(GM_GPCR_RX_ENA | GM_GPCR_TX_ENA);
gma_write16(hw, port, GM_GP_CTRL, reg);
- gma_read16(hw, port, GM_GP_CTRL); /* PCI post */
if (sky2->rx_pause && !sky2->tx_pause) {
/* restore Asymmetric Pause bit */
@@ -1633,6 +1591,7 @@
if (netif_msg_link(sky2))
printk(KERN_INFO PFX "%s: Link is down.\n", sky2->netdev->name);
+
sky2_phy_init(hw, port);
}
@@ -1673,8 +1632,11 @@
sky2->rx_pause = (aux & PHY_M_PS_RX_P_EN) != 0;
sky2->tx_pause = (aux & PHY_M_PS_TX_P_EN) != 0;
- if ((sky2->tx_pause || sky2->rx_pause)
- && !(sky2->speed < SPEED_1000 && sky2->duplex == DUPLEX_HALF))
+ if (sky2->duplex == DUPLEX_HALF && sky2->speed != SPEED_1000
+ && hw->chip_id != CHIP_ID_YUKON_EC_U)
+ sky2->rx_pause = sky2->tx_pause = 0;
+
+ if (sky2->rx_pause || sky2->tx_pause)
sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
else
sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
@@ -1700,7 +1662,7 @@
printk(KERN_INFO PFX "%s: phy interrupt status 0x%x 0x%x\n",
sky2->netdev->name, istatus, phystat);
- if (istatus & PHY_M_IS_AN_COMPL) {
+ if (sky2->autoneg == AUTONEG_ENABLE && (istatus & PHY_M_IS_AN_COMPL)) {
if (sky2_autoneg_done(sky2, phystat) == 0)
sky2_link_up(sky2);
goto out;
@@ -2890,7 +2852,6 @@
struct ethtool_pauseparam *ecmd)
{
struct sky2_port *sky2 = netdev_priv(dev);
- int err = 0;
sky2->autoneg = ecmd->autoneg;
sky2->tx_pause = ecmd->tx_pause != 0;
@@ -2898,7 +2859,7 @@
sky2_phy_reinit(sky2);
- return err;
+ return 0;
}
static int sky2_get_coalesce(struct net_device *dev,
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-09-22 18:50 ` Stephen Hemminger
@ 2006-10-03 18:21 ` Martin Lucina
2006-10-03 18:35 ` Stephen Hemminger
0 siblings, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-10-03 18:21 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Hi Stephen,
I'm still getting tx timeouts even after applying the patch you sent me
and forcing tx flow control off:
Sep 28 20:35:53 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 20:35:53 dezo kernel: sky2 eth1: tx timeout
Sep 28 20:35:53 dezo kernel: sky2 eth1: transmit ring 420 .. 379 report=420 done=420
Sep 28 20:35:53 dezo kernel: sky2 hardware hung? flushing
Sep 28 20:50:28 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 20:50:28 dezo kernel: sky2 eth1: tx timeout
Sep 28 20:50:28 dezo kernel: sky2 eth1: transmit ring 379 .. 338 report=420 done=420
Sep 28 20:50:28 dezo kernel: sky2 status report lost?
Sep 28 20:51:53 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 20:51:53 dezo kernel: sky2 eth1: tx timeout
Sep 28 20:51:53 dezo kernel: sky2 eth1: transmit ring 420 .. 379 report=420 done=420
Sep 28 20:51:53 dezo kernel: sky2 hardware hung? flushing
Sep 28 21:08:03 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:08:03 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:08:03 dezo kernel: sky2 eth1: transmit ring 379 .. 338 report=420 done=420
Sep 28 21:08:03 dezo kernel: sky2 status report lost?
Sep 28 21:09:28 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:09:28 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:09:28 dezo kernel: sky2 eth1: transmit ring 420 .. 379 report=420 done=420
Sep 28 21:09:28 dezo kernel: sky2 hardware hung? flushing
Sep 28 21:25:18 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:25:18 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:25:18 dezo kernel: sky2 eth1: transmit ring 379 .. 338 report=420 done=420
Sep 28 21:25:18 dezo kernel: sky2 status report lost?
Sep 28 21:26:38 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:26:38 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:26:38 dezo kernel: sky2 eth1: transmit ring 420 .. 379 report=420 done=420
Sep 28 21:26:38 dezo kernel: sky2 hardware hung? flushing
Sep 28 21:41:42 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:41:42 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:41:42 dezo kernel: sky2 eth1: transmit ring 379 .. 338 report=420 done=420
Sep 28 21:41:42 dezo kernel: sky2 status report lost?
Sep 28 21:42:57 dezo kernel: NETDEV WATCHDOG: eth1: transmit timed out
Sep 28 21:42:57 dezo kernel: sky2 eth1: tx timeout
Sep 28 21:42:57 dezo kernel: sky2 eth1: transmit ring 420 .. 379 report=420 done=420
Sep 28 21:42:57 dezo kernel: sky2 hardware hung? flushing
Sep 28 21:49:19 dezo kernel: sky2 eth1: disabling interface
Sep 28 21:49:19 dezo kernel: sky2 eth1: enabling interface
Sep 28 21:49:22 dezo kernel: sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control
rx
Sep 28 21:49:30 dezo kernel: eth1: no IPv6 routers present
This appears to have resulted in a system hang some time later, since
the machine stopped responding and the last message I see in the syslog
is:
Sep 28 22:09:22 dezo -- MARK --
The box was dead when I arrived at the office the next morning.
Also, I'm seeing a bunch of messages like this (in addition to the hw
csum failures I mentioned in my original email):
Oct 2 16:42:17 dezo kernel: SKB BUG: Invalid truesize (3944) len=12745, sizeof(sk_buff)=2
32
Oct 2 16:42:29 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
32
Oct 2 16:42:29 dezo last message repeated 11 times
Oct 3 17:20:54 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
32
Oct 3 17:21:56 dezo last message repeated 4 times
Oct 3 17:21:56 dezo last message repeated 3 times
Oct 3 17:23:46 dezo last message repeated 2 times
Oct 3 20:03:28 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
32
Oct 3 20:03:28 dezo last message repeated 16 times
After which some open TCP sockets between dezo and another box (also
with sky2) start running really slowly.
Not sure how to proceed with this - is there a newer version of sky2
than that in 2.6.18 which I can test?
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 18:21 ` Martin Lucina
@ 2006-10-03 18:35 ` Stephen Hemminger
2006-10-03 18:39 ` Martin Lucina
0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2006-10-03 18:35 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Tue, 3 Oct 2006 20:21:20 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> Hi Stephen,
>
> I'm still getting tx timeouts even after applying the patch you sent me
> and forcing tx flow control off:
What speed and duplex are you using?
>
> Also, I'm seeing a bunch of messages like this (in addition to the hw
> csum failures I mentioned in my original email):
>
> Oct 2 16:42:17 dezo kernel: SKB BUG: Invalid truesize (3944) len=12745, sizeof(sk_buff)=2
> 32
> Oct 2 16:42:29 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
> 32
> Oct 2 16:42:29 dezo last message repeated 11 times
> Oct 3 17:20:54 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
> 32
> Oct 3 17:21:56 dezo last message repeated 4 times
> Oct 3 17:21:56 dezo last message repeated 3 times
> Oct 3 17:23:46 dezo last message repeated 2 times
> Oct 3 20:03:28 dezo kernel: SKB BUG: Invalid truesize (3944) len=16384, sizeof(sk_buff)=2
> 32
> Oct 3 20:03:28 dezo last message repeated 16 times
>
> After which some open TCP sockets between dezo and another box (also
> with sky2) start running really slowly.
What MTU are you using.
>
> Not sure how to proceed with this - is there a newer version of sky2
> than that in 2.6.18 which I can test?
>
> -mato
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 18:35 ` Stephen Hemminger
@ 2006-10-03 18:39 ` Martin Lucina
2006-10-03 19:03 ` Stephen Hemminger
0 siblings, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-10-03 18:39 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
shemminger@osdl.org said:
> On Tue, 3 Oct 2006 20:21:20 +0200
> Martin Lucina <mato@kotelna.sk> wrote:
>
> > Hi Stephen,
> >
> > I'm still getting tx timeouts even after applying the patch you sent me
> > and forcing tx flow control off:
>
> What speed and duplex are you using?
1000 Mbps, full duplex
> What MTU are you using.
1500
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 18:39 ` Martin Lucina
@ 2006-10-03 19:03 ` Stephen Hemminger
2006-10-03 19:13 ` Martin Lucina
2006-10-03 19:15 ` Martin Lucina
0 siblings, 2 replies; 15+ messages in thread
From: Stephen Hemminger @ 2006-10-03 19:03 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Tue, 3 Oct 2006 20:39:49 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> shemminger@osdl.org said:
> > On Tue, 3 Oct 2006 20:21:20 +0200
> > Martin Lucina <mato@kotelna.sk> wrote:
> >
> > > Hi Stephen,
> > >
> > > I'm still getting tx timeouts even after applying the patch you sent me
> > > and forcing tx flow control off:
> >
> > What speed and duplex are you using?
>
> 1000 Mbps, full duplex
>
> > What MTU are you using.
>
> 1500
>
> -mato
Are you sure? I assume you are using the latest driver from Linus's
git repository. That version adds support for fragmented receive, but
that code shouldn't be doing anything unless MTU > page size (4K).
Maybe oversize frames are coming in, and the driver isn't handling.
I'll check
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 19:03 ` Stephen Hemminger
@ 2006-10-03 19:13 ` Martin Lucina
2006-10-03 19:16 ` Stephen Hemminger
2006-10-03 19:15 ` Martin Lucina
1 sibling, 1 reply; 15+ messages in thread
From: Martin Lucina @ 2006-10-03 19:13 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
shemminger@osdl.org said:
> > > What speed and duplex are you using?
> >
> > 1000 Mbps, full duplex
> >
> > > What MTU are you using.
> >
> > 1500
>
> Are you sure? I assume you are using the latest driver from Linus's
> git repository. That version adds support for fragmented receive, but
> that code shouldn't be doing anything unless MTU > page size (4K).
Absolutely positive on the MTU and speed/duplex.
!!!
As for the driver version, I'm using the one in the stock 2.6.18 kernel,
NOT the one in Linus' git.
!!!
If I should be using a newer version, please send me a patch, or tell me
a way to obtain one from the git repository that doesn't involve syncing
the whole repository (not enough intl bandwith for that sort of thing
here).
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 19:03 ` Stephen Hemminger
2006-10-03 19:13 ` Martin Lucina
@ 2006-10-03 19:15 ` Martin Lucina
1 sibling, 0 replies; 15+ messages in thread
From: Martin Lucina @ 2006-10-03 19:15 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Possibly related, could the TX hangs be due to extreme load / no free
memory on the machine? I just realised that my application appears to
be loading the machine to the max... (i.e. all 2GB RAM used + another
couple GB in swap)
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 19:13 ` Martin Lucina
@ 2006-10-03 19:16 ` Stephen Hemminger
2006-10-03 19:23 ` Martin Lucina
0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2006-10-03 19:16 UTC (permalink / raw)
To: Martin Lucina; +Cc: netdev
On Tue, 3 Oct 2006 21:13:51 +0200
Martin Lucina <mato@kotelna.sk> wrote:
> shemminger@osdl.org said:
> > > > What speed and duplex are you using?
> > >
> > > 1000 Mbps, full duplex
> > >
> > > > What MTU are you using.
> > >
> > > 1500
> >
> > Are you sure? I assume you are using the latest driver from Linus's
> > git repository. That version adds support for fragmented receive, but
> > that code shouldn't be doing anything unless MTU > page size (4K).
>
> Absolutely positive on the MTU and speed/duplex.
>
> !!!
> As for the driver version, I'm using the one in the stock 2.6.18 kernel,
> NOT the one in Linus' git.
> !!!
>
> If I should be using a newer version, please send me a patch, or tell me
> a way to obtain one from the git repository that doesn't involve syncing
> the whole repository (not enough intl bandwith for that sort of thing
> here).
>
> -mato
If you are seeing truesize errors with the stock 2.6.18 kernel then
some other protocol is messing with the skb's? Are you using IPV6?
Or PPPoE or something like that.
--
Stephen Hemminger <shemminger@osdl.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: sky2 hangs, hw csum errors with 2.6.18
2006-10-03 19:16 ` Stephen Hemminger
@ 2006-10-03 19:23 ` Martin Lucina
0 siblings, 0 replies; 15+ messages in thread
From: Martin Lucina @ 2006-10-03 19:23 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
shemminger@osdl.org said:
> If you are seeing truesize errors with the stock 2.6.18 kernel then
> some other protocol is messing with the skb's? Are you using IPV6?
> Or PPPoE or something like that.
Only thing like that that is running here is OpenVPN which is using a
tun interface. There's hardly any traffic running over that though.
Could that explain the truesize errors?
Doesn't explain the sky2 tx timeouts though...
-mato
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-10-03 19:24 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-22 11:24 sky2 hangs, hw csum errors with 2.6.18 Martin Lucina
2006-09-22 16:56 ` Stephen Hemminger
2006-09-22 18:23 ` Martin Lucina
2006-09-22 18:29 ` Martin Lucina
2006-09-22 18:31 ` Stephen Hemminger
2006-09-22 18:38 ` Martin Lucina
2006-09-22 18:50 ` Stephen Hemminger
2006-10-03 18:21 ` Martin Lucina
2006-10-03 18:35 ` Stephen Hemminger
2006-10-03 18:39 ` Martin Lucina
2006-10-03 19:03 ` Stephen Hemminger
2006-10-03 19:13 ` Martin Lucina
2006-10-03 19:16 ` Stephen Hemminger
2006-10-03 19:23 ` Martin Lucina
2006-10-03 19:15 ` Martin Lucina
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).