netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
@ 2006-04-12 21:42 Guenther Thomsen
  2006-04-12 21:48 ` Stephen Hemminger
  2006-05-16 19:11 ` Stephen Hemminger
  0 siblings, 2 replies; 9+ messages in thread
From: Guenther Thomsen @ 2006-04-12 21:42 UTC (permalink / raw)
  To: shemminger, John W. Linville; +Cc: netdev

I'm happy to report, that the version of the sky2 driver in 2.6.17-rc1 
yields line rate at low CPU utilization (as determined using ttcp).

Unfortunately, it's not quite bug-free yet ;-} 

When enabling the second interface (of the same network controller) the 
kernel panics (perhaps during DHCP discovery?):

--8<--
[root@penguin1 ~]# ifup eth1

Determining IP information for eth1...Unable to handle kernel paging 
request at ffffc20000014000 RIP:
<ffffffff811a3329>{sky2_mac_init+522}
PGD 13fc49067 PUD 13fc4a067 PMD 13fc4b067 PTE 0
Oops: 0000 [1] SMP
CPU 3os linked in: autofs4 sr_mod cdrom dm_mod button usb_storage 
uhci_hcd 11BladeRunner_sk98lin #1
RIP: 0010:[<ffffffff811a3329>] <ffffffff811a68 RCX: 000000000000001e
RDX: 0000000000004008 RSI: ffffc20000010000 11 0000000000000fe0 R12: 
0000000000001000
R13: ffff81013fae61a8 R14:
                           S: 010 DS: 0000 ES: 0000 CR0: 
000000008005003b
CR2: ffffc20000014000fe40)
Stack: 0000000000001000 ffff81013fae6000 ffff81013fae6500 
00000f1013fae6000
Call Trace: <ffffffff811a3e0c>{sky2_up+334} <ffffffff81
       <ffffffff81144cf4>{sprintf+144} 
<ffffffff8123b616>{inet_ioctl{s_ioctl+44} 
<ffffffff81085703>{sys_ioctl+107}
       <ffffffff81009a2ac_init+522} RSP <ffff81013487fd28>
CR2: ffffc20000014000
 <0>Kerne
-->8--

or (2nd try): 

--8<--
[root@penguin1 ~]# Unable to handle kernel paging request at 
ffffc20000014000 RIP:
<ffffffff811a3329>{sky2_mac_init+522}
PGD 13fc49067 PUD 13fc4a067 PMD 13fc4b067 PTE 0
Oops: 0000 [1] SMP
CPU 2
Modules linked in: autofs4 sr_mod cdrom dm_mod button usb_storage 
uhci_hcd ehci_hcd e752x_edac edac_mc shpcR: 00:[<ffffffff811a3329>] 
<ffffffff811a3329>{sky2_mac_init+522}
RDX: 0000000000004008 RSI: ffffc20000010000 RDI: 0000000000000000
R1 0000000001000
R13: ffff81013f0511a8 R14: 0000000000000001 R15: 0000S0000 CR0: 
000000008005003b
CR2: ffffc20000014000 CR3: 000000013425b000000000001000 ffff81013f051000 
ffff81013f051500 0000000000000000
   Call Trace: <ffffffff811a3e0c>{sky2_up+334} 
<ffffffff811fea04>{dev_op844cf4>{sprintf+144} 
<ffffffff8123b616>{inet_ioctl+74}
       <fffffffff81085703>{ys_ioctl+107}
       <ffffffff81009ac8>{tracesys+209}
+} RSP <ffff81013534fd28>
CR2: ffffc20000014000
 <0>Kernel panic - n
-->8--

The kernel is vanilla 2.6.17-rc1, the sky2 driver was compiled into the 
kernel. OS is RedHat Fedora Core 4. The kernel was compiled using 
gcc32.

The system is a Blade of a BladeRunner 4130 of Penguincomputing, it 
contains two Xeon CPU (+ HT enabled) and an on-board 8062 network 
controller of Marvell (88E8062 is stamped on the chip).

The hardware seems to work fine using 2.6.15(.7) with the sk98lin driver 
version 8.31 of Syskonnect (skd.de).

Please let me know, if I can provide further information or assist in 
any other way.

best regards
	Guenther

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
  2006-04-12 21:42 Guenther Thomsen
@ 2006-04-12 21:48 ` Stephen Hemminger
  2006-04-12 22:26   ` Guenther Thomsen
  2006-05-16 19:11 ` Stephen Hemminger
  1 sibling, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2006-04-12 21:48 UTC (permalink / raw)
  To: Guenther Thomsen; +Cc: John W. Linville, netdev

You need this patch, which Jeff hasn't applied yet.
-----
Subject: sky2: crash when bringing up second port

Sky2 driver will oops referencing bad memory if used on
a dual port card.  The problem is accessing past end of
MIB counter space.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>


--- test-2.6.orig/drivers/net/sky2.c
+++ test-2.6/drivers/net/sky2.c
@@ -579,8 +579,8 @@ static void sky2_mac_init(struct sky2_hw
 	reg = gma_read16(hw, port, GM_PHY_ADDR);
 	gma_write16(hw, port, GM_PHY_ADDR, reg | GM_PAR_MIB_CLR);
 
-	for (i = 0; i < GM_MIB_CNT_SIZE; i++)
-		gma_read16(hw, port, GM_MIB_CNT_BASE + 8 * i);
+	for (i = GM_MIB_CNT_BASE; i <= GM_MIB_CNT_END; i += 4)
+		gma_read16(hw, port, i);
 	gma_write16(hw, port, GM_PHY_ADDR, reg);
 
 	/* transmit control */
--- test-2.6.orig/drivers/net/sky2.h
+++ test-2.6/drivers/net/sky2.h
@@ -1375,7 +1375,7 @@ enum {
 	GM_PHY_ADDR	= 0x0088,	/* 16 bit r/w	GPHY Address Register */
 /* MIB Counters */
 	GM_MIB_CNT_BASE	= 0x0100,	/* Base Address of MIB Counters */
-	GM_MIB_CNT_SIZE	= 256,
+	GM_MIB_CNT_END	= 0x025C,	/* Last MIB counter */
 };
 
 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
  2006-04-12 21:48 ` Stephen Hemminger
@ 2006-04-12 22:26   ` Guenther Thomsen
  2006-04-17 18:18     ` Stephen Hemminger
  0 siblings, 1 reply; 9+ messages in thread
From: Guenther Thomsen @ 2006-04-12 22:26 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: John W. Linville, netdev

On Wednesday 12 April 2006 14:48, Stephen Hemminger wrote:
> You need this patch, which Jeff hasn't applied yet.
> -----
> Subject: sky2: crash when bringing up second port
>
> Sky2 driver will oops referencing bad memory if used on
> a dual port card.  The problem is accessing past end of
> MIB counter space.
>
> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
>
>
> --- test-2.6.orig/drivers/net/sky2.c
> +++ test-2.6/drivers/net/sky2.c
> @@ -579,8 +579,8 @@ static void sky2_mac_init(struct sky2_hw
>  	reg = gma_read16(hw, port, GM_PHY_ADDR);
>  	gma_write16(hw, port, GM_PHY_ADDR, reg | GM_PAR_MIB_CLR);
>
> -	for (i = 0; i < GM_MIB_CNT_SIZE; i++)
> -		gma_read16(hw, port, GM_MIB_CNT_BASE + 8 * i);
> +	for (i = GM_MIB_CNT_BASE; i <= GM_MIB_CNT_END; i += 4)
> +		gma_read16(hw, port, i);
>  	gma_write16(hw, port, GM_PHY_ADDR, reg);
>
>  	/* transmit control */
> --- test-2.6.orig/drivers/net/sky2.h
> +++ test-2.6/drivers/net/sky2.h
> @@ -1375,7 +1375,7 @@ enum {
>  	GM_PHY_ADDR	= 0x0088,	/* 16 bit r/w	GPHY Address Register */
>  /* MIB Counters */
>  	GM_MIB_CNT_BASE	= 0x0100,	/* Base Address of MIB Counters */
> -	GM_MIB_CNT_SIZE	= 256,
> +	GM_MIB_CNT_END	= 0x025C,	/* Last MIB counter */
>  };

Thanks for the very quick response. The patch indeed prevents the panic 
when bringing up the second interface, but now the host doesn't receive 
any packets anymore. It still sends packets (ARP requests, naturally). 
If I inject the Ethernet address of a second host into the arp table of 
the test subject, ICMP Echo requests are sent, but then sendmsg's 
buffer space is exhausted (?):
--8<--
[root@penguin1 ~]# arp -s 192.168.65.67 00:A0:D1:E1:F3:2C
[root@penguin1 ~]# ping 192.168.65.67
PING 192.168.65.67 (192.168.65.67) 56(84) bytes of data.
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available

--- 192.168.65.67 ping statistics ---
19 packets transmitted, 0 received, 100% packet loss, time 37012ms
-->8--

There is no hint of a malfunction to be found in the kernel's message 
buffer.

best regards
	Guenther

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
  2006-04-12 22:26   ` Guenther Thomsen
@ 2006-04-17 18:18     ` Stephen Hemminger
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2006-04-17 18:18 UTC (permalink / raw)
  To: Guenther Thomsen; +Cc: John W. Linville, netdev

I don't know what you are doing different, but my 2 port SysKonnect card
is working fine.  Running SMP AMD64 and 2.6.17 latest.

Showing full speed on both ports.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
  2006-04-12 21:42 Guenther Thomsen
  2006-04-12 21:48 ` Stephen Hemminger
@ 2006-05-16 19:11 ` Stephen Hemminger
  1 sibling, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2006-05-16 19:11 UTC (permalink / raw)
  To: Guenther Thomsen; +Cc: John W. Linville, netdev

Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
 	struct sky2_hw *hw = sky2->hw;
 	unsigned port = sky2->port;
 	u32 ramsize, rxspace, imask;
-	int err = -ENOMEM;
+	int cap, err;
+	struct net_device *otherdev = hw->dev[sky2->port^1];
 
+	/*
+	 * Reduce split transactions (and turn off) rx checksums to
+	 * prevent problems with dual ports.
+	 */
+	if (otherdev && netif_running(otherdev) &&
+	    (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+		struct sky2_port *osky2 = netdev_priv(otherdev);
+		u16 cmd;
+
+		cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+		cmd &= ~PCI_X_CMD_MAX_SPLIT;
+		sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+		sky2->rx_csum = 0;
+		osky2->rx_csum = 0;
+	}
+
+	err = -ENOMEM;
 	if (netif_msg_ifup(sky2))
 		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
 	sky2->duplex = -1;
 	sky2->speed = -1;
 	sky2->advertising = sky2_supported_modes(hw);
-
-	/* Receive checksum disabled for Yukon XL
-	 * because of observed problems with incorrect
-	 * values when multiple packets are received in one interrupt
-	 */
-	sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+	sky2->rx_csum = 1;
 
 	spin_lock_init(&sky2->phy_lock);
 	sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
@ 2006-05-16 20:15 Guenther Thomsen
  0 siblings, 0 replies; 9+ messages in thread
From: Guenther Thomsen @ 2006-05-16 20:15 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: John W. Linville, netdev

Thanks for your continued work on it. I will test the patch, as soon as I get access to the hardware again (probably next week).

best regards
	Guenther 

-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@osdl.org]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
 	struct sky2_hw *hw = sky2->hw;
 	unsigned port = sky2->port;
 	u32 ramsize, rxspace, imask;
-	int err = -ENOMEM;
+	int cap, err;
+	struct net_device *otherdev = hw->dev[sky2->port^1];
 
+	/*
+	 * Reduce split transactions (and turn off) rx checksums to
+	 * prevent problems with dual ports.
+	 */
+	if (otherdev && netif_running(otherdev) &&
+	    (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+		struct sky2_port *osky2 = netdev_priv(otherdev);
+		u16 cmd;
+
+		cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+		cmd &= ~PCI_X_CMD_MAX_SPLIT;
+		sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+		sky2->rx_csum = 0;
+		osky2->rx_csum = 0;
+	}
+
+	err = -ENOMEM;
 	if (netif_msg_ifup(sky2))
 		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
 	sky2->duplex = -1;
 	sky2->speed = -1;
 	sky2->advertising = sky2_supported_modes(hw);
-
-	/* Receive checksum disabled for Yukon XL
-	 * because of observed problems with incorrect
-	 * values when multiple packets are received in one interrupt
-	 */
-	sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+	sky2->rx_csum = 1;
 
 	spin_lock_init(&sky2->phy_lock);
 	sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
@ 2006-06-04  4:05 Guenther Thomsen
  0 siblings, 0 replies; 9+ messages in thread
From: Guenther Thomsen @ 2006-06-04  4:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: John W. Linville, netdev

I received the hardware back and took the opportunity to test with 2.6.17-rc5-git11. So far I did only little tests (ttcp on both interfaces in, out or mixed with some 10e6 packets), but it looks good. No errors (well, 16 overruns in 76574513 packets) and line rate (about 111MB/s) on both channels simultaneously. Hurray!

Thanks a lot for you continued efforts.
	Guenther

-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@osdl.org]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
 	struct sky2_hw *hw = sky2->hw;
 	unsigned port = sky2->port;
 	u32 ramsize, rxspace, imask;
-	int err = -ENOMEM;
+	int cap, err;
+	struct net_device *otherdev = hw->dev[sky2->port^1];
 
+	/*
+	 * Reduce split transactions (and turn off) rx checksums to
+	 * prevent problems with dual ports.
+	 */
+	if (otherdev && netif_running(otherdev) &&
+	    (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+		struct sky2_port *osky2 = netdev_priv(otherdev);
+		u16 cmd;
+
+		cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+		cmd &= ~PCI_X_CMD_MAX_SPLIT;
+		sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+		sky2->rx_csum = 0;
+		osky2->rx_csum = 0;
+	}
+
+	err = -ENOMEM;
 	if (netif_msg_ifup(sky2))
 		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
 	sky2->duplex = -1;
 	sky2->speed = -1;
 	sky2->advertising = sky2_supported_modes(hw);
-
-	/* Receive checksum disabled for Yukon XL
-	 * because of observed problems with incorrect
-	 * values when multiple packets are received in one interrupt
-	 */
-	sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+	sky2->rx_csum = 1;
 
 	spin_lock_init(&sky2->phy_lock);
 	sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
@ 2006-06-07 19:33 Guenther Thomsen
  2006-06-07 19:44 ` Stephen Hemminger
  0 siblings, 1 reply; 9+ messages in thread
From: Guenther Thomsen @ 2006-06-07 19:33 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: John W. Linville, netdev

I was perhaps a bit quick to declare victory. While the results below stand and the machine survived the last few days (idle), it occurred to me only today, to have a look at the kernel's message buffer, where I found following:
--8<--
sky2 eth0: enabling interface
sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control none
sky2 eth1: enabling interface
sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control none
audit(1149379670.514:3): audit_pid=1915 old=0 by auid=4294967295
<unknown>: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444

Call Trace: <ffffffff811de741>{__skb_checksum_complete+76}
       <ffffffff812030cb>{__tcp_checksum_complete_user+33}
       <ffffffff812080d8>{tcp_rcv_established+817} <ffffffff8120f3ee>{tcp_v4_
do_rcv+43}
       <ffffffff811da2ee>{sk_wait_data+203} <ffffffff811fe5a8>{tcp_prequeue_p
rocess+121}
       <ffffffff811ff71d>{tcp_recvmsg+1104} <ffffffff811d9712>{sock_common_re
cvmsg+48}
       <ffffffff811d7d4f>{do_sock_read+209} <ffffffff811d7e7e>{sock_aio_read+
83}
       <ffffffff811e2ca1>{dev_queue_xmit+0} <ffffffff8106dce9>{do_sync_read+1
99}
       <ffffffff8103d699>{remove_wait_queue+18} <ffffffff8103d530>{autoremove
_wake_function+0}
       <ffffffff8106df83>{vfs_read+228} <ffffffff8106ea12>{sys_read+69}
       <ffffffff81009b0d>{tracesys+209}
<unknown>: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444


Call Trace: <ffffffff811de741>{__skb_checksum_complete+76}
       <ffffffff812030cb>{__tcp_checksum_complete_user+33}
       <ffffffff812080d8>{tcp_rcv_established+817} <ffffffff8120f3ee>{tcp_v4_
do_rcv+43}
       <ffffffff811da2ee>{sk_wait_data+203} <ffffffff811fe5a8>{tcp_prequeue_p
rocess+121}
       <ffffffff811ff71d>{tcp_recvmsg+1104} <ffffffff811d9712>{sock_common_re
cvmsg+48}
       <ffffffff811d7696>{alloc_sock_iocb+20} <ffffffff811d7d4f>{do_sock_read
+209}
       <ffffffff811d7e7e>{sock_aio_read+83} <ffffffff8106dce9>{do_sync_read+1
99}
       <ffffffff8103d699>{remove_wait_queue+18} <ffffffff8103d530>{autoremove
_wake_function+0}
       <ffffffff8106df83>{vfs_read+228} <ffffffff8106ea12>{sys_read+69}
       <ffffffff81009b0d>{tracesys+209}
<unknown>: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444


Call Trace: <ffffffff811de741>{__skb_checksum_complete+76}
       <ffffffff812030cb>{__tcp_checksum_complete_user+33}
       <ffffffff812080d8>{tcp_rcv_established+817} <ffffffff8120f3ee>{tcp_v4_
do_rcv+43}
       <ffffffff811da2ee>{sk_wait_data+203} <ffffffff811fe5a8>{tcp_prequeue_p
rocess+121}
       <ffffffff811ff71d>{tcp_recvmsg+1104} <ffffffff811d9712>{sock_common_re
cvmsg+48}
       <ffffffff811d7d4f>{do_sock_read+209} <ffffffff811d7e7e>{sock_aio_read+
83}
       <ffffffff8106dce9>{do_sync_read+199} <ffffffff8103d699>{remove_wait_qu
eue+18}
       <ffffffff8103d530>{autoremove_wake_function+0} <ffffffff8102f1a0>{curr
ent_kernel_time+13}
       <ffffffff8106df83>{vfs_read+228} <ffffffff8106ea12>{sys_read+69}
       <ffffffff81009b0d>{tracesys+209}

sky2 eth0: rx error, status 0x7ffc0001 length 444
sky2 eth0: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
-->8--
Looks, like we're almost, but not quite there yet.

cheers
    Guenther


-----Original Message-----
From: Guenther Thomsen 
Sent: Saturday, June 03, 2006 9:06 PM
To: 'Stephen Hemminger'
Cc: John W. Linville; netdev@vger.kernel.org
Subject: RE: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


I received the hardware back and took the opportunity to test with 2.6.17-rc5-git11. So far I did only little tests (ttcp on both interfaces in, out or mixed with some 10e6 packets), but it looks good. No errors (well, 16 overruns in 76574513 packets) and line rate (about 111MB/s) on both channels simultaneously. Hurray!

Thanks a lot for you continued efforts.
	Guenther

-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@osdl.org]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
 	struct sky2_hw *hw = sky2->hw;
 	unsigned port = sky2->port;
 	u32 ramsize, rxspace, imask;
-	int err = -ENOMEM;
+	int cap, err;
+	struct net_device *otherdev = hw->dev[sky2->port^1];
 
+	/*
+	 * Reduce split transactions (and turn off) rx checksums to
+	 * prevent problems with dual ports.
+	 */
+	if (otherdev && netif_running(otherdev) &&
+	    (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+		struct sky2_port *osky2 = netdev_priv(otherdev);
+		u16 cmd;
+
+		cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+		cmd &= ~PCI_X_CMD_MAX_SPLIT;
+		sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+		sky2->rx_csum = 0;
+		osky2->rx_csum = 0;
+	}
+
+	err = -ENOMEM;
 	if (netif_msg_ifup(sky2))
 		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
 	sky2->duplex = -1;
 	sky2->speed = -1;
 	sky2->advertising = sky2_supported_modes(hw);
-
-	/* Receive checksum disabled for Yukon XL
-	 * because of observed problems with incorrect
-	 * values when multiple packets are received in one interrupt
-	 */
-	sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+	sky2->rx_csum = 1;
 
 	spin_lock_init(&sky2->phy_lock);
 	sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1
  2006-06-07 19:33 kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1 Guenther Thomsen
@ 2006-06-07 19:44 ` Stephen Hemminger
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2006-06-07 19:44 UTC (permalink / raw)
  To: Guenther Thomsen; +Cc: John W. Linville, netdev

On Wed, 7 Jun 2006 12:33:21 -0700
"Guenther Thomsen" <GThomsen@bluearc.com> wrote:

> I was perhaps a bit quick to declare victory. While the results below stand and the machine survived the last few days (idle), it occurred to me only today, to have a look at the kernel's message buffer, where I found following:
> --8<--
> sky2 eth0: enabling interface
> sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control none
> sky2 eth1: enabling interface
> sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control none
> audit(1149379670.514:3): audit_pid=1915 old=0 by auid=4294967295
> <unknown>: hw csum failure.
> sky2 eth1: rx error, status 0x7ffc0001 length 444
> 
> Call Trace: <ffffffff811de741>{__skb_checksum_complete+76}
>        <ffffffff812030cb>{__tcp_checksum_complete_user+33}
>        <ffffffff812080d8>{tcp_rcv_established+817} <ffffffff8120f3ee>{tcp_v4_
> do_rcv+43}
>        <ffffffff811da2ee>{sk_wait_data+203} <ffffffff811fe5a8>{tcp_prequeue_p
> rocess+121}
>        <ffffffff811ff71d>{tcp_recvmsg+1104} <ffffffff811d9712>{sock_common_re
> cvmsg+48}
>        <ffffffff811d7d4f>{do_sock_read+209} <ffffffff811d7e7e>{sock_aio_read+
> 83}
>        <ffffffff811e2ca1>{dev_queue_xmit+0} <ffffffff8106dce9>{do_sync_read+1
> 99}
>        <ffffffff8103d699>{remove_wait_queue+18} <ffffffff8103d530>{autoremove
> _wake_function+0}
>        <ffffffff8106df83>{vfs_read+228} <ffffffff8106ea12>{sys_read+69}
>        <ffffffff81009b0d>{tracesys+209}
> <unknown>: hw csum failure.
> sky2 eth1: rx error, status 0x7ffc0001 length 444

Different problem, I have seen it before.  Basically if the receiver gets overloaded, the
packet FIFO gets full. The driver needs to have some kind of recovery logic for this;
probably just shutting down the receiver and restarting.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-06-07 19:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-07 19:33 kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1 Guenther Thomsen
2006-06-07 19:44 ` Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2006-06-04  4:05 Guenther Thomsen
2006-05-16 20:15 Guenther Thomsen
2006-04-12 21:42 Guenther Thomsen
2006-04-12 21:48 ` Stephen Hemminger
2006-04-12 22:26   ` Guenther Thomsen
2006-04-17 18:18     ` Stephen Hemminger
2006-05-16 19:11 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).