Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/5] forcedeth: checksum fix
From: Ayaz Abdulla @ 2008-01-13 21:02 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 625 bytes --]

The driver should inform the stack when checksum has been performed by 
the HW when both IP and TCP (or UDP) checksum flags are indicated by HW.

Previously, it would also inform the stack when only IP checksum flag 
was indicated by HW. This can cause data corruption when IP fragments 
are used. The IP Identification field can wrap around and cause data 
from new fragments to fill into older fragment slots with same IP Id. 
The stack would then not perform TCP/UDP checksum (after re-assembly of 
all fragments) since driver falsely stated it was already calculated.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>



[-- Attachment #2: patch-forcedeth-checksum --]
[-- Type: text/plain, Size: 1826 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-13 15:01:36.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-13 15:08:31.000000000 -0500
@@ -471,9 +471,9 @@
 #define NV_RX_AVAIL		(1<<31)
 
 #define NV_RX2_CHECKSUMMASK	(0x1C000000)
-#define NV_RX2_CHECKSUMOK1	(0x10000000)
-#define NV_RX2_CHECKSUMOK2	(0x14000000)
-#define NV_RX2_CHECKSUMOK3	(0x18000000)
+#define NV_RX2_CHECKSUM_IP	(0x10000000)
+#define NV_RX2_CHECKSUM_IP_TCP	(0x14000000)
+#define NV_RX2_CHECKSUM_IP_UDP	(0x18000000)
 #define NV_RX2_DESCRIPTORVALID	(1<<29)
 #define NV_RX2_SUBSTRACT1	(1<<25)
 #define NV_RX2_ERROR1		(1<<18)
@@ -2375,14 +2375,9 @@
 						goto next_pkt;
 					}
 				}
-				if ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK2)/*ip and tcp */ {
+				if (((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUM_IP_TCP) || /*ip and tcp */
+				    ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUM_IP_UDP))   /*ip and udp */
 					skb->ip_summed = CHECKSUM_UNNECESSARY;
-				} else {
-					if ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK1 ||
-					    (flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK3) {
-						skb->ip_summed = CHECKSUM_UNNECESSARY;
-					}
-				}
 			} else {
 				dev_kfree_skb(skb);
 				goto next_pkt;
@@ -2474,14 +2469,9 @@
 				}
 			}
 
-			if ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK2)/*ip and tcp */ {
+			if (((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUM_IP_TCP) || /*ip and tcp */
+			    ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUM_IP_UDP))   /*ip and udp */
 				skb->ip_summed = CHECKSUM_UNNECESSARY;
-			} else {
-				if ((flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK1 ||
-				    (flags & NV_RX2_CHECKSUMMASK) == NV_RX2_CHECKSUMOK3) {
-					skb->ip_summed = CHECKSUM_UNNECESSARY;
-				}
-			}
 
 			/* got a valid packet - forward it to the network core */
 			skb_put(skb, len);

^ permalink raw reply

* Re: [BUG] skge 0000:02:05: read data parity error
From: Stephen Hemminger @ 2008-01-14 22:37 UTC (permalink / raw)
  To: Oliver Pinter (Pintér Olivér)
  Cc: Adrian Bunk, linux-kernel, netdev, gregkh
In-Reply-To: <6101e8c40801141358la75d1c6vb100247dbbd651a@mail.gmail.com>

On Mon, 14 Jan 2008 22:58:51 +0100
"Oliver Pinter (Pintér Olivér)" <oliver.pntr@gmail.com> wrote:

> I "tested" other devices resources file, and only with skge freezed
> the system. from this think, that is skge driver bug
> 

Its a property of the hardware, and the current device model has
no way to stop it.  Other devices are worse and can die if you access
pci config space.




-- 
Stephen Hemminger <stephen.hemminger@vyatta.com>

^ permalink raw reply

* [PATCH 1/5] forcedeth: reset register fix
From: Ayaz Abdulla @ 2008-01-13 21:02 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 118 bytes --]

This patch fixes the reset register definition from 0x3C to 0x34.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-reset-register --]
[-- Type: text/plain, Size: 379 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-13 15:00:43.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-13 15:00:48.000000000 -0500
@@ -226,7 +226,7 @@
 #define NVREG_MISC1_HD		0x02
 #define NVREG_MISC1_FORCE	0x3b0f3c
 
-	NvRegMacReset = 0x3c,
+	NvRegMacReset = 0x34,
 #define NVREG_MAC_RESET_ASSERT	0x0F3
 	NvRegTransmitterControl = 0x084,
 #define NVREG_XMITCTL_START	0x01

^ permalink raw reply

* [PATCH 3/5] forcedeth: updated copyright section
From: Ayaz Abdulla @ 2008-01-13 21:02 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 119 bytes --]

This patch updates the copyright section to include 2007 and 2008.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-copyright --]
[-- Type: text/plain, Size: 605 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-13 15:10:18.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-13 15:11:47.000000000 -0500
@@ -13,7 +13,7 @@
  * Copyright (C) 2004 Andrew de Quincey (wol support)
  * Copyright (C) 2004 Carl-Daniel Hailfinger (invalid MAC handling, insane
  *		IRQ rate fixes, bigendian fixes, cleanups, verification)
- * Copyright (c) 2004,5,6 NVIDIA Corporation
+ * Copyright (c) 2004,2005,2006,2007,2008 NVIDIA Corporation
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by

^ permalink raw reply

* [PATCH 4/5] forcedeth: tx pause fix
From: Ayaz Abdulla @ 2008-01-13 21:03 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, netdev

[-- Attachment #1: Type: text/plain, Size: 169 bytes --]

This patch fixes the tx pause enable watermark flags. The new values 
where determined to be optimal during testing.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-txpause --]
[-- Type: text/plain, Size: 547 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-13 15:12:16.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-13 15:14:55.000000000 -0500
@@ -316,8 +316,8 @@
 	NvRegTxRingPhysAddrHigh = 0x148,
 	NvRegRxRingPhysAddrHigh = 0x14C,
 	NvRegTxPauseFrame = 0x170,
-#define NVREG_TX_PAUSEFRAME_DISABLE	0x1ff0080
-#define NVREG_TX_PAUSEFRAME_ENABLE	0x0c00030
+#define NVREG_TX_PAUSEFRAME_DISABLE	0x01ff0080
+#define NVREG_TX_PAUSEFRAME_ENABLE	0x01800010
 	NvRegMIIStatus = 0x180,
 #define NVREG_MIISTAT_ERROR		0x0001
 #define NVREG_MIISTAT_LINKCHANGE	0x0008

^ permalink raw reply

* [PATCH 5/5] forcedeth: multicast fix
From: Ayaz Abdulla @ 2008-01-13 21:03 UTC (permalink / raw)
  To: Jeff Garzik, Manfred Spraul, Andrew Morton, nedev

[-- Attachment #1: Type: text/plain, Size: 222 bytes --]

This patch fixes the case where no multicast addresses are requested to 
be added to the multicast filter. The multicast mask must be set to all 
1's instead of all 0's.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>


[-- Attachment #2: patch-forcedeth-multicast --]
[-- Type: text/plain, Size: 1707 bytes --]

--- old/drivers/net/forcedeth.c	2008-01-13 15:15:22.000000000 -0500
+++ new/drivers/net/forcedeth.c	2008-01-13 15:24:22.000000000 -0500
@@ -277,7 +277,9 @@
 #define NVREG_MCASTADDRA_FORCE	0x01
 	NvRegMulticastAddrB = 0xB4,
 	NvRegMulticastMaskA = 0xB8,
+#define NVREG_MCASTMASKA_NONE		0xffffffff
 	NvRegMulticastMaskB = 0xBC,
+#define NVREG_MCASTMASKB_NONE		0xffff
 
 	NvRegPhyInterface = 0xC0,
 #define PHY_RGMII		0x10000000
@@ -2693,6 +2695,9 @@
 			addr[1] = alwaysOn[1];
 			mask[0] = alwaysOn[0] | alwaysOff[0];
 			mask[1] = alwaysOn[1] | alwaysOff[1];
+		} else {
+			mask[0] = NVREG_MCASTMASKA_NONE;
+			mask[1] = NVREG_MCASTMASKB_NONE;
 		}
 	}
 	addr[0] |= NVREG_MCASTADDRA_FORCE;
@@ -4803,8 +4808,8 @@
 		nv_mac_reset(dev);
 	writel(NVREG_MCASTADDRA_FORCE, base + NvRegMulticastAddrA);
 	writel(0, base + NvRegMulticastAddrB);
-	writel(0, base + NvRegMulticastMaskA);
-	writel(0, base + NvRegMulticastMaskB);
+	writel(NVREG_MCASTMASKA_NONE, base + NvRegMulticastMaskA);
+	writel(NVREG_MCASTMASKB_NONE, base + NvRegMulticastMaskB);
 	writel(0, base + NvRegPacketFilterFlags);
 
 	writel(0, base + NvRegTransmitterControl);
@@ -4898,8 +4903,8 @@
 	spin_lock_irq(&np->lock);
 	writel(NVREG_MCASTADDRA_FORCE, base + NvRegMulticastAddrA);
 	writel(0, base + NvRegMulticastAddrB);
-	writel(0, base + NvRegMulticastMaskA);
-	writel(0, base + NvRegMulticastMaskB);
+	writel(NVREG_MCASTMASKA_NONE, base + NvRegMulticastMaskA);
+	writel(NVREG_MCASTMASKB_NONE, base + NvRegMulticastMaskB);
 	writel(NVREG_PFF_ALWAYS|NVREG_PFF_MYADDR, base + NvRegPacketFilterFlags);
 	/* One manual link speed update: Interrupts are enabled, future link
 	 * speed changes cause interrupts and are handled by nv_link_irq().

^ permalink raw reply

* Re: [patch for 2.6.24? 1/1] bonding: locking fix
From: Andrew Morton @ 2008-01-14 22:47 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: davem, jeff, shemminger, Jay Vosburgh, netdev
In-Reply-To: <Pine.LNX.4.64.0801142313350.26034@bizon.gios.gov.pl>


(cc's added)

On Mon, 14 Jan 2008 23:14:25 +0100 (CET)
Krzysztof Oledzki <olel@ans.pl> wrote:

> 
> 
> On Mon, 14 Jan 2008, akpm@linux-foundation.org wrote:
> 
> > From: Andrew Morton <akpm@linux-foundation.org>
> >
> > Remove stray rtnl_unlock().
> >
> > Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9542
> >
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Stephen Hemminger <shemminger@linux-foundation.org>
> > Cc: Krzysztof Oledzki <olel@ans.pl>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >
> > drivers/net/bonding/bond_sysfs.c |    2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix drivers/net/bonding/bond_sysfs.c
> > --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
> > +++ a/drivers/net/bonding/bond_sysfs.c
> > @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
> > out:
> > 	write_unlock_bh(&bond->lock);
> >
> > -	rtnl_unlock();
> > -
> > 	return count;
> > }
> > static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR, bonding_show_primary, bonding_store_primary);
> > _
> >
> 
> http://thread.gmane.org/gmane.linux.network/82566/focus=83142
> 

That's bond_lock.

This patch (below) addresses what appears to me to be an obvious
imbalance in rtnl_lock.

I don't care how it's fixed, really.  Someone please fix it?




From: Andrew Morton <akpm@linux-foundation.org>

Remove stray rtnl_unlock().

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9542

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Krzysztof Oledzki <olel@ans.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/net/bonding/bond_sysfs.c |    2 --
 1 file changed, 2 deletions(-)

diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix drivers/net/bonding/bond_sysfs.c
--- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix
+++ a/drivers/net/bonding/bond_sysfs.c
@@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str
 out:
 	write_unlock_bh(&bond->lock);
 
-	rtnl_unlock();
-
 	return count;
 }
 static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR, bonding_show_primary, bonding_store_primary);
_


^ permalink raw reply

* Re: [patch for 2.6.24? 1/1] bonding: locking fix
From: Jay Vosburgh @ 2008-01-14 23:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Krzysztof Oledzki, davem, jeff, shemminger, netdev
In-Reply-To: <20080114144754.29a448ad.akpm@linux-foundation.org>

Andrew Morton <akpm@linux-foundation.org> wrote:
[...]
>That's bond_lock.
>
>This patch (below) addresses what appears to me to be an obvious
>imbalance in rtnl_lock.
>
>I don't care how it's fixed, really.  Someone please fix it?

	I posted a correct patch for this a few days ago:

http://marc.info/?l=linux-netdev&m=119975746803886&w=2

	The correct fix requires more than simply removing the rtnl calls.

	I've got a few other patches in the pipeline, so I'm planning to
repost the set the above patch was a part of plus a few others, most
likely tomorrow.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: 2.6.24-rc6-mm1 - oddness with IPv4/v6 mapped sockets hanging...
From: Valdis.Kletnieks @ 2008-01-14 23:04 UTC (permalink / raw)
  To: Paul Moore; +Cc: Andrew Morton, linux-kernel, netdev
In-Reply-To: <200801141407.46345.paul.moore@hp.com>

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

On Mon, 14 Jan 2008 14:07:46 EST, Paul Moore said:
>
> http://git.infradead.org/?p=users/pcmoore/lblnet-2.6_testing;a=commitdiff;h=02f1c89d6e36507476f78108a3dcc78538be460b

Initial testing indicates that 2.6.24-rc6-mm1 plus this one commit is
behaving itself correctly - my Tcl test case that reliably demonstrated wedges
during SYN handling is definitively fixed, and the current issue with hangs with
data pending seems to be gone as well (after admittedly light testing).

Thanks for finding the commit that fixed it... 

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply

* Re: 2.6.24-rc6-mm1 - oddness with IPv4/v6 mapped sockets hanging...
From: Paul Moore @ 2008-01-14 23:19 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Andrew Morton, linux-kernel, netdev
In-Reply-To: <4382.1200351868@turing-police.cc.vt.edu>

On Monday 14 January 2008 6:04:28 pm Valdis.Kletnieks@vt.edu wrote:
> On Mon, 14 Jan 2008 14:07:46 EST, Paul Moore said:
> > http://git.infradead.org/?p=users/pcmoore/lblnet-2.6_testing;a=commitdiff
> >;h=02f1c89d6e36507476f78108a3dcc78538be460b
>
> Initial testing indicates that 2.6.24-rc6-mm1 plus this one commit is
> behaving itself correctly - my Tcl test case that reliably demonstrated
> wedges during SYN handling is definitively fixed, and the current issue
> with hangs with data pending seems to be gone as well (after admittedly
> light testing).
>
> Thanks for finding the commit that fixed it...

No problem, glad to hear that fixed the problem.  It's already in Linus' tree 
so any future -mm kernels as well as 2.6.24 should be problem-free, at least 
with respect to this ;)

-- 
paul moore
linux security @ hp

^ permalink raw reply

* Re: [PATCH 1/3] drivers/net/ipg.c: Fix skbuff leak
From: Francois Romieu @ 2008-01-14 23:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

David Miller <davem@davemloft.net> :
> From: Francois Romieu <romieu@fr.zoreil.com>
> > 
> > I should be able to test your r8169 NAPI changes tomorrow.
> 
> Thank you.

(sorry for the delay)

I think that the r8169 part should be reverted.

. With it, my 8169 oopsed unexpectedly in rtl8169_reinit_task a few
  seconds after I started a 'ping -q -f -l64 -s 36' and before I did
  anything else. I do not have a clear explanation for the oops yet.

. rtl8169_reset_task will not like it as it could allow two racing
  r8169_rx_interrupt.

. The irq will be enabled later when needed: the modified code is
  only relevant for the reset/reinit tasks.

. After reverting the hunk, the 8168 in the same box stood the same
  stream (> 50kpps in both direction). Everything worked as expected
  when I ifconfiged it down, then up again. Same thing when I pluged
  the cable out then in.

I'll give it more testing/thought.

-- 
Ueimor

^ permalink raw reply

* [Patch 2.6.24 1/1]S2io: Fixed synchronization between scheduling of napi with card reset and close
From: Sreenivasa Honnur @ 2008-01-15  1:23 UTC (permalink / raw)
  To: netdev, jeff; +Cc: support

- Fixed synchronization between scheduling of napi with card reset and close 
  by moving the enabling and disabling of napi to card up and card down 
  functions respectively instead of open and close.

Signed-off-by: Surjit Reang <surjit.reang@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
---
diff -Nurp 2-0-26-10/drivers/net/s2io.c 2-0-26-17/drivers/net/s2io.c
--- 2-0-26-10/drivers/net/s2io.c	2008-01-10 01:33:53.000000000 +0530
+++ 2-0-26-17/drivers/net/s2io.c	2008-01-10 04:39:15.000000000 +0530
@@ -84,7 +84,7 @@
 #include "s2io.h"
 #include "s2io-regs.h"
 
-#define DRV_VERSION "2.0.26.10"
+#define DRV_VERSION "2.0.26.17"
 
 /* S2io Driver name & version. */
 static char s2io_driver_name[] = "Neterion";
@@ -3851,8 +3851,6 @@ static int s2io_open(struct net_device *
 	netif_carrier_off(dev);
 	sp->last_link_state = 0;
 
-	napi_enable(&sp->napi);
-
 	if (sp->config.intr_type == MSI_X) {
 		int ret = s2io_enable_msi_x(sp);
 
@@ -3895,7 +3893,6 @@ static int s2io_open(struct net_device *
 	return 0;
 
 hw_init_failed:
-	napi_disable(&sp->napi);
 	if (sp->config.intr_type == MSI_X) {
 		if (sp->entries) {
 			kfree(sp->entries);
@@ -3935,7 +3932,6 @@ static int s2io_close(struct net_device 
 		return 0;
 
 	netif_stop_queue(dev);
-	napi_disable(&sp->napi);
 	/* Reset card, kill tasklet and free Tx and Rx buffers. */
 	s2io_card_down(sp);
 
@@ -6799,6 +6795,8 @@ static void do_s2io_card_down(struct s2i
 	struct XENA_dev_config __iomem *bar0 = sp->bar0;
 	unsigned long flags;
 	register u64 val64 = 0;
+	struct config_param *config;
+	config = &sp->config;
 
 	if (!is_s2io_card_up(sp))
 		return;
@@ -6810,6 +6808,10 @@ static void do_s2io_card_down(struct s2i
 	}
 	clear_bit(__S2IO_STATE_CARD_UP, &sp->state);
 
+	/* Disable napi */
+	if (config->napi)
+		napi_disable(&sp->napi);
+
 	/* disable Tx and Rx traffic on the NIC */
 	if (do_io)
 		stop_nic(sp);
@@ -6903,6 +6905,11 @@ static int s2io_card_up(struct s2io_nic 
 		DBG_PRINT(INFO_DBG, "Buf in ring:%d is %d:\n", i,
 			  atomic_read(&sp->rx_bufs_left[i]));
 	}
+
+	/* Initialise napi */
+	if (config->napi)
+		napi_enable(&sp->napi);
+
 	/* Maintain the state prior to the open */
 	if (sp->promisc_flg)
 		sp->promisc_flg = 0;


^ permalink raw reply

* [PATCH] atl1: fix frame length bug
From: Jay Cliburn @ 2008-01-15  1:56 UTC (permalink / raw)
  To: jeff, davem; +Cc: csnook, david.harris, netdev, Jay Cliburn

The driver sets up the hardware to accept a frame with max length
equal to MTU + Ethernet header + FCS + VLAN tag, but we neglect to
add the VLAN tag size to the ingress buffer.  When a VLAN-tagged
frame arrives, the hardware passes it, but bad things happen
because the buffer is too small.  This patch fixes that.

Thanks to David Harris for reporting the bug and testing the fix.

Tested-by: David Harris <david.harris@cpni-inc.com>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
---
 drivers/net/atl1/atl1_main.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c
index 35b0a7d..9200ee5 100644
--- a/drivers/net/atl1/atl1_main.c
+++ b/drivers/net/atl1/atl1_main.c
@@ -120,7 +120,7 @@ static int __devinit atl1_sw_init(struct atl1_adapter *adapter)
 	struct atl1_hw *hw = &adapter->hw;
 	struct net_device *netdev = adapter->netdev;
 
-	hw->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
+	hw->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN;
 	hw->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
 	adapter->wol = 0;
@@ -688,7 +688,7 @@ static int atl1_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct atl1_adapter *adapter = netdev_priv(netdev);
 	int old_mtu = netdev->mtu;
-	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
+	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN;
 
 	if ((max_frame < ETH_ZLEN + ETH_FCS_LEN) ||
 	    (max_frame > MAX_JUMBO_FRAME_SIZE)) {
@@ -853,8 +853,8 @@ static u32 atl1_configure(struct atl1_adapter *adapter)
 	/* set Interrupt Clear Timer */
 	iowrite16(adapter->ict, hw->hw_addr + REG_CMBDISDMA_TIMER);
 
-	/* set MTU, 4 : VLAN */
-	iowrite32(hw->max_frame_size + 4, hw->hw_addr + REG_MTU);
+	/* set max frame size hw will accept */
+	iowrite32(hw->max_frame_size, hw->hw_addr + REG_MTU);
 
 	/* jumbo size & rrd retirement timer */
 	value = (((u32) hw->rx_jumbo_th & RXQ_JMBOSZ_TH_MASK)
-- 
1.5.3.7


^ permalink raw reply related

* Re: [PATCH] atl1: fix frame length bug
From: Jay Cliburn @ 2008-01-15  2:04 UTC (permalink / raw)
  To: jeff, davem; +Cc: Jay Cliburn, csnook, david.harris, netdev
In-Reply-To: <1200362201-30987-1-git-send-email-jacliburn@bellsouth.net>

On Mon, 14 Jan 2008 19:56:41 -0600
Jay Cliburn <jacliburn@bellsouth.net> wrote:

> The driver sets up the hardware to accept a frame with max length
> equal to MTU + Ethernet header + FCS + VLAN tag, but we neglect to
> add the VLAN tag size to the ingress buffer.  When a VLAN-tagged
> frame arrives, the hardware passes it, but bad things happen
> because the buffer is too small.  This patch fixes that.
> 
> Thanks to David Harris for reporting the bug and testing the fix.
> 
> Tested-by: David Harris <david.harris@cpni-inc.com>
> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>

Jeff, Dave,

This bugfix needs to go in for 2.6.24 if possible.

Thanks,
Jay

^ permalink raw reply

* Re: [PATCH 0/3] UCC TDM driver for MPC83xx platforms
From: Kumar Gala @ 2008-01-15  3:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kim Phillips, Poonam.Aggrwal, sfr, rubini, linux-ppcdev, netdev,
	linux-kernel, Michael.Barkowski, ashish.kalra, Rich.Cutler
In-Reply-To: <20080114131530.9d33f50a.akpm@linux-foundation.org>


On Jan 14, 2008, at 3:15 PM, Andrew Morton wrote:

> On Mon, 14 Jan 2008 12:00:51 -0600
> Kim Phillips <kim.phillips@freescale.com> wrote:
>
>> On Thu, 10 Jan 2008 21:41:20 -0700
>> "Aggrwal Poonam" <Poonam.Aggrwal@freescale.com> wrote:
>>
>>> Hello  All
>>>
>>> I am waiting for more feedback on the patches.
>>>
>>> If there are no objections please consider them for 2.6.25.
>>>
>> if this isn't going to go through Alessandro Rubini/misc drivers, can
>> it go through the akpm/mm tree?
>>
>
> That would work.  But it might be more appropriate to go Kumar- 
> >paulus->Linus.

I'm ok w/taking the arch/powerpc bits, but I"m a bit concerned about  
the driver itself.  I'm wondering if we need a TDM framework in the  
kernel.

I guess if Poonam could possibly describe how this driver is actually  
used that would be helpful.  I see we have 8315 with a discrete TDM  
block and I'm guessing 82xx/85xx based CPM parts of some form of TDM  
as well.

- k

^ permalink raw reply

* [PATCH 2/6] [IPV4] fib hash|trie initialization
From: Stephen Hemminger @ 2008-01-15  5:00 UTC (permalink / raw)
  Cc: David Miller, robert.olsson, netdev
In-Reply-To: <20080114125755.6157a3bf@deepthought>

Initialization of the slab cache's should be done when IP is
initialized to make sure of available memory, and that code
can be marked __init.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>


---
Applies after statistics (#1) patch for net-2.6.25

 include/net/ip_fib.h    |    5 +++--
 net/ipv4/fib_frontend.c |    9 ++++++---
 net/ipv4/fib_hash.c     |   24 +++++++++++-------------
 net/ipv4/fib_trie.c     |   16 ++++++++--------
 4 files changed, 28 insertions(+), 26 deletions(-)

--- a/include/net/ip_fib.h	2008-01-14 14:59:32.000000000 -0800
+++ b/include/net/ip_fib.h	2008-01-14 15:01:12.000000000 -0800
@@ -231,8 +231,9 @@ extern int fib_sync_down(__be32 local, s
 extern int fib_sync_up(struct net_device *dev);
 extern __be32  __fib_res_prefsrc(struct fib_result *res);
 
-/* Exported by fib_hash.c */
-extern struct fib_table *fib_hash_init(u32 id);
+/* Exported by fib_{hash|trie}.c */
+extern void fib_hash_init(void);
+extern struct fib_table *fib_hash_table(u32 id);
 
 static inline void fib_combine_itag(u32 *itag, struct fib_result *res)
 {
--- a/net/ipv4/fib_frontend.c	2008-01-14 14:59:32.000000000 -0800
+++ b/net/ipv4/fib_frontend.c	2008-01-14 15:01:12.000000000 -0800
@@ -53,11 +53,11 @@ static int __net_init fib4_rules_init(st
 {
 	struct fib_table *local_table, *main_table;
 
-	local_table = fib_hash_init(RT_TABLE_LOCAL);
+	local_table = fib_hash_table(RT_TABLE_LOCAL);
 	if (local_table == NULL)
 		return -ENOMEM;
 
-	main_table  = fib_hash_init(RT_TABLE_MAIN);
+	main_table  = fib_hash_table(RT_TABLE_MAIN);
 	if (main_table == NULL)
 		goto fail;
 
@@ -83,7 +83,8 @@ struct fib_table *fib_new_table(struct n
 	tb = fib_get_table(net, id);
 	if (tb)
 		return tb;
-	tb = fib_hash_init(id);
+
+	tb = fib_hash_table(id);
 	if (!tb)
 		return NULL;
 	h = id & (FIB_TABLE_HASHSZ - 1);
@@ -1042,6 +1043,8 @@ void __init ip_fib_init(void)
 	register_pernet_subsys(&fib_net_ops);
 	register_netdevice_notifier(&fib_netdev_notifier);
 	register_inetaddr_notifier(&fib_inetaddr_notifier);
+
+	fib_hash_init();
 }
 
 EXPORT_SYMBOL(inet_addr_type);
--- a/net/ipv4/fib_hash.c	2008-01-14 14:59:32.000000000 -0800
+++ b/net/ipv4/fib_hash.c	2008-01-14 15:01:12.000000000 -0800
@@ -746,21 +746,19 @@ static int fn_hash_dump(struct fib_table
 	return skb->len;
 }
 
-struct fib_table *fib_hash_init(u32 id)
+void __init fib_hash_init(void)
 {
-	struct fib_table *tb;
+	fn_hash_kmem = kmem_cache_create("ip_fib_hash", sizeof(struct fib_node),
+					 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+	fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
+					  0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+}
 
-	if (fn_hash_kmem == NULL)
-		fn_hash_kmem = kmem_cache_create("ip_fib_hash",
-						 sizeof(struct fib_node),
-						 0, SLAB_HWCACHE_ALIGN,
-						 NULL);
-
-	if (fn_alias_kmem == NULL)
-		fn_alias_kmem = kmem_cache_create("ip_fib_alias",
-						  sizeof(struct fib_alias),
-						  0, SLAB_HWCACHE_ALIGN,
-						  NULL);
+struct fib_table *fib_hash_table(u32 id)
+{
+	struct fib_table *tb;
 
 	tb = kmalloc(sizeof(struct fib_table) + sizeof(struct fn_hash),
 		     GFP_KERNEL);
--- a/net/ipv4/fib_trie.c	2008-01-14 15:01:12.000000000 -0800
+++ b/net/ipv4/fib_trie.c	2008-01-14 20:57:41.000000000 -0800
@@ -1923,19 +1923,19 @@ out:
 	return -1;
 }
 
-/* Fix more generic FIB names for init later */
+void __init fib_hash_init(void)
+{
+	fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
+					  0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+}
 
-struct fib_table *fib_hash_init(u32 id)
+
+/* Fix more generic FIB names for init later */
+struct fib_table *fib_hash_table(u32 id)
 {
 	struct fib_table *tb;
 	struct trie *t;
 
-	if (fn_alias_kmem == NULL)
-		fn_alias_kmem = kmem_cache_create("ip_fib_alias",
-						  sizeof(struct fib_alias),
-						  0, SLAB_HWCACHE_ALIGN,
-						  NULL);
-
 	tb = kmalloc(sizeof(struct fib_table) + sizeof(struct trie),
 		     GFP_KERNEL);
 	if (tb == NULL)

^ permalink raw reply

* [PATCH 3/6] [IPV4] trie: put leaf nodes in a slab cache
From: Stephen Hemminger @ 2008-01-15  0:46 UTC (permalink / raw)
  To: David Miller; +Cc: robert.olsson, netdev
In-Reply-To: <20080114164450.55f8c9b2@deepthought>

This improves locality for operations that touch all the leaves.
Later patch will grow the size of the leaf so it becomes more
important.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>


--- a/net/ipv4/fib_trie.c	2008-01-14 12:26:51.000000000 -0800
+++ b/net/ipv4/fib_trie.c	2008-01-14 13:41:00.000000000 -0800
@@ -162,6 +162,7 @@ static struct tnode *halve(struct trie *
 static void tnode_free(struct tnode *tn);
 
 static struct kmem_cache *fn_alias_kmem __read_mostly;
+static struct kmem_cache *trie_leaf_kmem __read_mostly;
 
 static inline struct tnode *node_parent(struct node *node)
 {
@@ -316,7 +317,8 @@ static inline void alias_free_mem_rcu(st
 
 static void __leaf_free_rcu(struct rcu_head *head)
 {
-	kfree(container_of(head, struct leaf, rcu));
+	struct leaf *leaf = container_of(head, struct leaf, rcu);
+	kmem_cache_free(trie_leaf_kmem, leaf);
 }
 
 static void __leaf_info_free_rcu(struct rcu_head *head)
@@ -366,7 +368,7 @@ static inline void tnode_free(struct tno
 
 static struct leaf *leaf_new(void)
 {
-	struct leaf *l = kmalloc(sizeof(struct leaf),  GFP_KERNEL);
+	struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
 	if (l) {
 		l->parent = T_LEAF;
 		INIT_HLIST_HEAD(&l->list);
@@ -1927,6 +1929,9 @@ void __init fib_hash_init(void)
 {
 	fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias),
 					  0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+
+	trie_leaf_kmem = kmem_cache_create("ip_fib_trie", sizeof(struct leaf),
+					   0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
 }
 
 

^ permalink raw reply

* [PATCH 4/6] [IPV4] fib_trie style cleanup
From: Stephen Hemminger @ 2008-01-15  0:47 UTC (permalink / raw)
  To: David Miller; +Cc: robert.olsson, netdev
In-Reply-To: <20080114164621.2bc5011f@deepthought>

Get rid of #ifdef, and split out embedded function calls in if statements.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>

--- a/net/ipv4/fib_trie.c	2008-01-14 14:24:58.000000000 -0800
+++ b/net/ipv4/fib_trie.c	2008-01-14 14:26:05.000000000 -0800
@@ -1293,7 +1293,9 @@ static inline int check_leaf(struct trie
 		if (l->key != (key & ntohl(mask)))
 			continue;
 
-		if ((err = fib_semantic_match(&li->falh, flp, res, htonl(l->key), mask, i)) <= 0) {
+		err = fib_semantic_match(&li->falh, flp, res,
+					 htonl(l->key), mask, i);
+		if (err <= 0) {
 			*plen = i;
 #ifdef CONFIG_IP_FIB_TRIE_STATS
 			t->stats.semantic_match_passed++;
@@ -1335,7 +1337,8 @@ fn_trie_lookup(struct fib_table *tb, con
 
 	/* Just a leaf? */
 	if (IS_LEAF(n)) {
-		if ((ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res)) <= 0)
+		ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
+		if (ret <= 0)
 			goto found;
 		goto failed;
 	}
@@ -1360,14 +1363,13 @@ fn_trie_lookup(struct fib_table *tb, con
 		}
 
 		if (IS_LEAF(n)) {
-			if ((ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res)) <= 0)
+			ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
+			if (ret <= 0)
 				goto found;
 			else
 				goto backtrace;
 		}
 
-#define HL_OPTIMIZE
-#ifdef HL_OPTIMIZE
 		cn = (struct tnode *)n;
 
 		/*
@@ -1455,7 +1457,7 @@ fn_trie_lookup(struct fib_table *tb, con
 			if (current_prefix_length >= cn->pos)
 				current_prefix_length = mp;
 		}
-#endif
+
 		pn = (struct tnode *)n; /* Descend */
 		chopped_off = 0;
 		continue;

^ permalink raw reply

* [PATCH 5/6] [IPV4] fib_trie: checkleaf calling convention
From: Stephen Hemminger @ 2008-01-15  2:58 UTC (permalink / raw)
  To: David Miller; +Cc: robert.olsson, netdev
In-Reply-To: <20080114164727.210047f6@deepthought>

Another case where just returning a negative value gets rid of
call by reference. In this case the return value for checkleaf is
now:
	-1      no match
	0..32   prefix length

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>


--- a/net/ipv4/fib_trie.c	2008-01-14 18:02:18.000000000 -0800
+++ b/net/ipv4/fib_trie.c	2008-01-14 18:37:51.000000000 -0800
@@ -1277,36 +1277,36 @@ err:
 
 
 /* should be called with rcu_read_lock */
-static inline int check_leaf(struct trie *t, struct leaf *l,
-			     t_key key, int *plen, const struct flowi *flp,
-			     struct fib_result *res)
+static int check_leaf(struct trie *t, struct leaf *l,
+		      t_key key,  const struct flowi *flp,
+		      struct fib_result *res)
 {
-	int err, i;
-	__be32 mask;
 	struct leaf_info *li;
 	struct hlist_head *hhead = &l->list;
 	struct hlist_node *node;
 
 	hlist_for_each_entry_rcu(li, node, hhead, hlist) {
-		i = li->plen;
-		mask = inet_make_mask(i);
+		int err;
+		int plen = li->plen;
+		__be32 mask = inet_make_mask(plen);
+
 		if (l->key != (key & ntohl(mask)))
 			continue;
 
 		err = fib_semantic_match(&li->falh, flp, res,
-					 htonl(l->key), mask, i);
-		if (err <= 0) {
-			*plen = i;
+					 htonl(l->key), mask, plen);
+
 #ifdef CONFIG_IP_FIB_TRIE_STATS
+		if (err <= 0)
 			t->stats.semantic_match_passed++;
+		else
+			t->stats.semantic_match_miss++;
 #endif
-			return err;
-		}
-#ifdef CONFIG_IP_FIB_TRIE_STATS
-		t->stats.semantic_match_miss++;
-#endif
+		if (err <= 0)
+			return plen;
 	}
-	return 1;
+
+	return -1;
 }
 
 static int
@@ -1337,11 +1337,13 @@ fn_trie_lookup(struct fib_table *tb, con
 
 	/* Just a leaf? */
 	if (IS_LEAF(n)) {
-		ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
-		if (ret <= 0)
-			goto found;
-		goto failed;
+		plen = check_leaf(t, (struct leaf *)n, key, flp, res);
+		if (plen < 0)
+			goto failed;
+		ret = 0;
+		goto found;
 	}
+
 	pn = (struct tnode *) n;
 	chopped_off = 0;
 
@@ -1363,11 +1365,12 @@ fn_trie_lookup(struct fib_table *tb, con
 		}
 
 		if (IS_LEAF(n)) {
-			ret = check_leaf(t, (struct leaf *)n, key, &plen, flp, res);
-			if (ret <= 0)
-				goto found;
-			else
+			plen = check_leaf(t, (struct leaf *)n, key, flp, res);
+			if (plen < 0)
 				goto backtrace;
+
+			ret = 0;
+			goto found;
 		}
 
 		cn = (struct tnode *)n;

^ permalink raw reply

* [RFC 6/6] fib_trie: combine leaf and info
From: Stephen Hemminger @ 2008-01-15  5:07 UTC (permalink / raw)
  To: David Miller; +Cc: robert.olsson, netdev
In-Reply-To: <20080114185843.0981f0ff@deepthought>

Combine the prefix information and the leaf together into one
allocation. This is furthur simplified by converting the hlist
into a simple bitfield.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>

This patch is experimental (ie it boots and loads routes), but
is slower for the 163,000 route dump test.

---
 net/ipv4/fib_trie.c |  165 ++++++++++++++++++++--------------------------------
 1 file changed, 65 insertions(+), 100 deletions(-)

--- a/net/ipv4/fib_trie.c	2008-01-14 18:37:51.000000000 -0800
+++ b/net/ipv4/fib_trie.c	2008-01-14 20:57:12.000000000 -0800
@@ -50,10 +50,9 @@
  *		Patrick McHardy <kaber@trash.net>
  */
 
-#define VERSION "0.408"
+#define VERSION "0.409"
+
 
-#include <asm/uaccess.h>
-#include <asm/system.h>
 #include <linux/bitops.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
@@ -80,6 +79,8 @@
 #include <net/tcp.h>
 #include <net/sock.h>
 #include <net/ip_fib.h>
+#include <asm/bitops.h>
+
 #include "fib_lookup.h"
 
 #define MAX_STAT_DEPTH 32
@@ -101,20 +102,24 @@ struct node {
 	t_key key;
 };
 
+struct leaf_info {
+	struct list_head falh;
+};
+
+#define INFO_SIZE	33
+/* NB: bitmap is in reverse order, so that find_first returns largest prefix */
+#define PLEN2BIT(x)	(INFO_SIZE - (x))
+
 struct leaf {
 	unsigned long parent;
 	t_key key;
-	struct hlist_head list;
 	struct rcu_head rcu;
-};
 
-struct leaf_info {
-	struct hlist_node hlist;
-	struct rcu_head rcu;
-	int plen;
-	struct list_head falh;
+	unsigned long bitmap[INFO_SIZE / BITS_PER_LONG + 1];
+	struct leaf_info info[INFO_SIZE];
 };
 
+
 struct tnode {
 	unsigned long parent;
 	t_key key;
@@ -321,16 +326,6 @@ static void __leaf_free_rcu(struct rcu_h
 	kmem_cache_free(trie_leaf_kmem, leaf);
 }
 
-static void __leaf_info_free_rcu(struct rcu_head *head)
-{
-	kfree(container_of(head, struct leaf_info, rcu));
-}
-
-static inline void free_leaf_info(struct leaf_info *leaf)
-{
-	call_rcu(&leaf->rcu, __leaf_info_free_rcu);
-}
-
 static struct tnode *tnode_alloc(size_t size)
 {
 	struct page *pages;
@@ -371,21 +366,37 @@ static struct leaf *leaf_new(void)
 	struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
 	if (l) {
 		l->parent = T_LEAF;
-		INIT_HLIST_HEAD(&l->list);
+		bitmap_zero(l->bitmap, INFO_SIZE);
 	}
 	return l;
 }
 
-static struct leaf_info *leaf_info_new(int plen)
+static inline struct leaf_info *find_leaf_info(struct leaf *l, int plen)
 {
-	struct leaf_info *li = kmalloc(sizeof(struct leaf_info),  GFP_KERNEL);
-	if (li) {
-		li->plen = plen;
-		INIT_LIST_HEAD(&li->falh);
-	}
+	return test_bit(PLEN2BIT(plen), l->bitmap) ? &l->info[plen] : NULL;
+}
+
+static struct leaf_info *set_leaf_info(struct leaf *l, int plen)
+{
+	struct leaf_info *li = &l->info[plen];
+
+	INIT_LIST_HEAD(&li->falh);
+	set_bit(PLEN2BIT(plen), l->bitmap);
+
 	return li;
 }
 
+static inline void clear_leaf_info(struct leaf *l, int plen)
+{
+	smp_mb__before_clear_bit();
+	clear_bit(PLEN2BIT(plen), &l->bitmap);
+}
+
+static inline int leaf_info_empty(const struct leaf *l)
+{
+	return find_first_bit(l->bitmap, INFO_SIZE) >= INFO_SIZE;
+}
+
 static struct tnode* tnode_new(t_key key, int pos, int bits)
 {
 	size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
@@ -876,20 +887,6 @@ nomem:
 
 /* readside must use rcu_read_lock currently dump routines
  via get_fa_head and dump */
-
-static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
-{
-	struct hlist_head *head = &l->list;
-	struct hlist_node *node;
-	struct leaf_info *li;
-
-	hlist_for_each_entry_rcu(li, node, head, hlist)
-		if (li->plen == plen)
-			return li;
-
-	return NULL;
-}
-
 static inline struct list_head * get_fa_head(struct leaf *l, int plen)
 {
 	struct leaf_info *li = find_leaf_info(l, plen);
@@ -900,26 +897,6 @@ static inline struct list_head * get_fa_
 	return &li->falh;
 }
 
-static void insert_leaf_info(struct hlist_head *head, struct leaf_info *new)
-{
-	struct leaf_info *li = NULL, *last = NULL;
-	struct hlist_node *node;
-
-	if (hlist_empty(head)) {
-		hlist_add_head_rcu(&new->hlist, head);
-	} else {
-		hlist_for_each_entry(li, node, head, hlist) {
-			if (new->plen > li->plen)
-				break;
-
-			last = li;
-		}
-		if (last)
-			hlist_add_after_rcu(&last->hlist, &new->hlist);
-		else
-			hlist_add_before_rcu(&new->hlist, &li->hlist);
-	}
-}
 
 /* rcu_read_lock needs to be hold by caller from readside */
 
@@ -993,6 +970,8 @@ static struct list_head *fib_insert_node
 	pos = 0;
 	n = t->trie;
 
+	pr_debug("fib_insert_node: %x/%d\n", key, plen);
+
 	/* If we point to NULL, stop. Either the tree is empty and we should
 	 * just put a new leaf in if, or we have reached an empty child slot,
 	 * and we should just put our new leaf in that.
@@ -1038,13 +1017,9 @@ static struct list_head *fib_insert_node
 
 	if (n != NULL && IS_LEAF(n) && tkey_equals(key, n->key)) {
 		l = (struct leaf *) n;
-		li = leaf_info_new(plen);
-
-		if (!li)
-			return NULL;
-
+		li = set_leaf_info(l, plen);
 		fa_head = &li->falh;
-		insert_leaf_info(&l->list, li);
+
 		goto done;
 	}
 	l = leaf_new();
@@ -1053,15 +1028,8 @@ static struct list_head *fib_insert_node
 		return NULL;
 
 	l->key = key;
-	li = leaf_info_new(plen);
-
-	if (!li) {
-		tnode_free((struct tnode *) l);
-		return NULL;
-	}
-
+	li = set_leaf_info(l, plen);
 	fa_head = &li->falh;
-	insert_leaf_info(&l->list, li);
 
 	if (t->trie && n == NULL) {
 		/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1091,7 +1059,6 @@ static struct list_head *fib_insert_node
 		}
 
 		if (!tn) {
-			free_leaf_info(li);
 			tnode_free((struct tnode *) l);
 			return NULL;
 		}
@@ -1119,7 +1086,7 @@ static struct list_head *fib_insert_node
 
 	rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
 done:
-	return fa_head;
+	return &li->falh;
 }
 
 /*
@@ -1281,14 +1248,14 @@ static int check_leaf(struct trie *t, st
 		      t_key key,  const struct flowi *flp,
 		      struct fib_result *res)
 {
-	struct leaf_info *li;
-	struct hlist_head *hhead = &l->list;
-	struct hlist_node *node;
+	int bit;
 
-	hlist_for_each_entry_rcu(li, node, hhead, hlist) {
-		int err;
-		int plen = li->plen;
+	/* need find highest prefix */
+	for_each_bit(bit, l->bitmap, INFO_SIZE) {
+		int plen = PLEN2BIT(bit);
+		struct leaf_info *li = &l->info[plen];
 		__be32 mask = inet_make_mask(plen);
+		int err;
 
 		if (l->key != (key & ntohl(mask)))
 			continue;
@@ -1622,12 +1589,10 @@ static int fn_trie_delete(struct fib_tab
 
 	list_del_rcu(&fa->fa_list);
 
-	if (list_empty(fa_head)) {
-		hlist_del_rcu(&li->hlist);
-		free_leaf_info(li);
-	}
+	if (list_empty(fa_head))
+		clear_leaf_info(l, plen);
 
-	if (hlist_empty(&l->list))
+	if (leaf_info_empty(l))
 		trie_leaf_remove(t, key);
 
 	if (fa->fa_state & FA_S_ACCESSED)
@@ -1659,18 +1624,17 @@ static int trie_flush_list(struct trie *
 static int trie_flush_leaf(struct trie *t, struct leaf *l)
 {
 	int found = 0;
-	struct hlist_head *lih = &l->list;
-	struct hlist_node *node, *tmp;
-	struct leaf_info *li = NULL;
+	int bit;
 
-	hlist_for_each_entry_safe(li, node, tmp, lih, hlist) {
+	for_each_bit(bit, l->bitmap, INFO_SIZE) {
+		int plen = PLEN2BIT(bit);
+		struct leaf_info *li = &l->info[plen];
 		found += trie_flush_list(t, &li->falh);
 
-		if (list_empty(&li->falh)) {
-			hlist_del_rcu(&li->hlist);
-			free_leaf_info(li);
-		}
+		if (list_empty(&li->falh))
+			clear_leaf_info(l, plen);
 	}
+
 	return found;
 }
 
@@ -1746,12 +1710,12 @@ static int fn_trie_flush(struct fib_tabl
 	for (h = 0; (l = nextleaf(t, l)) != NULL; h++) {
 		found += trie_flush_leaf(t, l);
 
-		if (ll && hlist_empty(&ll->list))
+		if (ll && leaf_info_empty(ll))
 			trie_leaf_remove(t, ll->key);
 		ll = l;
 	}
 
-	if (ll && hlist_empty(&ll->list))
+	if (ll && leaf_info_empty(ll))
 		trie_leaf_remove(t, ll->key);
 
 	pr_debug("trie_flush found=%d\n", found);
@@ -2428,10 +2392,11 @@ static int fib_route_seq_show(struct seq
 
 	if (iter->trie == iter->trie_local)
 		return 0;
+
 	if (IS_TNODE(l))
 		return 0;
 
-	for (i=32; i>=0; i--) {
+	for (i = 32; i >= 0; i--) {
 		struct leaf_info *li = find_leaf_info(l, i);
 		struct fib_alias *fa;
 		__be32 mask, prefix;
@@ -2439,7 +2404,7 @@ static int fib_route_seq_show(struct seq
 		if (!li)
 			continue;
 
-		mask = inet_make_mask(li->plen);
+		mask = inet_make_mask(i);
 		prefix = htonl(l->key);
 
 		list_for_each_entry_rcu(fa, &li->falh, fa_list) {

^ permalink raw reply

* [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: Frans Pop @ 2008-01-15  5:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

After compiling v2.6.24-rc7-163-g1a1b285 (x86_64) yesterday I suddenly see this error
repeatedly:
kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
kernel:   Tx Queue             <0>
kernel:   TDH                  <a>
kernel:   TDT                  <a>
kernel:   next_to_use          <a>
kernel:   next_to_clean        <ff>
kernel: buffer_info[next_to_clean]
kernel:   time_stamp           <10002738a>
kernel:   next_to_watch        <ff>
kernel:   jiffies              <1000275b4>
kernel:   next_to_watch.status <1>

My previous kernel was v2.6.24-rc7 and with that this error did not occur. I
have also never seen it with earlier kernels.

The values for "TX Queue" and "next_to_watch.status" are constant, the
others vary.

My NIC is:
01:00.0 Ethernet controller [0200]: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03)

01:00.0 0200: 8086:108c (rev 03)
        Subsystem: 8086:3096
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 1273
        Region 0: Memory at 90200000 (32-bit, non-prefetchable) [size=128K]
        Region 1: Memory at 90100000 (32-bit, non-prefetchable) [size=1M]
        Region 2: I/O ports at 1000 [size=32]
        Capabilities: [c8] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
                Address: 00000000fee0300c  Data: 41a9
        Capabilities: [e0] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <512ns, L1 <64us
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x1, ASPM unknown, Port 0
                Link: Latency L0s <128ns, L1 <64us
                Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch-
                Link: Speed 2.5Gb/s, Width x1

The system is an Intel D945GCZ main board with
Intel(R) Pentium(R) D CPU 3.20GHz (dual core) processor.

Cheers,
FJP

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: David Miller @ 2008-01-15  5:53 UTC (permalink / raw)
  To: elendil; +Cc: netdev, linux-kernel
In-Reply-To: <200801150625.10823.elendil@planet.nl>

From: Frans Pop <elendil@planet.nl>
Date: Tue, 15 Jan 2008 06:25:10 +0100

> kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

Does this make the problem go away?

(Note this isn't the final correct patch we should apply.  There
 is no reason why this revert back to the older ->poll() logic
 here should have any effect on the TX hang triggering...)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 13d57b0..cada32c 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget)
 {
 	struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi);
 	struct net_device *poll_dev = adapter->netdev;
-	int work_done = 0;
+	int tx_work = 0, work_done = 0;
 
 	/* Must NOT use netdev_priv macro here. */
 	adapter = poll_dev->priv;
@@ -3929,8 +3929,8 @@ e1000_clean(struct napi_struct *napi, int budget)
 	 * simultaneously.  A failure obtaining the lock means
 	 * tx_ring[0] is currently being cleaned anyway. */
 	if (spin_trylock(&adapter->tx_queue_lock)) {
-		e1000_clean_tx_irq(adapter,
-				   &adapter->tx_ring[0]);
+		tx_work = e1000_clean_tx_irq(adapter,
+					     &adapter->tx_ring[0]);
 		spin_unlock(&adapter->tx_queue_lock);
 	}
 
@@ -3938,7 +3938,7 @@ e1000_clean(struct napi_struct *napi, int budget)
 	                  &work_done, budget);
 
 	/* If budget not fully consumed, exit the polling mode */
-	if (work_done < budget) {
+	if (!tx_work && (work_done < budget)) {
 		if (likely(adapter->itr_setting & 3))
 			e1000_set_itr(adapter);
 		netif_rx_complete(poll_dev, napi);

^ permalink raw reply related

* Re: [RFC 6/6] fib_trie: combine leaf and info
From: Eric Dumazet @ 2008-01-15  6:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev
In-Reply-To: <20080114210716.4b09c84d@deepthought>

[-- Attachment #1: Type: text/plain, Size: 9629 bytes --]

Stephen Hemminger a écrit :
> Combine the prefix information and the leaf together into one
> allocation. This is furthur simplified by converting the hlist
> into a simple bitfield.
> 
> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
> 
> This patch is experimental (ie it boots and loads routes), but
> is slower for the 163,000 route dump test.
> 

Its also very memory expensive. That was not Robert suggestion I believe.

Its suggestion is to embedd one info into each leaves.

Please find attached to this mail a preliminary and incomplete patch I wrote 
this morning before coffee :), to get the idea.

Before patching this file, maybe we should first do a cleanup (checkpatch.pl 
and obvious modern style formating :) )
> ---
>  net/ipv4/fib_trie.c |  165 ++++++++++++++++++++--------------------------------
>  1 file changed, 65 insertions(+), 100 deletions(-)
> 
> --- a/net/ipv4/fib_trie.c	2008-01-14 18:37:51.000000000 -0800
> +++ b/net/ipv4/fib_trie.c	2008-01-14 20:57:12.000000000 -0800
> @@ -50,10 +50,9 @@
>   *		Patrick McHardy <kaber@trash.net>
>   */
>  
> -#define VERSION "0.408"
> +#define VERSION "0.409"
> +
>  
> -#include <asm/uaccess.h>
> -#include <asm/system.h>
>  #include <linux/bitops.h>
>  #include <linux/types.h>
>  #include <linux/kernel.h>
> @@ -80,6 +79,8 @@
>  #include <net/tcp.h>
>  #include <net/sock.h>
>  #include <net/ip_fib.h>
> +#include <asm/bitops.h>
> +
>  #include "fib_lookup.h"
>  
>  #define MAX_STAT_DEPTH 32
> @@ -101,20 +102,24 @@ struct node {
>  	t_key key;
>  };
>  
> +struct leaf_info {
> +	struct list_head falh;
> +};
> +
> +#define INFO_SIZE	33
> +/* NB: bitmap is in reverse order, so that find_first returns largest prefix */
> +#define PLEN2BIT(x)	(INFO_SIZE - (x))
> +
>  struct leaf {
>  	unsigned long parent;
>  	t_key key;
> -	struct hlist_head list;
>  	struct rcu_head rcu;
> -};
>  
> -struct leaf_info {
> -	struct hlist_node hlist;
> -	struct rcu_head rcu;
> -	int plen;
> -	struct list_head falh;
> +	unsigned long bitmap[INFO_SIZE / BITS_PER_LONG + 1];
> +	struct leaf_info info[INFO_SIZE];
>  };
>  
> +
>  struct tnode {
>  	unsigned long parent;
>  	t_key key;
> @@ -321,16 +326,6 @@ static void __leaf_free_rcu(struct rcu_h
>  	kmem_cache_free(trie_leaf_kmem, leaf);
>  }
>  
> -static void __leaf_info_free_rcu(struct rcu_head *head)
> -{
> -	kfree(container_of(head, struct leaf_info, rcu));
> -}
> -
> -static inline void free_leaf_info(struct leaf_info *leaf)
> -{
> -	call_rcu(&leaf->rcu, __leaf_info_free_rcu);
> -}
> -
>  static struct tnode *tnode_alloc(size_t size)
>  {
>  	struct page *pages;
> @@ -371,21 +366,37 @@ static struct leaf *leaf_new(void)
>  	struct leaf *l = kmem_cache_alloc(trie_leaf_kmem, GFP_KERNEL);
>  	if (l) {
>  		l->parent = T_LEAF;
> -		INIT_HLIST_HEAD(&l->list);
> +		bitmap_zero(l->bitmap, INFO_SIZE);
>  	}
>  	return l;
>  }
>  
> -static struct leaf_info *leaf_info_new(int plen)
> +static inline struct leaf_info *find_leaf_info(struct leaf *l, int plen)
>  {
> -	struct leaf_info *li = kmalloc(sizeof(struct leaf_info),  GFP_KERNEL);
> -	if (li) {
> -		li->plen = plen;
> -		INIT_LIST_HEAD(&li->falh);
> -	}
> +	return test_bit(PLEN2BIT(plen), l->bitmap) ? &l->info[plen] : NULL;
> +}
> +
> +static struct leaf_info *set_leaf_info(struct leaf *l, int plen)
> +{
> +	struct leaf_info *li = &l->info[plen];
> +
> +	INIT_LIST_HEAD(&li->falh);
> +	set_bit(PLEN2BIT(plen), l->bitmap);
> +
>  	return li;
>  }
>  
> +static inline void clear_leaf_info(struct leaf *l, int plen)
> +{
> +	smp_mb__before_clear_bit();
> +	clear_bit(PLEN2BIT(plen), &l->bitmap);
> +}
> +
> +static inline int leaf_info_empty(const struct leaf *l)
> +{
> +	return find_first_bit(l->bitmap, INFO_SIZE) >= INFO_SIZE;
> +}
> +
>  static struct tnode* tnode_new(t_key key, int pos, int bits)
>  {
>  	size_t sz = sizeof(struct tnode) + (sizeof(struct node *) << bits);
> @@ -876,20 +887,6 @@ nomem:
>  
>  /* readside must use rcu_read_lock currently dump routines
>   via get_fa_head and dump */
> -
> -static struct leaf_info *find_leaf_info(struct leaf *l, int plen)
> -{
> -	struct hlist_head *head = &l->list;
> -	struct hlist_node *node;
> -	struct leaf_info *li;
> -
> -	hlist_for_each_entry_rcu(li, node, head, hlist)
> -		if (li->plen == plen)
> -			return li;
> -
> -	return NULL;
> -}
> -
>  static inline struct list_head * get_fa_head(struct leaf *l, int plen)
>  {
>  	struct leaf_info *li = find_leaf_info(l, plen);
> @@ -900,26 +897,6 @@ static inline struct list_head * get_fa_
>  	return &li->falh;
>  }
>  
> -static void insert_leaf_info(struct hlist_head *head, struct leaf_info *new)
> -{
> -	struct leaf_info *li = NULL, *last = NULL;
> -	struct hlist_node *node;
> -
> -	if (hlist_empty(head)) {
> -		hlist_add_head_rcu(&new->hlist, head);
> -	} else {
> -		hlist_for_each_entry(li, node, head, hlist) {
> -			if (new->plen > li->plen)
> -				break;
> -
> -			last = li;
> -		}
> -		if (last)
> -			hlist_add_after_rcu(&last->hlist, &new->hlist);
> -		else
> -			hlist_add_before_rcu(&new->hlist, &li->hlist);
> -	}
> -}
>  
>  /* rcu_read_lock needs to be hold by caller from readside */
>  
> @@ -993,6 +970,8 @@ static struct list_head *fib_insert_node
>  	pos = 0;
>  	n = t->trie;
>  
> +	pr_debug("fib_insert_node: %x/%d\n", key, plen);
> +
>  	/* If we point to NULL, stop. Either the tree is empty and we should
>  	 * just put a new leaf in if, or we have reached an empty child slot,
>  	 * and we should just put our new leaf in that.
> @@ -1038,13 +1017,9 @@ static struct list_head *fib_insert_node
>  
>  	if (n != NULL && IS_LEAF(n) && tkey_equals(key, n->key)) {
>  		l = (struct leaf *) n;
> -		li = leaf_info_new(plen);
> -
> -		if (!li)
> -			return NULL;
> -
> +		li = set_leaf_info(l, plen);
>  		fa_head = &li->falh;
> -		insert_leaf_info(&l->list, li);
> +
>  		goto done;
>  	}
>  	l = leaf_new();
> @@ -1053,15 +1028,8 @@ static struct list_head *fib_insert_node
>  		return NULL;
>  
>  	l->key = key;
> -	li = leaf_info_new(plen);
> -
> -	if (!li) {
> -		tnode_free((struct tnode *) l);
> -		return NULL;
> -	}
> -
> +	li = set_leaf_info(l, plen);
>  	fa_head = &li->falh;
> -	insert_leaf_info(&l->list, li);
>  
>  	if (t->trie && n == NULL) {
>  		/* Case 2: n is NULL, and will just insert a new leaf */
> @@ -1091,7 +1059,6 @@ static struct list_head *fib_insert_node
>  		}
>  
>  		if (!tn) {
> -			free_leaf_info(li);
>  			tnode_free((struct tnode *) l);
>  			return NULL;
>  		}
> @@ -1119,7 +1086,7 @@ static struct list_head *fib_insert_node
>  
>  	rcu_assign_pointer(t->trie, trie_rebalance(t, tp));
>  done:
> -	return fa_head;
> +	return &li->falh;
>  }
>  
>  /*
> @@ -1281,14 +1248,14 @@ static int check_leaf(struct trie *t, st
>  		      t_key key,  const struct flowi *flp,
>  		      struct fib_result *res)
>  {
> -	struct leaf_info *li;
> -	struct hlist_head *hhead = &l->list;
> -	struct hlist_node *node;
> +	int bit;
>  
> -	hlist_for_each_entry_rcu(li, node, hhead, hlist) {
> -		int err;
> -		int plen = li->plen;
> +	/* need find highest prefix */
> +	for_each_bit(bit, l->bitmap, INFO_SIZE) {
> +		int plen = PLEN2BIT(bit);
> +		struct leaf_info *li = &l->info[plen];
>  		__be32 mask = inet_make_mask(plen);
> +		int err;
>  
>  		if (l->key != (key & ntohl(mask)))
>  			continue;
> @@ -1622,12 +1589,10 @@ static int fn_trie_delete(struct fib_tab
>  
>  	list_del_rcu(&fa->fa_list);
>  
> -	if (list_empty(fa_head)) {
> -		hlist_del_rcu(&li->hlist);
> -		free_leaf_info(li);
> -	}
> +	if (list_empty(fa_head))
> +		clear_leaf_info(l, plen);
>  
> -	if (hlist_empty(&l->list))
> +	if (leaf_info_empty(l))
>  		trie_leaf_remove(t, key);
>  
>  	if (fa->fa_state & FA_S_ACCESSED)
> @@ -1659,18 +1624,17 @@ static int trie_flush_list(struct trie *
>  static int trie_flush_leaf(struct trie *t, struct leaf *l)
>  {
>  	int found = 0;
> -	struct hlist_head *lih = &l->list;
> -	struct hlist_node *node, *tmp;
> -	struct leaf_info *li = NULL;
> +	int bit;
>  
> -	hlist_for_each_entry_safe(li, node, tmp, lih, hlist) {
> +	for_each_bit(bit, l->bitmap, INFO_SIZE) {
> +		int plen = PLEN2BIT(bit);
> +		struct leaf_info *li = &l->info[plen];
>  		found += trie_flush_list(t, &li->falh);
>  
> -		if (list_empty(&li->falh)) {
> -			hlist_del_rcu(&li->hlist);
> -			free_leaf_info(li);
> -		}
> +		if (list_empty(&li->falh))
> +			clear_leaf_info(l, plen);
>  	}
> +
>  	return found;
>  }
>  
> @@ -1746,12 +1710,12 @@ static int fn_trie_flush(struct fib_tabl
>  	for (h = 0; (l = nextleaf(t, l)) != NULL; h++) {
>  		found += trie_flush_leaf(t, l);
>  
> -		if (ll && hlist_empty(&ll->list))
> +		if (ll && leaf_info_empty(ll))
>  			trie_leaf_remove(t, ll->key);
>  		ll = l;
>  	}
>  
> -	if (ll && hlist_empty(&ll->list))
> +	if (ll && leaf_info_empty(ll))
>  		trie_leaf_remove(t, ll->key);
>  
>  	pr_debug("trie_flush found=%d\n", found);
> @@ -2428,10 +2392,11 @@ static int fib_route_seq_show(struct seq
>  
>  	if (iter->trie == iter->trie_local)
>  		return 0;
> +
>  	if (IS_TNODE(l))
>  		return 0;
>  
> -	for (i=32; i>=0; i--) {
> +	for (i = 32; i >= 0; i--) {
>  		struct leaf_info *li = find_leaf_info(l, i);
>  		struct fib_alias *fa;
>  		__be32 mask, prefix;
> @@ -2439,7 +2404,7 @@ static int fib_route_seq_show(struct seq
>  		if (!li)
>  			continue;
>  
> -		mask = inet_make_mask(li->plen);
> +		mask = inet_make_mask(i);
>  		prefix = htonl(l->key);
>  
>  		list_for_each_entry_rcu(fa, &li->falh, fa_list) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


[-- Attachment #2: info_embedded.patch --]
[-- Type: text/plain, Size: 3670 bytes --]

--- net/ipv4/fib_trie.c
+++ net/ipv4/fib_trie.c.rcu
@@ -97,30 +97,22 @@
 #define IS_LEAF(n) (n->parent & T_LEAF)
 
 struct node {
-	unsigned long		parent;
-	t_key			key;
+	unsigned long parent;
+	t_key key;
 };
 
 struct leaf {
-	unsigned long		parent;
-	t_key			key;
-	/*
-	 * Because we have at least one info per leaf, we embedd one here
-	 * to save some space and speedup lookups (sharing cache line)
-	 * Note : not inside a structure so that we dont have holes on 64bit 
-	 */
-	int 			plen_iinfo;
-	struct list_head	falh_iinfo;
-
-	struct hlist_head	list;
-	struct rcu_head		rcu;
+	unsigned long parent;
+	t_key key;
+	struct hlist_head list;
+	struct rcu_head rcu;
 };
 
 struct leaf_info {
-	struct hlist_node	hlist;
-	int			plen;
-	struct list_head	falh;
-	struct rcu_head		rcu;
+	struct hlist_node hlist;
+	struct rcu_head rcu;
+	int plen;
+	struct list_head falh;
 };
 
 struct tnode {
@@ -381,13 +373,11 @@
 		call_rcu(&tn->rcu, __tnode_free_rcu);
 }
 
-static struct leaf *leaf_new(int plen)
+static struct leaf *leaf_new(void)
 {
 	struct leaf *l = kmalloc(sizeof(struct leaf),  GFP_KERNEL);
 	if (l) {
 		l->parent = T_LEAF;
-		l->plen_iinfo = plen;
-		INIT_LIST_HEAD(&l->falh_iinfo);
 		INIT_HLIST_HEAD(&l->list);
 	}
 	return l;
@@ -909,12 +899,7 @@
 
 static inline struct list_head * get_fa_head(struct leaf *l, int plen)
 {
-	struct leaf_info *li;
-
-	if (plen == l->plen_iinfo)
-		return &l->falh_iinfo;
-
-	li = find_leaf_info(l, plen);
+	struct leaf_info *li = find_leaf_info(l, plen);
 
 	if (!li)
 		return NULL;
@@ -1071,22 +1056,21 @@
 		goto done;
 	}
 	t->size++;
-	l = leaf_new(plen);
+	l = leaf_new();
 
 	if (!l)
 		return NULL;
 
 	l->key = key;
-//	li = leaf_info_new(plen);
+	li = leaf_info_new(plen);
+
+	if (!li) {
+		tnode_free((struct tnode *) l);
+		return NULL;
+	}
 
-//	if (!li) {
-//		tnode_free((struct tnode *) l);
-//		return NULL;
-//	}
-
-//	fa_head = &li->falh;
-//	insert_leaf_info(&l->list, li);
-	fa_head = &l->falh_iinfo;
+	fa_head = &li->falh;
+	insert_leaf_info(&l->list, li);
 
 	if (t->trie && n == NULL) {
 		/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1116,7 +1100,7 @@
 		}
 
 		if (!tn) {
-//			free_leaf_info(li);
+			free_leaf_info(li);
 			tnode_free((struct tnode *) l);
 			return NULL;
 		}
@@ -1640,7 +1624,7 @@
 	li = find_leaf_info(l, plen);
 
 	list_del_rcu(&fa->fa_list);
-// FIXME
+
 	if (list_empty(fa_head)) {
 		hlist_del_rcu(&li->hlist);
 		free_leaf_info(li);
@@ -2382,11 +2366,10 @@
 		seq_indent(seq, iter->depth);
 		seq_printf(seq, "  |-- %d.%d.%d.%d\n", NIPQUAD(val));
 		for (i = 32; i >= 0; i--) {
-//			struct leaf_info *li = get_fa_head(l, i); //find_leaf_info(l, i);
-			struct list_head *fa_head = get_fa_head(l, i);
-			if (fa_head) {
+			struct leaf_info *li = find_leaf_info(l, i);
+			if (li) {
 				struct fib_alias *fa;
-				list_for_each_entry_rcu(fa, fa_head, fa_list) {
+				list_for_each_entry_rcu(fa, &li->falh, fa_list) {
 					seq_indent(seq, iter->depth+1);
 					seq_printf(seq, "  /%d %s %s", i,
 						   rtn_scope(fa->fa_scope),
@@ -2465,18 +2448,17 @@
 		return 0;
 
 	for (i=32; i>=0; i--) {
-		//struct leaf_info *li = find_leaf_info(l, i);
-		struct list_head *fa_head = get_fa_head(l, i);
+		struct leaf_info *li = find_leaf_info(l, i);
 		struct fib_alias *fa;
 		__be32 mask, prefix;
 
-		if (!fa_head)
+		if (!li)
 			continue;
 
-		mask = inet_make_mask(i);
+		mask = inet_make_mask(li->plen);
 		prefix = htonl(l->key);
 
-		list_for_each_entry_rcu(fa, fa_head, fa_list) {
+		list_for_each_entry_rcu(fa, &li->falh, fa_list) {
 			const struct fib_info *fi = fa->fa_info;
 			unsigned flags = fib_flag_trans(fa->fa_type, mask, fi);
 

^ permalink raw reply

* Re: [RFC 6/6] fib_trie: combine leaf and info
From: Eric Dumazet @ 2008-01-15  6:16 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, robert.olsson, netdev
In-Reply-To: <478C4EB7.6060807@cosmosbay.com>

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]

Eric Dumazet a écrit :
> Stephen Hemminger a écrit :
>> Combine the prefix information and the leaf together into one
>> allocation. This is furthur simplified by converting the hlist
>> into a simple bitfield.
>>
>> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
>>
>> This patch is experimental (ie it boots and loads routes), but
>> is slower for the 163,000 route dump test.
>>
> 
> Its also very memory expensive. That was not Robert suggestion I believe.
> 
> Its suggestion is to embedd one info into each leaves.
> 
> Please find attached to this mail a preliminary and incomplete patch I 
> wrote this morning before coffee :), to get the idea.

Oops, patch reversed, sorry, and against another work pending in my tree.

Now time for cofee :)


[-- Attachment #2: info_embedded.patch --]
[-- Type: text/plain, Size: 3670 bytes --]

--- net/ipv4/fib_trie.c.old
+++ net/ipv4/fib_trie.c
@@ -97,22 +97,30 @@
 #define IS_LEAF(n) (n->parent & T_LEAF)
 
 struct node {
-	unsigned long parent;
-	t_key key;
+	unsigned long		parent;
+	t_key			key;
 };
 
 struct leaf {
-	unsigned long parent;
-	t_key key;
-	struct hlist_head list;
-	struct rcu_head rcu;
+	unsigned long		parent;
+	t_key			key;
+	/*
+	 * Because we have at least one info per leaf, we embedd one here
+	 * to save some space and speedup lookups (sharing cache line)
+	 * Note : not inside a structure so that we dont have holes on 64bit 
+	 */
+	int 			plen_iinfo;
+	struct list_head	falh_iinfo;
+
+	struct hlist_head	list;
+	struct rcu_head		rcu;
 };
 
 struct leaf_info {
-	struct hlist_node hlist;
-	struct rcu_head rcu;
-	int plen;
-	struct list_head falh;
+	struct hlist_node	hlist;
+	int			plen;
+	struct list_head	falh;
+	struct rcu_head		rcu;
 };
 
 struct tnode {
@@ -373,11 +381,13 @@
 		call_rcu(&tn->rcu, __tnode_free_rcu);
 }
 
-static struct leaf *leaf_new(void)
+static struct leaf *leaf_new(int plen)
 {
 	struct leaf *l = kmalloc(sizeof(struct leaf),  GFP_KERNEL);
 	if (l) {
 		l->parent = T_LEAF;
+		l->plen_iinfo = plen;
+		INIT_LIST_HEAD(&l->falh_iinfo);
 		INIT_HLIST_HEAD(&l->list);
 	}
 	return l;
@@ -899,7 +909,12 @@
 
 static inline struct list_head * get_fa_head(struct leaf *l, int plen)
 {
-	struct leaf_info *li = find_leaf_info(l, plen);
+	struct leaf_info *li;
+
+	if (plen == l->plen_iinfo)
+		return &l->falh_iinfo;
+
+	li = find_leaf_info(l, plen);
 
 	if (!li)
 		return NULL;
@@ -1056,21 +1071,22 @@
 		goto done;
 	}
 	t->size++;
-	l = leaf_new();
+	l = leaf_new(plen);
 
 	if (!l)
 		return NULL;
 
 	l->key = key;
-	li = leaf_info_new(plen);
-
-	if (!li) {
-		tnode_free((struct tnode *) l);
-		return NULL;
-	}
+//	li = leaf_info_new(plen);
 
-	fa_head = &li->falh;
-	insert_leaf_info(&l->list, li);
+//	if (!li) {
+//		tnode_free((struct tnode *) l);
+//		return NULL;
+//	}
+
+//	fa_head = &li->falh;
+//	insert_leaf_info(&l->list, li);
+	fa_head = &l->falh_iinfo;
 
 	if (t->trie && n == NULL) {
 		/* Case 2: n is NULL, and will just insert a new leaf */
@@ -1100,7 +1116,7 @@
 		}
 
 		if (!tn) {
-			free_leaf_info(li);
+//			free_leaf_info(li);
 			tnode_free((struct tnode *) l);
 			return NULL;
 		}
@@ -1624,7 +1640,7 @@
 	li = find_leaf_info(l, plen);
 
 	list_del_rcu(&fa->fa_list);
-
+// FIXME
 	if (list_empty(fa_head)) {
 		hlist_del_rcu(&li->hlist);
 		free_leaf_info(li);
@@ -2366,10 +2382,11 @@
 		seq_indent(seq, iter->depth);
 		seq_printf(seq, "  |-- %d.%d.%d.%d\n", NIPQUAD(val));
 		for (i = 32; i >= 0; i--) {
-			struct leaf_info *li = find_leaf_info(l, i);
-			if (li) {
+//			struct leaf_info *li = get_fa_head(l, i); //find_leaf_info(l, i);
+			struct list_head *fa_head = get_fa_head(l, i);
+			if (fa_head) {
 				struct fib_alias *fa;
-				list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+				list_for_each_entry_rcu(fa, fa_head, fa_list) {
 					seq_indent(seq, iter->depth+1);
 					seq_printf(seq, "  /%d %s %s", i,
 						   rtn_scope(fa->fa_scope),
@@ -2448,17 +2465,18 @@
 		return 0;
 
 	for (i=32; i>=0; i--) {
-		struct leaf_info *li = find_leaf_info(l, i);
+		//struct leaf_info *li = find_leaf_info(l, i);
+		struct list_head *fa_head = get_fa_head(l, i);
 		struct fib_alias *fa;
 		__be32 mask, prefix;
 
-		if (!li)
+		if (!fa_head)
 			continue;
 
-		mask = inet_make_mask(li->plen);
+		mask = inet_make_mask(i);
 		prefix = htonl(l->key);
 
-		list_for_each_entry_rcu(fa, &li->falh, fa_list) {
+		list_for_each_entry_rcu(fa, fa_head, fa_list) {
 			const struct fib_info *fi = fa->fa_info;
 			unsigned flags = fib_flag_trans(fa->fa_type, mask, fi);
 

^ permalink raw reply

* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
From: Frans Pop @ 2008-01-15  6:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20080114.215317.38045859.davem@davemloft.net>

Wow. That's fast! :-)

On Tuesday 15 January 2008, David Miller wrote:
> From: Frans Pop <elendil@planet.nl>
>
> > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
>
> Does this make the problem go away?

I'm compiling a kernel with the patch now. Will let you know the result.
May take a while as I don't know how to trigger the bug, so I'll just have 
to let it run for some time.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox