Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] virtio-net: move sg off stack
From: David Miller @ 2010-04-02  2:26 UTC (permalink / raw)
  To: mst; +Cc: rusty, jpirko, xma, netdev, linux-kernel
In-Reply-To: <20100331124156.GA4598@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 31 Mar 2010 15:41:58 +0300

> Move sg structure off stack and into virtnet_info structure.
> This helps remove extra sg_init_table calls as well as reduce
> stack usage.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Compile-tested only for now.
> Shirley, Rusty Russell, does this fix CONFIG_DEBUG_SG=y for you?

I'll apply this once I see some ACKs on testing.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/6] updated cxgb4 submission v6
From: David Miller @ 2010-04-02  2:31 UTC (permalink / raw)
  To: dm; +Cc: netdev
In-Reply-To: <1270171707-26058-1-git-send-email-dm@chelsio.com>

From: Dimitris Michailidis <dm@chelsio.com>
Date: Thu,  1 Apr 2010 17:28:21 -0800

> 
> This is an updated patch series for the cxgb4 driver.  Changes from previous
> version:
> 
> - Replaced __u* types with u*
> - Of the proc files I kept one and moved it to debugfs (it dumps a largish
> table, it's not a good fit for a value-per-file model).  I've removed all
> the other proc files.  I'll look at making some of their information available
> in sysfs as separate work.
> 
> The patches are against net-next as of 6c57990.

Ok, this looks good enough now, thanks.

Since it's a self-contained new driver I've added it to net-2.6

Thanks.

^ permalink raw reply

* Re: [PATCH] r8169: clean up my printk uglyness
From: David Miller @ 2010-04-02  2:33 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, romieu, brandon
In-Reply-To: <20100401173007.GC14069@shamino.rdu.redhat.com>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 1 Apr 2010 13:30:07 -0400

> On Thu, Apr 01, 2010 at 10:53:56AM -0400, Neil Horman wrote:
>> Fix formatting on r8169 printk
>> 
>> Brandon Philips noted that I had a spacing issue in my printk for the last r8169
>> patch that made it quite ugly.  Fix that up and add the PFX macro to it as well
>> so it looks like the other r8169 printks
>> 
>> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>> 
>> 
> Grr, yes thanks, V2, fixing my idiocy :)
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Applied, thanks Neil.

^ permalink raw reply

* Re: [PATCH 1/1] e1000e: stop cleaning when we reach tx_ring->next_to_use
From: David Miller @ 2010-04-02  2:33 UTC (permalink / raw)
  To: terry.loftin; +Cc: netdev, e1000-devel, jeffrey.t.kirsher, jesse.brandeburg
In-Reply-To: <4BB3AA30.3060906@hp.com>

From: Terry Loftin <terry.loftin@hp.com>
Date: Wed, 31 Mar 2010 14:01:52 -0600

> Tx ring buffers after tx_ring->next_to_use are volatile and could
> change, possibly causing a crash.  Stop cleaning when we hit
> tx_ring->next_to_use.
> 
> Signed-off-by: Terry Loftin <terry.loftin@hp.com>

Intel folks please look carefully at this fix.  I want to
get it into net-2.6 if it's correct.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next-2.6] igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
From: David Miller @ 2010-04-02  2:34 UTC (permalink / raw)
  To: sassmann; +Cc: netdev, alexander.h.duyck, jeffrey.t.kirsher
In-Reply-To: <4BB4D13C.8030303@redhat.com>

From: Stefan Assmann <sassmann@redhat.com>
Date: Thu, 01 Apr 2010 19:00:44 +0200

> From: Stefan Assmann <sassmann@redhat.com>
> 
> Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
> only supported there.
> 
> Signed-off-by: Stefan Assmann <sassmann@redhat.com>

Intel folks I assume you will suck this in and resubmit to
me?

Thanks.

^ permalink raw reply

* Re: [PATCH 1/2] phylib: Support phy module autoloading
From: David Miller @ 2010-04-02  2:38 UTC (permalink / raw)
  To: ben; +Cc: dwmw2, netdev, 553024
In-Reply-To: <20100401180511.GQ16821@decadent.org.uk>

From: Ben Hutchings <ben@decadent.org.uk>
Date: Thu, 1 Apr 2010 19:05:12 +0100

> On Thu, Apr 01, 2010 at 06:03:48PM +0100, David Woodhouse wrote:
>> On Thu, 2010-04-01 at 05:34 +0100, Ben Hutchings wrote:
> [...]
>> > Since you've dealt with (a), and (b) is not really as important, I would
>> > just like to suggest some minor changes to your patch 1 (see below).
>> > Feel free to fold them in.  Your patch 2 would then need the
>> > substitutions s/phy_device_id/mdio_device_id/; s/TABLE(phy/TABLE(mdio/.
>> 
>> I'll tolerate the silly __u32 crap if I must for consistency, but
>> normally I prefer to write in C.
>> 
>> I did think about 'mdio:' for the module alias, but I decided that
>> 'phy:' probably made more sense since these are PHY driver modules and
>> the number is the phy_id.
> [...]
> 
> Many multi-layered communication standards have distinct PHY devices,
> and they presumably have their own ID spaces.  phylib deals only with
> management of Ethernet PHYs over an MDIO bus, identified using MDIO
> ID registers.

Agreed, PHYs exist on so many different kinds of topologies, the ones
here are definitely specific to Ethernet and MDIO and therefore that
is probably the more useful basis for naming.

David can you freshen things up in this area and integrate whatever
you deem useful and immediate from Ben's patch?  Whatever you submit
next I'd like to toss into net-next-2.6 so it can cook for a while
and maybe we'll backport it so that this bug can be fixed for good
upstream and then perhaps even in -stable.

Thanks!

^ permalink raw reply

* Re: [PATCH 0/4] xfrm fixes and flow structurization
From: David Miller @ 2010-04-02  2:42 UTC (permalink / raw)
  To: timo.teras; +Cc: netdev, herbert
In-Reply-To: <1270030626-16687-1-git-send-email-timo.teras@iki.fi>

From: Timo Teras <timo.teras@iki.fi>
Date: Wed, 31 Mar 2010 13:17:01 +0300

> These are fixes and cleanups which should be good for merging in.
> Patches 1 and 2 are new. Patches 3 and 4 were previously discussed
> with Herbert.
> 
> Please review and consider committing these.

All applied to net-next-2.6, and I used Herbert version of the genid
change, patch #1.

Thanks!

^ permalink raw reply

* Re: [PATCH net-next] bnx2x: Added GRO support
From: David Miller @ 2010-04-02  2:46 UTC (permalink / raw)
  To: dmitry; +Cc: netdev, eilong, vladz
In-Reply-To: <1270122573.22693.6.camel@lb-tlvb-dmitry>

From: "Dmitry Kravkov" <dmitry@broadcom.com>
Date: Thu, 1 Apr 2010 14:49:33 +0300

> Adding GRO support on top of the HW LRO (TPA) support ^[$(Q#|^[(B
> there is no measurable performance drawback of adding GRO
> on top of it, and it allows better performance when LRO (TPA)
> is turned off for virtualization or bridging.
> 
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH RFC] fix problems with NETIF_F_HIGHDMA in networking drivers v2
From: David Miller @ 2010-04-02  2:53 UTC (permalink / raw)
  To: fujita.tomonori; +Cc: hancockrwd, linux-kernel, netdev, linux-usb, bzolnier
In-Reply-To: <20100331173507I.fujita.tomonori@lab.ntt.co.jp>

From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date: Wed, 31 Mar 2010 17:35:50 +0900

> From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Subject: [PATCH] net: change illegal_highdma to use dma_mask

This is a good start, applied to net-next-2.6, thanks!

^ permalink raw reply

* Re: [PATCH] netdev/fec.c: add phylib supporting to enable carrier detection (v2)
From: David Miller @ 2010-04-02  2:53 UTC (permalink / raw)
  To: s.hauer
  Cc: bryan.wu, gerg, amit.kucheria, w.sang, kernel-team, netdev,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20100331122604.GV2241@pengutronix.de>

From: Sascha Hauer <s.hauer@pengutronix.de>
Date: Wed, 31 Mar 2010 14:26:04 +0200

> On Wed, Mar 31, 2010 at 08:10:44PM +0800, Bryan Wu wrote:
>> BugLink: http://bugs.launchpad.net/bugs/457878
>> 
>> v2:
>>  - remove duplicated phy_speed caculation
>>  - fix the phy_speed caculation according to the DataSheet
>> 
>> v1:
>>  - removed old MII phy control code
>>  - add phylib supporting
>>  - add ethtool interface to make user space NetworkManager works
>> 
>> Tested on Freescale i.MX51 Babbage board.
>> 
>> This patch is based on a patch from Frederic Rodo <fred.rodo@gmail.com>
>> 
>> Cc: Frederic Rodo <fred.rodo@gmail.com>
>> Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
> 
> Acked-by: Sascha Hauer <s.hauer@pengutronix.de>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [PATCH] net/pcmcia/3c589_cs: using netdev_info and friends where appropriate
From: David Miller @ 2010-04-02  2:54 UTC (permalink / raw)
  To: linux; +Cc: joe, ken_kawasaki, linux, damm, ben, netdev, linux-kernel
In-Reply-To: <1270039330-5775-1-git-send-email-linux@kbdbabel.org>

From: linux@kbdbabel.org
Date: Wed, 31 Mar 2010 16:42:10 +0400

> From: Alexander Kurz <linux@kbdbabel.org>
> 
> Signed-off-by: Alexander Kurz <linux@kbdbabel.org>

Applied, thanks.

^ permalink raw reply

* Re: [net-next PATCH] be2net: Adding PCI SRIOV support
From: David Miller @ 2010-04-02  2:54 UTC (permalink / raw)
  To: sarveshwarb; +Cc: netdev
In-Reply-To: <20100331125601.GA9725@serverengines.com>

From: Sarveshwar Bandi <sarveshwarb@serverengines.com>
Date: Wed, 31 Mar 2010 18:26:12 +0530

> - Patch adds support to enable PCI SRIOV in the driver and changes to handle initialization of PCI virtual functions.
> - Function handler to change mac addresses for VF from its corresponding PF.
> 
> Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com>

Applied, thanks.

^ permalink raw reply

* [PATCH] bnx2x: use the dma state API instead of the pci equivalents
From: FUJITA Tomonori @ 2010-04-02  2:56 UTC (permalink / raw)
  To: netdev; +Cc: eilong

The DMA API is preferred.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/net/bnx2x.h      |    4 ++--
 drivers/net/bnx2x_main.c |   28 ++++++++++++++--------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index 3c48a7a..ae9c89e 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -163,7 +163,7 @@ do {								 \
 
 struct sw_rx_bd {
 	struct sk_buff	*skb;
-	DECLARE_PCI_UNMAP_ADDR(mapping)
+	DEFINE_DMA_UNMAP_ADDR(mapping);
 };
 
 struct sw_tx_bd {
@@ -176,7 +176,7 @@ struct sw_tx_bd {
 
 struct sw_rx_page {
 	struct page	*page;
-	DECLARE_PCI_UNMAP_ADDR(mapping)
+	DEFINE_DMA_UNMAP_ADDR(mapping);
 };
 
 union db_prod {
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 6c042a7..2a77611 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -1086,7 +1086,7 @@ static inline void bnx2x_free_rx_sge(struct bnx2x *bp,
 	if (!page)
 		return;
 
-	pci_unmap_page(bp->pdev, pci_unmap_addr(sw_buf, mapping),
+	pci_unmap_page(bp->pdev, dma_unmap_addr(sw_buf, mapping),
 		       SGE_PAGE_SIZE*PAGES_PER_SGE, PCI_DMA_FROMDEVICE);
 	__free_pages(page, PAGES_PER_SGE_SHIFT);
 
@@ -1123,7 +1123,7 @@ static inline int bnx2x_alloc_rx_sge(struct bnx2x *bp,
 	}
 
 	sw_buf->page = page;
-	pci_unmap_addr_set(sw_buf, mapping, mapping);
+	dma_unmap_addr_set(sw_buf, mapping, mapping);
 
 	sge->addr_hi = cpu_to_le32(U64_HI(mapping));
 	sge->addr_lo = cpu_to_le32(U64_LO(mapping));
@@ -1151,7 +1151,7 @@ static inline int bnx2x_alloc_rx_skb(struct bnx2x *bp,
 	}
 
 	rx_buf->skb = skb;
-	pci_unmap_addr_set(rx_buf, mapping, mapping);
+	dma_unmap_addr_set(rx_buf, mapping, mapping);
 
 	rx_bd->addr_hi = cpu_to_le32(U64_HI(mapping));
 	rx_bd->addr_lo = cpu_to_le32(U64_LO(mapping));
@@ -1174,12 +1174,12 @@ static void bnx2x_reuse_rx_skb(struct bnx2x_fastpath *fp,
 	struct eth_rx_bd *prod_bd = &fp->rx_desc_ring[prod];
 
 	pci_dma_sync_single_for_device(bp->pdev,
-				       pci_unmap_addr(cons_rx_buf, mapping),
+				       dma_unmap_addr(cons_rx_buf, mapping),
 				       RX_COPY_THRESH, PCI_DMA_FROMDEVICE);
 
 	prod_rx_buf->skb = cons_rx_buf->skb;
-	pci_unmap_addr_set(prod_rx_buf, mapping,
-			   pci_unmap_addr(cons_rx_buf, mapping));
+	dma_unmap_addr_set(prod_rx_buf, mapping,
+			   dma_unmap_addr(cons_rx_buf, mapping));
 	*prod_bd = *cons_bd;
 }
 
@@ -1285,7 +1285,7 @@ static void bnx2x_tpa_start(struct bnx2x_fastpath *fp, u16 queue,
 	prod_rx_buf->skb = fp->tpa_pool[queue].skb;
 	mapping = pci_map_single(bp->pdev, fp->tpa_pool[queue].skb->data,
 				 bp->rx_buf_size, PCI_DMA_FROMDEVICE);
-	pci_unmap_addr_set(prod_rx_buf, mapping, mapping);
+	dma_unmap_addr_set(prod_rx_buf, mapping, mapping);
 
 	/* move partial skb from cons to pool (don't unmap yet) */
 	fp->tpa_pool[queue] = *cons_rx_buf;
@@ -1361,7 +1361,7 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 		}
 
 		/* Unmap the page as we r going to pass it to the stack */
-		pci_unmap_page(bp->pdev, pci_unmap_addr(&old_rx_pg, mapping),
+		pci_unmap_page(bp->pdev, dma_unmap_addr(&old_rx_pg, mapping),
 			      SGE_PAGE_SIZE*PAGES_PER_SGE, PCI_DMA_FROMDEVICE);
 
 		/* Add one frag and update the appropriate fields in the skb */
@@ -1389,7 +1389,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 	/* Unmap skb in the pool anyway, as we are going to change
 	   pool entry status to BNX2X_TPA_STOP even if new skb allocation
 	   fails. */
-	pci_unmap_single(bp->pdev, pci_unmap_addr(rx_buf, mapping),
+	pci_unmap_single(bp->pdev, dma_unmap_addr(rx_buf, mapping),
 			 bp->rx_buf_size, PCI_DMA_FROMDEVICE);
 
 	if (likely(new_skb)) {
@@ -1621,7 +1621,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 			}
 
 			pci_dma_sync_single_for_device(bp->pdev,
-					pci_unmap_addr(rx_buf, mapping),
+					dma_unmap_addr(rx_buf, mapping),
 						       pad + RX_COPY_THRESH,
 						       PCI_DMA_FROMDEVICE);
 			prefetch(skb);
@@ -1666,7 +1666,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 			} else
 			if (likely(bnx2x_alloc_rx_skb(bp, fp, bd_prod) == 0)) {
 				pci_unmap_single(bp->pdev,
-					pci_unmap_addr(rx_buf, mapping),
+					dma_unmap_addr(rx_buf, mapping),
 						 bp->rx_buf_size,
 						 PCI_DMA_FROMDEVICE);
 				skb_reserve(skb, pad);
@@ -4941,7 +4941,7 @@ static inline void bnx2x_free_tpa_pool(struct bnx2x *bp,
 
 		if (fp->tpa_state[i] == BNX2X_TPA_START)
 			pci_unmap_single(bp->pdev,
-					 pci_unmap_addr(rx_buf, mapping),
+					 dma_unmap_addr(rx_buf, mapping),
 					 bp->rx_buf_size, PCI_DMA_FROMDEVICE);
 
 		dev_kfree_skb(skb);
@@ -4978,7 +4978,7 @@ static void bnx2x_init_rx_rings(struct bnx2x *bp)
 					fp->disable_tpa = 1;
 					break;
 				}
-				pci_unmap_addr_set((struct sw_rx_bd *)
+				dma_unmap_addr_set((struct sw_rx_bd *)
 							&bp->fp->tpa_pool[i],
 						   mapping, 0);
 				fp->tpa_state[i] = BNX2X_TPA_STOP;
@@ -6907,7 +6907,7 @@ static void bnx2x_free_rx_skbs(struct bnx2x *bp)
 				continue;
 
 			pci_unmap_single(bp->pdev,
-					 pci_unmap_addr(rx_buf, mapping),
+					 dma_unmap_addr(rx_buf, mapping),
 					 bp->rx_buf_size, PCI_DMA_FROMDEVICE);
 
 			rx_buf->skb = NULL;
-- 
1.7.0


^ permalink raw reply related

* [PATCH] bnx2: use the dma state API instead of the pci equivalents
From: FUJITA Tomonori @ 2010-04-02  2:56 UTC (permalink / raw)
  To: netdev; +Cc: mchan

The DMA API is preferred.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/net/bnx2.c |   40 ++++++++++++++++++++--------------------
 drivers/net/bnx2.h |    6 +++---
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 381887b..c55e46f 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -2668,7 +2668,7 @@ bnx2_alloc_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
 	}
 
 	rx_pg->page = page;
-	pci_unmap_addr_set(rx_pg, mapping, mapping);
+	dma_unmap_addr_set(rx_pg, mapping, mapping);
 	rxbd->rx_bd_haddr_hi = (u64) mapping >> 32;
 	rxbd->rx_bd_haddr_lo = (u64) mapping & 0xffffffff;
 	return 0;
@@ -2683,7 +2683,7 @@ bnx2_free_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
 	if (!page)
 		return;
 
-	pci_unmap_page(bp->pdev, pci_unmap_addr(rx_pg, mapping), PAGE_SIZE,
+	pci_unmap_page(bp->pdev, dma_unmap_addr(rx_pg, mapping), PAGE_SIZE,
 		       PCI_DMA_FROMDEVICE);
 
 	__free_page(page);
@@ -2715,7 +2715,7 @@ bnx2_alloc_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
 	}
 
 	rx_buf->skb = skb;
-	pci_unmap_addr_set(rx_buf, mapping, mapping);
+	dma_unmap_addr_set(rx_buf, mapping, mapping);
 
 	rxbd->rx_bd_haddr_hi = (u64) mapping >> 32;
 	rxbd->rx_bd_haddr_lo = (u64) mapping & 0xffffffff;
@@ -2814,7 +2814,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 			}
 		}
 
-		pci_unmap_single(bp->pdev, pci_unmap_addr(tx_buf, mapping),
+		pci_unmap_single(bp->pdev, dma_unmap_addr(tx_buf, mapping),
 			skb_headlen(skb), PCI_DMA_TODEVICE);
 
 		tx_buf->skb = NULL;
@@ -2824,7 +2824,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 			sw_cons = NEXT_TX_BD(sw_cons);
 
 			pci_unmap_page(bp->pdev,
-				pci_unmap_addr(
+				dma_unmap_addr(
 					&txr->tx_buf_ring[TX_RING_IDX(sw_cons)],
 					mapping),
 				skb_shinfo(skb)->frags[i].size,
@@ -2906,8 +2906,8 @@ bnx2_reuse_rx_skb_pages(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 		if (prod != cons) {
 			prod_rx_pg->page = cons_rx_pg->page;
 			cons_rx_pg->page = NULL;
-			pci_unmap_addr_set(prod_rx_pg, mapping,
-				pci_unmap_addr(cons_rx_pg, mapping));
+			dma_unmap_addr_set(prod_rx_pg, mapping,
+				dma_unmap_addr(cons_rx_pg, mapping));
 
 			prod_bd->rx_bd_haddr_hi = cons_bd->rx_bd_haddr_hi;
 			prod_bd->rx_bd_haddr_lo = cons_bd->rx_bd_haddr_lo;
@@ -2931,7 +2931,7 @@ bnx2_reuse_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 	prod_rx_buf = &rxr->rx_buf_ring[prod];
 
 	pci_dma_sync_single_for_device(bp->pdev,
-		pci_unmap_addr(cons_rx_buf, mapping),
+		dma_unmap_addr(cons_rx_buf, mapping),
 		BNX2_RX_OFFSET + BNX2_RX_COPY_THRESH, PCI_DMA_FROMDEVICE);
 
 	rxr->rx_prod_bseq += bp->rx_buf_use_size;
@@ -2941,8 +2941,8 @@ bnx2_reuse_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 	if (cons == prod)
 		return;
 
-	pci_unmap_addr_set(prod_rx_buf, mapping,
-			pci_unmap_addr(cons_rx_buf, mapping));
+	dma_unmap_addr_set(prod_rx_buf, mapping,
+			dma_unmap_addr(cons_rx_buf, mapping));
 
 	cons_bd = &rxr->rx_desc_ring[RX_RING(cons)][RX_IDX(cons)];
 	prod_bd = &rxr->rx_desc_ring[RX_RING(prod)][RX_IDX(prod)];
@@ -3015,7 +3015,7 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, struct sk_buff *skb,
 			/* Don't unmap yet.  If we're unable to allocate a new
 			 * page, we need to recycle the page and the DMA addr.
 			 */
-			mapping_old = pci_unmap_addr(rx_pg, mapping);
+			mapping_old = dma_unmap_addr(rx_pg, mapping);
 			if (i == pages - 1)
 				frag_len -= 4;
 
@@ -3096,7 +3096,7 @@ bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 
 		rx_buf->skb = NULL;
 
-		dma_addr = pci_unmap_addr(rx_buf, mapping);
+		dma_addr = dma_unmap_addr(rx_buf, mapping);
 
 		pci_dma_sync_single_for_cpu(bp->pdev, dma_addr,
 			BNX2_RX_OFFSET + BNX2_RX_COPY_THRESH,
@@ -5310,7 +5310,7 @@ bnx2_free_tx_skbs(struct bnx2 *bp)
 			}
 
 			pci_unmap_single(bp->pdev,
-					 pci_unmap_addr(tx_buf, mapping),
+					 dma_unmap_addr(tx_buf, mapping),
 					 skb_headlen(skb),
 					 PCI_DMA_TODEVICE);
 
@@ -5321,7 +5321,7 @@ bnx2_free_tx_skbs(struct bnx2 *bp)
 			for (k = 0; k < last; k++, j++) {
 				tx_buf = &txr->tx_buf_ring[TX_RING_IDX(j)];
 				pci_unmap_page(bp->pdev,
-					pci_unmap_addr(tx_buf, mapping),
+					dma_unmap_addr(tx_buf, mapping),
 					skb_shinfo(skb)->frags[k].size,
 					PCI_DMA_TODEVICE);
 			}
@@ -5351,7 +5351,7 @@ bnx2_free_rx_skbs(struct bnx2 *bp)
 				continue;
 
 			pci_unmap_single(bp->pdev,
-					 pci_unmap_addr(rx_buf, mapping),
+					 dma_unmap_addr(rx_buf, mapping),
 					 bp->rx_buf_use_size,
 					 PCI_DMA_FROMDEVICE);
 
@@ -5761,7 +5761,7 @@ bnx2_run_loopback(struct bnx2 *bp, int loopback_mode)
 	skb_reserve(rx_skb, BNX2_RX_OFFSET);
 
 	pci_dma_sync_single_for_cpu(bp->pdev,
-		pci_unmap_addr(rx_buf, mapping),
+		dma_unmap_addr(rx_buf, mapping),
 		bp->rx_buf_size, PCI_DMA_FROMDEVICE);
 
 	if (rx_hdr->l2_fhdr_status &
@@ -6420,7 +6420,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	tx_buf = &txr->tx_buf_ring[ring_prod];
 	tx_buf->skb = skb;
-	pci_unmap_addr_set(tx_buf, mapping, mapping);
+	dma_unmap_addr_set(tx_buf, mapping, mapping);
 
 	txbd = &txr->tx_desc_ring[ring_prod];
 
@@ -6445,7 +6445,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			len, PCI_DMA_TODEVICE);
 		if (pci_dma_mapping_error(bp->pdev, mapping))
 			goto dma_error;
-		pci_unmap_addr_set(&txr->tx_buf_ring[ring_prod], mapping,
+		dma_unmap_addr_set(&txr->tx_buf_ring[ring_prod], mapping,
 				   mapping);
 
 		txbd->tx_bd_haddr_hi = (u64) mapping >> 32;
@@ -6482,7 +6482,7 @@ dma_error:
 	ring_prod = TX_RING_IDX(prod);
 	tx_buf = &txr->tx_buf_ring[ring_prod];
 	tx_buf->skb = NULL;
-	pci_unmap_single(bp->pdev, pci_unmap_addr(tx_buf, mapping),
+	pci_unmap_single(bp->pdev, dma_unmap_addr(tx_buf, mapping),
 			 skb_headlen(skb), PCI_DMA_TODEVICE);
 
 	/* unmap remaining mapped pages */
@@ -6490,7 +6490,7 @@ dma_error:
 		prod = NEXT_TX_BD(prod);
 		ring_prod = TX_RING_IDX(prod);
 		tx_buf = &txr->tx_buf_ring[ring_prod];
-		pci_unmap_page(bp->pdev, pci_unmap_addr(tx_buf, mapping),
+		pci_unmap_page(bp->pdev, dma_unmap_addr(tx_buf, mapping),
 			       skb_shinfo(skb)->frags[i].size,
 			       PCI_DMA_TODEVICE);
 	}
diff --git a/drivers/net/bnx2.h b/drivers/net/bnx2.h
index cd4b0e4..ab34a5d 100644
--- a/drivers/net/bnx2.h
+++ b/drivers/net/bnx2.h
@@ -6551,17 +6551,17 @@ struct l2_fhdr {
 
 struct sw_bd {
 	struct sk_buff		*skb;
-	DECLARE_PCI_UNMAP_ADDR(mapping)
+	DEFINE_DMA_UNMAP_ADDR(mapping);
 };
 
 struct sw_pg {
 	struct page		*page;
-	DECLARE_PCI_UNMAP_ADDR(mapping)
+	DEFINE_DMA_UNMAP_ADDR(mapping);
 };
 
 struct sw_tx_bd {
 	struct sk_buff		*skb;
-	DECLARE_PCI_UNMAP_ADDR(mapping)
+	DEFINE_DMA_UNMAP_ADDR(mapping);
 	unsigned short		is_gso;
 	unsigned short		nr_frags;
 };
-- 
1.7.0


^ permalink raw reply related

* [PATCH 2/2] benet: fix the misusage of zero dma address
From: FUJITA Tomonori @ 2010-04-02  2:53 UTC (permalink / raw)
  To: sathyap, subbus; +Cc: sarveshwarb, ajitk, netdev, fujita.tomonori
In-Reply-To: <1270176803-8561-1-git-send-email-fujita.tomonori@lab.ntt.co.jp>

benet driver wrongly assumes that zero is an invalid dma address
(calls dma_unmap_page for only non zero dma addresses). Zero is a
valid dma address on some architectures. The dma length can be used
here.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/net/benet/be_main.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 8d5e27b..8828c7d 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -394,13 +394,15 @@ static void unmap_tx_frag(struct pci_dev *pdev, struct be_eth_wrb *wrb,
 	be_dws_le_to_cpu(wrb, sizeof(*wrb));
 
 	dma = (u64)wrb->frag_pa_hi << 32 | (u64)wrb->frag_pa_lo;
-	if (dma != 0) {
+	if (wrb->frag_len) {
 		if (unmap_single)
 			pci_unmap_single(pdev, dma, wrb->frag_len,
 				PCI_DMA_TODEVICE);
 		else
 			pci_unmap_page(pdev, dma, wrb->frag_len,
 				PCI_DMA_TODEVICE);
+
+		wrb->frag_len = 0;
 	}
 }
 
@@ -466,9 +468,9 @@ dma_err:
 	txq->head = map_head;
 	while (copied) {
 		wrb = queue_head_node(txq);
+		copied -= wrb->frag_len;
 		unmap_tx_frag(pdev, wrb, map_single);
 		map_single = false;
-		copied -= wrb->frag_len;
 		queue_head_inc(txq);
 	}
 	return 0;
-- 
1.7.0


^ permalink raw reply related

* [PATCH 1/2] benet: use the dma state API instead of the pci equivalents
From: FUJITA Tomonori @ 2010-04-02  2:53 UTC (permalink / raw)
  To: sathyap, subbus; +Cc: sarveshwarb, ajitk, netdev, fujita.tomonori

The DMA API is preferred.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 drivers/net/benet/be.h      |    2 +-
 drivers/net/benet/be_main.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index 8f07525..bddae08 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -206,7 +206,7 @@ struct be_tx_obj {
 /* Struct to remember the pages posted for rx frags */
 struct be_rx_page_info {
 	struct page *page;
-	dma_addr_t bus;
+	DEFINE_DMA_UNMAP_ADDR(bus);
 	u16 page_offset;
 	bool last_page_user;
 };
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 17282df..8d5e27b 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -682,7 +682,7 @@ get_rx_page_info(struct be_adapter *adapter, u16 frag_idx)
 	BUG_ON(!rx_page_info->page);
 
 	if (rx_page_info->last_page_user) {
-		pci_unmap_page(adapter->pdev, pci_unmap_addr(rx_page_info, bus),
+		pci_unmap_page(adapter->pdev, dma_unmap_addr(rx_page_info, bus),
 			adapter->big_page_size, PCI_DMA_FROMDEVICE);
 		rx_page_info->last_page_user = false;
 	}
@@ -993,7 +993,7 @@ static void be_post_rx_frags(struct be_adapter *adapter)
 		}
 		page_offset = page_info->page_offset;
 		page_info->page = pagep;
-		pci_unmap_addr_set(page_info, bus, page_dmaaddr);
+		dma_unmap_addr_set(page_info, bus, page_dmaaddr);
 		frag_dmaaddr = page_dmaaddr + page_info->page_offset;
 
 		rxd = queue_head_node(rxq);
-- 
1.7.0


^ permalink raw reply related

* Re: [PATCH 0/4] caching bundles, iteration 3
From: David Miller @ 2010-04-02  3:00 UTC (permalink / raw)
  To: timo.teras; +Cc: netdev, herbert
In-Reply-To: <1270126340-30181-1-git-send-email-timo.teras@iki.fi>

From: Timo Teras <timo.teras@iki.fi>
Date: Thu,  1 Apr 2010 15:52:16 +0300

> Applies on top of the previous patches I sent.
> 
> Changes to previous iteration:
>  - "flow: delayed deletion of flow cache entries" refactored to go
>    after the other patches per Herbert's request
>  - fixed hlist searching in the above mentioned patch
>  - uses now ERR_CAST and other similar functions for better readability
>  - added basic gc for per-socket bundles
>  - some other minor clean ups
> 
> I'm now running this on a test box, with my specific setup, and it
> seems to be working pretty well. However, this changes quite a bit
> of things, so detailed review is needed.

I'm kind of burnt out on patch review for today, so I hope Herbert
can take a look at this when he wakes up.

Otherwise I'll study it in detail tomorrow.

Thanks Timo!

^ permalink raw reply

* [PATCH] rfs: Receive Flow Steering
From: Tom Herbert @ 2010-04-02  3:59 UTC (permalink / raw)
  To: davem, netdev

This patch implements receive flow steering (RFS).  RFS steers received packets for layer 3 and 4 processing to the CPU where the application for the corresponding flow is running.  RFS is an extension of Receive Packet Steering (RPS).

The basic idea of RFS is that when an application calls recvmsg (or sendmsg) the application's running CPU is stored in a hash table that is indexed by the connection's rxhash which is stored in the socket structure.  The rxhash is passed in skb's received on the connection from netif_receive_skb.  For each received packet, the associated rxhash is used to look up the CPU in the hash table, if a valid CPU is set then the packet is steered to that CPU using the RPS mechanisms.

The convolution of the simple approach is that it would potentially allow OOO packets.  If threads are thrashing around CPUs or multiple threads are trying to read from the same sockets, a quickly changing CPU value in the hash table could cause rampant OOO packets-- we consider this a non-starter.

To avoid OOO packets, this solution implements two types of hash tables: rps_sock_flow_table and rps_dev_flow_table.

rps_sock_table is a global hash table.  Each entry is just a CPU number and it is populated in recvmsg and sendmsg as described above.  This table contains the "desired" CPUs for flows.

rps_dev_flow_table is specific to each device queue.  Each entry contains a CPU and a tail queue counter.  The CPU is the "current" CPU for a matching flow.  The tail queue counter holds the value of a tail queue counter for the associated CPU's backlog queue at the time of last enqueue for a flow matching the entry.

Each backlog queue has a queue head counter which is incremented on dequeue, and so a queue tail counter is computed as queue head count + queue length.  When a packet is enqueued on a backlog queue, the current value of the queue tail counter is saved in the hash entry of the rps_dev_flow_table.

And now the trick: when selecting the CPU for RPS (get_rps_cpu) the rps_sock_flow table and the rps_dev_flow table for the RX queue are consulted.  When the desired CPU for the flow (found in the rps_sock_flow table) does not match the current CPU (found in the rps_dev_flow table), the current CPU is changed to the desired CPU if one of the following is true:

- The current CPU is unset (equal to NR_CPUS)
- Current CPU is offline
- The current CPU's queue head counter >= queue tail counter in the rps_dev_flow table.  This checks if the queue tail has advanced beyond the last packet that was enqueued using this table entry.  This guarantees that all packets queued using this entry have been dequeued, thus preserving in order delivery.

Making each queue have its own rps_dev_flow table has two advantages: 1) the tail queue counters will be written on each receive, so keeping the table local to interrupting CPU s good for locality.  2) this allows lockless access to the table-- the CPU number and queue tail counter need to be accessed together under mutual exclusion from netif_receive_skb, we assume that this is only called from device napi_poll which is non-reentrant.

This patch implements RFS for TCP, although it should be usable for other flow oriented protocols.  The patch has some minor modifications in TCP and inet: in tcp_v4_rcv the rxhash is saved in the inet_sk, in inet_sendmsg and inet_recvmsg (added function) the current CPU is recored in rps_sock_flow_table as indexed by the saved rxhash.

There are two configuration parameters for RFS.  The "rps_flow_entries" kernel init parameter sets the number of entries in the rps_sock_flow_table, the per rxqueue sysfs entry "rps_flow_cnt" contains the number of entries in the rps_dev_flow table for the rxqueue.  Both are rounded to power of two.

The obvious benefit of RFS (over just RPS) is that it achieves CPU locality between the receive processing for a flow and the applications processing; this can result in increased performance (higher pps, lower latency).

The benefits of RFS are dependent on cache hierarchy, application load, and other factors.  On simple benchmarks, we don't necessarily see improvement and sometimes see degradation.  However, for more complex benchmarks and for applications where cache pressure is much higher this technique seems to perform very well.

Below are some benchmark results which show the potential benfit of this patch.  The netperf test has 500 instances of netperf TCP_RR test with 1 byte req. and resp.  The RPC test is an request/response test similar in structure to netperf RR test ith 100 threads on each host, but does more work in userspace that netperf.

e1000e on 8 core Intel
   No RFS or RPS		104K tps at 30% CPU
   No RFS (best RPS config):    290K tps at 63% CPU
   RFS				303K tps at 61% CPU

RPC test		tps	CPU%	50/90/99% usec latency	StdDev
   No RFS or RPS	103K	48%	757/900/3185		4472.35
   RPS only:		174K	73%	415/993/2468		491.66
   RFS			223K	73%	379/651/1382		315.61
   
Signed-off-by: Tom Herbert <therbert@google.com>
---
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b5670ab..ea6ebca 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -543,14 +543,72 @@ struct rps_map {
 };
 #define RPS_MAP_SIZE(_num) (sizeof(struct rps_map) + (_num * sizeof(u16)))
 
+/*
+ * The rps_dev_flow structure contains the mapping of a flow to a CPU and the
+ * tail pointer for that CPU's input queue at the time of last enqueue.
+ */
+struct rps_dev_flow {
+	u16 cpu;
+	unsigned int last_qtail;
+};
+
+/*
+ * The rps_dev_flow_table structure contains a table of flow mappings.
+ */
+struct rps_dev_flow_table {
+	unsigned int mask;
+	struct rcu_head rcu;
+	struct work_struct free_work;
+	struct rps_dev_flow flows[0];
+};
+#define RPS_DEV_FLOW_TABLE_SIZE(_num) (sizeof(struct rps_dev_flow_table) + \
+    (_num * sizeof(struct rps_dev_flow)))
+
+/*
+ * The rps_sock_flow_table contains mappings of flows to the last CPU
+ * on which they were processed by the application (set in recvmsg).
+ */
+struct rps_sock_flow_table {
+	unsigned int mask;
+	u16 *ents;
+};
+
+#define RPS_NO_CPU NR_CPUS
+
+static inline void rps_set_sock_flow(struct rps_sock_flow_table *table,
+				     u32 hash, int cpu)
+{
+	if (table->ents && hash) {
+		unsigned int index = hash & table->mask;
+
+		if (table->ents[index] != cpu)
+			table->ents[index] = cpu;
+	}
+}
+
+static inline void rps_record_sock_flow(struct rps_sock_flow_table *table,
+					u32 hash)
+{
+	rps_set_sock_flow(table, hash, smp_processor_id());
+}
+
+static inline void rps_reset_sock_flow(struct rps_sock_flow_table *table,
+				       u32 hash)
+{
+	rps_set_sock_flow(table, hash, RPS_NO_CPU);
+}
+
+extern struct rps_sock_flow_table rps_sock_flow_table;
+
 /* This structure contains an instance of an RX queue. */
 struct netdev_rx_queue {
 	struct rps_map *rps_map;
+	struct rps_dev_flow_table *rps_flow_table;
 	struct kobject kobj;
 	struct netdev_rx_queue *first;
 	atomic_t count;
 } ____cacheline_aligned_in_smp;
-#endif
+#endif /* CONFIG_RPS */
 
 /*
  * This structure defines the management hooks for network devices.
@@ -1346,8 +1404,9 @@ struct softnet_data {
 	struct sk_buff		*completion_queue;
 
 	/* Elements below can be accessed between CPUs for RPS */
-#ifdef CONFIG_SMP
+#ifdef CONFIG_RPS
 	struct call_single_data	csd ____cacheline_aligned_in_smp;
+	unsigned int		input_queue_head;
 #endif
 	struct sk_buff_head	input_pkt_queue;
 	struct napi_struct	backlog;
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 83fd344..daa2666 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -21,6 +21,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/jhash.h>
+#include <linux/netdevice.h>
 
 #include <net/flow.h>
 #include <net/sock.h>
@@ -101,6 +102,7 @@ struct rtable;
  * @uc_ttl - Unicast TTL
  * @inet_sport - Source port
  * @inet_id - ID counter for DF pkts
+ * @rxhash - flow hash received from netif layer
  * @tos - TOS
  * @mc_ttl - Multicasting TTL
  * @is_icsk - is this an inet_connection_sock?
@@ -124,6 +126,9 @@ struct inet_sock {
 	__u16			cmsg_flags;
 	__be16			inet_sport;
 	__u16			inet_id;
+#ifdef CONFIG_RPS
+	__u32			rxhash;
+#endif
 
 	struct ip_options	*opt;
 	__u8			tos;
@@ -219,4 +224,17 @@ static inline __u8 inet_sk_flowi_flags(const struct sock *sk)
 	return inet_sk(sk)->transparent ? FLOWI_FLAG_ANYSRC : 0;
 }
 
+static inline void inet_rps_record_flow(const struct sock *sk)
+{
+#ifdef CONFIG_RPS
+	rps_record_sock_flow(&rps_sock_flow_table, inet_sk(sk)->rxhash);
+#endif
+}
+
+static inline void inet_rps_reset_flow(const struct sock *sk)
+{
+#ifdef CONFIG_RPS
+	rps_reset_sock_flow(&rps_sock_flow_table, inet_sk(sk)->rxhash);
+#endif
+}
 #endif	/* _INET_SOCK_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index 887aa84..94fd5c6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -129,6 +129,7 @@
 #include <linux/jhash.h>
 #include <linux/random.h>
 #include <trace/events/napi.h>
+#include <linux/bootmem.h>
 
 #include "net-sysfs.h"
 
@@ -2178,21 +2179,29 @@ int weight_p __read_mostly = 64;            /* old backlog weight */
 DEFINE_PER_CPU(struct netif_rx_stats, netdev_rx_stat) = { 0, };
 
 #ifdef CONFIG_RPS
+/* One global table that all flow-based protocols share. */
+struct rps_sock_flow_table rps_sock_flow_table;
+EXPORT_SYMBOL(rps_sock_flow_table);
+
 /*
  * get_rps_cpu is called from netif_receive_skb and returns the target
  * CPU from the RPS map of the receiving queue for a given skb.
+ * rcu_read_lock must be held on entry.
  */
-static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb)
+static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
+		       struct rps_dev_flow **rflowp)
 {
 	struct ipv6hdr *ip6;
 	struct iphdr *ip;
 	struct netdev_rx_queue *rxqueue;
 	struct rps_map *map;
+	struct rps_dev_flow_table *flow_table;
 	int cpu = -1;
 	u8 ip_proto;
+	u16 tcpu;
 	u32 addr1, addr2, ports, ihl;
 
-	rcu_read_lock();
+	*rflowp = NULL;
 
 	if (skb_rx_queue_recorded(skb)) {
 		u16 index = skb_get_rx_queue(skb);
@@ -2208,7 +2217,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb)
 	} else
 		rxqueue = dev->_rx;
 
-	if (!rxqueue->rps_map)
+	if (!rxqueue->rps_map && !rxqueue->rps_flow_table)
 		goto done;
 
 	if (skb->rxhash)
@@ -2260,9 +2269,47 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb)
 		skb->rxhash = 1;
 
 got_hash:
+	flow_table = rcu_dereference(rxqueue->rps_flow_table);
+	if (flow_table && rps_sock_flow_table.ents) {
+		u16 next_cpu;
+		struct rps_dev_flow *rflow;
+
+		rflow = &flow_table->flows[skb->rxhash & flow_table->mask];
+		tcpu = rflow->cpu;
+
+		next_cpu = rps_sock_flow_table.ents[skb->rxhash &
+		    rps_sock_flow_table.mask];
+
+		/*
+		 * If the desired CPU (where last recvmsg was done) is
+		 * different from current CPU (one in the rx-queue flow
+		 * table entry), switch if one of the following holds:
+		 *   - Current CPU is unset (equal to RPS_NO_CPU).
+		 *   - Current CPU is offline.
+		 *   - The current CPU's queue tail has advanced beyond the
+		 *     last packet that was enqueued using this table entry.
+		 *     This guarantees that all previous packets for the flow
+		 *     have been dequeued, thus preserving in order delivery.
+		 */
+		if (unlikely(tcpu != next_cpu) &&
+		    (tcpu == RPS_NO_CPU || !cpu_online(tcpu) ||
+		     ((int)(per_cpu(softnet_data, tcpu).input_queue_head -
+		      rflow->last_qtail)) >= 0)) {
+			tcpu = rflow->cpu = next_cpu;
+			if (tcpu != RPS_NO_CPU)
+				rflow->last_qtail = per_cpu(softnet_data,
+				    tcpu).input_queue_head;
+		}
+		if (tcpu != RPS_NO_CPU && cpu_online(tcpu)) {
+			*rflowp = rflow;
+			cpu = tcpu;
+			goto done;
+		}
+	}
+
 	map = rcu_dereference(rxqueue->rps_map);
 	if (map) {
-		u16 tcpu = map->cpus[((u64) skb->rxhash * map->len) >> 32];
+		tcpu = map->cpus[((u64) skb->rxhash * map->len) >> 32];
 
 		if (cpu_online(tcpu)) {
 			cpu = tcpu;
@@ -2271,7 +2318,6 @@ got_hash:
 	}
 
 done:
-	rcu_read_unlock();
 	return cpu;
 }
 
@@ -2297,13 +2343,14 @@ static void trigger_softirq(void *data)
 	__napi_schedule(&queue->backlog);
 	__get_cpu_var(netdev_rx_stat).received_rps++;
 }
-#endif /* CONFIG_SMP */
+#endif /* CONFIG_RPS */
 
 /*
  * enqueue_to_backlog is called to queue an skb to a per CPU backlog
  * queue (may be a remote CPU queue).
  */
-static int enqueue_to_backlog(struct sk_buff *skb, int cpu)
+static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
+			      unsigned int *qtail)
 {
 	struct softnet_data *queue;
 	unsigned long flags;
@@ -2318,6 +2365,10 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu)
 		if (queue->input_pkt_queue.qlen) {
 enqueue:
 			__skb_queue_tail(&queue->input_pkt_queue, skb);
+#ifdef CONFIG_RPS
+			*qtail = queue->input_queue_head +
+			    queue->input_pkt_queue.qlen;
+#endif
 			spin_unlock_irqrestore(&queue->input_pkt_queue.lock,
 			    flags);
 			return NET_RX_SUCCESS;
@@ -2367,7 +2418,11 @@ enqueue:
 
 int netif_rx(struct sk_buff *skb)
 {
-	int cpu;
+	unsigned int qtail;
+#ifdef CONFIG_RPS
+	struct rps_dev_flow *rflow;
+	int cpu, err;
+#endif
 
 	/* if netpoll wants it, pretend we never saw it */
 	if (netpoll_rx(skb))
@@ -2377,14 +2432,24 @@ int netif_rx(struct sk_buff *skb)
 		net_timestamp(skb);
 
 #ifdef CONFIG_RPS
-	cpu = get_rps_cpu(skb->dev, skb);
+	rcu_read_lock();
+
+	cpu = get_rps_cpu(skb->dev, skb, &rflow);
 	if (cpu < 0)
 		cpu = smp_processor_id();
+
+	err = enqueue_to_backlog(skb, cpu, &qtail);
+
+	if (rflow)
+		rflow->last_qtail = qtail;
+
+	rcu_read_unlock();
+
+	return err;
 #else
-	cpu = smp_processor_id();
+	return enqueue_to_backlog(skb, smp_processor_id(), &qtail);
 #endif
 
-	return enqueue_to_backlog(skb, cpu);
 }
 EXPORT_SYMBOL(netif_rx);
 
@@ -2751,14 +2816,26 @@ out:
 int netif_receive_skb(struct sk_buff *skb)
 {
 #ifdef CONFIG_RPS
-	int cpu;
+	struct rps_dev_flow *rflow;
+	int cpu, err;
+
+	rcu_read_lock();
 
-	cpu = get_rps_cpu(skb->dev, skb);
+	cpu = get_rps_cpu(skb->dev, skb, &rflow);
 
 	if (cpu < 0)
-		return __netif_receive_skb(skb);
-	else
-		return enqueue_to_backlog(skb, cpu);
+		err = __netif_receive_skb(skb);
+	else {
+		unsigned int qtail;
+
+		err = enqueue_to_backlog(skb, cpu, &qtail);
+		if (rflow)
+			rflow->last_qtail = qtail;
+	}
+
+	rcu_read_unlock();
+
+	return err;
 #else
 	return __netif_receive_skb(skb);
 #endif
@@ -2777,6 +2854,9 @@ static void flush_backlog(struct net_device *dev, int cpu)
 		if (skb->dev == dev) {
 			__skb_unlink(skb, &queue->input_pkt_queue);
 			kfree_skb(skb);
+#ifdef CONFIG_RPS
+			queue->input_queue_head++;
+#endif
 		}
 	spin_unlock_irqrestore(&queue->input_pkt_queue.lock, flags);
 }
@@ -3098,6 +3178,9 @@ static int process_backlog(struct napi_struct *napi, int quota)
 			spin_unlock_irq(&queue->input_pkt_queue.lock);
 			break;
 		}
+#ifdef CONFIG_RPS
+		queue->input_queue_head++;
+#endif
 		spin_unlock_irq(&queue->input_pkt_queue.lock);
 
 		__netif_receive_skb(skb);
@@ -6021,8 +6104,12 @@ static int dev_cpu_callback(struct notifier_block *nfb,
 	local_irq_enable();
 
 	/* Process offline CPU's input_pkt_queue */
-	while ((skb = __skb_dequeue(&oldsd->input_pkt_queue)))
+	while ((skb = __skb_dequeue(&oldsd->input_pkt_queue))) {
 		netif_rx(skb);
+#ifdef CONFIG_RPS
+		oldsd->input_queue_head++;
+#endif
+	}
 
 	return NOTIFY_OK;
 }
@@ -6203,6 +6290,42 @@ static struct pernet_operations __net_initdata default_device_ops = {
 	.exit_batch = default_device_exit_batch,
 };
 
+
+#ifdef CONFIG_RPS
+static __initdata unsigned long rps_sock_flow_entries;
+
+static int __init set_rps_sock_flow_entries(char *str)
+{
+	if (str)
+		rps_sock_flow_entries = simple_strtoul(str, &str, 0);
+
+	return 0;
+}
+
+__setup("rps_flow_entries=", set_rps_sock_flow_entries);
+
+static int alloc_rps_sock_flow_entries(void)
+{
+	unsigned int i, hash_size;
+
+	if (!rps_sock_flow_entries)
+		return 0;
+
+	rps_sock_flow_table.ents =
+	    alloc_large_system_hash("RPS flow table", sizeof(u16),
+	    rps_sock_flow_entries, 0, 0, &hash_size, NULL, 0);
+	hash_size = 1 << hash_size;
+	rps_sock_flow_table.mask = hash_size - 1;
+	for (i = 0; i < hash_size; i++)
+		rps_sock_flow_table.ents[i] = RPS_NO_CPU;
+
+	printk(KERN_INFO "RPS: flow table configured with %d entries\n",
+	    hash_size);
+
+	return 0;
+}
+#endif
+
 /*
  *	Initialize the DEV module. At boot time this walks the device list and
  *	unhooks any devices that fail to initialise (normally hardware not
@@ -6223,6 +6346,11 @@ static int __init net_dev_init(void)
 	if (dev_proc_init())
 		goto out;
 
+#ifdef CONFIG_RPS
+	if (alloc_rps_sock_flow_entries())
+		goto out;
+#endif
+
 	if (netdev_kobject_init())
 		goto out;
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 1e7fdd6..3039e98 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -600,22 +600,106 @@ ssize_t store_rps_map(struct netdev_rx_queue *queue,
 	return len;
 }
 
+
+static ssize_t show_rps_dev_flow_table_cnt(struct netdev_rx_queue *queue,
+					   struct rx_queue_attribute *attr,
+					   char *buf)
+{
+	struct rps_dev_flow_table *flow_table;
+	unsigned int val = 0;
+
+	rcu_read_lock();
+	flow_table = rcu_dereference(queue->rps_flow_table);
+	if (flow_table)
+		val = flow_table->mask + 1;
+	rcu_read_unlock();
+
+	return sprintf(buf, "%u\n", val);
+}
+
+static void rps_dev_flow_table_release_work(struct work_struct *work)
+{
+	struct rps_dev_flow_table *table = container_of(work,
+	    struct rps_dev_flow_table, free_work);
+
+	vfree(table);
+}
+
+static void rps_dev_flow_table_release(struct rcu_head *rcu)
+{
+	struct rps_dev_flow_table *table = container_of(rcu,
+	    struct rps_dev_flow_table, rcu);
+
+	INIT_WORK(&table->free_work, rps_dev_flow_table_release_work);
+	schedule_work(&table->free_work);
+}
+
+ssize_t store_rps_dev_flow_table_cnt(struct netdev_rx_queue *queue,
+				     struct rx_queue_attribute *attr,
+				     const char *buf, size_t len)
+{
+	unsigned int count;
+	char *endp;
+	struct rps_dev_flow_table *table, *old_table;
+	static DEFINE_SPINLOCK(rps_dev_flow_lock);
+
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	count = simple_strtoul(buf, &endp, 0);
+	if (endp == buf)
+		return -EINVAL;
+
+	if (count) {
+		int i;
+
+		count = roundup_pow_of_two(count);
+		table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));
+		if (!table)
+			return -ENOMEM;
+
+		table->mask = count - 1;
+		for (i = 0; i < count; i++)
+			table->flows[i].cpu = RPS_NO_CPU;
+	} else
+		table = NULL;
+
+	spin_lock(&rps_dev_flow_lock);
+	old_table = queue->rps_flow_table;
+	rcu_assign_pointer(queue->rps_flow_table, table);
+	spin_unlock(&rps_dev_flow_lock);
+
+	if (old_table)
+		call_rcu(&old_table->rcu, rps_dev_flow_table_release);
+
+	return len;
+}
+
 static struct rx_queue_attribute rps_cpus_attribute =
 	__ATTR(rps_cpus, S_IRUGO | S_IWUSR, show_rps_map, store_rps_map);
 
+
+static struct rx_queue_attribute rps_dev_flow_table_cnt_attribute =
+	__ATTR(rps_flow_cnt, S_IRUGO | S_IWUSR,
+	    show_rps_dev_flow_table_cnt, store_rps_dev_flow_table_cnt);
+
 static struct attribute *rx_queue_default_attrs[] = {
 	&rps_cpus_attribute.attr,
+	&rps_dev_flow_table_cnt_attribute.attr,
 	NULL
 };
 
 static void rx_queue_release(struct kobject *kobj)
 {
 	struct netdev_rx_queue *queue = to_rx_queue(kobj);
-	struct rps_map *map = queue->rps_map;
 	struct netdev_rx_queue *first = queue->first;
 
-	if (map)
-		call_rcu(&map->rcu, rps_map_release);
+	if (queue->rps_map)
+		call_rcu(&queue->rps_map->rcu, rps_map_release);
+
+	if (queue->rps_flow_table)
+		call_rcu(&queue->rps_flow_table->rcu,
+		    rps_dev_flow_table_release);
 
 	if (atomic_dec_and_test(&first->count))
 		kfree(first);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 55e1190..eb6155a 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -418,6 +418,8 @@ int inet_release(struct socket *sock)
 	if (sk) {
 		long timeout;
 
+		inet_rps_reset_flow(sk);
+
 		/* Applications forget to leave groups before exiting */
 		ip_mc_drop_socket(sk);
 
@@ -714,6 +716,8 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 {
 	struct sock *sk = sock->sk;
 
+	inet_rps_record_flow(sk);
+
 	/* We may need to bind the socket. */
 	if (!inet_sk(sk)->inet_num && inet_autobind(sk))
 		return -EAGAIN;
@@ -722,12 +726,13 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-
 static ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 			     size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
 
+	inet_rps_record_flow(sk);
+
 	/* We may need to bind the socket. */
 	if (!inet_sk(sk)->inet_num && inet_autobind(sk))
 		return -EAGAIN;
@@ -737,6 +742,22 @@ static ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
 
+int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
+		 size_t size, int flags)
+{
+	struct sock *sk = sock->sk;
+	int addr_len = 0;
+	int err;
+
+	inet_rps_record_flow(sk);
+
+	err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
+				   flags & ~MSG_DONTWAIT, &addr_len);
+	if (err >= 0)
+		msg->msg_namelen = addr_len;
+	return err;
+}
+EXPORT_SYMBOL(inet_recvmsg);
 
 int inet_shutdown(struct socket *sock, int how)
 {
@@ -866,7 +887,7 @@ const struct proto_ops inet_stream_ops = {
 	.setsockopt	   = sock_common_setsockopt,
 	.getsockopt	   = sock_common_getsockopt,
 	.sendmsg	   = tcp_sendmsg,
-	.recvmsg	   = sock_common_recvmsg,
+	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = tcp_sendpage,
 	.splice_read	   = tcp_splice_read,
@@ -893,7 +914,7 @@ const struct proto_ops inet_dgram_ops = {
 	.setsockopt	   = sock_common_setsockopt,
 	.getsockopt	   = sock_common_getsockopt,
 	.sendmsg	   = inet_sendmsg,
-	.recvmsg	   = sock_common_recvmsg,
+	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
@@ -923,7 +944,7 @@ static const struct proto_ops inet_sockraw_ops = {
 	.setsockopt	   = sock_common_setsockopt,
 	.getsockopt	   = sock_common_getsockopt,
 	.sendmsg	   = inet_sendmsg,
-	.recvmsg	   = sock_common_recvmsg,
+	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f4df5f9..3060c17 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1674,6 +1674,10 @@ process:
 
 	skb->dev = NULL;
 
+#ifdef CONFIG_RPS
+	inet_sk(sk)->rxhash = skb->rxhash;
+#endif
+
 	bh_lock_sock_nested(sk);
 	ret = 0;
 	if (!sock_owned_by_user(sk)) {

^ permalink raw reply related

* Re: [PATCH net-next-2.6] igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
From: Jeff Kirsher @ 2010-04-02  4:12 UTC (permalink / raw)
  To: David Miller
  Cc: sassmann@redhat.com, netdev@vger.kernel.org, Duyck, Alexander H
In-Reply-To: <20100401.193435.223373423.davem@davemloft.net>

On Thu, 2010-04-01 at 19:34 -0700, David Miller wrote:
> From: Stefan Assmann <sassmann@redhat.com>
> Date: Thu, 01 Apr 2010 19:00:44 +0200
> 
> > From: Stefan Assmann <sassmann@redhat.com>
> > 
> > Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
> > only supported there.
> > 
> > Signed-off-by: Stefan Assmann <sassmann@redhat.com>
> 
> Intel folks I assume you will suck this in and resubmit to
> me?
> 
> Thanks.

Correct, I am sucking this in to my tree.
-- 
Cheers,
Jeff


^ permalink raw reply

* Re: [PATCH 1/1] e1000e: stop cleaning when we reach tx_ring->next_to_use
From: Jeff Kirsher @ 2010-04-02  4:13 UTC (permalink / raw)
  To: David Miller
  Cc: terry.loftin@hp.com, netdev@vger.kernel.org,
	e1000-devel@lists.sourceforge.net, Brandeburg, Jesse
In-Reply-To: <20100401.193355.255461107.davem@davemloft.net>

On Thu, 2010-04-01 at 19:33 -0700, David Miller wrote:
> From: Terry Loftin <terry.loftin@hp.com>
> Date: Wed, 31 Mar 2010 14:01:52 -0600
> 
> > Tx ring buffers after tx_ring->next_to_use are volatile and could
> > change, possibly causing a crash.  Stop cleaning when we hit
> > tx_ring->next_to_use.
> > 
> > Signed-off-by: Terry Loftin <terry.loftin@hp.com>
> 
> Intel folks please look carefully at this fix.  I want to
> get it into net-2.6 if it's correct.
> 
> Thanks.

Got this one as well...

-- 
Cheers,
Jeff


^ permalink raw reply

* [PATCHv1 NEXT 2/9] qlcnic: handle queue manager access
From: Amit Kumar Salecha @ 2010-04-02  5:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270184496-18985-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

Check the access by tools for hardware queue engine and handle it
separately than other block registers, otherwise incorrect data
is returned.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic.h      |    5 +++++
 drivers/net/qlcnic/qlcnic_hdr.h  |    3 ++-
 drivers/net/qlcnic/qlcnic_hw.c   |   25 ++++++++++++++++++++++---
 drivers/net/qlcnic/qlcnic_main.c |   35 +++++++++++++++++++++++++++--------
 4 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic.h b/drivers/net/qlcnic/qlcnic.h
index 0da94b2..8a3446d 100644
--- a/drivers/net/qlcnic/qlcnic.h
+++ b/drivers/net/qlcnic/qlcnic.h
@@ -994,6 +994,11 @@ u32 qlcnic_hw_read_wx_2M(struct qlcnic_adapter *adapter, ulong off);
 int qlcnic_hw_write_wx_2M(struct qlcnic_adapter *, ulong off, u32 data);
 int qlcnic_pci_mem_write_2M(struct qlcnic_adapter *, u64 off, u64 data);
 int qlcnic_pci_mem_read_2M(struct qlcnic_adapter *, u64 off, u64 *data);
+void qlcnic_pci_camqm_read_2M(struct qlcnic_adapter *, u64, u64 *);
+void qlcnic_pci_camqm_write_2M(struct qlcnic_adapter *, u64, u64);
+
+#define ADDR_IN_RANGE(addr, low, high)	\
+	(((addr) < (high)) && ((addr) >= (low)))
 
 #define QLCRD32(adapter, off) \
 	(qlcnic_hw_read_wx_2M(adapter, off))
diff --git a/drivers/net/qlcnic/qlcnic_hdr.h b/drivers/net/qlcnic/qlcnic_hdr.h
index 0469f84..25465a9 100644
--- a/drivers/net/qlcnic/qlcnic_hdr.h
+++ b/drivers/net/qlcnic/qlcnic_hdr.h
@@ -435,9 +435,10 @@ enum {
 #define QLCNIC_PCI_MS_2M	(0x80000)
 #define QLCNIC_PCI_OCM0_2M	(0x000c0000UL)
 #define QLCNIC_PCI_CRBSPACE	(0x06000000UL)
+#define QLCNIC_PCI_CAMQM	(0x04800000UL)
+#define QLCNIC_PCI_CAMQM_END	(0x04800800UL)
 #define QLCNIC_PCI_2MB_SIZE	(0x00200000UL)
 #define QLCNIC_PCI_CAMQM_2M_BASE	(0x000ff800UL)
-#define QLCNIC_PCI_CAMQM_2M_END 	(0x04800800UL)
 
 #define QLCNIC_CRB_CAM	QLCNIC_PCI_CRB_WINDOW(QLCNIC_HW_PX_MAP_CRB_CAM)
 
diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index da00e16..b977874 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -53,9 +53,6 @@ static inline void writeq(u64 val, void __iomem *addr)
 }
 #endif
 
-#define ADDR_IN_RANGE(addr, low, high)	\
-	(((addr) < (high)) && ((addr) >= (low)))
-
 #define PCI_OFFSET_FIRST_RANGE(adapter, off)    \
 	((adapter)->ahw.pci_base0 + (off))
 
@@ -936,6 +933,28 @@ unlock:
 	return ret;
 }
 
+void
+qlcnic_pci_camqm_read_2M(struct qlcnic_adapter *adapter, u64 off, u64 *data)
+{
+	void __iomem *addr = adapter->ahw.pci_base0 +
+		QLCNIC_PCI_CAMQM_2M_BASE + (off - QLCNIC_PCI_CAMQM);
+
+	mutex_lock(&adapter->ahw.mem_lock);
+	*data = readq(addr);
+	mutex_unlock(&adapter->ahw.mem_lock);
+}
+
+void
+qlcnic_pci_camqm_write_2M(struct qlcnic_adapter *adapter, u64 off, u64 data)
+{
+	void __iomem *addr = adapter->ahw.pci_base0 +
+		QLCNIC_PCI_CAMQM_2M_BASE + (off - QLCNIC_PCI_CAMQM);
+
+	mutex_lock(&adapter->ahw.mem_lock);
+	writeq(data, addr);
+	mutex_unlock(&adapter->ahw.mem_lock);
+}
+
 #define MAX_CTL_CHECK   1000
 
 int
diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index fc72156..a234622 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -2386,14 +2386,21 @@ static int
 qlcnic_sysfs_validate_crb(struct qlcnic_adapter *adapter,
 		loff_t offset, size_t size)
 {
+	size_t crb_size = 4;
+
 	if (!(adapter->flags & QLCNIC_DIAG_ENABLED))
 		return -EIO;
 
-	if ((size != 4) || (offset & 0x3))
-		return  -EINVAL;
+	if (offset < QLCNIC_PCI_CRBSPACE) {
+		if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM,
+					QLCNIC_PCI_CAMQM_END))
+			crb_size = 8;
+		else
+			return -EINVAL;
+	}
 
-	if (offset < QLCNIC_PCI_CRBSPACE)
-		return -EINVAL;
+	if ((size != crb_size) || (offset & (crb_size-1)))
+		return  -EINVAL;
 
 	return 0;
 }
@@ -2405,14 +2412,20 @@ qlcnic_sysfs_read_crb(struct kobject *kobj, struct bin_attribute *attr,
 	struct device *dev = container_of(kobj, struct device, kobj);
 	struct qlcnic_adapter *adapter = dev_get_drvdata(dev);
 	u32 data;
+	u64 qmdata;
 	int ret;
 
 	ret = qlcnic_sysfs_validate_crb(adapter, offset, size);
 	if (ret != 0)
 		return ret;
 
-	data = QLCRD32(adapter, offset);
-	memcpy(buf, &data, size);
+	if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM, QLCNIC_PCI_CAMQM_END)) {
+		qlcnic_pci_camqm_read_2M(adapter, offset, &qmdata);
+		memcpy(buf, &qmdata, size);
+	} else {
+		data = QLCRD32(adapter, offset);
+		memcpy(buf, &data, size);
+	}
 	return size;
 }
 
@@ -2423,14 +2436,20 @@ qlcnic_sysfs_write_crb(struct kobject *kobj, struct bin_attribute *attr,
 	struct device *dev = container_of(kobj, struct device, kobj);
 	struct qlcnic_adapter *adapter = dev_get_drvdata(dev);
 	u32 data;
+	u64 qmdata;
 	int ret;
 
 	ret = qlcnic_sysfs_validate_crb(adapter, offset, size);
 	if (ret != 0)
 		return ret;
 
-	memcpy(&data, buf, size);
-	QLCWR32(adapter, offset, data);
+	if (ADDR_IN_RANGE(offset, QLCNIC_PCI_CAMQM, QLCNIC_PCI_CAMQM_END)) {
+		memcpy(&qmdata, buf, size);
+		qlcnic_pci_camqm_write_2M(adapter, offset, qmdata);
+	} else {
+		memcpy(&data, buf, size);
+		QLCWR32(adapter, offset, data);
+	}
 	return size;
 }
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCHv1 NEXT 3/9] qlcnic: update oncard memory size check
From: Amit Kumar Salecha @ 2010-04-02  5:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270184496-18985-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

All QLogic converged NICs have 128-bit 128MB on card memory.
Fix the limit check from 64MB to 128MB and remove unnecessary
64-bit read/write checks.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_hdr.h |    2 +-
 drivers/net/qlcnic/qlcnic_hw.c  |   55 +++++++++++++++++----------------------
 2 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_hdr.h b/drivers/net/qlcnic/qlcnic_hdr.h
index 25465a9..e9fb692 100644
--- a/drivers/net/qlcnic/qlcnic_hdr.h
+++ b/drivers/net/qlcnic/qlcnic_hdr.h
@@ -449,7 +449,7 @@ enum {
 #define QLCNIC_ADDR_OCM1	(0x0000000200400000ULL)
 #define QLCNIC_ADDR_OCM1_MAX	(0x00000002004fffffULL)
 #define QLCNIC_ADDR_QDR_NET	(0x0000000300000000ULL)
-#define QLCNIC_ADDR_QDR_NET_MAX_P3 (0x0000000303ffffffULL)
+#define QLCNIC_ADDR_QDR_NET_MAX (0x0000000307ffffffULL)
 
 /*
  *   Register offsets for MN
diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index b977874..419f46e 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -963,7 +963,6 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 {
 	int i, j, ret;
 	u32 temp, off8;
-	u64 stride;
 	void __iomem *mem_crb;
 
 	/* Only 64-bit aligned access */
@@ -972,7 +971,7 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 
 	/* P3 onward, test agent base for MIU and SIU is same */
 	if (ADDR_IN_RANGE(off, QLCNIC_ADDR_QDR_NET,
-				QLCNIC_ADDR_QDR_NET_MAX_P3)) {
+				QLCNIC_ADDR_QDR_NET_MAX)) {
 		mem_crb = qlcnic_get_ioaddr(adapter,
 				QLCNIC_CRB_QDR_NET+MIU_TEST_AGT_BASE);
 		goto correct;
@@ -990,9 +989,7 @@ qlcnic_pci_mem_write_2M(struct qlcnic_adapter *adapter,
 	return -EIO;
 
 correct:
-	stride = QLCNIC_IS_REVISION_P3P(adapter->ahw.revision_id) ? 16 : 8;
-
-	off8 = off & ~(stride-1);
+	off8 = off & ~0xf;
 
 	mutex_lock(&adapter->ahw.mem_lock);
 
@@ -1000,30 +997,28 @@ correct:
 	writel(0, (mem_crb + MIU_TEST_AGT_ADDR_HI));
 
 	i = 0;
-	if (stride == 16) {
-		writel(TA_CTL_ENABLE, (mem_crb + TEST_AGT_CTRL));
-		writel((TA_CTL_START | TA_CTL_ENABLE),
-				(mem_crb + TEST_AGT_CTRL));
-
-		for (j = 0; j < MAX_CTL_CHECK; j++) {
-			temp = readl(mem_crb + TEST_AGT_CTRL);
-			if ((temp & TA_CTL_BUSY) == 0)
-				break;
-		}
+	writel(TA_CTL_ENABLE, (mem_crb + TEST_AGT_CTRL));
+	writel((TA_CTL_START | TA_CTL_ENABLE),
+			(mem_crb + TEST_AGT_CTRL));
 
-		if (j >= MAX_CTL_CHECK) {
-			ret = -EIO;
-			goto done;
-		}
+	for (j = 0; j < MAX_CTL_CHECK; j++) {
+		temp = readl(mem_crb + TEST_AGT_CTRL);
+		if ((temp & TA_CTL_BUSY) == 0)
+			break;
+	}
 
-		i = (off & 0xf) ? 0 : 2;
-		writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i)),
-				mem_crb + MIU_TEST_AGT_WRDATA(i));
-		writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i+1)),
-				mem_crb + MIU_TEST_AGT_WRDATA(i+1));
-		i = (off & 0xf) ? 2 : 0;
+	if (j >= MAX_CTL_CHECK) {
+		ret = -EIO;
+		goto done;
 	}
 
+	i = (off & 0xf) ? 0 : 2;
+	writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i)),
+			mem_crb + MIU_TEST_AGT_WRDATA(i));
+	writel(readl(mem_crb + MIU_TEST_AGT_RDDATA(i+1)),
+			mem_crb + MIU_TEST_AGT_WRDATA(i+1));
+	i = (off & 0xf) ? 2 : 0;
+
 	writel(data & 0xffffffff,
 			mem_crb + MIU_TEST_AGT_WRDATA(i));
 	writel((data >> 32) & 0xffffffff,
@@ -1059,7 +1054,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 {
 	int j, ret;
 	u32 temp, off8;
-	u64 val, stride;
+	u64 val;
 	void __iomem *mem_crb;
 
 	/* Only 64-bit aligned access */
@@ -1068,7 +1063,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 
 	/* P3 onward, test agent base for MIU and SIU is same */
 	if (ADDR_IN_RANGE(off, QLCNIC_ADDR_QDR_NET,
-				QLCNIC_ADDR_QDR_NET_MAX_P3)) {
+				QLCNIC_ADDR_QDR_NET_MAX)) {
 		mem_crb = qlcnic_get_ioaddr(adapter,
 				QLCNIC_CRB_QDR_NET+MIU_TEST_AGT_BASE);
 		goto correct;
@@ -1088,9 +1083,7 @@ qlcnic_pci_mem_read_2M(struct qlcnic_adapter *adapter,
 	return -EIO;
 
 correct:
-	stride = QLCNIC_IS_REVISION_P3P(adapter->ahw.revision_id) ? 16 : 8;
-
-	off8 = off & ~(stride-1);
+	off8 = off & ~0xf;
 
 	mutex_lock(&adapter->ahw.mem_lock);
 
@@ -1112,7 +1105,7 @@ correct:
 		ret = -EIO;
 	} else {
 		off8 = MIU_TEST_AGT_RDDATA_LO;
-		if ((stride == 16) && (off & 0xf))
+		if (off & 0xf)
 			off8 = MIU_TEST_AGT_RDDATA_UPPER_LO;
 
 		temp = readl(mem_crb + off8 + 4);
-- 
1.6.0.2


^ permalink raw reply related

* [PATCHv1 0/9]qlcnic: fix diagnostic tools access
From: Amit Kumar Salecha @ 2010-04-02  5:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman

David,
  Series of 9 patches to fix diagnostic tools access and fixing
  ethtool msg level as per your comments.
  This is v1 series and this also include endianness fix.
  Please apply them on net-next branch.

-Amit

^ permalink raw reply

* [PATCHv1 NEXT 4/9] qlcnic: fix onchip memory access
From: Amit Kumar Salecha @ 2010-04-02  5:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, Dhananjay Phadke
In-Reply-To: <1270184496-18985-1-git-send-email-amit.salecha@qlogic.com>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>

Fix incorrect offset calculation and remove unnecessary remap
of the region in bar 0 to access onchip memory.

This was leading to read incorrect values by debug tools.

Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_hw.c |   39 ++-------------------------------------
 1 files changed, 2 insertions(+), 37 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_hw.c b/drivers/net/qlcnic/qlcnic_hw.c
index 419f46e..0933c2d 100644
--- a/drivers/net/qlcnic/qlcnic_hw.c
+++ b/drivers/net/qlcnic/qlcnic_hw.c
@@ -53,18 +53,6 @@ static inline void writeq(u64 val, void __iomem *addr)
 }
 #endif
 
-#define PCI_OFFSET_FIRST_RANGE(adapter, off)    \
-	((adapter)->ahw.pci_base0 + (off))
-
-static void __iomem *pci_base_offset(struct qlcnic_adapter *adapter,
-					    unsigned long off)
-{
-	if (ADDR_IN_RANGE(off, FIRST_PAGE_GROUP_START, FIRST_PAGE_GROUP_END))
-		return PCI_OFFSET_FIRST_RANGE(adapter, off);
-
-	return NULL;
-}
-
 static const struct crb_128M_2M_block_map
 crb_128M_2M_map[64] __cacheline_aligned_in_smp = {
     {{{0, 0,         0,         0} } },		/* 0: PCI */
@@ -871,13 +859,6 @@ qlcnic_pci_set_window_2M(struct qlcnic_adapter *adapter,
 		u64 addr, u32 *start)
 {
 	u32 window;
-	struct pci_dev *pdev = adapter->pdev;
-
-	if ((addr & 0x00ff800) == 0xff800) {
-		if (printk_ratelimit())
-			dev_warn(&pdev->dev, "QM access not handled\n");
-		return -EIO;
-	}
 
 	window = OCM_WIN_P3P(addr);
 
@@ -894,8 +875,7 @@ static int
 qlcnic_pci_mem_access_direct(struct qlcnic_adapter *adapter, u64 off,
 		u64 *data, int op)
 {
-	void __iomem *addr, *mem_ptr = NULL;
-	resource_size_t mem_base;
+	void __iomem *addr;
 	int ret;
 	u32 start;
 
@@ -905,21 +885,8 @@ qlcnic_pci_mem_access_direct(struct qlcnic_adapter *adapter, u64 off,
 	if (ret != 0)
 		goto unlock;
 
-	addr = pci_base_offset(adapter, start);
-	if (addr)
-		goto noremap;
-
-	mem_base = pci_resource_start(adapter->pdev, 0) + (start & PAGE_MASK);
-
-	mem_ptr = ioremap(mem_base, PAGE_SIZE);
-	if (mem_ptr == NULL) {
-		ret = -EIO;
-		goto unlock;
-	}
-
-	addr = mem_ptr + (start & (PAGE_SIZE - 1));
+	addr = adapter->ahw.pci_base0 + start;
 
-noremap:
 	if (op == 0)	/* read */
 		*data = readq(addr);
 	else		/* write */
@@ -928,8 +895,6 @@ noremap:
 unlock:
 	mutex_unlock(&adapter->ahw.mem_lock);
 
-	if (mem_ptr)
-		iounmap(mem_ptr);
 	return ret;
 }
 
-- 
1.6.0.2


^ permalink raw reply related

* [PATCHv1 NEXT 1/9] qlcnic: fix fw load from file
From: Amit Kumar Salecha @ 2010-04-02  5:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman
In-Reply-To: <1270184496-18985-1-git-send-email-amit.salecha@qlogic.com>

Rarely: Fw file size can be unaligned to 8.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 drivers/net/qlcnic/qlcnic_init.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/qlcnic/qlcnic_init.c b/drivers/net/qlcnic/qlcnic_init.c
index 7c34e4e..0a424e0 100644
--- a/drivers/net/qlcnic/qlcnic_init.c
+++ b/drivers/net/qlcnic/qlcnic_init.c
@@ -949,6 +949,16 @@ qlcnic_load_firmware(struct qlcnic_adapter *adapter)
 
 			flashaddr += 8;
 		}
+
+		size = (__force u32)qlcnic_get_fw_size(adapter) % 8;
+		if (size) {
+			data = cpu_to_le64(ptr64[i]);
+
+			if (qlcnic_pci_mem_write_2M(adapter,
+						flashaddr, data))
+				return -EIO;
+		}
+
 	} else {
 		u64 data;
 		u32 hi, lo;
-- 
1.6.0.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox