Netdev List

Netdev List
 help / color / mirror / Atom feed

* Regression 2.6.36 - driver rtl8169 crashes kernel, triggered by user app
From: Michael Monnerie @ 2010-11-11 18:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: romieu, netdev

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

Dear list, I've hunted down a bug which does *NOT* occur in kernel 
2.6.34.7-0.5-desktop from openSUSE 11.3, but crashes stock kernel 2.6.36 
triggered by a user!

It's actually very simple. From my desktop (kernel 2.6.36) I "cd" to an 
NFS4 share, where a xz compressed image of Win7-64.iso.xz is located.

# cd /q/iso-images

and then I try to uncompress it there (as user, not root!):

# xz -kv Win7-64.iso.xz

With kernel 2.6.34.7-0.5-desktop, this runs with ~41MiB/s without 
problems.

With kernel 2.6.36, it runs at ~26MiB/s, and while doing so, dmesg shows 
a lot of noise about r8169 complaining:
http://zmi.at/x/kernel2.6.36-crash86.jpg

Here are 2 pictures of different crashes:
http://zmi.at/x/kernel2.6.36-crash84.jpg
http://zmi.at/x/kernel2.6.36-crash85.jpg

Neither the dmesg-messages nor the crash happens with kernel 
2.6.34.7-0.5-desktop as delivered by openSUSE 11.3, but it always fully 
crashes 2.6.36. I've retried about 10 times, it *never* finished to 
uncompress the ~3GB image. At around 500-1000MB the kernel was gone.

I'm sure someone knows how to fix it. :-)

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Radiointerview zum Thema Spam ******
http://www.it-podcast.at/archiv.html#podcast-100716

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [PATCH] neigh: reorder struct neighbour
From: David Miller @ 2010-11-11 18:41 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1289494639.17691.1499.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 11 Nov 2010 17:57:19 +0100

> It is important to move nud_state outside of the often modified cache
> line (because of refcnt), to reduce false sharing in neigh_event_send()
> 
> This is a followup of commit 0ed8ddf4045f (neigh: Protect neigh->ha[]
> with a seqlock)
> 
> This gives a 7% speedup on routing test with IP route cache disabled.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> David, it appears I forgot to push this patch, I have it in my tree
> since one month. Thanks !

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: get rid of rtable->idev
From: David Miller @ 2010-11-11 18:41 UTC (permalink / raw)
  To: eric.dumazet; +Cc: herbert, netdev, rolandd, sean.hefty, hal.rosenstock
In-Reply-To: <1289495647.17691.1536.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 11 Nov 2010 18:14:07 +0100

> It seems idev field in struct rtable has no special purpose, but adding
> extra atomic ops.
> 
> We hold refcounts on the device itself (using percpu data, so pretty
> cheap in current kernel).
> 
> infiniband case is solved using dst.dev instead of idev->dev
> 
> Removal of this field means routing without route cache is now using
> shared data, percpu data, and only potential contention is a pair of
> atomic ops on struct neighbour per forwarded packet.
> 
> About 5% speedup on routing test.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Roland Dreier <rolandd@cisco.com>
> Cc: Sean Hefty <sean.hefty@intel.com>
> Cc: Hal Rosenstock <hal.rosenstock@gmail.com>

Yes, let's remove as much unused crap as possible :-)

Applied, thanks!

^ permalink raw reply

* Re: net-next-2.6 [PATCH 0/3] dccp: Ack Vectors in circular buffer instead of array
From: David Miller @ 2010-11-11 18:45 UTC (permalink / raw)
  To: gerrit; +Cc: dccp, netdev
In-Reply-To: <1289455653-5463-1-git-send-email-gerrit@erg.abdn.ac.uk>

From: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Date: Thu, 11 Nov 2010 07:07:30 +0100

>  Patch #1: cleans up the old interface to prepare for the improved one.
>  Patch #2: also tidies up the old interface, by separating the internals
>            of Ack Vectors from the option-parsing code.
>  Patch #3: Completes the implementation of a circular Ack Vector buffer.
> 
> 
> I have also placed this in into a fresh (today's) copy of net-next-2.6, on
> 
>     git://eden-feed.erg.abdn.ac.uk/net-next-2.6        [subtree 'dccp']
> 
> The set has been tested for 3 years, and is fully bisectable.

Pulled, thanks!

^ permalink raw reply

* Re: [PATCH] macvlan: lockless tx path
From: Eric Dumazet @ 2010-11-11 18:46 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev
In-Reply-To: <4CDC324B.40206@candelatech.com>

Le jeudi 11 novembre 2010 à 10:13 -0800, Ben Greear a écrit :

> If you are aware of any drivers that return counters of other than 32 or
> 64bit widths, please let us know and perhaps we can fix them as well.

I am pretty sure I found some of them, but cannot recall right now.

Some ethtool stats have definitly not a 32 or 64 bit range




^ permalink raw reply

* Re: [PATCH net-next-2.6] vlan: lockless transmit path
From: Jesse Gross @ 2010-11-11 18:56 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Patrick McHardy
In-Reply-To: <1289497978.17691.1582.camel@edumazet-laptop>

On Thu, Nov 11, 2010 at 9:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 11 novembre 2010 à 09:40 -0800, Jesse Gross a écrit :
>
>> If we're only allocating a single queue then we should also drop
>> vlan_dev_select_queue() and the netdev_ops that call it.  If the
>> underlying device is multiqueue and has its own select_queue function
>> then it can pick a queue number that is larger than what the vlan
>> device has.  The problem will be caught by dev_cap_txqueue() but it's
>> not right and it would also be nice to get rid of half of those
>> netdev_ops.
>
> Hmm, you refer to old kernels dont you ?
>
> My patch is for net-next

Well, I was referring to checked-in code.

>
> The plan is that after last Tom Herbert patches, dev_pick_tx()  wont
> call do_select_queue() on mono queue device.
>
> http://patchwork.ozlabs.org/patch/70369/

Before Tom's patch, a warning will be generated if a single queue vlan
device is stacked on top of a multiqueue physical device that
implements ndo_select_queue().  After Tom's patch, we avoid the
warning so vlan_dev_select_queue() is merely dead code.  Either way,
what's the benefit in keeping it?

>
>
> This logicaly is a second cleanup patch I believe.

I'm not arguing against your patch, I just think it should go a step further.

^ permalink raw reply

* WESTERN UNION FUNDS COMPENSATION!!!
From: Sueli Silva de Oliveira - Membro da Enfermagem @ 2010-11-11 18:26 UTC (permalink / raw)


Please be informed that you have $250,000.00 Lodged in our Western Union to 
transfer to you as Compensation. Contact 
Email: western.union0660@yahoo.com.hk



^ permalink raw reply

* [PATCH net-26 1/6] cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue()
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

Don't implement (struct net_device_ops *)->ndo_select_queue() with simple
call to skb_tx_hash().  This leads to non-persistent TX queue selection in
the Linux dev_pick_tx() routine for TCP connections.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |   14 --------------
 1 files changed, 0 insertions(+), 14 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 6de5e2e..24808ac 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -1103,18 +1103,6 @@ static int cxgb4vf_set_mac_addr(struct net_device *dev, void *_addr)
 	return 0;
 }
 
-/*
- * Return a TX Queue on which to send the specified skb.
- */
-static u16 cxgb4vf_select_queue(struct net_device *dev, struct sk_buff *skb)
-{
-	/*
-	 * XXX For now just use the default hash but we probably want to
-	 * XXX look at other possibilities ...
-	 */
-	return skb_tx_hash(dev, skb);
-}
-
 #ifdef CONFIG_NET_POLL_CONTROLLER
 /*
  * Poll all of our receive queues.  This is called outside of normal interrupt
@@ -2417,7 +2405,6 @@ static const struct net_device_ops cxgb4vf_netdev_ops	= {
 	.ndo_get_stats		= cxgb4vf_get_stats,
 	.ndo_set_rx_mode	= cxgb4vf_set_rxmode,
 	.ndo_set_mac_address	= cxgb4vf_set_mac_addr,
-	.ndo_select_queue	= cxgb4vf_select_queue,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_do_ioctl		= cxgb4vf_do_ioctl,
 	.ndo_change_mtu		= cxgb4vf_change_mtu,
@@ -2624,7 +2611,6 @@ static int __devinit cxgb4vf_pci_probe(struct pci_dev *pdev,
 		netdev->do_ioctl = cxgb4vf_do_ioctl;
 		netdev->change_mtu = cxgb4vf_change_mtu;
 		netdev->set_mac_address = cxgb4vf_set_mac_addr;
-		netdev->select_queue = cxgb4vf_select_queue;
 #ifdef CONFIG_NET_POLL_CONTROLLER
 		netdev->poll_controller = cxgb4vf_poll_controller;
 #endif
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-26 0/6] cxgb4vf: a number of bug fixes
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem

The following patch set includes a number of bug fixes for the cxgb4vf
network driver.  As always, please toss these in the bin if they're not
right.

 drivers/net/cxgb4vf/cxgb4vf_main.c |   42 ++++++++-----
 drivers/net/cxgb4vf/sge.c          |  122 ++++++++++++++++++++++--------------
 drivers/net/cxgb4vf/t4vf_common.h  |    1 +
 drivers/net/cxgb4vf/t4vf_hw.c      |   19 ++++++
 4 files changed, 122 insertions(+), 62 deletions(-)


^ permalink raw reply

* [PATCH net-26 2/6] cxgb4vf: fix bug in Generic Receive Offload
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

Fix botch in Generic Receive Offload (the Packet Gather List Total length
field wasn't being initialized).

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/sge.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
index f10864d..6a6e18b 100644
--- a/drivers/net/cxgb4vf/sge.c
+++ b/drivers/net/cxgb4vf/sge.c
@@ -1679,6 +1679,7 @@ int process_responses(struct sge_rspq *rspq, int budget)
 				}
 				len = RSPD_LEN(len);
 			}
+			gl.tot_len = len;
 
 			/*
 			 * Gather packet fragments.
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-26 3/6] cxgb4vf: fix some errors in Gather List to skb conversion
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

There were some errors in the way that internal Gather Lists were being
translated into skb's.  This also makes the VF Driver look more like the PF
Driver to facilitate easier comarison.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/sge.c |  121 +++++++++++++++++++++++++++-----------------
 1 files changed, 74 insertions(+), 47 deletions(-)

diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
index 6a6e18b..ecf0770 100644
--- a/drivers/net/cxgb4vf/sge.c
+++ b/drivers/net/cxgb4vf/sge.c
@@ -154,13 +154,14 @@ enum {
 	 */
 	RX_COPY_THRES = 256,
 	RX_PULL_LEN = 128,
-};
 
-/*
- * Can't define this in the above enum because PKTSHIFT isn't a constant in
- * the VF Driver ...
- */
-#define RX_PKT_PULL_LEN (RX_PULL_LEN + PKTSHIFT)
+	/*
+	 * Main body length for sk_buffs used for RX Ethernet packets with
+	 * fragments.  Should be >= RX_PULL_LEN but possibly bigger to give
+	 * pskb_may_pull() some room.
+	 */
+	RX_SKB_LEN = 512,
+};
 
 /*
  * Software state per TX descriptor.
@@ -1355,6 +1356,67 @@ out_free:
 }
 
 /**
+ *	t4vf_pktgl_to_skb - build an sk_buff from a packet gather list
+ *	@gl: the gather list
+ *	@skb_len: size of sk_buff main body if it carries fragments
+ *	@pull_len: amount of data to move to the sk_buff's main body
+ *
+ *	Builds an sk_buff from the given packet gather list.  Returns the
+ *	sk_buff or %NULL if sk_buff allocation failed.
+ */
+struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
+				  unsigned int skb_len, unsigned int pull_len)
+{
+	struct sk_buff *skb;
+	struct skb_shared_info *ssi;
+
+	/*
+	 * If the ingress packet is small enough, allocate an skb large enough
+	 * for all of the data and copy it inline.  Otherwise, allocate an skb
+	 * with enough room to pull in the header and reference the rest of
+	 * the data via the skb fragment list.
+	 *
+	 * Below we rely on RX_COPY_THRES being less than the smallest Rx
+	 * buff!  size, which is expected since buffers are at least
+	 * PAGE_SIZEd.  In this case packets up to RX_COPY_THRES have only one
+	 * fragment.
+	 */
+	if (gl->tot_len <= RX_COPY_THRES) {
+		/* small packets have only one fragment */
+		skb = alloc_skb(gl->tot_len, GFP_ATOMIC);
+		if (unlikely(!skb))
+			goto out;
+		__skb_put(skb, gl->tot_len);
+		skb_copy_to_linear_data(skb, gl->va, gl->tot_len);
+	} else {
+		skb = alloc_skb(skb_len, GFP_ATOMIC);
+		if (unlikely(!skb))
+			goto out;
+		__skb_put(skb, pull_len);
+		skb_copy_to_linear_data(skb, gl->va, pull_len);
+
+		ssi = skb_shinfo(skb);
+		ssi->frags[0].page = gl->frags[0].page;
+		ssi->frags[0].page_offset = gl->frags[0].page_offset + pull_len;
+		ssi->frags[0].size = gl->frags[0].size - pull_len;
+		if (gl->nfrags > 1)
+			memcpy(&ssi->frags[1], &gl->frags[1],
+			       (gl->nfrags-1) * sizeof(skb_frag_t));
+		ssi->nr_frags = gl->nfrags;
+
+		skb->len = gl->tot_len;
+		skb->data_len = skb->len - pull_len;
+		skb->truesize += skb->data_len;
+
+		/* Get a reference for the last page, we don't own it */
+		get_page(gl->frags[gl->nfrags - 1].page);
+	}
+
+out:
+	return skb;
+}
+
+/**
  *	t4vf_pktgl_free - free a packet gather list
  *	@gl: the gather list
  *
@@ -1463,10 +1525,8 @@ int t4vf_ethrx_handler(struct sge_rspq *rspq, const __be64 *rsp,
 {
 	struct sk_buff *skb;
 	struct port_info *pi;
-	struct skb_shared_info *ssi;
 	const struct cpl_rx_pkt *pkt = (void *)&rsp[1];
 	bool csum_ok = pkt->csum_calc && !pkt->err_vec;
-	unsigned int len = be16_to_cpu(pkt->len);
 	struct sge_eth_rxq *rxq = container_of(rspq, struct sge_eth_rxq, rspq);
 
 	/*
@@ -1481,42 +1541,14 @@ int t4vf_ethrx_handler(struct sge_rspq *rspq, const __be64 *rsp,
 	}
 
 	/*
-	 * If the ingress packet is small enough, allocate an skb large enough
-	 * for all of the data and copy it inline.  Otherwise, allocate an skb
-	 * with enough room to pull in the header and reference the rest of
-	 * the data via the skb fragment list.
+	 * Convert the Packet Gather List into an skb.
 	 */
-	if (len <= RX_COPY_THRES) {
-		/* small packets have only one fragment */
-		skb = alloc_skb(gl->frags[0].size, GFP_ATOMIC);
-		if (!skb)
-			goto nomem;
-		__skb_put(skb, gl->frags[0].size);
-		skb_copy_to_linear_data(skb, gl->va, gl->frags[0].size);
-	} else {
-		skb = alloc_skb(RX_PKT_PULL_LEN, GFP_ATOMIC);
-		if (!skb)
-			goto nomem;
-		__skb_put(skb, RX_PKT_PULL_LEN);
-		skb_copy_to_linear_data(skb, gl->va, RX_PKT_PULL_LEN);
-
-		ssi = skb_shinfo(skb);
-		ssi->frags[0].page = gl->frags[0].page;
-		ssi->frags[0].page_offset = (gl->frags[0].page_offset +
-					     RX_PKT_PULL_LEN);
-		ssi->frags[0].size = gl->frags[0].size - RX_PKT_PULL_LEN;
-		if (gl->nfrags > 1)
-			memcpy(&ssi->frags[1], &gl->frags[1],
-			       (gl->nfrags-1) * sizeof(skb_frag_t));
-		ssi->nr_frags = gl->nfrags;
-		skb->len = len + PKTSHIFT;
-		skb->data_len = skb->len - RX_PKT_PULL_LEN;
-		skb->truesize += skb->data_len;
-
-		/* Get a reference for the last page, we don't own it */
-		get_page(gl->frags[gl->nfrags - 1].page);
+	skb = t4vf_pktgl_to_skb(gl, RX_SKB_LEN, RX_PULL_LEN);
+	if (unlikely(!skb)) {
+		t4vf_pktgl_free(gl);
+		rxq->stats.rx_drops++;
+		return 0;
 	}
-
 	__skb_pull(skb, PKTSHIFT);
 	skb->protocol = eth_type_trans(skb, rspq->netdev);
 	skb_record_rx_queue(skb, rspq->idx);
@@ -1549,11 +1581,6 @@ int t4vf_ethrx_handler(struct sge_rspq *rspq, const __be64 *rsp,
 		netif_receive_skb(skb);
 
 	return 0;
-
-nomem:
-	t4vf_pktgl_free(gl);
-	rxq->stats.rx_drops++;
-	return 0;
 }
 
 /**
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-26 4/6] cxgb4vf: flesh out PCI Device ID Table ...
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

Add a bunch of T4 Device IDs for the VF Driver.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 24808ac..4487b1a 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -2829,6 +2829,14 @@ static struct pci_device_id cxgb4vf_pci_tbl[] = {
 	CH_DEVICE(0x4800, 0),	/* T440-dbg */
 	CH_DEVICE(0x4801, 0),	/* T420-cr */
 	CH_DEVICE(0x4802, 0),	/* T422-cr */
+	CH_DEVICE(0x4803, 0),	/* T440-cr */
+	CH_DEVICE(0x4804, 0),	/* T420-bch */
+	CH_DEVICE(0x4805, 0),   /* T440-bch */
+	CH_DEVICE(0x4806, 0),	/* T460-ch */
+	CH_DEVICE(0x4807, 0),	/* T420-so */
+	CH_DEVICE(0x4808, 0),	/* T420-cx */
+	CH_DEVICE(0x4809, 0),	/* T420-bt */
+	CH_DEVICE(0x480a, 0),   /* T404-bt */
 	{ 0, }
 };
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-26 6/6] cxgb4vf: add call to Firmware to reset VF State.
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

Add call to Firmware to reset its VF State when we first attach to the VF.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |   16 ++++++++++++++++
 drivers/net/cxgb4vf/t4vf_common.h  |    1 +
 drivers/net/cxgb4vf/t4vf_hw.c      |   19 +++++++++++++++++++
 3 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 8da3bda..c3449bb 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -2065,6 +2065,22 @@ static int adap_init0(struct adapter *adapter)
 	}
 
 	/*
+	 * Some environments do not properly handle PCIE FLRs -- e.g. in Linux
+	 * 2.6.31 and later we can't call pci_reset_function() in order to
+	 * issue an FLR because of a self- deadlock on the device semaphore.
+	 * Meanwhile, the OS infrastructure doesn't issue FLRs in all the
+	 * cases where they're needed -- for instance, some versions of KVM
+	 * fail to reset "Assigned Devices" when the VM reboots.  Therefore we
+	 * use the firmware based reset in order to reset any per function
+	 * state.
+	 */
+	err = t4vf_fw_reset(adapter);
+	if (err < 0) {
+		dev_err(adapter->pdev_dev, "FW reset failed: err=%d\n", err);
+		return err;
+	}
+
+	/*
 	 * Grab basic operational parameters.  These will predominantly have
 	 * been set up by the Physical Function Driver or will be hard coded
 	 * into the adapter.  We just have to live with them ...  Note that
diff --git a/drivers/net/cxgb4vf/t4vf_common.h b/drivers/net/cxgb4vf/t4vf_common.h
index 873cb7d..a65c80a 100644
--- a/drivers/net/cxgb4vf/t4vf_common.h
+++ b/drivers/net/cxgb4vf/t4vf_common.h
@@ -235,6 +235,7 @@ static inline int t4vf_wr_mbox_ns(struct adapter *adapter, const void *cmd,
 int __devinit t4vf_wait_dev_ready(struct adapter *);
 int __devinit t4vf_port_init(struct adapter *, int);
 
+int t4vf_fw_reset(struct adapter *);
 int t4vf_query_params(struct adapter *, unsigned int, const u32 *, u32 *);
 int t4vf_set_params(struct adapter *, unsigned int, const u32 *, const u32 *);
 
diff --git a/drivers/net/cxgb4vf/t4vf_hw.c b/drivers/net/cxgb4vf/t4vf_hw.c
index ea1c123..e306c20 100644
--- a/drivers/net/cxgb4vf/t4vf_hw.c
+++ b/drivers/net/cxgb4vf/t4vf_hw.c
@@ -326,6 +326,25 @@ int __devinit t4vf_port_init(struct adapter *adapter, int pidx)
 }
 
 /**
+ *      t4vf_fw_reset - issue a reset to FW
+ *      @adapter: the adapter
+ *
+ *	Issues a reset command to FW.  For a Physical Function this would
+ *	result in the Firmware reseting all of its state.  For a Virtual
+ *	Function this just resets the state associated with the VF.
+ */
+int t4vf_fw_reset(struct adapter *adapter)
+{
+	struct fw_reset_cmd cmd;
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.op_to_write = cpu_to_be32(FW_CMD_OP(FW_RESET_CMD) |
+				      FW_CMD_WRITE);
+	cmd.retval_len16 = cpu_to_be32(FW_LEN16(cmd));
+	return t4vf_wr_mbox(adapter, &cmd, sizeof(cmd), NULL);
+}
+
+/**
  *	t4vf_query_params - query FW or device parameters
  *	@adapter: the adapter
  *	@nparams: the number of parameters
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-26 5/6] cxgb4vf: Fail open if link_start() fails.
From: Casey Leedom @ 2010-11-11 19:06 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289502413-9895-1-git-send-email-leedom@chelsio.com>

Fail open if link_start() fails.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 4487b1a..8da3bda 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -753,7 +753,9 @@ static int cxgb4vf_open(struct net_device *dev)
 	if (err)
 		return err;
 	set_bit(pi->port_id, &adapter->open_device_map);
-	link_start(dev);
+	err = link_start(dev);
+	if (err)
+		return err;
 	netif_tx_start_all_queues(dev);
 	return 0;
 }
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-26 0/5] cxgb4vf: minor cleanup
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem

Fixes to comments, compiler warnings, etc.

 drivers/net/cxgb4vf/adapter.h      |    2 +-
 drivers/net/cxgb4vf/cxgb4vf_main.c |   32 ++++++++++++++++++++------------
 drivers/net/cxgb4vf/sge.c          |    9 ++++++---
 drivers/net/cxgb4vf/t4vf_common.h  |   28 ++++++++++++++--------------
 drivers/net/cxgb4vf/t4vf_hw.c      |    5 +++--
 5 files changed, 44 insertions(+), 32 deletions(-)


^ permalink raw reply

* [PATCH net-next-26 1/5] cxgb4vf: minor comment/symbolic name cleanup.
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289503844-18059-1-git-send-email-leedom@chelsio.com>

Minor cleanup of comments and symbolic constant names for clarity.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/adapter.h      |    2 +-
 drivers/net/cxgb4vf/cxgb4vf_main.c |   11 ++++-------
 drivers/net/cxgb4vf/sge.c          |    9 ++++++---
 drivers/net/cxgb4vf/t4vf_hw.c      |    5 +++--
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/cxgb4vf/adapter.h b/drivers/net/cxgb4vf/adapter.h
index 8ea0196..4766b41 100644
--- a/drivers/net/cxgb4vf/adapter.h
+++ b/drivers/net/cxgb4vf/adapter.h
@@ -60,7 +60,7 @@ enum {
 	 * MSI-X interrupt index usage.
 	 */
 	MSIX_FW		= 0,		/* MSI-X index for firmware Q */
-	MSIX_NIQFLINT	= 1,		/* MSI-X index base for Ingress Qs */
+	MSIX_IQFLINT	= 1,		/* MSI-X index base for Ingress Qs */
 	MSIX_EXTRAS	= 1,
 	MSIX_ENTRIES	= MAX_ETH_QSETS + MSIX_EXTRAS,
 
diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index c3449bb..6235719 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -280,9 +280,7 @@ static void name_msix_vecs(struct adapter *adapter)
 		const struct port_info *pi = netdev_priv(dev);
 		int qs, msi;
 
-		for (qs = 0, msi = MSIX_NIQFLINT;
-		     qs < pi->nqsets;
-		     qs++, msi++) {
+		for (qs = 0, msi = MSIX_IQFLINT; qs < pi->nqsets; qs++, msi++) {
 			snprintf(adapter->msix_info[msi].desc, namelen,
 				 "%s-%d", dev->name, qs);
 			adapter->msix_info[msi].desc[namelen] = 0;
@@ -309,7 +307,7 @@ static int request_msix_queue_irqs(struct adapter *adapter)
 	/*
 	 * Ethernet queues.
 	 */
-	msi = MSIX_NIQFLINT;
+	msi = MSIX_IQFLINT;
 	for_each_ethrxq(s, rxq) {
 		err = request_irq(adapter->msix_info[msi].vec,
 				  t4vf_sge_intr_msix, 0,
@@ -337,7 +335,7 @@ static void free_msix_queue_irqs(struct adapter *adapter)
 	int rxq, msi;
 
 	free_irq(adapter->msix_info[MSIX_FW].vec, &s->fw_evtq);
-	msi = MSIX_NIQFLINT;
+	msi = MSIX_IQFLINT;
 	for_each_ethrxq(s, rxq)
 		free_irq(adapter->msix_info[msi++].vec,
 			 &s->ethrxq[rxq].rspq);
@@ -527,7 +525,7 @@ static int setup_sge_queues(struct adapter *adapter)
 	 * brought up at which point lots of things get nailed down
 	 * permanently ...
 	 */
-	msix = MSIX_NIQFLINT;
+	msix = MSIX_IQFLINT;
 	for_each_port(adapter, pidx) {
 		struct net_device *dev = adapter->port[pidx];
 		struct port_info *pi = netdev_priv(dev);
@@ -2470,7 +2468,6 @@ static int __devinit cxgb4vf_pci_probe(struct pci_dev *pdev,
 		version_printed = 1;
 	}
 
-
 	/*
 	 * Initialize generic PCI device state.
 	 */
diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
index ecf0770..e0b3d1b 100644
--- a/drivers/net/cxgb4vf/sge.c
+++ b/drivers/net/cxgb4vf/sge.c
@@ -1568,6 +1568,9 @@ int t4vf_ethrx_handler(struct sge_rspq *rspq, const __be64 *rsp,
 	} else
 		skb_checksum_none_assert(skb);
 
+	/*
+	 * Deliver the packet to the stack.
+	 */
 	if (unlikely(pkt->vlan_ex)) {
 		struct vlan_group *grp = pi->vlan_grp;
 
@@ -2143,7 +2146,7 @@ int t4vf_sge_alloc_rxq(struct adapter *adapter, struct sge_rspq *rspq,
 
 		/*
 		 * Calculate the size of the hardware free list ring plus
-		 * status page (which the SGE will place at the end of the
+		 * Status Page (which the SGE will place after the end of the
 		 * free list ring) in Egress Queue Units.
 		 */
 		flsz = (fl->size / FL_PER_EQ_UNIT +
@@ -2240,8 +2243,8 @@ int t4vf_sge_alloc_eth_txq(struct adapter *adapter, struct sge_eth_txq *txq,
 	struct port_info *pi = netdev_priv(dev);
 
 	/*
-	 * Calculate the size of the hardware TX Queue (including the
-	 * status age on the end) in units of TX Descriptors.
+	 * Calculate the size of the hardware TX Queue (including the Status
+	 * Page on the end of the TX Queue) in units of TX Descriptors.
 	 */
 	nentries = txq->q.size + STAT_LEN / sizeof(struct tx_desc);
 
diff --git a/drivers/net/cxgb4vf/t4vf_hw.c b/drivers/net/cxgb4vf/t4vf_hw.c
index e306c20..f7d7f97 100644
--- a/drivers/net/cxgb4vf/t4vf_hw.c
+++ b/drivers/net/cxgb4vf/t4vf_hw.c
@@ -1276,7 +1276,7 @@ int t4vf_eth_eq_free(struct adapter *adapter, unsigned int eqid)
  */
 int t4vf_handle_fw_rpl(struct adapter *adapter, const __be64 *rpl)
 {
-	struct fw_cmd_hdr *cmd_hdr = (struct fw_cmd_hdr *)rpl;
+	const struct fw_cmd_hdr *cmd_hdr = (const struct fw_cmd_hdr *)rpl;
 	u8 opcode = FW_CMD_OP_GET(be32_to_cpu(cmd_hdr->hi));
 
 	switch (opcode) {
@@ -1284,7 +1284,8 @@ int t4vf_handle_fw_rpl(struct adapter *adapter, const __be64 *rpl)
 		/*
 		 * Link/module state change message.
 		 */
-		const struct fw_port_cmd *port_cmd = (void *)rpl;
+		const struct fw_port_cmd *port_cmd =
+			(const struct fw_port_cmd *)rpl;
 		u32 word;
 		int action, port_id, link_ok, speed, fc, pidx;
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-26 2/5] cxgb4vf: add ethtool statistics for GRO.
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289503844-18059-1-git-send-email-leedom@chelsio.com>

Add ethtool statistics for GRO.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 6235719..47417d4 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -1346,6 +1346,8 @@ struct queue_port_stats {
 	u64 rx_csum;
 	u64 vlan_ex;
 	u64 vlan_ins;
+	u64 lro_pkts;
+	u64 lro_merged;
 };
 
 /*
@@ -1383,6 +1385,8 @@ static const char stats_strings[][ETH_GSTRING_LEN] = {
 	"RxCsumGood        ",
 	"VLANextractions   ",
 	"VLANinsertions    ",
+	"GROPackets        ",
+	"GROMerged         ",
 };
 
 /*
@@ -1432,6 +1436,8 @@ static void collect_sge_port_stats(const struct adapter *adapter,
 		stats->rx_csum += rxq->stats.rx_cso;
 		stats->vlan_ex += rxq->stats.vlan_ex;
 		stats->vlan_ins += txq->vlan_ins;
+		stats->lro_pkts += rxq->stats.lro_pkts;
+		stats->lro_merged += rxq->stats.lro_merged;
 	}
 }
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-26 3/5] cxgb4vf: fix up "Section Mismatch" compiler warning.
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289503844-18059-1-git-send-email-leedom@chelsio.com>

Fix up "Section Mismatch" compiler warning and mark another routine as
__devinit.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 47417d4..4cf530a 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -2032,7 +2032,7 @@ static int __devinit setup_debugfs(struct adapter *adapter)
  * Tear down the /sys/kernel/debug/cxgb4vf sub-nodes created above.  We leave
  * it to our caller to tear down the directory (debugfs_root).
  */
-static void __devexit cleanup_debugfs(struct adapter *adapter)
+static void cleanup_debugfs(struct adapter *adapter)
 {
 	BUG_ON(adapter->debugfs_root == NULL);
 
@@ -2050,7 +2050,7 @@ static void __devexit cleanup_debugfs(struct adapter *adapter)
  * adapter parameters we're going to be using and initialize basic adapter
  * hardware support.
  */
-static int adap_init0(struct adapter *adapter)
+static int __devinit adap_init0(struct adapter *adapter)
 {
 	struct vf_resources *vfres = &adapter->params.vfres;
 	struct sge_params *sge_params = &adapter->params.sge;
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-26 4/5] cxgb4vf: Advertise NETIF_F_TSO_ECN.
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289503844-18059-1-git-send-email-leedom@chelsio.com>

Advertise NETIF_F_TSO_ECN.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/cxgb4vf_main.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/cxgb4vf/cxgb4vf_main.c b/drivers/net/cxgb4vf/cxgb4vf_main.c
index 4cf530a..9246d2f 100644
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c
@@ -1534,14 +1534,19 @@ static void cxgb4vf_get_wol(struct net_device *dev,
 }
 
 /*
+ * TCP Segmentation Offload flags which we support.
+ */
+#define TSO_FLAGS (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+
+/*
  * Set TCP Segmentation Offloading feature capabilities.
  */
 static int cxgb4vf_set_tso(struct net_device *dev, u32 tso)
 {
 	if (tso)
-		dev->features |= NETIF_F_TSO | NETIF_F_TSO6;
+		dev->features |= TSO_FLAGS;
 	else
-		dev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+		dev->features &= ~TSO_FLAGS;
 	return 0;
 }
 
@@ -2610,7 +2615,7 @@ static int __devinit cxgb4vf_pci_probe(struct pci_dev *pdev,
 		netif_carrier_off(netdev);
 		netdev->irq = pdev->irq;
 
-		netdev->features = (NETIF_F_SG | NETIF_F_TSO | NETIF_F_TSO6 |
+		netdev->features = (NETIF_F_SG | TSO_FLAGS |
 				    NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 				    NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX |
 				    NETIF_F_GRO);
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-26 5/5] cxgb4vf: Mark "UDP [RSS Hash] Enable" as a 1-bit field.
From: Casey Leedom @ 2010-11-11 19:30 UTC (permalink / raw)
  To: netdev; +Cc: davem, Casey Leedom
In-Reply-To: <1289503844-18059-1-git-send-email-leedom@chelsio.com>

Mark the UDP RSS Hash Enable field as 1-bit in length.  Also clean up
formatting from previous changeset which changed the RSS 1-bit fields from
"int" to "unsigned int".

Signed-off-by: Casey Leedom <leedom@chelsio.com>
---
 drivers/net/cxgb4vf/t4vf_common.h |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/cxgb4vf/t4vf_common.h b/drivers/net/cxgb4vf/t4vf_common.h
index a65c80a..7541a60 100644
--- a/drivers/net/cxgb4vf/t4vf_common.h
+++ b/drivers/net/cxgb4vf/t4vf_common.h
@@ -132,15 +132,15 @@ struct rss_params {
 	unsigned int mode;		/* RSS mode */
 	union {
 	    struct {
-		unsigned int synmapen:1;	/* SYN Map Enable */
-		unsigned int syn4tupenipv6:1;	/* enable hashing 4-tuple IPv6 SYNs */
-		unsigned int syn2tupenipv6:1;	/* enable hashing 2-tuple IPv6 SYNs */
-		unsigned int syn4tupenipv4:1;	/* enable hashing 4-tuple IPv4 SYNs */
-		unsigned int syn2tupenipv4:1;	/* enable hashing 2-tuple IPv4 SYNs */
-		unsigned int ofdmapen:1;	/* Offload Map Enable */
-		unsigned int tnlmapen:1;	/* Tunnel Map Enable */
-		unsigned int tnlalllookup:1;	/* Tunnel All Lookup */
-		unsigned int hashtoeplitz:1;	/* use Toeplitz hash */
+		uint synmapen:1;	/* SYN Map Enable */
+		uint syn4tupenipv6:1;	/* enable hashing 4-tuple IPv6 SYNs */
+		uint syn2tupenipv6:1;	/* enable hashing 2-tuple IPv6 SYNs */
+		uint syn4tupenipv4:1;	/* enable hashing 4-tuple IPv4 SYNs */
+		uint syn2tupenipv4:1;	/* enable hashing 2-tuple IPv4 SYNs */
+		uint ofdmapen:1;	/* Offload Map Enable */
+		uint tnlmapen:1;	/* Tunnel Map Enable */
+		uint tnlalllookup:1;	/* Tunnel All Lookup */
+		uint hashtoeplitz:1;	/* use Toeplitz hash */
 	    } basicvirtual;
 	} u;
 };
@@ -151,11 +151,11 @@ struct rss_params {
 union rss_vi_config {
     struct {
 	u16 defaultq;			/* Ingress Queue ID for !tnlalllookup */
-	unsigned int ip6fourtupen:1;	/* hash 4-tuple IPv6 ingress packets */
-	unsigned int ip6twotupen:1;	/* hash 2-tuple IPv6 ingress packets */
-	unsigned int ip4fourtupen:1;	/* hash 4-tuple IPv4 ingress packets */
-	unsigned int ip4twotupen:1;	/* hash 2-tuple IPv4 ingress packets */
-	int udpen;			/* hash 4-tuple UDP ingress packets */
+	uint ip6fourtupen:1;		/* hash 4-tuple IPv6 ingress packets */
+	uint ip6twotupen:1;		/* hash 2-tuple IPv6 ingress packets */
+	uint ip4fourtupen:1;		/* hash 4-tuple IPv4 ingress packets */
+	uint ip4twotupen:1;		/* hash 2-tuple IPv4 ingress packets */
+	uint udpen:1;			/* hash 4-tuple UDP ingress packets */
     } basicvirtual;
 };
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH net-next-2.6] vlan: remove ndo_select_queue() logic
From: Eric Dumazet @ 2010-11-11 19:42 UTC (permalink / raw)
  To: Jesse Gross; +Cc: David Miller, netdev, Patrick McHardy
In-Reply-To: <AANLkTik-b5jYgN9e2cgYQvW1V3kuHem2x6wDpw4_=GJJ@mail.gmail.com>

Le jeudi 11 novembre 2010 à 10:56 -0800, Jesse Gross a écrit :
> On Thu, Nov 11, 2010 at 9:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Before Tom's patch, a warning will be generated if a single queue vlan
> device is stacked on top of a multiqueue physical device that
> implements ndo_select_queue().  After Tom's patch, we avoid the
> warning so vlan_dev_select_queue() is merely dead code.  Either way,
> what's the benefit in keeping it?
> 
> >
> >
> > This logicaly is a second cleanup patch I believe.
> 
> I'm not arguing against your patch, I just think it should go a step further.

Sure ! Here is the thing ;)

Thanks

[PATCH] vlan: remove ndo_select_queue() logic

Now vlan are lockless, we dont need special ndo_select_queue() logic.
dev_pick_tx() will do the multiqueue stuff on the real device transmit.

Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/8021q/vlan_dev.c |   71 +----------------------------------------
 1 files changed, 3 insertions(+), 68 deletions(-)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 3a67483..99d6910 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -390,14 +390,6 @@ static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb,
 	return ret;
 }
 
-static u16 vlan_dev_select_queue(struct net_device *dev, struct sk_buff *skb)
-{
-	struct net_device *rdev = vlan_dev_info(dev)->real_dev;
-	const struct net_device_ops *ops = rdev->netdev_ops;
-
-	return ops->ndo_select_queue(rdev, skb);
-}
-
 static int vlan_dev_change_mtu(struct net_device *dev, int new_mtu)
 {
 	/* TODO: gotta make sure the underlying layer can handle it,
@@ -726,8 +718,7 @@ static const struct header_ops vlan_header_ops = {
 	.parse	 = eth_header_parse,
 };
 
-static const struct net_device_ops vlan_netdev_ops, vlan_netdev_accel_ops,
-		    vlan_netdev_ops_sq, vlan_netdev_accel_ops_sq;
+static const struct net_device_ops vlan_netdev_ops, vlan_netdev_accel_ops;
 
 static int vlan_dev_init(struct net_device *dev)
 {
@@ -763,17 +754,11 @@ static int vlan_dev_init(struct net_device *dev)
 	if (real_dev->features & NETIF_F_HW_VLAN_TX) {
 		dev->header_ops      = real_dev->header_ops;
 		dev->hard_header_len = real_dev->hard_header_len;
-		if (real_dev->netdev_ops->ndo_select_queue)
-			dev->netdev_ops = &vlan_netdev_accel_ops_sq;
-		else
-			dev->netdev_ops = &vlan_netdev_accel_ops;
+		dev->netdev_ops = &vlan_netdev_accel_ops;
 	} else {
 		dev->header_ops      = &vlan_header_ops;
 		dev->hard_header_len = real_dev->hard_header_len + VLAN_HLEN;
-		if (real_dev->netdev_ops->ndo_select_queue)
-			dev->netdev_ops = &vlan_netdev_ops_sq;
-		else
-			dev->netdev_ops = &vlan_netdev_ops;
+		dev->netdev_ops = &vlan_netdev_ops;
 	}
 
 	if (is_vlan_dev(real_dev))
@@ -944,56 +929,6 @@ static const struct net_device_ops vlan_netdev_accel_ops = {
 #endif
 };
 
-static const struct net_device_ops vlan_netdev_ops_sq = {
-	.ndo_select_queue	= vlan_dev_select_queue,
-	.ndo_change_mtu		= vlan_dev_change_mtu,
-	.ndo_init		= vlan_dev_init,
-	.ndo_uninit		= vlan_dev_uninit,
-	.ndo_open		= vlan_dev_open,
-	.ndo_stop		= vlan_dev_stop,
-	.ndo_start_xmit =  vlan_dev_hard_start_xmit,
-	.ndo_validate_addr	= eth_validate_addr,
-	.ndo_set_mac_address	= vlan_dev_set_mac_address,
-	.ndo_set_rx_mode	= vlan_dev_set_rx_mode,
-	.ndo_set_multicast_list	= vlan_dev_set_rx_mode,
-	.ndo_change_rx_flags	= vlan_dev_change_rx_flags,
-	.ndo_do_ioctl		= vlan_dev_ioctl,
-	.ndo_neigh_setup	= vlan_dev_neigh_setup,
-	.ndo_get_stats64	= vlan_dev_get_stats64,
-#if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
-	.ndo_fcoe_ddp_setup	= vlan_dev_fcoe_ddp_setup,
-	.ndo_fcoe_ddp_done	= vlan_dev_fcoe_ddp_done,
-	.ndo_fcoe_enable	= vlan_dev_fcoe_enable,
-	.ndo_fcoe_disable	= vlan_dev_fcoe_disable,
-	.ndo_fcoe_get_wwn	= vlan_dev_fcoe_get_wwn,
-#endif
-};
-
-static const struct net_device_ops vlan_netdev_accel_ops_sq = {
-	.ndo_select_queue	= vlan_dev_select_queue,
-	.ndo_change_mtu		= vlan_dev_change_mtu,
-	.ndo_init		= vlan_dev_init,
-	.ndo_uninit		= vlan_dev_uninit,
-	.ndo_open		= vlan_dev_open,
-	.ndo_stop		= vlan_dev_stop,
-	.ndo_start_xmit =  vlan_dev_hwaccel_hard_start_xmit,
-	.ndo_validate_addr	= eth_validate_addr,
-	.ndo_set_mac_address	= vlan_dev_set_mac_address,
-	.ndo_set_rx_mode	= vlan_dev_set_rx_mode,
-	.ndo_set_multicast_list	= vlan_dev_set_rx_mode,
-	.ndo_change_rx_flags	= vlan_dev_change_rx_flags,
-	.ndo_do_ioctl		= vlan_dev_ioctl,
-	.ndo_neigh_setup	= vlan_dev_neigh_setup,
-	.ndo_get_stats64	= vlan_dev_get_stats64,
-#if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
-	.ndo_fcoe_ddp_setup	= vlan_dev_fcoe_ddp_setup,
-	.ndo_fcoe_ddp_done	= vlan_dev_fcoe_ddp_done,
-	.ndo_fcoe_enable	= vlan_dev_fcoe_enable,
-	.ndo_fcoe_disable	= vlan_dev_fcoe_disable,
-	.ndo_fcoe_get_wwn	= vlan_dev_fcoe_get_wwn,
-#endif
-};
-
 void vlan_setup(struct net_device *dev)
 {
 	ether_setup(dev);



^ permalink raw reply related

* [net-2.6 PATCH] nete zero kobject in rx_queue_release
From: John Fastabend @ 2010-11-11 20:13 UTC (permalink / raw)
  To: davem; +Cc: john.r.fastabend, netdev, eric.dumazet, therbert

netif_set_real_num_rx_queues() can decrement and increment
the number of rx queues. For example ixgbe does this as
features and offloads are toggled. Presumably this could
also happen across down/up on most devices if the available
resources changed (cpu offlined).

The kobject needs to be zero'd in this case so that the
state is not preserved across kobject_put()/kobject_init_and_add().

This resolves the following error report.

ixgbe 0000:03:00.0: eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX
kobject (ffff880324b83210): tried to init an initialized object, something is seriously wrong.
Pid: 1972, comm: lldpad Not tainted 2.6.37-rc18021qaz+ #169
Call Trace:
 [<ffffffff8121c940>] kobject_init+0x3a/0x83
 [<ffffffff8121cf77>] kobject_init_and_add+0x23/0x57
 [<ffffffff8107b800>] ? mark_lock+0x21/0x267
 [<ffffffff813c6d11>] net_rx_queue_update_kobjects+0x63/0xc6
 [<ffffffff813b5e0e>] netif_set_real_num_rx_queues+0x5f/0x78
 [<ffffffffa0261d49>] ixgbe_set_num_queues+0x1c6/0x1ca [ixgbe]
 [<ffffffffa0262509>] ixgbe_init_interrupt_scheme+0x1e/0x79c [ixgbe]
 [<ffffffffa0274596>] ixgbe_dcbnl_set_state+0x167/0x189 [ixgbe]

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 net/core/net-sysfs.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index a5ff5a8..3315033 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -721,6 +721,11 @@ static void rx_queue_release(struct kobject *kobj)

 	if (atomic_dec_and_test(&first->count))
 		kfree(first);
+
+	/* cleanup kobject because we may need to reuse it if the
+	 * number of rx queues is increased again in the future
+	 */
+	memset(kobj, 0, sizeof(*kobj));
 }

 static struct kobj_type rx_queue_ktype = {

^ permalink raw reply related

* Re: Regression 2.6.36 - driver rtl8169 crashes kernel, triggered by user app
From: Francois Romieu @ 2010-11-11 20:49 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: linux-kernel, netdev
In-Reply-To: <201011111933.04343@zmi.at>

Michael Monnerie <michael.monnerie@is.it-management.at> :
[...]
> With kernel 2.6.34.7-0.5-desktop, this runs with ~41MiB/s without 
> problems.
> 
> With kernel 2.6.36, it runs at ~26MiB/s, and while doing so, dmesg shows 
> a lot of noise about r8169 complaining:
> http://zmi.at/x/kernel2.6.36-crash86.jpg

Can you test again after reverting 801e147cde02f04b5c2f42764cd43a89fc7400a2 ?

Thanks.

-- 
Ueimor

^ permalink raw reply

* [RFC PATCH] network: return errors if we know tcp_connect failed
From: Eric Paris @ 2010-11-11 21:03 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: davem, kuznet, pekkas, jmorris, yoshfuji, kaber

THIS PATCH IS VERY POSSIBLY WRONG!  But if it is I want some feedback.

Basically what I found was that if I added an iptables rule like so:

iptables -A OUTPUT -p tcp --dport 80 -j DROP

And then ran a web browser like links it would just hang on 'establishing
connection.'  I expected that the application would immediately, or at least
very quickly, get notified that the connect failed.   This waiting for timeout
would be expected if something else dropped the SYN or if we were dropping the
SYN/ACK packet coming back, but I figured if we knew we threw away the SYN we knew
right away that the connection was denied and we should be able to indicate
that to the application.  Yes, I realize this is little different than if the
SYN was dropped in the first network device, but it is different because we
know what happened!  We know that connect() call failed and that there isn't
anything coming back.

What I discovered was that we actually had 2 problems in making it possible.
For userspace to quickly realize the connect failed.  The first was a problem
in the netfilter code which wasn't passing errors back up the stack correctly,
due to what I believe to be a mistake in precedence rules.

http://marc.info/?l=netfilter-devel&m=128950262021804&w=2

And the second was that tcp_connect() was just ignoring the return value from
tcp_transmit_skb().  Maybe this was intentional but I really wish we could
find out that connect failed long before the minutes long timeout.  Once I
fixed both of those issues I find that links gets denied (with EPERM)
immediately when it calls connect().  Is this wrong?  Is this bad to tell
userspace more quickly what happened?  Does passing this error code back up
the stack here break something else?  Why do some functions seem to pay
attention to tcp_transmit_skb() return codes and some functions just ignore
it?  What do others think?

NOT-AT-ALL-SIGNED-OFF-BY
---

 net/ipv4/tcp_output.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index e961522..67b8535 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2592,6 +2592,7 @@ int tcp_connect(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *buff;
+	int ret;

 	tcp_connect_init(sk);

@@ -2614,7 +2615,7 @@ int tcp_connect(struct sock *sk)
 	sk->sk_wmem_queued += buff->truesize;
 	sk_mem_charge(sk, buff->truesize);
 	tp->packets_out += tcp_skb_pcount(buff);
-	tcp_transmit_skb(sk, buff, 1, sk->sk_allocation);
+	ret = tcp_transmit_skb(sk, buff, 1, sk->sk_allocation);

 	/* We change tp->snd_nxt after the tcp_transmit_skb() call
 	 * in order to make this packet get counted in tcpOutSegs.
@@ -2626,7 +2627,7 @@ int tcp_connect(struct sock *sk)
 	/* Timer for repeating the SYN until an answer. */
 	inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
 				  inet_csk(sk)->icsk_rto, TCP_RTO_MAX);
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(tcp_connect);

^ permalink raw reply related

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Eric Dumazet @ 2010-11-11 21:14 UTC (permalink / raw)
  To: Eric Paris
  Cc: netdev, linux-kernel, davem, kuznet, pekkas, jmorris, yoshfuji,
	kaber
In-Reply-To: <20101111210341.31350.86916.stgit@paris.rdu.redhat.com>

Le jeudi 11 novembre 2010 à 16:03 -0500, Eric Paris a écrit :
> THIS PATCH IS VERY POSSIBLY WRONG!  But if it is I want some feedback.
> 
> Basically what I found was that if I added an iptables rule like so:
> 
> iptables -A OUTPUT -p tcp --dport 80 -j DROP
> 
> And then ran a web browser like links it would just hang on 'establishing
> connection.'  I expected that the application would immediately, or at least
> very quickly, get notified that the connect failed.   This waiting for timeout
> would be expected if something else dropped the SYN or if we were dropping the
> SYN/ACK packet coming back, but I figured if we knew we threw away the SYN we knew
> right away that the connection was denied and we should be able to indicate
> that to the application.  Yes, I realize this is little different than if the
> SYN was dropped in the first network device, but it is different because we
> know what happened!  We know that connect() call failed and that there isn't
> anything coming back.
> 
> What I discovered was that we actually had 2 problems in making it possible.
> For userspace to quickly realize the connect failed.  The first was a problem
> in the netfilter code which wasn't passing errors back up the stack correctly,
> due to what I believe to be a mistake in precedence rules.
> 
> http://marc.info/?l=netfilter-devel&m=128950262021804&w=2
> 
> And the second was that tcp_connect() was just ignoring the return value from
> tcp_transmit_skb().  Maybe this was intentional but I really wish we could
> find out that connect failed long before the minutes long timeout.  Once I
> fixed both of those issues I find that links gets denied (with EPERM)
> immediately when it calls connect().  Is this wrong?  Is this bad to tell
> userspace more quickly what happened?  Does passing this error code back up
> the stack here break something else?  Why do some functions seem to pay
> attention to tcp_transmit_skb() return codes and some functions just ignore
> it?  What do others think?
> 


I think its an interesting idea, but a temporary memory shortage would
abort the connect().

We could imagine some special handling of the first packet of a flow
being DROPED for whatever reason (flow control...)

So it needs some refinement I think.

SYN packets should be allowed to be re-transmitted before saying a TCP
connect() cannot succeed.




^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox