Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH for-2.6.35] virtio_net: fix oom handling on tx
From: Rusty Russell @ 2010-06-21  2:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stephen Hemminger, Sridhar Samudrala, virtualization, Jiri Pirko,
	Shirley Ma, netdev, linux-kernel
In-Reply-To: <20100610190343.GC4044@redhat.com>

On Fri, 11 Jun 2010 04:33:43 am Michael S. Tsirkin wrote:
> > > > @@ -572,12 +571,14 @@ again:
> > > > 
> > > >  	/* This can happen with OOM and indirect buffers. */
> > > >  	if (unlikely(capacity < 0)) {
> > > > -		netif_stop_queue(dev);
> > > > -		dev_warn(&dev->dev, "Unexpected full queue\n");
> > > > -		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > > > -			virtqueue_disable_cb(vi->svq);
> > > > -			netif_start_queue(dev);
> > > > -			goto again;
> > > > +		if (net_ratelimit()) {
> > > > +			if (likely(capacity == -ENOMEM))
> > > > +				dev_warn(&dev->dev,
> > > > +					 "TX queue failure: out of memory\n");
> > > > +			else
> > > > +				dev_warn(&dev->dev,
> > > > +					 "Unexpected TX queue failure: %d\n",
> > > > +					 capacity);
...
> 
> Well, I only keep the existing behaviour around.

Actually, it *does* change behavior, as the comment indicates.  So let's
fix the whole thing.  AFAICT wth TX_BUSY we'll get called again RSN, and
that's not really useful for OOM.

This is what I have:

Subject: virtio_net: fix oom handling on tx
Date: Thu, 10 Jun 2010 18:20:41 +0300
From: "Michael S. Tsirkin" <mst@redhat.com>

virtio net will never try to overflow the TX ring, so the only reason
add_buf may fail is out of memory. Thus, we can not stop the
device until some request completes - there's no guarantee anything
at all is outstanding.

Make the error message clearer as well: error here does not
indicate queue full.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (...and avoid TX_BUSY)
Cc: stable@kernel.org
---
 drivers/net/virtio_net.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -562,7 +562,6 @@ static netdev_tx_t start_xmit(struct sk_
 	struct virtnet_info *vi = netdev_priv(dev);
 	int capacity;
 
-again:
 	/* Free up any pending old buffers before queueing new ones. */
 	free_old_xmit_skbs(vi);
 
@@ -571,14 +570,17 @@ again:
 
 	/* This can happen with OOM and indirect buffers. */
 	if (unlikely(capacity < 0)) {
-		netif_stop_queue(dev);
-		dev_warn(&dev->dev, "Unexpected full queue\n");
-		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
-			virtqueue_disable_cb(vi->svq);
-			netif_start_queue(dev);
-			goto again;
+		if (net_ratelimit()) {
+			if (likely(capacity == -ENOMEM))
+				dev_warn(&dev->dev,
+					 "TX queue failure: out of memory\n");
+			else
+				dev_warn(&dev->dev,
+					 "Unexpected TX queue failure: %d\n",
+					 capacity);
 		}
-		return NETDEV_TX_BUSY;
+		kfree_skb(skb);
+		return NETDEV_TX_OK;
 	}
 	virtqueue_kick(vi->svq);
 

^ permalink raw reply

* Re: [PATCH net-next-2.6 1/4] caif: Bugfix not all services uses flow-ctrl.
From: David Miller @ 2010-06-21  3:38 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: sjurbr, netdev, linus.walleij
In-Reply-To: <1276793741-20278-1-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Thu, 17 Jun 2010 18:55:38 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Flow control is not used by all CAIF services.
> The usage of flow control is now part of the gerneal
> initialization function for CAIF Services.
> 
> Signed-off-by: Sjur Braendeland@stericsson.com

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6 2/4] caif: Bugfix - RFM must support segmentation.
From: David Miller @ 2010-06-21  3:38 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: sjurbr, netdev, linus.walleij
In-Reply-To: <1276793741-20278-2-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Thu, 17 Jun 2010 18:55:39 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> CAIF Remote File Manager may send or receive more than 4050 bytes.
> Due to this The CAIF RFM service have to support segmentation.
> 
> Signed-off-by: Sjur Braendeland@stericsson.com

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6 3/4] caif: Use link layer MTU instead of fixed MTU
From: David Miller @ 2010-06-21  3:39 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: sjurbr, netdev, linus.walleij
In-Reply-To: <1276793741-20278-3-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Thu, 17 Jun 2010 18:55:40 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Previously CAIF supported maximum transfer size of ~4050.
> The transfer size is now calculated dynamically based on the
> link layers mtu size.
> 
> Signed-off-by: Sjur Braendeland@stericsson.com

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6 4/4] caif: Add debug connection type for CAIF.
From: David Miller @ 2010-06-21  3:39 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: sjurbr, netdev, linus.walleij
In-Reply-To: <1276793741-20278-4-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Thu, 17 Jun 2010 18:55:41 +0200

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Added new CAIF protocol type CAIFPROTO_DEBUG for accessing
> CAIF debug on the ST Ericsson modems.
> 
> There are two debug servers on the modem, one for radio related
> debug (CAIF_RADIO_DEBUG_SERVICE) and the other for
> communication/application related debug (CAIF_COM_DEBUG_SERVICE).
> 
> The debug connection can contain trace debug printouts or
> interactive debug used for debugging and test.
> 
> Debug connections can be of type STREAM or SEQPACKET.
> 
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH] caif: Add CAIF-SPI Protocol driver.
From: David Miller @ 2010-06-21  3:39 UTC (permalink / raw)
  To: sjur.brandeland; +Cc: sjurbr, netdev, marcel, daniel.martensson, linus.walleij
In-Reply-To: <1276794164-20455-1-git-send-email-sjur.brandeland@stericsson.com>

From: sjur.brandeland@stericsson.com
Date: Thu, 17 Jun 2010 19:02:44 +0200

> +ifeq ($(CONFIG_CAIF_SPI_SYNC),y)
> +CAIF_SPI_FLAGS := -DCONFIG_CAIF_SPI_SYNC
> +endif
> +

You should never set CONFIG_* defines from the Makefile
commands, always set them in your Kconfig statements.

^ permalink raw reply

* [RFC net-2.6 PATCH] ixgbe: need to purge skb qdisc lists when changing real_num_tx_queues
From: John Fastabend @ 2010-06-21  5:25 UTC (permalink / raw)
  To: netdev; +Cc: john.r.fastabend, davem, jeffrey.t.kirsher

This patch exports dev_deactivate() symbol.

The qdisc needs to be purged when real_num_tx_queues is
changed so that skbs do not hit ndo_start_xmit with
queue_mappings corresponding to tx_rings already freed.

real_num_tx_queues is changed when dynamically enabling or
disabling DCB or FCoE.  To purge the qdisc use dev_deactivate().

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 drivers/net/ixgbe/ixgbe_dcb_nl.c |    9 +++++++--
 drivers/net/ixgbe/ixgbe_fcoe.c   |    9 +++++++--
 net/sched/sch_generic.c          |    1 +
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_dcb_nl.c b/drivers/net/ixgbe/ixgbe_dcb_nl.c
index 71da325..0d6854c 100644
--- a/drivers/net/ixgbe/ixgbe_dcb_nl.c
+++ b/drivers/net/ixgbe/ixgbe_dcb_nl.c
@@ -28,6 +28,7 @@
 
 #include "ixgbe.h"
 #include <linux/dcbnl.h>
+#include <net/pkt_sched.h>
 #include "ixgbe_dcb_82598.h"
 #include "ixgbe_dcb_82599.h"
 
@@ -126,8 +127,10 @@ static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 			goto out;
 		}
 
-		if (netif_running(netdev))
+		if (netif_running(netdev)) {
+			dev_deactivate(netdev);
 			netdev->netdev_ops->ndo_stop(netdev);
+		}
 		ixgbe_clear_interrupt_scheme(adapter);
 
 		if (adapter->hw.mac.type == ixgbe_mac_82598EB) {
@@ -146,8 +149,10 @@ static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 	} else {
 		/* Turn off DCB */
 		if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
-			if (netif_running(netdev))
+			if (netif_running(netdev)) {
+				dev_deactivate(netdev);
 				netdev->netdev_ops->ndo_stop(netdev);
+			}
 			ixgbe_clear_interrupt_scheme(adapter);
 
 			adapter->hw.fc.requested_mode = adapter->last_lfc_mode;
diff --git a/drivers/net/ixgbe/ixgbe_fcoe.c b/drivers/net/ixgbe/ixgbe_fcoe.c
index 45182ab..181f158 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ixgbe/ixgbe_fcoe.c
@@ -33,6 +33,7 @@
 #include <linux/if_ether.h>
 #include <linux/gfp.h>
 #include <linux/if_vlan.h>
+#include <net/sch_generic.h>
 #include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_device.h>
 #include <scsi/fc/fc_fs.h>
@@ -615,8 +616,10 @@ int ixgbe_fcoe_enable(struct net_device *netdev)
 		goto out_enable;
 
 	DPRINTK(DRV, INFO, "Enabling FCoE offload features.\n");
-	if (netif_running(netdev))
+	if (netif_running(netdev)) {
+		dev_deactivate(netdev);
 		netdev->netdev_ops->ndo_stop(netdev);
+	}
 
 	ixgbe_clear_interrupt_scheme(adapter);
 
@@ -661,8 +664,10 @@ int ixgbe_fcoe_disable(struct net_device *netdev)
 		goto out_disable;
 
 	DPRINTK(DRV, INFO, "Disabling FCoE offload features.\n");
-	if (netif_running(netdev))
+	if (netif_running(netdev)) {
+		dev_deactivate(netdev);
 		netdev->netdev_ops->ndo_stop(netdev);
+	}
 
 	ixgbe_clear_interrupt_scheme(adapter);
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index a63029e..53ca3ac 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -804,6 +804,7 @@ void dev_deactivate(struct net_device *dev)
 	while (some_qdisc_is_busy(dev))
 		yield();
 }
+EXPORT_SYMBOL(dev_deactivate);
 
 static void dev_init_scheduler_queue(struct net_device *dev,
 				     struct netdev_queue *dev_queue,


^ permalink raw reply related

* linux-next: manual merge of the tip tree with the net tree
From: Stephen Rothwell @ 2010-06-21  6:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Peter Zijlstra
  Cc: linux-next, linux-kernel, Paul E. McKenney, Jiri Pirko,
	David Miller, netdev

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

Hi all,

Today's linux-next merge of the tip tree got a conflict in
net/bridge/br_fdb.c net/bridge/netfilter/ebt_redirect.c
net/bridge/netfilter/ebt_ulog.c net/bridge/netfilter/ebtables.c
net/netfilter/nfnetlink_log.c between commit
f350a0a87374418635689471606454abc7beaa3a ("bridge: use rx_handler_data
pointer to store net_bridge_port pointer") from the net tree and commit
81bdf5bd7349bd4523538cbd7878f334bc2bfe14 ("net: Make accesses to
->br_port safe for sparse RCU") from the tip tree.

The net tree commit looks like a superset of the tip tree commit, so I
effectively reverted the tip tree commit.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [PATCH] net: optimize Berkeley Packet Filter (BPF) processing
From: Hagen Paul Pfeifer @ 2010-06-21  7:57 UTC (permalink / raw)
  To: Changli Gao; +Cc: David Miller, shemminger, netdev, Linus Torvalds
In-Reply-To: <AANLkTinNDMrpDUxQGjQTReuN7EyIPYNnpfO77jRFIjM7@mail.gmail.com>


On Mon, 21 Jun 2010 10:15:33 +0800, Changli Gao <xiaosuo@gmail.com> wrote:



> FYI: FreeBSD goes even further, and its BPF implementation exploits

> JIT in two architectures: i386 and amd64.



;-) I know. This patch is architecture neutral and provides performance

gains on all plattforms. If someone is crazy enough to implement a JIT

compiler I am fine with that too. ;-) 



HGN



^ permalink raw reply

* [PATCH] cpmac: do not leak struct net_device on phy_connect errors
From: Florian Fainelli @ 2010-06-21  8:07 UTC (permalink / raw)
  To: netdev; +Cc: David Miller

If the call to phy_connect fails, we will return directly instead of freeing
the previously allocated struct net_device.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
CC: stable@kernel.org
---
diff --git a/drivers/net/cpmac.c b/drivers/net/cpmac.c
index 3c58db5..23786ee 100644
--- a/drivers/net/cpmac.c
+++ b/drivers/net/cpmac.c
@@ -1181,7 +1181,8 @@ static int __devinit cpmac_probe(struct platform_device *pdev)
 		if (netif_msg_drv(priv))
 			printk(KERN_ERR "%s: Could not attach to PHY\n",
 			       dev->name);
-		return PTR_ERR(priv->phy);
+		rc = PTR_ERR(priv->phy);
+		goto fail;
 	}
 
 	if ((rc = register_netdev(dev))) {

^ permalink raw reply related

* Re: inconsistent lock state
From: Peter Zijlstra @ 2010-06-21  6:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Sage Weil, linux-fsdevel,
	linux-kernel, Dominik Brodowski, Maciej Rutecki, Eric Dumazet,
	Paul E.McKenney, Lai Jiangshan, David S.Miller, netdev
In-Reply-To: <20100618133004.228c2223.akpm@linux-foundation.org>

On Fri, 2010-06-18 at 13:30 -0700, Andrew Morton wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> 

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

^ permalink raw reply

* Re: [PATCH for-2.6.35] virtio_net: fix oom handling on tx
From: Michael S. Tsirkin @ 2010-06-21  8:33 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Stephen Hemminger, Sridhar Samudrala, virtualization, Jiri Pirko,
	Shirley Ma, netdev, linux-kernel
In-Reply-To: <201006211213.49600.rusty@rustcorp.com.au>

On Mon, Jun 21, 2010 at 12:13:49PM +0930, Rusty Russell wrote:
> On Fri, 11 Jun 2010 04:33:43 am Michael S. Tsirkin wrote:
> > > > > @@ -572,12 +571,14 @@ again:
> > > > > 
> > > > >  	/* This can happen with OOM and indirect buffers. */
> > > > >  	if (unlikely(capacity < 0)) {
> > > > > -		netif_stop_queue(dev);
> > > > > -		dev_warn(&dev->dev, "Unexpected full queue\n");
> > > > > -		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > > > > -			virtqueue_disable_cb(vi->svq);
> > > > > -			netif_start_queue(dev);
> > > > > -			goto again;
> > > > > +		if (net_ratelimit()) {
> > > > > +			if (likely(capacity == -ENOMEM))
> > > > > +				dev_warn(&dev->dev,
> > > > > +					 "TX queue failure: out of memory\n");
> > > > > +			else
> > > > > +				dev_warn(&dev->dev,
> > > > > +					 "Unexpected TX queue failure: %d\n",
> > > > > +					 capacity);
> ...
> > 
> > Well, I only keep the existing behaviour around.
> 
> Actually, it *does* change behavior, as the comment indicates.  So let's
> fix the whole thing.  AFAICT wth TX_BUSY we'll get called again RSN, and
> that's not really useful for OOM.
> 
> This is what I have:
> 
> Subject: virtio_net: fix oom handling on tx
> Date: Thu, 10 Jun 2010 18:20:41 +0300
> From: "Michael S. Tsirkin" <mst@redhat.com>
> 
> virtio net will never try to overflow the TX ring, so the only reason
> add_buf may fail is out of memory. Thus, we can not stop the
> device until some request completes - there's no guarantee anything
> at all is outstanding.
> 
> Make the error message clearer as well: error here does not
> indicate queue full.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (...and avoid TX_BUSY)
> Cc: stable@kernel.org
> ---
>  drivers/net/virtio_net.c |   18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -562,7 +562,6 @@ static netdev_tx_t start_xmit(struct sk_
>  	struct virtnet_info *vi = netdev_priv(dev);
>  	int capacity;
>  
> -again:
>  	/* Free up any pending old buffers before queueing new ones. */
>  	free_old_xmit_skbs(vi);
>  
> @@ -571,14 +570,17 @@ again:
>  
>  	/* This can happen with OOM and indirect buffers. */
>  	if (unlikely(capacity < 0)) {
> -		netif_stop_queue(dev);
> -		dev_warn(&dev->dev, "Unexpected full queue\n");
> -		if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> -			virtqueue_disable_cb(vi->svq);
> -			netif_start_queue(dev);
> -			goto again;
> +		if (net_ratelimit()) {
> +			if (likely(capacity == -ENOMEM))
> +				dev_warn(&dev->dev,
> +					 "TX queue failure: out of memory\n");
> +			else
> +				dev_warn(&dev->dev,
> +					 "Unexpected TX queue failure: %d\n",
> +					 capacity);
>  		}
> -		return NETDEV_TX_BUSY;
> +		kfree_skb(skb);
> +		return NETDEV_TX_OK;

If we do so, let's increment the dropped counter and/or error counter?

>  	}
>  	virtqueue_kick(vi->svq);
>  

^ permalink raw reply

* Re: [v3 Patch 2/2] mlx4: add dynamic LRO disable support
From: Stanislaw Gruszka @ 2010-06-21  9:02 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, nhorman, herbert.xu, bhutchings, Ramkrishna.Vepa, davem
In-Reply-To: <4C1ECD04.20609@redhat.com>

On Mon, Jun 21, 2010 at 10:23:00AM +0800, Cong Wang wrote:
> On 06/18/10 19:09, Stanislaw Gruszka wrote:
>> On Fri, Jun 18, 2010 at 06:55:38AM -0400, Amerigo Wang wrote:
>>> +static int mlx4_ethtool_op_set_flags(struct net_device *dev, u32 data)
>>> +{
>>> +	struct mlx4_en_priv *priv = netdev_priv(dev);
>>> +	struct mlx4_en_dev *mdev = priv->mdev;
>>> +	int rc = 0;
>>> +	int changed = 0;
>>> +
>>> +	if (data&  (ETH_FLAG_NTUPLE | ETH_FLAG_RXHASH))
>>> +		return -EOPNOTSUPP;

As we are here, better would be
	if (data & ~ETH_FLAG_LRO)
		return -EOPNOTSUPP;
code will persist correct when someone add new flags.

>>> +
>>> +	if (data&  ETH_FLAG_LRO) {
>>> +		if (!(dev->features&  NETIF_F_LRO))
>>> +			changed = 1;
>>> +	} else if (dev->features&  NETIF_F_LRO) {
>>> +		changed = 1;
>>> +		mdev->profile.num_lro = 0;
>>
>> Everything fine except that, what for you zero num_lro value?
>>
>> If we set it to zero it will stay zero and we will not create
>> proper number of lro descriptors in mlx4_en_create_rx_ring()
>> (called from mlx4_en_set_ringparam() ->   mlx4_en_alloc_resources())
>> when someone enable LRO again on.
>>
>
> Huh? Isn't ->num_lro which controls LRO of mlx4 driver?

It is, but only in mlx4_en_add() just before register_netdev(),
when we setup default dev->features. Otherwise dev->features
tells if LRO is enabled or disabled.

This realize me, that we should not dev->features |= NETIF_F_LRO
if mdev->profile.num_lro == 0 .

Stanislaw

^ permalink raw reply

* Re: [PATCH] gainfar.c : skb_over_panic
From: Eran Liberty @ 2010-06-21  9:13 UTC (permalink / raw)
  To: David Miller; +Cc: galak, netdev
In-Reply-To: <20100617.122030.112600189.davem@davemloft.net>

David Miller wrote:
> From: Eran Liberty <liberty@extricom.com>
> Date: Thu, 17 Jun 2010 19:32:54 +0300
>
>   
>> I have demonstrated skb_over_panic with linux 2.6.32.15 on a mpc8548
>> based product.
>>     
>
> A fix for a similar bug was necessary for the ucc_geth driver,
> see below.
>
> The real problem is that skb->data assignment, the rest of the
> SKB state has to be reset, and not doing that is what results in
> the skb_over_panic calls.
>
> >From db176edc89abbf22e6db6853f8581f9475fe8ec1 Mon Sep 17 00:00:00 2001
> From: Sergey Matyukevich <geomatsi@gmail.com>
> Date: Mon, 14 Jun 2010 06:35:20 +0000
> Subject: [PATCH] ucc_geth: fix for RX skb buffers recycling
>
> This patch implements a proper modification of RX skb buffers before
> recycling. Adjusting only skb->data is not enough because after that
> skb->tail and skb->len become incorrect.
>
> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  drivers/net/ucc_geth.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> index 4a34833..807470e 100644
> --- a/drivers/net/ucc_geth.c
> +++ b/drivers/net/ucc_geth.c
> @@ -3215,6 +3215,8 @@ static int ucc_geth_rx(struct ucc_geth_private *ugeth, u8 rxQ, int rx_work_limit
>  					   __func__, __LINE__, (u32) skb);
>  			if (skb) {
>  				skb->data = skb->head + NET_SKB_PAD;
> +				skb->len = 0;
> +				skb_reset_tail_pointer(skb);
>  				__skb_queue_head(&ugeth->rx_recycle, skb);
>  			}
>  
David,

I have compared the suggested patch with what the function skb_recycle_check() does. Both patch and skb_recycle_check()
 have skb_reset_tail_pointer(). While the patch zero only skb->len, skb_recycle_check()
 clears the whole skb (up to tail). On top of that skb_recycle_check() preforms a whole set of other checks and cleanups. 
The question is, which action is MORE correct: the pin-point action of the patch suggested or the broader checks of skb_recycle_check() function?

-- Liberty

 

^ permalink raw reply

* Re: [PATCH for-2.6.35] virtio_net: fix oom handling on tx
From: Rusty Russell @ 2010-06-21 10:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Stephen Hemminger, Sridhar Samudrala, virtualization, Jiri Pirko,
	Shirley Ma, netdev, linux-kernel
In-Reply-To: <20100621083315.GA8665@redhat.com>

On Mon, 21 Jun 2010 06:03:16 pm Michael S. Tsirkin wrote:
> On Mon, Jun 21, 2010 at 12:13:49PM +0930, Rusty Russell wrote:
> > -		return NETDEV_TX_BUSY;
> > +		kfree_skb(skb);
> > +		return NETDEV_TX_OK;
> 
> If we do so, let's increment the dropped counter and/or error counter?

Yep, here's the extra change:

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -571,14 +571,16 @@ static netdev_tx_t start_xmit(struct sk_
 	/* This can happen with OOM and indirect buffers. */
 	if (unlikely(capacity < 0)) {
 		if (net_ratelimit()) {
-			if (likely(capacity == -ENOMEM))
+			if (likely(capacity == -ENOMEM)) {
 				dev_warn(&dev->dev,
 					 "TX queue failure: out of memory\n");
-			else
+			} else {
+				dev->stats.tx_fifo_errors++;
 				dev_warn(&dev->dev,
 					 "Unexpected TX queue failure: %d\n",
 					 capacity);
 		}
+		dev->stats.tx_dropped++;
 		kfree_skb(skb);
 		return NETDEV_TX_OK;
 	}

^ permalink raw reply

* Re: [PATCH for-2.6.35] virtio_net: fix oom handling on tx
From: Michael S. Tsirkin @ 2010-06-21 10:53 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Stephen Hemminger, Sridhar Samudrala, virtualization, Jiri Pirko,
	Shirley Ma, netdev, linux-kernel
In-Reply-To: <201006211953.44724.rusty@rustcorp.com.au>

On Mon, Jun 21, 2010 at 07:53:43PM +0930, Rusty Russell wrote:
> On Mon, 21 Jun 2010 06:03:16 pm Michael S. Tsirkin wrote:
> > On Mon, Jun 21, 2010 at 12:13:49PM +0930, Rusty Russell wrote:
> > > -		return NETDEV_TX_BUSY;
> > > +		kfree_skb(skb);
> > > +		return NETDEV_TX_OK;
> > 
> > If we do so, let's increment the dropped counter and/or error counter?
> 
> Yep, here's the extra change:

Looks good to me.

> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -571,14 +571,16 @@ static netdev_tx_t start_xmit(struct sk_
>  	/* This can happen with OOM and indirect buffers. */
>  	if (unlikely(capacity < 0)) {
>  		if (net_ratelimit()) {
> -			if (likely(capacity == -ENOMEM))
> +			if (likely(capacity == -ENOMEM)) {
>  				dev_warn(&dev->dev,
>  					 "TX queue failure: out of memory\n");
> -			else
> +			} else {
> +				dev->stats.tx_fifo_errors++;
>  				dev_warn(&dev->dev,
>  					 "Unexpected TX queue failure: %d\n",
>  					 capacity);
>  		}
> +		dev->stats.tx_dropped++;
>  		kfree_skb(skb);
>  		return NETDEV_TX_OK;
>  	}

^ permalink raw reply

* [PATCH] kernel 2.6.35: ixgbe: skip non IPv4 packets in ATR filter
From: Guillaume Gaudonville @ 2010-06-21 13:05 UTC (permalink / raw)
  To: netdev; +Cc: Kirsher, Jeffrey T, Waskiewicz Jr, Peter P,
	Chilakala, Mallikarjuna

Hello,

In driver ixgbe, ixgbe_atr may cause crashes for non-ipv4 packets. Just 
add a test to check skb->protocol:

 From fcb81aa89b6819f95349a4ed8c30f0629430aa1d Mon Sep 17 00:00:00 2001
From: Guillaume Gaudonville <guillaume.gaudonville@6wind.com>
Date: Thu, 17 Jun 2010 16:02:14 +0200
Subject: [PATCH] ixgbe: skip non IPv4 packets in ATR filter

It may crash on short packets due to ip_hdr() access.

Signed-off-by: Guillaume Gaudonville <guillaume.gaudonville@6wind.com>
---
 drivers/net/ixgbe/ixgbe_main.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index b2af2f6..3581dbe 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6019,7 +6019,6 @@ static void ixgbe_tx_queue(struct ixgbe_adapter 
*adapter,
 static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb,
                      int queue, u32 tx_flags)
 {
-       /* Right now, we support IPv4 only */
        struct ixgbe_atr_input atr_input;
        struct tcphdr *th;
        struct iphdr *iph = ip_hdr(skb);
@@ -6028,6 +6027,10 @@ static void ixgbe_atr(struct ixgbe_adapter 
*adapter, struct sk_buff *skb,
        u32 src_ipv4_addr, dst_ipv4_addr;
        u8 l4type = 0;
 
+       /* Right now, we support IPv4 only */
+       if (skb->protocol != htons(ETH_P_IP))
+               return;
+
        /* check if we're UDP or TCP */
        if (iph->protocol == IPPROTO_TCP) {
                th = tcp_hdr(skb);
-- 
1.5.6.5

-- 
Guillaume Gaudonville
guillaume.gaudonville@6wind.com


^ permalink raw reply related

* [PATCH net-next-2.6] sfc: Implement ethtool register dump operation
From: Ben Hutchings @ 2010-06-21 13:06 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/ethtool.c |   16 +++
 drivers/net/sfc/io.h      |    7 +
 drivers/net/sfc/nic.c     |  266 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/nic.h     |    3 +
 4 files changed, 292 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 22026bf..81b7f39 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -242,6 +242,20 @@ static void efx_ethtool_get_drvinfo(struct net_device *net_dev,
 	strlcpy(info->bus_info, pci_name(efx->pci_dev), sizeof(info->bus_info));
 }
 
+static int efx_ethtool_get_regs_len(struct net_device *net_dev)
+{
+	return efx_nic_get_regs_len(netdev_priv(net_dev));
+}
+
+static void efx_ethtool_get_regs(struct net_device *net_dev,
+				 struct ethtool_regs *regs, void *buf)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+
+	regs->version = efx->type->revision;
+	efx_nic_get_regs(efx, buf);
+}
+
 /**
  * efx_fill_test - fill in an individual self-test entry
  * @test_index:		Index of the test
@@ -834,6 +848,8 @@ const struct ethtool_ops efx_ethtool_ops = {
 	.get_settings		= efx_ethtool_get_settings,
 	.set_settings		= efx_ethtool_set_settings,
 	.get_drvinfo		= efx_ethtool_get_drvinfo,
+	.get_regs_len		= efx_ethtool_get_regs_len,
+	.get_regs		= efx_ethtool_get_regs,
 	.nway_reset		= efx_ethtool_nway_reset,
 	.get_link		= efx_ethtool_get_link,
 	.get_eeprom_len		= efx_ethtool_get_eeprom_len,
diff --git a/drivers/net/sfc/io.h b/drivers/net/sfc/io.h
index b89177c..4317574 100644
--- a/drivers/net/sfc/io.h
+++ b/drivers/net/sfc/io.h
@@ -211,6 +211,13 @@ static inline void efx_writed_table(struct efx_nic *efx, efx_dword_t *value,
 	efx_writed(efx, value, reg + index * sizeof(efx_oword_t));
 }
 
+/* Read from a dword register forming part of a table */
+static inline void efx_readd_table(struct efx_nic *efx, efx_dword_t *value,
+				   unsigned int reg, unsigned int index)
+{
+	efx_readd(efx, value, reg + index * sizeof(efx_dword_t));
+}
+
 /* Page-mapped register block size */
 #define EFX_PAGE_BLOCK_SIZE 0x2000
 
diff --git a/drivers/net/sfc/nic.c b/drivers/net/sfc/nic.c
index 0ee6fd3..67235f1 100644
--- a/drivers/net/sfc/nic.c
+++ b/drivers/net/sfc/nic.c
@@ -1627,3 +1627,269 @@ void efx_nic_init_common(struct efx_nic *efx)
 		EFX_SET_OWORD_FIELD(temp, FRF_BZ_TX_FLUSH_MIN_LEN_EN, 1);
 	efx_writeo(efx, &temp, FR_AZ_TX_RESERVED);
 }
+
+/* Register dump */
+
+#define REGISTER_REVISION_A	1
+#define REGISTER_REVISION_B	2
+#define REGISTER_REVISION_C	3
+#define REGISTER_REVISION_Z	3	/* latest revision */
+
+struct efx_nic_reg {
+	u32 offset:24;
+	u32 min_revision:2, max_revision:2;
+};
+
+#define REGISTER(name, min_rev, max_rev) {				\
+	FR_ ## min_rev ## max_rev ## _ ## name,				\
+	REGISTER_REVISION_ ## min_rev, REGISTER_REVISION_ ## max_rev	\
+}
+#define REGISTER_AA(name) REGISTER(name, A, A)
+#define REGISTER_AB(name) REGISTER(name, A, B)
+#define REGISTER_AZ(name) REGISTER(name, A, Z)
+#define REGISTER_BB(name) REGISTER(name, B, B)
+#define REGISTER_BZ(name) REGISTER(name, B, Z)
+#define REGISTER_CZ(name) REGISTER(name, C, Z)
+
+static const struct efx_nic_reg efx_nic_regs[] = {
+	REGISTER_AZ(ADR_REGION),
+	REGISTER_AZ(INT_EN_KER),
+	REGISTER_BZ(INT_EN_CHAR),
+	REGISTER_AZ(INT_ADR_KER),
+	REGISTER_BZ(INT_ADR_CHAR),
+	/* INT_ACK_KER is WO */
+	/* INT_ISR0 is RC */
+	REGISTER_AZ(HW_INIT),
+	REGISTER_CZ(USR_EV_CFG),
+	REGISTER_AB(EE_SPI_HCMD),
+	REGISTER_AB(EE_SPI_HADR),
+	REGISTER_AB(EE_SPI_HDATA),
+	REGISTER_AB(EE_BASE_PAGE),
+	REGISTER_AB(EE_VPD_CFG0),
+	/* EE_VPD_SW_CNTL and EE_VPD_SW_DATA are not used */
+	/* PMBX_DBG_IADDR and PBMX_DBG_IDATA are indirect */
+	/* PCIE_CORE_INDIRECT is indirect */
+	REGISTER_AB(NIC_STAT),
+	REGISTER_AB(GPIO_CTL),
+	REGISTER_AB(GLB_CTL),
+	/* FATAL_INTR_KER and FATAL_INTR_CHAR are partly RC */
+	REGISTER_BZ(DP_CTRL),
+	REGISTER_AZ(MEM_STAT),
+	REGISTER_AZ(CS_DEBUG),
+	REGISTER_AZ(ALTERA_BUILD),
+	REGISTER_AZ(CSR_SPARE),
+	REGISTER_AB(PCIE_SD_CTL0123),
+	REGISTER_AB(PCIE_SD_CTL45),
+	REGISTER_AB(PCIE_PCS_CTL_STAT),
+	/* DEBUG_DATA_OUT is not used */
+	/* DRV_EV is WO */
+	REGISTER_AZ(EVQ_CTL),
+	REGISTER_AZ(EVQ_CNT1),
+	REGISTER_AZ(EVQ_CNT2),
+	REGISTER_AZ(BUF_TBL_CFG),
+	REGISTER_AZ(SRM_RX_DC_CFG),
+	REGISTER_AZ(SRM_TX_DC_CFG),
+	REGISTER_AZ(SRM_CFG),
+	/* BUF_TBL_UPD is WO */
+	REGISTER_AZ(SRM_UPD_EVQ),
+	REGISTER_AZ(SRAM_PARITY),
+	REGISTER_AZ(RX_CFG),
+	REGISTER_BZ(RX_FILTER_CTL),
+	/* RX_FLUSH_DESCQ is WO */
+	REGISTER_AZ(RX_DC_CFG),
+	REGISTER_AZ(RX_DC_PF_WM),
+	REGISTER_BZ(RX_RSS_TKEY),
+	/* RX_NODESC_DROP is RC */
+	REGISTER_AA(RX_SELF_RST),
+	/* RX_DEBUG, RX_PUSH_DROP are not used */
+	REGISTER_CZ(RX_RSS_IPV6_REG1),
+	REGISTER_CZ(RX_RSS_IPV6_REG2),
+	REGISTER_CZ(RX_RSS_IPV6_REG3),
+	/* TX_FLUSH_DESCQ is WO */
+	REGISTER_AZ(TX_DC_CFG),
+	REGISTER_AA(TX_CHKSM_CFG),
+	REGISTER_AZ(TX_CFG),
+	/* TX_PUSH_DROP is not used */
+	REGISTER_AZ(TX_RESERVED),
+	REGISTER_BZ(TX_PACE),
+	/* TX_PACE_DROP_QID is RC */
+	REGISTER_BB(TX_VLAN),
+	REGISTER_BZ(TX_IPFIL_PORTEN),
+	REGISTER_AB(MD_TXD),
+	REGISTER_AB(MD_RXD),
+	REGISTER_AB(MD_CS),
+	REGISTER_AB(MD_PHY_ADR),
+	REGISTER_AB(MD_ID),
+	/* MD_STAT is RC */
+	REGISTER_AB(MAC_STAT_DMA),
+	REGISTER_AB(MAC_CTRL),
+	REGISTER_BB(GEN_MODE),
+	REGISTER_AB(MAC_MC_HASH_REG0),
+	REGISTER_AB(MAC_MC_HASH_REG1),
+	REGISTER_AB(GM_CFG1),
+	REGISTER_AB(GM_CFG2),
+	/* GM_IPG and GM_HD are not used */
+	REGISTER_AB(GM_MAX_FLEN),
+	/* GM_TEST is not used */
+	REGISTER_AB(GM_ADR1),
+	REGISTER_AB(GM_ADR2),
+	REGISTER_AB(GMF_CFG0),
+	REGISTER_AB(GMF_CFG1),
+	REGISTER_AB(GMF_CFG2),
+	REGISTER_AB(GMF_CFG3),
+	REGISTER_AB(GMF_CFG4),
+	REGISTER_AB(GMF_CFG5),
+	REGISTER_BB(TX_SRC_MAC_CTL),
+	REGISTER_AB(XM_ADR_LO),
+	REGISTER_AB(XM_ADR_HI),
+	REGISTER_AB(XM_GLB_CFG),
+	REGISTER_AB(XM_TX_CFG),
+	REGISTER_AB(XM_RX_CFG),
+	REGISTER_AB(XM_MGT_INT_MASK),
+	REGISTER_AB(XM_FC),
+	REGISTER_AB(XM_PAUSE_TIME),
+	REGISTER_AB(XM_TX_PARAM),
+	REGISTER_AB(XM_RX_PARAM),
+	/* XM_MGT_INT_MSK (note no 'A') is RC */
+	REGISTER_AB(XX_PWR_RST),
+	REGISTER_AB(XX_SD_CTL),
+	REGISTER_AB(XX_TXDRV_CTL),
+	/* XX_PRBS_CTL, XX_PRBS_CHK and XX_PRBS_ERR are not used */
+	/* XX_CORE_STAT is partly RC */
+};
+
+struct efx_nic_reg_table {
+	u32 offset:24;
+	u32 min_revision:2, max_revision:2;
+	u32 step:6, rows:21;
+};
+
+#define REGISTER_TABLE_DIMENSIONS(_, offset, min_rev, max_rev, step, rows) { \
+	offset,								\
+	REGISTER_REVISION_ ## min_rev, REGISTER_REVISION_ ## max_rev,	\
+	step, rows							\
+}
+#define REGISTER_TABLE(name, min_rev, max_rev) 				\
+	REGISTER_TABLE_DIMENSIONS(					\
+		name, FR_ ## min_rev ## max_rev ## _ ## name,		\
+		min_rev, max_rev,					\
+		FR_ ## min_rev ## max_rev ## _ ## name ## _STEP,	\
+		FR_ ## min_rev ## max_rev ## _ ## name ## _ROWS)
+#define REGISTER_TABLE_AA(name) REGISTER_TABLE(name, A, A)
+#define REGISTER_TABLE_AZ(name) REGISTER_TABLE(name, A, Z)
+#define REGISTER_TABLE_BB(name) REGISTER_TABLE(name, B, B)
+#define REGISTER_TABLE_BZ(name) REGISTER_TABLE(name, B, Z)
+#define REGISTER_TABLE_BB_CZ(name)					\
+	REGISTER_TABLE_DIMENSIONS(name, FR_BZ_ ## name, B, B,		\
+				  FR_BZ_ ## name ## _STEP,		\
+				  FR_BB_ ## name ## _ROWS),		\
+	REGISTER_TABLE_DIMENSIONS(name, FR_BZ_ ## name, C, Z,		\
+				  FR_BZ_ ## name ## _STEP,		\
+				  FR_CZ_ ## name ## _ROWS)
+#define REGISTER_TABLE_CZ(name) REGISTER_TABLE(name, C, Z)
+
+static const struct efx_nic_reg_table efx_nic_reg_tables[] = {
+	/* DRIVER is not used */
+	/* EVQ_RPTR, TIMER_COMMAND, USR_EV and {RX,TX}_DESC_UPD are WO */
+	REGISTER_TABLE_BB(TX_IPFIL_TBL),
+	REGISTER_TABLE_BB(TX_SRC_MAC_TBL),
+	REGISTER_TABLE_AA(RX_DESC_PTR_TBL_KER),
+	REGISTER_TABLE_BB_CZ(RX_DESC_PTR_TBL),
+	REGISTER_TABLE_AA(TX_DESC_PTR_TBL_KER),
+	REGISTER_TABLE_BB_CZ(TX_DESC_PTR_TBL),
+	REGISTER_TABLE_AA(EVQ_PTR_TBL_KER),
+	REGISTER_TABLE_BB_CZ(EVQ_PTR_TBL),
+	/* The register buffer is allocated with slab, so we can't
+	 * reasonably read all of the buffer table (up to 8MB!).
+	 * However this driver will only use a few entries.  Reading
+	 * 1K entries allows for some expansion of queue count and
+	 * size before we need to change the version. */
+	REGISTER_TABLE_DIMENSIONS(BUF_FULL_TBL_KER, FR_AA_BUF_FULL_TBL_KER,
+				  A, A, 8, 1024),
+	REGISTER_TABLE_DIMENSIONS(BUF_FULL_TBL, FR_BZ_BUF_FULL_TBL,
+				  B, Z, 8, 1024),
+	/* RX_FILTER_TBL{0,1} is huge and not used by this driver */
+	REGISTER_TABLE_CZ(RX_MAC_FILTER_TBL0),
+	REGISTER_TABLE_BB_CZ(TIMER_TBL),
+	REGISTER_TABLE_BB_CZ(TX_PACE_TBL),
+	REGISTER_TABLE_BZ(RX_INDIRECTION_TBL),
+	/* TX_FILTER_TBL0 is huge and not used by this driver */
+	REGISTER_TABLE_CZ(TX_MAC_FILTER_TBL0),
+	REGISTER_TABLE_CZ(MC_TREG_SMEM),
+	/* MSIX_PBA_TABLE is not mapped */
+	/* SRM_DBG is not mapped (and is redundant with BUF_FLL_TBL) */
+};
+
+size_t efx_nic_get_regs_len(struct efx_nic *efx)
+{
+	const struct efx_nic_reg *reg;
+	const struct efx_nic_reg_table *table;
+	size_t len = 0;
+
+	for (reg = efx_nic_regs;
+	     reg < efx_nic_regs + ARRAY_SIZE(efx_nic_regs);
+	     reg++)
+		if (efx->type->revision >= reg->min_revision &&
+		    efx->type->revision <= reg->max_revision)
+			len += sizeof(efx_oword_t);
+
+	for (table = efx_nic_reg_tables;
+	     table < efx_nic_reg_tables + ARRAY_SIZE(efx_nic_reg_tables);
+	     table++)
+		if (efx->type->revision >= table->min_revision &&
+		    efx->type->revision <= table->max_revision)
+			len += table->rows * min_t(size_t, table->step, 16);
+
+	return len;
+}
+
+void efx_nic_get_regs(struct efx_nic *efx, void *buf)
+{
+	const struct efx_nic_reg *reg;
+	const struct efx_nic_reg_table *table;
+
+	for (reg = efx_nic_regs;
+	     reg < efx_nic_regs + ARRAY_SIZE(efx_nic_regs);
+	     reg++) {
+		if (efx->type->revision >= reg->min_revision &&
+		    efx->type->revision <= reg->max_revision) {
+			efx_reado(efx, (efx_oword_t *)buf, reg->offset);
+			buf += sizeof(efx_oword_t);
+		}
+	}
+
+	for (table = efx_nic_reg_tables;
+	     table < efx_nic_reg_tables + ARRAY_SIZE(efx_nic_reg_tables);
+	     table++) {
+		size_t size, i;
+
+		if (!(efx->type->revision >= table->min_revision &&
+		      efx->type->revision <= table->max_revision))
+			continue;
+
+		size = min_t(size_t, table->step, 16);
+
+		for (i = 0; i < table->rows; i++) {
+			switch (table->step) {
+			case 4: /* 32-bit register or SRAM */
+				efx_readd_table(efx, buf, table->offset, i);
+				break;
+			case 8: /* 64-bit SRAM */
+				efx_sram_readq(efx,
+					       efx->membase + table->offset,
+					       buf, i);
+				break;
+			case 16: /* 128-bit register */
+				efx_reado_table(efx, buf, table->offset, i);
+				break;
+			case 32: /* 128-bit register, interleaved */
+				efx_reado_table(efx, buf, table->offset, 2 * i);
+				break;
+			default:
+				WARN_ON(1);
+				return;
+			}
+			buf += size;
+		}
+	}
+}
diff --git a/drivers/net/sfc/nic.h b/drivers/net/sfc/nic.h
index 95770e1..534461f 100644
--- a/drivers/net/sfc/nic.h
+++ b/drivers/net/sfc/nic.h
@@ -222,6 +222,9 @@ extern int efx_nic_test_registers(struct efx_nic *efx,
 				  const struct efx_nic_register_test *regs,
 				  size_t n_regs);
 
+extern size_t efx_nic_get_regs_len(struct efx_nic *efx);
+extern void efx_nic_get_regs(struct efx_nic *efx, void *buf);
+
 /**************************************************************************
  *
  * Falcon MAC stats
-- 
1.6.2.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* PATCH: uninitialized memory access in tcp_parse_options
From: Mathieu Lacage @ 2010-06-21 13:34 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

valgrind reports the following error:

==15996== Conditional jump or move depends on uninitialised value(s)
==15996==    at 0x6E63E4C: tcp_parse_options (tcp_input.c:3776)
==15996==    by 0x6E856A3: tcp_check_req (tcp_minisocks.c:532)
==15996==    by 0x6E7F0C6: tcp_v4_hnd_req (tcp_ipv4.c:1492)
==15996==    by 0x6E7F55A: tcp_v4_do_rcv (tcp_ipv4.c:1571)
==15996==    by 0x6E808C5: tcp_v4_rcv (tcp_ipv4.c:1690)
==15996==    by 0x6E2DA7B: ip_local_deliver_finish (ip_input.c:231)
==15996==    by 0x6E2DE0C: ip_local_deliver (netfilter.h:206)
==15996==    by 0x6E2E940: ip_rcv_finish (dst.h:255)
==15996==    by 0x6E2F17C: ip_rcv (netfilter.h:206)
==15996==    by 0x6D53D0E: __netif_receive_skb (dev.c:2873)
==15996==    by 0x6D5521F: process_backlog (dev.c:3305)
==15996==    by 0x6D55A20: net_rx_action (dev.c:3435)

The attached patch (generated against net-next-2.6) fixes that error by
making sure that user_mss is correctly initialized at the start of
tcp_parse_options, just like saw_tstamp is initialized at the start of
this function. To try to be coherent, this patch also removes the
redundant initialization of saw_tstamp from the caller, tcp_check_req.

hope this helps,
Mathieu
-- 
Mathieu Lacage <mathieu.lacage@sophia.inria.fr>
Tel: +33 4 9238 5056

[-- Attachment #2: tcp-options.patch --]
[-- Type: text/x-patch, Size: 855 bytes --]

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index ae3ec15..8b713ab 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3751,6 +3751,7 @@ void tcp_parse_options(struct sk_buff *skb, struct tcp_options_received *opt_rx,
 
 	ptr = (unsigned char *)(th + 1);
 	opt_rx->saw_tstamp = 0;
+	opt_rx->user_mss = 0;
 
 	while (length > 0) {
 		int opcode = *ptr++;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 794c2e1..21e47e7 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -527,7 +527,6 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
 	__be32 flg = tcp_flag_word(th) & (TCP_FLAG_RST|TCP_FLAG_SYN|TCP_FLAG_ACK);
 	int paws_reject = 0;
 
-	tmp_opt.saw_tstamp = 0;
 	if (th->doff > (sizeof(struct tcphdr)>>2)) {
 		tcp_parse_options(skb, &tmp_opt, &hash_location, 0);
 

^ permalink raw reply related

* [PATCH] NET: MIPSsim: Fix modpost warning.
From: Ralf Baechle @ 2010-06-21 13:44 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

$ make CONFIG_DEBUG_SECTION_MISMATCH=y
[...]
WARNING: drivers/net/built-in.o(.data+0x0): Section mismatch in reference from the variable mipsnet_driver to the function .init.text:mipsnet_probe()
The variable mipsnet_driver references
the function __init mipsnet_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
[...]

Fixed by making mipsnet_probe __devinit.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

 drivers/net/mipsnet.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/mipsnet.c b/drivers/net/mipsnet.c
index 8e9704f..869f0ea 100644
--- a/drivers/net/mipsnet.c
+++ b/drivers/net/mipsnet.c
@@ -247,7 +247,7 @@ static const struct net_device_ops mipsnet_netdev_ops = {
 	.ndo_set_mac_address	= eth_mac_addr,
 };
 
-static int __init mipsnet_probe(struct platform_device *dev)
+static int __devinit mipsnet_probe(struct platform_device *dev)
 {
 	struct net_device *netdev;
 	int err;

^ permalink raw reply related

* Re: 2.6.35-rc3: Reported regressions from 2.6.34
From: Nick Bowler @ 2010-06-21 13:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Maciej Rutecki, Andrew Morton,
	Linus Torvalds, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI
In-Reply-To: <lc60d4toGTL.A.QzD._LpHMB@chimera>

On 00:11 Mon 21 Jun     , Rafael J. Wysocki wrote:
> If you know of any other unresolved regressions from 2.6.34, please let us
> know either and we'll add them to the list.  Also, please let us know
> if any of the entries below are invalid.

Didn't see this one in the list:

  Why is kslowd accumulating so much CPU time?
  http://thread.gmane.org/gmane.linux.kernel/996907

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply

* [PATCH 2/2 v2] Driver core: reduce duplicated code
From: Uwe Kleine-König @ 2010-06-21 14:11 UTC (permalink / raw)
  To: Greg KH
  Cc: Randy Dunlap, Dmitry Torokhov, Uwe Kleine-König,
	Anisse Astier, Greg Kroah-Hartman, Magnus Damm, Rafael J. Wysocki,
	Paul Mundt, Eric Miao, linux-doc, linux-kernel, netdev
In-Reply-To: <20100618073950.GA12822@pengutronix.de>

This makes the two similar functions platform_device_register_simple
and platform_device_register_data one line inline functions using a new
generic function platform_device_register_resndata.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
Hello,

still unsolved is the naming issue, what do you think about
platform_device_register?

I marked the new function as __init_or_module in a separate patch to
make reverting it a bit easier, still I think it should be possible to
fix the caller if a problem occurs.

I changed the semantic slightly to only call
platform_device_add_resources if data != NULL instead of size != 0.  The
idea is to support wrappers like:

	#define add_blablub(id, pdata) \
		platform_device_register_resndata(NULL, "blablub", id, \
			NULL, 0, pdata, sizeof(struct blablub_platform_data))

that don't fail if pdata=NULL.  Ditto for res.

Best regards
Uwe

changed since v1:

 - fix docbook to pick up platform_device_register_simple and
   platform_device_register_data after moving them to
   <linux/platform_device.h>
 - only add_resources and add_data if res and data are non-NULL resp.

 Documentation/DocBook/device-drivers.tmpl |    1 +
 drivers/base/platform.c                   |  104 +++++++---------------------
 include/linux/platform_device.h           |   62 ++++++++++++++++-
 3 files changed, 85 insertions(+), 82 deletions(-)

diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index 1b2dd4f..ecd35e9 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -111,6 +111,7 @@ X!Edrivers/base/attribute_container.c
 <!--
 X!Edrivers/base/interface.c
 -->
+!Iinclude/linux/platform_device.h
 !Edrivers/base/platform.c
 !Edrivers/base/bus.c
      </sect1>
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 26eb69d..ffcfd73 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -344,108 +344,56 @@ void platform_device_unregister(struct platform_device *pdev)
 EXPORT_SYMBOL_GPL(platform_device_unregister);
 
 /**
- * platform_device_register_simple - add a platform-level device and its resources
- * @name: base name of the device we're adding
- * @id: instance id
- * @res: set of resources that needs to be allocated for the device
- * @num: number of resources
- *
- * This function creates a simple platform device that requires minimal
- * resource and memory management. Canned release function freeing memory
- * allocated for the device allows drivers using such devices to be
- * unloaded without waiting for the last reference to the device to be
- * dropped.
+ * platform_device_register_resndata - add a platform-level device with
+ * resources and platform-specific data
  *
- * This interface is primarily intended for use with legacy drivers which
- * probe hardware directly.  Because such drivers create sysfs device nodes
- * themselves, rather than letting system infrastructure handle such device
- * enumeration tasks, they don't fully conform to the Linux driver model.
- * In particular, when such drivers are built as modules, they can't be
- * "hotplugged".
- *
- * Returns &struct platform_device pointer on success, or ERR_PTR() on error.
- */
-struct platform_device *platform_device_register_simple(const char *name,
-							int id,
-							const struct resource *res,
-							unsigned int num)
-{
-	struct platform_device *pdev;
-	int retval;
-
-	pdev = platform_device_alloc(name, id);
-	if (!pdev) {
-		retval = -ENOMEM;
-		goto error;
-	}
-
-	if (num) {
-		retval = platform_device_add_resources(pdev, res, num);
-		if (retval)
-			goto error;
-	}
-
-	retval = platform_device_add(pdev);
-	if (retval)
-		goto error;
-
-	return pdev;
-
-error:
-	platform_device_put(pdev);
-	return ERR_PTR(retval);
-}
-EXPORT_SYMBOL_GPL(platform_device_register_simple);
-
-/**
- * platform_device_register_data - add a platform-level device with platform-specific data
  * @parent: parent device for the device we're adding
  * @name: base name of the device we're adding
  * @id: instance id
+ * @res: set of resources that needs to be allocated for the device
+ * @num: number of resources
  * @data: platform specific data for this platform device
  * @size: size of platform specific data
  *
- * This function creates a simple platform device that requires minimal
- * resource and memory management. Canned release function freeing memory
- * allocated for the device allows drivers using such devices to be
- * unloaded without waiting for the last reference to the device to be
- * dropped.
- *
  * Returns &struct platform_device pointer on success, or ERR_PTR() on error.
  */
-struct platform_device *platform_device_register_data(
+struct platform_device *platform_device_register_resndata(
 		struct device *parent,
 		const char *name, int id,
+		const struct resource *res, unsigned int num,
 		const void *data, size_t size)
 {
+	int ret = -ENOMEM;
 	struct platform_device *pdev;
-	int retval;
 
 	pdev = platform_device_alloc(name, id);
-	if (!pdev) {
-		retval = -ENOMEM;
-		goto error;
-	}
+	if (!pdev)
+		goto err;
 
 	pdev->dev.parent = parent;
 
-	if (size) {
-		retval = platform_device_add_data(pdev, data, size);
-		if (retval)
-			goto error;
+	if (res) {
+		ret = platform_device_add_resources(pdev, res, num);
+		if (ret)
+			goto err;
 	}
 
-	retval = platform_device_add(pdev);
-	if (retval)
-		goto error;
+	if (data) {
+		ret = platform_device_add_data(pdev, data, size);
+		if (ret)
+			goto err;
+	}
 
-	return pdev;
+	ret = platform_device_add(pdev);
+	if (ret) {
+err:
+		platform_device_put(pdev);
+		return ERR_PTR(ret);
+	}
 
-error:
-	platform_device_put(pdev);
-	return ERR_PTR(retval);
+	return pdev;
 }
-EXPORT_SYMBOL_GPL(platform_device_register_data);
+EXPORT_SYMBOL_GPL(platform_device_register_resndata);
 
 static int platform_drv_probe(struct device *_dev)
 {
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 5417944..d7ecad0 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -43,10 +43,64 @@ extern struct resource *platform_get_resource_byname(struct platform_device *, u
 extern int platform_get_irq_byname(struct platform_device *, const char *);
 extern int platform_add_devices(struct platform_device **, int);
 
-extern struct platform_device *platform_device_register_simple(const char *, int id,
-					const struct resource *, unsigned int);
-extern struct platform_device *platform_device_register_data(struct device *,
-		const char *, int, const void *, size_t);
+extern struct platform_device *platform_device_register_resndata(
+		struct device *parent, const char *name, int id,
+		const struct resource *res, unsigned int num,
+		const void *data, size_t size);
+
+/**
+ * platform_device_register_simple - add a platform-level device and its resources
+ * @name: base name of the device we're adding
+ * @id: instance id
+ * @res: set of resources that needs to be allocated for the device
+ * @num: number of resources
+ *
+ * This function creates a simple platform device that requires minimal
+ * resource and memory management. Canned release function freeing memory
+ * allocated for the device allows drivers using such devices to be
+ * unloaded without waiting for the last reference to the device to be
+ * dropped.
+ *
+ * This interface is primarily intended for use with legacy drivers which
+ * probe hardware directly.  Because such drivers create sysfs device nodes
+ * themselves, rather than letting system infrastructure handle such device
+ * enumeration tasks, they don't fully conform to the Linux driver model.
+ * In particular, when such drivers are built as modules, they can't be
+ * "hotplugged".
+ *
+ * Returns &struct platform_device pointer on success, or ERR_PTR() on error.
+ */
+static inline struct platform_device *platform_device_register_simple(
+		const char *name, int id,
+		const struct resource *res, unsigned int num)
+{
+	return platform_device_register_resndata(NULL, name, id,
+			res, num, NULL, 0);
+}
+
+/**
+ * platform_device_register_data - add a platform-level device with platform-specific data
+ * @parent: parent device for the device we're adding
+ * @name: base name of the device we're adding
+ * @id: instance id
+ * @data: platform specific data for this platform device
+ * @size: size of platform specific data
+ *
+ * This function creates a simple platform device that requires minimal
+ * resource and memory management. Canned release function freeing memory
+ * allocated for the device allows drivers using such devices to be
+ * unloaded without waiting for the last reference to the device to be
+ * dropped.
+ *
+ * Returns &struct platform_device pointer on success, or ERR_PTR() on error.
+ */
+static inline struct platform_device *platform_device_register_data(
+		struct device *parent, const char *name, int id,
+		const void *data, size_t size)
+{
+	return platform_device_register_resndata(parent, name, id,
+			NULL, 0, data, size);
+}
 
 extern struct platform_device *platform_device_alloc(const char *name, int id);
 extern int platform_device_add_resources(struct platform_device *pdev,
-- 
1.7.1


^ permalink raw reply related

* Re: 2.6.34 + IPv6: Oops?
From: Andreas Klauer @ 2010-06-21 15:30 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20100619175352.GA8482@EIS>

Hi,

no one replied so far - am I reporting this to the wrong place?
Please advise.

I've done some more testing; I can reproduce the issue now on different 
hardware (my desktop at home), with a clean environment (freshly boot- 
strapped Debian Lenny). Which should make things more interesting. 

The issue seems to be compiler related. It occurs if the kernel is 
compiled with Debian Lenny "gcc (Debian 4.3.2-1.1) 4.3.2". It does 
not occur (or at least I can't reproduce it) if the kernel was 
compiled with Gentoo "gcc (Gentoo 4.4.4 p1.0) 4.4.4".

Screenshot:
http://www.metamorpher.de/kernel/panic2.jpg

Steps to reproduce:
http://www.metamorpher.de/kernel/steps-to-reproduce.txt

kernel config and lspci also available here:
http://www.metamorpher.de/kernel/

Regards
Andreas Klauer

^ permalink raw reply

* Re: 2.6.34 + IPv6: Oops?
From: Stephen Hemminger @ 2010-06-21 16:19 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: netdev
In-Reply-To: <20100621153018.GA2433@EIS>

On Mon, 21 Jun 2010 17:30:18 +0200
Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote:

> Hi,
> 
> no one replied so far - am I reporting this to the wrong place?
> Please advise.
> 
> I've done some more testing; I can reproduce the issue now on different 
> hardware (my desktop at home), with a clean environment (freshly boot- 
> strapped Debian Lenny). Which should make things more interesting. 
> 
> The issue seems to be compiler related. It occurs if the kernel is 
> compiled with Debian Lenny "gcc (Debian 4.3.2-1.1) 4.3.2". It does 
> not occur (or at least I can't reproduce it) if the kernel was 
> compiled with Gentoo "gcc (Gentoo 4.4.4 p1.0) 4.4.4".
> 
> Screenshot:
> http://www.metamorpher.de/kernel/panic2.jpg
> 
> Steps to reproduce:
> http://www.metamorpher.de/kernel/steps-to-reproduce.txt
> 
> kernel config and lspci also available here:
> http://www.metamorpher.de/kernel/
> 
> Regards
> Andreas Klauer
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Unfortunately, Greg seems to be slow in getting 2.6.34.1 out.
These patches are related:


http://marc.info/?l=linux-netdev&m=127472600330413
http://marc.info/?l=linux-netdev&m=127472599530407

-- 

^ permalink raw reply

* Re: 2.6.34 + IPv6: Oops?
From: Hagen Paul Pfeifer @ 2010-06-21 16:25 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: netdev
In-Reply-To: <20100621153018.GA2433@EIS>

* Andreas Klauer | 2010-06-21 17:30:18 [+0200]:

>Hi,
>
>no one replied so far - am I reporting this to the wrong place?
>Please advise.

It is the right place, some guys are in holiday but probably you can git
bisect the problem?

HGN

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox