Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 09/11] sungem: remove unnecessary setting of skb->dev
From: Jon Mason @ 2012-07-10  0:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev
In-Reply-To: <1341878975-10577-1-git-send-email-jdmason@kudzu.us>

skb->dev is being unnecessarily set by the driver's skb alloc routine (which is
called in init and during rx).  It is already being set to the proper value when
eth_type_trans is called on packet receive, and the skb->dev is not referenced
anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/net/ethernet/sun/sungem.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 3cf4ab7..9ae12d0 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -752,7 +752,6 @@ static __inline__ struct sk_buff *gem_alloc_skb(struct net_device *dev, int size
 	if (likely(skb)) {
 		unsigned long offset = ALIGNED_RX_SKB_ADDR(skb->data);
 		skb_reserve(skb, offset);
-		skb->dev = dev;
 	}
 	return skb;
 }
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 10/11] sunhme: remove unnecessary setting of skb->dev
From: Jon Mason @ 2012-07-10  0:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev
In-Reply-To: <1341878975-10577-1-git-send-email-jdmason@kudzu.us>

skb->dev is being unnecessarily set during ring init and skb alloc in rx.  It is
already being set to the proper value when eth_type_trans is called on packet
receive, and the skb->dev is not referenced anywhere else in the code.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/net/ethernet/sun/sunhme.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunhme.c b/drivers/net/ethernet/sun/sunhme.c
index dfc00c4..73f341b 100644
--- a/drivers/net/ethernet/sun/sunhme.c
+++ b/drivers/net/ethernet/sun/sunhme.c
@@ -1249,7 +1249,6 @@ static void happy_meal_clean_rings(struct happy_meal *hp)
 static void happy_meal_init_rings(struct happy_meal *hp)
 {
 	struct hmeal_init_block *hb = hp->happy_block;
-	struct net_device *dev = hp->dev;
 	int i;
 
 	HMD(("happy_meal_init_rings: counters to zero, "));
@@ -1270,7 +1269,6 @@ static void happy_meal_init_rings(struct happy_meal *hp)
 			continue;
 		}
 		hp->rx_skbs[i] = skb;
-		skb->dev = dev;
 
 		/* Because we reserve afterwards. */
 		skb_put(skb, (ETH_FRAME_LEN + RX_OFFSET + 4));
@@ -2031,7 +2029,6 @@ static void happy_meal_rx(struct happy_meal *hp, struct net_device *dev)
 			}
 			dma_unmap_single(hp->dma_dev, dma_addr, RX_BUF_ALLOC_SIZE, DMA_FROM_DEVICE);
 			hp->rx_skbs[elem] = new_skb;
-			new_skb->dev = dev;
 			skb_put(new_skb, (ETH_FRAME_LEN + RX_OFFSET + 4));
 			hme_write_rxd(hp, this,
 				      (RXFLAG_OWN|((RX_BUF_ALLOC_SIZE-RX_OFFSET)<<16)),
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 11/11] ll_temac: remove unnecessary setting of skb->dev
From: Jon Mason @ 2012-07-10  0:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev
In-Reply-To: <1341878975-10577-1-git-send-email-jdmason@kudzu.us>

skb->dev is being unnecessarily set by the driver on packet recieve.
eth_type_trans already sets skb->dev to the proper value and it is not
referenced anywhere else in the dirver, thus making its setting unnecessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/net/ethernet/xilinx/ll_temac_main.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 1eaf712..705146d 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -768,7 +768,6 @@ static void ll_temac_recv(struct net_device *ndev)
 				 DMA_FROM_DEVICE);
 
 		skb_put(skb, length);
-		skb->dev = ndev;
 		skb->protocol = eth_type_trans(skb, ndev);
 		skb_checksum_none_assert(skb);
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 0/4] Calxeda xgmac fixes and performance improvements
From: Rob Herring @ 2012-07-10  0:16 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David S. Miller, Rob Herring

From: Rob Herring <rob.herring@calxeda.com>

A few fixes and performance improvements for the Calxeda xgmac driver for
3.6. It would be nice to get the 2 fixes into 3.5, but since it is a bit
late in the cycle they can wait.

Rob

Rob Herring (4):
  net: calxedaxgmac: fix net timeout recovery
  net: calxedaxgmac: fix hang on rx refill
  net: calxedaxgmac: set outstanding AXI bus transactions to 8
  net: calxedaxgmac: enable rx cut-thru mode

 drivers/net/ethernet/calxeda/xgmac.c |   35 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 18 deletions(-)

-- 
1.7.9.5

^ permalink raw reply

* [PATCH 1/4] net: calxedaxgmac: fix net timeout recovery
From: Rob Herring @ 2012-07-10  0:16 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David S. Miller, Rob Herring
In-Reply-To: <1341879370-23385-1-git-send-email-robherring2@gmail.com>

From: Rob Herring <rob.herring@calxeda.com>

Fix net tx watchdog timeout recovery. The descriptor ring was reset,
but the DMA engine was not reset to the beginning of the ring.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
 drivers/net/ethernet/calxeda/xgmac.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index 11f667f..c4fd2e3 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -933,6 +933,7 @@ static void xgmac_tx_err(struct xgmac_priv *priv)
 	desc_init_tx_desc(priv->dma_tx, DMA_TX_RING_SZ);
 	priv->tx_tail = 0;
 	priv->tx_head = 0;
+	writel(priv->dma_tx_phy, priv->base + XGMAC_DMA_TX_BASE_ADDR);
 	writel(reg | DMA_CONTROL_ST, priv->base + XGMAC_DMA_CONTROL);

 	writel(DMA_STATUS_TU | DMA_STATUS_TPS | DMA_STATUS_NIS | DMA_STATUS_AIS,
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 2/4] net: calxedaxgmac: fix hang on rx refill
From: Rob Herring @ 2012-07-10  0:16 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David S. Miller, Rob Herring
In-Reply-To: <1341879370-23385-1-git-send-email-robherring2@gmail.com>

From: Rob Herring <rob.herring@calxeda.com>

Fix intermittent hangs in xgmac_rx_refill. If a ring buffer entry already
had an skb allocated, then xgmac_rx_refill would get stuck in a loop. This
can happen on a rx error when we just leave the skb allocated to the entry.

[ 7884.510000] INFO: rcu_preempt detected stall on CPU 0 (t=727315 jiffies)
[ 7884.510000] [<c0010a59>] (unwind_backtrace+0x1/0x98) from [<c006fd93>] (__rcu_pending+0x11b/0x2c4)
[ 7884.510000] [<c006fd93>] (__rcu_pending+0x11b/0x2c4) from [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8)
[ 7884.510000] [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8) from [<c0036abb>] (update_process_times+0x2b/0x48)
[ 7884.510000] [<c0036abb>] (update_process_times+0x2b/0x48) from [<c004e8fd>] (tick_sched_timer+0x51/0x94)
[ 7884.510000] [<c004e8fd>] (tick_sched_timer+0x51/0x94) from [<c0045527>] (__run_hrtimer+0x4f/0x1e8)
[ 7884.510000] [<c0045527>] (__run_hrtimer+0x4f/0x1e8) from [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4)
[ 7884.510000] [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4) from [<c00101d3>] (twd_handler+0x17/0x24)
[ 7884.510000] [<c00101d3>] (twd_handler+0x17/0x24) from [<c006be39>] (handle_percpu_devid_irq+0x59/0x114)
[ 7884.510000] [<c006be39>] (handle_percpu_devid_irq+0x59/0x114) from [<c0069aab>] (generic_handle_irq+0x17/0x2c)
[ 7884.510000] [<c0069aab>] (generic_handle_irq+0x17/0x2c) from [<c000cc8d>] (handle_IRQ+0x35/0x7c)
[ 7884.510000] [<c000cc8d>] (handle_IRQ+0x35/0x7c) from [<c033b153>] (__irq_svc+0x33/0xb8)
[ 7884.510000] [<c033b153>] (__irq_svc+0x33/0xb8) from [<c0244b06>] (xgmac_rx_refill+0x3a/0x140)
[ 7884.510000] [<c0244b06>] (xgmac_rx_refill+0x3a/0x140) from [<c02458ed>] (xgmac_poll+0x265/0x3bc)
[ 7884.510000] [<c02458ed>] (xgmac_poll+0x265/0x3bc) from [<c029fcbf>] (net_rx_action+0xc3/0x200)
[ 7884.510000] [<c029fcbf>] (net_rx_action+0xc3/0x200) from [<c0030cab>] (__do_softirq+0xa3/0x1bc)

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
 drivers/net/ethernet/calxeda/xgmac.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index c4fd2e3..3ca1d79 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -671,26 +671,23 @@ static void xgmac_rx_refill(struct xgmac_priv *priv)
 
 		p = priv->dma_rx + entry;
 
-		if (priv->rx_skbuff[entry] != NULL)
-			continue;
-
-		skb = __skb_dequeue(&priv->rx_recycle);
-		if (skb == NULL)
-			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
-		if (unlikely(skb == NULL))
-			break;
-
-		priv->rx_skbuff[entry] = skb;
-		paddr = dma_map_single(priv->device, skb->data,
-					 priv->dma_buf_sz, DMA_FROM_DEVICE);
-		desc_set_buf_addr(p, paddr, priv->dma_buf_sz);
+		if (priv->rx_skbuff[entry] == NULL) {
+			skb = __skb_dequeue(&priv->rx_recycle);
+			if (skb == NULL)
+				skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
+			if (unlikely(skb == NULL))
+				break;
+
+			priv->rx_skbuff[entry] = skb;
+			paddr = dma_map_single(priv->device, skb->data,
+					       priv->dma_buf_sz, DMA_FROM_DEVICE);
+			desc_set_buf_addr(p, paddr, priv->dma_buf_sz);
+		}
 
 		netdev_dbg(priv->dev, "rx ring: head %d, tail %d\n",
 			priv->rx_head, priv->rx_tail);
 
 		priv->rx_head = dma_ring_incr(priv->rx_head, DMA_RX_RING_SZ);
-		/* Ensure descriptor is in memory before handing to h/w */
-		wmb();
 		desc_set_rx_owner(p);
 	}
 }
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 4/4] net: calxedaxgmac: enable rx cut-thru mode
From: Rob Herring @ 2012-07-10  0:16 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David S. Miller, Rob Herring
In-Reply-To: <1341879370-23385-1-git-send-email-robherring2@gmail.com>

From: Rob Herring <rob.herring@calxeda.com>

Enabling RX cut-thru mode yields better performance as received frames
start getting written to memory before a whole frame is received.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
 drivers/net/ethernet/calxeda/xgmac.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index abb8f40..2b4b4f5 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -264,7 +264,7 @@
 #define XGMAC_OMR_FEF		0x00000080	/* Forward Error Frames */
 #define XGMAC_OMR_DT		0x00000040	/* Drop TCP/IP csum Errors */
 #define XGMAC_OMR_RSF		0x00000020	/* RX FIFO Store and Forward */
-#define XGMAC_OMR_RTC		0x00000010	/* RX Threshhold Ctrl */
+#define XGMAC_OMR_RTC_256	0x00000018	/* RX Threshhold Ctrl */
 #define XGMAC_OMR_RTC_MASK	0x00000018	/* RX Threshhold Ctrl MASK */
 
 /* XGMAC HW Features Register */
@@ -982,7 +982,8 @@ static int xgmac_hw_init(struct net_device *dev)
 	writel(value, ioaddr + XGMAC_DMA_CONTROL);
 
 	/* Set the HW DMA mode and the COE */
-	writel(XGMAC_OMR_TSF | XGMAC_OMR_RSF | XGMAC_OMR_RFD | XGMAC_OMR_RFA,
+	writel(XGMAC_OMR_TSF | XGMAC_OMR_RFD | XGMAC_OMR_RFA |
+		XGMAC_OMR_RTC_256,
 		ioaddr + XGMAC_OMR);
 
 	/* Reset the MMC counters */
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 3/4] net: calxedaxgmac: set outstanding AXI bus transactions to 8
From: Rob Herring @ 2012-07-10  0:16 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: David S. Miller, Rob Herring
In-Reply-To: <1341879370-23385-1-git-send-email-robherring2@gmail.com>

From: Rob Herring <rob.herring@calxeda.com>

Increase the number of outstanding read and write AXI transactions from 1
to 8 for better performance.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
 drivers/net/ethernet/calxeda/xgmac.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index 3ca1d79..abb8f40 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -970,7 +970,7 @@ static int xgmac_hw_init(struct net_device *dev)
 	writel(DMA_INTR_DEFAULT_MASK, ioaddr + XGMAC_DMA_INTR_ENA);

 	/* XGMAC requires AXI bus init. This is a 'magic number' for now */
-	writel(0x000100E, ioaddr + XGMAC_DMA_AXI_BUS);
+	writel(0x0077000E, ioaddr + XGMAC_DMA_AXI_BUS);

 	ctrl |= XGMAC_CONTROL_DDIC | XGMAC_CONTROL_JE | XGMAC_CONTROL_ACS |
 		XGMAC_CONTROL_CAR;
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH v2] fail dentry revalidation after namespace change
From: Eric W. Biederman @ 2012-07-10  0:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Glauber Costa, linux-kernel, netdev, Greg Thelen, Serge Hallyn,
	Tejun Heo, Greg Kroah-Hartman
In-Reply-To: <20120709161336.0ec23592.akpm@linux-foundation.org>

Andrew Morton <akpm@linux-foundation.org> writes:

>>  {
>>  	struct sysfs_dirent *sd;
>>  	int is_dir;
>> +	int type;
>>  
>>  	if (nd->flags & LOOKUP_RCU)
>>  		return -ECHILD;
>> @@ -326,6 +327,13 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
>>  	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
>>  		goto out_bad;
>>  
>> +	/* The sysfs dirent has been moved to a different namespace */
>> +	type = KOBJ_NS_TYPE_NONE;
>> +	if (sd->s_parent)
>> +		type = sysfs_ns_type(sd->s_parent);
>> +	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
>
> eww, the code is assuming that KOBJ_NS_TYPE_NONE has a value of zero. 
> Don't do that; it smells bad.

Gag.  An incomplete change in idiom.

KOBJ_NS_TYPE_NONE is explicitly defined as 0 so that it can be used
this way, and every where else in fs/sysfs/dir.c uses this idiom.

Furthermore your change below takes one line of readable code and turns
it into something inappropriate to talk about in polite company.

If you want the code to be perfect type should be defined as
"enum kobj_ns_type type" instead of "int kobj_ns_type".

Of course the truly perfect solution is to rework the sysfs
code in a manner similar to proc, with magic internal symlinks
and multiple parallel tress for the different namespaces.
For the users of sysfs semantically there would be no changes
but in the implementation there would many fewer special cases
for namespaces.   The only special case would be reduced to
the internal sysfs symlink that lookup would have to know
about.

> @@ -329,10 +329,12 @@ static int sysfs_dentry_revalidate(struc
>  
>  	/* The sysfs dirent has been moved to a different namespace */
>  	type = KOBJ_NS_TYPE_NONE;
> -	if (sd->s_parent)
> +	if (sd->s_parent) {
>  		type = sysfs_ns_type(sd->s_parent);
> -	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> -		goto out_bad;
> +		if (type != KOBJ_NS_TYPE_NONE &&
> +				sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns)
> +			goto out_bad;
> +	}

Pray tell in what parallel universe is that monstrosity above more
readable than the line it replaces?

Eric

^ permalink raw reply

* Re: [PATCH v2] fail dentry revalidation after namespace change
From: Andrew Morton @ 2012-07-10  0:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Glauber Costa, linux-kernel, netdev, Greg Thelen, Serge Hallyn,
	Tejun Heo, Greg Kroah-Hartman
In-Reply-To: <87txxgxxs7.fsf@xmission.com>

On Mon, 09 Jul 2012 17:30:48 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:

> Andrew Morton <akpm@linux-foundation.org> writes:
> 
> >>  {
> >>  	struct sysfs_dirent *sd;
> >>  	int is_dir;
> >> +	int type;
> >>  
> >>  	if (nd->flags & LOOKUP_RCU)
> >>  		return -ECHILD;
> >> @@ -326,6 +327,13 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> >>  	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
> >>  		goto out_bad;
> >>  
> >> +	/* The sysfs dirent has been moved to a different namespace */
> >> +	type = KOBJ_NS_TYPE_NONE;
> >> +	if (sd->s_parent)
> >> +		type = sysfs_ns_type(sd->s_parent);
> >> +	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> >
> > eww, the code is assuming that KOBJ_NS_TYPE_NONE has a value of zero. 
> > Don't do that; it smells bad.
> 
> Gag.  An incomplete change in idiom.
> 
> KOBJ_NS_TYPE_NONE is explicitly defined as 0 so that it can be used
> this way, and every where else in fs/sysfs/dir.c uses this idiom.

One man's idiom is another man's idiocy.

Seriously.  What sort of idea is that?  Create an enumerated type and
then just ignore it?

> Pray tell in what parallel universe is that monstrosity above more
> readable than the line it replaces?

Don't be silly, it is not a "monstrosity".  The code it is modifying
contains an unneeded test-and-branch.  It's a test and branch which the
compiler might be able to avoid.  If we can demonstrate that the
compiler does indeed optimise it, or if we can find a less monstrous
way of implementing it then fine.  Otherwise, efficiency wins.

^ permalink raw reply

* Re: [PATCH v2] fail dentry revalidation after namespace change
From: Eric W. Biederman @ 2012-07-10  1:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Glauber Costa, linux-kernel, netdev, Greg Thelen, Serge Hallyn,
	Tejun Heo, Greg Kroah-Hartman
In-Reply-To: <20120709174705.0e2078c8.akpm@linux-foundation.org>

Andrew Morton <akpm@linux-foundation.org> writes:

> On Mon, 09 Jul 2012 17:30:48 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>> Andrew Morton <akpm@linux-foundation.org> writes:
>> 
>> >>  {
>> >>  	struct sysfs_dirent *sd;
>> >>  	int is_dir;
>> >> +	int type;
>> >>  
>> >>  	if (nd->flags & LOOKUP_RCU)
>> >>  		return -ECHILD;
>> >> @@ -326,6 +327,13 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
>> >>  	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
>> >>  		goto out_bad;
>> >>  
>> >> +	/* The sysfs dirent has been moved to a different namespace */
>> >> +	type = KOBJ_NS_TYPE_NONE;
>> >> +	if (sd->s_parent)
>> >> +		type = sysfs_ns_type(sd->s_parent);
>> >> +	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
>> >
>> > eww, the code is assuming that KOBJ_NS_TYPE_NONE has a value of zero. 
>> > Don't do that; it smells bad.
>> 
>> Gag.  An incomplete change in idiom.
>> 
>> KOBJ_NS_TYPE_NONE is explicitly defined as 0 so that it can be used
>> this way, and every where else in fs/sysfs/dir.c uses this idiom.
>
> One man's idiom is another man's idiocy.

And code that uses inconsistent idioms is even harder to read.

A half assed cleanup is worse than no cleanup.

> Seriously.  What sort of idea is that?  Create an enumerated type and
> then just ignore it?

It isn't ignored.  It just has a well defined NULL value. That is hardly
controversial.

>> Pray tell in what parallel universe is that monstrosity above more
>> readable than the line it replaces?
>
> Don't be silly, it is not a "monstrosity".  The code it is modifying
> contains an unneeded test-and-branch.  It's a test and branch which the
> compiler might be able to avoid.  If we can demonstrate that the
> compiler does indeed optimise it, or if we can find a less monstrous
> way of implementing it then fine.  Otherwise, efficiency wins.

Efficiency wins?  In a rarely used function?  Which kernel are you
working on?

Readable maintainable code wins.  Unreadable code causes regressions.

Your addition while it may not be monstrous is most definitely less
readable.

Eric

^ permalink raw reply

* Re: [PATCH v2] fail dentry revalidation after namespace change
From: Andrew Morton @ 2012-07-10  2:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Glauber Costa, linux-kernel, netdev, Greg Thelen, Serge Hallyn,
	Tejun Heo, Greg Kroah-Hartman
In-Reply-To: <87629wxu1i.fsf@xmission.com>

On Mon, 09 Jul 2012 18:51:37 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:

> Andrew Morton <akpm@linux-foundation.org> writes:
> 
> > On Mon, 09 Jul 2012 17:30:48 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote:
> >
> >> Andrew Morton <akpm@linux-foundation.org> writes:
> >> 
> >> >>  {
> >> >>  	struct sysfs_dirent *sd;
> >> >>  	int is_dir;
> >> >> +	int type;
> >> >>  
> >> >>  	if (nd->flags & LOOKUP_RCU)
> >> >>  		return -ECHILD;
> >> >> @@ -326,6 +327,13 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> >> >>  	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
> >> >>  		goto out_bad;
> >> >>  
> >> >> +	/* The sysfs dirent has been moved to a different namespace */
> >> >> +	type = KOBJ_NS_TYPE_NONE;
> >> >> +	if (sd->s_parent)
> >> >> +		type = sysfs_ns_type(sd->s_parent);
> >> >> +	if (type && (sysfs_info(dentry->d_sb)->ns[type] != sd->s_ns))
> >> >
> >> > eww, the code is assuming that KOBJ_NS_TYPE_NONE has a value of zero. 
> >> > Don't do that; it smells bad.
> >> 
> >> Gag.  An incomplete change in idiom.
> >> 
> >> KOBJ_NS_TYPE_NONE is explicitly defined as 0 so that it can be used
> >> this way, and every where else in fs/sysfs/dir.c uses this idiom.
> >
> > One man's idiom is another man's idiocy.
> 
> And code that uses inconsistent idioms is even harder to read.

Not true.  That patch is more readable when it is changed to use
correct types.  If only because readers don't need to go in and check
that KOBJ_NS_TYPE_NONE has value zero.

> > Seriously.  What sort of idea is that?  Create an enumerated type and
> > then just ignore it?
> 
> It isn't ignored.  It just has a well defined NULL value. That is hardly
> controversial.

If it's uncontroversial, why are we talking about it?  Why did I, an
experienced C and kernel developer, think that it looked stupid and
possibly buggy?

I'm uncomfortable with propagating this idiotic and unnecessary trick
any further.  It's better to fix it.

> >> Pray tell in what parallel universe is that monstrosity above more
> >> readable than the line it replaces?
> >
> > Don't be silly, it is not a "monstrosity".  The code it is modifying
> > contains an unneeded test-and-branch.  It's a test and branch which the
> > compiler might be able to avoid.  If we can demonstrate that the
> > compiler does indeed optimise it, or if we can find a less monstrous
> > way of implementing it then fine.  Otherwise, efficiency wins.
> 
> Efficiency wins?  In a rarely used function?  Which kernel are you
> working on?

One in which we frequently optimise uncommon code paths.

> Readable maintainable code wins.  Unreadable code causes regressions.

Dude, the whole reason for having enums and enumerated types is for
readability and maintainability.  If we didn't care about that, we'd
use literal constants everywhere.  And here you are arguing against
that readability and maintainability.

If you want to say "yes, the sysfs code is bad but I can't be bothered
fixing it all" then grumble, but OK.  But for heavens sake, don't go
and *defend* what that code is doing.

^ permalink raw reply

* RE: [PATCH 0/4] Add a driver for the ASIX AX88172A
From: ASIX Allan Email [office] @ 2012-07-10  2:20 UTC (permalink / raw)
  To: 'Mark Lord', 'Grant Grundler'
  Cc: 'Christian Riesch', netdev, 'Oliver Neukum',
	'Eric Dumazet', 'Ming Lei',
	'Michael Riesch'
In-Reply-To: <4FFB5AC6.3000506@teksavvy.com>

Dear All,

>From ASIX support viewpoint, it might be hard to support all AX88172A target applications on Linux kernel native ax88172a.c driver because some of AX88172A applications are embedded on customers' special target applications such as the AX88172A (PHY mode or Dual-PHY mode) + external MAC controller on-board design applications. For these kinds of AX88172A applications, the AX88172A Linux driver was qualified in our customers' site directly. It means ASIX doesn't have those customers' AX88172A devices in our site for testing. 

But for some AX88172A target applications such as AX88172A + external Fiber PHY and AX88172A + external National DP83640 PHY (Christian is testing under), etc. USB dongle applications, it is possible to support them on Linux kernel native ax88172a.c driver but *** the readme of this driver source might need to indicate clearly what kinds of AX88172A devices had been verified on this ax88172a.c native driver source ***. It will avoid users were confusing why their AX88172A devices couldn't work fine on Linux kernel native driver in future. 

Please let us know if you need more information. Thanks a lot. 

---
Best regards,
Allan Chou
Technical Support Division
ASIX Electronics Corporation
TEL: 886-3-5799500 ext.228
FAX: 886-3-5799558
E-mail: allan@asix.com.tw 
http://www.asix.com.tw/ 

-----Original Message-----
From: Mark Lord [mailto:kernel@teksavvy.com] 
Sent: Tuesday, July 10, 2012 6:27 AM
To: Grant Grundler
Cc: Christian Riesch; netdev@vger.kernel.org; Oliver Neukum; Eric Dumazet; Allan Chou; Ming Lei; Michael Riesch
Subject: Re: [PATCH 0/4] Add a driver for the ASIX AX88172A

On 12-07-09 01:45 PM, Grant Grundler wrote:
> Christian,
> Here's my $0.02 response to your questions.
> 
> On Fri, Jul 6, 2012 at 4:33 AM, Christian Riesch
> <christian.riesch@omicron.at> wrote:
> ...
>> I have a few questions:
>>
>> 1) Is it ok to factor out the common code like I did? Or should
>>    it go into a separate kernel module?
> 
> I think it's ok. I'd rather not see additional kernel modules unless
> the driver is substantially different.

I'll second that.  Ideally, somebody should pick up the pieces
from my aborted efforts last fall, and just get the real ASIX driver
itself tidied and into the kernel.  Then *everything* would work.

But I doubt that would be feasible at this point.

Cheers

^ permalink raw reply

* [PATCH] net: cgroup: fix access the unallocated memory in netprio cgroup
From: Gao feng @ 2012-07-10  2:31 UTC (permalink / raw)
  To: eric.dumazet, nhorman
  Cc: davem, linux-kernel, netdev, lizefan, tj, Gao feng, Eric Dumazet
In-Reply-To: <1341837625.3265.2748.camel@edumazet-glaptop>

there are some out of bound accesses in netprio cgroup.
when creating a new netprio cgroup,we only set a prioidx for
the new cgroup,without allocate memory for dev->priomap.

because we don't want to see additional bound checkings in
fast path, so I think the best way is to allocate memory when we
creating a new netprio cgroup.

and because netdev can be created or registered after cgroup being
created, so extend_netdev_table is also needed in write_priomap.

this patch add a return value for update_netdev_tables & extend_netdev_table,
so when new_priomap is allocated failed,write_priomap will stop to access
the priomap,and return -ENOMEM back to the userspace to tell the user
what happend.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <edumazet@google.com>
---
 net/core/netprio_cgroup.c |   43 +++++++++++++++++++++++++++++--------------
 1 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index aa907ed..3554f28 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -65,7 +65,7 @@ static void put_prioidx(u32 idx)
 	spin_unlock_irqrestore(&prioidx_map_lock, flags);
 }
 
-static void extend_netdev_table(struct net_device *dev, u32 new_len)
+static int extend_netdev_table(struct net_device *dev, u32 new_len)
 {
 	size_t new_size = sizeof(struct netprio_map) +
 			   ((sizeof(u32) * new_len));
@@ -77,7 +77,7 @@ static void extend_netdev_table(struct net_device *dev, u32 new_len)
 
 	if (!new_priomap) {
 		pr_warn("Unable to alloc new priomap!\n");
-		return;
+		return -ENOMEM;
 	}
 
 	for (i = 0;
@@ -90,10 +90,12 @@ static void extend_netdev_table(struct net_device *dev, u32 new_len)
 	rcu_assign_pointer(dev->priomap, new_priomap);
 	if (old_priomap)
 		kfree_rcu(old_priomap, rcu);
+	return 0;
 }
 
-static void update_netdev_tables(void)
+static int update_netdev_tables(void)
 {
+	int ret = 0;
 	struct net_device *dev;
 	u32 max_len = atomic_read(&max_prioidx) + 1;
 	struct netprio_map *map;
@@ -102,34 +104,44 @@ static void update_netdev_tables(void)
 	for_each_netdev(&init_net, dev) {
 		map = rtnl_dereference(dev->priomap);
 		if ((!map) ||
-		    (map->priomap_len < max_len))
-			extend_netdev_table(dev, max_len);
+		    (map->priomap_len < max_len)) {
+			ret = extend_netdev_table(dev, max_len);
+			if (ret < 0)
+				break;
+		}
 	}
 	rtnl_unlock();
+	return ret;
 }
 
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
-	int ret;
+	int ret = -EINVAL;
 
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
 	if (!cs)
 		return ERR_PTR(-ENOMEM);
 
-	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx) {
-		kfree(cs);
-		return ERR_PTR(-EINVAL);
-	}
+	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx)
+		goto out;
 
 	ret = get_prioidx(&cs->prioidx);
-	if (ret != 0) {
+	if (ret < 0) {
 		pr_warn("No space in priority index array\n");
-		kfree(cs);
-		return ERR_PTR(ret);
+		goto out;
+	}
+
+	ret = update_netdev_tables();
+	if (ret < 0) {
+		put_prioidx(cs->prioidx);
+		goto out;
 	}
 
 	return &cs->css;
+out:
+	kfree(cs);
+	return ERR_PTR(ret);
 }
 
 static void cgrp_destroy(struct cgroup *cgrp)
@@ -221,7 +233,10 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 	if (!dev)
 		goto out_free_devname;
 
-	update_netdev_tables();
+	ret = update_netdev_tables();
+	if (ret < 0)
+		goto out_free_devname;
+
 	ret = 0;
 	rcu_read_lock();
 	map = rcu_dereference(dev->priomap);
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH] net: cgroup: fix out of bounds accesses
From: Gao feng @ 2012-07-10  2:33 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, nhorman, linux-kernel, netdev, lizefan, tj
In-Reply-To: <20120709.145125.1903343847210013668.davem@davemloft.net>

于 2012年07月10日 05:51, David Miller 写道:
> From: Gao feng <gaofeng@cn.fujitsu.com>
> Date: Mon, 09 Jul 2012 16:15:29 +0800
> 
>> 于 2012年07月09日 15:45, Eric Dumazet 写道:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> dev->priomap is allocated by extend_netdev_table() called from
>>> update_netdev_tables().
>>> And this is only called if write_priomap() is called.
>>>
>>> But if write_priomap() is not called, it seems we can have out of bounds
>>> accesses in cgrp_destroy(), read_priomap() & skb_update_prio()
>>>
>>> With help from Gao Feng
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Cc: Neil Horman <nhorman@tuxdriver.com>
>>> Cc: Gao feng <gaofeng@cn.fujitsu.com>
>>> ---
>>> net/core/dev.c            |    8 ++++++--
>>> net/core/netprio_cgroup.c |    4 ++--
>>> 2 files changed, 8 insertions(+), 4 deletions(-)
>>
>> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
> 
> Applied.
> 

Hi David

Please see my patch in this thread, I think it's a better way to fix this bug.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: cgroup: fix out of bounds accesses
From: David Miller @ 2012-07-10  2:37 UTC (permalink / raw)
  To: gaofeng; +Cc: eric.dumazet, nhorman, linux-kernel, netdev, lizefan, tj
In-Reply-To: <4FFB9473.4040203@cn.fujitsu.com>

From: Gao feng <gaofeng@cn.fujitsu.com>
Date: Tue, 10 Jul 2012 10:33:23 +0800

> Please see my patch in this thread, I think it's a better way to fix this bug.

You'll need to work that out with Eric, fwiw I think his patch was
clean and just fine and it's staying in my tree.

^ permalink raw reply

* linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2012-07-10  3:08 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: linux-next, linux-kernel, "Bjørn Mork"

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

Hi all,

After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/net/usb/qmi_wwan.c:381:13: error: 'qmi_wwan_unbind_shared' undeclared here (not in a function)

Caused by a bad automatic merge between commit 6fecd35d4cd7 ("net:
qmi_wwan: add ZTE MF60") from the net tree and commit 230718bda1be ("net:
qmi_wwan: bind to both control and data interface") from the net-next
tree.

I added the following merge fix patch:

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Tue, 10 Jul 2012 13:06:01 +1000
Subject: [PATCH] net: fix for qmi_wwan_unbind_shared changes

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 drivers/net/usb/qmi_wwan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 06cfcc7..85c983d 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -378,7 +378,7 @@ static const struct driver_info qmi_wwan_force_int2 = {
 	.description	= "Qualcomm WWAN/QMI device",
 	.flags		= FLAG_WWAN,
 	.bind		= qmi_wwan_bind_shared,
-	.unbind		= qmi_wwan_unbind_shared,
+	.unbind		= qmi_wwan_unbind,
 	.manage_power	= qmi_wwan_manage_power,
 	.data		= BIT(2), /* interface whitelist bitmap */
 };
-- 
1.7.10.280.gaa39

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related

* [RFC PATCH v2 net-next] Re: [RFC PATCH] ppp: add support for L2 multihop / tunnel switching
From: Benjamin LaHaise @ 2012-07-10  3:27 UTC (permalink / raw)
  To: James Chapman; +Cc: netdev, linux-ppp
In-Reply-To: <20120709141511.GL19462@kvack.org>

Hello all,

Here is v2 of the PPP multihop patch.  This version adds a notifier hook to 
make sure that the multihop reference is dropped when the multihop target 
gets unregistered, ensuring that the references are properly dropped witout 
leaking the devices.

		-ben

Not-yet-signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
 drivers/net/ppp/ppp_generic.c |  119 ++++++++++++++++++++++++++++++++++++++++--
 include/linux/if_ether.h      |    1 
 include/linux/ppp-ioctl.h     |    1 
 3 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 5c05572..6dc7eff 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -121,6 +121,7 @@ struct ppp {
 	unsigned long	last_xmit;	/* jiffies when last pkt sent 9c */
 	unsigned long	last_recv;	/* jiffies when last pkt rcvd a0 */
 	struct net_device *dev;		/* network interface device a4 */
+	struct net_device *multihop_if;	/* if to forward incoming frames to */
 	int		closing;	/* is device closing down? a8 */
 #ifdef CONFIG_PPP_MULTILINK
 	int		nxchan;		/* next channel to send something on */
@@ -272,6 +273,7 @@ static void unit_put(struct idr *p, int n);
 static void *unit_find(struct idr *p, int n);
 
 static struct class *ppp_class;
+static const struct net_device_ops ppp_netdev_ops;
 
 /* per net-namespace data */
 static inline struct ppp_net *ppp_pernet(struct net *net)
@@ -380,8 +382,9 @@ static int ppp_release(struct inode *unused, struct file *file)
 		file->private_data = NULL;
 		if (pf->kind == INTERFACE) {
 			ppp = PF_TO_PPP(pf);
-			if (file == ppp->owner)
+			if (file == ppp->owner) {
 				ppp_shutdown_interface(ppp);
+			}
 		}
 		if (atomic_dec_and_test(&pf->refcnt)) {
 			switch (pf->kind) {
@@ -553,6 +556,41 @@ static int get_filter(void __user *arg, struct sock_filter **p)
 }
 #endif /* CONFIG_PPP_FILTER */
 
+static int ppp_multihop_event(struct notifier_block *this, unsigned long event,
+			      void *ptr)
+{
+	struct net_device *event_dev = (struct net_device *)ptr;
+	struct net_device *master = event_dev->master;
+	struct ppp *ppp;
+
+	if (event_dev->netdev_ops != &ppp_netdev_ops)
+		return NOTIFY_DONE;
+	if (!master || (master->netdev_ops != &ppp_netdev_ops))
+		return NOTIFY_DONE;
+
+	ppp = netdev_priv(master);
+
+	switch (event) {
+	case NETDEV_UNREGISTER:
+		ppp_lock(ppp);
+		BUG_ON(ppp->multihop_if != event_dev);
+		ppp->multihop_if = NULL;
+		netdev_set_master(event_dev, NULL);
+		ppp_unlock(ppp);
+		dev_put(event_dev);
+		break;
+
+	default:
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block ppp_multihop_notifier = {
+	.notifier_call = ppp_multihop_event,
+};
+
 static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
 	struct ppp_file *pf = file->private_data;
@@ -738,6 +776,46 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		err = 0;
 		break;
 
+	case PPPIOCSMULTIHOP_IF:
+	{
+		struct net_device *multihop_if;
+		if (get_user(val, p))
+			break;
+		rtnl_lock();
+		ppp_lock(ppp);
+		err = 0;
+		multihop_if = ppp->multihop_if;
+		if (multihop_if && (val == -1)) {
+			ppp->multihop_if = NULL;
+			BUG_ON(multihop_if->master != ppp->dev);
+			netdev_set_master(multihop_if, NULL);
+			goto out_multihop;
+		}
+		err = -EBUSY;
+		multihop_if = NULL;
+		if (ppp->multihop_if)
+			goto out_multihop;
+		multihop_if = dev_get_by_index(&init_net, val);
+		err = -ENOENT;
+		if (!multihop_if)
+			goto out_multihop;
+		err = -EINVAL;
+		if (multihop_if->netdev_ops != &ppp_netdev_ops)
+			goto out_multihop;
+		err = netdev_set_master(multihop_if, ppp->dev);
+		if (err)
+			goto out_multihop;
+		ppp->multihop_if = multihop_if;
+		multihop_if = NULL;
+		err = 0;
+out_multihop:
+		ppp_unlock(ppp);
+		rtnl_unlock();
+		if (multihop_if)
+			dev_put(multihop_if);
+		break;
+	}
+
 #ifdef CONFIG_PPP_FILTER
 	case PPPIOCSPASS:
 	{
@@ -901,6 +979,7 @@ static int __init ppp_init(void)
 
 	pr_info("PPP generic driver version " PPP_VERSION "\n");
 
+	register_netdevice_notifier(&ppp_multihop_notifier);
 	err = register_pernet_device(&ppp_net_ops);
 	if (err) {
 		pr_err("failed to register PPP pernet device (%d)\n", err);
@@ -942,6 +1021,9 @@ ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	int npi, proto;
 	unsigned char *pp;
 
+	if (skb->protocol == htons(ETH_P_PPP))
+		goto queue;
+
 	npi = ethertype_to_npindex(ntohs(skb->protocol));
 	if (npi < 0)
 		goto outf;
@@ -968,6 +1050,7 @@ ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	proto = npindex_to_proto[npi];
 	put_unaligned_be16(proto, pp);
 
+queue:
 	skb_queue_tail(&ppp->file.xq, skb);
 	ppp_xmit_process(ppp);
 	return NETDEV_TX_OK;
@@ -1131,6 +1214,9 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
 	int len;
 	unsigned char *cp;
 
+	if (skb->protocol == htons(ETH_P_PPP))
+		goto xmit;
+
 	if (proto < 0x8000) {
 #ifdef CONFIG_PPP_FILTER
 		/* check if we should pass this packet */
@@ -1228,6 +1314,7 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
 		return;
 	}
 
+xmit:
 	ppp->xmit_pending = skb;
 	ppp_push(ppp);
 	return;
@@ -1259,7 +1346,8 @@ ppp_push(struct ppp *ppp)
 		return;
 	}
 
-	if ((ppp->flags & SC_MULTILINK) == 0) {
+	if (((ppp->flags & SC_MULTILINK) == 0) ||
+	    (skb->protocol == htons(ETH_P_PPP))) {
 		/* not doing multilink: send it down the first channel */
 		list = list->next;
 		pch = list_entry(list, struct channel, clist);
@@ -1599,6 +1687,14 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
 		goto done;
 	}
 
+	if (pch->ppp && pch->ppp->multihop_if) {
+		skb->protocol = htons(ETH_P_PPP);
+		skb->dev = pch->ppp->multihop_if;
+		skb->ip_summed = CHECKSUM_NONE;
+		dev_queue_xmit(skb);
+		goto done;
+	}
+
 	proto = PPP_PROTO(skb);
 	if (!pch->ppp || proto >= 0xc000 || proto == PPP_CCPFRAG) {
 		/* put it on the channel queue */
@@ -2709,18 +2805,28 @@ static void ppp_shutdown_interface(struct ppp *ppp)
 {
 	struct ppp_net *pn;
 
+	rtnl_lock();
 	pn = ppp_pernet(ppp->ppp_net);
 	mutex_lock(&pn->all_ppp_mutex);
 
 	/* This will call dev_close() for us. */
 	ppp_lock(ppp);
 	if (!ppp->closing) {
+		struct net_device *multihop_if = ppp->multihop_if;
 		ppp->closing = 1;
+		ppp->multihop_if = NULL;
 		ppp_unlock(ppp);
+		if (multihop_if)
+			netdev_set_master(multihop_if, NULL);
+		rtnl_unlock();
 		unregister_netdev(ppp->dev);
 		unit_put(&pn->units_idr, ppp->file.index);
-	} else
+		if (multihop_if)
+			dev_put(multihop_if);
+	} else {
 		ppp_unlock(ppp);
+		rtnl_unlock();
+	}
 
 	ppp->file.dead = 1;
 	ppp->owner = NULL;
@@ -2764,6 +2870,12 @@ static void ppp_destroy_interface(struct ppp *ppp)
 #endif /* CONFIG_PPP_FILTER */
 
 	kfree_skb(ppp->xmit_pending);
+	if (ppp->multihop_if) {
+		struct net_device *multihop_if = ppp->multihop_if;
+		ppp->multihop_if = NULL;
+		netdev_set_master(multihop_if, NULL);
+		dev_put(multihop_if);
+	}
 
 	free_netdev(ppp->dev);
 }
@@ -2901,6 +3013,7 @@ static void __exit ppp_cleanup(void)
 	device_destroy(ppp_class, MKDEV(PPP_MAJOR, 0));
 	class_destroy(ppp_class);
 	unregister_pernet_device(&ppp_net_ops);
+	unregister_netdevice_notifier(&ppp_multihop_notifier);
 }
 
 /*
diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index 167ce5b..fe47a70 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -120,6 +120,7 @@
 #define ETH_P_PHONET	0x00F5		/* Nokia Phonet frames          */
 #define ETH_P_IEEE802154 0x00F6		/* IEEE802.15.4 frame		*/
 #define ETH_P_CAIF	0x00F7		/* ST-Ericsson CAIF protocol	*/
+#define ETH_P_PPP	0x00F8		/* Dummy type for PPP multihop	*/
 
 /*
  *	This is an Ethernet frame header.
diff --git a/include/linux/ppp-ioctl.h b/include/linux/ppp-ioctl.h
index 2d9a885..5571375 100644
--- a/include/linux/ppp-ioctl.h
+++ b/include/linux/ppp-ioctl.h
@@ -81,6 +81,7 @@ struct pppol2tp_ioc_stats {
  * Ioctl definitions.
  */
 
+#define	PPPIOCSMULTIHOP_IF	_IOWR('t', 91, int) /* set multihop if */
 #define	PPPIOCGFLAGS	_IOR('t', 90, int)	/* get configuration flags */
 #define	PPPIOCSFLAGS	_IOW('t', 89, int)	/* set configuration flags */
 #define	PPPIOCGASYNCMAP	_IOR('t', 88, int)	/* get async map */

^ permalink raw reply related

* Re: [PATCH] net: cgroup: fix access the unallocated memory in netprio cgroup
From: Eric Dumazet @ 2012-07-10  4:14 UTC (permalink / raw)
  To: Gao feng; +Cc: nhorman, davem, linux-kernel, netdev, lizefan, tj, Eric Dumazet
In-Reply-To: <1341887508-20302-1-git-send-email-gaofeng@cn.fujitsu.com>

On Tue, 2012-07-10 at 10:31 +0800, Gao feng wrote:
> there are some out of bound accesses in netprio cgroup.
> when creating a new netprio cgroup,we only set a prioidx for
> the new cgroup,without allocate memory for dev->priomap.
> 
> because we don't want to see additional bound checkings in
> fast path, so I think the best way is to allocate memory when we
> creating a new netprio cgroup.
> 
> and because netdev can be created or registered after cgroup being
> created, so extend_netdev_table is also needed in write_priomap.
> 
> this patch add a return value for update_netdev_tables & extend_netdev_table,
> so when new_priomap is allocated failed,write_priomap will stop to access
> the priomap,and return -ENOMEM back to the userspace to tell the user
> what happend.
> 
> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Eric Dumazet <edumazet@google.com>
> ---

>  static void cgrp_destroy(struct cgroup *cgrp)
> @@ -221,7 +233,10 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
>  	if (!dev)
>  		goto out_free_devname;
>  
> -	update_netdev_tables();
> +	ret = update_netdev_tables();
> +	if (ret < 0)
> +		goto out_free_devname;
> +
>  	ret = 0;
>  	rcu_read_lock();
>  	map = rcu_dereference(dev->priomap);

Hi Gao

Is it still needed to call update_netdev_tables() from write_priomap() ?

^ permalink raw reply

* Re: [PATCH net-next 6/6] r8169: support RTL8168G
From: Hayes Wang @ 2012-07-10  5:36 UTC (permalink / raw)
  To: romieu; +Cc: netdev, linux-kernel, wfg, Hayes Wang
In-Reply-To: <c558386b836ee97762e12495101c6e373f20e69d.1341872752.git.romieu@fr.zoreil.com>

fix incorrct argument in rtl_hw_init_8168g.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/ethernet/realtek/r8169.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 7ff3423..c29c5fb 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6753,14 +6753,14 @@ static void __devinit rtl_hw_init_8168g(struct rtl8169_private *tp)
 	msleep(1);
 	RTL_W8(MCU, RTL_R8(MCU) & ~NOW_IS_OOB);
 
-	data = r8168_mac_ocp_read(ioaddr, 0xe8de);
+	data = r8168_mac_ocp_read(tp, 0xe8de);
 	data &= ~(1 << 14);
 	r8168_mac_ocp_write(tp, 0xe8de, data);
 
 	if (!rtl_udelay_loop_wait_high(tp, &rtl_link_list_ready_cond, 100, 42))
 		return;
 
-	data = r8168_mac_ocp_read(ioaddr, 0xe8de);
+	data = r8168_mac_ocp_read(tp, 0xe8de);
 	data |= (1 << 15);
 	r8168_mac_ocp_write(tp, 0xe8de, data);
 
-- 
1.7.10.4

^ permalink raw reply related

* [RFC] skbtrace: A trace infrastructure for networking subsystem
From: Li Yu @ 2012-07-10  6:07 UTC (permalink / raw)
  To: Linux Netdev List

Hi,

  This RFC introduces to the tracing infrastructure for networking
subsystem and a workable prototype.

  I noticed that the blktrace indeed helps file system and block
subsystem developers a lot, even it could help them to find out some
problems in mm subsystem. However, the "networkers" don't have such
like good luck, although tcpdump is very very useful, but they still
often need to start investigation from limited exported statistics
counters, then may directly dig into source code to guess possible
solutions, then test their ideas, if good luck doesn't arrive, then
start another investigation-guess-test loop. It is a difficult
time-costly and hard to share experiences, report problem, many users
have not enough understanding for protocol stack internals, I saw some
"detailed reports" still do not carry useful information to solve problem.

  Unfortunately, the networking subsystem is rather performance
sensitive in kernel, so we can not add too detailed counters directly
here. In fact, Some folks already tried to add more statistics counters
for detailed performance measuration, e.g. RFC4898 and its
implementation Web10g project. Web10G is a great project for
researchers and engineers on TCP stack, which exports per-connection
details to userland by procfs or netlink interface. However, it tightly
depends on TCP and its implementation, other protocols implementation
need some duplicated works to archive same goal, and it also has some
measurable overhead (5% - 10% in my simple netperf TCP_STREAM
benchmark), I think it'd better that such powerful tracing or
instrumentation feature should be able to be off at runtime, and zero
overhead when it is off.

  So why we don't write a blktrace like utility for our sweet
networking subsystem? This just is it, "skbtrace", I hope it can:

1. Provide an extendable tracing infrastructure to support various
protocols instead of specific one.

2. Ability of runtime enable or disable and zero overhead when it
is off. I think that jump label optimized trace point is a good choice
to implement it.

3. Provide tracing details on per-connection/per-skb level. Please note
that skbtrace are not only for sk_buff tracing, but also can track
sockets events. Second, this also means we need some forms of filters,
otherwise we must will lost in tons of uninteresting trace data. I think
that BPF is one of good choices. But we need extend BPF to make it
can handle other data structures rather than skb.

   Above is my basic idea, below are details of current prototype
implementation.

   Like blktrace, skbtrace also are base on the tracepoints
infrastructure and relay file system, however, I do not implement any
tracers like blktrace, since I want to keep kernel side as simple (also
fast, I hope) as possible. Basically, the trace points just are
optimized conditional statements here, the slow path copies these
traced data to the ring buffer in relay file system. The parameters of
this relay file system can be tuned by some exported files in skbtrace
directory.

  There are three trace data files (channels) in relay file system for
each CPU, they represent above ring buffers that save kernel traced
data for different contexts respectively:

  (1) trace.hardirq.cpuN, saving trace data that come from hardirq
context.
  (2) trace.softirq.cpuN, saving trace data that come from softirq
context.
  (3) trace.syscall.cpuN, saving trace data that come from process
context.

  Each trace data will write into one of above channels, depend on which
context is trace point called. Each trace data is represented by a
skbtrace_block struct, the extended fields for specific protocols can be
append at end of it. For global order of trace data, this patch has an
64 bits atomic variable to generate sequence number of each generated
trace data. So userland utility is able to sort out of order trace data
across different channels or/and CPUs.

  For tracing filter feature, I selected BPF as core engine, so far, it
only can filter out sk_buff-based traces, I have a plan to extend BPF to
support other data structures. In fact, I ever wrote a custom filter
implemenation for TCP/IPv4 ago, this way needs to refactor each specific
protocol implemenation, I do not like and discard them.

  So far, I implemented some skbtrace trace points:

  (1) skb_rps_info.

         I ever saw that some buggy drivers (or firmwares?)
         always setup zero skb->rx_hash. And it seem that RPS
         hashing can not work well for some corner cases.

  (2) tcp_connection and icsk_connection.

	To track the basic TCP state migration, e.g. TCP_LISTEN.

  (3) tcp_sendlimit.

        Personally, I am interesting in reason of tcp_write_xmit()
        exits.

  (4) tcp_congestion.

       Oops, it is cwnd killer, isn't it?

  The userland utilties:

  (1) skbtrace, record raw trace data to regular disk files.
  (2) skbparse, parse raw trace data to human readable strings.
                this still need a lot of works, it just is a rough
		(but workable) demo for TCP/IPv4 yet.

  You can get source code at github:

	https://github.com/Rover-Yu/skbtrace-userland
	https://github.com/Rover-Yu/skbtrace-kernel

  The source code of skbtrace-kernel is based on net-next tree.

  Welcome for suggestions.

  Thanks.

Yu

^ permalink raw reply

* [net] net: Fix memory leak - vlan_info struct
From: Jeff Kirsher @ 2012-07-10  6:47 UTC (permalink / raw)
  To: davem; +Cc: Amir Hanania, netdev, gospo, sassmann, Jeff Kirsher

From: Amir Hanania <amir.hanania@intel.com>

In driver reload test there is a memory leak.
The structure vlan_info was not freed when the driver was removed.
It was not released since the nr_vids var is one after last vlan was removed.
The nr_vids is one, since vlan zero is added to the interface when the interface
is being set, but the vlan zero is not deleted at unregister.
Fix - delete vlan zero when we unregister the device.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 net/8021q/vlan.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 6089f0c..9096bcb 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -403,6 +403,9 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 		break;
 
 	case NETDEV_DOWN:
+		if (dev->features & NETIF_F_HW_VLAN_FILTER)
+			vlan_vid_del(dev, 0);
+
 		/* Put all VLANs for this dev in the down state too.  */
 		for (i = 0; i < VLAN_N_VID; i++) {
 			vlandev = vlan_group_get_device(grp, i);
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH net-next 6/6] r8169: support RTL8168G
From: Francois Romieu @ 2012-07-10  6:50 UTC (permalink / raw)
  To: Hayes Wang; +Cc: David S. Miller, netdev, linux-kernel, wfg
In-Reply-To: <1341898590-1253-1-git-send-email-hayeswang@realtek.com>

Hayes Wang <hayeswang@realtek.com> :
> fix incorrct argument in rtl_hw_init_8168g.
> 
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>

Thanks Hayes.

It's available with proper attribution and subject at:

git://violet.fr.zoreil.com/romieu/linux davem-next.r8169

-- 
Ueimor

^ permalink raw reply

* [v2 PATCH] ksz884x: fix Endian
From: roy.qing.li @ 2012-07-10  6:56 UTC (permalink / raw)
  To: netdev; +Cc: Tristram.Ha, bhutchings, joe

From: Li RongQing <roy.qing.li@gmail.com>

ETH_P_IP is host Endian, skb->protocol is big Endian, when
compare them, Using htons on skb->protocol is wrong.

And fix two code style issues: indentation and remove
unnecessary parentheses.

CC: Tristram Ha <Tristram.Ha@micrel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Joe Perches <joe@perches.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 drivers/net/ethernet/micrel/ksz884x.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c
index eaf9ff0..0fbe2e2 100644
--- a/drivers/net/ethernet/micrel/ksz884x.c
+++ b/drivers/net/ethernet/micrel/ksz884x.c
@@ -4881,8 +4881,8 @@ static netdev_tx_t netdev_tx(struct sk_buff *skb, struct net_device *dev)
 	left = hw_alloc_pkt(hw, skb->len, num);
 	if (left) {
 		if (left < num ||
-				((CHECKSUM_PARTIAL == skb->ip_summed) &&
-				(ETH_P_IPV6 == htons(skb->protocol)))) {
+		    (CHECKSUM_PARTIAL == skb->ip_summed &&
+		     skb->protocol == htons(ETH_P_IPV6))) {
 			struct sk_buff *org_skb = skb;
 
 			skb = netdev_alloc_skb(dev, org_skb->len);
-- 
1.7.1

^ permalink raw reply related

* Re: [RFC PATCH] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Massimo Cetra @ 2012-07-10  6:58 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Julian Anastasov
In-Reply-To: <CAF1ivSZBMWYc5iKxhX5d_ykkMD4LauFP9M10dBwfmqvpYj=pHg@mail.gmail.com>

On 09/07/2012 14:00, Lin Ming wrote:

>> i spent a couple of days trying to figure out how to reproduce but you were
>> quicker and smarter than me.
>
> Could you also test it ? :-)
>

Of course.

I have already installed a 3.5-rc and a 3.2.22 with this patch and, by 
now, i see no problems.

I'm only waiting a couple of days before reporting, to be sure the issue 
is gone.

Massimo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox