Netdev List
 help / color / mirror / Atom feed
* Re: irq disable in __netdev_alloc_frag() ?
From: Eric Dumazet @ 2014-10-23  3:56 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Alexei Starovoitov, Eric Dumazet, Network Development
In-Reply-To: <1414036276.2094.18.camel@edumazet-glaptop2.roam.corp.google.com>

On Wed, 2014-10-22 at 20:51 -0700, Eric Dumazet wrote:
> On Wed, 2014-10-22 at 20:19 -0700, Alexander Duyck wrote:
> 
> > Couldn't __netdev_alloc_frag() be forked into two functions, one that is
> > only called from inside the NAPI context and one that is called for all
> > other contexts?  It would mean having to double the number of pages
> > being held per CPU, but I would think something like that would be doable.
> 
> Possibly, but this looks like code bloat for me.
> 
> On my hosts, this hard irq masking is pure noise.
> 
> What CPU are you using Alexander ?

Sorry, the question was for Alexei ;)

^ permalink raw reply

* Re: irq disable in __netdev_alloc_frag() ?
From: Eric Dumazet @ 2014-10-23  3:51 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Alexei Starovoitov, Eric Dumazet, Network Development
In-Reply-To: <544873DF.1040403@gmail.com>

On Wed, 2014-10-22 at 20:19 -0700, Alexander Duyck wrote:

> Couldn't __netdev_alloc_frag() be forked into two functions, one that is
> only called from inside the NAPI context and one that is called for all
> other contexts?  It would mean having to double the number of pages
> being held per CPU, but I would think something like that would be doable.

Possibly, but this looks like code bloat for me.

On my hosts, this hard irq masking is pure noise.

What CPU are you using Alexander ?

Same could be done with some kmem_cache_alloc() : SLAB uses hard irq
masking while some caches are never used from hard irq context.

^ permalink raw reply

* Re: irq disable in __netdev_alloc_frag() ?
From: Eric Dumazet @ 2014-10-23  3:48 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Eric Dumazet, Network Development
In-Reply-To: <CAMEtUuxo7TT+9S1KSHxyTKR+pD6HWJKaTOO=A_VGsZVji3YU2w@mail.gmail.com>

On Wed, 2014-10-22 at 19:22 -0700, Alexei Starovoitov wrote:

> yes. I was thinking, since dev is already passed
> into __netdev_alloc_skb(), we can check whether
> dev registered with napi via dev->napi_list and if so,
> tell inner __netdev_alloc_frag() to skip irq disabling...
> 

This does not matter. The problem is not _this_ device, the problem is
that another device might trigger a hard irq, and this hard irq could
mess your data.

^ permalink raw reply

* Re: [PATCH net] macvlan: fix a race on port dismantle and possible skb leaks
From: Herbert Xu @ 2014-10-23  3:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1414032226.2094.14.camel@edumazet-glaptop2.roam.corp.google.com>

On Wed, Oct 22, 2014 at 07:43:46PM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> We need to cancel the work queue after rcu grace period,
> otherwise it can be rescheduled by incoming packets.
> 
> We need to purge queue if some skbs are still in it.
> 
> We can use __skb_queue_head_init() variant in
> macvlan_process_broadcast()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Fixes: 412ca1550cbec ("macvlan: Move broadcasts into a work queue")
> Cc: Herbert Xu <herbert@gondor.apana.org.au>

Good catch! Your fix looks good to me.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [net] i40e: _MASK vs _SHIFT typo in i40e_handle_mdd_event()
From: Joe Perches @ 2014-10-23  3:22 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, Dan Carpenter, netdev, nhorman, sassmann, jogreene
In-Reply-To: <1414033589-7544-1-git-send-email-jeffrey.t.kirsher@intel.com>

On Wed, 2014-10-22 at 20:06 -0700, Jeff Kirsher wrote:
> We accidentally mask by the _SHIFT variable.  It means that "event" is
> always zero.
[]
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
[]
> @@ -6151,7 +6151,7 @@ static void i40e_handle_mdd_event(struct i40e_pf *pf)
>  				I40E_GL_MDET_TX_PF_NUM_SHIFT;
>  		u8 vf_num = (reg & I40E_GL_MDET_TX_VF_NUM_MASK) >>
>  				I40E_GL_MDET_TX_VF_NUM_SHIFT;
> -		u8 event = (reg & I40E_GL_MDET_TX_EVENT_SHIFT) >>
> +		u8 event = (reg & I40E_GL_MDET_TX_EVENT_MASK) >>
>  				I40E_GL_MDET_TX_EVENT_SHIFT;
>  		u8 queue = (reg & I40E_GL_MDET_TX_QUEUE_MASK) >>
>  				I40E_GL_MDET_TX_QUEUE_SHIFT;
> @@ -6165,7 +6165,7 @@ static void i40e_handle_mdd_event(struct i40e_pf *pf)
>  	if (reg & I40E_GL_MDET_RX_VALID_MASK) {
>  		u8 func = (reg & I40E_GL_MDET_RX_FUNCTION_MASK) >>
>  				I40E_GL_MDET_RX_FUNCTION_SHIFT;
> -		u8 event = (reg & I40E_GL_MDET_RX_EVENT_SHIFT) >>
> +		u8 event = (reg & I40E_GL_MDET_RX_EVENT_MASK) >>
>  				I40E_GL_MDET_RX_EVENT_SHIFT;
>  		u8 queue = (reg & I40E_GL_MDET_RX_QUEUE_MASK) >>
>  				I40E_GL_MDET_RX_QUEUE_SHIFT;

It might be useful to have a macro for that.
Something like:

#define GET_REG_VAL(reg, type)			\
	((reg & type##_MASK) >> type##_SHIFT)

so these could become:

	u8 vf_num = GET_REG_VAL(reg, I40E_GL_MDET_TX_VF_NUM);
	u8 event = GET_REG_VAL(reg, I40E_GL_MDET_TX_EVENT);

etc...

^ permalink raw reply

* Re: irq disable in __netdev_alloc_frag() ?
From: Alexander Duyck @ 2014-10-23  3:19 UTC (permalink / raw)
  To: Eric Dumazet, Alexei Starovoitov; +Cc: Eric Dumazet, Network Development
In-Reply-To: <1414029160.2094.8.camel@edumazet-glaptop2.roam.corp.google.com>

On 10/22/2014 06:52 PM, Eric Dumazet wrote:
> On Wed, 2014-10-22 at 17:15 -0700, Alexei Starovoitov wrote:
>> Hi Eric,
>>
>> in the commit 6f532612cc24 ("net: introduce netdev_alloc_frag()")
>> you mentioned that the reason to disable interrupts
>> in __netdev_alloc_frag() is:
>> "- Must be IRQ safe (non NAPI drivers can use it)"
>>
>> Is there a way to do this conditionally?
>>
>> Without it I see 10% performance gain for my RX tests
>> (from 6.9Mpps to 7.7Mpps) and __netdev_alloc_frag()
>> itself goes from 6.6% to 2.1%
>> (popf seems to be quite costly)
> Well, your driver is probably a NAPI one, so you need to
> mask irqs, or to remove all non NAPI drivers from linux.
>
> __netdev_alloc_frag() (__netdev_alloc_skb()) is used by all.
>
> Problem is __netdev_alloc_frag() is generally deep inside caller
> chain, so using a private pool might have quite an overhead.
>
> Same could be said for skb_queue_head() /skb_queue_tail() /
> sock_queue_rcv_skb() :
> Many callers don't need to block irq.

Couldn't __netdev_alloc_frag() be forked into two functions, one that is
only called from inside the NAPI context and one that is called for all
other contexts?  It would mean having to double the number of pages
being held per CPU, but I would think something like that would be doable.

Thanks,

Alex

^ permalink raw reply

* Possible wireless issue introduced in next-20140930
From: Murilo Opsfelder Araujo @ 2014-10-23  3:17 UTC (permalink / raw)
  To: linux-kernel, netdev, linux-wireless; +Cc: Larry Finger, John W. Linville

Hello, everyone.

With next-20140930 my laptop does not work, i.e. after I enter my login 
and password in KDM, the entire system becomes unresponsive and I need 
to reset it in order to reboot (it does not even show the KDE splash 
screen).

It was working pretty fine with next-20140926.

I've also tested with next-20141022 and v3.18-rc1 and no luck.

git bisect pointed me to the commit below [1].  My wireless card is a 
RTL8191SEvA [2].

I need your help to troubleshoot this.

Thanks in advance.

[1]
commit 38506ecefab911785d5e1aa5889f6eeb462e0954
Author: Larry Finger <Larry.Finger@lwfinger.net>
Date:   Mon Sep 22 09:39:19 2014 -0500

     rtlwifi: rtl_pci: Start modification for new drivers

     Future patches will move the drivers for RTL8192EE and RTL8821AE
     from staging to the regular wireless tree. Here, the necessary features
     are added to the PCI driver. Other files are touched due to changes
     in the various data structs.

     Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
     Signed-off-by: John W. Linville <linville@tuxdriver.com>

[2]
$ lspci -vvv
02:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8191SEvA 
Wireless LAN Controller (rev 10)
         Subsystem: Hewlett-Packard Company Device 1467
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx-
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 16
         Region 0: I/O ports at 3000 [size=256]
         Region 1: Memory at d3400000 (32-bit, non-prefetchable) [size=16K]
         Capabilities: <access denied>
         Kernel driver in use: rtl8192se

-- 
Murilo

^ permalink raw reply

* [net] i40e: _MASK vs _SHIFT typo in i40e_handle_mdd_event()
From: Jeff Kirsher @ 2014-10-23  3:06 UTC (permalink / raw)
  To: davem; +Cc: Dan Carpenter, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Dan Carpenter <dan.carpenter@oracle.com>

We accidentally mask by the _SHIFT variable.  It means that "event" is
always zero.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ed5f1c1..c3a7f4a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -6151,7 +6151,7 @@ static void i40e_handle_mdd_event(struct i40e_pf *pf)
 				I40E_GL_MDET_TX_PF_NUM_SHIFT;
 		u8 vf_num = (reg & I40E_GL_MDET_TX_VF_NUM_MASK) >>
 				I40E_GL_MDET_TX_VF_NUM_SHIFT;
-		u8 event = (reg & I40E_GL_MDET_TX_EVENT_SHIFT) >>
+		u8 event = (reg & I40E_GL_MDET_TX_EVENT_MASK) >>
 				I40E_GL_MDET_TX_EVENT_SHIFT;
 		u8 queue = (reg & I40E_GL_MDET_TX_QUEUE_MASK) >>
 				I40E_GL_MDET_TX_QUEUE_SHIFT;
@@ -6165,7 +6165,7 @@ static void i40e_handle_mdd_event(struct i40e_pf *pf)
 	if (reg & I40E_GL_MDET_RX_VALID_MASK) {
 		u8 func = (reg & I40E_GL_MDET_RX_FUNCTION_MASK) >>
 				I40E_GL_MDET_RX_FUNCTION_SHIFT;
-		u8 event = (reg & I40E_GL_MDET_RX_EVENT_SHIFT) >>
+		u8 event = (reg & I40E_GL_MDET_RX_EVENT_MASK) >>
 				I40E_GL_MDET_RX_EVENT_SHIFT;
 		u8 queue = (reg & I40E_GL_MDET_RX_QUEUE_MASK) >>
 				I40E_GL_MDET_RX_QUEUE_SHIFT;
-- 
1.9.3

^ permalink raw reply related

* [PATCH net] macvlan: fix a race on port dismantle and possible skb leaks
From: Eric Dumazet @ 2014-10-23  2:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Herbert Xu

From: Eric Dumazet <edumazet@google.com>

We need to cancel the work queue after rcu grace period,
otherwise it can be rescheduled by incoming packets.

We need to purge queue if some skbs are still in it.

We can use __skb_queue_head_init() variant in
macvlan_process_broadcast()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 412ca1550cbec ("macvlan: Move broadcasts into a work queue")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
---
 drivers/net/macvlan.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 29b3bb410781..98c2732755ea 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -272,7 +272,7 @@ static void macvlan_process_broadcast(struct work_struct *w)
 	struct sk_buff *skb;
 	struct sk_buff_head list;
 
-	skb_queue_head_init(&list);
+	__skb_queue_head_init(&list);
 
 	spin_lock_bh(&port->bc_queue.lock);
 	skb_queue_splice_tail_init(&port->bc_queue, &list);
@@ -1082,9 +1082,15 @@ static void macvlan_port_destroy(struct net_device *dev)
 {
 	struct macvlan_port *port = macvlan_port_get_rtnl(dev);
 
-	cancel_work_sync(&port->bc_work);
 	dev->priv_flags &= ~IFF_MACVLAN_PORT;
 	netdev_rx_handler_unregister(dev);
+
+	/* After this point, no packet can schedule bc_work anymore,
+	 * but we need to cancel it and purge left skbs if any.
+	 */
+	cancel_work_sync(&port->bc_work);
+	__skb_queue_purge(&port->bc_queue);
+
 	kfree_rcu(port, rcu);
 }
 

^ permalink raw reply related

* Re: irq disable in __netdev_alloc_frag() ?
From: Alexei Starovoitov @ 2014-10-23  2:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eric Dumazet, Network Development
In-Reply-To: <1414029160.2094.8.camel@edumazet-glaptop2.roam.corp.google.com>

On Wed, Oct 22, 2014 at 6:52 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> On Wed, 2014-10-22 at 17:15 -0700, Alexei Starovoitov wrote:
>> Hi Eric,
>>
>> in the commit 6f532612cc24 ("net: introduce netdev_alloc_frag()")
>> you mentioned that the reason to disable interrupts
>> in __netdev_alloc_frag() is:
>> "- Must be IRQ safe (non NAPI drivers can use it)"
>>
>> Is there a way to do this conditionally?
>>
>> Without it I see 10% performance gain for my RX tests
>> (from 6.9Mpps to 7.7Mpps) and __netdev_alloc_frag()
>> itself goes from 6.6% to 2.1%
>> (popf seems to be quite costly)
>
> Well, your driver is probably a NAPI one, so you need to
> mask irqs, or to remove all non NAPI drivers from linux.

yeah, the 10G+ nics I care about are all napi :)

> __netdev_alloc_frag() (__netdev_alloc_skb()) is used by all.
>
> Problem is __netdev_alloc_frag() is generally deep inside caller
> chain, so using a private pool might have quite an overhead.

yes. I was thinking, since dev is already passed
into __netdev_alloc_skb(), we can check whether
dev registered with napi via dev->napi_list and if so,
tell inner __netdev_alloc_frag() to skip irq disabling...

don't know about skb_queue_head() and friends.
I'm only looking at pure rx now. One challenge at a time.

^ permalink raw reply

* [PATCH net-next v2 6/6] ethernet: samsung: sxgbe: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 .../net/ethernet/samsung/sxgbe/sxgbe_platform.c    |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
index b147d46..7fd6e27 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
@@ -90,9 +90,6 @@ static int sxgbe_platform_probe(struct platform_device *pdev)
 
 	/* Get memory resource */
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!res)
-		goto err_out;
-
 	addr = devm_ioremap_resource(dev, res);
 	if (IS_ERR(addr))
 		return PTR_ERR(addr);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 5/6] ethernet: renesas: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 drivers/net/ethernet/renesas/sh_eth.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 60e9c2c..ffb49f3 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -2769,10 +2769,6 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
 
 	/* get base addr */
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (unlikely(res == NULL)) {
-		dev_err(&pdev->dev, "invalid resource\n");
-		return -EINVAL;
-	}
 
 	ndev = alloc_etherdev(sizeof(struct sh_eth_private));
 	if (!ndev)
@@ -2781,8 +2777,6 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
 	pm_runtime_enable(&pdev->dev);
 	pm_runtime_get_sync(&pdev->dev);
 
-	/* The sh Ether-specific entries in the device structure. */
-	ndev->base_addr = res->start;
 	devno = pdev->id;
 	if (devno < 0)
 		devno = 0;
@@ -2806,6 +2800,9 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
 		goto out_release;
 	}
 
+	/* The sh Ether-specific entries in the device structure. */
+	ndev->base_addr = res->start;
+
 	spin_lock_init(&mdp->lock);
 	mdp->pdev = pdev;
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 4/6] ethernet: marvell: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 drivers/net/ethernet/marvell/pxa168_eth.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/pxa168_eth.c b/drivers/net/ethernet/marvell/pxa168_eth.c
index c3b209c..a378c92 100644
--- a/drivers/net/ethernet/marvell/pxa168_eth.c
+++ b/drivers/net/ethernet/marvell/pxa168_eth.c
@@ -1505,16 +1505,14 @@ static int pxa168_eth_probe(struct platform_device *pdev)
 	pep = netdev_priv(dev);
 	pep->dev = dev;
 	pep->clk = clk;
+
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (res == NULL) {
-		err = -ENODEV;
-		goto err_netdev;
-	}
 	pep->base = devm_ioremap_resource(&pdev->dev, res);
 	if (IS_ERR(pep->base)) {
 		err = -ENOMEM;
 		goto err_netdev;
 	}
+
 	res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
 	BUG_ON(!res);
 	dev->irq = res->start;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 3/6] ethernet: apm: xgene: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c |   12 ------------
 1 file changed, 12 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 3c208cc..f226594 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -761,10 +761,6 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	ndev = pdata->ndev;
 
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "enet_csr");
-	if (!res) {
-		dev_err(dev, "Resource enet_csr not defined\n");
-		return -ENODEV;
-	}
 	pdata->base_addr = devm_ioremap_resource(dev, res);
 	if (IS_ERR(pdata->base_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Port CSR region\n");
@@ -772,10 +768,6 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	}
 
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ring_csr");
-	if (!res) {
-		dev_err(dev, "Resource ring_csr not defined\n");
-		return -ENODEV;
-	}
 	pdata->ring_csr_addr = devm_ioremap_resource(dev, res);
 	if (IS_ERR(pdata->ring_csr_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Ring CSR region\n");
@@ -783,10 +775,6 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	}
 
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ring_cmd");
-	if (!res) {
-		dev_err(dev, "Resource ring_cmd not defined\n");
-		return -ENODEV;
-	}
 	pdata->ring_cmd_addr = devm_ioremap_resource(dev, res);
 	if (IS_ERR(pdata->ring_cmd_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Ring command region\n");
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 2/6] ethernet: wiznet: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 drivers/net/ethernet/wiznet/w5300.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/wiznet/w5300.c b/drivers/net/ethernet/wiznet/w5300.c
index f961f14..7974b7d 100644
--- a/drivers/net/ethernet/wiznet/w5300.c
+++ b/drivers/net/ethernet/wiznet/w5300.c
@@ -558,14 +558,12 @@ static int w5300_hw_probe(struct platform_device *pdev)
 	}
 
 	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!mem)
-		return -ENXIO;
-	mem_size = resource_size(mem);
-
 	priv->base = devm_ioremap_resource(&pdev->dev, mem);
 	if (IS_ERR(priv->base))
 		return PTR_ERR(priv->base);
 
+	mem_size = resource_size(mem);
+
 	spin_lock_init(&priv->reg_lock);
 	priv->indirect = mem_size < W5300_BUS_DIRECT_SIZE;
 	if (priv->indirect) {
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 1/6] ethernet: wiznet: remove unnecessary check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram
In-Reply-To: <1414029531-5067-1-git-send-email-varkab@cdac.in>

devm_ioremap_resource checks platform_get_resource() return value.
We can remove the duplicate check here.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
---
 drivers/net/ethernet/wiznet/w5100.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/wiznet/w5100.c b/drivers/net/ethernet/wiznet/w5100.c
index 0f56b1c..70a930a 100644
--- a/drivers/net/ethernet/wiznet/w5100.c
+++ b/drivers/net/ethernet/wiznet/w5100.c
@@ -638,14 +638,12 @@ static int w5100_hw_probe(struct platform_device *pdev)
 	}
 
 	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!mem)
-		return -ENXIO;
-	mem_size = resource_size(mem);
-
 	priv->base = devm_ioremap_resource(&pdev->dev, mem);
 	if (IS_ERR(priv->base))
 		return PTR_ERR(priv->base);
 
+	mem_size = resource_size(mem);
+
 	spin_lock_init(&priv->reg_lock);
 	priv->indirect = mem_size < W5100_BUS_DIRECT_SIZE;
 	if (priv->indirect) {
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next v2 0/6] cleanup on resource check
From: Varka Bhadram @ 2014-10-23  1:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, Varka Bhadram

This series removes the duplication of sanity check for
platform_get_resource() return resource. It will be checked 
with devm_ioremap_resource()

changes since v1:
	- remove NULL dereference on resource_size()

Varka Bhadram (6):
  ethernet: wiznet: remove unnecessary check
  ethernet: wiznet: remove unnecessary check
  ethernet: apm: xgene: remove unnecessary check
  ethernet: marvell: remove unnecessary check
  ethernet: renesas: remove unnecessary check
  ethernet: samsung: sxgbe: remove unnecessary check

 drivers/net/ethernet/apm/xgene/xgene_enet_main.c   |   12 ------------
 drivers/net/ethernet/marvell/pxa168_eth.c          |    6 ++----
 drivers/net/ethernet/renesas/sh_eth.c              |    9 +++------
 .../net/ethernet/samsung/sxgbe/sxgbe_platform.c    |    3 ---
 drivers/net/ethernet/wiznet/w5100.c                |    6 ++----
 drivers/net/ethernet/wiznet/w5300.c                |    6 ++----
 6 files changed, 9 insertions(+), 33 deletions(-)

-- 
1.7.9.5

^ permalink raw reply

* Re: irq disable in __netdev_alloc_frag() ?
From: Eric Dumazet @ 2014-10-23  1:52 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Eric Dumazet, Network Development
In-Reply-To: <CAMEtUuwsUqd-U8ZSEXCB+a7cvLpbuRjPU2m0Ux84q6cxoWSx+g@mail.gmail.com>


On Wed, 2014-10-22 at 17:15 -0700, Alexei Starovoitov wrote:
> Hi Eric,
> 
> in the commit 6f532612cc24 ("net: introduce netdev_alloc_frag()")
> you mentioned that the reason to disable interrupts
> in __netdev_alloc_frag() is:
> "- Must be IRQ safe (non NAPI drivers can use it)"
> 
> Is there a way to do this conditionally?
> 
> Without it I see 10% performance gain for my RX tests
> (from 6.9Mpps to 7.7Mpps) and __netdev_alloc_frag()
> itself goes from 6.6% to 2.1%
> (popf seems to be quite costly)

Well, your driver is probably a NAPI one, so you need to
mask irqs, or to remove all non NAPI drivers from linux.

__netdev_alloc_frag() (__netdev_alloc_skb()) is used by all.

Problem is __netdev_alloc_frag() is generally deep inside caller
chain, so using a private pool might have quite an overhead.

Same could be said for skb_queue_head() /skb_queue_tail() /
sock_queue_rcv_skb() :
Many callers don't need to block irq.

^ permalink raw reply

* Re: [RFC] tcp md5 use of alloc_percpu
From: Eric Dumazet @ 2014-10-23  1:47 UTC (permalink / raw)
  To: Crestez Dan Leonard; +Cc: Jonathan Toppins, netdev
In-Reply-To: <54485337.5040108@gmail.com>

On Thu, 2014-10-23 at 04:00 +0300, Crestez Dan Leonard wrote:
> On 10/23/2014 02:38 AM, Jonathan Toppins wrote:
> > On 10/22/14, 2:55 PM, Crestez Dan Leonard wrote:
> >> sg_init_one does virt_addr on the pointer which assumes it is directly accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu which can return memory from the vmalloc area after the pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting crashes on mips and I believe this to be the cause.
> >
> > Thinking about this more if the issue really is sg_init_one assumes a
> > directly accessible memory region, can we just modify the zone
> > allocation to GFP_DMA using alloc_percpu_gfp()? Does this satisfy the
> > assumptions made by sg_init_one?
> I don't think that alloc_percpu_gfp can be used that way. Looking at the 
> code it only checks for GFP_KERNEL and behaves "atomically" if it is not 
> present. This means that it fails rather than vmalloc a new percpu_chunk.
> 
> The problem is not that the memory is not allocated with GFP_DMA but 
> rather that the memory is allocated with vmalloc.

Could you try the following patch ?

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1bec4e76d88c..d253ad8ced64 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2868,30 +2868,29 @@ EXPORT_SYMBOL(compat_tcp_getsockopt);
 #endif
 
 #ifdef CONFIG_TCP_MD5SIG
-static struct tcp_md5sig_pool __percpu *tcp_md5sig_pool __read_mostly;
+static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool);
 static DEFINE_MUTEX(tcp_md5sig_mutex);
+static bool tcp_md5sig_pool_populated = false;
 
-static void __tcp_free_md5sig_pool(struct tcp_md5sig_pool __percpu *pool)
+static void tcp_free_md5sig_pool(void)
 {
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct tcp_md5sig_pool *p = per_cpu_ptr(pool, cpu);
+		struct crypto_hash *hash;
+
+		hash = per_cpu(tcp_md5sig_pool, cpu).md5_desc.tfm;
 
-		if (p->md5_desc.tfm)
-			crypto_free_hash(p->md5_desc.tfm);
+		if (hash) {
+			crypto_free_hash(hash);
+			per_cpu(tcp_md5sig_pool, cpu).md5_desc.tfm = NULL;
+		}
 	}
-	free_percpu(pool);
 }
 
 static void __tcp_alloc_md5sig_pool(void)
 {
 	int cpu;
-	struct tcp_md5sig_pool __percpu *pool;
-
-	pool = alloc_percpu(struct tcp_md5sig_pool);
-	if (!pool)
-		return;
 
 	for_each_possible_cpu(cpu) {
 		struct crypto_hash *hash;
@@ -2900,29 +2899,29 @@ static void __tcp_alloc_md5sig_pool(void)
 		if (IS_ERR_OR_NULL(hash))
 			goto out_free;
 
-		per_cpu_ptr(pool, cpu)->md5_desc.tfm = hash;
+		per_cpu(tcp_md5sig_pool, cpu).md5_desc.tfm = hash;
 	}
-	/* before setting tcp_md5sig_pool, we must commit all writes
-	 * to memory. See ACCESS_ONCE() in tcp_get_md5sig_pool()
+	/* before setting tcp_md5sig_pool_populated, we must commit all writes
+	 * to memory. See smp_rmb() in tcp_get_md5sig_pool()
 	 */
 	smp_wmb();
-	tcp_md5sig_pool = pool;
+	tcp_md5sig_pool_populated = true;
 	return;
 out_free:
-	__tcp_free_md5sig_pool(pool);
+	tcp_free_md5sig_pool();
 }
 
 bool tcp_alloc_md5sig_pool(void)
 {
-	if (unlikely(!tcp_md5sig_pool)) {
+	if (unlikely(!tcp_md5sig_pool_populated)) {
 		mutex_lock(&tcp_md5sig_mutex);
 
-		if (!tcp_md5sig_pool)
+		if (!tcp_md5sig_pool_populated)
 			__tcp_alloc_md5sig_pool();
 
 		mutex_unlock(&tcp_md5sig_mutex);
 	}
-	return tcp_md5sig_pool != NULL;
+	return tcp_md5sig_pool_populated;
 }
 EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
 
@@ -2936,13 +2935,13 @@ EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
  */
 struct tcp_md5sig_pool *tcp_get_md5sig_pool(void)
 {
-	struct tcp_md5sig_pool __percpu *p;
-
 	local_bh_disable();
-	p = ACCESS_ONCE(tcp_md5sig_pool);
-	if (p)
-		return raw_cpu_ptr(p);
 
+	if (tcp_md5sig_pool_populated) {
+		/* coupled with smp_wmb() in __tcp_alloc_md5sig_pool */
+		smp_rmb();
+		return this_cpu_ptr(&tcp_md5sig_pool);
+	}
 	local_bh_enable();
 	return NULL;
 }

^ permalink raw reply related

* Re: [RFC] tcp md5 use of alloc_percpu
From: Crestez Dan Leonard @ 2014-10-23  1:00 UTC (permalink / raw)
  To: Jonathan Toppins; +Cc: netdev
In-Reply-To: <54483FF7.4090208@cumulusnetworks.com>

On 10/23/2014 02:38 AM, Jonathan Toppins wrote:
> On 10/22/14, 2:55 PM, Crestez Dan Leonard wrote:
>> sg_init_one does virt_addr on the pointer which assumes it is directly accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu which can return memory from the vmalloc area after the pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting crashes on mips and I believe this to be the cause.
>
> Thinking about this more if the issue really is sg_init_one assumes a
> directly accessible memory region, can we just modify the zone
> allocation to GFP_DMA using alloc_percpu_gfp()? Does this satisfy the
> assumptions made by sg_init_one?
I don't think that alloc_percpu_gfp can be used that way. Looking at the 
code it only checks for GFP_KERNEL and behaves "atomically" if it is not 
present. This means that it fails rather than vmalloc a new percpu_chunk.

The problem is not that the memory is not allocated with GFP_DMA but 
rather that the memory is allocated with vmalloc.

Regards,
Leonard

^ permalink raw reply

* irq disable in __netdev_alloc_frag() ?
From: Alexei Starovoitov @ 2014-10-23  0:15 UTC (permalink / raw)
  To: Eric Dumazet, Network Development

Hi Eric,

in the commit 6f532612cc24 ("net: introduce netdev_alloc_frag()")
you mentioned that the reason to disable interrupts
in __netdev_alloc_frag() is:
"- Must be IRQ safe (non NAPI drivers can use it)"

Is there a way to do this conditionally?

Without it I see 10% performance gain for my RX tests
(from 6.9Mpps to 7.7Mpps) and __netdev_alloc_frag()
itself goes from 6.6% to 2.1%
(popf seems to be quite costly)

Thanks
Alexei

^ permalink raw reply

* Re: [PATCH 1/2] net: dsa: Error out on tagging protocol mismatches
From: Florian Fainelli @ 2014-10-22 23:46 UTC (permalink / raw)
  To: Andrew Lunn, davem; +Cc: netdev, alexander.h.duyck
In-Reply-To: <1414020918-20903-2-git-send-email-andrew@lunn.ch>

On 10/22/2014 04:35 PM, Andrew Lunn wrote:
> If there is a mismatch between enabled tagging protocols and the
> protocol the switch supports, error out, rather than continue with a
> situation which is unlikely to work.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> cc: alexander.h.duyck@intel.com
> ---
>  net/dsa/dsa.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 22f34cf4cb27..8a31bd81a315 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -175,7 +175,8 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
>  			break;
>  #endif
>  		default:
> -			break;
> +			ret = -ENOPROTOOPT;
> +			goto out;
>  		}

This prevents using a switch driver without tagging, which is something
that you might want to do (link setup, ethtool stats, EEE etc...).
--
Florian

^ permalink raw reply

* Re: [RFC] tcp md5 use of alloc_percpu
From: Jonathan Toppins @ 2014-10-22 23:38 UTC (permalink / raw)
  To: Crestez Dan Leonard, netdev
In-Reply-To: <5447FDB2.2010906@gmail.com>

On 10/22/14, 2:55 PM, Crestez Dan Leonard wrote:
> sg_init_one does virt_addr on the pointer which assumes it is directly accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu which can return memory from the vmalloc area after the pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting crashes on mips and I believe this to be the cause.

Thinking about this more if the issue really is sg_init_one assumes a
directly accessible memory region, can we just modify the zone
allocation to GFP_DMA using alloc_percpu_gfp()? Does this satisfy the
assumptions made by sg_init_one?

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1bec4e7..6924320 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2889,7 +2889,7 @@ static void __tcp_alloc_md5sig_pool(void)
        int cpu;
        struct tcp_md5sig_pool __percpu *pool;

-       pool = alloc_percpu(struct tcp_md5sig_pool);
+       pool = alloc_percpu_gfp(struct tcp_md5sig_pool, GFP_DMA);
        if (!pool)
                return;

^ permalink raw reply related

* Careers via Adecco UK
From: Adecco UK @ 2014-10-22 23:37 UTC (permalink / raw)


Dear Expat,

Adecco is a recruitment provider that creates the opportunity for you
to live in the UK and work with some of the most exciting companies
that can take your career to the next level.

Our recruitment is vast, so regardless of your level of education and
industry, we will have your Resume forwarded to the appropriate
companies and considered for various openings.

Please send us your Resume in reply to this notice and let us help you
find a better job.

Regards,

Arlo Colston
Snr. Recruitment Specialist
Adecco UK

^ permalink raw reply

* [PATCH 1/2] net: dsa: Error out on tagging protocol mismatches
From: Andrew Lunn @ 2014-10-22 23:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, Andrew Lunn, alexander.h.duyck
In-Reply-To: <1414020918-20903-1-git-send-email-andrew@lunn.ch>

If there is a mismatch between enabled tagging protocols and the
protocol the switch supports, error out, rather than continue with a
situation which is unlikely to work.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
cc: alexander.h.duyck@intel.com
---
 net/dsa/dsa.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 22f34cf4cb27..8a31bd81a315 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -175,7 +175,8 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
 			break;
 #endif
 		default:
-			break;
+			ret = -ENOPROTOOPT;
+			goto out;
 		}
 
 		dst->tag_protocol = drv->tag_protocol;
-- 
2.1.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox