[PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
@ 2013-10-14 14:05 Claudiu Manoil
  2013-10-14 14:34 ` Eric Dumazet
  2013-10-18 19:55 ` David Miller
  0 siblings, 2 replies; 8+ messages in thread
From: Claudiu Manoil @ 2013-10-14 14:05 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller

Under certain low traffic conditions, the single core
devices with multiple Rx/Tx queues (MQ mode) may reach
soft lockup due to gfar_poll not returning in proper time.
The following exception was obtained using iperf on a 100Mbit
half-duplex link, for a p1010 single core device:

BUG: soft lockup - CPU#0 stuck for 23s! [iperf:2847]
Modules linked in:
CPU: 0 PID: 2847 Comm: iperf Not tainted 3.12.0-rc3 #16
task: e8bf8000 ti: eeb16000 task.ti: ee646000
NIP: c0255b6c LR: c0367ae8 CTR: c0461c18
REGS: eeb17e70 TRAP: 0901   Not tainted  (3.12.0-rc3)
MSR: 00029000 <CE,EE,ME>  CR: 44228428  XER: 20000000

GPR00: c0367ad4 eeb17f20 e8bf8000 ee01f4b4 00000008 ffffffff ffffffff
00000000
GPR08: 000000c0 00000008 000000ff ffffffc0 000193fe
NIP [c0255b6c] find_next_bit+0xb8/0xc4
LR [c0367ae8] gfar_poll+0xc8/0x1d8
Call Trace:
[eeb17f20] [c0367ad4] gfar_poll+0xb4/0x1d8 (unreliable)
[eeb17f70] [c0422100] net_rx_action+0xa4/0x158
[eeb17fa0] [c003ec6c] __do_softirq+0xcc/0x17c
[eeb17ff0] [c000c28c] call_do_softirq+0x24/0x3c
[ee647cc0] [c0004660] do_softirq+0x6c/0x94
[ee647ce0] [c003eb9c] local_bh_enable+0x9c/0xa0
[ee647cf0] [c0454fe8] tcp_prequeue_process+0xa4/0xdc
[ee647d10] [c0457e44] tcp_recvmsg+0x498/0x96c
[ee647d80] [c047b630] inet_recvmsg+0x40/0x64
[ee647da0] [c040ca8c] sock_recvmsg+0x90/0xc0
[ee647e30] [c040edb8] SyS_recvfrom+0x98/0xfc

To prevent this, the outer while() loop has been removed
allowing gfar_poll() to return faster even if there's
still budget left.  Also, there's no need to recompute
the budget per Rx queue anymore.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
---
 drivers/net/ethernet/freescale/gianfar.c | 87 ++++++++++++++------------------
 1 file changed, 38 insertions(+), 49 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 9fbe4dd..d6d810c 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2918,7 +2918,7 @@ static int gfar_poll(struct napi_struct *napi, int budget)
 	struct gfar_priv_rx_q *rx_queue = NULL;
 	int work_done = 0, work_done_per_q = 0;
 	int i, budget_per_q = 0;
-	int has_tx_work;
+	int has_tx_work = 0;
 	unsigned long rstat_rxf;
 	int num_act_queues;
 
@@ -2933,62 +2933,51 @@ static int gfar_poll(struct napi_struct *napi, int budget)
 	if (num_act_queues)
 		budget_per_q = budget/num_act_queues;
 
-	while (1) {
-		has_tx_work = 0;
-		for_each_set_bit(i, &gfargrp->tx_bit_map, priv->num_tx_queues) {
-			tx_queue = priv->tx_queue[i];
-			/* run Tx cleanup to completion */
-			if (tx_queue->tx_skbuff[tx_queue->skb_dirtytx]) {
-				gfar_clean_tx_ring(tx_queue);
-				has_tx_work = 1;
-			}
+	for_each_set_bit(i, &gfargrp->tx_bit_map, priv->num_tx_queues) {
+		tx_queue = priv->tx_queue[i];
+		/* run Tx cleanup to completion */
+		if (tx_queue->tx_skbuff[tx_queue->skb_dirtytx]) {
+			gfar_clean_tx_ring(tx_queue);
+			has_tx_work = 1;
 		}
+	}
 
-		for_each_set_bit(i, &gfargrp->rx_bit_map, priv->num_rx_queues) {
-			/* skip queue if not active */
-			if (!(rstat_rxf & (RSTAT_CLEAR_RXF0 >> i)))
-				continue;
-
-			rx_queue = priv->rx_queue[i];
-			work_done_per_q =
-				gfar_clean_rx_ring(rx_queue, budget_per_q);
-			work_done += work_done_per_q;
-
-			/* finished processing this queue */
-			if (work_done_per_q < budget_per_q) {
-				/* clear active queue hw indication */
-				gfar_write(&regs->rstat,
-					   RSTAT_CLEAR_RXF0 >> i);
-				rstat_rxf &= ~(RSTAT_CLEAR_RXF0 >> i);
-				num_act_queues--;
-
-				if (!num_act_queues)
-					break;
-				/* recompute budget per Rx queue */
-				budget_per_q =
-					(budget - work_done) / num_act_queues;
-			}
-		}
+	for_each_set_bit(i, &gfargrp->rx_bit_map, priv->num_rx_queues) {
+		/* skip queue if not active */
+		if (!(rstat_rxf & (RSTAT_CLEAR_RXF0 >> i)))
+			continue;
 
-		if (work_done >= budget)
-			break;
+		rx_queue = priv->rx_queue[i];
+		work_done_per_q =
+			gfar_clean_rx_ring(rx_queue, budget_per_q);
+		work_done += work_done_per_q;
+
+		/* finished processing this queue */
+		if (work_done_per_q < budget_per_q) {
+			/* clear active queue hw indication */
+			gfar_write(&regs->rstat,
+				   RSTAT_CLEAR_RXF0 >> i);
+			num_act_queues--;
+
+			if (!num_act_queues)
+				break;
+		}
+	}
 
-		if (!num_act_queues && !has_tx_work) {
+	if (!num_act_queues && !has_tx_work) {
 
-			napi_complete(napi);
+		napi_complete(napi);
 
-			/* Clear the halt bit in RSTAT */
-			gfar_write(&regs->rstat, gfargrp->rstat);
+		/* Clear the halt bit in RSTAT */
+		gfar_write(&regs->rstat, gfargrp->rstat);
 
-			gfar_write(&regs->imask, IMASK_DEFAULT);
+		gfar_write(&regs->imask, IMASK_DEFAULT);
 
-			/* If we are coalescing interrupts, update the timer
-			 * Otherwise, clear it
-			 */
-			gfar_configure_coalescing(priv, gfargrp->rx_bit_map,
-						  gfargrp->tx_bit_map);
-			break;
-		}
+		/* If we are coalescing interrupts, update the timer
+		 * Otherwise, clear it
+		 */
+		gfar_configure_coalescing(priv, gfargrp->rx_bit_map,
+					  gfargrp->tx_bit_map);
 	}
 
 	return work_done;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2013-10-14 14:05 [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup Claudiu Manoil
@ 2013-10-14 14:34 ` Eric Dumazet
  2013-10-14 15:11   ` Claudiu Manoil
  2013-10-18 19:55 ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2013-10-14 14:34 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: netdev, David S. Miller

On Mon, 2013-10-14 at 17:05 +0300, Claudiu Manoil wrote:
> Under certain low traffic conditions, the single core
> devices with multiple Rx/Tx queues (MQ mode) may reach
> soft lockup due to gfar_poll not returning in proper time.
> The following exception was obtained using iperf on a 100Mbit
> half-duplex link, for a p1010 single core device:
> 
> BUG: soft lockup - CPU#0 stuck for 23s! [iperf:2847]
> Modules linked in:
> CPU: 0 PID: 2847 Comm: iperf Not tainted 3.12.0-rc3 #16
> task: e8bf8000 ti: eeb16000 task.ti: ee646000
> NIP: c0255b6c LR: c0367ae8 CTR: c0461c18
> REGS: eeb17e70 TRAP: 0901   Not tainted  (3.12.0-rc3)
> MSR: 00029000 <CE,EE,ME>  CR: 44228428  XER: 20000000
> 
> GPR00: c0367ad4 eeb17f20 e8bf8000 ee01f4b4 00000008 ffffffff ffffffff
> 00000000
> GPR08: 000000c0 00000008 000000ff ffffffc0 000193fe
> NIP [c0255b6c] find_next_bit+0xb8/0xc4
> LR [c0367ae8] gfar_poll+0xc8/0x1d8
> Call Trace:
> [eeb17f20] [c0367ad4] gfar_poll+0xb4/0x1d8 (unreliable)
> [eeb17f70] [c0422100] net_rx_action+0xa4/0x158
> [eeb17fa0] [c003ec6c] __do_softirq+0xcc/0x17c
> [eeb17ff0] [c000c28c] call_do_softirq+0x24/0x3c
> [ee647cc0] [c0004660] do_softirq+0x6c/0x94
> [ee647ce0] [c003eb9c] local_bh_enable+0x9c/0xa0
> [ee647cf0] [c0454fe8] tcp_prequeue_process+0xa4/0xdc
> [ee647d10] [c0457e44] tcp_recvmsg+0x498/0x96c
> [ee647d80] [c047b630] inet_recvmsg+0x40/0x64
> [ee647da0] [c040ca8c] sock_recvmsg+0x90/0xc0
> [ee647e30] [c040edb8] SyS_recvfrom+0x98/0xfc
> 
> To prevent this, the outer while() loop has been removed
> allowing gfar_poll() to return faster even if there's
> still budget left.  Also, there's no need to recompute
> the budget per Rx queue anymore.

It seems there is a race condition, and this patch only makes it happen
less often ?

return faster means what exactly ?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2013-10-14 14:34 ` Eric Dumazet
@ 2013-10-14 15:11   ` Claudiu Manoil
  2014-03-27 12:53     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 8+ messages in thread
From: Claudiu Manoil @ 2013-10-14 15:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David S. Miller

On 10/14/2013 5:34 PM, Eric Dumazet wrote:
> On Mon, 2013-10-14 at 17:05 +0300, Claudiu Manoil wrote:
>> Under certain low traffic conditions, the single core
>> devices with multiple Rx/Tx queues (MQ mode) may reach
>> soft lockup due to gfar_poll not returning in proper time.
>> The following exception was obtained using iperf on a 100Mbit
>> half-duplex link, for a p1010 single core device:
>>
>> BUG: soft lockup - CPU#0 stuck for 23s! [iperf:2847]
>> Modules linked in:
>> CPU: 0 PID: 2847 Comm: iperf Not tainted 3.12.0-rc3 #16
>> task: e8bf8000 ti: eeb16000 task.ti: ee646000
>> NIP: c0255b6c LR: c0367ae8 CTR: c0461c18
>> REGS: eeb17e70 TRAP: 0901   Not tainted  (3.12.0-rc3)
>> MSR: 00029000 <CE,EE,ME>  CR: 44228428  XER: 20000000
>>
>> GPR00: c0367ad4 eeb17f20 e8bf8000 ee01f4b4 00000008 ffffffff ffffffff
>> 00000000
>> GPR08: 000000c0 00000008 000000ff ffffffc0 000193fe
>> NIP [c0255b6c] find_next_bit+0xb8/0xc4
>> LR [c0367ae8] gfar_poll+0xc8/0x1d8
>> Call Trace:
>> [eeb17f20] [c0367ad4] gfar_poll+0xb4/0x1d8 (unreliable)
>> [eeb17f70] [c0422100] net_rx_action+0xa4/0x158
>> [eeb17fa0] [c003ec6c] __do_softirq+0xcc/0x17c
>> [eeb17ff0] [c000c28c] call_do_softirq+0x24/0x3c
>> [ee647cc0] [c0004660] do_softirq+0x6c/0x94
>> [ee647ce0] [c003eb9c] local_bh_enable+0x9c/0xa0
>> [ee647cf0] [c0454fe8] tcp_prequeue_process+0xa4/0xdc
>> [ee647d10] [c0457e44] tcp_recvmsg+0x498/0x96c
>> [ee647d80] [c047b630] inet_recvmsg+0x40/0x64
>> [ee647da0] [c040ca8c] sock_recvmsg+0x90/0xc0
>> [ee647e30] [c040edb8] SyS_recvfrom+0x98/0xfc
>>
>> To prevent this, the outer while() loop has been removed
>> allowing gfar_poll() to return faster even if there's
>> still budget left.  Also, there's no need to recompute
>> the budget per Rx queue anymore.
>
> It seems there is a race condition, and this patch only makes it happen
> less often ?
>
> return faster means what exactly ?
>

Hi Eric,
Because of the outer while loop, gfar_poll may not return due
to continuous tx work. The later implementation of gfar_poll
allows only one iteration of the Tx queues before returning
control to net_rx_action(), that's what I meant with "returns faster".
I tested this fix with different loads, and the soft lockup
didn't trigger (without the fix it triggers right away).

Besides, isn't this a more appropriate napi poll implementation
than the former one with the outer while() loop?

Thanks,
Claudiu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2013-10-14 15:11   ` Claudiu Manoil
@ 2014-03-27 12:53     ` Sebastian Andrzej Siewior
  2014-03-28  8:19       ` Claudiu Manoil
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-03-27 12:53 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: Eric Dumazet, netdev, David S. Miller

On 2013-10-14 18:11:15 [+0300], Claudiu Manoil wrote:
> >>BUG: soft lockup - CPU#0 stuck for 23s! [iperf:2847]
> >>NIP [c0255b6c] find_next_bit+0xb8/0xc4
> >>LR [c0367ae8] gfar_poll+0xc8/0x1d8
> >It seems there is a race condition, and this patch only makes it happen
> >less often ?
> >
> >return faster means what exactly ?
> >
> 
> Hi Eric,
> Because of the outer while loop, gfar_poll may not return due
> to continuous tx work. The later implementation of gfar_poll
> allows only one iteration of the Tx queues before returning
> control to net_rx_action(), that's what I meant with "returns faster".

We talk here about 23secs of cleanup. RX is limited by NAPI and TX is
limited because it can't be refilled on your UP system.
Does your box recover from this condition without this patch? Mine does
not. But I run -RT and stumbled uppon something different.

What I observe is that the TX queue is not empty but does not make any
progress. That means tx_queue->tx_skbuff[tx_queue->skb_dirtytx] is true
and gfar_clean_tx_ring() cleans up zero packages because it is not yet
complete.

My problem is that when gfar_start_xmit() is preemted after the 
tx_queue->tx_skbuff[tx_queue->skb_curtx] is set but before the DMA is started
then the NAPI-poll never completes because it sees a packet which never
completes because the DMA engine did no start yet and won't.
On non-RT SMP systems this isn't a big problem because on the first iteration
the DMA engine might be idle but on the second the other CPU most likely
started the DMA engine and on the fifth the packet might be gone so you
stop (finally).

What happens on your slow link setup is probably the following:
You enqueue hundrets of packets which need TX cleanup. Since that link
is *that* slow, you spent a bunch of cycles calling gfar_clean_tx_ring()
with zero cleanup and you can't leave the poll routine because there is
TX skb not cleaned up. What amazes me is hat it keeps you CPU busy for
as long as 23secs.
*IF* the link goes down in the middle of a cleanup you should see
similar stall because that TX packet won't leave the device so you never
cleanup. So what happens here? Do you get an error interrupt which
purges that skb or do you wait for ndo_tx_timeout()? One of these two
will save your ass but it ain't pretty.

To fix properly with something that works on -RT and mainline I suggest
to revert this patch and add the following:
- do not set has_tx_work unless gfar_clean_tx_ring() unless atleast one
  skb has been cleaned up.
- take the TX cleanup into NAPI accounting. I am not sure if it is
  realistic that one CPU is filling the queue and the other cleans up
  continuously assuming a GBIT link and small packets. However this should
  put a limit here.

which looks in C like this:

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 1799ff0..19192c4 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -132,7 +132,6 @@ static int gfar_poll(struct napi_struct *napi, int budget);
 static void gfar_netpoll(struct net_device *dev);
 #endif
 int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit);
-static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue);
 static void gfar_process_frame(struct net_device *dev, struct sk_buff *skb,
 			       int amount_pull, struct napi_struct *napi);
 void gfar_halt(struct net_device *dev);
@@ -2473,7 +2472,7 @@ static void gfar_align_skb(struct sk_buff *skb)
 }

 /* Interrupt Handler for Transmit complete */
-static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
+static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 {
 	struct net_device *dev = tx_queue->dev;
 	struct netdev_queue *txq;
@@ -2854,10 +2853,14 @@ static int gfar_poll(struct napi_struct *napi, int budget)
 			tx_queue = priv->tx_queue[i];
 			/* run Tx cleanup to completion */
 			if (tx_queue->tx_skbuff[tx_queue->skb_dirtytx]) {
-				gfar_clean_tx_ring(tx_queue);
-				has_tx_work = 1;
+				int ret;
+
+				ret = gfar_clean_tx_ring(tx_queue);
+				if (ret)
+					has_tx_work++;
 			}
 		}
+		work_done += has_tx_work;

 		for_each_set_bit(i, &gfargrp->rx_bit_map, priv->num_rx_queues) {
 			/* skip queue if not active */
> Thanks,
> Claudiu

Sebastian

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2014-03-27 12:53     ` Sebastian Andrzej Siewior
@ 2014-03-28  8:19       ` Claudiu Manoil
  2014-03-28  8:34         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 8+ messages in thread
From: Claudiu Manoil @ 2014-03-28  8:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Eric Dumazet, netdev, David S. Miller

On 3/27/2014 2:53 PM, Sebastian Andrzej Siewior wrote:
> On 2013-10-14 18:11:15 [+0300], Claudiu Manoil wrote:
>>>> BUG: soft lockup - CPU#0 stuck for 23s! [iperf:2847]
>>>> NIP [c0255b6c] find_next_bit+0xb8/0xc4
>>>> LR [c0367ae8] gfar_poll+0xc8/0x1d8
>>> It seems there is a race condition, and this patch only makes it happen
>>> less often ?
>>>
>>> return faster means what exactly ?
>>>
>>
>> Hi Eric,
>> Because of the outer while loop, gfar_poll may not return due
>> to continuous tx work. The later implementation of gfar_poll
>> allows only one iteration of the Tx queues before returning
>> control to net_rx_action(), that's what I meant with "returns faster".
>
> We talk here about 23secs of cleanup. RX is limited by NAPI and TX is
> limited because it can't be refilled on your UP system.
> Does your box recover from this condition without this patch? Mine does
> not. But I run -RT and stumbled uppon something different.
>
> What I observe is that the TX queue is not empty but does not make any
> progress. That means tx_queue->tx_skbuff[tx_queue->skb_dirtytx] is true
> and gfar_clean_tx_ring() cleans up zero packages because it is not yet
> complete.
>
> My problem is that when gfar_start_xmit() is preemted after the
> tx_queue->tx_skbuff[tx_queue->skb_curtx] is set but before the DMA is started
> then the NAPI-poll never completes because it sees a packet which never
> completes because the DMA engine did no start yet and won't.

False, that code section from start_xmit() cannot be preempted, because
it has spin_lock_irqsave()/restore() around it (unless you modified
your code).  Will check though if on SMP, for some reason,
clean_tx_ring() enters with 0 skbs to clean.

[...]

> To fix properly with something that works on -RT and mainline I suggest
> to revert this patch and add the following:

This patch cannot be reverted. (why would you?)
This patch fixes the issue from description.  I'm seeing no issues with
P1010 now (on any kind of traffic), and the openwrt/tp-link guys also
confirmed (on the powerpc list) that this patch addresses the issue on
their end.
If you encounter problems with the latest driver code, please submit a
proper issue description indicating the code base you're using and so
on.  Also make sure that the problem you're seeing wasn't already fixed
by one of the latest gianfar fixes from net-next:
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2014-03-28  8:19       ` Claudiu Manoil
@ 2014-03-28  8:34         ` Sebastian Andrzej Siewior
  2014-03-28  9:46           ` Claudiu Manoil
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-03-28  8:34 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: Eric Dumazet, netdev, David S. Miller

On 2014-03-28 10:19:07 [+0200], Claudiu Manoil wrote:
> >My problem is that when gfar_start_xmit() is preemted after the
> >tx_queue->tx_skbuff[tx_queue->skb_curtx] is set but before the DMA is started
> >then the NAPI-poll never completes because it sees a packet which never
> >completes because the DMA engine did no start yet and won't.
> 
> False, that code section from start_xmit() cannot be preempted, because
> it has spin_lock_irqsave()/restore() around it (unless you modified
> your code).  Will check though if on SMP, for some reason,
> clean_tx_ring() enters with 0 skbs to clean.

I said on -RT. On mainline it can't be preempted as I said. If for
some reason you can't get your packet out (on a slow link as you in your
case) it will return with 0 cleanups.
This has been broken since c233cf4 ("gianfar: Fix tx napi polling")
since you drop the return value.

> [...]
> 
> >To fix properly with something that works on -RT and mainline I suggest
> >to revert this patch and add the following:
> 
> This patch cannot be reverted. (why would you?)
Because it does not fix a thing it simply duck tapes the issue that a TX
transfer does not cleanup a thing and you assume that it did something.
You have budget a reserved for RX cleanup which you do not use up if possible.
You simple do one loop and leave.

> This patch fixes the issue from description.  I'm seeing no issues with
> P1010 now (on any kind of traffic), and the openwrt/tp-link guys also
> confirmed (on the powerpc list) that this patch addresses the issue on
> their end.
Simply because the stall is gone doesn't make it good. As you had no
idea why.

> If you encounter problems with the latest driver code, please submit a
> proper issue description indicating the code base you're using and so
> on.  Also make sure that the problem you're seeing wasn't already fixed
> by one of the latest gianfar fixes from net-next:
> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git

I pointed out _why_ you saw the stall and the fix involved not to
endless loop on TX clean up on yet transmitted packages. The removal of
outer loop was not required. The issue is present since c233cf4 which
made it in v3.10 into the kernel. 

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2014-03-28  8:34         ` Sebastian Andrzej Siewior
@ 2014-03-28  9:46           ` Claudiu Manoil
  0 siblings, 0 replies; 8+ messages in thread
From: Claudiu Manoil @ 2014-03-28  9:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Eric Dumazet, netdev, David S. Miller

On 3/28/2014 10:34 AM, Sebastian Andrzej Siewior wrote:
> On 2014-03-28 10:19:07 [+0200], Claudiu Manoil wrote:
>>> My problem is that when gfar_start_xmit() is preemted after the
>>> tx_queue->tx_skbuff[tx_queue->skb_curtx] is set but before the DMA is started
>>> then the NAPI-poll never completes because it sees a packet which never
>>> completes because the DMA engine did no start yet and won't.
>>
>> False, that code section from start_xmit() cannot be preempted, because
>> it has spin_lock_irqsave()/restore() around it (unless you modified
>> your code).  Will check though if on SMP, for some reason,
>> clean_tx_ring() enters with 0 skbs to clean.
>
> I said on -RT. On mainline it can't be preempted as I said. If for
> some reason you can't get your packet out (on a slow link as you in your
> case) it will return with 0 cleanups.
> This has been broken since c233cf4 ("gianfar: Fix tx napi polling")
> since you drop the return value.
>
>> [...]
>>
>>> To fix properly with something that works on -RT and mainline I suggest
>>> to revert this patch and add the following:
>>
>> This patch cannot be reverted. (why would you?)
> Because it does not fix a thing it simply duck tapes the issue that a TX
> transfer does not cleanup a thing and you assume that it did something.
> You have budget a reserved for RX cleanup which you do not use up if possible.
> You simple do one loop and leave.

Your proposed fix doesn't fix the root cause either, it's just a 
workaround that came late.  Do you suggest consuming Rx budget for
Tx processing as a better workaround?
Note that the NAPI processing code has been changed in the meanwhile:
http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git
to address other issues (see aeb12c5ef7cb08d879af22fc0a56cab9e70689ea,
and 71ff9e3df7e1c5d3293af6b595309124e8c97412).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup
  2013-10-14 14:05 [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup Claudiu Manoil
  2013-10-14 14:34 ` Eric Dumazet
@ 2013-10-18 19:55 ` David Miller
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2013-10-18 19:55 UTC (permalink / raw)
  To: claudiu.manoil; +Cc: netdev

From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Mon, 14 Oct 2013 17:05:09 +0300

> Under certain low traffic conditions, the single core
> devices with multiple Rx/Tx queues (MQ mode) may reach
> soft lockup due to gfar_poll not returning in proper time.
> The following exception was obtained using iperf on a 100Mbit
> half-duplex link, for a p1010 single core device:
 ...
> To prevent this, the outer while() loop has been removed
> allowing gfar_poll() to return faster even if there's
> still budget left.  Also, there's no need to recompute
> the budget per Rx queue anymore.
> 
> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-03-28  9:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-14 14:05 [PATCH][net-next] gianfar: Simplify MQ polling to avoid soft lockup Claudiu Manoil
2013-10-14 14:34 ` Eric Dumazet
2013-10-14 15:11   ` Claudiu Manoil
2014-03-27 12:53     ` Sebastian Andrzej Siewior
2014-03-28  8:19       ` Claudiu Manoil
2014-03-28  8:34         ` Sebastian Andrzej Siewior
2014-03-28  9:46           ` Claudiu Manoil
2013-10-18 19:55 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).