netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* GRO with non napi driver: BUG in __napi_complete
@ 2009-03-17  8:35 Frank Blaschka
  2009-03-17 10:22 ` Herbert Xu
  2009-03-17 15:49 ` Jean-Pascal Billaud
  0 siblings, 2 replies; 6+ messages in thread
From: Frank Blaschka @ 2009-03-17  8:35 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Herbert Xu

Hi,

I try to activate GRO on a non napi driver (2.6.29-rc8). Running an iperf test
causes a bug in __napi_complete.

kernel BUG at net/core/dev.c:2625!
illegal operation: 0001 <DD>#1<A8> PREEMPT SMP
Modules linked in:
CPU: 1 Not tainted 2.6.29-rc8-00124-g5bee17f #8
Process swapper (pid: 0, task: 000000002ff7ccc0, ksp: 000000002ff97d48)
Krnl PSW : 0404d00180000000 00000000002d1f4e (__napi_complete+0x82/0x88)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
...
<DD>  261.377396<A8> Call Trace:
<DD>  261.377400<A8> (<DD><00000000002cb752><A8> process_backlog+0xba/0x104)
<DD>  261.377410<A8>  <DD><00000000002cb5ba><A8> net_rx_action+0x102/0x1e0
<DD>  261.377418<A8>  <DD><000000000004921e><A8> __do_softirq+0x92/0x168
<DD>  261.377428<A8>  <DD><0000000000020936><A8> do_softirq+0x96/0xb0
<DD>  261.377436<A8>  <DD><00000000000493c0><A8> irq_exit+0x70/0x80
<DD>  261.377444<A8>  <DD><000000000025789c><A8> do_IRQ+0x174/0x194
<DD>  261.377455<A8>  <DD><00000000000258da><A8> io_return+0x0/0x8
<DD>  261.377464<A8>  <DD><00000000000246fe><A8> vtime_stop_cpu+0xb2/0xc0
<DD>  261.377473<A8> (<DD><00000007005b1007><A8> 0x7005b1007)

What is the intention process_backlog calls __napi_complete() instead of
napi_complete(), this looks suspicious to me. Can anybody help?

Thanks,

Frank


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GRO with non napi driver: BUG in __napi_complete
  2009-03-17  8:35 GRO with non napi driver: BUG in __napi_complete Frank Blaschka
@ 2009-03-17 10:22 ` Herbert Xu
  2009-03-17 20:11   ` David Miller
  2009-03-27 19:05   ` Tom Herbert
  2009-03-17 15:49 ` Jean-Pascal Billaud
  1 sibling, 2 replies; 6+ messages in thread
From: Herbert Xu @ 2009-03-17 10:22 UTC (permalink / raw)
  To: Frank Blaschka; +Cc: netdev, David Miller

On Tue, Mar 17, 2009 at 09:35:21AM +0100, Frank Blaschka wrote:
>
> What is the intention process_backlog calls __napi_complete() instead of
> napi_complete(), this looks suspicious to me. Can anybody help?

You're absolutely right.  Dave, we need this fix for both net
and net-next.

gro: Fix legacy path napi_complete crash

On the legacy netif_rx path, I incorrectly tried to optimise
the napi_complete call by using __napi_complete before we reenable
IRQs.  This simply doesn't work since we need to flush the held
GRO packets first.

This patch fixes it by doing the obvious thing of reenabling
IRQs first and then calling napi_complete.

Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/core/dev.c b/net/core/dev.c
index f112970..2565f6d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2588,9 +2588,9 @@ static int process_backlog(struct napi_struct *napi, int quota)
 		local_irq_disable();
 		skb = __skb_dequeue(&queue->input_pkt_queue);
 		if (!skb) {
-			__napi_complete(napi);
 			local_irq_enable();
-			break;
+			napi_complete(napi);
+			goto out;
 		}
 		local_irq_enable();
 
@@ -2599,6 +2599,7 @@ static int process_backlog(struct napi_struct *napi, int quota)
 
 	napi_gro_flush(napi);
 
+out:
 	return work;
 }
 
Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: GRO with non napi driver: BUG in __napi_complete
  2009-03-17  8:35 GRO with non napi driver: BUG in __napi_complete Frank Blaschka
  2009-03-17 10:22 ` Herbert Xu
@ 2009-03-17 15:49 ` Jean-Pascal Billaud
  1 sibling, 0 replies; 6+ messages in thread
From: Jean-Pascal Billaud @ 2009-03-17 15:49 UTC (permalink / raw)
  To: Frank Blaschka, netdev@vger.kernel.org; +Cc: David Miller, Herbert Xu

More generally, is there still a point to support non-napi drivers? Shouldn't we have a unique way to process traffic at the driver level?

--jp

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Frank Blaschka
> Sent: Tuesday, March 17, 2009 1:35 AM
> To: netdev@vger.kernel.org
> Cc: David Miller; Herbert Xu
> Subject: GRO with non napi driver: BUG in __napi_complete
> 
> Hi,
> 
> I try to activate GRO on a non napi driver (2.6.29-rc8). Running an
> iperf test
> causes a bug in __napi_complete.
> 
> kernel BUG at net/core/dev.c:2625!
> illegal operation: 0001 <DD>#1<A8> PREEMPT SMP
> Modules linked in:
> CPU: 1 Not tainted 2.6.29-rc8-00124-g5bee17f #8
> Process swapper (pid: 0, task: 000000002ff7ccc0, ksp: 000000002ff97d48)
> Krnl PSW : 0404d00180000000 00000000002d1f4e
> (__napi_complete+0x82/0x88)
>            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
> ...
> <DD>  261.377396<A8> Call Trace:
> <DD>  261.377400<A8> (<DD><00000000002cb752><A8>
> process_backlog+0xba/0x104)
> <DD>  261.377410<A8>  <DD><00000000002cb5ba><A8>
> net_rx_action+0x102/0x1e0
> <DD>  261.377418<A8>  <DD><000000000004921e><A8>
> __do_softirq+0x92/0x168
> <DD>  261.377428<A8>  <DD><0000000000020936><A8> do_softirq+0x96/0xb0
> <DD>  261.377436<A8>  <DD><00000000000493c0><A8> irq_exit+0x70/0x80
> <DD>  261.377444<A8>  <DD><000000000025789c><A8> do_IRQ+0x174/0x194
> <DD>  261.377455<A8>  <DD><00000000000258da><A8> io_return+0x0/0x8
> <DD>  261.377464<A8>  <DD><00000000000246fe><A8>
> vtime_stop_cpu+0xb2/0xc0
> <DD>  261.377473<A8> (<DD><00000007005b1007><A8> 0x7005b1007)
> 
> What is the intention process_backlog calls __napi_complete() instead
> of
> napi_complete(), this looks suspicious to me. Can anybody help?
> 
> Thanks,
> 
> Frank
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GRO with non napi driver: BUG in __napi_complete
  2009-03-17 10:22 ` Herbert Xu
@ 2009-03-17 20:11   ` David Miller
  2009-03-27 19:05   ` Tom Herbert
  1 sibling, 0 replies; 6+ messages in thread
From: David Miller @ 2009-03-17 20:11 UTC (permalink / raw)
  To: herbert; +Cc: blaschka, netdev

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue, 17 Mar 2009 18:22:44 +0800

> On Tue, Mar 17, 2009 at 09:35:21AM +0100, Frank Blaschka wrote:
> >
> > What is the intention process_backlog calls __napi_complete() instead of
> > napi_complete(), this looks suspicious to me. Can anybody help?
> 
> You're absolutely right.  Dave, we need this fix for both net
> and net-next.
> 
> gro: Fix legacy path napi_complete crash
> 
> On the legacy netif_rx path, I incorrectly tried to optimise
> the napi_complete call by using __napi_complete before we reenable
> IRQs.  This simply doesn't work since we need to flush the held
> GRO packets first.
> 
> This patch fixes it by doing the obvious thing of reenabling
> IRQs first and then calling napi_complete.
> 
> Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GRO with non napi driver: BUG in __napi_complete
  2009-03-17 10:22 ` Herbert Xu
  2009-03-17 20:11   ` David Miller
@ 2009-03-27 19:05   ` Tom Herbert
  2009-03-27 22:50     ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Tom Herbert @ 2009-03-27 19:05 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Frank Blaschka, netdev, David Miller

> On the legacy netif_rx path, I incorrectly tried to optimise
>
> the napi_complete call by using __napi_complete before we reenable
> IRQs.  This simply doesn't work since we need to flush the held
> GRO packets first.
>
> This patch fixes it by doing the obvious thing of reenabling
> IRQs first and then calling napi_complete.

Does this fix generate a race condition for a non-NAPI device?  If
netif_rx runs immediately after local_irq_enable it would queue a
packet on the backlog queue and try to schedule napi (the latter has
no effect because napi has not completed).  On return from interrupt,
napi_complete is done leaving a packet in the input queue but napi is
not scheduled to process it.

Thanks,
Tom

>
> Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f112970..2565f6d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2588,9 +2588,9 @@ static int process_backlog(struct napi_struct *napi, int quota)
>                local_irq_disable();
>                skb = __skb_dequeue(&queue->input_pkt_queue);
>                if (!skb) {
> -                       __napi_complete(napi);
>                        local_irq_enable();
> -                       break;
> +                       napi_complete(napi);
> +                       goto out;
>                }
>                local_irq_enable();
>
> @@ -2599,6 +2599,7 @@ static int process_backlog(struct napi_struct *napi, int quota)
>
>        napi_gro_flush(napi);
>
> +out:
>        return work;
>  }
>
> Thanks,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GRO with non napi driver: BUG in __napi_complete
  2009-03-27 19:05   ` Tom Herbert
@ 2009-03-27 22:50     ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2009-03-27 22:50 UTC (permalink / raw)
  To: therbert; +Cc: herbert, blaschka, netdev

From: Tom Herbert <therbert@google.com>
Date: Fri, 27 Mar 2009 12:05:30 -0700

> > On the legacy netif_rx path, I incorrectly tried to optimise
> >
> > the napi_complete call by using __napi_complete before we reenable
> > IRQs.  This simply doesn't work since we need to flush the held
> > GRO packets first.
> >
> > This patch fixes it by doing the obvious thing of reenabling
> > IRQs first and then calling napi_complete.
> 
> Does this fix generate a race condition for a non-NAPI device?  If
> netif_rx runs immediately after local_irq_enable it would queue a
> packet on the backlog queue and try to schedule napi (the latter has
> no effect because napi has not completed).  On return from interrupt,
> napi_complete is done leaving a packet in the input queue but napi is
> not scheduled to process it.

Yes we know this version of Herbert's patch has that problem,
read the rest of the thread and subsequent versions of the fix.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-03-27 22:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-17  8:35 GRO with non napi driver: BUG in __napi_complete Frank Blaschka
2009-03-17 10:22 ` Herbert Xu
2009-03-17 20:11   ` David Miller
2009-03-27 19:05   ` Tom Herbert
2009-03-27 22:50     ` David Miller
2009-03-17 15:49 ` Jean-Pascal Billaud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).