netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ofa-general] NetEffect, iw_nes and kernel warning
       [not found] <497EF9AC.70104@poczta.onet.pl>
@ 2009-01-27 23:53 ` Roland Dreier
  2009-01-28  0:07   ` David Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2009-01-27 23:53 UTC (permalink / raw)
  To: aluno3@poczta.onet.pl; +Cc: general, netdev

Interesting... looks like an unfortunate interaction with unclear
locking rules.  See below for full explanation.

BTW, what workload are you running to hit this?

I assume you have CONFIG_HIGHMEM set?

 > WARNING: at kernel/softirq.c:136 local_bh_enable+0x9b/0xa0()

I assume this is

	WARN_ON_ONCE(in_irq() || irqs_disabled());

The interesting parts of the stack trace seem to be (reversing the order
so the story makes sense):

 [<e8e3f815>] nes_netdev_start_xmit+0x815/0x8a0 [iw_nes]

nes_netdev_start_xmit() calls skb_linearize() for nonlinear skbs it
can't handle, which calls __pskb_pull_tail():

 [<c048982c>] __pskb_pull_tail+0x5c/0x2e0

__pskb_pull_tail() calls skb_copy_bits():

 [<c0489c05>] skb_copy_bits+0x155/0x290

At least in some cases, skb_copy_bits() calls kmap_skb_frag() and more
to the point kunmap_skb_frag(), which looks like:

	static inline void kunmap_skb_frag(void *vaddr)
	{
		kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
	#ifdef CONFIG_HIGHMEM
		local_bh_enable();
	#endif
	}

which leads to:

 [<c012a79b>] local_bh_enable+0x9b/0xa0

which hits the irqs_disabled() warning because iw_nes is using LLTX, and
nes_netdev_start_xmit() does:

	local_irq_save(flags);
	if (!spin_trylock(&nesnic->sq_lock)) {

at the very beginning.

The best solution is probably for iw_nes to stop using LLTX and use the
main netdev lock... but actually I still don't see how it's safe for a
net driver to call skb_linearize() from its transmit routine, since
there's a chance that that will unconditionally enable BHs?

 - R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-27 23:53 ` [ofa-general] NetEffect, iw_nes and kernel warning Roland Dreier
@ 2009-01-28  0:07   ` David Miller
  2009-01-28  0:17     ` Stephen Hemminger
  2009-01-28 18:05     ` Roland Dreier
  0 siblings, 2 replies; 16+ messages in thread
From: David Miller @ 2009-01-28  0:07 UTC (permalink / raw)
  To: rdreier; +Cc: aluno3, general, netdev

From: Roland Dreier <rdreier@cisco.com>
Date: Tue, 27 Jan 2009 15:53:16 -0800

> but actually I still don't see how it's safe for a net driver to
> call skb_linearize() from its transmit routine, since there's a
> chance that that will unconditionally enable BHs?

It's simply not allowed.

dev_queue_xmit() at a higher level can do __skb_linearize()
because it does so before doing the rcu_read_lock_bh().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28  0:07   ` David Miller
@ 2009-01-28  0:17     ` Stephen Hemminger
  2009-01-28 16:36       ` Tung, Chien Tin
  2009-01-28 18:05     ` Roland Dreier
  1 sibling, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2009-01-28  0:17 UTC (permalink / raw)
  To: David Miller; +Cc: rdreier, aluno3, general, netdev

On Tue, 27 Jan 2009 16:07:50 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: Roland Dreier <rdreier@cisco.com>
> Date: Tue, 27 Jan 2009 15:53:16 -0800
> 
> > but actually I still don't see how it's safe for a net driver to
> > call skb_linearize() from its transmit routine, since there's a
> > chance that that will unconditionally enable BHs?
> 
> It's simply not allowed.
> 
> dev_queue_xmit() at a higher level can do __skb_linearize()
> because it does so before doing the rcu_read_lock_bh().

If the device driver can't handle non-linear SKB's then it should
not set NETIF_F_SG.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28  0:17     ` Stephen Hemminger
@ 2009-01-28 16:36       ` Tung, Chien Tin
  0 siblings, 0 replies; 16+ messages in thread
From: Tung, Chien Tin @ 2009-01-28 16:36 UTC (permalink / raw)
  To: Stephen Hemminger, David Miller
  Cc: netdev@vger.kernel.org, rdreier@cisco.com,
	general@lists.openfabrics.org

Thank you for all the feedback.  We are looking into the issue.

Chien

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28  0:07   ` David Miller
  2009-01-28  0:17     ` Stephen Hemminger
@ 2009-01-28 18:05     ` Roland Dreier
  2009-01-28 19:05       ` Stephen Hemminger
  2009-01-30  6:57       ` Herbert Xu
  1 sibling, 2 replies; 16+ messages in thread
From: Roland Dreier @ 2009-01-28 18:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, general

 > > but actually I still don't see how it's safe for a net driver to
 > > call skb_linearize() from its transmit routine, since there's a
 > > chance that that will unconditionally enable BHs?
 > 
 > It's simply not allowed.
 > 
 > dev_queue_xmit() at a higher level can do __skb_linearize()
 > because it does so before doing the rcu_read_lock_bh().

OK, thanks... what confused me is that several other drivers also do
skb_linearize() in their hard_start_xmit method... eg bnx2x,
via-velocity, mv643xx_eth.  So there are several other lurking bugs to
deal with here I guess.

 - R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28 18:05     ` Roland Dreier
@ 2009-01-28 19:05       ` Stephen Hemminger
  2009-01-28 21:52         ` Roland Dreier
  2009-01-30  6:57       ` Herbert Xu
  1 sibling, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2009-01-28 19:05 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Miller, aluno3, general, netdev

On Wed, 28 Jan 2009 10:05:29 -0800
Roland Dreier <rdreier@cisco.com> wrote:

>  > > but actually I still don't see how it's safe for a net driver to
>  > > call skb_linearize() from its transmit routine, since there's a
>  > > chance that that will unconditionally enable BHs?
>  > 
>  > It's simply not allowed.
>  > 
>  > dev_queue_xmit() at a higher level can do __skb_linearize()
>  > because it does so before doing the rcu_read_lock_bh().
> 
> OK, thanks... what confused me is that several other drivers also do
> skb_linearize() in their hard_start_xmit method... eg bnx2x,
> via-velocity, mv643xx_eth.  So there are several other lurking bugs to
> deal with here I guess.
> 
>  - R.

They all look like lurking (and untested) bug paths. mv643xx is especially
bad since it can leak skb. But it should be possible to call pull_tail
if bh is disabled (as long as irqs are enabled).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28 19:05       ` Stephen Hemminger
@ 2009-01-28 21:52         ` Roland Dreier
  0 siblings, 0 replies; 16+ messages in thread
From: Roland Dreier @ 2009-01-28 21:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, David Miller, general

 > > OK, thanks... what confused me is that several other drivers also do
 > > skb_linearize() in their hard_start_xmit method... eg bnx2x,
 > > via-velocity, mv643xx_eth.  So there are several other lurking bugs to
 > > deal with here I guess.

 > They all look like lurking (and untested) bug paths. mv643xx is especially
 > bad since it can leak skb. But it should be possible to call pull_tail
 > if bh is disabled (as long as irqs are enabled).

Yes.  The only obvious problem with __pskb_pull_tail() with BHs disabled
is that with CONFIG_HIGHMEM set, it goes into kmap_skb_frag(), which
then unconditionally does local_bh_disable()/local_bh_enable().  There's
no reason in principle that kmap_skb_frag() couldn't do
local_save_flags()/local_restore_flags() instead.

Just grepping around I see other potential issues related to this, for
example the (unused but exported) function fcoe_fc_crc() does
kmap_atomic(KM_SKB_DATA_SOFTIRQ) without any particular BH disabling,
which might run into trouble if used in the wrong context...

 - R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-28 18:05     ` Roland Dreier
  2009-01-28 19:05       ` Stephen Hemminger
@ 2009-01-30  6:57       ` Herbert Xu
  2009-01-30  8:22         ` Eilon Greenstein
                           ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Herbert Xu @ 2009-01-30  6:57 UTC (permalink / raw)
  To: Roland Dreier; +Cc: davem, aluno3, general, netdev

Roland Dreier <rdreier@cisco.com> wrote:
> 
> OK, thanks... what confused me is that several other drivers also do
> skb_linearize() in their hard_start_xmit method... eg bnx2x,
> via-velocity, mv643xx_eth.  So there are several other lurking bugs to
> deal with here I guess.

I don't know about the rest but bnx2x is certainly OK since it
only does so with IRQ enabled.  It is legal to call skb_linearize
as long as you're sure that IRQs are enabled, which is always the
case for hard_start_xmit upon entry.

So the only time you can't call it in hard_start_xmit is if you've
just disabled IRQs yourself.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-30  6:57       ` Herbert Xu
@ 2009-01-30  8:22         ` Eilon Greenstein
  2009-01-30 16:25         ` Stephen Hemminger
  2009-01-30 17:35         ` Roland Dreier
  2 siblings, 0 replies; 16+ messages in thread
From: Eilon Greenstein @ 2009-01-30  8:22 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Roland Dreier, davem@davemloft.net, aluno3@poczta.onet.pl,
	general@lists.openfabrics.org, netdev@vger.kernel.org

On Thu, 2009-01-29 at 22:57 -0800, Herbert Xu wrote:
> Roland Dreier <rdreier@cisco.com> wrote:
> > 
> > OK, thanks... what confused me is that several other drivers also do
> > skb_linearize() in their hard_start_xmit method... eg bnx2x,
> > via-velocity, mv643xx_eth.  So there are several other lurking bugs to
> > deal with here I guess.
> 
> I don't know about the rest but bnx2x is certainly OK since it
> only does so with IRQ enabled.  It is legal to call skb_linearize
> as long as you're sure that IRQs are enabled, which is always the
> case for hard_start_xmit upon entry.
> 
> So the only time you can't call it in hard_start_xmit is if you've
> just disabled IRQs yourself.
> 
Thanks Herbert,

That was my conclusion too and even though I was still looking at it
since this report yesterday - it still looks OK to me. The bnx2x is
getting into this flow when using SAMBA and it was tested on few systems
for days under traffic - this does not mean that the code is right and
it is not a prove that there is no bug - but it makes me feel better...

I would appreciate some comments if someone still thinks this is a bug.

Thanks,
Eilon



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-30  6:57       ` Herbert Xu
  2009-01-30  8:22         ` Eilon Greenstein
@ 2009-01-30 16:25         ` Stephen Hemminger
  2009-01-30 17:35         ` Roland Dreier
  2 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2009-01-30 16:25 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Roland Dreier, davem, aluno3, general, netdev

On Fri, 30 Jan 2009 17:57:21 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Roland Dreier <rdreier@cisco.com> wrote:
> > 
> > OK, thanks... what confused me is that several other drivers also do
> > skb_linearize() in their hard_start_xmit method... eg bnx2x,
> > via-velocity, mv643xx_eth.  So there are several other lurking bugs to
> > deal with here I guess.
> 
> I don't know about the rest but bnx2x is certainly OK since it
> only does so with IRQ enabled.  It is legal to call skb_linearize
> as long as you're sure that IRQs are enabled, which is always the
> case for hard_start_xmit upon entry.
> 
> So the only time you can't call it in hard_start_xmit is if you've
> just disabled IRQs yourself.

Or netconsole.
netconsole calls start_xmit from IRQ 
but it is safe since netconsole doesn't send fragmented skb's

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-30  6:57       ` Herbert Xu
  2009-01-30  8:22         ` Eilon Greenstein
  2009-01-30 16:25         ` Stephen Hemminger
@ 2009-01-30 17:35         ` Roland Dreier
  2009-01-30 21:51           ` David Miller
  2 siblings, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2009-01-30 17:35 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, aluno3, davem, general

 > > OK, thanks... what confused me is that several other drivers also do
 > > skb_linearize() in their hard_start_xmit method... eg bnx2x,
 > > via-velocity, mv643xx_eth.  So there are several other lurking bugs to
 > > deal with here I guess.
 > 
 > I don't know about the rest but bnx2x is certainly OK since it
 > only does so with IRQ enabled.  It is legal to call skb_linearize
 > as long as you're sure that IRQs are enabled, which is always the
 > case for hard_start_xmit upon entry.

I don't believe this is accurate.  Calling skb_linearize() (on a kernel
with CONFIG_HIGHMEM set) can end up calling local_bh_enable() in
kunmap_skb_frag(), which can obviously cause problems if the initial
context relies on having BHs disabled (as hard_start_xmit does).

 - R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-30 17:35         ` Roland Dreier
@ 2009-01-30 21:51           ` David Miller
  2009-01-31  3:54             ` Roland Dreier
  0 siblings, 1 reply; 16+ messages in thread
From: David Miller @ 2009-01-30 21:51 UTC (permalink / raw)
  To: rdreier; +Cc: herbert, aluno3, general, netdev

From: Roland Dreier <rdreier@cisco.com>
Date: Fri, 30 Jan 2009 09:35:52 -0800

>  > > OK, thanks... what confused me is that several other drivers also do
>  > > skb_linearize() in their hard_start_xmit method... eg bnx2x,
>  > > via-velocity, mv643xx_eth.  So there are several other lurking bugs to
>  > > deal with here I guess.
>  > 
>  > I don't know about the rest but bnx2x is certainly OK since it
>  > only does so with IRQ enabled.  It is legal to call skb_linearize
>  > as long as you're sure that IRQs are enabled, which is always the
>  > case for hard_start_xmit upon entry.
> 
> I don't believe this is accurate.  Calling skb_linearize() (on a kernel
> with CONFIG_HIGHMEM set) can end up calling local_bh_enable() in
> kunmap_skb_frag(), which can obviously cause problems if the initial
> context relies on having BHs disabled (as hard_start_xmit does).

local_bh_{enable,disable}() nests, so this is not a problem


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-30 21:51           ` David Miller
@ 2009-01-31  3:54             ` Roland Dreier
  2009-04-21  9:09               ` Lennert Buytenhek
  0 siblings, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2009-01-31  3:54 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, aluno3, herbert, general

 > > I don't believe this is accurate.  Calling skb_linearize() (on a kernel
 > > with CONFIG_HIGHMEM set) can end up calling local_bh_enable() in
 > > kunmap_skb_frag(), which can obviously cause problems if the initial
 > > context relies on having BHs disabled (as hard_start_xmit does).

 > local_bh_{enable,disable}() nests, so this is not a problem

Duh.  OK, then the only bugs seem to be that iw_nes does skb_linearize
with irqs off (due to being an LLTX driver), and mv643xx_eth leaks an
skb on its error path if skb_linearize fails.

 - R.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-01-31  3:54             ` Roland Dreier
@ 2009-04-21  9:09               ` Lennert Buytenhek
  2009-04-21 12:49                 ` Herbert Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Lennert Buytenhek @ 2009-04-21  9:09 UTC (permalink / raw)
  To: Roland Dreier; +Cc: David Miller, herbert, aluno3, general, netdev

On Fri, Jan 30, 2009 at 07:54:12PM -0800, Roland Dreier wrote:

> > > I don't believe this is accurate.  Calling skb_linearize() (on a kernel
> > > with CONFIG_HIGHMEM set) can end up calling local_bh_enable() in
> > > kunmap_skb_frag(), which can obviously cause problems if the initial
> > > context relies on having BHs disabled (as hard_start_xmit does).
> >
> > local_bh_{enable,disable}() nests, so this is not a problem
> 
> Duh.  OK, then the only bugs seem to be that iw_nes does skb_linearize
> with irqs off (due to being an LLTX driver), and mv643xx_eth leaks an
> skb on its error path if skb_linearize fails.

(Found this when deleting old netdev@ mail...)  mv643xx_eth returns
NETDEV_TX_BUSY if skb_linearize fails, so the qdisc will requeue the
skb, and we shouldn't free it.  Am I missing something?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-04-21  9:09               ` Lennert Buytenhek
@ 2009-04-21 12:49                 ` Herbert Xu
  2009-04-21 12:50                   ` Herbert Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Herbert Xu @ 2009-04-21 12:49 UTC (permalink / raw)
  To: Lennert Buytenhek; +Cc: Roland Dreier, David Miller, aluno3, general, netdev

On Tue, Apr 21, 2009 at 11:09:18AM +0200, Lennert Buytenhek wrote:
> On Fri, Jan 30, 2009 at 07:54:12PM -0800, Roland Dreier wrote:
> 
> > > > I don't believe this is accurate.  Calling skb_linearize() (on a kernel
> > > > with CONFIG_HIGHMEM set) can end up calling local_bh_enable() in
> > > > kunmap_skb_frag(), which can obviously cause problems if the initial
> > > > context relies on having BHs disabled (as hard_start_xmit does).
> > >
> > > local_bh_{enable,disable}() nests, so this is not a problem
> > 
> > Duh.  OK, then the only bugs seem to be that iw_nes does skb_linearize
> > with irqs off (due to being an LLTX driver), and mv643xx_eth leaks an
> > skb on its error path if skb_linearize fails.
> 
> (Found this when deleting old netdev@ mail...)  mv643xx_eth returns
> NETDEV_TX_BUSY if skb_linearize fails, so the qdisc will requeue the
> skb, and we shouldn't free it.  Am I missing something?

I don't think the issue here is the leak.  Calling skb_linearize is
simply illegal if you support netpoll because netpoll will call the
xmit routine with IRQs off.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [ofa-general] NetEffect, iw_nes and kernel warning
  2009-04-21 12:49                 ` Herbert Xu
@ 2009-04-21 12:50                   ` Herbert Xu
  0 siblings, 0 replies; 16+ messages in thread
From: Herbert Xu @ 2009-04-21 12:50 UTC (permalink / raw)
  To: Lennert Buytenhek; +Cc: netdev, Roland Dreier, aluno3, David Miller, general

On Tue, Apr 21, 2009 at 08:49:19PM +0800, Herbert Xu wrote:
> 
> I don't think the issue here is the leak.  Calling skb_linearize is
> simply illegal if you support netpoll because netpoll will call the
> xmit routine with IRQs off.

On the other hand if netpoll never generates a packet that requires
linearisation, maybe it will work :)
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-04-21 12:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <497EF9AC.70104@poczta.onet.pl>
2009-01-27 23:53 ` [ofa-general] NetEffect, iw_nes and kernel warning Roland Dreier
2009-01-28  0:07   ` David Miller
2009-01-28  0:17     ` Stephen Hemminger
2009-01-28 16:36       ` Tung, Chien Tin
2009-01-28 18:05     ` Roland Dreier
2009-01-28 19:05       ` Stephen Hemminger
2009-01-28 21:52         ` Roland Dreier
2009-01-30  6:57       ` Herbert Xu
2009-01-30  8:22         ` Eilon Greenstein
2009-01-30 16:25         ` Stephen Hemminger
2009-01-30 17:35         ` Roland Dreier
2009-01-30 21:51           ` David Miller
2009-01-31  3:54             ` Roland Dreier
2009-04-21  9:09               ` Lennert Buytenhek
2009-04-21 12:49                 ` Herbert Xu
2009-04-21 12:50                   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).