All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Stevens <dlstevens@us.ibm.com>,
	greearb@candelatech.com, netdev@vger.kernel.org
Subject: Re: BUG: soft lockup detected on CPU#0!  (2.6.18.2 plus hacks)
Date: Thu, 4 Jan 2007 09:03:51 +0100	[thread overview]
Message-ID: <20070104080351.GA1740@ff.dom.local> (raw)
In-Reply-To: <E1H2M3X-0001HE-00@gondolin.me.apana.org.au>

On Thu, Jan 04, 2007 at 05:26:27PM +1100, Herbert Xu wrote:
> David Stevens <dlstevens@us.ibm.com> wrote:
> >        You're right, I don't know whether it'll fix the problem Ben saw
> > or not, but it looks like the original code can do a receive before the
> > in_device is fully initialized, and that, of course, is bad.
> >        If the device for ip_rcv() is not the same one we were
> > initializing when the receive interrupted, then the patch should have
> > no effect either way -- I don't think it'll hide other problems.
> >        If it's hard to reproduce (which I guess is true), then you're
> > right, no soft lockup doesn't really tell us if it's fixed or not.
> 
> Actually I missed your point that the multicast locks aren't even
> initialised at that point.  So this does explain the soft lock-up
> and therefore your patch is clearly the correct solution.

I doubt this is the right solution. It certainly
could fix this particular situation but my main
point was packets shouldn't get into kernel
receive queues with skb->dev not IFF_UP.

The real devices' drivers don't do that and
virtual devices should do the same. Otherwise,
the code of netif_rx or netif_receive_skb should
check this always and drop such packets or else
this kind of checking should be done later. And
this patch simply takes into consideration
something could be wrong here. But then all the
rest of receiving and routing functions should be
checked and maybe fixed to consider the same.

I've proposed some measures to check if this bug
is really caused by this skipped init but, IMHO,
this should be fixed with one of this ways:

- vlan driver should be reworked to do like "real"
drivers and assure no packet with skb->dev not
IFF_UP will be queued or processed by higher
protocols; it could possibly use bridge's master
field and skb_bond and skb_bond_should_drop (maybe
slightly changed),

- vlan driver should itself open the real devices
only after it's devices are up,

- all dev.c receive functions should be changed to
check IFF_UP - but because vlans are not so
popular - this would be the waste of time of
course.

Regards,
Jarek P.

PS: for scientific reasons we could seek this
specific place where it locks or loops now
(I've some suspicions to lockdep because it
looks like the place after it with lock init
checking isn't reached), and maybe there is
also some other bug, but it's evident this
possibility of ip_rcv and ip_route_input
before dev IFF_UP is a hole in the design and
should be fixed.

  reply	other threads:[~2007-01-04  8:02 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-20  2:13 BUG: soft lockup detected on CPU#0! (2.6.18.2 plus hacks) Ben Greear
2006-12-22  7:13 ` [PATCH] igmp: spin_lock_bh in timer (Re: BUG: soft lockup detected on CPU#0!) Jarek Poplawski
2006-12-22  7:42   ` Jarek Poplawski
2006-12-22 13:47     ` Ben Greear
2006-12-22 14:05     ` Ben Greear
2006-12-27  8:24       ` Jarek Poplawski
2006-12-27 16:16         ` Ben Greear
2006-12-28 12:56           ` Jarek Poplawski
2006-12-29 11:16           ` Jarek Poplawski
2006-12-22  9:48   ` Jarek Poplawski
2006-12-22 11:16   ` Herbert Xu
2006-12-22 12:53     ` Jarek Poplawski
2007-01-02  5:00 ` BUG: soft lockup detected on CPU#0! (2.6.18.2 plus hacks) Ben Greear
2007-01-02  7:39   ` Jarek Poplawski
2007-01-02  8:23     ` Jarek Poplawski
2007-01-02  9:23       ` Jarek Poplawski
2007-01-02 23:35   ` David Stevens
2007-01-02 23:43     ` Ben Greear
2007-01-03  8:07     ` Jarek Poplawski
2007-01-03  8:28       ` Jarek Poplawski
2007-01-03 16:53         ` Ben Greear
2007-01-03 22:14   ` David Stevens
2007-01-03 23:13     ` David Stevens
2007-01-03 23:35       ` Ben Greear
2007-01-03 23:56         ` David Stevens
2007-01-04  0:30       ` Herbert Xu
2007-01-04  1:02         ` Ben Greear
2007-01-04  1:14           ` Herbert Xu
2007-01-04  5:41           ` David Stevens
2007-01-04  5:34         ` David Stevens
2007-01-04  6:26           ` Herbert Xu
2007-01-04  8:03             ` Jarek Poplawski [this message]
2007-01-04  8:29               ` Herbert Xu
2007-01-04  8:50                 ` Jarek Poplawski
2007-01-04 10:27                   ` Herbert Xu
2007-01-04 11:04                     ` Jarek Poplawski
2007-01-04 17:04                       ` Ben Greear
2007-01-05 13:55                         ` Jarek Poplawski
2007-01-04 20:33             ` David Miller
2007-01-05  6:38               ` Jarek Poplawski
2007-01-05  9:38                 ` Herbert Xu
2007-01-05 11:19                   ` [PATCH] devinet: inetdev_init out label moved after RCU assignment Jarek Poplawski
2007-01-05 11:23                     ` Herbert Xu
2007-01-05 11:37                       ` Jarek Poplawski
2007-01-09 22:38                       ` David Miller
2007-01-05 19:52                     ` David Stevens
2007-01-05 20:33               ` BUG: soft lockup detected on CPU#0! (2.6.18.2 plus hacks) Ben Greear
2007-01-05 20:34                 ` David Miller
2007-01-08  6:53                 ` Jarek Poplawski
2007-01-08 16:57                   ` Ben Greear
2007-01-08 18:03                     ` Stephen Hemminger
2007-01-09  8:10                       ` Jarek Poplawski
2007-01-10  9:04                         ` Jarek Poplawski
2007-01-10 12:50                           ` Jarek Poplawski
2007-01-10 20:01                             ` Stephen Hemminger
2007-01-11  7:24                               ` Jarek Poplawski
2007-01-11  7:40                                 ` David Miller
2007-01-11  8:29                                   ` Jarek Poplawski
2007-01-11  8:35                                     ` Jarek Poplawski
2007-01-11  8:39                                       ` Jarek Poplawski
2007-01-11  9:27                                         ` David Miller
2007-01-11 11:09                                           ` Jarek Poplawski
2007-01-11 17:42                                             ` RCU info Stephen Hemminger
2007-01-12 12:19                                               ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070104080351.GA1740@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=dlstevens@us.ibm.com \
    --cc=greearb@candelatech.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.