public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>,
	Vegard Nossum <vegard.nossum@gmail.com>,
	Adrian Bunk <bunk@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jens Axboe <jens.axboe@oracle.com>,
	Greg Kroah-Hartman <gregkh@suse.de>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Kay Sievers <kay.sievers@vrfy.org>, Neil Brown <neilb@suse.de>,
	Mariusz Kozlowski <m.kozlowski@tuxland.pl>,
	Dave Young <hidave.darkstar@gmail.com>
Subject: Re: [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in blk_lookup_devt()/prepare_namespace()
Date: Mon, 9 Jun 2008 17:38:56 +0200	[thread overview]
Message-ID: <20080609153856.GA5149@elte.hu> (raw)
In-Reply-To: <alpine.LFD.1.10.0806090819410.3473@woody.linux-foundation.org>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, 9 Jun 2008, Cornelia Huck wrote:
> > 
> > Does this crash happen with the conversion to the class iterator 
> > functions (should be in linux-next) as well? They take the class 
> > mutex...
> 
> I really don't think it's the locking, although I do agree that the 
> locking looks bogus _too_.
> 
> I suspect that the problem is even simpler than that. On the 
> "block_class.devices" list we can have two types of devices: the ones 
> that have been added by the block/genhd.c code (disks: dev->type 
> "disk_type"), and the ones that are added by the class layer for 
> partitions (partitions: dev.type "part_type").
> 
> And *all* the block/genhd.c loops over that device list look like this:
> 
> 	list_for_each_entry(dev, &block_class.devices, node) {
> 		if (dev->type != &disk_type)
> 			continue;
> 		sgp = dev_to_disk(dev);
> 		...
> 
> because you cannot do that "dev_to_disk()" on a partition entry (it 
> won't have a container of type gendisk, it will be of type hd_struct).
> 
> Well, all except one. Guess which one..
> 
> So I suspect that (a) yes, we need to fix the locking, but (b) the fix for 
> this particular bug is probably the trivial one appended.
> 
> And yes, this bug was introduced by commit 30f2f0eb4b ("block: 
> do_mounts - accept root=<non-existant partition>"), so the alternative 
> is to revert it entirely. Kay?

ah. I suspect that explains the sporadic nature as well: normally there 
is 'some' object at the list address, just with an invalid type.

The invalid type only gets visible as a hard crash if due to PAGEALLOC 
the structure sizes and kmalloc/slab details cause the invalid access to 
go to a not yet allocated page. (and then it crashes there)

And that in itself is a rather unlikely and fragile condition (it might 
even depend on timings of various allocations), that's why the bug wasnt 
really reproducible deterministically.

	Ingo

  reply	other threads:[~2008-06-09 15:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-09  8:03 [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in blk_lookup_devt()/prepare_namespace() Ingo Molnar
2008-06-09  9:06 ` Andrew Morton
2008-06-09  9:09   ` Vegard Nossum
2008-06-09  9:34     ` Ingo Molnar
2008-06-09 10:35     ` Vegard Nossum
2008-06-09 13:34     ` Adrian Bunk
2008-06-09 13:58       ` Vegard Nossum
2008-06-09 14:28         ` Vegard Nossum
2008-06-09 14:57           ` Cornelia Huck
2008-06-09 15:09             ` Vegard Nossum
2008-06-09 15:29             ` Linus Torvalds
2008-06-09 15:38               ` Ingo Molnar [this message]
2008-06-09 16:15                 ` Linus Torvalds
2008-06-09 17:15                   ` Cornelia Huck
2008-06-09 18:03                     ` Cornelia Huck
2008-06-10  3:11                     ` Greg KH
2008-06-10  7:51                       ` Cornelia Huck
2008-06-10 21:52                         ` Greg KH
2008-06-10  3:09                   ` Greg KH
2008-06-09 15:46               ` Kay Sievers
2008-06-09 15:58                 ` Linus Torvalds
2008-06-10  3:07                   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080609153856.GA5149@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=bunk@kernel.org \
    --cc=cornelia.huck@de.ibm.com \
    --cc=gregkh@suse.de \
    --cc=hidave.darkstar@gmail.com \
    --cc=jens.axboe@oracle.com \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.kozlowski@tuxland.pl \
    --cc=neilb@suse.de \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    --cc=vegard.nossum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox