public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
	tj@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	albcamus@gmail.com, pjones@redhat.com, alex.shi@intel.com
Subject: Re: system fails to boot
Date: Fri, 14 Nov 2008 14:29:56 +0800	[thread overview]
Message-ID: <1226644196.2866.83.camel@ymzhang> (raw)
In-Reply-To: <20081114061847.GB2227@x200.localdomain>


On Fri, 2008-11-14 at 09:18 +0300, Alexey Dobriyan wrote:
> On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> > Jens,
> > 
> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > system boot doesn't fail.
> > 
> > I debug it and locate the root cause. Pls. see
> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> > 
> > As a matter of fact, there are 2 bugs.
> > 
> > 1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5
> > times and fails once. nash has a bug. Some of its functions misuse return value 0.
> > Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets
> > an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence,
> > kernel tells nash that uevents are available, but kernel also set timeout, nash
> > might stops collecting other uevents in queue if current uevent isn't block-related.
> > I work out a patch for nash to fix it. 
> > http://bugzilla.kernel.org/attachment.cgi?id=18858
> > 
> > 2) root=LABEL=/, system always can't boot. initrd init reports
> > switchroot fails. Here is an executation branch of nash when booting:
> >     (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> >     (2) nash query /proc/devices with the major number; It found line  "8 sd";
> >     (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> >        and add it to its own list;
> >     (4) Later on, it probes all devices in its list to get filesystem labels;
> >        scsi register "8 sd" always.
> > When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> > when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> > to find device (DISK) type.
> > To fixing issue 2), I create a patch for nash and another patch for kernel.
> > http://bugzilla.kernel.org/attachment.cgi?id=18859
> > http://bugzilla.kernel.org/attachment.cgi?id=18837
> > 
> > Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices.
> > 
> It's procfs-specific init, what's up?
nash (FC9 uses nash to explain the init script in initrd) reads /proc/devices to check the type of
root device. When CONFIG_DEBUG_BLOCK_EXT_DEVT=y, the root device MAJOR is 259. Current kernel doesn't
register block device for 259 in /proc/devices.

It's hard to explain in a short statement. Would you like to read it from
http://bugzilla.kernel.org/show_bug.cgi?id=11899?



  parent reply	other threads:[~2008-11-14  6:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-14  5:16 system fails to boot Zhang, Yanmin
2008-11-14  5:26 ` Tejun Heo
2008-11-14  7:16   ` Jens Axboe
2008-11-14  6:18 ` Alexey Dobriyan
2008-11-14  6:22   ` Tejun Heo
2008-11-14  7:22     ` Zhang, Yanmin
2008-11-14  6:29   ` Zhang, Yanmin [this message]
2008-11-17  8:19     ` Zhang, Yanmin
2008-11-21 17:26 ` Jike Song
2008-11-24  5:52   ` Zhang, Yanmin
2008-11-24  6:40     ` Jike Song
2008-11-24  6:57       ` Zhang, Yanmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1226644196.2866.83.camel@ymzhang \
    --to=yanmin_zhang@linux.intel.com \
    --cc=adobriyan@gmail.com \
    --cc=albcamus@gmail.com \
    --cc=alex.shi@intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pjones@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox