All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
	tj@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	albcamus@gmail.com, pjones@redhat.com, alex.shi@intel.com
Subject: Re: system fails to boot
Date: Fri, 14 Nov 2008 14:29:56 +0800	[thread overview]
Message-ID: <1226644196.2866.83.camel@ymzhang> (raw)
In-Reply-To: <20081114061847.GB2227@x200.localdomain>


On Fri, 2008-11-14 at 09:18 +0300, Alexey Dobriyan wrote:
> On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> > Jens,
> > 
> > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > system boot doesn't fail.
> > 
> > I debug it and locate the root cause. Pls. see
> > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> > 
> > As a matter of fact, there are 2 bugs.
> > 
> > 1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5
> > times and fails once. nash has a bug. Some of its functions misuse return value 0.
> > Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets
> > an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence,
> > kernel tells nash that uevents are available, but kernel also set timeout, nash
> > might stops collecting other uevents in queue if current uevent isn't block-related.
> > I work out a patch for nash to fix it. 
> > http://bugzilla.kernel.org/attachment.cgi?id=18858
> > 
> > 2) root=LABEL=/, system always can't boot. initrd init reports
> > switchroot fails. Here is an executation branch of nash when booting:
> >     (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> >     (2) nash query /proc/devices with the major number; It found line  "8 sd";
> >     (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> >        and add it to its own list;
> >     (4) Later on, it probes all devices in its list to get filesystem labels;
> >        scsi register "8 sd" always.
> > When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> > when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> > to find device (DISK) type.
> > To fixing issue 2), I create a patch for nash and another patch for kernel.
> > http://bugzilla.kernel.org/attachment.cgi?id=18859
> > http://bugzilla.kernel.org/attachment.cgi?id=18837
> > 
> > Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices.
> > 
> It's procfs-specific init, what's up?
nash (FC9 uses nash to explain the init script in initrd) reads /proc/devices to check the type of
root device. When CONFIG_DEBUG_BLOCK_EXT_DEVT=y, the root device MAJOR is 259. Current kernel doesn't
register block device for 259 in /proc/devices.

It's hard to explain in a short statement. Would you like to read it from
http://bugzilla.kernel.org/show_bug.cgi?id=11899?



  parent reply	other threads:[~2008-11-14  6:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-14  5:16 system fails to boot Zhang, Yanmin
2008-11-14  5:26 ` Tejun Heo
2008-11-14  7:16   ` Jens Axboe
2008-11-14  6:18 ` Alexey Dobriyan
2008-11-14  6:22   ` Tejun Heo
2008-11-14  7:22     ` Zhang, Yanmin
2008-11-14  6:29   ` Zhang, Yanmin [this message]
2008-11-17  8:19     ` Zhang, Yanmin
2008-11-21 17:26 ` Jike Song
2008-11-24  5:52   ` Zhang, Yanmin
2008-11-24  6:40     ` Jike Song
2008-11-24  6:57       ` Zhang, Yanmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1226644196.2866.83.camel@ymzhang \
    --to=yanmin_zhang@linux.intel.com \
    --cc=adobriyan@gmail.com \
    --cc=albcamus@gmail.com \
    --cc=alex.shi@intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pjones@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.