linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* raid1 boot regression in 2.6.37 [bisected]
@ 2011-03-25 18:55 Thomas Jarosch
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Jarosch @ 2011-03-25 18:55 UTC (permalink / raw)
  To: linux-raid; +Cc: Tejun Heo

Hello,

I've just updated from kernel 2.6.34.7 to kernel 2.6.37.5 and one
HP Proliant DL320 G3 box with a raid1 software RAID stopped booting.
(also two other non-HP boxes).

We run this script at boot time via dracut:
----------------------------------
#!/bin/sh
. /lib/dracut-lib.sh

info "Telling kernel to auto-detect RAID arrays"
/sbin/initqueue --settled --name kerneldetectraid /sbin/mdadm --auto-detect
----------------------------------

With the "bad" commit in place, the kernel doesn't output
any md message at all. I've bisected it down to this commit:

e804ac780e2f01cb3b914daca2fd4780d1743db1 is the first bad commit
commit e804ac780e2f01cb3b914daca2fd4780d1743db1
Author: Tejun Heo <tj@kernel.org>
Date:   Fri Oct 15 15:36:08 2010 +0200

    md: fix and update workqueue usage

    Workqueue usage in md has two problems.

    * Flush can be used during or depended upon by memory reclaim, but md
      uses the system workqueue for flush_work which may lead to deadlock.

    * md depends on flush_scheduled_work() to achieve exclusion against
      completion of removal of previous instances.  flush_scheduled_work()
      may incur unexpected amount of delay and is scheduled to be removed.

    This patch adds two workqueues to md - md_wq and md_misc_wq.  The
    former is guaranteed to make forward progress under memory pressure
    and serves flush_work.  The latter serves as the flush domain for
    other works.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: NeilBrown <neilb@suse.de>

:040000 040000 f6b6a34a71864263ed253866c5f8abe7f766ac6b 
dc2eff4a91825142b7c88cf54751fc7acdf1a6d2 M      drivers

I manually verified that the commit before it 
(57dab0bdf689d42972975ec646d862b0900a4bf3) works
and the "bad" commit prevents the box from booting.


Some more info:

# mdadm --version
mdadm - v2.6.9 - 10th March 2009

# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Wed May 27 17:52:40 2009
     Raid Level : raid1
     Array Size : 2562240 (2.44 GiB 2.62 GB)
  Used Dev Size : 2562240 (2.44 GiB 2.62 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Mar 25 17:11:33 2011
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 0ee8da2c:5803478b:e399b924:6520c535
         Events : 0.160

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1



Any idea what might go wrong? May be building a kernel
with lock debugging on Monday might help. I think I also
tried kernel 2.6.38 though I'll very on Monday, too.


Have a nice weekend,
Thomas

PS: Sorry Tejun for the HTML crap in my first mail.

^ permalink raw reply	[flat|nested] 22+ messages in thread
[parent not found: <201103251725.21180.thomas.jarosch@intra2net.com>]

end of thread, other threads:[~2011-05-02 12:17 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-25 18:55 raid1 boot regression in 2.6.37 [bisected] Thomas Jarosch
     [not found] <201103251725.21180.thomas.jarosch@intra2net.com>
2011-03-28  7:59 ` Tejun Heo
2011-03-28 11:02   ` Thomas Jarosch
2011-03-28 12:53     ` Thomas Jarosch
2011-03-28 15:59       ` Tejun Heo
2011-03-28 19:46         ` Thomas Jarosch
2011-03-28 19:59           ` Roberto Spadim
2011-03-29 12:06             ` Thomas Jarosch
2011-03-29 12:22               ` Roberto Spadim
2011-03-29  8:25           ` Tejun Heo
2011-03-29  9:53             ` Thomas Jarosch
2011-03-29 10:07               ` Tejun Heo
2011-03-29 11:52                 ` Thomas Jarosch
2011-04-05  3:46                 ` NeilBrown
2011-04-06 10:16                   ` Tejun Heo
2011-04-12 14:05                     ` Thomas Jarosch
2011-04-12 22:44                       ` NeilBrown
     [not found]                         ` <201104261051.09464.thomas.jarosch@intra2net.com>
2011-04-27  8:17                           ` NeilBrown
2011-04-27 10:05                             ` NeilBrown
     [not found]                               ` <201104271700.58894.thomas.jarosch@intra2net.com>
2011-04-28  1:23                                 ` NeilBrown
2011-04-28 13:47                                   ` Thomas Jarosch
2011-05-02 12:17                                     ` Thomas Jarosch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).