linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Thomas Jarosch <thomas.jarosch@intra2net.com>
Cc: linux-raid@vger.kernel.org, Neil Brown <neilb@suse.de>
Subject: Re: raid1 boot regression in 2.6.37 [bisected]
Date: Tue, 29 Mar 2011 10:25:03 +0200	[thread overview]
Message-ID: <20110329082503.GI6736@htj.dyndns.org> (raw)
In-Reply-To: <4D90E580.7020406@intra2net.com>

On Mon, Mar 28, 2011 at 09:46:08PM +0200, Thomas Jarosch wrote:
> On 03/28/2011 05:59 PM, Tejun Heo wrote:
> >> Call Trace:
> >>  [<c12dc808>] mutex_unlock+0x8/0x10
> >>  [<c11c4451>] kobj_lookup+0xe1/0x140
> >>  [<c11323b0>] ? exact_match+0x0/0x10
> >>  [<c11331b8>] get_gendisk+0x98/0xb0
> >>  [<c10e85aa>] __blkdev_get+0xca/0x320
> >>  [<c10e8843>] blkdev_get+0x43/0x2c0
> >>  [<c12de75d>] ? _raw_spin_unlock+0x1d/0x20
> >>  [<c10e8b12>] blkdev_open+0x52/0x70
> >>  [<c10bb12d>] __dentry_open+0x9d/0x240
> >>  [<c10bb3c6>] nameidata_to_filp+0x66/0x80
> >>  [<c10e8ac0>] ? blkdev_open+0x0/0x70
> >>  [<c10c781f>] finish_open+0xaf/0x190
> >>  [<c10c8a24>] ? do_path_lookup+0x44/0xe0
> >>  [<c10c9920>] do_filp_open+0x210/0x6d0
> >>  [<c10672e9>] ? lock_release_non_nested+0x59/0x2f0
> >>  [<c12de75d>] ? _raw_spin_unlock+0x1d/0x20
> >>  [<c10d47d8>] ? alloc_fd+0xb8/0xf0
> >>  [<c10baf45>] do_sys_open+0x55/0xf0
> >>  [<c10bb049>] sys_open+0x29/0x40
> >>  [<c1002e9f>] sysenter_do_call+0x12/0x38
> > 
> > Hmmm... Weird.
> > 
> > * blkid seems to be looping in blkdev_open() repeatedly calling
> >   md_open() which keeps returning -ERESTARTSYS.
> > 
> > * It triggered softlockup.  Even with -ERESTARTSYS looping, I can't
> >   see how that would be possible.
> > 
> > Is this custom boot script?  If so, do you use RT priority in the
> > script?
> 
> It's a normal dracut installation with an additional custom script
> to trigger kernel raid auto detection via mdadm.
> The custom script was part of the initial post.
> 
> I've also noticed another odd thing: On a HP Proliant ML110 G6 box,
> which is quite fast / SMP, the box brings up the software
> RAID successfully. The box is slow as hell and I can see a constant load
> on a kernel process (could be "kworker", don't remember it exactly).
> I'll try tomorrow if that is also related to the RAID subsystem
> or something else turning it into a PDP11...

Can you please apply the following patch and see whether it resolves
the problem and report the boot log?

Thanks.

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8b66e04..e17098b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6001,6 +6001,15 @@ static int md_open(struct block_device *bdev, fmode_t mode)
 		 * bd_disk.
 		 */
 		mddev_put(mddev);
+		if (current->policy == SCHED_FIFO || current->policy == SCHED_RR) {
+			static bool once;
+			if (!once) {
+				printk("%s: md_open(): RT prio, pol=%u p=%d rt_p=%u\n",
+				       current->comm, current->policy, current->static_prio, current->rt_priority);
+				once = true;
+			}
+		}
+		msleep(10);
 		/* Wait until bdev->bd_disk is definitely gone */
 		flush_workqueue(md_misc_wq);
 		/* Then retry the open from the top */

  parent reply	other threads:[~2011-03-29  8:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <201103251725.21180.thomas.jarosch@intra2net.com>
2011-03-28  7:59 ` raid1 boot regression in 2.6.37 [bisected] Tejun Heo
2011-03-28 11:02   ` Thomas Jarosch
2011-03-28 12:53     ` Thomas Jarosch
2011-03-28 15:59       ` Tejun Heo
2011-03-28 19:46         ` Thomas Jarosch
2011-03-28 19:59           ` Roberto Spadim
2011-03-29 12:06             ` Thomas Jarosch
2011-03-29 12:22               ` Roberto Spadim
2011-03-29  8:25           ` Tejun Heo [this message]
2011-03-29  9:53             ` Thomas Jarosch
2011-03-29 10:07               ` Tejun Heo
2011-03-29 11:52                 ` Thomas Jarosch
2011-04-05  3:46                 ` NeilBrown
2011-04-06 10:16                   ` Tejun Heo
2011-04-12 14:05                     ` Thomas Jarosch
2011-04-12 22:44                       ` NeilBrown
     [not found]                         ` <201104261051.09464.thomas.jarosch@intra2net.com>
2011-04-27  8:17                           ` NeilBrown
2011-04-27 10:05                             ` NeilBrown
     [not found]                               ` <201104271700.58894.thomas.jarosch@intra2net.com>
2011-04-28  1:23                                 ` NeilBrown
2011-04-28 13:47                                   ` Thomas Jarosch
2011-05-02 12:17                                     ` Thomas Jarosch
2011-03-25 18:55 Thomas Jarosch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110329082503.GI6736@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=thomas.jarosch@intra2net.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).