From: Tejun Heo <tj@kernel.org>
To: Thomas Jarosch <thomas.jarosch@intra2net.com>
Cc: linux-raid@vger.kernel.org, Neil Brown <neilb@suse.de>
Subject: Re: raid1 boot regression in 2.6.37 [bisected]
Date: Tue, 29 Mar 2011 12:07:44 +0200 [thread overview]
Message-ID: <20110329100744.GK6736@htj.dyndns.org> (raw)
In-Reply-To: <201103291153.06495.thomas.jarosch@intra2net.com>
On Tue, Mar 29, 2011 at 11:53:06AM +0200, Thomas Jarosch wrote:
> On Tuesday, 29. March 2011 10:25:03 Tejun Heo wrote:
> > Can you please apply the following patch and see whether it resolves
> > the problem and report the boot log?
>
> Ok, I did the following:
> - Check out commit e804ac780e2f01cb3b914daca2fd4780d1743db1
> (md: fix and update workqueue usage)
> - Apply your patch
> - Add small debug output on top of it:
>
> ------------------------------
> # git diff
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1e6534d..d2ddef4 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5899,6 +5899,16 @@ static int md_open(struct block_device *bdev, fmode_t mode)
> once = true;
> }
> }
> + /* DEBUG HACK */
> + {
> + static bool tomj_once = false;
> + if (!tomj_once)
> + {
> + printk("TOMJ %s: md_open(): RT prio, pol=%u p=%d rt_p=%u\n",
> + current->comm, current->policy, current->static_prio, current->rt_priority);
> + tomj_once = true;
> + }
> + }
> msleep(10);
> /* Wait until bdev->bd_disk is definitely gone */
> flush_workqueue(md_misc_wq);
...
> TOMJ blkid: md_open(): RT prio, pol=0 p=118 rt_p=0
...
> As you can see, your printk() is not triggered(). I just
> copied your printk and made it print once unconditionally.
>
> So probably the msleep(10); does the trick. Something
> seems very racy to me as other boxes with software RAID
> can boot the exact same kernel + dracut version just fine.
>
> I'll put the box in a reboot loop over the lunch break.
Hmmm.. interesting, so no RT task there. I don't know why the
softlockup is triggering then. Ah, okay, none of CONFIG_PREEMPT and
CONFIG_PREEMPT_VOLUNTARY is set, right?
Anyways, the root cause here is that md_open() -ERESTARTSYS retrying
is busy looping without giving the put path a chance to run. When it
was using flush_scheduled_work(), there were some unrelated work items
there so it ended up sleeping by accident giving the put path a chance
to run. With the conversion, the flush domain is reduced and there's
nothing unrelated to wait for so it just busy loops.
Neil, we can put a short unconditional sleep there or somehow ensure
work item is queued before the restart loop engages. What do you
think?
Thanks.
--
tejun
next prev parent reply other threads:[~2011-03-29 10:07 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <201103251725.21180.thomas.jarosch@intra2net.com>
2011-03-28 7:59 ` raid1 boot regression in 2.6.37 [bisected] Tejun Heo
2011-03-28 11:02 ` Thomas Jarosch
2011-03-28 12:53 ` Thomas Jarosch
2011-03-28 15:59 ` Tejun Heo
2011-03-28 19:46 ` Thomas Jarosch
2011-03-28 19:59 ` Roberto Spadim
2011-03-29 12:06 ` Thomas Jarosch
2011-03-29 12:22 ` Roberto Spadim
2011-03-29 8:25 ` Tejun Heo
2011-03-29 9:53 ` Thomas Jarosch
2011-03-29 10:07 ` Tejun Heo [this message]
2011-03-29 11:52 ` Thomas Jarosch
2011-04-05 3:46 ` NeilBrown
2011-04-06 10:16 ` Tejun Heo
2011-04-12 14:05 ` Thomas Jarosch
2011-04-12 22:44 ` NeilBrown
[not found] ` <201104261051.09464.thomas.jarosch@intra2net.com>
2011-04-27 8:17 ` NeilBrown
2011-04-27 10:05 ` NeilBrown
[not found] ` <201104271700.58894.thomas.jarosch@intra2net.com>
2011-04-28 1:23 ` NeilBrown
2011-04-28 13:47 ` Thomas Jarosch
2011-05-02 12:17 ` Thomas Jarosch
2011-03-25 18:55 Thomas Jarosch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110329100744.GK6736@htj.dyndns.org \
--to=tj@kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=thomas.jarosch@intra2net.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).