From: "Vaughan" <cxt9401@163.com>
To: neilb@suse.com
Cc: linux-raid@vger.kernel.org
Subject: Re: How does md gurantee not miss to free an active stripe_head when md stops?
Date: Wed, 20 Jul 2016 18:53:41 +0800 [thread overview]
Message-ID: <004e01d1e274$faa33a20$efe9ae60$@163.com> (raw)
Hi Neil,
I'm using v3.10 md code for develop. Recently I encounter a problem where an
read IO usually returned from physical disk after md has been stopped.
I reviewed the code and find when md stops, it unregister raid5d
unconditionally and call shrink_stripes() to free only the *inactive*
stripes.
I know before stop, it uses O_EXCL open the md, but that won't stop others
open it and send IO to it.
So I think it's possible that some active stripes will be still running.
And I also found
commit 5aa61f427e4979be733e4847b9199ff9cc48a47e
Author: NeilBrown <neilb@suse.de>
Date: Mon Dec 15 12:56:57 2014 +1100
md: split detach operation out from ->stop.
add calling a quiesce before unregister raid5d in __md_stop, which not
exists there before.
Does this fix the hole when md stop?
In my case, an OOPS usually happens like below:
I keep calling mdadm -stop to stop a md, but lsof shows it's opened by
systemd-udevd, so "still is inuse".
30s later, udev reports timeout and be kicked with SIGKILL.
systemd-udevd: worker [19335]/devices/virtual/block/md41 timeout; kill it.
Then md stop process is able to continue and go passed the free_conf(). But
there is an active_stripe left.
kernel:
shrink_stripes:conf(ffff880004affc00)->md(ffff8802d95b4000,md41)active_strip
es=1 <== this is my debug print.
kernel: md41: detected capacity change from 3409128980480 to 0
mdadm: stopped /dev/md41
After md is stopped, an read IO from underlying returned and OOPS.
[190830.867371] md: unbind<dm-64>
[190830.876345] md: export_rdev(dm-64)
[190831.201619] BUG: unable to handle kernel [190831.202875] paging request
at 0000000000002050 [190831.204101] IP: [<ffffffffa089a349>]
raid5_end_read_request+0xf9/0xdc0[raid456]
I found this returned bio is caused by a user read page, which is caused by
a fput to kill_bdev.
PID: 21345 TASK: ffff8803e5a916c0 CPU: 1 COMMAND: "mdadm"
#0 [ffff88016f777b88] __schedule at ffffffff815f513d
#1 [ffff88016f777bf0] io_schedule at ffffffff815f599d
#2 [ffff88016f777c08] sleep_on_page at ffffffff81155f1e
#3 [ffff88016f777c18] __wait_on_bit_lock at ffffffff815f38ab
#4 [ffff88016f777c58] __lock_page at ffffffff81156038
#5 [ffff88016f777cb0] truncate_inode_pages_range at ffffffff8116645e
#6 [ffff88016f777e00] truncate_inode_pages at ffffffff811664b5
#7 [ffff88016f777e10] kill_bdev at ffffffff811ffaef
#8 [ffff88016f777e28] __blkdev_put at ffffffff81201124
#9 [ffff88016f777e68] blkdev_put at ffffffff81201bae
#10 [ffff88016f777e98] blkdev_close at ffffffff81201d55
#11 [ffff88016f777ea8] __fput at ffffffff811c81b9
#12 [ffff88016f777ef0] ____fput at ffffffff811c847e
#13 [ffff88016f777f00] task_work_run at ffffffff81093b37
#14 [ffff88016f777f30] do_notify_resume at ffffffff81013b0c
#15 [ffff88016f777f50] int_signal at ffffffff8160049d
next reply other threads:[~2016-07-20 10:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-20 10:53 Vaughan [this message]
2016-07-22 1:15 ` How does md gurantee not miss to free an active stripe_head when md stops? NeilBrown
-- strict thread matches above, loose matches on Subject: below --
2016-07-18 9:18 Vaughan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='004e01d1e274$faa33a20$efe9ae60$@163.com' \
--to=cxt9401@163.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).