* task mdadm blocked when stopping array, 3.15rc3
@ 2014-05-03 23:16 Chris Murphy
2014-05-05 3:40 ` NeilBrown
0 siblings, 1 reply; 2+ messages in thread
From: Chris Murphy @ 2014-05-03 23:16 UTC (permalink / raw)
To: linux-raid@vger.kernel.org List
When I issue mdadm -S /dev/md0, I get a hang which does not recover after 30+ minutes. This is what appears in dmesg (partial), but I also have issued sysrq-w and included a followup dmesg and journalctl both of which are attached to this kernel bug because it's so wide it just looks ugly in email:
https://bugzilla.kernel.org/show_bug.cgi?id=75451
dmesg partial:
[ 596.109426] rawhide.localdomain kernel: kworker/u6:7 (76) used greatest stack depth: 1872 bytes left
[ 600.363333] rawhide.localdomain kernel: INFO: task mdadm:1209 blocked for more than 120 seconds.
[ 600.363668] rawhide.localdomain kernel: Not tainted 3.15.0-0.rc3.git3.1.fc21.x86_64 #1
[ 600.363668] rawhide.localdomain kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 600.363668] rawhide.localdomain kernel: mdadm D ffff880088abcd40 5464 1209 1055 0x00000080
[ 600.363668] rawhide.localdomain kernel: ffff880088be7c40 0000000000000046 ffff880088abcd40 00000000001d5dc0
[ 600.363668] rawhide.localdomain kernel: ffff880088be7fd8 00000000001d5dc0 ffff880099193380 ffff880000144710
[ 600.363668] rawhide.localdomain kernel: ffff880000144600 ffff880088be7c60 ffff880000144778 0000000000000001
[ 600.363668] rawhide.localdomain kernel: Call Trace:
[ 600.363668] rawhide.localdomain kernel: [<ffffffff817ec479>] schedule+0x29/0x70
[ 600.363668] rawhide.localdomain kernel: [<ffffffffa031cbe5>] raise_barrier+0x135/0x1a0 [raid10]
[ 600.363668] rawhide.localdomain kernel: [<ffffffff810eed10>] ? abort_exclusive_wait+0xb0/0xb0
[ 600.363668] rawhide.localdomain kernel: [<ffffffffa031d58c>] stop+0x1c/0x90 [raid10]
[ 600.363668] rawhide.localdomain kernel: [<ffffffff8160589b>] __md_stop+0x1b/0x60
[ 600.363668] rawhide.localdomain kernel: [<ffffffff81613337>] do_md_stop+0x157/0x570
[ 600.363668] rawhide.localdomain kernel: [<ffffffff81614099>] ? md_ioctl+0x179/0x1d40
[ 600.363668] rawhide.localdomain kernel: [<ffffffff810f8f6d>] ? trace_hardirqs_on_caller+0x15d/0x200
[ 600.363668] rawhide.localdomain kernel: [<ffffffff81614cf5>] md_ioctl+0xdd5/0x1d40
[ 600.363668] rawhide.localdomain kernel: [<ffffffff810dbb26>] ? local_clock+0x16/0x30
[ 600.363668] rawhide.localdomain kernel: [<ffffffff810f5a7f>] ? lock_release_holdtime.part.28+0xf/0x200
[ 600.363668] rawhide.localdomain kernel: [<ffffffff8135e90a>] ? avc_has_perm+0x15a/0x350
[ 600.363668] rawhide.localdomain kernel: [<ffffffff8135e7e4>] ? avc_has_perm+0x34/0x350
[ 600.363668] rawhide.localdomain kernel: [<ffffffff813b0bcc>] blkdev_ioctl+0x1cc/0x840
[ 600.363668] rawhide.localdomain kernel: [<ffffffff8127e763>] block_ioctl+0x43/0x50
[ 600.363668] rawhide.localdomain kernel: [<ffffffff812513e0>] do_vfs_ioctl+0x2f0/0x520
[ 600.363668] rawhide.localdomain kernel: [<ffffffff81251691>] SyS_ioctl+0x81/0xa0
[ 600.363668] rawhide.localdomain kernel: [<ffffffff81155eac>] ? __audit_syscall_entry+0x9c/0xf0
[ 600.363668] rawhide.localdomain kernel: [<ffffffff817fca29>] system_call_fastpath+0x16/0x1b
[ 600.363668] rawhide.localdomain kernel: 2 locks held by mdadm/1209:
[ 600.363668] rawhide.localdomain kernel: #0: (&mddev->reconfig_mutex){+.+.+.}, at: [<ffffffff816132ba>] do_md_stop+0xda/0x570
[ 600.363668] rawhide.localdomain kernel: #1: (&mddev->open_mutex){+.+.+.}, at: [<ffffffff816132cf>] do_md_stop+0xef/0x570
[ 631.441218] rawhide.localdomain systemd[1]: Starting dnf makecache...
[ 632.838884] rawhide.localdomain dnf[1213]: Config time: 0.079
[ 632.845277] rawhide.localdomain dnf[1213]: cachedir: /var/cache/dnf/x86_64/21
[ 632.886277] rawhide.localdomain dnf[1213]: dnf version: 0.4.20
[ 632.903309] rawhide.localdomain dnf[1213]: Making cache files for all metadata files.
[ 632.977230] rawhide.localdomain dnf[1213]: Metadata cache refreshed recently.
[ 633.153699] rawhide.localdomain systemd[1]: Started dnf makecache.
[ 720.363842] rawhide.localdomain kernel: INFO: task mdadm:1209 blocked for more than 120 seconds.
[ 720.364012] rawhide.localdomain kernel: Not tainted 3.15.0-0.rc3.git3.1.fc21.x86_64 #1
[ 720.364948] rawhide.localdomain kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 720.373340] rawhide.localdomain kernel: mdadm D ffff880088abcd40 5464 1209 1055 0x00000080
[ 720.374515] rawhide.localdomain kernel: ffff880088be7c40 0000000000000046 ffff880088abcd40 00000000001d5dc0
[ 720.375523] rawhide.localdomain kernel: ffff880088be7fd8 00000000001d5dc0 ffff880099193380 ffff880000144710
[ 720.376624] rawhide.localdomain kernel: ffff880000144600 ffff880088be7c60 ffff880000144778 0000000000000001
[ 720.377007] rawhide.localdomain kernel: Call Trace:
[ 720.377762] rawhide.localdomain kernel: [<ffffffff817ec479>] schedule+0x29/0x70
[ 720.378217] rawhide.localdomain kernel: [<ffffffffa031cbe5>] raise_barrier+0x135/0x1a0 [raid10]
[ 720.378922] rawhide.localdomain kernel: [<ffffffff810eed10>] ? abort_exclusive_wait+0xb0/0xb0
[ 720.383370] rawhide.localdomain kernel: [<ffffffffa031d58c>] stop+0x1c/0x90 [raid10]
[ 720.385210] rawhide.localdomain kernel: [<ffffffff8160589b>] __md_stop+0x1b/0x60
[ 720.385842] rawhide.localdomain kernel: [<ffffffff81613337>] do_md_stop+0x157/0x570
[ 720.386619] rawhide.localdomain kernel: [<ffffffff81614099>] ? md_ioctl+0x179/0x1d40
[ 720.387351] rawhide.localdomain kernel: [<ffffffff810f8f6d>] ? trace_hardirqs_on_caller+0x15d/0x200
[ 720.387924] rawhide.localdomain kernel: [<ffffffff81614cf5>] md_ioctl+0xdd5/0x1d40
[ 720.388639] rawhide.localdomain kernel: [<ffffffff810dbb26>] ? local_clock+0x16/0x30
[ 720.389361] rawhide.localdomain kernel: [<ffffffff810f5a7f>] ? lock_release_holdtime.part.28+0xf/0x200
[ 720.389917] rawhide.localdomain kernel: [<ffffffff8135e90a>] ? avc_has_perm+0x15a/0x350
[ 720.390633] rawhide.localdomain kernel: [<ffffffff8135e7e4>] ? avc_has_perm+0x34/0x350
[ 720.391295] rawhide.localdomain kernel: [<ffffffff813b0bcc>] blkdev_ioctl+0x1cc/0x840
[ 720.391835] rawhide.localdomain kernel: [<ffffffff8127e763>] block_ioctl+0x43/0x50
[ 720.392501] rawhide.localdomain kernel: [<ffffffff812513e0>] do_vfs_ioctl+0x2f0/0x520
[ 720.393187] rawhide.localdomain kernel: [<ffffffff81251691>] SyS_ioctl+0x81/0xa0
[ 720.393692] rawhide.localdomain kernel: [<ffffffff81155eac>] ? __audit_syscall_entry+0x9c/0xf0
[ 720.393894] rawhide.localdomain kernel: [<ffffffff817fca29>] system_call_fastpath+0x16/0x1b
[ 720.394313] rawhide.localdomain kernel: 2 locks held by mdadm/1209:
[ 720.394798] rawhide.localdomain kernel: #0: (&mddev->reconfig_mutex){+.+.+.}, at: [<ffffffff816132ba>] do_md_stop+0xda/0x570
[ 720.395450] rawhide.localdomain kernel: #1: (&mddev->open_mutex){+.+.+.}, at: [<ffffffff816132cf>] do_md_stop+0xef/0x570
avc_has_perm looks selinux related, but enforcing=0
Also, after rebooting it's possible to stop the array without error. It only seems to happen when issued shortly after umount. So I'm not really sure what's going on here.
Setup:
VirtualBox 4.3.10 2.5G RAM, 3 CPUs, 3x 8GB virtual disks are on SATA
+ Fedora Rawhide
Versions:
kernel-3.15.0-0.rc3.git0.1.fc21.x86_64
kernel-3.15.0-0.rc3.git3.1.fc21.x86_64 (debug kernel)
mdadm-3.3-7.fc21.x86_64
Reproduce steps:
# mdadm -C /dev/md0 -n 3 -l raid10 --assume-clean /dev/sd[bcd]
# mkfs.xfs /dev/md0
# mount /dev/md0 /mnt
# umount /mnt
# mdadm -S /dev/md0
--hang--
Additional info:
# mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat May 3 16:47:43 2014
Raid Level : raid10
Array Size : 12570624 (11.99 GiB 12.87 GB)
Used Dev Size : 8380416 (7.99 GiB 8.58 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Update Time : Sat May 3 16:47:43 2014
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : near=2
Chunk Size : 512K
Name : rawhide.localdomain:0 (local to host rawhide.localdomain)
UUID : c9aa4d6c:0870ebeb:1fb70b0a:3af4bdf3
Events : 0
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
# mkfs.xfs /dev/md0
log stripe unit (524288 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md0 isize=256 agcount=16, agsize=196480 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0
data = bsize=4096 blocks=3142656, imaxpct=25
= sunit=128 swidth=384 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
## Not included in dmesg, but included in journalctl (which is attached to bug report also) is this suspicious udev entry from systemd:
[ 450.100607] rawhide.localdomain systemd-udevd[310]: worker [1210] /devices/virtual/block/md0 timeout; kill it
[ 450.109793] rawhide.localdomain systemd-udevd[310]: seq 1902 '/devices/virtual/block/md0' killed
[ 450.114924] rawhide.localdomain systemd-udevd[310]: worker [1210] terminated by signal 9 (Killed)
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: task mdadm blocked when stopping array, 3.15rc3
2014-05-03 23:16 task mdadm blocked when stopping array, 3.15rc3 Chris Murphy
@ 2014-05-05 3:40 ` NeilBrown
0 siblings, 0 replies; 2+ messages in thread
From: NeilBrown @ 2014-05-05 3:40 UTC (permalink / raw)
To: Chris Murphy; +Cc: linux-raid@vger.kernel.org List, Kent Overstreet
[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]
On Sat, 3 May 2014 17:16:18 -0600 Chris Murphy <lists@colorremedies.com>
wrote:
> When I issue mdadm -S /dev/md0, I get a hang which does not recover after 30+ minutes. This is what appears in dmesg (partial), but I also have issued sysrq-w and included a followup dmesg and journalctl both of which are attached to this kernel bug because it's so wide it just looks ugly in email:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=75451
Thanks for the report.
Patch below should fix it. I'll send it upstream shortly.
I don't think the systemd-udevd messages are relevant.... I wonder what they
mean though.
NeilBrown
From bbba3bc5932a56fdaeecfda87597c1cac5d84803 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 5 May 2014 13:34:37 +1000
Subject: [PATCH] md/raid10: call wait_barrier() for each request submitted.
wait_barrier() includes a counter, so we must call it precisely once
(unless balanced by allow_barrier()) for each request submitted.
Since
commit 20d0189b1012a37d2533a87fb451f7852f2418d1
block: Introduce new bio_split()
in 3.14-rc1, we don't call it for the extra requests generated when
we need to split a bio.
When this happens the counter goes negative, any resync/recovery will
never start, and "mdadm --stop" will hang.
Reported-by: Chris Murphy <lists@colorremedies.com>
Fixes: 20d0189b1012a37d2533a87fb451f7852f2418d1
Cc: stable@vger.kernel.org (3.14+)
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 33fc408e5eac..cb882aae9e20 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1172,6 +1172,13 @@ static void __make_request(struct mddev *mddev, struct bio *bio)
int max_sectors;
int sectors;
+ /*
+ * Register the new request and wait if the reconstruction
+ * thread has put up a bar for new requests.
+ * Continue immediately if no resync is active currently.
+ */
+ wait_barrier(conf);
+
sectors = bio_sectors(bio);
while (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
bio->bi_iter.bi_sector < conf->reshape_progress &&
@@ -1552,12 +1559,6 @@ static void make_request(struct mddev *mddev, struct bio *bio)
md_write_start(mddev, bio);
- /*
- * Register the new request and wait if the reconstruction
- * thread has put up a bar for new requests.
- * Continue immediately if no resync is active currently.
- */
- wait_barrier(conf);
do {
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-05-05 3:40 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-03 23:16 task mdadm blocked when stopping array, 3.15rc3 Chris Murphy
2014-05-05 3:40 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).