* md deadlock (2.6.31-rc5-git2)
@ 2009-08-17 20:17 Dave Jones
2009-08-17 21:16 ` Mike Snitzer
0 siblings, 1 reply; 3+ messages in thread
From: Dave Jones @ 2009-08-17 20:17 UTC (permalink / raw)
To: Linux Kernel
This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
and we haven't started building install images with anything newer yet),
but I don't recall seeing anything similar posted recently..
While creating a series of md arrays, I got the mdadm process to just lock up.
Looking in dmesg showed that it had warned about it too ..
Dave
...
type=1403 audit(1250524416.444:2): policy loaded auid=4294967295 ses=4294967295
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
xor: automatically using best checksumming function: generic_sse
generic_sse: 4188.000 MB/sec
xor: using function: generic_sse (4188.000 MB/sec)
async_tx: api initialized (async)
raid6: int64x1 1199 MB/s
raid6: int64x2 1363 MB/s
raid6: int64x4 1570 MB/s
raid6: int64x8 1265 MB/s
raid6: sse2x1 1734 MB/s
raid6: sse2x2 2750 MB/s
raid6: sse2x4 2843 MB/s
raid6: using algorithm sse2x4 (2843 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: raid10 personality registered for level 10
md: linear personality registered for level -1
device-mapper: multipath: version 1.1.0 loaded
device-mapper: multipath round-robin: version 1.0.0 loaded
executing set pll
executing set crtc timing
[drm] TV-5: set mode 1280x1024 1d
end_request: I/O error, dev fd0, sector 0
SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
SGI XFS Quota Management subsystem
md: bind<sda11>
md: bind<sdb11>
raid1: md0 is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md0: detected capacity change from 0 to 104726528
md: resync of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 102272 blocks.
md0: unknown partition table
md: bind<sda1>
md: bind<sdb1>
raid0: looking at sdb1
raid0: comparing sdb1(20479872)
with sdb1(20479872)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda1
raid0: comparing sda1(20479872)
with sdb1(20479872)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 40959744 sectors.
******* md1 configuration *********
zone0=[sda1/sdb1/]
zone offset=0kb device offset=0kb size=20479872kb
**********************************
md1: detected capacity change from 0 to 20971388928
md1: unknown partition table
md: bind<sda2>
md: bind<sdb2>
raid0: looking at sdb2
raid0: comparing sdb2(2047872)
with sdb2(2047872)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(2047872)
with sdb2(2047872)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 4095744 sectors.
******* md2 configuration *********
zone0=[sda2/sdb2/]
zone offset=0kb device offset=0kb size=2047872kb
**********************************
md2: detected capacity change from 0 to 2097020928
md2: unknown partition table
md: bind<sda3>
md: bind<sdb3>
raid0: looking at sdb3
raid0: comparing sdb3(2047872)
with sdb3(2047872)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda3
raid0: comparing sda3(2047872)
with sdb3(2047872)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 4095744 sectors.
******* md3 configuration *********
zone0=[sda3/sdb3/]
zone offset=0kb device offset=0kb size=2047872kb
**********************************
md: md0: resync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda11
disk 1, wo:0, o:1, dev:sdb11
INFO: task mdadm:2249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm D 0000000000000003 4664 2249 333 0x00000080
ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
Call Trace:
[<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
[<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
[<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
[<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
[<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
[<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
[<ffffffff813f35ce>] do_md_run+0x886/0x92f
[<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
[<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
[<ffffffff81095fab>] ? mark_lock+0x3c/0x253
[<ffffffff81095fab>] ? mark_lock+0x3c/0x253
[<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
[<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
[<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
[<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
[<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
[<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
[<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
[<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
[<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
[<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
[<ffffffff81151486>] sys_ioctl+0x65/0x9c
[<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
2 locks held by mdadm/2249:
#0: (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
#1: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
INFO: task mdadm:2249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm D 0000000000000003 4664 2249 333 0x00000080
ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
Call Trace:
[<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
[<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
[<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
[<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
[<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
[<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
[<ffffffff813f35ce>] do_md_run+0x886/0x92f
[<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
[<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
[<ffffffff81095fab>] ? mark_lock+0x3c/0x253
[<ffffffff81095fab>] ? mark_lock+0x3c/0x253
[<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
[<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
[<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
[<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
[<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
[<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
[<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
[<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
[<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
[<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
[<ffffffff81151486>] sys_ioctl+0x65/0x9c
[<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
2 locks held by mdadm/2249:
#0: (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
#1: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: md deadlock (2.6.31-rc5-git2)
2009-08-17 20:17 md deadlock (2.6.31-rc5-git2) Dave Jones
@ 2009-08-17 21:16 ` Mike Snitzer
2009-08-17 21:31 ` Dave Jones
0 siblings, 1 reply; 3+ messages in thread
From: Mike Snitzer @ 2009-08-17 21:16 UTC (permalink / raw)
To: Dave Jones, Linux Kernel
On Mon, Aug 17, 2009 at 4:17 PM, Dave Jones <davej@redhat.com> wrote:
>
> This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
> and we haven't started building install images with anything newer yet),
> but I don't recall seeing anything similar posted recently..
>
> While creating a series of md arrays, I got the mdadm process to just lock up.
> Looking in dmesg showed that it had warned about it too ..
...
> INFO: task mdadm:2249 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mdadm D 0000000000000003 4664 2249 333 0x00000080
> ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
> ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
> ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
> Call Trace:
> [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
> [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
> [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
> [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
> [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
> [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
> [<ffffffff813f35ce>] do_md_run+0x886/0x92f
> [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
> [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
> [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
> [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
> [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
> [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
> [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
> [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
> [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
> [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
> [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
> [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
> [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
> [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
> [<ffffffff81151486>] sys_ioctl+0x65/0x9c
> [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
> 2 locks held by mdadm/2249:
> #0: (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
> #1: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
> INFO: task mdadm:2249 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mdadm D 0000000000000003 4664 2249 333 0x00000080
> ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
> ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
> ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
> Call Trace:
> [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
> [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
> [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
> [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
> [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
> [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
> [<ffffffff813f35ce>] do_md_run+0x886/0x92f
> [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
> [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
> [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
> [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
> [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
> [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
> [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
> [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
> [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
> [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
> [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
> [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
> [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
> [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
> [<ffffffff81151486>] sys_ioctl+0x65/0x9c
> [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
> 2 locks held by mdadm/2249:
> #0: (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
> #1: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
This was fixed with commit: c8c00a6915a2e3d10416e8bdd3138429beb96210
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: md deadlock (2.6.31-rc5-git2)
2009-08-17 21:16 ` Mike Snitzer
@ 2009-08-17 21:31 ` Dave Jones
0 siblings, 0 replies; 3+ messages in thread
From: Dave Jones @ 2009-08-17 21:31 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Linux Kernel
On Mon, Aug 17, 2009 at 05:16:25PM -0400, Mike Snitzer wrote:
> On Mon, Aug 17, 2009 at 4:17 PM, Dave Jones <davej@redhat.com> wrote:
> >
> > This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
> > and we haven't started building install images with anything newer yet),
> > but I don't recall seeing anything similar posted recently..
> >
> > While creating a series of md arrays, I got the mdadm process to just lock up.
> > Looking in dmesg showed that it had warned about it too ..
> ...
> > 2 locks held by mdadm/2249:
> > #0: (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
> > #1: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
>
> This was fixed with commit: c8c00a6915a2e3d10416e8bdd3138429beb96210
excellent! I missed that. thanks.
Dave
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-08-17 21:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-17 20:17 md deadlock (2.6.31-rc5-git2) Dave Jones
2009-08-17 21:16 ` Mike Snitzer
2009-08-17 21:31 ` Dave Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox