md deadlock (2.6.31-rc5-git2)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* md deadlock (2.6.31-rc5-git2)
@ 2009-08-17 20:17 Dave Jones
  2009-08-17 21:16 ` Mike Snitzer
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Jones @ 2009-08-17 20:17 UTC (permalink / raw)
  To: Linux Kernel

This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
and we haven't started building install images with anything newer yet),
but I don't recall seeing anything similar posted recently..

While creating a series of md arrays, I got the mdadm process to just lock up.
Looking in dmesg showed that it had warned about it too ..

	Dave


...
type=1403 audit(1250524416.444:2): policy loaded auid=4294967295 ses=4294967295
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
xor: automatically using best checksumming function: generic_sse
   generic_sse:  4188.000 MB/sec
xor: using function: generic_sse (4188.000 MB/sec)
async_tx: api initialized (async)
raid6: int64x1   1199 MB/s
raid6: int64x2   1363 MB/s
raid6: int64x4   1570 MB/s
raid6: int64x8   1265 MB/s
raid6: sse2x1    1734 MB/s
raid6: sse2x2    2750 MB/s
raid6: sse2x4    2843 MB/s
raid6: using algorithm sse2x4 (2843 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: raid10 personality registered for level 10
md: linear personality registered for level -1
device-mapper: multipath: version 1.1.0 loaded
device-mapper: multipath round-robin: version 1.0.0 loaded
executing set pll
executing set crtc timing
[drm] TV-5: set mode 1280x1024 1d
end_request: I/O error, dev fd0, sector 0
SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
SGI XFS Quota Management subsystem
md: bind<sda11>
md: bind<sdb11>
raid1: md0 is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md0: detected capacity change from 0 to 104726528
md: resync of RAID array md0
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 102272 blocks.
 md0: unknown partition table
md: bind<sda1>
md: bind<sdb1>
raid0: looking at sdb1
raid0:   comparing sdb1(20479872)
 with sdb1(20479872)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda1
raid0:   comparing sda1(20479872)
 with sdb1(20479872)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 40959744 sectors.
******* md1 configuration *********
zone0=[sda1/sdb1/]
        zone offset=0kb device offset=0kb size=20479872kb
**********************************

md1: detected capacity change from 0 to 20971388928
 md1: unknown partition table
md: bind<sda2>
md: bind<sdb2>
raid0: looking at sdb2
raid0:   comparing sdb2(2047872)
 with sdb2(2047872)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0:   comparing sda2(2047872)
 with sdb2(2047872)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 4095744 sectors.
******* md2 configuration *********
zone0=[sda2/sdb2/]
        zone offset=0kb device offset=0kb size=2047872kb
**********************************

md2: detected capacity change from 0 to 2097020928
 md2: unknown partition table
md: bind<sda3>
md: bind<sdb3>
raid0: looking at sdb3
raid0:   comparing sdb3(2047872)
 with sdb3(2047872)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at sda3
raid0:   comparing sda3(2047872)
 with sdb3(2047872)
raid0:   EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 4095744 sectors.
******* md3 configuration *********
zone0=[sda3/sdb3/]
        zone offset=0kb device offset=0kb size=2047872kb
**********************************

md: md0: resync done.
RAID1 conf printout:
 --- wd:2 rd:2
 disk 0, wo:0, o:1, dev:sda11
 disk 1, wo:0, o:1, dev:sdb11
INFO: task mdadm:2249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000003  4664  2249    333 0x00000080
 ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
 ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
 ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
Call Trace:
 [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
 [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
 [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
 [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
 [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
 [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
 [<ffffffff813f35ce>] do_md_run+0x886/0x92f
 [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
 [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
 [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
 [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
 [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
 [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
 [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
 [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
 [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
 [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
 [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
 [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
 [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
 [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
 [<ffffffff81151486>] sys_ioctl+0x65/0x9c
 [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
2 locks held by mdadm/2249:
 #0:  (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
 #1:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
INFO: task mdadm:2249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm         D 0000000000000003  4664  2249    333 0x00000080
 ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
 ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
 ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
Call Trace:
 [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
 [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
 [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
 [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
 [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
 [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
 [<ffffffff813f35ce>] do_md_run+0x886/0x92f
 [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
 [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
 [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
 [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
 [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
 [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
 [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
 [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
 [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
 [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
 [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
 [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
 [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
 [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
 [<ffffffff81151486>] sys_ioctl+0x65/0x9c
 [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
2 locks held by mdadm/2249:
 #0:  (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
 #1:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: md deadlock (2.6.31-rc5-git2)
  2009-08-17 20:17 md deadlock (2.6.31-rc5-git2) Dave Jones
@ 2009-08-17 21:16 ` Mike Snitzer
  2009-08-17 21:31   ` Dave Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Snitzer @ 2009-08-17 21:16 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel

On Mon, Aug 17, 2009 at 4:17 PM, Dave Jones <davej@redhat.com> wrote:
>
> This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
> and we haven't started building install images with anything newer yet),
> but I don't recall seeing anything similar posted recently..
>
> While creating a series of md arrays, I got the mdadm process to just lock up.
> Looking in dmesg showed that it had warned about it too ..
...
> INFO: task mdadm:2249 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mdadm         D 0000000000000003  4664  2249    333 0x00000080
>  ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
>  ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
>  ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
> Call Trace:
>  [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
>  [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
>  [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
>  [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
>  [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
>  [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
>  [<ffffffff813f35ce>] do_md_run+0x886/0x92f
>  [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
>  [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
>  [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
>  [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
>  [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
>  [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
>  [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
>  [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
>  [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
>  [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
>  [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
>  [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
>  [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
>  [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
>  [<ffffffff81151486>] sys_ioctl+0x65/0x9c
>  [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
> 2 locks held by mdadm/2249:
>  #0:  (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
>  #1:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
> INFO: task mdadm:2249 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> mdadm         D 0000000000000003  4664  2249    333 0x00000080
>  ffff880023c81938 0000000000000086 0000000000000000 0000000000000001
>  ffff88002bb124a0 0000000000000007 0000000000000006 ffff88003f417028
>  ffff88002bb12890 000000000000fa20 ffff88002bb12890 00000000001d5bc0
> Call Trace:
>  [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
>  [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
>  [<ffffffff814fb166>] __mutex_lock_common+0x21e/0x3bf
>  [<ffffffff8116ed21>] ? revalidate_disk+0x5e/0x9d
>  [<ffffffff814fb42a>] mutex_lock_nested+0x4f/0x6b
>  [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
>  [<ffffffff813f35ce>] do_md_run+0x886/0x92f
>  [<ffffffff814fb356>] ? mutex_lock_interruptible_nested+0x4f/0x6a
>  [<ffffffff813f5f30>] md_ioctl+0x11b6/0x142b
>  [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
>  [<ffffffff81095fab>] ? mark_lock+0x3c/0x253
>  [<ffffffff81263f12>] __blkdev_driver_ioctl+0x36/0x95
>  [<ffffffff81264895>] blkdev_ioctl+0x8d6/0x925
>  [<ffffffff8101aa23>] ? native_sched_clock+0x2d/0x62
>  [<ffffffff8122bae6>] ? __rcu_read_unlock+0x34/0x4a
>  [<ffffffff8122ca90>] ? avc_has_perm_noaudit+0x3c9/0x3ef
>  [<ffffffff8122cb21>] ? avc_has_perm+0x6b/0x91
>  [<ffffffff81096546>] ? trace_hardirqs_on_caller+0x139/0x175
>  [<ffffffff8116e5b4>] block_ioctl+0x4a/0x62
>  [<ffffffff81150e03>] vfs_ioctl+0x31/0xaa
>  [<ffffffff811513c5>] do_vfs_ioctl+0x4aa/0x506
>  [<ffffffff81151486>] sys_ioctl+0x65/0x9c
>  [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
> 2 locks held by mdadm/2249:
>  #0:  (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
>  #1:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d

This was fixed with commit: c8c00a6915a2e3d10416e8bdd3138429beb96210

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: md deadlock (2.6.31-rc5-git2)
  2009-08-17 21:16 ` Mike Snitzer
@ 2009-08-17 21:31   ` Dave Jones
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Jones @ 2009-08-17 21:31 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Linux Kernel

On Mon, Aug 17, 2009 at 05:16:25PM -0400, Mike Snitzer wrote:
 > On Mon, Aug 17, 2009 at 4:17 PM, Dave Jones <davej@redhat.com> wrote:
 > >
 > > This kernel is a bit old (it's what we froze on for Fedora 12 alpha,
 > > and we haven't started building install images with anything newer yet),
 > > but I don't recall seeing anything similar posted recently..
 > >
 > > While creating a series of md arrays, I got the mdadm process to just lock up.
 > > Looking in dmesg showed that it had warned about it too ..
 > ...
 > > 2 locks held by mdadm/2249:
 > >  #0:  (&new->reconfig_mutex#2){+.+.+.}, at: [<ffffffff813edcab>] mddev_lock+0x2a/0x40
 > >  #1:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff8116ed21>] revalidate_disk+0x5e/0x9d
 > 
 > This was fixed with commit: c8c00a6915a2e3d10416e8bdd3138429beb96210

excellent! I missed that. thanks.

	Dave

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-08-17 21:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-17 20:17 md deadlock (2.6.31-rc5-git2) Dave Jones
2009-08-17 21:16 ` Mike Snitzer
2009-08-17 21:31   ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox