Re: [BUG] kernel 2.6.32.x hangs during boot process

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [BUG] kernel 2.6.32.x hangs during boot process
       [not found] <d2deb3241001160158r4baed1e1t7e8f6642de018b4c@mail.gmail.com>
@ 2010-01-23  0:07 ` Andrew Morton
  2010-01-28  2:42   ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2010-01-23  0:07 UTC (permalink / raw)
  To: François Figarola; +Cc: linux-kernel, Neil Brown, linux-raid, Al Viro

(cc's added)

On Sat, 16 Jan 2010 10:58:30 +0100
Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:

> Dear all,
> 
> First, I apologize por my poor english...
> 
> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
> boot process, and I think it could be related to the problem reported
> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
> 
> The hardware is a Dell PowerEdge 2950 which runs fine with the
> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
> and the system is debian etch.
> 
> Here is the trace of the bug I've got (using netconsole) with a
> 2.6.32.3 kernel :
> 
> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
> [unmount of ext3 dm-4]
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:670!

That's

			if (atomic_read(&dentry->d_count) != 0) {
				printk(KERN_ERR
				       "BUG: Dentry %p{i=%lx,n=%s}"
				       " still in use (%d)"
				       " [unmount of %s %s]\n",
				       dentry,
				       dentry->d_inode ?
				       dentry->d_inode->i_ino : 0UL,
				       dentry->d_name.name,
				       atomic_read(&dentry->d_count),
				       dentry->d_sb->s_type->name,
				       dentry->d_sb->s_id);
				BUG();
			}

I'm a bit surprised that the system is doing a dm suspemd/resume during
the boot process.

I assume it's a DM bug, dunno.

> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/block/dm-2/removable
> CPU 0
> Modules linked in: i5k_amb hwmon button processor thermal fan [last
> unloaded: scsi_wait_scan]
> Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950
> RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> shrink_dcache_for_umount_subtree+0x280/0x290
> RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296
> RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096
> RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060
> R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0
> FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0)
> Stack:
> ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001
> <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049
> <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> Call Trace:
> [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> RSP <ffff88066670dcf8>
> ---[ end trace 3cc1cb65fcc6a8ca ]---
> 
> another trace with same behavior on a new compiled kernel with more
> debug options;
> but I can't see any difference :
> 
> BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8)
> [unmount of ext3 dm-4]
> ------------[ cut here ]------------
> kernel BUG at fs/dcache.c:670!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/block/dm-3/removable
> CPU 1
> Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last
> unloaded: scsi_wait_scan]
> Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950
> RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> shrink_dcache_for_umount_subtree+0x280/0x290
> RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296
> RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096
> RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798
> R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0
> FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40)
> Stack:
> ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001
> <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049
> <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> Call Trace:
> [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> RSP <ffff880667089cf8>
> ---[ end trace a9fb3c2286e56cbd ]---
> 
> 
> I think the problem should be related with lvm or device mapper because
> I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950
> without any kind of lvm or dm configured...
> but I'm really not expert with kernel debug.
> 
> Here is the fstab of the buggy system :
> 
> # /etc/fstab: static file system information.
> #
> # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass>
> proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0
> /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1
> /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0
> /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0
> debugfs /sys/kernel/debug debugfs noauto 0 0
> 
> I hope it can help, and try to give us more informations if necessary.
> 
> Fran__ois.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-23  0:07 ` [BUG] kernel 2.6.32.x hangs during boot process Andrew Morton
@ 2010-01-28  2:42   ` Neil Brown
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2010-01-28  2:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: François Figarola, linux-kernel, linux-raid, Al Viro,
	dm-devel

On Fri, 22 Jan 2010 16:07:40 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> (cc's added)
(another cc added, one that might actually be useful.....)

> 
> On Sat, 16 Jan 2010 10:58:30 +0100
> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
> 
> > Dear all,
> > 
> > First, I apologize por my poor english...
> > 
> > Since I've tried to boot 2.6.32.x kernel, my system hangs during the
> > boot process, and I think it could be related to the problem reported
> > earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
> > 
> > The hardware is a Dell PowerEdge 2950 which runs fine with the
> > 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
> > and the system is debian etch.
> > 
> > Here is the trace of the bug I've got (using netconsole) with a
> > 2.6.32.3 kernel :
> > 
> > BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
> > [unmount of ext3 dm-4]
> > ------------[ cut here ]------------
> > kernel BUG at fs/dcache.c:670!
> 
> That's
> 
> 			if (atomic_read(&dentry->d_count) != 0) {
> 				printk(KERN_ERR
> 				       "BUG: Dentry %p{i=%lx,n=%s}"
> 				       " still in use (%d)"
> 				       " [unmount of %s %s]\n",
> 				       dentry,
> 				       dentry->d_inode ?
> 				       dentry->d_inode->i_ino : 0UL,
> 				       dentry->d_name.name,
> 				       atomic_read(&dentry->d_count),
> 				       dentry->d_sb->s_type->name,
> 				       dentry->d_sb->s_id);
> 				BUG();
> 			}
> 
> I'm a bit surprised that the system is doing a dm suspemd/resume during
> the boot process.

It could be that a dm_resume if how you activate a dm device once it is
built, but I'm not sure....
Maybe the guys on dm-devel can help.

NeilBrown

> 
> I assume it's a DM bug, dunno.
> 
> > invalid opcode: 0000 [#1] SMP
> > last sysfs file: /sys/block/dm-2/removable
> > CPU 0
> > Modules linked in: i5k_amb hwmon button processor thermal fan [last
> > unloaded: scsi_wait_scan]
> > Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950
> > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> > shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296
> > RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096
> > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> > RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060
> > R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0
> > FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0)
> > Stack:
> > ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001
> > <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049
> > <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> > Call Trace:
> > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP <ffff88066670dcf8>
> > ---[ end trace 3cc1cb65fcc6a8ca ]---
> > 
> > another trace with same behavior on a new compiled kernel with more
> > debug options;
> > but I can't see any difference :
> > 
> > BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8)
> > [unmount of ext3 dm-4]
> > ------------[ cut here ]------------
> > kernel BUG at fs/dcache.c:670!
> > invalid opcode: 0000 [#1] SMP
> > last sysfs file: /sys/block/dm-3/removable
> > CPU 1
> > Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last
> > unloaded: scsi_wait_scan]
> > Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950
> > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>]
> > shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296
> > RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096
> > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246
> > RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798
> > R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0
> > FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40)
> > Stack:
> > ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001
> > <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049
> > <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159
> > Call Trace:
> > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50
> > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100
> > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50
> > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80
> > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110
> > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160
> > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220
> > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220
> > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260
> > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990
> > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20
> > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0
> > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0
> > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580
> > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d
> > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0
> > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b
> > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00
> > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f>
> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48
> > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290
> > RSP <ffff880667089cf8>
> > ---[ end trace a9fb3c2286e56cbd ]---
> > 
> > 
> > I think the problem should be related with lvm or device mapper because
> > I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950
> > without any kind of lvm or dm configured...
> > but I'm really not expert with kernel debug.
> > 
> > Here is the fstab of the buggy system :
> > 
> > # /etc/fstab: static file system information.
> > #
> > # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass>
> > proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0
> > /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1
> > /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2
> > /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0
> > /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0
> > debugfs /sys/kernel/debug debugfs noauto 0 0
> > 
> > I hope it can help, and try to give us more informations if necessary.
> > 
> > Fran__ois.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  2:42   ` Neil Brown
@ 2010-01-28  6:32     ` Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
                         ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jun'ichi Nomura @ 2010-01-28  6:32 UTC (permalink / raw)
  To: François Figarola, hch
  Cc: device-mapper development, linux-kernel, Neil Brown,
	Andrew Morton, linux-raid, Al Viro

>> On Sat, 16 Jan 2010 10:58:30 +0100
>> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>> boot process, and I think it could be related to the problem reported
>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>
>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>> and the system is debian etch.
>>>
>>> Here is the trace of the bug I've got (using netconsole) with a
>>> 2.6.32.3 kernel :
>>>
>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>> [unmount of ext3 dm-4]
>>> ------------[ cut here ]------------
>>> kernel BUG at fs/dcache.c:670!

I can reproduce this when suspend/resume read-only mounted dm device.

When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
which seems wrong. The change was introduced with the commit below:

  commit 4504230a71566785a05d3e6b53fa1ee071b864eb
  Author: Christoph Hellwig <hch@lst.de>
  Date:   Mon Aug 3 23:28:35 2009 +0200

  freeze_bdev: grab active reference to frozen superblocks

With the attached patch, both remount-ro and remount-rw are
rejected as EBUSY on freezed device as expected.

Christoph, do you think this is the right fix?

-- 
Jun'ichi Nomura, NEC Corporation


If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
deactivate_locked_super().
Also, keep sb->s_frozen consistent so that remount can check the frozen state.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 73d6a73..600261f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 	if (!sb)
 		goto out;
 	if (sb->s_flags & MS_RDONLY) {
-		deactivate_locked_super(sb);
+		sb->s_frozen = SB_FREEZE_TRANS;
+		smp_wmb();
+		up_write(&sb->s_umount);
 		mutex_unlock(&bdev->bd_fsfreeze_mutex);
 		return sb;
 	}
@@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 	BUG_ON(sb->s_bdev != bdev);
 	down_write(&sb->s_umount);
 	if (sb->s_flags & MS_RDONLY)
-		goto out_deactivate;
+		goto out_unfrozen;
 
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
@@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 		}
 	}
 
+out_unfrozen:
 	sb->s_frozen = SB_UNFROZEN;
 	smp_wmb();
 	wake_up(&sb->s_wait_unfrozen);
 
-out_deactivate:
 	if (sb)
 		deactivate_locked_super(sb);
 out_unlock:

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
@ 2010-01-28 18:16       ` Thomas Backlund
  2010-01-28 18:25       ` Christoph Hellwig
  2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola
  2 siblings, 0 replies; 8+ messages in thread
From: Thomas Backlund @ 2010-01-28 18:16 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: François Figarola, hch, device-mapper development,
	linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro

28.01.2010 08:32, Jun'ichi Nomura skrev:
>>> On Sat, 16 Jan 2010 10:58:30 +0100
>>> Fran__ois Figarola<francois.figarola@i-consult.fr>  wrote:
>>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>>> boot process, and I think it could be related to the problem reported
>>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>>
>>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>>> and the system is debian etch.
>>>>
>>>> Here is the trace of the bug I've got (using netconsole) with a
>>>> 2.6.32.3 kernel :
>>>>
>>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>>> [unmount of ext3 dm-4]
>>>> ------------[ cut here ]------------
>>>> kernel BUG at fs/dcache.c:670!
>
> I can reproduce this when suspend/resume read-only mounted dm device.
>
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
>
>    commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>    Author: Christoph Hellwig<hch@lst.de>
>    Date:   Mon Aug 3 23:28:35 2009 +0200
>
>    freeze_bdev: grab active reference to frozen superblocks
>
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
>
> Christoph, do you think this is the right fix?
>

I can confirm that both reverting the above patch, or applying the fix 
below fixes the issue on both 2.6.32 and 2.6.33-rc5

So if it's considered the correct fix, it needs to be cc stable@ for 2.6.32

(I reported this same issue this morning here:
  http://marc.info/?l=linux-kernel&m=126467195500908&w=2,
  but then I found this thread/fix)

The system I have tested on is a 4-disk dmraid10 connected to an Intel 
ICH10R on an Asus P7P55D Deluxe running x86_64

> Jun'ichi Nomura, NEC Corporation
>
>
> If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
> deactivate_locked_super().
> Also, keep sb->s_frozen consistent so that remount can check the frozen state.
>
> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
   Tested-by: Thomas Backlund <tmb@mandriva.org>
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 73d6a73..600261f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev)
>  	if (!sb)
>  		goto out;
>  	if (sb->s_flags & MS_RDONLY) {
> -		deactivate_locked_super(sb);
> +		sb->s_frozen = SB_FREEZE_TRANS;
> +		smp_wmb();
> +		up_write(&sb->s_umount);
>  		mutex_unlock(&bdev->bd_fsfreeze_mutex);
>  		return sb;
>  	}
> @@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  	BUG_ON(sb->s_bdev != bdev);
>  	down_write(&sb->s_umount);
>  	if (sb->s_flags & MS_RDONLY)
> -		goto out_deactivate;
> +		goto out_unfrozen;
>
>  	if (sb->s_op->unfreeze_fs) {
>  		error = sb->s_op->unfreeze_fs(sb);
> @@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>  		}
>  	}
>
> +out_unfrozen:
>  	sb->s_frozen = SB_UNFROZEN;
>  	smp_wmb();
>  	wake_up(&sb->s_wait_unfrozen);
>
> -out_deactivate:
>  	if (sb)
>  		deactivate_locked_super(sb);
>  out_unlock:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
@ 2010-01-28 18:25       ` Christoph Hellwig
  2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
  2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola
  2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2010-01-28 18:25 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Fran?ois Figarola, hch, device-mapper development, linux-kernel,
	Neil Brown, Andrew Morton, linux-raid, Al Viro

On Thu, Jan 28, 2010 at 03:32:41PM +0900, Jun'ichi Nomura wrote:
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
> 
>   commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>   Author: Christoph Hellwig <hch@lst.de>
>   Date:   Mon Aug 3 23:28:35 2009 +0200
> 
>   freeze_bdev: grab active reference to frozen superblocks
> 
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
> 
> Christoph, do you think this is the right fix?

Indeed, this looks wrong in my original code, and the patch looks like
the correct fix.  Thanks a lot!


Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb
  2010-01-28 18:25       ` Christoph Hellwig
@ 2010-01-29  0:56         ` Jun'ichi Nomura
  2010-01-30 18:44           ` Thomas Backlund
  0 siblings, 1 reply; 8+ messages in thread
From: Jun'ichi Nomura @ 2010-01-29  0:56 UTC (permalink / raw)
  To: Christoph Hellwig, linux-kernel, tmb
  Cc: Fran?ois Figarola, device-mapper development, Neil Brown,
	Andrew Morton, linux-raid, Al Viro, stable

Thanks Thomas and Christoph for testing and review.
I removed 'smp_wmb()' before up_write from the previous patch,
since up_write() should have necessary ordering constraints.
(I.e. the change of s_frozen is visible to others after up_write)
I'm quite sure the change is harmless but if you are uncomfortable
with Tested-by/Reviewed-by on the modified patch, please remove them.


If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
deactivate_locked_super().
Also, keep sb->s_frozen consistent so that remount can check the frozen state.

Otherwise a crash reported here can happen:
http://lkml.org/lkml/2010/1/16/37
http://lkml.org/lkml/2010/1/28/53


This patch should be applied for 2.6.32 stable series, too.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Thomas Backlund <tmb@mandriva.org> 
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> 
Cc: stable@kernel.org

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 73d6a73..d11d028 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev)
 	if (!sb)
 		goto out;
 	if (sb->s_flags & MS_RDONLY) {
-		deactivate_locked_super(sb);
+		sb->s_frozen = SB_FREEZE_TRANS;
+		up_write(&sb->s_umount);
 		mutex_unlock(&bdev->bd_fsfreeze_mutex);
 		return sb;
 	}
@@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 	BUG_ON(sb->s_bdev != bdev);
 	down_write(&sb->s_umount);
 	if (sb->s_flags & MS_RDONLY)
-		goto out_deactivate;
+		goto out_unfrozen;
 
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
@@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
 		}
 	}
 
+out_unfrozen:
 	sb->s_frozen = SB_UNFROZEN;
 	smp_wmb();
 	wake_up(&sb->s_wait_unfrozen);
 
-out_deactivate:
 	if (sb)
 		deactivate_locked_super(sb);
 out_unlock:

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb
  2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
@ 2010-01-30 18:44           ` Thomas Backlund
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Backlund @ 2010-01-30 18:44 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Christoph Hellwig, linux-kernel@vger.kernel.org, tmb@mandriva.org,
	Fran?ois Figarola, device-mapper development, Neil Brown,
	Andrew Morton, linux-raid@vger.kernel.org, Al Viro,
	stable@kernel.org

29.01.2010 02:56, Jun'ichi Nomura skrev:
> Thanks Thomas and Christoph for testing and review.
> I removed 'smp_wmb()' before up_write from the previous patch,
> since up_write() should have necessary ordering constraints.
> (I.e. the change of s_frozen is visible to others after up_write)
> I'm quite sure the change is harmless but if you are uncomfortable
> with Tested-by/Reviewed-by on the modified patch, please remove them.
>

I've just verified that this patch works as intended on both 2.6.32 and 
2.6.33-rc6, so for me it's still OK.
>
> If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
> deactivate_locked_super().
> Also, keep sb->s_frozen consistent so that remount can check the frozen state.
>
> Otherwise a crash reported here can happen:
> http://lkml.org/lkml/2010/1/16/37
> http://lkml.org/lkml/2010/1/28/53
>
>
> This patch should be applied for 2.6.32 stable series, too.
>
> Reviewed-by: Christoph Hellwig<hch@lst.de>
> Tested-by: Thomas Backlund<tmb@mandriva.org>
> Signed-off-by: Jun'ichi Nomura<j-nomura@ce.jp.nec.com>
> Cc: stable@kernel.org
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 73d6a73..d11d028 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev)
>   	if (!sb)
>   		goto out;
>   	if (sb->s_flags&  MS_RDONLY) {
> -		deactivate_locked_super(sb);
> +		sb->s_frozen = SB_FREEZE_TRANS;
> +		up_write(&sb->s_umount);
>   		mutex_unlock(&bdev->bd_fsfreeze_mutex);
>   		return sb;
>   	}
> @@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>   	BUG_ON(sb->s_bdev != bdev);
>   	down_write(&sb->s_umount);
>   	if (sb->s_flags&  MS_RDONLY)
> -		goto out_deactivate;
> +		goto out_unfrozen;
>
>   	if (sb->s_op->unfreeze_fs) {
>   		error = sb->s_op->unfreeze_fs(sb);
> @@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb)
>   		}
>   	}
>
> +out_unfrozen:
>   	sb->s_frozen = SB_UNFROZEN;
>   	smp_wmb();
>   	wake_up(&sb->s_wait_unfrozen);
>
> -out_deactivate:
>   	if (sb)
>   		deactivate_locked_super(sb);
>   out_unlock:
> .
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process
  2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
  2010-01-28 18:16       ` Thomas Backlund
  2010-01-28 18:25       ` Christoph Hellwig
@ 2010-01-29  7:06       ` François Figarola
  2 siblings, 0 replies; 8+ messages in thread
From: François Figarola @ 2010-01-29  7:06 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: hch, device-mapper development, linux-kernel, Neil Brown,
	Andrew Morton, linux-raid, Al Viro

Jun'ichi Nomura a écrit :
>>> On Sat, 16 Jan 2010 10:58:30 +0100
>>> Fran__ois Figarola  <francois.figarola@i-consult.fr> wrote:
>>>       
>>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the
>>>> boot process, and I think it could be related to the problem reported
>>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92).
>>>>
>>>> The hardware is a Dell PowerEdge 2950 which runs fine with the
>>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11),
>>>> and the system is debian etch.
>>>>
>>>> Here is the trace of the bug I've got (using netconsole) with a
>>>> 2.6.32.3 kernel :
>>>>
>>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8)
>>>> [unmount of ext3 dm-4]
>>>> ------------[ cut here ]------------
>>>> kernel BUG at fs/dcache.c:670!
>>>>         
>
> I can reproduce this when suspend/resume read-only mounted dm device.
>
> When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super,
> which seems wrong. The change was introduced with the commit below:
>
>   commit 4504230a71566785a05d3e6b53fa1ee071b864eb
>   Author: Christoph Hellwig <hch@lst.de>
>   Date:   Mon Aug 3 23:28:35 2009 +0200
>
>   freeze_bdev: grab active reference to frozen superblocks
>
> With the attached patch, both remount-ro and remount-rw are
> rejected as EBUSY on freezed device as expected.
>
> Christoph, do you think this is the right fix?
>
>   
With the fix from Jun'ichi Nomura, a 2.6.32.5 kernel
boots now correctly.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-30 18:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <d2deb3241001160158r4baed1e1t7e8f6642de018b4c@mail.gmail.com>
2010-01-23  0:07 ` [BUG] kernel 2.6.32.x hangs during boot process Andrew Morton
2010-01-28  2:42   ` Neil Brown
2010-01-28  6:32     ` [dm-devel] " Jun'ichi Nomura
2010-01-28 18:16       ` Thomas Backlund
2010-01-28 18:25       ` Christoph Hellwig
2010-01-29  0:56         ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura
2010-01-30 18:44           ` Thomas Backlund
2010-01-29  7:06       ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).