* Re: [BUG] kernel 2.6.32.x hangs during boot process [not found] <d2deb3241001160158r4baed1e1t7e8f6642de018b4c@mail.gmail.com> @ 2010-01-23 0:07 ` Andrew Morton 2010-01-28 2:42 ` Neil Brown 0 siblings, 1 reply; 8+ messages in thread From: Andrew Morton @ 2010-01-23 0:07 UTC (permalink / raw) To: François Figarola; +Cc: linux-kernel, Neil Brown, linux-raid, Al Viro (cc's added) On Sat, 16 Jan 2010 10:58:30 +0100 Fran__ois Figarola <francois.figarola@i-consult.fr> wrote: > Dear all, > > First, I apologize por my poor english... > > Since I've tried to boot 2.6.32.x kernel, my system hangs during the > boot process, and I think it could be related to the problem reported > earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). > > The hardware is a Dell PowerEdge 2950 which runs fine with the > 2.6.31.x kernel series (actually running with the latest 2.6.31.11), > and the system is debian etch. > > Here is the trace of the bug I've got (using netconsole) with a > 2.6.32.3 kernel : > > BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) > [unmount of ext3 dm-4] > ------------[ cut here ]------------ > kernel BUG at fs/dcache.c:670! That's if (atomic_read(&dentry->d_count) != 0) { printk(KERN_ERR "BUG: Dentry %p{i=%lx,n=%s}" " still in use (%d)" " [unmount of %s %s]\n", dentry, dentry->d_inode ? dentry->d_inode->i_ino : 0UL, dentry->d_name.name, atomic_read(&dentry->d_count), dentry->d_sb->s_type->name, dentry->d_sb->s_id); BUG(); } I'm a bit surprised that the system is doing a dm suspemd/resume during the boot process. I assume it's a DM bug, dunno. > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/block/dm-2/removable > CPU 0 > Modules linked in: i5k_amb hwmon button processor thermal fan [last > unloaded: scsi_wait_scan] > Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950 > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>] > shrink_dcache_for_umount_subtree+0x280/0x290 > RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296 > RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096 > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060 > R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0 > FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0) > Stack: > ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001 > <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049 > <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > Call Trace: > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50 > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100 > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50 > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80 > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110 > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160 > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220 > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220 > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260 > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990 > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20 > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0 > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0 > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580 > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0 > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290 > RSP <ffff88066670dcf8> > ---[ end trace 3cc1cb65fcc6a8ca ]--- > > another trace with same behavior on a new compiled kernel with more > debug options; > but I can't see any difference : > > BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8) > [unmount of ext3 dm-4] > ------------[ cut here ]------------ > kernel BUG at fs/dcache.c:670! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/block/dm-3/removable > CPU 1 > Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last > unloaded: scsi_wait_scan] > Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950 > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>] > shrink_dcache_for_umount_subtree+0x280/0x290 > RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296 > RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096 > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798 > R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0 > FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40) > Stack: > ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001 > <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049 > <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > Call Trace: > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50 > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100 > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50 > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80 > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110 > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160 > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220 > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220 > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260 > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990 > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20 > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0 > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0 > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580 > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0 > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290 > RSP <ffff880667089cf8> > ---[ end trace a9fb3c2286e56cbd ]--- > > > I think the problem should be related with lvm or device mapper because > I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950 > without any kind of lvm or dm configured... > but I'm really not expert with kernel debug. > > Here is the fstab of the buggy system : > > # /etc/fstab: static file system information. > # > # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass> > proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0 > /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1 > /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0 > /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0 > debugfs /sys/kernel/debug debugfs noauto 0 0 > > I hope it can help, and try to give us more informations if necessary. > > Fran__ois. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] kernel 2.6.32.x hangs during boot process 2010-01-23 0:07 ` [BUG] kernel 2.6.32.x hangs during boot process Andrew Morton @ 2010-01-28 2:42 ` Neil Brown 2010-01-28 6:32 ` [dm-devel] " Jun'ichi Nomura 0 siblings, 1 reply; 8+ messages in thread From: Neil Brown @ 2010-01-28 2:42 UTC (permalink / raw) To: Andrew Morton Cc: François Figarola, linux-kernel, linux-raid, Al Viro, dm-devel On Fri, 22 Jan 2010 16:07:40 -0800 Andrew Morton <akpm@linux-foundation.org> wrote: > (cc's added) (another cc added, one that might actually be useful.....) > > On Sat, 16 Jan 2010 10:58:30 +0100 > Fran__ois Figarola <francois.figarola@i-consult.fr> wrote: > > > Dear all, > > > > First, I apologize por my poor english... > > > > Since I've tried to boot 2.6.32.x kernel, my system hangs during the > > boot process, and I think it could be related to the problem reported > > earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). > > > > The hardware is a Dell PowerEdge 2950 which runs fine with the > > 2.6.31.x kernel series (actually running with the latest 2.6.31.11), > > and the system is debian etch. > > > > Here is the trace of the bug I've got (using netconsole) with a > > 2.6.32.3 kernel : > > > > BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) > > [unmount of ext3 dm-4] > > ------------[ cut here ]------------ > > kernel BUG at fs/dcache.c:670! > > That's > > if (atomic_read(&dentry->d_count) != 0) { > printk(KERN_ERR > "BUG: Dentry %p{i=%lx,n=%s}" > " still in use (%d)" > " [unmount of %s %s]\n", > dentry, > dentry->d_inode ? > dentry->d_inode->i_ino : 0UL, > dentry->d_name.name, > atomic_read(&dentry->d_count), > dentry->d_sb->s_type->name, > dentry->d_sb->s_id); > BUG(); > } > > I'm a bit surprised that the system is doing a dm suspemd/resume during > the boot process. It could be that a dm_resume if how you activate a dm device once it is built, but I'm not sure.... Maybe the guys on dm-devel can help. NeilBrown > > I assume it's a DM bug, dunno. > > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/dm-2/removable > > CPU 0 > > Modules linked in: i5k_amb hwmon button processor thermal fan [last > > unloaded: scsi_wait_scan] > > Pid: 3311, comm: kpartx Not tainted 2.6.32.3 #2 PowerEdge 2950 > > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>] > > shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP: 0018:ffff88066670dcf8 __EFLAGS: 00010296 > > RAX: 000000000000005c RBX: ffff8806677696c0 RCX: 0000000000000096 > > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > > RBP: ffff880667690000 R08: 0000000000000000 R09: ffff8806670d1628 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667690060 > > R13: 0000000000000007 R14: ffff8806654d1a88 R15: 0000000000dec0b0 > > FS: __00007f176e96b770(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 > > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007fff0a2e0080 CR3: 0000000666607000 CR4: 00000000000006f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kpartx (pid: 3311, threadinfo ffff88066670c000, task ffff8806652997d0) > > Stack: > > ffff880665b8b178 ffff880665b8af18 ffffffff81619600 0000000000000001 > > <0> ffff880667408e00 ffffffff810f9629 ffff880665b8af18 ffffffff810e8049 > > <0> ffff8806651333f8 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > > Call Trace: > > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50 > > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100 > > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50 > > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80 > > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110 > > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160 > > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220 > > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220 > > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260 > > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990 > > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20 > > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0 > > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0 > > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580 > > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d > > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0 > > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b > > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP <ffff88066670dcf8> > > ---[ end trace 3cc1cb65fcc6a8ca ]--- > > > > another trace with same behavior on a new compiled kernel with more > > debug options; > > but I can't see any difference : > > > > BUG: Dentry ffff880667556738{i=41a46,n=sleep} still in use (8) > > [unmount of ext3 dm-4] > > ------------[ cut here ]------------ > > kernel BUG at fs/dcache.c:670! > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/dm-3/removable > > CPU 1 > > Modules linked in: i5k_amb(+) button hwmon processor thermal fan [last > > unloaded: scsi_wait_scan] > > Pid: 3315, comm: kpartx Not tainted 2.6.32.3 #3 PowerEdge 2950 > > RIP: 0010:[<ffffffff810f95f0>] __[<ffffffff810f95f0>] > > shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP: 0018:ffff880667089cf8 __EFLAGS: 00010296 > > RAX: 000000000000005c RBX: ffff880667790a60 RCX: 0000000000000096 > > RDX: 0000000000006767 RSI: 0000000000000046 RDI: 0000000000000246 > > RBP: ffff880667556738 R08: 0000000000000000 R09: ffff88066604b420 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880667556798 > > R13: 0000000000000007 R14: ffff880665842360 R15: 0000000000b3c0b0 > > FS: __00007f7b1006c770(0000) GS:ffff880028240000(0000) knlGS:0000000000000000 > > CS: __0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f6e67f1c350 CR3: 0000000664ff1000 CR4: 00000000000006e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process kpartx (pid: 3315, threadinfo ffff880667088000, task ffff880664f55f40) > > Stack: > > ffff880667058af0 ffff880667058890 ffffffff81619600 0000000000000001 > > <0> ffff880667408e00 ffffffff810f9629 ffff880667058890 ffffffff810e8049 > > <0> ffff88067f83e758 ffff880667408e00 ffffffff8185fc00 ffffffff810e8159 > > Call Trace: > > [<ffffffff810f9629>] ? shrink_dcache_for_umount+0x29/0x50 > > [<ffffffff810e8049>] ? generic_shutdown_super+0x19/0x100 > > [<ffffffff810e8159>] ? kill_block_super+0x29/0x50 > > [<ffffffff810e8238>] ? deactivate_locked_super+0x58/0x80 > > [<ffffffff81112842>] ? thaw_bdev+0xd2/0x110 > > [<ffffffff814b0c67>] ? dm_resume+0xf7/0x160 > > [<ffffffff814b5f00>] ? dev_suspend+0x0/0x220 > > [<ffffffff814b60b1>] ? dev_suspend+0x1b1/0x220 > > [<ffffffff814b6c7b>] ? ctl_ioctl+0x1eb/0x260 > > [<ffffffff810c0b1b>] ? handle_mm_fault+0x63b/0x990 > > [<ffffffff814b6cfe>] ? dm_ctl_ioctl+0xe/0x20 > > [<ffffffff8104991a>] ? finish_task_switch+0x3a/0xc0 > > [<ffffffff810f4e9f>] ? vfs_ioctl+0x2f/0xb0 > > [<ffffffff810f53bb>] ? do_vfs_ioctl+0x3fb/0x580 > > [<ffffffff815fb101>] ? thread_return+0x3e/0x64d > > [<ffffffff810f55e1>] ? sys_ioctl+0xa1/0xb0 > > [<ffffffff8100bf02>] ? system_call_fastpath+0x16/0x1b > > Code: 4d 38 48 8b 45 10 48 85 c0 74 04 48 8b 50 40 48 8d 86 60 02 00 > > 00 48 c7 c7 a8 66 76 81 48 89 04 24 48 89 ee 31 c0 e8 a9 11 50 00 <0f> > > 0b eb fe 0f 0b eb fe 0f 1f 84 00 00 00 00 00 53 48 89 fb 48 > > RIP __[<ffffffff810f95f0>] shrink_dcache_for_umount_subtree+0x280/0x290 > > RSP <ffff880667089cf8> > > ---[ end trace a9fb3c2286e56cbd ]--- > > > > > > I think the problem should be related with lvm or device mapper because > > I could start perfectly a 2.6.32.2 kernel on another PowerEdge 2950 > > without any kind of lvm or dm configured... > > but I'm really not expert with kernel debug. > > > > Here is the fstab of the buggy system : > > > > # /etc/fstab: static file system information. > > # > > # <file system> <mount point> __ <type> __<options> __ __ __ <dump> __<pass> > > proc __ __ __ __ __ __/proc __ __ __ __ __ proc __ __defaults __ __ __ __0 __ __ __ 0 > > /dev/dm-4 __ __ __ / __ __ __ __ __ __ __ ext3 __ __errors=remount-ro 0 __ __ __ 1 > > /dev/dm-1 __ __ __ /boot __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-7 __ __ __ /home __ __ __ __ __ ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-5 __ __ __ /usr __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-6 __ __ __ /var __ __ __ __ __ __ext3 __ __defaults __ __ __ __0 __ __ __ 2 > > /dev/dm-2 __ __ __ none __ __ __ __ __ __swap __ __sw __ __ __ __ __ __ __0 __ __ __ 0 > > /dev/hda __ __ __ __/media/cdrom0 __ udf,iso9660 user,noauto __ __ 0 __ __ __ 0 > > debugfs /sys/kernel/debug debugfs noauto 0 0 > > > > I hope it can help, and try to give us more informations if necessary. > > > > Fran__ois. > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process 2010-01-28 2:42 ` Neil Brown @ 2010-01-28 6:32 ` Jun'ichi Nomura 2010-01-28 18:16 ` Thomas Backlund ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Jun'ichi Nomura @ 2010-01-28 6:32 UTC (permalink / raw) To: François Figarola, hch Cc: device-mapper development, linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro >> On Sat, 16 Jan 2010 10:58:30 +0100 >> Fran__ois Figarola <francois.figarola@i-consult.fr> wrote: >>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the >>> boot process, and I think it could be related to the problem reported >>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). >>> >>> The hardware is a Dell PowerEdge 2950 which runs fine with the >>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11), >>> and the system is debian etch. >>> >>> Here is the trace of the bug I've got (using netconsole) with a >>> 2.6.32.3 kernel : >>> >>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) >>> [unmount of ext3 dm-4] >>> ------------[ cut here ]------------ >>> kernel BUG at fs/dcache.c:670! I can reproduce this when suspend/resume read-only mounted dm device. When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super, which seems wrong. The change was introduced with the commit below: commit 4504230a71566785a05d3e6b53fa1ee071b864eb Author: Christoph Hellwig <hch@lst.de> Date: Mon Aug 3 23:28:35 2009 +0200 freeze_bdev: grab active reference to frozen superblocks With the attached patch, both remount-ro and remount-rw are rejected as EBUSY on freezed device as expected. Christoph, do you think this is the right fix? -- Jun'ichi Nomura, NEC Corporation If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of deactivate_locked_super(). Also, keep sb->s_frozen consistent so that remount can check the frozen state. Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> diff --git a/fs/block_dev.c b/fs/block_dev.c index 73d6a73..600261f 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev) if (!sb) goto out; if (sb->s_flags & MS_RDONLY) { - deactivate_locked_super(sb); + sb->s_frozen = SB_FREEZE_TRANS; + smp_wmb(); + up_write(&sb->s_umount); mutex_unlock(&bdev->bd_fsfreeze_mutex); return sb; } @@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) BUG_ON(sb->s_bdev != bdev); down_write(&sb->s_umount); if (sb->s_flags & MS_RDONLY) - goto out_deactivate; + goto out_unfrozen; if (sb->s_op->unfreeze_fs) { error = sb->s_op->unfreeze_fs(sb); @@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) } } +out_unfrozen: sb->s_frozen = SB_UNFROZEN; smp_wmb(); wake_up(&sb->s_wait_unfrozen); -out_deactivate: if (sb) deactivate_locked_super(sb); out_unlock: ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process 2010-01-28 6:32 ` [dm-devel] " Jun'ichi Nomura @ 2010-01-28 18:16 ` Thomas Backlund 2010-01-28 18:25 ` Christoph Hellwig 2010-01-29 7:06 ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola 2 siblings, 0 replies; 8+ messages in thread From: Thomas Backlund @ 2010-01-28 18:16 UTC (permalink / raw) To: Jun'ichi Nomura Cc: François Figarola, hch, device-mapper development, linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro 28.01.2010 08:32, Jun'ichi Nomura skrev: >>> On Sat, 16 Jan 2010 10:58:30 +0100 >>> Fran__ois Figarola<francois.figarola@i-consult.fr> wrote: >>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the >>>> boot process, and I think it could be related to the problem reported >>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). >>>> >>>> The hardware is a Dell PowerEdge 2950 which runs fine with the >>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11), >>>> and the system is debian etch. >>>> >>>> Here is the trace of the bug I've got (using netconsole) with a >>>> 2.6.32.3 kernel : >>>> >>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) >>>> [unmount of ext3 dm-4] >>>> ------------[ cut here ]------------ >>>> kernel BUG at fs/dcache.c:670! > > I can reproduce this when suspend/resume read-only mounted dm device. > > When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super, > which seems wrong. The change was introduced with the commit below: > > commit 4504230a71566785a05d3e6b53fa1ee071b864eb > Author: Christoph Hellwig<hch@lst.de> > Date: Mon Aug 3 23:28:35 2009 +0200 > > freeze_bdev: grab active reference to frozen superblocks > > With the attached patch, both remount-ro and remount-rw are > rejected as EBUSY on freezed device as expected. > > Christoph, do you think this is the right fix? > I can confirm that both reverting the above patch, or applying the fix below fixes the issue on both 2.6.32 and 2.6.33-rc5 So if it's considered the correct fix, it needs to be cc stable@ for 2.6.32 (I reported this same issue this morning here: http://marc.info/?l=linux-kernel&m=126467195500908&w=2, but then I found this thread/fix) The system I have tested on is a 4-disk dmraid10 connected to an Intel ICH10R on an Asus P7P55D Deluxe running x86_64 > Jun'ichi Nomura, NEC Corporation > > > If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of > deactivate_locked_super(). > Also, keep sb->s_frozen consistent so that remount can check the frozen state. > > Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Tested-by: Thomas Backlund <tmb@mandriva.org> > > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 73d6a73..600261f 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -246,7 +246,9 @@ struct super_block *freeze_bdev(struct block_device *bdev) > if (!sb) > goto out; > if (sb->s_flags & MS_RDONLY) { > - deactivate_locked_super(sb); > + sb->s_frozen = SB_FREEZE_TRANS; > + smp_wmb(); > + up_write(&sb->s_umount); > mutex_unlock(&bdev->bd_fsfreeze_mutex); > return sb; > } > @@ -307,7 +309,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) > BUG_ON(sb->s_bdev != bdev); > down_write(&sb->s_umount); > if (sb->s_flags & MS_RDONLY) > - goto out_deactivate; > + goto out_unfrozen; > > if (sb->s_op->unfreeze_fs) { > error = sb->s_op->unfreeze_fs(sb); > @@ -321,11 +323,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) > } > } > > +out_unfrozen: > sb->s_frozen = SB_UNFROZEN; > smp_wmb(); > wake_up(&sb->s_wait_unfrozen); > > -out_deactivate: > if (sb) > deactivate_locked_super(sb); > out_unlock: ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process 2010-01-28 6:32 ` [dm-devel] " Jun'ichi Nomura 2010-01-28 18:16 ` Thomas Backlund @ 2010-01-28 18:25 ` Christoph Hellwig 2010-01-29 0:56 ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura 2010-01-29 7:06 ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola 2 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2010-01-28 18:25 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Fran?ois Figarola, hch, device-mapper development, linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro On Thu, Jan 28, 2010 at 03:32:41PM +0900, Jun'ichi Nomura wrote: > When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super, > which seems wrong. The change was introduced with the commit below: > > commit 4504230a71566785a05d3e6b53fa1ee071b864eb > Author: Christoph Hellwig <hch@lst.de> > Date: Mon Aug 3 23:28:35 2009 +0200 > > freeze_bdev: grab active reference to frozen superblocks > > With the attached patch, both remount-ro and remount-rw are > rejected as EBUSY on freezed device as expected. > > Christoph, do you think this is the right fix? Indeed, this looks wrong in my original code, and the patch looks like the correct fix. Thanks a lot! Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 8+ messages in thread
* [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb 2010-01-28 18:25 ` Christoph Hellwig @ 2010-01-29 0:56 ` Jun'ichi Nomura 2010-01-30 18:44 ` Thomas Backlund 0 siblings, 1 reply; 8+ messages in thread From: Jun'ichi Nomura @ 2010-01-29 0:56 UTC (permalink / raw) To: Christoph Hellwig, linux-kernel, tmb Cc: Fran?ois Figarola, device-mapper development, Neil Brown, Andrew Morton, linux-raid, Al Viro, stable Thanks Thomas and Christoph for testing and review. I removed 'smp_wmb()' before up_write from the previous patch, since up_write() should have necessary ordering constraints. (I.e. the change of s_frozen is visible to others after up_write) I'm quite sure the change is harmless but if you are uncomfortable with Tested-by/Reviewed-by on the modified patch, please remove them. If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of deactivate_locked_super(). Also, keep sb->s_frozen consistent so that remount can check the frozen state. Otherwise a crash reported here can happen: http://lkml.org/lkml/2010/1/16/37 http://lkml.org/lkml/2010/1/28/53 This patch should be applied for 2.6.32 stable series, too. Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Thomas Backlund <tmb@mandriva.org> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@kernel.org diff --git a/fs/block_dev.c b/fs/block_dev.c index 73d6a73..d11d028 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev) if (!sb) goto out; if (sb->s_flags & MS_RDONLY) { - deactivate_locked_super(sb); + sb->s_frozen = SB_FREEZE_TRANS; + up_write(&sb->s_umount); mutex_unlock(&bdev->bd_fsfreeze_mutex); return sb; } @@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) BUG_ON(sb->s_bdev != bdev); down_write(&sb->s_umount); if (sb->s_flags & MS_RDONLY) - goto out_deactivate; + goto out_unfrozen; if (sb->s_op->unfreeze_fs) { error = sb->s_op->unfreeze_fs(sb); @@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) } } +out_unfrozen: sb->s_frozen = SB_UNFROZEN; smp_wmb(); wake_up(&sb->s_wait_unfrozen); -out_deactivate: if (sb) deactivate_locked_super(sb); out_unlock: ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb 2010-01-29 0:56 ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura @ 2010-01-30 18:44 ` Thomas Backlund 0 siblings, 0 replies; 8+ messages in thread From: Thomas Backlund @ 2010-01-30 18:44 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Christoph Hellwig, linux-kernel@vger.kernel.org, tmb@mandriva.org, Fran?ois Figarola, device-mapper development, Neil Brown, Andrew Morton, linux-raid@vger.kernel.org, Al Viro, stable@kernel.org 29.01.2010 02:56, Jun'ichi Nomura skrev: > Thanks Thomas and Christoph for testing and review. > I removed 'smp_wmb()' before up_write from the previous patch, > since up_write() should have necessary ordering constraints. > (I.e. the change of s_frozen is visible to others after up_write) > I'm quite sure the change is harmless but if you are uncomfortable > with Tested-by/Reviewed-by on the modified patch, please remove them. > I've just verified that this patch works as intended on both 2.6.32 and 2.6.33-rc6, so for me it's still OK. > > If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of > deactivate_locked_super(). > Also, keep sb->s_frozen consistent so that remount can check the frozen state. > > Otherwise a crash reported here can happen: > http://lkml.org/lkml/2010/1/16/37 > http://lkml.org/lkml/2010/1/28/53 > > > This patch should be applied for 2.6.32 stable series, too. > > Reviewed-by: Christoph Hellwig<hch@lst.de> > Tested-by: Thomas Backlund<tmb@mandriva.org> > Signed-off-by: Jun'ichi Nomura<j-nomura@ce.jp.nec.com> > Cc: stable@kernel.org > > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 73d6a73..d11d028 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -246,7 +246,8 @@ struct super_block *freeze_bdev(struct block_device *bdev) > if (!sb) > goto out; > if (sb->s_flags& MS_RDONLY) { > - deactivate_locked_super(sb); > + sb->s_frozen = SB_FREEZE_TRANS; > + up_write(&sb->s_umount); > mutex_unlock(&bdev->bd_fsfreeze_mutex); > return sb; > } > @@ -307,7 +308,7 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) > BUG_ON(sb->s_bdev != bdev); > down_write(&sb->s_umount); > if (sb->s_flags& MS_RDONLY) > - goto out_deactivate; > + goto out_unfrozen; > > if (sb->s_op->unfreeze_fs) { > error = sb->s_op->unfreeze_fs(sb); > @@ -321,11 +322,11 @@ int thaw_bdev(struct block_device *bdev, struct super_block *sb) > } > } > > +out_unfrozen: > sb->s_frozen = SB_UNFROZEN; > smp_wmb(); > wake_up(&sb->s_wait_unfrozen); > > -out_deactivate: > if (sb) > deactivate_locked_super(sb); > out_unlock: > . > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process 2010-01-28 6:32 ` [dm-devel] " Jun'ichi Nomura 2010-01-28 18:16 ` Thomas Backlund 2010-01-28 18:25 ` Christoph Hellwig @ 2010-01-29 7:06 ` François Figarola 2 siblings, 0 replies; 8+ messages in thread From: François Figarola @ 2010-01-29 7:06 UTC (permalink / raw) To: Jun'ichi Nomura Cc: hch, device-mapper development, linux-kernel, Neil Brown, Andrew Morton, linux-raid, Al Viro Jun'ichi Nomura a écrit : >>> On Sat, 16 Jan 2010 10:58:30 +0100 >>> Fran__ois Figarola <francois.figarola@i-consult.fr> wrote: >>> >>>> Since I've tried to boot 2.6.32.x kernel, my system hangs during the >>>> boot process, and I think it could be related to the problem reported >>>> earlier by Megastorage (http://lkml.org/lkml/2010/1/10/92). >>>> >>>> The hardware is a Dell PowerEdge 2950 which runs fine with the >>>> 2.6.31.x kernel series (actually running with the latest 2.6.31.11), >>>> and the system is debian etch. >>>> >>>> Here is the trace of the bug I've got (using netconsole) with a >>>> 2.6.32.3 kernel : >>>> >>>> BUG: Dentry ffff880667690000{i=41a46,n=sleep} still in use (8) >>>> [unmount of ext3 dm-4] >>>> ------------[ cut here ]------------ >>>> kernel BUG at fs/dcache.c:670! >>>> > > I can reproduce this when suspend/resume read-only mounted dm device. > > When MS_RDONLY, both freeze_bdev and thaw_bdev call deactivate_locked_super, > which seems wrong. The change was introduced with the commit below: > > commit 4504230a71566785a05d3e6b53fa1ee071b864eb > Author: Christoph Hellwig <hch@lst.de> > Date: Mon Aug 3 23:28:35 2009 +0200 > > freeze_bdev: grab active reference to frozen superblocks > > With the attached patch, both remount-ro and remount-rw are > rejected as EBUSY on freezed device as expected. > > Christoph, do you think this is the right fix? > > With the fix from Jun'ichi Nomura, a 2.6.32.5 kernel boots now correctly. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-01-30 18:44 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <d2deb3241001160158r4baed1e1t7e8f6642de018b4c@mail.gmail.com> 2010-01-23 0:07 ` [BUG] kernel 2.6.32.x hangs during boot process Andrew Morton 2010-01-28 2:42 ` Neil Brown 2010-01-28 6:32 ` [dm-devel] " Jun'ichi Nomura 2010-01-28 18:16 ` Thomas Backlund 2010-01-28 18:25 ` Christoph Hellwig 2010-01-29 0:56 ` [BUGFIX] [PATCH] freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb Jun'ichi Nomura 2010-01-30 18:44 ` Thomas Backlund 2010-01-29 7:06 ` [dm-devel] [BUG] kernel 2.6.32.x hangs during boot process François Figarola
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).