Re: [Bug 30882] Automatic process group scheduling causes crashes after a while

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
       [not found] ` <201103152102.p2FL2NN8006070@demeter2.kernel.org>
@ 2011-03-15 21:24   ` Andrew Morton
  2011-03-15 23:54     ` Mehmet Giritli
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2011-03-15 21:24 UTC (permalink / raw)
  To: linux-fsdevel, Nick Piggin, Al Viro
  Cc: bugzilla-daemon, Mehmet Giritli, Ian Kent


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

Seems that we have a nasty involving autofs, nfs and the VFS.

Mehmet, the kernel should have printed some diagnostics prior to doing
the BUG() call:

			if (dentry->d_count != 0) {
				printk(KERN_ERR
				       "BUG: Dentry %p{i=%lx,n=%s}"
				       " still in use (%d)"
				       " [unmount of %s %s]\n",
				       dentry,
				       dentry->d_inode ?
				       dentry->d_inode->i_ino : 0UL,
				       dentry->d_name.name,
				       dentry->d_count,
				       dentry->d_sb->s_type->name,
				       dentry->d_sb->s_id);
				BUG();
			}

Please find those in the log and email them to use - someone might find
it useful.


On Tue, 15 Mar 2011 21:02:23 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=30882
> 
> 
> 
> 
> 
> --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> Here is that crash happening again, the system was NOT running overclocked or
> anything...
> 
> [ 1860.156122] ------------[ cut here ]------------
> [ 1860.156124] kernel BUG at fs/dcache.c:943!
> [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> [ 1860.156128] CPU 3 
> [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> nvidia(P)
> [ 1860.156137] 
> [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> shrink_dcache_for_umount_subtree+0x268/0x270
> [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> 000000000003ffff
> [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> ffffffff8174c9f8
> [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> 0000000000000006
> [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff88023a07f5e0
> [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> ffff880211d38740
> [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> knlGS:00000000f74186c0
> [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> 00000000000006e0
> [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> ffff880211fd5640)
> [ 1860.156162] Stack:
> [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> ffff88020c05cc00
> [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> ffffffff810e96a4
> [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> ffffffff810d5d15
> [ 1860.156169] Call Trace:
> [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> [ 1860.156201] RIP  [<ffffffff810e9648>]
> shrink_dcache_for_umount_subtree+0x268/0x270
> [ 1860.156204]  RSP <ffff8800be82fe08>
> [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> 
> -- 
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-15 21:24   ` [Bug 30882] Automatic process group scheduling causes crashes after a while Andrew Morton
@ 2011-03-15 23:54     ` Mehmet Giritli
  2011-03-16  2:32       ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Mehmet Giritli @ 2011-03-15 23:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-fsdevel, Nick Piggin, Al Viro, bugzilla-daemon,
	Mehmet Giritli, Ian Kent

The missing piece is as follows:

Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]

(sorry for the inconvenience Andrew)
 
On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> Seems that we have a nasty involving autofs, nfs and the VFS.
> 
> Mehmet, the kernel should have printed some diagnostics prior to doing
> the BUG() call:
> 
> 			if (dentry->d_count != 0) {
> 				printk(KERN_ERR
> 				       "BUG: Dentry %p{i=%lx,n=%s}"
> 				       " still in use (%d)"
> 				       " [unmount of %s %s]\n",
> 				       dentry,
> 				       dentry->d_inode ?
> 				       dentry->d_inode->i_ino : 0UL,
> 				       dentry->d_name.name,
> 				       dentry->d_count,
> 				       dentry->d_sb->s_type->name,
> 				       dentry->d_sb->s_id);
> 				BUG();
> 			}
> 
> Please find those in the log and email them to use - someone might find
> it useful.
> 
> 
> On Tue, 15 Mar 2011 21:02:23 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > 
> > 
> > 
> > 
> > 
> > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > Here is that crash happening again, the system was NOT running overclocked or
> > anything...
> > 
> > [ 1860.156122] ------------[ cut here ]------------
> > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > [ 1860.156128] CPU 3 
> > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > nvidia(P)
> > [ 1860.156137] 
> > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > shrink_dcache_for_umount_subtree+0x268/0x270
> > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > 000000000003ffff
> > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > ffffffff8174c9f8
> > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > 0000000000000006
> > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > ffff88023a07f5e0
> > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > ffff880211d38740
> > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > knlGS:00000000f74186c0
> > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > 00000000000006e0
> > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000
> > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > 0000000000000400
> > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > ffff880211fd5640)
> > [ 1860.156162] Stack:
> > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > ffff88020c05cc00
> > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > ffffffff810e96a4
> > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > ffffffff810d5d15
> > [ 1860.156169] Call Trace:
> > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > shrink_dcache_for_umount_subtree+0x268/0x270
> > [ 1860.156204]  RSP <ffff8800be82fe08>
> > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > 
> > -- 
> > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > ------- You are receiving this mail because: -------
> > You are on the CC list for the bug.
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-15 23:54     ` Mehmet Giritli
@ 2011-03-16  2:32       ` Ian Kent
  2011-03-16 14:27         ` Mehmet Giritli
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2011-03-16  2:32 UTC (permalink / raw)
  To: mgiritli
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> The missing piece is as follows:
> 
> Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]

This might be the same problem I saw and described in rc1.
However, for me the fs in the BUG() report was autofs.
Hopefully that just means my autofs setup is different.

At the moment I believe a dentry leak Al Viro spotted is the cause.
Please try this patch.

autofs4 - fix dentry leak in autofs4_expire_direct()

From: Ian Kent <raven@themaw.net>

There is a missing dput() when returning from autofs4_expire_direct()
when we see that the dentry is already a pending mount.

Signed-off-by: Ian Kent <raven@themaw.net>
---

 fs/autofs4/expire.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)


diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index c896dd6..c403abc 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
 	spin_lock(&sbi->fs_lock);
 	ino = autofs4_dentry_ino(root);
 	/* No point expiring a pending mount */
-	if (ino->flags & AUTOFS_INF_PENDING) {
-		spin_unlock(&sbi->fs_lock);
-		return NULL;
-	}
+	if (ino->flags & AUTOFS_INF_PENDING)
+		goto out;
 	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
 		struct autofs_info *ino = autofs4_dentry_ino(root);
 		ino->flags |= AUTOFS_INF_EXPIRING;
@@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
 		spin_unlock(&sbi->fs_lock);
 		return root;
 	}
+out:
 	spin_unlock(&sbi->fs_lock);
 	dput(root);
 

> 
> (sorry for the inconvenience Andrew)
>  
> On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > Seems that we have a nasty involving autofs, nfs and the VFS.
> > 
> > Mehmet, the kernel should have printed some diagnostics prior to doing
> > the BUG() call:
> > 
> > 			if (dentry->d_count != 0) {
> > 				printk(KERN_ERR
> > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > 				       " still in use (%d)"
> > 				       " [unmount of %s %s]\n",
> > 				       dentry,
> > 				       dentry->d_inode ?
> > 				       dentry->d_inode->i_ino : 0UL,
> > 				       dentry->d_name.name,
> > 				       dentry->d_count,
> > 				       dentry->d_sb->s_type->name,
> > 				       dentry->d_sb->s_id);
> > 				BUG();
> > 			}
> > 
> > Please find those in the log and email them to use - someone might find
> > it useful.
> > 
> > 
> > On Tue, 15 Mar 2011 21:02:23 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > Here is that crash happening again, the system was NOT running overclocked or
> > > anything...
> > > 
> > > [ 1860.156122] ------------[ cut here ]------------
> > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > [ 1860.156128] CPU 3 
> > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > nvidia(P)
> > > [ 1860.156137] 
> > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > 000000000003ffff
> > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > ffffffff8174c9f8
> > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > 0000000000000006
> > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > ffff88023a07f5e0
> > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > ffff880211d38740
> > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > knlGS:00000000f74186c0
> > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > 00000000000006e0
> > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > 0000000000000000
> > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > 0000000000000400
> > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > ffff880211fd5640)
> > > [ 1860.156162] Stack:
> > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > ffff88020c05cc00
> > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > ffffffff810e96a4
> > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > ffffffff810d5d15
> > > [ 1860.156169] Call Trace:
> > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > 
> > > -- 
> > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > ------- You are receiving this mail because: -------
> > > You are on the CC list for the bug.
> > 
> 
> 



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16  2:32       ` Ian Kent
@ 2011-03-16 14:27         ` Mehmet Giritli
  2011-03-16 15:21           ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Mehmet Giritli @ 2011-03-16 14:27 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

Ian,

I am having much more frequent crashes now. I havent been able to
cleanly reboot my machine yet and I have tried three times so far. Init
scripts fail to unmount the file systems and I have to reboot manually

On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > The missing piece is as follows:
> > 
> > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> 
> This might be the same problem I saw and described in rc1.
> However, for me the fs in the BUG() report was autofs.
> Hopefully that just means my autofs setup is different.
> 
> At the moment I believe a dentry leak Al Viro spotted is the cause.
> Please try this patch.
> 
> autofs4 - fix dentry leak in autofs4_expire_direct()
> 
> From: Ian Kent <raven@themaw.net>
> 
> There is a missing dput() when returning from autofs4_expire_direct()
> when we see that the dentry is already a pending mount.
> 
> Signed-off-by: Ian Kent <raven@themaw.net>
> ---
> 
>  fs/autofs4/expire.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> 
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index c896dd6..c403abc 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
>  	spin_lock(&sbi->fs_lock);
>  	ino = autofs4_dentry_ino(root);
>  	/* No point expiring a pending mount */
> -	if (ino->flags & AUTOFS_INF_PENDING) {
> -		spin_unlock(&sbi->fs_lock);
> -		return NULL;
> -	}
> +	if (ino->flags & AUTOFS_INF_PENDING)
> +		goto out;
>  	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
>  		struct autofs_info *ino = autofs4_dentry_ino(root);
>  		ino->flags |= AUTOFS_INF_EXPIRING;
> @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
>  		spin_unlock(&sbi->fs_lock);
>  		return root;
>  	}
> +out:
>  	spin_unlock(&sbi->fs_lock);
>  	dput(root);
>  
> 
> > 
> > (sorry for the inconvenience Andrew)
> >  
> > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > bugzilla web interface).
> > > 
> > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > 
> > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > the BUG() call:
> > > 
> > > 			if (dentry->d_count != 0) {
> > > 				printk(KERN_ERR
> > > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > > 				       " still in use (%d)"
> > > 				       " [unmount of %s %s]\n",
> > > 				       dentry,
> > > 				       dentry->d_inode ?
> > > 				       dentry->d_inode->i_ino : 0UL,
> > > 				       dentry->d_name.name,
> > > 				       dentry->d_count,
> > > 				       dentry->d_sb->s_type->name,
> > > 				       dentry->d_sb->s_id);
> > > 				BUG();
> > > 			}
> > > 
> > > Please find those in the log and email them to use - someone might find
> > > it useful.
> > > 
> > > 
> > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > 
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > anything...
> > > > 
> > > > [ 1860.156122] ------------[ cut here ]------------
> > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > [ 1860.156128] CPU 3 
> > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > nvidia(P)
> > > > [ 1860.156137] 
> > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > 000000000003ffff
> > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > ffffffff8174c9f8
> > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > 0000000000000006
> > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > ffff88023a07f5e0
> > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > ffff880211d38740
> > > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > knlGS:00000000f74186c0
> > > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > 00000000000006e0
> > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > 0000000000000000
> > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > 0000000000000400
> > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > ffff880211fd5640)
> > > > [ 1860.156162] Stack:
> > > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > ffff88020c05cc00
> > > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > ffffffff810e96a4
> > > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > ffffffff810d5d15
> > > > [ 1860.156169] Call Trace:
> > > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > 
> > > > -- 
> > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > ------- You are receiving this mail because: -------
> > > > You are on the CC list for the bug.
> > > 
> > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 14:27         ` Mehmet Giritli
@ 2011-03-16 15:21           ` Ian Kent
  2011-03-16 15:29             ` Mehmet Giritli
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2011-03-16 15:21 UTC (permalink / raw)
  To: mgiritli
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

On Wed, 2011-03-16 at 16:27 +0200, Mehmet Giritli wrote:
> Ian,
> 
> I am having much more frequent crashes now. I havent been able to
> cleanly reboot my machine yet and I have tried three times so far. Init
> scripts fail to unmount the file systems and I have to reboot manually

What do your autofs maps look like?

> 
> On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> > On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > > The missing piece is as follows:
> > > 
> > > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> > 
> > This might be the same problem I saw and described in rc1.
> > However, for me the fs in the BUG() report was autofs.
> > Hopefully that just means my autofs setup is different.
> > 
> > At the moment I believe a dentry leak Al Viro spotted is the cause.
> > Please try this patch.
> > 
> > autofs4 - fix dentry leak in autofs4_expire_direct()
> > 
> > From: Ian Kent <raven@themaw.net>
> > 
> > There is a missing dput() when returning from autofs4_expire_direct()
> > when we see that the dentry is already a pending mount.
> > 
> > Signed-off-by: Ian Kent <raven@themaw.net>
> > ---
> > 
> >  fs/autofs4/expire.c |    7 +++----
> >  1 files changed, 3 insertions(+), 4 deletions(-)
> > 
> > 
> > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > index c896dd6..c403abc 100644
> > --- a/fs/autofs4/expire.c
> > +++ b/fs/autofs4/expire.c
> > @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> >  	spin_lock(&sbi->fs_lock);
> >  	ino = autofs4_dentry_ino(root);
> >  	/* No point expiring a pending mount */
> > -	if (ino->flags & AUTOFS_INF_PENDING) {
> > -		spin_unlock(&sbi->fs_lock);
> > -		return NULL;
> > -	}
> > +	if (ino->flags & AUTOFS_INF_PENDING)
> > +		goto out;
> >  	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> >  		struct autofs_info *ino = autofs4_dentry_ino(root);
> >  		ino->flags |= AUTOFS_INF_EXPIRING;
> > @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> >  		spin_unlock(&sbi->fs_lock);
> >  		return root;
> >  	}
> > +out:
> >  	spin_unlock(&sbi->fs_lock);
> >  	dput(root);
> >  
> > 
> > > 
> > > (sorry for the inconvenience Andrew)
> > >  
> > > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > > bugzilla web interface).
> > > > 
> > > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > > 
> > > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > > the BUG() call:
> > > > 
> > > > 			if (dentry->d_count != 0) {
> > > > 				printk(KERN_ERR
> > > > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > > > 				       " still in use (%d)"
> > > > 				       " [unmount of %s %s]\n",
> > > > 				       dentry,
> > > > 				       dentry->d_inode ?
> > > > 				       dentry->d_inode->i_ino : 0UL,
> > > > 				       dentry->d_name.name,
> > > > 				       dentry->d_count,
> > > > 				       dentry->d_sb->s_type->name,
> > > > 				       dentry->d_sb->s_id);
> > > > 				BUG();
> > > > 			}
> > > > 
> > > > Please find those in the log and email them to use - someone might find
> > > > it useful.
> > > > 
> > > > 
> > > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > 
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > > anything...
> > > > > 
> > > > > [ 1860.156122] ------------[ cut here ]------------
> > > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > > [ 1860.156128] CPU 3 
> > > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > > nvidia(P)
> > > > > [ 1860.156137] 
> > > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > > 000000000003ffff
> > > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > > ffffffff8174c9f8
> > > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > > 0000000000000006
> > > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > > ffff88023a07f5e0
> > > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > > ffff880211d38740
> > > > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > > knlGS:00000000f74186c0
> > > > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > > 00000000000006e0
> > > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > 0000000000000000
> > > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > 0000000000000400
> > > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > > ffff880211fd5640)
> > > > > [ 1860.156162] Stack:
> > > > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > > ffff88020c05cc00
> > > > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > > ffffffff810e96a4
> > > > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > > ffffffff810d5d15
> > > > > [ 1860.156169] Call Trace:
> > > > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > > 
> > > > > -- 
> > > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > > ------- You are receiving this mail because: -------
> > > > > You are on the CC list for the bug.
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 15:21           ` Ian Kent
@ 2011-03-16 15:29             ` Mehmet Giritli
  2011-03-16 15:44               ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Mehmet Giritli @ 2011-03-16 15:29 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

On Wed, 2011-03-16 at 23:21 +0800, Ian Kent wrote:
> On Wed, 2011-03-16 at 16:27 +0200, Mehmet Giritli wrote:
> > Ian,
> > 
> > I am having much more frequent crashes now. I havent been able to
> > cleanly reboot my machine yet and I have tried three times so far. Init
> > scripts fail to unmount the file systems and I have to reboot manually
> 
> What do your autofs maps look like?
> 
> 

Here is  the contents of my auto.misc:

gollum-media            -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/mnt/media
gollum-distfiles        -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/usr/portage/distfiles
gollum-www              -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/www
gollum-WebDav           -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/dav

> > 
> > On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> > > On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > > > The missing piece is as follows:
> > > > 
> > > > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > > > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> > > 
> > > This might be the same problem I saw and described in rc1.
> > > However, for me the fs in the BUG() report was autofs.
> > > Hopefully that just means my autofs setup is different.
> > > 
> > > At the moment I believe a dentry leak Al Viro spotted is the cause.
> > > Please try this patch.
> > > 
> > > autofs4 - fix dentry leak in autofs4_expire_direct()
> > > 
> > > From: Ian Kent <raven@themaw.net>
> > > 
> > > There is a missing dput() when returning from autofs4_expire_direct()
> > > when we see that the dentry is already a pending mount.
> > > 
> > > Signed-off-by: Ian Kent <raven@themaw.net>
> > > ---
> > > 
> > >  fs/autofs4/expire.c |    7 +++----
> > >  1 files changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > > index c896dd6..c403abc 100644
> > > --- a/fs/autofs4/expire.c
> > > +++ b/fs/autofs4/expire.c
> > > @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > >  	spin_lock(&sbi->fs_lock);
> > >  	ino = autofs4_dentry_ino(root);
> > >  	/* No point expiring a pending mount */
> > > -	if (ino->flags & AUTOFS_INF_PENDING) {
> > > -		spin_unlock(&sbi->fs_lock);
> > > -		return NULL;
> > > -	}
> > > +	if (ino->flags & AUTOFS_INF_PENDING)
> > > +		goto out;
> > >  	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> > >  		struct autofs_info *ino = autofs4_dentry_ino(root);
> > >  		ino->flags |= AUTOFS_INF_EXPIRING;
> > > @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > >  		spin_unlock(&sbi->fs_lock);
> > >  		return root;
> > >  	}
> > > +out:
> > >  	spin_unlock(&sbi->fs_lock);
> > >  	dput(root);
> > >  
> > > 
> > > > 
> > > > (sorry for the inconvenience Andrew)
> > > >  
> > > > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > > > bugzilla web interface).
> > > > > 
> > > > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > > > 
> > > > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > > > the BUG() call:
> > > > > 
> > > > > 			if (dentry->d_count != 0) {
> > > > > 				printk(KERN_ERR
> > > > > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > > > > 				       " still in use (%d)"
> > > > > 				       " [unmount of %s %s]\n",
> > > > > 				       dentry,
> > > > > 				       dentry->d_inode ?
> > > > > 				       dentry->d_inode->i_ino : 0UL,
> > > > > 				       dentry->d_name.name,
> > > > > 				       dentry->d_count,
> > > > > 				       dentry->d_sb->s_type->name,
> > > > > 				       dentry->d_sb->s_id);
> > > > > 				BUG();
> > > > > 			}
> > > > > 
> > > > > Please find those in the log and email them to use - someone might find
> > > > > it useful.
> > > > > 
> > > > > 
> > > > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > 
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > > > anything...
> > > > > > 
> > > > > > [ 1860.156122] ------------[ cut here ]------------
> > > > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > > > [ 1860.156128] CPU 3 
> > > > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > > > nvidia(P)
> > > > > > [ 1860.156137] 
> > > > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > > > 000000000003ffff
> > > > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > > > ffffffff8174c9f8
> > > > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > > > 0000000000000006
> > > > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > > > ffff88023a07f5e0
> > > > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > > > ffff880211d38740
> > > > > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > > > knlGS:00000000f74186c0
> > > > > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > > > 00000000000006e0
> > > > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > 0000000000000000
> > > > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > > 0000000000000400
> > > > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > > > ffff880211fd5640)
> > > > > > [ 1860.156162] Stack:
> > > > > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > > > ffff88020c05cc00
> > > > > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > > > ffffffff810e96a4
> > > > > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > > > ffffffff810d5d15
> > > > > > [ 1860.156169] Call Trace:
> > > > > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > > > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > > > 
> > > > > > -- 
> > > > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > > > ------- You are receiving this mail because: -------
> > > > > > You are on the CC list for the bug.
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 15:29             ` Mehmet Giritli
@ 2011-03-16 15:44               ` Ian Kent
  2011-03-16 16:48                 ` Mehmet Giritli
  2011-03-16 17:47                 ` Mehmet Giritli
  0 siblings, 2 replies; 10+ messages in thread
From: Ian Kent @ 2011-03-16 15:44 UTC (permalink / raw)
  To: mgiritli
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

On Wed, 2011-03-16 at 17:29 +0200, Mehmet Giritli wrote:
> On Wed, 2011-03-16 at 23:21 +0800, Ian Kent wrote:
> > On Wed, 2011-03-16 at 16:27 +0200, Mehmet Giritli wrote:
> > > Ian,
> > > 
> > > I am having much more frequent crashes now. I havent been able to
> > > cleanly reboot my machine yet and I have tried three times so far. Init
> > > scripts fail to unmount the file systems and I have to reboot manually
> > 
> > What do your autofs maps look like?
> > 
> > 
> 
> Here is  the contents of my auto.misc:
> 
> gollum-media            -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/mnt/media
> gollum-distfiles        -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/usr/portage/distfiles
> gollum-www              -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/www
> gollum-WebDav           -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/dav

What, that's it, and your only using "/misc    /etc/auto.misc" in the
master map and your having problems.

Are the crashes always the same?
How have you established that the BUG()s are in fact due to automount
umounting mounts and that the BUG()s correspond to NFS mounts previously
mounted by autofs?
Is there any noise at all in the syslog?
Are you sure your using a kernel with the dentry leak patch?
What sort of automounting load is happening on the machine, ie.
frequency or mounts and umounts and what timeout are you using?

The dentry leak patch got rid of the BUG()s I was seeing but by that
time I did have a couple of other patches. I still don't think the other
patches made much difference for this particular case.

> 
> > > 
> > > On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> > > > On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > > > > The missing piece is as follows:
> > > > > 
> > > > > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > > > > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> > > > 
> > > > This might be the same problem I saw and described in rc1.
> > > > However, for me the fs in the BUG() report was autofs.
> > > > Hopefully that just means my autofs setup is different.
> > > > 
> > > > At the moment I believe a dentry leak Al Viro spotted is the cause.
> > > > Please try this patch.
> > > > 
> > > > autofs4 - fix dentry leak in autofs4_expire_direct()
> > > > 
> > > > From: Ian Kent <raven@themaw.net>
> > > > 
> > > > There is a missing dput() when returning from autofs4_expire_direct()
> > > > when we see that the dentry is already a pending mount.
> > > > 
> > > > Signed-off-by: Ian Kent <raven@themaw.net>
> > > > ---
> > > > 
> > > >  fs/autofs4/expire.c |    7 +++----
> > > >  1 files changed, 3 insertions(+), 4 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > > > index c896dd6..c403abc 100644
> > > > --- a/fs/autofs4/expire.c
> > > > +++ b/fs/autofs4/expire.c
> > > > @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > >  	spin_lock(&sbi->fs_lock);
> > > >  	ino = autofs4_dentry_ino(root);
> > > >  	/* No point expiring a pending mount */
> > > > -	if (ino->flags & AUTOFS_INF_PENDING) {
> > > > -		spin_unlock(&sbi->fs_lock);
> > > > -		return NULL;
> > > > -	}
> > > > +	if (ino->flags & AUTOFS_INF_PENDING)
> > > > +		goto out;
> > > >  	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> > > >  		struct autofs_info *ino = autofs4_dentry_ino(root);
> > > >  		ino->flags |= AUTOFS_INF_EXPIRING;
> > > > @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > >  		spin_unlock(&sbi->fs_lock);
> > > >  		return root;
> > > >  	}
> > > > +out:
> > > >  	spin_unlock(&sbi->fs_lock);
> > > >  	dput(root);
> > > >  
> > > > 
> > > > > 
> > > > > (sorry for the inconvenience Andrew)
> > > > >  
> > > > > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > > > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > > > > bugzilla web interface).
> > > > > > 
> > > > > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > > > > 
> > > > > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > > > > the BUG() call:
> > > > > > 
> > > > > > 			if (dentry->d_count != 0) {
> > > > > > 				printk(KERN_ERR
> > > > > > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > > > > > 				       " still in use (%d)"
> > > > > > 				       " [unmount of %s %s]\n",
> > > > > > 				       dentry,
> > > > > > 				       dentry->d_inode ?
> > > > > > 				       dentry->d_inode->i_ino : 0UL,
> > > > > > 				       dentry->d_name.name,
> > > > > > 				       dentry->d_count,
> > > > > > 				       dentry->d_sb->s_type->name,
> > > > > > 				       dentry->d_sb->s_id);
> > > > > > 				BUG();
> > > > > > 			}
> > > > > > 
> > > > > > Please find those in the log and email them to use - someone might find
> > > > > > it useful.
> > > > > > 
> > > > > > 
> > > > > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > > > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > > 
> > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > > > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > > > > anything...
> > > > > > > 
> > > > > > > [ 1860.156122] ------------[ cut here ]------------
> > > > > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > > > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > > > > [ 1860.156128] CPU 3 
> > > > > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > > > > nvidia(P)
> > > > > > > [ 1860.156137] 
> > > > > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > > > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > > > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > > > > 000000000003ffff
> > > > > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > > > > ffffffff8174c9f8
> > > > > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > > > > 0000000000000006
> > > > > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > > > > ffff88023a07f5e0
> > > > > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > > > > ffff880211d38740
> > > > > > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > > > > knlGS:00000000f74186c0
> > > > > > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > > > > 00000000000006e0
> > > > > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > > 0000000000000000
> > > > > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > > > 0000000000000400
> > > > > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > > > > ffff880211fd5640)
> > > > > > > [ 1860.156162] Stack:
> > > > > > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > > > > ffff88020c05cc00
> > > > > > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > > > > ffffffff810e96a4
> > > > > > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > > > > ffffffff810d5d15
> > > > > > > [ 1860.156169] Call Trace:
> > > > > > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > > > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > > > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > > > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > > > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > > > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > > > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > > > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > > > > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > > > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > > > > 
> > > > > > > -- 
> > > > > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > > > > ------- You are receiving this mail because: -------
> > > > > > > You are on the CC list for the bug.
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 15:44               ` Ian Kent
@ 2011-03-16 16:48                 ` Mehmet Giritli
  2011-03-16 17:47                 ` Mehmet Giritli
  1 sibling, 0 replies; 10+ messages in thread
From: Mehmet Giritli @ 2011-03-16 16:48 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon, Mehmet Giritli

On Wed, 2011-03-16 at 23:44 +0800, Ian Kent wrote:
> On Wed, 2011-03-16 at 17:29 +0200, Mehmet Giritli wrote:
> > On Wed, 2011-03-16 at 23:21 +0800, Ian Kent wrote:
> > > On Wed, 2011-03-16 at 16:27 +0200, Mehmet Giritli wrote:
> > > > Ian,
> > > > 
> > > > I am having much more frequent crashes now. I havent been able to
> > > > cleanly reboot my machine yet and I have tried three times so far. Init
> > > > scripts fail to unmount the file systems and I have to reboot manually
> > > 
> > > What do your autofs maps look like?
> > > 
> > > 
> > 
> > Here is  the contents of my auto.misc:
> > 
> > gollum-media            -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/mnt/media
> > gollum-distfiles        -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/usr/portage/distfiles
> > gollum-www              -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/www
> > gollum-WebDav           -rsize=8192,wsize=8192,soft,timeo=10,rw         gollum.giritli.eu:/var/dav
> 
> What, that's it, and your only using "/misc    /etc/auto.misc" in the
> master map and your having problems.

yes

> 
> Are the crashes always the same?

identical

> How have you established that the BUG()s are in fact due to automount
> umounting mounts and that the BUG()s correspond to NFS mounts previously
> mounted by autofs?

I havent established anything. However, thats the only way I mount nfs
and my file manager hangs, init scripts hang when trying to unmount...

> Is there any noise at all in the syslog?

nothing unusual

> Are you sure your using a kernel with the dentry leak patch?

yes

> What sort of automounting load is happening on the machine, ie.
> frequency or mounts and umounts and what timeout are you using?

from auto.master:

/mnt/autofs     /etc/auto.misc  --timeout=300 --ghost

Not very much. Lets say 2-3 times every hour for each mount point.

> The dentry leak patch got rid of the BUG()s I was seeing but by that
> time I did have a couple of other patches. I still don't think the other
> patches made much difference for this particular case.
> 
> > 
> > > > 
> > > > On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> > > > > On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > > > > > The missing piece is as follows:
> > > > > > 
> > > > > > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > > > > > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> > > > > 
> > > > > This might be the same problem I saw and described in rc1.
> > > > > However, for me the fs in the BUG() report was autofs.
> > > > > Hopefully that just means my autofs setup is different.
> > > > > 
> > > > > At the moment I believe a dentry leak Al Viro spotted is the cause.
> > > > > Please try this patch.
> > > > > 
> > > > > autofs4 - fix dentry leak in autofs4_expire_direct()
> > > > > 
> > > > > From: Ian Kent <raven@themaw.net>
> > > > > 
> > > > > There is a missing dput() when returning from autofs4_expire_direct()
> > > > > when we see that the dentry is already a pending mount.
> > > > > 
> > > > > Signed-off-by: Ian Kent <raven@themaw.net>
> > > > > ---
> > > > > 
> > > > >  fs/autofs4/expire.c |    7 +++----
> > > > >  1 files changed, 3 insertions(+), 4 deletions(-)
> > > > > 
> > > > > 
> > > > > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > > > > index c896dd6..c403abc 100644
> > > > > --- a/fs/autofs4/expire.c
> > > > > +++ b/fs/autofs4/expire.c
> > > > > @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > > >  	spin_lock(&sbi->fs_lock);
> > > > >  	ino = autofs4_dentry_ino(root);
> > > > >  	/* No point expiring a pending mount */
> > > > > -	if (ino->flags & AUTOFS_INF_PENDING) {
> > > > > -		spin_unlock(&sbi->fs_lock);
> > > > > -		return NULL;
> > > > > -	}
> > > > > +	if (ino->flags & AUTOFS_INF_PENDING)
> > > > > +		goto out;
> > > > >  	if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> > > > >  		struct autofs_info *ino = autofs4_dentry_ino(root);
> > > > >  		ino->flags |= AUTOFS_INF_EXPIRING;
> > > > > @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > > >  		spin_unlock(&sbi->fs_lock);
> > > > >  		return root;
> > > > >  	}
> > > > > +out:
> > > > >  	spin_unlock(&sbi->fs_lock);
> > > > >  	dput(root);
> > > > >  
> > > > > 
> > > > > > 
> > > > > > (sorry for the inconvenience Andrew)
> > > > > >  
> > > > > > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > > > > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > > > > > bugzilla web interface).
> > > > > > > 
> > > > > > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > > > > > 
> > > > > > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > > > > > the BUG() call:
> > > > > > > 
> > > > > > > 			if (dentry->d_count != 0) {
> > > > > > > 				printk(KERN_ERR
> > > > > > > 				       "BUG: Dentry %p{i=%lx,n=%s}"
> > > > > > > 				       " still in use (%d)"
> > > > > > > 				       " [unmount of %s %s]\n",
> > > > > > > 				       dentry,
> > > > > > > 				       dentry->d_inode ?
> > > > > > > 				       dentry->d_inode->i_ino : 0UL,
> > > > > > > 				       dentry->d_name.name,
> > > > > > > 				       dentry->d_count,
> > > > > > > 				       dentry->d_sb->s_type->name,
> > > > > > > 				       dentry->d_sb->s_id);
> > > > > > > 				BUG();
> > > > > > > 			}
> > > > > > > 
> > > > > > > Please find those in the log and email them to use - someone might find
> > > > > > > it useful.
> > > > > > > 
> > > > > > > 
> > > > > > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > > > > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > > > 
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu>  2011-03-15 21:02:22 ---
> > > > > > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > > > > > anything...
> > > > > > > > 
> > > > > > > > [ 1860.156122] ------------[ cut here ]------------
> > > > > > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > > > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP 
> > > > > > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > > > > > [ 1860.156128] CPU 3 
> > > > > > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > > > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > > > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > > > > > nvidia(P)
> > > > > > > > [ 1860.156137] 
> > > > > > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P            2.6.38-rc8 #9
> > > > > > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > > > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>]  [<ffffffff810e9648>]
> > > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08  EFLAGS: 00010296
> > > > > > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > > > > > 000000000003ffff
> > > > > > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > > > > > ffffffff8174c9f8
> > > > > > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > > > > > 0000000000000006
> > > > > > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > > > > > ffff88023a07f5e0
> > > > > > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > > > > > ffff880211d38740
> > > > > > > > [ 1860.156155] FS:  00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > > > > > knlGS:00000000f74186c0
> > > > > > > > [ 1860.156156] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > > > > > 00000000000006e0
> > > > > > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > > > 0000000000000000
> > > > > > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > > > > 0000000000000400
> > > > > > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > > > > > ffff880211fd5640)
> > > > > > > > [ 1860.156162] Stack:
> > > > > > > > [ 1860.156163]  ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > > > > > ffff88020c05cc00
> > > > > > > > [ 1860.156165]  ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > > > > > ffffffff810e96a4
> > > > > > > > [ 1860.156167]  ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > > > > > ffffffff810d5d15
> > > > > > > > [ 1860.156169] Call Trace:
> > > > > > > > [ 1860.156172]  [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > > > > > [ 1860.156174]  [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > > > > > [ 1860.156176]  [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > > > > > [ 1860.156179]  [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > > > > > [ 1860.156181]  [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > > > > > [ 1860.156183]  [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > > > > > [ 1860.156185]  [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > > > > > [ 1860.156187]  [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > > > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > > > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > > > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08 
> > > > > > > > [ 1860.156201] RIP  [<ffffffff810e9648>]
> > > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > > [ 1860.156204]  RSP <ffff8800be82fe08>
> > > > > > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > > > > > 
> > > > > > > > -- 
> > > > > > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > > > > > ------- You are receiving this mail because: -------
> > > > > > > > You are on the CC list for the bug.
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 15:44               ` Ian Kent
  2011-03-16 16:48                 ` Mehmet Giritli
@ 2011-03-16 17:47                 ` Mehmet Giritli
  2011-03-17  2:36                   ` Ian Kent
  1 sibling, 1 reply; 10+ messages in thread
From: Mehmet Giritli @ 2011-03-16 17:47 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon

Ian,

Well, here is a copy-paste of the crash that I am seeing *everytime at
shutdown* now that I am running with your patch. It is pretty much
identical to the older ones..except that I see this at every time,
instead of just rare crashes...

Now I am going to revert your patch and check for changes in the
frequency of the crashes...


Mar 16 18:49:27 mordor kernel: [ 5696.670114] BUG: Dentry
ffff88015007f300{i=2,n=donkey} still in use (1) [unmount of nfs 0:f]
Mar 16 18:49:27 mordor kernel: [ 5696.670134] ------------[ cut
here ]------------
Mar 16 18:49:27 mordor kernel: [ 5696.670187] kernel BUG at
fs/dcache.c:943!
Mar 16 18:49:27 mordor kernel: [ 5696.670237] invalid opcode: 0000 [#1]
SMP
Mar 16 18:49:27 mordor kernel: [ 5696.670369] last sysfs
file: /sys/devices/platform/it87.552/pwm1_enable
Mar 16 18:49:27 mordor kernel: [ 5696.670421] CPU 2
Mar 16 18:49:27 mordor kernel: [ 5696.670466] Modules linked in: ipt_LOG
xt_tcpudp xt_state xt_mac iptable_filter iptable_nat nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_mangle xt_multiport xt_mark
xt_conntrack xt_connmark nf_conntr$
Mar 16 18:49:27 mordor kernel: [ 5696.671003]
Mar 16 18:49:27 mordor kernel: [ 5696.671003] Pid: 21015, comm:
umount.nfs Tainted: P            2.6.38-gentoo #3 Gigabyte Technology
Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
Mar 16 18:49:27 mordor kernel: [ 5696.671003] RIP:
0010:[<ffffffff810e95c8>]  [<ffffffff810e95c8>]
shrink_dcache_for_umount_subtree+0x268/0x270
Mar 16 18:49:27 mordor kernel: [ 5696.671003] RSP: 0018:ffff880219903e08
EFLAGS: 00010296
Mar 16 18:49:27 mordor kernel: [ 5696.671003] RAX: 0000000000000066 RBX:
ffff88015007f300 RCX: 000000000003ffff
Mar 16 18:49:27 mordor kernel: [ 5696.671003] RDX: ffffffff81623888 RSI:
0000000000000046 RDI: ffffffff817509f8
Mar 16 18:49:27 mordor kernel: [ 5696.671003] RBP: ffff88023a1480c0 R08:
000000000001fa7c R09: 0000000000000006
Mar 16 18:49:27 mordor kernel: [ 5696.671003] R10: 0000000000000000 R11:
0000000000000000 R12: ffff88015007f3a0
Mar 16 18:49:27 mordor kernel: [ 5696.671003] R13: ffff88023a14811c R14:
ffff880219903f18 R15: ffff8802107d9640
Mar 16 18:49:27 mordor kernel: [ 5696.671003] FS:
00007ff9ec925700(0000) GS:ffff8800bfa80000(0000) knlGS:0000000000000000
Mar 16 18:49:27 mordor kernel: [ 5696.671003] CS:  0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Mar 16 18:49:27 mordor kernel: [ 5696.671003] CR2: 00007fd326cc2bc8 CR3:
0000000232272000 CR4: 00000000000006e0
Mar 16 18:49:27 mordor kernel: [ 5696.671003] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Mar 16 18:49:27 mordor kernel: [ 5696.671003] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Mar 16 18:49:27 mordor kernel: [ 5696.671003] Process umount.nfs (pid:
21015, threadinfo ffff880219902000, task ffff88023ca16780)
Mar 16 18:49:27 mordor kernel: [ 5696.671003] Stack:
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  ffff880219af1650
0000000000000000 ffff88023fc07128 ffff880219af1400
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  ffff88023a1486c0
ffff880219903f28 ffff88023a03a180 ffffffff810e9624
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  ffff88023f4f2480
ffff880219af1400 ffffffff81471460 ffffffff810d5ce5
Mar 16 18:49:27 mordor kernel: [ 5696.671003] Call Trace:
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810e9624>] ?
shrink_dcache_for_umount+0x54/0x60
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810d5ce5>] ?
generic_shutdown_super+0x25/0x100
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810d5e49>] ?
kill_anon_super+0x9/0x40
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff81179acd>] ?
nfs_kill_super+0xd/0x20
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810d5ee3>] ?
deactivate_locked_super+0x43/0x70
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810ef4d0>] ?
release_mounts+0x70/0x90
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff810efa44>] ?
sys_umount+0x314/0x3d0
Mar 16 18:49:27 mordor kernel: [ 5696.671003]  [<ffffffff8100243b>] ?
system_call_fastpath+0x16/0x1b
Mar 16 18:49:27 mordor kernel: [ 5696.671003] Code: 8b 0a 31 d2 48 85 f6
74 07 48 8b 96 a8 00 00 00 48 05 50 02 00 00 48 89 de 48 c7 c7 e0 7f 52
81 48 89 04 24 31 c0 e8 f1 f0 35 00 <0f> 0b eb fe 0f 0b eb fe 55 53 48
89 fb 48 8d 7f 68 48$
Mar 16 18:49:27 mordor kernel: [ 5696.678048] RIP  [<ffffffff810e95c8>]
shrink_dcache_for_umount_subtree+0x268/0x270
Mar 16 18:49:27 mordor kernel: [ 5696.678048]  RSP <ffff880219903e08>
Mar 16 18:49:27 mordor kernel: [ 5696.678552] ---[ end trace
24269d237584cd43 ]---
Mar 16 18:49:28 mordor mountd[4362]: Caught signal 15, un-registering
and exiting.
Mar 16 18:49:28 mordor kernel: [ 5697.256066] nfsd: last server has
exited, flushing export cache
Mar 16 18:49:28 mordor kernel: [ 5697.272903] nfsd: last server has
exited, flushing export cache




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
  2011-03-16 17:47                 ` Mehmet Giritli
@ 2011-03-17  2:36                   ` Ian Kent
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Kent @ 2011-03-17  2:36 UTC (permalink / raw)
  To: mgiritli
  Cc: Andrew Morton, linux-fsdevel, Nick Piggin, Al Viro,
	bugzilla-daemon

On Wed, 2011-03-16 at 19:47 +0200, Mehmet Giritli wrote:
> Well, here is a copy-paste of the crash that I am seeing *everytime at
> shutdown* now that I am running with your patch. It is pretty much
> identical to the older ones..except that I see this at every time,
> instead of just rare crashes...

That's odd because, with the maps your using, the patch changes code
that's never executed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-17  2:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-30882-27@https.bugzilla.kernel.org/>
     [not found] ` <201103152102.p2FL2NN8006070@demeter2.kernel.org>
2011-03-15 21:24   ` [Bug 30882] Automatic process group scheduling causes crashes after a while Andrew Morton
2011-03-15 23:54     ` Mehmet Giritli
2011-03-16  2:32       ` Ian Kent
2011-03-16 14:27         ` Mehmet Giritli
2011-03-16 15:21           ` Ian Kent
2011-03-16 15:29             ` Mehmet Giritli
2011-03-16 15:44               ` Ian Kent
2011-03-16 16:48                 ` Mehmet Giritli
2011-03-16 17:47                 ` Mehmet Giritli
2011-03-17  2:36                   ` Ian Kent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).