From: "Mehmet Giritli" <mgiritli@giritli.eu>
To: Ian Kent <raven@themaw.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, Nick Piggin <npiggin@kernel.dk>,
Al Viro <viro@zeniv.linux.org.uk>,
bugzilla-daemon@bugzilla.kernel.org,
Mehmet Giritli <mehmet@giritli.eu>
Subject: Re: [Bug 30882] Automatic process group scheduling causes crashes after a while
Date: Wed, 16 Mar 2011 18:48:58 +0200 [thread overview]
Message-ID: <1300294138.16850.10.camel@mordor> (raw)
In-Reply-To: <1300290279.2750.153.camel@perseus>
On Wed, 2011-03-16 at 23:44 +0800, Ian Kent wrote:
> On Wed, 2011-03-16 at 17:29 +0200, Mehmet Giritli wrote:
> > On Wed, 2011-03-16 at 23:21 +0800, Ian Kent wrote:
> > > On Wed, 2011-03-16 at 16:27 +0200, Mehmet Giritli wrote:
> > > > Ian,
> > > >
> > > > I am having much more frequent crashes now. I havent been able to
> > > > cleanly reboot my machine yet and I have tried three times so far. Init
> > > > scripts fail to unmount the file systems and I have to reboot manually
> > >
> > > What do your autofs maps look like?
> > >
> > >
> >
> > Here is the contents of my auto.misc:
> >
> > gollum-media -rsize=8192,wsize=8192,soft,timeo=10,rw gollum.giritli.eu:/mnt/media
> > gollum-distfiles -rsize=8192,wsize=8192,soft,timeo=10,rw gollum.giritli.eu:/usr/portage/distfiles
> > gollum-www -rsize=8192,wsize=8192,soft,timeo=10,rw gollum.giritli.eu:/var/www
> > gollum-WebDav -rsize=8192,wsize=8192,soft,timeo=10,rw gollum.giritli.eu:/var/dav
>
> What, that's it, and your only using "/misc /etc/auto.misc" in the
> master map and your having problems.
yes
>
> Are the crashes always the same?
identical
> How have you established that the BUG()s are in fact due to automount
> umounting mounts and that the BUG()s correspond to NFS mounts previously
> mounted by autofs?
I havent established anything. However, thats the only way I mount nfs
and my file manager hangs, init scripts hang when trying to unmount...
> Is there any noise at all in the syslog?
nothing unusual
> Are you sure your using a kernel with the dentry leak patch?
yes
> What sort of automounting load is happening on the machine, ie.
> frequency or mounts and umounts and what timeout are you using?
from auto.master:
/mnt/autofs /etc/auto.misc --timeout=300 --ghost
Not very much. Lets say 2-3 times every hour for each mount point.
> The dentry leak patch got rid of the BUG()s I was seeing but by that
> time I did have a couple of other patches. I still don't think the other
> patches made much difference for this particular case.
>
> >
> > > >
> > > > On Wed, 2011-03-16 at 10:32 +0800, Ian Kent wrote:
> > > > > On Wed, 2011-03-16 at 01:54 +0200, Mehmet Giritli wrote:
> > > > > > The missing piece is as follows:
> > > > > >
> > > > > > Mar 15 22:37:38 mordor kernel: [ 1860.156114] BUG: Dentry
> > > > > > ffff88023f96e600{i=25f56f,n=} still in use (1) [unmount of nfs 0:f]
> > > > >
> > > > > This might be the same problem I saw and described in rc1.
> > > > > However, for me the fs in the BUG() report was autofs.
> > > > > Hopefully that just means my autofs setup is different.
> > > > >
> > > > > At the moment I believe a dentry leak Al Viro spotted is the cause.
> > > > > Please try this patch.
> > > > >
> > > > > autofs4 - fix dentry leak in autofs4_expire_direct()
> > > > >
> > > > > From: Ian Kent <raven@themaw.net>
> > > > >
> > > > > There is a missing dput() when returning from autofs4_expire_direct()
> > > > > when we see that the dentry is already a pending mount.
> > > > >
> > > > > Signed-off-by: Ian Kent <raven@themaw.net>
> > > > > ---
> > > > >
> > > > > fs/autofs4/expire.c | 7 +++----
> > > > > 1 files changed, 3 insertions(+), 4 deletions(-)
> > > > >
> > > > >
> > > > > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > > > > index c896dd6..c403abc 100644
> > > > > --- a/fs/autofs4/expire.c
> > > > > +++ b/fs/autofs4/expire.c
> > > > > @@ -290,10 +290,8 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > > > spin_lock(&sbi->fs_lock);
> > > > > ino = autofs4_dentry_ino(root);
> > > > > /* No point expiring a pending mount */
> > > > > - if (ino->flags & AUTOFS_INF_PENDING) {
> > > > > - spin_unlock(&sbi->fs_lock);
> > > > > - return NULL;
> > > > > - }
> > > > > + if (ino->flags & AUTOFS_INF_PENDING)
> > > > > + goto out;
> > > > > if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> > > > > struct autofs_info *ino = autofs4_dentry_ino(root);
> > > > > ino->flags |= AUTOFS_INF_EXPIRING;
> > > > > @@ -301,6 +299,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> > > > > spin_unlock(&sbi->fs_lock);
> > > > > return root;
> > > > > }
> > > > > +out:
> > > > > spin_unlock(&sbi->fs_lock);
> > > > > dput(root);
> > > > >
> > > > >
> > > > > >
> > > > > > (sorry for the inconvenience Andrew)
> > > > > >
> > > > > > On Tue, 2011-03-15 at 14:24 -0700, Andrew Morton wrote:
> > > > > > > (switched to email. Please respond via emailed reply-to-all, not via the
> > > > > > > bugzilla web interface).
> > > > > > >
> > > > > > > Seems that we have a nasty involving autofs, nfs and the VFS.
> > > > > > >
> > > > > > > Mehmet, the kernel should have printed some diagnostics prior to doing
> > > > > > > the BUG() call:
> > > > > > >
> > > > > > > if (dentry->d_count != 0) {
> > > > > > > printk(KERN_ERR
> > > > > > > "BUG: Dentry %p{i=%lx,n=%s}"
> > > > > > > " still in use (%d)"
> > > > > > > " [unmount of %s %s]\n",
> > > > > > > dentry,
> > > > > > > dentry->d_inode ?
> > > > > > > dentry->d_inode->i_ino : 0UL,
> > > > > > > dentry->d_name.name,
> > > > > > > dentry->d_count,
> > > > > > > dentry->d_sb->s_type->name,
> > > > > > > dentry->d_sb->s_id);
> > > > > > > BUG();
> > > > > > > }
> > > > > > >
> > > > > > > Please find those in the log and email them to use - someone might find
> > > > > > > it useful.
> > > > > > >
> > > > > > >
> > > > > > > On Tue, 15 Mar 2011 21:02:23 GMT
> > > > > > > bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > > >
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=30882
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --- Comment #4 from Mehmet Giritli <mehmet@giritli.eu> 2011-03-15 21:02:22 ---
> > > > > > > > Here is that crash happening again, the system was NOT running overclocked or
> > > > > > > > anything...
> > > > > > > >
> > > > > > > > [ 1860.156122] ------------[ cut here ]------------
> > > > > > > > [ 1860.156124] kernel BUG at fs/dcache.c:943!
> > > > > > > > [ 1860.156126] invalid opcode: 0000 [#1] SMP
> > > > > > > > [ 1860.156127] last sysfs file: /sys/devices/platform/it87.552/fan3_input
> > > > > > > > [ 1860.156128] CPU 3
> > > > > > > > [ 1860.156129] Modules linked in: iptable_mangle iptable_nat nf_nat ipt_LOG
> > > > > > > > xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_mac iptable_filter
> > > > > > > > xt_multiport xt_mark xt_conntrack xt_connmark nf_conntrack ip_tables x_tables
> > > > > > > > nvidia(P)
> > > > > > > > [ 1860.156137]
> > > > > > > > [ 1860.156139] Pid: 7388, comm: umount.nfs Tainted: P 2.6.38-rc8 #9
> > > > > > > > Gigabyte Technology Co., Ltd. GA-790FXTA-UD5/GA-790FXTA-UD5
> > > > > > > > [ 1860.156142] RIP: 0010:[<ffffffff810e9648>] [<ffffffff810e9648>]
> > > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > > [ 1860.156147] RSP: 0018:ffff8800be82fe08 EFLAGS: 00010296
> > > > > > > > [ 1860.156149] RAX: 0000000000000065 RBX: ffff88023f96e600 RCX:
> > > > > > > > 000000000003ffff
> > > > > > > > [ 1860.156150] RDX: ffffffff8161f888 RSI: 0000000000000046 RDI:
> > > > > > > > ffffffff8174c9f8
> > > > > > > > [ 1860.156151] RBP: ffff88023f96e600 R08: 0000000000012c37 R09:
> > > > > > > > 0000000000000006
> > > > > > > > [ 1860.156152] R10: 0000000000000000 R11: 0000000000000000 R12:
> > > > > > > > ffff88023a07f5e0
> > > > > > > > [ 1860.156154] R13: ffff88023f96e65c R14: ffff8800be82ff18 R15:
> > > > > > > > ffff880211d38740
> > > > > > > > [ 1860.156155] FS: 00007f3428cb2700(0000) GS:ffff8800bfac0000(0000)
> > > > > > > > knlGS:00000000f74186c0
> > > > > > > > [ 1860.156156] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > > > > > [ 1860.156157] CR2: 00007f7c97da1000 CR3: 00000000bea08000 CR4:
> > > > > > > > 00000000000006e0
> > > > > > > > [ 1860.156159] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > > > 0000000000000000
> > > > > > > > [ 1860.156160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> > > > > > > > 0000000000000400
> > > > > > > > [ 1860.156161] Process umount.nfs (pid: 7388, threadinfo ffff8800be82e000, task
> > > > > > > > ffff880211fd5640)
> > > > > > > > [ 1860.156162] Stack:
> > > > > > > > [ 1860.156163] ffff88020c05ce50 0000000000000000 ffff88023fc07128
> > > > > > > > ffff88020c05cc00
> > > > > > > > [ 1860.156165] ffff88023f96e6c0 ffff8800be82ff28 ffff88023f96e300
> > > > > > > > ffffffff810e96a4
> > > > > > > > [ 1860.156167] ffff88023f49f480 ffff88020c05cc00 ffffffff8146d4a0
> > > > > > > > ffffffff810d5d15
> > > > > > > > [ 1860.156169] Call Trace:
> > > > > > > > [ 1860.156172] [<ffffffff810e96a4>] ? shrink_dcache_for_umount+0x54/0x60
> > > > > > > > [ 1860.156174] [<ffffffff810d5d15>] ? generic_shutdown_super+0x25/0x100
> > > > > > > > [ 1860.156176] [<ffffffff810d5e79>] ? kill_anon_super+0x9/0x40
> > > > > > > > [ 1860.156179] [<ffffffff81179aed>] ? nfs_kill_super+0xd/0x20
> > > > > > > > [ 1860.156181] [<ffffffff810d5f13>] ? deactivate_locked_super+0x43/0x70
> > > > > > > > [ 1860.156183] [<ffffffff810ef4d8>] ? release_mounts+0x68/0x90
> > > > > > > > [ 1860.156185] [<ffffffff810efa54>] ? sys_umount+0x314/0x3d0
> > > > > > > > [ 1860.156187] [<ffffffff8100243b>] ? system_call_fastpath+0x16/0x1b
> > > > > > > > [ 1860.156188] Code: 8b 0a 31 d2 48 85 f6 74 07 48 8b 96 a8 00 00 00 48 05 50
> > > > > > > > 02 00 00 48 89 de 48 c7 c7 40 3a 52 81 48 89 04 24 31 c0 e8 a1 bc 35 00 <0f> 0b
> > > > > > > > eb fe 0f 0b eb fe 55 53 48 89 fb 48 8d 7f 68 48 83 ec 08
> > > > > > > > [ 1860.156201] RIP [<ffffffff810e9648>]
> > > > > > > > shrink_dcache_for_umount_subtree+0x268/0x270
> > > > > > > > [ 1860.156204] RSP <ffff8800be82fe08>
> > > > > > > > [ 1860.156205] ---[ end trace ee03486c16c108a7 ]---
> > > > > > > >
> > > > > > > > --
> > > > > > > > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> > > > > > > > ------- You are receiving this mail because: -------
> > > > > > > > You are on the CC list for the bug.
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2011-03-16 16:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-30882-27@https.bugzilla.kernel.org/>
[not found] ` <201103152102.p2FL2NN8006070@demeter2.kernel.org>
2011-03-15 21:24 ` [Bug 30882] Automatic process group scheduling causes crashes after a while Andrew Morton
2011-03-15 23:54 ` Mehmet Giritli
2011-03-16 2:32 ` Ian Kent
2011-03-16 14:27 ` Mehmet Giritli
2011-03-16 15:21 ` Ian Kent
2011-03-16 15:29 ` Mehmet Giritli
2011-03-16 15:44 ` Ian Kent
2011-03-16 16:48 ` Mehmet Giritli [this message]
2011-03-16 17:47 ` Mehmet Giritli
2011-03-17 2:36 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1300294138.16850.10.camel@mordor \
--to=mgiritli@giritli.eu \
--cc=akpm@linux-foundation.org \
--cc=bugzilla-daemon@bugzilla.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mehmet@giritli.eu \
--cc=npiggin@kernel.dk \
--cc=raven@themaw.net \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).