From: john stultz <johnstul@us.ibm.com>
To: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
Cc: Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
rt-users <linux-rt-users@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Nick Piggin <npiggin@suse.de>
Subject: Re: 2.6.33.5 rt23: machine lockup (nfs/autofs related?)
Date: Mon, 12 Jul 2010 16:53:54 -0700 [thread overview]
Message-ID: <1278978834.2404.12.camel@localhost.localdomain> (raw)
In-Reply-To: <1278977858.6489.52.camel@localhost.localdomain>
On Mon, 2010-07-12 at 16:37 -0700, Fernando Lopez-Lezcano wrote:
> On Fri, 2010-07-09 at 15:57 -0700, john stultz wrote:
> > So looking over it, I'm not easily seeing what else could be off.
> >
> > So Lets see if we can cut some of the guess work out of this...
> >
> > > [<c04e08e9>] ? d_materialise_unique+0xbf/0x29e
> >
> > I'm curious exactly where that is in d_materialise_unique. To find out,
> > can you find the vmlinux image in the base of the directory you built
> > the kernel you triggered this in?
> >
> > Then run:
> > # gdb ./vmlinux
> >
> > Once gdb loads:
> > (gdb) list *0xc04e08e9
> >
> > That should point to exactly where in the function we are trying to
> > acquire a previously locked lock.
>
> Finally... I did a local build in my desktop machine so I now have
> access to the full patched/compiled source tree. I confirmed that the
> patch you sent is there (moving a spin_lock one line down).
>
> This is from a different kernel (non-PAE) so the exact address is
> different from the previous report:
>
> (gdb) list *0xc04d82dd
> 0xc04d82dd is in d_materialise_unique (fs/dcache.c:2100).
> 2095 spin_lock(&aparent->d_lock);
> 2096 spin_lock(&dparent->d_lock);
> 2097 spin_lock(&dentry->d_lock);
> 2098 spin_lock(&anon->d_lock);
> 2099
> 2100 dentry->d_parent = (aparent == anon) ? dentry : aparent;
> 2101 list_del(&dentry->d_u.d_child);
> 2102 if (!IS_ROOT(dentry))
> 2103 list_add(&dentry->d_u.d_child, &dentry->d_parent->d_subdirs);
> 2104 else
>
> See below for the full dump of the BUG through the serial console in
> this particular occurrence.
Huh. I'm still baffled. Since we're blowing out on line 2098, the anon
pointer points to the alias pointer we passed in to
__d_materialise_dentry(). So that means the anon dentry is already
locked, and we've moved the obviously wrong lock operation down so it
shouldn't be held.
Hrm. Ok.. I think the line 2100 above gives us a hint: (aparent == anon)
So if that were the case, we would have already locked aparent and that
would explain the blowup.
How does it do with the following change?
thanks
-john
diff --git a/fs/dcache.c b/fs/dcache.c
index c9d21ae..8d68504 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2099,7 +2099,8 @@ static void __d_materialise_dentry(struct dentry *dentry, struct dentry *anon)
aparent = anon->d_parent;
/* XXX: hack */
- spin_lock(&aparent->d_lock);
+ if (aparent != anon)
+ spin_lock(&aparent->d_lock);
spin_lock(&dparent->d_lock);
spin_lock(&dentry->d_lock);
spin_lock(&anon->d_lock);
@@ -2121,7 +2122,8 @@ static void __d_materialise_dentry(struct dentry *dentry, struct dentry *anon)
spin_unlock(&anon->d_lock);
spin_unlock(&dentry->d_lock);
spin_unlock(&dparent->d_lock);
- spin_unlock(&aparent->d_lock);
+ if (aparent != anon)
+ spin_unlock(&aparent->d_lock);
anon->d_flags &= ~DCACHE_DISCONNECTED;
}
@@ -2159,8 +2161,8 @@ struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode)
/* Is this an anonymous mountpoint that we could splice
* into our tree? */
if (IS_ROOT(alias)) {
- spin_lock(&alias->d_lock);
__d_materialise_dentry(dentry, alias);
+ spin_lock(&alias->d_lock);
__d_drop(alias);
goto found;
}
next prev parent reply other threads:[~2010-07-12 23:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-08 17:19 2.6.33.5 rt23: machine lockup (nfs/autofs related?) Fernando Lopez-Lezcano
2010-07-08 22:33 ` john stultz
2010-07-08 22:44 ` Fernando Lopez-Lezcano
2010-07-08 23:00 ` john stultz
2010-07-09 19:02 ` Fernando Lopez-Lezcano
2010-07-09 19:13 ` Fernando Lopez-Lezcano
2010-07-09 19:54 ` john stultz
2010-07-09 22:13 ` Fernando Lopez-Lezcano
2010-07-09 22:31 ` john stultz
2010-07-09 23:07 ` Fernando Lopez-Lezcano
2010-07-09 23:24 ` Fernando Lopez-Lezcano
2010-07-09 22:57 ` john stultz
2010-07-09 23:13 ` Fernando Lopez-Lezcano
2010-07-12 23:37 ` Fernando Lopez-Lezcano
2010-07-12 23:53 ` john stultz [this message]
2010-07-13 1:10 ` Fernando Lopez-Lezcano
2010-07-13 1:40 ` john stultz
2010-07-13 3:06 ` Fernando Lopez-Lezcano
2010-07-14 21:32 ` Fernando Lopez-Lezcano
2010-07-14 21:36 ` john stultz
2010-07-14 22:02 ` Fernando Lopez-Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1278978834.2404.12.camel@localhost.localdomain \
--to=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=nando@ccrma.Stanford.EDU \
--cc=npiggin@suse.de \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).