From: Dave Jones <davej@redhat.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: processes hung after sys_renameat, and 'missing' processes
Date: Wed, 6 Jun 2012 15:42:33 -0400 [thread overview]
Message-ID: <20120606194233.GA1537@redhat.com> (raw)
In-Reply-To: <20120603232820.GQ30000@ZenIV.linux.org.uk>
On Mon, Jun 04, 2012 at 12:28:20AM +0100, Al Viro wrote:
> On Mon, Jun 04, 2012 at 12:17:09AM +0100, Al Viro wrote:
> >
> > > Also, sysrq-w is usually way more interesting than 't' when there are
> > > processes stuck on a mutex.
> > >
> > > Because yes, it looks like you have a boattload of trinity processes
> > > stuck on an inode mutex. Looks like every single one of them is in
> > > 'lock_rename()'. It *shouldn't* be an ABBA deadlock, since lockdep
> > > should have noticed that, but who knows.
> >
> > lock_rename() is a bit of a red herring here - they appear to be all
> > within-directory renames, so it's just a "trying to rename something
> > in a directory that has ->i_mutex held by something else".
> >
> > IOW, something else in there is holding ->i_mutex - something that
> > either hadn't been through lock_rename() at all or has already
> > passed through it and still hadn't got around to unlock_rename().
> > In either case, suspects won't have lock_rename() in the trace...
>
> Everything in lock_rename() appears to be at lock_rename+0x3e. Unless
> there's a really huge amount of filesystems on that box, this has to
> be
> mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
> and everything on that sucker is not holding any locks yet. IOW, that's
> the tail hanging off whatever deadlock is there.
>
> One possibility is that something has left the kernel without releasing
> i_mutex on some directory, which would make atomic_open patches the most
> obvious suspects.
Just hit this again on a different box, though this time the stack traces
of the stuck processes seems to vary between fchmod/fchown/getdents calls.
partial dmesg at http://fpaste.org/jBVM/
sysrq-w: http://fpaste.org/uYtj/
sysrq-d: http://fpaste.org/Xxur/
does this give any new clues that the previous traces didn't ?
Dave
next prev parent reply other threads:[~2012-06-06 19:42 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-03 22:36 processes hung after sys_renameat, and 'missing' processes Dave Jones
2012-06-03 22:51 ` Dave Jones
2012-06-03 23:07 ` Linus Torvalds
2012-06-03 23:17 ` Al Viro
2012-06-03 23:28 ` Al Viro
2012-06-03 23:40 ` Al Viro
2012-06-03 23:59 ` Al Viro
2012-06-04 0:07 ` Dave Jones
2012-06-06 19:42 ` Dave Jones [this message]
2012-06-06 22:38 ` Linus Torvalds
2012-06-06 23:00 ` Dave Jones
2012-06-06 23:31 ` Linus Torvalds
2012-06-06 23:54 ` Al Viro
2012-06-07 0:29 ` Dave Jones
2012-06-07 0:40 ` Al Viro
2012-06-07 0:42 ` Linus Torvalds
2012-06-07 1:19 ` Dave Jones
2012-06-07 1:29 ` Al Viro
2012-06-07 1:31 ` Dave Jones
2012-06-07 1:31 ` Al Viro
2012-06-07 1:42 ` Dave Jones
2012-06-07 1:45 ` Linus Torvalds
2012-06-07 1:54 ` Al Viro
2012-06-07 2:08 ` Dave Jones
2012-06-07 19:36 ` Al Viro
2012-06-07 20:43 ` Sage Weil
2012-06-07 23:12 ` Eric W. Biederman
2012-06-07 23:39 ` Al Viro
2012-06-07 23:57 ` Linus Torvalds
2012-06-08 0:36 ` Al Viro
2012-06-08 0:42 ` Linus Torvalds
2012-06-08 0:59 ` Al Viro
2012-06-08 5:25 ` Eric W. Biederman
2012-06-08 5:48 ` Al Viro
2012-06-08 7:54 ` Eric W. Biederman
2012-06-08 20:20 ` Al Viro
2012-06-08 2:08 ` Eric W. Biederman
2012-06-08 2:37 ` Al Viro
2012-06-08 2:18 ` Al Viro
2012-06-08 16:22 ` J. Bruce Fields
2012-06-08 17:44 ` Linus Torvalds
2012-06-11 12:17 ` J. Bruce Fields
2012-06-07 1:40 ` Linus Torvalds
2012-06-07 0:35 ` Linus Torvalds
2012-06-07 10:26 ` Peter Zijlstra
2012-06-07 15:30 ` Linus Torvalds
2012-06-08 7:31 ` Peter Zijlstra
2012-06-08 14:38 ` Dave Jones
2012-06-08 14:51 ` Peter Zijlstra
2012-06-08 15:01 ` Dave Jones
2012-06-08 15:11 ` Peter Zijlstra
2012-06-08 15:21 ` Dave Jones
2012-06-08 14:46 ` J. Bruce Fields
2012-06-08 15:08 ` Peter Zijlstra
2012-06-11 12:17 ` J. Bruce Fields
2012-06-04 0:00 ` Dave Jones
2012-06-04 0:16 ` Linus Torvalds
2012-06-04 0:20 ` Al Viro
2012-06-04 9:35 ` Peter Zijlstra
2012-06-04 9:29 ` Peter Zijlstra
2012-06-04 10:49 ` Peter Zijlstra
2012-06-07 0:13 ` Dave Jones
-- strict thread matches above, loose matches on Subject: below --
2012-06-07 7:07 Miklos Szeredi
2012-06-07 15:44 ` Linus Torvalds
2012-06-11 16:02 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120606194233.GA1537@redhat.com \
--to=davej@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox