From: Seth Forshee <seth.forshee@canonical.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Anna Schumaker <anna.schumaker@netapp.com>,
linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
linux-kernel@vger.kernel.org,
Tycho Andersen <tycho.andersen@canonical.com>
Subject: Re: Hang due to nfs letting tasks freeze with locked inodes
Date: Wed, 6 Jul 2016 22:55:34 -0500 [thread overview]
Message-ID: <20160707035534.GF45215@ubuntu-hedt> (raw)
In-Reply-To: <1467842838.2908.45.camel@redhat.com>
On Wed, Jul 06, 2016 at 06:07:18PM -0400, Jeff Layton wrote:
> On Wed, 2016-07-06 at 12:46 -0500, Seth Forshee wrote:
> > We're seeing a hang when freezing a container with an nfs bind mount while
> > running iozone. Two iozone processes were hung with this stack trace.
> >
> > [] schedule+0x35/0x80
> > [] schedule_preempt_disabled+0xe/0x10
> > [] __mutex_lock_slowpath+0xb9/0x130
> > [] mutex_lock+0x1f/0x30
> > [] do_unlinkat+0x12b/0x2d0
> > [] SyS_unlink+0x16/0x20
> > [] entry_SYSCALL_64_fastpath+0x16/0x71
> >
> > This seems to be due to another iozone thread frozen during unlink with
> > this stack trace:
> >
> > [] __refrigerator+0x7a/0x140
> > [] nfs4_handle_exception+0x118/0x130 [nfsv4]
> > [] nfs4_proc_remove+0x7d/0xf0 [nfsv4]
> > [] nfs_unlink+0x149/0x350 [nfs]
> > [] vfs_unlink+0xf1/0x1a0
> > [] do_unlinkat+0x279/0x2d0
> > [] SyS_unlink+0x16/0x20
> > [] entry_SYSCALL_64_fastpath+0x16/0x71
> >
> > Since nfs is allowing the thread to be frozen with the inode locked it's
> > preventing other threads trying to lock the same inode from freezing. It
> > seems like a bad idea for nfs to be doing this.
> >
>
> Yeah, known problem. Not a simple one to fix though.
>
> > Can nfs do something different here to prevent this? Maybe use a
> > non-freezable sleep and let the operation complete, or else abort the
> > operation and return ERESTARTSYS?
>
> The problem with letting the op complete is that often by the time you
> get to the point of trying to freeze processes, the network interfaces
> are already shut down. So the operation you're waiting on might never
> complete. Stuff like suspend operations on your laptop fail, leading to
> fun bug reports like: "Oh, my laptop burned to crisp inside my bag
> because the suspend never completed."
>
> You could (in principle) return something like -ERESTARTSYS iff the
> call has not yet been transmitted. If it has already been transmitted,
> then you might end up sending the call a second time (but not as an RPC
> retransmission of course). If that call was non-idempotent then you end
> up with all of _those_ sorts of problems.
>
> Also, -ERESTARTSYS is not quite right as it doesn't always cause the
> call to be restarted. It depends on the syscall. I think this would
> probably need some other sort of syscall-restart machinery plumbed in.
I don't really know much at all about how NFS works, so I hope you don't
mind indulging me in some questions.
What happens then if you suspend waiting for an op to complete and then
resume an hour later? Will it actually succeed or end up returning some
sort of "timed out" error?
If it's going to be an error (or even likely to be one) could the op
just be aborted immediately with an error code? It just seems like there
must be something better than potentially deadlocking the kernel.
Thanks,
Seth
next prev parent reply other threads:[~2016-07-07 3:55 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-06 17:46 Hang due to nfs letting tasks freeze with locked inodes Seth Forshee
2016-07-06 22:07 ` Jeff Layton
2016-07-07 3:55 ` Seth Forshee [this message]
2016-07-07 10:29 ` Jeff Layton
2016-07-07 23:53 ` Dave Chinner
2016-07-08 11:33 ` Jeff Layton
2016-07-08 12:48 ` Seth Forshee
2016-07-08 12:55 ` Trond Myklebust
2016-07-08 13:05 ` Trond Myklebust
2016-07-11 1:20 ` Dave Chinner
2016-07-08 12:22 ` Michal Hocko
2016-07-08 12:47 ` Seth Forshee
2016-07-08 12:51 ` Jeff Layton
2016-07-08 14:23 ` Michal Hocko
2016-07-08 14:27 ` Jeff Layton
2016-07-11 7:23 ` Michal Hocko
2016-07-11 11:03 ` Jeff Layton
2016-07-11 11:43 ` Michal Hocko
2016-07-11 12:50 ` Seth Forshee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160707035534.GF45215@ubuntu-hedt \
--to=seth.forshee@canonical.com \
--cc=anna.schumaker@netapp.com \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
--cc=tycho.andersen@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).