From: Jeff Layton <jlayton@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Seth Forshee <seth.forshee@canonical.com>,
Trond Myklebust <trond.myklebust@primarydata.com>,
Anna Schumaker <anna.schumaker@netapp.com>,
linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
linux-kernel@vger.kernel.org,
Tycho Andersen <tycho.andersen@canonical.com>
Subject: Re: Hang due to nfs letting tasks freeze with locked inodes
Date: Fri, 08 Jul 2016 08:51:54 -0400 [thread overview]
Message-ID: <1467982314.13822.5.camel@redhat.com> (raw)
In-Reply-To: <20160708122224.GA20200@dhcp22.suse.cz>
On Fri, 2016-07-08 at 14:22 +0200, Michal Hocko wrote:
> On Wed 06-07-16 18:07:18, Jeff Layton wrote:
> >
> > On Wed, 2016-07-06 at 12:46 -0500, Seth Forshee wrote:
> > >
> > > We're seeing a hang when freezing a container with an nfs bind mount while
> > > running iozone. Two iozone processes were hung with this stack trace.
> > >
> > > [] schedule+0x35/0x80
> > > [] schedule_preempt_disabled+0xe/0x10
> > > [] __mutex_lock_slowpath+0xb9/0x130
> > > [] mutex_lock+0x1f/0x30
> > > [] do_unlinkat+0x12b/0x2d0
> > > [] SyS_unlink+0x16/0x20
> > > [] entry_SYSCALL_64_fastpath+0x16/0x71
> > >
> > > This seems to be due to another iozone thread frozen during unlink with
> > > this stack trace:
> > >
> > > [] __refrigerator+0x7a/0x140
> > > [] nfs4_handle_exception+0x118/0x130 [nfsv4]
> > > [] nfs4_proc_remove+0x7d/0xf0 [nfsv4]
> > > [] nfs_unlink+0x149/0x350 [nfs]
> > > [] vfs_unlink+0xf1/0x1a0
> > > [] do_unlinkat+0x279/0x2d0
> > > [] SyS_unlink+0x16/0x20
> > > [] entry_SYSCALL_64_fastpath+0x16/0x71
> > >
> > > Since nfs is allowing the thread to be frozen with the inode locked it's
> > > preventing other threads trying to lock the same inode from freezing. It
> > > seems like a bad idea for nfs to be doing this.
> > >
> > Yeah, known problem. Not a simple one to fix though.
> Apart from alternative Dave was mentioning in other email, what is the
> point to use freezable wait from this path in the first place?
>
> nfs4_handle_exception does nfs4_wait_clnt_recover from the same path and
> that does wait_on_bit_action with TASK_KILLABLE so we are waiting in two
> different modes from the same path AFAICS. There do not seem to be other
> callers of nfs4_delay outside of nfs4_handle_exception. Sounds like
> something is not quite right here to me. If the nfs4_delay did regular
> wait then the freezing would fail as well but at least it would be clear
> who is the culrprit rather than having an indirect dependency.
The codepaths involved there are a lot more complex than that
unfortunately.
nfs4_delay is the function that we use to handle the case where the
server returns NFS4ERR_DELAY. Basically telling us that it's too busy
right now or has some transient error and the client should retry after
a small, sliding delay.
That codepath could probably be made more freezer-safe. The typical
case however, is that we've sent a call and just haven't gotten a
reply. That's the trickier one to handle.
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2016-07-08 12:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-06 17:46 Hang due to nfs letting tasks freeze with locked inodes Seth Forshee
2016-07-06 22:07 ` Jeff Layton
2016-07-07 3:55 ` Seth Forshee
2016-07-07 10:29 ` Jeff Layton
2016-07-07 23:53 ` Dave Chinner
2016-07-08 11:33 ` Jeff Layton
2016-07-08 12:48 ` Seth Forshee
2016-07-08 12:55 ` Trond Myklebust
2016-07-08 13:05 ` Trond Myklebust
2016-07-11 1:20 ` Dave Chinner
2016-07-08 12:22 ` Michal Hocko
2016-07-08 12:47 ` Seth Forshee
2016-07-08 12:51 ` Jeff Layton [this message]
2016-07-08 14:23 ` Michal Hocko
2016-07-08 14:27 ` Jeff Layton
2016-07-11 7:23 ` Michal Hocko
2016-07-11 11:03 ` Jeff Layton
2016-07-11 11:43 ` Michal Hocko
2016-07-11 12:50 ` Seth Forshee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1467982314.13822.5.camel@redhat.com \
--to=jlayton@redhat.com \
--cc=anna.schumaker@netapp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=seth.forshee@canonical.com \
--cc=trond.myklebust@primarydata.com \
--cc=tycho.andersen@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).