linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: Seth Forshee <seth.forshee@canonical.com>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Tycho Andersen <tycho.andersen@canonical.com>
Subject: Re: Hang due to nfs letting tasks freeze with locked inodes
Date: Mon, 11 Jul 2016 13:43:35 +0200	[thread overview]
Message-ID: <20160711114335.GC1811@dhcp22.suse.cz> (raw)
In-Reply-To: <1468235011.2577.3.camel@redhat.com>

On Mon 11-07-16 07:03:31, Jeff Layton wrote:
> On Mon, 2016-07-11 at 09:23 +0200, Michal Hocko wrote:
> > On Fri 08-07-16 10:27:38, Jeff Layton wrote:
> > > On Fri, 2016-07-08 at 16:23 +0200, Michal Hocko wrote:
> > > > On Fri 08-07-16 08:51:54, Jeff Layton wrote:
> > > > > 
> > > > > On Fri, 2016-07-08 at 14:22 +0200, Michal Hocko wrote:
> > > > [...]
> > > > > 
> > > > > > 
> > > > > > Apart from alternative Dave was mentioning in other email, what
> > > > > > is the
> > > > > > point to use freezable wait from this path in the first place?
> > > > > > 
> > > > > > nfs4_handle_exception does nfs4_wait_clnt_recover from the same
> > > > > > path and
> > > > > > that does wait_on_bit_action with TASK_KILLABLE so we are waiting
> > > > > > in two
> > > > > > different modes from the same path AFAICS. There do not seem to
> > > > > > be other
> > > > > > callers of nfs4_delay outside of nfs4_handle_exception. Sounds
> > > > > > like
> > > > > > something is not quite right here to me. If the nfs4_delay did
> > > > > > regular
> > > > > > wait then the freezing would fail as well but at least it would
> > > > > > be clear
> > > > > > who is the culrprit rather than having an indirect dependency.
> > > > > The codepaths involved there are a lot more complex than that
> > > > > unfortunately.
> > > > > 
> > > > > nfs4_delay is the function that we use to handle the case where the
> > > > > server returns NFS4ERR_DELAY. Basically telling us that it's too
> > > > > busy
> > > > > right now or has some transient error and the client should retry
> > > > > after
> > > > > a small, sliding delay.
> > > > > 
> > > > > That codepath could probably be made more freezer-safe. The typical
> > > > > case however, is that we've sent a call and just haven't gotten a
> > > > > reply. That's the trickier one to handle.
> > > > Why using a regular non-freezable wait would be a problem?
> > > 
> > > It has been a while since I looked at that code, but IIRC, that could
> > > block the freezer for up to 15s, which is a significant portion of the
> > > 20s that you get before the freezer gives up.
> > 
> > But how does that differ from the situation when the freezer has to give
> > up on the timeout because another task fails due to lock dependency.
> > 
> > As Trond and Dave have written in other emails. It is really danngerous
> > to freeze a task while it is holding locks and other resources.
> 
> It's not really dangerous if you're freezing every task on the host.
> Sure, you're freezing with locks held, but everything else is freezing
> too, so nothing will be contending for those locks.

But the very same path is used also for cgroup freezer so you can end up
freezing a task while it holds locks which might block tasks from
unrelated cgroups, right?
 
> I'm not at all opposed to changing how all of that works. My only
> stipulation is that we not break the ability to reliably suspend a host
> that is actively using an NFS mount. If you can come up with a way to
> do that that also works for freezing cgroups, then I'm all for it.

My knowledge of NFS is too limited to help you out here but I guess it
would be a good start to stop using unsafe freezer APIs. Or use it only
when you are sure you cannot block any resources.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-07-11 11:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-06 17:46 Hang due to nfs letting tasks freeze with locked inodes Seth Forshee
2016-07-06 22:07 ` Jeff Layton
2016-07-07  3:55   ` Seth Forshee
2016-07-07 10:29     ` Jeff Layton
2016-07-07 23:53   ` Dave Chinner
2016-07-08 11:33     ` Jeff Layton
2016-07-08 12:48     ` Seth Forshee
2016-07-08 12:55       ` Trond Myklebust
2016-07-08 13:05         ` Trond Myklebust
2016-07-11  1:20           ` Dave Chinner
2016-07-08 12:22   ` Michal Hocko
2016-07-08 12:47     ` Seth Forshee
2016-07-08 12:51     ` Jeff Layton
2016-07-08 14:23       ` Michal Hocko
2016-07-08 14:27         ` Jeff Layton
2016-07-11  7:23           ` Michal Hocko
2016-07-11 11:03             ` Jeff Layton
2016-07-11 11:43               ` Michal Hocko [this message]
2016-07-11 12:50               ` Seth Forshee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160711114335.GC1811@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=anna.schumaker@netapp.com \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=seth.forshee@canonical.com \
    --cc=trond.myklebust@primarydata.com \
    --cc=tycho.andersen@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).