From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: neilb@suse.de, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/2] NLM: break out of nlmsvc_retry_blocked if lockd is coming down
Date: Mon, 28 Jan 2008 20:23:51 -0500 [thread overview]
Message-ID: <20080129012351.GE16785@fieldses.org> (raw)
In-Reply-To: <20080128201210.376097e6-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
On Mon, Jan 28, 2008 at 08:12:10PM -0500, Jeff Layton wrote:
> On Mon, 28 Jan 2008 19:05:17 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>
> > On Mon, Jan 28, 2008 at 02:29:11PM -0500, Jeff Layton wrote:
> > > It's possible to end up continually looping in
> > > nlmsvc_retry_blocked() even when the lockd kthread has been
> > > requested to come down. I've been able to reproduce this fairly
> > > easily by having an RPC grant callback queued up to an RPC client
> > > that's not bound. If the other host is unresponsive, then the block
> > > never gets taken off the list, and lockd continually respawns new
> > > RPC's.
> > >
> > > If lockd is going down anyway, we don't want to keep retrying the
> > > blocks, so allow it to break out of this loop. Additionally, in that
> > > situation kthread_stop will have already woken lockd, so when
> > > nlmsvc_retry_blocked returns, have lockd check for kthread_stop() so
> > > that it doesn't end up calling svc_recv and waiting for a long time.
> >
> > Stupid question: how is this different from the kthread_stop() coming
> > just after this kthread_should_stop() call but before we call
> > svc_recv()? What keeps us from waiting in svc_recv() indefinitely
> > after kthread_stop()?
> >
>
> Nothing at all, unfortunately :-(
>
> Since we've taken signalling out of the lockd_down codepath, this gets
> a lot trickier than it used to be. I'm starting to wonder if we should
> go back to sending a signal on lockd_down.
OK, thanks. So do I keep these patches, or not? This sounds like a
regression (if perhaps a very minor one--I'm not quite clear). Help!
--b.
>
> > > This patch prevents this continual looping and allows lockd to
> > > eventually come down, but this can quite some time since lockd
> > > won't check the condition in the nlmsvc_retry_blocked while loop
> > > until nlmsvc_grant_blocked returns. That won't happen until
> > > all of the portmap requests time out.
> >
> > So, shutdown problems aside, does that mean an unresponsive client can
> > cause lock to stop serving lock requests in normal operation?
> >
>
> Possibly.
>
> I think that in general what happens is that blocks eventually get
> reinserted into the list with an expire time that is well after the
> current jiffies and we eventually hit this condition:
>
> if (time_after(block->b_when,jiffies)) {
> timeout = block->b_when - jiffies;
> break;
> }
>
> ...but I was seeing lockd loop in this loop over several iterations
> without returning back up to the main lockd loop. I think what was
> happening was that we were inserting the block with a later jiffies in
> nlmsvc_grant_blocked, but the rpcbind attempts would end up taking
> longer than that timeout. So when we returned from
> nlmsvc_grant_blocked() that condition would no longer fire, and we'd end up retrying the grant callback again.
>
> As to why we don't see this more often, I'm not sure. I probably need
> to experiment a bit more with it to make sure I'm understanding this
> correctly.
>
> Let me do that and get back to you...
next prev parent reply other threads:[~2008-01-29 1:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-28 19:29 [PATCH 0/2] NLM: fix use after free in lockd Jeff Layton
2008-01-28 19:29 ` [PATCH 1/2] NLM: tear down RPC clients in nlm_shutdown_hosts Jeff Layton
2008-01-28 19:29 ` [PATCH 2/2] NLM: break out of nlmsvc_retry_blocked if lockd is coming down Jeff Layton
2008-01-29 0:05 ` J. Bruce Fields
2008-01-29 1:12 ` Jeff Layton
[not found] ` <20080128201210.376097e6-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 1:23 ` J. Bruce Fields [this message]
2008-01-29 2:29 ` Jeff Layton
[not found] ` <20080128212942.77035005-PC62bkCOHzGdMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2008-01-29 3:02 ` J. Bruce Fields
2008-01-29 11:23 ` Jeff Layton
[not found] ` <20080129062312.33ba5a54-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 12:32 ` Jeff Layton
[not found] ` <20080129073241.7c5c2ef1-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 14:41 ` J. Bruce Fields
2008-01-28 23:18 ` [PATCH 1/2] NLM: tear down RPC clients in nlm_shutdown_hosts J. Bruce Fields
2008-01-29 0:04 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080129012351.GE16785@fieldses.org \
--to=bfields@fieldses.org \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox