From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: neilb@suse.de, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/2] NLM: break out of nlmsvc_retry_blocked if lockd is coming down
Date: Mon, 28 Jan 2008 20:23:51 -0500 [thread overview]
Message-ID: <20080129012351.GE16785@fieldses.org> (raw)
In-Reply-To: <20080128201210.376097e6-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
On Mon, Jan 28, 2008 at 08:12:10PM -0500, Jeff Layton wrote:
> On Mon, 28 Jan 2008 19:05:17 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
>
> > On Mon, Jan 28, 2008 at 02:29:11PM -0500, Jeff Layton wrote:
> > > It's possible to end up continually looping in
> > > nlmsvc_retry_blocked() even when the lockd kthread has been
> > > requested to come down. I've been able to reproduce this fairly
> > > easily by having an RPC grant callback queued up to an RPC client
> > > that's not bound. If the other host is unresponsive, then the block
> > > never gets taken off the list, and lockd continually respawns new
> > > RPC's.
> > >
> > > If lockd is going down anyway, we don't want to keep retrying the
> > > blocks, so allow it to break out of this loop. Additionally, in that
> > > situation kthread_stop will have already woken lockd, so when
> > > nlmsvc_retry_blocked returns, have lockd check for kthread_stop() so
> > > that it doesn't end up calling svc_recv and waiting for a long time.
> >
> > Stupid question: how is this different from the kthread_stop() coming
> > just after this kthread_should_stop() call but before we call
> > svc_recv()? What keeps us from waiting in svc_recv() indefinitely
> > after kthread_stop()?
> >
>
> Nothing at all, unfortunately :-(
>
> Since we've taken signalling out of the lockd_down codepath, this gets
> a lot trickier than it used to be. I'm starting to wonder if we should
> go back to sending a signal on lockd_down.
OK, thanks. So do I keep these patches, or not? This sounds like a
regression (if perhaps a very minor one--I'm not quite clear). Help!
--b.
>
> > > This patch prevents this continual looping and allows lockd to
> > > eventually come down, but this can quite some time since lockd
> > > won't check the condition in the nlmsvc_retry_blocked while loop
> > > until nlmsvc_grant_blocked returns. That won't happen until
> > > all of the portmap requests time out.
> >
> > So, shutdown problems aside, does that mean an unresponsive client can
> > cause lock to stop serving lock requests in normal operation?
> >
>
> Possibly.
>
> I think that in general what happens is that blocks eventually get
> reinserted into the list with an expire time that is well after the
> current jiffies and we eventually hit this condition:
>
> if (time_after(block->b_when,jiffies)) {
> timeout = block->b_when - jiffies;
> break;
> }
>
> ...but I was seeing lockd loop in this loop over several iterations
> without returning back up to the main lockd loop. I think what was
> happening was that we were inserting the block with a later jiffies in
> nlmsvc_grant_blocked, but the rpcbind attempts would end up taking
> longer than that timeout. So when we returned from
> nlmsvc_grant_blocked() that condition would no longer fire, and we'd end up retrying the grant callback again.
>
> As to why we don't see this more often, I'm not sure. I probably need
> to experiment a bit more with it to make sure I'm understanding this
> correctly.
>
> Let me do that and get back to you...
next prev parent reply other threads:[~2008-01-29 1:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-28 19:29 [PATCH 0/2] NLM: fix use after free in lockd Jeff Layton
2008-01-28 19:29 ` [PATCH 1/2] NLM: tear down RPC clients in nlm_shutdown_hosts Jeff Layton
2008-01-28 19:29 ` [PATCH 2/2] NLM: break out of nlmsvc_retry_blocked if lockd is coming down Jeff Layton
2008-01-29 0:05 ` J. Bruce Fields
2008-01-29 1:12 ` Jeff Layton
[not found] ` <20080128201210.376097e6-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 1:23 ` J. Bruce Fields [this message]
2008-01-29 2:29 ` Jeff Layton
[not found] ` <20080128212942.77035005-PC62bkCOHzGdMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2008-01-29 3:02 ` J. Bruce Fields
2008-01-29 11:23 ` Jeff Layton
[not found] ` <20080129062312.33ba5a54-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 12:32 ` Jeff Layton
[not found] ` <20080129073241.7c5c2ef1-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-29 14:41 ` J. Bruce Fields
2008-01-28 23:18 ` [PATCH 1/2] NLM: tear down RPC clients in nlm_shutdown_hosts J. Bruce Fields
2008-01-29 0:04 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080129012351.GE16785@fieldses.org \
--to=bfields@fieldses.org \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.