From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Scott Mayhew <smayhew@redhat.com>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 3/3] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
Date: Thu, 20 Dec 2018 14:02:18 -0500 [thread overview]
Message-ID: <20181220190218.GF6063@fieldses.org> (raw)
In-Reply-To: <f1664df76f40994a848e58f8a4ea86ab8c855ff8.camel@kernel.org>
On Thu, Dec 20, 2018 at 01:26:34PM -0500, Jeff Layton wrote:
> On Thu, 2018-12-20 at 13:05 -0500, J. Bruce Fields wrote:
> > On Thu, Dec 20, 2018 at 12:29:43PM -0500, Jeff Layton wrote:
> > > That wasn't my thinking here.
> > >
> > > Suppose we have a client that holds some locks. Server reboots and we do
> > > EXCHANGE_ID and start reclaiming, and eventually send a
> > > RECLAIM_COMPLETE.
> > >
> > > Now, there is a network partition and we lose contact with the server
> > > for more than a lease period. The client record gets tossed out. Client
> > > eventually reestablishes the connection before the grace period ends and
> > > attempts to reclaim.
> > >
> > > That reclaim should succeed, IMO, as there is no reason that it
> > > shouldn't. Nothing can have claimed competing state since we're still in
> > > the grace period.
> >
> > That scenario requires a grace period longer than the lease period,
> > which isn't impossible but sounds rare? I guess you're thinking in the
> > cluster case about the possibility of a second node failure extending
> > the grace period.
>
> Isn't our grace period twice the lease period by default?
Reminding myself.... Upstream now it will end the grace period after one
grace period, but will extend it up to two grace periods if someone has
reclaimed in the last second.
> I think we do
> have to assume that it may take an entire lease period before the
> client notices that the server has rebooted. If grace period == lease
> period then you aren't leaving much time for reclaim to occur.
My assumption is that it's mainly the client's responsibility to allow
enough time, by renewing its lease somewhat more frequently than once
per lease period.
That may be wrong--there's some support for that assumption in
https://tools.ietf.org/html/rfc7530#section-9.5, but that's talking only
about network delays, not about allowing additional time for the
recovery.
> > Still, that's different from the case where the client explicitly
> > destroys its own state. That could happen in less than a lease period
> > and in that case there won't be a reclaim. I think that case could
> > happen if a client rebooted quickly or maybe just unmounted.
> >
> > Hm.
> >
>
> True. You're right that we don't want to delay lifting the grace period
> because we're waiting for clients that have unmounted and aren't coming
> back. Unfortunately, it's difficult to distinguish the two cases. Could
> we just decrement the counter when we're tearing down a clientid
> because of lease expiration and not on DESTROY_CLIENT?
Right, either DESTROY_CLIENTID or (in the 4.0 case) a
SETCLIENTID_CONFIRM. So those two cases wouldn't be difficult to treat
differently. OK, maybe that's the best choice.
--b.
prev parent reply other threads:[~2018-12-20 19:02 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-18 14:29 [PATCH v2 0/3] un-deprecate nfsdcld Scott Mayhew
2018-12-18 14:29 ` [PATCH v2 1/3] nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array Scott Mayhew
2018-12-18 14:29 ` [PATCH v2 2/3] nfsd: un-deprecate nfsdcld Scott Mayhew
2018-12-19 21:23 ` Jeff Layton
2018-12-19 22:11 ` Scott Mayhew
2018-12-20 0:19 ` Jeff Layton
2018-12-20 1:59 ` J. Bruce Fields
2018-12-20 15:24 ` Jeff Layton
2018-12-18 14:29 ` [PATCH v2 3/3] nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld Scott Mayhew
2018-12-19 17:46 ` J. Bruce Fields
2018-12-19 21:57 ` Scott Mayhew
2018-12-19 18:28 ` J. Bruce Fields
2018-12-19 22:01 ` Scott Mayhew
2018-12-19 18:36 ` J. Bruce Fields
2018-12-19 22:05 ` Scott Mayhew
2018-12-19 22:21 ` J. Bruce Fields
2018-12-19 22:43 ` J. Bruce Fields
2018-12-20 16:36 ` Scott Mayhew
2018-12-20 17:32 ` Jeff Layton
2018-12-20 17:29 ` Jeff Layton
2018-12-20 18:05 ` J. Bruce Fields
2018-12-20 18:26 ` Jeff Layton
2018-12-20 19:02 ` J. Bruce Fields [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181220190218.GF6063@fieldses.org \
--to=bfields@fieldses.org \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=smayhew@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox