Re: server_scope v4.1 lock reclaim

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: bfields@fieldses.org (J. Bruce Fields)
To: Saso Slavicic <saso.linux@astim.si>
Cc: linux-nfs@vger.kernel.org
Subject: Re: server_scope v4.1 lock reclaim
Date: Mon, 27 Apr 2015 11:19:44 -0400	[thread overview]
Message-ID: <20150427151944.GA2735@fieldses.org> (raw)
In-Reply-To: <000601d080b0$687a2860$396e7920$@astim.si>

On Mon, Apr 27, 2015 at 08:07:12AM +0200, Saso Slavicic wrote:
> I'm doing a NFS HA setup for KVM and need lock reclaim to work. I've been
> doing a lot of testing and reading in the past week and finally figured out
> that for reclaims to work on a 4.1 mount (4.1 is preferable due to
> RECLAIM_COMPLETE and thus faster failover), the server hostnames need to be
> the same. RFC specifies that reclaim can succeed if server scope is the same
> and in fact, the client will not even attempt a reclaim if the server scope
> does not match.
> 
> But...there doesn't seem to be any way of setting server scope other than
> changing server hostname? RFC states: "The purpose of the server scope is to
> allow a group of servers to indicate to clients that a set of servers
> sharing the same server scope value has arranged to use compatible values of
> otherwise opaque identifiers." The nfsdcltrack directory is properly handed
> over during failover so I'd need some way of configuring server scope on
> this "set of servers"? From the code, the server scope is simply set to
> utsname()->nodename in nfs4xdr.c.
> 
> What am I missing here, how can this work when Heartbeat needs different
> names for nodes?

So in theory we could add some sort of way to configure the server scope
and then you could set the server scope to the same thing on all your
servers.

But that's not enough to satisfy
https://tools.ietf.org/html/rfc5661#section-2.10.4, which also requires
stateid's and the rest to be compatible between the servers.

In practice given current Linux servers and clients maybe that could
work, because in your situation the only case when they see each other's
stateid's is after a restart, in which case the id's will include a boot
time that will result in a STALE error as long as the server clocks are
roughly synchronized.  But that makes some assumptions about how our
servers generate id's and how the clients use them.  And I don't think
those assumptions are guaranteed by the spec.  It seems fragile.

If it's simple active-to-passive failover then I suppose you could
arrange for the utsname to be the same too.

--b.

next prev parent reply	other threads:[~2015-04-27 15:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27  6:07 server_scope v4.1 lock reclaim Saso Slavicic
2015-04-27 15:19 ` J. Bruce Fields [this message]
2015-04-28 16:44   ` Saso Slavicic
2015-04-28 18:23     ` 'J. Bruce Fields'

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150427151944.GA2735@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=saso.linux@astim.si \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).