Re: server_scope v4.1 lock reclaim

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "'J. Bruce Fields'" <bfields@fieldses.org>
To: Saso Slavicic <saso.linux@astim.si>
Cc: linux-nfs@vger.kernel.org
Subject: Re: server_scope v4.1 lock reclaim
Date: Tue, 28 Apr 2015 14:23:29 -0400	[thread overview]
Message-ID: <20150428182329.GA16090@fieldses.org> (raw)
In-Reply-To: <000101d081d2$984f1820$c8ed4860$@astim.si>

On Tue, Apr 28, 2015 at 06:44:27PM +0200, Saso Slavicic wrote:
> > From: J. Bruce Fields
> > Sent: Monday, April 27, 2015 5:20 PM
> 
> > So in theory we could add some sort of way to configure the server scope
> > and then you could set the server scope to the same thing on all your
> > servers.
> >
> > But that's not enough to satisfy
> > https://tools.ietf.org/html/rfc5661#section-2.10.4, which also requires
> > stateid's and the rest to be compatible between the servers.
> 
> OK...I have to admit that with the amount of NFS HA tutorials and the
> improvements that NFS v4(.1) brings in the specs, I assumed that HA failover
> was supported. I apologize if that is not the case.

I'm afraid you're in the vanguard--I doubt many people have tried HA
with 4.1 and knfsd yet. (And I hadn't noticed the server scope problem,
thanks for bringing it up.)

> So, such a config option could be added but it's not planned to be added,
> since it could be wrongly used in some situations (ie. not doing
> active-to-passive failover)?
> Active-active setup is then totally out of the question?

I'm not sure what the right fix is yet.

> > In practice given current Linux servers and clients maybe that could
> > work, because in your situation the only case when they see each other's
> > stateid's is after a restart, in which case the id's will include a boot
> > time that will result in a STALE error as long as the server clocks are
> > roughly synchronized.  But that makes some assumptions about how our
> > servers generate id's and how the clients use them.  And I don't think
> > those assumptions are guaranteed by the spec.  It seems fragile.
> 
> I read (part of) the specs and stateids are supposed to hold over sessions
> but not for different client ids.
> Doing a wireshark dump, the (failover) server sends STALE_CLIENTID after
> reconnect so that should properly invalidate all the ids?

Since this is 4.1, I guess the first rpc the new server sees will have
either a clientid or a sessionid.  So we want to make sure the new
server will handle either of those correctly.

> Would I assume correctly that this is read from the nfsdcltrack? Is there
> even a need for this database to sync between each failover, if the client
> is already known since it's last failover (only the timestamp would be
> older)?

So, you're thinking of a case where there's a failover from server A to
server B, then back to server A again, and a single client is
continuously active throughout both failovers?

Here's the sort of case that's a concern:

	- A->B failover happens
	- client gets a file lock from B
	- client loses contact with B (network problem or something)
	- B->A failover happens.

At this point, should A allow the client to reclaim its lock?  B could
have given up on the client, released its lock, and granted conflicting
lock to other clients.  Or it might not have.  Neither the client nor A
knows, B's the only one that knows what happened, so we need to get that
database from B to find out.

--b.

> > If it's simple active-to-passive failover then I suppose you could
> > arrange for the utsname to be the same too.
> 
> I could, but then I don't know which server is active when I login to ssh :)
> What would happen, if the 'migration' mount option would be modified for
> v4.1 mounts not to check for server scope when doing reclaims (as opposed to
> configuring server scope)? :)
> 
> Thanks,
> Saso Slavicic

     prev parent reply	other threads:[~2015-04-28 18:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27  6:07 server_scope v4.1 lock reclaim Saso Slavicic
2015-04-27 15:19 ` J. Bruce Fields
2015-04-28 16:44   ` Saso Slavicic
2015-04-28 18:23     ` 'J. Bruce Fields' [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150428182329.GA16090@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=saso.linux@astim.si \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).