linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org, chuck.lever@oracle.com
Subject: Re: [PATCH v4 0/6] nfsd: overhaul the client name tracking code
Date: Wed, 25 Jan 2012 08:38:20 -0500	[thread overview]
Message-ID: <20120125083820.637c8362@tlielax.poochiereds.net> (raw)
In-Reply-To: <20120125131116.GA17873@fieldses.org>

On Wed, 25 Jan 2012 08:11:17 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Jan 25, 2012 at 06:41:58AM -0500, Jeff Layton wrote:
> > On Tue, 24 Jan 2012 18:08:55 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > On Mon, Jan 23, 2012 at 03:01:01PM -0500, Jeff Layton wrote:
> > > > This is the fourth iteration of this patchset. I had originally asked
> > > > Bruce to take the last one for 3.3, but decided at the last minute to
> > > > wait on it a bit. I knew there would be some changes needed in the
> > > > upcall, so by waiting we can avoid needing to deal with those in code
> > > > that has already shipped. I would like to see this patchset considered
> > > > for 3.4 however.
> > > > 
> > > > The previous patchset can be viewed here. That set also contains a
> > > > more comprehensive description of the rationale for this:
> > > > 
> > > >     http://www.spinics.net/lists/linux-nfs/msg26324.html
> > > > 
> > > > There have been a number of significant changes since the last set:
> > > > 
> > > > - the remove/expire upcall is now gone. In a clustered environment, the
> > > > records would need to be refcounted in order to handle that properly. That
> > > > becomes a sticky problem when you could have nodes rebooting. We don't
> > > > really need to remove these records individually however. Cleaning them
> > > > out only when the grace period ends should be sufficient.
> > > 
> > > I don't think so:
> > > 
> > > 	1. Client establishes state with server.
> > > 	2. Network goes down.
> > > 	3. A lease period passes without the client being able to renew.
> > > 	   The server expires the client and grants conflicting locks to
> > > 	   other clients.
> > > 	3. Server reboots.
> > > 	4. Network comes back up.
> > > 
> > > At this point, the client sees that the server has rebooted and is in
> > > its grace period, and reclaims.  Ooops.
> > > 
> > > The server needs to be able to tell the client "nope, you're not allowed
> > > to reclaim any more" at this point.
> > > 
> > > So we need some sort of remove/expire upcall.
> > > 
> > 
> > Doh! I don't know what I was thinking -- you're correct and we do need
> > that.
> > 
> > Ok, I'll see about putting it back and will resend. That does make it
> > rather nasty to handle clients mounting from multiple nodes in the same
> > cluster though. We'll need to come up with a data model that allows for
> > that as well.
> 
> Honestly, in the v4-based migration case if one client can hold state on
> mulitple nodes, and could (could it?) after reboot decide to reclaim
> state on a different node from the one it previously held the same state
> on--I'm not even clear what *should* happen, or if the protocol is
> really adequate for that case.
> 
> --b.

That was one of Chuck's concerns, IIUC:

--------------[snip]----------------

What if a server has more than one address?  For example, an IPv4 and
an IPv6 address?  Does it get two separate database files?  If so, how
do you ensure that a client's nfs_client_id4 is recorded in both places
atomically?  I'm not sure tying the server's identity to an IP address
is wise.

--------------[snip]----------------

This is the problem...

We need to tie the record to some property that's invariant for the NFS
server "instance". That can't be a physical nodeid or anything, since
part of the goal here is to allow for cluster services to float freely
between them.

I really would like to avoid having to establish some abstract "service
ID" or something since we'd have to track that on stable storage on a
per-nfs-service basis.

The server address seems like a natural fit here. With the design I'm
proposing, a client will need to reestablish its state on another node
if it migrates for any reason.

Chuck, what was your specific worry about tracking these on a per
server address basis? Can you outline a scenario where that would break
something?

-- 
Jeff Layton <jlayton@redhat.com>

  reply	other threads:[~2012-01-25 13:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-23 20:01 [PATCH v4 0/6] nfsd: overhaul the client name tracking code Jeff Layton
2012-01-23 20:01 ` [PATCH v4 1/6] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
2012-01-23 20:01 ` [PATCH v4 2/6] sunrpc: create nfsd dir in rpc_pipefs Jeff Layton
2012-01-23 20:01 ` [PATCH v4 3/6] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
2012-01-23 20:01 ` [PATCH v4 4/6] nfsd: add a header describing upcall to nfsdcld Jeff Layton
2012-01-23 20:01 ` [PATCH v4 5/6] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
2012-01-23 20:01 ` [PATCH v4 6/6] nfsd: get boot generation number from upcall instead of boot_time Jeff Layton
2012-01-24 23:08 ` [PATCH v4 0/6] nfsd: overhaul the client name tracking code J. Bruce Fields
2012-01-24 23:11   ` J. Bruce Fields
2012-01-25 11:41   ` Jeff Layton
2012-01-25 13:11     ` J. Bruce Fields
2012-01-25 13:38       ` Jeff Layton [this message]
2012-01-25 16:47         ` Chuck Lever
2012-01-25 17:14           ` J. Bruce Fields
2012-01-25 17:41             ` Chuck Lever
2012-01-25 18:55               ` J. Bruce Fields
2012-01-25 20:23                 ` Jeff Layton
2012-01-25 21:25                   ` J. Bruce Fields
2012-01-25 21:29                     ` Chuck Lever
2012-01-25 21:54                       ` J. Bruce Fields
2012-01-25 21:55                         ` Chuck Lever
2012-01-25 22:11                           ` J. Bruce Fields
2012-01-27 15:43                     ` Jeff Layton
2012-01-25 20:29                 ` Chuck Lever
2012-01-25 20:53                   ` J. Bruce Fields
2012-01-25 21:08                     ` Chuck Lever
2012-01-25 19:08           ` Jeff Layton
2012-01-24 23:10 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120125083820.637c8362@tlielax.poochiereds.net \
    --to=jlayton@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).