From: Jeff Layton <jlayton@redhat.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org, chuck.lever@oracle.com
Subject: Re: [PATCH v4 0/6] nfsd: overhaul the client name tracking code
Date: Wed, 25 Jan 2012 08:38:20 -0500 [thread overview]
Message-ID: <20120125083820.637c8362@tlielax.poochiereds.net> (raw)
In-Reply-To: <20120125131116.GA17873@fieldses.org>
On Wed, 25 Jan 2012 08:11:17 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Wed, Jan 25, 2012 at 06:41:58AM -0500, Jeff Layton wrote:
> > On Tue, 24 Jan 2012 18:08:55 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> >
> > > On Mon, Jan 23, 2012 at 03:01:01PM -0500, Jeff Layton wrote:
> > > > This is the fourth iteration of this patchset. I had originally asked
> > > > Bruce to take the last one for 3.3, but decided at the last minute to
> > > > wait on it a bit. I knew there would be some changes needed in the
> > > > upcall, so by waiting we can avoid needing to deal with those in code
> > > > that has already shipped. I would like to see this patchset considered
> > > > for 3.4 however.
> > > >
> > > > The previous patchset can be viewed here. That set also contains a
> > > > more comprehensive description of the rationale for this:
> > > >
> > > > http://www.spinics.net/lists/linux-nfs/msg26324.html
> > > >
> > > > There have been a number of significant changes since the last set:
> > > >
> > > > - the remove/expire upcall is now gone. In a clustered environment, the
> > > > records would need to be refcounted in order to handle that properly. That
> > > > becomes a sticky problem when you could have nodes rebooting. We don't
> > > > really need to remove these records individually however. Cleaning them
> > > > out only when the grace period ends should be sufficient.
> > >
> > > I don't think so:
> > >
> > > 1. Client establishes state with server.
> > > 2. Network goes down.
> > > 3. A lease period passes without the client being able to renew.
> > > The server expires the client and grants conflicting locks to
> > > other clients.
> > > 3. Server reboots.
> > > 4. Network comes back up.
> > >
> > > At this point, the client sees that the server has rebooted and is in
> > > its grace period, and reclaims. Ooops.
> > >
> > > The server needs to be able to tell the client "nope, you're not allowed
> > > to reclaim any more" at this point.
> > >
> > > So we need some sort of remove/expire upcall.
> > >
> >
> > Doh! I don't know what I was thinking -- you're correct and we do need
> > that.
> >
> > Ok, I'll see about putting it back and will resend. That does make it
> > rather nasty to handle clients mounting from multiple nodes in the same
> > cluster though. We'll need to come up with a data model that allows for
> > that as well.
>
> Honestly, in the v4-based migration case if one client can hold state on
> mulitple nodes, and could (could it?) after reboot decide to reclaim
> state on a different node from the one it previously held the same state
> on--I'm not even clear what *should* happen, or if the protocol is
> really adequate for that case.
>
> --b.
That was one of Chuck's concerns, IIUC:
--------------[snip]----------------
What if a server has more than one address? For example, an IPv4 and
an IPv6 address? Does it get two separate database files? If so, how
do you ensure that a client's nfs_client_id4 is recorded in both places
atomically? I'm not sure tying the server's identity to an IP address
is wise.
--------------[snip]----------------
This is the problem...
We need to tie the record to some property that's invariant for the NFS
server "instance". That can't be a physical nodeid or anything, since
part of the goal here is to allow for cluster services to float freely
between them.
I really would like to avoid having to establish some abstract "service
ID" or something since we'd have to track that on stable storage on a
per-nfs-service basis.
The server address seems like a natural fit here. With the design I'm
proposing, a client will need to reestablish its state on another node
if it migrates for any reason.
Chuck, what was your specific worry about tracking these on a per
server address basis? Can you outline a scenario where that would break
something?
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2012-01-25 13:38 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-23 20:01 [PATCH v4 0/6] nfsd: overhaul the client name tracking code Jeff Layton
2012-01-23 20:01 ` [PATCH v4 1/6] nfsd: add nfsd4_client_tracking_ops struct and a way to set it Jeff Layton
2012-01-23 20:01 ` [PATCH v4 2/6] sunrpc: create nfsd dir in rpc_pipefs Jeff Layton
2012-01-23 20:01 ` [PATCH v4 3/6] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field Jeff Layton
2012-01-23 20:01 ` [PATCH v4 4/6] nfsd: add a header describing upcall to nfsdcld Jeff Layton
2012-01-23 20:01 ` [PATCH v4 5/6] nfsd: add the infrastructure to handle the cld upcall Jeff Layton
2012-01-23 20:01 ` [PATCH v4 6/6] nfsd: get boot generation number from upcall instead of boot_time Jeff Layton
2012-01-24 23:08 ` [PATCH v4 0/6] nfsd: overhaul the client name tracking code J. Bruce Fields
2012-01-24 23:11 ` J. Bruce Fields
2012-01-25 11:41 ` Jeff Layton
2012-01-25 13:11 ` J. Bruce Fields
2012-01-25 13:38 ` Jeff Layton [this message]
2012-01-25 16:47 ` Chuck Lever
2012-01-25 17:14 ` J. Bruce Fields
2012-01-25 17:41 ` Chuck Lever
2012-01-25 18:55 ` J. Bruce Fields
2012-01-25 20:23 ` Jeff Layton
2012-01-25 21:25 ` J. Bruce Fields
2012-01-25 21:29 ` Chuck Lever
2012-01-25 21:54 ` J. Bruce Fields
2012-01-25 21:55 ` Chuck Lever
2012-01-25 22:11 ` J. Bruce Fields
2012-01-27 15:43 ` Jeff Layton
2012-01-25 20:29 ` Chuck Lever
2012-01-25 20:53 ` J. Bruce Fields
2012-01-25 21:08 ` Chuck Lever
2012-01-25 19:08 ` Jeff Layton
2012-01-24 23:10 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120125083820.637c8362@tlielax.poochiereds.net \
--to=jlayton@redhat.com \
--cc=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).