All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jeff.layton@primarydata.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH v2 1/5] lockd: move lockd's grace period handling into its own module
Date: Mon, 15 Sep 2014 20:19:43 -0400	[thread overview]
Message-ID: <20140916001943.GA7712@fieldses.org> (raw)
In-Reply-To: <20140915191919.4097750d@tlielax.poochiereds.net>

On Mon, Sep 15, 2014 at 07:19:19PM -0400, Jeff Layton wrote:
> On Mon, 15 Sep 2014 18:09:39 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Mon, Sep 15, 2014 at 06:08:13PM -0400, J. Bruce Fields wrote:
> > > On Thu, Aug 28, 2014 at 04:24:43PM -0400, J. Bruce Fields wrote:
> > > > On Thu, Aug 28, 2014 at 04:01:00PM -0400, J. Bruce Fields wrote:
> > > > > On Tue, Aug 19, 2014 at 02:38:25PM -0400, Jeff Layton wrote:
> > > > > > Currently, all of the grace period handling is part of lockd. Eventually
> > > > > > though we'd like to be able to build v4-only servers, at which point
> > > > > > we'll need to put all of this elsewhere.
> > > > > > 
> > > > > > Move the code itself into fs/nfs_common and have it build a grace.ko
> > > > > > module. Then, rejigger the Kconfig options so that both nfsd and lockd
> > > > > > enable it automatically.
> > > > > 
> > > > > Thanks, applying this one for 3.18 indepedently of the others.
> > > > 
> > > > This code should also be fixed, though.
> > > > 
> > > > Currently nfsd is recording the grace period as done when its own timer
> > > > runs out, but then it continuing to accept reclaims until lockd is also
> > > > done.
> > > 
> > > I sat down to fix this today then decided there's not a real bug:
> > > 
> > > All the grace_done upcall really does is throw away any previous clients
> > > that haven't reclaimed yet.
> > > 
> > > It's legal to do that as soon as the correct amount of time has passed.
> > > It's actually OK to continue to allow the grace period to run past that
> > > point, the only requirement is that all reclaims precede all new opens,
> > > which the code still does enforce.
> > > 
> > > So I think it might just be worth a little extra explanation.
> > 
> > I considered doing something like the following anyway (total
> > off-the-cuff draft, not tested), but I think it's overkill.
> > 
> > --b.
> > 
> 
> Yeah, I tend to agree. I don't see a real need for it.
> 
> In principle, I guess you could end up in a situation where the lockd
> and nfsd grace periods different by a large amount. In that case, you
> could end up in a situation where you started denying reclaims for v4.x
> clients that haven't reclaimed anything yet, but then still didn't
> allow them to establish new locks either.

Oh, right, I forget that this also keeps old clients from doing further
reclaims during this boot.

I think that could actually be fixed purely in userspace--instead of
purging those old clients, just record the new end-of-grace timestamp
and keep them around a while longer.  You still know which ones to deny
on the next boot, but they're still around if you want to keep allowing
them to reclaim.

Anyway:

> So it might be reasonable to add something like this, but I don't see
> it as anything most setups would ever hit -- particularly not in sane
> configurations.

Yeah, no big deal.

--b.

> 
> 
> > commit 675121da48fb4d7d85b58467c7f79dd3a6964879
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Mon Sep 15 17:42:14 2014 -0400
> > 
> >     nfsd4: end grace only when last grace period ends
> >     
> >     In the case both lockd and nfsd4 are running, or when we have multiple
> >     containers exporting the same filesystem, there may be multiple grace
> >     periods.  In that case the grace period doesn't really end till the last
> >     one does and though it's harmless to do the grace_done upcall before
> >     then, it would probably be better not to.
> >     
> >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c
> > index ae6e58e..dc67e5d 100644
> > --- a/fs/nfs_common/grace.c
> > +++ b/fs/nfs_common/grace.c
> > @@ -12,6 +12,11 @@
> >  static int grace_net_id;
> >  static DEFINE_SPINLOCK(grace_lock);
> >  
> > +struct grace_net {
> > +	struct list_head *grace_list;
> > +	bool in_grace;
> > +};
> > +
> >  /**
> >   * locks_start_grace
> >   * @net: net namespace that this lock manager belongs to
> > @@ -27,10 +32,10 @@ static DEFINE_SPINLOCK(grace_lock);
> >  void
> >  locks_start_grace(struct net *net, struct lock_manager *lm)
> >  {
> > -	struct list_head *grace_list = net_generic(net, grace_net_id);
> > +	struct list_head *gn = net_generic(net, grace_net_id);
> >  
> >  	spin_lock(&grace_lock);
> > -	list_add(&lm->list, grace_list);
> > +	list_add(&lm->list, &gn->grace_list);
> >  	spin_unlock(&grace_lock);
> >  }
> >  EXPORT_SYMBOL_GPL(locks_start_grace);
> > @@ -49,9 +54,31 @@ EXPORT_SYMBOL_GPL(locks_start_grace);
> >  void
> >  locks_end_grace(struct lock_manager *lm)
> >  {
> > +	struct list_head *gn = net_generic(net, grace_net_id);
> > +
> > +	if (list_empty(&lm->list))
> > +		return;
> >  	spin_lock(&grace_lock);
> >  	list_del_init(&lm->list);
> >  	spin_unlock(&grace_lock);
> > +	if (list_empty(&gn->grace_list))
> > +		return;
> > +	/*
> > +	 * If the server goes down again right now, an NFSv4
> > +	 * client will still be reclaim after it comes back up, even if
> > +	 * it hasn't yet had a chance to reclaim state this time.
> > +	 */
> > +	lm->finish_grace(lm);
> > +	/*
> > +	 * At this point, NFSv4 clients can still reclaim.  But if the
> > +	 * server crashes, any that have not yet reclaimed will be out
> > +	 * of luck on the next boot.
> > +	 */
> > +	ln->in_grace = false;
> 
> 	gn->in_grace = false;
> 
> > +	/*
> > +	 * At this point, no reclaims are permitted.  Regular locking
> > +	 * can resume.
> > +	 */
> >  }
> >  EXPORT_SYMBOL_GPL(locks_end_grace);
> >  
> > @@ -65,27 +92,28 @@ EXPORT_SYMBOL_GPL(locks_end_grace);
> >  int
> >  locks_in_grace(struct net *net)
> >  {
> > -	struct list_head *grace_list = net_generic(net, grace_net_id);
> > +	struct list_head *gn = net_generic(net, grace_net_id);
> >  
> > -	return !list_empty(grace_list);
> > +	return gn->in_grace;
> >  }
> >  EXPORT_SYMBOL_GPL(locks_in_grace);
> >  
> >  static int __net_init
> >  grace_init_net(struct net *net)
> >  {
> > -	struct list_head *grace_list = net_generic(net, grace_net_id);
> > +	struct list_head *gn = net_generic(net, grace_net_id);
> >  
> > -	INIT_LIST_HEAD(grace_list);
> > +	INIT_LIST_HEAD(&gn->grace_list);
> > +	gn->in_grace = true;
> >  	return 0;
> >  }
> >  
> >  static void __net_exit
> >  grace_exit_net(struct net *net)
> >  {
> > -	struct list_head *grace_list = net_generic(net, grace_net_id);
> > +	struct list_head *gn = net_generic(net, grace_net_id);
> >  
> > -	BUG_ON(!list_empty(grace_list));
> > +	BUG_ON(!list_empty(&gn->grace_list));
> >  }
> >  
> >  static struct pernet_operations grace_net_ops = {
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index fc88b57..968acd0 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -4122,7 +4122,6 @@ nfsd4_end_grace(struct nfsd_net *nn)
> >  
> >  	dprintk("NFSD: end of grace period\n");
> >  	nn->grace_ended = true;
> > -	nfsd4_record_grace_done(nn);
> >  	locks_end_grace(&nn->nfsd4_manager);
> >  	/*
> >  	 * Now that every NFSv4 client has had the chance to recover and
> > @@ -6341,6 +6340,13 @@ nfs4_state_destroy_net(struct net *net)
> >  	put_net(net);
> >  }
> >  
> > +void nfsd4_finish_grace(struct lock_manager *lm)
> > +{
> > +	struct nfsd_net *nn = lm->private;
> > +
> > +	nfsd4_record_grace_done(nn);
> > +}
> > +
> >  int
> >  nfs4_state_start_net(struct net *net)
> >  {
> > @@ -6352,6 +6358,8 @@ nfs4_state_start_net(struct net *net)
> >  		return ret;
> >  	nn->boot_time = get_seconds();
> >  	nn->grace_ended = false;
> > +	nn->nfsd4_manager.finish_grace = nfsd4_finish_grace;
> > +	nn->nfsd4_manager.private = nn;
> >  	locks_start_grace(net, &nn->nfsd4_manager);
> >  	nfsd4_client_tracking_init(net);
> >  	printk(KERN_INFO "NFSD: starting %ld-second grace period (net %p)\n",
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 9418772..0e33e7c 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -876,6 +876,8 @@ struct lock_manager_operations {
> >  
> >  struct lock_manager {
> >  	struct list_head list;
> > +	void (*finish_grace)(struct lock_manager *lm);
> > +	void *private;
> >  };
> >  
> >  struct net;
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> -- 
> Jeff Layton <jlayton@primarydata.com>

  reply	other threads:[~2014-09-16  0:19 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-19 18:38 [PATCH v2 0/5] nfsd: support for lifting grace period early Jeff Layton
2014-08-19 18:38 ` [PATCH v2 1/5] lockd: move lockd's grace period handling into its own module Jeff Layton
2014-08-28 20:01   ` J. Bruce Fields
2014-08-28 20:24     ` J. Bruce Fields
2014-08-28 23:53       ` Jeff Layton
2014-09-03 21:54         ` J. Bruce Fields
2014-09-15 22:08       ` J. Bruce Fields
2014-09-15 22:09         ` J. Bruce Fields
2014-09-15 22:10           ` J. Bruce Fields
2014-09-15 23:19           ` Jeff Layton
2014-09-16  0:19             ` J. Bruce Fields [this message]
2014-09-15 23:11         ` Jeff Layton
2014-08-19 18:38 ` [PATCH v2 2/5] lockd: add a /proc/fs/lockd/nlm_end_grace file Jeff Layton
2014-09-04 19:52   ` J. Bruce Fields
2014-08-19 18:38 ` [PATCH v2 3/5] nfsd: add a v4_end_grace file to /proc/fs/nfsd Jeff Layton
2014-09-04 19:54   ` J. Bruce Fields
2014-09-05 11:40     ` Jeff Layton
2014-08-19 18:38 ` [PATCH v2 4/5] nfsd: remove redundant boot_time parm from grace_done client tracking op Jeff Layton
2014-09-04 19:54   ` J. Bruce Fields
2014-08-19 18:38 ` [PATCH v2 5/5] nfsd: pass extra info in env vars to upcalls to allow for early grace period end Jeff Layton
2014-09-04 19:59   ` J. Bruce Fields
2014-09-05 11:43     ` Jeff Layton
2014-09-05 15:58       ` J. Bruce Fields
2014-09-26 18:39 ` [PATCH v2 0/5] nfsd: support for lifting grace period early J. Bruce Fields
2014-09-26 18:54   ` Jeff Layton
2014-09-26 19:46     ` J. Bruce Fields
2014-09-26 20:37       ` Trond Myklebust
2014-09-26 20:45         ` J. Bruce Fields
2014-09-26 20:58           ` Trond Myklebust
2014-09-26 21:47             ` J. Bruce Fields
2014-09-26 22:17               ` Trond Myklebust
2014-09-26 22:35                 ` Trond Myklebust
2014-09-27 13:04       ` Jeff Layton
2014-09-29 16:44         ` J. Bruce Fields
2014-09-29 16:53           ` Trond Myklebust
2014-09-29 17:11           ` Jeff Layton
2014-09-29 17:55             ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140916001943.GA7712@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=jeff.layton@primarydata.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.