All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@redhat.com>,
	linux-nfs@vger.kernel.org,
	Menyhart Zoltan <Zoltan.Menyhart@bull.net>
Subject: Re: [PATCH 1/4] svcrpc: never clear XPT_BUSY on dead xprt
Date: Fri, 12 Nov 2010 14:00:03 -0500	[thread overview]
Message-ID: <20101112190002.GB32745@fieldses.org> (raw)
In-Reply-To: <20101026160517.GC19445@fieldses.org>

On Tue, Oct 26, 2010 at 12:05:17PM -0400, J. Bruce Fields wrote:
> On Tue, Oct 26, 2010 at 08:59:47AM -0400, J. Bruce Fields wrote:
> > On Tue, Oct 26, 2010 at 12:28:20PM +1100, Neil Brown wrote:
> > > On Mon, 25 Oct 2010 20:30:08 -0400
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > > 
> > > >     svcrpc: simplify svc_close_xprt, fix minor race
> > > >     
> > > >     We're assuming that in the XPT_BUSY case, somebody else will take care
> > > >     of the close.  But that's not necessarily the case (e.g. if
> > > >     svc_xprt_enqueue() exits due to a lack of write space).  So the close
> > > >     could in theory be delayed indefinitely.  We should be calling
> > > >     svc_xprt_enqueue() after every set of XPT_CLOSE.
> > > >     
> > > >     Also with svc_close_all() no longer relying on svc_close, it no longer
> > > >     needs to be able to immediately do the svc_delete_xprt() itself.
> > > >     
> > > >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > > 
> > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > > > index 8c018df..dfee123 100644
> > > > --- a/net/sunrpc/svc_xprt.c
> > > > +++ b/net/sunrpc/svc_xprt.c
> > > > @@ -928,11 +928,7 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> > > >  void svc_close_xprt(struct svc_xprt *xprt)
> > > >  {
> > > >  	set_bit(XPT_CLOSE, &xprt->xpt_flags);
> > > > -	if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
> > > > -		/* someone else will have to effect the close */
> > > > -		return;
> > > > -
> > > > -	svc_delete_xprt(xprt);
> > > > +	svc_xprt_enqueue(xprt);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(svc_close_xprt);
> > > >  
> > > 
> > > I'm not quite so sure about this one.
> > > 
> > > svc_close_xprt gets called when user-space writes some magic string to some
> > > magic file.  This could be done when there are no active nfsd threads.
> > > It is a bit far fetched, but you could tell nfsd to open a socket, then tell
> > > it to close it, before telling it to start any threads.
> > > 
> > > In that case the socket would stay open until the first thread was started.
> > 
> > Thanks, that's an odd case, but I think you're right.  OK:
> 
> Actually.... I think the real race here is the fault of the write-space
> check: once we've taken XPT_BUSY, we really do need to guarantee that
> we'll arrange to have svc_xprt_enqueue() called again after clearing
> XPT_BUSY, to avoid missing any events that arrived while we were
> processing ours.  So putting the wspace check after taking XPT_BUSY in
> svc_xprt_enqueue looks to me like a bug that could cause processing to
> stall unnecessarily in some unlikely cases.

I'm going to apply this for 2.6.38 if you can't think of a reason not
to.

(My one last-minute worry was whether there was some reason xpo_wspace()
needed to be called with XPT_BUSY held; but I can't see any.)

--b.

> 
> --b.
> 
> commit cf501cb479873d6e77abba131ea40c11aa924d72
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Tue Oct 26 11:32:03 2010 -0400
> 
>     svcrpc: fix wspace-checking race
>     
>     We call svc_xprt_enqueue() after something happens which we think may
>     require handling from a server thread.  To avoid such events being lost,
>     svc_xprt_enqueue() must guarantee that there will be a svc_serv() call
>     from a server thread following any such event.  It does that by either
>     waking up a server thread itself, or checking that XPT_BUSY is set (in
>     which case somebody else is doing it).
>     
>     But the check of XPT_BUSY could occur just as someone finishes
>     processing some other event, and just before they clear XPT_BUSY.
>     
>     Therefore it's important not to clear XPT_BUSY without subsequently
>     doing another svc_export_enqueue() to check whether the xprt should be
>     requeued.
>     
>     The xpo_wspace() check in svc_xprt_enqueue() breaks this rule, allowing
>     an event to be missed in situations like:
>     
>     	data arrives
>     	call svc_tcp_data_ready():
>     	call svc_xprt_enqueue():
>     	set BUSY
>     	find no write space
>     				svc_reserve():
>     				free up write space
>     				call svc_enqueue():
>     				test BUSY
>     	clear BUSY
>     
>     So, instead, check wspace in the same places that the state flags are
>     checked: before taking BUSY, and in svc_receive().
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
> index d2261fe..9c690fa 100644
> --- a/include/linux/sunrpc/svc_xprt.h
> +++ b/include/linux/sunrpc/svc_xprt.h
> @@ -92,12 +92,17 @@ static inline void unregister_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user
>  static inline int register_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user *u)
>  {
>  	spin_lock(&xpt->xpt_lock);
> -	list_add(&u->list, &xpt->xpt_users);
> -	spin_unlock(&xpt->xpt_lock);
>  	if (test_bit(XPT_CLOSE, &xpt->xpt_flags)) {
> -		unregister_xpt_user(xpt, u);
> +		/*
> +		 * The connection is about to be deleted soon (or,
> +		 * worse, may already be deleted--in which case we've
> +		 * already notified the xpt_users).
> +		 */
> +		spin_unlock(&xpt->xpt_lock);
>  		return -ENOTCONN;
>  	}
> +	list_add(&u->list, &xpt->xpt_users);
> +	spin_unlock(&xpt->xpt_lock);
>  	return 0;
>  }
>  
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 99996ad..04238a9 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -302,6 +302,15 @@ static void svc_thread_dequeue(struct svc_pool *pool, struct svc_rqst *rqstp)
>  	list_del(&rqstp->rq_list);
>  }
>  
> +static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt)
> +{
> +	if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE)))
> +		return true;
> +	if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED)))
> +		return xprt->xpt_ops->xpo_has_wspace(xprt);
> +	return false;
> +}
> +
>  /*
>   * Queue up a transport with data pending. If there are idle nfsd
>   * processes, wake 'em up.
> @@ -314,8 +323,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>  	struct svc_rqst	*rqstp;
>  	int cpu;
>  
> -	if (!(xprt->xpt_flags &
> -	      ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
> +	if (!svc_xprt_has_something_to_do(xprt))
>  		return;
>  
>  	cpu = get_cpu();
> @@ -345,25 +353,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>  	BUG_ON(xprt->xpt_pool != NULL);
>  	xprt->xpt_pool = pool;
>  
> -	/* Handle pending connection */
> -	if (test_bit(XPT_CONN, &xprt->xpt_flags))
> -		goto process;
> -
> -	/* Handle close in-progress */
> -	if (test_bit(XPT_CLOSE, &xprt->xpt_flags))
> -		goto process;
> -
> -	/* Check if we have space to reply to a request */
> -	if (!xprt->xpt_ops->xpo_has_wspace(xprt)) {
> -		/* Don't enqueue while not enough space for reply */
> -		dprintk("svc: no write space, transport %p  not enqueued\n",
> -			xprt);
> -		xprt->xpt_pool = NULL;
> -		clear_bit(XPT_BUSY, &xprt->xpt_flags);
> -		goto out_unlock;
> -	}
> -
> - process:
>  	if (!list_empty(&pool->sp_threads)) {
>  		rqstp = list_entry(pool->sp_threads.next,
>  				   struct svc_rqst,
> @@ -744,7 +733,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
>  			spin_unlock_bh(&serv->sv_lock);
>  			svc_xprt_received(newxpt);
>  		}
> -	} else {
> +	} else if (xprt->xpt_ops->xpo_has_wspace(xprt)) {
>  		dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
>  			rqstp, pool->sp_id, xprt,
>  			atomic_read(&xprt->xpt_ref.refcount));

  reply	other threads:[~2010-11-12 19:00 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-01 12:17 Relocate NFS root FS for maintenance Greg
2010-09-01 17:34 ` J. Bruce Fields
2010-09-01 21:52 ` Tom Haynes
2010-09-02  7:32   ` Greg
2010-09-02 16:06     ` J. Bruce Fields
2010-09-07  6:59       ` Greg
2010-09-02  6:56 ` statfs() gives ESTALE error Menyhart Zoltan
2010-09-07 18:32   ` Trond Myklebust
2010-09-08 13:33 ` Re :statfs() " Menyhart Zoltan
2010-09-08 20:25   ` Trond Myklebust
2010-09-09  8:12 ` Menyhart Zoltan
2010-09-20 12:49 ` Locking question around "...PagePrivate()" Menyhart Zoltan
2010-09-20 13:55   ` Trond Myklebust
2010-10-05  8:22 ` "xprt" reference count drops to 0 Menyhart Zoltan
2010-10-21 20:38   ` J. Bruce Fields
2010-10-22 15:00     ` Menyhart Zoltan
2010-10-22 21:20       ` J. Bruce Fields
2010-10-22 23:01         ` J. Bruce Fields
2010-10-22 23:21           ` J. Bruce Fields
2010-10-23  3:32             ` J. Bruce Fields
2010-10-25  1:09               ` J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 1/4] svcrpc: never clear XPT_BUSY on dead xprt J. Bruce Fields
2010-10-25  1:43                   ` Neil Brown
2010-10-25 20:21                     ` J. Bruce Fields
2010-10-25 22:58                       ` Neil Brown
2010-10-25 23:03                         ` J. Bruce Fields
2010-10-25 23:54                           ` Neil Brown
2010-10-26  0:11                             ` J. Bruce Fields
2010-10-26  0:28                               ` J. Bruce Fields
2010-10-26  0:30                                 ` J. Bruce Fields
2010-10-26  1:28                                   ` Neil Brown
2010-10-26 12:59                                     ` J. Bruce Fields
2010-10-26 16:05                                       ` J. Bruce Fields
2010-11-12 19:00                                         ` J. Bruce Fields [this message]
2010-10-25  1:21                 ` [PATCH 2/4] svcrpc: assume svc_delete_xprt() called only once J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 3/4] svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue J. Bruce Fields
2010-10-25  1:21                 ` [PATCH 4/4] svcrpc: svc_tcp_sendto XTP_DEAD check is redundant J. Bruce Fields
2010-10-25  2:10                   ` Neil Brown
2010-10-25 15:03                     ` J. Bruce Fields
2010-10-25 17:46                       ` J. Bruce Fields
2010-10-25 23:08                         ` Neil Brown
2010-10-26  1:33                           ` J. Bruce Fields
2010-10-25 23:23                       ` Neil Brown
2010-10-26  1:25                         ` J. Bruce Fields
2010-10-25 11:56         ` "xprt" reference count drops to 0 Menyhart Zoltan
2010-10-25 14:36           ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101112190002.GB32745@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=Zoltan.Menyhart@bull.net \
    --cc=bfields@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.