From: "J. Bruce Fields" <bfields@fieldses.org>
To: NeilBrown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 5/7] sunrpc/cache: allow thread manager more control of whether threads can wait for upcalls
Date: Wed, 22 Sep 2010 14:36:22 -0400 [thread overview]
Message-ID: <20100922183622.GD26903@fieldses.org> (raw)
In-Reply-To: <20100922025507.31745.30398.stgit@localhost.localdomain>
On Wed, Sep 22, 2010 at 12:55:07PM +1000, NeilBrown wrote:
> Rather than blindly setting a timeout based on a course idea of
> busy-ness, allow the 'cache' to call into the 'rqstp' manager to
> request permission to wait for an upcall, and how long to wait for.
>
> This allows the thread manager to know how many threads are waiting
> and reduce the permitted timeout when there are a large number.
>
> The same code can check if waiting makes any sense (which it doesn't
> on single-threaded services) or if deferral is allowed (which it isn't
> e.g. for NFSv4).
>
> The current heuristic is to allow a long wait (30 sec) if fewer than
> 1/2 the threads are waiting, and to allow no wait at all if there are
> more than 1/2 already waiting.
>
> This provides better guaranties that slow responses to upcalls will
> not block too many threads for too long.
I suppose you're probably looking for comments on the idea rather than
the particular choice of heuristic, but: wasn't one of the motivations
that a cache flush in the midst of heavy write traffic could cause long
delays due to writes being dropped?
That seems like a case where most threads may handling rpc's, but
waiting is still preferable to dropping.
--b.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> include/linux/sunrpc/cache.h | 8 ++++--
> include/linux/sunrpc/svc.h | 1 +
> net/sunrpc/cache.c | 30 ++++++++++++-----------
> net/sunrpc/svc_xprt.c | 54 +++++++++++++++++++++++++++++++++---------
> 4 files changed, 64 insertions(+), 29 deletions(-)
>
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index 0349635..e2f5e56 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -125,9 +125,11 @@ struct cache_detail {
> */
> struct cache_req {
> struct cache_deferred_req *(*defer)(struct cache_req *req);
> - int thread_wait; /* How long (jiffies) we can block the
> - * current thread to wait for updates.
> - */
> + /* See how long (jiffies) we can block the
> + * current thread to wait for updates, and register
> + * (or unregister) that we are waiting.
> + */
> + int (*set_waiter)(struct cache_req *req, int set);
> };
> /* this must be embedded in a deferred_request that is being
> * delayed awaiting cache-fill
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 5a3085b..060029a 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -48,6 +48,7 @@ struct svc_pool {
> struct list_head sp_threads; /* idle server threads */
> struct list_head sp_sockets; /* pending sockets */
> unsigned int sp_nrthreads; /* # of threads in pool */
> + unsigned int sp_nrwaiting; /* # of threads waiting on callbacks */
> struct list_head sp_all_threads; /* all server threads */
> struct svc_pool_stats sp_stats; /* statistics on pool operation */
> } ____cacheline_aligned_in_smp;
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 1963e2a..b14d0ec 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -558,7 +558,7 @@ static void cache_restart_thread(struct cache_deferred_req *dreq, int too_many)
> complete(&dr->completion);
> }
>
> -static int cache_wait_req(struct cache_req *req, struct cache_head *item)
> +static int cache_wait_req(struct cache_req *req, struct cache_head *item, int timeout)
> {
> struct thread_deferred_req sleeper;
> struct cache_deferred_req *dreq = &sleeper.handle;
> @@ -571,7 +571,7 @@ static int cache_wait_req(struct cache_req *req, struct cache_head *item)
>
> if (ret ||
> wait_for_completion_interruptible_timeout(
> - &sleeper.completion, req->thread_wait) <= 0) {
> + &sleeper.completion, timeout) <= 0) {
> /* The completion wasn't completed, so we need
> * to clean up
> */
> @@ -643,20 +643,22 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> {
> struct cache_deferred_req *dreq;
> int ret;
> + int timeout;
>
> - if (req->thread_wait) {
> - ret = cache_wait_req(req, item);
> - if (ret != -ETIMEDOUT)
> - return ret;
> + timeout = req->set_waiter(req, 1);
> + if (timeout) {
> + ret = cache_wait_req(req, item, timeout);
> + req->set_waiter(req, 0);
> + return ret;
> + } else {
> + dreq = req->defer(req);
> + if (dreq == NULL)
> + return -ENOMEM;
> + if (setup_deferral(dreq, item) < 0)
> + return -EAGAIN;
> + cache_limit_defers(dreq);
> + return 0;
> }
> - dreq = req->defer(req);
> - if (dreq == NULL)
> - return -ENOMEM;
> - if (setup_deferral(dreq, item) < 0)
> - return -EAGAIN;
> -
> - cache_limit_defers(dreq);
> - return 0;
> }
>
> static void cache_revisit_request(struct cache_head *item)
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 95fc3e8..4d4032c 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -20,6 +20,7 @@
> static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt);
> static int svc_deferred_recv(struct svc_rqst *rqstp);
> static struct cache_deferred_req *svc_defer(struct cache_req *req);
> +static int svc_set_waiter(struct cache_req *req, int add);
> static void svc_age_temp_xprts(unsigned long closure);
>
> /* apparently the "standard" is that clients close
> @@ -651,11 +652,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> if (signalled() || kthread_should_stop())
> return -EINTR;
>
> - /* Normally we will wait up to 5 seconds for any required
> - * cache information to be provided.
> - */
> - rqstp->rq_chandle.thread_wait = 5*HZ;
> -
> spin_lock_bh(&pool->sp_lock);
> xprt = svc_xprt_dequeue(pool);
> if (xprt) {
> @@ -663,12 +659,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> svc_xprt_get(xprt);
> rqstp->rq_reserved = serv->sv_max_mesg;
> atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
> -
> - /* As there is a shortage of threads and this request
> - * had to be queued, don't allow the thread to wait so
> - * long for cache updates.
> - */
> - rqstp->rq_chandle.thread_wait = 1*HZ;
> } else {
> /* No data pending. Go to sleep */
> svc_thread_enqueue(pool, rqstp);
> @@ -772,6 +762,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
>
> rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> rqstp->rq_chandle.defer = svc_defer;
> + rqstp->rq_chandle.set_waiter = svc_set_waiter;
>
> if (serv->sv_stats)
> serv->sv_stats->netcnt++;
> @@ -987,7 +978,7 @@ static struct cache_deferred_req *svc_defer(struct cache_req *req)
> struct svc_rqst *rqstp = container_of(req, struct svc_rqst, rq_chandle);
> struct svc_deferred_req *dr;
>
> - if (rqstp->rq_arg.page_len || !rqstp->rq_usedeferral)
> + if (rqstp->rq_arg.page_len)
> return NULL; /* if more than a page, give up FIXME */
> if (rqstp->rq_deferred) {
> dr = rqstp->rq_deferred;
> @@ -1065,6 +1056,45 @@ static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
> return dr;
> }
>
> +/*
> + * If 'add', the cache wants to wait for user-space to respond to a request.
> + * We return the number of jiffies to wait. 0 means not to wait at all but
> + * to try deferral instead.
> + * If not 'add', then the wait is over and we can discount this one.
> + */
> +static int svc_set_waiter(struct cache_req *req, int add)
> +{
> + struct svc_rqst *rqstp = container_of(req, struct svc_rqst, rq_chandle);
> + struct svc_pool *pool = rqstp->rq_pool;
> + int rv = 0;
> +
> + spin_lock_bh(&pool->sp_lock);
> + if (add) {
> + if (pool->sp_nrthreads <= 1 && rqstp->rq_usedeferral)
> + /* single threaded - never wait */
> + rv = 0;
> + else if (pool->sp_nrwaiting * 2 > pool->sp_nrthreads)
> + /* > 1/2 are waiting already, something looks wrong,
> + * Don't defer or wait */
> + rv = 1;
> + else
> + /* Wait a reasonably long time */
> + rv = 30 * HZ;
> +
> + if (!rqstp->rq_usedeferral && rv == 0)
> + /* just double-checking - we mustn't allow deferral */
> + rv = 1;
> +
> + if (rv)
> + pool->sp_nrwaiting++;
> + } else {
> + BUG_ON(pool->sp_nrwaiting == 0);
> + pool->sp_nrwaiting--;
> + }
> + spin_unlock_bh(&pool->sp_lock);
> + return rv;
> +}
> +
> /**
> * svc_find_xprt - find an RPC transport instance
> * @serv: pointer to svc_serv to search
>
>
next prev parent reply other threads:[~2010-09-22 18:37 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-22 2:55 [PATCH 0/7] Assorted nfsd patches for 2.6.37 NeilBrown
2010-09-22 2:55 ` [PATCH 2/7] sunrpc/cache: fix recent breakage of cache_clean_deferred NeilBrown
[not found] ` <20100922025506.31745.74964.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 18:27 ` J. Bruce Fields
2010-09-22 2:55 ` [PATCH 1/7] sunrpc: fix race in new cache_wait code NeilBrown
2010-09-22 17:50 ` J. Bruce Fields
2010-09-23 3:00 ` Neil Brown
2010-09-23 3:25 ` J. Bruce Fields
2010-09-23 14:46 ` J. Bruce Fields
2010-10-01 23:09 ` J. Bruce Fields
2010-10-02 0:12 ` Neil Brown
2010-09-22 2:55 ` [PATCH 7/7] nfsd: allow deprecated interface to be compiled out NeilBrown
2010-09-22 2:55 ` [PATCH 4/7] sunrpc/cache: centralise handling of size limit on deferred list NeilBrown
[not found] ` <20100922025507.31745.61919.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 18:31 ` J. Bruce Fields
2010-09-23 3:02 ` Neil Brown
2010-09-22 2:55 ` [PATCH 6/7] nfsd: formally deprecate legacy nfsd syscall interface NeilBrown
[not found] ` <20100922025507.31745.57024.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 3:10 ` J. Bruce Fields
2010-09-22 2:55 ` [PATCH 5/7] sunrpc/cache: allow thread manager more control of whether threads can wait for upcalls NeilBrown
2010-09-22 18:36 ` J. Bruce Fields [this message]
2010-09-23 3:23 ` Neil Brown
2010-09-22 2:55 ` [PATCH 3/7] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
2010-09-22 2:59 ` J. Bruce Fields
2010-09-22 4:51 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100922183622.GD26903@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.