[PATCH 00/12] Some improvements to request deferral and related code

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/12] Some improvements to request deferral and related code
@ 2009-08-04  5:22 NeilBrown
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Hi Bruce and all.

 When I wrote the authentication cache all those years ago I handled
 cache misses by arranging to simply retry the whole request, either
 by saving a copy for be replayed later, or by waiting for the client
 to resend.

 As we know this isn't ideal.  Large requests (e.g. writes) don't get
 saved, TCP connections don't resend in a suitable time, and NFSv4
 request are sufficiently complex that restarting from the beginning
 isn't really a good idea.

 So I have finally implemented the "wait a while" approach which has
 probably been suggested several times and is even implemented in a
 somewhat hackish way for nfsidmap.

 Yes, I know I have probably argued against this in the past.  I was
 wrong.

 This series fixes a few little bugs and tidies up some code but does
 two main important things.

 1/ 'allow thread to block....' will wait a little while if there is a
 cache miss.  If an answer is available in that time, it continues on
 it's merry way.  If no answer arrives, the old deferral approach is
 used.  It waits 5 seconds if there are spare nfsd threads, and 1
 second if there all threads are busy.  I have almost nothing with
 which to justify these numbers.

 2/ 'close connection when...' provides a better fall back for TCP.
 If the sunrpc layer decides to drop the request without queueing it
 at all, it will close the connection.  This encourages the client to
 retry sooner.

 Please review and comment.

 If you like them all, they can be pulled from

      git://neil.brown.name/linux-2.6/ nfsd

Thanks,
NeilBrown

---

NeilBrown (12):
      sunrpc: close connection when a request is irretrievably lost.
      sunrpc/cache: change deferred-request hash table to use hlist.
      sunrpc: fix memory leak in unix_gid cache.
      nfsd/idmap: drop special request deferal in favour of improved default.
      sunrpc/cache: retry cache lookups that return -ETIMEDOUT
      sunrpc/cache: allow thread to block while waiting for cache update.
      sunrpc/cache: avoid variable over-loading in cache_defer_req
      sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
      sunrpc/cache: recheck cache validity after cache_defer_req
      sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
      sunrpc/cache: make sure deferred requests eventually get revisited.
      sunrpc/cache: rename queue_loose to cache_dequeue

 fs/nfsd/export.c               |   18 ++++
 fs/nfsd/nfs4idmap.c            |  105 +++---------------------
 include/linux/sunrpc/cache.h   |    5 +
 include/linux/sunrpc/svcauth.h |   10 ++
 net/sunrpc/cache.c             |  173 ++++++++++++++++++++++++++--------------
 net/sunrpc/svc.c               |    3 +
 net/sunrpc/svc_xprt.c          |   11 +++
 net/sunrpc/svcauth_unix.c      |   31 ++++++-
 8 files changed, 192 insertions(+), 164 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

[parent not found: <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* [PATCH 03/12] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052238.15929.17142.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req NeilBrown
                     ` (11 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The extra call to cache_revisit_request in cache_fresh_unlocked is not
needed, as should have been fairly clear at the time of
   commit 4013edea9a0b6cdcb1fdf5d4011e47e068fd6efb

If there are requests to be revisited, then we can be sure that
CACHE_PENDING is set, so the second call is sufficient.

So remove the first call.
Then remove the 'new' parameter,
then remove the return value for cache_fresh_locked which is only used
to provide the value for 'new'.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   23 ++++++++++-------------
 1 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 44f4516..c1f897c 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -103,18 +103,16 @@ EXPORT_SYMBOL_GPL(sunrpc_cache_lookup);
 
 static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch);
 
-static int cache_fresh_locked(struct cache_head *head, time_t expiry)
+static void cache_fresh_locked(struct cache_head *head, time_t expiry)
 {
 	head->expiry_time = expiry;
 	head->last_refresh = get_seconds();
-	return !test_and_set_bit(CACHE_VALID, &head->flags);
+	set_bit(CACHE_VALID, &head->flags);
 }
 
 static void cache_fresh_unlocked(struct cache_head *head,
-			struct cache_detail *detail, int new)
+				 struct cache_detail *detail)
 {
-	if (new)
-		cache_revisit_request(head);
 	if (test_and_clear_bit(CACHE_PENDING, &head->flags)) {
 		cache_revisit_request(head);
 		cache_dequeue(detail, head);
@@ -130,7 +128,6 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	 */
 	struct cache_head **head;
 	struct cache_head *tmp;
-	int is_new;
 
 	if (!test_bit(CACHE_VALID, &old->flags)) {
 		write_lock(&detail->hash_lock);
@@ -139,9 +136,9 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 				set_bit(CACHE_NEGATIVE, &old->flags);
 			else
 				detail->update(old, new);
-			is_new = cache_fresh_locked(old, new->expiry_time);
+			cache_fresh_locked(old, new->expiry_time);
 			write_unlock(&detail->hash_lock);
-			cache_fresh_unlocked(old, detail, is_new);
+			cache_fresh_unlocked(old, detail);
 			return old;
 		}
 		write_unlock(&detail->hash_lock);
@@ -165,11 +162,11 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	*head = tmp;
 	detail->entries++;
 	cache_get(tmp);
-	is_new = cache_fresh_locked(tmp, new->expiry_time);
+	cache_fresh_locked(tmp, new->expiry_time);
 	cache_fresh_locked(old, 0);
 	write_unlock(&detail->hash_lock);
-	cache_fresh_unlocked(tmp, detail, is_new);
-	cache_fresh_unlocked(old, detail, 0);
+	cache_fresh_unlocked(tmp, detail);
+	cache_fresh_unlocked(old, detail);
 	cache_put(old, detail);
 	return tmp;
 }
@@ -224,8 +221,8 @@ int cache_check(struct cache_detail *detail,
 				cache_revisit_request(h);
 				if (rv == -EAGAIN) {
 					set_bit(CACHE_NEGATIVE, &h->flags);
-					cache_fresh_unlocked(h, detail,
-					     cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY));
+					cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY);
+					cache_fresh_unlocked(h, detail);
 					rv = -ENOENT;
 				}
 				break;



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052238.15929.17142.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 03/12] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
       [not found]     ` <20090804052238.15929.17142.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 15:45       ` J. Bruce Fields
  0 siblings, 0 replies; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 15:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> The extra call to cache_revisit_request in cache_fresh_unlocked is not
> needed, as should have been fairly clear at the time of
>    commit 4013edea9a0b6cdcb1fdf5d4011e47e068fd6efb
> 
> If there are requests to be revisited, then we can be sure that
> CACHE_PENDING is set, so the second call is sufficient.

Yes: the only cache_make_upcall() caller (cache_check()) sets
CACHE_PENDING first.  And a grep for CACHE_PENDING verifies that
everyone that clears it immediately calls either cache_revisit_request()
or cache_dequeue().  OK!

Thanks, applied.--b.

> 
> So remove the first call.
> Then remove the 'new' parameter,
> then remove the return value for cache_fresh_locked which is only used
> to provide the value for 'new'.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |   23 ++++++++++-------------
>  1 files changed, 10 insertions(+), 13 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 44f4516..c1f897c 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -103,18 +103,16 @@ EXPORT_SYMBOL_GPL(sunrpc_cache_lookup);
>  
>  static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch);
>  
> -static int cache_fresh_locked(struct cache_head *head, time_t expiry)
> +static void cache_fresh_locked(struct cache_head *head, time_t expiry)
>  {
>  	head->expiry_time = expiry;
>  	head->last_refresh = get_seconds();
> -	return !test_and_set_bit(CACHE_VALID, &head->flags);
> +	set_bit(CACHE_VALID, &head->flags);
>  }
>  
>  static void cache_fresh_unlocked(struct cache_head *head,
> -			struct cache_detail *detail, int new)
> +				 struct cache_detail *detail)
>  {
> -	if (new)
> -		cache_revisit_request(head);
>  	if (test_and_clear_bit(CACHE_PENDING, &head->flags)) {
>  		cache_revisit_request(head);
>  		cache_dequeue(detail, head);
> @@ -130,7 +128,6 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
>  	 */
>  	struct cache_head **head;
>  	struct cache_head *tmp;
> -	int is_new;
>  
>  	if (!test_bit(CACHE_VALID, &old->flags)) {
>  		write_lock(&detail->hash_lock);
> @@ -139,9 +136,9 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
>  				set_bit(CACHE_NEGATIVE, &old->flags);
>  			else
>  				detail->update(old, new);
> -			is_new = cache_fresh_locked(old, new->expiry_time);
> +			cache_fresh_locked(old, new->expiry_time);
>  			write_unlock(&detail->hash_lock);
> -			cache_fresh_unlocked(old, detail, is_new);
> +			cache_fresh_unlocked(old, detail);
>  			return old;
>  		}
>  		write_unlock(&detail->hash_lock);
> @@ -165,11 +162,11 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
>  	*head = tmp;
>  	detail->entries++;
>  	cache_get(tmp);
> -	is_new = cache_fresh_locked(tmp, new->expiry_time);
> +	cache_fresh_locked(tmp, new->expiry_time);
>  	cache_fresh_locked(old, 0);
>  	write_unlock(&detail->hash_lock);
> -	cache_fresh_unlocked(tmp, detail, is_new);
> -	cache_fresh_unlocked(old, detail, 0);
> +	cache_fresh_unlocked(tmp, detail);
> +	cache_fresh_unlocked(old, detail);
>  	cache_put(old, detail);
>  	return tmp;
>  }
> @@ -224,8 +221,8 @@ int cache_check(struct cache_detail *detail,
>  				cache_revisit_request(h);
>  				if (rv == -EAGAIN) {
>  					set_bit(CACHE_NEGATIVE, &h->flags);
> -					cache_fresh_unlocked(h, detail,
> -					     cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY));
> +					cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY);
> +					cache_fresh_unlocked(h, detail);
>  					rv = -ENOENT;
>  				}
>  				break;
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 03/12] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052238.15929.56800.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 01/12] sunrpc/cache: rename queue_loose to cache_dequeue NeilBrown
                     ` (10 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

If cache_defer_req did not leave the request on a queue, then it could
possibly have waited long enough that the cache became valid.  So check the
status after the call.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   53 ++++++++++++++++++++++++++++++++--------------------
 1 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index c1f897c..bff4baa 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -173,6 +173,22 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 EXPORT_SYMBOL_GPL(sunrpc_cache_update);
 
 static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
+
+static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
+{
+	if (!test_bit(CACHE_VALID, &h->flags) ||
+	    h->expiry_time < get_seconds())
+		return -EAGAIN;
+	else if (detail->flush_time > h->last_refresh)
+		return -EAGAIN;
+	else {
+		/* entry is valid */
+		if (test_bit(CACHE_NEGATIVE, &h->flags))
+			return -ENOENT;
+		else
+			return 0;
+	}
+}
 /*
  * This is the generic cache management routine for all
  * the authentication caches.
@@ -181,8 +197,10 @@ static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
  *
  *
  * Returns 0 if the cache_head can be used, or cache_puts it and returns
- * -EAGAIN if upcall is pending,
- * -ETIMEDOUT if upcall failed and should be retried,
+ * -EAGAIN if upcall is pending and request has been queued
+ * -ETIMEDOUT if upcall failed or request could not be queue or
+ *           upcall completed but item is still invalid (implying that
+ *           the cache item has been replaced with a newer one).
  * -ENOENT if cache entry was negative
  */
 int cache_check(struct cache_detail *detail,
@@ -192,17 +210,7 @@ int cache_check(struct cache_detail *detail,
 	long refresh_age, age;
 
 	/* First decide return status as best we can */
-	if (!test_bit(CACHE_VALID, &h->flags) ||
-	    h->expiry_time < get_seconds())
-		rv = -EAGAIN;
-	else if (detail->flush_time > h->last_refresh)
-		rv = -EAGAIN;
-	else {
-		/* entry is valid */
-		if (test_bit(CACHE_NEGATIVE, &h->flags))
-			rv = -ENOENT;
-		else rv = 0;
-	}
+	rv = cache_is_valid(detail, h);
 
 	/* now see if we want to start an upcall */
 	refresh_age = (h->expiry_time - h->last_refresh);
@@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
 		}
 	}
 
-	if (rv == -EAGAIN)
-		if (cache_defer_req(rqstp, h) != 0)
-			rv = -ETIMEDOUT;
-
+	if (rv == -EAGAIN) {
+		if (cache_defer_req(rqstp, h) == 0) {
+			/* Request is not deferred */
+			rv = cache_is_valid(detail, h);
+			if (rv == -EAGAIN)
+				rv = -ETIMEDOUT;
+		}
+	}
 	if (rv)
 		cache_put(h, detail);
 	return rv;
@@ -557,11 +569,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		 * or continue and drop the oldest below
 		 */
 		if (net_random()&1)
-			return -ETIMEDOUT;
+			return 0;
 	}
 	dreq = req->defer(req);
 	if (dreq == NULL)
-		return -ETIMEDOUT;
+		return 0;
 
 	dreq->item = item;
 
@@ -591,8 +603,9 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	if (!test_bit(CACHE_PENDING, &item->flags)) {
 		/* must have just been validated... */
 		cache_revisit_request(item);
+		return 0;
 	}
-	return 0;
+	return 1;
 }
 
 static void cache_revisit_request(struct cache_head *item)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052238.15929.56800.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req
       [not found]     ` <20090804052238.15929.56800.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 20:42       ` J. Bruce Fields
  2009-08-06  4:57         ` Neil Brown
  0 siblings, 1 reply; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 20:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> If cache_defer_req did not leave the request on a queue, then it could
> possibly have waited long enough that the cache became valid.  So check the
> status after the call.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |   53 ++++++++++++++++++++++++++++++++--------------------
>  1 files changed, 33 insertions(+), 20 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index c1f897c..bff4baa 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -173,6 +173,22 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
>  EXPORT_SYMBOL_GPL(sunrpc_cache_update);
>  
>  static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
> +
> +static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> +{
> +	if (!test_bit(CACHE_VALID, &h->flags) ||
> +	    h->expiry_time < get_seconds())
> +		return -EAGAIN;
> +	else if (detail->flush_time > h->last_refresh)
> +		return -EAGAIN;
> +	else {
> +		/* entry is valid */
> +		if (test_bit(CACHE_NEGATIVE, &h->flags))
> +			return -ENOENT;
> +		else
> +			return 0;
> +	}
> +}
>  /*
>   * This is the generic cache management routine for all
>   * the authentication caches.
> @@ -181,8 +197,10 @@ static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
>   *
>   *
>   * Returns 0 if the cache_head can be used, or cache_puts it and returns
> - * -EAGAIN if upcall is pending,
> - * -ETIMEDOUT if upcall failed and should be retried,
> + * -EAGAIN if upcall is pending and request has been queued
> + * -ETIMEDOUT if upcall failed or request could not be queue or

s/queue/queued/

> + *           upcall completed but item is still invalid (implying that
> + *           the cache item has been replaced with a newer one).
>   * -ENOENT if cache entry was negative
>   */
>  int cache_check(struct cache_detail *detail,
> @@ -192,17 +210,7 @@ int cache_check(struct cache_detail *detail,
>  	long refresh_age, age;
>  
>  	/* First decide return status as best we can */
> -	if (!test_bit(CACHE_VALID, &h->flags) ||
> -	    h->expiry_time < get_seconds())
> -		rv = -EAGAIN;
> -	else if (detail->flush_time > h->last_refresh)
> -		rv = -EAGAIN;
> -	else {
> -		/* entry is valid */
> -		if (test_bit(CACHE_NEGATIVE, &h->flags))
> -			rv = -ENOENT;
> -		else rv = 0;
> -	}
> +	rv = cache_is_valid(detail, h);
>  
>  	/* now see if we want to start an upcall */
>  	refresh_age = (h->expiry_time - h->last_refresh);
> @@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
>  		}
>  	}
>  
> -	if (rv == -EAGAIN)
> -		if (cache_defer_req(rqstp, h) != 0)
> -			rv = -ETIMEDOUT;
> -
> +	if (rv == -EAGAIN) {
> +		if (cache_defer_req(rqstp, h) == 0) {
> +			/* Request is not deferred */

The code might be more self-explanatory if we wrote:

		if (cache_defer_req(rqstp, h) == -ETIMEDOUT) {

Well, at least it would be obvious we're handling the "failure" case?
(Even if admittedly it's a "failure" that we may be able to handle).

It always takes me a little thought whenever I encounter a
boolean-returning function whose name doesn't have an obvious truth
value (list_empty, cache_is_valid).

No complaints otherwise.

--b.

> +			rv = cache_is_valid(detail, h);
> +			if (rv == -EAGAIN)
> +				rv = -ETIMEDOUT;
> +		}
> +	}
>  	if (rv)
>  		cache_put(h, detail);
>  	return rv;
> @@ -557,11 +569,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		 * or continue and drop the oldest below
>  		 */
>  		if (net_random()&1)
> -			return -ETIMEDOUT;
> +			return 0;
>  	}
>  	dreq = req->defer(req);
>  	if (dreq == NULL)
> -		return -ETIMEDOUT;
> +		return 0;
>  
>  	dreq->item = item;
>  
> @@ -591,8 +603,9 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	if (!test_bit(CACHE_PENDING, &item->flags)) {
>  		/* must have just been validated... */
>  		cache_revisit_request(item);
> +		return 0;
>  	}
> -	return 0;
> +	return 1;
>  }
>  
>  static void cache_revisit_request(struct cache_head *item)
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req
  2009-08-04 20:42       ` J. Bruce Fields
@ 2009-08-06  4:57         ` Neil Brown
       [not found]           ` <19066.25248.283061.383233-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 26+ messages in thread
From: Neil Brown @ 2009-08-06  4:57 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Tuesday August 4, bfields@fieldses.org wrote:
> On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> > If cache_defer_req did not leave the request on a queue, then it could
> > possibly have waited long enough that the cache became valid.  So check the
> > status after the call.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> >   * Returns 0 if the cache_head can be used, or cache_puts it and returns
> > - * -EAGAIN if upcall is pending,
> > - * -ETIMEDOUT if upcall failed and should be retried,
> > + * -EAGAIN if upcall is pending and request has been queued
> > + * -ETIMEDOUT if upcall failed or request could not be queue or
> 
> s/queue/queued/
> 

:-)

> > @@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
> >  		}
> >  	}
> >  
> > -	if (rv == -EAGAIN)
> > -		if (cache_defer_req(rqstp, h) != 0)
> > -			rv = -ETIMEDOUT;
> > -
> > +	if (rv == -EAGAIN) {
> > +		if (cache_defer_req(rqstp, h) == 0) {
> > +			/* Request is not deferred */
> 
> The code might be more self-explanatory if we wrote:
> 
> 		if (cache_defer_req(rqstp, h) == -ETIMEDOUT) {
> 
> Well, at least it would be obvious we're handling the "failure" case?
> (Even if admittedly it's a "failure" that we may be able to handle).
> 
> It always takes me a little thought whenever I encounter a
> boolean-returning function whose name doesn't have an obvious truth
> value (list_empty, cache_is_valid).

I certainly see you point.  For consistency in the kernel, if the
function name doesn't sound like a boolean it should return 0 or
positive on success and negative for error.

But despite that I changed cache_defer_req to return 0 or 1 rather
than -ETIMEDOUT or 0...

There are three possibly results of cache_defer_req:
  a/ the request has been stored for later processing
  b/ there was a failure while trying to store the request
  c/ there was no need to store the request because the cache
     item is no longer waiting for a reply.

While 'a' is success and 'b' is an error, 'c' doesn't exactly fit in
to either.  However 'b' and 'c' are treated the same way by
cache_check.
So returning '-ETIMEDOUT' for both 'b' and 'c' seemed wrong.

The current return value is a true/false value for the assertion "the
request was successfully deferred".  But choosing a name for
cache_defer_req which makes that meaning obvious seems clumsy.

Thinks.....

Maybe 
   a -> 0 (success, we deferred the request)
   b -> -ENOMEM (failed to find somewhere to store the request)
   c -> -EAGAIN (something happened .. check again).

and in cache_check we write

   if (cache_defer_req(rqstp, h) < 0) {
         /* Request is not deferred */

which maybe a bit more self explanatory??

NeilBrown


>From c970b6abce98044de573336b3a867b7ed39642e4 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Thu, 6 Aug 2009 14:56:13 +1000
Subject: [PATCH] sunrpc/cache: recheck cache validity after cache_defer_req

If cache_defer_req did not leave the request on a queue, then it could
possibly have waited long enough that the cache became valid.  So check the
status after the call.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 net/sunrpc/cache.c |   51 ++++++++++++++++++++++++++++++++-------------------
 1 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index c1f897c..cec2574 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -173,6 +173,22 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 EXPORT_SYMBOL_GPL(sunrpc_cache_update);
 
 static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
+
+static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
+{
+	if (!test_bit(CACHE_VALID, &h->flags) ||
+	    h->expiry_time < get_seconds())
+		return -EAGAIN;
+	else if (detail->flush_time > h->last_refresh)
+		return -EAGAIN;
+	else {
+		/* entry is valid */
+		if (test_bit(CACHE_NEGATIVE, &h->flags))
+			return -ENOENT;
+		else
+			return 0;
+	}
+}
 /*
  * This is the generic cache management routine for all
  * the authentication caches.
@@ -181,8 +197,10 @@ static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
  *
  *
  * Returns 0 if the cache_head can be used, or cache_puts it and returns
- * -EAGAIN if upcall is pending,
- * -ETIMEDOUT if upcall failed and should be retried,
+ * -EAGAIN if upcall is pending and request has been queued
+ * -ETIMEDOUT if upcall failed or request could not be queued or
+ *           upcall completed but item is still invalid (implying that
+ *           the cache item has been replaced with a newer one).
  * -ENOENT if cache entry was negative
  */
 int cache_check(struct cache_detail *detail,
@@ -192,17 +210,7 @@ int cache_check(struct cache_detail *detail,
 	long refresh_age, age;
 
 	/* First decide return status as best we can */
-	if (!test_bit(CACHE_VALID, &h->flags) ||
-	    h->expiry_time < get_seconds())
-		rv = -EAGAIN;
-	else if (detail->flush_time > h->last_refresh)
-		rv = -EAGAIN;
-	else {
-		/* entry is valid */
-		if (test_bit(CACHE_NEGATIVE, &h->flags))
-			rv = -ENOENT;
-		else rv = 0;
-	}
+	rv = cache_is_valid(detail, h);
 
 	/* now see if we want to start an upcall */
 	refresh_age = (h->expiry_time - h->last_refresh);
@@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
 		}
 	}
 
-	if (rv == -EAGAIN)
-		if (cache_defer_req(rqstp, h) != 0)
-			rv = -ETIMEDOUT;
-
+	if (rv == -EAGAIN) {
+		if (cache_defer_req(rqstp, h) < 0) {
+			/* Request is not deferred */
+			rv = cache_is_valid(detail, h);
+			if (rv == -EAGAIN)
+				rv = -ETIMEDOUT;
+		}
+	}
 	if (rv)
 		cache_put(h, detail);
 	return rv;
@@ -557,11 +569,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		 * or continue and drop the oldest below
 		 */
 		if (net_random()&1)
-			return -ETIMEDOUT;
+			return -ENOMEM;
 	}
 	dreq = req->defer(req);
 	if (dreq == NULL)
-		return -ETIMEDOUT;
+		return -ENOMEM;
 
 	dreq->item = item;
 
@@ -591,6 +603,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	if (!test_bit(CACHE_PENDING, &item->flags)) {
 		/* must have just been validated... */
 		cache_revisit_request(item);
+		return -EAGAIN;
 	}
 	return 0;
 }
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <19066.25248.283061.383233-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req
       [not found]           ` <19066.25248.283061.383233-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-25 21:50             ` J. Bruce Fields
  2009-08-26  0:42               ` Neil Brown
  0 siblings, 1 reply; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-25 21:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Thu, Aug 06, 2009 at 02:57:04PM +1000, Neil Brown wrote:
> On Tuesday August 4, bfields@fieldses.org wrote:
> > On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> > > If cache_defer_req did not leave the request on a queue, then it could
> > > possibly have waited long enough that the cache became valid.  So check the
> > > status after the call.
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > >   * Returns 0 if the cache_head can be used, or cache_puts it and returns
> > > - * -EAGAIN if upcall is pending,
> > > - * -ETIMEDOUT if upcall failed and should be retried,
> > > + * -EAGAIN if upcall is pending and request has been queued
> > > + * -ETIMEDOUT if upcall failed or request could not be queue or
> > 
> > s/queue/queued/
> > 
> 
> :-)
> 
> > > @@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
> > >  		}
> > >  	}
> > >  
> > > -	if (rv == -EAGAIN)
> > > -		if (cache_defer_req(rqstp, h) != 0)
> > > -			rv = -ETIMEDOUT;
> > > -
> > > +	if (rv == -EAGAIN) {
> > > +		if (cache_defer_req(rqstp, h) == 0) {
> > > +			/* Request is not deferred */
> > 
> > The code might be more self-explanatory if we wrote:
> > 
> > 		if (cache_defer_req(rqstp, h) == -ETIMEDOUT) {
> > 
> > Well, at least it would be obvious we're handling the "failure" case?
> > (Even if admittedly it's a "failure" that we may be able to handle).
> > 
> > It always takes me a little thought whenever I encounter a
> > boolean-returning function whose name doesn't have an obvious truth
> > value (list_empty, cache_is_valid).
> 
> I certainly see you point.  For consistency in the kernel, if the
> function name doesn't sound like a boolean it should return 0 or
> positive on success and negative for error.
> 
> But despite that I changed cache_defer_req to return 0 or 1 rather
> than -ETIMEDOUT or 0...
> 
> There are three possibly results of cache_defer_req:
>   a/ the request has been stored for later processing
>   b/ there was a failure while trying to store the request
>   c/ there was no need to store the request because the cache
>      item is no longer waiting for a reply.
> 
> While 'a' is success and 'b' is an error, 'c' doesn't exactly fit in
> to either.  However 'b' and 'c' are treated the same way by
> cache_check.
> So returning '-ETIMEDOUT' for both 'b' and 'c' seemed wrong.
> 
> The current return value is a true/false value for the assertion "the
> request was successfully deferred".  But choosing a name for
> cache_defer_req which makes that meaning obvious seems clumsy.
> 
> Thinks.....
> 
> Maybe 
>    a -> 0 (success, we deferred the request)
>    b -> -ENOMEM (failed to find somewhere to store the request)
>    c -> -EAGAIN (something happened .. check again).
> 
> and in cache_check we write
> 
>    if (cache_defer_req(rqstp, h) < 0) {
>          /* Request is not deferred */
> 
> which maybe a bit more self explanatory??

Yes, OK, seems reasonable enough.

So, apologies, I've lost track of the rest of your patches, but I'm
still very much interested.  Will you have a chance to rebase and
resend?  My for-2.6.32 branch at

	git://linux-nfs.org/~bfields/linux.git for-2.6.32

has what I've applied so far, plus a merge of Trond's queued patches
(which includes some changes to the cache code).

--b.

> 
> NeilBrown
> 
> 
> >From c970b6abce98044de573336b3a867b7ed39642e4 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Thu, 6 Aug 2009 14:56:13 +1000
> Subject: [PATCH] sunrpc/cache: recheck cache validity after cache_defer_req
> 
> If cache_defer_req did not leave the request on a queue, then it could
> possibly have waited long enough that the cache became valid.  So check the
> status after the call.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  net/sunrpc/cache.c |   51 ++++++++++++++++++++++++++++++++-------------------
>  1 files changed, 32 insertions(+), 19 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index c1f897c..cec2574 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -173,6 +173,22 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
>  EXPORT_SYMBOL_GPL(sunrpc_cache_update);
>  
>  static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
> +
> +static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> +{
> +	if (!test_bit(CACHE_VALID, &h->flags) ||
> +	    h->expiry_time < get_seconds())
> +		return -EAGAIN;
> +	else if (detail->flush_time > h->last_refresh)
> +		return -EAGAIN;
> +	else {
> +		/* entry is valid */
> +		if (test_bit(CACHE_NEGATIVE, &h->flags))
> +			return -ENOENT;
> +		else
> +			return 0;
> +	}
> +}
>  /*
>   * This is the generic cache management routine for all
>   * the authentication caches.
> @@ -181,8 +197,10 @@ static int cache_make_upcall(struct cache_detail *detail, struct cache_head *h);
>   *
>   *
>   * Returns 0 if the cache_head can be used, or cache_puts it and returns
> - * -EAGAIN if upcall is pending,
> - * -ETIMEDOUT if upcall failed and should be retried,
> + * -EAGAIN if upcall is pending and request has been queued
> + * -ETIMEDOUT if upcall failed or request could not be queued or
> + *           upcall completed but item is still invalid (implying that
> + *           the cache item has been replaced with a newer one).
>   * -ENOENT if cache entry was negative
>   */
>  int cache_check(struct cache_detail *detail,
> @@ -192,17 +210,7 @@ int cache_check(struct cache_detail *detail,
>  	long refresh_age, age;
>  
>  	/* First decide return status as best we can */
> -	if (!test_bit(CACHE_VALID, &h->flags) ||
> -	    h->expiry_time < get_seconds())
> -		rv = -EAGAIN;
> -	else if (detail->flush_time > h->last_refresh)
> -		rv = -EAGAIN;
> -	else {
> -		/* entry is valid */
> -		if (test_bit(CACHE_NEGATIVE, &h->flags))
> -			rv = -ENOENT;
> -		else rv = 0;
> -	}
> +	rv = cache_is_valid(detail, h);
>  
>  	/* now see if we want to start an upcall */
>  	refresh_age = (h->expiry_time - h->last_refresh);
> @@ -235,10 +243,14 @@ int cache_check(struct cache_detail *detail,
>  		}
>  	}
>  
> -	if (rv == -EAGAIN)
> -		if (cache_defer_req(rqstp, h) != 0)
> -			rv = -ETIMEDOUT;
> -
> +	if (rv == -EAGAIN) {
> +		if (cache_defer_req(rqstp, h) < 0) {
> +			/* Request is not deferred */
> +			rv = cache_is_valid(detail, h);
> +			if (rv == -EAGAIN)
> +				rv = -ETIMEDOUT;
> +		}
> +	}
>  	if (rv)
>  		cache_put(h, detail);
>  	return rv;
> @@ -557,11 +569,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		 * or continue and drop the oldest below
>  		 */
>  		if (net_random()&1)
> -			return -ETIMEDOUT;
> +			return -ENOMEM;
>  	}
>  	dreq = req->defer(req);
>  	if (dreq == NULL)
> -		return -ETIMEDOUT;
> +		return -ENOMEM;
>  
>  	dreq->item = item;
>  
> @@ -591,6 +603,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	if (!test_bit(CACHE_PENDING, &item->flags)) {
>  		/* must have just been validated... */
>  		cache_revisit_request(item);
> +		return -EAGAIN;
>  	}
>  	return 0;
>  }
> -- 
> 1.6.3.3
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req
  2009-08-25 21:50             ` J. Bruce Fields
@ 2009-08-26  0:42               ` Neil Brown
  0 siblings, 0 replies; 26+ messages in thread
From: Neil Brown @ 2009-08-26  0:42 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Tuesday August 25, bfields@fieldses.org wrote:
> 
> Yes, OK, seems reasonable enough.
> 
> So, apologies, I've lost track of the rest of your patches, but I'm
> still very much interested.  Will you have a chance to rebase and
> resend?  My for-2.6.32 branch at
> 
> 	git://linux-nfs.org/~bfields/linux.git for-2.6.32
> 
> has what I've applied so far, plus a merge of Trond's queued patches
> (which includes some changes to the cache code).

I'm taking some leave next week so it'll probably be mid September
before I get around to sending anything, but I'll try to keep it on my
todo list.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/12] sunrpc/cache: rename queue_loose to cache_dequeue
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 03/12] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
  2009-08-04  5:22   ` [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052238.15929.91015.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 02/12] sunrpc/cache: make sure deferred requests eventually get revisited NeilBrown
                     ` (9 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

'loose' was a mis-spelling of 'lose', and even that wasn't a good
word choice.
So give this function a more useful name.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index ff0c230..d19c075 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -101,7 +101,7 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
 EXPORT_SYMBOL_GPL(sunrpc_cache_lookup);
 
 
-static void queue_loose(struct cache_detail *detail, struct cache_head *ch);
+static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch);
 
 static int cache_fresh_locked(struct cache_head *head, time_t expiry)
 {
@@ -117,7 +117,7 @@ static void cache_fresh_unlocked(struct cache_head *head,
 		cache_revisit_request(head);
 	if (test_and_clear_bit(CACHE_PENDING, &head->flags)) {
 		cache_revisit_request(head);
-		queue_loose(detail, head);
+		cache_dequeue(detail, head);
 	}
 }
 
@@ -457,7 +457,7 @@ static int cache_clean(void)
 				)
 				continue;
 			if (test_and_clear_bit(CACHE_PENDING, &ch->flags))
-				queue_loose(current_detail, ch);
+				cache_dequeue(current_detail, ch);
 
 			if (atomic_read(&ch->ref.refcount) == 1)
 				break;
@@ -920,7 +920,7 @@ static const struct file_operations cache_file_operations = {
 };
 
 
-static void queue_loose(struct cache_detail *detail, struct cache_head *ch)
+static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
 {
 	struct cache_queue *cq;
 	spin_lock(&queue_lock);



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052238.15929.91015.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 01/12] sunrpc/cache: rename queue_loose to cache_dequeue
       [not found]     ` <20090804052238.15929.91015.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 14:05       ` J. Bruce Fields
  0 siblings, 0 replies; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 14:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> 'loose' was a mis-spelling of 'lose', and even that wasn't a good
> word choice.
> So give this function a more useful name.

OK, others may take me longer to get to, but this is easy enough to
apply now!  Done.--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index ff0c230..d19c075 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -101,7 +101,7 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
>  EXPORT_SYMBOL_GPL(sunrpc_cache_lookup);
>  
>  
> -static void queue_loose(struct cache_detail *detail, struct cache_head *ch);
> +static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch);
>  
>  static int cache_fresh_locked(struct cache_head *head, time_t expiry)
>  {
> @@ -117,7 +117,7 @@ static void cache_fresh_unlocked(struct cache_head *head,
>  		cache_revisit_request(head);
>  	if (test_and_clear_bit(CACHE_PENDING, &head->flags)) {
>  		cache_revisit_request(head);
> -		queue_loose(detail, head);
> +		cache_dequeue(detail, head);
>  	}
>  }
>  
> @@ -457,7 +457,7 @@ static int cache_clean(void)
>  				)
>  				continue;
>  			if (test_and_clear_bit(CACHE_PENDING, &ch->flags))
> -				queue_loose(current_detail, ch);
> +				cache_dequeue(current_detail, ch);
>  
>  			if (atomic_read(&ch->ref.refcount) == 1)
>  				break;
> @@ -920,7 +920,7 @@ static const struct file_operations cache_file_operations = {
>  };
>  
>  
> -static void queue_loose(struct cache_detail *detail, struct cache_head *ch)
> +static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
>  {
>  	struct cache_queue *cq;
>  	spin_lock(&queue_lock);
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 02/12] sunrpc/cache: make sure deferred requests eventually get revisited.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 01/12] sunrpc/cache: rename queue_loose to cache_dequeue NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052238.15929.74402.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 08/12] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
                     ` (8 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

While deferred requests normally get revisited quite quickly,
it is possible for a request to remain in the deferral queue
when the cache item is discarded.  We can easily make sure that
doesn't happen by calling cache_revisit_request just before
the final 'put'.

Also there is a small chance that a race would cause one thread to
defer a request against a cache item while another thread is failing
to queue and upcall for that item.  So when the upcall fails,
make sure to revisit all deferred requests.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index d19c075..44f4516 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -221,6 +221,7 @@ int cache_check(struct cache_detail *detail,
 			switch (cache_make_upcall(detail, h)) {
 			case -EINVAL:
 				clear_bit(CACHE_PENDING, &h->flags);
+				cache_revisit_request(h);
 				if (rv == -EAGAIN) {
 					set_bit(CACHE_NEGATIVE, &h->flags);
 					cache_fresh_unlocked(h, detail,
@@ -473,8 +474,10 @@ static int cache_clean(void)
 		if (!ch)
 			current_index ++;
 		spin_unlock(&cache_list_lock);
-		if (ch)
+		if (ch) {
+			cache_revisit_request(ch);
 			cache_put(ch, d);
+		}
 	} else
 		spin_unlock(&cache_list_lock);
 



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052238.15929.74402.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 02/12] sunrpc/cache: make sure deferred requests eventually get revisited.
       [not found]     ` <20090804052238.15929.74402.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 15:02       ` J. Bruce Fields
  0 siblings, 0 replies; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 15:02 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> While deferred requests normally get revisited quite quickly,
> it is possible for a request to remain in the deferral queue
> when the cache item is discarded.  We can easily make sure that
> doesn't happen by calling cache_revisit_request just before
> the final 'put'.
> 
> Also there is a small chance that a race would cause one thread to
> defer a request against a cache item while another thread is failing
> to queue and upcall for that item.  So when the upcall fails,
> make sure to revisit all deferred requests.

OK, thanks--applied.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index d19c075..44f4516 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -221,6 +221,7 @@ int cache_check(struct cache_detail *detail,
>  			switch (cache_make_upcall(detail, h)) {
>  			case -EINVAL:
>  				clear_bit(CACHE_PENDING, &h->flags);
> +				cache_revisit_request(h);
>  				if (rv == -EAGAIN) {
>  					set_bit(CACHE_NEGATIVE, &h->flags);
>  					cache_fresh_unlocked(h, detail,
> @@ -473,8 +474,10 @@ static int cache_clean(void)
>  		if (!ch)
>  			current_index ++;
>  		spin_unlock(&cache_list_lock);
> -		if (ch)
> +		if (ch) {
> +			cache_revisit_request(ch);

Possibly dumb question: does dreq->item hold a reference?  Assuming not:
should we zero it out on revisit?  It seems like an accident waiting to
happen otherwise.

--b.

>  			cache_put(ch, d);
> +		}
>  	} else
>  		spin_unlock(&cache_list_lock);
>  
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 08/12] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 02/12] sunrpc/cache: make sure deferred requests eventually get revisited NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04  5:22   ` [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

If cache_check returns -ETIMEDOUT, then the cache item is not
up-to-date, but there is no pending upcall.
This could mean the data is not available, or it could mean that the
good data has been stored in a new cache item.

So re-do the lookup and if that returns a new item, proceed using that
item.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 fs/nfsd/export.c          |   18 ++++++++++++++++++
 net/sunrpc/svcauth_unix.c |   22 ++++++++++++++++++++--
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index b92a276..6fcb0eb 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -783,9 +783,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
 	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
 
 	ek = svc_expkey_lookup(&key);
+ again:
 	if (ek == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
+	if (err == -ETIMEDOUT) {
+		struct svc_expkey *prev_ek = ek;
+		ek = svc_expkey_lookup(&key);
+		if (ek != prev_ek)
+			goto again;
+		if (ek)
+			cache_put(&ek->h, &svc_expkey_cache);
+	}
 	if (err)
 		return ERR_PTR(err);
 	return ek;
@@ -855,9 +864,18 @@ static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
 	key.ex_path = *path;
 
 	exp = svc_export_lookup(&key);
+ retry:
 	if (exp == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(&svc_export_cache, &exp->h, reqp);
+	if (err == -ETIMEDOUT) {
+		struct svc_export *prev_exp = exp;
+		exp = svc_export_lookup(&key);
+		if (exp != prev_exp)
+			goto retry;
+		if (exp)
+			cache_put(&exp->h, &svc_export_cache);
+	}
 	if (err)
 		return ERR_PTR(err);
 	return exp;
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 5c865e2..f2de152 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -649,8 +649,10 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 			 struct svc_rqst *rqstp)
 {
 	struct unix_gid *ug = unix_gid_lookup(uid);
+	struct unix_gid *prevug;
 	if (!ug)
 		return -EAGAIN;
+ retry:
 	switch (cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle)) {
 	case -ENOENT:
 		*gip = NULL;
@@ -659,6 +661,13 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 		*gip = ug->gi;
 		get_group_info(*gip);
 		return 0;
+	case -ETIMEDOUT:
+		prevug = ug;
+		ug = unix_gid_lookup(uid);
+		if (ug != prevug)
+			goto retry;
+		if (ug)
+			cache_put(&ug->h, &unix_gid_cache);
 	default:
 		return -EAGAIN;
 	}
@@ -669,7 +678,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 {
 	struct sockaddr_in *sin;
 	struct sockaddr_in6 *sin6, sin6_storage;
-	struct ip_map *ipm;
+	struct ip_map *ipm, *prev_ipm;
 
 	switch (rqstp->rq_addr.ss_family) {
 	case AF_INET:
@@ -694,14 +703,23 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 		ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
 				    &sin6->sin6_addr);
 
+ retry:
 	if (ipm == NULL)
 		return SVC_DENIED;
 
 	switch (cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle)) {
 		default:
 			BUG();
-		case -EAGAIN:
 		case -ETIMEDOUT:
+			prev_ipm = ipm;
+			ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
+					    &sin6->sin6_addr);
+			if (ipm != prev_ipm)
+				goto retry;
+			if (ipm)
+				cache_put(&ipm->h, &ip_map_cache);
+
+		case -EAGAIN:
 			return SVC_DROP;
 		case -ENOENT:
 			return SVC_DENIED;



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 08/12] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052239.15929.87201.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 09/12] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
                     ` (6 subsequent siblings)
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

In cache_defer_req, 'dreq' is used for two significantly different
values that happen to be of the same type.

This is both confusing, and make it hard to extend the range of one of
the values as we will in the next patch.
So introduce 'discard' to take one of the values.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 2f19463..4892c5c 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -561,7 +561,7 @@ static int cache_defer_cnt;
 
 static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 {
-	struct cache_deferred_req *dreq;
+	struct cache_deferred_req *dreq, *discard;
 	int hash = DFR_HASH(item);
 
 	if (cache_defer_cnt >= DFR_MAX) {
@@ -586,20 +586,20 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	list_add(&dreq->hash, &cache_defer_hash[hash]);
 
 	/* it is in, now maybe clean up */
-	dreq = NULL;
+	discard = NULL;
 	if (++cache_defer_cnt > DFR_MAX) {
-		dreq = list_entry(cache_defer_list.prev,
-				  struct cache_deferred_req, recent);
+		discard = list_entry(cache_defer_list.prev,
+				     struct cache_deferred_req, recent);
 		list_del_init(&dreq->recent);
 		list_del_init(&dreq->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
 
-	if (dreq) {
+	if (discard)
 		/* there was one too many */
-		dreq->revisit(dreq, 1);
-	}
+		discard->revisit(discard, 1);
+
 	if (!test_bit(CACHE_PENDING, &item->flags)) {
 		/* must have just been validated... */
 		cache_revisit_request(item);



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052239.15929.87201.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req
       [not found]     ` <20090804052239.15929.87201.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 20:47       ` J. Bruce Fields
  2009-08-06  4:35         ` Neil Brown
  0 siblings, 1 reply; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 20:47 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:39PM +1000, NeilBrown wrote:
> In cache_defer_req, 'dreq' is used for two significantly different
> values that happen to be of the same type.
> 
> This is both confusing, and make it hard to extend the range of one of
> the values as we will in the next patch.
> So introduce 'discard' to take one of the values.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 2f19463..4892c5c 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -561,7 +561,7 @@ static int cache_defer_cnt;
>  
>  static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  {
> -	struct cache_deferred_req *dreq;
> +	struct cache_deferred_req *dreq, *discard;
>  	int hash = DFR_HASH(item);
>  
>  	if (cache_defer_cnt >= DFR_MAX) {
> @@ -586,20 +586,20 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	list_add(&dreq->hash, &cache_defer_hash[hash]);
>  
>  	/* it is in, now maybe clean up */
> -	dreq = NULL;
> +	discard = NULL;
>  	if (++cache_defer_cnt > DFR_MAX) {
> -		dreq = list_entry(cache_defer_list.prev,
> -				  struct cache_deferred_req, recent);
> +		discard = list_entry(cache_defer_list.prev,
> +				     struct cache_deferred_req, recent);
>  		list_del_init(&dreq->recent);
>  		list_del_init(&dreq->hash);

Shouldn't these two "dreq"'s be "discard"'s as well?

--b.

>  		cache_defer_cnt--;
>  	}
>  	spin_unlock(&cache_defer_lock);
>  
> -	if (dreq) {
> +	if (discard)
>  		/* there was one too many */
> -		dreq->revisit(dreq, 1);
> -	}
> +		discard->revisit(discard, 1);
> +
>  	if (!test_bit(CACHE_PENDING, &item->flags)) {
>  		/* must have just been validated... */
>  		cache_revisit_request(item);
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req
  2009-08-04 20:47       ` J. Bruce Fields
@ 2009-08-06  4:35         ` Neil Brown
  0 siblings, 0 replies; 26+ messages in thread
From: Neil Brown @ 2009-08-06  4:35 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Tuesday August 4, bfields@fieldses.org wrote:
> On Tue, Aug 04, 2009 at 03:22:39PM +1000, NeilBrown wrote:
> > In cache_defer_req, 'dreq' is used for two significantly different
> > values that happen to be of the same type.
> > 
> > This is both confusing, and make it hard to extend the range of one of
> > the values as we will in the next patch.
> > So introduce 'discard' to take one of the values.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
...
> > @@ -586,20 +586,20 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> >  	list_add(&dreq->hash, &cache_defer_hash[hash]);
> >  
> >  	/* it is in, now maybe clean up */
> > -	dreq = NULL;
> > +	discard = NULL;
> >  	if (++cache_defer_cnt > DFR_MAX) {
> > -		dreq = list_entry(cache_defer_list.prev,
> > -				  struct cache_deferred_req, recent);
> > +		discard = list_entry(cache_defer_list.prev,
> > +				     struct cache_deferred_req, recent);
> >  		list_del_init(&dreq->recent);
> >  		list_del_init(&dreq->hash);
> 
> Shouldn't these two "dreq"'s be "discard"'s as well?

Yes, definitely.  Thanks for catching that.

NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 09/12] nfsd/idmap: drop special request deferal in favour of improved default.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04  5:22   ` [PATCH 07/12] sunrpc/cache: allow thread to block while waiting for cache update NeilBrown
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The idmap code manages request deferal by waiting for a reply from
userspace rather than putting the NFS request on a queue to be retried
from the start.
Now that the comment deferal code does this there is no need for the
special code in idmap.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 fs/nfsd/nfs4idmap.c |  105 +++++----------------------------------------------
 1 files changed, 11 insertions(+), 94 deletions(-)

diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
index 5b39842..d6d37f8 100644
--- a/fs/nfsd/nfs4idmap.c
+++ b/fs/nfsd/nfs4idmap.c
@@ -485,109 +485,26 @@ nfsd_idmap_shutdown(void)
 	cache_unregister(&nametoid_cache);
 }
 
-/*
- * Deferred request handling
- */
-
-struct idmap_defer_req {
-       struct cache_req		req;
-       struct cache_deferred_req deferred_req;
-       wait_queue_head_t	waitq;
-       atomic_t			count;
-};
-
-static inline void
-put_mdr(struct idmap_defer_req *mdr)
-{
-	if (atomic_dec_and_test(&mdr->count))
-		kfree(mdr);
-}
-
-static inline void
-get_mdr(struct idmap_defer_req *mdr)
-{
-	atomic_inc(&mdr->count);
-}
-
-static void
-idmap_revisit(struct cache_deferred_req *dreq, int toomany)
-{
-	struct idmap_defer_req *mdr =
-		container_of(dreq, struct idmap_defer_req, deferred_req);
-
-	wake_up(&mdr->waitq);
-	put_mdr(mdr);
-}
-
-static struct cache_deferred_req *
-idmap_defer(struct cache_req *req)
-{
-	struct idmap_defer_req *mdr =
-		container_of(req, struct idmap_defer_req, req);
-
-	mdr->deferred_req.revisit = idmap_revisit;
-	get_mdr(mdr);
-	return (&mdr->deferred_req);
-}
-
-static inline int
-do_idmap_lookup(struct ent *(*lookup_fn)(struct ent *), struct ent *key,
-		struct cache_detail *detail, struct ent **item,
-		struct idmap_defer_req *mdr)
-{
-	*item = lookup_fn(key);
-	if (!*item)
-		return -ENOMEM;
-	return cache_check(detail, &(*item)->h, &mdr->req);
-}
-
-static inline int
-do_idmap_lookup_nowait(struct ent *(*lookup_fn)(struct ent *),
-			struct ent *key, struct cache_detail *detail,
-			struct ent **item)
-{
-	int ret = -ENOMEM;
-
-	*item = lookup_fn(key);
-	if (!*item)
-		goto out_err;
-	ret = -ETIMEDOUT;
-	if (!test_bit(CACHE_VALID, &(*item)->h.flags)
-			|| (*item)->h.expiry_time < get_seconds()
-			|| detail->flush_time > (*item)->h.last_refresh)
-		goto out_put;
-	ret = -ENOENT;
-	if (test_bit(CACHE_NEGATIVE, &(*item)->h.flags))
-		goto out_put;
-	return 0;
-out_put:
-	cache_put(&(*item)->h, detail);
-out_err:
-	*item = NULL;
-	return ret;
-}
-
 static int
 idmap_lookup(struct svc_rqst *rqstp,
 		struct ent *(*lookup_fn)(struct ent *), struct ent *key,
 		struct cache_detail *detail, struct ent **item)
 {
-	struct idmap_defer_req *mdr;
 	int ret;
 
-	mdr = kzalloc(sizeof(*mdr), GFP_KERNEL);
-	if (!mdr)
+	*item = lookup_fn(key);
+	if (!*item)
 		return -ENOMEM;
-	atomic_set(&mdr->count, 1);
-	init_waitqueue_head(&mdr->waitq);
-	mdr->req.defer = idmap_defer;
-	ret = do_idmap_lookup(lookup_fn, key, detail, item, mdr);
-	if (ret == -EAGAIN) {
-		wait_event_interruptible_timeout(mdr->waitq,
-			test_bit(CACHE_VALID, &(*item)->h.flags), 1 * HZ);
-		ret = do_idmap_lookup_nowait(lookup_fn, key, detail, item);
+ retry:
+	ret = cache_check(detail, &(*item)->h, &rqstp->rq_chandle);
+
+	if (ret == -ETIMEDOUT) {
+		struct ent *prev_item = *item;
+		*item = lookup_fn(key);
+		if (*item != prev_item)
+			goto retry;
+		cache_put(&(*item)->h, detail);
 	}
-	put_mdr(mdr);
 	return ret;
 }
 



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/12] sunrpc/cache: allow thread to block while waiting for cache update.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 09/12] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04  5:22   ` [PATCH 05/12] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ We NFSv4, requests can be quite complex and re-trying a whole
  request when a later part fails should only be a list-resort, not a
  normal practice.

2/ Large requests, and in particular any 'write' request, will not be
  queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
 ->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/cache.h |    3 +++
 net/sunrpc/cache.c           |   44 +++++++++++++++++++++++++++++++++++++++++-
 net/sunrpc/svc_xprt.c        |   11 +++++++++++
 3 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 2d8b211..ad62e8d 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -112,6 +112,9 @@ struct cache_detail {
  */
 struct cache_req {
 	struct cache_deferred_req *(*defer)(struct cache_req *req);
+	int thread_wait;  /* How long (jiffies) we can block the
+			   * current thread to wait for updates.
+			   */
 };
 /* this must be embedded in a deferred_request that is being
  * delayed awaiting cache-fill
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 4892c5c..ba2e113 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -559,10 +559,22 @@ static LIST_HEAD(cache_defer_list);
 static struct list_head cache_defer_hash[DFR_HASHSIZE];
 static int cache_defer_cnt;
 
+struct thread_deferred_req {
+	struct cache_deferred_req handle;
+	wait_queue_head_t wait;
+};
+static void cache_restart_thread(struct cache_deferred_req *dreq, int too_many)
+{
+	struct thread_deferred_req *dr =
+		container_of(dreq, struct thread_deferred_req, handle);
+	wake_up(&dr->wait);
+}
+
 static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 {
 	struct cache_deferred_req *dreq, *discard;
 	int hash = DFR_HASH(item);
+	struct thread_deferred_req sleeper;
 
 	if (cache_defer_cnt >= DFR_MAX) {
 		/* too much in the cache, randomly drop this one,
@@ -571,7 +583,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		if (net_random()&1)
 			return 0;
 	}
-	dreq = req->defer(req);
+	if (req->thread_wait) {
+		dreq = &sleeper.handle;
+		init_waitqueue_head(&sleeper.wait);
+		dreq->revisit = cache_restart_thread;
+	} else
+		dreq = req->defer(req);
+
+ retry:
 	if (dreq == NULL)
 		return 0;
 
@@ -605,6 +624,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		cache_revisit_request(item);
 		return 0;
 	}
+
+	if (dreq == &sleeper.handle) {
+		wait_event_interruptible_timeout(
+			sleeper.wait,
+			!test_bit(CACHE_PENDING, &item->flags)
+			|| list_empty(&sleeper.handle.hash),
+			req->thread_wait);
+		spin_lock(&cache_defer_lock);
+		if (!list_empty(&sleeper.handle.hash)) {
+			list_del_init(&sleeper.handle.recent);
+			list_del_init(&sleeper.handle.hash);
+			cache_defer_cnt--;
+		}
+		spin_unlock(&cache_defer_lock);
+		if (test_bit(CACHE_PENDING, &item->flags)) {
+			/* item is still pending, try request
+			 * deferral
+			 */
+			dreq = req->defer(req);
+			goto retry;
+		}
+		return 0;
+	}
 	return 1;
 }
 
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 27d4433..465443c 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -655,12 +655,23 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
 		pool->sp_nwaking--;
 		BUG_ON(pool->sp_nwaking < 0);
 	}
+
+	/* Normally we will wait up to 5 seconds for any required
+	 * cache information to be provided.
+	 */
+	rqstp->rq_chandle.thread_wait = 5*HZ;
 	xprt = svc_xprt_dequeue(pool);
 	if (xprt) {
 		rqstp->rq_xprt = xprt;
 		svc_xprt_get(xprt);
 		rqstp->rq_reserved = serv->sv_max_mesg;
 		atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+
+		/* As there is a shortage of threads and this request
+		 * had to be queue, don't allow the thread to wait so
+		 * long for cache updates.
+		 */
+		rqstp->rq_chandle.thread_wait = 1*HZ;
 	} else {
 		/* No data pending. Go to sleep */
 		svc_thread_enqueue(pool, rqstp);



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/12] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 07/12] sunrpc/cache: allow thread to block while waiting for cache update NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04  5:22   ` [PATCH 11/12] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

Using list_del_init is generally safer than list_del, and it will
allow us, in the next patch, to see if an entry has already been
processed or not.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index bff4baa..2f19463 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -590,8 +590,8 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	if (++cache_defer_cnt > DFR_MAX) {
 		dreq = list_entry(cache_defer_list.prev,
 				  struct cache_deferred_req, recent);
-		list_del(&dreq->recent);
-		list_del(&dreq->hash);
+		list_del_init(&dreq->recent);
+		list_del_init(&dreq->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
@@ -625,7 +625,7 @@ static void cache_revisit_request(struct cache_head *item)
 			dreq = list_entry(lp, struct cache_deferred_req, hash);
 			lp = lp->next;
 			if (dreq->item == item) {
-				list_del(&dreq->hash);
+				list_del_init(&dreq->hash);
 				list_move(&dreq->recent, &pending);
 				cache_defer_cnt--;
 			}
@@ -651,7 +651,7 @@ void cache_clean_deferred(void *owner)
 
 	list_for_each_entry_safe(dreq, tmp, &cache_defer_list, recent) {
 		if (dreq->owner == owner) {
-			list_del(&dreq->hash);
+			list_del_init(&dreq->hash);
 			list_move(&dreq->recent, &pending);
 			cache_defer_cnt--;
 		}



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/12] sunrpc/cache: change deferred-request hash table to use hlist.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (8 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 05/12] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04  5:22   ` [PATCH 10/12] sunrpc: fix memory leak in unix_gid cache NeilBrown
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

Being a hash table, hlist is the best option.

There is currently some ugliness were we treat "->next == NULL" as
a special case to avoid having to initialise the whole array.
This change nicely gets rid of that case.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/cache.h |    2 +-
 net/sunrpc/cache.c           |   30 ++++++++++++------------------
 2 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ad62e8d..4749474 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -120,7 +120,7 @@ struct cache_req {
  * delayed awaiting cache-fill
  */
 struct cache_deferred_req {
-	struct list_head	hash;	/* on hash chain */
+	struct hlist_node	hash;	/* on hash chain */
 	struct list_head	recent; /* on fifo */
 	struct cache_head	*item;  /* cache item we wait on */
 	void			*owner; /* we might need to discard all defered requests
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index ba2e113..0087b75 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -556,7 +556,7 @@ EXPORT_SYMBOL_GPL(cache_purge);
 
 static DEFINE_SPINLOCK(cache_defer_lock);
 static LIST_HEAD(cache_defer_list);
-static struct list_head cache_defer_hash[DFR_HASHSIZE];
+static struct hlist_head cache_defer_hash[DFR_HASHSIZE];
 static int cache_defer_cnt;
 
 struct thread_deferred_req {
@@ -600,9 +600,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 
 	list_add(&dreq->recent, &cache_defer_list);
 
-	if (cache_defer_hash[hash].next == NULL)
-		INIT_LIST_HEAD(&cache_defer_hash[hash]);
-	list_add(&dreq->hash, &cache_defer_hash[hash]);
+	hlist_add_head(&dreq->hash, &cache_defer_hash[hash]);
 
 	/* it is in, now maybe clean up */
 	discard = NULL;
@@ -610,7 +608,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		discard = list_entry(cache_defer_list.prev,
 				     struct cache_deferred_req, recent);
 		list_del_init(&dreq->recent);
-		list_del_init(&dreq->hash);
+		hlist_del_init(&dreq->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
@@ -629,12 +627,12 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		wait_event_interruptible_timeout(
 			sleeper.wait,
 			!test_bit(CACHE_PENDING, &item->flags)
-			|| list_empty(&sleeper.handle.hash),
+			|| hlist_unhashed(&sleeper.handle.hash),
 			req->thread_wait);
 		spin_lock(&cache_defer_lock);
-		if (!list_empty(&sleeper.handle.hash)) {
+		if (!hlist_unhashed(&sleeper.handle.hash)) {
 			list_del_init(&sleeper.handle.recent);
-			list_del_init(&sleeper.handle.hash);
+			hlist_del_init(&sleeper.handle.hash);
 			cache_defer_cnt--;
 		}
 		spin_unlock(&cache_defer_lock);
@@ -655,24 +653,20 @@ static void cache_revisit_request(struct cache_head *item)
 	struct cache_deferred_req *dreq;
 	struct list_head pending;
 
-	struct list_head *lp;
+	struct hlist_node *lp, *tmp;
 	int hash = DFR_HASH(item);
 
 	INIT_LIST_HEAD(&pending);
 	spin_lock(&cache_defer_lock);
 
-	lp = cache_defer_hash[hash].next;
-	if (lp) {
-		while (lp != &cache_defer_hash[hash]) {
-			dreq = list_entry(lp, struct cache_deferred_req, hash);
-			lp = lp->next;
+	hlist_for_each_entry_safe(dreq, lp, tmp,
+				  &cache_defer_hash[hash], hash)
 			if (dreq->item == item) {
-				list_del_init(&dreq->hash);
+				hlist_del_init(&dreq->hash);
 				list_move(&dreq->recent, &pending);
 				cache_defer_cnt--;
 			}
-		}
-	}
+
 	spin_unlock(&cache_defer_lock);
 
 	while (!list_empty(&pending)) {
@@ -693,7 +687,7 @@ void cache_clean_deferred(void *owner)
 
 	list_for_each_entry_safe(dreq, tmp, &cache_defer_list, recent) {
 		if (dreq->owner == owner) {
-			list_del_init(&dreq->hash);
+			hlist_del_init(&dreq->hash);
 			list_move(&dreq->recent, &pending);
 			cache_defer_cnt--;
 		}



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/12] sunrpc: fix memory leak in unix_gid cache.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (9 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 11/12] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
       [not found]     ` <20090804052239.15929.71459.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-08-04  5:22   ` [PATCH 12/12] sunrpc: close connection when a request is irretrievably lost NeilBrown
  2009-08-04 14:04   ` [PATCH 00/12] Some improvements to request deferral and related code J. Bruce Fields
  12 siblings, 1 reply; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

When we look up an entry in the uid->gidlist cache, we take
a reference to the content but don't drop the reference to the
cache entry.  So it never gets freed.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/svcauth_unix.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index f2de152..0afeb30 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -660,6 +660,7 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 	case 0:
 		*gip = ug->gi;
 		get_group_info(*gip);
+		cache_put(&ug->h, &unix_gid_cache);
 		return 0;
 	case -ETIMEDOUT:
 		prevug = ug;



^ permalink raw reply related	[flat|nested] 26+ messages in thread

[parent not found: <20090804052239.15929.71459.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 10/12] sunrpc: fix memory leak in unix_gid cache.
       [not found]     ` <20090804052239.15929.71459.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-04 20:55       ` J. Bruce Fields
  0 siblings, 0 replies; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 20:55 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:39PM +1000, NeilBrown wrote:
> When we look up an entry in the uid->gidlist cache, we take
> a reference to the content but don't drop the reference to the
> cache entry.  So it never gets freed.

Applied, thanks.  (It doesn't seem to depend on the others.)

--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/svcauth_unix.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index f2de152..0afeb30 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -660,6 +660,7 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
>  	case 0:
>  		*gip = ug->gi;
>  		get_group_info(*gip);
> +		cache_put(&ug->h, &unix_gid_cache);
>  		return 0;
>  	case -ETIMEDOUT:
>  		prevug = ug;
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 12/12] sunrpc: close connection when a request is irretrievably lost.
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (10 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 10/12] sunrpc: fix memory leak in unix_gid cache NeilBrown
@ 2009-08-04  5:22   ` NeilBrown
  2009-08-04 14:04   ` [PATCH 00/12] Some improvements to request deferral and related code J. Bruce Fields
  12 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2009-08-04  5:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

If we drop a request in the sunrpc layer due to a cache miss,
and could not queue the request for later replay, then close
the connection to encourage the client to retry sooner.

Note that if the drop happens in the NFS layer, NFSERR_JUKEBOX
is returned to guide the client concerning replay

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/svcauth.h |   10 +++++++---
 net/sunrpc/svc.c               |    3 +++
 net/sunrpc/svcauth_unix.c      |   10 +++++++---
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index d39dbdc..1126693 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -108,9 +108,13 @@ struct auth_ops {
 #define	SVC_NEGATIVE	4
 #define	SVC_OK		5
 #define	SVC_DROP	6
-#define	SVC_DENIED	7
-#define	SVC_PENDING	8
-#define	SVC_COMPLETE	9
+#define	SVC_CLOSE	7	/* Like SVC_DROP, but request is definitely
+				 * lost so if there is a tcp connection, it
+				 * should be closed
+				 */
+#define	SVC_DENIED	8
+#define	SVC_PENDING	9
+#define	SVC_COMPLETE	10
 
 
 extern int	svc_authenticate(struct svc_rqst *rqstp, __be32 *authp);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 952f206..b47aa21 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1050,6 +1050,9 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
 		goto err_bad;
 	case SVC_DENIED:
 		goto err_bad_auth;
+	case SVC_CLOSE:
+		if (test_bit(XPT_TEMP, &rqstp->rq_xprt->xpt_flags))
+			svc_close_xprt(rqstp->rq_xprt);
 	case SVC_DROP:
 		goto dropit;
 	case SVC_COMPLETE:
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 0afeb30..e203a0b 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -669,6 +669,7 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 			goto retry;
 		if (ug)
 			cache_put(&ug->h, &unix_gid_cache);
+		return -ESHUTDOWN;
 	default:
 		return -EAGAIN;
 	}
@@ -719,7 +720,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 				goto retry;
 			if (ipm)
 				cache_put(&ipm->h, &ip_map_cache);
-
+			return SVC_CLOSE;
 		case -EAGAIN:
 			return SVC_DROP;
 		case -ENOENT:
@@ -826,9 +827,12 @@ svcauth_unix_accept(struct svc_rqst *rqstp, __be32 *authp)
 	slen = svc_getnl(argv);			/* gids length */
 	if (slen > 16 || (len -= (slen + 2)*4) < 0)
 		goto badcred;
-	if (unix_gid_find(cred->cr_uid, &cred->cr_group_info, rqstp)
-	    == -EAGAIN)
+	switch (unix_gid_find(cred->cr_uid, &cred->cr_group_info, rqstp)) {
+	case -EAGAIN:
 		return SVC_DROP;
+	case -ESHUTDOWN:
+		return SVC_CLOSE;
+	}
 	if (cred->cr_group_info == NULL) {
 		cred->cr_group_info = groups_alloc(slen);
 		if (cred->cr_group_info == NULL)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 00/12] Some improvements to request deferral and related code
       [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (11 preceding siblings ...)
  2009-08-04  5:22   ` [PATCH 12/12] sunrpc: close connection when a request is irretrievably lost NeilBrown
@ 2009-08-04 14:04   ` J. Bruce Fields
  2009-08-07  4:13     ` Neil Brown
  12 siblings, 1 reply; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-04 14:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> Hi Bruce and all.
> 
>  When I wrote the authentication cache all those years ago I handled
>  cache misses by arranging to simply retry the whole request, either
>  by saving a copy for be replayed later, or by waiting for the client
>  to resend.
> 
>  As we know this isn't ideal.  Large requests (e.g. writes) don't get
>  saved, TCP connections don't resend in a suitable time, and NFSv4
>  request are sufficiently complex that restarting from the beginning
>  isn't really a good idea.

Excellent, thanks for working on this!

>  So I have finally implemented the "wait a while" approach which has
>  probably been suggested several times and is even implemented in a
>  somewhat hackish way for nfsidmap.
> 
>  Yes, I know I have probably argued against this in the past.  I was
>  wrong.
> 
>  This series fixes a few little bugs and tidies up some code but does
>  two main important things.
> 
>  1/ 'allow thread to block....' will wait a little while if there is a
>  cache miss.  If an answer is available in that time, it continues on
>  it's merry way.  If no answer arrives, the old deferral approach is
>  used.  It waits 5 seconds if there are spare nfsd threads, and 1
>  second if there all threads are busy.  I have almost nothing with
>  which to justify these numbers.

I think the v4 server at least should return NFS4ERR_DELAY in this case
instead of doing the internal replay.  That avoids possible problems
with non-idempotent compound ops.

>From the protocol point of view I don't know if there's any rule of
thumb about when it'd be best to return DELAY.  Perhaps it's best to
avoid it whenever possible, but when the delay is on the order of
seconds it sounds reasonable to me.

--b.

> 
>  2/ 'close connection when...' provides a better fall back for TCP.
>  If the sunrpc layer decides to drop the request without queueing it
>  at all, it will close the connection.  This encourages the client to
>  retry sooner.
> 
>  Please review and comment.
> 
>  If you like them all, they can be pulled from
> 
>       git://neil.brown.name/linux-2.6/ nfsd
> 
> Thanks,
> NeilBrown
> 
> 
> ---
> 
> NeilBrown (12):
>       sunrpc: close connection when a request is irretrievably lost.
>       sunrpc/cache: change deferred-request hash table to use hlist.
>       sunrpc: fix memory leak in unix_gid cache.
>       nfsd/idmap: drop special request deferal in favour of improved default.
>       sunrpc/cache: retry cache lookups that return -ETIMEDOUT
>       sunrpc/cache: allow thread to block while waiting for cache update.
>       sunrpc/cache: avoid variable over-loading in cache_defer_req
>       sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
>       sunrpc/cache: recheck cache validity after cache_defer_req
>       sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
>       sunrpc/cache: make sure deferred requests eventually get revisited.
>       sunrpc/cache: rename queue_loose to cache_dequeue
> 
> 
>  fs/nfsd/export.c               |   18 ++++
>  fs/nfsd/nfs4idmap.c            |  105 +++---------------------
>  include/linux/sunrpc/cache.h   |    5 +
>  include/linux/sunrpc/svcauth.h |   10 ++
>  net/sunrpc/cache.c             |  173 ++++++++++++++++++++++++++--------------
>  net/sunrpc/svc.c               |    3 +
>  net/sunrpc/svc_xprt.c          |   11 +++
>  net/sunrpc/svcauth_unix.c      |   31 ++++++-
>  8 files changed, 192 insertions(+), 164 deletions(-)
> 
> -- 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 00/12] Some improvements to request deferral and related code
  2009-08-04 14:04   ` [PATCH 00/12] Some improvements to request deferral and related code J. Bruce Fields
@ 2009-08-07  4:13     ` Neil Brown
       [not found]       ` <19067.43518.105153.247173-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 26+ messages in thread
From: Neil Brown @ 2009-08-07  4:13 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Tuesday August 4, bfields@fieldses.org wrote:
> On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> >  This series fixes a few little bugs and tidies up some code but does
> >  two main important things.
> > 
> >  1/ 'allow thread to block....' will wait a little while if there is a
> >  cache miss.  If an answer is available in that time, it continues on
> >  it's merry way.  If no answer arrives, the old deferral approach is
> >  used.  It waits 5 seconds if there are spare nfsd threads, and 1
> >  second if there all threads are busy.  I have almost nothing with
> >  which to justify these numbers.
> 
> I think the v4 server at least should return NFS4ERR_DELAY in this case
> instead of doing the internal replay.  That avoids possible problems
> with non-idempotent compound ops.

If the request has been handed to nfsd, then yes I agree.  We probably
want some way for nfsd to mark the request as "don't replay" so that
an error will propagate out.  Currently we map that error to EJUKEBOX
for v3 or v4, but you are right, we want ERR_DELAY for v4.

If the request is still in the RPC code (trying to identify the
origin or to decode the crypto) then we cannot return ERR_DELAY, but
as none of the request will have been processed yet, there is no room
for a problem with non-idempotent ops.

It has occurred to me that we could throw away the current request
deferral completely:  if we don't feel comfortable delaying the thread
for as long as it takes, we just return an error or drop the request
(closing any connection).
I'm not sure I'd be comfortable doing that if there were only a few
(8?) threads though.
Maybe if we got dynamic nfsd threads so that new ones could be created
on demand I would feel quite happy to discard the deferral stuff and
just use a delay.

> 
> >From the protocol point of view I don't know if there's any rule of
> thumb about when it'd be best to return DELAY.  Perhaps it's best to
> avoid it whenever possible, but when the delay is on the order of
> seconds it sounds reasonable to me.

Of course you don't know how long the delay will be until it happens:-)

But I agree.  Delay internally if possible, but as soon as that seems
to be awkward (e.g. run out of threads), return DELAY

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 26+ messages in thread

[parent not found: <19067.43518.105153.247173-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]

* Re: [PATCH 00/12] Some improvements to request deferral and related code
       [not found]       ` <19067.43518.105153.247173-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-08-10 15:05         ` J. Bruce Fields
  0 siblings, 0 replies; 26+ messages in thread
From: J. Bruce Fields @ 2009-08-10 15:05 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Fri, Aug 07, 2009 at 02:13:50PM +1000, Neil Brown wrote:
> On Tuesday August 4, bfields@fieldses.org wrote:
> > On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> > >  This series fixes a few little bugs and tidies up some code but does
> > >  two main important things.
> > > 
> > >  1/ 'allow thread to block....' will wait a little while if there is a
> > >  cache miss.  If an answer is available in that time, it continues on
> > >  it's merry way.  If no answer arrives, the old deferral approach is
> > >  used.  It waits 5 seconds if there are spare nfsd threads, and 1
> > >  second if there all threads are busy.  I have almost nothing with
> > >  which to justify these numbers.
> > 
> > I think the v4 server at least should return NFS4ERR_DELAY in this case
> > instead of doing the internal replay.  That avoids possible problems
> > with non-idempotent compound ops.
> 
> If the request has been handed to nfsd, then yes I agree.  We probably
> want some way for nfsd to mark the request as "don't replay" so that
> an error will propagate out.  Currently we map that error to EJUKEBOX
> for v3 or v4, but you are right, we want ERR_DELAY for v4.

Note actually DELAY and JUKEBOX are both 10008--the v4 spec just renamed
it.

> If the request is still in the RPC code (trying to identify the
> origin or to decode the crypto) then we cannot return ERR_DELAY, but
> as none of the request will have been processed yet, there is no room
> for a problem with non-idempotent ops.
> 
> It has occurred to me that we could throw away the current request
> deferral completely:  if we don't feel comfortable delaying the thread
> for as long as it takes, we just return an error or drop the request
> (closing any connection).
> I'm not sure I'd be comfortable doing that if there were only a few
> (8?) threads though.
> Maybe if we got dynamic nfsd threads so that new ones could be created
> on demand I would feel quite happy to discard the deferral stuff and
> just use a delay.

How about just increasing the default number of threads for now?

--b.

> 
> > 
> > >From the protocol point of view I don't know if there's any rule of
> > thumb about when it'd be best to return DELAY.  Perhaps it's best to
> > avoid it whenever possible, but when the delay is on the order of
> > seconds it sounds reasonable to me.
> 
> Of course you don't know how long the delay will be until it happens:-)
> 
> But I agree.  Delay internally if possible, but as soon as that seems
> to be awkward (e.g. run out of threads), return DELAY
> 
> Thanks,
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2009-08-26  0:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-04  5:22 [PATCH 00/12] Some improvements to request deferral and related code NeilBrown
     [not found] ` <20090804051145.15929.11356.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04  5:22   ` [PATCH 03/12] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
     [not found]     ` <20090804052238.15929.17142.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 15:45       ` J. Bruce Fields
2009-08-04  5:22   ` [PATCH 04/12] sunrpc/cache: recheck cache validity after cache_defer_req NeilBrown
     [not found]     ` <20090804052238.15929.56800.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 20:42       ` J. Bruce Fields
2009-08-06  4:57         ` Neil Brown
     [not found]           ` <19066.25248.283061.383233-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-25 21:50             ` J. Bruce Fields
2009-08-26  0:42               ` Neil Brown
2009-08-04  5:22   ` [PATCH 01/12] sunrpc/cache: rename queue_loose to cache_dequeue NeilBrown
     [not found]     ` <20090804052238.15929.91015.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 14:05       ` J. Bruce Fields
2009-08-04  5:22   ` [PATCH 02/12] sunrpc/cache: make sure deferred requests eventually get revisited NeilBrown
     [not found]     ` <20090804052238.15929.74402.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 15:02       ` J. Bruce Fields
2009-08-04  5:22   ` [PATCH 08/12] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
2009-08-04  5:22   ` [PATCH 06/12] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
     [not found]     ` <20090804052239.15929.87201.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 20:47       ` J. Bruce Fields
2009-08-06  4:35         ` Neil Brown
2009-08-04  5:22   ` [PATCH 09/12] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
2009-08-04  5:22   ` [PATCH 07/12] sunrpc/cache: allow thread to block while waiting for cache update NeilBrown
2009-08-04  5:22   ` [PATCH 05/12] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
2009-08-04  5:22   ` [PATCH 11/12] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
2009-08-04  5:22   ` [PATCH 10/12] sunrpc: fix memory leak in unix_gid cache NeilBrown
     [not found]     ` <20090804052239.15929.71459.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-04 20:55       ` J. Bruce Fields
2009-08-04  5:22   ` [PATCH 12/12] sunrpc: close connection when a request is irretrievably lost NeilBrown
2009-08-04 14:04   ` [PATCH 00/12] Some improvements to request deferral and related code J. Bruce Fields
2009-08-07  4:13     ` Neil Brown
     [not found]       ` <19067.43518.105153.247173-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-08-10 15:05         ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox