linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hang on server boot
@ 2010-09-20  3:36 J. Bruce Fields
  2010-09-20 15:03 ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2010-09-20  3:36 UTC (permalink / raw)
  To: linux-nfs

Trying to do some reboot testing, I hit this bug.  I'm planning to queue
up this fix for 2.6.37 absent any objections.

(Arguably it could go to 2.6.36, but we're getting closer to the next
release, the consequences of this bug aren't too horrible, and it's not
a new regression.)

--b.

commit fae1561d4f5ccd315741cf4cb9ca2fb7c3fbe377
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Sun Sep 19 22:55:06 2010 -0400

    nfsd4: fix hang on fast-booting nfs servers
    
    The last_close field of a cache_detail is initialized to zero, so the
    condition
    
    	detail->last_close < seconds_since_boot() - 30
    
    may be false even for a cache that was never opened.
    
    However, we want to immediately fail upcalls to caches that were never
    opened: in the case of the auth_unix_gid cache, especially, which may
    never be opened by mountd (if the --manage-gids option is not set), we
    want to fail the upcall immediately.  Otherwise client requests will be
    dropped unnecessarily on reboot.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index da872f9..ca7c621 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1091,6 +1091,23 @@ static void warn_no_listener(struct cache_detail *detail)
 	}
 }
 
+static bool cache_listeners_exist(struct cache_detail *detail)
+{
+	if (atomic_read(&detail->readers))
+		return true;
+	if (detail->last_close == 0)
+		/* This cache was never opened */
+		return false;
+	if (detail->last_close < seconds_since_boot() - 30)
+		/*
+		 * We allow for the possibility that someone might
+		 * restart a userspace daemon without restarting the
+		 * server; but after 30 seconds, we give up.
+		 */
+		 return false;
+	return true;
+}
+
 /*
  * register an upcall request to user-space and queue it up for read() by the
  * upcall daemon.
@@ -1109,10 +1126,9 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
 	char *bp;
 	int len;
 
-	if (atomic_read(&detail->readers) == 0 &&
-	    detail->last_close < seconds_since_boot() - 30) {
-			warn_no_listener(detail);
-			return -EINVAL;
+	if (!cache_listeners_exist(detail)) {
+		warn_no_listener(detail);
+		return -EINVAL;
 	}
 
 	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-09-21 16:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-20  3:36 hang on server boot J. Bruce Fields
2010-09-20 15:03 ` J. Bruce Fields
2010-09-20 15:04   ` [PATCH 1/2] TESTS: fix error when rebootscript defined but not rebootargs J. Bruce Fields
2010-09-20 15:04     ` [PATCH 2/2] TESTS: make reboot tests run non-interactively J. Bruce Fields
2010-09-20 15:05   ` hang on server boot J. Bruce Fields
2010-09-21 16:11     ` [PATCH] CLNT: preliminary reboot test J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).