[PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail,

Linux NFS development
 help / color / mirror / Atom feed

* [PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail,
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 2 of 10 - Make sure CACHE_NEGATIVE is cleared when a cache entry is updates NeilBrown
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


.. to help the user diagnose problems caused by user-level daemons not running.

From: "J. Bruce Fields" <bfields@fieldses.org>

 ----------- Diffstat output ------------
 ./include/linux/sunrpc/cache.h |    3 ++-
 ./net/sunrpc/cache.c           |   18 ++++++++++++++----
 2 files changed, 16 insertions(+), 5 deletions(-)

diff ./include/linux/sunrpc/cache.h~current~ ./include/linux/sunrpc/cache.h
--- ./include/linux/sunrpc/cache.h~current~	2004-05-18 11:15:40.000000000 +1000
+++ ./include/linux/sunrpc/cache.h	2004-05-18 11:19:07.000000000 +1000
@@ -97,7 +97,8 @@ struct cache_detail {
 	struct proc_dir_entry   *flush_ent, *channel_ent, *content_ent;
 
 	atomic_t		readers;		/* how many time is /chennel open */
-	time_t			last_close;		/* it no readers, when did last close */
+	time_t			last_close;		/* if no readers, when did last close */
+	time_t			last_warn;		/* when we last warned about no readers */
 };
 
 

diff ./net/sunrpc/cache.c~current~ ./net/sunrpc/cache.c
--- ./net/sunrpc/cache.c~current~	2004-05-18 11:16:01.000000000 +1000
+++ ./net/sunrpc/cache.c	2004-05-18 11:21:24.000000000 +1000
@@ -214,6 +214,7 @@ void cache_register(struct cache_detail 
 	cd->entries = 0;
 	atomic_set(&cd->readers, 0);
 	cd->last_close = 0;
+	cd->last_warn = -1;
 	list_add(&cd->others, &cache_list);
 	spin_unlock(&cache_list_lock);
 
@@ -905,7 +906,15 @@ void qword_addhex(char **bpp, int *lp, c
 	*lp = len;
 }
 
-			
+void warn_no_listener(struct cache_detail *detail)
+{
+	if (detail->last_warn != detail->last_close) {
+		detail->last_warn = detail->last_close;
+		printk(KERN_WARNING "nfsd: nobody listening for %s upcall;"
+				" has some daemon %s?\n", detail->name,
+		      		detail->last_close?"died" : "not been started");
+	}
+}
 
 /*
  * register an upcall request to user-space.
@@ -923,9 +932,10 @@ static int cache_make_upcall(struct cach
 		return -EINVAL;
 
 	if (atomic_read(&detail->readers) == 0 &&
-	    detail->last_close < get_seconds() - 60)
-		/* nobody is listening */
-		return -EINVAL;
+	    detail->last_close < get_seconds() - 30) {
+			warn_no_listener(detail);
+			return -EINVAL;
+	}
 
 	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!buf)


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 2 of 10 - Make sure CACHE_NEGATIVE is cleared when a cache entry is updates.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail, NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 5 of 10 - Protect reference to exp across calls to nfsd_cross_mnt NeilBrown
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


This is important for update-in-place caches which may change
from being negative to posative.

Thanks to "J. Bruce Fields" <bfields@fieldses.org>
and Olaf Kirch <okir@suse.de>

 ----------- Diffstat output ------------
 ./include/linux/sunrpc/cache.h |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff ./include/linux/sunrpc/cache.h~current~ ./include/linux/sunrpc/cache.h
--- ./include/linux/sunrpc/cache.h~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./include/linux/sunrpc/cache.h	2004-05-18 11:15:40.000000000 +1000
@@ -193,8 +193,11 @@ RTN *FNAME ARGS										\
 					t2 = tmp; tmp = new; new = t2;			\
 				}							\
 				if (test_bit(CACHE_NEGATIVE,  &item->MEMBER.flags))	\
-					 set_bit(CACHE_NEGATIVE, &tmp->MEMBER.flags);	\
-				else {UPDATE;}						\
+					set_bit(CACHE_NEGATIVE, &tmp->MEMBER.flags);	\
+				else {							\
+					UPDATE;						\
+					clear_bit(CACHE_NEGATIVE, &tmp->MEMBER.flags);	\
+				}							\
 			}								\
 			if (set||new) write_unlock(&(DETAIL)->hash_lock);		\
 			else read_unlock(&(DETAIL)->hash_lock);				\


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 6 of 10 - Fix race conditions in idmapper
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (5 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 9 of 10 - Remove check on number of threads waiting on user-space NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 8 of 10 - Reduce timeout when waiting for idmapper userspace daemon NeilBrown
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


Also fix leaks on error; split up code a bit to make it
easier to verify correctness.

From: "J. Bruce Fields" <bfields@fieldses.org>

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs4idmap.c |  102 +++++++++++++++++++++++++++++++-------------------
 1 files changed, 65 insertions(+), 37 deletions(-)

diff ./fs/nfsd/nfs4idmap.c~current~ ./fs/nfsd/nfs4idmap.c
--- ./fs/nfsd/nfs4idmap.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/nfs4idmap.c	2004-05-18 11:17:53.000000000 +1000
@@ -409,13 +409,19 @@ struct idmap_defer_req {
        atomic_t			count;
 };
 
-static void
+static inline void
 put_mdr(struct idmap_defer_req *mdr)
 {
 	if (atomic_dec_and_test(&mdr->count))
 		kfree(mdr);
 }
 
+static inline void
+get_mdr(struct idmap_defer_req *mdr)
+{
+	atomic_inc(&mdr->count);
+}
+
 static void
 idmap_revisit(struct cache_deferred_req *dreq, int toomany)
 {
@@ -433,31 +439,68 @@ idmap_defer(struct cache_req *req)
 		container_of(req, struct idmap_defer_req, req);
 
 	mdr->deferred_req.revisit = idmap_revisit;
+	get_mdr(mdr);
 	return (&mdr->deferred_req);
 }
 
+static inline int
+do_idmap_lookup(struct ent *(*lookup_fn)(struct ent *, int), struct ent *key,
+		struct cache_detail *detail, struct ent **item,
+		struct idmap_defer_req *mdr)
+{
+	*item = lookup_fn(key, 0);
+	if (!*item)
+		return -ENOMEM;
+	return cache_check(detail, &(*item)->h, &mdr->req);
+}
+
 static int threads_waiting = 0;
 
 static inline int
-idmap_lookup_wait(struct idmap_defer_req *mdr, wait_queue_t waitq, struct
-		svc_rqst *rqstp) {
-	int ret = -ETIMEDOUT;
+idmap_lookup_wait(struct idmap_defer_req *mdr, struct svc_rqst *rqstp,
+		struct ent *item))
+{
+	DEFINE_WAIT(wait);
 
-	set_task_state(current, TASK_INTERRUPTIBLE);
-	lock_kernel();
+	prepare_to_wait(&mdr->waitq, &wait, TASK_INTERRUPTIBLE);
 	/* XXX: Does it matter that threads_waiting isn't per-server? */
 	/* Note: BKL prevents races with nfsd_svc and other lookups */
-	if (2 * threads_waiting > rqstp->rq_server->sv_nrthreads)
-		goto out;
-	threads_waiting++;
-	schedule_timeout(10 * HZ);
-	threads_waiting--;
-	ret = 0;
+	lock_kernel();
+	if (!test_bit(CACHE_VALID, &item->h.flags)) {
+		if (2 * threads_waiting > rqstp->rq_server->sv_nrthreads)
+			goto out;
+		threads_waiting++;
+		schedule_timeout(10 * HZ);
+		threads_waiting--;
+	}
 out:
 	unlock_kernel();
-	remove_wait_queue(&mdr->waitq, &waitq);
-	set_task_state(current, TASK_RUNNING);
-	put_mdr(mdr);
+	finish_wait(&mdr->waitq, &wait);
+}
+
+static inline int
+do_idmap_lookup_nowait(struct ent *(*lookup_fn)(struct ent *, int),
+			struct ent *key, struct cache_detail *detail,
+			struct ent **item)
+{
+	int ret = -ENOMEM;
+
+	*item = lookup_fn(key, 0);
+	if (!*item)
+		goto out_err;
+	ret = -ETIMEDOUT;
+	if (!test_bit(CACHE_VALID, &(*item)->h.flags)
+			|| (*item)->h.expiry_time < get_seconds()
+			|| detail->flush_time > (*item)->h.last_refresh)
+		goto out_put;
+	ret = -ENOENT;
+	if (test_bit(CACHE_NEGATIVE, &(*item)->h.flags))
+		goto out_put;
+	return 0;
+out_put:
+	ent_put(&(*item)->h, detail);
+out_err:
+	*item = NULL;
 	return ret;
 }
 
@@ -467,36 +510,21 @@ idmap_lookup(struct svc_rqst *rqstp,
 		struct cache_detail *detail, struct ent **item)
 {
 	struct idmap_defer_req *mdr;
-	DECLARE_WAITQUEUE(waitq, current);
 	int ret;
 
-	*item = lookup_fn(key, 0);
-	if (!*item)
-		return -ENOMEM;
 	mdr = kmalloc(sizeof(*mdr), GFP_KERNEL);
+	if (!mdr)
+		return -ENOMEM;
 	memset(mdr, 0, sizeof(*mdr));
+	atomic_set(&mdr->count, 1);
 	init_waitqueue_head(&mdr->waitq);
-	add_wait_queue(&mdr->waitq, &waitq);
-	atomic_set(&mdr->count, 2);
 	mdr->req.defer = idmap_defer;
-	ret = cache_check(detail, &(*item)->h, &mdr->req);
+	ret = do_idmap_lookup(lookup_fn, key, detail, item, mdr);
 	if (ret == -EAGAIN) {
-		ret = idmap_lookup_wait(mdr, waitq, rqstp);
-		if (ret)
-			goto out;
-		/* Try again, but don't wait. */
-		*item = lookup_fn(key, 0);
-		ret = -ENOMEM;
-		if (!*item)
-			goto out;
-		ret = -ETIMEDOUT;
-		if (!test_bit(CACHE_VALID, &(*item)->h.flags)) {
-			ent_put(&(*item)->h, detail);
-			goto out;
-		}
-		ret = cache_check(detail, &(*item)->h, NULL);
+		idmap_wait(mdr, rqstp, *item);
+		ret = do_idmap_lookup_nowait(lookup_fn, key, detail, item);
 	}
-out:
+	put_mdr(mdr);
 	return ret;
 }
 


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 9 of 10 - Remove check on number of threads waiting on user-space.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (4 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 3 of 10 - Allow larger writes to sunrpc/svc caches NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 6 of 10 - Fix race conditions in idmapper NeilBrown
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs

Currently we are counting the number of threads already asleep and returning an
immediate NFS4ERR_DELAY (==JUKEBOX) error if more than half are already asleep.

This patch removes that logic, so instead we only return NFS4ERR_DELAY if an
upcall times out (if it takes more than a second to return).

With the thread counting there is the risk that even when all the relevant
subsystems are responsive, the client may still see occasional NFS4ERR_DELAY
returns just because, by coincidence, several upcalls were initiated at the
same time.  I expect clients will delay several seconds before retrying after
NFS4ERR_DELAY, so this will be quite noticeable to users.  Sporadic long delays
like this are likely to lead users to suspect a problem somewhere, when in fact
there is none.

The current scheme ensures that we can still process requests not depending on
upcalls, even when all threads would otherwise be tied up waiting on upcalls.
However, this is not something that should happen under normal circumstances;
if a server spends a significant portion of its time with all threads waiting
for upcalls, this a sign that something is seriously wrong.

In such a circumstance (e.g., an ldap server dies), we can, at least, bound the
waiting time to a second without the need for counting threads.

In short, removing the thread-counting will allow us to behave predictably when
things are working, while still allowing some progress when they don't.

It would be a worthwhile project to measure the amount of time threads spend
waiting for upcalls (or for reads, for that matter); if a significant portion
of the time they spend handling requests is spent sleeping, then there's an
opportunity to improve nfsd performance: if we can break the one-to-one mapping
between requests and threads, then we can lower the number of threads required
to keep the nfs server busy.

However, both the currently available options for doing this are problematic:
returning JUKEBOX/DELAY errors at random times will lead to unpredictable
performance, and saving a copy of the request to be processed from scratch
again later is wasteful and makes it difficult to provide correct semantics,
especially in the NFSv4 case.

So for now I believe waits with short timeouts are the best option.

From: "J. Bruce Fields" <bfields@fieldses.org>

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs4idmap.c |   27 ++-------------------------
 1 files changed, 2 insertions(+), 25 deletions(-)

diff ./fs/nfsd/nfs4idmap.c~current~ ./fs/nfsd/nfs4idmap.c
--- ./fs/nfsd/nfs4idmap.c~current~	2004-05-18 11:18:30.000000000 +1000
+++ ./fs/nfsd/nfs4idmap.c	2004-05-18 11:18:55.000000000 +1000
@@ -454,30 +454,6 @@ do_idmap_lookup(struct ent *(*lookup_fn)
 	return cache_check(detail, &(*item)->h, &mdr->req);
 }

-static int threads_waiting = 0;
-
-static inline int
-idmap_lookup_wait(struct idmap_defer_req *mdr, struct svc_rqst *rqstp,
-		struct ent *item))
-{
-	DEFINE_WAIT(wait);
-
-	prepare_to_wait(&mdr->waitq, &wait, TASK_INTERRUPTIBLE);
-	/* XXX: Does it matter that threads_waiting isn't per-server? */
-	/* Note: BKL prevents races with nfsd_svc and other lookups */
-	lock_kernel();
-	if (!test_bit(CACHE_VALID, &item->h.flags)) {
-		if (2 * threads_waiting > rqstp->rq_server->sv_nrthreads)
-			goto out;
-		threads_waiting++;
-		schedule_timeout(1 * HZ);
-		threads_waiting--;
-	}
-out:
-	unlock_kernel();
-	finish_wait(&mdr->waitq, &wait);
-}
-
 static inline int
 do_idmap_lookup_nowait(struct ent *(*lookup_fn)(struct ent *, int),
 			struct ent *key, struct cache_detail *detail,
@@ -521,7 +497,8 @@ idmap_lookup(struct svc_rqst *rqstp,
 	mdr->req.defer = idmap_defer;
 	ret = do_idmap_lookup(lookup_fn, key, detail, item, mdr);
 	if (ret == -EAGAIN) {
-		idmap_wait(mdr, rqstp, *item);
+		wait_event_interruptible_timeout(mdr->waitq,
+			test_bit(CACHE_VALID, &(*item)->h.flags), 1 * HZ);
 		ret = do_idmap_lookup_nowait(lookup_fn, key, detail, item);
 	}
 	put_mdr(mdr);

-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 8 of 10 - Reduce timeout when waiting for idmapper userspace daemon.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (6 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 6 of 10 - Fix race conditions in idmapper NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-27 10:20   ` nfs3 dentry encoding Olaf Kirch
  2004-05-18  1:31 ` [PATCH] kNFSd - 4 of 10 - Change fh_compose to NOT consume a reference to the dentry NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 1 of 10 - Use correct _bh locking on sv_lock NeilBrown
  9 siblings, 1 reply; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


1 second should be plenty of time; if we're going to take longer than that it's
probably better just to return NFS4ERR_DELAY and let the client retry anyway.

From: "J. Bruce Fields" <bfields@fieldses.org>

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs4idmap.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff ./fs/nfsd/nfs4idmap.c~current~ ./fs/nfsd/nfs4idmap.c
--- ./fs/nfsd/nfs4idmap.c~current~	2004-05-18 11:18:14.000000000 +1000
+++ ./fs/nfsd/nfs4idmap.c	2004-05-18 11:18:30.000000000 +1000
@@ -470,7 +470,7 @@ idmap_lookup_wait(struct idmap_defer_req
 		if (2 * threads_waiting > rqstp->rq_server->sv_nrthreads)
 			goto out;
 		threads_waiting++;
-		schedule_timeout(10 * HZ);
+		schedule_timeout(1 * HZ);
 		threads_waiting--;
 	}
 out:


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 4 of 10 - Change fh_compose to NOT consume a reference to the dentry.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (7 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 8 of 10 - Reduce timeout when waiting for idmapper userspace daemon NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 1 of 10 - Use correct _bh locking on sv_lock NeilBrown
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


fh_compose currently consumes a reference to the dentry but 
not the export point.  This is both inconsistent and confusing. 

It is better if a routine like this doesn't consume reference points,
so with this patch, it doesn't.
This fixes a couple of very subtle and unusual reference counting errors.

 ----------- Diffstat output ------------
 ./fs/nfsd/export.c  |    3 +--
 ./fs/nfsd/nfs3xdr.c |   12 +++++++-----
 ./fs/nfsd/nfs4xdr.c |    1 +
 ./fs/nfsd/nfsfh.c   |    2 +-
 ./fs/nfsd/nfsproc.c |    1 +
 ./fs/nfsd/vfs.c     |    9 +++++++--
 6 files changed, 18 insertions(+), 10 deletions(-)

diff ./fs/nfsd/export.c~current~ ./fs/nfsd/export.c
--- ./fs/nfsd/export.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/export.c	2004-05-18 11:16:20.000000000 +1000
@@ -899,7 +899,7 @@ exp_rootfh(svc_client *clp, char *path, 
 	 * fh must be initialized before calling fh_compose
 	 */
 	fh_init(&fh, maxsize);
-	if (fh_compose(&fh, exp, dget(nd.dentry), NULL))
+	if (fh_compose(&fh, exp, nd.dentry, NULL))
 		err = -EINVAL;
 	else
 		err = 0;
@@ -932,7 +932,6 @@ exp_pseudoroot(struct auth_domain *clp, 
 	if (!fsid_key || IS_ERR(fsid_key))
 		return nfserr_perm;
 
-	dget(fsid_key->ek_export->ex_dentry);
 	rv = fh_compose(fhp, fsid_key->ek_export, 
 			  fsid_key->ek_export->ex_dentry, NULL);
 	expkey_put(&fsid_key->h, &svc_expkey_cache);

diff ./fs/nfsd/nfs3xdr.c~current~ ./fs/nfsd/nfs3xdr.c
--- ./fs/nfsd/nfs3xdr.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/nfs3xdr.c	2004-05-18 11:16:20.000000000 +1000
@@ -799,6 +799,7 @@ compose_entry_fh(struct nfsd3_readdirres
 {
 	struct svc_export	*exp;
 	struct dentry		*dparent, *dchild;
+	int rv = 0;
 
 	dparent = cd->fh.fh_dentry;
 	exp  = cd->fh.fh_export;
@@ -813,11 +814,12 @@ compose_entry_fh(struct nfsd3_readdirres
 		dchild = lookup_one_len(name, dparent, namlen);
 	if (IS_ERR(dchild))
 		return 1;
-	if (d_mountpoint(dchild))
-		return 1;
-	if (fh_compose(fhp, exp, dchild, &cd->fh) != 0 || !dchild->d_inode)
-		return 1;
-	return 0;
+	if (d_mountpoint(dchild) ||
+	    fh_compose(fhp, exp, dchild, &cd->fh) != 0 ||
+	    !dchild->d_inode)
+		rv = 1;
+	dput(dchild);
+	return rv;
 }
 
 /*

diff ./fs/nfsd/nfs4xdr.c~current~ ./fs/nfsd/nfs4xdr.c
--- ./fs/nfsd/nfs4xdr.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/nfs4xdr.c	2004-05-18 11:16:20.000000000 +1000
@@ -1706,6 +1706,7 @@ nfsd4_encode_dirent(struct readdir_cd *c
 		nfserr = nfsd4_encode_fattr(NULL, exp,
 				dentry, p, &buflen, cd->rd_bmval,
 				cd->rd_rqstp);
+		dput(dentry);
 		if (!nfserr) {
 			p += buflen;
 			goto out;

diff ./fs/nfsd/nfsfh.c~current~ ./fs/nfsd/nfsfh.c
--- ./fs/nfsd/nfsfh.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/nfsfh.c	2004-05-18 11:16:20.000000000 +1000
@@ -367,7 +367,7 @@ fh_compose(struct svc_fh *fhp, struct sv
 		printk(KERN_ERR "fh_compose: called with maxsize %d! %s/%s\n",
 		       fhp->fh_maxsize, parent->d_name.name, dentry->d_name.name);
 
-	fhp->fh_dentry = dentry; /* our internal copy */
+	fhp->fh_dentry = dget(dentry); /* our internal copy */
 	fhp->fh_export = exp;
 	cache_get(&exp->h);
 

diff ./fs/nfsd/nfsproc.c~current~ ./fs/nfsd/nfsproc.c
--- ./fs/nfsd/nfsproc.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/nfsproc.c	2004-05-18 11:16:20.000000000 +1000
@@ -212,6 +212,7 @@ nfsd_proc_create(struct svc_rqst *rqstp,
 	nfserr = fh_compose(newfhp, dirfhp->fh_export, dchild, dirfhp);
 	if (!nfserr && !dchild->d_inode)
 		nfserr = nfserr_noent;
+	dput(dchild);
 	if (nfserr) {
 		if (nfserr != nfserr_noent)
 			goto out_unlock;

diff ./fs/nfsd/vfs.c~current~ ./fs/nfsd/vfs.c
--- ./fs/nfsd/vfs.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./fs/nfsd/vfs.c	2004-05-18 11:16:20.000000000 +1000
@@ -208,6 +208,7 @@ nfsd_lookup(struct svc_rqst *rqstp, stru
 	err = fh_compose(resfh, exp, dentry, fhp);
 	if (!err && !dentry->d_inode)
 		err = nfserr_noent;
+	dput(dentry);
 out:
 	return err;
 
@@ -859,7 +860,7 @@ nfsd_create(struct svc_rqst *rqstp, stru
 		char *fname, int flen, struct iattr *iap,
 		int type, dev_t rdev, struct svc_fh *resfhp)
 {
-	struct dentry	*dentry, *dchild;
+	struct dentry	*dentry, *dchild = NULL;
 	struct inode	*dirp;
 	int		err;
 
@@ -965,6 +966,8 @@ nfsd_create(struct svc_rqst *rqstp, stru
 	if (!err)
 		err = fh_update(resfhp);
 out:
+	if (dchild && !IS_ERR(dchild))
+		dput(dchild);
 	return err;
 
 out_nfserr:
@@ -982,7 +985,7 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
 		struct svc_fh *resfhp, int createmode, u32 *verifier,
 	        int *truncp)
 {
-	struct dentry	*dentry, *dchild;
+	struct dentry	*dentry, *dchild = NULL;
 	struct inode	*dirp;
 	int		err;
 	__u32		v_mtime=0, v_atime=0;
@@ -1111,6 +1114,8 @@ nfsd_create_v3(struct svc_rqst *rqstp, s
 
  out:
 	fh_unlock(fhp);
+	if (dchild && !IS_ERR(dchild))
+		dput(dchild);
  	return err;
  
  out_nfserr:


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 5 of 10 - Protect reference to exp across calls to nfsd_cross_mnt
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail, NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 2 of 10 - Make sure CACHE_NEGATIVE is cleared when a cache entry is updates NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 7 of 10 - Improve idmapper behaviour on failure NeilBrown
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


nfsd_cross_mnt can release the reference to the passed
svc_export structure when it returns a different
svc_export structure.  So we need to make sure we have a 
counted reference before, and drop the reference afterwards.

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs4xdr.c           |    3 +++
 ./fs/nfsd/vfs.c               |    4 +++-
 ./include/linux/nfsd/export.h |    6 +++++-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff ./fs/nfsd/nfs4xdr.c~current~ ./fs/nfsd/nfs4xdr.c
--- ./fs/nfsd/nfs4xdr.c~current~	2004-05-18 11:16:20.000000000 +1000
+++ ./fs/nfsd/nfs4xdr.c	2004-05-18 11:16:40.000000000 +1000
@@ -1686,6 +1686,7 @@ nfsd4_encode_dirent(struct readdir_cd *c
 			goto error;
 		}
 
+		exp_get(exp);
 		if (d_mountpoint(dentry)) {
 			if ((nfserr = nfsd_cross_mnt(cd->rd_rqstp, &dentry, 
 					 &exp))) {	
@@ -1697,6 +1698,7 @@ nfsd4_encode_dirent(struct readdir_cd *c
 			 * this call will be retried.
 			 */
 				dput(dentry);
+				exp_put(exp);
 				nfserr = nfserr_dropit;
 				goto error;
 			}
@@ -1707,6 +1709,7 @@ nfsd4_encode_dirent(struct readdir_cd *c
 				dentry, p, &buflen, cd->rd_bmval,
 				cd->rd_rqstp);
 		dput(dentry);
+		exp_put(exp);
 		if (!nfserr) {
 			p += buflen;
 			goto out;

diff ./fs/nfsd/vfs.c~current~ ./fs/nfsd/vfs.c
--- ./fs/nfsd/vfs.c~current~	2004-05-18 11:16:20.000000000 +1000
+++ ./fs/nfsd/vfs.c	2004-05-18 11:16:40.000000000 +1000
@@ -141,10 +141,11 @@ nfsd_lookup(struct svc_rqst *rqstp, stru
 	/* Obtain dentry and export. */
 	err = fh_verify(rqstp, fhp, S_IFDIR, MAY_EXEC);
 	if (err)
-		goto out;
+		return err;
 
 	dparent = fhp->fh_dentry;
 	exp  = fhp->fh_export;
+	exp_get(exp);
 
 	err = nfserr_acces;
 
@@ -210,6 +211,7 @@ nfsd_lookup(struct svc_rqst *rqstp, stru
 		err = nfserr_noent;
 	dput(dentry);
 out:
+	exp_put(exp);
 	return err;
 
 out_nfserr:

diff ./include/linux/nfsd/export.h~current~ ./include/linux/nfsd/export.h
--- ./include/linux/nfsd/export.h~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./include/linux/nfsd/export.h	2004-05-18 11:16:40.000000000 +1000
@@ -110,6 +110,10 @@ static inline void exp_put(struct svc_ex
 	svc_export_put(&exp->h, &svc_export_cache);
 }
 
+static inline void exp_get(struct svc_export *exp)
+{
+	cache_get(&exp->h);
+}
 static inline struct svc_export *
 exp_find(struct auth_domain *clp, int fsid_type, u32 *fsidv,
 	 struct cache_req *reqp)
@@ -118,7 +122,7 @@ exp_find(struct auth_domain *clp, int fs
 	if (ek && !IS_ERR(ek)) {
 		struct svc_export *exp = ek->ek_export;
 		int err;
-		cache_get(&exp->h);
+		exp_get(exp);
 		expkey_put(&ek->h, &svc_expkey_cache);
 		if (exp &&
 		    (err = cache_check(&svc_export_cache, &exp->h, reqp)))


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 1 of 10 - Use correct _bh locking on sv_lock.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (8 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 4 of 10 - Change fh_compose to NOT consume a reference to the dentry NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


With the _bh, we can dead-lock..

 ----------- Diffstat output ------------
 ./net/sunrpc/svcsock.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff ./net/sunrpc/svcsock.c~current~ ./net/sunrpc/svcsock.c
--- ./net/sunrpc/svcsock.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./net/sunrpc/svcsock.c	2004-05-18 11:15:11.000000000 +1000
@@ -1511,9 +1511,9 @@ static void svc_revisit(struct cache_def
 	dprintk("revisit queued\n");
 	svsk = dr->svsk;
 	dr->svsk = NULL;
-	spin_lock(&serv->sv_lock);
+	spin_lock_bh(&serv->sv_lock);
 	list_add(&dr->handle.recent, &svsk->sk_deferred);
-	spin_unlock(&serv->sv_lock);
+	spin_unlock_bh(&serv->sv_lock);
 	set_bit(SK_DEFERRED, &svsk->sk_flags);
 	svc_sock_enqueue(svsk);
 	svc_sock_put(svsk);
@@ -1544,10 +1544,10 @@ svc_defer(struct cache_req *req)
 		dr->argslen = rqstp->rq_arg.len >> 2;
 		memcpy(dr->args, rqstp->rq_arg.head[0].iov_base-skip, dr->argslen<<2);
 	}
-	spin_lock(&rqstp->rq_server->sv_lock);
+	spin_lock_bh(&rqstp->rq_server->sv_lock);
 	rqstp->rq_sock->sk_inuse++;
 	dr->svsk = rqstp->rq_sock;
-	spin_unlock(&rqstp->rq_server->sv_lock);
+	spin_unlock_bh(&rqstp->rq_server->sv_lock);
 
 	dr->handle.revisit = svc_revisit;
 	return &dr->handle;
@@ -1577,7 +1577,7 @@ static struct svc_deferred_req *svc_defe
 	
 	if (!test_bit(SK_DEFERRED, &svsk->sk_flags))
 		return NULL;
-	spin_lock(&serv->sv_lock);
+	spin_lock_bh(&serv->sv_lock);
 	clear_bit(SK_DEFERRED, &svsk->sk_flags);
 	if (!list_empty(&svsk->sk_deferred)) {
 		dr = list_entry(svsk->sk_deferred.next,
@@ -1586,6 +1586,6 @@ static struct svc_deferred_req *svc_defe
 		list_del_init(&dr->handle.recent);
 		set_bit(SK_DEFERRED, &svsk->sk_flags);
 	}
-	spin_unlock(&serv->sv_lock);
+	spin_unlock_bh(&serv->sv_lock);
 	return dr;
 }


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 7 of 10 - Improve idmapper behaviour on failure.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (2 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 5 of 10 - Protect reference to exp across calls to nfsd_cross_mnt NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 3 of 10 - Allow larger writes to sunrpc/svc caches NeilBrown
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


Slightly better behavior on failed mapping (which may happen either because
idmapd is not running, or because there it has told us it doesn't know the
mapping.):
	on name->id (setattr), return BADNAME.  (I used ESRCH to
		communicate BADNAME, just because it was the first error in
		include/asm-generic/errno-base.h that had something to
		do with nonexistance of something, and that we weren't
		already using.)
	id->name (getattr), return a string representation of the numerical
		id.  This is probably useless to the client, especially
		since we're unlikely to accept such a string on a setattr,
		but perhaps some client will find it mildly helpful.


From: "J. Bruce Fields" <bfields@fieldses.org>

 ----------- Diffstat output ------------
 ./fs/nfsd/nfs4idmap.c       |    4 ++++
 ./fs/nfsd/nfsproc.c         |    1 +
 ./include/linux/nfsd/nfsd.h |    1 +
 3 files changed, 6 insertions(+)

diff ./fs/nfsd/nfs4idmap.c~current~ ./fs/nfsd/nfs4idmap.c
--- ./fs/nfsd/nfs4idmap.c~current~	2004-05-18 11:17:53.000000000 +1000
+++ ./fs/nfsd/nfs4idmap.c	2004-05-18 11:18:14.000000000 +1000
@@ -543,6 +543,8 @@ idmap_name_to_id(struct svc_rqst *rqstp,
 	key.name[namelen] = '\0';
 	strlcpy(key.authname, rqstp->rq_client->name, sizeof(key.authname));
 	ret = idmap_lookup(rqstp, nametoid_lookup, &key, &nametoid_cache, &item);
+	if (ret == -ENOENT)
+		ret = -ESRCH; /* nfserr_badname */
 	if (ret)
 		return ret;
 	*id = item->id;
@@ -561,6 +563,8 @@ idmap_id_to_name(struct svc_rqst *rqstp,
 
 	strlcpy(key.authname, rqstp->rq_client->name, sizeof(key.authname));
 	ret = idmap_lookup(rqstp, idtoname_lookup, &key, &idtoname_cache, &item);
+	if (ret == -ENOENT)
+		return sprintf(name, "%u", id);
 	if (ret)
 		return ret;
 	ret = strlen(item->name);

diff ./fs/nfsd/nfsproc.c~current~ ./fs/nfsd/nfsproc.c
--- ./fs/nfsd/nfsproc.c~current~	2004-05-18 11:16:20.000000000 +1000
+++ ./fs/nfsd/nfsproc.c	2004-05-18 11:18:14.000000000 +1000
@@ -589,6 +589,7 @@ nfserrno (int errno)
 		{ nfserr_jukebox, -ETIMEDOUT },
 		{ nfserr_dropit, -EAGAIN },
 		{ nfserr_dropit, -ENOMEM },
+		{ nfserr_badname, -ESRCH },
 		{ -1, -EIO }
 	};
 	int	i;

diff ./include/linux/nfsd/nfsd.h~current~ ./include/linux/nfsd/nfsd.h
--- ./include/linux/nfsd/nfsd.h~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./include/linux/nfsd/nfsd.h	2004-05-18 11:18:14.000000000 +1000
@@ -199,6 +199,7 @@ void		nfsd_lockd_shutdown(void);
 #define	nfserr_grace		__constant_htonl(NFSERR_GRACE)
 #define	nfserr_no_grace		__constant_htonl(NFSERR_NO_GRACE)
 #define	nfserr_reclaim_bad	__constant_htonl(NFSERR_RECLAIM_BAD)
+#define	nfserr_badname		__constant_htonl(NFSERR_BADNAME)
 
 /* error codes for internal use */
 /* if a request fails due to kmalloc failure, it gets dropped.


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 0 of 10 - Introduction
@ 2004-05-18  1:31 NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail, NeilBrown
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs

Following are 10 patches to nfsd in 2.6.6-mm3

The first two are simple bug fixes.
The next three are minor general improvements and cleanups.
The remaining 4 are specific the the "idmapper" used by nfsdv4 
and resolve issues with the name/uid lookup process.

Thanks,
NeilBrown



-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] kNFSd - 3 of 10 - Allow larger writes to sunrpc/svc caches.
  2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
                   ` (3 preceding siblings ...)
  2004-05-18  1:31 ` [PATCH] kNFSd - 7 of 10 - Improve idmapper behaviour on failure NeilBrown
@ 2004-05-18  1:31 ` NeilBrown
  2004-05-18  1:31 ` [PATCH] kNFSd - 9 of 10 - Remove check on number of threads waiting on user-space NeilBrown
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: NeilBrown @ 2004-05-18  1:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nfs


We currently serialize all writes to these caches with queue_io_sem,
so we only needed one buffer.
There is some need for larger-than-one-page writes, so we can just
statically allocate a buffer.

 ----------- Diffstat output ------------
 ./net/sunrpc/cache.c |   20 ++++++--------------
 1 files changed, 6 insertions(+), 14 deletions(-)

diff ./net/sunrpc/cache.c~current~ ./net/sunrpc/cache.c
--- ./net/sunrpc/cache.c~current~	2004-05-18 11:09:16.000000000 +1000
+++ ./net/sunrpc/cache.c	2004-05-18 11:16:01.000000000 +1000
@@ -652,12 +652,13 @@ cache_read(struct file *filp, char *buf,
 	return err ? err :  count;
 }
 
+static char write_buf[8192]; /* protected by queue_io_sem */
+
 static ssize_t
 cache_write(struct file *filp, const char *buf, size_t count,
 	    loff_t *ppos)
 {
 	int err;
-	char *page;
 	struct cache_detail *cd = PDE(filp->f_dentry->d_inode)->data;
 
 	if (ppos != &filp->f_pos)
@@ -665,31 +666,22 @@ cache_write(struct file *filp, const cha
 
 	if (count == 0)
 		return 0;
-	if (count > PAGE_SIZE)
+	if (count >= sizeof(write_buf))
 		return -EINVAL;
 
 	down(&queue_io_sem);
 
-	page = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	if (page == NULL) {
-		up(&queue_io_sem);
-		return -ENOMEM;
-	}
-
-	if (copy_from_user(page, buf, count)) {
+	if (copy_from_user(write_buf, buf, count)) {
 		up(&queue_io_sem);
-		kfree(page);
 		return -EFAULT;
 	}
-	if (count < PAGE_SIZE)
-		page[count] = '\0';
+	write_buf[count] = '\0';
 	if (cd->cache_parse)
-		err = cd->cache_parse(cd, page, count);
+		err = cd->cache_parse(cd, write_buf, count);
 	else
 		err = -EINVAL;
 
 	up(&queue_io_sem);
-	kfree(page);
 	return err ? err : count;
 }
 


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* nfs3 dentry encoding
  2004-05-18  1:31 ` [PATCH] kNFSd - 8 of 10 - Reduce timeout when waiting for idmapper userspace daemon NeilBrown
@ 2004-05-27 10:20   ` Olaf Kirch
  2004-05-27 11:09     ` Neil Brown
  0 siblings, 1 reply; 14+ messages in thread
From: Olaf Kirch @ 2004-05-27 10:20 UTC (permalink / raw)
  To: NeilBrown; +Cc: nfs

Hi Neil,

you recently posted a patch that should fix readdir encoding in
nfsd. You say there

	Note that as the offset and whole response is known to be
	4byte-aligned, the offset pointer will never be split over
	two pages.

This is not true. The dirent offset is a 64bit quantity, so it's quite
possible it will be split across the page boundary. I'm working on a
patch...

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+ 


-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nfs3 dentry encoding
  2004-05-27 10:20   ` nfs3 dentry encoding Olaf Kirch
@ 2004-05-27 11:09     ` Neil Brown
  2004-05-27 12:15       ` Olaf Kirch
  0 siblings, 1 reply; 14+ messages in thread
From: Neil Brown @ 2004-05-27 11:09 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: nfs

On Thursday May 27, okir@suse.de wrote:
> Hi Neil,
> 
> you recently posted a patch that should fix readdir encoding in
> nfsd. You say there
> 
> 	Note that as the offset and whole response is known to be
> 	4byte-aligned, the offset pointer will never be split over
> 	two pages.
> 
> This is not true. The dirent offset is a 64bit quantity, so it's quite
> possible it will be split across the page boundary. I'm working on a
> patch...

Hmmm, just as well someone is watching over me!  Thanks.
I must have been thinking NFSv2.

I guess we need an offset1 and an offset2 which may not be
consecutive.

I'll look forward to the patch.

NeilBrown



-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: nfs3 dentry encoding
  2004-05-27 11:09     ` Neil Brown
@ 2004-05-27 12:15       ` Olaf Kirch
  0 siblings, 0 replies; 14+ messages in thread
From: Olaf Kirch @ 2004-05-27 12:15 UTC (permalink / raw)
  To: Neil Brown; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 747 bytes --]

On Thu, May 27, 2004 at 09:09:49PM +1000, Neil Brown wrote:
> Hmmm, just as well someone is watching over me!  Thanks.

I'm currently looking into a failure with a Solaris 9 running connectathon
against a 2.6 Linux server. It fails in a test case doing a readdirplus
of a large directory, because it scribbles the offset value into
random memory locations (usually at page offset 0)

See the attached patch, which I also posted in a follow-up to
your message to to linux-kernel.

The first hunk is rather cosmetic; rq_res is cleared by a memset
so there's not need to zero the offset pointer(s) here. In fact
we don't initialize them in readdirplus anyway.

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+ 

[-- Attachment #2: nfsd-encode-dirent3 --]
[-- Type: text/plain, Size: 2757 bytes --]

Index: linux-2.6.5/fs/nfsd/nfs3proc.c
===================================================================
--- linux-2.6.5.orig/fs/nfsd/nfs3proc.c	2004-05-27 12:22:43.000000000 +0200
+++ linux-2.6.5/fs/nfsd/nfs3proc.c	2004-05-27 12:34:40.000000000 +0200
@@ -437,7 +437,6 @@
 	resp->buflen = count;
 	resp->common.err = nfs_ok;
 	resp->buffer = argp->buffer;
-	resp->offset = NULL;
 	resp->rqstp = rqstp;
 	nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t*) &argp->cookie, 
 					&resp->common, nfs3svc_encode_entry);
Index: linux-2.6.5/fs/nfsd/nfs3xdr.c
===================================================================
--- linux-2.6.5.orig/fs/nfsd/nfs3xdr.c	2004-05-27 12:22:43.000000000 +0200
+++ linux-2.6.5/fs/nfsd/nfs3xdr.c	2004-05-27 12:39:09.000000000 +0200
@@ -887,8 +887,18 @@
 	int		elen;		/* estimated entry length in words */
 	int		num_entry_words = 0;	/* actual number of words */
 
-	if (cd->offset)
-		xdr_encode_hyper(cd->offset, (u64) offset);
+	if (cd->offset) {
+		u64 offset64 = offset;
+
+		if (unlikely(cd->offset1)) {
+			/* we ended up with offset on a page boundary */
+			*cd->offset = htonl(offset64 >> 32);
+			*cd->offset1 = htonl(offset64 & 0xffffffff);
+			cd->offset1 = NULL;
+		} else {
+			xdr_encode_hyper(cd->offset, (u64) offset);
+		}
+	}
 
 	/*
 	dprintk("encode_entry(%.*s @%ld%s)\n",
@@ -969,17 +979,32 @@
 			/* update offset */
 			cd->offset = cd->buffer + (cd->offset - tmp);
 		} else {
+			unsigned int offset_r = (cd->offset - tmp) << 2;
+
+			/* update pointer to offset location.
+			 * This is a 64bit quantity, so we need to
+			 * deal with 3 cases:
+			 *  -	entirely in first page
+			 *  -	entirely in second page
+			 *  -	4 bytes in each page
+			 */
+			if (offset_r + 8 <= len1) {
+				cd->offset = p + (cd->offset - tmp);
+			} else if (offset_r >= len1) {
+				cd->offset -= len1 >> 2;
+			} else {
+				/* sitting on the fence */
+				BUG_ON(offset_r != len1 - 4);
+				cd->offset = p + (cd->offset - tmp);
+				cd->offset1 = tmp;
+			}
+
 			len2 = (num_entry_words << 2) - len1;
 
 			/* move from temp page to current and next pages */
 			memmove(p, tmp, len1);
 			memmove(tmp, (caddr_t)tmp+len1, len2);
 
-			/* update offset */
-			if (((cd->offset - tmp) << 2) <= len1)
-				cd->offset = p + (cd->offset - tmp);
-			else
-				cd->offset -= len1 >> 2;
 			p = tmp + (len2 >> 2);
 		}
 	}
Index: linux-2.6.5/include/linux/nfsd/xdr3.h
===================================================================
--- linux-2.6.5.orig/include/linux/nfsd/xdr3.h	2004-05-27 12:22:43.000000000 +0200
+++ linux-2.6.5/include/linux/nfsd/xdr3.h	2004-05-27 12:28:20.000000000 +0200
@@ -183,6 +183,7 @@
 	u32 *			buffer;
 	int			buflen;
 	u32 *			offset;
+	u32 *			offset1;
 	struct svc_rqst *	rqstp;
 
 };

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-05-31 11:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-18  1:31 [PATCH] kNFSd - 0 of 10 - Introduction NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 10 of 10 - Add a warning when upcalls fail, NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 2 of 10 - Make sure CACHE_NEGATIVE is cleared when a cache entry is updates NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 5 of 10 - Protect reference to exp across calls to nfsd_cross_mnt NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 7 of 10 - Improve idmapper behaviour on failure NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 3 of 10 - Allow larger writes to sunrpc/svc caches NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 9 of 10 - Remove check on number of threads waiting on user-space NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 6 of 10 - Fix race conditions in idmapper NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 8 of 10 - Reduce timeout when waiting for idmapper userspace daemon NeilBrown
2004-05-27 10:20   ` nfs3 dentry encoding Olaf Kirch
2004-05-27 11:09     ` Neil Brown
2004-05-27 12:15       ` Olaf Kirch
2004-05-18  1:31 ` [PATCH] kNFSd - 4 of 10 - Change fh_compose to NOT consume a reference to the dentry NeilBrown
2004-05-18  1:31 ` [PATCH] kNFSd - 1 of 10 - Use correct _bh locking on sv_lock NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox