From: "J. Bruce Fields" <bfields@fieldses.org>
To: NeilBrown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/7] sunrpc: fix race in new cache_wait code.
Date: Wed, 22 Sep 2010 13:50:27 -0400 [thread overview]
Message-ID: <20100922175027.GF15560@fieldses.org> (raw)
In-Reply-To: <20100922025506.31745.67177.stgit@localhost.localdomain>
On Wed, Sep 22, 2010 at 12:55:06PM +1000, NeilBrown wrote:
> If we set up to wait for a cache item to be filled in, and then find
> that it is no longer pending, it could be that some other thread is
> in 'cache_revisit_request' and has moved our request to its 'pending' list.
> So when our setup_deferral calls cache_revisit_request it will find nothing to
> put on the pending list, and do nothing.
>
> We then return from cache_wait_req, thus leaving the 'sleeper'
> on-stack structure open to being corrupted by subsequent stack usage.
>
> However that 'sleeper' could still be on the 'pending' list that the
> other thread is looking at and so any corruption could cause it to behave badly.
>
> To avoid this race we simply take the same path as if the
> 'wait_for_completion_interruptible_timeout' was interrupted and if the
> sleeper is no longer on the list (which it won't be) we wait on the
> completion - which will ensure that any other cache_revisit_request
> will have let go of the sleeper.
OK, but I don't think we need that first CACHE_PENDING check in
setup_deferral at all. Best just to ignore it?:
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index d789dfc..82804b4 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -567,10 +567,9 @@ static int cache_wait_req(struct cache_req *req, struct cache_head *item, int ti
sleeper.completion = COMPLETION_INITIALIZER_ONSTACK(sleeper.completion);
dreq->revisit = cache_restart_thread;
- ret = setup_deferral(dreq, item);
+ setup_deferral(dreq, item);
- if (ret ||
- wait_for_completion_interruptible_timeout(
+ if (wait_for_completion_interruptible_timeout(
&sleeper.completion, timeout) <= 0) {
/* The completion wasn't completed, so we need
* to clean up
Then it's obvious that we always wait for completion, and that we fill
the basic requirement to avoid corruption here.
(Which is, in more detail: cache_wait_req must not return while dreq is
still reachable from anywhere else. Since dreq is reachable from
elsewhere only as long as it is hashed, and since anyone else that might
unhash it will call our revisit (which unconditionally calls complete),
then forget about it, it suffices for cache_wait_req to either wait for
completion, or unhash dreq *itself* (before someone else does).)
Also we don't need the PENDING check to ensure we wait only when
necessary--we only wait while dreq is hashed, and as long as we're
hashed anyone clearing PENDING will also end up doing the complete().
In the deferral case it's maybe a useful optimization if it avoids
an unnecessary drop sometimes. Here it doesn't help.
--b.
next prev parent reply other threads:[~2010-09-22 17:51 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-22 2:55 [PATCH 0/7] Assorted nfsd patches for 2.6.37 NeilBrown
2010-09-22 2:55 ` [PATCH 2/7] sunrpc/cache: fix recent breakage of cache_clean_deferred NeilBrown
[not found] ` <20100922025506.31745.74964.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 18:27 ` J. Bruce Fields
2010-09-22 2:55 ` [PATCH 1/7] sunrpc: fix race in new cache_wait code NeilBrown
2010-09-22 17:50 ` J. Bruce Fields [this message]
2010-09-23 3:00 ` Neil Brown
2010-09-23 3:25 ` J. Bruce Fields
2010-09-23 14:46 ` J. Bruce Fields
2010-10-01 23:09 ` J. Bruce Fields
2010-10-02 0:12 ` Neil Brown
2010-09-22 2:55 ` [PATCH 6/7] nfsd: formally deprecate legacy nfsd syscall interface NeilBrown
[not found] ` <20100922025507.31745.57024.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 3:10 ` J. Bruce Fields
2010-09-22 2:55 ` [PATCH 5/7] sunrpc/cache: allow thread manager more control of whether threads can wait for upcalls NeilBrown
2010-09-22 18:36 ` J. Bruce Fields
2010-09-23 3:23 ` Neil Brown
2010-09-22 2:55 ` [PATCH 7/7] nfsd: allow deprecated interface to be compiled out NeilBrown
2010-09-22 2:55 ` [PATCH 3/7] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
2010-09-22 2:59 ` J. Bruce Fields
2010-09-22 4:51 ` Neil Brown
2010-09-22 2:55 ` [PATCH 4/7] sunrpc/cache: centralise handling of size limit on deferred list NeilBrown
[not found] ` <20100922025507.31745.61919.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-09-22 18:31 ` J. Bruce Fields
2010-09-23 3:02 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100922175027.GF15560@fieldses.org \
--to=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).