From: NeilBrown <neilb@suse.de>
To: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall.
Date: Wed, 13 Mar 2013 16:58:46 +1100 [thread overview]
Message-ID: <20130313165846.707392cf@notabene.brown> (raw)
In-Reply-To: <61eb00$3g8j4e@dgate20u.abg.fsc.net>
[-- Attachment #1: Type: text/plain, Size: 7241 bytes --]
On 05 Mar 2013 16:07:39 +0100 Bodo Stroesser <bstroesser@ts.fujitsu.com>
wrote:
> Hi,
>
> the patch in my previous mail was a bit too optimized,
> in that it gets an cache_request* pointer from each entry
> on the detail->queue, even if it really is a cache_reader.
>
> As the resulting pointer is used as a cache_request* after
> checking cache_queue->reader only, this isn't a bug, but
> it is unclean.
>
> Thus I changed the patch below once again to keep the sane
> handling as it has been implemented.
>
> Bodo
>
>
> On 04 Mar 2013 07:10:00 +0100 NeilBrown <neilb@suse.de> wrote:
>
> > We currently queue an upcall after setting CACHE_PENDING,
> > and dequeue after clearing CACHE_PENDING.
> > So a request should only be present when CACHE_PENDING is set.
> >
> > However we don't combine the test and the enqueue/dequeue in
> > a protected region, so it is possible (if unlikely) for a race
> > to result in a request being queued without CACHE_PENDING set,
> > or a request to be absent despite CACHE_PENDING.
> >
> > So: include a test for CACHE_PENDING inside the regions of
> > enqueue and dequeue where queue_lock is held, and abort
> > the operation if the value is not as expected.
> >
> > Also remove the 'return' from cache_dequeue() to ensure it always
> > removes all entries: As there is no locking between setting
> > CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
> > inconceivable for some other thread to clear CACHE_PENDING and then
> > someone else to set it can call sunrpc_cache_pipe_upcall, both before
> > the original thread completed the call.
> >
> > With this, it perfectly safe and correct to:
> > - call cache_dequeue() if and only if we have just
> > cleared CACHE_PENDING
> > - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
> > if and only if we have just set CACHE_PENDING.
> >
> > Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> > net/sunrpc/cache.c | 17 ++++++++++++++---
> > 1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> > index 9afa439..0400a92 100644
> > --- a/net/sunrpc/cache.c
> > +++ b/net/sunrpc/cache.c
> > @@ -1022,6 +1022,9 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
> > struct cache_request *cr = container_of(cq, struct cache_request, q);
> > if (cr->item != ch)
> > continue;
> > + if (test_bit(CACHE_PENDING, &ch->flags))
> > + /* Lost a race and it is pending again */
> > + break;
> > if (cr->readers != 0)
> > continue;
> > list_del(&cr->q.list);
> > @@ -1029,7 +1032,6 @@ static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
> > cache_put(cr->item, detail);
> > kfree(cr->buf);
> > kfree(cr);
> > - return;
>
> Just removing "return" is not enough. If the loop no longer stops
> after dequeueing the first entry found, some other changes are
> necessary also:
>
> 1) use list_for_each_entry_safe() instead of list_for_each_entry()
>
> 2) don't call spin_unlock() in the loop.
>
> Thus, at the end of this mail I added a revised patch. With this
> patch cache_dequeue() no longer frees the requests in the loop,
> but moves them to a temporary queue.
> After the loop it calls spin_unlock() and does the kfree() and
> cache_put() in a second loop.
>
> The patch is not tested on a mainline kernel. Instead, I
> ported it to SLES11 SP1 2.6.32.59-0.7.1. On SLES 11 the test
> is still running fine.
>
> Best Regards,
> Bodo
>
> > }
> > spin_unlock(&queue_lock);
> > }
> > @@ -1151,6 +1153,7 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
> > struct cache_request *crq;
> > char *bp;
> > int len;
> > + int ret = 0;
> >
> > if (!cache_listeners_exist(detail)) {
> > warn_no_listener(detail);
> > @@ -1182,10 +1185,18 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
> > crq->len = PAGE_SIZE - len;
> > crq->readers = 0;
> > spin_lock(&queue_lock);
> > - list_add_tail(&crq->q.list, &detail->queue);
> > + if (test_bit(CACHE_PENDING, &h->flags))
> > + list_add_tail(&crq->q.list, &detail->queue);
> > + else
> > + /* Lost a race, no longer PENDING, so don't enqueue */
> > + ret = -EAGAIN;
> > spin_unlock(&queue_lock);
> > wake_up(&queue_wait);
> > - return 0;
> > + if (ret == -EAGAIN) {
> > + kfree(buf);
> > + kfree(crq);
> > + }
> > + return ret;
> > }
> > EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);
> >
> >
>
>
> .....
>
> Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> ---
>
> --- a/net/sunrpc/cache.c 2013-02-20 14:05:08.000000000 +0100
> +++ b/net/sunrpc/cache.c 2013-03-05 13:30:13.000000000 +0100
> @@ -1015,23 +1015,33 @@
>
> static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch)
> {
> - struct cache_queue *cq;
> + struct cache_queue *cq, *tmp;
> + struct cache_request *cr;
> + struct list_head dequeued;
> +
> + INIT_LIST_HEAD(&dequeued);
> spin_lock(&queue_lock);
> - list_for_each_entry(cq, &detail->queue, list)
> + list_for_each_entry_safe(cq, tmp, &detail->queue, list)
> if (!cq->reader) {
> - struct cache_request *cr = container_of(cq, struct cache_request, q);
> + cr = container_of(cq, struct cache_request, q);
> if (cr->item != ch)
> continue;
> + if (test_bit(CACHE_PENDING, &ch->flags))
> + /* Lost a race and it is pending again */
> + break;
> if (cr->readers != 0)
> continue;
> - list_del(&cr->q.list);
> - spin_unlock(&queue_lock);
> - cache_put(cr->item, detail);
> - kfree(cr->buf);
> - kfree(cr);
> - return;
> + list_move(&cr->q.list, &dequeued);
> }
> spin_unlock(&queue_lock);
> +
> + while (!list_empty(&dequeued)) {
> + cr = list_entry(dequeued.next, struct cache_request, q.list);
> + list_del(&cr->q.list);
> + cache_put(cr->item, detail);
> + kfree(cr->buf);
> + kfree(cr);
> + }
> }
>
> /*
> @@ -1151,6 +1161,7 @@
> struct cache_request *crq;
> char *bp;
> int len;
> + int ret = 0;
>
> if (!cache_listeners_exist(detail)) {
> warn_no_listener(detail);
> @@ -1182,10 +1193,18 @@
> crq->len = PAGE_SIZE - len;
> crq->readers = 0;
> spin_lock(&queue_lock);
> - list_add_tail(&crq->q.list, &detail->queue);
> + if (test_bit(CACHE_PENDING, &h->flags))
> + list_add_tail(&crq->q.list, &detail->queue);
> + else
> + /* Lost a race, no longer PENDING, so don't enqueue */
> + ret = -EAGAIN;
> spin_unlock(&queue_lock);
> wake_up(&queue_wait);
> - return 0;
> + if (ret == -EAGAIN) {
> + kfree(buf);
> + kfree(crq);
> + }
> + return ret;
> }
> EXPORT_SYMBOL_GPL(sunrpc_cache_pipe_upcall);
>
Thanks for this - that was careless of me. sorry.
Your patch looks good. I'll forward the two to Bruce again shortly, then
look at submitting a backport for SLES.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next parent reply other threads:[~2013-03-13 5:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <61eb00$3g8j4e@dgate20u.abg.fsc.net>
2013-03-13 5:58 ` NeilBrown [this message]
2013-03-05 15:07 [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall Bodo Stroesser
-- strict thread matches above, loose matches on Subject: below --
2013-03-05 13:57 Bodo Stroesser
2013-03-04 15:57 Bodo Stroesser
2013-03-04 6:09 [PATCH 0/2] two fixes for races in SUNRPC. - Take 2 NeilBrown
2013-03-04 6:09 ` [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall NeilBrown
[not found] <61eb00$3fkshf@dgate20u.abg.fsc.net>
2013-02-27 23:24 ` NeilBrown
2013-02-26 14:02 Bodo Stroesser
2013-02-26 6:36 [PATCH 0/2] two fixes for races in SUNRPC NeilBrown
2013-02-26 6:36 ` [PATCH 1/2] sunrpc/cache: remove races with queuing an upcall NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130313165846.707392cf@notabene.brown \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=bstroesser@ts.fujitsu.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).