From: Trond Myklebust <trond.myklebust@primarydata.com>
To: Bruce Fields <bfields@fieldses.org>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 08/11] SUNRPC: get rid of the request wait queue
Date: Tue, 12 Aug 2014 16:54:53 -0400 [thread overview]
Message-ID: <CAHQdGtS+1osSEZyB-q=kYaH0wokKZ3XHc+7f3WWso03LhE-JbA@mail.gmail.com> (raw)
In-Reply-To: <20140812202316.GE25914@fieldses.org>
On Tue, Aug 12, 2014 at 4:23 PM, Bruce Fields <bfields@fieldses.org> wrote:
> On Tue, Aug 12, 2014 at 04:09:52PM -0400, Trond Myklebust wrote:
>> On Tue, Aug 12, 2014 at 3:53 PM, Bruce Fields <bfields@fieldses.org> wrote:
>> > On Sun, Aug 03, 2014 at 01:03:10PM -0400, Trond Myklebust wrote:
>> >> We're always _only_ waking up tasks from within the sp_threads list, so
>> >> we know that they are enqueued and alive. The rq_wait waitqueue is just
>> >> a distraction with extra atomic semantics.
>> >>
>> >> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>> >> ---
>> >> include/linux/sunrpc/svc.h | 1 -
>> >> net/sunrpc/svc.c | 2 --
>> >> net/sunrpc/svc_xprt.c | 32 +++++++++++++++++---------------
>> >> 3 files changed, 17 insertions(+), 18 deletions(-)
>> >>
>> >> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
>> >> index 1bc7cd05b22e..3ec769b65c7f 100644
>> >> --- a/include/linux/sunrpc/svc.h
>> >> +++ b/include/linux/sunrpc/svc.h
>> >> @@ -280,7 +280,6 @@ struct svc_rqst {
>> >> int rq_splice_ok; /* turned off in gss privacy
>> >> * to prevent encrypting page
>> >> * cache pages */
>> >> - wait_queue_head_t rq_wait; /* synchronization */
>> >> struct task_struct *rq_task; /* service thread */
>> >> };
>> >>
>> >> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
>> >> index 5de6801cd924..dfb78c4f3031 100644
>> >> --- a/net/sunrpc/svc.c
>> >> +++ b/net/sunrpc/svc.c
>> >> @@ -612,8 +612,6 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
>> >> if (!rqstp)
>> >> goto out_enomem;
>> >>
>> >> - init_waitqueue_head(&rqstp->rq_wait);
>> >> -
>> >> serv->sv_nrthreads++;
>> >> spin_lock_bh(&pool->sp_lock);
>> >> pool->sp_nrthreads++;
>> >> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>> >> index 2c30193c7a13..438e91c12851 100644
>> >> --- a/net/sunrpc/svc_xprt.c
>> >> +++ b/net/sunrpc/svc_xprt.c
>> >> @@ -348,8 +348,6 @@ static void svc_xprt_do_enqueue(struct svc_xprt *xprt)
>> >>
>> >> cpu = get_cpu();
>> >> pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
>> >> - put_cpu();
>> >> -
>> >> spin_lock_bh(&pool->sp_lock);
>> >>
>> >> if (!list_empty(&pool->sp_threads) &&
>> >> @@ -382,10 +380,15 @@ static void svc_xprt_do_enqueue(struct svc_xprt *xprt)
>> >> printk(KERN_ERR
>> >> "svc_xprt_enqueue: server %p, rq_xprt=%p!\n",
>> >> rqstp, rqstp->rq_xprt);
>> >> - rqstp->rq_xprt = xprt;
>> >> + /* Note the order of the following 3 lines:
>> >> + * We want to assign xprt to rqstp->rq_xprt only _after_
>> >> + * we've woken up the process, so that we don't race with
>> >> + * the lockless check in svc_get_next_xprt().
>> >
>> > Sorry, I'm not following this: what exactly is the race?
>>
>> There are 2 potential races, both due to the lockless check for rqstp->rq_xprt.
>>
>> 1) You need to ensure that the reference count in xprt is bumped
>> before assigning to rqstp->rq_xprt, because svc_get_next_xprt() does a
>> lockless check for rqstp->rq_xprt != NULL, so there is no lock to
>> ensure that the refcount bump occurs before whoever called
>> svc_get_next_xprt() calls svc_xprt_put()...
>>
>> 2) You want to ensure that you don't call wake_up_process() after
>> exiting svc_get_next_xprt() since you would no longer be guaranteed
>> that the task still exists. By calling wake_up_process() before
>> setting rqstp->rq_xprt, you ensure that if the task wakes up before
>> calling wake_up_process(), then the test for rqstp->rq_xprt == NULL
>> forces you into the spin_lock_bh() protected region, where it is safe
>> to test rqstp->rq_xprt again and decide whether or not the task is
>> still queued on &pool->sp_threads.
>
> Got it. So how about making that comment:
>
> /*
> * Once we assign rq_xprt, a concurrent task in
> * svc_get_next_xprt() could put the xprt, or could exit.
> * Therefore, the get and the wake_up need to come first.
> */
>
> Is that close enough?
>
It's close: it's not so much any concurrent task, it's specifically
the one that we're trying to wake up that may find itself waking up
prematurely due to the schedule_timeout(), and then racing due to the
lockless check.
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
next prev parent reply other threads:[~2014-08-12 20:54 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-03 17:03 [PATCH 00/11] SUNRPC server scalability improvements Trond Myklebust
2014-08-03 17:03 ` [PATCH 01/11] SUNRPC: Reduce contention in svc_xprt_enqueue() Trond Myklebust
2014-08-03 17:03 ` [PATCH 02/11] SUNRPC: svc_tcp_write_space: don't clear SOCK_NOSPACE prematurely Trond Myklebust
2014-08-03 17:03 ` [PATCH 03/11] SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed Trond Myklebust
2014-08-03 17:03 ` [PATCH 04/11] SUNRPC: Do not override wspace tests in svc_handle_xprt Trond Myklebust
2014-08-03 17:03 ` [PATCH 05/11] lockd: Ensure that lockd_start_svc sets the server rq_task Trond Myklebust
2014-08-03 17:03 ` [PATCH 06/11] nfs: Ensure that nfs_callback_start_svc " Trond Myklebust
2014-08-03 17:03 ` [PATCH 07/11] SUNRPC: Do not grab pool->sp_lock unnecessarily in svc_get_next_xprt Trond Myklebust
2014-08-03 17:03 ` [PATCH 08/11] SUNRPC: get rid of the request wait queue Trond Myklebust
2014-08-03 17:03 ` [PATCH 09/11] SUNRPC: Fix broken kthread_should_stop test in svc_get_next_xprt Trond Myklebust
2014-08-03 17:03 ` [PATCH 10/11] SUNRPC: More optimisations of svc_xprt_enqueue() Trond Myklebust
2014-08-03 17:03 ` [PATCH 11/11] SUNRPC: Optimise away svc_recv_available Trond Myklebust
2014-08-12 19:53 ` [PATCH 08/11] SUNRPC: get rid of the request wait queue Bruce Fields
2014-08-12 20:09 ` Trond Myklebust
2014-08-12 20:23 ` Bruce Fields
2014-08-12 20:54 ` Trond Myklebust [this message]
2014-08-12 21:29 ` Bruce Fields
2014-08-12 21:34 ` Trond Myklebust
2014-08-12 19:16 ` [PATCH 04/11] SUNRPC: Do not override wspace tests in svc_handle_xprt Bruce Fields
2014-08-12 19:31 ` Trond Myklebust
2014-08-12 20:04 ` Bruce Fields
2014-08-04 13:36 ` [PATCH 00/11] SUNRPC server scalability improvements Bruce Fields
2014-08-05 20:42 ` Bruce Fields
2014-08-05 21:24 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHQdGtS+1osSEZyB-q=kYaH0wokKZ3XHc+7f3WWso03LhE-JbA@mail.gmail.com' \
--to=trond.myklebust@primarydata.com \
--cc=bfields@fieldses.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).