Re: [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing

Linux NFS development
 help / color / mirror / Atom feed

From: Jeff Layton <jeff.layton@primarydata.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jeff.layton@primarydata.com>,
	Chris Worley <chris.worley@primarydata.com>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing
Date: Tue, 25 Nov 2014 19:38:18 -0500	[thread overview]
Message-ID: <20141125193818.3800fd0d@tlielax.poochiereds.net> (raw)
In-Reply-To: <20141126000941.GF15033@fieldses.org>

On Tue, 25 Nov 2014 19:09:41 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Nov 25, 2014 at 04:25:57PM -0500, Jeff Layton wrote:
> > On Fri, 21 Nov 2014 14:19:27 -0500
> > Jeff Layton <jlayton@primarydata.com> wrote:
> > 
> > > Hi Bruce!
> > > 
> > > Here are the patches that I had mentioned earlier that reduce the
> > > contention for the pool->sp_lock when the server is heavily loaded.
> > > 
> > > The basic problem is that whenever a svc_xprt needs to be queued up for
> > > servicing, we have to take the pool->sp_lock to try and find an idle
> > > thread to service it.  On a busy server, that lock becomes highly
> > > contended and that limits the throughput.
> > > 
> > > This patchset fixes this by changing how we search for an idle thread.
> > > First, we convert svc_rqst and the sp_all_threads list to be
> > > RCU-managed. Then we change the search for an idle thread to use the
> > > sp_all_threads list, which now can be done under the rcu_read_lock.
> > > When there is an available thread, queueing an xprt to it can now be
> > > done without any spinlocking.
> > > 
> > > With this, we see a pretty substantial increase in performance on a
> > > larger-scale server that is heavily loaded. Chris has some preliminary
> > > numbers, but they need to be cleaned up a bit before we can present
> > > them. I'm hoping to have those by early next week.
> > > 
> > > Jeff Layton (4):
> > >   sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it
> > >   sunrpc: fix potential races in pool_stats collection
> > >   sunrpc: convert to lockless lookup of queued server threads
> > >   sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
> > > 
> > >  include/linux/sunrpc/svc.h    |  12 +-
> > >  include/trace/events/sunrpc.h |  98 +++++++++++++++-
> > >  net/sunrpc/svc.c              |  17 +--
> > >  net/sunrpc/svc_xprt.c         | 252 ++++++++++++++++++++++++------------------
> > >  4 files changed, 258 insertions(+), 121 deletions(-)
> > > 
> > 
> > Here's what I've got so far.
> > 
> > This is just a chart that shows the % increase in the number of iops in
> > a distributed test on a NFSv3 server with this patchset vs. without.
> > 
> > The numbers along the bottom show the number of total job threads
> > running. Chris says:
> > 
> > "There were 64 nfsd threads running on the server.
> > 
> >  There were 7 hypervisors running 2 VMs each running 2 and 4 threads per
> >  VM.  Thus, 56 and 112 threads total."
> 
> Thanks!
> 

Good questions all around. I'll try to answer them as best I can:

> Results that someone else could reproduce would be much better.
> (Where's the source code for the test?

The test is just fio (which is available in the fedora repos, fwiw):

    http://git.kernel.dk/?p=fio.git;a=summary

...but we'd have to ask Chris for the job files. Chris, can those be
released?

>  What's the base the patchset was
> applied to?

The base was a v3.14-ish kernel with a pile of patches on top (mostly,
the ones that Trond asked you to merge for v3.18). The only difference
between the "baseline" and "patched" kernels is this set, plus a few
patches from upstream that made it apply more cleanly. None of those
should have much effect on the results though.

> What was the hardware? 

Again, I'll have to defer that question to Chris. I don't know much
about the hw in use here, other than that it has some pretty fast
storage (high perf. SSDs).

> I understand that's a lot of
> information.)  But it's nice to see some numbers at least.
> 
> (I wonder what the reason is for the odd shape in the 112-thread case
> (descending slightly as the writes decrease and then shooting up when
> they go to zero.)  OK, I guess that's what you get if you just assume
> read-write contention is expensive and one write is slightly more
> expensive than one read.  But then why doesn't it behave the same way in
> the 56-thread case?)
> 

Yeah, I wondered about that too.

There is some virtualization in use on the clients here (and it's
vmware too), so I have to wonder if there's some variance in the
numbers due to weirdo virt behaviors or something.

The good news is that the overall trend pretty clearly shows a
performance increase.

As always, benchmark results point out the need for more benchmarks.

-- 
Jeff Layton <jlayton@primarydata.com>

next prev parent reply	other threads:[~2014-11-26  0:38 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 19:19 [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing Jeff Layton
2014-11-21 19:19 ` [PATCH 1/4] sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it Jeff Layton
2014-12-01 22:44   ` J. Bruce Fields
2014-12-01 23:05     ` Jeff Layton
2014-12-01 23:36       ` Trond Myklebust
2014-12-02  0:29         ` Jeff Layton
2014-12-02  0:52           ` Trond Myklebust
2014-12-09 17:05             ` J. Bruce Fields
2014-11-21 19:19 ` [PATCH 2/4] sunrpc: fix potential races in pool_stats collection Jeff Layton
2014-11-21 19:19 ` [PATCH 3/4] sunrpc: convert to lockless lookup of queued server threads Jeff Layton
2014-12-01 23:47   ` J. Bruce Fields
2014-12-02  0:38     ` Trond Myklebust
2014-12-02 11:57       ` Jeff Layton
2014-12-02 12:14         ` Jeff Layton
2014-12-02 16:50           ` J. Bruce Fields
2014-12-02 18:53             ` Ben Myers
2014-12-09 17:04               ` J. Bruce Fields
2014-12-08 18:57             ` J. Bruce Fields
2014-12-08 19:54               ` Jeff Layton
2014-12-08 19:58                 ` J. Bruce Fields
2014-12-08 20:24                   ` Jeff Layton
2014-12-09 16:57           ` J. Bruce Fields
2014-11-21 19:19 ` [PATCH 4/4] sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt Jeff Layton
2014-12-02 13:31   ` Jeff Layton
2014-12-09 16:36     ` J. Bruce Fields
2014-11-25 21:25 ` [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing Jeff Layton
2014-11-26  0:09   ` J. Bruce Fields
2014-11-26  0:38     ` Jeff Layton [this message]
2014-11-26  2:40       ` J. Bruce Fields
2014-11-26 11:12         ` Jeff Layton
2014-12-09 16:44 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141125193818.3800fd0d@tlielax.poochiereds.net \
    --to=jeff.layton@primarydata.com \
    --cc=bfields@fieldses.org \
    --cc=chris.worley@primarydata.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox