Linux NFS development
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Ben Myers <bpm@sgi.com>
Cc: Andrew Dahl <adahl@sgi.com>,
	Jeff Layton <jeff.layton@primarydata.com>,
	Trond Myklebust <trondmy@gmail.com>,
	Chris Worley <chris.worley@primarydata.com>,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 3/4] sunrpc: convert to lockless lookup of queued server threads
Date: Tue, 9 Dec 2014 12:04:48 -0500	[thread overview]
Message-ID: <20141209170447.GH20526@fieldses.org> (raw)
In-Reply-To: <20141202185358.GH11444@sgi.com>

On Tue, Dec 02, 2014 at 12:53:58PM -0600, Ben Myers wrote:
> Hey Bruce,
> 
> On Tue, Dec 02, 2014 at 11:50:24AM -0500, J. Bruce Fields wrote:
> > On Tue, Dec 02, 2014 at 07:14:22AM -0500, Jeff Layton wrote:
> > > On Tue, 2 Dec 2014 06:57:50 -0500
> > > Jeff Layton <jeff.layton@primarydata.com> wrote:
> > > 
> > > > On Mon, 1 Dec 2014 19:38:19 -0500
> > > > Trond Myklebust <trondmy@gmail.com> wrote:
> > > > 
> > > > > On Mon, Dec 1, 2014 at 6:47 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > > > > > I find it hard to think about how we expect this to affect performance.
> > > > > > So it comes down to the observed results, I guess, but just trying to
> > > > > > get an idea:
> > > > > >
> > > > > >         - this eliminates sp_lock.  I think the original idea here was
> > > > > >           that if interrupts could be routed correctly then there
> > > > > >           shouldn't normally be cross-cpu contention on this lock.  Do
> > > > > >           we understand why that didn't pan out?  Is hardware capable of
> > > > > >           doing this really rare, or is it just too hard to configure it
> > > > > >           correctly?
> > > > > 
> > > > > One problem is that a 1MB incoming write will generate a lot of
> > > > > interrupts. While that is not so noticeable on a 1GigE network, it is
> > > > > on a 40GigE network. The other thing you should note is that this
> > > > > workload was generated with ~100 clients pounding on that server, so
> > > > > there are a fair amount of TCP connections to service in parallel.
> > > > > Playing with the interrupt routing doesn't necessarily help you so
> > > > > much when all those connections are hot.
> > > > > 
> > > 
> > > In principle though, the percpu pool_mode should have alleviated the
> > > contention on the sp_lock. When an interrupt comes in, the xprt gets
> > > queued to its pool. If there is a pool for each cpu then there should
> > > be no sp_lock contention. The pernode pool mode might also have
> > > alleviated the lock contention to a lesser degree in a NUMA
> > > configuration.
> > > 
> > > Do we understand why that didn't help?
> > 
> > Yes, the lots-of-interrupts-per-rpc problem strikes me as a separate if
> > not entirely orthogonal problem.
> > 
> > (And I thought it should be addressable separately; Trond and I talked
> > about this in Westford.  I think it currently wakes a thread to handle
> > each individual tcp segment--but shouldn't it be able to do all the data
> > copying in the interrupt and wait to wake up a thread until it's got the
> > entire rpc?)
> > 
> > > In any case, I think that doing this with RCU is still preferable.
> > > We're walking a very short list, so doing it lockless is still a
> > > good idea to improve performance without needing to use the percpu
> > > pool_mode.
> > 
> > I find that entirely plausible.
> > 
> > Maybe it would help to ask SGI people.  Cc'ing Ben Myers in hopes he
> > could point us to the right person.
> >
> > It'd be interesting to know:
> > 
> > 	- are they using the svc_pool stuff?
> > 	- if not, why not?
> > 	- if so:
> > 		- can they explain how they configure systems to take
> > 		  advantage of it?
> > 		- do they have any recent results showing how it helps?
> > 		- could they test Jeff's patches for performance
> > 		  regressions?
> > 
> > Anyway, I'm off for now, back to work Thursday.
> > 
> > --b.
> 
> Andrew Dahl is the right person.  Cc'd. 

Thanks!

I'm less worried about Jeff's particular changes here, but I would still
really love to see answers to the above questions.

We've had a couple cases now of people trying to use the pool_modes for
performance tuning without good results, and I'd like to figure out
what's happening.  If this keeps up then we may end up just breaking
them by accident (if we haven't already).

--b.

  reply	other threads:[~2014-12-09 17:04 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 19:19 [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing Jeff Layton
2014-11-21 19:19 ` [PATCH 1/4] sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it Jeff Layton
2014-12-01 22:44   ` J. Bruce Fields
2014-12-01 23:05     ` Jeff Layton
2014-12-01 23:36       ` Trond Myklebust
2014-12-02  0:29         ` Jeff Layton
2014-12-02  0:52           ` Trond Myklebust
2014-12-09 17:05             ` J. Bruce Fields
2014-11-21 19:19 ` [PATCH 2/4] sunrpc: fix potential races in pool_stats collection Jeff Layton
2014-11-21 19:19 ` [PATCH 3/4] sunrpc: convert to lockless lookup of queued server threads Jeff Layton
2014-12-01 23:47   ` J. Bruce Fields
2014-12-02  0:38     ` Trond Myklebust
2014-12-02 11:57       ` Jeff Layton
2014-12-02 12:14         ` Jeff Layton
2014-12-02 16:50           ` J. Bruce Fields
2014-12-02 18:53             ` Ben Myers
2014-12-09 17:04               ` J. Bruce Fields [this message]
2014-12-08 18:57             ` J. Bruce Fields
2014-12-08 19:54               ` Jeff Layton
2014-12-08 19:58                 ` J. Bruce Fields
2014-12-08 20:24                   ` Jeff Layton
2014-12-09 16:57           ` J. Bruce Fields
2014-11-21 19:19 ` [PATCH 4/4] sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt Jeff Layton
2014-12-02 13:31   ` Jeff Layton
2014-12-09 16:36     ` J. Bruce Fields
2014-11-25 21:25 ` [PATCH 0/4] sunrpc: reduce pool->sp_lock contention when queueing a xprt for servicing Jeff Layton
2014-11-26  0:09   ` J. Bruce Fields
2014-11-26  0:38     ` Jeff Layton
2014-11-26  2:40       ` J. Bruce Fields
2014-11-26 11:12         ` Jeff Layton
2014-12-09 16:44 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141209170447.GH20526@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=adahl@sgi.com \
    --cc=bpm@sgi.com \
    --cc=chris.worley@primarydata.com \
    --cc=jeff.layton@primarydata.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox