From: Jeff Layton <jeff.layton@primarydata.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jeff Layton <jeff.layton@primarydata.com>,
NeilBrown <neilb@suse.de>,
linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH 00/14] nfsd/sunrpc: add support for a workqueue-based nfsd
Date: Wed, 3 Dec 2014 14:02:02 -0500 [thread overview]
Message-ID: <20141203140202.7865bedb@tlielax.poochiereds.net> (raw)
In-Reply-To: <20141203110405.5ecc85df@tlielax.poochiereds.net>
On Wed, 3 Dec 2014 11:04:05 -0500
Jeff Layton <jlayton@primarydata.com> wrote:
> On Wed, 3 Dec 2014 10:56:49 -0500
> Tejun Heo <tj@kernel.org> wrote:
>
> > Hello, Neil, Jeff.
> >
> > On Tue, Dec 02, 2014 at 08:29:46PM -0500, Jeff Layton wrote:
> > > That's a good point. I had originally thought that max_active on an
> > > unbound workqueue would be the number of concurrent jobs that could run
> > > across all the CPUs, but now that I look I'm not sure that's really
> > > the case.
> >
> > @max_active is a per-pool number. By default, unbound wqs use
> > per-node pools, so @max_active would be per-node. Currently,
> > @max_active is mostly meant as a protection against run-away
> > workqueues creating crazy number of workers, which has been enough for
> > the existing wq users. *Maybe* it makes sense to make it actually
> > mean maximum concurrency which would prolly involve aggregated per-cpu
> > distribution mechanism so that we don't end up inc'ing and dec'ing the
> > same counter from all CPUs on each work item execution.
> >
> > However, I do agree with Neil that making it user configurable is
> > almost always painful. It's usually a question without a good answer
> > and the same value may behave differently depending on a lot of
> > implementation details and a better approach, probably, is to use
> > @max_active as the last resort protection mechanism while providing
> > automatic throttling of in-flight work items which is meaningful for
> > the specific use cases.
> >
> > > I've heard random grumblings from various people in the past that
> > > workqueues have significant latency, but this is the first time I've
> > > really hit it in practice. If we can get this fixed, then that may be a
> > > significant perf win for all workqueue users. For instance, rpciod in
> > > the NFS client is all workqueue-based. Getting that latency down could
> > > really help things.
> > >
> > > I'm currently trying to roll up a kernel module for benchmarking the
> > > workqueue dispatching code in the hopes that we can use that to help
> > > nail it down.
> >
> > Definitely, there were some reportings but nothing really got tracked
> > down properly. It'd be awesome to actually find out where the latency
> > is coming from.
> >
> > Thanks!
> >
>
> I think I might have figured this out (and before I go any farther
> allow me to say <facepalm>), thanks to the workqueue tracepoints in the
> code. What I noticed is that when things are fairly idle, the work is
> picked up quickly, but once things get busy it takes a lot longer.
>
> I think that the issue is in the design of the workqueue-based nfsd
> code. In particular, I attached a work_struct to the svc_xprt which is
> limiting the code to only process one RPC at a time for a xprt, from
> beginning to end.
>
> So, even if we requeue that work after the receive phase is done, the
> workqueue won't pick it up again until the thing is processed and the
> reply is sent.
>
> What I think I need to do is to do the receive phase using the
> work_struct attached to the xprt, and then do the rest of the
> processing from the context of a different work_struct (possibly one
> attached to the svc_rqst), which should free up the xprt's work_struct
> sooner.
>
> I'm going to work on changing that today and see if it improves things.
>
> Thanks for the help so far!
Yes! That does help. The new workqueue based code is a little (a few
percent?) slower than the thread-based code across the board. I suspect
that's due to the fact that I'm having to queue each RPC to the
workqueue twice (once for the receive and once to do the processing).
I suspect that I can remedy that, but I'll have to think about the best
way to do it.
Thanks again for the help!
--
Jeff Layton <jlayton@primarydata.com>
next prev parent reply other threads:[~2014-12-03 19:02 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-02 18:24 [RFC PATCH 00/14] nfsd/sunrpc: add support for a workqueue-based nfsd Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 01/14] sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 02/14] sunrpc: move sv_function into sv_ops Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 03/14] sunrpc: move sv_module parm " Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 04/14] sunrpc: turn enqueueing a svc_xprt into a svc_serv operation Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 05/14] sunrpc: abstract out svc_set_num_threads to sv_ops Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 06/14] sunrpc: move pool_mode definitions into svc.h Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 07/14] sunrpc: factor svc_rqst allocation and freeing from sv_nrthreads refcounting Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 08/14] sunrpc: set up workqueue function in svc_xprt Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 09/14] sunrpc: add basic support for workqueue-based services Jeff Layton
2014-12-08 20:47 ` J. Bruce Fields
2014-12-08 20:49 ` Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 10/14] nfsd: keep a reference to the fs_struct in svc_rqst Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 11/14] nfsd: add support for workqueue based service processing Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 12/14] sunrpc: keep a cache of svc_rqsts for each NUMA node Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 13/14] sunrpc: add more tracepoints around svc_xprt handling Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 14/14] sunrpc: add tracepoints around svc_sock handling Jeff Layton
2014-12-02 19:18 ` [RFC PATCH 00/14] nfsd/sunrpc: add support for a workqueue-based nfsd Tejun Heo
2014-12-02 19:26 ` Jeff Layton
2014-12-02 19:29 ` Tejun Heo
2014-12-02 19:26 ` Tejun Heo
2014-12-02 19:46 ` Jeff Layton
2014-12-03 1:11 ` NeilBrown
2014-12-03 1:29 ` Jeff Layton
2014-12-03 15:56 ` Tejun Heo
2014-12-03 16:04 ` Jeff Layton
2014-12-03 19:02 ` Jeff Layton [this message]
2014-12-03 19:08 ` Trond Myklebust
2014-12-03 19:20 ` Jeff Layton
2014-12-03 19:59 ` Trond Myklebust
2014-12-03 20:21 ` Jeff Layton
2014-12-03 20:44 ` Trond Myklebust
2014-12-04 11:47 ` Jeff Layton
2014-12-04 17:17 ` Shirley Ma
2014-12-04 17:28 ` Jeff Layton
2014-12-04 17:44 ` Shirley Ma
2014-12-03 16:50 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141203140202.7865bedb@tlielax.poochiereds.net \
--to=jeff.layton@primarydata.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox