From: Jeff Layton <jlayton@primarydata.com>
To: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: [RFC PATCH 00/14] nfsd/sunrpc: add support for a workqueue-based nfsd
Date: Tue, 2 Dec 2014 13:24:09 -0500 [thread overview]
Message-ID: <1417544663-13299-1-git-send-email-jlayton@primarydata.com> (raw)
tl;dr: this code works and is much simpler than the dedicated thread
pool, but there are some latencies in the workqueue code that
seem to keep it from being as fast as it could be.
This patchset is a little skunkworks project that I've been poking at
for the last few weeks. Currently nfsd uses a dedicated thread pool to
handle RPCs, but that requires maintaining a rather large swath of
"fiddly" code to handle the threads and transports.
This patchset represents an alternative approach, which makes nfsd use
workqueues to do its bidding rather than a dedicated thread pool. When a
transport needs to do work, we simply queue it to the workqueue in
softirq context and let it service the transport.
The current draft is runtime-switchable via a new sunrpc pool_mode
module parameter setting. When that's set to "workqueue", nfsd will use
a workqueue-based service. One of the goals of this patchset was to
*not* need to change any userland code, so starting it up using rpc.nfsd
still works as expected. The only real difference is that the nfsdfs
"threads" file is reinterpreted as the "max_active" value for the
workqueue.
This code has a lot of potential to simplify nfsd significantly and I
think it may also scale better on larger machines. When testing with an
exported tmpfs on my craptacular test machine, the workqueue based code
seems to be a little faster than a dedicated thread pool.
Currently though, performance takes a nose dive (~%40) when I'm writing
to (relatively slow) SATA disks. With the help of some tracepoints, I
think this is mostly due to some significant latency in the workqueue
code.
When I queue a thread using the legacy dedicated thread pool, I see
~.2ms of latency between the softirq function queueing it to a given
thread and the thread picking that work up. When I queue it to a
workqueue however, that latency jumps to ~30ms (average).
My current theory is that this latency interferes with the ability to
batch up requests to the disks and that is what accounts for the massive
slowdown.
So, I have several goals here in posting this:
1) to get some early feedback on this code. Does this seem reasonable,
assuming that we can address the workqueue latency problems?
2) get some insight about the latency from those with a better
understanding of the CMWQ code. Any thoughts as to why we might be
seeing such high latency here? Any ideas of what we can do about it?
3) I'm also cc'ing Al due to some changes in patch #10 to allow nfsd
to manage its fs_structs a little differently. Does that approach seem
reasonable?
Jeff Layton (14):
sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it
sunrpc: move sv_function into sv_ops
sunrpc: move sv_module parm into sv_ops
sunrpc: turn enqueueing a svc_xprt into a svc_serv operation
sunrpc: abstract out svc_set_num_threads to sv_ops
sunrpc: move pool_mode definitions into svc.h
sunrpc: factor svc_rqst allocation and freeing from sv_nrthreads
refcounting
sunrpc: set up workqueue function in svc_xprt
sunrpc: add basic support for workqueue-based services
nfsd: keep a reference to the fs_struct in svc_rqst
nfsd: add support for workqueue based service processing
sunrpc: keep a cache of svc_rqsts for each NUMA node
sunrpc: add more tracepoints around svc_xprt handling
sunrpc: add tracepoints around svc_sock handling
fs/fs_struct.c | 60 +++++++--
fs/lockd/svc.c | 7 +-
fs/nfs/callback.c | 6 +-
fs/nfsd/nfssvc.c | 107 ++++++++++++---
include/linux/fs_struct.h | 4 +
include/linux/sunrpc/svc.h | 97 +++++++++++---
include/linux/sunrpc/svc_xprt.h | 3 +
include/linux/sunrpc/svcsock.h | 1 +
include/trace/events/sunrpc.h | 60 ++++++++-
net/sunrpc/Kconfig | 10 ++
net/sunrpc/Makefile | 1 +
net/sunrpc/svc.c | 141 +++++++++++---------
net/sunrpc/svc_wq.c | 281 ++++++++++++++++++++++++++++++++++++++++
net/sunrpc/svc_xprt.c | 66 +++++++++-
net/sunrpc/svcsock.c | 6 +
15 files changed, 737 insertions(+), 113 deletions(-)
create mode 100644 net/sunrpc/svc_wq.c
--
2.1.0
next reply other threads:[~2014-12-02 18:24 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-02 18:24 Jeff Layton [this message]
2014-12-02 18:24 ` [RFC PATCH 01/14] sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 02/14] sunrpc: move sv_function into sv_ops Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 03/14] sunrpc: move sv_module parm " Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 04/14] sunrpc: turn enqueueing a svc_xprt into a svc_serv operation Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 05/14] sunrpc: abstract out svc_set_num_threads to sv_ops Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 06/14] sunrpc: move pool_mode definitions into svc.h Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 07/14] sunrpc: factor svc_rqst allocation and freeing from sv_nrthreads refcounting Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 08/14] sunrpc: set up workqueue function in svc_xprt Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 09/14] sunrpc: add basic support for workqueue-based services Jeff Layton
2014-12-08 20:47 ` J. Bruce Fields
2014-12-08 20:49 ` Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 10/14] nfsd: keep a reference to the fs_struct in svc_rqst Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 11/14] nfsd: add support for workqueue based service processing Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 12/14] sunrpc: keep a cache of svc_rqsts for each NUMA node Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 13/14] sunrpc: add more tracepoints around svc_xprt handling Jeff Layton
2014-12-02 18:24 ` [RFC PATCH 14/14] sunrpc: add tracepoints around svc_sock handling Jeff Layton
2014-12-02 19:18 ` [RFC PATCH 00/14] nfsd/sunrpc: add support for a workqueue-based nfsd Tejun Heo
2014-12-02 19:26 ` Jeff Layton
2014-12-02 19:29 ` Tejun Heo
2014-12-02 19:26 ` Tejun Heo
2014-12-02 19:46 ` Jeff Layton
2014-12-03 1:11 ` NeilBrown
2014-12-03 1:29 ` Jeff Layton
2014-12-03 15:56 ` Tejun Heo
2014-12-03 16:04 ` Jeff Layton
2014-12-03 19:02 ` Jeff Layton
2014-12-03 19:08 ` Trond Myklebust
2014-12-03 19:20 ` Jeff Layton
2014-12-03 19:59 ` Trond Myklebust
2014-12-03 20:21 ` Jeff Layton
2014-12-03 20:44 ` Trond Myklebust
2014-12-04 11:47 ` Jeff Layton
2014-12-04 17:17 ` Shirley Ma
2014-12-04 17:28 ` Jeff Layton
2014-12-04 17:44 ` Shirley Ma
2014-12-03 16:50 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1417544663-13299-1-git-send-email-jlayton@primarydata.com \
--to=jlayton@primarydata.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox