From: Chuck Lever <cel@kernel.org>
To: Mike Snitzer <snitzer@kernel.org>,
Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org,
Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 0/2] svcrdma: Reduce svcrdma_wq contention on the Send completion path
Date: Wed, 06 May 2026 11:26:49 -0400 [thread overview]
Message-ID: <20260506-svcrdma-next-v1-0-915fce8c4fbb@oracle.com> (raw)
Profiling an 8KB NFSv3 read/write workload over RDMA shows about
4% of total CPU spent on the svcrdma_wq unbound workqueue pool
spinlock. Each Send completion queues work on svcrdma_wq to
release the send_ctxt, and that work item queues another item
for each write_info chunk it owns. Every queue_work step contends
on the same pool lock.
The first patch removes the inner re-queue.
svc_rdma_write_info_free already runs on svcrdma_wq from its
caller, so the extra work item only adds another spinlock
acquisition with no parallelism to gain. Inlining the chunk
release recovers roughly 1% of CPU cycles. Mike, your workload
might see relief from just this patch alone.
The second patch retires svcrdma_wq. Send completion handlers
append the send_ctxt to a per-transport lock-free list, and the
nfsd thread drains the list in xpo_release_ctxt between RPCs.
DMA unmap and page release move out of the completion context.
That matters when an IOMMU runs in strict mode, where each unmap
synchronously invalidates the IOTLB; the nfsd thread absorbs that
latency where it is harmless and batches teardown across all
completions that accumulated during the prior RPC.
A self-enqueue covers the trailing edge of a burst. When a Send
completion finds sc_send_release_list previously empty on an idle
connection, it sets XPT_DATA and enqueues the transport. The nfsd
thread enters svc_rdma_recvfrom, finds nothing to receive, and
returns; svc_xprt_release then runs xpo_release_ctxt and drains
the list. Without that wakeup, a Send completion arriving after
the last xpo_release_ctxt would leave the send_ctxt's DMA mappings
and reply pages pinned until the next RPC, send-context exhaustion,
or transport close.
Patches were rebased today, but have not been recently tested.
---
Chuck Lever (2):
svcrdma: Release write chunk resources without re-queuing
svcrdma: Defer send context release to xpo_release_ctxt
include/linux/sunrpc/svc_rdma.h | 6 +--
net/sunrpc/xprtrdma/svc_rdma.c | 18 +------
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 9 ++++
net/sunrpc/xprtrdma/svc_rdma_rw.c | 13 +----
net/sunrpc/xprtrdma/svc_rdma_sendto.c | 91 +++++++++++++++++++++++---------
net/sunrpc/xprtrdma/svc_rdma_transport.c | 3 +-
6 files changed, 84 insertions(+), 56 deletions(-)
---
base-commit: d1c29a34fe35c1eb9331cab0537c7bb583692187
change-id: 20260506-svcrdma-next-2e736249390f
Best regards,
--
Chuck Lever <chuck.lever@oracle.com>
next reply other threads:[~2026-05-06 15:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 15:26 Chuck Lever [this message]
2026-05-06 15:26 ` [PATCH 1/2] svcrdma: Release write chunk resources without re-queuing Chuck Lever
2026-05-07 20:46 ` Mike Snitzer
2026-05-08 20:14 ` Chuck Lever
2026-05-06 15:26 ` [PATCH 2/2] svcrdma: Defer send context release to xpo_release_ctxt Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260506-svcrdma-next-v1-0-915fce8c4fbb@oracle.com \
--to=cel@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=snitzer@kernel.org \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox