Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Chuck Lever <cel@kernel.org>,
	Jonathan Flynn <jonathan.flynn@hammerspace.com>
Cc: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>,
	ben.coddington@hammerspace.com
Subject: Re: [PATCH 1/2] svcrdma: Release write chunk resources without re-queuing
Date: Thu, 7 May 2026 16:46:55 -0400	[thread overview]
Message-ID: <afz6PwRlpJFzoIQE@kernel.org> (raw)
In-Reply-To: <20260506-svcrdma-next-v1-1-915fce8c4fbb@oracle.com>

On Wed, May 06, 2026 at 11:26:50AM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Each RDMA Send completion triggers a cascade of work items on the
> svcrdma_wq unbound workqueue:
> 
>   ib_cq_poll_work (on ib_comp_wq, per-CPU)
>     -> svc_rdma_send_ctxt_put -> queue_work    [work item 1]
>       -> svc_rdma_write_info_free -> queue_work [work item 2]
> 
> Every transition through queue_work contends on the unbound
> pool's spinlock. Profiling an 8KB NFSv3 read/write workload
> over RDMA shows about 4% of total CPU cycles spent on this
> lock, with the cascading re-queue of write_info release
> contributing roughly 1%.
> 
> The initial queue_work in svc_rdma_send_ctxt_put is needed to
> move release work off the CQ completion context (which runs on
> a per-CPU bound workqueue). However, once executing on
> svcrdma_wq, there is no need to re-queue for each write_info
> structure. svc_rdma_reply_chunk_release already calls
> svc_rdma_cc_release inline from the same svcrdma_wq context,
> and svc_rdma_recv_ctxt_put does the same from nfsd thread
> context.
> 
> Release write chunk resources inline in
> svc_rdma_write_info_free, removing the intermediate
> svc_rdma_write_info_free_async work item and the wi_work
> field from struct svc_rdma_write_info.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

You were correct: this patch alone eliminates the OOM (we tested with
both 16K and then 4K read IO from 121 clients to 8 NFS servers, no
measurable memory growth while testing).

Feel free to add:

Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Jonathan Flynn <jonathan.flynn@hammerspace.com>

Thanks!
Mike

ps.
So you are aware, couldn't test your 2nd patch at the customer site
because the baseline kernel there is based on 6.12-stable but your
2nd patch builds on your 7.1 svcrdma changes. I think your 2nd patch
is ideal though, and will be able to pull it in to test in future, but
won't have the ability to test at this customer's scale until we can
role that newer kernel out there... might take a couple months.

  reply	other threads:[~2026-05-07 20:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 15:26 [PATCH 0/2] svcrdma: Reduce svcrdma_wq contention on the Send completion path Chuck Lever
2026-05-06 15:26 ` [PATCH 1/2] svcrdma: Release write chunk resources without re-queuing Chuck Lever
2026-05-07 20:46   ` Mike Snitzer [this message]
2026-05-08 20:14     ` Chuck Lever
2026-05-06 15:26 ` [PATCH 2/2] svcrdma: Defer send context release to xpo_release_ctxt Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afz6PwRlpJFzoIQE@kernel.org \
    --to=snitzer@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=ben.coddington@hammerspace.com \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=jonathan.flynn@hammerspace.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox