Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Chuck Lever <cel@kernel.org>
To: Mike Snitzer <snitzer@kernel.org>,
	Jonathan Flynn <jonathan.flynn@hammerspace.com>
Cc: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org,
	Chuck Lever <chuck.lever@oracle.com>,
	ben.coddington@hammerspace.com
Subject: Re: [PATCH 1/2] svcrdma: Release write chunk resources without re-queuing
Date: Fri, 8 May 2026 16:14:42 -0400	[thread overview]
Message-ID: <0267f73f-b4e5-4318-85de-808699f77d95@kernel.org> (raw)
In-Reply-To: <afz6PwRlpJFzoIQE@kernel.org>

On 5/7/26 4:46 PM, Mike Snitzer wrote:
> On Wed, May 06, 2026 at 11:26:50AM -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> Each RDMA Send completion triggers a cascade of work items on the
>> svcrdma_wq unbound workqueue:
>>
>>   ib_cq_poll_work (on ib_comp_wq, per-CPU)
>>     -> svc_rdma_send_ctxt_put -> queue_work    [work item 1]
>>       -> svc_rdma_write_info_free -> queue_work [work item 2]
>>
>> Every transition through queue_work contends on the unbound
>> pool's spinlock. Profiling an 8KB NFSv3 read/write workload
>> over RDMA shows about 4% of total CPU cycles spent on this
>> lock, with the cascading re-queue of write_info release
>> contributing roughly 1%.
>>
>> The initial queue_work in svc_rdma_send_ctxt_put is needed to
>> move release work off the CQ completion context (which runs on
>> a per-CPU bound workqueue). However, once executing on
>> svcrdma_wq, there is no need to re-queue for each write_info
>> structure. svc_rdma_reply_chunk_release already calls
>> svc_rdma_cc_release inline from the same svcrdma_wq context,
>> and svc_rdma_recv_ctxt_put does the same from nfsd thread
>> context.
>>
>> Release write chunk resources inline in
>> svc_rdma_write_info_free, removing the intermediate
>> svc_rdma_write_info_free_async work item and the wi_work
>> field from struct svc_rdma_write_info.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> 
> You were correct: this patch alone eliminates the OOM (we tested with
> both 16K and then 4K read IO from 121 clients to 8 NFS servers, no
> measurable memory growth while testing).

Excellent news! Thanks to you both for testing.


> Feel free to add:
> 
> Reviewed-by: Mike Snitzer <snitzer@kernel.org>
> Tested-by: Jonathan Flynn <jonathan.flynn@hammerspace.com>
> 
> Thanks!
> Mike
> 
> ps.
> So you are aware, couldn't test your 2nd patch at the customer site
> because the baseline kernel there is based on 6.12-stable but your
> 2nd patch builds on your 7.1 svcrdma changes. I think your 2nd patch
> is ideal though, and will be able to pull it in to test in future, but
> won't have the ability to test at this customer's scale until we can
> role that newer kernel out there... might take a couple months.

Understood.


-- 
Chuck Lever

  reply	other threads:[~2026-05-08 20:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 15:26 [PATCH 0/2] svcrdma: Reduce svcrdma_wq contention on the Send completion path Chuck Lever
2026-05-06 15:26 ` [PATCH 1/2] svcrdma: Release write chunk resources without re-queuing Chuck Lever
2026-05-07 20:46   ` Mike Snitzer
2026-05-08 20:14     ` Chuck Lever [this message]
2026-05-06 15:26 ` [PATCH 2/2] svcrdma: Defer send context release to xpo_release_ctxt Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0267f73f-b4e5-4318-85de-808699f77d95@kernel.org \
    --to=cel@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=ben.coddington@hammerspace.com \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=jonathan.flynn@hammerspace.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox