* [PATCH] xprtrdma: Decrement re_receiving on the early exit paths
@ 2026-02-23 18:28 Eric Badger
2026-02-23 19:58 ` Chuck Lever
0 siblings, 1 reply; 2+ messages in thread
From: Eric Badger @ 2026-02-23 18:28 UTC (permalink / raw)
To: ebadger
Cc: Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton,
NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, open list:NFS, SUNRPC, AND LOCKD CLIENTS,
open list:NETWORKING [GENERAL], open list
In the event that rpcrdma_post_recvs() fails to create a work request
(due to memory allocation failure, say) or otherwise exits early, we
should decrement ep->re_receiving before returning. Otherwise we will
hang in rpcrdma_xprt_drain() as re_receiving will never reach zero and
the completion will never be triggered.
On a system with high memory pressure, this can appear as the following
hung task:
INFO: task kworker/u385:17:8393 blocked for more than 122 seconds.
Tainted: G S E 6.19.0 #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u385:17 state:D stack:0 pid:8393 tgid:8393 ppid:2 task_flags:0x4248060 flags:0x00080000
Workqueue: xprtiod xprt_autoclose [sunrpc]
Call Trace:
<TASK>
__schedule+0x48b/0x18b0
? ib_post_send_mad+0x247/0xae0 [ib_core]
schedule+0x27/0xf0
schedule_timeout+0x104/0x110
__wait_for_common+0x98/0x180
? __pfx_schedule_timeout+0x10/0x10
wait_for_completion+0x24/0x40
rpcrdma_xprt_disconnect+0x444/0x460 [rpcrdma]
xprt_rdma_close+0x12/0x40 [rpcrdma]
xprt_autoclose+0x5f/0x120 [sunrpc]
process_one_work+0x191/0x3e0
worker_thread+0x2e3/0x420
? __pfx_worker_thread+0x10/0x10
kthread+0x10d/0x230
? __pfx_kthread+0x10/0x10
ret_from_fork+0x273/0x2b0
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
Fixes: 15788d1d1077 ("xprtrdma: Do not refresh Receive Queue while it is draining")
Signed-off-by: Eric Badger <ebadger@purestorage.com>
---
net/sunrpc/xprtrdma/verbs.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 63262ef0c2e3..8abbd9c4045a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1362,7 +1362,7 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed)
needed += RPCRDMA_MAX_RECV_BATCH;
if (atomic_inc_return(&ep->re_receiving) > 1)
- goto out;
+ goto out_dec;
/* fast path: all needed reps can be found on the free list */
wr = NULL;
@@ -1385,7 +1385,7 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed)
++count;
}
if (!wr)
- goto out;
+ goto out_dec;
rc = ib_post_recv(ep->re_id->qp, wr,
(const struct ib_recv_wr **)&bad_wr);
@@ -1400,9 +1400,10 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed)
--count;
}
}
+
+out_dec:
if (atomic_dec_return(&ep->re_receiving) > 0)
complete(&ep->re_done);
-
out:
trace_xprtrdma_post_recvs(r_xprt, count);
ep->re_receive_count += count;
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] xprtrdma: Decrement re_receiving on the early exit paths
2026-02-23 18:28 [PATCH] xprtrdma: Decrement re_receiving on the early exit paths Eric Badger
@ 2026-02-23 19:58 ` Chuck Lever
0 siblings, 0 replies; 2+ messages in thread
From: Chuck Lever @ 2026-02-23 19:58 UTC (permalink / raw)
To: Eric Badger
Cc: Trond Myklebust, Anna Schumaker, Jeff Layton, NeilBrown,
Olga Kornievskaia, Dai Ngo, Tom Talpey, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
open list:NFS, SUNRPC, AND LOCKD CLIENTS,
open list:NETWORKING [GENERAL], open list
On 2/23/26 1:28 PM, Eric Badger wrote:
> In the event that rpcrdma_post_recvs() fails to create a work request
> (due to memory allocation failure, say) or otherwise exits early, we
> should decrement ep->re_receiving before returning. Otherwise we will
> hang in rpcrdma_xprt_drain() as re_receiving will never reach zero and
> the completion will never be triggered.
>
> On a system with high memory pressure, this can appear as the following
> hung task:
>
> INFO: task kworker/u385:17:8393 blocked for more than 122 seconds.
> Tainted: G S E 6.19.0 #3
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u385:17 state:D stack:0 pid:8393 tgid:8393 ppid:2 task_flags:0x4248060 flags:0x00080000
> Workqueue: xprtiod xprt_autoclose [sunrpc]
> Call Trace:
> <TASK>
> __schedule+0x48b/0x18b0
> ? ib_post_send_mad+0x247/0xae0 [ib_core]
> schedule+0x27/0xf0
> schedule_timeout+0x104/0x110
> __wait_for_common+0x98/0x180
> ? __pfx_schedule_timeout+0x10/0x10
> wait_for_completion+0x24/0x40
> rpcrdma_xprt_disconnect+0x444/0x460 [rpcrdma]
> xprt_rdma_close+0x12/0x40 [rpcrdma]
> xprt_autoclose+0x5f/0x120 [sunrpc]
> process_one_work+0x191/0x3e0
> worker_thread+0x2e3/0x420
> ? __pfx_worker_thread+0x10/0x10
> kthread+0x10d/0x230
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x273/0x2b0
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
>
> Fixes: 15788d1d1077 ("xprtrdma: Do not refresh Receive Queue while it is draining")
> Signed-off-by: Eric Badger <ebadger@purestorage.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
--
Chuck Lever
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-02-23 19:58 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-23 18:28 [PATCH] xprtrdma: Decrement re_receiving on the early exit paths Eric Badger
2026-02-23 19:58 ` Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox