From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAE1E30F927; Tue, 10 Feb 2026 16:32:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770741157; cv=none; b=pAjRGQhWnRjAE0wL8+TtFLm5ZWwZENZqmXzAbvj43N/z0gPBlDa/kpj/BddUyUTL0PhKUVgifMjMo3sXryFU9ZnE/FOixqTg9NUfLR1J2BNfgciXNVVmE3SksIZ1BCyaQDTaFqxj3V4egG17eg6t9ikJruOFuRZyE5pyZatee7k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770741157; c=relaxed/simple; bh=JZs7gDmh5mV++11AEBxm1sctrg0slj0GRYi/OVScehI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TZPhKmjxX3g57ZDjVpUy6meEV5SNC+3HsPPZfwP9qlOwul8JmMhXL/U92dvbChfnzXlNUgtPqMhtI/igPl1MYnDCuIBaMNVKN5a0Qh6R/7bGkHFhKXJYxguwFI2YZryUucbfKD8edpIGGacrf0ZrSX3fBfDoRayKNuwEAEYnx+M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hNYYCgZb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hNYYCgZb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1385FC19421; Tue, 10 Feb 2026 16:32:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1770741157; bh=JZs7gDmh5mV++11AEBxm1sctrg0slj0GRYi/OVScehI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hNYYCgZbPdIZiuCcZLJTNUWL4XnGAkjygk/Q/2TWr5Ky3RgNNqIaRG3CW6zB44nn1 qr/ngI0fLsm0Nc6nx6B9qcefQRsJCFurZxlq3wJUiifHB4vhAJLklxPp74V//r1IPF LKMq1aslsN8bbNNA64RllBZvdvtC6tr1Ph5VWggo6qqx/eCPalGbers3ts0m44zmqm D/CL5R6NtvKRTc9gXm5r89ziXPEdGrxROjIgG2e38S3mjBeUyNfRu+38dVV3TxyzYW cAtdUXiNtI036BVs9aHhJ1rMCnAVTCcnl0q0r8nfP64mCt5U43oNE2wSnYxuhqx1i0 KHc/54OoBngWw== From: Chuck Lever To: NeilBrown , Jeff Layton , Olga Kornievskaia , Dai Ngo , Tom Talpey Cc: , , Chuck Lever Subject: [RFC PATCH 13/15] svcrdma: clear XPT_DATA on sc_read_complete_q consumption Date: Tue, 10 Feb 2026 11:32:20 -0500 Message-ID: <20260210163222.2356793-14-cel@kernel.org> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260210163222.2356793-1-cel@kernel.org> References: <20260210163222.2356793-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Chuck Lever svc_rdma_wc_read_done() sets XPT_DATA when adding a completed RDMA Read context to sc_read_complete_q. The consumer in svc_rdma_recvfrom() takes the context but leaves XPT_DATA set. The subsequent svc_xprt_received() clears XPT_BUSY and re-enqueues the transport; because XPT_DATA remains set, a second thread awakens. That thread finds both queues empty, accomplishes nothing, and releases its slot and reservation. Trace data from a 256KB NFSv3 WRITE workload over RDMA shows approximately 14 enqueue attempts per RPC, with 62% returning immediately due to no pending data. The majority originate from this spurious dispatch path. After clearing XPT_DATA to acknowledge consumption, the XPT_DATA state must be recomputed from both queue states. A concurrent producer may call llist_add and then set_bit(XPT_DATA) between this consumer's llist_del_first and the clear_bit, causing clear_bit to erase the producer's signal. An smp_mb__after_atomic() barrier after clear_bit pairs with the implicit barrier in each producer's llist_add cmpxchg, ensuring llist_empty rechecks observe any add whose set_bit was erased. This barrier requirement applies at both call sites: the new sc_read_complete_q path and the pre-existing sc_rq_dto_q "both queues empty" path. A new helper svc_rdma_update_xpt_data() centralizes this clear/barrier/recheck/set pattern to ensure both locations maintain the required memory ordering. Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 33 ++++++++++++++++--------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index b48ef78c79c2..2ee9819a53d7 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -917,6 +917,25 @@ static noinline void svc_rdma_read_complete(struct svc_rqst *rqstp, trace_svcrdma_read_finished(&ctxt->rc_cid); } +/* + * Recompute XPT_DATA from queue state after consuming a completion. A + * concurrent producer may have called llist_add and then set_bit(XPT_DATA) + * between this consumer's llist_del_first and the clear_bit below, causing + * clear_bit to erase the producer's signal. The barrier pairs with the + * implicit barrier in each producer's llist_add so that the llist_empty + * rechecks observe any add whose set_bit was erased. + */ +static void svc_rdma_update_xpt_data(struct svcxprt_rdma *rdma) +{ + struct svc_xprt *xprt = &rdma->sc_xprt; + + clear_bit(XPT_DATA, &xprt->xpt_flags); + smp_mb__after_atomic(); + if (!llist_empty(&rdma->sc_rq_dto_q) || + !llist_empty(&rdma->sc_read_complete_q)) + set_bit(XPT_DATA, &xprt->xpt_flags); +} + /** * svc_rdma_recvfrom - Receive an RPC call * @rqstp: request structure into which to receive an RPC Call @@ -965,6 +984,8 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp) node = llist_del_first(&rdma_xprt->sc_read_complete_q); if (node) { ctxt = llist_entry(node, struct svc_rdma_recv_ctxt, rc_node); + + svc_rdma_update_xpt_data(rdma_xprt); svc_xprt_received(xprt); svc_rdma_read_complete(rqstp, ctxt); goto complete; @@ -975,17 +996,7 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp) } else { ctxt = NULL; /* No new incoming requests, terminate the loop */ - clear_bit(XPT_DATA, &xprt->xpt_flags); - - /* - * If a completion arrived after llist_del_first but - * before clear_bit, the producer's set_bit would be - * cleared above. Recheck both queues to close this - * race window. - */ - if (!llist_empty(&rdma_xprt->sc_rq_dto_q) || - !llist_empty(&rdma_xprt->sc_read_complete_q)) - set_bit(XPT_DATA, &xprt->xpt_flags); + svc_rdma_update_xpt_data(rdma_xprt); } /* Unblock the transport for the next receive */ -- 2.52.0