From: Chuck Lever <cel@kernel.org>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org,
Chris Mason <clm@meta.com>, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 2/6] svcrdma: Fix offset arithmetic in read_chunk_range
Date: Tue, 26 May 2026 09:35:56 -0400 [thread overview]
Message-ID: <20260526-rpc-kernel-bugs-v1-2-e251306ccca9@oracle.com> (raw)
In-Reply-To: <20260526-rpc-kernel-bugs-v1-0-e251306ccca9@oracle.com>
From: Chris Mason <clm@meta.com>
svc_rdma_read_chunk_range() walks a Read chunk's segment list to
build a sub-range starting at byte offset and spanning length bytes
for a Position-Zero or Call chunk. Two arithmetic defects in the
per-segment loop produce wrong DMA lengths and a u32 underflow:
pcl_for_each_segment(segment, chunk) {
if (offset > segment->rs_length) {
offset -= segment->rs_length;
continue;
}
dummy.rs_handle = segment->rs_handle;
dummy.rs_length = min_t(u32, length,
segment->rs_length) - offset;
dummy.rs_offset = segment->rs_offset + offset;
First, the skip predicate uses '>' instead of '>='. When offset
equals the segment's full rs_length, the segment is fully consumed
and should be skipped, but the loop falls through into the body.
The resulting dummy.rs_length is min_t(u32, length, rs_length) -
rs_length, which underflows to a near-UINT_MAX u32 when length is
smaller than rs_length, or is zero otherwise.
Second, the length formula subtracts offset from the min_t() result
rather than from segment->rs_length before the cap. For offset > 0
the segment's residual is rs_length - offset, not rs_length, so the
cap must be applied to the residual. With the current bracketing,
whenever length is smaller than rs_length - offset the per-segment
length becomes length - offset instead of length, silently dropping
offset bytes from the rebuilt chunk. Combined with the boundary
case above it also enables the u32 underflow path, which propagates
a huge nr_bvec into svc_rdma_build_read_segment() and a multi-MiB
kmalloc_array_node() in svc_rdma_get_rw_ctxt().
Additionally, svc_rdma_read_call_chunk() can invoke this function
with length == 0 when the last Read chunk ends exactly at the end
of the Call chunk. With the corrected >= predicate, every segment
is skipped and the function returns the initial -EINVAL, rejecting
a valid request. Return success immediately when length is zero.
Also break out of the loop once length is fully consumed to avoid
passing zero-length segments to svc_rdma_build_read_segment().
Fix by using '>=' so a fully-consumed segment is skipped, by
moving '- offset' inside min_t() so the cap is applied to the
segment's residual length, by returning success for zero-length
requests, and by stopping iteration when the requested range has
been consumed.
Fixes: d7cc73972661 ("svcrdma: support multiple Read chunks per RPC")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/svc_rdma_rw.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index eb4bc56ed387..182bd577e0b7 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -1227,17 +1227,20 @@ static int svc_rdma_read_chunk_range(struct svc_rqst *rqstp,
const struct svc_rdma_segment *segment;
int ret;
+ if (!length)
+ return 0;
+
ret = -EINVAL;
pcl_for_each_segment(segment, chunk) {
struct svc_rdma_segment dummy;
- if (offset > segment->rs_length) {
+ if (offset >= segment->rs_length) {
offset -= segment->rs_length;
continue;
}
dummy.rs_handle = segment->rs_handle;
- dummy.rs_length = min_t(u32, length, segment->rs_length) - offset;
+ dummy.rs_length = min_t(u32, length, segment->rs_length - offset);
dummy.rs_offset = segment->rs_offset + offset;
ret = svc_rdma_build_read_segment(rqstp, head, &dummy);
@@ -1246,6 +1249,8 @@ static int svc_rdma_read_chunk_range(struct svc_rqst *rqstp,
head->rc_readbytes += dummy.rs_length;
length -= dummy.rs_length;
+ if (!length)
+ break;
offset = 0;
}
return ret;
--
2.54.0
next prev parent reply other threads:[~2026-05-26 13:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-26 13:35 [PATCH 0/6] svcrdma: harden parsed chunk list against malformed wire values Chuck Lever
2026-05-26 13:35 ` [PATCH 1/6] svcrdma: validate Read chunk positions before reconstruction Chuck Lever
2026-05-26 13:35 ` Chuck Lever [this message]
2026-05-26 13:35 ` [PATCH 3/6] svcrdma: reject oversized Read segments at decode time Chuck Lever
2026-05-26 13:35 ` [PATCH 4/6] svcrdma: fix pcl_for_each_segment for empty chunks Chuck Lever
2026-05-26 13:35 ` [PATCH 5/6] svcrdma: reject Write/Reply chunks with segcount 0 Chuck Lever
2026-05-26 13:36 ` [PATCH 6/6] svcrdma: Validate Read chunk positions at decode time Chuck Lever
2026-05-27 15:19 ` [PATCH 0/6] svcrdma: harden parsed chunk list against malformed wire values Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260526-rpc-kernel-bugs-v1-2-e251306ccca9@oracle.com \
--to=cel@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=clm@meta.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox