All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] RDMA/siw: bound Read Response placement to the RREAD length
@ 2026-06-02 19:47 Michael Bommarito
  2026-06-03  9:44 ` Bernard Metzler
  2026-06-05 17:07 ` Jason Gunthorpe
  0 siblings, 2 replies; 4+ messages in thread
From: Michael Bommarito @ 2026-06-02 19:47 UTC (permalink / raw)
  To: Bernard Metzler, Jason Gunthorpe, Leon Romanovsky
  Cc: linux-rdma, linux-kernel

In drivers/infiniband/sw/siw/siw_qp_rx.c, siw_proc_rresp() places each
inbound Read Response DDP segment at sge->laddr + wqe->processed and then
accumulates wqe->processed, but it never checks the running total against
the sink buffer length on continuation segments. siw_check_sge() resolves
and validates the sink memory only on the first fragment (the if (!*mem)
branch), and siw_rresp_check_ntoh() compares the cumulative length against
wqe->bytes only on the final segment (the !frx->more_ddp_segs guard).

A connected siw peer that answers an outstanding RREAD with Read Response
segments that keep the DDP Last flag clear, carrying more total payload
than the RREAD requested, drives wqe->processed past the validated sink
buffer; the next siw_rx_data() call writes out of bounds at
sge->laddr + wqe->processed. siw runs iWARP over ordinary routable TCP,
so the peer is the remote end of an established RDMA connection and needs
no local privilege.

Bound every segment before placement, exactly as siw_proc_send() and
siw_proc_write() already do for their tagged and untagged paths, and
terminate the connection with a base-or-bounds DDP error when the
Read Response would overrun the sink buffer.

This is the second receive-path length fix for this file. A separate
change rejects an MPA FPDU length that underflows the per-fragment
remainder in the header decode; that guard does not cover this case,
because here each individual segment length is self-consistent and only
the accumulated placement offset overruns the buffer.

Fixes: 8b6a361b8c48 ("rdma/siw: receive path")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---

Impact: the remote peer of an established Soft-iWARP connection can write
out of bounds past the RREAD sink (global DMA MR sink) or force a
connection-terminating fault (default FRWR sink) by streaming Read
Response segments whose cumulative length exceeds the requested length
with the DDP Last flag clear.

Relationship to the in-flight siw receive-path fix

This is the second receive-path length fix I have for this file. The
first, currently under review on this list, rejects an MPA FPDU length
that underflows the per-fragment remainder during header decode in
siw_get_hdr(). That guard sits in the header-length path and does not
cover the case here: in this report every individual segment length is
self-consistent and non-negative, and only the accumulated placement
offset (wqe->processed across continuation segments) overruns the sink
buffer. The two fixes touch different functions and are independent;
either can be applied without the other. I am sending this as a fresh
top-level thread rather than threading it under the earlier series to
keep the two changes separable for review and for stable selection.

Analysis

Verified by reading siw_qp_rx.c at v7.1-rc4 (a1f173eb51db):

  - siw_proc_rresp() resolves and range-checks the sink memory region
    once, on the first fragment only (the if (!*mem) branch calling
    siw_check_sge() with the full requested length).
  - The cumulative-length consistency check in siw_rresp_check_ntoh()
    is gated on !more_ddp_segs, so it fires only on the segment that
    carries the DDP Last flag, and siw_rresp_check_ntoh() itself runs
    only on the first DDP segment.
  - Each segment is then placed at sge->laddr + wqe->processed and
    wqe->processed is accumulated, with no per-segment bound against
    wqe->bytes.

A peer that keeps the DDP Last flag clear across continuation segments
and streams more total payload than the outstanding RREAD requested
therefore drives wqe->processed past the validated sink region; the
next placement writes peer-supplied bytes out of bounds. For a sink
backed by a global DMA / kernel-virtual MR (mem_obj NULL, the
siw_rx_kva() path) this is a direct kernel-heap out-of-bounds write.
For a user-memory or fast-registration/PBL MR (siw_rx_umem() /
siw_rx_pbl(), the mode kernel ULPs use by default) the over-walk is
caught when the page or PBL index runs out and is converted to a
connection-terminating fault. The missing per-segment bound should be
added regardless of sink type.

siw_proc_send() and siw_proc_write() already bound every segment with
the equivalent wqe->bytes < wqe->processed + srx->fpdu_part_rem check;
this patch brings siw_proc_rresp() in line with them.

Conditions: CONFIG_RDMA_SIW=m or =y, siw loaded and a link configured
(modprobe siw; rdma link add ... type siw), and a local ULP with an
outstanding RDMA READ to the peer. The out-of-bounds write requires the
READ sink to be a global DMA / kernel-virtual MR; ULPs using the default
fast-registration (FRWR) sink instead see a connection-terminating
fault. No special sysctls or local privilege on the victim are required
beyond the configured fabric.

Mitigations: until the patch is applied, do not configure Soft-iWARP
toward untrusted peers, or restrict the RDMA-CM listener to trusted
peers at the network layer. Removing the module (rmmod siw) is
sufficient when no application is using it.

A function-level reproducer that drives siw_proc_rresp() over the real
kernel TCP receive path with a primed ORQ and a two-segment malformed
Read Response reproduces a KASAN slab-out-of-bounds write on a global
DMA MR sink; it can be shared off-list on request.

Fixes: 8b6a361b8c48 ("rdma/siw: receive path"), reachable from v5.10
and all later releases; all stable branches carrying siw are affected.

 drivers/infiniband/sw/siw/siw_qp_rx.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c
index e8a88b378d51d..ec4aadef3dfe2 100644
--- a/drivers/infiniband/sw/siw/siw_qp_rx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_rx.c
@@ -844,6 +844,25 @@ int siw_proc_rresp(struct siw_qp *qp)
 	}
 	mem_p = *mem;
 
+	/*
+	 * siw_rresp_check_ntoh() validates the cumulative length only on
+	 * the last DDP segment (!more_ddp_segs), and siw_check_sge() above
+	 * resolves the sink memory only on the first fragment. A peer that
+	 * keeps DDP_FLAG_LAST clear and streams more payload than the
+	 * outstanding RREAD requested therefore drives wqe->processed past
+	 * the validated sink buffer, writing out of bounds. Bound every
+	 * segment as siw_proc_send()/siw_proc_write() already do.
+	 */
+	if (unlikely(wqe->processed + srx->fpdu_part_rem > wqe->bytes)) {
+		siw_dbg_qp(qp, "rresp len: %d + %d > %d\n",
+			   wqe->processed, srx->fpdu_part_rem, wqe->bytes);
+		wqe->wc_status = SIW_WC_LOC_LEN_ERR;
+		siw_init_terminate(qp, TERM_ERROR_LAYER_DDP,
+				   DDP_ETYPE_TAGGED_BUF,
+				   DDP_ECODE_T_BASE_BOUNDS, 0);
+		return -EINVAL;
+	}
+
 	bytes = min(srx->fpdu_part_rem, srx->skb_new);
 	rv = siw_rx_data(mem_p, srx, &frx->pbl_idx,
 			 sge->laddr + wqe->processed, bytes);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-06 18:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 19:47 [PATCH] RDMA/siw: bound Read Response placement to the RREAD length Michael Bommarito
2026-06-03  9:44 ` Bernard Metzler
2026-06-05 17:07 ` Jason Gunthorpe
2026-06-06 18:42   ` Michael Bommarito

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.