[PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg_location walk
@ 2026-05-18 21:23 Michael Bommarito
  2026-05-19 14:46 ` Leon Romanovsky
  2026-05-20 15:47 ` [PATCH v2] IB/mad: cap RMPP reassembly window size Michael Bommarito
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Bommarito @ 2026-05-18 21:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: linux-rdma, linux-kernel, stable, Vlad Dumitrescu, Or Har-Toov,
	Bob Pearson, Sean Hefty, Kees Cook

A peer on the same InfiniBand subnet or RoCEv2 L2 (or any UDP/4791-
reachable peer for internet-exposed RoCEv2 ports) can pin a target
port's IB MAD kworker for milliseconds per low-bandwidth RMPP burst
by sending an RMPP management transaction with descending segment
numbers. QP1 GMP traffic is unauthenticated by IBTA spec, so no
credentials are required. The bug sits on the IB management path
(QP1 GMP RMPP reassembly), not the RDMA data plane, so RDMA verbs
throughput is unaffected; deployments that raise recv_queue_size to
tune management-plane throughput are quadratically more exposed,
because per-burst cost grows O(F^2) with the configured window.

drivers/infiniband/core/mad_rmpp.c::find_seg_location() walks
rmpp_recv->rmpp_wc->rmpp_list in reverse on every inbound RMPP DATA
segment to locate the insertion point keyed by segment number. The
walk is O(N) per insert under spin_lock_irqsave(&rmpp_recv->lock) in
kworker context, so F adversarially-reordered segments aggregate to
O(F^2). window_size() returns max(recv_queue.max_active >> 3, 1):
the IB MAD core default recv_queue_size of 512 yields window=64
(per-burst cost in the microsecond range), but tuned production
configs with recv_queue_size=8192 push window to 1024 and let a
single low-bandwidth burst pin the per-port MAD kworker for several
milliseconds.

Cap the effective window at IB_MAD_RMPP_MAX_WINDOW = 64 in
window_size() so admins tuning recv_queue_size for higher RX throughput
do not enlarge the walker attack surface. Real RMPP transactions in
the wild (SA queries, perf-counter reads) are well served by a window
of 64, which is also the IB MAD core default. A structural follow-up
would convert rmpp_recv->rmpp_wc->rmpp_list to an rb_tree keyed by
seg_num and lift the cap; that mirrors tcp_data_queue_ofo post-
CVE-2018-5390. For now the cap suffices.

Fixes: fa619a77046b ("[PATCH] IB: Add RMPP implementation")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
I reproduced this under x86_64 QEMU/KVM (4 vCPUs) on v7.1-rc2 with
CONFIG_RDMA_RXE + CONFIG_INFINIBAND_USER_MAD, a veth pair carrying
two rdma_rxe links, and raw RoCEv2/UDP/4791 packet injection with
descending seg_num while holding seg #1. Without the cap, F=1024
burst produces 1022 paired continue_rmpp invocations whose per-call
walker duration grows from ~1 us (early, near-empty list) to ~5 us
(late, ~1000-deep list), a 4x per-call amplification as the queue
deepens, with aggregate walker time per burst >= 1.5 ms (lower bound,
ftrace 1 us granularity). With the cap, the same F=1024 burst drops
to ~0.28 ms aggregate (5.4x reduction); F=32 in-window legitimate
RMPP still completes normally (30 walker calls, avg 1.5 us, max 3 us).
tools/testing/selftests/drivers/net/rdma/ carries no RMPP-specific
selftest in v7.1-rc2 (rdma_rxe self-tests do not exercise QP1 GMP
RMPP reassembly), so no in-tree selftest delta to report.

 drivers/infiniband/core/mad_rmpp.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 17c4c52a19e4c..4d55b133c689c 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -391,9 +391,25 @@ static inline struct ib_mad_recv_buf *get_next_seg(struct list_head *rmpp_list,
 	return container_of(seg->list.next, struct ib_mad_recv_buf, list);
 }

+/*
+ * Cap the per-RMPP-transaction in-flight window. find_seg_location()
+ * walks the rmpp_recv list reverse to find each insertion point, so the
+ * aggregate cost across an attacker-paced reordered window is O(N^2)
+ * under spin_lock_irqsave(&rmpp_recv->lock) in kworker context. The
+ * default recv_queue_size of 512 yields window=64, which keeps that
+ * cost in the noise; tuned configurations (recv_queue_size up to 8192)
+ * push window to 1024 and the per-port kworker measurably stalls under
+ * a low-bandwidth burst from any unauthenticated peer on QP1 GMP. Cap
+ * window at IB_MAD_RMPP_MAX_WINDOW so the bug class is structurally
+ * defused regardless of recv_queue_size tuning.
+ */
+#define IB_MAD_RMPP_MAX_WINDOW 64
+
 static inline int window_size(struct ib_mad_agent_private *agent)
 {
-	return max(agent->qp_info->recv_queue.max_active >> 3, 1);
+	int wsize = agent->qp_info->recv_queue.max_active >> 3;
+
+	return clamp(wsize, 1, IB_MAD_RMPP_MAX_WINDOW);
 }

 static struct ib_mad_recv_buf *find_seg_location(struct list_head *rmpp_list,
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg_location walk
  2026-05-18 21:23 [PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg_location walk Michael Bommarito
@ 2026-05-19 14:46 ` Leon Romanovsky
  2026-05-20 15:47 ` [PATCH v2] IB/mad: cap RMPP reassembly window size Michael Bommarito
  1 sibling, 0 replies; 7+ messages in thread
From: Leon Romanovsky @ 2026-05-19 14:46 UTC (permalink / raw)
  To: Michael Bommarito
  Cc: Jason Gunthorpe, linux-rdma, linux-kernel, stable,
	Vlad Dumitrescu, Or Har-Toov, Bob Pearson, Sean Hefty, Kees Cook

On Mon, May 18, 2026 at 05:23:36PM -0400, Michael Bommarito wrote:
> A peer on the same InfiniBand subnet or RoCEv2 L2 (or any UDP/4791-
> reachable peer for internet-exposed RoCEv2 ports) can pin a target
> port's IB MAD kworker for milliseconds per low-bandwidth RMPP burst
> by sending an RMPP management transaction with descending segment
> numbers. QP1 GMP traffic is unauthenticated by IBTA spec, so no
> credentials are required. The bug sits on the IB management path
> (QP1 GMP RMPP reassembly), not the RDMA data plane, so RDMA verbs
> throughput is unaffected; deployments that raise recv_queue_size to
> tune management-plane throughput are quadratically more exposed,
> because per-burst cost grows O(F^2) with the configured window.
> 
> drivers/infiniband/core/mad_rmpp.c::find_seg_location() walks
> rmpp_recv->rmpp_wc->rmpp_list in reverse on every inbound RMPP DATA
> segment to locate the insertion point keyed by segment number. The
> walk is O(N) per insert under spin_lock_irqsave(&rmpp_recv->lock) in
> kworker context, so F adversarially-reordered segments aggregate to
> O(F^2). window_size() returns max(recv_queue.max_active >> 3, 1):
> the IB MAD core default recv_queue_size of 512 yields window=64
> (per-burst cost in the microsecond range), but tuned production
> configs with recv_queue_size=8192 push window to 1024 and let a
> single low-bandwidth burst pin the per-port MAD kworker for several
> milliseconds.
> 
> Cap the effective window at IB_MAD_RMPP_MAX_WINDOW = 64 in
> window_size() so admins tuning recv_queue_size for higher RX throughput
> do not enlarge the walker attack surface. Real RMPP transactions in
> the wild (SA queries, perf-counter reads) are well served by a window
> of 64, which is also the IB MAD core default. A structural follow-up
> would convert rmpp_recv->rmpp_wc->rmpp_list to an rb_tree keyed by
> seg_num and lift the cap; that mirrors tcp_data_queue_ofo post-
> CVE-2018-5390. For now the cap suffices.
> 
> Fixes: fa619a77046b ("[PATCH] IB: Add RMPP implementation")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> I reproduced this under x86_64 QEMU/KVM (4 vCPUs) on v7.1-rc2 with
> CONFIG_RDMA_RXE + CONFIG_INFINIBAND_USER_MAD, a veth pair carrying
> two rdma_rxe links, and raw RoCEv2/UDP/4791 packet injection with
> descending seg_num while holding seg #1. Without the cap, F=1024
> burst produces 1022 paired continue_rmpp invocations whose per-call
> walker duration grows from ~1 us (early, near-empty list) to ~5 us
> (late, ~1000-deep list), a 4x per-call amplification as the queue
> deepens, with aggregate walker time per burst >= 1.5 ms (lower bound,
> ftrace 1 us granularity). With the cap, the same F=1024 burst drops
> to ~0.28 ms aggregate (5.4x reduction); F=32 in-window legitimate
> RMPP still completes normally (30 walker calls, avg 1.5 us, max 3 us).
> tools/testing/selftests/drivers/net/rdma/ carries no RMPP-specific
> selftest in v7.1-rc2 (rdma_rxe self-tests do not exercise QP1 GMP
> RMPP reassembly), so no in-tree selftest delta to report.
> 
>  drivers/infiniband/core/mad_rmpp.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)

Please rewrite this patch in human language without AI slop.
While working on this, please ensure that your commit message clearly
explains why the change is needed and what issue it actually resolves.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] IB/mad: cap RMPP reassembly window size
  2026-05-18 21:23 [PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg_location walk Michael Bommarito
  2026-05-19 14:46 ` Leon Romanovsky
@ 2026-05-20 15:47 ` Michael Bommarito
  2026-06-03 17:54   ` Jason Gunthorpe
  2026-06-06 20:01   ` [PATCH v3] IB/mad: drop unmatched RMPP responses before reassembly Michael Bommarito
  1 sibling, 2 replies; 7+ messages in thread
From: Michael Bommarito @ 2026-05-20 15:47 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: linux-rdma, linux-kernel, stable, Vlad Dumitrescu, Or Har-Toov,
	Bob Pearson, Sean Hefty, Kees Cook

find_seg_location() inserts reordered RMPP DATA segments into a
per-transaction list by walking that list in reverse. The walk runs
under rmpp_recv->lock in the MAD receive worker, so a large receive
window makes a reversed RMPP burst expensive.

The receive window comes from recv_queue.max_active. With the default
recv_queue_size of 512, the window is 64. Larger tuned queues can raise
the window to 1024, turning one reordered transaction into repeated
long list walks and keeping the target port's MAD worker busy for
milliseconds.

Cap the RMPP window at 64, matching the current default. This keeps
existing behavior for default configurations and prevents larger receive
queues from increasing the worst-case insertion walk.

Fixes: fa619a77046b ("[PATCH] IB: Add RMPP implementation")
Cc: stable@vger.kernel.org
Assisted-by: Codex:gpt-5-5-xhigh
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
Impact: a fabric peer that can send QP1 GMP RMPP DATA segments can keep
the targeted port's MAD worker busy with reordered RMPP bursts, delaying
other MAD processing on that port.

I tested this on v7.1-rc2 under x86_64 QEMU/KVM with rxe and raw RoCEv2
packets carrying descending RMPP segment numbers. With
recv_queue_size=8192, the unpatched kernel spent at least 1.5 ms per
F=1024 burst in the insertion walk; the patched kernel dropped the same
run to about 0.28 ms because segments outside the capped window are
rejected before the list grows. A normal in-window F=32 RMPP exchange
still completed; there are no in-tree selftests for QP1 GMP RMPP
reassembly in tools/testing/selftests/drivers/net/rdma.

Changes in v2:
- Rewrite the commit message in shorter, plain language.
- Trim the code comment to the local reason for the cap.

 drivers/infiniband/core/mad_rmpp.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 17c4c52a19e4c..0db645eb2e29b 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -391,9 +391,18 @@ static inline struct ib_mad_recv_buf *get_next_seg(struct list_head *rmpp_list,
 	return container_of(seg->list.next, struct ib_mad_recv_buf, list);
 }

+/*
+ * find_seg_location() is linear in the number of queued segments.
+ * Keep the RMPP window at the default size so a larger receive queue
+ * does not also enlarge the reordered DATA insertion walk.
+ */
+#define IB_MAD_RMPP_MAX_WINDOW 64
+
 static inline int window_size(struct ib_mad_agent_private *agent)
 {
-	return max(agent->qp_info->recv_queue.max_active >> 3, 1);
+	int wsize = agent->qp_info->recv_queue.max_active >> 3;
+
+	return clamp(wsize, 1, IB_MAD_RMPP_MAX_WINDOW);
 }

 static struct ib_mad_recv_buf *find_seg_location(struct list_head *rmpp_list,
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] IB/mad: cap RMPP reassembly window size
  2026-05-20 15:47 ` [PATCH v2] IB/mad: cap RMPP reassembly window size Michael Bommarito
@ 2026-06-03 17:54   ` Jason Gunthorpe
  2026-06-03 18:20     ` Michael Bommarito
  2026-06-06 20:01   ` [PATCH v3] IB/mad: drop unmatched RMPP responses before reassembly Michael Bommarito
  1 sibling, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2026-06-03 17:54 UTC (permalink / raw)
  To: Michael Bommarito
  Cc: Leon Romanovsky, linux-rdma, linux-kernel, stable,
	Vlad Dumitrescu, Or Har-Toov, Bob Pearson, Sean Hefty, Kees Cook

On Wed, May 20, 2026 at 11:47:15AM -0400, Michael Bommarito wrote:
> find_seg_location() inserts reordered RMPP DATA segments into a
> per-transaction list by walking that list in reverse. The walk runs
> under rmpp_recv->lock in the MAD receive worker, so a large receive
> window makes a reversed RMPP burst expensive.
> 
> The receive window comes from recv_queue.max_active. With the default
> recv_queue_size of 512, the window is 64. Larger tuned queues can raise
> the window to 1024, turning one reordered transaction into repeated
> long list walks and keeping the target port's MAD worker busy for
> milliseconds.
> 
> Cap the RMPP window at 64, matching the current default. This keeps
> existing behavior for default configurations and prevents larger receive
> queues from increasing the worst-case insertion walk.
> 
> Fixes: fa619a77046b ("[PATCH] IB: Add RMPP implementation")
> Cc: stable@vger.kernel.org
> Assisted-by: Codex:gpt-5-5-xhigh
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> Impact: a fabric peer that can send QP1 GMP RMPP DATA segments can keep
> the targeted port's MAD worker busy with reordered RMPP bursts, delaying
> other MAD processing on that port.
> 
> I tested this on v7.1-rc2 under x86_64 QEMU/KVM with rxe and raw RoCEv2
> packets carrying descending RMPP segment numbers. With
> recv_queue_size=8192, the unpatched kernel spent at least 1.5 ms per
> F=1024 burst in the insertion walk; the patched kernel dropped the same
> run to about 0.28 ms because segments outside the capped window are
> rejected before the list grows. A normal in-window F=32 RMPP exchange
> still completed; there are no in-tree selftests for QP1 GMP RMPP
> reassembly in tools/testing/selftests/drivers/net/rdma.

Why do you think it is OK to only search back 64? Where do these
numbers come from?

Is this a real issue?  It looks to me like all this code is gated by
IB_USER_MAD_USER_RMPP and no in-kernel user makes use of RMPP.

Use of RMPP in userspace is extremely rare and requires privilege to
activate. If userspace opts into it then and only then would there be
a performance issue.

So I don't see why we should be changing this and risking regressions
with the window reduction?

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] IB/mad: cap RMPP reassembly window size
  2026-06-03 17:54   ` Jason Gunthorpe
@ 2026-06-03 18:20     ` Michael Bommarito
  2026-06-03 18:41       ` Jason Gunthorpe
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Bommarito @ 2026-06-03 18:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, linux-kernel, stable,
	Vlad Dumitrescu, Or Har-Toov, Bob Pearson, Sean Hefty, Kees Cook

On Wed, Jun 3, 2026 at 1:55 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> Why do you think it is OK to only search back 64? Where do these
> numbers come from?

512 >> 3 from IB_MAD_QP_RECV_SIZE in mad_priv.h and max_active.

> Is this a real issue?  It looks to me like all this code is gated by
> IB_USER_MAD_USER_RMPP and no in-kernel user makes use of RMPP.

I originally found these issues looking for reachable quadratic
runtimes with libclang+Claude, and these are in my notes on
reachability.
<CLAUDE>
  - sa_query.c:2436: the in-kernel SA client registers its GSI agent
with rmpp_version = IB_MGMT_RMPP_VERSION and flags = 0. So
ib_mad_kernel_rmpp_agent() (mad.c:856) is true for it, and
ib_process_rmpp_recv_wc()
  → find_seg_location runs on its receive path. ib_sa is always
loaded. Not a umad-only path.
</CLAUDE>

So I think the reachability is wider than you expect.  Perhaps that's
the real fix you'd prefer.

> So I don't see why we should be changing this and risking regressions
> with the window reduction?

It's obviously your choice as maintainers, but I'd encourage you to
test the pathological worst case from an unprivileged peer to see the
impact before totally writing it off.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] IB/mad: cap RMPP reassembly window size
  2026-06-03 18:20     ` Michael Bommarito
@ 2026-06-03 18:41       ` Jason Gunthorpe
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2026-06-03 18:41 UTC (permalink / raw)
  To: Michael Bommarito
  Cc: Leon Romanovsky, linux-rdma, linux-kernel, stable,
	Vlad Dumitrescu, Or Har-Toov, Bob Pearson, Sean Hefty, Kees Cook

On Wed, Jun 03, 2026 at 02:20:03PM -0400, Michael Bommarito wrote:
> On Wed, Jun 3, 2026 at 1:55 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > Why do you think it is OK to only search back 64? Where do these
> > numbers come from?
> 
> 512 >> 3 from IB_MAD_QP_RECV_SIZE in mad_priv.h and max_active.

I mean from the real world - the purpose of this window is to deal
with network re-ordering, by changing it like this we are reducing the
kinds of re-ordering the network can perform.

I think reordering is basically something that should never happen on
IB, yet 20 years ago someone decided to have huge reorder windows..

> > Is this a real issue?  It looks to me like all this code is gated by
> > IB_USER_MAD_USER_RMPP and no in-kernel user makes use of RMPP.
> 
> I originally found these issues looking for reachable quadratic
> runtimes with libclang+Claude, and these are in my notes on
> reachability.
> <CLAUDE>
>   - sa_query.c:2436: the in-kernel SA client registers its GSI agent
> with rmpp_version = IB_MGMT_RMPP_VERSION and flags = 0. So
> ib_mad_kernel_rmpp_agent() (mad.c:856) is true for it, and
> ib_process_rmpp_recv_wc()
>   → find_seg_location runs on its receive path. ib_sa is always
> loaded. Not a umad-only path.
> </CLAUDE>
> 
> So I think the reachability is wider than you expect.  Perhaps that's
> the real fix you'd prefer.

Hmmm, I didn't remember SA left it turned on. AI says it is only used
by SA IB CM service resolution which is so obscure and rarely used in
modern systems. Yet it opens this whole scary bit of code.

> > So I don't see why we should be changing this and risking regressions
> > with the window reduction?
> 
> It's obviously your choice as maintainers, but I'd encourage you to
> test the pathological worst case from an unprivileged peer to see the
> impact before totally writing it off.

I'm sure the pathological case is bad, but I don't know if lowering
the window size will somehow break something someone is using.

If it could be fixed without changing the behavior that would be more
interesting..

Also the way this works the peer sending into this isn't
unpriviledged.  On IB it is using a restricted qkey so it is supposed
to be trusted software under the 1990's security model IB uses..

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3] IB/mad: drop unmatched RMPP responses before reassembly
  2026-05-20 15:47 ` [PATCH v2] IB/mad: cap RMPP reassembly window size Michael Bommarito
  2026-06-03 17:54   ` Jason Gunthorpe
@ 2026-06-06 20:01   ` Michael Bommarito
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Bommarito @ 2026-06-06 20:01 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: linux-rdma, linux-kernel, stable, Vlad Dumitrescu, Or Har-Toov,
	Bob Pearson, Sean Hefty, Kees Cook

Kernel-handled RMPP receive processing starts reassembly for active
DATA responses before the response is matched to an outstanding send.
The normal match happens later, after ib_process_rmpp_recv_wc() has
either assembled a complete message or consumed the segment.

That ordering lets an unsolicited response that routes to a kernel
RMPP agent by the high TID bits allocate or extend RMPP receive state
before the full TID and source address are checked against a real
request. A reordered burst can therefore reach the receive-side
insertion path even though the response would not match any send.

For kernel-handled RMPP DATA responses, require the existing
ib_find_send_mad() match before entering RMPP reassembly. The matcher
already checks the full TID, management class and source address/GID
against the agent wait, backlog and in-flight send lists. If there is
no match, drop the response without creating RMPP state.

This leaves the RMPP window behavior unchanged and only rejects
responses that have no corresponding request.

Fixes: fa619a77046b ("[PATCH] IB: Add RMPP implementation")
Cc: stable@vger.kernel.org
Assisted-by: Codex:gpt-5-5-xhigh
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
Impact: a fabric peer that can send QP1 GMP RMPP DATA responses to a
kernel RMPP agent can create receive-side RMPP reassembly work before
the response is matched to an outstanding send, delaying other MAD
processing on that port.

I tested this on v7.1-rc6 under x86_64 QEMU/KVM with rxe plus
debug-only patches that host the in-kernel SA agent on soft-RoCE. A
descending F=1024 burst to the SA agent hi_tid reached RX/MAD completion
(1023 packets) but, with this patch, did not enter RMPP receive
processing or the insertion walker: walks=0 and no continue_rmpp samples.
There are no in-tree selftests for QP1 GMP RMPP reassembly in
tools/testing/selftests/rdma.

Changes in v3:
- Replace the RMPP window cap with a pre-reassembly response match.
- Leave the accepted RMPP reordering window unchanged.

 drivers/infiniband/core/mad.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 8d19613179e3e..e0b3b36b8b149 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2031,6 +2031,24 @@ void ib_mark_mad_done(struct ib_mad_send_wr_private *mad_send_wr)
 		change_mad_state(mad_send_wr, IB_MAD_STATE_EARLY_RESP);
 }

+static bool is_kernel_rmpp_data_response(struct ib_mad_agent_private *agent,
+					 struct ib_mad_recv_wc *mad_recv_wc)
+{
+	const struct ib_mad_hdr *mad_hdr = &mad_recv_wc->recv_buf.mad->mad_hdr;
+	struct ib_rmpp_mad *rmpp_mad;
+
+	if (!ib_mad_kernel_rmpp_agent(&agent->agent) ||
+	    !ib_response_mad(mad_hdr) ||
+	    !ib_is_mad_class_rmpp(mad_hdr->mgmt_class))
+		return false;
+
+	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
+
+	return (ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+		IB_MGMT_RMPP_FLAG_ACTIVE) &&
+	       rmpp_mad->rmpp_hdr.rmpp_type == IB_MGMT_RMPP_TYPE_DATA;
+}
+
 static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 				 struct ib_mad_recv_wc *mad_recv_wc)
 {
@@ -2050,6 +2068,18 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
 	}

 	list_add(&mad_recv_wc->recv_buf.list, &mad_recv_wc->rmpp_list);
+	if (is_kernel_rmpp_data_response(mad_agent_priv, mad_recv_wc)) {
+		spin_lock_irqsave(&mad_agent_priv->lock, flags);
+		mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
+		spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+
+		if (!mad_send_wr) {
+			ib_free_recv_mad(mad_recv_wc);
+			deref_mad_agent(mad_agent_priv);
+			return;
+		}
+	}
+
 	if (ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent)) {
 		mad_recv_wc = ib_process_rmpp_recv_wc(mad_agent_priv,
 						      mad_recv_wc);
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-06 20:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 21:23 [PATCH] IB/mad: cap RMPP reassembly window size to bound find_seg_location walk Michael Bommarito
2026-05-19 14:46 ` Leon Romanovsky
2026-05-20 15:47 ` [PATCH v2] IB/mad: cap RMPP reassembly window size Michael Bommarito
2026-06-03 17:54   ` Jason Gunthorpe
2026-06-03 18:20     ` Michael Bommarito
2026-06-03 18:41       ` Jason Gunthorpe
2026-06-06 20:01   ` [PATCH v3] IB/mad: drop unmatched RMPP responses before reassembly Michael Bommarito

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.