* [PATCH 1/2] RDMA/rxe: fix TOCTOU heap overflow in get_srq_wqe
2026-05-18 21:50 [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path Tristan Madani
@ 2026-05-18 21:50 ` Tristan Madani
2026-05-18 21:50 ` [PATCH 2/2] RDMA/rxe: copy WQE to local buffer in non-SRQ receive path Tristan Madani
2026-05-19 2:03 ` [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in " Zhu Yanjun
2 siblings, 0 replies; 8+ messages in thread
From: Tristan Madani @ 2026-05-18 21:50 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, Tristan Madani
get_srq_wqe() reads wqe->dma.num_sge from the shared receive queue
buffer, which is mapped into userspace. It validates num_sge against
max_sge, but then re-reads the same field to calculate the memcpy
size. A concurrent userspace thread can modify num_sge between
validation and use, causing a heap buffer overflow when copying the
WQE into qp->resp.srq_wqe.
Read num_sge into a local variable and use it for both the bounds
check and the size calculation.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
---
drivers/infiniband/sw/rxe/rxe_resp.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 9cb2f6f..8a0a973 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -264,6 +264,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
struct rxe_recv_wqe *wqe;
struct ib_event ev;
unsigned int count;
+ unsigned int num_sge;
size_t size;
unsigned long flags;
@@ -279,12 +280,13 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
}
/* don't trust user space data */
- if (unlikely(wqe->dma.num_sge > srq->rq.max_sge)) {
+ num_sge = wqe->dma.num_sge;
+ if (unlikely(num_sge > srq->rq.max_sge)) {
spin_unlock_irqrestore(&srq->rq.consumer_lock, flags);
rxe_dbg_qp(qp, "invalid num_sge in SRQ entry\n");
return RESPST_ERR_MALFORMED_WQE;
}
- size = sizeof(*wqe) + wqe->dma.num_sge*sizeof(struct rxe_sge);
+ size = sizeof(*wqe) + num_sge * sizeof(struct rxe_sge);
memcpy(&qp->resp.srq_wqe, wqe, size);
qp->resp.wqe = &qp->resp.srq_wqe.wqe;
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/2] RDMA/rxe: copy WQE to local buffer in non-SRQ receive path
2026-05-18 21:50 [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path Tristan Madani
2026-05-18 21:50 ` [PATCH 1/2] RDMA/rxe: fix TOCTOU heap overflow in get_srq_wqe Tristan Madani
@ 2026-05-18 21:50 ` Tristan Madani
2026-05-19 2:03 ` [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in " Zhu Yanjun
2 siblings, 0 replies; 8+ messages in thread
From: Tristan Madani @ 2026-05-18 21:50 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, Tristan Madani
For non-SRQ QPs, the responder reads WQE fields directly from the
shared queue buffer mapped into userspace. This allows a malicious
user to modify fields like num_sge or sge entries while the kernel
is processing the WQE, leading to out-of-bounds reads in
rxe_resp_check_length() and copy_data().
Introduce get_recv_wqe() that validates num_sge and copies the WQE
to a kernel-local buffer before processing, matching the approach
already used for SRQ WQEs in get_srq_wqe(). The srq_wqe buffer is
reused since SRQ and non-SRQ paths are mutually exclusive per QP.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
---
drivers/infiniband/sw/rxe/rxe_resp.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 8a0a973..43e8d86 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -310,6 +310,29 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
return RESPST_CHK_LENGTH;
}
+static enum resp_states get_recv_wqe(struct rxe_qp *qp)
+{
+ struct rxe_queue *q = qp->rq.queue;
+ struct rxe_recv_wqe *wqe;
+ unsigned int num_sge;
+ size_t size;
+
+ wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
+ if (!wqe)
+ return RESPST_ERR_RNR;
+
+ num_sge = wqe->dma.num_sge;
+ if (unlikely(num_sge > qp->rq.max_sge)) {
+ rxe_dbg_qp(qp, "invalid num_sge in recv WQE\n");
+ return RESPST_ERR_MALFORMED_WQE;
+ }
+ size = sizeof(*wqe) + num_sge * sizeof(struct rxe_sge);
+ memcpy(&qp->resp.srq_wqe, wqe, size);
+
+ qp->resp.wqe = &qp->resp.srq_wqe.wqe;
+ return RESPST_CHK_LENGTH;
+}
+
static enum resp_states check_resource(struct rxe_qp *qp,
struct rxe_pkt_info *pkt)
{
@@ -330,9 +353,7 @@ static enum resp_states check_resource(struct rxe_qp *qp,
if (srq)
return get_srq_wqe(qp);
- qp->resp.wqe = queue_head(qp->rq.queue,
- QUEUE_TYPE_FROM_CLIENT);
- return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR;
+ return get_recv_wqe(qp);
}
return RESPST_CHK_LENGTH;
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path
2026-05-18 21:50 [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path Tristan Madani
2026-05-18 21:50 ` [PATCH 1/2] RDMA/rxe: fix TOCTOU heap overflow in get_srq_wqe Tristan Madani
2026-05-18 21:50 ` [PATCH 2/2] RDMA/rxe: copy WQE to local buffer in non-SRQ receive path Tristan Madani
@ 2026-05-19 2:03 ` Zhu Yanjun
2026-05-19 14:56 ` Leon Romanovsky
2 siblings, 1 reply; 8+ messages in thread
From: Zhu Yanjun @ 2026-05-19 2:03 UTC (permalink / raw)
To: Tristan Madani, Zhu Yanjun, yanjun.zhu@linux.dev
Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, Tristan Madani
在 2026/5/18 14:50, Tristan Madani 写道:
> RXE queue buffers are mapped read-write into userspace. The receive
> path reads WQE fields from these shared buffers, which lets a
> concurrent userspace thread modify them between validation and use.
To be honest, can you implement the above? If yes, please show us the
steps to reproduce this problem.
Thanks a lot.
Zhu Yanjun
>
> Patch 1 fixes a heap overflow in the SRQ path where num_sge is
> validated but then re-read for the memcpy size calculation.
>
> Patch 2 addresses the non-SRQ path by copying the WQE to a
> kernel-local buffer before processing, preventing TOCTOU on
> fields used in check_length and copy_data.
>
> Tristan Madani (2):
> RDMA/rxe: fix TOCTOU heap overflow in get_srq_wqe
> RDMA/rxe: copy WQE to local buffer in non-SRQ receive path
>
> drivers/infiniband/sw/rxe/rxe_resp.c | 33 ++++++++++++++++++++++++---
> 1 file changed, 28 insertions(+), 5 deletions(-)
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path
2026-05-19 2:03 ` [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in " Zhu Yanjun
@ 2026-05-19 14:56 ` Leon Romanovsky
2026-05-19 15:00 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2026-05-19 14:56 UTC (permalink / raw)
To: Zhu Yanjun
Cc: Tristan Madani, Zhu Yanjun, Jason Gunthorpe, linux-rdma,
Tristan Madani
On Mon, May 18, 2026 at 07:03:18PM -0700, Zhu Yanjun wrote:
> 在 2026/5/18 14:50, Tristan Madani 写道:
> > RXE queue buffers are mapped read-write into userspace. The receive
> > path reads WQE fields from these shared buffers, which lets a
> > concurrent userspace thread modify them between validation and use.
>
> To be honest, can you implement the above? If yes, please show us the steps
> to reproduce this problem.
It is an imaginary problem. One would need to run RXE (development,
virtual RNIC), write a buggy userspace application, and then be
surprised when RXE misbehaves after running it.
Thanks
>
> Thanks a lot.
> Zhu Yanjun
>
> >
> > Patch 1 fixes a heap overflow in the SRQ path where num_sge is
> > validated but then re-read for the memcpy size calculation.
> >
> > Patch 2 addresses the non-SRQ path by copying the WQE to a
> > kernel-local buffer before processing, preventing TOCTOU on
> > fields used in check_length and copy_data.
> >
> > Tristan Madani (2):
> > RDMA/rxe: fix TOCTOU heap overflow in get_srq_wqe
> > RDMA/rxe: copy WQE to local buffer in non-SRQ receive path
> >
> > drivers/infiniband/sw/rxe/rxe_resp.c | 33 ++++++++++++++++++++++++---
> > 1 file changed, 28 insertions(+), 5 deletions(-)
> >
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path
2026-05-19 14:56 ` Leon Romanovsky
@ 2026-05-19 15:00 ` Jason Gunthorpe
2026-05-19 22:30 ` Tristan Madani
0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2026-05-19 15:00 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Zhu Yanjun, Tristan Madani, Zhu Yanjun, linux-rdma,
Tristan Madani
On Tue, May 19, 2026 at 05:56:10PM +0300, Leon Romanovsky wrote:
> On Mon, May 18, 2026 at 07:03:18PM -0700, Zhu Yanjun wrote:
> > 在 2026/5/18 14:50, Tristan Madani 写道:
> > > RXE queue buffers are mapped read-write into userspace. The receive
> > > path reads WQE fields from these shared buffers, which lets a
> > > concurrent userspace thread modify them between validation and use.
> >
> > To be honest, can you implement the above? If yes, please show us the steps
> > to reproduce this problem.
>
> It is an imaginary problem. One would need to run RXE (development,
> virtual RNIC), write a buggy userspace application, and then be
> surprised when RXE misbehaves after running it.
Simple misbehave is one thing, but if userspace can hack the kernel
and gain control of it through this shared memory then we have to fix
it.
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path
2026-05-19 15:00 ` Jason Gunthorpe
@ 2026-05-19 22:30 ` Tristan Madani
2026-05-20 0:07 ` Yanjun.Zhu
0 siblings, 1 reply; 8+ messages in thread
From: Tristan Madani @ 2026-05-19 22:30 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Leon Romanovsky, Zhu Yanjun, Zhu Yanjun, linux-rdma
On Tue, 19 May 2026, Jason Gunthorpe wrote:
> Simple misbehave is one thing, but if userspace can hack the kernel
> and gain control of it through this shared memory then we have to fix
> it.
The non-SRQ receive path in check_resource() sets qp->resp.wqe
directly into the shared mmap buffer:
qp->resp.wqe = queue_head(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
No copy, no validation of the WQE fields. Every subsequent access
to wqe->dma.num_sge, wqe->dma.sge[].lkey, and wqe->dma.sge[].addr
reads from memory that userspace can modify concurrently.
The concrete problem is in copy_data(), called via send_data_in().
It re-reads dma->num_sge from the shared buffer on every loop
iteration (the dma->cur_sge >= dma->num_sge bound check), and uses
sge->lkey for lookup_mr() and sge->addr to compute the iova for
rxe_mr_copy(). A concurrent thread can:
1. Increase num_sge: the sge pointer walks past the WQE's
allocated SGE slots into adjacent queue entries, and the
kernel acts on whatever lkey/addr/length values it finds
there -- all attacker-controlled through the same mmap.
2. Swap sge[].lkey between iterations: redirect the MR lookup
to a different memory region.
3. Modify sge[].addr: shift the write target within the
resolved MR.
The data being written is incoming packet payload (attacker-
controlled in loopback), direction is RXE_TO_MR_OBJ.
The SRQ path already handles this correctly: get_srq_wqe() copies
the WQE to kernel memory with memcpy() and validates num_sge
against max_sge before use. The comment there says "don't trust
user space data". The non-SRQ path has neither the copy nor the
validation.
The race window is not tight -- the shared pointer is set during
RESPST_CHK_RESOURCE and the fields are consumed across CHK_LENGTH
and EXECUTE before copy_data() runs.
I can provide a reproducer if that helps move the patches forward.
Tristan
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 0/2] RDMA/rxe: fix shared memory TOCTOU in receive path
2026-05-19 22:30 ` Tristan Madani
@ 2026-05-20 0:07 ` Yanjun.Zhu
0 siblings, 0 replies; 8+ messages in thread
From: Yanjun.Zhu @ 2026-05-20 0:07 UTC (permalink / raw)
To: Tristan Madani, Jason Gunthorpe, Zhu Yanjun
Cc: Leon Romanovsky, Zhu Yanjun, linux-rdma
On 5/19/26 3:30 PM, Tristan Madani wrote:
> On Tue, 19 May 2026, Jason Gunthorpe wrote:
>> Simple misbehave is one thing, but if userspace can hack the kernel
>> and gain control of it through this shared memory then we have to fix
>> it.
> The non-SRQ receive path in check_resource() sets qp->resp.wqe
> directly into the shared mmap buffer:
>
> qp->resp.wqe = queue_head(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
>
> No copy, no validation of the WQE fields. Every subsequent access
> to wqe->dma.num_sge, wqe->dma.sge[].lkey, and wqe->dma.sge[].addr
> reads from memory that userspace can modify concurrently.
>
> The concrete problem is in copy_data(), called via send_data_in().
> It re-reads dma->num_sge from the shared buffer on every loop
> iteration (the dma->cur_sge >= dma->num_sge bound check), and uses
> sge->lkey for lookup_mr() and sge->addr to compute the iova for
> rxe_mr_copy(). A concurrent thread can:
>
> 1. Increase num_sge: the sge pointer walks past the WQE's
> allocated SGE slots into adjacent queue entries, and the
> kernel acts on whatever lkey/addr/length values it finds
> there -- all attacker-controlled through the same mmap.
>
> 2. Swap sge[].lkey between iterations: redirect the MR lookup
> to a different memory region.
>
> 3. Modify sge[].addr: shift the write target within the
> resolved MR.
>
> The data being written is incoming packet payload (attacker-
> controlled in loopback), direction is RXE_TO_MR_OBJ.
>
> The SRQ path already handles this correctly: get_srq_wqe() copies
> the WQE to kernel memory with memcpy() and validates num_sge
> against max_sge before use. The comment there says "don't trust
> user space data". The non-SRQ path has neither the copy nor the
> validation.
>
> The race window is not tight -- the shared pointer is set during
> RESPST_CHK_RESOURCE and the fields are consumed across CHK_LENGTH
> and EXECUTE before copy_data() runs.
>
> I can provide a reproducer if that helps move the patches forward.
yes. Please provide a reproducer.
Thanks,
Zhu Yanjun
>
> Tristan
^ permalink raw reply [flat|nested] 8+ messages in thread