Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
@ 2026-06-03 12:18 Vladislav Nikolaev
  2026-06-03 15:03 ` [lvc-project] " Fedor Pchelkin
  2026-06-05 19:37 ` Sasha Levin
  0 siblings, 2 replies; 4+ messages in thread
From: Vladislav Nikolaev @ 2026-06-03 12:18 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Vladislav Nikolaev, Zhu Yanjun, Doug Ledford, Jason Gunthorpe,
	Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
	Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
	lvc-project, syzbot+cfcc1a3c85be15a40cba, Zhu Yanjun

From: Zhu Yanjun <yanjun.zhu@linux.dev>

commit b2b1ddc457458fecd1c6f385baa9fbda5f0c63ad upstream.

In the function rxe_create_qp(), rxe_qp_from_init() is called to
initialize qp, internally things like rxe_init_task are not setup until
rxe_qp_init_req().

If an error occurred before this point then the unwind will call
rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
which will oops when trying to access the uninitialized spinlock.

If rxe_init_task is not executed, rxe_cleanup_task will not be called.

Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230413101115.1366068-1-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[ Vladislav: match upstream cleanup order and add the missing
resp.task.func check. ]
Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
---
v2: Move rxe_cleanup_task(&qp->resp.task) after RC timer cleanup.
Add missing qp->resp.task.func check before cleaning up the responder task.

Backport fix for CVE-2023-54028.
 drivers/infiniband/sw/rxe/rxe_qp.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 4c938d841f76..616efae0c09a 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -760,15 +760,20 @@ void rxe_qp_destroy(struct rxe_qp *qp)
 {
 	qp->valid = 0;
 	qp->qp_timeout_jiffies = 0;
-	rxe_cleanup_task(&qp->resp.task);
 
 	if (qp_type(qp) == IB_QPT_RC) {
 		del_timer_sync(&qp->retrans_timer);
 		del_timer_sync(&qp->rnr_nak_timer);
 	}
 
-	rxe_cleanup_task(&qp->req.task);
-	rxe_cleanup_task(&qp->comp.task);
+	if (qp->resp.task.func)
+		rxe_cleanup_task(&qp->resp.task);
+	
+	if (qp->req.task.func)
+		rxe_cleanup_task(&qp->req.task);
+
+	if (qp->comp.task.func)
+		rxe_cleanup_task(&qp->comp.task);
 
 	/* flush out any receive wr's or pending requests */
 	if (qp->req.task.func)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [lvc-project] [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2026-06-03 12:18 [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
@ 2026-06-03 15:03 ` Fedor Pchelkin
  2026-06-05 17:37   ` Vladislav Nikolaev
  2026-06-05 19:37 ` Sasha Levin
  1 sibling, 1 reply; 4+ messages in thread
From: Fedor Pchelkin @ 2026-06-03 15:03 UTC (permalink / raw)
  To: Vladislav Nikolaev
  Cc: stable, Greg Kroah-Hartman, Haggai Eran, lvc-project,
	Leon Romanovsky, linux-rdma, Zhu Yanjun, linux-kernel,
	Jason Gunthorpe, Doug Ledford, Zhu Yanjun,
	syzbot+cfcc1a3c85be15a40cba

On Wed, 03. Jun 15:18, Vladislav Nikolaev wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> commit b2b1ddc457458fecd1c6f385baa9fbda5f0c63ad upstream.
> 
> In the function rxe_create_qp(), rxe_qp_from_init() is called to
> initialize qp, internally things like rxe_init_task are not setup until
> rxe_qp_init_req().
> 
> If an error occurred before this point then the unwind will call
> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
> which will oops when trying to access the uninitialized spinlock.
> 
> If rxe_init_task is not executed, rxe_cleanup_task will not be called.
> 
> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> Link: https://lore.kernel.org/r/20230413101115.1366068-1-yanjun.zhu@intel.com
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
> [ Vladislav: match upstream cleanup order and add the missing
> resp.task.func check. ]
> Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
> ---

Thanks for the update.

> v2: Move rxe_cleanup_task(&qp->resp.task) after RC timer cleanup.
> Add missing qp->resp.task.func check before cleaning up the responder task.

I did actually suggest only adding a corresponding check for the
rxe_cleanup_task(&qp->resp.task) call which the upstream commit performs.
Moving it a couple of lines around requires some explanation why it's
okay in 5.10/5.15 kernels.  Note that in upstream it was done by another
commit 960ebe97e523 ("RDMA/rxe: Remove __rxe_do_task()").

[ yeah, it should be safe to move the call but it'd better be stated
  explicitly in the backporter's comment ]

Worth saying that checkpatch.pl for the current patch gives:

ERROR: trailing whitespace
#52: FILE: drivers/infiniband/sw/rxe/rxe_qp.c:771:
+^I$

You might also want to consider porting 1c7eec4d5f3b ("RDMA/rxe: Fix
"trying to register non-static key in rxe_qp_do_cleanup" bug") which fixes
the similar problem for del_timer_sync / timer_delete_sync calls in this
code.  This all could go as a series now probably.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lvc-project] [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2026-06-03 15:03 ` [lvc-project] " Fedor Pchelkin
@ 2026-06-05 17:37   ` Vladislav Nikolaev
  0 siblings, 0 replies; 4+ messages in thread
From: Vladislav Nikolaev @ 2026-06-05 17:37 UTC (permalink / raw)
  To: Fedor Pchelkin
  Cc: Vladislav Nikolaev, stable, Greg Kroah-Hartman, Haggai Eran,
	lvc-project, Leon Romanovsky, linux-rdma, Zhu Yanjun,
	linux-kernel, Jason Gunthorpe, Doug Ledford, Zhu Yanjun,
	syzbot+cfcc1a3c85be15a40cba

On Wed, 3 Jun 2026 at 18:03:00 +0300, Fedor Pchelkin wrote:
> Moving it a couple of lines around requires some explanation why it's
> okay in 5.10/5.15 kernels.  Note that in upstream it was done by another
> commit 960ebe97e523 ("RDMA/rxe: Remove __rxe_do_task()").
>
> [ yeah, it should be safe to move the call but it'd better be stated
>   explicitly in the backporter's comment ]
>
> Worth saying that checkpatch.pl for the current patch gives:
>
> ERROR: trailing whitespace
> #52: FILE: drivers/infiniband/sw/rxe/rxe_qp.c:771:
> +^I$
>
> You might also want to consider porting 1c7eec4d5f3b ("RDMA/rxe: Fix
> "trying to register non-static key in rxe_qp_do_cleanup" bug") which fixes
> the similar problem for del_timer_sync / timer_delete_sync calls in this
> code.  This all could go as a series now probably.

Thanks for the review.

I have prepared v3 as a 5.10/5.15 series and addressed all three points:

1. extended the backporter's comment to explain why moving
   rxe_cleanup_task(&qp->resp.task) after the RC timer cleanup is safe
   for 5.10/5.15 even though upstream got that order via 960ebe97e523;
2. fixed the trailing whitespace;
3. added the backport of 1c7eec4d5f3b as the second patch in the series.

The updated series is available here:

https://lore.kernel.org/all/20260605171449.1760-1-vlad102nikolaev@gmail.com/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2026-06-03 12:18 [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
  2026-06-03 15:03 ` [lvc-project] " Fedor Pchelkin
@ 2026-06-05 19:37 ` Sasha Levin
  1 sibling, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2026-06-05 19:37 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Sasha Levin, Vladislav Nikolaev, Zhu Yanjun, Doug Ledford,
	Jason Gunthorpe, Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua,
	Yonatan Cohen, Leon Romanovsky, linux-rdma, linux-kernel,
	Zhu Yanjun, lvc-project, syzbot+cfcc1a3c85be15a40cba, Zhu Yanjun

> [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"

I'm dropping this for now; it isn't right for either branch as submitted:

 - 5.15.y: the bug doesn't exist there -- the task locks are already
   spin_lock_init()'d on the QP-create error path.
 - 5.10.y: mis-targeted -- it patches rxe_qp_do_cleanup(), but the 5.10
   error-unwind path doesn't call rxe_cleanup_task() there.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-05 19:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03 12:18 [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
2026-06-03 15:03 ` [lvc-project] " Fedor Pchelkin
2026-06-05 17:37   ` Vladislav Nikolaev
2026-06-05 19:37 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox