* [PATCH 6.1 0/3] RDMA/rxe: correct cleanup-task backport and timer cleanup
@ 2026-06-05 17:03 Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"" Vladislav Nikolaev
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Vladislav Nikolaev @ 2026-06-05 17:03 UTC (permalink / raw)
To: stable, Greg Kroah-Hartman
Cc: Vladislav Nikolaev, Zhu Yanjun, Doug Ledford, Jason Gunthorpe,
Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
lvc-project
The linux-6.1.y tree contains commit 3236221bb8e4 ("RDMA/rxe: Fix the
error "trying to register non-static key in rxe_cleanup_task""), which is
an incomplete backport of upstream commit b2b1ddc45745 ("RDMA/rxe: Fix
the error "trying to register non-static key in rxe_cleanup_task"").
The stable backport added guards for req.task and comp.task, but missed
the resp.task guard and also left rxe_cleanup_task(&qp->resp.task) above
the RC timer cleanup. The upstream fix checks all three tasks and keeps
resp.task cleanup after the timer cleanup.
This series first reverts the incomplete stable backport, then applies the
correct backport, and finally backports commit 1c7eec4d5f3b ("RDMA/rxe:
Fix "trying to register non-static key in rxe_qp_do_cleanup" bug") to
avoid deleting uninitialized RC timers during QP cleanup. The last patch
keeps del_timer_sync(), because linux-6.1.y has not renamed it to
timer_delete_sync() yet.
Vladislav Nikolaev (1):
Revert "RDMA/rxe: Fix the error "trying to register non-static key in
rxe_cleanup_task""
Zhu Yanjun (2):
RDMA/rxe: Fix the error "trying to register non-static key in
rxe_cleanup_task"
RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup"
bug
drivers/infiniband/sw/rxe/rxe_qp.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task""
2026-06-05 17:03 [PATCH 6.1 0/3] RDMA/rxe: correct cleanup-task backport and timer cleanup Vladislav Nikolaev
@ 2026-06-05 17:03 ` Vladislav Nikolaev
2026-06-16 13:39 ` Greg Kroah-Hartman
2026-06-05 17:03 ` [PATCH 6.1 2/3] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 3/3] RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug Vladislav Nikolaev
2 siblings, 1 reply; 5+ messages in thread
From: Vladislav Nikolaev @ 2026-06-05 17:03 UTC (permalink / raw)
To: stable, Greg Kroah-Hartman
Cc: Vladislav Nikolaev, Zhu Yanjun, Doug Ledford, Jason Gunthorpe,
Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
lvc-project
This reverts commit 3236221bb8e4de8e3d0c8385f634064fb26b8e38.
The reverted commit is an incomplete backport of upstream
commit b2b1ddc45745. It added guards for req.task and comp.task
cleanup, but missed resp.task cleanup and left it before the RC timer
cleanup, unlike the upstream fix. Revert it first so the correct
backport can be applied cleanly in the following patch.
Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
---
drivers/infiniband/sw/rxe/rxe_qp.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 709c63e9773c..05e4a270084f 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -788,11 +788,8 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
del_timer_sync(&qp->rnr_nak_timer);
}
- if (qp->req.task.func)
- rxe_cleanup_task(&qp->req.task);
-
- if (qp->comp.task.func)
- rxe_cleanup_task(&qp->comp.task);
+ rxe_cleanup_task(&qp->req.task);
+ rxe_cleanup_task(&qp->comp.task);
/* flush out any receive wr's or pending requests */
if (qp->req.task.func)
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 6.1 2/3] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
2026-06-05 17:03 [PATCH 6.1 0/3] RDMA/rxe: correct cleanup-task backport and timer cleanup Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"" Vladislav Nikolaev
@ 2026-06-05 17:03 ` Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 3/3] RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug Vladislav Nikolaev
2 siblings, 0 replies; 5+ messages in thread
From: Vladislav Nikolaev @ 2026-06-05 17:03 UTC (permalink / raw)
To: stable, Greg Kroah-Hartman
Cc: Vladislav Nikolaev, Zhu Yanjun, Doug Ledford, Jason Gunthorpe,
Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
lvc-project, syzbot+cfcc1a3c85be15a40cba, Zhu Yanjun
From: Zhu Yanjun <yanjun.zhu@linux.dev>
commit b2b1ddc457458fecd1c6f385baa9fbda5f0c63ad upstream.
In the function rxe_create_qp(), rxe_qp_from_init() is called to
initialize qp, internally things like rxe_init_task are not setup until
rxe_qp_init_req().
If an error occurred before this point then the unwind will call
rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
which will oops when trying to access the uninitialized spinlock.
If rxe_init_task is not executed, rxe_cleanup_task will not be called.
Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230413101115.1366068-1-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[ Vladislav: match upstream cleanup order and add the missing
resp.task.func check. ]
Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
---
drivers/infiniband/sw/rxe/rxe_qp.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 05e4a270084f..171c0f4dcbec 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -781,15 +781,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
qp->valid = 0;
qp->qp_timeout_jiffies = 0;
- rxe_cleanup_task(&qp->resp.task);
if (qp_type(qp) == IB_QPT_RC) {
del_timer_sync(&qp->retrans_timer);
del_timer_sync(&qp->rnr_nak_timer);
}
- rxe_cleanup_task(&qp->req.task);
- rxe_cleanup_task(&qp->comp.task);
+ if (qp->resp.task.func)
+ rxe_cleanup_task(&qp->resp.task);
+
+ if (qp->req.task.func)
+ rxe_cleanup_task(&qp->req.task);
+
+ if (qp->comp.task.func)
+ rxe_cleanup_task(&qp->comp.task);
/* flush out any receive wr's or pending requests */
if (qp->req.task.func)
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 6.1 3/3] RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug
2026-06-05 17:03 [PATCH 6.1 0/3] RDMA/rxe: correct cleanup-task backport and timer cleanup Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"" Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 2/3] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
@ 2026-06-05 17:03 ` Vladislav Nikolaev
2 siblings, 0 replies; 5+ messages in thread
From: Vladislav Nikolaev @ 2026-06-05 17:03 UTC (permalink / raw)
To: stable, Greg Kroah-Hartman
Cc: Vladislav Nikolaev, Zhu Yanjun, Doug Ledford, Jason Gunthorpe,
Haggai Eran, Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
lvc-project, syzbot+4edb496c3cad6e953a31, Zhu Yanjun
From: Zhu Yanjun <yanjun.zhu@linux.dev>
commit 1c7eec4d5f3b39cdea2153abaebf1b7229a47072 upstream.
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
assign_lock_key kernel/locking/lockdep.c:986 [inline]
register_lock_class+0x4a3/0x4c0 kernel/locking/lockdep.c:1300
__lock_acquire+0x99/0x1ba0 kernel/locking/lockdep.c:5110
lock_acquire kernel/locking/lockdep.c:5866 [inline]
lock_acquire+0x179/0x350 kernel/locking/lockdep.c:5823
__timer_delete_sync+0x152/0x1b0 kernel/time/timer.c:1644
rxe_qp_do_cleanup+0x5c3/0x7e0 drivers/infiniband/sw/rxe/rxe_qp.c:815
execute_in_process_context+0x3a/0x160 kernel/workqueue.c:4596
__rxe_cleanup+0x267/0x3c0 drivers/infiniband/sw/rxe/rxe_pool.c:232
rxe_create_qp+0x3f7/0x5f0 drivers/infiniband/sw/rxe/rxe_verbs.c:604
create_qp+0x62d/0xa80 drivers/infiniband/core/verbs.c:1250
ib_create_qp_kernel+0x9f/0x310 drivers/infiniband/core/verbs.c:1361
ib_create_qp include/rdma/ib_verbs.h:3803 [inline]
rdma_create_qp+0x10c/0x340 drivers/infiniband/core/cma.c:1144
rds_ib_setup_qp+0xc86/0x19a0 net/rds/ib_cm.c:600
rds_ib_cm_initiate_connect+0x1e8/0x3d0 net/rds/ib_cm.c:944
rds_rdma_cm_event_handler_cmn+0x61f/0x8c0 net/rds/rdma_transport.c:109
cma_cm_event_handler+0x94/0x300 drivers/infiniband/core/cma.c:2184
cma_work_handler+0x15b/0x230 drivers/infiniband/core/cma.c:3042
process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238
process_scheduled_works kernel/workqueue.c:3319 [inline]
worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400
kthread+0x3c2/0x780 kernel/kthread.c:464
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
The root cause is as below:
In the function rxe_create_qp, the function rxe_qp_from_init is called
to create qp, if this function rxe_qp_from_init fails, rxe_cleanup will
be called to handle all the allocated resources, including the timers:
retrans_timer and rnr_nak_timer.
The function rxe_qp_from_init calls the function rxe_qp_init_req to
initialize the timers: retrans_timer and rnr_nak_timer.
But these timers are initialized in the end of rxe_qp_init_req.
If some errors occur before the initialization of these timers, this
problem will occur.
The solution is to check whether these timers are initialized or not.
If these timers are not initialized, ignore these timers.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Reported-by: syzbot+4edb496c3cad6e953a31@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4edb496c3cad6e953a31
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20250419080741.1515231-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[ Vladislav: keep del_timer_sync() because linux-6.1.y has not renamed it
to timer_delete_sync() yet. ]
Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
---
drivers/infiniband/sw/rxe/rxe_qp.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 171c0f4dcbec..899fee5f145a 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -782,7 +782,12 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
qp->valid = 0;
qp->qp_timeout_jiffies = 0;
- if (qp_type(qp) == IB_QPT_RC) {
+ /* In the function timer_setup, .function is initialized. If .function
+ * is NULL, it indicates the function timer_setup is not called, the
+ * timer is not initialized. Or else, the timer is initialized.
+ */
+ if (qp_type(qp) == IB_QPT_RC && qp->retrans_timer.function &&
+ qp->rnr_nak_timer.function) {
del_timer_sync(&qp->retrans_timer);
del_timer_sync(&qp->rnr_nak_timer);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task""
2026-06-05 17:03 ` [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"" Vladislav Nikolaev
@ 2026-06-16 13:39 ` Greg Kroah-Hartman
0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2026-06-16 13:39 UTC (permalink / raw)
To: Vladislav Nikolaev
Cc: stable, Zhu Yanjun, Doug Ledford, Jason Gunthorpe, Haggai Eran,
Kamal Heib, Amir Vadai, Moni Shoua, Yonatan Cohen,
Leon Romanovsky, linux-rdma, linux-kernel, Zhu Yanjun,
lvc-project
On Fri, Jun 05, 2026 at 08:03:27PM +0300, Vladislav Nikolaev wrote:
> This reverts commit 3236221bb8e4de8e3d0c8385f634064fb26b8e38.
>
> The reverted commit is an incomplete backport of upstream
> commit b2b1ddc45745. It added guards for req.task and comp.task
> cleanup, but missed resp.task cleanup and left it before the RC timer
> cleanup, unlike the upstream fix. Revert it first so the correct
> backport can be applied cleanly in the following patch.
>
> Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
> ---
> drivers/infiniband/sw/rxe/rxe_qp.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> index 709c63e9773c..05e4a270084f 100644
> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> @@ -788,11 +788,8 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
> del_timer_sync(&qp->rnr_nak_timer);
> }
>
> - if (qp->req.task.func)
> - rxe_cleanup_task(&qp->req.task);
> -
> - if (qp->comp.task.func)
> - rxe_cleanup_task(&qp->comp.task);
> + rxe_cleanup_task(&qp->req.task);
> + rxe_cleanup_task(&qp->comp.task);
>
> /* flush out any receive wr's or pending requests */
> if (qp->req.task.func)
> --
> 2.43.0
>
This series does not apply to the latest tree :(
Are you sure it is still needed?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-16 13:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 17:03 [PATCH 6.1 0/3] RDMA/rxe: correct cleanup-task backport and timer cleanup Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 1/3] Revert "RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"" Vladislav Nikolaev
2026-06-16 13:39 ` Greg Kroah-Hartman
2026-06-05 17:03 ` [PATCH 6.1 2/3] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Vladislav Nikolaev
2026-06-05 17:03 ` [PATCH 6.1 3/3] RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug Vladislav Nikolaev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox