* [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) @ 2023-02-28 15:14 syzbot 2023-04-01 2:09 ` Zhu Yanjun 0 siblings, 1 reply; 7+ messages in thread From: syzbot @ 2023-02-28 15:14 UTC (permalink / raw) To: jgg, leon, linux-kernel, linux-rdma, syzkaller-bugs, zyjzyj2000 Hello, syzbot found the following issue on: HEAD commit: 982818426a0f Merge tag 'arm-fixes-6.3-1' of git://git.kern.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1194800f480000 kernel config: https://syzkaller.appspot.com/x/.config?x=90a4de3f96747e3f dashboard link: https://syzkaller.appspot.com/bug?extid=cfcc1a3c85be15a40cba compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 Unfortunately, I don't have any reproducer for this issue yet. Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/225d8c8e9264/disk-98281842.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/87a9e2a89842/vmlinux-98281842.xz kernel image: https://storage.googleapis.com/syzbot-assets/39bdeb741f2e/bzImage-98281842.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com Node 1 hugepages_total=2 hugepages_free=2 hugepages_surp=0 hugepages_size=2048kB 14586 total pagecache pages 0 pages in swap cache Free swap = 0kB Total swap = 0kB 2097051 pages RAM 0 pages HighMem/MovableOnly 392145 pages reserved 0 pages cma reserved INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 1 PID: 5486 Comm: syz-executor.2 Not tainted 6.2.0-syzkaller-12765-g982818426a0f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 assign_lock_key kernel/locking/lockdep.c:982 [inline] register_lock_class+0xdb6/0x1120 kernel/locking/lockdep.c:1295 __lock_acquire+0x108/0x5d40 kernel/locking/lockdep.c:4935 lock_acquire kernel/locking/lockdep.c:5669 [inline] lock_acquire+0x1e3/0x670 kernel/locking/lockdep.c:5634 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 spin_lock_bh include/linux/spinlock.h:355 [inline] rxe_cleanup_task+0x73/0xc0 drivers/infiniband/sw/rxe/rxe_task.c:119 rxe_qp_do_cleanup+0x8c/0x7c0 drivers/infiniband/sw/rxe/rxe_qp.c:776 execute_in_process_context+0x3b/0x150 kernel/workqueue.c:3458 __rxe_cleanup+0x21e/0x370 drivers/infiniband/sw/rxe/rxe_pool.c:233 rxe_create_qp+0x2c2/0x340 drivers/infiniband/sw/rxe/rxe_verbs.c:430 create_qp+0x5ac/0x970 drivers/infiniband/core/verbs.c:1233 ib_create_qp_kernel+0xa1/0x310 drivers/infiniband/core/verbs.c:1344 ib_create_qp include/rdma/ib_verbs.h:3743 [inline] create_mad_qp+0x177/0x380 drivers/infiniband/core/mad.c:2905 ib_mad_port_open drivers/infiniband/core/mad.c:2986 [inline] ib_mad_init_device+0xf40/0x1a90 drivers/infiniband/core/mad.c:3077 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:721 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1332 ib_register_device drivers/infiniband/core/device.c:1420 [inline] ib_register_device+0x8b1/0xbc0 drivers/infiniband/core/device.c:1366 rxe_register_device+0x317/0x3f0 drivers/infiniband/sw/rxe/rxe_verbs.c:1096 rxe_net_add+0x90/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:524 rxe_newlink+0xd5/0x140 drivers/infiniband/sw/rxe/rxe.c:195 nldev_newlink+0x332/0x5e0 drivers/infiniband/core/nldev.c:1731 rdma_nl_rcv_msg+0x371/0x6a0 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb.constprop.0.isra.0+0x2fc/0x440 drivers/infiniband/core/netlink.c:239 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:722 [inline] sock_sendmsg+0xde/0x190 net/socket.c:745 ____sys_sendmsg+0x71c/0x900 net/socket.c:2504 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2558 __sys_sendmsg+0xf7/0x1c0 net/socket.c:2587 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd95868c0f9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd9594b3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fd9587abf80 RCX: 00007fd95868c0f9 RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003 RBP: 00007fd9586e7ae9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff9db7c9ef R14: 00007fd9594b3300 R15: 0000000000022000 </TASK> infiniband syz1: Couldn't create ib_mad QP1 infiniband syz1: Couldn't open port 1 RDS/IB: syz1: added smc: adding ib device syz1 with port count 1 smc: ib device syz1 port 1 has pnetid syz-executor.2 (5486) used greatest stack depth: 22840 bytes left --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) 2023-02-28 15:14 [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) syzbot @ 2023-04-01 2:09 ` Zhu Yanjun 2023-04-01 2:44 ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun 0 siblings, 1 reply; 7+ messages in thread From: Zhu Yanjun @ 2023-04-01 2:09 UTC (permalink / raw) To: syzbot, jgg, leon, linux-kernel, linux-rdma, syzkaller-bugs, zyjzyj2000 在 2023/2/28 23:14, syzbot 写道: > Hello, > > syzbot found the following issue on: > > HEAD commit: 982818426a0f Merge tag 'arm-fixes-6.3-1' of git://git.kern.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1194800f480000 > kernel config: https://syzkaller.appspot.com/x/.config?x=90a4de3f96747e3f > dashboard link: https://syzkaller.appspot.com/bug?extid=cfcc1a3c85be15a40cba > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > > Unfortunately, I don't have any reproducer for this issue yet. > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/225d8c8e9264/disk-98281842.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/87a9e2a89842/vmlinux-98281842.xz > kernel image: https://storage.googleapis.com/syzbot-assets/39bdeb741f2e/bzImage-98281842.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com > > Node 1 hugepages_total=2 hugepages_free=2 hugepages_surp=0 hugepages_size=2048kB > 14586 total pagecache pages > 0 pages in swap cache > Free swap = 0kB > Total swap = 0kB > 2097051 pages RAM > 0 pages HighMem/MovableOnly > 392145 pages reserved > 0 pages cma reserved > INFO: trying to register non-static key. > The code is fine but needs lockdep annotation, or maybe > you didn't initialize this object before use? > turning off the locking correctness validator. > CPU: 1 PID: 5486 Comm: syz-executor.2 Not tainted 6.2.0-syzkaller-12765-g982818426a0f #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 > assign_lock_key kernel/locking/lockdep.c:982 [inline] > register_lock_class+0xdb6/0x1120 kernel/locking/lockdep.c:1295 > __lock_acquire+0x108/0x5d40 kernel/locking/lockdep.c:4935 > lock_acquire kernel/locking/lockdep.c:5669 [inline] > lock_acquire+0x1e3/0x670 kernel/locking/lockdep.c:5634 > __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] > _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 > spin_lock_bh include/linux/spinlock.h:355 [inline] > rxe_cleanup_task+0x73/0xc0 drivers/infiniband/sw/rxe/rxe_task.c:119 > rxe_qp_do_cleanup+0x8c/0x7c0 drivers/infiniband/sw/rxe/rxe_qp.c:776 > execute_in_process_context+0x3b/0x150 kernel/workqueue.c:3458 > __rxe_cleanup+0x21e/0x370 drivers/infiniband/sw/rxe/rxe_pool.c:233 > rxe_create_qp+0x2c2/0x340 drivers/infiniband/sw/rxe/rxe_verbs.c:430 It took me a lot of time to reproduce this problem. And finally I can reproduce this problem. The root cause is: In this function rxe_create_qp, if some problem occur in rxe_qp_init_req, rxe_init_task functions are not executed. But error handler of rxe_create_qp still calls rxe_cleanup_task. This problem will occur. static int rxe_create_qp { ... err = rxe_qp_from_init { ... err = rxe_qp_init_req { ... some errors occur <---This will go to the error handler of rxe_create_qp rxe_init_task(&qp->req.task, qp, rxe_requester); <---It is not executed. rxe_init_task(&qp->comp.task, qp, rxe_completer);<---It is not executed. ... } if (err) goto err1; ... err = rxe_qp_init_resp{ ... rxe_init_task(&qp->resp.task, qp, rxe_responder);<---It is not executed. ... } if (err) goto err2; } if (err) goto qp_init; ... qp_init: rxe_cleanup(qp); ---> rxe_cleanup_task return err; } Zhu Yanjun > create_qp+0x5ac/0x970 drivers/infiniband/core/verbs.c:1233 > ib_create_qp_kernel+0xa1/0x310 drivers/infiniband/core/verbs.c:1344 > ib_create_qp include/rdma/ib_verbs.h:3743 [inline] > create_mad_qp+0x177/0x380 drivers/infiniband/core/mad.c:2905 > ib_mad_port_open drivers/infiniband/core/mad.c:2986 [inline] > ib_mad_init_device+0xf40/0x1a90 drivers/infiniband/core/mad.c:3077 > add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:721 > enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1332 > ib_register_device drivers/infiniband/core/device.c:1420 [inline] > ib_register_device+0x8b1/0xbc0 drivers/infiniband/core/device.c:1366 > rxe_register_device+0x317/0x3f0 drivers/infiniband/sw/rxe/rxe_verbs.c:1096 > rxe_net_add+0x90/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:524 > rxe_newlink+0xd5/0x140 drivers/infiniband/sw/rxe/rxe.c:195 > nldev_newlink+0x332/0x5e0 drivers/infiniband/core/nldev.c:1731 > rdma_nl_rcv_msg+0x371/0x6a0 drivers/infiniband/core/netlink.c:195 > rdma_nl_rcv_skb.constprop.0.isra.0+0x2fc/0x440 drivers/infiniband/core/netlink.c:239 > netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] > netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 > netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1942 > sock_sendmsg_nosec net/socket.c:722 [inline] > sock_sendmsg+0xde/0x190 net/socket.c:745 > ____sys_sendmsg+0x71c/0x900 net/socket.c:2504 > ___sys_sendmsg+0x110/0x1b0 net/socket.c:2558 > __sys_sendmsg+0xf7/0x1c0 net/socket.c:2587 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fd95868c0f9 > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007fd9594b3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00007fd9587abf80 RCX: 00007fd95868c0f9 > RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003 > RBP: 00007fd9586e7ae9 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007fff9db7c9ef R14: 00007fd9594b3300 R15: 0000000000022000 > </TASK> > infiniband syz1: Couldn't create ib_mad QP1 > infiniband syz1: Couldn't open port 1 > RDS/IB: syz1: added > smc: adding ib device syz1 with port count 1 > smc: ib device syz1 port 1 has pnetid > syz-executor.2 (5486) used greatest stack depth: 22840 bytes left > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" 2023-04-01 2:09 ` Zhu Yanjun @ 2023-04-01 2:44 ` Zhu Yanjun 2023-04-03 18:10 ` Leon Romanovsky 0 siblings, 1 reply; 7+ messages in thread From: Zhu Yanjun @ 2023-04-01 2:44 UTC (permalink / raw) To: zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun, syzbot+cfcc1a3c85be15a40cba From: Zhu Yanjun <yanjun.zhu@linux.dev> In the function rxe_create_qp(), rxe_qp_from_init() is called to initialize qp, internally things like rxe_init_task are not setup until rxe_qp_init_req(). If an error occures before this point then the unwind will call rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() which will oops when trying to access the uninitialized spinlock. If rxe_init_task is not executed, rxe_cleanup_task will not be called. Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8 Fixes: 8700e3e7c485 ("Soft RoCE driver") Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP") Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> --- drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index ab72db68b58f..7856c02c1b46 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, spin_lock_init(&qp->rq.producer_lock); spin_lock_init(&qp->rq.consumer_lock); + memset(&qp->req.task, 0, sizeof(struct rxe_task)); + memset(&qp->comp.task, 0, sizeof(struct rxe_task)); + memset(&qp->resp.task, 0, sizeof(struct rxe_task)); + atomic_set(&qp->ssn, 0); atomic_set(&qp->skb_out, 0); } @@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work) qp->valid = 0; qp->qp_timeout_jiffies = 0; - rxe_cleanup_task(&qp->resp.task); + + if (qp->resp.task.func) + rxe_cleanup_task(&qp->resp.task); if (qp_type(qp) == IB_QPT_RC) { del_timer_sync(&qp->retrans_timer); del_timer_sync(&qp->rnr_nak_timer); } - rxe_cleanup_task(&qp->req.task); - rxe_cleanup_task(&qp->comp.task); + if (qp->req.task.func) + rxe_cleanup_task(&qp->req.task); + + if (qp->comp.task.func) + rxe_cleanup_task(&qp->comp.task); /* flush out any receive wr's or pending requests */ if (qp->req.task.func) -- 2.27.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" 2023-04-01 2:44 ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun @ 2023-04-03 18:10 ` Leon Romanovsky 2023-04-04 0:13 ` Zhu Yanjun 0 siblings, 1 reply; 7+ messages in thread From: Leon Romanovsky @ 2023-04-03 18:10 UTC (permalink / raw) To: Zhu Yanjun Cc: zyjzyj2000, jgg, linux-rdma, Zhu Yanjun, syzbot+cfcc1a3c85be15a40cba On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote: > From: Zhu Yanjun <yanjun.zhu@linux.dev> > > In the function rxe_create_qp(), rxe_qp_from_init() is called to > initialize qp, internally things like rxe_init_task are not setup until > rxe_qp_init_req(). > > If an error occures before this point then the unwind will call > rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() > which will oops when trying to access the uninitialized spinlock. > > If rxe_init_task is not executed, rxe_cleanup_task will not be called. > > Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com > Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8 > > Fixes: 8700e3e7c485 ("Soft RoCE driver") > Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP") > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> > --- > drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c > index ab72db68b58f..7856c02c1b46 100644 > --- a/drivers/infiniband/sw/rxe/rxe_qp.c > +++ b/drivers/infiniband/sw/rxe/rxe_qp.c > @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, > spin_lock_init(&qp->rq.producer_lock); > spin_lock_init(&qp->rq.consumer_lock); > > + memset(&qp->req.task, 0, sizeof(struct rxe_task)); > + memset(&qp->comp.task, 0, sizeof(struct rxe_task)); > + memset(&qp->resp.task, 0, sizeof(struct rxe_task)); IMHO QP is already zeroed here. Please don't send patches as reply-to. Thanks > + > atomic_set(&qp->ssn, 0); > atomic_set(&qp->skb_out, 0); > } > @@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work) > > qp->valid = 0; > qp->qp_timeout_jiffies = 0; > - rxe_cleanup_task(&qp->resp.task); > + > + if (qp->resp.task.func) > + rxe_cleanup_task(&qp->resp.task); > > if (qp_type(qp) == IB_QPT_RC) { > del_timer_sync(&qp->retrans_timer); > del_timer_sync(&qp->rnr_nak_timer); > } > > - rxe_cleanup_task(&qp->req.task); > - rxe_cleanup_task(&qp->comp.task); > + if (qp->req.task.func) > + rxe_cleanup_task(&qp->req.task); > + > + if (qp->comp.task.func) > + rxe_cleanup_task(&qp->comp.task); > > /* flush out any receive wr's or pending requests */ > if (qp->req.task.func) > -- > 2.27.0 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" 2023-04-03 18:10 ` Leon Romanovsky @ 2023-04-04 0:13 ` Zhu Yanjun 2023-04-04 5:58 ` Leon Romanovsky 0 siblings, 1 reply; 7+ messages in thread From: Zhu Yanjun @ 2023-04-04 0:13 UTC (permalink / raw) To: Leon Romanovsky, Zhu Yanjun Cc: zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba 在 2023/4/4 2:10, Leon Romanovsky 写道: > On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote: >> From: Zhu Yanjun <yanjun.zhu@linux.dev> >> >> In the function rxe_create_qp(), rxe_qp_from_init() is called to >> initialize qp, internally things like rxe_init_task are not setup until >> rxe_qp_init_req(). >> >> If an error occures before this point then the unwind will call >> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() >> which will oops when trying to access the uninitialized spinlock. >> >> If rxe_init_task is not executed, rxe_cleanup_task will not be called. >> >> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com >> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8 >> >> Fixes: 8700e3e7c485 ("Soft RoCE driver") >> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP") >> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> >> --- >> drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c >> index ab72db68b58f..7856c02c1b46 100644 >> --- a/drivers/infiniband/sw/rxe/rxe_qp.c >> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c >> @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, >> spin_lock_init(&qp->rq.producer_lock); >> spin_lock_init(&qp->rq.consumer_lock); >> >> + memset(&qp->req.task, 0, sizeof(struct rxe_task)); >> + memset(&qp->comp.task, 0, sizeof(struct rxe_task)); >> + memset(&qp->resp.task, 0, sizeof(struct rxe_task)); > IMHO QP is already zeroed here. Sure. Exactly. Here I just confirm that req.task, comp.task and resp.task are zeroed explicitly. If you think it had better remove these memset functions, I will follow your advice. Please let me know your advice. > Please don't send patches as reply-to. Got it. I will follow your advice. Thanks, Zhu Yanjun > > Thanks > >> + >> atomic_set(&qp->ssn, 0); >> atomic_set(&qp->skb_out, 0); >> } >> @@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work) >> >> qp->valid = 0; >> qp->qp_timeout_jiffies = 0; >> - rxe_cleanup_task(&qp->resp.task); >> + >> + if (qp->resp.task.func) >> + rxe_cleanup_task(&qp->resp.task); >> >> if (qp_type(qp) == IB_QPT_RC) { >> del_timer_sync(&qp->retrans_timer); >> del_timer_sync(&qp->rnr_nak_timer); >> } >> >> - rxe_cleanup_task(&qp->req.task); >> - rxe_cleanup_task(&qp->comp.task); >> + if (qp->req.task.func) >> + rxe_cleanup_task(&qp->req.task); >> + >> + if (qp->comp.task.func) >> + rxe_cleanup_task(&qp->comp.task); >> >> /* flush out any receive wr's or pending requests */ >> if (qp->req.task.func) >> -- >> 2.27.0 >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" 2023-04-04 0:13 ` Zhu Yanjun @ 2023-04-04 5:58 ` Leon Romanovsky 2023-04-04 6:42 ` Zhu Yanjun 0 siblings, 1 reply; 7+ messages in thread From: Leon Romanovsky @ 2023-04-04 5:58 UTC (permalink / raw) To: Zhu Yanjun Cc: Zhu Yanjun, zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba On Tue, Apr 04, 2023 at 08:13:22AM +0800, Zhu Yanjun wrote: > > 在 2023/4/4 2:10, Leon Romanovsky 写道: > > On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote: > > > From: Zhu Yanjun <yanjun.zhu@linux.dev> > > > > > > In the function rxe_create_qp(), rxe_qp_from_init() is called to > > > initialize qp, internally things like rxe_init_task are not setup until > > > rxe_qp_init_req(). > > > > > > If an error occures before this point then the unwind will call > > > rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() > > > which will oops when trying to access the uninitialized spinlock. > > > > > > If rxe_init_task is not executed, rxe_cleanup_task will not be called. > > > > > > Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com > > > Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8 > > > > > > Fixes: 8700e3e7c485 ("Soft RoCE driver") > > > Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP") > > > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> > > > --- > > > drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++--- > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c > > > index ab72db68b58f..7856c02c1b46 100644 > > > --- a/drivers/infiniband/sw/rxe/rxe_qp.c > > > +++ b/drivers/infiniband/sw/rxe/rxe_qp.c > > > @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, > > > spin_lock_init(&qp->rq.producer_lock); > > > spin_lock_init(&qp->rq.consumer_lock); > > > + memset(&qp->req.task, 0, sizeof(struct rxe_task)); > > > + memset(&qp->comp.task, 0, sizeof(struct rxe_task)); > > > + memset(&qp->resp.task, 0, sizeof(struct rxe_task)); > > IMHO QP is already zeroed here. > > Sure. Exactly. Here I just confirm that req.task, comp.task and resp.task > are zeroed explicitly. There is no need to do so. It is quite misleading to read the code and see these memset() functions as they give false impression that QP is not zeroed. > > If you think it had better remove these memset functions, I will follow your > advice. Yes, please. Thanks ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" 2023-04-04 5:58 ` Leon Romanovsky @ 2023-04-04 6:42 ` Zhu Yanjun 0 siblings, 0 replies; 7+ messages in thread From: Zhu Yanjun @ 2023-04-04 6:42 UTC (permalink / raw) To: Leon Romanovsky Cc: Zhu Yanjun, zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba 在 2023/4/4 13:58, Leon Romanovsky 写道: > On Tue, Apr 04, 2023 at 08:13:22AM +0800, Zhu Yanjun wrote: >> 在 2023/4/4 2:10, Leon Romanovsky 写道: >>> On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote: >>>> From: Zhu Yanjun <yanjun.zhu@linux.dev> >>>> >>>> In the function rxe_create_qp(), rxe_qp_from_init() is called to >>>> initialize qp, internally things like rxe_init_task are not setup until >>>> rxe_qp_init_req(). >>>> >>>> If an error occures before this point then the unwind will call >>>> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task() >>>> which will oops when trying to access the uninitialized spinlock. >>>> >>>> If rxe_init_task is not executed, rxe_cleanup_task will not be called. >>>> >>>> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com >>>> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8 >>>> >>>> Fixes: 8700e3e7c485 ("Soft RoCE driver") >>>> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP") >>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> >>>> --- >>>> drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++--- >>>> 1 file changed, 12 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c >>>> index ab72db68b58f..7856c02c1b46 100644 >>>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c >>>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c >>>> @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp, >>>> spin_lock_init(&qp->rq.producer_lock); >>>> spin_lock_init(&qp->rq.consumer_lock); >>>> + memset(&qp->req.task, 0, sizeof(struct rxe_task)); >>>> + memset(&qp->comp.task, 0, sizeof(struct rxe_task)); >>>> + memset(&qp->resp.task, 0, sizeof(struct rxe_task)); >>> IMHO QP is already zeroed here. >> Sure. Exactly. Here I just confirm that req.task, comp.task and resp.task >> are zeroed explicitly. > There is no need to do so. It is quite misleading to read the code and > see these memset() functions as they give false impression that QP is > not zeroed. I will remove these memset function in the latest commit. Thanks, Zhu Yanjun > >> If you think it had better remove these memset functions, I will follow your >> advice. > Yes, please. > > Thanks ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-04-04 6:42 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-28 15:14 [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) syzbot 2023-04-01 2:09 ` Zhu Yanjun 2023-04-01 2:44 ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun 2023-04-03 18:10 ` Leon Romanovsky 2023-04-04 0:13 ` Zhu Yanjun 2023-04-04 5:58 ` Leon Romanovsky 2023-04-04 6:42 ` Zhu Yanjun
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.