From: Leon Romanovsky <leon@kernel.org>
To: Lin Ma <linma@zju.edu.cn>
Cc: jgg@ziepe.ca, cmeiohas@nvidia.com, michaelgur@nvidia.com,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [bug report] RDMA/iwpm: reentrant iwpm hello message
Date: Tue, 24 Dec 2024 11:29:38 +0200 [thread overview]
Message-ID: <20241224092938.GC171473@unreal> (raw)
In-Reply-To: <661ee85f.a4a2.193e4b2f91b.Coremail.linma@zju.edu.cn>
On Fri, Dec 20, 2024 at 11:32:34PM +0800, Lin Ma wrote:
> Hello maintainers,
>
> Our fuzzer identified one interesting reentrant bug that could cause hang
> in the kernel. The crash log is like below:
>
> [ 32.616575][ T2983]
> [ 32.617000][ T2983] ============================================
> [ 32.617879][ T2983] WARNING: possible recursive locking detected
> [ 32.618759][ T2983] 6.1.70 #1 Not tainted
> [ 32.619362][ T2983] --------------------------------------------
> [ 32.620248][ T2983] hello.elf/2983 is trying to acquire lock:
> [ 32.621084][ T2983] ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [ 32.624234][ T2983]
> [ 32.624234][ T2983] but task is already holding lock:
> [ 32.625237][ T2983] ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [ 32.626562][ T2983]
> [ 32.626562][ T2983] other info that might help us debug this:
> [ 32.627648][ T2983] Possible unsafe locking scenario:
> [ 32.627648][ T2983]
> [ 32.633708][ T2983] CPU0
> [ 32.634184][ T2983] ----
> [ 32.634646][ T2983] lock(&rdma_nl_types[idx].sem);
> [ 32.635433][ T2983] lock(&rdma_nl_types[idx].sem);
> [ 32.636155][ T2983]
> [ 32.636155][ T2983] *** DEADLOCK ***
> [ 32.636155][ T2983]
> [ 32.637236][ T2983] May be due to missing lock nesting notation
> [ 32.637236][ T2983]
> [ 32.638408][ T2983] 2 locks held by hello.elf/2983:
> [ 32.639135][ T2983] #0: ffffffff91978ff8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x30f/0x990
> [ 32.640605][ T2983] #1: ffff888103f8f690 (nlk_cb_mutex-RDMA){+.+.}-{3:3}, at: netlink_dump+0xd3/0xc60
> [ 32.641981][ T2983]
> [ 32.641981][ T2983] stack backtrace:
> [ 32.642833][ T2983] CPU: 0 PID: 2983 Comm: hello.elf Not tainted 6.1.70 #1
> [ 32.643830][ T2983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> [ 32.645243][ T2983] Call Trace:
> [ 32.645735][ T2983] <TASK>
> [ 32.646197][ T2983] dump_stack_lvl+0x177/0x231
> [ 32.646901][ T2983] ? nf_tcp_handle_invalid+0x605/0x605
> [ 32.647705][ T2983] ? panic+0x725/0x725
> [ 32.648350][ T2983] validate_chain+0x4dd0/0x6010
> [ 32.649080][ T2983] ? reacquire_held_locks+0x5a0/0x5a0
> [ 32.649864][ T2983] ? mark_lock+0x94/0x320
> [ 32.650506][ T2983] ? lockdep_hardirqs_on_prepare+0x3fd/0x760
> [ 32.651376][ T2983] ? print_irqtrace_events+0x210/0x210
> [ 32.652182][ T2983] ? mark_lock+0x94/0x320
> [ 32.652825][ T2983] __lock_acquire+0x12ad/0x2010
> [ 32.653541][ T2983] lock_acquire+0x1b4/0x490
> [ 32.654211][ T2983] ? rdma_nl_rcv+0x30f/0x990
> [ 32.654891][ T2983] ? __might_sleep+0xd0/0xd0
> [ 32.655569][ T2983] ? __lock_acquire+0x12ad/0x2010
> [ 32.656316][ T2983] ? read_lock_is_recursive+0x10/0x10
> [ 32.657109][ T2983] down_read+0x42/0x2d0
> [ 32.657723][ T2983] ? rdma_nl_rcv+0x30f/0x990
> [ 32.658400][ T2983] rdma_nl_rcv+0x30f/0x990
> [ 32.659132][ T2983] ? rdma_nl_net_init+0x160/0x160
> [ 32.659847][ T2983] ? netlink_lookup+0x30/0x200
> [ 32.660519][ T2983] ? __netlink_lookup+0x2a/0x6d0
> [ 32.661214][ T2983] ? netlink_lookup+0x30/0x200
> [ 32.661880][ T2983] ? netlink_lookup+0x30/0x200
> [ 32.662545][ T2983] netlink_unicast+0x74b/0x8c0
> [ 32.663215][ T2983] rdma_nl_unicast+0x4b/0x60
> [ 32.663852][ T2983] iwpm_send_hello+0x1d8/0x350
> [ 32.664525][ T2983] ? iwpm_mapinfo_available+0x130/0x130
> [ 32.665295][ T2983] ? iwpm_parse_nlmsg+0x124/0x260
> [ 32.665995][ T2983] iwpm_hello_cb+0x1e1/0x2e0
> [ 32.666638][ T2983] ? netlink_dump+0x236/0xc60
> [ 32.667294][ T2983] ? iwpm_mapping_error_cb+0x3e0/0x3e0
> [ 32.668064][ T2983] netlink_dump+0x592/0xc60
> [ 32.668706][ T2983] ? netlink_lookup+0x200/0x200
> [ 32.669381][ T2983] ? __netlink_lookup+0x2a/0x6d0
> [ 32.670073][ T2983] ? netlink_lookup+0x30/0x200
> [ 32.670731][ T2983] ? netlink_lookup+0x30/0x200
> [ 32.671411][ T2983] __netlink_dump_start+0x54e/0x710
> [ 32.672220][ T2983] rdma_nl_rcv+0x753/0x990
> [ 32.672846][ T2983] ? rdma_nl_net_init+0x160/0x160
> [ 32.673538][ T2983] ? iwpm_mapping_error_cb+0x3e0/0x3e0
> [ 32.674316][ T2983] ? netlink_deliver_tap+0x2e/0x1b0
> [ 32.675106][ T2983] ? net_generic+0x1e/0x240
> [ 32.675778][ T2983] ? netlink_deliver_tap+0x2e/0x1b0
> [ 32.676553][ T2983] netlink_unicast+0x74b/0x8c0
> [ 32.677262][ T2983] netlink_sendmsg+0x882/0xb90
> [ 32.677969][ T2983] ? netlink_getsockopt+0x550/0x550
> [ 32.678732][ T2983] ? aa_sock_msg_perm+0x94/0x150
> [ 32.679465][ T2983] ? bpf_lsm_socket_sendmsg+0x5/0x10
> [ 32.680243][ T2983] ? security_socket_sendmsg+0x7c/0xa0
> [ 32.681048][ T2983] __sys_sendto+0x456/0x5b0
> [ 32.681724][ T2983] ? __ia32_sys_getpeername+0x80/0x80
> [ 32.682510][ T2983] ? __lock_acquire+0x2010/0x2010
> [ 32.683241][ T2983] ? lockdep_hardirqs_on_prepare+0x3fd/0x760
> [ 32.684134][ T2983] ? fd_install+0x5c/0x4f0
> [ 32.684794][ T2983] ? print_irqtrace_events+0x210/0x210
> [ 32.685608][ T2983] __x64_sys_sendto+0xda/0xf0
> [ 32.686298][ T2983] do_syscall_64+0x45/0x90
> [ 32.686955][ T2983] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [ 32.687810][ T2983] RIP: 0033:0x440624
> [ 32.691944][ T2983] RSP: 002b:00007ffc82f3ee48 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
> [ 32.693136][ T2983] RAX: ffffffffffffffda RBX: 0000000000400400 RCX: 0000000000440624
> [ 32.694264][ T2983] RDX: 0000000000000018 RSI: 00007ffc82f3ee80 RDI: 0000000000000003
> [ 32.695387][ T2983] RBP: 00007ffc82f3fe90 R08: 000000000047df08 R09: 000000000000000c
> [ 32.696621][ T2983] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000403990
> [ 32.697693][ T2983] R13: 0000000000000000 R14: 00000000006a6018 R15: 0000000000000000
> [ 32.698774][ T2983] </TASK>
>
> In a nutshell, the callback function for the command RDMA_NL_IWPM_HELLO, iwpm_hello_cb,
> can further call rdma_nl_unicast, leading to repeated calls that may cause
> a deadlock and potentially harm the kernel.
>
> I am not familiar with the internal workings of the callback mechanism or how
> IWPMD utilizes it, so I'm uncertain whether this reentrancy is expected behavior.
> If it is, perhaps a reference counter should be used instead of an rw_semaphore.
> If not, a proper check should be implemented.
I'm not fully understand the lockdep here. We use down_read(), which is
reentry safe.
Thanks
>
> Regards,
> Lin
next prev parent reply other threads:[~2024-12-24 9:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 15:32 [bug report] RDMA/iwpm: reentrant iwpm hello message Lin Ma
2024-12-24 9:29 ` Leon Romanovsky [this message]
2024-12-24 10:51 ` Lin Ma
2024-12-24 14:11 ` Leon Romanovsky
2024-12-24 16:16 ` Lin Ma
2024-12-24 19:26 ` Leon Romanovsky
2024-12-25 1:58 ` Lin Ma
2024-12-30 18:28 ` Leon Romanovsky
2025-01-08 15:14 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241224092938.GC171473@unreal \
--to=leon@kernel.org \
--cc=cmeiohas@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=linma@zju.edu.cn \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=michaelgur@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.