* [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
@ 2025-06-05 10:54 Denis Arefev
2025-06-09 22:57 ` Jakub Kicinski
0 siblings, 1 reply; 6+ messages in thread
From: Denis Arefev @ 2025-06-05 10:54 UTC (permalink / raw)
To: David S. Miller
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Nikita Marushkin, Ilya Shchipletsov, Hongbo Li, linux-hams,
netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
Syzkaller detected a possible deadlock in nr_rt_device_down [1]
Locking in concurrent threads can cause deadlock.
CPU0
----
nr_rt_device_down()
|-> spin_lock_bh(&nr_neigh_list_lock); capture
. . .
|-> spin_lock_bh(&nr_node_list_lock); waiting and deadlock
CPU1
----
nr_del_node()
|-> spin_lock_bh(&nr_node_list_lock); capture
. . .
|-> nr_remove_neigh(nr_neigh);
|-> spin_lock_bh(&nr_neigh_list_lock); waiting for capture
Make sure we always get nr_neigh_list_lock before nr_node_list_lock.
[1]
WARNING: possible circular locking dependency detected
6.15.0-rc2-syzkaller-00278-gfc96b232f8e7 #0 Not tainted
------------------------------------------------------
syz-executor107/6105 is trying to acquire lock:
ffffffff902543b8 (nr_node_list_lock){+...}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffffffff902543b8 (nr_node_list_lock){+...}-{3:3}, at: nr_rt_device_down+0xb5/0x7b0 net/netrom/nr_route.c:517
but task is already holding lock:
ffffffff90254358 (nr_neigh_list_lock){+...}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffffffff90254358 (nr_neigh_list_lock){+...}-{3:3}, at: nr_rt_device_down+0x28/0x7b0 net/netrom/nr_route.c:514
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (nr_neigh_list_lock){+...}-{3:3}:
lock_acquire+0x116/0x2f0 kernel/locking/lockdep.c:5866
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
nr_remove_neigh net/netrom/nr_route.c:307 [inline]
nr_dec_obs net/netrom/nr_route.c:472 [inline]
nr_rt_ioctl+0x39a/0xff0 net/netrom/nr_route.c:692
sock_do_ioctl+0x152/0x400 net/socket.c:1190
sock_ioctl+0x644/0x900 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl+0xf1/0x160 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #1 (&nr_node->node_lock){+...}-{3:3}:
lock_acquire+0x116/0x2f0 kernel/locking/lockdep.c:5866
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
nr_node_lock include/net/netrom.h:152 [inline]
nr_dec_obs net/netrom/nr_route.c:459 [inline]
nr_rt_ioctl+0x194/0xff0 net/netrom/nr_route.c:692
sock_do_ioctl+0x152/0x400 net/socket.c:1190
sock_ioctl+0x644/0x900 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl+0xf1/0x160 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (nr_node_list_lock){+...}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3166 [inline]
check_prevs_add kernel/locking/lockdep.c:3285 [inline]
validate_chain+0xa69/0x24e0 kernel/locking/lockdep.c:3909
__lock_acquire+0xad5/0xd80 kernel/locking/lockdep.c:5235
lock_acquire+0x116/0x2f0 kernel/locking/lockdep.c:5866
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
nr_rt_device_down+0xb5/0x7b0 net/netrom/nr_route.c:517
nr_device_event+0x134/0x150 net/netrom/af_netrom.c:126
notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
__dev_notify_flags+0x209/0x410 net/core/dev.c:-1
netif_change_flags+0xf0/0x1a0 net/core/dev.c:9434
dev_change_flags+0x146/0x270 net/core/dev_api.c:68
dev_ioctl+0x80f/0x1260 net/core/dev_ioctl.c:821
sock_do_ioctl+0x22f/0x400 net/socket.c:1204
sock_ioctl+0x644/0x900 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl+0xf1/0x160 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
other info that might help us debug this:
Chain exists of:
nr_node_list_lock --> &nr_node->node_lock --> nr_neigh_list_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(nr_neigh_list_lock);
lock(&nr_node->node_lock);
lock(nr_neigh_list_lock);
lock(nr_node_list_lock);
*** DEADLOCK ***
2 locks held by syz-executor107/6105:
#0: ffffffff900fd788 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:130 [inline]
#0: ffffffff900fd788 (rtnl_mutex){+.+.}-{4:4}, at: dev_ioctl+0x7fd/0x1260 net/core/dev_ioctl.c:820
#1: ffffffff90254358 (nr_neigh_list_lock){+...}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
#1: ffffffff90254358 (nr_neigh_list_lock){+...}-{3:3}, at: nr_rt_device_down+0x28/0x7b0 net/netrom/nr_route.c:514
stack backtrace:
CPU: 0 UID: 0 PID: 6105 Comm: syz-executor107 Not tainted 6.15.0-rc2-syzkaller-00278-gfc96b232f8e7 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x2e1/0x300 kernel/locking/lockdep.c:2079
check_noncircular+0x142/0x160 kernel/locking/lockdep.c:2211
check_prev_add kernel/locking/lockdep.c:3166 [inline]
check_prevs_add kernel/locking/lockdep.c:3285 [inline]
validate_chain+0xa69/0x24e0 kernel/locking/lockdep.c:3909
__lock_acquire+0xad5/0xd80 kernel/locking/lockdep.c:5235
lock_acquire+0x116/0x2f0 kernel/locking/lockdep.c:5866
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
nr_rt_device_down+0xb5/0x7b0 net/netrom/nr_route.c:517
nr_device_event+0x134/0x150 net/netrom/af_netrom.c:126
notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
__dev_notify_flags+0x209/0x410 net/core/dev.c:-1
netif_change_flags+0xf0/0x1a0 net/core/dev.c:9434
dev_change_flags+0x146/0x270 net/core/dev_api.c:68
dev_ioctl+0x80f/0x1260 net/core/dev_ioctl.c:821
sock_do_ioctl+0x22f/0x400 net/socket.c:1204
sock_ioctl+0x644/0x900 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl+0xf1/0x160 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reported-by: syzbot+ccdfb85a561b973219c7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ccdfb85a561b973219c7
Signed-off-by: Denis Arefev <arefev@swemel.ru>
---
net/netrom/nr_route.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c
index b94cb2ffbaf8..aae0923dbcf0 100644
--- a/net/netrom/nr_route.c
+++ b/net/netrom/nr_route.c
@@ -331,6 +331,7 @@ static int nr_del_node(ax25_address *callsign, ax25_address *neighbour, struct n
return -EINVAL;
}
+ spin_lock_bh(&nr_neigh_list_lock);
spin_lock_bh(&nr_node_list_lock);
nr_node_lock(nr_node);
for (i = 0; i < nr_node->count; i++) {
@@ -339,7 +340,7 @@ static int nr_del_node(ax25_address *callsign, ax25_address *neighbour, struct n
nr_neigh_put(nr_neigh);
if (nr_neigh->count == 0 && !nr_neigh->locked)
- nr_remove_neigh(nr_neigh);
+ nr_remove_neigh_locked(nr_neigh);
nr_neigh_put(nr_neigh);
nr_node->count--;
@@ -361,13 +362,14 @@ static int nr_del_node(ax25_address *callsign, ax25_address *neighbour, struct n
}
nr_node_unlock(nr_node);
spin_unlock_bh(&nr_node_list_lock);
-
+ spin_unlock_bh(&nr_neigh_list_lock);
return 0;
}
}
nr_neigh_put(nr_neigh);
nr_node_unlock(nr_node);
spin_unlock_bh(&nr_node_list_lock);
+ spin_unlock_bh(&nr_neigh_list_lock);
nr_node_put(nr_node);
return -EINVAL;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
2025-06-05 10:54 [PATCH net] netrom: fix possible deadlock in nr_rt_device_down Denis Arefev
@ 2025-06-09 22:57 ` Jakub Kicinski
[not found] ` <5f821879-6774-3dc2-e97d-e33b76513088@trinnet.net>
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2025-06-09 22:57 UTC (permalink / raw)
To: Denis Arefev
Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Nikita Marushkin, Ilya Shchipletsov, Hongbo Li, linux-hams,
netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
On Thu, 5 Jun 2025 13:54:48 +0300 Denis Arefev wrote:
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
I suspect syzbot is the only user of this code.
Can you delete one of the locks to make the maintenance simpler?
At a glance - I don't think your fix covers all the ABBA cases.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
[not found] ` <5f821879-6774-3dc2-e97d-e33b76513088@trinnet.net>
@ 2025-06-09 23:26 ` Jakub Kicinski
2025-06-09 23:30 ` David Ranch
0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2025-06-09 23:26 UTC (permalink / raw)
To: David Ranch
Cc: Denis Arefev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Nikita Marushkin, Ilya Shchipletsov, Hongbo Li,
linux-hams, netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
On Mon, 9 Jun 2025 16:16:32 -0700 David Ranch wrote:
> I'm not sure what you mean by "the only user of this code". There are
> many people using the Linux AX.25 + NETROM stack but we unfortunately
> don't have a active kernel maintainer for this code today.
Alright, sorry. Either way - these locks are not performance critical
for you, right?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
2025-06-09 23:26 ` Jakub Kicinski
@ 2025-06-09 23:30 ` David Ranch
2025-06-10 13:36 ` Dan Cross
0 siblings, 1 reply; 6+ messages in thread
From: David Ranch @ 2025-06-09 23:30 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Denis Arefev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Nikita Marushkin, Ilya Shchipletsov, Hongbo Li,
linux-hams, netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
That's unclear to me but maybe someone else knowing the code better than
myself can chime in. I have to assume having these locks present
are for a reason.
--David
KI6ZHD
On 06/09/2025 04:26 PM, Jakub Kicinski wrote:
> On Mon, 9 Jun 2025 16:16:32 -0700 David Ranch wrote:
>> I'm not sure what you mean by "the only user of this code". There are
>> many people using the Linux AX.25 + NETROM stack but we unfortunately
>> don't have a active kernel maintainer for this code today.
>
> Alright, sorry. Either way - these locks are not performance critical
> for you, right?
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
2025-06-09 23:30 ` David Ranch
@ 2025-06-10 13:36 ` Dan Cross
2025-06-10 17:00 ` David Ranch
0 siblings, 1 reply; 6+ messages in thread
From: Dan Cross @ 2025-06-10 13:36 UTC (permalink / raw)
To: David Ranch
Cc: Jakub Kicinski, Denis Arefev, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Nikita Marushkin, Ilya Shchipletsov,
Hongbo Li, linux-hams, netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
On Mon, Jun 9, 2025 at 7:31 PM David Ranch <linux-hams@trinnet.net> wrote:
> That's unclear to me but maybe someone else knowing the code better than
> myself can chime in. I have to assume having these locks present
> are for a reason.
The suggestion was not to remove locking, but rather, to fold multiple
separate locks into one. That is, have a single lock that covers both
the neighbor list and the node list. Naturally, there would be more
contention around a single lock in contrast to multiple, more granular
locks. But given that NETROM has very low performance requirements,
and that the data that these locks protect doesn't change that often,
that's probably fine and would eliminate the possibility of deadlock
due to lock ordering issues.
- Dan C.
> On 06/09/2025 04:26 PM, Jakub Kicinski wrote:
> > On Mon, 9 Jun 2025 16:16:32 -0700 David Ranch wrote:
> >> I'm not sure what you mean by "the only user of this code". There are
> >> many people using the Linux AX.25 + NETROM stack but we unfortunately
> >> don't have a active kernel maintainer for this code today.
> >
> > Alright, sorry. Either way - these locks are not performance critical
> > for you, right?
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] netrom: fix possible deadlock in nr_rt_device_down
2025-06-10 13:36 ` Dan Cross
@ 2025-06-10 17:00 ` David Ranch
0 siblings, 0 replies; 6+ messages in thread
From: David Ranch @ 2025-06-10 17:00 UTC (permalink / raw)
To: Dan Cross
Cc: Jakub Kicinski, Denis Arefev, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, Nikita Marushkin, Ilya Shchipletsov,
Hongbo Li, linux-hams, netdev, linux-kernel, lvc-project, stable,
syzbot+ccdfb85a561b973219c7
Yes, this seems like a reasonable approach though I understand all this
code is old, overly complicated, and when proposed changes are
available, little to no proper testing is done before it's commited and
it takes a very long time to get properly fixed.
I only bring this all up as the Linux AX.25 community has been badly
bitten by similar commits in the last few years. I've tried to help
find a new maintainer and/or find somewhere to possibly create and run
CI tests to catch issues but I've been unsuccessful so far.
I am happy to try helping on the testing side once I know what the test
harness is but I'm out of my league when it comes to the code side.
--David
KI6ZHD
On 06/10/2025 06:36 AM, Dan Cross wrote:
> On Mon, Jun 9, 2025 at 7:31 PM David Ranch <linux-hams@trinnet.net> wrote:
>> That's unclear to me but maybe someone else knowing the code better than
>> myself can chime in. I have to assume having these locks present
>> are for a reason.
>
> The suggestion was not to remove locking, but rather, to fold multiple
> separate locks into one. That is, have a single lock that covers both
> the neighbor list and the node list. Naturally, there would be more
> contention around a single lock in contrast to multiple, more granular
> locks. But given that NETROM has very low performance requirements,
> and that the data that these locks protect doesn't change that often,
> that's probably fine and would eliminate the possibility of deadlock
> due to lock ordering issues.
>
> - Dan C.
>
>> On 06/09/2025 04:26 PM, Jakub Kicinski wrote:
>>> On Mon, 9 Jun 2025 16:16:32 -0700 David Ranch wrote:
>>>> I'm not sure what you mean by "the only user of this code". There are
>>>> many people using the Linux AX.25 + NETROM stack but we unfortunately
>>>> don't have a active kernel maintainer for this code today.
>>>
>>> Alright, sorry. Either way - these locks are not performance critical
>>> for you, right?
>>>
>>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-06-10 17:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 10:54 [PATCH net] netrom: fix possible deadlock in nr_rt_device_down Denis Arefev
2025-06-09 22:57 ` Jakub Kicinski
[not found] ` <5f821879-6774-3dc2-e97d-e33b76513088@trinnet.net>
2025-06-09 23:26 ` Jakub Kicinski
2025-06-09 23:30 ` David Ranch
2025-06-10 13:36 ` Dan Cross
2025-06-10 17:00 ` David Ranch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).