* [syzbot] [bluetooth?] general protection fault in l2cap_sock_recv_cb
@ 2024-06-07 6:02 syzbot
2024-06-11 14:50 ` [PATCH] bluetooth/l2cap: sync sock recv cb and release Edward Adam Davis
0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2024-06-07 6:02 UTC (permalink / raw)
To: johan.hedberg, linux-bluetooth, linux-kernel, luiz.dentz, marcel,
syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: cc8ed4d0a848 Merge tag 'drm-fixes-2024-06-01' of https://g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=158b9cf2980000
kernel config: https://syzkaller.appspot.com/x/.config?x=47d282ddffae809f
dashboard link: https://syzkaller.appspot.com/bug?extid=b7f6f8c9303466e16c8a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15236f14980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1200d314980000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e4cc470367bc/disk-cc8ed4d0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/614492e3b597/vmlinux-cc8ed4d0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/fde890f9e46d/bzImage-cc8ed4d0.xz
Bisection is inconclusive: the issue happens on the oldest tested release.
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=113ad9ba980000
final oops: https://syzkaller.appspot.com/x/report.txt?x=133ad9ba980000
console output: https://syzkaller.appspot.com/x/log.txt?x=153ad9ba980000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
Oops: general protection fault, probably for non-canonical address 0xdffffc000000002e: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000170-0x0000000000000177]
CPU: 1 PID: 53 Comm: kworker/u9:0 Not tainted 6.10.0-rc1-syzkaller-00267-gcc8ed4d0a848 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Workqueue: hci0 hci_rx_work
RIP: 0010:l2cap_publish_rx_avail net/bluetooth/l2cap_sock.c:1137 [inline]
RIP: 0010:l2cap_sock_recv_cb+0x1af/0x4f0 net/bluetooth/l2cap_sock.c:1509
Code: 80 3c 07 00 74 08 4c 89 ef e8 dd dd 7b f7 4d 8b 7d 00 49 8d bf 74 01 00 00 48 89 f8 48 c1 e8 03 49 bd 00 00 00 00 00 fc ff df <42> 0f b6 04 28 84 c0 0f 85 b5 02 00 00 41 8b 9f 74 01 00 00 49 81
RSP: 0018:ffffc90000bd7468 EFLAGS: 00010207
RAX: 000000000000002e RBX: ffff88801f6cc000 RCX: ffff888016398000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000174
RBP: ffff88801f6cd000 R08: ffffffff89459100 R09: 1ffff11003ed980c
R10: dffffc0000000000 R11: ffffed1003ed980d R12: 1ffff11003ed9a05
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055557f1dd6c8 CR3: 000000006f77c000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
l2cap_conless_channel net/bluetooth/l2cap_core.c:6780 [inline]
l2cap_recv_frame+0x8b6d/0x10670 net/bluetooth/l2cap_core.c:6833
hci_acldata_packet net/bluetooth/hci_core.c:3842 [inline]
hci_rx_work+0x50f/0xca0 net/bluetooth/hci_core.c:4079
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:l2cap_publish_rx_avail net/bluetooth/l2cap_sock.c:1137 [inline]
RIP: 0010:l2cap_sock_recv_cb+0x1af/0x4f0 net/bluetooth/l2cap_sock.c:1509
Code: 80 3c 07 00 74 08 4c 89 ef e8 dd dd 7b f7 4d 8b 7d 00 49 8d bf 74 01 00 00 48 89 f8 48 c1 e8 03 49 bd 00 00 00 00 00 fc ff df <42> 0f b6 04 28 84 c0 0f 85 b5 02 00 00 41 8b 9f 74 01 00 00 49 81
RSP: 0018:ffffc90000bd7468 EFLAGS: 00010207
RAX: 000000000000002e RBX: ffff88801f6cc000 RCX: ffff888016398000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000174
RBP: ffff88801f6cd000 R08: ffffffff89459100 R09: 1ffff11003ed980c
R10: dffffc0000000000 R11: ffffed1003ed980d R12: 1ffff11003ed9a05
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055557f1dd6c8 CR3: 000000000e132000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 80 3c 07 00 cmpb $0x0,(%rdi,%rax,1)
4: 74 08 je 0xe
6: 4c 89 ef mov %r13,%rdi
9: e8 dd dd 7b f7 call 0xf77bddeb
e: 4d 8b 7d 00 mov 0x0(%r13),%r15
12: 49 8d bf 74 01 00 00 lea 0x174(%r15),%rdi
19: 48 89 f8 mov %rdi,%rax
1c: 48 c1 e8 03 shr $0x3,%rax
20: 49 bd 00 00 00 00 00 movabs $0xdffffc0000000000,%r13
27: fc ff df
* 2a: 42 0f b6 04 28 movzbl (%rax,%r13,1),%eax <-- trapping instruction
2f: 84 c0 test %al,%al
31: 0f 85 b5 02 00 00 jne 0x2ec
37: 41 8b 9f 74 01 00 00 mov 0x174(%r15),%ebx
3e: 49 rex.WB
3f: 81 .byte 0x81
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-07 6:02 [syzbot] [bluetooth?] general protection fault in l2cap_sock_recv_cb syzbot
@ 2024-06-11 14:50 ` Edward Adam Davis
2024-06-11 19:24 ` Luiz Augusto von Dentz
0 siblings, 1 reply; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-11 14:50 UTC (permalink / raw)
To: syzbot+b7f6f8c9303466e16c8a
Cc: johan.hedberg, linux-bluetooth, linux-kernel, luiz.dentz, marcel,
syzkaller-bugs
The problem occurs between the system call to close the sock and hci_rx_work,
where the former releases the sock and the latter accesses it without lock protection.
CPU0 CPU1
---- ----
sock_close hci_rx_work
l2cap_sock_release hci_acldata_packet
l2cap_sock_kill l2cap_recv_frame
sk_free l2cap_conless_channel
l2cap_sock_recv_cb
If hci_rx_work processes the data that needs to be received before the sock is
closed, then everything is normal; Otherwise, the work thread may access the
released sock when receiving data.
Add a chan mutex in the rx callback of the sock to achieve synchronization between
the sock release and recv cb.
Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 6db60946c627..f3e9236293e1 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock)
l2cap_chan_hold(chan);
l2cap_chan_lock(chan);
+ if (refcount_read(&sk->sk_refcnt) == 1)
+ chan->data = NULL;
sock_orphan(sk);
l2cap_sock_kill(sk);
@@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
{
- struct sock *sk = chan->data;
- struct l2cap_pinfo *pi = l2cap_pi(sk);
+ struct sock *sk;
+ struct l2cap_pinfo *pi;
int err;
- lock_sock(sk);
+ l2cap_chan_hold(chan);
+ l2cap_chan_lock(chan);
+ sk = chan->data;
+
+ if (!sk) {
+ l2cap_chan_unlock(chan);
+ l2cap_chan_put(chan);
+ return -ENXIO;
+ }
+ pi = l2cap_pi(sk);
+ lock_sock(sk);
if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
err = -ENOMEM;
goto done;
@@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
done:
release_sock(sk);
+ l2cap_chan_unlock(chan);
+ l2cap_chan_put(chan);
return err;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-11 14:50 ` [PATCH] bluetooth/l2cap: sync sock recv cb and release Edward Adam Davis
@ 2024-06-11 19:24 ` Luiz Augusto von Dentz
2024-06-12 0:33 ` Edward Adam Davis
2024-06-15 1:45 ` [PATCH v2] " Edward Adam Davis
0 siblings, 2 replies; 15+ messages in thread
From: Luiz Augusto von Dentz @ 2024-06-11 19:24 UTC (permalink / raw)
To: Edward Adam Davis
Cc: syzbot+b7f6f8c9303466e16c8a, johan.hedberg, linux-bluetooth,
linux-kernel, marcel, syzkaller-bugs
Hi Edward,
On Tue, Jun 11, 2024 at 10:50 AM Edward Adam Davis <eadavis@qq.com> wrote:
>
> The problem occurs between the system call to close the sock and hci_rx_work,
> where the former releases the sock and the latter accesses it without lock protection.
>
> CPU0 CPU1
> ---- ----
> sock_close hci_rx_work
> l2cap_sock_release hci_acldata_packet
> l2cap_sock_kill l2cap_recv_frame
> sk_free l2cap_conless_channel
> l2cap_sock_recv_cb
>
> If hci_rx_work processes the data that needs to be received before the sock is
> closed, then everything is normal; Otherwise, the work thread may access the
> released sock when receiving data.
>
> Add a chan mutex in the rx callback of the sock to achieve synchronization between
> the sock release and recv cb.
>
> Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
> net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
> index 6db60946c627..f3e9236293e1 100644
> --- a/net/bluetooth/l2cap_sock.c
> +++ b/net/bluetooth/l2cap_sock.c
> @@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock)
> l2cap_chan_hold(chan);
> l2cap_chan_lock(chan);
>
> + if (refcount_read(&sk->sk_refcnt) == 1)
> + chan->data = NULL;
Might be a good idea to add some comment on why checking for refcount
== 1 is the right thing to do here, or perhaps we can always assign
chan->data to NULL, instead of that perhaps we could have it done in
l2cap_sock_kill?
> sock_orphan(sk);
> l2cap_sock_kill(sk);
>
> @@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
>
> static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
> {
> - struct sock *sk = chan->data;
> - struct l2cap_pinfo *pi = l2cap_pi(sk);
> + struct sock *sk;
> + struct l2cap_pinfo *pi;
> int err;
>
> - lock_sock(sk);
> + l2cap_chan_hold(chan);
> + l2cap_chan_lock(chan);
> + sk = chan->data;
> +
> + if (!sk) {
> + l2cap_chan_unlock(chan);
> + l2cap_chan_put(chan);
> + return -ENXIO;
> + }
>
> + pi = l2cap_pi(sk);
> + lock_sock(sk);
> if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
> err = -ENOMEM;
> goto done;
> @@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
>
> done:
> release_sock(sk);
> + l2cap_chan_unlock(chan);
> + l2cap_chan_put(chan);
>
> return err;
> }
> --
> 2.43.0
>
--
Luiz Augusto von Dentz
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-11 19:24 ` Luiz Augusto von Dentz
@ 2024-06-12 0:33 ` Edward Adam Davis
2024-06-15 1:45 ` [PATCH v2] " Edward Adam Davis
1 sibling, 0 replies; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-12 0:33 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Luiz Augusto von Dentz,
On Tue, 11 Jun 2024 15:24:52 -0400, Luiz Augusto von Dentz wrote:
> > The problem occurs between the system call to close the sock and hci_rx_work,
> > where the former releases the sock and the latter accesses it without lock protection.
> >
> > CPU0 CPU1
> > ---- ----
> > sock_close hci_rx_work
> > l2cap_sock_release hci_acldata_packet
> > l2cap_sock_kill l2cap_recv_frame
> > sk_free l2cap_conless_channel
> > l2cap_sock_recv_cb
> >
> > If hci_rx_work processes the data that needs to be received before the sock is
> > closed, then everything is normal; Otherwise, the work thread may access the
> > released sock when receiving data.
> >
> > Add a chan mutex in the rx callback of the sock to achieve synchronization between
> > the sock release and recv cb.
> >
> > Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
> > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> > ---
> > net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++---
> > 1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
> > index 6db60946c627..f3e9236293e1 100644
> > --- a/net/bluetooth/l2cap_sock.c
> > +++ b/net/bluetooth/l2cap_sock.c
> > @@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock)
> > l2cap_chan_hold(chan);
> > l2cap_chan_lock(chan);
> >
> > + if (refcount_read(&sk->sk_refcnt) == 1)
> > + chan->data = NULL;
>
> Might be a good idea to add some comment on why checking for refcount
> == 1 is the right thing to do here, or perhaps we can always assign
> chan->data to NULL, instead of that perhaps we could have it done in
> l2cap_sock_kill?
In this case, it is possible to always set chan->data to NULL, but I think a
better approach would be to release sock in l2cap_sock_kill when the reference
count of the sock is 1. Previously, it was mentioned that setting chan->data to
NULL is more rigorous.
And chan->data is bound one-on-one to the sock, if the sock is not released,
I can't confirm whether setting chan->data to NULL will affect the operation of
the sock on other paths.
>
> > sock_orphan(sk);
> > l2cap_sock_kill(sk);
> >
> > @@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
> >
> > static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
> > {
> > - struct sock *sk = chan->data;
> > - struct l2cap_pinfo *pi = l2cap_pi(sk);
> > + struct sock *sk;
> > + struct l2cap_pinfo *pi;
> > int err;
> >
> > - lock_sock(sk);
> > + l2cap_chan_hold(chan);
> > + l2cap_chan_lock(chan);
> > + sk = chan->data;
> > +
> > + if (!sk) {
> > + l2cap_chan_unlock(chan);
> > + l2cap_chan_put(chan);
> > + return -ENXIO;
> > + }
> >
> > + pi = l2cap_pi(sk);
> > + lock_sock(sk);
> > if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
> > err = -ENOMEM;
> > goto done;
> > @@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
> >
> > done:
> > release_sock(sk);
> > + l2cap_chan_unlock(chan);
> > + l2cap_chan_put(chan);
> >
> > return err;
> > }
--
Edward
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] bluetooth/l2cap: sync sock recv cb and release
2024-06-11 19:24 ` Luiz Augusto von Dentz
2024-06-12 0:33 ` Edward Adam Davis
@ 2024-06-15 1:45 ` Edward Adam Davis
2024-06-17 19:00 ` patchwork-bot+bluetooth
2024-06-20 16:53 ` Luiz Augusto von Dentz
1 sibling, 2 replies; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-15 1:45 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
The problem occurs between the system call to close the sock and hci_rx_work,
where the former releases the sock and the latter accesses it without lock protection.
CPU0 CPU1
---- ----
sock_close hci_rx_work
l2cap_sock_release hci_acldata_packet
l2cap_sock_kill l2cap_recv_frame
sk_free l2cap_conless_channel
l2cap_sock_recv_cb
If hci_rx_work processes the data that needs to be received before the sock is
closed, then everything is normal; Otherwise, the work thread may access the
released sock when receiving data.
Add a chan mutex in the rx callback of the sock to achieve synchronization between
the sock release and recv cb.
Sock is dead, so set chan data to NULL, avoid others use invalid sock pointer.
Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
net/bluetooth/l2cap_sock.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 6db60946c627..f45cdf9bc985 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1239,6 +1239,10 @@ static void l2cap_sock_kill(struct sock *sk)
BT_DBG("sk %p state %s", sk, state_to_string(sk->sk_state));
+ /* Sock is dead, so set chan data to NULL, avoid other task use invalid
+ * sock pointer.
+ */
+ l2cap_pi(sk)->chan->data = NULL;
/* Kill poor orphan */
l2cap_chan_put(l2cap_pi(sk)->chan);
@@ -1481,12 +1485,25 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
{
- struct sock *sk = chan->data;
- struct l2cap_pinfo *pi = l2cap_pi(sk);
+ struct sock *sk;
+ struct l2cap_pinfo *pi;
int err;
- lock_sock(sk);
+ /* To avoid race with sock_release, a chan lock needs to be added here
+ * to synchronize the sock.
+ */
+ l2cap_chan_hold(chan);
+ l2cap_chan_lock(chan);
+ sk = chan->data;
+ if (!sk) {
+ l2cap_chan_unlock(chan);
+ l2cap_chan_put(chan);
+ return -ENXIO;
+ }
+
+ pi = l2cap_pi(sk);
+ lock_sock(sk);
if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
err = -ENOMEM;
goto done;
@@ -1535,6 +1552,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
done:
release_sock(sk);
+ l2cap_chan_unlock(chan);
+ l2cap_chan_put(chan);
return err;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2] bluetooth/l2cap: sync sock recv cb and release
2024-06-15 1:45 ` [PATCH v2] " Edward Adam Davis
@ 2024-06-17 19:00 ` patchwork-bot+bluetooth
2024-06-20 16:53 ` Luiz Augusto von Dentz
1 sibling, 0 replies; 15+ messages in thread
From: patchwork-bot+bluetooth @ 2024-06-17 19:00 UTC (permalink / raw)
To: Edward Adam Davis
Cc: luiz.dentz, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hello:
This patch was applied to bluetooth/bluetooth-next.git (master)
by Luiz Augusto von Dentz <luiz.von.dentz@intel.com>:
On Sat, 15 Jun 2024 09:45:54 +0800 you wrote:
> The problem occurs between the system call to close the sock and hci_rx_work,
> where the former releases the sock and the latter accesses it without lock protection.
>
> CPU0 CPU1
> ---- ----
> sock_close hci_rx_work
> l2cap_sock_release hci_acldata_packet
> l2cap_sock_kill l2cap_recv_frame
> sk_free l2cap_conless_channel
> l2cap_sock_recv_cb
>
> [...]
Here is the summary with links:
- [v2] bluetooth/l2cap: sync sock recv cb and release
https://git.kernel.org/bluetooth/bluetooth-next/c/e3203b177717
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] bluetooth/l2cap: sync sock recv cb and release
2024-06-15 1:45 ` [PATCH v2] " Edward Adam Davis
2024-06-17 19:00 ` patchwork-bot+bluetooth
@ 2024-06-20 16:53 ` Luiz Augusto von Dentz
2024-06-21 14:45 ` [PATCH] " Edward Adam Davis
2024-06-22 3:46 ` Edward Adam Davis
1 sibling, 2 replies; 15+ messages in thread
From: Luiz Augusto von Dentz @ 2024-06-20 16:53 UTC (permalink / raw)
To: Edward Adam Davis
Cc: johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Edward,
On Fri, Jun 14, 2024 at 9:46 PM Edward Adam Davis <eadavis@qq.com> wrote:
>
> The problem occurs between the system call to close the sock and hci_rx_work,
> where the former releases the sock and the latter accesses it without lock protection.
>
> CPU0 CPU1
> ---- ----
> sock_close hci_rx_work
> l2cap_sock_release hci_acldata_packet
> l2cap_sock_kill l2cap_recv_frame
> sk_free l2cap_conless_channel
> l2cap_sock_recv_cb
>
> If hci_rx_work processes the data that needs to be received before the sock is
> closed, then everything is normal; Otherwise, the work thread may access the
> released sock when receiving data.
>
> Add a chan mutex in the rx callback of the sock to achieve synchronization between
> the sock release and recv cb.
>
> Sock is dead, so set chan data to NULL, avoid others use invalid sock pointer.
>
> Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
> net/bluetooth/l2cap_sock.c | 25 ++++++++++++++++++++++---
> 1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
> index 6db60946c627..f45cdf9bc985 100644
> --- a/net/bluetooth/l2cap_sock.c
> +++ b/net/bluetooth/l2cap_sock.c
> @@ -1239,6 +1239,10 @@ static void l2cap_sock_kill(struct sock *sk)
>
> BT_DBG("sk %p state %s", sk, state_to_string(sk->sk_state));
>
> + /* Sock is dead, so set chan data to NULL, avoid other task use invalid
> + * sock pointer.
> + */
> + l2cap_pi(sk)->chan->data = NULL;
> /* Kill poor orphan */
>
> l2cap_chan_put(l2cap_pi(sk)->chan);
> @@ -1481,12 +1485,25 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
>
> static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
> {
> - struct sock *sk = chan->data;
> - struct l2cap_pinfo *pi = l2cap_pi(sk);
> + struct sock *sk;
> + struct l2cap_pinfo *pi;
> int err;
>
> - lock_sock(sk);
> + /* To avoid race with sock_release, a chan lock needs to be added here
> + * to synchronize the sock.
> + */
> + l2cap_chan_hold(chan);
> + l2cap_chan_lock(chan);
> + sk = chan->data;
>
> + if (!sk) {
> + l2cap_chan_unlock(chan);
> + l2cap_chan_put(chan);
> + return -ENXIO;
> + }
> +
> + pi = l2cap_pi(sk);
> + lock_sock(sk);
> if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
> err = -ENOMEM;
> goto done;
> @@ -1535,6 +1552,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
>
> done:
> release_sock(sk);
> + l2cap_chan_unlock(chan);
> + l2cap_chan_put(chan);
>
> return err;
> }
> --
> 2.43.0
Looks like this was never really tested properly:
============================================
WARNING: possible recursive locking detected
6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
--------------------------------------------
kworker/u5:0/35 is trying to acquire lock:
ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_sock_recv_cb+0x44/0x1e0
but task is already holding lock:
ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_get_chan_by_scid+0xaf/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&chan->lock#2/1);
lock(&chan->lock#2/1);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u5:0/35:
#0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
process_one_work+0x750/0x930
#1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
at: process_one_work+0x44e/0x930
#2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
l2cap_get_chan_by_scid+0xaf/0xd0
l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
perhaps we can just do:
sk = chan->data;
if (!sk)
return -ENXIO;
--
Luiz Augusto von Dentz
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-20 16:53 ` Luiz Augusto von Dentz
@ 2024-06-21 14:45 ` Edward Adam Davis
2024-06-21 15:57 ` Pauli Virtanen
2024-06-22 3:46 ` Edward Adam Davis
1 sibling, 1 reply; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-21 14:45 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Luiz Augusto von Dentz,
On Thu, 20 Jun 2024 12:53:19 -0400, Luiz Augusto von Dentz wrote:
> > release_sock(sk);
> > + l2cap_chan_unlock(chan);
> > + l2cap_chan_put(chan);
> >
> > return err;
> > }
> > --
> > 2.43.0
>
> Looks like this was never really tested properly:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> --------------------------------------------
> kworker/u5:0/35 is trying to acquire lock:
> ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_sock_recv_cb+0x44/0x1e0
>
> but task is already holding lock:
> ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_get_chan_by_scid+0xaf/0xd0
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&chan->lock#2/1);
> lock(&chan->lock#2/1);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by kworker/u5:0/35:
> #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> process_one_work+0x750/0x930
> #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> at: process_one_work+0x44e/0x930
> #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_get_chan_by_scid+0xaf/0xd0
>
> l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> perhaps we can just do:
>
> sk = chan->data;
> if (!sk)
> return -ENXIO;
If the release occurs after this judgment, the same problem will still occur.
Recv and release must be synchronized using locks, which can be solved by
adding new lock.
Can you provide a reproduction program for the AA lock mentioned above?
--
Edward
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-21 14:45 ` [PATCH] " Edward Adam Davis
@ 2024-06-21 15:57 ` Pauli Virtanen
0 siblings, 0 replies; 15+ messages in thread
From: Pauli Virtanen @ 2024-06-21 15:57 UTC (permalink / raw)
To: Edward Adam Davis; +Cc: linux-bluetooth
Hi,
pe, 2024-06-21 kello 22:45 +0800, Edward Adam Davis kirjoitti:
> Hi Luiz Augusto von Dentz,
>
> On Thu, 20 Jun 2024 12:53:19 -0400, Luiz Augusto von Dentz wrote:
> > > release_sock(sk);
> > > + l2cap_chan_unlock(chan);
> > > + l2cap_chan_put(chan);
> > >
> > > return err;
> > > }
> > > --
> > > 2.43.0
> >
> > Looks like this was never really tested properly:
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> > --------------------------------------------
> > kworker/u5:0/35 is trying to acquire lock:
> > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_sock_recv_cb+0x44/0x1e0
> >
> > but task is already holding lock:
> > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_get_chan_by_scid+0xaf/0xd0
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&chan->lock#2/1);
> > lock(&chan->lock#2/1);
> >
> > *** DEADLOCK ***
> >
> > May be due to missing lock nesting notation
> >
> > 3 locks held by kworker/u5:0/35:
> > #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> > process_one_work+0x750/0x930
> > #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> > at: process_one_work+0x44e/0x930
> > #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_get_chan_by_scid+0xaf/0xd0
> >
> > l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> > perhaps we can just do:
> >
> > sk = chan->data;
> > if (!sk)
> > return -ENXIO;
>
> If the release occurs after this judgment, the same problem will still occur.
> Recv and release must be synchronized using locks, which can be solved by
> adding new lock.
>
> Can you provide a reproduction program for the AA lock mentioned above?
eg. l2cap_data_channel() calls l2cap_sock_recv_cb with chan->lock (and
conn->chan_lock) held.
All callsites eg. l2cap_raw_recv() don't seem to hold chan->lock, so
you'd probably need to check the locking on all of them, if you want to
use chan->lock for this synchronization.
--
Pauli Virtanen
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-20 16:53 ` Luiz Augusto von Dentz
2024-06-21 14:45 ` [PATCH] " Edward Adam Davis
@ 2024-06-22 3:46 ` Edward Adam Davis
2024-06-24 13:36 ` Luiz Augusto von Dentz
1 sibling, 1 reply; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-22 3:46 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Luiz Augusto von Dentz,
On Thu, 20 Jun 2024 12:53:19 -0400, Luiz Augusto von Dentz wrote:
> > release_sock(sk);
> > + l2cap_chan_unlock(chan);
> > + l2cap_chan_put(chan);
> >
> > return err;
> > }
> > --
> > 2.43.0
>
> Looks like this was never really tested properly:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> --------------------------------------------
> kworker/u5:0/35 is trying to acquire lock:
> ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_sock_recv_cb+0x44/0x1e0
>
> but task is already holding lock:
> ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_get_chan_by_scid+0xaf/0xd0
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&chan->lock#2/1);
> lock(&chan->lock#2/1);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by kworker/u5:0/35:
> #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> process_one_work+0x750/0x930
> #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> at: process_one_work+0x44e/0x930
> #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> l2cap_get_chan_by_scid+0xaf/0xd0
>
> l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> perhaps we can just do:
>
> sk = chan->data;
> if (!sk)
> return -ENXIO;
If the release occurs after this judgment, the same problem will still occur.
Recv and release must be synchronized using locks, which can be solved by
adding new lock.
Please use the new patch https://syzkaller.appspot.com/text?tag=Patch&x=15d2c48e980000, I have tested in
https://syzkaller.appspot.com/bug?extid=b7f6f8c9303466e16c8a
--
Edward
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-22 3:46 ` Edward Adam Davis
@ 2024-06-24 13:36 ` Luiz Augusto von Dentz
2024-06-25 0:52 ` Edward Adam Davis
0 siblings, 1 reply; 15+ messages in thread
From: Luiz Augusto von Dentz @ 2024-06-24 13:36 UTC (permalink / raw)
To: Edward Adam Davis
Cc: johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Edward,
On Fri, Jun 21, 2024 at 11:46 PM Edward Adam Davis <eadavis@qq.com> wrote:
>
> Hi Luiz Augusto von Dentz,
>
> On Thu, 20 Jun 2024 12:53:19 -0400, Luiz Augusto von Dentz wrote:
> > > release_sock(sk);
> > > + l2cap_chan_unlock(chan);
> > > + l2cap_chan_put(chan);
> > >
> > > return err;
> > > }
> > > --
> > > 2.43.0
> >
> > Looks like this was never really tested properly:
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> > --------------------------------------------
> > kworker/u5:0/35 is trying to acquire lock:
> > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_sock_recv_cb+0x44/0x1e0
> >
> > but task is already holding lock:
> > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_get_chan_by_scid+0xaf/0xd0
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&chan->lock#2/1);
> > lock(&chan->lock#2/1);
> >
> > *** DEADLOCK ***
> >
> > May be due to missing lock nesting notation
> >
> > 3 locks held by kworker/u5:0/35:
> > #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> > process_one_work+0x750/0x930
> > #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> > at: process_one_work+0x44e/0x930
> > #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > l2cap_get_chan_by_scid+0xaf/0xd0
> >
> > l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> > perhaps we can just do:
> >
> > sk = chan->data;
> > if (!sk)
> > return -ENXIO;
>
> If the release occurs after this judgment, the same problem will still occur.
> Recv and release must be synchronized using locks, which can be solved by
> adding new lock.
>
> Please use the new patch https://syzkaller.appspot.com/text?tag=Patch&x=15d2c48e980000, I have tested in
> https://syzkaller.appspot.com/bug?extid=b7f6f8c9303466e16c8a
Hmm, why don't we just fix l2cap_conless_channel? Btw,
l2cap_conless_channel is normally not used by any profiles thus why
there isn't any CI covering it, on the other hand l2cap_data_channel
is used by 99% of the profiles.
> --
> Edward
>
--
Luiz Augusto von Dentz
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-24 13:36 ` Luiz Augusto von Dentz
@ 2024-06-25 0:52 ` Edward Adam Davis
2024-06-25 3:08 ` Luiz Augusto von Dentz
0 siblings, 1 reply; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-25 0:52 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Luiz Augusto von Dentz,
On Mon, 24 Jun 2024 09:36:14 -0400, Luiz Augusto von Dentz wrote:
> > > Looks like this was never really tested properly:
> > >
> > > ============================================
> > > WARNING: possible recursive locking detected
> > > 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> > > --------------------------------------------
> > > kworker/u5:0/35 is trying to acquire lock:
> > > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > l2cap_sock_recv_cb+0x44/0x1e0
> > >
> > > but task is already holding lock:
> > > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > l2cap_get_chan_by_scid+0xaf/0xd0
> > >
> > > other info that might help us debug this:
> > > Possible unsafe locking scenario:
> > >
> > > CPU0
> > > ----
> > > lock(&chan->lock#2/1);
> > > lock(&chan->lock#2/1);
> > >
> > > *** DEADLOCK ***
> > >
> > > May be due to missing lock nesting notation
> > >
> > > 3 locks held by kworker/u5:0/35:
> > > #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> > > process_one_work+0x750/0x930
> > > #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> > > at: process_one_work+0x44e/0x930
> > > #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > l2cap_get_chan_by_scid+0xaf/0xd0
> > >
> > > l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> > > perhaps we can just do:
> > >
> > > sk = chan->data;
> > > if (!sk)
> > > return -ENXIO;
> >
> > If the release occurs after this judgment, the same problem will still occur.
> > Recv and release must be synchronized using locks, which can be solved by
> > adding new lock.
> >
> > Please use the new patch https://syzkaller.appspot.com/text?tag=Patch&x=15d2c48e980000, I have tested in
> > https://syzkaller.appspot.com/bug?extid=b7f6f8c9303466e16c8a
>
> Hmm, why don't we just fix l2cap_conless_channel? Btw,
> l2cap_conless_channel is normally not used by any profiles thus why
> there isn't any CI covering it, on the other hand l2cap_data_channel
> is used by 99% of the profiles.
Yes, we can fix l2cap_conless_channel, but key point is that "chan->lock"
cannot be used to synchronize l2cap_conless_channel and l2cap_sock_release,
otherwise it will form an AA lock with l2cap_data_channel.
Why not fix it in l2cap_conless_channel but in l2cap_sock_recv_cb, because
l2cap_sock_recv_cb is on the same layer as l2cap_sock_kill, using a new
lock for synchronization is more appropriate.
--
Edward
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] bluetooth/l2cap: sync sock recv cb and release
2024-06-25 0:52 ` Edward Adam Davis
@ 2024-06-25 3:08 ` Luiz Augusto von Dentz
2024-06-25 10:12 ` [PATCH V3] Bluetooth/l2cap: " Edward Adam Davis
0 siblings, 1 reply; 15+ messages in thread
From: Luiz Augusto von Dentz @ 2024-06-25 3:08 UTC (permalink / raw)
To: Edward Adam Davis
Cc: johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
Hi Edward,
On Mon, Jun 24, 2024 at 8:53 PM Edward Adam Davis <eadavis@qq.com> wrote:
>
> Hi Luiz Augusto von Dentz,
>
> On Mon, 24 Jun 2024 09:36:14 -0400, Luiz Augusto von Dentz wrote:
> > > > Looks like this was never really tested properly:
> > > >
> > > > ============================================
> > > > WARNING: possible recursive locking detected
> > > > 6.10.0-rc3-g4029dba6b6f1 #6823 Not tainted
> > > > --------------------------------------------
> > > > kworker/u5:0/35 is trying to acquire lock:
> > > > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > > l2cap_sock_recv_cb+0x44/0x1e0
> > > >
> > > > but task is already holding lock:
> > > > ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > > l2cap_get_chan_by_scid+0xaf/0xd0
> > > >
> > > > other info that might help us debug this:
> > > > Possible unsafe locking scenario:
> > > >
> > > > CPU0
> > > > ----
> > > > lock(&chan->lock#2/1);
> > > > lock(&chan->lock#2/1);
> > > >
> > > > *** DEADLOCK ***
> > > >
> > > > May be due to missing lock nesting notation
> > > >
> > > > 3 locks held by kworker/u5:0/35:
> > > > #0: ffff888002b8a940 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
> > > > process_one_work+0x750/0x930
> > > > #1: ffff888002c67dd0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
> > > > at: process_one_work+0x44e/0x930
> > > > #2: ffff888002ec2510 (&chan->lock#2/1){+.+.}-{3:3}, at:
> > > > l2cap_get_chan_by_scid+0xaf/0xd0
> > > >
> > > > l2cap_sock_recv_cb is assumed to be called with the chan_lock held so
> > > > perhaps we can just do:
> > > >
> > > > sk = chan->data;
> > > > if (!sk)
> > > > return -ENXIO;
> > >
> > > If the release occurs after this judgment, the same problem will still occur.
> > > Recv and release must be synchronized using locks, which can be solved by
> > > adding new lock.
> > >
> > > Please use the new patch https://syzkaller.appspot.com/text?tag=Patch&x=15d2c48e980000, I have tested in
> > > https://syzkaller.appspot.com/bug?extid=b7f6f8c9303466e16c8a
> >
> > Hmm, why don't we just fix l2cap_conless_channel? Btw,
> > l2cap_conless_channel is normally not used by any profiles thus why
> > there isn't any CI covering it, on the other hand l2cap_data_channel
> > is used by 99% of the profiles.
> Yes, we can fix l2cap_conless_channel, but key point is that "chan->lock"
> cannot be used to synchronize l2cap_conless_channel and l2cap_sock_release,
> otherwise it will form an AA lock with l2cap_data_channel.
Don't follow, what l2cap_conless_channel has to do with
l2cap_data_channel? You can't have the same channel calling both, it
is either connectionless or it is not.
> Why not fix it in l2cap_conless_channel but in l2cap_sock_recv_cb, because
> l2cap_sock_recv_cb is on the same layer as l2cap_sock_kill, using a new
> lock for synchronization is more appropriate.
There are dedicated locks for both layers and now that we are doing
chan->lock for before chan->recv the locking sequence should be the
same in all instances.
> --
> Edward
>
--
Luiz Augusto von Dentz
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3] Bluetooth/l2cap: sync sock recv cb and release
2024-06-25 3:08 ` Luiz Augusto von Dentz
@ 2024-06-25 10:12 ` Edward Adam Davis
2024-06-25 10:39 ` [V3] " bluez.test.bot
0 siblings, 1 reply; 15+ messages in thread
From: Edward Adam Davis @ 2024-06-25 10:12 UTC (permalink / raw)
To: luiz.dentz
Cc: eadavis, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
syzbot+b7f6f8c9303466e16c8a, syzkaller-bugs
The problem occurs between the system call to close the sock and hci_rx_work,
where the former releases the sock and the latter accesses it without lock protection.
CPU0 CPU1
---- ----
sock_close hci_rx_work
l2cap_sock_release hci_acldata_packet
l2cap_sock_kill l2cap_recv_frame
sk_free l2cap_conless_channel
l2cap_sock_recv_cb
If hci_rx_work processes the data that needs to be received before the sock is
closed, then everything is normal; Otherwise, the work thread may access the
released sock when receiving data.
Add a chan lock in the l2cap_conless_channel of the sock to achieve sync
between the sock release and recv cb.
Sock is dead, so set chan data to NULL, avoid others use invalid sock pointer.
Reported-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
net/bluetooth/l2cap_core.c | 3 +++
net/bluetooth/l2cap_sock.c | 13 +++++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index aed025734d04..35a9534eb62d 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -6771,10 +6771,13 @@ static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
bacpy(&bt_cb(skb)->l2cap.bdaddr, &hcon->dst);
bt_cb(skb)->l2cap.psm = psm;
+ l2cap_chan_lock(chan);
if (!chan->ops->recv(chan, skb)) {
+ l2cap_chan_unlock(chan);
l2cap_chan_put(chan);
return;
}
+ l2cap_chan_unlock(chan);
drop:
l2cap_chan_put(chan);
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 6db60946c627..25091fb992a7 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1239,6 +1239,10 @@ static void l2cap_sock_kill(struct sock *sk)
BT_DBG("sk %p state %s", sk, state_to_string(sk->sk_state));
+ /* Sock is dead, so set chan data to NULL, avoid other task use invalid
+ * sock pointer.
+ */
+ l2cap_pi(sk)->chan->data = NULL;
/* Kill poor orphan */
l2cap_chan_put(l2cap_pi(sk)->chan);
@@ -1481,11 +1485,16 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan)
static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb)
{
- struct sock *sk = chan->data;
- struct l2cap_pinfo *pi = l2cap_pi(sk);
+ struct sock *sk;
+ struct l2cap_pinfo *pi;
int err;
+ if (!chan->data)
+ return -ENXIO;
+
+ sk = chan->data;
lock_sock(sk);
+ pi = l2cap_pi(sk);
if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) {
err = -ENOMEM;
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* RE: [V3] Bluetooth/l2cap: sync sock recv cb and release
2024-06-25 10:12 ` [PATCH V3] Bluetooth/l2cap: " Edward Adam Davis
@ 2024-06-25 10:39 ` bluez.test.bot
0 siblings, 0 replies; 15+ messages in thread
From: bluez.test.bot @ 2024-06-25 10:39 UTC (permalink / raw)
To: linux-bluetooth, eadavis
[-- Attachment #1: Type: text/plain, Size: 555 bytes --]
This is an automated email and please do not reply to this email.
Dear Submitter,
Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.
----- Output -----
error: patch failed: net/bluetooth/l2cap_sock.c:1239
error: net/bluetooth/l2cap_sock.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch
Please resolve the issue and submit the patches again.
---
Regards,
Linux Bluetooth
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-06-25 10:39 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-07 6:02 [syzbot] [bluetooth?] general protection fault in l2cap_sock_recv_cb syzbot
2024-06-11 14:50 ` [PATCH] bluetooth/l2cap: sync sock recv cb and release Edward Adam Davis
2024-06-11 19:24 ` Luiz Augusto von Dentz
2024-06-12 0:33 ` Edward Adam Davis
2024-06-15 1:45 ` [PATCH v2] " Edward Adam Davis
2024-06-17 19:00 ` patchwork-bot+bluetooth
2024-06-20 16:53 ` Luiz Augusto von Dentz
2024-06-21 14:45 ` [PATCH] " Edward Adam Davis
2024-06-21 15:57 ` Pauli Virtanen
2024-06-22 3:46 ` Edward Adam Davis
2024-06-24 13:36 ` Luiz Augusto von Dentz
2024-06-25 0:52 ` Edward Adam Davis
2024-06-25 3:08 ` Luiz Augusto von Dentz
2024-06-25 10:12 ` [PATCH V3] Bluetooth/l2cap: " Edward Adam Davis
2024-06-25 10:39 ` [V3] " bluez.test.bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox