* [PATCH] nbd: don't warn when reclassifying a busy socket lock
@ 2026-06-21 23:52 Deepanshu Kartikey
2026-06-22 1:43 ` Hillf Danton
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Deepanshu Kartikey @ 2026-06-21 23:52 UTC (permalink / raw)
To: josef, axboe, edumazet
Cc: linux-block, nbd, linux-kernel, Deepanshu Kartikey,
syzbot+6b85d1e39a5b8ed9a954
nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
held at the point of reclassification. That assertion was copied from
nvme-tcp, where the socket is created internally by the kernel
(sock_create_kern()) and is never visible to user space, so the lock
is guaranteed to be free.
NBD is different: the socket is looked up from a user-supplied fd in
nbd_get_socket(), and user space retains that fd. A concurrent syscall
on the same socket (or softirq processing taking bh_lock_sock() on a
connected TCP socket) can legitimately hold the lock at the instant
NBD reclassifies it. sock_allow_reclassification() then returns false
and the WARN_ON_ONCE() fires, which turns into a crash under
panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
against socket activity on the same fd, as reported by syzbot.
Hitting a held lock here is expected for an externally owned socket and
is not a kernel bug, so skip reclassification silently instead of
warning. Reclassification is a lockdep-only annotation, so skipping it
in the rare racing case is harmless.
Reported-by: syzbot+6b85d1e39a5b8ed9a954@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6b85d1e39a5b8ed9a954
Fixes: d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular dependency")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
drivers/block/nbd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 3a585a0c882a..8f10762e90ef 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1246,7 +1246,7 @@ static void nbd_reclassify_socket(struct socket *sock)
{
struct sock *sk = sock->sk;
- if (WARN_ON_ONCE(!sock_allow_reclassification(sk)))
+ if (!sock_allow_reclassification(sk))
return;
switch (sk->sk_family) {
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-21 23:52 [PATCH] nbd: don't warn when reclassifying a busy socket lock Deepanshu Kartikey
@ 2026-06-22 1:43 ` Hillf Danton
2026-06-22 8:18 ` Eric Dumazet
2026-06-22 8:31 ` Eric Dumazet
2026-06-22 22:00 ` Jens Axboe
2 siblings, 1 reply; 8+ messages in thread
From: Hillf Danton @ 2026-06-22 1:43 UTC (permalink / raw)
To: Deepanshu Kartikey
Cc: edumazet, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Mon, 22 Jun 2026 05:22:55 +0530 Deepanshu Kartikey wrote:
> nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
> held at the point of reclassification. That assertion was copied from
> nvme-tcp, where the socket is created internally by the kernel
> (sock_create_kern()) and is never visible to user space, so the lock
> is guaranteed to be free.
>
> NBD is different: the socket is looked up from a user-supplied fd in
> nbd_get_socket(), and user space retains that fd. A concurrent syscall
> on the same socket (or softirq processing taking bh_lock_sock() on a
> connected TCP socket) can legitimately hold the lock at the instant
> NBD reclassifies it. sock_allow_reclassification() then returns false
> and the WARN_ON_ONCE() fires, which turns into a crash under
> panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
> against socket activity on the same fd, as reported by syzbot.
>
Given the syzbot report, if you are right (I suspect) then Eric delivered
another half-baked croissant, and feel free to cut it off instead to make
room for correct fix.
> Hitting a held lock here is expected for an externally owned socket and
> is not a kernel bug, so skip reclassification silently instead of
> warning. Reclassification is a lockdep-only annotation, so skipping it
> in the rare racing case is harmless.
>
> Reported-by: syzbot+6b85d1e39a5b8ed9a954@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=6b85d1e39a5b8ed9a954
> Fixes: d532cddb6c60 ("nbd: Reclassify sockets to avoid lockdep circular dependency")
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
> drivers/block/nbd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 3a585a0c882a..8f10762e90ef 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1246,7 +1246,7 @@ static void nbd_reclassify_socket(struct socket *sock)
> {
> struct sock *sk = sock->sk;
>
> - if (WARN_ON_ONCE(!sock_allow_reclassification(sk)))
> + if (!sock_allow_reclassification(sk))
> return;
>
> switch (sk->sk_family) {
> --
> 2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-22 1:43 ` Hillf Danton
@ 2026-06-22 8:18 ` Eric Dumazet
2026-06-23 0:07 ` Hillf Danton
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-06-22 8:18 UTC (permalink / raw)
To: Hillf Danton
Cc: Deepanshu Kartikey, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Sun, Jun 21, 2026 at 6:43 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Mon, 22 Jun 2026 05:22:55 +0530 Deepanshu Kartikey wrote:
> > nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
> > held at the point of reclassification. That assertion was copied from
> > nvme-tcp, where the socket is created internally by the kernel
> > (sock_create_kern()) and is never visible to user space, so the lock
> > is guaranteed to be free.
> >
> > NBD is different: the socket is looked up from a user-supplied fd in
> > nbd_get_socket(), and user space retains that fd. A concurrent syscall
> > on the same socket (or softirq processing taking bh_lock_sock() on a
> > connected TCP socket) can legitimately hold the lock at the instant
> > NBD reclassifies it. sock_allow_reclassification() then returns false
> > and the WARN_ON_ONCE() fires, which turns into a crash under
> > panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
> > against socket activity on the same fd, as reported by syzbot.
> >
> Given the syzbot report, if you are right (I suspect) then Eric delivered
> another half-baked croissant, and feel free to cut it off instead to make
> room for correct fix.
Nobody (including you) caught this.difference between nbd and other
sock_allow_reclassification()
callers.
What was the "correct fix" you envisioned exactly?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-21 23:52 [PATCH] nbd: don't warn when reclassifying a busy socket lock Deepanshu Kartikey
2026-06-22 1:43 ` Hillf Danton
@ 2026-06-22 8:31 ` Eric Dumazet
2026-06-22 22:00 ` Jens Axboe
2 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2026-06-22 8:31 UTC (permalink / raw)
To: Deepanshu Kartikey
Cc: josef, axboe, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Sun, Jun 21, 2026 at 4:53 PM Deepanshu Kartikey
<kartikey406@gmail.com> wrote:
>
>
> nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
> held at the point of reclassification. That assertion was copied from
> nvme-tcp, where the socket is created internally by the kernel
> (sock_create_kern()) and is never visible to user space, so the lock
> is guaranteed to be free.
>
> NBD is different: the socket is looked up from a user-supplied fd in
> nbd_get_socket(), and user space retains that fd. A concurrent syscall
> on the same socket (or softirq processing taking bh_lock_sock() on a
> connected TCP socket) can legitimately hold the lock at the instant
> NBD reclassifies it. sock_allow_reclassification() then returns false
> and the WARN_ON_ONCE() fires, which turns into a crash under
> panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
> against socket activity on the same fd, as reported by syzbot.
>
> Hitting a held lock here is expected for an externally owned socket and
> is not a kernel bug, so skip reclassification silently instead of
> warning. Reclassification is a lockdep-only annotation, so skipping it
> in the rare racing case is harmless.
Acked-by: Eric Dumazet <edumazet@google.com>
Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-21 23:52 [PATCH] nbd: don't warn when reclassifying a busy socket lock Deepanshu Kartikey
2026-06-22 1:43 ` Hillf Danton
2026-06-22 8:31 ` Eric Dumazet
@ 2026-06-22 22:00 ` Jens Axboe
2 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2026-06-22 22:00 UTC (permalink / raw)
To: josef, edumazet, Deepanshu Kartikey
Cc: linux-block, nbd, linux-kernel, syzbot+6b85d1e39a5b8ed9a954
On Mon, 22 Jun 2026 05:22:55 +0530, Deepanshu Kartikey wrote:
> nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
> held at the point of reclassification. That assertion was copied from
> nvme-tcp, where the socket is created internally by the kernel
> (sock_create_kern()) and is never visible to user space, so the lock
> is guaranteed to be free.
>
> NBD is different: the socket is looked up from a user-supplied fd in
> nbd_get_socket(), and user space retains that fd. A concurrent syscall
> on the same socket (or softirq processing taking bh_lock_sock() on a
> connected TCP socket) can legitimately hold the lock at the instant
> NBD reclassifies it. sock_allow_reclassification() then returns false
> and the WARN_ON_ONCE() fires, which turns into a crash under
> panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
> against socket activity on the same fd, as reported by syzbot.
>
> [...]
Applied, thanks!
[1/1] nbd: don't warn when reclassifying a busy socket lock
commit: 9280e6edf65662b6aafc8b704ad065b54c08b519
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-22 8:18 ` Eric Dumazet
@ 2026-06-23 0:07 ` Hillf Danton
2026-06-23 0:21 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Hillf Danton @ 2026-06-23 0:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: Deepanshu Kartikey, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Mon, 22 Jun 2026 01:18:10 -0700 Eric Dumazet wrote:
>On Sun, Jun 21, 2026 at 6:43 PM Hillf Danton <hdanton@sina.com> wrote:
>> On Mon, 22 Jun 2026 05:22:55 +0530 Deepanshu Kartikey wrote:
>> > nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
>> > held at the point of reclassification. That assertion was copied from
>> > nvme-tcp, where the socket is created internally by the kernel
>> > (sock_create_kern()) and is never visible to user space, so the lock
>> > is guaranteed to be free.
>> >
>> > NBD is different: the socket is looked up from a user-supplied fd in
>> > nbd_get_socket(), and user space retains that fd. A concurrent syscall
>> > on the same socket (or softirq processing taking bh_lock_sock() on a
>> > connected TCP socket) can legitimately hold the lock at the instant
>> > NBD reclassifies it. sock_allow_reclassification() then returns false
>> > and the WARN_ON_ONCE() fires, which turns into a crash under
>> > panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
>> > against socket activity on the same fd, as reported by syzbot.
>> >
>> Given the syzbot report, if you are right (I suspect) then Eric delivered
>> another half-baked croissant, and feel free to cut it off instead to make
>> room for correct fix.
>
> Nobody (including you) caught this.difference between nbd and other
> sock_allow_reclassification() callers.
>
Nope, actually it raises the question -- does the deadlock still remain
after your fix without the lock key you added applied?
> What was the "correct fix" you envisioned exactly?
>
Frankly I had no evidence against your fix a couple days back, but now I
see your lock key approach fails to take off. And the correct fix is to
erase the incorrect locking order ffa1e7ada456 tries to catch, more
difficult than you thought so far.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-23 0:07 ` Hillf Danton
@ 2026-06-23 0:21 ` Eric Dumazet
2026-06-23 0:44 ` Hillf Danton
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2026-06-23 0:21 UTC (permalink / raw)
To: Hillf Danton
Cc: Deepanshu Kartikey, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Mon, Jun 22, 2026 at 5:07 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On Mon, 22 Jun 2026 01:18:10 -0700 Eric Dumazet wrote:
> >On Sun, Jun 21, 2026 at 6:43 PM Hillf Danton <hdanton@sina.com> wrote:
> >> On Mon, 22 Jun 2026 05:22:55 +0530 Deepanshu Kartikey wrote:
> >> > nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
> >> > held at the point of reclassification. That assertion was copied from
> >> > nvme-tcp, where the socket is created internally by the kernel
> >> > (sock_create_kern()) and is never visible to user space, so the lock
> >> > is guaranteed to be free.
> >> >
> >> > NBD is different: the socket is looked up from a user-supplied fd in
> >> > nbd_get_socket(), and user space retains that fd. A concurrent syscall
> >> > on the same socket (or softirq processing taking bh_lock_sock() on a
> >> > connected TCP socket) can legitimately hold the lock at the instant
> >> > NBD reclassifies it. sock_allow_reclassification() then returns false
> >> > and the WARN_ON_ONCE() fires, which turns into a crash under
> >> > panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
> >> > against socket activity on the same fd, as reported by syzbot.
> >> >
> >> Given the syzbot report, if you are right (I suspect) then Eric delivered
> >> another half-baked croissant, and feel free to cut it off instead to make
> >> room for correct fix.
> >
> > Nobody (including you) caught this.difference between nbd and other
> > sock_allow_reclassification() callers.
> >
> Nope, actually it raises the question -- does the deadlock still remain
> after your fix without the lock key you added applied?
LOCKDEP might have a false positive, but it will be much much harder to trigger.
I had about 50 syzbot duplicates (that I did not release) before d532cddb6c60
("nbd: Reclassify sockets to avoid lockdep circular dependency").
>
> > What was the "correct fix" you envisioned exactly?
> >
> Frankly I had no evidence against your fix a couple days back, but now I
> see your lock key approach fails to take off. And the correct fix is to
> erase the incorrect locking order ffa1e7ada456 tries to catch, more
> difficult than you thought so far.
Which incorrect locking order are you referring to? This is a LOCKDEP
false positive.
I suggest you send a patch so we can discuss it.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nbd: don't warn when reclassifying a busy socket lock
2026-06-23 0:21 ` Eric Dumazet
@ 2026-06-23 0:44 ` Hillf Danton
0 siblings, 0 replies; 8+ messages in thread
From: Hillf Danton @ 2026-06-23 0:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: Deepanshu Kartikey, linux-block, nbd, linux-kernel,
syzbot+6b85d1e39a5b8ed9a954
On Mon, 22 Jun 2026 17:21:53 -0700 Eric Dumazet wrote:
>On Mon, Jun 22, 2026 at 5:07 PM Hillf Danton <hdanton@sina.com> wrote:
>> On Mon, 22 Jun 2026 01:18:10 -0700 Eric Dumazet wrote:
>> >On Sun, Jun 21, 2026 at 6:43 PM Hillf Danton <hdanton@sina.com> wrote:
>> >> On Mon, 22 Jun 2026 05:22:55 +0530 Deepanshu Kartikey wrote:
>> >> > nbd_reclassify_socket() warns via WARN_ON_ONCE() if the socket lock is
>> >> > held at the point of reclassification. That assertion was copied from
>> >> > nvme-tcp, where the socket is created internally by the kernel
>> >> > (sock_create_kern()) and is never visible to user space, so the lock
>> >> > is guaranteed to be free.
>> >> >
>> >> > NBD is different: the socket is looked up from a user-supplied fd in
>> >> > nbd_get_socket(), and user space retains that fd. A concurrent syscall
>> >> > on the same socket (or softirq processing taking bh_lock_sock() on a
>> >> > connected TCP socket) can legitimately hold the lock at the instant
>> >> > NBD reclassifies it. sock_allow_reclassification() then returns false
>> >> > and the WARN_ON_ONCE() fires, which turns into a crash under
>> >> > panic_on_warn. This is reachable by simply racing NBD_CMD_CONNECT
>> >> > against socket activity on the same fd, as reported by syzbot.
>> >> >
>> >> Given the syzbot report, if you are right (I suspect) then Eric delivered
>> >> another half-baked croissant, and feel free to cut it off instead to make
>> >> room for correct fix.
>> >
>> > Nobody (including you) caught this.difference between nbd and other
>> > sock_allow_reclassification() callers.
>> >
>> Nope, actually it raises the question -- does the deadlock still remain
>> after your fix without the lock key you added applied?
>
>LOCKDEP might have a false positive, but it will be much much harder to trigger.
>
>I had about 50 syzbot duplicates (that I did not release) before d532cddb6c60
> ("nbd: Reclassify sockets to avoid lockdep circular dependency").
>
>>
>> > What was the "correct fix" you envisioned exactly?
>> >
>> Frankly I had no evidence against your fix a couple days back, but now I
>> see your lock key approach fails to take off. And the correct fix is to
>> erase the incorrect locking order ffa1e7ada456 tries to catch, more
>> difficult than you thought so far.
>
>Which incorrect locking order are you referring to? This is a LOCKDEP
>false positive.
>
In addition to 50 syzbot reports, your fix has a Fixes tag, no?
>I suggest you send a patch so we can discuss it.
The deadlock existed before ffa1e7ada456, why is a chance left for your fix?
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-23 0:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-21 23:52 [PATCH] nbd: don't warn when reclassifying a busy socket lock Deepanshu Kartikey
2026-06-22 1:43 ` Hillf Danton
2026-06-22 8:18 ` Eric Dumazet
2026-06-23 0:07 ` Hillf Danton
2026-06-23 0:21 ` Eric Dumazet
2026-06-23 0:44 ` Hillf Danton
2026-06-22 8:31 ` Eric Dumazet
2026-06-22 22:00 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox