* [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind()
@ 2026-04-22 2:15 Deepanshu Kartikey
2026-04-22 15:13 ` Rémi Denis-Courmont
2026-04-28 0:54 ` Jakub Kicinski
0 siblings, 2 replies; 4+ messages in thread
From: Deepanshu Kartikey @ 2026-04-22 2:15 UTC (permalink / raw)
To: courmisch, davem, edumazet, kuba, pabeni, horms
Cc: netdev, linux-kernel, Deepanshu Kartikey,
syzbot+706f5eb79044e686c794
pn_socket_autobind() calls pn_socket_bind() and treats
-EINVAL as a signal that the socket was already bound,
then uses BUG_ON() to verify it:
if (err != -EINVAL)
return err;
BUG_ON(!pn_port(pn_sk(sock->sk)->sobject));
However, pn_socket_bind() returns -EINVAL in multiple
cases:
1. address length too short
2. socket not in TCP_CLOSE state
3. socket already bound <- only intended case
When -EINVAL comes from cases 1 or 2, sobject is still
zero (never assigned), causing BUG_ON to fire and crash
the kernel.
Fix this by checking the bound state directly via
pn_port(sobject) BEFORE calling pn_socket_bind(),
eliminating the ambiguous -EINVAL interpretation
entirely.
Reported-by: syzbot+706f5eb79044e686c794@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=706f5eb79044e686c794
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
net/phonet/socket.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index c4af26357144..5a55e7d14e85 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -204,14 +204,14 @@ static int pn_socket_autobind(struct socket *sock)
struct sockaddr_pn sa;
int err;
+ if (pn_port(pn_sk(sock->sk)->sobject))
+ return 0; /* socket was already bound */
+
memset(&sa, 0, sizeof(sa));
sa.spn_family = AF_PHONET;
err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa,
sizeof(struct sockaddr_pn));
- if (err != -EINVAL)
- return err;
- BUG_ON(!pn_port(pn_sk(sock->sk)->sobject));
- return 0; /* socket was already bound */
+ return err;
}
static int pn_socket_connect(struct socket *sock, struct sockaddr_unsized *addr,
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() 2026-04-22 2:15 [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() Deepanshu Kartikey @ 2026-04-22 15:13 ` Rémi Denis-Courmont 2026-04-28 0:54 ` Jakub Kicinski 1 sibling, 0 replies; 4+ messages in thread From: Rémi Denis-Courmont @ 2026-04-22 15:13 UTC (permalink / raw) To: courmisch, davem, edumazet, kuba, pabeni, horms, Deepanshu Kartikey Cc: netdev, linux-kernel, Deepanshu Kartikey, syzbot+706f5eb79044e686c794 Hi, Le keskiviikkona 22. huhtikuuta 2026, 5.15.33 Itä-Euroopan kesäaika Deepanshu Kartikey a écrit : > pn_socket_autobind() calls pn_socket_bind() and treats > -EINVAL as a signal that the socket was already bound, > then uses BUG_ON() to verify it: > > if (err != -EINVAL) > return err; > BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > > However, pn_socket_bind() returns -EINVAL in multiple > cases: > > 1. address length too short > 2. socket not in TCP_CLOSE state > 3. socket already bound <- only intended case > > When -EINVAL comes from cases 1 or 2, sobject is still > zero (never assigned), causing BUG_ON to fire and crash > the kernel. > > Fix this by checking the bound state directly via > pn_port(sobject) BEFORE calling pn_socket_bind(), > eliminating the ambiguous -EINVAL interpretation > entirely. > > Reported-by: syzbot+706f5eb79044e686c794@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=706f5eb79044e686c794 > Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> > --- > net/phonet/socket.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/net/phonet/socket.c b/net/phonet/socket.c > index c4af26357144..5a55e7d14e85 100644 > --- a/net/phonet/socket.c > +++ b/net/phonet/socket.c > @@ -204,14 +204,14 @@ static int pn_socket_autobind(struct socket *sock) > struct sockaddr_pn sa; > int err; > > + if (pn_port(pn_sk(sock->sk)->sobject)) > + return 0; /* socket was already bound */ > + This was almost 20 years ago, but IIRC, we did not do it that way back then because it results in a data race on sobject if another task binds the socket in parallel. > memset(&sa, 0, sizeof(sa)); > sa.spn_family = AF_PHONET; > err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa, > sizeof(struct sockaddr_pn)); > - if (err != -EINVAL) > - return err; > - BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > - return 0; /* socket was already bound */ > + return err; > } > > static int pn_socket_connect(struct socket *sock, struct sockaddr_unsized > *addr, -- 德尼-库尔蒙‧雷米 https://www.remlab.net/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() 2026-04-22 2:15 [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() Deepanshu Kartikey 2026-04-22 15:13 ` Rémi Denis-Courmont @ 2026-04-28 0:54 ` Jakub Kicinski 2026-04-28 8:48 ` Deepanshu Kartikey 1 sibling, 1 reply; 4+ messages in thread From: Jakub Kicinski @ 2026-04-28 0:54 UTC (permalink / raw) To: kartikey406 Cc: Jakub Kicinski, pabeni, netdev, syzbot+706f5eb79044e686c794, horms, linux-kernel, edumazet, davem, courmisch This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html --- net: phonet: fix BUG_ON() in pn_socket_autobind() This commit fixes a kernel crash in pn_socket_autobind() by checking the socket's bound state directly before calling pn_socket_bind(), eliminating ambiguous -EINVAL interpretation. Closes: https://syzkaller.appspot.com/bug?extid=706f5eb79044e686c794 > pn_socket_autobind() calls pn_socket_bind() and treats > -EINVAL as a signal that the socket was already bound, > then uses BUG_ON() to verify it: > > if (err != -EINVAL) > return err; > BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > > However, pn_socket_bind() returns -EINVAL in multiple > cases: > > 1. address length too short The commit message lists "address length too short" as a cause of the BUG_ON crash, but is this accurate? Looking at pn_socket_autobind(), it always passes len=sizeof(struct sockaddr_pn) to pn_socket_bind(): err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa, sizeof(struct sockaddr_pn)); While pn_socket_bind() does have this check: if (len < sizeof(struct sockaddr_pn)) return -EINVAL; This case cannot be triggered through pn_socket_autobind() since it always passes the correct length. The BUG_ON is only reachable through pn_socket_autobind(), so listing this as a cause seems misleading. Should the commit message be updated to remove case 1, or clarify that it's describing pn_socket_bind() in general rather than paths that can trigger the BUG_ON? > 2. socket not in TCP_CLOSE state Can case 2 (socket not in TCP_CLOSE state) actually trigger the BUG_ON crash with sobject still zero? The combined check in pn_socket_bind() is: if (sk->sk_state != TCP_CLOSE || pn_port(pn->sobject)) { err = -EINVAL; goto out; } For the BUG_ON to fire, we need sk_state != TCP_CLOSE while sobject remains zero (unbound). Looking at phonet state transitions: - pn_socket_connect() binds the socket before changing state - pn_socket_listen() binds the socket before changing state - pep_sock_accept() creates sockets with state=TCP_SYN_RECV but also sets sobject - pep_ioctl(SIOCPNENABLEPIPE) checks sobject before changing state Is there a code path that changes sk_state to non-TCP_CLOSE without first binding the socket? The syzbot crash is real, but the mechanism described couldn't be verified. Could this involve a race condition or a different code path? > 3. socket already bound <- only intended case > > When -EINVAL comes from cases 1 or 2, sobject is still > zero (never assigned), causing BUG_ON to fire and crash > the kernel. Given the issues with cases 1 and 2 above, is the explanation of the crash mechanism accurate? Case 1 cannot happen from pn_socket_autobind(), and case 2's ability to leave sobject at zero while triggering -EINVAL could not be verified. The syzbot crash is confirmed (RIP at pn_socket_autobind net/phonet/socket.c:213 via pn_socket_sendmsg), but could the commit message more accurately describe how the crash actually occurs? > diff --git a/net/phonet/socket.c b/net/phonet/socket.c > index c4af26357144..5a55e7d14e85 100644 > --- a/net/phonet/socket.c > +++ b/net/phonet/socket.c > @@ -204,14 +204,14 @@ static int pn_socket_autobind(struct socket *sock) > struct sockaddr_pn sa; > int err; > > + if (pn_port(pn_sk(sock->sk)->sobject)) > + return 0; /* socket was already bound */ > + > memset(&sa, 0, sizeof(sa)); > sa.spn_family = AF_PHONET; > err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa, > sizeof(struct sockaddr_pn)); > - if (err != -EINVAL) > - return err; > - BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > - return 0; /* socket was already bound */ > + return err; > } The fix itself is correct and removes the dangerous BUG_ON(). However, the pn_port(sobject) check is performed without holding any lock. Could a concurrent pn_socket_bind() race with this check? This is likely benign since pn_socket_bind() re-checks under lock_sock(), but is this the intended behavior? -- pw-bot: cr ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() 2026-04-28 0:54 ` Jakub Kicinski @ 2026-04-28 8:48 ` Deepanshu Kartikey 0 siblings, 0 replies; 4+ messages in thread From: Deepanshu Kartikey @ 2026-04-28 8:48 UTC (permalink / raw) To: Jakub Kicinski Cc: pabeni, netdev, syzbot+706f5eb79044e686c794, horms, linux-kernel, edumazet, davem, courmisch On Tue, Apr 28, 2026 at 6:24 AM Jakub Kicinski <kuba@kernel.org> wrote: > > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > > For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html > --- > net: phonet: fix BUG_ON() in pn_socket_autobind() > > This commit fixes a kernel crash in pn_socket_autobind() by checking > the socket's bound state directly before calling pn_socket_bind(), > eliminating ambiguous -EINVAL interpretation. > > Closes: https://syzkaller.appspot.com/bug?extid=706f5eb79044e686c794 > > > pn_socket_autobind() calls pn_socket_bind() and treats > > -EINVAL as a signal that the socket was already bound, > > then uses BUG_ON() to verify it: > > > > if (err != -EINVAL) > > return err; > > BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > > > > However, pn_socket_bind() returns -EINVAL in multiple > > cases: > > > > 1. address length too short > > The commit message lists "address length too short" as a cause of the > BUG_ON crash, but is this accurate? > > Looking at pn_socket_autobind(), it always passes > len=sizeof(struct sockaddr_pn) to pn_socket_bind(): > > err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa, > sizeof(struct sockaddr_pn)); > > While pn_socket_bind() does have this check: > > if (len < sizeof(struct sockaddr_pn)) > return -EINVAL; > > This case cannot be triggered through pn_socket_autobind() since it > always passes the correct length. The BUG_ON is only reachable through > pn_socket_autobind(), so listing this as a cause seems misleading. > > Should the commit message be updated to remove case 1, or clarify that > it's describing pn_socket_bind() in general rather than paths that can > trigger the BUG_ON? > > > 2. socket not in TCP_CLOSE state > > Can case 2 (socket not in TCP_CLOSE state) actually trigger the BUG_ON > crash with sobject still zero? > > The combined check in pn_socket_bind() is: > > if (sk->sk_state != TCP_CLOSE || pn_port(pn->sobject)) { > err = -EINVAL; > goto out; > } > > For the BUG_ON to fire, we need sk_state != TCP_CLOSE while sobject > remains zero (unbound). Looking at phonet state transitions: > > - pn_socket_connect() binds the socket before changing state > - pn_socket_listen() binds the socket before changing state > - pep_sock_accept() creates sockets with state=TCP_SYN_RECV but also > sets sobject > - pep_ioctl(SIOCPNENABLEPIPE) checks sobject before changing state > > Is there a code path that changes sk_state to non-TCP_CLOSE without > first binding the socket? The syzbot crash is real, but the mechanism > described couldn't be verified. Could this involve a race condition or > a different code path? > > > 3. socket already bound <- only intended case > > > > When -EINVAL comes from cases 1 or 2, sobject is still > > zero (never assigned), causing BUG_ON to fire and crash > > the kernel. > > Given the issues with cases 1 and 2 above, is the explanation of the > crash mechanism accurate? Case 1 cannot happen from pn_socket_autobind(), > and case 2's ability to leave sobject at zero while triggering -EINVAL > could not be verified. > > The syzbot crash is confirmed (RIP at pn_socket_autobind > net/phonet/socket.c:213 via pn_socket_sendmsg), but could the commit > message more accurately describe how the crash actually occurs? > > > diff --git a/net/phonet/socket.c b/net/phonet/socket.c > > index c4af26357144..5a55e7d14e85 100644 > > --- a/net/phonet/socket.c > > +++ b/net/phonet/socket.c > > @@ -204,14 +204,14 @@ static int pn_socket_autobind(struct socket *sock) > > struct sockaddr_pn sa; > > int err; > > > > + if (pn_port(pn_sk(sock->sk)->sobject)) > > + return 0; /* socket was already bound */ > > + > > memset(&sa, 0, sizeof(sa)); > > sa.spn_family = AF_PHONET; > > err = pn_socket_bind(sock, (struct sockaddr_unsized *)&sa, > > sizeof(struct sockaddr_pn)); > > - if (err != -EINVAL) > > - return err; > > - BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); > > - return 0; /* socket was already bound */ > > + return err; > > } > > The fix itself is correct and removes the dangerous BUG_ON(). However, > the pn_port(sobject) check is performed without holding any lock. Could > a concurrent pn_socket_bind() race with this check? This is likely > benign since pn_socket_bind() re-checks under lock_sock(), but is this > the intended behavior? > -- > pw-bot: cr Hi Jakub, Thank you for the detailed review! You are correct on all three points. Let me address each one: 1. Case 1 ("address length too short") is indeed impossible from pn_socket_autobind() since it always passes sizeof(struct sockaddr_pn). I will remove this from the commit message in v2. 2. Regarding case 2 (sk_state != TCP_CLOSE with sobject == 0), I verified this using the syzbot C reproducer. The exact path is: r0 = socket$kcm(AF_KCM, SOCK_SEQPACKET, 0) listen(r0, ...) → sk_state = TCP_LISTEN, sobject = 0 r3 = accept4(r0, ...) → creates NEW socket r3 → sk_state = TCP_SYN_RECV (not TCP_CLOSE!) → sobject = 0 (never bound!) sendmsg(r3) → pn_socket_autobind(r3) → pn_socket_bind() sees sk_state != TCP_CLOSE while sobject is still 0 → returns -EINVAL → BUG_ON fires! sobject is only set in two places: a. successful pn_socket_bind() → get_port() assigns port b. pep_sock_accept() → copied from request Since r3 comes from accept4() on a KCM socket (not a PEP phonet socket), pep_sock_accept() is never called and sobject stays zero. So sk_state != TCP_CLOSE and sobject == 0 can indeed be true simultaneously. 3. Regarding the missing lock: the check: if (pn_port(pn_sk(sock->sk)->sobject)) return 0; is performed without lock_sock(). A concurrent pn_socket_bind() could race with this read. However since pn_socket_bind() re-checks pn_port(sobject) under lock_sock(), any racing bind would be caught there. So the race is benign. I can add a comment clarifying this if desired. I will send a v2 with an updated commit message reflecting the above. Thanks, Deepanshu Kartikey ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-28 8:48 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-22 2:15 [PATCH] net: phonet: fix BUG_ON() in pn_socket_autobind() Deepanshu Kartikey 2026-04-22 15:13 ` Rémi Denis-Courmont 2026-04-28 0:54 ` Jakub Kicinski 2026-04-28 8:48 ` Deepanshu Kartikey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox