Netdev List
 help / color / mirror / Atom feed
* [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free()
@ 2026-05-29  2:00 Jiakai Xu
  2026-06-02 20:30 ` Jakub Kicinski
  0 siblings, 1 reply; 4+ messages in thread
From: Jiakai Xu @ 2026-05-29  2:00 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: David S . Miller, Breno Leitao, Eric Dumazet, Ernestas Kulik,
	Jakub Kicinski, Jiakai Xu, Kees Cook, Kuniyuki Iwashima,
	Paolo Abeni, Simon Horman

In llc_ui_release(), sock_orphan() was called before llc_sk_free()
stopped all LLC timers. A pending timer callback
(llc_conn_ack_tmr_cb()->llc_process_tmr_ev()->llc_conn_state_process())
could fire between these two operations and dereference the
NULL sk->sk_socket that sock_orphan() sets, causing a kernel
page fault.

Fix the race by moving sock_orphan() into llc_sk_free(), after
llc_sk_stop_all_timers() has completed. This guarantees that
all timers are stopped before the socket is orphaned, eliminating
the window for the race.

Fixes: aa2b2eb39348 ("llc: call sock_orphan() at release time")
Signed-off-by: Jiakai Xu <xujiakai24@mails.ucas.ac.cn>
---
V1 -> V2:
  - Replaced sk->sk_socket NULL checks with moving sock_orphan()
    after timer stop, as suggested by Paolo Abeni.
Link: https://lore.kernel.org/lkml/20260526013541.796307-1-xujiakai24@mails.ucas.ac.cn/T/#u
---
 net/llc/af_llc.c   | 1 -
 net/llc/llc_conn.c | 5 +++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 35278c519a30..92f3576b339a 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -227,7 +227,6 @@ static int llc_ui_release(struct socket *sock)
 	}
 	netdev_put(llc->dev, &llc->dev_tracker);
 	sock_put(sk);
-	sock_orphan(sk);
 	sock->sk = NULL;
 	llc_sk_free(sk);
 out:
diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c
index 5c0ac243b248..c02285441592 100644
--- a/net/llc/llc_conn.c
+++ b/net/llc/llc_conn.c
@@ -977,6 +977,11 @@ void llc_sk_free(struct sock *sk)
 	llc->state = LLC_CONN_OUT_OF_SVC;
 	/* Stop all (possibly) running timers */
 	llc_sk_stop_all_timers(sk, true);
+	/* Orphan the socket after timers are stopped; otherwise a pending
+	 * timer callback could dereference the NULL sk->sk_socket that
+	 * sock_orphan() sets.
+	 */
+	sock_orphan(sk);
 #ifdef DEBUG_LLC_CONN_ALLOC
 	printk(KERN_INFO "%s: unackq=%d, txq=%d\n", __func__,
 		skb_queue_len(&llc->pdu_unack_q),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free()
  2026-05-29  2:00 [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free() Jiakai Xu
@ 2026-06-02 20:30 ` Jakub Kicinski
  2026-06-03  1:30   ` Jiakai Xu
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2026-06-02 20:30 UTC (permalink / raw)
  To: Jiakai Xu
  Cc: linux-kernel, netdev, David S . Miller, Breno Leitao,
	Eric Dumazet, Ernestas Kulik, Kees Cook, Kuniyuki Iwashima,
	Paolo Abeni, Simon Horman

On Fri, 29 May 2026 02:00:59 +0000 Jiakai Xu wrote:
> In llc_ui_release(), sock_orphan() was called before llc_sk_free()
> stopped all LLC timers. A pending timer callback
> (llc_conn_ack_tmr_cb()->llc_process_tmr_ev()->llc_conn_state_process())
> could fire between these two operations and dereference the
> NULL sk->sk_socket that sock_orphan() sets, causing a kernel
> page fault.
> 
> Fix the race by moving sock_orphan() into llc_sk_free(), after
> llc_sk_stop_all_timers() has completed. This guarantees that
> all timers are stopped before the socket is orphaned, eliminating
> the window for the race.

Sashiko points out that there's more issues if the timer runs after
llc_ui_release(). Can you reliably reproduce this? Have you checked
that this change is sufficient? Sashiko says that llc->dev may
disappear even tho we don't clear that pointer in _release().
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free()
  2026-06-02 20:30 ` Jakub Kicinski
@ 2026-06-03  1:30   ` Jiakai Xu
  2026-06-03  2:11     ` Jakub Kicinski
  0 siblings, 1 reply; 4+ messages in thread
From: Jiakai Xu @ 2026-06-03  1:30 UTC (permalink / raw)
  To: kuba
  Cc: davem, edumazet, ernestas.k, horms, kees, kuniyu, leitao,
	linux-kernel, netdev, pabeni, xujiakai24

Thank you very much for your review and feedback. I really appreciate 
you taking the time to look at this.

> Sashiko points out that there's more issues if the timer runs after
> llc_ui_release(). Can you reliably reproduce this? Have you checked
> that this change is sufficient? Sashiko says that llc->dev may
> disappear even tho we don't clear that pointer in _release().

This crash was discovered by fuzzing. Unfortunately, the fuzzer did 
not generate a reproducer program, so I am unable to reproduce it. 
Our analysis has been based entirely on the crash report.

I'm not an expert in this area, so the quality of my patches may be 
low. I really appreciate your patience and the time you've taken to 
review this. Would this V3 approach (moving both sock_orphan() and 
netdev_put() into llc_sk_free() after the timer stop) be the correct 
way to proceed?

Regards,
Jiakai


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free()
  2026-06-03  1:30   ` Jiakai Xu
@ 2026-06-03  2:11     ` Jakub Kicinski
  0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-06-03  2:11 UTC (permalink / raw)
  To: Jiakai Xu
  Cc: davem, edumazet, ernestas.k, horms, kees, kuniyu, leitao,
	linux-kernel, netdev, pabeni

On Wed,  3 Jun 2026 01:30:07 +0000 Jiakai Xu wrote:
> > Sashiko points out that there's more issues if the timer runs after
> > llc_ui_release(). Can you reliably reproduce this? Have you checked
> > that this change is sufficient? Sashiko says that llc->dev may
> > disappear even tho we don't clear that pointer in _release().  
> 
> This crash was discovered by fuzzing. Unfortunately, the fuzzer did 
> not generate a reproducer program, so I am unable to reproduce it. 
> Our analysis has been based entirely on the crash report.
> 
> I'm not an expert in this area, so the quality of my patches may be 
> low. I really appreciate your patience and the time you've taken to 
> review this. Would this V3 approach (moving both sock_orphan() and 
> netdev_put() into llc_sk_free() after the timer stop) be the correct 
> way to proceed?

Not sure, feels like we're trying to fix symptoms instead of addressing
the real root cause.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-03  2:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29  2:00 [PATCH v2] llc: Fix race between sock_orphan() and timer callback in llc_sk_free() Jiakai Xu
2026-06-02 20:30 ` Jakub Kicinski
2026-06-03  1:30   ` Jiakai Xu
2026-06-03  2:11     ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox