public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established
@ 2026-04-03 13:07 Jiayuan Chen
  2026-04-04  1:55 ` Jiayuan Chen
  2026-04-04 11:15 ` Matthieu Baerts
  0 siblings, 2 replies; 4+ messages in thread
From: Jiayuan Chen @ 2026-04-03 13:07 UTC (permalink / raw)
  To: mptcp, netdev
  Cc: Jiayuan Chen, Matthieu Baerts, Mat Martineau, Geliang Tang,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-kernel

The ehash table lookups are lockless and rely on
SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
during RCU read-side critical sections. Both tcp_prot and
tcpv6_prot have their slab caches created with this flag
via proto_register().

However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
tcpv6_prot_override during inet_init() (fs_initcall, level 5),
before inet6_init() (module_init/device_initcall, level 6) has
called proto_register(&tcpv6_prot). At that point,
tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
remains NULL permanently.

This causes MPTCP v6 subflow child sockets to be allocated via
kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
when these sockets are freed without SOCK_RCU_FREE (which is
cleared for child sockets by design), the memory can be
immediately reused. Concurrent ehash lookups under
rcu_read_lock can then access freed memory, triggering a
slab-use-after-free in __inet_lookup_established.

Fix this by splitting the IPv6-specific initialization out of
mptcp_subflow_init() into a new mptcp_subflow_v6_init(), which
is called from mptcpv6_init() after proto_register(&tcpv6_prot)
has completed. This ensures tcpv6_prot_override.slab correctly
inherits the SLAB_TYPESAFE_BY_RCU slab cache.

Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/mptcp/ctrl.c     |  6 +++++-
 net/mptcp/protocol.h |  1 +
 net/mptcp/subflow.c  | 15 +++++++++------
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index d96130e49942..5887ddcdb875 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -583,7 +583,11 @@ int __init mptcpv6_init(void)
 	int err;
 
 	err = mptcp_proto_v6_init();
+	if (err)
+		return err;
 
-	return err;
+	mptcp_subflow_v6_init();
+
+	return 0;
 }
 #endif
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 0bd1ee860316..ec15e503da8b 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -875,6 +875,7 @@ static inline void mptcp_subflow_tcp_fallback(struct sock *sk,
 void __init mptcp_proto_init(void);
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
 int __init mptcp_proto_v6_init(void);
+void __init mptcp_subflow_v6_init(void);
 #endif
 
 struct sock *mptcp_sk_clone_init(const struct sock *sk,
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 6716970693e9..4ff5863aa9fd 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -2165,7 +2165,15 @@ void __init mptcp_subflow_init(void)
 	tcp_prot_override.psock_update_sk_prot = NULL;
 #endif
 
+	mptcp_diag_subflow_init(&subflow_ulp_ops);
+
+	if (tcp_register_ulp(&subflow_ulp_ops) != 0)
+		panic("MPTCP: failed to register subflows to ULP\n");
+}
+
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
+void __init mptcp_subflow_v6_init(void)
+{
 	/* In struct mptcp_subflow_request_sock, we assume the TCP request sock
 	 * structures for v4 and v6 have the same size. It should not changed in
 	 * the future but better to make sure to be warned if it is no longer
@@ -2204,10 +2212,5 @@ void __init mptcp_subflow_init(void)
 	/* Disable sockmap processing for subflows */
 	tcpv6_prot_override.psock_update_sk_prot = NULL;
 #endif
-#endif
-
-	mptcp_diag_subflow_init(&subflow_ulp_ops);
-
-	if (tcp_register_ulp(&subflow_ulp_ops) != 0)
-		panic("MPTCP: failed to register subflows to ULP\n");
 }
+#endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established
  2026-04-03 13:07 [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established Jiayuan Chen
@ 2026-04-04  1:55 ` Jiayuan Chen
  2026-04-04 11:15 ` Matthieu Baerts
  1 sibling, 0 replies; 4+ messages in thread
From: Jiayuan Chen @ 2026-04-04  1:55 UTC (permalink / raw)
  To: mptcp, netdev
  Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-kernel


On 4/3/26 9:07 PM, Jiayuan Chen wrote:
> The ehash table lookups are lockless and rely on
> SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
> during RCU read-side critical sections. Both tcp_prot and
> tcpv6_prot have their slab caches created with this flag
> via proto_register().
>
> However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
> tcpv6_prot_override during inet_init() (fs_initcall, level 5),
> before inet6_init() (module_init/device_initcall, level 6) has
> called proto_register(&tcpv6_prot). At that point,
> tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
> remains NULL permanently.
>
> This causes MPTCP v6 subflow child sockets to be allocated via
> kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
> cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
> when these sockets are freed without SOCK_RCU_FREE (which is
> cleared for child sockets by design), the memory can be
> immediately reused. Concurrent ehash lookups under
> rcu_read_lock can then access freed memory, triggering a
> slab-use-after-free in __inet_lookup_established.
>
> Fix this by splitting the IPv6-specific initialization out of
> mptcp_subflow_init() into a new mptcp_subflow_v6_init(), which
> is called from mptcpv6_init() after proto_register(&tcpv6_prot)
> has completed. This ensures tcpv6_prot_override.slab correctly
> inherits the SLAB_TYPESAFE_BY_RCU slab cache.
>
> Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>   net/mptcp/ctrl.c     |  6 +++++-
>   net/mptcp/protocol.h |  1 +
>   net/mptcp/subflow.c  | 15 +++++++++------
>   3 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index d96130e49942..5887ddcdb875 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -583,7 +583,11 @@ int __init mptcpv6_init(void)
>   	int err;
>   
>   	err = mptcp_proto_v6_init();
> +	if (err)
> +		return err;
>   
> -	return err;
> +	mptcp_subflow_v6_init();
> +
> +	return 0;
>   }
>   #endif
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 0bd1ee860316..ec15e503da8b 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -875,6 +875,7 @@ static inline void mptcp_subflow_tcp_fallback(struct sock *sk,
>   void __init mptcp_proto_init(void);
>   #if IS_ENABLED(CONFIG_MPTCP_IPV6)
>   int __init mptcp_proto_v6_init(void);
> +void __init mptcp_subflow_v6_init(void);
>   #endif
>   
>   struct sock *mptcp_sk_clone_init(const struct sock *sk,
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 6716970693e9..4ff5863aa9fd 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -2165,7 +2165,15 @@ void __init mptcp_subflow_init(void)
>   	tcp_prot_override.psock_update_sk_prot = NULL;
>   #endif
>   
> +	mptcp_diag_subflow_init(&subflow_ulp_ops);
> +
> +	if (tcp_register_ulp(&subflow_ulp_ops) != 0)
> +		panic("MPTCP: failed to register subflows to ULP\n");
> +}
> +
>   #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> +void __init mptcp_subflow_v6_init(void)
> +{
>   	/* In struct mptcp_subflow_request_sock, we assume the TCP request sock
>   	 * structures for v4 and v6 have the same size. It should not changed in
>   	 * the future but better to make sure to be warned if it is no longer
> @@ -2204,10 +2212,5 @@ void __init mptcp_subflow_init(void)
>   	/* Disable sockmap processing for subflows */
>   	tcpv6_prot_override.psock_update_sk_prot = NULL;
>   #endif
> -#endif
> -
> -	mptcp_diag_subflow_init(&subflow_ulp_ops);
> -
> -	if (tcp_register_ulp(&subflow_ulp_ops) != 0)
> -		panic("MPTCP: failed to register subflows to ULP\n");
>   }
> +#endif



I think the AI review is not accurate here.
https://sashiko.dev/#/patchset/20260403130734.93981-1-jiayuan.chen%40linux.dev

Userspace programs only start running after all initcalls have 
completed, so there is no race condition.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established
  2026-04-03 13:07 [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established Jiayuan Chen
  2026-04-04  1:55 ` Jiayuan Chen
@ 2026-04-04 11:15 ` Matthieu Baerts
  2026-04-04 14:16   ` Jiayuan Chen
  1 sibling, 1 reply; 4+ messages in thread
From: Matthieu Baerts @ 2026-04-04 11:15 UTC (permalink / raw)
  To: Jiayuan Chen, mptcp, netdev
  Cc: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, linux-kernel

Hi Jiayuan,

On 03/04/2026 15:07, Jiayuan Chen wrote:
> The ehash table lookups are lockless and rely on
> SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
> during RCU read-side critical sections. Both tcp_prot and
> tcpv6_prot have their slab caches created with this flag
> via proto_register().
> 
> However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
> tcpv6_prot_override during inet_init() (fs_initcall, level 5),
> before inet6_init() (module_init/device_initcall, level 6) has
> called proto_register(&tcpv6_prot). At that point,
> tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
> remains NULL permanently.
> 
> This causes MPTCP v6 subflow child sockets to be allocated via
> kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
> cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
> when these sockets are freed without SOCK_RCU_FREE (which is
> cleared for child sockets by design), the memory can be
> immediately reused. Concurrent ehash lookups under
> rcu_read_lock can then access freed memory, triggering a
> slab-use-after-free in __inet_lookup_established.

Good catch! Thank you for this patch.

> Fix this by splitting the IPv6-specific initialization out of
> mptcp_subflow_init() into a new mptcp_subflow_v6_init(), which
> is called from mptcpv6_init() after proto_register(&tcpv6_prot)
> has completed. This ensures tcpv6_prot_override.slab correctly
> inherits the SLAB_TYPESAFE_BY_RCU slab cache.

The split makes sense anyway: better to regroup all v6-related init steps.

> Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  net/mptcp/ctrl.c     |  6 +++++-
>  net/mptcp/protocol.h |  1 +
>  net/mptcp/subflow.c  | 15 +++++++++------
>  3 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> index d96130e49942..5887ddcdb875 100644
> --- a/net/mptcp/ctrl.c
> +++ b/net/mptcp/ctrl.c
> @@ -583,7 +583,11 @@ int __init mptcpv6_init(void)
>  	int err;
>  
>  	err = mptcp_proto_v6_init();
> +	if (err)
> +		return err;
>  
> -	return err;
> +	mptcp_subflow_v6_init();

I think it would be better to move this to mptcp_proto_v6_init, similar
to what is done with mptcp_subflow_init, from mptcp_proto_init.

From there, you can even call it before registering the protocol, at the
beginning, so before inet6_register_protosw, which seems more logical
and similar to what is done in v4. WDYT?

If you send a v2, can you please remove the 'net:' prefix please?
'mptcp:' is enough:

  [PATCH net v2] mptcp: fix (...)

Also, can you add a "Cc: stable" tag please?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established
  2026-04-04 11:15 ` Matthieu Baerts
@ 2026-04-04 14:16   ` Jiayuan Chen
  0 siblings, 0 replies; 4+ messages in thread
From: Jiayuan Chen @ 2026-04-04 14:16 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp, netdev
  Cc: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, linux-kernel


On 4/4/26 7:15 PM, Matthieu Baerts wrote:
> Hi Jiayuan,
>
> On 03/04/2026 15:07, Jiayuan Chen wrote:
>> The ehash table lookups are lockless and rely on
>> SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
>> during RCU read-side critical sections. Both tcp_prot and
>> tcpv6_prot have their slab caches created with this flag
>> via proto_register().
>>
>> However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
>> tcpv6_prot_override during inet_init() (fs_initcall, level 5),
>> before inet6_init() (module_init/device_initcall, level 6) has
>> called proto_register(&tcpv6_prot). At that point,
>> tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
>> remains NULL permanently.
>>
>> This causes MPTCP v6 subflow child sockets to be allocated via
>> kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
>> cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
>> when these sockets are freed without SOCK_RCU_FREE (which is
>> cleared for child sockets by design), the memory can be
>> immediately reused. Concurrent ehash lookups under
>> rcu_read_lock can then access freed memory, triggering a
>> slab-use-after-free in __inet_lookup_established.
> Good catch! Thank you for this patch.
>
>> Fix this by splitting the IPv6-specific initialization out of
>> mptcp_subflow_init() into a new mptcp_subflow_v6_init(), which
>> is called from mptcpv6_init() after proto_register(&tcpv6_prot)
>> has completed. This ensures tcpv6_prot_override.slab correctly
>> inherits the SLAB_TYPESAFE_BY_RCU slab cache.
> The split makes sense anyway: better to regroup all v6-related init steps.
>
>> Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
>> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
>> ---
>>   net/mptcp/ctrl.c     |  6 +++++-
>>   net/mptcp/protocol.h |  1 +
>>   net/mptcp/subflow.c  | 15 +++++++++------
>>   3 files changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
>> index d96130e49942..5887ddcdb875 100644
>> --- a/net/mptcp/ctrl.c
>> +++ b/net/mptcp/ctrl.c
>> @@ -583,7 +583,11 @@ int __init mptcpv6_init(void)
>>   	int err;
>>   
>>   	err = mptcp_proto_v6_init();
>> +	if (err)
>> +		return err;
>>   
>> -	return err;
>> +	mptcp_subflow_v6_init();
> I think it would be better to move this to mptcp_proto_v6_init, similar
> to what is done with mptcp_subflow_init, from mptcp_proto_init.
>
>  From there, you can even call it before registering the protocol, at the
> beginning, so before inet6_register_protosw, which seems more logical
> and similar to what is done in v4. WDYT?
>
> If you send a v2, can you please remove the 'net:' prefix please?
> 'mptcp:' is enough:
>
>    [PATCH net v2] mptcp: fix (...)
>
> Also, can you add a "Cc: stable" tag please?
>
> Cheers,
> Matt


Hi Matt,

Thanks for the feedback! I'll try all your suggestions in v2.

Thanks!

Jiayuan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-04 14:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03 13:07 [PATCH net v1] net: mptcp: fix slab-use-after-free in __inet_lookup_established Jiayuan Chen
2026-04-04  1:55 ` Jiayuan Chen
2026-04-04 11:15 ` Matthieu Baerts
2026-04-04 14:16   ` Jiayuan Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox