public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
       [not found] <20251023125450.105859-1-jiayuan.chen@linux.dev>
@ 2025-10-23 12:54 ` Jiayuan Chen
  2025-10-23 14:10   ` Matthieu Baerts
  2025-10-28 11:30   ` Paolo Abeni
  2025-10-23 12:54 ` [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap Jiayuan Chen
  1 sibling, 2 replies; 10+ messages in thread
From: Jiayuan Chen @ 2025-10-23 12:54 UTC (permalink / raw)
  To: mptcp
  Cc: Jiayuan Chen, stable, Jakub Sitnicki, John Fastabend,
	Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, Matthieu Baerts,
	Mat Martineau, Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

When the server has MPTCP enabled but receives a non-MP-capable request
from a client, it calls mptcp_fallback_tcp_ops().

Since non-MPTCP connections are allowed to use sockmap, which replaces
sk->sk_prot, using sk->sk_prot to determine the IP version in
mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
incorrect ops to sk->sk_socket->ops.

Additionally, when BPF Sockmap modifies the protocol handlers, the
original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
trigger warnings.

Fix this by using the more stable sk_family to distinguish between IPv4
and IPv6 connections, ensuring correct fallback protocol operations are
selected even when BPF Sockmap has modified the socket protocol handlers.

Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/mptcp/protocol.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0292162a14ee..2393741bc310 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
 
 static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
 {
+	/* When BPF sockmap is used, it may replace sk->sk_prot.
+	 * Using sk_family is a reliable way to determine the IP version.
+	 */
+	unsigned short family = READ_ONCE(sk->sk_family);
+
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
-	if (sk->sk_prot == &tcpv6_prot)
+	if (family == AF_INET6)
 		return &inet6_stream_ops;
 #endif
-	WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
+	WARN_ON_ONCE(family != AF_INET);
 	return &inet_stream_ops;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap
       [not found] <20251023125450.105859-1-jiayuan.chen@linux.dev>
  2025-10-23 12:54 ` [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap Jiayuan Chen
@ 2025-10-23 12:54 ` Jiayuan Chen
  2025-10-28 12:03   ` Paolo Abeni
  1 sibling, 1 reply; 10+ messages in thread
From: Jiayuan Chen @ 2025-10-23 12:54 UTC (permalink / raw)
  To: mptcp
  Cc: Jiayuan Chen, stable, Jakub Sitnicki, John Fastabend,
	Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn,
	David S. Miller, Jakub Kicinski, Simon Horman, Matthieu Baerts,
	Mat Martineau, Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

MPTCP creates subflows for data transmission, and these sockets should not
be added to sockmap because MPTCP sets specialized data_ready handlers
that would be overridden by sockmap.

Additionally, for the parent socket of MPTCP subflows (plain TCP socket),
MPTCP sk requires specific protocol handling that conflicts with sockmap's
operation(mptcp_prot).

This patch adds proper checks to reject MPTCP subflows and their parent
sockets from being added to sockmap, while preserving compatibility with
reuseport functionality for listening MPTCP sockets.

We cannot add this logic to sock_map_sk_state_allowed() because the sockops
path doesn't execute this function, and the socket state coming from
sockops might be in states like SYN_RECV. So moving
sock_map_sk_state_allowed() to sock_{map,hash}_update_common() is not
appropriate. Instead, we introduce a new function to handle MPTCP checks.

Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 5947b38e4f8b..5be38cdfb5cc 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -467,6 +467,27 @@ static int sock_map_get_next_key(struct bpf_map *map, void *key, void *next)
 	return 0;
 }
 
+/* Disallow MPTCP subflows and their parent sockets. However, a TCP_LISTEN
+ * MPTCP socket is permitted because sockmap can also serve for reuseport
+ * socket selection.
+ */
+static inline bool sock_map_sk_type_allowed(const struct sock *sk)
+{
+	/* MPTCP subflows are not intended for data I/O by user */
+	if (sk_is_tcp(sk) && sk_is_mptcp(sk))
+		goto disallow;
+
+	/* MPTCP parents use mptcp_prot - not supported with sockmap yet */
+	if (sk->sk_protocol == IPPROTO_MPTCP && sk->sk_state != TCP_LISTEN)
+		goto disallow;
+
+	return true;
+
+disallow:
+	pr_err_once("sockmap/sockhash: MPTCP sockets are not supported\n");
+	return false;
+}
+
 static int sock_map_update_common(struct bpf_map *map, u32 idx,
 				  struct sock *sk, u64 flags)
 {
@@ -482,6 +503,9 @@ static int sock_map_update_common(struct bpf_map *map, u32 idx,
 	if (unlikely(idx >= map->max_entries))
 		return -E2BIG;
 
+	if (!sock_map_sk_type_allowed(sk))
+		return -EOPNOTSUPP;
+
 	link = sk_psock_init_link();
 	if (!link)
 		return -ENOMEM;
@@ -1003,6 +1027,9 @@ static int sock_hash_update_common(struct bpf_map *map, void *key,
 	if (unlikely(flags > BPF_EXIST))
 		return -EINVAL;
 
+	if (!sock_map_sk_type_allowed(sk))
+		return -EOPNOTSUPP;
+
 	link = sk_psock_init_link();
 	if (!link)
 		return -ENOMEM;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-23 12:54 ` [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap Jiayuan Chen
@ 2025-10-23 14:10   ` Matthieu Baerts
  2025-10-23 14:38     ` Jiayuan Chen
  2025-10-28 11:30   ` Paolo Abeni
  1 sibling, 1 reply; 10+ messages in thread
From: Matthieu Baerts @ 2025-10-23 14:10 UTC (permalink / raw)
  To: Jiayuan Chen, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Mat Martineau, Geliang Tang,
	Andrii Nakryiko, Eduard Zingerman, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Florian Westphal, linux-kernel, netdev, bpf, linux-kselftest

Hi Jiayuan,

On 23/10/2025 14:54, Jiayuan Chen wrote:
> When the server has MPTCP enabled but receives a non-MP-capable request
> from a client, it calls mptcp_fallback_tcp_ops().
> 
> Since non-MPTCP connections are allowed to use sockmap, which replaces
> sk->sk_prot, using sk->sk_prot to determine the IP version in
> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> incorrect ops to sk->sk_socket->ops.
> 
> Additionally, when BPF Sockmap modifies the protocol handlers, the
> original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> trigger warnings.
> 
> Fix this by using the more stable sk_family to distinguish between IPv4
> and IPv6 connections, ensuring correct fallback protocol operations are
> selected even when BPF Sockmap has modified the socket protocol handlers.
> 
> Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/mptcp/protocol.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 0292162a14ee..2393741bc310 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
>  
>  static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
>  {
> +	/* When BPF sockmap is used, it may replace sk->sk_prot.
> +	 * Using sk_family is a reliable way to determine the IP version.
> +	 */
> +	unsigned short family = READ_ONCE(sk->sk_family);
> +
>  #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> -	if (sk->sk_prot == &tcpv6_prot)
> +	if (family == AF_INET6)
>  		return &inet6_stream_ops;
>  #endif
> -	WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> +	WARN_ON_ONCE(family != AF_INET);
>  	return &inet_stream_ops;

Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
Because that's what mptcp_fallback_tcp_ops() will do somehow.

In other words, is it always fine to set inet(6)_stream_ops? (I guess
yes, but better to be sure while we are looking at that :) )

>  }
>  

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-23 14:10   ` Matthieu Baerts
@ 2025-10-23 14:38     ` Jiayuan Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Jiayuan Chen @ 2025-10-23 14:38 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Paolo Abeni, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Mat Martineau, Geliang Tang,
	Andrii Nakryiko, Eduard Zingerman, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	Florian Westphal, linux-kernel, netdev, bpf, linux-kselftest

October 23, 2025 at 22:10, "Matthieu Baerts" <matttbe@kernel.org mailto:matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:


> 
> Hi Jiayuan,
> 
> On 23/10/2025 14:54, Jiayuan Chen wrote:
> 
> > 
> > When the server has MPTCP enabled but receives a non-MP-capable request
> >  from a client, it calls mptcp_fallback_tcp_ops().
> >  
> >  Since non-MPTCP connections are allowed to use sockmap, which replaces
> >  sk->sk_prot, using sk->sk_prot to determine the IP version in
> >  mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> >  incorrect ops to sk->sk_socket->ops.
> >  
> >  Additionally, when BPF Sockmap modifies the protocol handlers, the
> >  original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> >  trigger warnings.
> >  
> >  Fix this by using the more stable sk_family to distinguish between IPv4
> >  and IPv6 connections, ensuring correct fallback protocol operations are
> >  selected even when BPF Sockmap has modified the socket protocol handlers.
> >  
> >  Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> >  Cc: <stable@vger.kernel.org>
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> >  Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
> >  ---
> >  net/mptcp/protocol.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >  
> >  diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> >  index 0292162a14ee..2393741bc310 100644
> >  --- a/net/mptcp/protocol.c
> >  +++ b/net/mptcp/protocol.c
> >  @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
> >  
> >  static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
> >  {
> >  + /* When BPF sockmap is used, it may replace sk->sk_prot.
> >  + * Using sk_family is a reliable way to determine the IP version.
> >  + */
> >  + unsigned short family = READ_ONCE(sk->sk_family);
> >  +
> >  #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> >  - if (sk->sk_prot == &tcpv6_prot)
> >  + if (family == AF_INET6)
> >  return &inet6_stream_ops;
> >  #endif
> >  - WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> >  + WARN_ON_ONCE(family != AF_INET);
> >  return &inet_stream_ops;
> > 
> Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
> Because that's what mptcp_fallback_tcp_ops() will do somehow.
> 
> In other words, is it always fine to set inet(6)_stream_ops? (I guess
> yes, but better to be sure while we are looking at that :) )



Hi Matt,

I can confirm that on the BPF side, the only special operations targeting
sockets currently are sockmap/sockhash. Their implementations do not modify
sk->sk_socket->ops. Currently, they only modify sk->prot, because the BPF
side typically operates on 'struct sock' and does not concern itself with
'struct socket'.

Therefore, setting inet(6)_stream_ops is fine.

Thanks,
Jiayuan

> > 
> > }
> > 
> Cheers,
> Matt
> -- 
> Sponsored by the NGI0 Core fund.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-23 12:54 ` [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap Jiayuan Chen
  2025-10-23 14:10   ` Matthieu Baerts
@ 2025-10-28 11:30   ` Paolo Abeni
  2025-10-28 11:47     ` Paolo Abeni
  2025-11-03 12:44     ` Jiayuan Chen
  1 sibling, 2 replies; 10+ messages in thread
From: Paolo Abeni @ 2025-10-28 11:30 UTC (permalink / raw)
  To: Jiayuan Chen, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> When the server has MPTCP enabled but receives a non-MP-capable request
> from a client, it calls mptcp_fallback_tcp_ops().
> 
> Since non-MPTCP connections are allowed to use sockmap, which replaces
> sk->sk_prot, using sk->sk_prot to determine the IP version in
> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> incorrect ops to sk->sk_socket->ops.

I don't see how sockmap could modify the to-be-accepted socket sk_prot
before mptcp_fallback_tcp_ops(), as such call happens before the fd is
installed, and AFAICS sockmap can only fetch sockets via fds.

Is this patch needed?

/P


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-28 11:30   ` Paolo Abeni
@ 2025-10-28 11:47     ` Paolo Abeni
  2025-11-03 12:45       ` Jiayuan Chen
  2025-11-03 12:44     ` Jiayuan Chen
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2025-10-28 11:47 UTC (permalink / raw)
  To: Jiayuan Chen, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

On 10/28/25 12:30 PM, Paolo Abeni wrote:
> On 10/23/25 2:54 PM, Jiayuan Chen wrote:
>> When the server has MPTCP enabled but receives a non-MP-capable request
>> from a client, it calls mptcp_fallback_tcp_ops().
>>
>> Since non-MPTCP connections are allowed to use sockmap, which replaces
>> sk->sk_prot, using sk->sk_prot to determine the IP version in
>> mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
>> incorrect ops to sk->sk_socket->ops.
> 
> I don't see how sockmap could modify the to-be-accepted socket sk_prot
> before mptcp_fallback_tcp_ops(), as such call happens before the fd is
> installed, and AFAICS sockmap can only fetch sockets via fds.
> 
> Is this patch needed?

Matttbe explained off-list the details of how that could happen. I think
the commit message here must be more verbose to explain clearly the
whys, even to those non proficient in sockmap like me.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap
  2025-10-23 12:54 ` [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap Jiayuan Chen
@ 2025-10-28 12:03   ` Paolo Abeni
  2025-11-03 12:52     ` Jiayuan Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2025-10-28 12:03 UTC (permalink / raw)
  To: Jiayuan Chen, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> MPTCP creates subflows for data transmission, and these sockets should not
> be added to sockmap because MPTCP sets specialized data_ready handlers
> that would be overridden by sockmap.
> 
> Additionally, for the parent socket of MPTCP subflows (plain TCP socket),
> MPTCP sk requires specific protocol handling that conflicts with sockmap's
> operation(mptcp_prot).
> 
> This patch adds proper checks to reject MPTCP subflows and their parent
> sockets from being added to sockmap, while preserving compatibility with
> reuseport functionality for listening MPTCP sockets.

It's unclear to me why that is safe. sockmap is going to change the
listener msk proto ops.

The listener could disconnect and create an egress connection, still
using the wrong ops.

I think sockmap should always be prevented for mptcp socket, or at least
a solid explanation of why such exception is safe should be included in
the commit message.

Note that the first option allows for solving the issue entirely in the
mptcp code, setting dummy/noop psock_update_sk_prot for mptcp sockets
and mptcp subflows.

/P


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-28 11:30   ` Paolo Abeni
  2025-10-28 11:47     ` Paolo Abeni
@ 2025-11-03 12:44     ` Jiayuan Chen
  1 sibling, 0 replies; 10+ messages in thread
From: Jiayuan Chen @ 2025-11-03 12:44 UTC (permalink / raw)
  To: Paolo Abeni, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

October 28, 2025 at 19:30, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:


> 
> On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> 
> > 
> > When the server has MPTCP enabled but receives a non-MP-capable request
> >  from a client, it calls mptcp_fallback_tcp_ops().
> >  
> >  Since non-MPTCP connections are allowed to use sockmap, which replaces
> >  sk->sk_prot, using sk->sk_prot to determine the IP version in
> >  mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> >  incorrect ops to sk->sk_socket->ops.
> > 
> I don't see how sockmap could modify the to-be-accepted socket sk_prot
> before mptcp_fallback_tcp_ops(), as such call happens before the fd is
> installed, and AFAICS sockmap can only fetch sockets via fds.
> 
> Is this patch needed?

"mptcp_fallback_tcp_ops" is only called during the accept process. However,
before that, for an already established TCP socket, its sk_prot is replaced via the following path:
tcp_rcv_state_process()
  tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
    call bpf prog
        bpf_sock_map_update(sk)
           tcp_bpf_update_proto()

However, after discussing with Matthieu, we've concluded that this patch is indeed no
longer necessary, as we have a simpler way to intercept the operation."

Thanks~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap
  2025-10-28 11:47     ` Paolo Abeni
@ 2025-11-03 12:45       ` Jiayuan Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Jiayuan Chen @ 2025-11-03 12:45 UTC (permalink / raw)
  To: Paolo Abeni, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

October 28, 2025 at 19:47, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:


> 
> On 10/28/25 12:30 PM, Paolo Abeni wrote:
> 
> > 
> > On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> > 
> > > 
> > > When the server has MPTCP enabled but receives a non-MP-capable request
> > >  from a client, it calls mptcp_fallback_tcp_ops().
> > > 
> > >  Since non-MPTCP connections are allowed to use sockmap, which replaces
> > >  sk->sk_prot, using sk->sk_prot to determine the IP version in
> > >  mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> > >  incorrect ops to sk->sk_socket->ops.
> > > 
> >  
> >  I don't see how sockmap could modify the to-be-accepted socket sk_prot
> >  before mptcp_fallback_tcp_ops(), as such call happens before the fd is
> >  installed, and AFAICS sockmap can only fetch sockets via fds.
> >  
> >  Is this patch needed?
> > 
> Matttbe explained off-list the details of how that could happen. I think
> the commit message here must be more verbose to explain clearly the
> whys, even to those non proficient in sockmap like me.
> 
> Thanks,
> 
> Paolo
>

Thanks, I will add more details into commit message :).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap
  2025-10-28 12:03   ` Paolo Abeni
@ 2025-11-03 12:52     ` Jiayuan Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Jiayuan Chen @ 2025-11-03 12:52 UTC (permalink / raw)
  To: Paolo Abeni, mptcp
  Cc: stable, Jakub Sitnicki, John Fastabend, Eric Dumazet,
	Kuniyuki Iwashima, Willem de Bruijn, David S. Miller,
	Jakub Kicinski, Simon Horman, Matthieu Baerts, Mat Martineau,
	Geliang Tang, Andrii Nakryiko, Eduard Zingerman,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Florian Westphal, linux-kernel, netdev, bpf,
	linux-kselftest

October 28, 2025 at 20:03, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:


> 
> On 10/23/25 2:54 PM, Jiayuan Chen wrote:
> 
> > 
> > MPTCP creates subflows for data transmission, and these sockets should not
> >  be added to sockmap because MPTCP sets specialized data_ready handlers
> >  that would be overridden by sockmap.
> >  
> >  Additionally, for the parent socket of MPTCP subflows (plain TCP socket),
> >  MPTCP sk requires specific protocol handling that conflicts with sockmap's
> >  operation(mptcp_prot).
> >  
> >  This patch adds proper checks to reject MPTCP subflows and their parent
> >  sockets from being added to sockmap, while preserving compatibility with
> >  reuseport functionality for listening MPTCP sockets.
> > 
> It's unclear to me why that is safe. sockmap is going to change the
> listener msk proto ops.
> 
> The listener could disconnect and create an egress connection, still
> using the wrong ops.

sockmap only replaces read/write handler of a sk and keeps another handler.

But I agree with you; I also don't think sockmap should replace the handlers of
the listen socket. Because for a listen socket, sockmap is merely used as a container, 
just like hash map or array map. But in reality, that's exactly what it does...

> I think sockmap should always be prevented for mptcp socket, or at least
> a solid explanation of why such exception is safe should be included in
> the commit message.
> 
> Note that the first option allows for solving the issue entirely in the
> mptcp code, setting dummy/noop psock_update_sk_prot for mptcp sockets
> and mptcp subflows.

I will do it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-11-03 12:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251023125450.105859-1-jiayuan.chen@linux.dev>
2025-10-23 12:54 ` [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with BPF sockmap Jiayuan Chen
2025-10-23 14:10   ` Matthieu Baerts
2025-10-23 14:38     ` Jiayuan Chen
2025-10-28 11:30   ` Paolo Abeni
2025-10-28 11:47     ` Paolo Abeni
2025-11-03 12:45       ` Jiayuan Chen
2025-11-03 12:44     ` Jiayuan Chen
2025-10-23 12:54 ` [PATCH net v3 2/3] bpf,sockmap: disallow MPTCP sockets from sockmap Jiayuan Chen
2025-10-28 12:03   ` Paolo Abeni
2025-11-03 12:52     ` Jiayuan Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox