mptcp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races
@ 2025-07-09 15:12 Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic Paolo Abeni
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

This series contains 3 fixes somewhat related to various races we have
while handling fallback and 2 small follow-up likely more suited for
net-next.

The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL - where the window race is even
wider.

Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2.

Note that mptcp_disconnect() unconditionally
clears the fallback status (zeroing msk->flags) but don't tuch the
`allows_infinite_fallback` flag. Such issue is addressed in patch 3.

Patch 4 cleans up a bit the fallback code, introducing specific MIB for
each FB reason, and patch 5 drops the, hopefully now redundant
pr_fallback().
---
v2 -> v3:
 - mptcp_do_fallback  -> mptcp_try_fallback
 - refactored patch 3/5
 - changed mibs names, increment fail only when the protocol mandate it
 - fix W=1 warn in patch 5/5

Paolo Abeni (5):
  mptcp: make fallback action and fallback decision atomic
  mptcp: plug races between subflow fail and subflow creation
  mptcp: reset fallback status gracefully at disconnect() time
  mptcp: track fallbacks accurately via mibs
  mptcp: remove pr_fallback()

 net/mptcp/ctrl.c     |  4 +-
 net/mptcp/mib.c      |  5 +++
 net/mptcp/mib.h      |  7 ++++
 net/mptcp/options.c  |  6 ++-
 net/mptcp/pm.c       |  8 +++-
 net/mptcp/protocol.c | 97 ++++++++++++++++++++++++++++++++++----------
 net/mptcp/protocol.h | 35 ++++++++--------
 net/mptcp/subflow.c  | 40 ++++++++++--------
 8 files changed, 140 insertions(+), 62 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
@ 2025-07-09 15:12 ` Paolo Abeni
  2025-07-10 17:41   ` Matthieu Baerts
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 2/5] mptcp: plug races between subflow fail and subflow creation Paolo Abeni
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

Syzkaller reported the following splat:

WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Modules linked in:
CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted 6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary)
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline]
RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00
RSP: 0018:ffff8880a3f08448 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880180a8000 RCX: ffffffff84afcf45
RDX: ffff888090223700 RSI: ffffffff84afdaa7 RDI: 0000000000000001
RBP: ffff888017955780 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880180a8910 R14: ffff8880a3e9d058 R15: 0000000000000000
FS:  00005555791b8500(0000) GS:ffff88811c495000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000110c2800b7 CR3: 0000000058e44000 CR4: 0000000000350ef0
Call Trace:
 <IRQ>
 tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432
 tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975
 tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166
 tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925
 tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363
 ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205
 ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233
 NF_HOOK include/linux/netfilter.h:317 [inline]
 NF_HOOK include/linux/netfilter.h:311 [inline]
 ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254
 dst_input include/net/dst.h:469 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
 NF_HOOK include/linux/netfilter.h:317 [inline]
 NF_HOOK include/linux/netfilter.h:311 [inline]
 ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975
 __netif_receive_skb+0x1f/0x120 net/core/dev.c:6088
 process_backlog+0x301/0x1360 net/core/dev.c:6440
 __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453
 napi_poll net/core/dev.c:7517 [inline]
 net_rx_action+0xb44/0x1010 net/core/dev.c:7644
 handle_softirqs+0x1d0/0x770 kernel/softirq.c:579
 do_softirq+0x3f/0x90 kernel/softirq.c:480
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407
 local_bh_enable include/linux/bottom_half.h:33 [inline]
 inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524
 mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985
 mptcp_check_listen_stop net/mptcp/mib.h:118 [inline]
 __mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000
 mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066
 inet_release+0xed/0x200 net/ipv4/af_inet.c:435
 inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487
 __sock_release+0xb3/0x270 net/socket.c:649
 sock_close+0x1c/0x30 net/socket.c:1439
 __fput+0x402/0xb70 fs/file_table.c:465
 task_work_run+0x150/0x240 kernel/task_work.c:227
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop+0xd4/0xe0 kernel/entry/common.c:114
 exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
 do_syscall_64+0x245/0x360 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc92f8a36ad
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcf52802d8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 00007ffcf52803a8 RCX: 00007fc92f8a36ad
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 00007fc92fae7ba0 R08: 0000000000000001 R09: 0000002800000000
R10: 00007fc92f700000 R11: 0000000000000246 R12: 00007fc92fae5fac
R13: 00007fc92fae5fa0 R14: 0000000000026d00 R15: 0000000000026c51
 </TASK>
irq event stamp: 4068
hardirqs last  enabled at (4076): [<ffffffff81544816>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:344
hardirqs last disabled at (4085): [<ffffffff815447fb>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:342
softirqs last  enabled at (3096): [<ffffffff840e1be0>] local_bh_enable include/linux/bottom_half.h:33 [inline]
softirqs last  enabled at (3096): [<ffffffff840e1be0>] inet_csk_listen_stop+0x2c0/0x1070 net/ipv4/inet_connection_sock.c:1524
softirqs last disabled at (3097): [<ffffffff813b6b9f>] do_softirq+0x3f/0x90 kernel/softirq.c:480

Since we need to track the 'fallback is possible' condition and the
fallback status separately, there are a few possible races open between
the check and the actual fallback action.

Add a spinlock to protect the fallback related information and use
it close all the possible related races. While at it also remove the
too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect():
the field will be correctly cleared by subflow_finish_connect() if/
when the connection will complete successfully.

If fallback is not possible, as per RFC, reset the current subflow.

Since the fallback operation can now fail and return value should be
checked, rename the helper accordingly.

Reported by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570
Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555
Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v2 -> v3:
  - fix return vale check in check_fully_established()
  - add WARN when hitting 'impossible' fallback failures.
  - drop empty line
  - mptcp_do_fallback -> mptcp_try_fallback

v1 -> v2:
  - remove unneeded and racing allow_infinite_fallback in
    __mptcp_subflow_connect()
  - really close race in subflow_check_data_avail()

Sharing earlier to feed syzkaller.
Hopefully a follow-up on mptcp-net-next will cleanup a bit the fallback
handling, I tried to minimize the delta for backport's sake
---
 net/mptcp/options.c  |  3 ++-
 net/mptcp/protocol.c | 40 +++++++++++++++++++++++++++++++++++-----
 net/mptcp/protocol.h | 23 ++++++++++++++++-------
 net/mptcp/subflow.c  | 11 +++++------
 4 files changed, 58 insertions(+), 19 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 56fd7bcb786c..9288a8d1235e 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -978,8 +978,9 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
 		if (subflow->mp_join)
 			goto reset;
 		subflow->mp_capable = 0;
+		if (!mptcp_try_fallback(ssk))
+			goto reset;
 		pr_fallback(msk);
-		mptcp_do_fallback(ssk);
 		return false;
 	}
 
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1de42dc4e8ea..4ddf3a1e0085 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -560,10 +560,9 @@ static bool mptcp_check_data_fin(struct sock *sk)
 
 static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
 {
-	if (READ_ONCE(msk->allow_infinite_fallback)) {
+	if (mptcp_try_fallback(ssk)) {
 		MPTCP_INC_STATS(sock_net(ssk),
 				MPTCP_MIB_DSSCORRUPTIONFALLBACK);
-		mptcp_do_fallback(ssk);
 	} else {
 		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCORRUPTIONRESET);
 		mptcp_subflow_reset(ssk);
@@ -803,6 +802,14 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
 	if (sk->sk_state != TCP_ESTABLISHED)
 		return false;
 
+	spin_lock_bh(&msk->fallback_lock);
+	if (__mptcp_check_fallback(msk)) {
+		spin_unlock_bh(&msk->fallback_lock);
+		return false;
+	}
+	mptcp_subflow_joined(msk, ssk);
+	spin_unlock_bh(&msk->fallback_lock);
+
 	/* attach to msk socket only after we are sure we will deal with it
 	 * at close time
 	 */
@@ -811,7 +818,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
 
 	mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
 	mptcp_sockopt_sync_locked(msk, ssk);
-	mptcp_subflow_joined(msk, ssk);
 	mptcp_stop_tout_timer(sk);
 	__mptcp_propagate_sndbuf(sk, ssk);
 	return true;
@@ -1136,10 +1142,14 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	mpext->infinite_map = 1;
 	mpext->data_len = 0;
 
+	if (!mptcp_try_fallback(ssk)) {
+		mptcp_subflow_reset(ssk);
+		return;
+	}
+
 	MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX);
 	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
 	pr_fallback(msk);
-	mptcp_do_fallback(ssk);
 }
 
 #define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
@@ -2543,9 +2553,9 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
 
 static void __mptcp_retrans(struct sock *sk)
 {
+	struct mptcp_sendmsg_info info = { .data_lock_held = true, };
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	struct mptcp_subflow_context *subflow;
-	struct mptcp_sendmsg_info info = {};
 	struct mptcp_data_frag *dfrag;
 	struct sock *ssk;
 	int ret, err;
@@ -2590,6 +2600,18 @@ static void __mptcp_retrans(struct sock *sk)
 			info.sent = 0;
 			info.limit = READ_ONCE(msk->csum_enabled) ? dfrag->data_len :
 								    dfrag->already_sent;
+
+			/*
+			 * make the whole retrans decision, xmit, disallow
+			 * fallback atomic
+			 */
+			spin_lock_bh(&msk->fallback_lock);
+			if (__mptcp_check_fallback(msk)) {
+				spin_unlock_bh(&msk->fallback_lock);
+				release_sock(ssk);
+				return;
+			}
+
 			while (info.sent < info.limit) {
 				ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info);
 				if (ret <= 0)
@@ -2605,6 +2627,7 @@ static void __mptcp_retrans(struct sock *sk)
 					 info.size_goal);
 				WRITE_ONCE(msk->allow_infinite_fallback, false);
 			}
+			spin_unlock_bh(&msk->fallback_lock);
 
 			release_sock(ssk);
 		}
@@ -2738,6 +2761,7 @@ static void __mptcp_init_sock(struct sock *sk)
 	msk->last_ack_recv = tcp_jiffies32;
 
 	mptcp_pm_data_init(msk);
+	spin_lock_init(&msk->fallback_lock);
 
 	/* re-use the csk retrans timer for MPTCP-level retrans */
 	timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0);
@@ -3524,7 +3548,13 @@ bool mptcp_finish_join(struct sock *ssk)
 
 	/* active subflow, already present inside the conn_list */
 	if (!list_empty(&subflow->node)) {
+		spin_lock_bh(&msk->fallback_lock);
+		if (__mptcp_check_fallback(msk)) {
+			spin_unlock_bh(&msk->fallback_lock);
+			return false;
+		}
 		mptcp_subflow_joined(msk, ssk);
+		spin_unlock_bh(&msk->fallback_lock);
 		mptcp_propagate_sndbuf(parent, ssk);
 		return true;
 	}
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index e36ae88b10ad..61a9ff467501 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -352,6 +352,7 @@ struct mptcp_sock {
 	u32		subflow_id;
 	u32		setsockopt_seq;
 	char		ca_name[TCP_CA_NAME_MAX];
+	spinlock_t	fallback_lock;	/* protects fallback && allow_infinite_fallback */
 };
 
 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
@@ -1214,15 +1215,21 @@ static inline bool mptcp_check_fallback(const struct sock *sk)
 	return __mptcp_check_fallback(msk);
 }
 
-static inline void __mptcp_do_fallback(struct mptcp_sock *msk)
+static inline bool __mptcp_try_fallback(struct mptcp_sock *msk)
 {
 	if (__mptcp_check_fallback(msk)) {
 		pr_debug("TCP fallback already done (msk=%p)\n", msk);
-		return;
+		return true;
 	}
-	if (WARN_ON_ONCE(!READ_ONCE(msk->allow_infinite_fallback)))
-		return;
+	spin_lock_bh(&msk->fallback_lock);
+	if (!msk->allow_infinite_fallback) {
+		spin_unlock_bh(&msk->fallback_lock);
+		return false;
+	}
+
 	set_bit(MPTCP_FALLBACK_DONE, &msk->flags);
+	spin_unlock_bh(&msk->fallback_lock);
+	return true;
 }
 
 static inline bool __mptcp_has_initial_subflow(const struct mptcp_sock *msk)
@@ -1234,14 +1241,15 @@ static inline bool __mptcp_has_initial_subflow(const struct mptcp_sock *msk)
 			TCPF_SYN_RECV | TCPF_LISTEN));
 }
 
-static inline void mptcp_do_fallback(struct sock *ssk)
+static inline bool mptcp_try_fallback(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct sock *sk = subflow->conn;
 	struct mptcp_sock *msk;
 
 	msk = mptcp_sk(sk);
-	__mptcp_do_fallback(msk);
+	if (!__mptcp_try_fallback(msk))
+		return false;
 	if (READ_ONCE(msk->snd_data_fin_enable) && !(ssk->sk_shutdown & SEND_SHUTDOWN)) {
 		gfp_t saved_allocation = ssk->sk_allocation;
 
@@ -1253,6 +1261,7 @@ static inline void mptcp_do_fallback(struct sock *ssk)
 		tcp_shutdown(ssk, SEND_SHUTDOWN);
 		ssk->sk_allocation = saved_allocation;
 	}
+	return true;
 }
 
 #define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)\n", __func__, a)
@@ -1262,7 +1271,7 @@ static inline void mptcp_subflow_early_fallback(struct mptcp_sock *msk,
 {
 	pr_fallback(msk);
 	subflow->request_mptcp = 0;
-	__mptcp_do_fallback(msk);
+	WARN_ON_ONCE(!__mptcp_try_fallback(msk));
 }
 
 static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 15613d691bfe..a6a35985e551 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -544,9 +544,11 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 	mptcp_get_options(skb, &mp_opt);
 	if (subflow->request_mptcp) {
 		if (!(mp_opt.suboptions & OPTION_MPTCP_MPC_SYNACK)) {
+			if (!mptcp_try_fallback(sk))
+				goto do_reset;
+
 			MPTCP_INC_STATS(sock_net(sk),
 					MPTCP_MIB_MPCAPABLEACTIVEFALLBACK);
-			mptcp_do_fallback(sk);
 			pr_fallback(msk);
 			goto fallback;
 		}
@@ -1395,7 +1397,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			return true;
 		}
 
-		if (!READ_ONCE(msk->allow_infinite_fallback)) {
+		if (!mptcp_try_fallback(ssk)) {
 			/* fatal protocol error, close the socket.
 			 * subflow_error_report() will introduce the appropriate barriers
 			 */
@@ -1413,8 +1415,6 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			WRITE_ONCE(subflow->data_avail, false);
 			return false;
 		}
-
-		mptcp_do_fallback(ssk);
 	}
 
 	skb = skb_peek(&ssk->sk_receive_queue);
@@ -1679,7 +1679,6 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_pm_local *local,
 	/* discard the subflow socket */
 	mptcp_sock_graft(ssk, sk->sk_socket);
 	iput(SOCK_INODE(sf));
-	WRITE_ONCE(msk->allow_infinite_fallback, false);
 	mptcp_stop_tout_timer(sk);
 	return 0;
 
@@ -1851,7 +1850,7 @@ static void subflow_state_change(struct sock *sk)
 
 	msk = mptcp_sk(parent);
 	if (subflow_simultaneous_connect(sk)) {
-		mptcp_do_fallback(sk);
+		WARN_ON_ONCE(!mptcp_try_fallback(sk));
 		pr_fallback(msk);
 		subflow->conn_finished = 1;
 		mptcp_propagate_state(parent, sk, subflow, NULL);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 mptcp-net 2/5] mptcp: plug races between subflow fail and subflow creation
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic Paolo Abeni
@ 2025-07-09 15:12 ` Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 3/5] mptcp: reset fallback status gracefully at disconnect() time Paolo Abeni
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

We have races similar to the one addressed by the previous patch
between subflow failing and additional subflow creation. They are
just harder to trigger.

The solution is similar. Use a separate flag to track the condition
'socket state prevent any additional subflow creation' protected by
the fallback lock.

The socket fallback makes such flag true, and also receiving or sending
a MPTCP fail option.

The field 'allow_infinite_fallback' is now always touched under the
relevant lock, we can drop the ONCE annotation on write.

Fixes: 478d770008b0 ("mptcp: send out MP_FAIL when data checksum fails")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/pm.c       |  8 +++++++-
 net/mptcp/protocol.c | 11 ++++++-----
 net/mptcp/protocol.h |  8 +++++++-
 net/mptcp/subflow.c  | 19 ++++++++++++++-----
 4 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 25ce2d5135bc..687dbb59d084 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -765,8 +765,14 @@ void mptcp_pm_mp_fail_received(struct sock *sk, u64 fail_seq)
 
 	pr_debug("fail_seq=%llu\n", fail_seq);
 
-	if (!READ_ONCE(msk->allow_infinite_fallback))
+	/* After accepting the fail, we can't create any other subflows */
+	spin_lock_bh(&msk->fallback_lock);
+	if (!msk->allow_infinite_fallback) {
+		spin_unlock_bh(&msk->fallback_lock);
 		return;
+	}
+	msk->allow_subflows = false;
+	spin_unlock_bh(&msk->fallback_lock);
 
 	if (!subflow->fail_tout) {
 		pr_debug("send MP_FAIL response and infinite map\n");
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 4ddf3a1e0085..49498cdd17f0 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -791,7 +791,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
 static void mptcp_subflow_joined(struct mptcp_sock *msk, struct sock *ssk)
 {
 	mptcp_subflow_ctx(ssk)->map_seq = READ_ONCE(msk->ack_seq);
-	WRITE_ONCE(msk->allow_infinite_fallback, false);
+	msk->allow_infinite_fallback = false;
 	mptcp_event(MPTCP_EVENT_SUB_ESTABLISHED, msk, ssk, GFP_ATOMIC);
 }
 
@@ -803,7 +803,7 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
 		return false;
 
 	spin_lock_bh(&msk->fallback_lock);
-	if (__mptcp_check_fallback(msk)) {
+	if (!msk->allow_subflows) {
 		spin_unlock_bh(&msk->fallback_lock);
 		return false;
 	}
@@ -2625,7 +2625,7 @@ static void __mptcp_retrans(struct sock *sk)
 				len = max(copied, len);
 				tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle,
 					 info.size_goal);
-				WRITE_ONCE(msk->allow_infinite_fallback, false);
+				msk->allow_infinite_fallback = false;
 			}
 			spin_unlock_bh(&msk->fallback_lock);
 
@@ -2753,7 +2753,8 @@ static void __mptcp_init_sock(struct sock *sk)
 	WRITE_ONCE(msk->first, NULL);
 	inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
 	WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk)));
-	WRITE_ONCE(msk->allow_infinite_fallback, true);
+	msk->allow_infinite_fallback = true;
+	msk->allow_subflows = true;
 	msk->recovery = false;
 	msk->subflow_id = 1;
 	msk->last_data_sent = tcp_jiffies32;
@@ -3549,7 +3550,7 @@ bool mptcp_finish_join(struct sock *ssk)
 	/* active subflow, already present inside the conn_list */
 	if (!list_empty(&subflow->node)) {
 		spin_lock_bh(&msk->fallback_lock);
-		if (__mptcp_check_fallback(msk)) {
+		if (!msk->allow_subflows) {
 			spin_unlock_bh(&msk->fallback_lock);
 			return false;
 		}
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 61a9ff467501..c8b4ebe198dc 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -348,11 +348,16 @@ struct mptcp_sock {
 		u64	rtt_us; /* last maximum rtt of subflows */
 	} rcvq_space;
 	u8		scaling_ratio;
+	bool		allow_subflows;
 
 	u32		subflow_id;
 	u32		setsockopt_seq;
 	char		ca_name[TCP_CA_NAME_MAX];
-	spinlock_t	fallback_lock;	/* protects fallback && allow_infinite_fallback */
+
+	spinlock_t	fallback_lock;	/* protects fallback,
+					 * allow_infinite_fallback and
+					 * allow_join
+					 */
 };
 
 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
@@ -1227,6 +1232,7 @@ static inline bool __mptcp_try_fallback(struct mptcp_sock *msk)
 		return false;
 	}
 
+	msk->allow_subflows = false;
 	set_bit(MPTCP_FALLBACK_DONE, &msk->flags);
 	spin_unlock_bh(&msk->fallback_lock);
 	return true;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index a6a35985e551..1802bc5435a1 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1302,20 +1302,29 @@ static void subflow_sched_work_if_closed(struct mptcp_sock *msk, struct sock *ss
 		mptcp_schedule_work(sk);
 }
 
-static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)
+static bool mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	unsigned long fail_tout;
 
+	/* we are really failing, prevent any later subflow join */
+	spin_lock_bh(&msk->fallback_lock);
+	if (!msk->allow_infinite_fallback) {
+		spin_unlock_bh(&msk->fallback_lock);
+		return false;
+	}
+	msk->allow_subflows = false;
+	spin_unlock_bh(&msk->fallback_lock);
+
 	/* graceful failure can happen only on the MPC subflow */
 	if (WARN_ON_ONCE(ssk != READ_ONCE(msk->first)))
-		return;
+		return false;
 
 	/* since the close timeout take precedence on the fail one,
 	 * no need to start the latter when the first is already set
 	 */
 	if (sock_flag((struct sock *)msk, SOCK_DEAD))
-		return;
+		return true;
 
 	/* we don't need extreme accuracy here, use a zero fail_tout as special
 	 * value meaning no fail timeout at all;
@@ -1327,6 +1336,7 @@ static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)
 	tcp_send_ack(ssk);
 
 	mptcp_reset_tout_timer(msk, subflow->fail_tout);
+	return true;
 }
 
 static bool subflow_check_data_avail(struct sock *ssk)
@@ -1387,12 +1397,11 @@ static bool subflow_check_data_avail(struct sock *ssk)
 		    (subflow->mp_join || subflow->valid_csum_seen)) {
 			subflow->send_mp_fail = 1;
 
-			if (!READ_ONCE(msk->allow_infinite_fallback)) {
+			if (!mptcp_subflow_fail(msk, ssk)) {
 				subflow->reset_transient = 0;
 				subflow->reset_reason = MPTCP_RST_EMIDDLEBOX;
 				goto reset;
 			}
-			mptcp_subflow_fail(msk, ssk);
 			WRITE_ONCE(subflow->data_avail, true);
 			return true;
 		}
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 mptcp-net 3/5] mptcp: reset fallback status gracefully at disconnect() time
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 2/5] mptcp: plug races between subflow fail and subflow creation Paolo Abeni
@ 2025-07-09 15:12 ` Paolo Abeni
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs Paolo Abeni
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

mptcp_disconnect() clears the fallback bit unconditionally, without
touching the associated flags.

The bit clear is safe, as no fallback operation can race with that -
all subflow are already in TCP_CLOSE status thanks to the previous
FASTCLOSE - but we need to consistently reset all the fallback related
status.

Also acquire the relevant lock, to avoid fouling static analyzers.

Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v2 -> v3:
  - drop entirely the blocking section
---
 net/mptcp/protocol.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 49498cdd17f0..46bc5cc5a4d7 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3142,7 +3142,16 @@ static int mptcp_disconnect(struct sock *sk, int flags)
 	 * subflow
 	 */
 	mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE);
+
+	/* The first subflow is already in TCP_CLOSE status, the following
+	 * can't overlap with a fallback anymore
+	 */
+	spin_lock_bh(&msk->fallback_lock);
+	msk->allow_subflows = true;
+	msk->allow_infinite_fallback = true;
 	WRITE_ONCE(msk->flags, 0);
+	spin_unlock_bh(&msk->fallback_lock);
+
 	msk->cb_flags = 0;
 	msk->recovery = false;
 	WRITE_ONCE(msk->can_ack, false);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
                   ` (2 preceding siblings ...)
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 3/5] mptcp: reset fallback status gracefully at disconnect() time Paolo Abeni
@ 2025-07-09 15:12 ` Paolo Abeni
  2025-07-10 17:41   ` Matthieu Baerts
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 5/5] mptcp: remove pr_fallback() Paolo Abeni
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

Add the mibs required to cover the few possible fallback causes still
lacking suck info.

Move the relevant mib increment into the fallback helper, so that no
eventual future fallback operation will miss a paired mib increment.

Additionally track failed fallback via its own mib, such mib is
incremented only when a fallback mandated by the protocol fails - due
to racing subflow creation.

While at the above, rename an existing helper to reduce long lines
problems all along.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v2 -> v3:
  - a bunch of mibs rename
---
 net/mptcp/ctrl.c     |  4 ++--
 net/mptcp/mib.c      |  5 +++++
 net/mptcp/mib.h      |  7 +++++++
 net/mptcp/options.c  |  4 +++-
 net/mptcp/protocol.c | 44 ++++++++++++++++++++++++++++++--------------
 net/mptcp/protocol.h | 31 ++++++++-----------------------
 net/mptcp/subflow.c  | 12 +++++++-----
 7 files changed, 62 insertions(+), 45 deletions(-)

diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
index d9290c5bb6c7..fed40dae5583 100644
--- a/net/mptcp/ctrl.c
+++ b/net/mptcp/ctrl.c
@@ -533,9 +533,9 @@ void mptcp_active_detect_blackhole(struct sock *ssk, bool expired)
 	to_max = mptcp_get_pernet(net)->syn_retrans_before_tcp_fallback;
 
 	if (timeouts == to_max || (timeouts < to_max && expired)) {
-		MPTCP_INC_STATS(net, MPTCP_MIB_MPCAPABLEACTIVEDROP);
 		subflow->mpc_drop = 1;
-		mptcp_subflow_early_fallback(mptcp_sk(subflow->conn), subflow);
+		mptcp_early_fallback(mptcp_sk(subflow->conn), subflow,
+				     MPTCP_MIB_MPCAPABLEACTIVEDROP);
 	}
 }
 
diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
index 0c24545f0e8d..ee51d14ba510 100644
--- a/net/mptcp/mib.c
+++ b/net/mptcp/mib.c
@@ -80,6 +80,11 @@ static const struct snmp_mib mptcp_snmp_list[] = {
 	SNMP_MIB_ITEM("RcvWndConflict", MPTCP_MIB_RCVWNDCONFLICT),
 	SNMP_MIB_ITEM("MPCurrEstab", MPTCP_MIB_CURRESTAB),
 	SNMP_MIB_ITEM("Blackhole", MPTCP_MIB_BLACKHOLE),
+	SNMP_MIB_ITEM("DataFallback", MPTCP_MIB_MPCAPABLEDATAFALLBACK),
+	SNMP_MIB_ITEM("MD5sumFallback", MPTCP_MIB_MD5SIGFALLBACK),
+	SNMP_MIB_ITEM("DssFallback", MPTCP_MIB_DSSFALLBACK),
+	SNMP_MIB_ITEM("SimultFallback", MPTCP_MIB_SIMULTCONNFALLBACK),
+	SNMP_MIB_ITEM("FallbackFailed", MPTCP_MIB_FALLBACKFAILED),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
index 250c6b77977e..309bac6fea32 100644
--- a/net/mptcp/mib.h
+++ b/net/mptcp/mib.h
@@ -81,6 +81,13 @@ enum linux_mptcp_mib_field {
 	MPTCP_MIB_RCVWNDCONFLICT,	/* Conflict with while updating msk rcv wnd */
 	MPTCP_MIB_CURRESTAB,		/* Current established MPTCP connections */
 	MPTCP_MIB_BLACKHOLE,		/* A blackhole has been detected */
+	MPTCP_MIB_MPCAPABLEDATAFALLBACK,	/* Missing DSS/MPC+data on first
+					 * established packet
+					 */
+	MPTCP_MIB_MD5SIGFALLBACK,	/* Conflicting TCP option enabled */
+	MPTCP_MIB_DSSFALLBACK,		/* Bad or missing DSS */
+	MPTCP_MIB_SIMULTCONNFALLBACK,	/* Simultaneous connect */
+	MPTCP_MIB_FALLBACKFAILED,	/* Can't fallback due to msk status */
 	__MPTCP_MIB_MAX
 };
 
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index 9288a8d1235e..b49a8b3ffe00 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -978,8 +978,10 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
 		if (subflow->mp_join)
 			goto reset;
 		subflow->mp_capable = 0;
-		if (!mptcp_try_fallback(ssk))
+		if (!mptcp_try_fallback(ssk, MPTCP_MIB_MPCAPABLEDATAFALLBACK)) {
+			MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_FALLBACKFAILED);
 			goto reset;
+		}
 		pr_fallback(msk);
 		return false;
 	}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 46bc5cc5a4d7..54dc39fd1495 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -67,6 +67,26 @@ static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
 	return &inet_stream_ops;
 }
 
+bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib)
+{
+	struct net *net = sock_net((struct sock *)msk);
+
+	if (__mptcp_check_fallback(msk))
+		return true;
+
+	spin_lock_bh(&msk->fallback_lock);
+	if (!msk->allow_infinite_fallback) {
+		spin_unlock_bh(&msk->fallback_lock);
+		return false;
+	}
+
+	msk->allow_subflows = false;
+	set_bit(MPTCP_FALLBACK_DONE, &msk->flags);
+	__MPTCP_INC_STATS(net, fb_mib);
+	spin_unlock_bh(&msk->fallback_lock);
+	return true;
+}
+
 static int __mptcp_socket_create(struct mptcp_sock *msk)
 {
 	struct mptcp_subflow_context *subflow;
@@ -560,10 +580,7 @@ static bool mptcp_check_data_fin(struct sock *sk)
 
 static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
 {
-	if (mptcp_try_fallback(ssk)) {
-		MPTCP_INC_STATS(sock_net(ssk),
-				MPTCP_MIB_DSSCORRUPTIONFALLBACK);
-	} else {
+	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
 		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCORRUPTIONRESET);
 		mptcp_subflow_reset(ssk);
 	}
@@ -1142,12 +1159,12 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	mpext->infinite_map = 1;
 	mpext->data_len = 0;
 
-	if (!mptcp_try_fallback(ssk)) {
+	if (!mptcp_try_fallback(ssk, MPTCP_MIB_INFINITEMAPTX)) {
+		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_FALLBACKFAILED);
 		mptcp_subflow_reset(ssk);
 		return;
 	}
 
-	MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX);
 	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
 	pr_fallback(msk);
 }
@@ -3688,16 +3705,15 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	 * TCP option space.
 	 */
 	if (rcu_access_pointer(tcp_sk(ssk)->md5sig_info))
-		mptcp_subflow_early_fallback(msk, subflow);
+		mptcp_early_fallback(msk, subflow, MPTCP_MIB_MD5SIGFALLBACK);
 #endif
 	if (subflow->request_mptcp) {
-		if (mptcp_active_should_disable(sk)) {
-			MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPCAPABLEACTIVEDISABLED);
-			mptcp_subflow_early_fallback(msk, subflow);
-		} else if (mptcp_token_new_connect(ssk) < 0) {
-			MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_TOKENFALLBACKINIT);
-			mptcp_subflow_early_fallback(msk, subflow);
-		}
+		if (mptcp_active_should_disable(sk))
+			mptcp_early_fallback(msk, subflow,
+					    MPTCP_MIB_MPCAPABLEACTIVEDISABLED);
+		else if (mptcp_token_new_connect(ssk) < 0)
+			mptcp_early_fallback(msk, subflow,
+					     MPTCP_MIB_TOKENFALLBACKINIT);
 	}
 
 	WRITE_ONCE(msk->write_seq, subflow->idsn);
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index c8b4ebe198dc..74579743264f 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1220,24 +1220,6 @@ static inline bool mptcp_check_fallback(const struct sock *sk)
 	return __mptcp_check_fallback(msk);
 }
 
-static inline bool __mptcp_try_fallback(struct mptcp_sock *msk)
-{
-	if (__mptcp_check_fallback(msk)) {
-		pr_debug("TCP fallback already done (msk=%p)\n", msk);
-		return true;
-	}
-	spin_lock_bh(&msk->fallback_lock);
-	if (!msk->allow_infinite_fallback) {
-		spin_unlock_bh(&msk->fallback_lock);
-		return false;
-	}
-
-	msk->allow_subflows = false;
-	set_bit(MPTCP_FALLBACK_DONE, &msk->flags);
-	spin_unlock_bh(&msk->fallback_lock);
-	return true;
-}
-
 static inline bool __mptcp_has_initial_subflow(const struct mptcp_sock *msk)
 {
 	struct sock *ssk = READ_ONCE(msk->first);
@@ -1247,14 +1229,16 @@ static inline bool __mptcp_has_initial_subflow(const struct mptcp_sock *msk)
 			TCPF_SYN_RECV | TCPF_LISTEN));
 }
 
-static inline bool mptcp_try_fallback(struct sock *ssk)
+bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib);
+
+static inline bool mptcp_try_fallback(struct sock *ssk, int fb_mib)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct sock *sk = subflow->conn;
 	struct mptcp_sock *msk;
 
 	msk = mptcp_sk(sk);
-	if (!__mptcp_try_fallback(msk))
+	if (!__mptcp_try_fallback(msk, fb_mib))
 		return false;
 	if (READ_ONCE(msk->snd_data_fin_enable) && !(ssk->sk_shutdown & SEND_SHUTDOWN)) {
 		gfp_t saved_allocation = ssk->sk_allocation;
@@ -1272,12 +1256,13 @@ static inline bool mptcp_try_fallback(struct sock *ssk)
 
 #define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)\n", __func__, a)
 
-static inline void mptcp_subflow_early_fallback(struct mptcp_sock *msk,
-						struct mptcp_subflow_context *subflow)
+static inline void mptcp_early_fallback(struct mptcp_sock *msk,
+					struct mptcp_subflow_context *subflow,
+					int fb_mib)
 {
 	pr_fallback(msk);
 	subflow->request_mptcp = 0;
-	WARN_ON_ONCE(!__mptcp_try_fallback(msk));
+	WARN_ON_ONCE(!__mptcp_try_fallback(msk, fb_mib));
 }
 
 static inline bool mptcp_check_infinite_map(struct sk_buff *skb)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 1802bc5435a1..600e59bba363 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -544,11 +544,13 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 	mptcp_get_options(skb, &mp_opt);
 	if (subflow->request_mptcp) {
 		if (!(mp_opt.suboptions & OPTION_MPTCP_MPC_SYNACK)) {
-			if (!mptcp_try_fallback(sk))
+			if (!mptcp_try_fallback(sk,
+						MPTCP_MIB_MPCAPABLEACTIVEFALLBACK)) {
+				MPTCP_INC_STATS(sock_net(sk),
+						MPTCP_MIB_FALLBACKFAILED);
 				goto do_reset;
+			}
 
-			MPTCP_INC_STATS(sock_net(sk),
-					MPTCP_MIB_MPCAPABLEACTIVEFALLBACK);
 			pr_fallback(msk);
 			goto fallback;
 		}
@@ -1406,7 +1408,7 @@ static bool subflow_check_data_avail(struct sock *ssk)
 			return true;
 		}
 
-		if (!mptcp_try_fallback(ssk)) {
+		if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSFALLBACK)) {
 			/* fatal protocol error, close the socket.
 			 * subflow_error_report() will introduce the appropriate barriers
 			 */
@@ -1859,7 +1861,7 @@ static void subflow_state_change(struct sock *sk)
 
 	msk = mptcp_sk(parent);
 	if (subflow_simultaneous_connect(sk)) {
-		WARN_ON_ONCE(!mptcp_try_fallback(sk));
+		WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK));
 		pr_fallback(msk);
 		subflow->conn_finished = 1;
 		mptcp_propagate_state(parent, sk, subflow, NULL);
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 mptcp-net 5/5] mptcp: remove pr_fallback()
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
                   ` (3 preceding siblings ...)
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs Paolo Abeni
@ 2025-07-09 15:12 ` Paolo Abeni
  2025-07-09 18:03 ` [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races MPTCP CI
  2025-07-10 17:41 ` Matthieu Baerts
  6 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-07-09 15:12 UTC (permalink / raw)
  To: mptcp

We can now track fully the fallback status of a given connection
via the relevant mibs, the mentioned helper is redundant. Remove
it completely.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v2 -> v3:
  - dropped unused variable in subflow_state_change()
---
 net/mptcp/options.c  | 1 -
 net/mptcp/protocol.c | 1 -
 net/mptcp/protocol.h | 3 ---
 net/mptcp/subflow.c  | 4 ----
 4 files changed, 9 deletions(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index b49a8b3ffe00..3c08beffbfea 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -982,7 +982,6 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
 			MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_FALLBACKFAILED);
 			goto reset;
 		}
-		pr_fallback(msk);
 		return false;
 	}
 
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 54dc39fd1495..c46993bf9d8e 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1166,7 +1166,6 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	}
 
 	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
-	pr_fallback(msk);
 }
 
 #define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 74579743264f..db50b216143a 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1254,13 +1254,10 @@ static inline bool mptcp_try_fallback(struct sock *ssk, int fb_mib)
 	return true;
 }
 
-#define pr_fallback(a) pr_debug("%s:fallback to TCP (msk=%p)\n", __func__, a)
-
 static inline void mptcp_early_fallback(struct mptcp_sock *msk,
 					struct mptcp_subflow_context *subflow,
 					int fb_mib)
 {
-	pr_fallback(msk);
 	subflow->request_mptcp = 0;
 	WARN_ON_ONCE(!__mptcp_try_fallback(msk, fb_mib));
 }
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 600e59bba363..3f1b62a9fe88 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -551,7 +551,6 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 				goto do_reset;
 			}
 
-			pr_fallback(msk);
 			goto fallback;
 		}
 
@@ -1855,14 +1854,11 @@ static void subflow_state_change(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 	struct sock *parent = subflow->conn;
-	struct mptcp_sock *msk;
 
 	__subflow_state_change(sk);
 
-	msk = mptcp_sk(parent);
 	if (subflow_simultaneous_connect(sk)) {
 		WARN_ON_ONCE(!mptcp_try_fallback(sk, MPTCP_MIB_SIMULTCONNFALLBACK));
-		pr_fallback(msk);
 		subflow->conn_finished = 1;
 		mptcp_propagate_state(parent, sk, subflow, NULL);
 	}
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
                   ` (4 preceding siblings ...)
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 5/5] mptcp: remove pr_fallback() Paolo Abeni
@ 2025-07-09 18:03 ` MPTCP CI
  2025-07-10 17:41 ` Matthieu Baerts
  6 siblings, 0 replies; 13+ messages in thread
From: MPTCP CI @ 2025-07-09 18:03 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/16175222973

Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/2b0c77e07efb
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=980585


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races
  2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
                   ` (5 preceding siblings ...)
  2025-07-09 18:03 ` [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races MPTCP CI
@ 2025-07-10 17:41 ` Matthieu Baerts
  6 siblings, 0 replies; 13+ messages in thread
From: Matthieu Baerts @ 2025-07-10 17:41 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 09/07/2025 17:12, Paolo Abeni wrote:
> This series contains 3 fixes somewhat related to various races we have
> while handling fallback and 2 small follow-up likely more suited for
> net-next.
> 
> The root cause of the issues addressed here is that the check for
> "we can fallback to tcp now" and the related action are not atomic. That
> also applies to fallback due to MP_FAIL - where the window race is even
> wider.
> 
> Address the issue introducing an additional spinlock to bundle together
> all the relevant events, as per patch 1 and 2.
> 
> Note that mptcp_disconnect() unconditionally
> clears the fallback status (zeroing msk->flags) but don't tuch the
> `allows_infinite_fallback` flag. Such issue is addressed in patch 3.
> 
> Patch 4 cleans up a bit the fallback code, introducing specific MIB for
> each FB reason, and patch 5 drops the, hopefully now redundant
> pr_fallback().
> ---
> v2 -> v3:
>  - mptcp_do_fallback  -> mptcp_try_fallback
>  - refactored patch 3/5
>  - changed mibs names, increment fail only when the protocol mandate it
>  - fix W=1 warn in patch 5/5

Thank you for the new version!

It looks good to me, I just a few small questions, not really blocking
the series, and I can eventually do the small modifications when
applying the patches if you prefer.

> 
> Paolo Abeni (5):
>   mptcp: make fallback action and fallback decision atomic
>   mptcp: plug races between subflow fail and subflow creation
>   mptcp: reset fallback status gracefully at disconnect() time
>   mptcp: track fallbacks accurately via mibs
>   mptcp: remove pr_fallback()
> 
>  net/mptcp/ctrl.c     |  4 +-
>  net/mptcp/mib.c      |  5 +++
>  net/mptcp/mib.h      |  7 ++++
>  net/mptcp/options.c  |  6 ++-
>  net/mptcp/pm.c       |  8 +++-
>  net/mptcp/protocol.c | 97 ++++++++++++++++++++++++++++++++++----------
>  net/mptcp/protocol.h | 35 ++++++++--------
>  net/mptcp/subflow.c  | 40 ++++++++++--------
>  8 files changed, 140 insertions(+), 62 deletions(-)
> 

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic Paolo Abeni
@ 2025-07-10 17:41   ` Matthieu Baerts
  2025-07-11  7:26     ` Paolo Abeni
  0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Baerts @ 2025-07-10 17:41 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 09/07/2025 17:12, Paolo Abeni wrote:
> Syzkaller reported the following splat:

(...)

> Since we need to track the 'fallback is possible' condition and the
> fallback status separately, there are a few possible races open between
> the check and the actual fallback action.
> 
> Add a spinlock to protect the fallback related information and use
> it close all the possible related races. While at it also remove the
> too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect():
> the field will be correctly cleared by subflow_finish_connect() if/
> when the connection will complete successfully.
> 
> If fallback is not possible, as per RFC, reset the current subflow.
> 
> Since the fallback operation can now fail and return value should be
> checked, rename the helper accordingly.
> 
> Reported by: Matthieu Baerts <matttbe@kernel.org>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570
> Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555
> Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v2 -> v3:
>   - fix return vale check in check_fully_established()
>   - add WARN when hitting 'impossible' fallback failures.
>   - drop empty line
>   - mptcp_do_fallback -> mptcp_try_fallback

Thank you for the v3!

> 
> v1 -> v2:
>   - remove unneeded and racing allow_infinite_fallback in
>     __mptcp_subflow_connect()
>   - really close race in subflow_check_data_avail()
> 
> Sharing earlier to feed syzkaller.
> Hopefully a follow-up on mptcp-net-next will cleanup a bit the fallback
> handling, I tried to minimize the delta for backport's sake

syzkaller is still OK with this v3!

(...)

> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 1de42dc4e8ea..4ddf3a1e0085 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -560,10 +560,9 @@ static bool mptcp_check_data_fin(struct sock *sk)
>  
>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>  {
> -	if (READ_ONCE(msk->allow_infinite_fallback)) {
> +	if (mptcp_try_fallback(ssk)) {
>  		MPTCP_INC_STATS(sock_net(ssk),
>  				MPTCP_MIB_DSSCORRUPTIONFALLBACK);
> -		mptcp_do_fallback(ssk);
>  	} else {
>  		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCORRUPTIONRESET);
>  		mptcp_subflow_reset(ssk);
> @@ -803,6 +802,14 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
>  	if (sk->sk_state != TCP_ESTABLISHED)
>  		return false;
>  
> +	spin_lock_bh(&msk->fallback_lock);
> +	if (__mptcp_check_fallback(msk)) {
> +		spin_unlock_bh(&msk->fallback_lock);
> +		return false;
> +	}
> +	mptcp_subflow_joined(msk, ssk);
> +	spin_unlock_bh(&msk->fallback_lock);

(just to be sure, do you prefer to explicitly have the lock and fallback
check here and in mptcp_finish_join, and not in mptcp_subflow_joined?)
> +
>  	/* attach to msk socket only after we are sure we will deal with it
>  	 * at close time
>  	 */
> @@ -811,7 +818,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
>  
>  	mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++;
>  	mptcp_sockopt_sync_locked(msk, ssk);
> -	mptcp_subflow_joined(msk, ssk);
>  	mptcp_stop_tout_timer(sk);
>  	__mptcp_propagate_sndbuf(sk, ssk);
>  	return true;
> @@ -1136,10 +1142,14 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
>  	mpext->infinite_map = 1;
>  	mpext->data_len = 0;
>  
> +	if (!mptcp_try_fallback(ssk)) {
> +		mptcp_subflow_reset(ssk);
> +		return;
> +	}
> +
>  	MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPTX);
>  	mptcp_subflow_ctx(ssk)->send_infinite_map = 0;
>  	pr_fallback(msk);
> -	mptcp_do_fallback(ssk);
>  }
>  
>  #define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
> @@ -2543,9 +2553,9 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
>  
>  static void __mptcp_retrans(struct sock *sk)
>  {
> +	struct mptcp_sendmsg_info info = { .data_lock_held = true, };

(You might have missed my comment in the v2, or I missed your reply)

Is this modification needed? I might have missed something (or used
later?), but it looks like this 'data_lock_held' is not used here or in
mptcp_sendmsg_frag(). Then no need to set it to true, no?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs
  2025-07-09 15:12 ` [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs Paolo Abeni
@ 2025-07-10 17:41   ` Matthieu Baerts
  2025-07-11  7:26     ` Paolo Abeni
  0 siblings, 1 reply; 13+ messages in thread
From: Matthieu Baerts @ 2025-07-10 17:41 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 09/07/2025 17:12, Paolo Abeni wrote:
> Add the mibs required to cover the few possible fallback causes still
> lacking suck info.
> 
> Move the relevant mib increment into the fallback helper, so that no
> eventual future fallback operation will miss a paired mib increment.
> 
> Additionally track failed fallback via its own mib, such mib is
> incremented only when a fallback mandated by the protocol fails - due
> to racing subflow creation.
> 
> While at the above, rename an existing helper to reduce long lines
> problems all along.

(...)

> diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
> index 0c24545f0e8d..ee51d14ba510 100644
> --- a/net/mptcp/mib.c
> +++ b/net/mptcp/mib.c
> @@ -80,6 +80,11 @@ static const struct snmp_mib mptcp_snmp_list[] = {
>  	SNMP_MIB_ITEM("RcvWndConflict", MPTCP_MIB_RCVWNDCONFLICT),
>  	SNMP_MIB_ITEM("MPCurrEstab", MPTCP_MIB_CURRESTAB),
>  	SNMP_MIB_ITEM("Blackhole", MPTCP_MIB_BLACKHOLE),
> +	SNMP_MIB_ITEM("DataFallback", MPTCP_MIB_MPCAPABLEDATAFALLBACK),

I see that you modified the enum, but not the exposed name. Is it
intentional?

  MPCapableFallbackData

And placed above with the others MPCapableFallback ones? same in mib.h?

> +	SNMP_MIB_ITEM("MD5sumFallback", MPTCP_MIB_MD5SIGFALLBACK),

Same here? (s/sum/Sig)

  MD5SigFallback

> +	SNMP_MIB_ITEM("DssFallback", MPTCP_MIB_DSSFALLBACK),
> +	SNMP_MIB_ITEM("SimultFallback", MPTCP_MIB_SIMULTCONNFALLBACK),

Same here?

  SimultConnectFallback

> +	SNMP_MIB_ITEM("FallbackFailed", MPTCP_MIB_FALLBACKFAILED),
>  	SNMP_MIB_SENTINEL
>  };
>  
Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic
  2025-07-10 17:41   ` Matthieu Baerts
@ 2025-07-11  7:26     ` Paolo Abeni
  2025-07-11  7:38       ` Matthieu Baerts
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2025-07-11  7:26 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp

On 7/10/25 7:41 PM, Matthieu Baerts wrote:
> On 09/07/2025 17:12, Paolo Abeni wrote:
 @@ -803,6 +802,14 @@ static bool __mptcp_finish_join(struct mptcp_sock
*msk, struct sock *ssk)
>>  	if (sk->sk_state != TCP_ESTABLISHED)
>>  		return false;
>>  
>> +	spin_lock_bh(&msk->fallback_lock);
>> +	if (__mptcp_check_fallback(msk)) {
>> +		spin_unlock_bh(&msk->fallback_lock);
>> +		return false;
>> +	}
>> +	mptcp_subflow_joined(msk, ssk);
>> +	spin_unlock_bh(&msk->fallback_lock);
> 
> (just to be sure, do you prefer to explicitly have the lock and fallback
> check here and in mptcp_finish_join, and not in mptcp_subflow_joined?)

Moving the check into mptcp_subflow_joined would need to change such
helper return value (void -> bool), and likely changing the name (as the
join could still fail there). Overall more code delta, with IMHO unclear
gain.

>> @@ -2543,9 +2553,9 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
>>  
>>  static void __mptcp_retrans(struct sock *sk)
>>  {
>> +	struct mptcp_sendmsg_info info = { .data_lock_held = true, };
> 
> (You might have missed my comment in the v2, or I missed your reply)
> 
> Is this modification needed? I might have missed something (or used
> later?), but it looks like this 'data_lock_held' is not used here or in
> mptcp_sendmsg_frag(). Then no need to set it to true, no?

Sorry, I forgot to reply. This code chunk is needed to make
mptcp_alloc_tx_skb() - in mptcp_sendmsg_frag() - atomic. After this
patch the mptcp_sendmsg_frag() call in __mptcp_retrans indeed happens in
atomic scope and the memory allocation there can't be blocking.

/P


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs
  2025-07-10 17:41   ` Matthieu Baerts
@ 2025-07-11  7:26     ` Paolo Abeni
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-07-11  7:26 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp

On 7/10/25 7:41 PM, Matthieu Baerts wrote:
> On 09/07/2025 17:12, Paolo Abeni wrote:
>> Add the mibs required to cover the few possible fallback causes still
>> lacking suck info.
>>
>> Move the relevant mib increment into the fallback helper, so that no
>> eventual future fallback operation will miss a paired mib increment.
>>
>> Additionally track failed fallback via its own mib, such mib is
>> incremented only when a fallback mandated by the protocol fails - due
>> to racing subflow creation.
>>
>> While at the above, rename an existing helper to reduce long lines
>> problems all along.
> 
> (...)
> 
>> diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
>> index 0c24545f0e8d..ee51d14ba510 100644
>> --- a/net/mptcp/mib.c
>> +++ b/net/mptcp/mib.c
>> @@ -80,6 +80,11 @@ static const struct snmp_mib mptcp_snmp_list[] = {
>>  	SNMP_MIB_ITEM("RcvWndConflict", MPTCP_MIB_RCVWNDCONFLICT),
>>  	SNMP_MIB_ITEM("MPCurrEstab", MPTCP_MIB_CURRESTAB),
>>  	SNMP_MIB_ITEM("Blackhole", MPTCP_MIB_BLACKHOLE),
>> +	SNMP_MIB_ITEM("DataFallback", MPTCP_MIB_MPCAPABLEDATAFALLBACK),
> 
> I see that you modified the enum, but not the exposed name. Is it
> intentional?

Oops, sorry. I rushed this a bit too much. I'll send a v4.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic
  2025-07-11  7:26     ` Paolo Abeni
@ 2025-07-11  7:38       ` Matthieu Baerts
  0 siblings, 0 replies; 13+ messages in thread
From: Matthieu Baerts @ 2025-07-11  7:38 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

Thank you for your reply!

On 11/07/2025 09:26, Paolo Abeni wrote:
> On 7/10/25 7:41 PM, Matthieu Baerts wrote:
>> On 09/07/2025 17:12, Paolo Abeni wrote:
>  @@ -803,6 +802,14 @@ static bool __mptcp_finish_join(struct mptcp_sock
> *msk, struct sock *ssk)
>>>  	if (sk->sk_state != TCP_ESTABLISHED)
>>>  		return false;
>>>  
>>> +	spin_lock_bh(&msk->fallback_lock);
>>> +	if (__mptcp_check_fallback(msk)) {
>>> +		spin_unlock_bh(&msk->fallback_lock);
>>> +		return false;
>>> +	}
>>> +	mptcp_subflow_joined(msk, ssk);
>>> +	spin_unlock_bh(&msk->fallback_lock);
>>
>> (just to be sure, do you prefer to explicitly have the lock and fallback
>> check here and in mptcp_finish_join, and not in mptcp_subflow_joined?)
> 
> Moving the check into mptcp_subflow_joined would need to change such
> helper return value (void -> bool), and likely changing the name (as the
> join could still fail there). Overall more code delta, with IMHO unclear
> gain.

It makes sense, fine by me.

>>> @@ -2543,9 +2553,9 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
>>>  
>>>  static void __mptcp_retrans(struct sock *sk)
>>>  {
>>> +	struct mptcp_sendmsg_info info = { .data_lock_held = true, };
>>
>> (You might have missed my comment in the v2, or I missed your reply)
>>
>> Is this modification needed? I might have missed something (or used
>> later?), but it looks like this 'data_lock_held' is not used here or in
>> mptcp_sendmsg_frag(). Then no need to set it to true, no?
> 
> Sorry, I forgot to reply. This code chunk is needed to make
> mptcp_alloc_tx_skb() - in mptcp_sendmsg_frag() - atomic. After this
> patch the mptcp_sendmsg_frag() call in __mptcp_retrans indeed happens in
> atomic scope and the memory allocation there can't be blocking.

Oh, my bad, I missed the call to mptcp_alloc_tx_skb() in
mptcp_sendmsg_frag() (twice...)!

All good then!

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-11  7:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 15:12 [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races Paolo Abeni
2025-07-09 15:12 ` [PATCH v3 mptcp-net 1/5] mptcp: make fallback action and fallback decision atomic Paolo Abeni
2025-07-10 17:41   ` Matthieu Baerts
2025-07-11  7:26     ` Paolo Abeni
2025-07-11  7:38       ` Matthieu Baerts
2025-07-09 15:12 ` [PATCH v3 mptcp-net 2/5] mptcp: plug races between subflow fail and subflow creation Paolo Abeni
2025-07-09 15:12 ` [PATCH v3 mptcp-net 3/5] mptcp: reset fallback status gracefully at disconnect() time Paolo Abeni
2025-07-09 15:12 ` [PATCH v3 mptcp-net 4/5] mptcp: track fallbacks accurately via mibs Paolo Abeni
2025-07-10 17:41   ` Matthieu Baerts
2025-07-11  7:26     ` Paolo Abeni
2025-07-09 15:12 ` [PATCH v3 mptcp-net 5/5] mptcp: remove pr_fallback() Paolo Abeni
2025-07-09 18:03 ` [PATCH v3 mptcp-net 0/5] mptcp: fix fallback-related races MPTCP CI
2025-07-10 17:41 ` Matthieu Baerts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).