public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com,
	Paolo Abeni <pabeni@redhat.com>,
	Mat Martineau <martineau@kernel.org>,
	Matthieu Baerts <matttbe@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.6 30/49] mptcp: fix inconsistent state on fastopen race
Date: Wed,  3 Jan 2024 17:55:50 +0100	[thread overview]
Message-ID: <20240103164839.629732451@linuxfoundation.org> (raw)
In-Reply-To: <20240103164834.970234661@linuxfoundation.org>

6.6-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <pabeni@redhat.com>

[ Upstream commit 4fd19a30701659af5839b7bd19d1f05f05933ebe ]

The netlink PM can race with fastopen self-connect attempts, shutting
down the first subflow via:

MPTCP_PM_CMD_DEL_ADDR -> mptcp_nl_remove_id_zero_address ->
  mptcp_pm_nl_rm_subflow_received -> mptcp_close_ssk

and transitioning such subflow to FIN_WAIT1 status before the syn-ack
packet is processed. The MPTCP code does not react to such state change,
leaving the connection in not-fallback status and the subflow handshake
uncompleted, triggering the following splat:

  WARNING: CPU: 0 PID: 10630 at net/mptcp/subflow.c:1405 subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
  Modules linked in:
  CPU: 0 PID: 10630 Comm: kworker/u4:11 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
  Workqueue: bat_events batadv_nc_worker
  RIP: 0010:subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
  Code: 18 89 ee e8 e3 d2 21 f7 40 84 ed 75 1f e8 a9 d7 21 f7 44 89 fe bf 07 00 00 00 e8 0c d3 21 f7 41 83 ff 07 74 07 e8 91 d7 21 f7 <0f> 0b e8 8a d7 21 f7 48 89 df e8 d2 b2 ff ff 31 ff 89 c5 89 c6 e8
  RSP: 0018:ffffc90000007448 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff888031efc700 RCX: ffffffff8a65baf4
  RDX: ffff888043222140 RSI: ffffffff8a65baff RDI: 0000000000000005
  RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
  R10: 000000000000000b R11: 0000000000000000 R12: 1ffff92000000e89
  R13: ffff88807a534d80 R14: ffff888021c11a00 R15: 000000000000000b
  FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fa19a0ffc81 CR3: 000000007a2db000 CR4: 00000000003506f0
  DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Call Trace:
   <IRQ>
   tcp_data_ready+0x14c/0x5b0 net/ipv4/tcp_input.c:5128
   tcp_data_queue+0x19c3/0x5190 net/ipv4/tcp_input.c:5208
   tcp_rcv_state_process+0x11ef/0x4e10 net/ipv4/tcp_input.c:6844
   tcp_v4_do_rcv+0x369/0xa10 net/ipv4/tcp_ipv4.c:1929
   tcp_v4_rcv+0x3888/0x3b30 net/ipv4/tcp_ipv4.c:2329
   ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205
   ip_local_deliver_finish+0x2e4/0x510 net/ipv4/ip_input.c:233
   NF_HOOK include/linux/netfilter.h:314 [inline]
   NF_HOOK include/linux/netfilter.h:308 [inline]
   ip_local_deliver+0x1b6/0x550 net/ipv4/ip_input.c:254
   dst_input include/net/dst.h:461 [inline]
   ip_rcv_finish+0x1c4/0x2e0 net/ipv4/ip_input.c:449
   NF_HOOK include/linux/netfilter.h:314 [inline]
   NF_HOOK include/linux/netfilter.h:308 [inline]
   ip_rcv+0xce/0x440 net/ipv4/ip_input.c:569
   __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527
   __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641
   process_backlog+0x101/0x6b0 net/core/dev.c:5969
   __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6531
   napi_poll net/core/dev.c:6600 [inline]
   net_rx_action+0x956/0xe90 net/core/dev.c:6733
   __do_softirq+0x21a/0x968 kernel/softirq.c:553
   do_softirq kernel/softirq.c:454 [inline]
   do_softirq+0xaa/0xe0 kernel/softirq.c:441
   </IRQ>
   <TASK>
   __local_bh_enable_ip+0xf8/0x120 kernel/softirq.c:381
   spin_unlock_bh include/linux/spinlock.h:396 [inline]
   batadv_nc_purge_paths+0x1ce/0x3c0 net/batman-adv/network-coding.c:471
   batadv_nc_worker+0x9b1/0x10e0 net/batman-adv/network-coding.c:722
   process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
   process_scheduled_works kernel/workqueue.c:2703 [inline]
   worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
   kthread+0x33c/0x440 kernel/kthread.c:388
   ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
   ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
   </TASK>

To address the issue, catch the racing subflow state change and
use it to cause the MPTCP fallback. Such fallback is also used to
cause the first subflow state propagation to the msk socket via
mptcp_set_connected(). After this change, the first subflow can
additionally propagate the TCP_FIN_WAIT1 state, so rename the
helper accordingly.

Finally, if the state propagation is delayed to the msk release
callback, the first subflow can change to a different state in between.
Cache the relevant target state in a new msk-level field and use
such value to update the msk state at release time.

Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
Cc: stable@vger.kernel.org
Reported-by: <syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/mptcp/protocol.c |  6 +++---
 net/mptcp/protocol.h |  9 ++++++---
 net/mptcp/subflow.c  | 28 +++++++++++++++++-----------
 3 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index dc030551cac13..5c003a0f0fe5b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3397,12 +3397,12 @@ static void mptcp_release_cb(struct sock *sk)
 	if (__test_and_clear_bit(MPTCP_CLEAN_UNA, &msk->cb_flags))
 		__mptcp_clean_una_wakeup(sk);
 	if (unlikely(msk->cb_flags)) {
-		/* be sure to set the current sk state before taking actions
+		/* be sure to sync the msk state before taking actions
 		 * depending on sk_state (MPTCP_ERROR_REPORT)
 		 * On sk release avoid actions depending on the first subflow
 		 */
-		if (__test_and_clear_bit(MPTCP_CONNECTED, &msk->cb_flags) && msk->first)
-			__mptcp_set_connected(sk);
+		if (__test_and_clear_bit(MPTCP_SYNC_STATE, &msk->cb_flags) && msk->first)
+			__mptcp_sync_state(sk, msk->pending_state);
 		if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags))
 			__mptcp_error_report(sk);
 		if (__test_and_clear_bit(MPTCP_SYNC_SNDBUF, &msk->cb_flags))
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 40866acd91ad5..07c5ac37d092b 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -122,7 +122,7 @@
 #define MPTCP_ERROR_REPORT	3
 #define MPTCP_RETRANSMIT	4
 #define MPTCP_FLUSH_JOIN_LIST	5
-#define MPTCP_CONNECTED		6
+#define MPTCP_SYNC_STATE	6
 #define MPTCP_SYNC_SNDBUF	7
 
 struct mptcp_skb_cb {
@@ -293,6 +293,9 @@ struct mptcp_sock {
 	bool		use_64bit_ack; /* Set when we received a 64-bit DSN */
 	bool		csum_enabled;
 	bool		allow_infinite_fallback;
+	u8		pending_state; /* A subflow asked to set this sk_state,
+					* protected by the msk data lock
+					*/
 	u8		mpc_endpoint_id;
 	u8		recvmsg_inq:1,
 			cork:1,
@@ -711,7 +714,7 @@ void mptcp_get_options(const struct sk_buff *skb,
 		       struct mptcp_options_received *mp_opt);
 
 void mptcp_finish_connect(struct sock *sk);
-void __mptcp_set_connected(struct sock *sk);
+void __mptcp_sync_state(struct sock *sk, int state);
 void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout);
 
 static inline void mptcp_stop_tout_timer(struct sock *sk)
@@ -1101,7 +1104,7 @@ static inline bool subflow_simultaneous_connect(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
-	return sk->sk_state == TCP_ESTABLISHED &&
+	return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) &&
 	       is_active_ssk(subflow) &&
 	       !subflow->conn_finished;
 }
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index d8827427ffc84..f3a1e4aa0e5eb 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -419,22 +419,28 @@ static bool subflow_use_different_dport(struct mptcp_sock *msk, const struct soc
 	return inet_sk(sk)->inet_dport != inet_sk((struct sock *)msk)->inet_dport;
 }
 
-void __mptcp_set_connected(struct sock *sk)
+void __mptcp_sync_state(struct sock *sk, int state)
 {
-	__mptcp_propagate_sndbuf(sk, mptcp_sk(sk)->first);
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
+	__mptcp_propagate_sndbuf(sk, msk->first);
 	if (sk->sk_state == TCP_SYN_SENT) {
-		inet_sk_state_store(sk, TCP_ESTABLISHED);
+		inet_sk_state_store(sk, state);
 		sk->sk_state_change(sk);
 	}
 }
 
-static void mptcp_set_connected(struct sock *sk)
+static void mptcp_propagate_state(struct sock *sk, struct sock *ssk)
 {
+	struct mptcp_sock *msk = mptcp_sk(sk);
+
 	mptcp_data_lock(sk);
-	if (!sock_owned_by_user(sk))
-		__mptcp_set_connected(sk);
-	else
-		__set_bit(MPTCP_CONNECTED, &mptcp_sk(sk)->cb_flags);
+	if (!sock_owned_by_user(sk)) {
+		__mptcp_sync_state(sk, ssk->sk_state);
+	} else {
+		msk->pending_state = ssk->sk_state;
+		__set_bit(MPTCP_SYNC_STATE, &msk->cb_flags);
+	}
 	mptcp_data_unlock(sk);
 }
 
@@ -496,7 +502,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 		subflow_set_remote_key(msk, subflow, &mp_opt);
 		MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVEACK);
 		mptcp_finish_connect(sk);
-		mptcp_set_connected(parent);
+		mptcp_propagate_state(parent, sk);
 	} else if (subflow->request_join) {
 		u8 hmac[SHA256_DIGEST_SIZE];
 
@@ -540,7 +546,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
 	} else if (mptcp_check_fallback(sk)) {
 fallback:
 		mptcp_rcv_space_init(msk, sk);
-		mptcp_set_connected(parent);
+		mptcp_propagate_state(parent, sk);
 	}
 	return;
 
@@ -1732,7 +1738,7 @@ static void subflow_state_change(struct sock *sk)
 		mptcp_rcv_space_init(msk, sk);
 		pr_fallback(msk);
 		subflow->conn_finished = 1;
-		mptcp_set_connected(parent);
+		mptcp_propagate_state(parent, sk);
 	}
 
 	/* as recvmsg() does not acquire the subflow socket for ssk selection
-- 
2.43.0




  parent reply	other threads:[~2024-01-03 17:14 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-03 16:55 [PATCH 6.6 00/49] 6.6.10-rc1 review Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 01/49] ksmbd: Remove unused field in ksmbd_user struct Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 02/49] ksmbd: reorganize ksmbd_iov_pin_rsp() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 03/49] ksmbd: fix kernel-doc comment of ksmbd_vfs_setxattr() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 04/49] ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 05/49] ksmbd: add support for surrogate pair conversion Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 06/49] ksmbd: no need to wait for binded connection termination at logoff Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 07/49] ksmbd: fix kernel-doc comment of ksmbd_vfs_kern_path_locked() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 08/49] ksmbd: prevent memory leak on error return Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 09/49] ksmbd: separately allocate ci per dentry Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 10/49] ksmbd: move oplock handling after unlock parent dir Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 11/49] ksmbd: release interim response after sending status pending response Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 12/49] ksmbd: move setting SMB2_FLAGS_ASYNC_COMMAND and AsyncId Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 13/49] ksmbd: dont update ->op_state as OPLOCK_STATE_NONE on error Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 14/49] ksmbd: set epoch in create context v2 lease Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 15/49] ksmbd: set v2 lease capability Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 16/49] ksmbd: downgrade RWH lease caching state to RH for directory Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 17/49] ksmbd: send v2 lease break notification " Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 18/49] ksmbd: lazy v2 lease break on smb2_write() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 19/49] ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 20/49] fs: new accessor methods for atime and mtime Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 21/49] client: convert to new timestamp accessors Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 22/49] fs: cifs: Fix atime update check Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 23/49] virtio_ring: fix syncs DMA memory with different direction Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 24/49] kexec: fix KEXEC_FILE dependencies Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 25/49] kexec: select CRYPTO from KEXEC_FILE instead of depending on it Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 26/49] linux/export: Fix alignment for 64-bit ksymtab entries Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 27/49] linux/export: Ensure natural alignment of kcrctab array Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 28/49] mptcp: refactor sndbuf auto-tuning Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 29/49] mptcp: fix possible NULL pointer dereference on close Greg Kroah-Hartman
2024-01-03 16:55 ` Greg Kroah-Hartman [this message]
2024-01-03 16:55 ` [PATCH 6.6 31/49] block: renumber QUEUE_FLAG_HW_WC Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 32/49] platform/x86/intel/pmc: Add suspend callback Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 33/49] platform/x86/intel/pmc: Allow reenabling LTRs Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 34/49] platform/x86/intel/pmc: Move GBE LTR ignore to suspend callback Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 35/49] ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 36/49] platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe Greg Kroah-Hartman
2024-01-04  9:01   ` Shinichiro Kawasaki
2024-01-03 16:55 ` [PATCH 6.6 37/49] maple_tree: do not preallocate nodes for slot stores Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 38/49] selftests: secretmem: floor the memory size to the multiple of page_size Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 6.6 39/49] mm/filemap: avoid buffered read/write race to read inconsistent data Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 40/49] mm: migrate high-order folios in swap cache correctly Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 41/49] mm/memory-failure: cast index to loff_t before shifting it Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 42/49] mm/memory-failure: check the mapcount of the precise page Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 43/49] Revert "nvme-fc: fix race between error recovery and creating association" Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 44/49] ring-buffer: Fix wake ups when buffer_percent is set to 100 Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 45/49] ftrace: Fix modification of direct_function hash while in use Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 46/49] tracing: Fix blocked reader of snapshot buffer Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 47/49] wifi: cfg80211: fix CQM for non-range use Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 48/49] wifi: nl80211: fix deadlock in nl80211_set_cqm_rssi (6.6.x) Greg Kroah-Hartman
2024-01-03 16:56 ` [PATCH 6.6 49/49] netfilter: nf_tables: skip set commit for deleted/destroyed sets Greg Kroah-Hartman
2024-01-03 17:44 ` [PATCH 6.6 00/49] 6.6.10-rc1 review Nam Cao
2024-01-03 18:57 ` SeongJae Park
2024-01-03 22:04 ` Florian Fainelli
2024-01-03 23:35 ` Kelsey Steele
2024-01-04  0:18 ` Shuah Khan
2024-01-04  2:24 ` Takeshi Ogasawara
2024-01-04  4:10 ` Daniel Díaz
2024-01-04  7:15   ` Daniel Díaz
2024-01-04  7:58     ` Greg Kroah-Hartman
2024-01-04  8:21       ` Johannes Berg
2024-01-04 12:39       ` Naresh Kamboju
2024-01-04 12:58         ` Greg Kroah-Hartman
2024-01-04  5:20 ` Bagas Sanjaya
2024-01-04  7:55 ` Luna Jernberg
2024-01-04  7:57   ` Greg Kroah-Hartman
2024-01-04 10:26 ` Ron Economos
2024-01-04 11:53 ` Harshit Mogalapalli
2024-01-04 16:52 ` Jon Hunter
2024-01-04 17:12 ` Allen
2024-01-05  1:04 ` Guenter Roeck
2024-01-05  2:43 ` Namjae Jeon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240103164839.629732451@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=martineau@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox