netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/7] mptcp: fixes for 6.3
@ 2023-02-27 17:29 Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 1/7] mptcp: fix possible deadlock in subflow_error_report Matthieu Baerts
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Christoph Paasch, Geliang Tang

Patch 1 fixes a possible deadlock in subflow_error_report() reported by
lockdep. The report was in fact a false positive but the modification
makes sense and silences lockdep to allow syzkaller to find real issues.
The regression has been introduced in v5.12.

Patch 2 is a refactoring needed to be able to fix the two next issues.
It improves the situation and can be backported up to v6.0.

Patches 3 and 4 fix UaF reported by KASAN. It fixes issues potentially
visible since v5.7 and v5.19 but only reproducible until recently
(v6.0). These two patches depend on patch 2/7.

Patch 5 fixes the order of the printed values: expected vs seen values.
The regression has been introduced recently: present in Linus' tree but
not in a tagged version yet.

Patch 6 adds missing ro_after_init flags. A previous patch added them
for other functions but these two have been missed. This previous patch
has been backported to stable versions (up to v5.12) so probably better
to do the same here.

Patch 7 fixes tcp_set_state() being called twice in a row since v5.10.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
Geliang Tang (1):
      mptcp: add ro_after_init for tcp{,v6}_prot_override

Matthieu Baerts (2):
      selftests: mptcp: userspace pm: fix printed values
      mptcp: avoid setting TCP_CLOSE state twice

Paolo Abeni (4):
      mptcp: fix possible deadlock in subflow_error_report
      mptcp: refactor passive socket initialization
      mptcp: use the workqueue to destroy unaccepted sockets
      mptcp: fix UaF in listener shutdown

 net/mptcp/protocol.c                              |  44 +++-----
 net/mptcp/protocol.h                              |   4 +-
 net/mptcp/subflow.c                               | 122 +++++++---------------
 tools/testing/selftests/net/mptcp/userspace_pm.sh |   2 +-
 4 files changed, 59 insertions(+), 113 deletions(-)
---
base-commit: aaa3c08ee0653beaa649d4adfb27ad562641cfd8
change-id: 20230227-upstream-net-20230227-mptcp-fixes-cc78f3a2f5b2

Best regards,
-- 
Matthieu Baerts <matthieu.baerts@tessares.net>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net 1/7] mptcp: fix possible deadlock in subflow_error_report
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 2/7] mptcp: refactor passive socket initialization Matthieu Baerts
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Christoph Paasch

From: Paolo Abeni <pabeni@redhat.com>

Christoph reported a possible deadlock while the TCP stack
destroys an unaccepted subflow due to an incoming reset: the
MPTCP socket error path tries to acquire the msk-level socket
lock while TCP still owns the listener socket accept queue
spinlock, and the reverse dependency already exists in the
TCP stack.

Note that the above is actually a lockdep false positive, as
the chain involves two separate sockets. A different per-socket
lockdep key will address the issue, but such a change will be
quite invasive.

Instead, we can simply stop earlier the socket error handling
for orphaned or unaccepted subflows, breaking the critical
lockdep chain. Error handling in such a scenario is a no-op.

Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk")
Cc: stable@vger.kernel.org
Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/355
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/subflow.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 4ae1a7304cf0..5070dc33675d 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1432,6 +1432,13 @@ static void subflow_error_report(struct sock *ssk)
 {
 	struct sock *sk = mptcp_subflow_ctx(ssk)->conn;
 
+	/* bail early if this is a no-op, so that we avoid introducing a
+	 * problematic lockdep dependency between TCP accept queue lock
+	 * and msk socket spinlock
+	 */
+	if (!sk->sk_socket)
+		return;
+
 	mptcp_data_lock(sk);
 	if (!sock_owned_by_user(sk))
 		__mptcp_error_report(sk);

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 2/7] mptcp: refactor passive socket initialization
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 1/7] mptcp: fix possible deadlock in subflow_error_report Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 3/7] mptcp: use the workqueue to destroy unaccepted sockets Matthieu Baerts
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Christoph Paasch

From: Paolo Abeni <pabeni@redhat.com>

After commit 30e51b923e43 ("mptcp: fix unreleased socket in accept queue")
unaccepted msk sockets go throu complete shutdown, we don't need anymore
to delay inserting the first subflow into the subflow lists.

The reference counting deserve some extra care, as __mptcp_close() is
unaware of the request socket linkage to the first subflow.

Please note that this is more a refactoring than a fix but because this
modification is needed to include other corrections, see the following
commits. Then a Fixes tag has been added here to help the stable team.

Fixes: 30e51b923e43 ("mptcp: fix unreleased socket in accept queue")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/protocol.c | 17 -----------------
 net/mptcp/subflow.c  | 27 +++++++++++++++++++++------
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ad9c46202fc..447641d34c2c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -825,7 +825,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk)
 	if (sk->sk_socket && !ssk->sk_socket)
 		mptcp_sock_graft(ssk, sk->sk_socket);
 
-	mptcp_propagate_sndbuf((struct sock *)msk, ssk);
 	mptcp_sockopt_sync_locked(msk, ssk);
 	return true;
 }
@@ -3708,22 +3707,6 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 
 		lock_sock(newsk);
 
-		/* PM/worker can now acquire the first subflow socket
-		 * lock without racing with listener queue cleanup,
-		 * we can notify it, if needed.
-		 *
-		 * Even if remote has reset the initial subflow by now
-		 * the refcnt is still at least one.
-		 */
-		subflow = mptcp_subflow_ctx(msk->first);
-		list_add(&subflow->node, &msk->conn_list);
-		sock_hold(msk->first);
-		if (mptcp_is_fully_established(newsk))
-			mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL);
-
-		mptcp_rcv_space_init(msk, msk->first);
-		mptcp_propagate_sndbuf(newsk, msk->first);
-
 		/* set ssk->sk_socket of accept()ed flows to mptcp socket.
 		 * This is needed so NOSPACE flag can be set from tcp stack.
 		 */
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 5070dc33675d..a631a5e6fc7b 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -397,6 +397,12 @@ void mptcp_subflow_reset(struct sock *ssk)
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
 	struct sock *sk = subflow->conn;
 
+	/* mptcp_mp_fail_no_response() can reach here on an already closed
+	 * socket
+	 */
+	if (ssk->sk_state == TCP_CLOSE)
+		return;
+
 	/* must hold: tcp_done() could drop last reference on parent */
 	sock_hold(sk);
 
@@ -750,6 +756,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 	struct mptcp_options_received mp_opt;
 	bool fallback, fallback_is_fatal;
 	struct sock *new_msk = NULL;
+	struct mptcp_sock *owner;
 	struct sock *child;
 
 	pr_debug("listener=%p, req=%p, conn=%p", listener, req, listener->conn);
@@ -824,6 +831,8 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 		ctx->setsockopt_seq = listener->setsockopt_seq;
 
 		if (ctx->mp_capable) {
+			owner = mptcp_sk(new_msk);
+
 			/* this can't race with mptcp_close(), as the msk is
 			 * not yet exposted to user-space
 			 */
@@ -832,14 +841,14 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 			/* record the newly created socket as the first msk
 			 * subflow, but don't link it yet into conn_list
 			 */
-			WRITE_ONCE(mptcp_sk(new_msk)->first, child);
+			WRITE_ONCE(owner->first, child);
 
 			/* new mpc subflow takes ownership of the newly
 			 * created mptcp socket
 			 */
 			mptcp_sk(new_msk)->setsockopt_seq = ctx->setsockopt_seq;
-			mptcp_pm_new_connection(mptcp_sk(new_msk), child, 1);
-			mptcp_token_accept(subflow_req, mptcp_sk(new_msk));
+			mptcp_pm_new_connection(owner, child, 1);
+			mptcp_token_accept(subflow_req, owner);
 			ctx->conn = new_msk;
 			new_msk = NULL;
 
@@ -847,15 +856,21 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 			 * uses the correct data
 			 */
 			mptcp_copy_inaddrs(ctx->conn, child);
+			mptcp_propagate_sndbuf(ctx->conn, child);
+
+			mptcp_rcv_space_init(owner, child);
+			list_add(&ctx->node, &owner->conn_list);
+			sock_hold(child);
 
 			/* with OoO packets we can reach here without ingress
 			 * mpc option
 			 */
-			if (mp_opt.suboptions & OPTION_MPTCP_MPC_ACK)
+			if (mp_opt.suboptions & OPTION_MPTCP_MPC_ACK) {
 				mptcp_subflow_fully_established(ctx, &mp_opt);
+				mptcp_pm_fully_established(owner, child, GFP_ATOMIC);
+				ctx->pm_notified = 1;
+			}
 		} else if (ctx->mp_join) {
-			struct mptcp_sock *owner;
-
 			owner = subflow_req->msk;
 			if (!owner) {
 				subflow_add_reset_reason(skb, MPTCP_RST_EPROHIBIT);

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 3/7] mptcp: use the workqueue to destroy unaccepted sockets
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 1/7] mptcp: fix possible deadlock in subflow_error_report Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 2/7] mptcp: refactor passive socket initialization Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 4/7] mptcp: fix UaF in listener shutdown Matthieu Baerts
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Christoph Paasch

From: Paolo Abeni <pabeni@redhat.com>

Christoph reported a UaF at token lookup time after having
refactored the passive socket initialization part:

  BUG: KASAN: use-after-free in __token_bucket_busy+0x253/0x260
  Read of size 4 at addr ffff88810698d5b0 by task syz-executor653/3198

  CPU: 1 PID: 3198 Comm: syz-executor653 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x6e/0x91
   print_report+0x16a/0x46f
   kasan_report+0xad/0x130
   __token_bucket_busy+0x253/0x260
   mptcp_token_new_connect+0x13d/0x490
   mptcp_connect+0x4ed/0x860
   __inet_stream_connect+0x80e/0xd90
   tcp_sendmsg_fastopen+0x3ce/0x710
   mptcp_sendmsg+0xff1/0x1a20
   inet_sendmsg+0x11d/0x140
   __sys_sendto+0x405/0x490
   __x64_sys_sendto+0xdc/0x1b0
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

We need to properly clean-up all the paired MPTCP-level
resources and be sure to release the msk last, even when
the unaccepted subflow is destroyed by the TCP internals
via inet_child_forget().

We can re-use the existing MPTCP_WORK_CLOSE_SUBFLOW infra,
explicitly checking that for the critical scenario: the
closed subflow is the MPC one, the msk is not accepted and
eventually going through full cleanup.

With such change, __mptcp_destroy_sock() is always called
on msk sockets, even on accepted ones. We don't need anymore
to transiently drop one sk reference at msk clone time.

Please note this commit depends on the parent one:

  mptcp: refactor passive socket initialization

Fixes: 58b09919626b ("mptcp: create msk early")
Cc: stable@vger.kernel.org # v6.0+
Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/347
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/protocol.c | 26 +++++++++++++++++---------
 net/mptcp/protocol.h |  3 ++-
 net/mptcp/subflow.c  | 11 +++++++++--
 3 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 447641d34c2c..b7014f939236 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2398,9 +2398,10 @@ static unsigned int mptcp_sync_mss(struct sock *sk, u32 pmtu)
 	return 0;
 }
 
-static void __mptcp_close_subflow(struct mptcp_sock *msk)
+static void __mptcp_close_subflow(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow, *tmp;
+	struct mptcp_sock *msk = mptcp_sk(sk);
 
 	might_sleep();
 
@@ -2414,7 +2415,15 @@ static void __mptcp_close_subflow(struct mptcp_sock *msk)
 		if (!skb_queue_empty_lockless(&ssk->sk_receive_queue))
 			continue;
 
-		mptcp_close_ssk((struct sock *)msk, ssk, subflow);
+		mptcp_close_ssk(sk, ssk, subflow);
+	}
+
+	/* if the MPC subflow has been closed before the msk is accepted,
+	 * msk will never be accept-ed, close it now
+	 */
+	if (!msk->first && msk->in_accept_queue) {
+		sock_set_flag(sk, SOCK_DEAD);
+		inet_sk_state_store(sk, TCP_CLOSE);
 	}
 }
 
@@ -2623,6 +2632,9 @@ static void mptcp_worker(struct work_struct *work)
 	__mptcp_check_send_data_fin(sk);
 	mptcp_check_data_fin(sk);
 
+	if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags))
+		__mptcp_close_subflow(sk);
+
 	/* There is no point in keeping around an orphaned sk timedout or
 	 * closed, but we need the msk around to reply to incoming DATA_FIN,
 	 * even if it is orphaned and in FIN_WAIT2 state
@@ -2638,9 +2650,6 @@ static void mptcp_worker(struct work_struct *work)
 		}
 	}
 
-	if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags))
-		__mptcp_close_subflow(msk);
-
 	if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags))
 		__mptcp_retrans(sk);
 
@@ -3078,6 +3087,7 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
 	msk->local_key = subflow_req->local_key;
 	msk->token = subflow_req->token;
 	msk->subflow = NULL;
+	msk->in_accept_queue = 1;
 	WRITE_ONCE(msk->fully_established, false);
 	if (mp_opt->suboptions & OPTION_MPTCP_CSUMREQD)
 		WRITE_ONCE(msk->csum_enabled, true);
@@ -3095,8 +3105,7 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
 	security_inet_csk_clone(nsk, req);
 	bh_unlock_sock(nsk);
 
-	/* keep a single reference */
-	__sock_put(nsk);
+	/* note: the newly allocated socket refcount is 2 now */
 	return nsk;
 }
 
@@ -3152,8 +3161,6 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
 			goto out;
 		}
 
-		/* acquire the 2nd reference for the owning socket */
-		sock_hold(new_mptcp_sock);
 		newsk = new_mptcp_sock;
 		MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEPASSIVEACK);
 	} else {
@@ -3704,6 +3711,7 @@ static int mptcp_stream_accept(struct socket *sock, struct socket *newsock,
 		struct sock *newsk = newsock->sk;
 
 		set_bit(SOCK_CUSTOM_SOCKOPT, &newsock->flags);
+		msk->in_accept_queue = 0;
 
 		lock_sock(newsk);
 
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 61fd8eabfca2..901c9da8fe66 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -295,7 +295,8 @@ struct mptcp_sock {
 	u8		recvmsg_inq:1,
 			cork:1,
 			nodelay:1,
-			fastopening:1;
+			fastopening:1,
+			in_accept_queue:1;
 	int		connect_flags;
 	struct work_struct work;
 	struct sk_buff  *ooo_last_skb;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index a631a5e6fc7b..9d5bf2a020ef 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -699,9 +699,10 @@ static bool subflow_hmac_valid(const struct request_sock *req,
 
 static void mptcp_force_close(struct sock *sk)
 {
-	/* the msk is not yet exposed to user-space */
+	/* the msk is not yet exposed to user-space, and refcount is 2 */
 	inet_sk_state_store(sk, TCP_CLOSE);
 	sk_common_release(sk);
+	sock_put(sk);
 }
 
 static void subflow_ulp_fallback(struct sock *sk,
@@ -1866,7 +1867,6 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
 		struct sock *sk = (struct sock *)msk;
 		bool do_cancel_work;
 
-		sock_hold(sk);
 		lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
 		next = msk->dl_next;
 		msk->first = NULL;
@@ -1954,6 +1954,13 @@ static void subflow_ulp_release(struct sock *ssk)
 		 * when the subflow is still unaccepted
 		 */
 		release = ctx->disposable || list_empty(&ctx->node);
+
+		/* inet_child_forget() does not call sk_state_change(),
+		 * explicitly trigger the socket close machinery
+		 */
+		if (!release && !test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW,
+						  &mptcp_sk(sk)->flags))
+			mptcp_schedule_work(sk);
 		sock_put(sk);
 	}
 

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 4/7] mptcp: fix UaF in listener shutdown
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
                   ` (2 preceding siblings ...)
  2023-02-27 17:29 ` [PATCH net 3/7] mptcp: use the workqueue to destroy unaccepted sockets Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 5/7] selftests: mptcp: userspace pm: fix printed values Matthieu Baerts
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Christoph Paasch

From: Paolo Abeni <pabeni@redhat.com>

As reported by Christoph after having refactored the passive
socket initialization, the mptcp listener shutdown path is prone
to an UaF issue.

  BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x73/0xe0
  Write of size 4 at addr ffff88810cb23098 by task syz-executor731/1266

  CPU: 1 PID: 1266 Comm: syz-executor731 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x6e/0x91
   print_report+0x16a/0x46f
   kasan_report+0xad/0x130
   kasan_check_range+0x14a/0x1a0
   _raw_spin_lock_bh+0x73/0xe0
   subflow_error_report+0x6d/0x110
   sk_error_report+0x3b/0x190
   tcp_disconnect+0x138c/0x1aa0
   inet_child_forget+0x6f/0x2e0
   inet_csk_listen_stop+0x209/0x1060
   __mptcp_close_ssk+0x52d/0x610
   mptcp_destroy_common+0x165/0x640
   mptcp_destroy+0x13/0x80
   __mptcp_destroy_sock+0xe7/0x270
   __mptcp_close+0x70e/0x9b0
   mptcp_close+0x2b/0x150
   inet_release+0xe9/0x1f0
   __sock_release+0xd2/0x280
   sock_close+0x15/0x20
   __fput+0x252/0xa20
   task_work_run+0x169/0x250
   exit_to_user_mode_prepare+0x113/0x120
   syscall_exit_to_user_mode+0x1d/0x40
   do_syscall_64+0x48/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

The msk grace period can legitly expire in between the last
reference count dropped in mptcp_subflow_queue_clean() and
the later eventual access in inet_csk_listen_stop()

After the previous patch we don't need anymore special-casing
msk listener socket cleanup: the mptcp worker will process each
of the unaccepted msk sockets.

Just drop the now unnecessary code.

Please note this commit depends on the two parent ones:

  mptcp: refactor passive socket initialization
  mptcp: use the workqueue to destroy unaccepted sockets

Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets")
Cc: stable@vger.kernel.org # v6.0+
Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/346
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/protocol.c |  1 -
 net/mptcp/protocol.h |  1 -
 net/mptcp/subflow.c  | 72 ----------------------------------------------------
 3 files changed, 74 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b7014f939236..420d6616da7d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2355,7 +2355,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 		/* otherwise tcp will dispose of the ssk and subflow ctx */
 		if (ssk->sk_state == TCP_LISTEN) {
 			tcp_set_state(ssk, TCP_CLOSE);
-			mptcp_subflow_queue_clean(sk, ssk);
 			inet_csk_listen_stop(ssk);
 			mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED);
 		}
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 901c9da8fe66..bda5ad723d38 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -629,7 +629,6 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
 		     struct mptcp_subflow_context *subflow);
 void __mptcp_subflow_send_ack(struct sock *ssk);
 void mptcp_subflow_reset(struct sock *ssk);
-void mptcp_subflow_queue_clean(struct sock *sk, struct sock *ssk);
 void mptcp_sock_graft(struct sock *sk, struct socket *parent);
 struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk);
 bool __mptcp_close(struct sock *sk, long timeout);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 9d5bf2a020ef..5a3b17811b6b 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1826,78 +1826,6 @@ static void subflow_state_change(struct sock *sk)
 	}
 }
 
-void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
-{
-	struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
-	struct mptcp_sock *msk, *next, *head = NULL;
-	struct request_sock *req;
-
-	/* build a list of all unaccepted mptcp sockets */
-	spin_lock_bh(&queue->rskq_lock);
-	for (req = queue->rskq_accept_head; req; req = req->dl_next) {
-		struct mptcp_subflow_context *subflow;
-		struct sock *ssk = req->sk;
-		struct mptcp_sock *msk;
-
-		if (!sk_is_mptcp(ssk))
-			continue;
-
-		subflow = mptcp_subflow_ctx(ssk);
-		if (!subflow || !subflow->conn)
-			continue;
-
-		/* skip if already in list */
-		msk = mptcp_sk(subflow->conn);
-		if (msk->dl_next || msk == head)
-			continue;
-
-		msk->dl_next = head;
-		head = msk;
-	}
-	spin_unlock_bh(&queue->rskq_lock);
-	if (!head)
-		return;
-
-	/* can't acquire the msk socket lock under the subflow one,
-	 * or will cause ABBA deadlock
-	 */
-	release_sock(listener_ssk);
-
-	for (msk = head; msk; msk = next) {
-		struct sock *sk = (struct sock *)msk;
-		bool do_cancel_work;
-
-		lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
-		next = msk->dl_next;
-		msk->first = NULL;
-		msk->dl_next = NULL;
-
-		do_cancel_work = __mptcp_close(sk, 0);
-		release_sock(sk);
-		if (do_cancel_work) {
-			/* lockdep will report a false positive ABBA deadlock
-			 * between cancel_work_sync and the listener socket.
-			 * The involved locks belong to different sockets WRT
-			 * the existing AB chain.
-			 * Using a per socket key is problematic as key
-			 * deregistration requires process context and must be
-			 * performed at socket disposal time, in atomic
-			 * context.
-			 * Just tell lockdep to consider the listener socket
-			 * released here.
-			 */
-			mutex_release(&listener_sk->sk_lock.dep_map, _RET_IP_);
-			mptcp_cancel_work(sk);
-			mutex_acquire(&listener_sk->sk_lock.dep_map,
-				      SINGLE_DEPTH_NESTING, 0, _RET_IP_);
-		}
-		sock_put(sk);
-	}
-
-	/* we are still under the listener msk socket lock */
-	lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
-}
-
 static int subflow_ulp_init(struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 5/7] selftests: mptcp: userspace pm: fix printed values
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
                   ` (3 preceding siblings ...)
  2023-02-27 17:29 ` [PATCH net 4/7] mptcp: fix UaF in listener shutdown Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 6/7] mptcp: add ro_after_init for tcp{,v6}_prot_override Matthieu Baerts
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable,
	Geliang Tang

In case of errors, the printed message had the expected and the seen
value inverted.

This patch simply correct the order: first the expected value, then the
one that has been seen.

Fixes: 10d4273411be ("selftests: mptcp: userspace: print error details if any")
Cc: stable@vger.kernel.org
Acked-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 tools/testing/selftests/net/mptcp/userspace_pm.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/mptcp/userspace_pm.sh b/tools/testing/selftests/net/mptcp/userspace_pm.sh
index 66c5be25c13d..48e52f995a98 100755
--- a/tools/testing/selftests/net/mptcp/userspace_pm.sh
+++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh
@@ -240,7 +240,7 @@ check_expected_one()
 	fi
 
 	stdbuf -o0 -e0 printf "\tExpected value for '%s': '%s', got '%s'.\n" \
-		"${var}" "${!var}" "${!exp}"
+		"${var}" "${!exp}" "${!var}"
 	return 1
 }
 

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 6/7] mptcp: add ro_after_init for tcp{,v6}_prot_override
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
                   ` (4 preceding siblings ...)
  2023-02-27 17:29 ` [PATCH net 5/7] selftests: mptcp: userspace pm: fix printed values Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-27 17:29 ` [PATCH net 7/7] mptcp: avoid setting TCP_CLOSE state twice Matthieu Baerts
  2023-02-28 11:28 ` [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts,
	Geliang Tang, stable

From: Geliang Tang <geliang.tang@suse.com>

Add __ro_after_init labels for the variables tcp_prot_override and
tcpv6_prot_override, just like other variables adjacent to them, to
indicate that they are initialised from the init hooks and no writes
occur afterwards.

Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
Cc: stable@vger.kernel.org
Fixes: 51fa7f8ebf0e ("mptcp: mark ops structures as ro_after_init")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/subflow.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 5a3b17811b6b..f6b4511b09b0 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -628,7 +628,7 @@ static struct request_sock_ops mptcp_subflow_v6_request_sock_ops __ro_after_init
 static struct tcp_request_sock_ops subflow_request_sock_ipv6_ops __ro_after_init;
 static struct inet_connection_sock_af_ops subflow_v6_specific __ro_after_init;
 static struct inet_connection_sock_af_ops subflow_v6m_specific __ro_after_init;
-static struct proto tcpv6_prot_override;
+static struct proto tcpv6_prot_override __ro_after_init;
 
 static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb)
 {
@@ -926,7 +926,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 }
 
 static struct inet_connection_sock_af_ops subflow_specific __ro_after_init;
-static struct proto tcp_prot_override;
+static struct proto tcp_prot_override __ro_after_init;
 
 enum mapping_status {
 	MAPPING_OK,

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH net 7/7] mptcp: avoid setting TCP_CLOSE state twice
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
                   ` (5 preceding siblings ...)
  2023-02-27 17:29 ` [PATCH net 6/7] mptcp: add ro_after_init for tcp{,v6}_prot_override Matthieu Baerts
@ 2023-02-27 17:29 ` Matthieu Baerts
  2023-02-28 11:28 ` [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
  7 siblings, 0 replies; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-27 17:29 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts, stable

tcp_set_state() is called from tcp_done() already.

There is then no need to first set the state to TCP_CLOSE, then call
tcp_done().

Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/362
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 net/mptcp/subflow.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index f6b4511b09b0..b865ba911bc4 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -406,7 +406,6 @@ void mptcp_subflow_reset(struct sock *ssk)
 	/* must hold: tcp_done() could drop last reference on parent */
 	sock_hold(sk);
 
-	tcp_set_state(ssk, TCP_CLOSE);
 	tcp_send_active_reset(ssk, GFP_ATOMIC);
 	tcp_done(ssk);
 	if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags) &&

-- 
2.38.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net 0/7] mptcp: fixes for 6.3
  2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
                   ` (6 preceding siblings ...)
  2023-02-27 17:29 ` [PATCH net 7/7] mptcp: avoid setting TCP_CLOSE state twice Matthieu Baerts
@ 2023-02-28 11:28 ` Matthieu Baerts
  2023-02-28 11:33   ` Paolo Abeni
  7 siblings, 1 reply; 10+ messages in thread
From: Matthieu Baerts @ 2023-02-28 11:28 UTC (permalink / raw)
  To: mptcp, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Menglong Dong, Mengen Sun, Shuah Khan, Florian Westphal,
	Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, stable, Christoph Paasch,
	Geliang Tang

Hello,

On 27/02/2023 18:29, Matthieu Baerts wrote:
> Patch 1 fixes a possible deadlock in subflow_error_report() reported by
> lockdep. The report was in fact a false positive but the modification
> makes sense and silences lockdep to allow syzkaller to find real issues.
> The regression has been introduced in v5.12.
> 
> Patch 2 is a refactoring needed to be able to fix the two next issues.
> It improves the situation and can be backported up to v6.0.
> 
> Patches 3 and 4 fix UaF reported by KASAN. It fixes issues potentially
> visible since v5.7 and v5.19 but only reproducible until recently
> (v6.0). These two patches depend on patch 2/7.
> 
> Patch 5 fixes the order of the printed values: expected vs seen values.
> The regression has been introduced recently: present in Linus' tree but
> not in a tagged version yet.
> 
> Patch 6 adds missing ro_after_init flags. A previous patch added them
> for other functions but these two have been missed. This previous patch
> has been backported to stable versions (up to v5.12) so probably better
> to do the same here.
> 
> Patch 7 fixes tcp_set_state() being called twice in a row since v5.10.

I'm sorry to ask for that but is it possible not to apply these patches?

> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> ---
> Geliang Tang (1):
>       mptcp: add ro_after_init for tcp{,v6}_prot_override
> 
> Matthieu Baerts (2):
>       selftests: mptcp: userspace pm: fix printed values
>       mptcp: avoid setting TCP_CLOSE state twice
> 
> Paolo Abeni (4):
>       mptcp: fix possible deadlock in subflow_error_report
>       mptcp: refactor passive socket initialization
>       mptcp: use the workqueue to destroy unaccepted sockets

After 3 weeks of validation, syzkaller found an issue with this patch:

  https://github.com/multipath-tcp/mptcp_net-next/issues/366

We then need to NAK this series. We will send a v2 with a fix for that.

>       mptcp: fix UaF in listener shutdown

The other patches of the series are either not very important or are
linked to the "faulty" one: they can all wait as well.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net 0/7] mptcp: fixes for 6.3
  2023-02-28 11:28 ` [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
@ 2023-02-28 11:33   ` Paolo Abeni
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2023-02-28 11:33 UTC (permalink / raw)
  To: Matthieu Baerts, mptcp, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Menglong Dong, Mengen Sun, Shuah Khan,
	Florian Westphal, Jiang Biao
  Cc: netdev, linux-kernel, linux-kselftest, stable, Christoph Paasch,
	Geliang Tang

On Tue, 2023-02-28 at 12:28 +0100, Matthieu Baerts wrote:
> Hello,
> 
> On 27/02/2023 18:29, Matthieu Baerts wrote:
> > Patch 1 fixes a possible deadlock in subflow_error_report() reported by
> > lockdep. The report was in fact a false positive but the modification
> > makes sense and silences lockdep to allow syzkaller to find real issues.
> > The regression has been introduced in v5.12.
> > 
> > Patch 2 is a refactoring needed to be able to fix the two next issues.
> > It improves the situation and can be backported up to v6.0.
> > 
> > Patches 3 and 4 fix UaF reported by KASAN. It fixes issues potentially
> > visible since v5.7 and v5.19 but only reproducible until recently
> > (v6.0). These two patches depend on patch 2/7.
> > 
> > Patch 5 fixes the order of the printed values: expected vs seen values.
> > The regression has been introduced recently: present in Linus' tree but
> > not in a tagged version yet.
> > 
> > Patch 6 adds missing ro_after_init flags. A previous patch added them
> > for other functions but these two have been missed. This previous patch
> > has been backported to stable versions (up to v5.12) so probably better
> > to do the same here.
> > 
> > Patch 7 fixes tcp_set_state() being called twice in a row since v5.10.
> 
> I'm sorry to ask for that but is it possible not to apply these patches?

Done, thanks!

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-02-28 11:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-27 17:29 [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 1/7] mptcp: fix possible deadlock in subflow_error_report Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 2/7] mptcp: refactor passive socket initialization Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 3/7] mptcp: use the workqueue to destroy unaccepted sockets Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 4/7] mptcp: fix UaF in listener shutdown Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 5/7] selftests: mptcp: userspace pm: fix printed values Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 6/7] mptcp: add ro_after_init for tcp{,v6}_prot_override Matthieu Baerts
2023-02-27 17:29 ` [PATCH net 7/7] mptcp: avoid setting TCP_CLOSE state twice Matthieu Baerts
2023-02-28 11:28 ` [PATCH net 0/7] mptcp: fixes for 6.3 Matthieu Baerts
2023-02-28 11:33   ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).