netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 00/10] mptcp: more misc. fixes for v6.8
@ 2024-02-23 16:14 Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 01/10] mptcp: map v4 address to v6 when destroying subflow Matthieu Baerts (NGI0)
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	Geliang Tang, stable, Davide Caratti

This series includes 6 types of fixes:

- Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
  when asking to delete a subflow. It was done everywhere else, but not
  there. Patch 2 validates the modification, thanks to a subtest in
  mptcp_join.sh. These patches can be backported up to v5.19.

- Patch 3 is a small fix for a recent bug-fix patch, just to avoid
  printing an irrelevant warning (pr_warn()) once. It can be backported
  up to v5.6, alongside the bug-fix that has been introduced in the
  v6.8-rc5.

- Patches 4 to 6 are fixes for bugs found by Paolo while working on
  TCP_NOTSENT_LOWAT support for MPTCP. These fixes can improve the
  performances in some cases. Patches can be backported up to v5.6,
  v5.11 and v6.7 respectively.

- Patch 7 makes sure 'ss -M' is available when starting MPTCP Join
  selftest as it is required for some subtests since v5.18.

- Patch 8 fixes a possible double-free on socket dismantle. The issue
  always existed, but was unnoticed because it was not causing any
  problem so far. This fix can be backported up to v5.6.

- Patch 9 is a fix for a very recent patch causing lockdep warnings in
  subflow diag. The patch causing the regression -- which fixes another
  issue present since v5.7 -- should be part of the future v6.8-rc6.
  Patch 10 validates the modification, thanks to a new subtest in
  diag.sh.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Davide Caratti (1):
      mptcp: fix double-free on socket dismantle

Geliang Tang (3):
      mptcp: map v4 address to v6 when destroying subflow
      selftests: mptcp: rm subflow with v4/v4mapped addr
      selftests: mptcp: join: add ss mptcp support check

Matthieu Baerts (NGI0) (1):
      mptcp: avoid printing warning once on client side

Paolo Abeni (5):
      mptcp: push at DSS boundaries
      mptcp: fix snd_wnd initialization for passive socket
      mptcp: fix potential wake-up event loss
      mptcp: fix possible deadlock in subflow diag
      selftests: mptcp: explicitly trigger the listener diag code-path

 net/mptcp/diag.c                                |  3 ++
 net/mptcp/options.c                             |  2 +-
 net/mptcp/pm_userspace.c                        | 10 +++++
 net/mptcp/protocol.c                            | 52 ++++++++++++++++++++++++-
 net/mptcp/protocol.h                            | 21 +++++-----
 tools/testing/selftests/net/mptcp/diag.sh       | 30 +++++++++++++-
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++++------
 tools/testing/selftests/net/mptcp/mptcp_lib.sh  |  4 +-
 8 files changed, 128 insertions(+), 27 deletions(-)
---
base-commit: b0b1210bc150fbd741b4b9fce8a24541306b40fc
change-id: 20240223-upstream-net-20240223-misc-fixes-1630cd6b3b0a

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net 01/10] mptcp: map v4 address to v6 when destroying subflow
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 02/10] selftests: mptcp: rm subflow with v4/v4mapped addr Matthieu Baerts (NGI0)
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	Geliang Tang, stable

From: Geliang Tang <tanggeliang@kylinos.cn>

Address family of server side mismatches with that of client side, like
in "userspace pm add & remove address" test:

    userspace_pm_add_addr $ns1 10.0.2.1 10
    userspace_pm_rm_sf $ns1 "::ffff:10.0.2.1" $SUB_ESTABLISHED

That's because on the server side, the family is set to AF_INET6 and the
v4 address is mapped in a v6 one.

This patch fixes this issue. In mptcp_pm_nl_subflow_destroy_doit(), before
checking local address family with remote address family, map an IPv4
address to an IPv6 address if the pair is a v4-mapped address.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm_userspace.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/mptcp/pm_userspace.c b/net/mptcp/pm_userspace.c
index d396a5973429..bc97cc30f013 100644
--- a/net/mptcp/pm_userspace.c
+++ b/net/mptcp/pm_userspace.c
@@ -495,6 +495,16 @@ int mptcp_pm_nl_subflow_destroy_doit(struct sk_buff *skb, struct genl_info *info
 		goto destroy_err;
 	}
 
+#if IS_ENABLED(CONFIG_MPTCP_IPV6)
+	if (addr_l.family == AF_INET && ipv6_addr_v4mapped(&addr_r.addr6)) {
+		ipv6_addr_set_v4mapped(addr_l.addr.s_addr, &addr_l.addr6);
+		addr_l.family = AF_INET6;
+	}
+	if (addr_r.family == AF_INET && ipv6_addr_v4mapped(&addr_l.addr6)) {
+		ipv6_addr_set_v4mapped(addr_r.addr.s_addr, &addr_r.addr6);
+		addr_r.family = AF_INET6;
+	}
+#endif
 	if (addr_l.family != addr_r.family) {
 		GENL_SET_ERR_MSG(info, "address families do not match");
 		err = -EINVAL;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 02/10] selftests: mptcp: rm subflow with v4/v4mapped addr
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 01/10] mptcp: map v4 address to v6 when destroying subflow Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 03/10] mptcp: avoid printing warning once on client side Matthieu Baerts (NGI0)
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	Geliang Tang, stable

From: Geliang Tang <tanggeliang@kylinos.cn>

Now both a v4 address and a v4-mapped address are supported when
destroying a userspace pm subflow, this patch adds a second subflow
to "userspace pm add & remove address" test, and two subflows could
be removed two different ways, one with the v4mapped and one with v4.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
Fixes: 48d73f609dcc ("selftests: mptcp: update userspace pm addr tests")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 28 ++++++++++++++-----------
 tools/testing/selftests/net/mptcp/mptcp_lib.sh  |  4 ++--
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index c07386e21e0a..e68b1bc2c2e4 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -3333,16 +3333,17 @@ userspace_pm_rm_sf()
 {
 	local evts=$evts_ns1
 	local t=${3:-1}
-	local ip=4
+	local ip
 	local tk da dp sp
 	local cnt
 
 	[ "$1" == "$ns2" ] && evts=$evts_ns2
-	if mptcp_lib_is_v6 $2; then ip=6; fi
+	[ -n "$(mptcp_lib_evts_get_info "saddr4" "$evts" $t)" ] && ip=4
+	[ -n "$(mptcp_lib_evts_get_info "saddr6" "$evts" $t)" ] && ip=6
 	tk=$(mptcp_lib_evts_get_info token "$evts")
-	da=$(mptcp_lib_evts_get_info "daddr$ip" "$evts" $t)
-	dp=$(mptcp_lib_evts_get_info dport "$evts" $t)
-	sp=$(mptcp_lib_evts_get_info sport "$evts" $t)
+	da=$(mptcp_lib_evts_get_info "daddr$ip" "$evts" $t $2)
+	dp=$(mptcp_lib_evts_get_info dport "$evts" $t $2)
+	sp=$(mptcp_lib_evts_get_info sport "$evts" $t $2)
 
 	cnt=$(rm_sf_count ${1})
 	ip netns exec $1 ./pm_nl_ctl dsf lip $2 lport $sp \
@@ -3429,20 +3430,23 @@ userspace_tests()
 	if reset_with_events "userspace pm add & remove address" &&
 	   continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
 		set_userspace_pm $ns1
-		pm_nl_set_limits $ns2 1 1
+		pm_nl_set_limits $ns2 2 2
 		speed=5 \
 			run_tests $ns1 $ns2 10.0.1.1 &
 		local tests_pid=$!
 		wait_mpj $ns1
 		userspace_pm_add_addr $ns1 10.0.2.1 10
-		chk_join_nr 1 1 1
-		chk_add_nr 1 1
-		chk_mptcp_info subflows 1 subflows 1
-		chk_subflows_total 2 2
-		chk_mptcp_info add_addr_signal 1 add_addr_accepted 1
+		userspace_pm_add_addr $ns1 10.0.3.1 20
+		chk_join_nr 2 2 2
+		chk_add_nr 2 2
+		chk_mptcp_info subflows 2 subflows 2
+		chk_subflows_total 3 3
+		chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
 		userspace_pm_rm_addr $ns1 10
 		userspace_pm_rm_sf $ns1 "::ffff:10.0.2.1" $SUB_ESTABLISHED
-		chk_rm_nr 1 1 invert
+		userspace_pm_rm_addr $ns1 20
+		userspace_pm_rm_sf $ns1 10.0.3.1 $SUB_ESTABLISHED
+		chk_rm_nr 2 2 invert
 		chk_mptcp_info subflows 0 subflows 0
 		chk_subflows_total 1 1
 		kill_events_pids
diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
index 3a2abae5993e..3777d66fc56d 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
@@ -213,9 +213,9 @@ mptcp_lib_get_info_value() {
 	grep "${2}" | sed -n 's/.*\('"${1}"':\)\([0-9a-f:.]*\).*$/\2/p;q'
 }
 
-# $1: info name ; $2: evts_ns ; $3: event type
+# $1: info name ; $2: evts_ns ; [$3: event type; [$4: addr]]
 mptcp_lib_evts_get_info() {
-	mptcp_lib_get_info_value "${1}" "^type:${3:-1}," < "${2}"
+	grep "${4:-}" "${2}" | mptcp_lib_get_info_value "${1}" "^type:${3:-1},"
 }
 
 # $1: PID

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 03/10] mptcp: avoid printing warning once on client side
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 01/10] mptcp: map v4 address to v6 when destroying subflow Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 02/10] selftests: mptcp: rm subflow with v4/v4mapped addr Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 04/10] mptcp: push at DSS boundaries Matthieu Baerts (NGI0)
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	stable

After the 'Fixes' commit mentioned below, the client side might print
the following warning once when a subflow is fully established at the
reception of any valid additional ack:

  MPTCP: bogus mpc option on established client sk

That's a normal situation, and no warning should be printed for that. We
can then skip the check when the label is used.

Fixes: e4a0fa47e816 ("mptcp: corner case locking for rx path fields initialization")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/options.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index e3e96a49f922..63fc0758c22d 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -981,10 +981,10 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk,
 	if (mp_opt->deny_join_id0)
 		WRITE_ONCE(msk->pm.remote_deny_join_id0, true);
 
-set_fully_established:
 	if (unlikely(!READ_ONCE(msk->pm.server_side)))
 		pr_warn_once("bogus mpc option on established client sk");
 
+set_fully_established:
 	mptcp_data_lock((struct sock *)msk);
 	__mptcp_subflow_fully_established(msk, subflow, mp_opt);
 	mptcp_data_unlock((struct sock *)msk);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 04/10] mptcp: push at DSS boundaries
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (2 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 03/10] mptcp: avoid printing warning once on client side Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket Matthieu Baerts (NGI0)
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	stable

From: Paolo Abeni <pabeni@redhat.com>

when inserting not contiguous data in the subflow write queue,
the protocol creates a new skb and prevent the TCP stack from
merging it later with already queued skbs by setting the EOR marker.

Still no push flag is explicitly set at the end of previous GSO
packet, making the aggregation on the receiver side sub-optimal -
and packetdrill self-tests less predictable.

Explicitly mark the end of not contiguous DSS with the push flag.

Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 948606a537da..442fa7d9b57a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1260,6 +1260,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		mpext = mptcp_get_ext(skb);
 		if (!mptcp_skb_can_collapse_to(data_seq, skb, mpext)) {
 			TCP_SKB_CB(skb)->eor = 1;
+			tcp_mark_push(tcp_sk(ssk), skb);
 			goto alloc_skb;
 		}
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (3 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 04/10] mptcp: push at DSS boundaries Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:35   ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket: manual merge Matthieu Baerts
  2024-02-23 16:14 ` [PATCH net 06/10] mptcp: fix potential wake-up event loss Matthieu Baerts (NGI0)
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	stable

From: Paolo Abeni <pabeni@redhat.com>

Such value should be inherited from the first subflow, but
passive sockets always used 'rsk_rcv_wnd'.

Fixes: 6f8a612a33e4 ("mptcp: keep track of advertised windows right edge")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 442fa7d9b57a..2c8f931c6d5b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3211,7 +3211,7 @@ struct sock *mptcp_sk_clone_init(const struct sock *sk,
 	msk->write_seq = subflow_req->idsn + 1;
 	msk->snd_nxt = msk->write_seq;
 	msk->snd_una = msk->write_seq;
-	msk->wnd_end = msk->snd_nxt + req->rsk_rcv_wnd;
+	msk->wnd_end = msk->snd_nxt + tcp_sk(ssk)->snd_wnd;
 	msk->setsockopt_seq = mptcp_sk(sk)->setsockopt_seq;
 	mptcp_init_sched(msk, mptcp_sk(sk)->sched);
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 06/10] mptcp: fix potential wake-up event loss
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (4 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 07/10] selftests: mptcp: join: add ss mptcp support check Matthieu Baerts (NGI0)
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	stable

From: Paolo Abeni <pabeni@redhat.com>

After the blamed commit below, the send buffer auto-tuning can
happen after that the mptcp_propagate_sndbuf() completes - via
the delegated action infrastructure.

We must check for write space even after such change or we risk
missing the wake-up event.

Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.h | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 631a7f445f34..07f6242afc1a 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -790,6 +790,16 @@ static inline bool mptcp_data_fin_enabled(const struct mptcp_sock *msk)
 	       READ_ONCE(msk->write_seq) == READ_ONCE(msk->snd_nxt);
 }
 
+static inline void mptcp_write_space(struct sock *sk)
+{
+	if (sk_stream_is_writeable(sk)) {
+		/* pairs with memory barrier in mptcp_poll */
+		smp_mb();
+		if (test_and_clear_bit(MPTCP_NOSPACE, &mptcp_sk(sk)->flags))
+			sk_stream_write_space(sk);
+	}
+}
+
 static inline void __mptcp_sync_sndbuf(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow;
@@ -808,6 +818,7 @@ static inline void __mptcp_sync_sndbuf(struct sock *sk)
 
 	/* the msk max wmem limit is <nr_subflows> * tcp wmem[2] */
 	WRITE_ONCE(sk->sk_sndbuf, new_sndbuf);
+	mptcp_write_space(sk);
 }
 
 /* The called held both the msk socket and the subflow socket locks,
@@ -838,16 +849,6 @@ static inline void mptcp_propagate_sndbuf(struct sock *sk, struct sock *ssk)
 	local_bh_enable();
 }
 
-static inline void mptcp_write_space(struct sock *sk)
-{
-	if (sk_stream_is_writeable(sk)) {
-		/* pairs with memory barrier in mptcp_poll */
-		smp_mb();
-		if (test_and_clear_bit(MPTCP_NOSPACE, &mptcp_sk(sk)->flags))
-			sk_stream_write_space(sk);
-	}
-}
-
 void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags);
 
 #define MPTCP_TOKEN_MAX_RETRIES	4

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 07/10] selftests: mptcp: join: add ss mptcp support check
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (5 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 06/10] mptcp: fix potential wake-up event loss Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 08/10] mptcp: fix double-free on socket dismantle Matthieu Baerts (NGI0)
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	Geliang Tang, stable

From: Geliang Tang <tanggeliang@kylinos.cn>

Commands 'ss -M' are used in script mptcp_join.sh to display only MPTCP
sockets. So it must be checked if ss tool supports MPTCP in this script.

Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index e68b1bc2c2e4..e4581b0dfb96 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -161,6 +161,11 @@ check_tools()
 		exit $ksft_skip
 	fi
 
+	if ! ss -h | grep -q MPTCP; then
+		echo "SKIP: ss tool does not support MPTCP"
+		exit $ksft_skip
+	fi
+
 	# Use the legacy version if available to support old kernel versions
 	if iptables-legacy -V &> /dev/null; then
 		iptables="iptables-legacy"

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 08/10] mptcp: fix double-free on socket dismantle
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (6 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 07/10] selftests: mptcp: join: add ss mptcp support check Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 09/10] mptcp: fix possible deadlock in subflow diag Matthieu Baerts (NGI0)
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	Davide Caratti, stable

From: Davide Caratti <dcaratti@redhat.com>

when MPTCP server accepts an incoming connection, it clones its listener
socket. However, the pointer to 'inet_opt' for the new socket has the same
value as the original one: as a consequence, on program exit it's possible
to observe the following splat:

  BUG: KASAN: double-free in inet_sock_destruct+0x54f/0x8b0
  Free of addr ffff888485950880 by task swapper/25/0

  CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 6.8.0-rc1+ #609
  Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0  07/26/2013
  Call Trace:
   <IRQ>
   dump_stack_lvl+0x32/0x50
   print_report+0xca/0x620
   kasan_report_invalid_free+0x64/0x90
   __kasan_slab_free+0x1aa/0x1f0
   kfree+0xed/0x2e0
   inet_sock_destruct+0x54f/0x8b0
   __sk_destruct+0x48/0x5b0
   rcu_do_batch+0x34e/0xd90
   rcu_core+0x559/0xac0
   __do_softirq+0x183/0x5a4
   irq_exit_rcu+0x12d/0x170
   sysvec_apic_timer_interrupt+0x6b/0x80
   </IRQ>
   <TASK>
   asm_sysvec_apic_timer_interrupt+0x16/0x20
  RIP: 0010:cpuidle_enter_state+0x175/0x300
  Code: 30 00 0f 84 1f 01 00 00 83 e8 01 83 f8 ff 75 e5 48 83 c4 18 44 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc fb 45 85 ed <0f> 89 60 ff ff ff 48 c1 e5 06 48 c7 43 18 00 00 00 00 48 83 44 2b
  RSP: 0018:ffff888481cf7d90 EFLAGS: 00000202
  RAX: 0000000000000000 RBX: ffff88887facddc8 RCX: 0000000000000000
  RDX: 1ffff1110ff588b1 RSI: 0000000000000019 RDI: ffff88887fac4588
  RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000043080
  R10: 0009b02ea273363f R11: ffff88887fabf42b R12: ffffffff932592e0
  R13: 0000000000000004 R14: 0000000000000000 R15: 00000022c880ec80
   cpuidle_enter+0x4a/0xa0
   do_idle+0x310/0x410
   cpu_startup_entry+0x51/0x60
   start_secondary+0x211/0x270
   secondary_startup_64_no_verify+0x184/0x18b
   </TASK>

  Allocated by task 6853:
   kasan_save_stack+0x1c/0x40
   kasan_save_track+0x10/0x30
   __kasan_kmalloc+0xa6/0xb0
   __kmalloc+0x1eb/0x450
   cipso_v4_sock_setattr+0x96/0x360
   netlbl_sock_setattr+0x132/0x1f0
   selinux_netlbl_socket_post_create+0x6c/0x110
   selinux_socket_post_create+0x37b/0x7f0
   security_socket_post_create+0x63/0xb0
   __sock_create+0x305/0x450
   __sys_socket_create.part.23+0xbd/0x130
   __sys_socket+0x37/0xb0
   __x64_sys_socket+0x6f/0xb0
   do_syscall_64+0x83/0x160
   entry_SYSCALL_64_after_hwframe+0x6e/0x76

  Freed by task 6858:
   kasan_save_stack+0x1c/0x40
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x3b/0x60
   __kasan_slab_free+0x12c/0x1f0
   kfree+0xed/0x2e0
   inet_sock_destruct+0x54f/0x8b0
   __sk_destruct+0x48/0x5b0
   subflow_ulp_release+0x1f0/0x250
   tcp_cleanup_ulp+0x6e/0x110
   tcp_v4_destroy_sock+0x5a/0x3a0
   inet_csk_destroy_sock+0x135/0x390
   tcp_fin+0x416/0x5c0
   tcp_data_queue+0x1bc8/0x4310
   tcp_rcv_state_process+0x15a3/0x47b0
   tcp_v4_do_rcv+0x2c1/0x990
   tcp_v4_rcv+0x41fb/0x5ed0
   ip_protocol_deliver_rcu+0x6d/0x9f0
   ip_local_deliver_finish+0x278/0x360
   ip_local_deliver+0x182/0x2c0
   ip_rcv+0xb5/0x1c0
   __netif_receive_skb_one_core+0x16e/0x1b0
   process_backlog+0x1e3/0x650
   __napi_poll+0xa6/0x500
   net_rx_action+0x740/0xbb0
   __do_softirq+0x183/0x5a4

  The buggy address belongs to the object at ffff888485950880
   which belongs to the cache kmalloc-64 of size 64
  The buggy address is located 0 bytes inside of
   64-byte region [ffff888485950880, ffff8884859508c0)

  The buggy address belongs to the physical page:
  page:0000000056d1e95e refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888485950700 pfn:0x485950
  flags: 0x57ffffc0000800(slab|node=1|zone=2|lastcpupid=0x1fffff)
  page_type: 0xffffffff()
  raw: 0057ffffc0000800 ffff88810004c640 ffffea00121b8ac0 dead000000000006
  raw: ffff888485950700 0000000000200019 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff888485950780: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
   ffff888485950800: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
  >ffff888485950880: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
                     ^
   ffff888485950900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
   ffff888485950980: 00 00 00 00 00 01 fc fc fc fc fc fc fc fc fc fc

Something similar (a refcount underflow) happens with CALIPSO/IPv6. Fix
this by duplicating IP / IPv6 options after clone, so that
ip{,6}_sock_destruct() doesn't end up freeing the same memory area twice.

Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
Cc: stable@vger.kernel.org
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/protocol.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2c8f931c6d5b..7833a49f6214 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3178,8 +3178,50 @@ static struct ipv6_pinfo *mptcp_inet6_sk(const struct sock *sk)
 
 	return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
 }
+
+static void mptcp_copy_ip6_options(struct sock *newsk, const struct sock *sk)
+{
+	const struct ipv6_pinfo *np = inet6_sk(sk);
+	struct ipv6_txoptions *opt;
+	struct ipv6_pinfo *newnp;
+
+	newnp = inet6_sk(newsk);
+
+	rcu_read_lock();
+	opt = rcu_dereference(np->opt);
+	if (opt) {
+		opt = ipv6_dup_options(newsk, opt);
+		if (!opt)
+			net_warn_ratelimited("%s: Failed to copy ip6 options\n", __func__);
+	}
+	RCU_INIT_POINTER(newnp->opt, opt);
+	rcu_read_unlock();
+}
 #endif
 
+static void mptcp_copy_ip_options(struct sock *newsk, const struct sock *sk)
+{
+	struct ip_options_rcu *inet_opt, *newopt = NULL;
+	const struct inet_sock *inet = inet_sk(sk);
+	struct inet_sock *newinet;
+
+	newinet = inet_sk(newsk);
+
+	rcu_read_lock();
+	inet_opt = rcu_dereference(inet->inet_opt);
+	if (inet_opt) {
+		newopt = sock_kmalloc(newsk, sizeof(*inet_opt) +
+				      inet_opt->opt.optlen, GFP_ATOMIC);
+		if (newopt)
+			memcpy(newopt, inet_opt, sizeof(*inet_opt) +
+			       inet_opt->opt.optlen);
+		else
+			net_warn_ratelimited("%s: Failed to copy ip options\n", __func__);
+	}
+	RCU_INIT_POINTER(newinet->inet_opt, newopt);
+	rcu_read_unlock();
+}
+
 struct sock *mptcp_sk_clone_init(const struct sock *sk,
 				 const struct mptcp_options_received *mp_opt,
 				 struct sock *ssk,
@@ -3200,6 +3242,13 @@ struct sock *mptcp_sk_clone_init(const struct sock *sk,
 
 	__mptcp_init_sock(nsk);
 
+#if IS_ENABLED(CONFIG_MPTCP_IPV6)
+	if (nsk->sk_family == AF_INET6)
+		mptcp_copy_ip6_options(nsk, sk);
+	else
+#endif
+		mptcp_copy_ip_options(nsk, sk);
+
 	msk = mptcp_sk(nsk);
 	msk->local_key = subflow_req->local_key;
 	msk->token = subflow_req->token;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 09/10] mptcp: fix possible deadlock in subflow diag
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (7 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 08/10] mptcp: fix double-free on socket dismantle Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-23 16:14 ` [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path Matthieu Baerts (NGI0)
  2024-02-27  3:10 ` [PATCH net 00/10] mptcp: more misc. fixes for v6.8 patchwork-bot+netdevbpf
  10 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0),
	stable

From: Paolo Abeni <pabeni@redhat.com>

Syzbot and Eric reported a lockdep splat in the subflow diag:

   WARNING: possible circular locking dependency detected
   6.8.0-rc4-syzkaller-00212-g40b9385dd8e6 #0 Not tainted

   syz-executor.2/24141 is trying to acquire lock:
   ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
   tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline]
   ffff888045870130 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
   tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137

   but task is already holding lock:
   ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at: spin_lock
   include/linux/spinlock.h:351 [inline]
   ffffc9000135e488 (&h->lhash2[i].lock){+.+.}-{2:2}, at:
   inet_diag_dump_icsk+0x39f/0x1f80 net/ipv4/inet_diag.c:1038

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #1 (&h->lhash2[i].lock){+.+.}-{2:2}:
   lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
   __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
   _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
   spin_lock include/linux/spinlock.h:351 [inline]
   __inet_hash+0x335/0xbe0 net/ipv4/inet_hashtables.c:743
   inet_csk_listen_start+0x23a/0x320 net/ipv4/inet_connection_sock.c:1261
   __inet_listen_sk+0x2a2/0x770 net/ipv4/af_inet.c:217
   inet_listen+0xa3/0x110 net/ipv4/af_inet.c:239
   rds_tcp_listen_init+0x3fd/0x5a0 net/rds/tcp_listen.c:316
   rds_tcp_init_net+0x141/0x320 net/rds/tcp.c:577
   ops_init+0x352/0x610 net/core/net_namespace.c:136
   __register_pernet_operations net/core/net_namespace.c:1214 [inline]
   register_pernet_operations+0x2cb/0x660 net/core/net_namespace.c:1283
   register_pernet_device+0x33/0x80 net/core/net_namespace.c:1370
   rds_tcp_init+0x62/0xd0 net/rds/tcp.c:735
   do_one_initcall+0x238/0x830 init/main.c:1236
   do_initcall_level+0x157/0x210 init/main.c:1298
   do_initcalls+0x3f/0x80 init/main.c:1314
   kernel_init_freeable+0x42f/0x5d0 init/main.c:1551
   kernel_init+0x1d/0x2a0 init/main.c:1441
   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
   ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242

   -> #0 (k-sk_lock-AF_INET6){+.+.}-{0:0}:
   check_prev_add kernel/locking/lockdep.c:3134 [inline]
   check_prevs_add kernel/locking/lockdep.c:3253 [inline]
   validate_chain+0x18ca/0x58e0 kernel/locking/lockdep.c:3869
   __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
   lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
   lock_sock_fast include/net/sock.h:1723 [inline]
   subflow_get_info+0x166/0xd20 net/mptcp/diag.c:28
   tcp_diag_put_ulp net/ipv4/tcp_diag.c:100 [inline]
   tcp_diag_get_aux+0x738/0x830 net/ipv4/tcp_diag.c:137
   inet_sk_diag_fill+0x10ed/0x1e00 net/ipv4/inet_diag.c:345
   inet_diag_dump_icsk+0x55b/0x1f80 net/ipv4/inet_diag.c:1061
   __inet_diag_dump+0x211/0x3a0 net/ipv4/inet_diag.c:1263
   inet_diag_dump_compat+0x1c1/0x2d0 net/ipv4/inet_diag.c:1371
   netlink_dump+0x59b/0xc80 net/netlink/af_netlink.c:2264
   __netlink_dump_start+0x5df/0x790 net/netlink/af_netlink.c:2370
   netlink_dump_start include/linux/netlink.h:338 [inline]
   inet_diag_rcv_msg_compat+0x209/0x4c0 net/ipv4/inet_diag.c:1405
   sock_diag_rcv_msg+0xe7/0x410
   netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
   sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
   netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
   netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
   netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
   sock_sendmsg_nosec net/socket.c:730 [inline]
   __sock_sendmsg+0x221/0x270 net/socket.c:745
   ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
   ___sys_sendmsg net/socket.c:2638 [inline]
   __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
   do_syscall_64+0xf9/0x240
   entry_SYSCALL_64_after_hwframe+0x6f/0x77

As noted by Eric we can break the lock dependency chain avoid
dumping any extended info for the mptcp subflow listener:
nothing actually useful is presented there.

Fixes: b8adb69a7d29 ("mptcp: fix lockless access in subflow ULP diag")
Cc: stable@vger.kernel.org
Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89iJ=Oecw6OZDwmSYc9HJKQ_G32uN11L+oUcMu+TOD5Xiaw@mail.gmail.com/
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/diag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c
index 6ff6f14674aa..7017dd60659d 100644
--- a/net/mptcp/diag.c
+++ b/net/mptcp/diag.c
@@ -21,6 +21,9 @@ static int subflow_get_info(struct sock *sk, struct sk_buff *skb)
 	bool slow;
 	int err;
 
+	if (inet_sk_state_load(sk) == TCP_LISTEN)
+		return 0;
+
 	start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
 	if (!start)
 		return -EMSGSIZE;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (8 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 09/10] mptcp: fix possible deadlock in subflow diag Matthieu Baerts (NGI0)
@ 2024-02-23 16:14 ` Matthieu Baerts (NGI0)
  2024-02-26 17:58   ` Simon Horman
  2024-03-01 14:37   ` Jakub Kicinski
  2024-02-27  3:10 ` [PATCH net 00/10] mptcp: more misc. fixes for v6.8 patchwork-bot+netdevbpf
  10 siblings, 2 replies; 18+ messages in thread
From: Matthieu Baerts (NGI0) @ 2024-02-23 16:14 UTC (permalink / raw)
  To: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch
  Cc: netdev, linux-kernel, linux-kselftest, Matthieu Baerts (NGI0)

From: Paolo Abeni <pabeni@redhat.com>

The mptcp diag interface already experienced a few locking bugs
that lockdep and appropriate coverage have detected in advance.

Let's add a test-case triggering the relevant code path, to prevent
similar issues in the future.

Be careful to cope with very slow environments.

Note that we don't need an explicit timeout on the mptcp_connect
subprocess to cope with eventual bug/hang-up as the final cleanup
terminating the child processes will take care of that.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/diag.sh | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
index 0a58ebb8b04c..f300f4e1eb59 100755
--- a/tools/testing/selftests/net/mptcp/diag.sh
+++ b/tools/testing/selftests/net/mptcp/diag.sh
@@ -20,7 +20,7 @@ flush_pids()
 
 	ip netns pids "${ns}" | xargs --no-run-if-empty kill -SIGUSR1 &>/dev/null
 
-	for _ in $(seq 10); do
+	for _ in $(seq $((timeout_poll * 10))); do
 		[ -z "$(ip netns pids "${ns}")" ] && break
 		sleep 0.1
 	done
@@ -91,6 +91,15 @@ chk_msk_nr()
 	__chk_msk_nr "grep -c token:" "$@"
 }
 
+chk_listener_nr()
+{
+	local expected=$1
+	local msg="$2"
+
+	__chk_nr "ss -inmlHMON $ns | wc -l" "$expected" "$msg - mptcp" 0
+	__chk_nr "ss -inmlHtON $ns | wc -l" "$expected" "$msg - subflows"
+}
+
 wait_msk_nr()
 {
 	local condition="grep -c token:"
@@ -289,5 +298,24 @@ flush_pids
 chk_msk_inuse 0 "many->0"
 chk_msk_cestab 0 "many->0"
 
+chk_listener_nr 0 "no listener sockets"
+NR_SERVERS=100
+for I in $(seq 1 $NR_SERVERS); do
+	ip netns exec $ns ./mptcp_connect -p $((I + 20001)) \
+		-t ${timeout_poll} -l 0.0.0.0 >/dev/null 2>&1 &
+done
+
+for I in $(seq 1 $NR_SERVERS); do
+	mptcp_lib_wait_local_port_listen $ns $((I + 20001))
+done
+
+chk_listener_nr $NR_SERVERS "many listener sockets"
+
+# graceful termination
+for I in $(seq 1 $NR_SERVERS); do
+	echo a | ip netns exec $ns ./mptcp_connect -p $((I + 20001)) 127.0.0.1 >/dev/null 2>&1 &
+done
+flush_pids
+
 mptcp_lib_result_print_all_tap
 exit $ret

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket: manual merge
  2024-02-23 16:14 ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket Matthieu Baerts (NGI0)
@ 2024-02-23 16:35   ` Matthieu Baerts
  0 siblings, 0 replies; 18+ messages in thread
From: Matthieu Baerts @ 2024-02-23 16:35 UTC (permalink / raw)
  To: Jakub Kicinski, Paolo Abeni, Stephen Rothwell
  Cc: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Florian Westphal, Kishen Maloor, Shuah Khan, Peter Krystad,
	Christoph Paasch, netdev, linux-kernel, linux-kselftest, stable

[-- Attachment #1: Type: text/plain, Size: 1651 bytes --]

Hi Jakub, Paolo, Stephen,

On 23/02/2024 5:14 pm, Matthieu Baerts (NGI0) wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> 
> Such value should be inherited from the first subflow, but
> passive sockets always used 'rsk_rcv_wnd'.
> 
> Fixes: 6f8a612a33e4 ("mptcp: keep track of advertised windows right edge")
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> Reviewed-by: Mat Martineau <martineau@kernel.org>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
>  net/mptcp/protocol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 442fa7d9b57a..2c8f931c6d5b 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -3211,7 +3211,7 @@ struct sock *mptcp_sk_clone_init(const struct sock *sk,
>  	msk->write_seq = subflow_req->idsn + 1;
>  	msk->snd_nxt = msk->write_seq;
>  	msk->snd_una = msk->write_seq;
> -	msk->wnd_end = msk->snd_nxt + req->rsk_rcv_wnd;
> +	msk->wnd_end = msk->snd_nxt + tcp_sk(ssk)->snd_wnd;

Please note that this patch will conflict with the following commit from
net-next:

  3f83d8a77eee ("mptcp: fix more tx path fields initialization")

That's because this commit modifies the same line as the one modified
here. We cannot avoid a conflict here. To fix it, please use
'WRITE_ONCE()' with the new line from -net:

  WRITE_ONCE(msk->wnd_end, msk->snd_nxt + tcp_sk(ssk)->snd_wnd);

A 3-way diff has been attached to this email.

>  	msk->setsockopt_seq = mptcp_sk(sk)->setsockopt_seq;
>  	mptcp_init_sched(msk, mptcp_sk(sk)->sched);
>  
> 

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

[-- Attachment #2: conflict-mptcp-fix-more-tx-path-fields-initialization.diff --]
[-- Type: text/x-patch, Size: 1322 bytes --]

diff --cc net/mptcp/protocol.c
index 7833a49f6214,9df4eaddfd48..000000000000
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@@ -3242,25 -3204,18 +3247,25 @@@ struct sock *mptcp_sk_clone_init(const 
  
  	__mptcp_init_sock(nsk);
  
 +#if IS_ENABLED(CONFIG_MPTCP_IPV6)
 +	if (nsk->sk_family == AF_INET6)
 +		mptcp_copy_ip6_options(nsk, sk);
 +	else
 +#endif
 +		mptcp_copy_ip_options(nsk, sk);
 +
  	msk = mptcp_sk(nsk);
- 	msk->local_key = subflow_req->local_key;
- 	msk->token = subflow_req->token;
+ 	WRITE_ONCE(msk->local_key, subflow_req->local_key);
+ 	WRITE_ONCE(msk->token, subflow_req->token);
  	msk->in_accept_queue = 1;
  	WRITE_ONCE(msk->fully_established, false);
  	if (mp_opt->suboptions & OPTION_MPTCP_CSUMREQD)
  		WRITE_ONCE(msk->csum_enabled, true);
  
- 	msk->write_seq = subflow_req->idsn + 1;
- 	msk->snd_nxt = msk->write_seq;
- 	msk->snd_una = msk->write_seq;
- 	msk->wnd_end = msk->snd_nxt + tcp_sk(ssk)->snd_wnd;
+ 	WRITE_ONCE(msk->write_seq, subflow_req->idsn + 1);
+ 	WRITE_ONCE(msk->snd_nxt, msk->write_seq);
+ 	WRITE_ONCE(msk->snd_una, msk->write_seq);
 -	WRITE_ONCE(msk->wnd_end, msk->snd_nxt + req->rsk_rcv_wnd);
++	WRITE_ONCE(msk->wnd_end, msk->snd_nxt + tcp_sk(ssk)->snd_wnd);
  	msk->setsockopt_seq = mptcp_sk(sk)->setsockopt_seq;
  	mptcp_init_sched(msk, mptcp_sk(sk)->sched);
  

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-02-23 16:14 ` [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path Matthieu Baerts (NGI0)
@ 2024-02-26 17:58   ` Simon Horman
  2024-03-01 14:37   ` Jakub Kicinski
  1 sibling, 0 replies; 18+ messages in thread
From: Simon Horman @ 2024-02-26 17:58 UTC (permalink / raw)
  To: Matthieu Baerts (NGI0)
  Cc: mptcp, Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Florian Westphal, Kishen Maloor,
	Shuah Khan, Peter Krystad, Christoph Paasch, netdev, linux-kernel,
	linux-kselftest

On Fri, Feb 23, 2024 at 05:14:20PM +0100, Matthieu Baerts (NGI0) wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> 
> The mptcp diag interface already experienced a few locking bugs
> that lockdep and appropriate coverage have detected in advance.
> 
> Let's add a test-case triggering the relevant code path, to prevent
> similar issues in the future.
> 
> Be careful to cope with very slow environments.
> 
> Note that we don't need an explicit timeout on the mptcp_connect
> subprocess to cope with eventual bug/hang-up as the final cleanup
> terminating the child processes will take care of that.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 00/10] mptcp: more misc. fixes for v6.8
  2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
                   ` (9 preceding siblings ...)
  2024-02-23 16:14 ` [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path Matthieu Baerts (NGI0)
@ 2024-02-27  3:10 ` patchwork-bot+netdevbpf
  10 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-02-27  3:10 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: mptcp, martineau, geliang, davem, edumazet, kuba, pabeni, fw,
	kishen.maloor, shuah, peter.krystad, cpaasch, netdev,
	linux-kernel, linux-kselftest, tanggeliang, stable, dcaratti

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 23 Feb 2024 17:14:10 +0100 you wrote:
> This series includes 6 types of fixes:
> 
> - Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
>   when asking to delete a subflow. It was done everywhere else, but not
>   there. Patch 2 validates the modification, thanks to a subtest in
>   mptcp_join.sh. These patches can be backported up to v5.19.
> 
> [...]

Here is the summary with links:
  - [net,01/10] mptcp: map v4 address to v6 when destroying subflow
    https://git.kernel.org/netdev/net/c/535d620ea5ff
  - [net,02/10] selftests: mptcp: rm subflow with v4/v4mapped addr
    https://git.kernel.org/netdev/net/c/7092dbee2328
  - [net,03/10] mptcp: avoid printing warning once on client side
    https://git.kernel.org/netdev/net/c/5b49c41ac8f2
  - [net,04/10] mptcp: push at DSS boundaries
    https://git.kernel.org/netdev/net/c/b9cd26f640a3
  - [net,05/10] mptcp: fix snd_wnd initialization for passive socket
    https://git.kernel.org/netdev/net/c/adf1bb78dab5
  - [net,06/10] mptcp: fix potential wake-up event loss
    https://git.kernel.org/netdev/net/c/b111d8fbd2cb
  - [net,07/10] selftests: mptcp: join: add ss mptcp support check
    https://git.kernel.org/netdev/net/c/9480f388a2ef
  - [net,08/10] mptcp: fix double-free on socket dismantle
    https://git.kernel.org/netdev/net/c/10048689def7
  - [net,09/10] mptcp: fix possible deadlock in subflow diag
    https://git.kernel.org/netdev/net/c/d6a9608af9a7
  - [net,10/10] selftests: mptcp: explicitly trigger the listener diag code-path
    https://git.kernel.org/netdev/net/c/b4b51d36bbaa

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-02-23 16:14 ` [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path Matthieu Baerts (NGI0)
  2024-02-26 17:58   ` Simon Horman
@ 2024-03-01 14:37   ` Jakub Kicinski
  2024-03-01 15:10     ` Matthieu Baerts
  1 sibling, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2024-03-01 14:37 UTC (permalink / raw)
  To: Matthieu Baerts (NGI0), Mat Martineau, Paolo Abeni
  Cc: mptcp, Geliang Tang, David S. Miller, Eric Dumazet,
	Florian Westphal, Kishen Maloor, Shuah Khan, Peter Krystad,
	Christoph Paasch, netdev, linux-kernel, linux-kselftest

On Fri, 23 Feb 2024 17:14:20 +0100 Matthieu Baerts (NGI0) wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> 
> The mptcp diag interface already experienced a few locking bugs
> that lockdep and appropriate coverage have detected in advance.
> 
> Let's add a test-case triggering the relevant code path, to prevent
> similar issues in the future.
> 
> Be careful to cope with very slow environments.
> 
> Note that we don't need an explicit timeout on the mptcp_connect
> subprocess to cope with eventual bug/hang-up as the final cleanup
> terminating the child processes will take care of that.

Hi!

There's a failure in CI under debug after merging net and net-next 
in diag.sh. Maybe because of the patch which lowered timeout?
https://lore.kernel.org/all/20240223-upstream-net-next-20240223-misc-improvements-v1-8-b6c8a10396bd@kernel.org/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-03-01 14:37   ` Jakub Kicinski
@ 2024-03-01 15:10     ` Matthieu Baerts
  2024-03-01 18:24       ` Matthieu Baerts
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Baerts @ 2024-03-01 15:10 UTC (permalink / raw)
  To: Jakub Kicinski, Mat Martineau, Paolo Abeni
  Cc: mptcp, Geliang Tang, David S. Miller, Eric Dumazet,
	Florian Westphal, Kishen Maloor, Shuah Khan, Peter Krystad,
	Christoph Paasch, netdev, linux-kernel, linux-kselftest

Hi Jakub,

On 01/03/2024 15:37, Jakub Kicinski wrote:
> On Fri, 23 Feb 2024 17:14:20 +0100 Matthieu Baerts (NGI0) wrote:
>> From: Paolo Abeni <pabeni@redhat.com>
>>
>> The mptcp diag interface already experienced a few locking bugs
>> that lockdep and appropriate coverage have detected in advance.
>>
>> Let's add a test-case triggering the relevant code path, to prevent
>> similar issues in the future.
>>
>> Be careful to cope with very slow environments.
>>
>> Note that we don't need an explicit timeout on the mptcp_connect
>> subprocess to cope with eventual bug/hang-up as the final cleanup
>> terminating the child processes will take care of that.
> 
> Hi!
> 
> There's a failure in CI under debug after merging net and net-next 
> in diag.sh. Maybe because of the patch which lowered timeout?
> https://lore.kernel.org/all/20240223-upstream-net-next-20240223-misc-improvements-v1-8-b6c8a10396bd@kernel.org/

Thank you for this message!

I didn't have this error on my side, even without '-d SLUB_DEBUG_ON' we
do on top of the debug kconfig, but I see I can reproduce it on slower
environments. Indeed, it looks like it can be caused by that
modification. I will send a fix ASAP!

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-03-01 15:10     ` Matthieu Baerts
@ 2024-03-01 18:24       ` Matthieu Baerts
  2024-03-01 18:55         ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Baerts @ 2024-03-01 18:24 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Mat Martineau, Paolo Abeni, mptcp, Geliang Tang, David S. Miller,
	Eric Dumazet, Florian Westphal, Kishen Maloor, Shuah Khan,
	Peter Krystad, Christoph Paasch, netdev, linux-kernel,
	linux-kselftest

Hi Jakub,

On 01/03/2024 16:10, Matthieu Baerts wrote:
> On 01/03/2024 15:37, Jakub Kicinski wrote:
>> On Fri, 23 Feb 2024 17:14:20 +0100 Matthieu Baerts (NGI0) wrote:
>>> From: Paolo Abeni <pabeni@redhat.com>
>>>
>>> The mptcp diag interface already experienced a few locking bugs
>>> that lockdep and appropriate coverage have detected in advance.
>>>
>>> Let's add a test-case triggering the relevant code path, to prevent
>>> similar issues in the future.
>>>
>>> Be careful to cope with very slow environments.
>>>
>>> Note that we don't need an explicit timeout on the mptcp_connect
>>> subprocess to cope with eventual bug/hang-up as the final cleanup
>>> terminating the child processes will take care of that.
>>
>> Hi!
>>
>> There's a failure in CI under debug after merging net and net-next 
>> in diag.sh. Maybe because of the patch which lowered timeout?
>> https://lore.kernel.org/all/20240223-upstream-net-next-20240223-misc-improvements-v1-8-b6c8a10396bd@kernel.org/
> 
> Thank you for this message!
> 
> I didn't have this error on my side, even without '-d SLUB_DEBUG_ON' we
> do on top of the debug kconfig, but I see I can reproduce it on slower
> environments. Indeed, it looks like it can be caused by that
> modification. I will send a fix ASAP!

The following patch fixes the issue on my side (when using 'renice 20'
and stress-ng in parallel :) ):

https://lore.kernel.org/netdev/20240301-upstream-net-20240301-selftests-mptcp-diag-exit-timeout-v1-2-07cb2fa15c06@kernel.org/T/

I queued it for -net, but the issue should only be visible in net-next
due to the reduction of the timeout, as you suggested. This patch can
also be applied in net-next if preferred, it will not cause any
conflicts between net and net-next.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path
  2024-03-01 18:24       ` Matthieu Baerts
@ 2024-03-01 18:55         ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2024-03-01 18:55 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: Mat Martineau, Paolo Abeni, mptcp, Geliang Tang, David S. Miller,
	Eric Dumazet, Florian Westphal, Kishen Maloor, Shuah Khan,
	Peter Krystad, Christoph Paasch, netdev, linux-kernel,
	linux-kselftest

On Fri, 1 Mar 2024 19:24:53 +0100 Matthieu Baerts wrote:
> > Thank you for this message!
> > 
> > I didn't have this error on my side, even without '-d SLUB_DEBUG_ON' we
> > do on top of the debug kconfig, but I see I can reproduce it on slower
> > environments. Indeed, it looks like it can be caused by that
> > modification. I will send a fix ASAP!  
> 
> The following patch fixes the issue on my side (when using 'renice 20'
> and stress-ng in parallel :) ):
> 
> https://lore.kernel.org/netdev/20240301-upstream-net-20240301-selftests-mptcp-diag-exit-timeout-v1-2-07cb2fa15c06@kernel.org/T/

Nice! Note only a fix but also lowers runtime, if I'm reading right!

> I queued it for -net, but the issue should only be visible in net-next
> due to the reduction of the timeout, as you suggested. This patch can
> also be applied in net-next if preferred, it will not cause any
> conflicts between net and net-next.

No strong preference, net sounds good.

Thanks for a quick turnaround!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-03-01 18:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-23 16:14 [PATCH net 00/10] mptcp: more misc. fixes for v6.8 Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 01/10] mptcp: map v4 address to v6 when destroying subflow Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 02/10] selftests: mptcp: rm subflow with v4/v4mapped addr Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 03/10] mptcp: avoid printing warning once on client side Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 04/10] mptcp: push at DSS boundaries Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket Matthieu Baerts (NGI0)
2024-02-23 16:35   ` [PATCH net 05/10] mptcp: fix snd_wnd initialization for passive socket: manual merge Matthieu Baerts
2024-02-23 16:14 ` [PATCH net 06/10] mptcp: fix potential wake-up event loss Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 07/10] selftests: mptcp: join: add ss mptcp support check Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 08/10] mptcp: fix double-free on socket dismantle Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 09/10] mptcp: fix possible deadlock in subflow diag Matthieu Baerts (NGI0)
2024-02-23 16:14 ` [PATCH net 10/10] selftests: mptcp: explicitly trigger the listener diag code-path Matthieu Baerts (NGI0)
2024-02-26 17:58   ` Simon Horman
2024-03-01 14:37   ` Jakub Kicinski
2024-03-01 15:10     ` Matthieu Baerts
2024-03-01 18:24       ` Matthieu Baerts
2024-03-01 18:55         ` Jakub Kicinski
2024-02-27  3:10 ` [PATCH net 00/10] mptcp: more misc. fixes for v6.8 patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).