Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp
@ 2025-09-25 10:32 Matthieu Baerts (NGI0)
  2025-09-25 10:32 ` [PATCH net-next 01/15] mptcp: pm: in-kernel: usable client side with C-flag Matthieu Baerts (NGI0)
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-09-25 10:32 UTC (permalink / raw)
  To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
	Martin KaFai Lau
  Cc: netdev, mptcp, linux-kernel, linux-kselftest,
	Matthieu Baerts (NGI0), stable

Here are some patches for the MPTCP PM, including some refactoring that
I thought it would be best to send at the end of a cycle to avoid
conflicts between net and net-next that could last a few weeks.

The most interesting changes are in the first and last patch, the rest
are patches refactoring the code & tests to validate the modifications.

- Patches 1 & 2: When servers set the C-flag in their MP_CAPABLE to tell
  clients not to create subflows to the initial address and port -- e.g.
  a deployment behind a L4 load balancer like a typical CDN deployment
  -- clients will not use their other endpoints when default settings
  are used. That's because the in-kernel path-manager uses the 'subflow'
  endpoints to create subflows only to the initial address and port. The
  first patch fixes that (for >=v5.14), and the second one validates it.

- Patches 3-14: various patches refactoring the code around the
  in-kernel PM (mainly): split too long functions, rename variables and
  functions to avoid confusions, reduce structure size, and compare IDs
  instead of IP addresses. Note that one patch modifies one internal
  variable used in one BPF selftest.

- Patch 15: ability to control endpoints that are used in reaction to a
  new address announced by the other peer. With that, endpoints can be
  used only once.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Notes:
 - Patches 1 & 2 are sent to net-next on purpose: to delay a bit the
   backports, just in case. Plus we are at the end of a cycle, and not
   to delay the other refactoring patches.
 - Sorry, I wanted to send this series earlier on, but due to some
   unrelated issues (and holiday), it got delayed. Most patches are
   pure refactoring ones.

---
Matthieu Baerts (NGI0) (15):
      mptcp: pm: in-kernel: usable client side with C-flag
      selftests: mptcp: join: validate C-flag + def limit
      mptcp: pm: in-kernel: refactor fill_local_addresses_vec
      mptcp: pm: in-kernel: refactor fill_remote_addresses_vec
      mptcp: pm: rename 'subflows' to 'extra_subflows'
      mptcp: pm: in-kernel: rename 'subflows_max' to 'limit_extra_subflows'
      mptcp: pm: in-kernel: rename 'add_addr_signal_max' to 'endp_signal_max'
      mptcp: pm: in-kernel: rename 'add_addr_accept_max' to 'limit_add_addr_accepted'
      mptcp: pm: in-kernel: rename 'local_addr_max' to 'endp_subflow_max'
      mptcp: pm: in-kernel: rename 'local_addr_list' to 'endp_list'
      mptcp: pm: in-kernel: rename 'addrs' to 'endpoints'
      mptcp: pm: in-kernel: remove stale_loss_cnt
      mptcp: pm: in-kernel: reduce pernet struct size
      mptcp: pm: in-kernel: compare IDs instead of addresses
      mptcp: pm: in-kernel: add laminar endpoints

 include/uapi/linux/mptcp.h                        |  11 +-
 net/mptcp/pm.c                                    |  32 +-
 net/mptcp/pm_kernel.c                             | 569 ++++++++++++++--------
 net/mptcp/pm_userspace.c                          |   2 +-
 net/mptcp/protocol.h                              |  21 +-
 net/mptcp/sockopt.c                               |  22 +-
 tools/testing/selftests/bpf/progs/mptcp_subflow.c |   2 +-
 tools/testing/selftests/net/mptcp/mptcp_join.sh   |  11 +
 8 files changed, 441 insertions(+), 229 deletions(-)
---
base-commit: a1f1f2422e098485b09e55a492de05cf97f9954d
change-id: 20250925-net-next-mptcp-c-flag-laminar-f8442e4d4bd9

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 01/15] mptcp: pm: in-kernel: usable client side with C-flag
  2025-09-25 10:32 [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp Matthieu Baerts (NGI0)
@ 2025-09-25 10:32 ` Matthieu Baerts (NGI0)
  2025-09-25 10:32 ` [PATCH net-next 02/15] selftests: mptcp: join: validate C-flag + def limit Matthieu Baerts (NGI0)
  2025-09-27  1:20 ` [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-09-25 10:32 UTC (permalink / raw)
  To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
	Martin KaFai Lau
  Cc: netdev, mptcp, linux-kernel, linux-kselftest,
	Matthieu Baerts (NGI0), stable

When servers set the C-flag in their MP_CAPABLE to tell clients not to
create subflows to the initial address and port, clients will likely not
use their other endpoints. That's because the in-kernel path-manager
uses the 'subflow' endpoints to create subflows only to the initial
address and port.

If the limits have not been modified to accept ADD_ADDR, the client
doesn't try to establish new subflows. If the limits accept ADD_ADDR,
the routing routes will be used to select the source IP.

The C-flag is typically set when the server is operating behind a legacy
Layer 4 load balancer, or using anycast IP address. Clients having their
different 'subflow' endpoints setup, don't end up creating multiple
subflows as expected, and causing some deployment issues.

A special case is then added here: when servers set the C-flag in the
MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted.
The 'subflows' endpoints will then be used with this new remote IP and
port. This exception is only allowed when the ADD_ADDR is sent
immediately after the 3WHS, and makes the client switching to the 'fully
established' mode. After that, 'select_local_address()' will not be able
to find any subflows, because 'id_avail_bitmap' will be filled in
mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully
established' mode.

Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c        |  7 +++++--
 net/mptcp/pm_kernel.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 net/mptcp/protocol.h  |  8 ++++++++
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 204e1f61212e2be77a8476f024b59be67d04b80a..584cab90aa6eff4c01cdf4ca4d3dce8894829920 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -637,9 +637,12 @@ void mptcp_pm_add_addr_received(const struct sock *ssk,
 		} else {
 			__MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP);
 		}
-	/* id0 should not have a different address */
+	/* - id0 should not have a different address
+	 * - special case for C-flag: linked to fill_local_addresses_vec()
+	 */
 	} else if ((addr->id == 0 && !mptcp_pm_is_init_remote_addr(msk, addr)) ||
-		   (addr->id > 0 && !READ_ONCE(pm->accept_addr))) {
+		   (addr->id > 0 && !READ_ONCE(pm->accept_addr) &&
+		    !mptcp_pm_add_addr_c_flag_case(msk))) {
 		mptcp_pm_announce_addr(msk, addr, true);
 		mptcp_pm_add_addr_send_ack(msk);
 	} else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) {
diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index 667803d72b643a0bb98365003b136c53f2a9a975..8c46493a0835b0e2d5e70950662ae6e845393777 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -389,10 +389,12 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
 	struct mptcp_addr_info mpc_addr;
 	struct pm_nl_pernet *pernet;
 	unsigned int subflows_max;
+	bool c_flag_case;
 	int i = 0;
 
 	pernet = pm_nl_get_pernet_from_msk(msk);
 	subflows_max = mptcp_pm_get_subflows_max(msk);
+	c_flag_case = remote->id && mptcp_pm_add_addr_c_flag_case(msk);
 
 	mptcp_local_address((struct sock_common *)msk, &mpc_addr);
 
@@ -405,12 +407,27 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
 			continue;
 
 		if (msk->pm.subflows < subflows_max) {
+			bool is_id0;
+
 			locals[i].addr = entry->addr;
 			locals[i].flags = entry->flags;
 			locals[i].ifindex = entry->ifindex;
 
+			is_id0 = mptcp_addresses_equal(&locals[i].addr,
+						       &mpc_addr,
+						       locals[i].addr.port);
+
+			if (c_flag_case &&
+			    (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) {
+				__clear_bit(locals[i].addr.id,
+					    msk->pm.id_avail_bitmap);
+
+				if (!is_id0)
+					msk->pm.local_addr_used++;
+			}
+
 			/* Special case for ID0: set the correct ID */
-			if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port))
+			if (is_id0)
 				locals[i].addr.id = 0;
 
 			msk->pm.subflows++;
@@ -419,6 +436,37 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
 	}
 	rcu_read_unlock();
 
+	/* Special case: peer sets the C flag, accept one ADD_ADDR if default
+	 * limits are used -- accepting no ADD_ADDR -- and use subflow endpoints
+	 */
+	if (!i && c_flag_case) {
+		unsigned int local_addr_max = mptcp_pm_get_local_addr_max(msk);
+
+		while (msk->pm.local_addr_used < local_addr_max &&
+		       msk->pm.subflows < subflows_max) {
+			struct mptcp_pm_local *local = &locals[i];
+
+			if (!select_local_address(pernet, msk, local))
+				break;
+
+			__clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
+
+			if (!mptcp_pm_addr_families_match(sk, &local->addr,
+							  remote))
+				continue;
+
+			if (mptcp_addresses_equal(&local->addr, &mpc_addr,
+						  local->addr.port))
+				continue;
+
+			msk->pm.local_addr_used++;
+			msk->pm.subflows++;
+			i++;
+		}
+
+		return i;
+	}
+
 	/* If the array is empty, fill in the single
 	 * 'IPADDRANY' local address
 	 */
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a1787a1344ac1bbeefdb4548740d6aef980b79e7..cbe54331e5c745989af50409d9cb79c6d90a8201 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1199,6 +1199,14 @@ static inline void mptcp_pm_close_subflow(struct mptcp_sock *msk)
 	spin_unlock_bh(&msk->pm.lock);
 }
 
+static inline bool mptcp_pm_add_addr_c_flag_case(struct mptcp_sock *msk)
+{
+	return READ_ONCE(msk->pm.remote_deny_join_id0) &&
+	       msk->pm.local_addr_used == 0 &&
+	       mptcp_pm_get_add_addr_accept_max(msk) == 0 &&
+	       msk->pm.subflows < mptcp_pm_get_subflows_max(msk);
+}
+
 void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk);
 
 static inline struct mptcp_ext *mptcp_get_ext(const struct sk_buff *skb)

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 02/15] selftests: mptcp: join: validate C-flag + def limit
  2025-09-25 10:32 [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp Matthieu Baerts (NGI0)
  2025-09-25 10:32 ` [PATCH net-next 01/15] mptcp: pm: in-kernel: usable client side with C-flag Matthieu Baerts (NGI0)
@ 2025-09-25 10:32 ` Matthieu Baerts (NGI0)
  2025-09-27  1:20 ` [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-09-25 10:32 UTC (permalink / raw)
  To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
	Martin KaFai Lau
  Cc: netdev, mptcp, linux-kernel, linux-kselftest,
	Matthieu Baerts (NGI0), stable

The previous commit adds an exception for the C-flag case. The
'mptcp_join.sh' selftest is extended to validate this case.

In this subtest, there is a typical CDN deployment with a client where
MPTCP endpoints have been 'automatically' configured:

- the server set net.mptcp.allow_join_initial_addr_port=0

- the client has multiple 'subflow' endpoints, and the default limits:
  not accepting ADD_ADDRs.

Without the parent patch, the client is not able to establish new
subflows using its 'subflow' endpoints. The parent commit fixes that.

The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.

Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 6055ee5762e13108e5e2924a0e77d58da584d008..a94b3960ad5e009dbead66b6ff2aa01f70aa3e1f 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -3306,6 +3306,17 @@ deny_join_id0_tests()
 		run_tests $ns1 $ns2 10.0.1.1
 		chk_join_nr 1 1 1
 	fi
+
+	# default limits, server deny join id 0 + signal
+	if reset_with_allow_join_id0 "default limits, server deny join id 0" 0 1; then
+		pm_nl_set_limits $ns1 0 2
+		pm_nl_set_limits $ns2 0 2
+		pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
+		pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow
+		pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow
+		run_tests $ns1 $ns2 10.0.1.1
+		chk_join_nr 2 2 2
+	fi
 }
 
 fullmesh_tests()

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp
  2025-09-25 10:32 [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp Matthieu Baerts (NGI0)
  2025-09-25 10:32 ` [PATCH net-next 01/15] mptcp: pm: in-kernel: usable client side with C-flag Matthieu Baerts (NGI0)
  2025-09-25 10:32 ` [PATCH net-next 02/15] selftests: mptcp: join: validate C-flag + def limit Matthieu Baerts (NGI0)
@ 2025-09-27  1:20 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-27  1:20 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: martineau, geliang, davem, edumazet, kuba, pabeni, horms, shuah,
	martin.lau, netdev, mptcp, linux-kernel, linux-kselftest, stable

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 25 Sep 2025 12:32:35 +0200 you wrote:
> Here are some patches for the MPTCP PM, including some refactoring that
> I thought it would be best to send at the end of a cycle to avoid
> conflicts between net and net-next that could last a few weeks.
> 
> The most interesting changes are in the first and last patch, the rest
> are patches refactoring the code & tests to validate the modifications.
> 
> [...]

Here is the summary with links:
  - [net-next,01/15] mptcp: pm: in-kernel: usable client side with C-flag
    https://git.kernel.org/netdev/net-next/c/4b1ff850e0c1
  - [net-next,02/15] selftests: mptcp: join: validate C-flag + def limit
    https://git.kernel.org/netdev/net-next/c/008385efd05e
  - [net-next,03/15] mptcp: pm: in-kernel: refactor fill_local_addresses_vec
    https://git.kernel.org/netdev/net-next/c/8dc63ade451d
  - [net-next,04/15] mptcp: pm: in-kernel: refactor fill_remote_addresses_vec
    https://git.kernel.org/netdev/net-next/c/a845b2bbf26e
  - [net-next,05/15] mptcp: pm: rename 'subflows' to 'extra_subflows'
    https://git.kernel.org/netdev/net-next/c/c5273f6ca166
  - [net-next,06/15] mptcp: pm: in-kernel: rename 'subflows_max' to 'limit_extra_subflows'
    https://git.kernel.org/netdev/net-next/c/3eb3c9a9596a
  - [net-next,07/15] mptcp: pm: in-kernel: rename 'add_addr_signal_max' to 'endp_signal_max'
    https://git.kernel.org/netdev/net-next/c/45cae570664d
  - [net-next,08/15] mptcp: pm: in-kernel: rename 'add_addr_accept_max' to 'limit_add_addr_accepted'
    https://git.kernel.org/netdev/net-next/c/37712d84dfc2
  - [net-next,09/15] mptcp: pm: in-kernel: rename 'local_addr_max' to 'endp_subflow_max'
    https://git.kernel.org/netdev/net-next/c/e7757b6d3a62
  - [net-next,10/15] mptcp: pm: in-kernel: rename 'local_addr_list' to 'endp_list'
    https://git.kernel.org/netdev/net-next/c/35e71e43a56d
  - [net-next,11/15] mptcp: pm: in-kernel: rename 'addrs' to 'endpoints'
    https://git.kernel.org/netdev/net-next/c/e9aa044f4a1f
  - [net-next,12/15] mptcp: pm: in-kernel: remove stale_loss_cnt
    https://git.kernel.org/netdev/net-next/c/db9a0e3858ba
  - [net-next,13/15] mptcp: pm: in-kernel: reduce pernet struct size
    https://git.kernel.org/netdev/net-next/c/4984fe6254f8
  - [net-next,14/15] mptcp: pm: in-kernel: compare IDs instead of addresses
    https://git.kernel.org/netdev/net-next/c/f596293314b2
  - [net-next,15/15] mptcp: pm: in-kernel: add laminar endpoints
    https://git.kernel.org/netdev/net-next/c/539f6b9de39e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-27  1:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-25 10:32 [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp Matthieu Baerts (NGI0)
2025-09-25 10:32 ` [PATCH net-next 01/15] mptcp: pm: in-kernel: usable client side with C-flag Matthieu Baerts (NGI0)
2025-09-25 10:32 ` [PATCH net-next 02/15] selftests: mptcp: join: validate C-flag + def limit Matthieu Baerts (NGI0)
2025-09-27  1:20 ` [PATCH net-next 00/15] mptcp: pm: special case for c-flag + luminar endp patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox