All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup
@ 2026-04-22 16:40 Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 01/19] mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0 Matthieu Baerts (NGI0)
                   ` (20 more replies)
  0 siblings, 21 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

The initial intension was to increase the limits, but some fixes were
needed, then when looking at the code around, other fixes had to be
added too. So now I ended up with multiple fixes (including one early
exit while at it), followed by patches increasing limits with some
adaptations required before, some to validate the modifications, and one
small improvement in the selftests.

The patches with a Fixes tag are for -net, the rest for net-next.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v6:
- patch 4: fix label name in commit message + mention next patch.
- patch 7: use min() to improve clarity (Mat)
- patch 15: fix typo.
- patch 20: removed: will be sent in another series after that one.
- Link to v5: https://patch.msgid.link/20260415-mptcp-inc-limits-v5-0-e54c3bf80e4e@kernel.org

Changes in v5:
- patch 5: only set timer_done to true in the timer handler: safer.
- patch 9: fix typo in commit message.
- patch 10: 3rd check expects error, better err msg, rename var.
- Link to v4: https://patch.msgid.link/20260414-mptcp-inc-limits-v4-0-78274edda5dd@kernel.org

Changes in v4:
- patch 5: reset timer_done when the timer is reset.
- patch 10: new: selftests: correctly catch error in 'check output'.
- patch 16: add example with init_shapers to silence AI reviews
- Patch 18: clearer way to check errors (without buggy '&&')
- Link to v3: https://patch.msgid.link/20260413-mptcp-inc-limits-v3-0-dd36c9360432@kernel.org

Changes in v3:
- patch 1: new: retransmit ADD_ADDR for ID 0 with right ID.
- patch 5: moved below + support calling sk_free().
- patch 13: new: allow flushing more than 8 endpoints.
- patch 12: clarify single batch of 8 subflows as known limit.
- patch 16: fix typo in comment.
- Link to v2: https://patch.msgid.link/20260410-mptcp-inc-limits-v2-0-5402209f05d3@kernel.org

Changes in v2:
- patch 2: new: fix potential data-race.
- re-order patches 1 to 5 and use shorter prefix (ADD_ADDR rtx).
- patch 6: restore accidentally deleted icsk->icsk_rto > max check.
- patch 16: fix already present checkpatch warning.
- Link to v1: https://patch.msgid.link/20260409-mptcp-inc-limits-v1-0-0e45fa30d914@kernel.org

---
Matthieu Baerts (NGI0) (19):
      mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0
      mptcp: pm: ADD_ADDR rtx: fix potential data-race
      mptcp: pm: ADD_ADDR rtx: allow ID 0
      mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
      mptcp: pm: ADD_ADDR rtx: free sk if last
      mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
      mptcp: pm: ADD_ADDR rtx: skip inactive subflows
      mptcp: pm: retrans ADD_ADDR: return early if no retrans
      mptcp: pm: prio: skip closed subflows
      selftests: mptcp: check output: catch cmd errors
      selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
      mptcp: pm: in-kernel: explicitly limit batches to array size
      mptcp: pm: in-kernel: increase all limits to 64
      mptcp: pm: kernel: allow flushing more than 8 endpoints
      mptcp: pm: in-kernel: increase endpoints limit
      selftests: mptcp: join: allow changing ifaces nr per test
      selftests: mptcp: join: validate 8x8 subflows
      selftests: mptcp: pm: validate new limits
      selftests: mptcp: pm: use simpler send/recv forms

 net/mptcp/pm.c                                  | 62 +++++++++++------
 net/mptcp/pm_kernel.c                           | 90 ++++++++++++++++---------
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++-
 tools/testing/selftests/net/mptcp/mptcp_lib.sh  | 16 +++--
 tools/testing/selftests/net/mptcp/pm_netlink.sh | 74 ++++++++++++--------
 tools/testing/selftests/net/mptcp/pm_nl_ctl.c   |  8 +--
 6 files changed, 193 insertions(+), 90 deletions(-)
---
base-commit: b3fa12c3ddab70d570c42071efb0edd5dad07060
change-id: 20260403-mptcp-inc-limits-ce9811024066

Best regards,
--  
Matthieu Baerts (NGI0) <matttbe@kernel.org>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 01/19] mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 02/19] mptcp: pm: ADD_ADDR rtx: fix potential data-race Matthieu Baerts (NGI0)
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When adding the ADD_ADDR to the list, the address including the IP, port
and ID are copied. On the other hand, when the endpoint corresponds to
the one from the initial subflow, the ID is set to 0, as specified by
the MPTCP protocol.

The issue is that the ID was reset after having copied the ID in the
ADD_ADDR entry. So the retransmission was done, but using a different ID
than the initial one.

Fixes: 8b8ed1b429f8 ("mptcp: pm: reuse ID 0 after delete and re-add")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm_kernel.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index 7895fb5f982e..906a1d5ab9ed 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -347,6 +347,8 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 
 	/* check first for announce */
 	if (msk->pm.add_addr_signaled < endp_signal_max) {
+		u8 endp_id;
+
 		/* due to racing events on both ends we can reach here while
 		 * previous add address is still running: if we invoke now
 		 * mptcp_pm_announce_addr(), that will fail and the
@@ -360,19 +362,20 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 		if (!select_signal_address(pernet, msk, &local))
 			goto subflow;
 
+		/* Special case for ID0: set the correct ID */
+		endp_id = local.addr.id;
+		if (endp_id == msk->mpc_endpoint_id)
+			local.addr.id = 0;
+
 		/* If the alloc fails, we are on memory pressure, not worth
 		 * continuing, and trying to create subflows.
 		 */
 		if (!mptcp_pm_alloc_anno_list(msk, &local.addr))
 			return;
 
-		__clear_bit(local.addr.id, msk->pm.id_avail_bitmap);
+		__clear_bit(endp_id, msk->pm.id_avail_bitmap);
 		msk->pm.add_addr_signaled++;
 
-		/* Special case for ID0: set the correct ID */
-		if (local.addr.id == msk->mpc_endpoint_id)
-			local.addr.id = 0;
-
 		mptcp_pm_announce_addr(msk, &local.addr, false);
 		mptcp_pm_addr_send_ack(msk);
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 02/19] mptcp: pm: ADD_ADDR rtx: fix potential data-race
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 01/19] mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0 Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 03/19] mptcp: pm: ADD_ADDR rtx: allow ID 0 Matthieu Baerts (NGI0)
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

This mptcp_pm_add_timer() helper is executed as a timer callback in
softirq context. To avoid any data races, the socket lock needs to be
held with bh_lock_sock().

If the socket is in use, retry again soon after, similar to what is done
with the keepalive timer.

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index fdadd0e8f77c..43a11b65235f 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -340,6 +340,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 	if (!entry->addr.id)
 		return;
 
+	bh_lock_sock(sk);
+	if (sock_owned_by_user(sk)) {
+		/* Try again later. */
+		sk_reset_timer(sk, timer, jiffies + HZ / 20);
+		goto out;
+	}
+
 	if (mptcp_pm_should_add_signal_addr(msk)) {
 		sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX / 8);
 		goto out;
@@ -368,6 +375,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 		mptcp_pm_subflow_established(msk);
 
 out:
+	bh_unlock_sock(sk);
 	__sock_put(sk);
 }
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 03/19] mptcp: pm: ADD_ADDR rtx: allow ID 0
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 01/19] mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0 Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 02/19] mptcp: pm: ADD_ADDR rtx: fix potential data-race Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 04/19] mptcp: pm: ADD_ADDR rtx: always decrease sk refcount Matthieu Baerts (NGI0)
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

ADD_ADDR can be sent for the ID 0, which corresponds to the local
address and port linked to the initial subflow.

Indeed, this address could be removed, and re-added later on, e.g. what
is done in the "delete re-add signal" MPTCP Join selftests. So no reason
to ignore it.

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 43a11b65235f..6a75470dcf5e 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -337,9 +337,6 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 	if (inet_sk_state_load(sk) == TCP_CLOSE)
 		return;
 
-	if (!entry->addr.id)
-		return;
-
 	bh_lock_sock(sk);
 	if (sock_owned_by_user(sk)) {
 		/* Try again later. */

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 04/19] mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (2 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 03/19] mptcp: pm: ADD_ADDR rtx: allow ID 0 Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 05/19] mptcp: pm: ADD_ADDR rtx: free sk if last Matthieu Baerts (NGI0)
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer().
It should then be released in all cases at the end.

Some (unlikely) checks were returning directly instead of calling
sock_put() to decrease the refcount. Jump to a new 'exit' label to call
__sock_put() (which will become sock_put() in the next commit) to fix
this potential leak.

While at it, drop the '!msk' check which cannot happen because it is
never reset, and explicitly mark the remaining one as "unlikely".

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v3: remove '!msk' check: cannot be true.
---
 net/mptcp/pm.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 6a75470dcf5e..d3fcf441b208 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -331,11 +331,8 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 
 	pr_debug("msk=%p\n", msk);
 
-	if (!msk)
-		return;
-
-	if (inet_sk_state_load(sk) == TCP_CLOSE)
-		return;
+	if (unlikely(inet_sk_state_load(sk) == TCP_CLOSE))
+		goto exit;
 
 	bh_lock_sock(sk);
 	if (sock_owned_by_user(sk)) {
@@ -373,6 +370,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 
 out:
 	bh_unlock_sock(sk);
+exit:
 	__sock_put(sk);
 }
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 05/19] mptcp: pm: ADD_ADDR rtx: free sk if last
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (3 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 04/19] mptcp: pm: ADD_ADDR rtx: always decrease sk refcount Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 06/19] mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker Matthieu Baerts (NGI0)
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer(),
and released at the end.

If at that moment, it was the last reference being held, the sk would
not be freed. sock_put() should then be called instead of __sock_put().

But that's not enough: if it is the last reference, sock_put() will call
sk_free(), which will end up calling sk_stop_timer_sync() on the same
timer, and waiting indefinitely to finish. So it is needed to mark that
the timer is done at the end of the timer handler when it has not been
rescheduled, not to call sk_stop_timer_sync() on "itself".

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v3: support calling sk_free() from the timer handler. Note: I'm not very
    happy with this patch, it looks too big. Did I miss a simpler way?
v4: init timer_done after 'reset_timer' label to handle cases where the
    sysctl is changed in between.
v5: only set timer_done to true in the timer handler, safer and easier.
---
 net/mptcp/pm.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index d3fcf441b208..0ff35f49d451 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -16,6 +16,7 @@ struct mptcp_pm_add_entry {
 	struct list_head	list;
 	struct mptcp_addr_info	addr;
 	u8			retrans_times;
+	bool			timer_done;
 	struct timer_list	add_timer;
 	struct mptcp_sock	*sock;
 	struct rcu_head		rcu;
@@ -327,22 +328,22 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 							      add_timer);
 	struct mptcp_sock *msk = entry->sock;
 	struct sock *sk = (struct sock *)msk;
-	unsigned int timeout;
+	unsigned int timeout = 0;
 
 	pr_debug("msk=%p\n", msk);
 
-	if (unlikely(inet_sk_state_load(sk) == TCP_CLOSE))
-		goto exit;
-
 	bh_lock_sock(sk);
+	if (unlikely(inet_sk_state_load(sk) == TCP_CLOSE))
+		goto out;
+
 	if (sock_owned_by_user(sk)) {
 		/* Try again later. */
-		sk_reset_timer(sk, timer, jiffies + HZ / 20);
+		timeout = HZ / 20;
 		goto out;
 	}
 
 	if (mptcp_pm_should_add_signal_addr(msk)) {
-		sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX / 8);
+		timeout = TCP_RTO_MAX / 8;
 		goto out;
 	}
 
@@ -360,8 +361,9 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 	}
 
 	if (entry->retrans_times < ADD_ADDR_RETRANS_MAX)
-		sk_reset_timer(sk, timer,
-			       jiffies + (timeout << entry->retrans_times));
+		timeout <<= entry->retrans_times;
+	else
+		timeout = 0;
 
 	spin_unlock_bh(&msk->pm.lock);
 
@@ -369,9 +371,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 		mptcp_pm_subflow_established(msk);
 
 out:
+	if (timeout)
+		sk_reset_timer(sk, timer, jiffies + timeout);
+	else
+		/* if sock_put calls sk_free: avoid waiting for this timer */
+		entry->timer_done = true;
 	bh_unlock_sock(sk);
-exit:
-	__sock_put(sk);
+	sock_put(sk);
 }
 
 struct mptcp_pm_add_entry *
@@ -434,6 +440,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
 
 	timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
 reset_timer:
+	add_entry->timer_done = false;
 	timeout = mptcp_adjust_add_addr_timeout(msk);
 	if (timeout)
 		sk_reset_timer(sk, &add_entry->add_timer, jiffies + timeout);
@@ -454,7 +461,8 @@ static void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
 	spin_unlock_bh(&msk->pm.lock);
 
 	list_for_each_entry_safe(entry, tmp, &free_list, list) {
-		sk_stop_timer_sync(sk, &entry->add_timer);
+		if (!entry->timer_done)
+			sk_stop_timer_sync(sk, &entry->add_timer);
 		kfree_rcu(entry, rcu);
 	}
 }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 06/19] mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (4 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 05/19] mptcp: pm: ADD_ADDR rtx: free sk if last Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 07/19] mptcp: pm: ADD_ADDR rtx: skip inactive subflows Matthieu Baerts (NGI0)
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When an ADD_ADDR needs to be retransmitted and another one has already
been prepared -- e.g. multiple ADD_ADDRs have been sent in a row and
need to be retransmitted later -- this additional retransmission will
need to wait.

In this case, the timer was reset to TCP_RTO_MAX / 8, which is ~15
seconds. This delay is unnecessary long: it should just be rescheduled
at the next opportunity, e.g. after the retransmission timeout.

Without this modification, some issues can be seen from time to time in
the selftests when multiple ADD_ADDRs are sent, and the host takes time
to process them, e.g. the "signal addresses, ADD_ADDR timeout" MPTCP
Join selftest, especially with a debug kernel config.

Note that on older kernels, 'timeout' is not available. It should be
enough to replace it by one second (HZ).

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 0ff35f49d451..bc5168831085 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -342,13 +342,8 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
 		goto out;
 	}
 
-	if (mptcp_pm_should_add_signal_addr(msk)) {
-		timeout = TCP_RTO_MAX / 8;
-		goto out;
-	}
-
 	timeout = mptcp_adjust_add_addr_timeout(msk);
-	if (!timeout)
+	if (!timeout || mptcp_pm_should_add_signal_addr(msk))
 		goto out;
 
 	spin_lock_bh(&msk->pm.lock);

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 07/19] mptcp: pm: ADD_ADDR rtx: skip inactive subflows
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (5 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 06/19] mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 08/19] mptcp: pm: retrans ADD_ADDR: return early if no retrans Matthieu Baerts (NGI0)
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When looking at the maximum RTO amongst the subflows, inactive subflows
were taken into account: that includes stale ones, and the initial one
if it has been already been closed.

Unusable subflows are now simply skipped. Stale ones are used as an
alternative: if there are only stale ones, to take their maximum RTO and
avoid to eventually fallback to net.mptcp.add_addr_timeout, which is set
to 2 minutes by default.

Fixes: 30549eebc4d8 ("mptcp: make ADD_ADDR retransmission timeout adaptive")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v2: restore accidentally deleted icsk->icsk_rto > max check
v6: use min() (Mat)
---
 net/mptcp/pm.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index bc5168831085..94a8207e51e9 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -306,18 +306,28 @@ static unsigned int mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk)
 	const struct net *net = sock_net((struct sock *)msk);
 	unsigned int rto = mptcp_get_add_addr_timeout(net);
 	struct mptcp_subflow_context *subflow;
-	unsigned int max = 0;
+	unsigned int max = 0, max_stale = 0;
 
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 		struct inet_connection_sock *icsk = inet_csk(ssk);
 
-		if (icsk->icsk_rto > max)
+		if (!__mptcp_subflow_active(subflow))
+			continue;
+
+		if (unlikely(subflow->stale)) {
+			if (icsk->icsk_rto > max_stale)
+				max_stale = icsk->icsk_rto;
+		} else if (icsk->icsk_rto > max) {
 			max = icsk->icsk_rto;
+		}
 	}
 
-	if (max && max < rto)
-		rto = max;
+	if (max)
+		return min(max, rto);
+
+	if (max_stale)
+		return min(max_stale, rto);
 
 	return rto;
 }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 08/19] mptcp: pm: retrans ADD_ADDR: return early if no retrans
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (6 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 07/19] mptcp: pm: ADD_ADDR rtx: skip inactive subflows Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 09/19] mptcp: pm: prio: skip closed subflows Matthieu Baerts (NGI0)
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

No need to iterate over all subflows if there is no retransmission
needed.

Exit early in this case then.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 94a8207e51e9..a75889f3dac7 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -308,6 +308,9 @@ static unsigned int mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk)
 	struct mptcp_subflow_context *subflow;
 	unsigned int max = 0, max_stale = 0;
 
+	if (!rto)
+		return 0;
+
 	mptcp_for_each_subflow(msk, subflow) {
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 		struct inet_connection_sock *icsk = inet_csk(ssk);

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 09/19] mptcp: pm: prio: skip closed subflows
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (7 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 08/19] mptcp: pm: retrans ADD_ADDR: return early if no retrans Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 10/19] selftests: mptcp: check output: catch cmd errors Matthieu Baerts (NGI0)
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When sending an MP_PRIO, closed subflows need to be skipped.

This fixes the case where the initial subflow got closed, re-opened
later, then an MP_PRIO is needed for the same local address.

Note that explicit MP_PRIO cannot be sent during the 3WHS, so it is fine
to use __mptcp_subflow_active().

Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support")
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index a75889f3dac7..10df7c0aef38 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -284,6 +284,9 @@ int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk,
 		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
 		struct mptcp_addr_info local, remote;
 
+		if (!__mptcp_subflow_active(subflow))
+			continue;
+
 		mptcp_local_address((struct sock_common *)ssk, &local);
 		if (!mptcp_addresses_equal(&local, addr, addr->port))
 			continue;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 10/19] selftests: mptcp: check output: catch cmd errors
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (8 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 09/19] mptcp: pm: prio: skip closed subflows Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 11/19] selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl Matthieu Baerts (NGI0)
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

Using '${?}' inside the if-statement to check the returned value from
the command that was evaluated as part of the if-statement is not
correct: here, '${?}' will be linked to the previous instruction, not
the one that is expected here (${cmd}).

Instead, simply mark the error, except if an error is expected. If
that's the case, 1 can be passed as the 4th argument of this helper.
Three checks from pm_netlink.sh expect an error.

While at it, improve the error message when the command unexpectedly
fails or succeeds.

Note that we could expect a specific returned value, but the checks
currently expecting an error can be used with 'ip mptcp' or 'pm_nl_ctl',
and these two tools don't return the same error code.

Fixes: 2d0c1d27ea4e ("selftests: mptcp: add mptcp_lib_check_output helper")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v5:
 - improve err msg in mptcp_lib.sh, supporting unexpected success
 - s/no_err/exp_error/ in pm_netlink.sh
 - a 3rd check expects an error in pm_netlink.sh (removed later on)
---
 tools/testing/selftests/net/mptcp/mptcp_lib.sh  | 16 ++++++++++------
 tools/testing/selftests/net/mptcp/pm_netlink.sh | 10 ++++++----
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
index 5fea7e7df628..989a5975dcea 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
@@ -474,20 +474,24 @@ mptcp_lib_wait_local_port_listen() {
 	wait_local_port_listen "${@}" "tcp"
 }
 
+# $1: error file, $2: cmd, $3: expected msg, [$4: expected error]
 mptcp_lib_check_output() {
 	local err="${1}"
 	local cmd="${2}"
 	local expected="${3}"
+	local exp_error="${4:-0}"
 	local cmd_ret=0
 	local out
 
-	if ! out=$(${cmd} 2>"${err}"); then
-		cmd_ret=${?}
-	fi
+	out=$(${cmd} 2>"${err}") || cmd_ret=1
 
-	if [ ${cmd_ret} -ne 0 ]; then
-		mptcp_lib_pr_fail "command execution '${cmd}' stderr"
-		cat "${err}"
+	if [ "${cmd_ret}" != "${exp_error}" ]; then
+		mptcp_lib_pr_fail "unexpected returned code for '${cmd}', info:"
+		if [ "${exp_error}" = 0 ]; then
+			cat "${err}"
+		else
+			echo "${out}"
+		fi
 		return 2
 	elif [ "${out}" = "${expected}" ]; then
 		return 0
diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh
index 123d9d7a0278..b69f30fcb91e 100755
--- a/tools/testing/selftests/net/mptcp/pm_netlink.sh
+++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh
@@ -122,10 +122,12 @@ check()
 	local cmd="$1"
 	local expected="$2"
 	local msg="$3"
+	local exp_error="$4"
 	local rc=0
 
 	mptcp_lib_print_title "$msg"
-	mptcp_lib_check_output "${err}" "${cmd}" "${expected}" || rc=${?}
+	mptcp_lib_check_output "${err}" "${cmd}" "${expected}" "${exp_error}" ||
+		rc=${?}
 	if [ ${rc} -eq 2 ]; then
 		mptcp_lib_result_fail "${msg} # error ${rc}"
 		ret=${KSFT_FAIL}
@@ -158,13 +160,13 @@ check "show_endpoints" \
 			    "3,10.0.1.3,signal backup")" "dump addrs"
 
 del_endpoint 2
-check "get_endpoint 2" "" "simple del addr"
+check "get_endpoint 2" "" "simple del addr" 1
 check "show_endpoints" \
 	"$(format_endpoints "1,10.0.1.1" \
 			    "3,10.0.1.3,signal backup")" "dump addrs after del"
 
 add_endpoint 10.0.1.3 2>/dev/null
-check "get_endpoint 4" "" "duplicate addr"
+check "get_endpoint 4" "" "duplicate addr" 1
 
 add_endpoint 10.0.1.4 flags signal
 check "get_endpoint 4" "$(format_endpoints "4,10.0.1.4,signal")" "id addr increment"
@@ -173,7 +175,7 @@ for i in $(seq 5 9); do
 	add_endpoint "10.0.1.${i}" flags signal >/dev/null 2>&1
 done
 check "get_endpoint 9" "$(format_endpoints "9,10.0.1.9,signal")" "hard addr limit"
-check "get_endpoint 10" "" "above hard addr limit"
+check "get_endpoint 10" "" "above hard addr limit" 1
 
 del_endpoint 9
 for i in $(seq 10 255); do

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 11/19] selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (9 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 10/19] selftests: mptcp: check output: catch cmd errors Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 12/19] mptcp: pm: in-kernel: explicitly limit batches to array size Matthieu Baerts (NGI0)
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

When pm_netlink.sh is executed with '-i', 'ip mptcp' is used instead of
'pm_nl_ctl'. IPRoute2 doesn't support the 'unknown' flag, which has only
been added to 'pm_nl_ctl' for this specific check: to ensure that the
kernel ignores such unsupported flag.

No reason to add this flag to 'ip mptcp'. Then, this check should be
skipped when 'ip mptcp' is used.

Fixes: 29f4801e9c8d ("selftests: mptcp: pm: ensure unknown flags are ignored")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/pm_netlink.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh
index b69f30fcb91e..04594dfc22b1 100755
--- a/tools/testing/selftests/net/mptcp/pm_netlink.sh
+++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh
@@ -194,9 +194,13 @@ check "show_endpoints" \
 flush_endpoint
 check "show_endpoints" "" "flush addrs"
 
-add_endpoint 10.0.1.1 flags unknown
-check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" "ignore unknown flags"
-flush_endpoint
+# "unknown" flag is only supported by pm_nl_ctl
+if ! mptcp_lib_is_ip_mptcp; then
+	add_endpoint 10.0.1.1 flags unknown
+	check "show_endpoints" "$(format_endpoints "1,10.0.1.1")" \
+	      "ignore unknown flags"
+	flush_endpoint
+fi
 
 set_limits 9 1 2>/dev/null
 check "get_limits" "${default_limits}" "rcv addrs above hard limit"

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 12/19] mptcp: pm: in-kernel: explicitly limit batches to array size
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (10 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 11/19] selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 13/19] mptcp: pm: in-kernel: increase all limits to 64 Matthieu Baerts (NGI0)
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

The in-kernel PM can create subflows in reply to ADD_ADDR by batch of
maximum 8 subflows for the moment. Same when adding new "subflow"
endpoints with the fullmesh flag. This limit is linked to the arrays
used during these steps.

There was no explicit limit to the arrays size (8), because the limit of
extra subflows is the same (8). It seems safer to use an explicit limit,
but also these two sizes are going to be different in the next commit.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm_kernel.c | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index 906a1d5ab9ed..a678b1c33f56 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -201,7 +201,8 @@ fill_remote_addr(struct mptcp_sock *msk, struct mptcp_addr_info *local,
 static unsigned int
 fill_remote_addresses_fullmesh(struct mptcp_sock *msk,
 			       struct mptcp_addr_info *local,
-			       struct mptcp_addr_info *addrs)
+			       struct mptcp_addr_info *addrs,
+			       int addrs_size)
 {
 	u8 limit_extra_subflows = mptcp_pm_get_limit_extra_subflows(msk);
 	bool deny_id0 = READ_ONCE(msk->pm.remote_deny_join_id0);
@@ -236,7 +237,8 @@ fill_remote_addresses_fullmesh(struct mptcp_sock *msk,
 		msk->pm.extra_subflows++;
 		i++;
 
-		if (msk->pm.extra_subflows >= limit_extra_subflows)
+		if (msk->pm.extra_subflows >= limit_extra_subflows ||
+		    i == addrs_size)
 			break;
 	}
 
@@ -248,7 +250,8 @@ fill_remote_addresses_fullmesh(struct mptcp_sock *msk,
  */
 static unsigned int
 fill_remote_addresses_vec(struct mptcp_sock *msk, struct mptcp_addr_info *local,
-			  bool fullmesh, struct mptcp_addr_info *addrs)
+			  bool fullmesh, struct mptcp_addr_info *addrs,
+			  int addrs_size)
 {
 	/* Non-fullmesh: fill in the single entry corresponding to the primary
 	 * MPC subflow remote address, and return 1, corresponding to 1 entry.
@@ -257,7 +260,7 @@ fill_remote_addresses_vec(struct mptcp_sock *msk, struct mptcp_addr_info *local,
 		return fill_remote_addr(msk, local, addrs);
 
 	/* Fullmesh endpoint: fill all possible remote addresses */
-	return fill_remote_addresses_fullmesh(msk, local, addrs);
+	return fill_remote_addresses_fullmesh(msk, local, addrs, addrs_size);
 }
 
 static struct mptcp_pm_addr_entry *
@@ -410,7 +413,8 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
 		else /* local_addr_used is not decr for ID 0 */
 			msk->pm.local_addr_used++;
 
-		nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs);
+		nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh,
+					       addrs, ARRAY_SIZE(addrs));
 		if (nr == 0)
 			continue;
 
@@ -447,6 +451,7 @@ static unsigned int
 fill_local_addresses_vec_fullmesh(struct mptcp_sock *msk,
 				  struct mptcp_addr_info *remote,
 				  struct mptcp_pm_local *locals,
+				  int locals_size,
 				  bool c_flag_case)
 {
 	u8 limit_extra_subflows = mptcp_pm_get_limit_extra_subflows(msk);
@@ -488,7 +493,8 @@ fill_local_addresses_vec_fullmesh(struct mptcp_sock *msk,
 		msk->pm.extra_subflows++;
 		i++;
 
-		if (msk->pm.extra_subflows >= limit_extra_subflows)
+		if (msk->pm.extra_subflows >= limit_extra_subflows ||
+		    i == locals_size)
 			break;
 	}
 	rcu_read_unlock();
@@ -559,7 +565,8 @@ fill_local_laminar_endp(struct mptcp_sock *msk, struct mptcp_addr_info *remote,
 static unsigned int
 fill_local_addresses_vec_c_flag(struct mptcp_sock *msk,
 				struct mptcp_addr_info *remote,
-				struct mptcp_pm_local *locals)
+				struct mptcp_pm_local *locals,
+				int locals_size)
 {
 	u8 limit_extra_subflows = mptcp_pm_get_limit_extra_subflows(msk);
 	struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk);
@@ -586,7 +593,8 @@ fill_local_addresses_vec_c_flag(struct mptcp_sock *msk,
 		msk->pm.extra_subflows++;
 		i++;
 
-		if (msk->pm.extra_subflows >= limit_extra_subflows)
+		if (msk->pm.extra_subflows >= limit_extra_subflows ||
+		    i == locals_size)
 			break;
 	}
 
@@ -620,13 +628,14 @@ fill_local_address_any(struct mptcp_sock *msk, struct mptcp_addr_info *remote,
  */
 static unsigned int
 fill_local_addresses_vec(struct mptcp_sock *msk, struct mptcp_addr_info *remote,
-			 struct mptcp_pm_local *locals)
+			 struct mptcp_pm_local *locals, int locals_size)
 {
 	bool c_flag_case = remote->id && mptcp_pm_add_addr_c_flag_case(msk);
 
 	/* If there is at least one MPTCP endpoint with a fullmesh flag */
 	if (mptcp_pm_get_endp_fullmesh_max(msk))
 		return fill_local_addresses_vec_fullmesh(msk, remote, locals,
+							 locals_size,
 							 c_flag_case);
 
 	/* If there is at least one MPTCP endpoint with a laminar flag */
@@ -637,7 +646,8 @@ fill_local_addresses_vec(struct mptcp_sock *msk, struct mptcp_addr_info *remote,
 	 * limits are used -- accepting no ADD_ADDR -- and use subflow endpoints
 	 */
 	if (c_flag_case)
-		return fill_local_addresses_vec_c_flag(msk, remote, locals);
+		return fill_local_addresses_vec_c_flag(msk, remote, locals,
+						       locals_size);
 
 	/* No special case: fill in the single 'IPADDRANY' local address */
 	return fill_local_address_any(msk, remote, &locals[0]);
@@ -672,7 +682,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 	/* connect to the specified remote address, using whatever
 	 * local address the routing configuration will pick.
 	 */
-	nr = fill_local_addresses_vec(msk, &remote, locals);
+	nr = fill_local_addresses_vec(msk, &remote, locals, ARRAY_SIZE(locals));
 	if (nr == 0)
 		return;
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 13/19] mptcp: pm: in-kernel: increase all limits to 64
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (11 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 12/19] mptcp: pm: in-kernel: explicitly limit batches to array size Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 14/19] mptcp: pm: kernel: allow flushing more than 8 endpoints Matthieu Baerts (NGI0)
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

This means switching the maximum from 8 to 64 for the number of subflows
and accepted ADD_ADDR.

The previous limit of 8 subflows makes sense in most cases. Using more
subflows will very likely *not* improve the situation, and could even
decrease the performances. But there are no technical limitations nor
performance impact to raise this limit, so let's do it: this will allow
people with very specific use-cases, and researchers to easily create
more subflows, and measure the performance impact by themselves.

The theoretical limit is 255 -- the ID is written in a u8 on the wire --
but 64 is more than enough. With so many subflows, it will be costly to
iterate over all of them when operations are done in bottom half.

Note that the in-kernel PM will continue to create subflows in reply to
ADD_ADDR with a single batch of maximum 8 subflows. Same when adding new
"subflow" endpoints with the fullmesh flag. Increasing those batch
limits would have a memory impact, and it looks fine not to cover these
cases with larger batches for the moment. If more is needed later, the
position of the last subflow from the list could be remembered, and the
list iteration could continue later.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/434
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm_kernel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index a678b1c33f56..f163d40e7375 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -30,6 +30,7 @@ struct pm_nl_pernet {
 };
 
 #define MPTCP_PM_ADDR_MAX	8
+#define MPTCP_PM_SUBFLOWS_MAX	64
 
 static struct pm_nl_pernet *pm_nl_get_pernet(const struct net *net)
 {
@@ -1381,10 +1382,10 @@ static int parse_limit(struct genl_info *info, int id, unsigned int *limit)
 		return 0;
 
 	*limit = nla_get_u32(attr);
-	if (*limit > MPTCP_PM_ADDR_MAX) {
+	if (*limit > MPTCP_PM_SUBFLOWS_MAX) {
 		NL_SET_ERR_MSG_ATTR_FMT(info->extack, attr,
 					"limit greater than maximum (%u)",
-					MPTCP_PM_ADDR_MAX);
+					MPTCP_PM_SUBFLOWS_MAX);
 		return -EINVAL;
 	}
 	return 0;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 14/19] mptcp: pm: kernel: allow flushing more than 8 endpoints
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (12 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 13/19] mptcp: pm: in-kernel: increase all limits to 64 Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 15/19] mptcp: pm: in-kernel: increase endpoints limit Matthieu Baerts (NGI0)
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

The mptcp_rm_list structure contains an array of IDs of 8 entries: to be
able to send a RM_ADDR with 8 IDs. This limitation was OK so far because
there could maximum 8 endpoints.

But this is going to change in the next commit. To cope with that, if
one of the arrays is full, the iteration stops, the lists are processed,
then the iteration continues where it previously stopped.

Note that if there are many endpoints to remove, and multiple RM_ADDR to
send, it might be more likely that some of these RM_ADDRs are dropped or
lost. This is a known limitation: RM_ADDR are not retransmitted in
MPTCPv1.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 net/mptcp/pm_kernel.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index f163d40e7375..1480bb7473b4 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -1224,19 +1224,30 @@ int mptcp_pm_nl_del_addr_doit(struct sk_buff *skb, struct genl_info *info)
 }
 
 static void mptcp_pm_flush_addrs_and_subflows(struct mptcp_sock *msk,
-					      struct list_head *rm_list)
+					      struct list_head *rm_list,
+					      struct mptcp_pm_addr_entry *entry)
 {
-	struct mptcp_rm_list alist = { .nr = 0 }, slist = { .nr = 0 };
-	struct mptcp_pm_addr_entry *entry;
+	struct mptcp_rm_list alist, slist;
+	bool more;
 
-	list_for_each_entry(entry, rm_list, list) {
-		if (slist.nr < MPTCP_RM_IDS_MAX &&
-		    mptcp_lookup_subflow_by_saddr(&msk->conn_list, &entry->addr))
+again:
+	alist.nr = 0;
+	slist.nr = 0;
+	more = false;
+
+	entry = list_prepare_entry(entry, rm_list, list);
+	list_for_each_entry_continue(entry, rm_list, list) {
+		if (mptcp_lookup_subflow_by_saddr(&msk->conn_list, &entry->addr))
 			slist.ids[slist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr);
 
-		if (alist.nr < MPTCP_RM_IDS_MAX &&
-		    mptcp_remove_anno_list_by_saddr(msk, &entry->addr))
+		if (mptcp_remove_anno_list_by_saddr(msk, &entry->addr))
 			alist.ids[alist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr);
+
+		if (slist.nr == MPTCP_RM_IDS_MAX ||
+		    alist.nr == MPTCP_RM_IDS_MAX) {
+			more = !list_is_last(&entry->list, rm_list);
+			break;
+		}
 	}
 
 	spin_lock_bh(&msk->pm.lock);
@@ -1247,9 +1258,14 @@ static void mptcp_pm_flush_addrs_and_subflows(struct mptcp_sock *msk,
 	if (slist.nr)
 		mptcp_pm_rm_subflow(msk, &slist);
 	/* Reset counters: maybe some subflows have been removed before */
-	bitmap_fill(msk->pm.id_avail_bitmap, MPTCP_PM_MAX_ADDR_ID + 1);
-	msk->pm.local_addr_used = 0;
+	if (!more) {
+		bitmap_fill(msk->pm.id_avail_bitmap, MPTCP_PM_MAX_ADDR_ID + 1);
+		msk->pm.local_addr_used = 0;
+	}
 	spin_unlock_bh(&msk->pm.lock);
+
+	if (more)
+		goto again;
 }
 
 static void mptcp_nl_flush_addrs_list(struct net *net,
@@ -1266,7 +1282,7 @@ static void mptcp_nl_flush_addrs_list(struct net *net,
 
 		if (!mptcp_pm_is_userspace(msk)) {
 			lock_sock(sk);
-			mptcp_pm_flush_addrs_and_subflows(msk, rm_list);
+			mptcp_pm_flush_addrs_and_subflows(msk, rm_list, NULL);
 			release_sock(sk);
 		}
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 15/19] mptcp: pm: in-kernel: increase endpoints limit
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (13 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 14/19] mptcp: pm: kernel: allow flushing more than 8 endpoints Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 16/19] selftests: mptcp: join: allow changing ifaces nr per test Matthieu Baerts (NGI0)
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

The endpoints are managed in a list which was limited to 8 entries.

This limit can be too small in some cases: by having the same limit as
the number of subflows, it might not allow creating all expected
subflows when having a mix of v4 and v6 addresses that can all use MPTCP
on v4/v6 only networks.

While increasing the limit above the new subflows one, why not using the
technical limit: 255. Indeed, the endpoint will each have an ID that
will be used on the wire, limited to u8, and the ID 0 is reserved to the
initial subflow.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v6: commit message: limit is at 255, not 254
---
 net/mptcp/pm_kernel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index 1480bb7473b4..c75a2c29bcf2 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -746,7 +746,7 @@ static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet,
 	 */
 	if (pernet->next_id == MPTCP_PM_MAX_ADDR_ID)
 		pernet->next_id = 1;
-	if (pernet->endpoints >= MPTCP_PM_ADDR_MAX) {
+	if (pernet->endpoints == MPTCP_PM_MAX_ADDR_ID) {
 		ret = -ERANGE;
 		goto out;
 	}

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 16/19] selftests: mptcp: join: allow changing ifaces nr per test
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (14 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 15/19] mptcp: pm: in-kernel: increase endpoints limit Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 17/19] selftests: mptcp: join: validate 8x8 subflows Matthieu Baerts (NGI0)
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

By default, 4 network interfaces are created per subtest in a dedicated
net namespace. Each netns has a dedicated pair of v4 and v6 addresses.
Future tests will need more.

Simply always creating more network interfaces per test will increase
the execution time for all other tests, for no other benefits. So now it
is possible to change this number only when needed, by setting ifaces_nr
when calling 'reset' and 'init_shapers', e.g.

  ifaces_nr=8 reset "Subtest title"
  ifaces_nr=8 init_shapers

Note that it might also be interesting to decrease the default value to
2 to reduce the setup time, especially when a debug kernel config is
being used.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index beec41f6662a..28da9df797ae 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -63,6 +63,7 @@ unset fastclose
 unset fullmesh
 unset speed
 unset bind_addr
+unset ifaces_nr
 unset join_syn_rej
 unset join_csum_ns1
 unset join_csum_ns2
@@ -146,7 +147,7 @@ init_partial()
 	# ns1eth4    ns2eth4
 
 	local i
-	for i in $(seq 1 4); do
+	for i in $(seq 1 "${ifaces_nr:-4}"); do
 		ip link add ns1eth$i netns "$ns1" type veth peer name ns2eth$i netns "$ns2"
 		ip -net "$ns1" addr add 10.0.$i.1/24 dev ns1eth$i
 		ip -net "$ns1" addr add dead:beef:$i::1/64 dev ns1eth$i nodad
@@ -165,7 +166,7 @@ init_partial()
 init_shapers()
 {
 	local i
-	for i in $(seq 1 4); do
+	for i in $(seq 1 "${ifaces_nr:-4}"); do
 		tc -n $ns1 qdisc add dev ns1eth$i root netem rate 20mbit delay 1ms
 		tc -n $ns2 qdisc add dev ns2eth$i root netem rate 20mbit delay 1ms
 	done

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 17/19] selftests: mptcp: join: validate 8x8 subflows
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (15 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 16/19] selftests: mptcp: join: allow changing ifaces nr per test Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 18/19] selftests: mptcp: pm: validate new limits Matthieu Baerts (NGI0)
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

The limits have been recently increased, it is required to validate that
having 64 subflows is allowed.

Here, both the client and the server have 8 network interfaces. The
server has 8 endpoints marked as 'signal' to announce all its v4
addresses. The client also has 8 endpoints, but marked as 'subflow' and
'fullmesh' in order to create 8 subflows to each address announced by
the server. This means 63 additional subflows will be created after the
initial one.

If it is not possible to increase the limits to 64, it means an older
kernel version is being used, and the test is skipped.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 28 +++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 28da9df797ae..c6bb345d056b 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -513,6 +513,19 @@ reset_with_tcp_filter()
 	fi
 }
 
+# For kernel supporting limits above 8
+# $1: title ; $2,4: addrs limit ns1,2 ; $3,5: subflows limit ns1,2
+reset_with_high_limits()
+{
+	reset "${1}" || return 1
+
+	if ! pm_nl_set_limits "${ns1}" "${2}" "${3}" 2>/dev/null ||
+	   ! pm_nl_set_limits "${ns2}" "${4}" "${5}" 2>/dev/null; then
+		mark_as_skipped "unable to set the limits to ${*:2}"
+		return 1
+	fi
+}
+
 # $1: err msg
 fail_test()
 {
@@ -3670,6 +3683,21 @@ fullmesh_tests()
 		chk_prio_nr 0 1 1 0
 		chk_rm_nr 0 1
 	fi
+
+	# fullmesh in 8x8 to create 63 additional subflows
+	if ifaces_nr=8 reset_with_high_limits "fullmesh 8x8" 64 64 64 64; then
+		# higher chance to lose ADD_ADDR: allow retransmissions
+		ip netns exec $ns1 sysctl -q net.mptcp.add_addr_timeout=1
+		local i
+		for i in $(seq 1 8); do
+			pm_nl_add_endpoint $ns2 10.0.$i.2 flags subflow,fullmesh
+			pm_nl_add_endpoint $ns1 10.0.$i.1 flags signal
+		done
+		speed=slow \
+			run_tests $ns1 $ns2 10.0.1.1
+		chk_join_nr 63 63 63
+	fi
+
 }
 
 fastclose_tests()

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 18/19] selftests: mptcp: pm: validate new limits
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (16 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 17/19] selftests: mptcp: join: validate 8x8 subflows Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 19/19] selftests: mptcp: pm: use simpler send/recv forms Matthieu Baerts (NGI0)
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

These limits have been recently updated, from 8 to:

- 64 for the subflows and accepted add_addr

- 255 for the MPTCP endpoints

These modifications validate the new limits, but are also compatible
with the previous ones, to be able to continue to validate stable kernel
using the last version of the selftests. That's why new variables are
now used instead of hard-coded values.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v4: avoid using "&&" with check(), use a clearer way instead.
---
 tools/testing/selftests/net/mptcp/pm_netlink.sh | 56 +++++++++++++++----------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/pm_netlink.sh b/tools/testing/selftests/net/mptcp/pm_netlink.sh
index 04594dfc22b1..21bfe1311f11 100755
--- a/tools/testing/selftests/net/mptcp/pm_netlink.sh
+++ b/tools/testing/selftests/net/mptcp/pm_netlink.sh
@@ -66,6 +66,15 @@ get_limits() {
 	fi
 }
 
+get_limits_nb() {
+	if mptcp_lib_is_ip_mptcp; then
+		ip -n "${ns1}" mptcp limits | awk '{ print $2" "$4 }'
+	else
+		ip netns exec "${ns1}" ./pm_nl_ctl limits | \
+			awk '{ printf "%s ", $2 }'
+	fi
+}
+
 format_endpoints() {
 	mptcp_lib_pm_nl_format_endpoints "${@}"
 }
@@ -164,6 +173,7 @@ check "get_endpoint 2" "" "simple del addr" 1
 check "show_endpoints" \
 	"$(format_endpoints "1,10.0.1.1" \
 			    "3,10.0.1.3,signal backup")" "dump addrs after del"
+add_endpoint 10.0.1.2 id 2
 
 add_endpoint 10.0.1.3 2>/dev/null
 check "get_endpoint 4" "" "duplicate addr" 1
@@ -171,25 +181,29 @@ check "get_endpoint 4" "" "duplicate addr" 1
 add_endpoint 10.0.1.4 flags signal
 check "get_endpoint 4" "$(format_endpoints "4,10.0.1.4,signal")" "id addr increment"
 
-for i in $(seq 5 9); do
-	add_endpoint "10.0.1.${i}" flags signal >/dev/null 2>&1
-done
-check "get_endpoint 9" "$(format_endpoints "9,10.0.1.9,signal")" "hard addr limit"
-check "get_endpoint 10" "" "above hard addr limit" 1
+read -r -a default_limits_nb <<< "$(get_limits_nb)"
+# limits have been increased: from 8 to 64 for subflows/add_addr & 255 for endp
+if mptcp_lib_expect_all_features || set_limits 9 9 2>/dev/null; then
+	max_endp=255
+	max_limits=64
+else
+	max_endp=8
+	max_limits=8
+fi
+set_limits "${default_limits_nb[@]}"
 
-del_endpoint 9
-for i in $(seq 10 255); do
-	add_endpoint 10.0.0.9 id "${i}"
-	del_endpoint "${i}"
+for i in $(seq 5 ${max_endp}); do
+	add_endpoint "10.0.0.${i}" id "${i}"
 done
-check "show_endpoints" \
-	"$(format_endpoints "1,10.0.1.1" \
-			    "3,10.0.1.3,signal backup" \
-			    "4,10.0.1.4,signal" \
-			    "5,10.0.1.5,signal" \
-			    "6,10.0.1.6,signal" \
-			    "7,10.0.1.7,signal" \
-			    "8,10.0.1.8,signal")" "id limit"
+check "get_endpoint ${max_endp}" \
+	"$(format_endpoints "${max_endp},10.0.0.${max_endp}")" "id limit"
+
+if add_endpoint '10.0.0.1' &>/dev/null; then
+	hardlimit="no error"
+else
+	hardlimit="error"
+fi
+check "echo ${hardlimit}" "error" "above hard addr limit"
 
 flush_endpoint
 check "show_endpoints" "" "flush addrs"
@@ -202,15 +216,15 @@ if ! mptcp_lib_is_ip_mptcp; then
 	flush_endpoint
 fi
 
-set_limits 9 1 2>/dev/null
+set_limits $((max_limits + 1)) 1 2>/dev/null
 check "get_limits" "${default_limits}" "rcv addrs above hard limit"
 
-set_limits 1 9 2>/dev/null
+set_limits 1 $((max_limits + 1)) 2>/dev/null
 check "get_limits" "${default_limits}" "subflows above hard limit"
 
-set_limits 8 8
+set_limits ${max_limits} ${max_limits}
 flush_endpoint  ## to make sure it doesn't affect the limits
-check "get_limits" "$(format_limits 8 8)" "set limits"
+check "get_limits" "$(format_limits ${max_limits} ${max_limits})" "set limits"
 
 flush_endpoint
 add_endpoint 10.0.1.1

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH mptcp-net v6 19/19] selftests: mptcp: pm: use simpler send/recv forms
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (17 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 18/19] selftests: mptcp: pm: validate new limits Matthieu Baerts (NGI0)
@ 2026-04-22 16:40 ` Matthieu Baerts (NGI0)
  2026-04-22 17:47 ` [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup MPTCP CI
  2026-04-29  0:09 ` Mat Martineau
  20 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-04-22 16:40 UTC (permalink / raw)
  To: MPTCP Upstream; +Cc: Matthieu Baerts (NGI0)

Instead of sendto() and recvfrom() which the NL address that was already
provided before.

Just simpler and easier to read without the to/from variants.

While at it, fix a checkpatch warning by removing multiple assignments.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v2: fix already present checkpatch warning.
---
 tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
index 99eecccbf0c8..78180da1efcc 100644
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
@@ -217,8 +217,6 @@ static int capture_events(int fd, int event_group)
 /* do a netlink command and, if max > 0, fetch the reply ; nh's size >1024B */
 static int do_nl_req(int fd, struct nlmsghdr *nh, int len, int max)
 {
-	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
-	socklen_t addr_len;
 	void *data = nh;
 	int rem, ret;
 	int err = 0;
@@ -230,15 +228,15 @@ static int do_nl_req(int fd, struct nlmsghdr *nh, int len, int max)
 	}
 
 	nh->nlmsg_len = len;
-	ret = sendto(fd, data, len, 0, (void *)&nladdr, sizeof(nladdr));
+	ret = send(fd, data, len, 0);
 	if (ret != len)
 		error(1, errno, "send netlink: %uB != %uB\n", ret, len);
 
-	addr_len = sizeof(nladdr);
-	rem = ret = recvfrom(fd, data, max, 0, (void *)&nladdr, &addr_len);
+	ret = recv(fd, data, max, 0);
 	if (ret < 0)
 		error(1, errno, "recv netlink: %uB\n", ret);
 
+	rem = ret;
 	/* Beware: the NLMSG_NEXT macro updates the 'rem' argument */
 	for (; NLMSG_OK(nh, rem); nh = NLMSG_NEXT(nh, rem)) {
 		if (nh->nlmsg_type == NLMSG_DONE)

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (18 preceding siblings ...)
  2026-04-22 16:40 ` [PATCH mptcp-net v6 19/19] selftests: mptcp: pm: use simpler send/recv forms Matthieu Baerts (NGI0)
@ 2026-04-22 17:47 ` MPTCP CI
  2026-04-29  0:09 ` Mat Martineau
  20 siblings, 0 replies; 23+ messages in thread
From: MPTCP CI @ 2026-04-22 17:47 UTC (permalink / raw)
  To: Matthieu Baerts; +Cc: mptcp

Hi Matthieu,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_fastopen packetdrill_mp_capable ⚠️ 
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/24791273302

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/5f1ff4c3eff3
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1084373


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup
  2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
                   ` (19 preceding siblings ...)
  2026-04-22 17:47 ` [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup MPTCP CI
@ 2026-04-29  0:09 ` Mat Martineau
  2026-04-29  7:39   ` Matthieu Baerts
  20 siblings, 1 reply; 23+ messages in thread
From: Mat Martineau @ 2026-04-29  0:09 UTC (permalink / raw)
  To: Matthieu Baerts (NGI0); +Cc: MPTCP Upstream

On Wed, 22 Apr 2026, Matthieu Baerts (NGI0) wrote:

> The initial intension was to increase the limits, but some fixes were
> needed, then when looking at the code around, other fixes had to be
> added too. So now I ended up with multiple fixes (including one early
> exit while at it), followed by patches increasing limits with some
> adaptations required before, some to validate the modifications, and one
> small improvement in the selftests.
>
> The patches with a Fixes tag are for -net, the rest for net-next.
>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> Changes in v6:
> - patch 4: fix label name in commit message + mention next patch.
> - patch 7: use min() to improve clarity (Mat)
> - patch 15: fix typo.
> - patch 20: removed: will be sent in another series after that one.
> - Link to v5: https://patch.msgid.link/20260415-mptcp-inc-limits-v5-0-e54c3bf80e4e@kernel.org

Hi Matthieu -

v6 LGTM, thanks for the updates.

Reviewed-by: Mat Martineau <martineau@kernel.org>

>
> Changes in v5:
> - patch 5: only set timer_done to true in the timer handler: safer.
> - patch 9: fix typo in commit message.
> - patch 10: 3rd check expects error, better err msg, rename var.
> - Link to v4: https://patch.msgid.link/20260414-mptcp-inc-limits-v4-0-78274edda5dd@kernel.org
>
> Changes in v4:
> - patch 5: reset timer_done when the timer is reset.
> - patch 10: new: selftests: correctly catch error in 'check output'.
> - patch 16: add example with init_shapers to silence AI reviews
> - Patch 18: clearer way to check errors (without buggy '&&')
> - Link to v3: https://patch.msgid.link/20260413-mptcp-inc-limits-v3-0-dd36c9360432@kernel.org
>
> Changes in v3:
> - patch 1: new: retransmit ADD_ADDR for ID 0 with right ID.
> - patch 5: moved below + support calling sk_free().
> - patch 13: new: allow flushing more than 8 endpoints.
> - patch 12: clarify single batch of 8 subflows as known limit.
> - patch 16: fix typo in comment.
> - Link to v2: https://patch.msgid.link/20260410-mptcp-inc-limits-v2-0-5402209f05d3@kernel.org
>
> Changes in v2:
> - patch 2: new: fix potential data-race.
> - re-order patches 1 to 5 and use shorter prefix (ADD_ADDR rtx).
> - patch 6: restore accidentally deleted icsk->icsk_rto > max check.
> - patch 16: fix already present checkpatch warning.
> - Link to v1: https://patch.msgid.link/20260409-mptcp-inc-limits-v1-0-0e45fa30d914@kernel.org
>
> ---
> Matthieu Baerts (NGI0) (19):
>      mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0
>      mptcp: pm: ADD_ADDR rtx: fix potential data-race
>      mptcp: pm: ADD_ADDR rtx: allow ID 0
>      mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
>      mptcp: pm: ADD_ADDR rtx: free sk if last
>      mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
>      mptcp: pm: ADD_ADDR rtx: skip inactive subflows
>      mptcp: pm: retrans ADD_ADDR: return early if no retrans
>      mptcp: pm: prio: skip closed subflows
>      selftests: mptcp: check output: catch cmd errors
>      selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
>      mptcp: pm: in-kernel: explicitly limit batches to array size
>      mptcp: pm: in-kernel: increase all limits to 64
>      mptcp: pm: kernel: allow flushing more than 8 endpoints
>      mptcp: pm: in-kernel: increase endpoints limit
>      selftests: mptcp: join: allow changing ifaces nr per test
>      selftests: mptcp: join: validate 8x8 subflows
>      selftests: mptcp: pm: validate new limits
>      selftests: mptcp: pm: use simpler send/recv forms
>
> net/mptcp/pm.c                                  | 62 +++++++++++------
> net/mptcp/pm_kernel.c                           | 90 ++++++++++++++++---------
> tools/testing/selftests/net/mptcp/mptcp_join.sh | 33 ++++++++-
> tools/testing/selftests/net/mptcp/mptcp_lib.sh  | 16 +++--
> tools/testing/selftests/net/mptcp/pm_netlink.sh | 74 ++++++++++++--------
> tools/testing/selftests/net/mptcp/pm_nl_ctl.c   |  8 +--
> 6 files changed, 193 insertions(+), 90 deletions(-)
> ---
> base-commit: b3fa12c3ddab70d570c42071efb0edd5dad07060
> change-id: 20260403-mptcp-inc-limits-ce9811024066
>
> Best regards,
> --
> Matthieu Baerts (NGI0) <matttbe@kernel.org>
>
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup
  2026-04-29  0:09 ` Mat Martineau
@ 2026-04-29  7:39   ` Matthieu Baerts
  0 siblings, 0 replies; 23+ messages in thread
From: Matthieu Baerts @ 2026-04-29  7:39 UTC (permalink / raw)
  To: Mat Martineau; +Cc: MPTCP Upstream

Hi Mat,

On 29/04/2026 02:09, Mat Martineau wrote:
> On Wed, 22 Apr 2026, Matthieu Baerts (NGI0) wrote:
> 
>> The initial intension was to increase the limits, but some fixes were
>> needed, then when looking at the code around, other fixes had to be
>> added too. So now I ended up with multiple fixes (including one early
>> exit while at it), followed by patches increasing limits with some
>> adaptations required before, some to validate the modifications, and one
>> small improvement in the selftests.
>>
>> The patches with a Fixes tag are for -net, the rest for net-next.
>>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>> Changes in v6:
>> - patch 4: fix label name in commit message + mention next patch.
>> - patch 7: use min() to improve clarity (Mat)
>> - patch 15: fix typo.
>> - patch 20: removed: will be sent in another series after that one.
>> - Link to v5: https://patch.msgid.link/20260415-mptcp-inc-limits-v5-0-
>> e54c3bf80e4e@kernel.org
> 
> Hi Matthieu -
> 
> v6 LGTM, thanks for the updates.
Thank you for the review!

Now in our tree:

New patches for t/upstream-net and t/upstream:
- 4cd1d0bb0a77: mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0
- 24d6e9230abb: mptcp: pm: ADD_ADDR rtx: fix potential data-race
- 1d62197aebe7: mptcp: pm: ADD_ADDR rtx: allow ID 0
- 3e8311058b8a: mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
- c7492016b448: mptcp: pm: ADD_ADDR rtx: free sk if last
- 999a20ff8f59: mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
- 20269ec49bd8: mptcp: pm: ADD_ADDR rtx: skip inactive subflows
- a91b0640998b: mptcp: pm: ADD_ADDR rtx: return early if no retrans
- 09b51d25b19b: mptcp: pm: prio: skip closed subflows
- 53421f820dd0: selftests: mptcp: check output: catch cmd errors
- f1dcf11f1a08: selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
- Results: 90cb02b20cf2..2a1f2d9ea26c (export-net)
- Results: f68dc562c883..eff6f637e315 (export)

New patches for t/upstream:
- 43bbf39ee203: mptcp: pm: in-kernel: explicitly limit batches to array size
- aec0321c2d29: mptcp: pm: in-kernel: increase all limits to 64
- a6d8f6739145: mptcp: pm: kernel: allow flushing more than 8 endpoints
- db7b26d14b03: mptcp: pm: in-kernel: increase endpoints limit
- 4097388c8d85: selftests: mptcp: join: allow changing ifaces nr per test
- b72b1d12a597: selftests: mptcp: join: validate 8x8 subflows
- 67c280697ab2: selftests: mptcp: pm: validate new limits
- dade65b5a91d: selftests: mptcp: pm: use simpler send/recv forms
- Results: eff6f637e315..d56fb1eba9f2 (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/6f73a831d415e285b7e49565b9079495b7274a89/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/ea2792a55b2660faf08c4a65ad46d7b87adf6f1c/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-04-29  7:39 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 16:40 [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 01/19] mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0 Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 02/19] mptcp: pm: ADD_ADDR rtx: fix potential data-race Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 03/19] mptcp: pm: ADD_ADDR rtx: allow ID 0 Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 04/19] mptcp: pm: ADD_ADDR rtx: always decrease sk refcount Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 05/19] mptcp: pm: ADD_ADDR rtx: free sk if last Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 06/19] mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 07/19] mptcp: pm: ADD_ADDR rtx: skip inactive subflows Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 08/19] mptcp: pm: retrans ADD_ADDR: return early if no retrans Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 09/19] mptcp: pm: prio: skip closed subflows Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 10/19] selftests: mptcp: check output: catch cmd errors Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 11/19] selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 12/19] mptcp: pm: in-kernel: explicitly limit batches to array size Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 13/19] mptcp: pm: in-kernel: increase all limits to 64 Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 14/19] mptcp: pm: kernel: allow flushing more than 8 endpoints Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 15/19] mptcp: pm: in-kernel: increase endpoints limit Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 16/19] selftests: mptcp: join: allow changing ifaces nr per test Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 17/19] selftests: mptcp: join: validate 8x8 subflows Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 18/19] selftests: mptcp: pm: validate new limits Matthieu Baerts (NGI0)
2026-04-22 16:40 ` [PATCH mptcp-net v6 19/19] selftests: mptcp: pm: use simpler send/recv forms Matthieu Baerts (NGI0)
2026-04-22 17:47 ` [PATCH mptcp-net v6 00/19] mptcp: pm: increase limits, and related fixes and cleanup MPTCP CI
2026-04-29  0:09 ` Mat Martineau
2026-04-29  7:39   ` Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.