* [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4
@ 2026-05-15 4:27 Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 1/6] mptcp: do not drop partial packets Matthieu Baerts (NGI0)
` (6 more replies)
0 siblings, 7 replies; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, Shardul Bankar, stable, Li Xiasong,
Gang Yan
Here are various unrelated fixes:
- Patch 1: avoid dropping partial packets. A previous version has been
sent a few week ago. A fix for 5.10.
- Patches 2-3: stop ADD_ADDR timer when an ADD_ADDR can never been sent
due to insufficient option space. A fix for v5.10.
- Patch 4: reset rcv_wnd_sent on disconnect, just in case the next
connection falls back to TCP. A fix for 5.17.
- Patch 5: update window_clamp when SO_RCVBUF is set during the
connection. A fix similar to a recent one on TCP side, for v6.6.
- Patch 6: avoid wrong time being displayed in the selftests when using
uutils 0.8.0 which contains a regression with 'date +%3N'. It doesn't
fix an issue in the kernel selftests, but having the fix is helpful
for those using uutils 0.8.0.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v2:
- Patch 2: note for sashiko-nipa
- Patch 5: remove 'inline' keyword (NIPA) + update Fixes tag (Jakub)
- Patch 6: new
- Remove Eric's duplicated address with a typo (not sure how I did that)
- Link to v1: https://patch.msgid.link/20260511-net-mptcp-misc-fixes-7-1-rc4-v1-0-5ee57cb2b7eb@kernel.org
To: Matthieu Baerts <matttbe@kernel.org>
To: Mat Martineau <martineau@kernel.org>
To: Geliang Tang <geliang@kernel.org>
To: "David S. Miller" <davem@davemloft.net>
To: Eric Dumazet <edumazet@google.com>
To: Jakub Kicinski <kuba@kernel.org>
To: Paolo Abeni <pabeni@redhat.com>
To: Simon Horman <horms@kernel.org>
To: Shuah Khan <shuah@kernel.org>
Cc: netdev@vger.kernel.org
Cc: mptcp@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: Eric Dumazet <edumaze@google.com>
---
Gang Yan (1):
mptcp: update window_clamp on subflows when SO_RCVBUF is set
Li Xiasong (2):
mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient
selftests: mptcp: join: cover ADD_ADDR tx drop and list progress
Matthieu Baerts (NGI0) (1):
selftests: mptcp: drop nanoseconds width specifier
Paolo Abeni (1):
mptcp: reset rcv wnd on disconnect
Shardul Bankar (1):
mptcp: do not drop partial packets
net/mptcp/pm.c | 56 ++++++++++++++++++----
net/mptcp/protocol.c | 25 ++++++++--
net/mptcp/sockopt.c | 10 +++-
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 6 +--
tools/testing/selftests/net/mptcp/mptcp_join.sh | 31 ++++++++++++
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 10 ++--
6 files changed, 113 insertions(+), 25 deletions(-)
---
base-commit: 5db89c99566fc4728cc92e941d8e1975711e24b5
change-id: 20260511-net-mptcp-misc-fixes-7-1-rc4-e2640fd4ef2c
Best regards,
--
Matthieu Baerts (NGI0) <matttbe@kernel.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net v2 1/6] mptcp: do not drop partial packets
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Matthieu Baerts (NGI0)
` (5 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, Shardul Bankar, stable
From: Shardul Bankar <shardul.b@mpiricsoftware.com>
When a packet arrives with map_seq < ack_seq < end_seq, the beginning
of the packet has already been acknowledged but the end contains new
data. Currently the entire packet is dropped as "old data," forcing
the sender to retransmit.
Instead, skip the already-acked bytes by adjusting the skb offset and
enqueue only the new portion. Update bytes_received and ack_seq to
reflect the new data consumed.
A previous attempt at this fix has been sent by Paolo Abeni [1], but had
issues [2]: it also added a zero-window check and changed rcv_wnd_sent
initialization, which caused test regressions. This version addresses
only the partial packet handling without modifying receive window
accounting.
Fixes: ab174ad8ef76 ("mptcp: move ooo skbs into msk out of order queue.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/c9b426a4e163aa3c4fe8b80c79f1a610f47ae7d8.1763075056.git.pabeni@redhat.com [1]
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/600 [2]
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
[pabeni@redhat.com: update map]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v3: (Paolo)
- update map_seq, too (AI tool)
v2: (Shardul)
- Drop the mptcp_try_coalesce() attempt for partial packets, since
non-zero offset always prevents coalescing (Paolo).
- https://lore.kernel.org/20260422143931.43281-1-shardul.b@mpiricsoftware.com
v1: (Shardul)
- https://lore.kernel.org/20260422120954.8877-1-shardul.b@mpiricsoftware.com
v0: (Paolo)
- https://lore.kernel.org/mptcp/c9b426a4e163aa3c4fe8b80c79f1a610f47ae7d8.1763075056.git.pabeni@redhat.com
---
net/mptcp/protocol.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 4546a8b09884..859df49e16dc 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -397,12 +397,26 @@ static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb)
return false;
}
- /* old data, keep it simple and drop the whole pkt, sender
- * will retransmit as needed, if needed.
+ /* Completely old data? */
+ if (!after64(MPTCP_SKB_CB(skb)->end_seq, msk->ack_seq)) {
+ MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA);
+ mptcp_drop(sk, skb);
+ return false;
+ }
+
+ /* Partial packet: map_seq < ack_seq < end_seq.
+ * Skip the already-acked bytes and enqueue the new data.
*/
- MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA);
- mptcp_drop(sk, skb);
- return false;
+ copy_len = MPTCP_SKB_CB(skb)->end_seq - msk->ack_seq;
+ MPTCP_SKB_CB(skb)->offset += msk->ack_seq - MPTCP_SKB_CB(skb)->map_seq;
+ MPTCP_SKB_CB(skb)->map_seq += msk->ack_seq -
+ MPTCP_SKB_CB(skb)->map_seq;
+ msk->bytes_received += copy_len;
+ WRITE_ONCE(msk->ack_seq, msk->ack_seq + copy_len);
+
+ skb_set_owner_r(skb, sk);
+ __skb_queue_tail(&sk->sk_receive_queue, skb);
+ return true;
}
static void mptcp_stop_rtx_timer(struct sock *sk)
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 1/6] mptcp: do not drop partial packets Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-17 5:50 ` Matthieu Baerts
2026-05-15 4:27 ` [PATCH net v2 3/6] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Matthieu Baerts (NGI0)
` (4 subsequent siblings)
6 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, Li Xiasong, stable
From: Li Xiasong <lixiasong1@huawei.com>
When TCP option space is insufficient (e.g., when sending ADD_ADDR with an
IPv6 address and port while tcp_timestamps is enabled), the original code
jumped to out_unlock without clearing the addr_signal flag. This caused
mptcp_pm_add_timer to keep rescheduling indefinitely, not sending ADD_ADDR,
preventing subsequent addresses in the endpoint list from being announced.
Handle this case by clearing the ADD_ADDR signal and skipping the matching
ADD_ADDR retransmission entry. The skip path cancels the matching timer
(with id check) and advances PM state progression, preserving forward
progress to subsequent PM work.
This cancellation is inherently best-effort. A concurrent add_timer
callback may already be running and may acquire pm.lock before the
cancel path updates entry state. In that case, one final ADD_ADDR
transmit attempt can still be executed.
Once the cancel path sets entry->retrans_times to ADD_ADDR_RETRANS_MAX,
the callback-side retrans_times check suppresses further ADD_ADDR
retransmissions.
Note that when an ADD_ADDR is being prepared, a pure-ACK is queued. On
the output side, it means that it is fine to skip non-pure-ACK packets,
when drop_other_suboptions is set: a pure-ACK will be processed soon
after.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v2: add a note for sashiko-nipa's: it's a false positive.
---
net/mptcp/pm.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 46 insertions(+), 10 deletions(-)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 3c152bf66cd5..3e770c7407e1 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -364,7 +364,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
spin_lock_bh(&msk->pm.lock);
- if (!mptcp_pm_should_add_signal_addr(msk)) {
+ /* The cancel path (mptcp_pm_del_add_timer()) can race with this
+ * callback. Once cancel updates retrans_times to MAX, suppress further
+ * retransmissions here. If this callback acquires pm.lock first, one
+ * final transmit attempt is still possible.
+ */
+ if (entry->retrans_times < ADD_ADDR_RETRANS_MAX &&
+ !mptcp_pm_should_add_signal_addr(msk)) {
pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id);
mptcp_pm_announce_addr(msk, &entry->addr, false);
mptcp_pm_add_addr_send_ack(msk);
@@ -414,8 +420,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
/* Note: entry might have been removed by another thread.
* We hold rcu_read_lock() to ensure it is not freed under us.
*/
- if (stop_timer)
- sk_stop_timer_sync(sk, &entry->add_timer);
+ if (stop_timer) {
+ if (check_id)
+ sk_stop_timer(sk, &entry->add_timer);
+ else
+ sk_stop_timer_sync(sk, &entry->add_timer);
+ }
rcu_read_unlock();
return entry;
@@ -882,6 +892,7 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb,
struct mptcp_addr_info *addr, bool *echo,
bool *drop_other_suboptions)
{
+ bool skip_add_addr = false;
int ret = false;
u8 add_addr;
u8 family;
@@ -903,24 +914,49 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb,
}
*echo = mptcp_pm_should_add_signal_echo(msk);
- port = !!(*echo ? msk->pm.remote.port : msk->pm.local.port);
-
- family = *echo ? msk->pm.remote.family : msk->pm.local.family;
- if (remaining < mptcp_add_addr_len(family, *echo, port))
- goto out_unlock;
-
if (*echo) {
*addr = msk->pm.remote;
add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_ECHO);
+ port = !!msk->pm.remote.port;
+ family = msk->pm.remote.family;
} else {
*addr = msk->pm.local;
add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_SIGNAL);
+ port = !!msk->pm.local.port;
+ family = msk->pm.local.family;
}
- WRITE_ONCE(msk->pm.addr_signal, add_addr);
+
+ if (remaining < mptcp_add_addr_len(family, *echo, port)) {
+ struct net *net = sock_net((struct sock *)msk);
+
+ if (!*drop_other_suboptions)
+ goto out_unlock;
+
+ if (*echo) {
+ MPTCP_INC_STATS(net, MPTCP_MIB_ECHOADDTXDROP);
+ } else {
+ skip_add_addr = true;
+ MPTCP_INC_STATS(net, MPTCP_MIB_ADDADDRTXDROP);
+ }
+ goto drop_signal_mark;
+ }
+
ret = true;
+drop_signal_mark:
+ WRITE_ONCE(msk->pm.addr_signal, add_addr);
+
out_unlock:
spin_unlock_bh(&msk->pm.lock);
+
+ /* On pure-ACK option-space exhaustion, stop retrying this ADD_ADDR:
+ * clear the signal bit, cancel the matching retransmission timer, and
+ * let the PM state machine progress.
+ */
+ if (skip_add_addr) {
+ mptcp_pm_del_add_timer(msk, addr, true);
+ mptcp_pm_subflow_established(msk);
+ }
return ret;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 3/6] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 1/6] mptcp: do not drop partial packets Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 4/6] mptcp: reset rcv wnd on disconnect Matthieu Baerts (NGI0)
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, Li Xiasong
From: Li Xiasong <lixiasong1@huawei.com>
Extend add_addr_ports_tests with IPv6 signaling cases that exercise
ADD_ADDR tx-space shortage when tcp_timestamps are enabled.
Add one case to verify PM still progresses to later signal endpoints
after the first one is dropped.
This covers both failure accounting and the non-blocking behavior of
the announce list after a tx-space drop on pure ACK.
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
To: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
---
tools/testing/selftests/net/mptcp/mptcp_join.sh | 31 +++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index beec41f6662a..5acd12021e6e 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -1828,6 +1828,22 @@ chk_add_tx_nr()
fi
}
+chk_add_drop_tx_nr()
+{
+ local drop_tx_nr=$1
+ local count
+
+ print_check "add addr tx drop"
+ count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtAddAddrTxDrop")
+ if [ -z "$count" ]; then
+ print_skip
+ elif [ "$count" != "$drop_tx_nr" ]; then
+ fail_test "got $count ADD_ADDR drop[s] TX, expected $drop_tx_nr"
+ else
+ print_ok
+ fi
+}
+
chk_rm_nr()
{
local rm_addr_nr=$1
@@ -3278,6 +3294,21 @@ add_addr_ports_tests()
chk_mpc_endp_attempt ${retl} 1
fi
+
+ # first signal address drops, second one still progresses
+ if reset "signal addr list progresses after tx drop"; then
+ pm_nl_set_limits $ns1 0 2
+ pm_nl_set_limits $ns2 1 0
+ ip netns exec $ns1 sysctl -q net.ipv4.tcp_timestamps=1
+ ip netns exec $ns2 sysctl -q net.ipv4.tcp_timestamps=1
+
+ pm_nl_add_endpoint $ns1 dead:beef:2::1 flags signal port 10100
+ pm_nl_add_endpoint $ns1 dead:beef:3::1 flags signal
+ run_tests $ns1 $ns2 dead:beef:1::1
+ chk_add_drop_tx_nr 1
+ chk_add_tx_nr 1 1
+ chk_add_nr 1 1 0
+ fi
}
bind_tests()
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 4/6] mptcp: reset rcv wnd on disconnect
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
` (2 preceding siblings ...)
2026-05-15 4:27 ` [PATCH net v2 3/6] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 5/6] mptcp: update window_clamp on subflows when SO_RCVBUF is set Matthieu Baerts (NGI0)
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, stable
From: Paolo Abeni <pabeni@redhat.com>
If the MPTCP socket fallback to TCP before the MP handshake completion,
the IASN remain 0, and the rcv_wnd_sent field is not explicitly
initialized, just incremented over time with the data transfer.
At disconnect time such value is not cleared. If the next connection falls
back to TCP before the MP handshake completion, the data transfer will
keep incrementing the receive window end sequence starting from the last
value used in the previous connection: the announced window will be
unrelated from the actual receiver buffer size and likely too big.
Address the issue zeroing the field at disconnect time.
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/mptcp/protocol.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 859df49e16dc..a72a6ad6ee8b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3487,6 +3487,7 @@ static int mptcp_disconnect(struct sock *sk, int flags)
/* for fallback's sake */
WRITE_ONCE(msk->ack_seq, 0);
+ atomic64_set(&msk->rcv_wnd_sent, 0);
WRITE_ONCE(sk->sk_shutdown, 0);
sk_error_report(sk);
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 5/6] mptcp: update window_clamp on subflows when SO_RCVBUF is set
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
` (3 preceding siblings ...)
2026-05-15 4:27 ` [PATCH net v2 4/6] mptcp: reset rcv wnd on disconnect Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier Matthieu Baerts (NGI0)
2026-05-19 13:50 ` [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 patchwork-bot+netdevbpf
6 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, Gang Yan, stable
From: Gang Yan <yangang@kylinos.cn>
Add __mptcp_subflow_set_rcvbuf() helper to write the subflow sk_rcvbuf,
but also to call the recently added tcp_set_rcvbuf() helper to update
window_clamp. This is needed because the window clap is updated when
scaling_ratio changes, in tcp_measure_rcv_mss(). Until scaling_ratio
changes, the subflow is stuck with the old window clamp which may be
based on a small initial buffer.
Use this new helper in both mptcp_sol_socket_sync_intval() (setsockopt
path) and sync_socket_options() (new subflow creation path).
Note that this patch depends on commit b025461303d8 ("tcp: update
window_clamp when SO_RCVBUF is set"): it fixes the issue on TCP side,
but the same fix is needed on MPTCP side as well.
Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/619
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
v2: remove 'inline' keyword (NIPA) + update Fixes tag (Jakub)
---
net/mptcp/sockopt.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 1cf608e7357b..87b5796d0135 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -67,6 +67,12 @@ static int mptcp_get_int_option(struct mptcp_sock *msk, sockptr_t optval,
return 0;
}
+static void __mptcp_subflow_set_rcvbuf(struct sock *ssk, int val)
+{
+ WRITE_ONCE(ssk->sk_rcvbuf, val);
+ tcp_set_rcvbuf(ssk, val);
+}
+
static void mptcp_sol_socket_sync_intval(struct mptcp_sock *msk, int optname, int val)
{
struct mptcp_subflow_context *subflow;
@@ -100,7 +106,7 @@ static void mptcp_sol_socket_sync_intval(struct mptcp_sock *msk, int optname, in
case SO_RCVBUF:
case SO_RCVBUFFORCE:
ssk->sk_userlocks |= SOCK_RCVBUF_LOCK;
- WRITE_ONCE(ssk->sk_rcvbuf, sk->sk_rcvbuf);
+ __mptcp_subflow_set_rcvbuf(ssk, sk->sk_rcvbuf);
break;
case SO_MARK:
if (READ_ONCE(ssk->sk_mark) != sk->sk_mark) {
@@ -1560,7 +1566,7 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
mptcp_subflow_ctx(ssk)->cached_sndbuf = sk->sk_sndbuf;
}
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
- WRITE_ONCE(ssk->sk_rcvbuf, sk->sk_rcvbuf);
+ __mptcp_subflow_set_rcvbuf(ssk, sk->sk_rcvbuf);
}
if (sock_flag(sk, SOCK_LINGER)) {
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
` (4 preceding siblings ...)
2026-05-15 4:27 ` [PATCH net v2 5/6] mptcp: update window_clamp on subflows when SO_RCVBUF is set Matthieu Baerts (NGI0)
@ 2026-05-15 4:27 ` Matthieu Baerts (NGI0)
2026-05-17 5:44 ` Matthieu Baerts
2026-05-19 13:50 ` [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 patchwork-bot+netdevbpf
6 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts (NGI0) @ 2026-05-15 4:27 UTC (permalink / raw)
To: Mat Martineau, Geliang Tang, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan
Cc: netdev, mptcp, linux-kernel, Matthieu Baerts (NGI0),
linux-kselftest, Eric Dumazet, stable
Using the format specifier +%s%3N with GNU date is honoured, and only
prints 3 digits of the nanoseconds portion of the seconds since epoch,
which corresponds to the milliseconds.
The uutils implementation of date currently does not honour this, and
always prints all 9 digits. This is a known issue [1], but can be worked
around by adapting this test to use nanoseconds instead of microseconds,
and then divide it by 1e6.
This fix is similar to what has been done on systemd side [2], and it is
needed to run the selftests on Ubuntu 26.04, containing uutils 0.8.0.
Note that the Fixes tag is there even if this patch doesn't fix an issue
in the kernel selftests, but it is useful for those using uutils 0.8.0.
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Cc: stable@vger.kernel.org
Link: https://github.com/uutils/coreutils/issues/11658 [1]
Link: https://github.com/systemd/systemd/pull/41627 [2]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
To: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
---
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 6 +++---
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 10 +++++-----
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index a6447f7a31fe..d158678fa6ab 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -401,7 +401,7 @@ do_transfer()
mptcp_lib_wait_local_port_listen "${listener_ns}" "${port}"
local start
- start=$(date +%s%3N)
+ start=$(date +%s%N)
ip netns exec ${connector_ns} \
./mptcp_connect -t ${timeout_poll} -p $port -s ${cl_proto} \
$extra_args $connect_addr < "$cin" > "$cout" &
@@ -423,7 +423,7 @@ do_transfer()
fi
local stop
- stop=$(date +%s%3N)
+ stop=$(date +%s%N)
if $capture; then
sleep 1
@@ -439,7 +439,7 @@ do_transfer()
fi
local duration
- duration=$((stop-start))
+ duration=$(((stop-start) / 1000000))
printf "(duration %05sms) " "${duration}"
if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ] || [ ${timeout_pid} -ne 0 ]; then
mptcp_lib_pr_fail "client exit code $retc, server $rets"
diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
index 989a5975dcea..5ef6033775c8 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh
@@ -28,7 +28,7 @@ declare -rx MPTCP_LIB_AF_INET6=10
MPTCP_LIB_SUBTESTS=()
MPTCP_LIB_SUBTESTS_DUPLICATED=0
MPTCP_LIB_SUBTEST_FLAKY=0
-MPTCP_LIB_SUBTESTS_LAST_TS_MS=
+MPTCP_LIB_SUBTESTS_LAST_TS_NS=
MPTCP_LIB_TEST_COUNTER=0
MPTCP_LIB_TEST_FORMAT="%02u %-50s"
MPTCP_LIB_IP_MPTCP=0
@@ -236,7 +236,7 @@ mptcp_lib_kversion_ge() {
}
mptcp_lib_subtests_last_ts_reset() {
- MPTCP_LIB_SUBTESTS_LAST_TS_MS="$(date +%s%3N)"
+ MPTCP_LIB_SUBTESTS_LAST_TS_NS="$(date +%s%N)"
}
mptcp_lib_subtests_last_ts_reset
@@ -255,7 +255,7 @@ __mptcp_lib_result_check_duplicated() {
__mptcp_lib_result_add() {
local result="${1}"
local time="time="
- local ts_prev_ms
+ local ts_prev_ns
shift
local id=$((${#MPTCP_LIB_SUBTESTS[@]} + 1))
@@ -265,9 +265,9 @@ __mptcp_lib_result_add() {
# not to add two '#'
[[ "${*}" != *"#"* ]] && time="# ${time}"
- ts_prev_ms="${MPTCP_LIB_SUBTESTS_LAST_TS_MS}"
+ ts_prev_ns="${MPTCP_LIB_SUBTESTS_LAST_TS_NS}"
mptcp_lib_subtests_last_ts_reset
- time+="$((MPTCP_LIB_SUBTESTS_LAST_TS_MS - ts_prev_ms))ms"
+ time+="$(((MPTCP_LIB_SUBTESTS_LAST_TS_NS - ts_prev_ns) / 1000000))ms"
MPTCP_LIB_SUBTESTS+=("${result} ${id} - ${KSFT_TEST}: ${*} ${time}")
}
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier
2026-05-15 4:27 ` [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier Matthieu Baerts (NGI0)
@ 2026-05-17 5:44 ` Matthieu Baerts
0 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts @ 2026-05-17 5:44 UTC (permalink / raw)
To: netdev, Jakub Kicinski, Paolo Abeni
Cc: mptcp, linux-kernel, linux-kselftest, stable, Mat Martineau,
Geliang Tang, David S. Miller, Eric Dumazet, Simon Horman,
Shuah Khan
Hello,
On 15/05/2026 06:27, Matthieu Baerts (NGI0) wrote:
> Using the format specifier +%s%3N with GNU date is honoured, and only
> prints 3 digits of the nanoseconds portion of the seconds since epoch,
> which corresponds to the milliseconds.
>
> The uutils implementation of date currently does not honour this, and
> always prints all 9 digits. This is a known issue [1], but can be worked
> around by adapting this test to use nanoseconds instead of microseconds,
Sashiko noticed that I wrote microseconds instead of milliseconds. Do I
need to send a v3 just to fix this typo?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient
2026-05-15 4:27 ` [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Matthieu Baerts (NGI0)
@ 2026-05-17 5:50 ` Matthieu Baerts
2026-05-18 6:34 ` Li Xiasong
0 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts @ 2026-05-17 5:50 UTC (permalink / raw)
To: Jakub Kicinski, Paolo Abeni, Li Xiasong
Cc: netdev, mptcp, linux-kernel, linux-kselftest, Eric Dumazet,
stable, Mat Martineau, Geliang Tang, David S. Miller,
Eric Dumazet, Simon Horman, Shuah Khan
Hello,
On 15/05/2026 06:27, Matthieu Baerts (NGI0) wrote:
> From: Li Xiasong <lixiasong1@huawei.com>
>
> When TCP option space is insufficient (e.g., when sending ADD_ADDR with an
> IPv6 address and port while tcp_timestamps is enabled), the original code
> jumped to out_unlock without clearing the addr_signal flag. This caused
> mptcp_pm_add_timer to keep rescheduling indefinitely, not sending ADD_ADDR,
> preventing subsequent addresses in the endpoint list from being announced.
(...)
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 3c152bf66cd5..3e770c7407e1 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
(...)
> @@ -414,8 +420,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
> /* Note: entry might have been removed by another thread.
> * We hold rcu_read_lock() to ensure it is not freed under us.
> */
> - if (stop_timer)
> - sk_stop_timer_sync(sk, &entry->add_timer);
FYI, sashiko found a pre-existing issue here, but I guess that's not
blocking this series.
https://sashiko.dev/#/patchset/20260515-net-mptcp-misc-fixes-7-1-rc4-v2-0-701e96419f2f%40kernel.org
@Li: just to know what I do with this pre-existing issue, do you plan to
look at it?
Just in case, I just opened:
https://github.com/multipath-tcp/mptcp_net-next/issues/623
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient
2026-05-17 5:50 ` Matthieu Baerts
@ 2026-05-18 6:34 ` Li Xiasong
0 siblings, 0 replies; 11+ messages in thread
From: Li Xiasong @ 2026-05-18 6:34 UTC (permalink / raw)
To: Matthieu Baerts
Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
Eric Dumazet, stable@vger.kernel.org, Mat Martineau, Geliang Tang,
David S. Miller, Eric Dumazet, Simon Horman, Shuah Khan,
Jakub Kicinski, Paolo Abeni, weiyongjun (A), yuehaibing,
zhangchangzhong
Hi, Matt
On 5/17/2026 1:50 PM, Matthieu Baerts wrote:
> Hello,
>
> On 15/05/2026 06:27, Matthieu Baerts (NGI0) wrote:
>> From: Li Xiasong <lixiasong1@huawei.com>
>>
>> When TCP option space is insufficient (e.g., when sending ADD_ADDR with an
>> IPv6 address and port while tcp_timestamps is enabled), the original code
>> jumped to out_unlock without clearing the addr_signal flag. This caused
>> mptcp_pm_add_timer to keep rescheduling indefinitely, not sending ADD_ADDR,
>> preventing subsequent addresses in the endpoint list from being announced.
>
> (...)
>
>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>> index 3c152bf66cd5..3e770c7407e1 100644
>> --- a/net/mptcp/pm.c
>> +++ b/net/mptcp/pm.c
>
> (...)
>
>> @@ -414,8 +420,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
>> /* Note: entry might have been removed by another thread.
>> * We hold rcu_read_lock() to ensure it is not freed under us.
>> */
>> - if (stop_timer)
>> - sk_stop_timer_sync(sk, &entry->add_timer);
> FYI, sashiko found a pre-existing issue here, but I guess that's not
> blocking this series.
>
> https://sashiko.dev/#/patchset/20260515-net-mptcp-misc-fixes-7-1-rc4-v2-0-701e96419f2f%40kernel.org
>
> @Li: just to know what I do with this pre-existing issue, do you plan to
> look at it?
>
> Just in case, I just opened:
>
> https://github.com/multipath-tcp/mptcp_net-next/issues/623
>
Thanks for the heads-up. I’ll take a look at #623 separately and follow up
there.
Best regards,
Li Xiasong
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
` (5 preceding siblings ...)
2026-05-15 4:27 ` [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier Matthieu Baerts (NGI0)
@ 2026-05-19 13:50 ` patchwork-bot+netdevbpf
6 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-19 13:50 UTC (permalink / raw)
To: Matthieu Baerts
Cc: martineau, geliang, davem, edumazet, kuba, pabeni, horms, shuah,
netdev, mptcp, linux-kernel, linux-kselftest, edumaze, shardul.b,
stable, lixiasong1, yangang
Hello:
This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Fri, 15 May 2026 06:27:31 +0200 you wrote:
> Here are various unrelated fixes:
>
> - Patch 1: avoid dropping partial packets. A previous version has been
> sent a few week ago. A fix for 5.10.
>
> - Patches 2-3: stop ADD_ADDR timer when an ADD_ADDR can never been sent
> due to insufficient option space. A fix for v5.10.
>
> [...]
Here is the summary with links:
- [net,v2,1/6] mptcp: do not drop partial packets
https://git.kernel.org/netdev/net/c/50c2d91c5dfa
- [net,v2,2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient
https://git.kernel.org/netdev/net/c/51e398a3b896
- [net,v2,3/6] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress
https://git.kernel.org/netdev/net/c/fc5ef4331810
- [net,v2,4/6] mptcp: reset rcv wnd on disconnect
https://git.kernel.org/netdev/net/c/0981f90e1a05
- [net,v2,5/6] mptcp: update window_clamp on subflows when SO_RCVBUF is set
https://git.kernel.org/netdev/net/c/3a543ae0e209
- [net,v2,6/6] selftests: mptcp: drop nanoseconds width specifier
https://git.kernel.org/netdev/net/c/01ff78e4b3d9
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-19 13:50 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 4:27 [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 1/6] mptcp: do not drop partial packets Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 2/6] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Matthieu Baerts (NGI0)
2026-05-17 5:50 ` Matthieu Baerts
2026-05-18 6:34 ` Li Xiasong
2026-05-15 4:27 ` [PATCH net v2 3/6] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 4/6] mptcp: reset rcv wnd on disconnect Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 5/6] mptcp: update window_clamp on subflows when SO_RCVBUF is set Matthieu Baerts (NGI0)
2026-05-15 4:27 ` [PATCH net v2 6/6] selftests: mptcp: drop nanoseconds width specifier Matthieu Baerts (NGI0)
2026-05-17 5:44 ` Matthieu Baerts
2026-05-19 13:50 ` [PATCH net v2 0/6] mptcp: misc fixes for v7.1-rc4 patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox