* [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage
@ 2026-05-07 12:52 Li Xiasong
2026-05-07 12:52 ` [PATCH mptcp-net v3 1/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Li Xiasong
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Li Xiasong @ 2026-05-07 12:52 UTC (permalink / raw)
To: mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong
This series fixes a bug where the ADD_ADDR timer can keep rescheduling
indefinitely when TCP option space is insufficient, blocking subsequent
addresses in the endpoint list from being announced.
The issue is reproducible when sending an IPv6 ADD_ADDR with port while
tcp_timestamps is enabled. In that case, option space may be exhausted on
pure ACKs, and ADD_ADDR retransmission can stall PM forward progress.
Patch 1 clears the signal in this path, skips the matching ADD_ADDR
retransmission entry, and preserves PM state-machine progression.
Note on concurrency: timer cancellation in this path is best-effort.
A concurrent add_timer callback that acquires pm.lock first may still
perform one final ADD_ADDR send attempt before cancel state is applied.
After cancel sets entry->retrans_times to ADD_ADDR_RETRANS_MAX, the
callback-side check prevents further ADD_ADDR retransmissions.
Patch 2 adds a selftest to cover this behavior and prevent regressions.
Patch summary:
- patch 1: mptcp: pm: fix ADD_ADDR timer infinite retry on option space
insufficient
- patch 2: selftests: mptcp: join: cover ADD_ADDR tx drop and list
progress
Changes since v2:
- document the possible timer-cancel race window in the commit message
and code comments
- keep best-effort cancellation semantics (possible one extra send
attempt under lock-order race), while preventing further retries via
retrans_times gating
- selftests: remove the redundant ADD_ADDR tx-drop subtest that
duplicated existing coverage for list progression behavior
- link to v2:
https://lore.kernel.org/mptcp/20260430112026.343691-1-lixiasong1@huawei.com/
Changes since v1:
- add explicit MIB accounting for this drop path:
- MPTCP_MIB_ADDADDRTXDROP for non-echo ADD_ADDR
- MPTCP_MIB_ECHOADDTXDROP for echo ADD_ADDR
- add explicit handling to stop the matching ADD_ADDR retransmission
timer when skipping non-echo ADD_ADDR on pure-ACK option-space
exhaustion
- keep PM forward progress by advancing PM state after a successful
skip of the matching ADD_ADDR entry
- add selftest to verify IPv6 ADD_ADDR with port tx-drop when
tcp_timestamps is enabled
- link to v1:
https://lore.kernel.org/netdev/20260418100018.2219500-1-lixiasong1@huawei.com/
Thanks to Matthieu Baerts for the detailed review and suggestions.
Li Xiasong (2):
mptcp: pm: fix ADD_ADDR timer infinite retry on option space
insufficient
selftests: mptcp: join: cover ADD_ADDR tx drop and list progress
net/mptcp/pm.c | 56 +++++++++++++++----
.../testing/selftests/net/mptcp/mptcp_join.sh | 31 ++++++++++
2 files changed, 77 insertions(+), 10 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH mptcp-net v3 1/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong @ 2026-05-07 12:52 ` Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Li Xiasong ` (3 subsequent siblings) 4 siblings, 0 replies; 7+ messages in thread From: Li Xiasong @ 2026-05-07 12:52 UTC (permalink / raw) To: mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong When TCP option space is insufficient (e.g., when sending ADD_ADDR with an IPv6 address and port while tcp_timestamps is enabled), the original code jumped to out_unlock without clearing the addr_signal flag. This caused mptcp_pm_add_timer to keep rescheduling indefinitely, not sending ADD_ADDR, preventing subsequent addresses in the endpoint list from being announced. Handle this case by clearing the ADD_ADDR signal and skipping the matching ADD_ADDR retransmission entry. The skip path cancels the matching timer (with id check) and advances PM state progression, preserving forward progress to subsequent PM work. This cancellation is inherently best-effort. A concurrent add_timer callback may already be running and may acquire pm.lock before the cancel path updates entry state. In that case, one final ADD_ADDR transmit attempt can still be executed. Once the cancel path sets entry->retrans_times to ADD_ADDR_RETRANS_MAX, the callback-side retrans_times check suppresses further ADD_ADDR retransmissions. Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Signed-off-by: Li Xiasong <lixiasong1@huawei.com> --- net/mptcp/pm.c | 56 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 10 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 57a456690406..8a86f1e93b78 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -351,7 +351,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer) spin_lock_bh(&msk->pm.lock); - if (!mptcp_pm_should_add_signal_addr(msk)) { + /* The cancel path (mptcp_pm_del_add_timer()) can race with this + * callback. Once cancel updates retrans_times to MAX, suppress further + * retransmissions here. If this callback acquires pm.lock first, one + * final transmit attempt is still possible. + */ + if (entry->retrans_times < ADD_ADDR_RETRANS_MAX && + !mptcp_pm_should_add_signal_addr(msk)) { pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id); mptcp_pm_announce_addr(msk, &entry->addr, false); mptcp_pm_add_addr_send_ack(msk); @@ -394,8 +400,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, /* Note: entry might have been removed by another thread. * We hold rcu_read_lock() to ensure it is not freed under us. */ - if (stop_timer) - sk_stop_timer_sync(sk, &entry->add_timer); + if (stop_timer) { + if (check_id) + sk_stop_timer(sk, &entry->add_timer); + else + sk_stop_timer_sync(sk, &entry->add_timer); + } rcu_read_unlock(); return entry; @@ -860,6 +870,7 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb, struct mptcp_addr_info *addr, bool *echo, bool *drop_other_suboptions) { + bool skip_add_addr = false; int ret = false; u8 add_addr; u8 family; @@ -881,24 +892,49 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb, } *echo = mptcp_pm_should_add_signal_echo(msk); - port = !!(*echo ? msk->pm.remote.port : msk->pm.local.port); - - family = *echo ? msk->pm.remote.family : msk->pm.local.family; - if (remaining < mptcp_add_addr_len(family, *echo, port)) - goto out_unlock; - if (*echo) { *addr = msk->pm.remote; add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_ECHO); + port = !!msk->pm.remote.port; + family = msk->pm.remote.family; } else { *addr = msk->pm.local; add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_SIGNAL); + port = !!msk->pm.local.port; + family = msk->pm.local.family; } - WRITE_ONCE(msk->pm.addr_signal, add_addr); + + if (remaining < mptcp_add_addr_len(family, *echo, port)) { + struct net *net = sock_net((struct sock *)msk); + + if (!*drop_other_suboptions) + goto out_unlock; + + if (*echo) { + MPTCP_INC_STATS(net, MPTCP_MIB_ECHOADDTXDROP); + } else { + skip_add_addr = true; + MPTCP_INC_STATS(net, MPTCP_MIB_ADDADDRTXDROP); + } + goto drop_signal_mark; + } + ret = true; +drop_signal_mark: + WRITE_ONCE(msk->pm.addr_signal, add_addr); + out_unlock: spin_unlock_bh(&msk->pm.lock); + + /* On pure-ACK option-space exhaustion, stop retrying this ADD_ADDR: + * clear the signal bit, cancel the matching retransmission timer, and + * let the PM state machine progress. + */ + if (skip_add_addr) { + mptcp_pm_del_add_timer(msk, addr, true); + mptcp_pm_subflow_established(msk); + } return ret; } -- 2.34.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 1/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Li Xiasong @ 2026-05-07 12:52 ` Li Xiasong 2026-05-08 11:37 ` Matthieu Baerts 2026-05-07 13:40 ` [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage MPTCP CI ` (2 subsequent siblings) 4 siblings, 1 reply; 7+ messages in thread From: Li Xiasong @ 2026-05-07 12:52 UTC (permalink / raw) To: mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong Extend add_addr_ports_tests with IPv6 signaling cases that exercise ADD_ADDR tx-space shortage when tcp_timestamps are enabled. Add one case to verify PM still progresses to later signal endpoints after the first one is dropped. This covers both failure accounting and the non-blocking behavior of the announce list after a tx-space drop on pure ACK. Signed-off-by: Li Xiasong <lixiasong1@huawei.com> --- .../testing/selftests/net/mptcp/mptcp_join.sh | 31 +++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index beec41f6662a..5acd12021e6e 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -1828,6 +1828,22 @@ chk_add_tx_nr() fi } +chk_add_drop_tx_nr() +{ + local drop_tx_nr=$1 + local count + + print_check "add addr tx drop" + count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtAddAddrTxDrop") + if [ -z "$count" ]; then + print_skip + elif [ "$count" != "$drop_tx_nr" ]; then + fail_test "got $count ADD_ADDR drop[s] TX, expected $drop_tx_nr" + else + print_ok + fi +} + chk_rm_nr() { local rm_addr_nr=$1 @@ -3278,6 +3294,21 @@ add_addr_ports_tests() chk_mpc_endp_attempt ${retl} 1 fi + + # first signal address drops, second one still progresses + if reset "signal addr list progresses after tx drop"; then + pm_nl_set_limits $ns1 0 2 + pm_nl_set_limits $ns2 1 0 + ip netns exec $ns1 sysctl -q net.ipv4.tcp_timestamps=1 + ip netns exec $ns2 sysctl -q net.ipv4.tcp_timestamps=1 + + pm_nl_add_endpoint $ns1 dead:beef:2::1 flags signal port 10100 + pm_nl_add_endpoint $ns1 dead:beef:3::1 flags signal + run_tests $ns1 $ns2 dead:beef:1::1 + chk_add_drop_tx_nr 1 + chk_add_tx_nr 1 1 + chk_add_nr 1 1 0 + fi } bind_tests() -- 2.34.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress 2026-05-07 12:52 ` [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Li Xiasong @ 2026-05-08 11:37 ` Matthieu Baerts 0 siblings, 0 replies; 7+ messages in thread From: Matthieu Baerts @ 2026-05-08 11:37 UTC (permalink / raw) To: Li Xiasong, mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong Hi Li, On 07/05/2026 14:52, Li Xiasong wrote: > Extend add_addr_ports_tests with IPv6 signaling cases that exercise > ADD_ADDR tx-space shortage when tcp_timestamps are enabled. > > Add one case to verify PM still progresses to later signal endpoints > after the first one is dropped. > > This covers both failure accounting and the non-blocking behavior of > the announce list after a tx-space drop on pure ACK. > > Signed-off-by: Li Xiasong <lixiasong1@huawei.com> > --- > .../testing/selftests/net/mptcp/mptcp_join.sh | 31 +++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh > index beec41f6662a..5acd12021e6e 100755 > --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh > +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh (...) > @@ -3278,6 +3294,21 @@ add_addr_ports_tests() > > chk_mpc_endp_attempt ${retl} 1 > fi > + > + # first signal address drops, second one still progresses > + if reset "signal addr list progresses after tx drop"; then > + pm_nl_set_limits $ns1 0 2 > + pm_nl_set_limits $ns2 1 0 > + ip netns exec $ns1 sysctl -q net.ipv4.tcp_timestamps=1 > + ip netns exec $ns2 sysctl -q net.ipv4.tcp_timestamps=1 > + > + pm_nl_add_endpoint $ns1 dead:beef:2::1 flags signal port 10100 > + pm_nl_add_endpoint $ns1 dead:beef:3::1 flags signal > + run_tests $ns1 $ns2 dead:beef:1::1 > + chk_add_drop_tx_nr 1 > + chk_add_tx_nr 1 1 It would be good to call chk_add_drop_tx_nr from chk_add_tx_nr, similar to what is done with chk_join_tx_nr from chk_join_nr. For example, by default, assuming there is no drop, except when add_addr_drop_tx_nr=X is passed when calling chk_add_tx_nr. Or as a new optional argument for chk_add_tx_nr. But that's more mptcp-next material, so I'm not going to block this patch for that! But if you are whiling to add this, don't hesitate to :) > + chk_add_nr 1 1 0 > + fi > } > > bind_tests() Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 1/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Li Xiasong @ 2026-05-07 13:40 ` MPTCP CI 2026-05-08 11:37 ` Matthieu Baerts 2026-05-08 11:47 ` Matthieu Baerts 4 siblings, 0 replies; 7+ messages in thread From: MPTCP CI @ 2026-05-07 13:40 UTC (permalink / raw) To: Li Xiasong; +Cc: mptcp Hi Li, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal (except selftest_mptcp_join): Success! ✅ - KVM Validation: normal (only selftest_mptcp_join): Success! ✅ - KVM Validation: debug (except selftest_mptcp_join): Success! ✅ - KVM Validation: debug (only selftest_mptcp_join): Success! ✅ - KVM Validation: btf-normal (only bpftest_all): Success! ✅ - KVM Validation: btf-debug (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25496768062 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/274d555dc3cf Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1091028 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong ` (2 preceding siblings ...) 2026-05-07 13:40 ` [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage MPTCP CI @ 2026-05-08 11:37 ` Matthieu Baerts 2026-05-08 11:47 ` Matthieu Baerts 4 siblings, 0 replies; 7+ messages in thread From: Matthieu Baerts @ 2026-05-08 11:37 UTC (permalink / raw) To: Li Xiasong, mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong Hi Li, On 07/05/2026 14:52, Li Xiasong wrote: > This series fixes a bug where the ADD_ADDR timer can keep rescheduling > indefinitely when TCP option space is insufficient, blocking subsequent > addresses in the endpoint list from being announced. > > The issue is reproducible when sending an IPv6 ADD_ADDR with port while > tcp_timestamps is enabled. In that case, option space may be exhausted on > pure ACKs, and ADD_ADDR retransmission can stall PM forward progress. > > Patch 1 clears the signal in this path, skips the matching ADD_ADDR > retransmission entry, and preserves PM state-machine progression. > > Note on concurrency: timer cancellation in this path is best-effort. > A concurrent add_timer callback that acquires pm.lock first may still > perform one final ADD_ADDR send attempt before cancel state is applied. > After cancel sets entry->retrans_times to ADD_ADDR_RETRANS_MAX, the > callback-side check prevents further ADD_ADDR retransmissions. > > Patch 2 adds a selftest to cover this behavior and prevent regressions. > > Patch summary: > - patch 1: mptcp: pm: fix ADD_ADDR timer infinite retry on option space > insufficient > - patch 2: selftests: mptcp: join: cover ADD_ADDR tx drop and list > progress > > Changes since v2: > - document the possible timer-cancel race window in the commit message > and code comments > - keep best-effort cancellation semantics (possible one extra send > attempt under lock-order race), while preventing further retries via > retrans_times gating > - selftests: remove the redundant ADD_ADDR tx-drop subtest that > duplicated existing coverage for list progression behavior > - link to v2: > https://lore.kernel.org/mptcp/20260430112026.343691-1-lixiasong1@huawei.com/ Thank you for the v3, it looks good to me! Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong ` (3 preceding siblings ...) 2026-05-08 11:37 ` Matthieu Baerts @ 2026-05-08 11:47 ` Matthieu Baerts 4 siblings, 0 replies; 7+ messages in thread From: Matthieu Baerts @ 2026-05-08 11:47 UTC (permalink / raw) To: Li Xiasong, mptcp; +Cc: weiyongjun1, yuehaibing, zhangchangzhong Hi Li, On 07/05/2026 14:52, Li Xiasong wrote: > This series fixes a bug where the ADD_ADDR timer can keep rescheduling > indefinitely when TCP option space is insufficient, blocking subsequent > addresses in the endpoint list from being announced. > > The issue is reproducible when sending an IPv6 ADD_ADDR with port while > tcp_timestamps is enabled. In that case, option space may be exhausted on > pure ACKs, and ADD_ADDR retransmission can stall PM forward progress. Now in our tree: New patches for t/upstream-net and t/upstream: - 89d79b9cc99b: mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient - 330dfb1d1df7: selftests: mptcp: join: cover ADD_ADDR tx drop and list progress - Results: 112def6c73e7..67f6d956336b (export-net) - Results: 3fc124e79727..1b5f90649a5a (export) Tests are now in progress: - export-net: https://github.com/multipath-tcp/mptcp_net-next/commit/292cba7c4eb827d0bf79fa02422e6c3daacf8229/checks - export: https://github.com/multipath-tcp/mptcp_net-next/commit/63b133728231ebba5167bd1e53dda9bcf0bee7c7/checks Cheers, Matt -- Sponsored by the NGI0 Core fund. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-05-08 11:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-07 12:52 [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 1/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient Li Xiasong 2026-05-07 12:52 ` [PATCH mptcp-net v3 2/2] selftests: mptcp: join: cover ADD_ADDR tx drop and list progress Li Xiasong 2026-05-08 11:37 ` Matthieu Baerts 2026-05-07 13:40 ` [PATCH mptcp-net v3 0/2] mptcp: pm: fix ADD_ADDR timer infinite retry on option space shortage MPTCP CI 2026-05-08 11:37 ` Matthieu Baerts 2026-05-08 11:47 ` Matthieu Baerts
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.