* [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling
@ 2025-08-06 3:17 Geliang Tang
2025-08-06 3:17 ` [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call Geliang Tang
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Geliang Tang @ 2025-08-06 3:17 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
v3:
- a new patch, remove duplicate sk_reset_timer call
- use __tcp_set_rto() * 2 instead of icsk->icsk_rto
- update selftests.
Based-on: <cover.1754449726.git.tanggeliang@kylinos.cn>
v2:
- do not rename add_addr_timeout
- use icsk->icsk_rto instead of tp->srtt_us
- increase the time after each retransmission
- https://patchwork.kernel.org/project/mptcp/patch/40706ad511220729207ccd7cf48e320e4d0d7dea.1754028199.git.tanggeliang@kylinos.cn/
v1:
- https://patchwork.kernel.org/project/mptcp/cover/cover.1753777199.git.tanggeliang@kylinos.cn/
Geliang Tang (3):
mptcp: remove duplicate sk_reset_timer call
mptcp: make ADD_ADDR retransmission timeout adaptive
selftests: mptcp: remove add_addr_timeout settings
Documentation/networking/mptcp-sysctl.rst | 4 +--
net/mptcp/pm.c | 36 +++++++++++++++----
.../testing/selftests/net/mptcp/mptcp_join.sh | 3 --
3 files changed, 31 insertions(+), 12 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call
2025-08-06 3:17 [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling Geliang Tang
@ 2025-08-06 3:17 ` Geliang Tang
2025-08-06 15:31 ` Matthieu Baerts
2025-08-06 3:17 ` [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive Geliang Tang
` (2 subsequent siblings)
3 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-06 3:17 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
sk_reset_timer() was called twice in mptcp_pm_alloc_anno_list. Simplify the
code by using a 'goto' statement to eliminate the duplication.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
net/mptcp/pm.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 5dec018fdda2..1330af6a2154 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -348,7 +348,6 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
{
struct mptcp_pm_add_entry *add_entry = NULL;
struct sock *sk = (struct sock *)msk;
- struct net *net = sock_net(sk);
lockdep_assert_held(&msk->pm.lock);
@@ -358,9 +357,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk)))
return false;
- sk_reset_timer(sk, &add_entry->add_timer,
- jiffies + mptcp_get_add_addr_timeout(net));
- return true;
+ goto out;
}
add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC);
@@ -374,8 +371,9 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
add_entry->retrans_times = 0;
timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
+out:
sk_reset_timer(sk, &add_entry->add_timer,
- jiffies + mptcp_get_add_addr_timeout(net));
+ jiffies + mptcp_get_add_addr_timeout(sock_net(sk)));
return true;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-06 3:17 [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling Geliang Tang
2025-08-06 3:17 ` [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call Geliang Tang
@ 2025-08-06 3:17 ` Geliang Tang
2025-08-06 15:34 ` Matthieu Baerts
2025-08-06 3:17 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: remove add_addr_timeout settings Geliang Tang
2025-08-06 15:29 ` [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling MPTCP CI
3 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-06 3:17 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang, Christoph Paasch
From: Geliang Tang <tanggeliang@kylinos.cn>
Replace the fixed ADD_ADDR retransmission timeout with an adaptive
mechanism that uses the maximum RTO among existing subflows. This
improves responsiveness when establishing new subflows while maintaining
an upper bound through the configured add_addr_timeout value.
Key changes:
1. Compute timeout as min(max_subflow_RTO, configured_max_timeout)
2. Apply exponential backoff based on retransmission count
3. Maintain fallback to configured max timeout when no subflows exist
4. Update documentation to clarify add_addr_timeout is now the maximum
value
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/576
Reviewed-by: Christoph Paasch <cpaasch@openai.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
Documentation/networking/mptcp-sysctl.rst | 4 +--
net/mptcp/pm.c | 30 ++++++++++++++++++++---
2 files changed, 29 insertions(+), 5 deletions(-)
diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index 1683c139821e..2b8e8fb99ee2 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -8,8 +8,8 @@ MPTCP Sysfs variables
===============================
add_addr_timeout - INTEGER (seconds)
- Set the timeout after which an ADD_ADDR control message will be
- resent to an MPTCP peer that has not acknowledged a previous
+ Set the maximum value of timeout after which an ADD_ADDR control message
+ will be resent to an MPTCP peer that has not acknowledged a previous
ADD_ADDR message.
Do not retransmit if set to 0.
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 1330af6a2154..34cc4f45bc7d 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk,
return -EINVAL;
}
+static unsigned int mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+ struct sock *sk = (struct sock *)msk;
+ unsigned int max_rto = 0;
+ unsigned int timeout;
+
+ timeout = mptcp_get_add_addr_timeout(sock_net(sk));
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ unsigned int subflow_rto;
+
+ subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
+ if (subflow_rto > max_rto)
+ max_rto = subflow_rto;
+ }
+
+ if (max_rto && max_rto < timeout)
+ timeout = max_rto;
+
+ return timeout;
+}
+
static void mptcp_pm_add_timer(struct timer_list *timer)
{
struct mptcp_pm_add_entry *entry = timer_container_of(entry, timer,
@@ -292,7 +316,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
goto out;
}
- timeout = mptcp_get_add_addr_timeout(sock_net(sk));
+ timeout = mptcp_adjust_add_addr_timeout(msk);
if (!timeout)
goto out;
@@ -307,7 +331,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
if (entry->retrans_times < ADD_ADDR_RETRANS_MAX)
sk_reset_timer(sk, timer,
- jiffies + timeout);
+ jiffies + (timeout << entry->retrans_times));
spin_unlock_bh(&msk->pm.lock);
@@ -373,7 +397,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
out:
sk_reset_timer(sk, &add_entry->add_timer,
- jiffies + mptcp_get_add_addr_timeout(sock_net(sk)));
+ jiffies + mptcp_adjust_add_addr_timeout(msk));
return true;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v3 3/3] selftests: mptcp: remove add_addr_timeout settings
2025-08-06 3:17 [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling Geliang Tang
2025-08-06 3:17 ` [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call Geliang Tang
2025-08-06 3:17 ` [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive Geliang Tang
@ 2025-08-06 3:17 ` Geliang Tang
2025-08-06 15:29 ` [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling MPTCP CI
3 siblings, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2025-08-06 3:17 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
From: Geliang Tang <tanggeliang@kylinos.cn>
Now that add_addr_timeout can be dynamically adjusted, there is no need
to set specific timeout values in the mptcp_join.sh tests. This patch
removes the explicit sysctl settings for net.mptcp.add_addr_timeout
from the test scripts.
The change simplifies the test setup and ensures the tests work with
the default or dynamically adjusted timeout values.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
tools/testing/selftests/net/mptcp/mptcp_join.sh | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 82cae37d9c20..e85bb62046e0 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -347,8 +347,6 @@ reset_with_add_addr_timeout()
tables="${ip6tables}"
fi
- ip netns exec $ns1 sysctl -q net.mptcp.add_addr_timeout=1
-
if ! ip netns exec $ns2 $tables -A OUTPUT -p tcp \
-m tcp --tcp-option 30 \
-m bpf --bytecode \
@@ -2183,7 +2181,6 @@ signal_address_tests()
pm_nl_add_endpoint $ns2 10.0.4.2 flags signal
# the peer could possibly miss some addr notification, allow retransmission
- ip netns exec $ns1 sysctl -q net.mptcp.add_addr_timeout=1
speed=slow \
run_tests $ns1 $ns2 10.0.1.1
--
2.48.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling
2025-08-06 3:17 [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling Geliang Tang
` (2 preceding siblings ...)
2025-08-06 3:17 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: remove add_addr_timeout settings Geliang Tang
@ 2025-08-06 15:29 ` MPTCP CI
2025-08-06 15:36 ` Matthieu Baerts
3 siblings, 1 reply; 19+ messages in thread
From: MPTCP CI @ 2025-08-06 15:29 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal: Unstable: 2 failed test(s): packetdrill_add_addr selftest_mptcp_connect 🔴
- KVM Validation: debug: Unstable: 1 failed test(s): selftest_mptcp_connect 🔴
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/16778676332
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/287779d926ff
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=988606
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call
2025-08-06 3:17 ` [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call Geliang Tang
@ 2025-08-06 15:31 ` Matthieu Baerts
0 siblings, 0 replies; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-06 15:31 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang
Hi Geliang,
On 06/08/2025 05:17, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> sk_reset_timer() was called twice in mptcp_pm_alloc_anno_list. Simplify the
> code by using a 'goto' statement to eliminate the duplication.
>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> net/mptcp/pm.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 5dec018fdda2..1330af6a2154 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -348,7 +348,6 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
> {
> struct mptcp_pm_add_entry *add_entry = NULL;
> struct sock *sk = (struct sock *)msk;
> - struct net *net = sock_net(sk);
>
> lockdep_assert_held(&msk->pm.lock);
>
> @@ -358,9 +357,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
> if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk)))
> return false;
>
> - sk_reset_timer(sk, &add_entry->add_timer,
> - jiffies + mptcp_get_add_addr_timeout(net));
> - return true;
> + goto out;
Good idea, but maybe better to have a more explicit label, e.g. reset_timer.
> }
>
> add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC);
> @@ -374,8 +371,9 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
> add_entry->retrans_times = 0;
>
> timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
> +out:
> sk_reset_timer(sk, &add_entry->add_timer,
> - jiffies + mptcp_get_add_addr_timeout(net));
> + jiffies + mptcp_get_add_addr_timeout(sock_net(sk)));
>
> return true;
> }
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-06 3:17 ` [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive Geliang Tang
@ 2025-08-06 15:34 ` Matthieu Baerts
2025-08-07 3:25 ` Geliang Tang
0 siblings, 1 reply; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-06 15:34 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang, Christoph Paasch
Hi Geliang,
On 06/08/2025 05:17, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> Replace the fixed ADD_ADDR retransmission timeout with an adaptive
> mechanism that uses the maximum RTO among existing subflows. This
> improves responsiveness when establishing new subflows while maintaining
> an upper bound through the configured add_addr_timeout value.
>
> Key changes:
> 1. Compute timeout as min(max_subflow_RTO, configured_max_timeout)
> 2. Apply exponential backoff based on retransmission count
> 3. Maintain fallback to configured max timeout when no subflows exist
> 4. Update documentation to clarify add_addr_timeout is now the maximum
> value
>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/576
> Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
> Documentation/networking/mptcp-sysctl.rst | 4 +--
> net/mptcp/pm.c | 30 ++++++++++++++++++++---
> 2 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
> index 1683c139821e..2b8e8fb99ee2 100644
> --- a/Documentation/networking/mptcp-sysctl.rst
> +++ b/Documentation/networking/mptcp-sysctl.rst
> @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> ===============================
>
> add_addr_timeout - INTEGER (seconds)
> - Set the timeout after which an ADD_ADDR control message will be
> - resent to an MPTCP peer that has not acknowledged a previous
> + Set the maximum value of timeout after which an ADD_ADDR control message
> + will be resent to an MPTCP peer that has not acknowledged a previous
> ADD_ADDR message.
>
> Do not retransmit if set to 0.
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 1330af6a2154..34cc4f45bc7d 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk,
> return -EINVAL;
> }
>
> +static unsigned int mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk)
> +{
> + struct mptcp_subflow_context *subflow;
> + struct sock *sk = (struct sock *)msk;
> + unsigned int max_rto = 0;
> + unsigned int timeout;
> +
> + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> +
> + mptcp_for_each_subflow(msk, subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> + unsigned int subflow_rto;
> +
> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
Why "* 2"? Is it what TCP is doing?
(Probably best to add a comment here above, and in the commit message.)
(Also, I guess the compiler will do the optimisation, but you could
write: ">> 1" instead of "* 2".
> + if (subflow_rto > max_rto)
> + max_rto = subflow_rto;
> + }
> +
> + if (max_rto && max_rto < timeout)
> + timeout = max_rto;
> +
> + return timeout;
> +}
> +
> static void mptcp_pm_add_timer(struct timer_list *timer)
> {
> struct mptcp_pm_add_entry *entry = timer_container_of(entry, timer,
> @@ -292,7 +316,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
> goto out;
> }
>
> - timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> + timeout = mptcp_adjust_add_addr_timeout(msk);
> if (!timeout)
> goto out;
>
> @@ -307,7 +331,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer)
>
> if (entry->retrans_times < ADD_ADDR_RETRANS_MAX)
> sk_reset_timer(sk, timer,
> - jiffies + timeout);
> + jiffies + (timeout << entry->retrans_times));
>
> spin_unlock_bh(&msk->pm.lock);
>
> @@ -373,7 +397,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
> timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
> out:
> sk_reset_timer(sk, &add_entry->add_timer,
> - jiffies + mptcp_get_add_addr_timeout(sock_net(sk)));
> + jiffies + mptcp_adjust_add_addr_timeout(msk));
>
> return true;
> }
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling
2025-08-06 15:29 ` [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling MPTCP CI
@ 2025-08-06 15:36 ` Matthieu Baerts
2025-08-07 5:51 ` Geliang Tang
0 siblings, 1 reply; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-06 15:36 UTC (permalink / raw)
To: mptcp, Geliang Tang
Hi Geliang,
On 06/08/2025 17:29, MPTCP CI wrote:
> Hi Geliang,
>
> Thank you for your modifications, that's great!
>
> Our CI did some validations and here is its report:
>
> - KVM Validation: normal: Unstable: 2 failed test(s): packetdrill_add_addr selftest_mptcp_connect 🔴
Don't forget to update the packetdrill tests.
Also, if you are still using mptcp-upstream-virtme-docker, please make
sure you are using a recent version where the tolerances are now only
increased in debug mode.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-06 15:34 ` Matthieu Baerts
@ 2025-08-07 3:25 ` Geliang Tang
2025-08-07 11:56 ` Matthieu Baerts
0 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-07 3:25 UTC (permalink / raw)
To: Matthieu Baerts, mptcp; +Cc: Geliang Tang, Christoph Paasch
Hi Matt,
On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
> Hi Geliang,
>
> On 06/08/2025 05:17, Geliang Tang wrote:
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> >
> > Replace the fixed ADD_ADDR retransmission timeout with an adaptive
> > mechanism that uses the maximum RTO among existing subflows. This
> > improves responsiveness when establishing new subflows while
> > maintaining
> > an upper bound through the configured add_addr_timeout value.
> >
> > Key changes:
> > 1. Compute timeout as min(max_subflow_RTO, configured_max_timeout)
> > 2. Apply exponential backoff based on retransmission count
> > 3. Maintain fallback to configured max timeout when no subflows
> > exist
> > 4. Update documentation to clarify add_addr_timeout is now the
> > maximum
> > value
> >
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/576
> > Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> > Documentation/networking/mptcp-sysctl.rst | 4 +--
> > net/mptcp/pm.c | 30
> > ++++++++++++++++++++---
> > 2 files changed, 29 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/networking/mptcp-sysctl.rst
> > b/Documentation/networking/mptcp-sysctl.rst
> > index 1683c139821e..2b8e8fb99ee2 100644
> > --- a/Documentation/networking/mptcp-sysctl.rst
> > +++ b/Documentation/networking/mptcp-sysctl.rst
> > @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> > ===============================
> >
> > add_addr_timeout - INTEGER (seconds)
> > - Set the timeout after which an ADD_ADDR control message
> > will be
> > - resent to an MPTCP peer that has not acknowledged a
> > previous
> > + Set the maximum value of timeout after which an ADD_ADDR
> > control message
> > + will be resent to an MPTCP peer that has not acknowledged
> > a previous
> > ADD_ADDR message.
> >
> > Do not retransmit if set to 0.
> > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > index 1330af6a2154..34cc4f45bc7d 100644
> > --- a/net/mptcp/pm.c
> > +++ b/net/mptcp/pm.c
> > @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
> > mptcp_sock *msk,
> > return -EINVAL;
> > }
> >
> > +static unsigned int mptcp_adjust_add_addr_timeout(struct
> > mptcp_sock *msk)
> > +{
> > + struct mptcp_subflow_context *subflow;
> > + struct sock *sk = (struct sock *)msk;
> > + unsigned int max_rto = 0;
> > + unsigned int timeout;
> > +
> > + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> > +
> > + mptcp_for_each_subflow(msk, subflow) {
> > + struct sock *ssk =
> > mptcp_subflow_tcp_sock(subflow);
> > + unsigned int subflow_rto;
> > +
> > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
>
> Why "* 2"? Is it what TCP is doing?
>
> (Probably best to add a comment here above, and in the commit
> message.)
>
> (Also, I guess the compiler will do the optimisation, but you could
> write: ">> 1" instead of "* 2".
Yes, "<< 1" is much better. I added a comment here:
+ /*
+ * icsk->icsk_rto is not used directly since
+ * it's not updated frequently enough.
+ *
+ * subflow_rto is twice of rto, and allow
+ * some tolerance for add_addr echo.
+ */
+ subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
Is this okay?
Thanks,
-Geliang
>
> > + if (subflow_rto > max_rto)
> > + max_rto = subflow_rto;
> > + }
> > +
> > + if (max_rto && max_rto < timeout)
> > + timeout = max_rto;
> > +
> > + return timeout;
> > +}
> > +
> > static void mptcp_pm_add_timer(struct timer_list *timer)
> > {
> > struct mptcp_pm_add_entry *entry =
> > timer_container_of(entry, timer,
> > @@ -292,7 +316,7 @@ static void mptcp_pm_add_timer(struct
> > timer_list *timer)
> > goto out;
> > }
> >
> > - timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> > + timeout = mptcp_adjust_add_addr_timeout(msk);
> > if (!timeout)
> > goto out;
> >
> > @@ -307,7 +331,7 @@ static void mptcp_pm_add_timer(struct
> > timer_list *timer)
> >
> > if (entry->retrans_times < ADD_ADDR_RETRANS_MAX)
> > sk_reset_timer(sk, timer,
> > - jiffies + timeout);
> > + jiffies + (timeout << entry-
> > >retrans_times));
> >
> > spin_unlock_bh(&msk->pm.lock);
> >
> > @@ -373,7 +397,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock
> > *msk,
> > timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0);
> > out:
> > sk_reset_timer(sk, &add_entry->add_timer,
> > - jiffies +
> > mptcp_get_add_addr_timeout(sock_net(sk)));
> > + jiffies +
> > mptcp_adjust_add_addr_timeout(msk));
> >
> > return true;
> > }
>
> Cheers,
> Matt
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling
2025-08-06 15:36 ` Matthieu Baerts
@ 2025-08-07 5:51 ` Geliang Tang
2025-08-07 11:56 ` Matthieu Baerts
0 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-07 5:51 UTC (permalink / raw)
To: Matthieu Baerts, mptcp, Geliang Tang
Hi Matt,
On Wed, 2025-08-06 at 17:36 +0200, Matthieu Baerts wrote:
> Hi Geliang,
>
> On 06/08/2025 17:29, MPTCP CI wrote:
> > Hi Geliang,
> >
> > Thank you for your modifications, that's great!
> >
> > Our CI did some validations and here is its report:
> >
> > - KVM Validation: normal: Unstable: 2 failed test(s):
> > packetdrill_add_addr selftest_mptcp_connect 🔴
>
> Don't forget to update the packetdrill tests.
>
> Also, if you are still using mptcp-upstream-virtme-docker, please
> make
> sure you are using a recent version where the tolerances are now only
> increased in debug mode.
Here it is:
https://github.com/multipath-tcp/packetdrill/pull/166
Thanks,
-Geliang
>
> Cheers,
> Matt
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-07 3:25 ` Geliang Tang
@ 2025-08-07 11:56 ` Matthieu Baerts
2025-08-10 1:13 ` Geliang Tang
0 siblings, 1 reply; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-07 11:56 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang, Christoph Paasch
Hi Geliang,
On 07/08/2025 05:25, Geliang Tang wrote:
> Hi Matt,
>
> On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> On 06/08/2025 05:17, Geliang Tang wrote:
>>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>>
>>> Replace the fixed ADD_ADDR retransmission timeout with an adaptive
>>> mechanism that uses the maximum RTO among existing subflows. This
>>> improves responsiveness when establishing new subflows while
>>> maintaining
>>> an upper bound through the configured add_addr_timeout value.
>>>
>>> Key changes:
>>> 1. Compute timeout as min(max_subflow_RTO, configured_max_timeout)
>>> 2. Apply exponential backoff based on retransmission count
>>> 3. Maintain fallback to configured max timeout when no subflows
>>> exist
>>> 4. Update documentation to clarify add_addr_timeout is now the
>>> maximum
>>> value
>>>
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/576
>>> Reviewed-by: Christoph Paasch <cpaasch@openai.com>
>>> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
>>> ---
>>> Documentation/networking/mptcp-sysctl.rst | 4 +--
>>> net/mptcp/pm.c | 30
>>> ++++++++++++++++++++---
>>> 2 files changed, 29 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/networking/mptcp-sysctl.rst
>>> b/Documentation/networking/mptcp-sysctl.rst
>>> index 1683c139821e..2b8e8fb99ee2 100644
>>> --- a/Documentation/networking/mptcp-sysctl.rst
>>> +++ b/Documentation/networking/mptcp-sysctl.rst
>>> @@ -8,8 +8,8 @@ MPTCP Sysfs variables
>>> ===============================
>>>
>>> add_addr_timeout - INTEGER (seconds)
>>> - Set the timeout after which an ADD_ADDR control message
>>> will be
>>> - resent to an MPTCP peer that has not acknowledged a
>>> previous
>>> + Set the maximum value of timeout after which an ADD_ADDR
>>> control message
>>> + will be resent to an MPTCP peer that has not acknowledged
>>> a previous
>>> ADD_ADDR message.
>>>
>>> Do not retransmit if set to 0.
>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>> index 1330af6a2154..34cc4f45bc7d 100644
>>> --- a/net/mptcp/pm.c
>>> +++ b/net/mptcp/pm.c
>>> @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
>>> mptcp_sock *msk,
>>> return -EINVAL;
>>> }
>>>
>>> +static unsigned int mptcp_adjust_add_addr_timeout(struct
>>> mptcp_sock *msk)
>>> +{
>>> + struct mptcp_subflow_context *subflow;
>>> + struct sock *sk = (struct sock *)msk;
>>> + unsigned int max_rto = 0;
>>> + unsigned int timeout;
>>> +
>>> + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
>>> +
>>> + mptcp_for_each_subflow(msk, subflow) {
>>> + struct sock *ssk =
>>> mptcp_subflow_tcp_sock(subflow);
>>> + unsigned int subflow_rto;
>>> +
>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
>>
>> Why "* 2"? Is it what TCP is doing?
>>
>> (Probably best to add a comment here above, and in the commit
>> message.)
>>
>> (Also, I guess the compiler will do the optimisation, but you could
>> write: ">> 1" instead of "* 2".
>
> Yes, "<< 1" is much better. I added a comment here:
>
> + /*
> + * icsk->icsk_rto is not used directly since
> + * it's not updated frequently enough.
> + *
> + * subflow_rto is twice of rto, and allow
> + * some tolerance for add_addr echo.
> + */
> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
>
> Is this okay?
No :)
I don't think the comment explain why it is twice the default TCP RTO.
Why having an exception here for the ADD_ADDR? I mean: I'm OK with the
idea of doubling the RTO, but only if there is a reason behind that :)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling
2025-08-07 5:51 ` Geliang Tang
@ 2025-08-07 11:56 ` Matthieu Baerts
0 siblings, 0 replies; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-07 11:56 UTC (permalink / raw)
To: Geliang Tang, mptcp, Geliang Tang
Hi Geliang,
On 07/08/2025 07:51, Geliang Tang wrote:
> Hi Matt,
>
> On Wed, 2025-08-06 at 17:36 +0200, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> On 06/08/2025 17:29, MPTCP CI wrote:
>>> Hi Geliang,
>>>
>>> Thank you for your modifications, that's great!
>>>
>>> Our CI did some validations and here is its report:
>>>
>>> - KVM Validation: normal: Unstable: 2 failed test(s):
>>> packetdrill_add_addr selftest_mptcp_connect 🔴
>>
>> Don't forget to update the packetdrill tests.
>>
>> Also, if you are still using mptcp-upstream-virtme-docker, please
>> make
>> sure you are using a recent version where the tolerances are now only
>> increased in debug mode.
>
> Here it is:
>
> https://github.com/multipath-tcp/packetdrill/pull/166
Thank you!
I will check ASAP (but I'm still travelling).
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-07 11:56 ` Matthieu Baerts
@ 2025-08-10 1:13 ` Geliang Tang
2025-08-11 14:51 ` Matthieu Baerts
0 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-10 1:13 UTC (permalink / raw)
To: Matthieu Baerts, mptcp; +Cc: Geliang Tang, Christoph Paasch
Hi Matt,
On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
> Hi Geliang,
>
> On 07/08/2025 05:25, Geliang Tang wrote:
> > Hi Matt,
> >
> > On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
> > > Hi Geliang,
> > >
> > > On 06/08/2025 05:17, Geliang Tang wrote:
> > > > From: Geliang Tang <tanggeliang@kylinos.cn>
> > > >
> > > > Replace the fixed ADD_ADDR retransmission timeout with an
> > > > adaptive
> > > > mechanism that uses the maximum RTO among existing subflows.
> > > > This
> > > > improves responsiveness when establishing new subflows while
> > > > maintaining
> > > > an upper bound through the configured add_addr_timeout value.
> > > >
> > > > Key changes:
> > > > 1. Compute timeout as min(max_subflow_RTO,
> > > > configured_max_timeout)
> > > > 2. Apply exponential backoff based on retransmission count
> > > > 3. Maintain fallback to configured max timeout when no subflows
> > > > exist
> > > > 4. Update documentation to clarify add_addr_timeout is now the
> > > > maximum
> > > > value
> > > >
> > > > Closes:
> > > > https://github.com/multipath-tcp/mptcp_net-next/issues/576
> > > > Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> > > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > > > ---
> > > > Documentation/networking/mptcp-sysctl.rst | 4 +--
> > > > net/mptcp/pm.c | 30
> > > > ++++++++++++++++++++---
> > > > 2 files changed, 29 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/Documentation/networking/mptcp-sysctl.rst
> > > > b/Documentation/networking/mptcp-sysctl.rst
> > > > index 1683c139821e..2b8e8fb99ee2 100644
> > > > --- a/Documentation/networking/mptcp-sysctl.rst
> > > > +++ b/Documentation/networking/mptcp-sysctl.rst
> > > > @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> > > > ===============================
> > > >
> > > > add_addr_timeout - INTEGER (seconds)
> > > > - Set the timeout after which an ADD_ADDR control
> > > > message
> > > > will be
> > > > - resent to an MPTCP peer that has not acknowledged a
> > > > previous
> > > > + Set the maximum value of timeout after which an
> > > > ADD_ADDR
> > > > control message
> > > > + will be resent to an MPTCP peer that has not
> > > > acknowledged
> > > > a previous
> > > > ADD_ADDR message.
> > > >
> > > > Do not retransmit if set to 0.
> > > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > > > index 1330af6a2154..34cc4f45bc7d 100644
> > > > --- a/net/mptcp/pm.c
> > > > +++ b/net/mptcp/pm.c
> > > > @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
> > > > mptcp_sock *msk,
> > > > return -EINVAL;
> > > > }
> > > >
> > > > +static unsigned int mptcp_adjust_add_addr_timeout(struct
> > > > mptcp_sock *msk)
> > > > +{
> > > > + struct mptcp_subflow_context *subflow;
> > > > + struct sock *sk = (struct sock *)msk;
> > > > + unsigned int max_rto = 0;
> > > > + unsigned int timeout;
> > > > +
> > > > + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> > > > +
> > > > + mptcp_for_each_subflow(msk, subflow) {
> > > > + struct sock *ssk =
> > > > mptcp_subflow_tcp_sock(subflow);
> > > > + unsigned int subflow_rto;
> > > > +
> > > > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
> > >
> > > Why "* 2"? Is it what TCP is doing?
> > >
> > > (Probably best to add a comment here above, and in the commit
> > > message.)
> > >
> > > (Also, I guess the compiler will do the optimisation, but you
> > > could
> > > write: ">> 1" instead of "* 2".
> >
> > Yes, "<< 1" is much better. I added a comment here:
> >
> > + /*
> > + * icsk->icsk_rto is not used directly since
> > + * it's not updated frequently enough.
> > + *
> > + * subflow_rto is twice of rto, and allow
> > + * some tolerance for add_addr echo.
> > + */
> > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
> >
> > Is this okay?
>
> No :)
>
> I don't think the comment explain why it is twice the default TCP
> RTO.
> Why having an exception here for the ADD_ADDR? I mean: I'm OK with
> the
> idea of doubling the RTO, but only if there is a reason behind that
> :)
How about this one (answered by DeepSeek, an AI from China):
/*
* icsk->icsk_rto is not used directly since it's not
* updated frequently enough.
*
* Setting subflow_rto to twice of RTO provides a safety
* buffer for RTT estimation variance, kernel RTO update
* delays, and a full transmission retry cycle while
* minimizing false failures.
*/
subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
Thanks,
-Geliang
>
> Cheers,
> Matt
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-10 1:13 ` Geliang Tang
@ 2025-08-11 14:51 ` Matthieu Baerts
2025-08-11 15:34 ` Christoph Paasch
0 siblings, 1 reply; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-11 14:51 UTC (permalink / raw)
To: Geliang Tang, mptcp; +Cc: Geliang Tang, Christoph Paasch
Hi Geliang,
On 10/08/2025 03:13, Geliang Tang wrote:
> Hi Matt,
>
> On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> On 07/08/2025 05:25, Geliang Tang wrote:
>>> Hi Matt,
>>>
>>> On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
>>>> Hi Geliang,
>>>>
>>>> On 06/08/2025 05:17, Geliang Tang wrote:
>>>>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>>>>
>>>>> Replace the fixed ADD_ADDR retransmission timeout with an
>>>>> adaptive
>>>>> mechanism that uses the maximum RTO among existing subflows.
>>>>> This
>>>>> improves responsiveness when establishing new subflows while
>>>>> maintaining
>>>>> an upper bound through the configured add_addr_timeout value.
>>>>>
>>>>> Key changes:
>>>>> 1. Compute timeout as min(max_subflow_RTO,
>>>>> configured_max_timeout)
>>>>> 2. Apply exponential backoff based on retransmission count
>>>>> 3. Maintain fallback to configured max timeout when no subflows
>>>>> exist
>>>>> 4. Update documentation to clarify add_addr_timeout is now the
>>>>> maximum
>>>>> value
>>>>>
>>>>> Closes:
>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/576
>>>>> Reviewed-by: Christoph Paasch <cpaasch@openai.com>
>>>>> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
>>>>> ---
>>>>> Documentation/networking/mptcp-sysctl.rst | 4 +--
>>>>> net/mptcp/pm.c | 30
>>>>> ++++++++++++++++++++---
>>>>> 2 files changed, 29 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/networking/mptcp-sysctl.rst
>>>>> b/Documentation/networking/mptcp-sysctl.rst
>>>>> index 1683c139821e..2b8e8fb99ee2 100644
>>>>> --- a/Documentation/networking/mptcp-sysctl.rst
>>>>> +++ b/Documentation/networking/mptcp-sysctl.rst
>>>>> @@ -8,8 +8,8 @@ MPTCP Sysfs variables
>>>>> ===============================
>>>>>
>>>>> add_addr_timeout - INTEGER (seconds)
>>>>> - Set the timeout after which an ADD_ADDR control
>>>>> message
>>>>> will be
>>>>> - resent to an MPTCP peer that has not acknowledged a
>>>>> previous
>>>>> + Set the maximum value of timeout after which an
>>>>> ADD_ADDR
>>>>> control message
>>>>> + will be resent to an MPTCP peer that has not
>>>>> acknowledged
>>>>> a previous
>>>>> ADD_ADDR message.
>>>>>
>>>>> Do not retransmit if set to 0.
>>>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>>>> index 1330af6a2154..34cc4f45bc7d 100644
>>>>> --- a/net/mptcp/pm.c
>>>>> +++ b/net/mptcp/pm.c
>>>>> @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
>>>>> mptcp_sock *msk,
>>>>> return -EINVAL;
>>>>> }
>>>>>
>>>>> +static unsigned int mptcp_adjust_add_addr_timeout(struct
>>>>> mptcp_sock *msk)
>>>>> +{
>>>>> + struct mptcp_subflow_context *subflow;
>>>>> + struct sock *sk = (struct sock *)msk;
>>>>> + unsigned int max_rto = 0;
>>>>> + unsigned int timeout;
>>>>> +
>>>>> + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
>>>>> +
>>>>> + mptcp_for_each_subflow(msk, subflow) {
>>>>> + struct sock *ssk =
>>>>> mptcp_subflow_tcp_sock(subflow);
>>>>> + unsigned int subflow_rto;
>>>>> +
>>>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
>>>>
>>>> Why "* 2"? Is it what TCP is doing?
>>>>
>>>> (Probably best to add a comment here above, and in the commit
>>>> message.)
>>>>
>>>> (Also, I guess the compiler will do the optimisation, but you
>>>> could
>>>> write: ">> 1" instead of "* 2".
>>>
>>> Yes, "<< 1" is much better. I added a comment here:
>>>
>>> + /*
>>> + * icsk->icsk_rto is not used directly since
>>> + * it's not updated frequently enough.
>>> + *
>>> + * subflow_rto is twice of rto, and allow
>>> + * some tolerance for add_addr echo.
>>> + */
>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
>>>
>>> Is this okay?
>>
>> No :)
>>
>> I don't think the comment explain why it is twice the default TCP
>> RTO.
>> Why having an exception here for the ADD_ADDR? I mean: I'm OK with
>> the
>> idea of doubling the RTO, but only if there is a reason behind that
>> :)
>
> How about this one (answered by DeepSeek, an AI from China):
>
> /*
> * icsk->icsk_rto is not used directly since it's not
> * updated frequently enough.
What about: (...) since it might not have been updated recently.
> * Setting subflow_rto to twice of RTO provides a safety
> * buffer for RTT estimation variance, kernel RTO update
> * delays, and a full transmission retry cycle while
> * minimizing false failures.
A bit long and not clear (typical AI, making commit messages too
verbose, it almost force me to use one to give me the summary :) )
What about:
Setting this RTO to twice of TCP one, not to be too agressive for this
notification that might take longer to process than "simple" data.
Still, is icsk_rto not already "long" enough by default? I mean: when I
see your description in the related packetdrill PR [1] where the delay
is supposed to be 0.01 sec, but "subflow_rto" (or was it isck_rto?) is
0.65 sec, that's quite different, and half of it would be more than
enough :)
But maybe that's different when it happens later during the connection?
[1] https://github.com/multipath-tcp/packetdrill/pull/166
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-11 14:51 ` Matthieu Baerts
@ 2025-08-11 15:34 ` Christoph Paasch
2025-08-12 8:33 ` Geliang Tang
2025-08-12 8:39 ` Matthieu Baerts
0 siblings, 2 replies; 19+ messages in thread
From: Christoph Paasch @ 2025-08-11 15:34 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: Geliang Tang, mptcp, Geliang Tang
On Mon, Aug 11, 2025 at 7:51 AM Matthieu Baerts <matttbe@kernel.org> wrote:
>
> Hi Geliang,
>
> On 10/08/2025 03:13, Geliang Tang wrote:
> > Hi Matt,
> >
> > On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
> >> Hi Geliang,
> >>
> >> On 07/08/2025 05:25, Geliang Tang wrote:
> >>> Hi Matt,
> >>>
> >>> On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
> >>>> Hi Geliang,
> >>>>
> >>>> On 06/08/2025 05:17, Geliang Tang wrote:
> >>>>> From: Geliang Tang <tanggeliang@kylinos.cn>
> >>>>>
> >>>>> Replace the fixed ADD_ADDR retransmission timeout with an
> >>>>> adaptive
> >>>>> mechanism that uses the maximum RTO among existing subflows.
> >>>>> This
> >>>>> improves responsiveness when establishing new subflows while
> >>>>> maintaining
> >>>>> an upper bound through the configured add_addr_timeout value.
> >>>>>
> >>>>> Key changes:
> >>>>> 1. Compute timeout as min(max_subflow_RTO,
> >>>>> configured_max_timeout)
> >>>>> 2. Apply exponential backoff based on retransmission count
> >>>>> 3. Maintain fallback to configured max timeout when no subflows
> >>>>> exist
> >>>>> 4. Update documentation to clarify add_addr_timeout is now the
> >>>>> maximum
> >>>>> value
> >>>>>
> >>>>> Closes:
> >>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/576
> >>>>> Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> >>>>> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> >>>>> ---
> >>>>> Documentation/networking/mptcp-sysctl.rst | 4 +--
> >>>>> net/mptcp/pm.c | 30
> >>>>> ++++++++++++++++++++---
> >>>>> 2 files changed, 29 insertions(+), 5 deletions(-)
> >>>>>
> >>>>> diff --git a/Documentation/networking/mptcp-sysctl.rst
> >>>>> b/Documentation/networking/mptcp-sysctl.rst
> >>>>> index 1683c139821e..2b8e8fb99ee2 100644
> >>>>> --- a/Documentation/networking/mptcp-sysctl.rst
> >>>>> +++ b/Documentation/networking/mptcp-sysctl.rst
> >>>>> @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> >>>>> ===============================
> >>>>>
> >>>>> add_addr_timeout - INTEGER (seconds)
> >>>>> - Set the timeout after which an ADD_ADDR control
> >>>>> message
> >>>>> will be
> >>>>> - resent to an MPTCP peer that has not acknowledged a
> >>>>> previous
> >>>>> + Set the maximum value of timeout after which an
> >>>>> ADD_ADDR
> >>>>> control message
> >>>>> + will be resent to an MPTCP peer that has not
> >>>>> acknowledged
> >>>>> a previous
> >>>>> ADD_ADDR message.
> >>>>>
> >>>>> Do not retransmit if set to 0.
> >>>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> >>>>> index 1330af6a2154..34cc4f45bc7d 100644
> >>>>> --- a/net/mptcp/pm.c
> >>>>> +++ b/net/mptcp/pm.c
> >>>>> @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
> >>>>> mptcp_sock *msk,
> >>>>> return -EINVAL;
> >>>>> }
> >>>>>
> >>>>> +static unsigned int mptcp_adjust_add_addr_timeout(struct
> >>>>> mptcp_sock *msk)
> >>>>> +{
> >>>>> + struct mptcp_subflow_context *subflow;
> >>>>> + struct sock *sk = (struct sock *)msk;
> >>>>> + unsigned int max_rto = 0;
> >>>>> + unsigned int timeout;
> >>>>> +
> >>>>> + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> >>>>> +
> >>>>> + mptcp_for_each_subflow(msk, subflow) {
> >>>>> + struct sock *ssk =
> >>>>> mptcp_subflow_tcp_sock(subflow);
> >>>>> + unsigned int subflow_rto;
> >>>>> +
> >>>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
> >>>>
> >>>> Why "* 2"? Is it what TCP is doing?
> >>>>
> >>>> (Probably best to add a comment here above, and in the commit
> >>>> message.)
> >>>>
> >>>> (Also, I guess the compiler will do the optimisation, but you
> >>>> could
> >>>> write: ">> 1" instead of "* 2".
> >>>
> >>> Yes, "<< 1" is much better. I added a comment here:
> >>>
> >>> + /*
> >>> + * icsk->icsk_rto is not used directly since
> >>> + * it's not updated frequently enough.
> >>> + *
> >>> + * subflow_rto is twice of rto, and allow
> >>> + * some tolerance for add_addr echo.
> >>> + */
> >>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
> >>>
> >>> Is this okay?
> >>
> >> No :)
> >>
> >> I don't think the comment explain why it is twice the default TCP
> >> RTO.
> >> Why having an exception here for the ADD_ADDR? I mean: I'm OK with
> >> the
> >> idea of doubling the RTO, but only if there is a reason behind that
> >> :)
> >
> > How about this one (answered by DeepSeek, an AI from China):
> >
> > /*
> > * icsk->icsk_rto is not used directly since it's not
> > * updated frequently enough.
>
> What about: (...) since it might not have been updated recently.
>
> > * Setting subflow_rto to twice of RTO provides a safety
> > * buffer for RTT estimation variance, kernel RTO update
> > * delays, and a full transmission retry cycle while
> > * minimizing false failures.
>
> A bit long and not clear (typical AI, making commit messages too
> verbose, it almost force me to use one to give me the summary :) )
>
> What about:
>
> Setting this RTO to twice of TCP one, not to be too agressive for this
> notification that might take longer to process than "simple" data.
>
> Still, is icsk_rto not already "long" enough by default? I mean: when I
> see your description in the related packetdrill PR [1] where the delay
> is supposed to be 0.01 sec, but "subflow_rto" (or was it isck_rto?) is
> 0.65 sec, that's quite different, and half of it would be more than
> enough :)
>
> But maybe that's different when it happens later during the connection?
Why wouldn't icsk_rto not be good enough ? We just need a reasonable
number here. And, I must say that 2 x __tcp_set_rto() seems a bit
arbitrary. Let's just use icsk_rto. Does it really matter that it is
not updated very frequently ?
Christoph
>
> [1] https://github.com/multipath-tcp/packetdrill/pull/166
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-11 15:34 ` Christoph Paasch
@ 2025-08-12 8:33 ` Geliang Tang
2025-08-12 8:39 ` Matthieu Baerts
1 sibling, 0 replies; 19+ messages in thread
From: Geliang Tang @ 2025-08-12 8:33 UTC (permalink / raw)
To: Christoph Paasch, Matthieu Baerts; +Cc: mptcp, Geliang Tang
Hi Christoph, Matt,
On Mon, 2025-08-11 at 08:34 -0700, Christoph Paasch wrote:
> On Mon, Aug 11, 2025 at 7:51 AM Matthieu Baerts <matttbe@kernel.org>
> wrote:
> >
> > Hi Geliang,
> >
> > On 10/08/2025 03:13, Geliang Tang wrote:
> > > Hi Matt,
> > >
> > > On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
> > > > Hi Geliang,
> > > >
> > > > On 07/08/2025 05:25, Geliang Tang wrote:
> > > > > Hi Matt,
> > > > >
> > > > > On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
> > > > > > Hi Geliang,
> > > > > >
> > > > > > On 06/08/2025 05:17, Geliang Tang wrote:
> > > > > > > From: Geliang Tang <tanggeliang@kylinos.cn>
> > > > > > >
> > > > > > > Replace the fixed ADD_ADDR retransmission timeout with an
> > > > > > > adaptive
> > > > > > > mechanism that uses the maximum RTO among existing
> > > > > > > subflows.
> > > > > > > This
> > > > > > > improves responsiveness when establishing new subflows
> > > > > > > while
> > > > > > > maintaining
> > > > > > > an upper bound through the configured add_addr_timeout
> > > > > > > value.
> > > > > > >
> > > > > > > Key changes:
> > > > > > > 1. Compute timeout as min(max_subflow_RTO,
> > > > > > > configured_max_timeout)
> > > > > > > 2. Apply exponential backoff based on retransmission
> > > > > > > count
> > > > > > > 3. Maintain fallback to configured max timeout when no
> > > > > > > subflows
> > > > > > > exist
> > > > > > > 4. Update documentation to clarify add_addr_timeout is
> > > > > > > now the
> > > > > > > maximum
> > > > > > > value
> > > > > > >
> > > > > > > Closes:
> > > > > > > https://github.com/multipath-tcp/mptcp_net-next/issues/576
> > > > > > > Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> > > > > > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > > > > > > ---
> > > > > > > Documentation/networking/mptcp-sysctl.rst | 4 +--
> > > > > > > net/mptcp/pm.c | 30
> > > > > > > ++++++++++++++++++++---
> > > > > > > 2 files changed, 29 insertions(+), 5 deletions(-)
> > > > > > >
> > > > > > > diff --git a/Documentation/networking/mptcp-sysctl.rst
> > > > > > > b/Documentation/networking/mptcp-sysctl.rst
> > > > > > > index 1683c139821e..2b8e8fb99ee2 100644
> > > > > > > --- a/Documentation/networking/mptcp-sysctl.rst
> > > > > > > +++ b/Documentation/networking/mptcp-sysctl.rst
> > > > > > > @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> > > > > > > ===============================
> > > > > > >
> > > > > > > add_addr_timeout - INTEGER (seconds)
> > > > > > > - Set the timeout after which an ADD_ADDR control
> > > > > > > message
> > > > > > > will be
> > > > > > > - resent to an MPTCP peer that has not acknowledged a
> > > > > > > previous
> > > > > > > + Set the maximum value of timeout after which an
> > > > > > > ADD_ADDR
> > > > > > > control message
> > > > > > > + will be resent to an MPTCP peer that has not
> > > > > > > acknowledged
> > > > > > > a previous
> > > > > > > ADD_ADDR message.
> > > > > > >
> > > > > > > Do not retransmit if set to 0.
> > > > > > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > > > > > > index 1330af6a2154..34cc4f45bc7d 100644
> > > > > > > --- a/net/mptcp/pm.c
> > > > > > > +++ b/net/mptcp/pm.c
> > > > > > > @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
> > > > > > > mptcp_sock *msk,
> > > > > > > return -EINVAL;
> > > > > > > }
> > > > > > >
> > > > > > > +static unsigned int mptcp_adjust_add_addr_timeout(struct
> > > > > > > mptcp_sock *msk)
> > > > > > > +{
> > > > > > > + struct mptcp_subflow_context *subflow;
> > > > > > > + struct sock *sk = (struct sock *)msk;
> > > > > > > + unsigned int max_rto = 0;
> > > > > > > + unsigned int timeout;
> > > > > > > +
> > > > > > > + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> > > > > > > +
> > > > > > > + mptcp_for_each_subflow(msk, subflow) {
> > > > > > > + struct sock *ssk =
> > > > > > > mptcp_subflow_tcp_sock(subflow);
> > > > > > > + unsigned int subflow_rto;
> > > > > > > +
> > > > > > > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
> > > > > >
> > > > > > Why "* 2"? Is it what TCP is doing?
> > > > > >
> > > > > > (Probably best to add a comment here above, and in the
> > > > > > commit
> > > > > > message.)
> > > > > >
> > > > > > (Also, I guess the compiler will do the optimisation, but
> > > > > > you
> > > > > > could
> > > > > > write: ">> 1" instead of "* 2".
> > > > >
> > > > > Yes, "<< 1" is much better. I added a comment here:
> > > > >
> > > > > + /*
> > > > > + * icsk->icsk_rto is not used directly since
> > > > > + * it's not updated frequently enough.
> > > > > + *
> > > > > + * subflow_rto is twice of rto, and allow
> > > > > + * some tolerance for add_addr echo.
> > > > > + */
> > > > > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) <<
> > > > > 1;
> > > > >
> > > > > Is this okay?
> > > >
> > > > No :)
> > > >
> > > > I don't think the comment explain why it is twice the default
> > > > TCP
> > > > RTO.
> > > > Why having an exception here for the ADD_ADDR? I mean: I'm OK
> > > > with
> > > > the
> > > > idea of doubling the RTO, but only if there is a reason behind
> > > > that
> > > > :)
> > >
> > > How about this one (answered by DeepSeek, an AI from China):
> > >
> > > /*
> > > * icsk->icsk_rto is not used directly since it's not
> > > * updated frequently enough.
> >
> > What about: (...) since it might not have been updated recently.
> >
> > > * Setting subflow_rto to twice of RTO provides a safety
> > > * buffer for RTT estimation variance, kernel RTO update
> > > * delays, and a full transmission retry cycle while
> > > * minimizing false failures.
> >
> > A bit long and not clear (typical AI, making commit messages too
> > verbose, it almost force me to use one to give me the summary :) )
> >
> > What about:
> >
> > Setting this RTO to twice of TCP one, not to be too agressive for
> > this
> > notification that might take longer to process than "simple"
> > data.
> >
> > Still, is icsk_rto not already "long" enough by default? I mean:
> > when I
> > see your description in the related packetdrill PR [1] where the
> > delay
> > is supposed to be 0.01 sec, but "subflow_rto" (or was it isck_rto?)
> > is
> > 0.65 sec, that's quite different, and half of it would be more than
> > enough :)
> >
> > But maybe that's different when it happens later during the
> > connection?
>
> Why wouldn't icsk_rto not be good enough ? We just need a reasonable
> number here. And, I must say that 2 x __tcp_set_rto() seems a bit
> arbitrary. Let's just use icsk_rto. Does it really matter that it is
> not updated very frequently ?
Sure, I'll update this in v4, just using icsk->icsk_rto.
Thanks,
-Geliang
>
>
> Christoph
>
> >
> > [1] https://github.com/multipath-tcp/packetdrill/pull/166
> >
> > Cheers,
> > Matt
> > --
> > Sponsored by the NGI0 Core fund.
> >
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-11 15:34 ` Christoph Paasch
2025-08-12 8:33 ` Geliang Tang
@ 2025-08-12 8:39 ` Matthieu Baerts
2025-08-12 9:20 ` Geliang Tang
1 sibling, 1 reply; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-12 8:39 UTC (permalink / raw)
To: Christoph Paasch; +Cc: Geliang Tang, mptcp, Geliang Tang
Hi Christoph,
On 11/08/2025 17:34, Christoph Paasch wrote:
> On Mon, Aug 11, 2025 at 7:51 AM Matthieu Baerts <matttbe@kernel.org> wrote:
>>
>> Hi Geliang,
>>
>> On 10/08/2025 03:13, Geliang Tang wrote:
>>> Hi Matt,
>>>
>>> On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
>>>> Hi Geliang,
>>>>
>>>> On 07/08/2025 05:25, Geliang Tang wrote:
>>>>> Hi Matt,
>>>>>
>>>>> On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
>>>>>> Hi Geliang,
>>>>>>
>>>>>> On 06/08/2025 05:17, Geliang Tang wrote:
>>>>>>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>>>>>>
>>>>>>> Replace the fixed ADD_ADDR retransmission timeout with an
>>>>>>> adaptive
>>>>>>> mechanism that uses the maximum RTO among existing subflows.
>>>>>>> This
>>>>>>> improves responsiveness when establishing new subflows while
>>>>>>> maintaining
>>>>>>> an upper bound through the configured add_addr_timeout value.
>>>>>>>
>>>>>>> Key changes:
>>>>>>> 1. Compute timeout as min(max_subflow_RTO,
>>>>>>> configured_max_timeout)
>>>>>>> 2. Apply exponential backoff based on retransmission count
>>>>>>> 3. Maintain fallback to configured max timeout when no subflows
>>>>>>> exist
>>>>>>> 4. Update documentation to clarify add_addr_timeout is now the
>>>>>>> maximum
>>>>>>> value
>>>>>>>
>>>>>>> Closes:
>>>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/576
>>>>>>> Reviewed-by: Christoph Paasch <cpaasch@openai.com>
>>>>>>> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
>>>>>>> ---
>>>>>>> Documentation/networking/mptcp-sysctl.rst | 4 +--
>>>>>>> net/mptcp/pm.c | 30
>>>>>>> ++++++++++++++++++++---
>>>>>>> 2 files changed, 29 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/networking/mptcp-sysctl.rst
>>>>>>> b/Documentation/networking/mptcp-sysctl.rst
>>>>>>> index 1683c139821e..2b8e8fb99ee2 100644
>>>>>>> --- a/Documentation/networking/mptcp-sysctl.rst
>>>>>>> +++ b/Documentation/networking/mptcp-sysctl.rst
>>>>>>> @@ -8,8 +8,8 @@ MPTCP Sysfs variables
>>>>>>> ===============================
>>>>>>>
>>>>>>> add_addr_timeout - INTEGER (seconds)
>>>>>>> - Set the timeout after which an ADD_ADDR control
>>>>>>> message
>>>>>>> will be
>>>>>>> - resent to an MPTCP peer that has not acknowledged a
>>>>>>> previous
>>>>>>> + Set the maximum value of timeout after which an
>>>>>>> ADD_ADDR
>>>>>>> control message
>>>>>>> + will be resent to an MPTCP peer that has not
>>>>>>> acknowledged
>>>>>>> a previous
>>>>>>> ADD_ADDR message.
>>>>>>>
>>>>>>> Do not retransmit if set to 0.
>>>>>>> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
>>>>>>> index 1330af6a2154..34cc4f45bc7d 100644
>>>>>>> --- a/net/mptcp/pm.c
>>>>>>> +++ b/net/mptcp/pm.c
>>>>>>> @@ -268,6 +268,30 @@ int mptcp_pm_mp_prio_send_ack(struct
>>>>>>> mptcp_sock *msk,
>>>>>>> return -EINVAL;
>>>>>>> }
>>>>>>>
>>>>>>> +static unsigned int mptcp_adjust_add_addr_timeout(struct
>>>>>>> mptcp_sock *msk)
>>>>>>> +{
>>>>>>> + struct mptcp_subflow_context *subflow;
>>>>>>> + struct sock *sk = (struct sock *)msk;
>>>>>>> + unsigned int max_rto = 0;
>>>>>>> + unsigned int timeout;
>>>>>>> +
>>>>>>> + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
>>>>>>> +
>>>>>>> + mptcp_for_each_subflow(msk, subflow) {
>>>>>>> + struct sock *ssk =
>>>>>>> mptcp_subflow_tcp_sock(subflow);
>>>>>>> + unsigned int subflow_rto;
>>>>>>> +
>>>>>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
>>>>>>
>>>>>> Why "* 2"? Is it what TCP is doing?
>>>>>>
>>>>>> (Probably best to add a comment here above, and in the commit
>>>>>> message.)
>>>>>>
>>>>>> (Also, I guess the compiler will do the optimisation, but you
>>>>>> could
>>>>>> write: ">> 1" instead of "* 2".
>>>>>
>>>>> Yes, "<< 1" is much better. I added a comment here:
>>>>>
>>>>> + /*
>>>>> + * icsk->icsk_rto is not used directly since
>>>>> + * it's not updated frequently enough.
>>>>> + *
>>>>> + * subflow_rto is twice of rto, and allow
>>>>> + * some tolerance for add_addr echo.
>>>>> + */
>>>>> + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) << 1;
>>>>>
>>>>> Is this okay?
>>>>
>>>> No :)
>>>>
>>>> I don't think the comment explain why it is twice the default TCP
>>>> RTO.
>>>> Why having an exception here for the ADD_ADDR? I mean: I'm OK with
>>>> the
>>>> idea of doubling the RTO, but only if there is a reason behind that
>>>> :)
>>>
>>> How about this one (answered by DeepSeek, an AI from China):
>>>
>>> /*
>>> * icsk->icsk_rto is not used directly since it's not
>>> * updated frequently enough.
>>
>> What about: (...) since it might not have been updated recently.
>>
>>> * Setting subflow_rto to twice of RTO provides a safety
>>> * buffer for RTT estimation variance, kernel RTO update
>>> * delays, and a full transmission retry cycle while
>>> * minimizing false failures.
>>
>> A bit long and not clear (typical AI, making commit messages too
>> verbose, it almost force me to use one to give me the summary :) )
>>
>> What about:
>>
>> Setting this RTO to twice of TCP one, not to be too agressive for this
>> notification that might take longer to process than "simple" data.
>>
>> Still, is icsk_rto not already "long" enough by default? I mean: when I
>> see your description in the related packetdrill PR [1] where the delay
>> is supposed to be 0.01 sec, but "subflow_rto" (or was it isck_rto?) is
>> 0.65 sec, that's quite different, and half of it would be more than
>> enough :)
>>
>> But maybe that's different when it happens later during the connection?
>
> Why wouldn't icsk_rto not be good enough ? We just need a reasonable
> number here.
I didn't have time to investigate in between, but is icsk_rto not
updated only when needed (@Geliang: did you check?). So could it be
possible that at the beginning of the connection, the value is not correct?
If yes, why not using __tcp_set_rto()?
> And, I must say that 2 x __tcp_set_rto() seems a bit
> arbitrary. Let's just use icsk_rto. Does it really matter that it is
> not updated very frequently ?
Maybe yes.
@Geliang: but before sending a new version, please check my comment:
https://github.com/multipath-tcp/packetdrill/pull/166
Especially this part:
> 0.65s looks a lot to me: The 3rd ACK at line 15 should be injected
> with a delay of 0.01s.
>
> What are the values of icsk->icsk_rto, tp->srtt_us, tp->rttvar_us,
> __tcp_set_rto(tp) and the one you set for the timer when scheduling
> each retransmissions and when running this test?
>
> (for the rto value in jiffies, please convert them to usec like the
> others → jiffies_to_usecs())
Sharing such info when running the packetdrill tests using a non debug
kernel (and ideally also sharing the output of packetdrill, when running
it with the extra -vvvv), would help better understand the situation here.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-12 8:39 ` Matthieu Baerts
@ 2025-08-12 9:20 ` Geliang Tang
2025-08-12 9:32 ` Matthieu Baerts
0 siblings, 1 reply; 19+ messages in thread
From: Geliang Tang @ 2025-08-12 9:20 UTC (permalink / raw)
To: Matthieu Baerts, Christoph Paasch; +Cc: mptcp, Geliang Tang
[-- Attachment #1: Type: text/plain, Size: 12036 bytes --]
Hi Matt, Christoph,
On Tue, 2025-08-12 at 10:39 +0200, Matthieu Baerts wrote:
> Hi Christoph,
>
> On 11/08/2025 17:34, Christoph Paasch wrote:
> > On Mon, Aug 11, 2025 at 7:51 AM Matthieu Baerts
> > <matttbe@kernel.org> wrote:
> > >
> > > Hi Geliang,
> > >
> > > On 10/08/2025 03:13, Geliang Tang wrote:
> > > > Hi Matt,
> > > >
> > > > On Thu, 2025-08-07 at 13:56 +0200, Matthieu Baerts wrote:
> > > > > Hi Geliang,
> > > > >
> > > > > On 07/08/2025 05:25, Geliang Tang wrote:
> > > > > > Hi Matt,
> > > > > >
> > > > > > On Wed, 2025-08-06 at 17:34 +0200, Matthieu Baerts wrote:
> > > > > > > Hi Geliang,
> > > > > > >
> > > > > > > On 06/08/2025 05:17, Geliang Tang wrote:
> > > > > > > > From: Geliang Tang <tanggeliang@kylinos.cn>
> > > > > > > >
> > > > > > > > Replace the fixed ADD_ADDR retransmission timeout with
> > > > > > > > an
> > > > > > > > adaptive
> > > > > > > > mechanism that uses the maximum RTO among existing
> > > > > > > > subflows.
> > > > > > > > This
> > > > > > > > improves responsiveness when establishing new subflows
> > > > > > > > while
> > > > > > > > maintaining
> > > > > > > > an upper bound through the configured add_addr_timeout
> > > > > > > > value.
> > > > > > > >
> > > > > > > > Key changes:
> > > > > > > > 1. Compute timeout as min(max_subflow_RTO,
> > > > > > > > configured_max_timeout)
> > > > > > > > 2. Apply exponential backoff based on retransmission
> > > > > > > > count
> > > > > > > > 3. Maintain fallback to configured max timeout when no
> > > > > > > > subflows
> > > > > > > > exist
> > > > > > > > 4. Update documentation to clarify add_addr_timeout is
> > > > > > > > now the
> > > > > > > > maximum
> > > > > > > > value
> > > > > > > >
> > > > > > > > Closes:
> > > > > > > > https://github.com/multipath-tcp/mptcp_net-next/issues/576
> > > > > > > > Reviewed-by: Christoph Paasch <cpaasch@openai.com>
> > > > > > > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > > > > > > > ---
> > > > > > > > Documentation/networking/mptcp-sysctl.rst | 4 +--
> > > > > > > > net/mptcp/pm.c | 30
> > > > > > > > ++++++++++++++++++++---
> > > > > > > > 2 files changed, 29 insertions(+), 5 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/Documentation/networking/mptcp-sysctl.rst
> > > > > > > > b/Documentation/networking/mptcp-sysctl.rst
> > > > > > > > index 1683c139821e..2b8e8fb99ee2 100644
> > > > > > > > --- a/Documentation/networking/mptcp-sysctl.rst
> > > > > > > > +++ b/Documentation/networking/mptcp-sysctl.rst
> > > > > > > > @@ -8,8 +8,8 @@ MPTCP Sysfs variables
> > > > > > > > ===============================
> > > > > > > >
> > > > > > > > add_addr_timeout - INTEGER (seconds)
> > > > > > > > - Set the timeout after which an ADD_ADDR control
> > > > > > > > message
> > > > > > > > will be
> > > > > > > > - resent to an MPTCP peer that has not acknowledged a
> > > > > > > > previous
> > > > > > > > + Set the maximum value of timeout after which an
> > > > > > > > ADD_ADDR
> > > > > > > > control message
> > > > > > > > + will be resent to an MPTCP peer that has not
> > > > > > > > acknowledged
> > > > > > > > a previous
> > > > > > > > ADD_ADDR message.
> > > > > > > >
> > > > > > > > Do not retransmit if set to 0.
> > > > > > > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> > > > > > > > index 1330af6a2154..34cc4f45bc7d 100644
> > > > > > > > --- a/net/mptcp/pm.c
> > > > > > > > +++ b/net/mptcp/pm.c
> > > > > > > > @@ -268,6 +268,30 @@ int
> > > > > > > > mptcp_pm_mp_prio_send_ack(struct
> > > > > > > > mptcp_sock *msk,
> > > > > > > > return -EINVAL;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static unsigned int
> > > > > > > > mptcp_adjust_add_addr_timeout(struct
> > > > > > > > mptcp_sock *msk)
> > > > > > > > +{
> > > > > > > > + struct mptcp_subflow_context *subflow;
> > > > > > > > + struct sock *sk = (struct sock *)msk;
> > > > > > > > + unsigned int max_rto = 0;
> > > > > > > > + unsigned int timeout;
> > > > > > > > +
> > > > > > > > + timeout = mptcp_get_add_addr_timeout(sock_net(sk));
> > > > > > > > +
> > > > > > > > + mptcp_for_each_subflow(msk, subflow) {
> > > > > > > > + struct sock *ssk =
> > > > > > > > mptcp_subflow_tcp_sock(subflow);
> > > > > > > > + unsigned int subflow_rto;
> > > > > > > > +
> > > > > > > > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) * 2;
> > > > > > >
> > > > > > > Why "* 2"? Is it what TCP is doing?
> > > > > > >
> > > > > > > (Probably best to add a comment here above, and in the
> > > > > > > commit
> > > > > > > message.)
> > > > > > >
> > > > > > > (Also, I guess the compiler will do the optimisation, but
> > > > > > > you
> > > > > > > could
> > > > > > > write: ">> 1" instead of "* 2".
> > > > > >
> > > > > > Yes, "<< 1" is much better. I added a comment here:
> > > > > >
> > > > > > + /*
> > > > > > + * icsk->icsk_rto is not used directly
> > > > > > since
> > > > > > + * it's not updated frequently enough.
> > > > > > + *
> > > > > > + * subflow_rto is twice of rto, and allow
> > > > > > + * some tolerance for add_addr echo.
> > > > > > + */
> > > > > > + subflow_rto = __tcp_set_rto(tcp_sk(ssk)) <<
> > > > > > 1;
> > > > > >
> > > > > > Is this okay?
> > > > >
> > > > > No :)
> > > > >
> > > > > I don't think the comment explain why it is twice the default
> > > > > TCP
> > > > > RTO.
> > > > > Why having an exception here for the ADD_ADDR? I mean: I'm OK
> > > > > with
> > > > > the
> > > > > idea of doubling the RTO, but only if there is a reason
> > > > > behind that
> > > > > :)
> > > >
> > > > How about this one (answered by DeepSeek, an AI from China):
> > > >
> > > > /*
> > > > * icsk->icsk_rto is not used directly since it's not
> > > > * updated frequently enough.
> > >
> > > What about: (...) since it might not have been updated recently.
> > >
> > > > * Setting subflow_rto to twice of RTO provides a
> > > > safety
> > > > * buffer for RTT estimation variance, kernel RTO
> > > > update
> > > > * delays, and a full transmission retry cycle while
> > > > * minimizing false failures.
> > >
> > > A bit long and not clear (typical AI, making commit messages too
> > > verbose, it almost force me to use one to give me the summary :)
> > > )
> > >
> > > What about:
> > >
> > > Setting this RTO to twice of TCP one, not to be too agressive
> > > for this
> > > notification that might take longer to process than "simple"
> > > data.
> > >
> > > Still, is icsk_rto not already "long" enough by default? I mean:
> > > when I
> > > see your description in the related packetdrill PR [1] where the
> > > delay
> > > is supposed to be 0.01 sec, but "subflow_rto" (or was it
> > > isck_rto?) is
> > > 0.65 sec, that's quite different, and half of it would be more
> > > than
> > > enough :)
> > >
> > > But maybe that's different when it happens later during the
> > > connection?
> >
> > Why wouldn't icsk_rto not be good enough ? We just need a
> > reasonable
> > number here.
>
> I didn't have time to investigate in between, but is icsk_rto not
> updated only when needed (@Geliang: did you check?). So could it be
> possible that at the beginning of the connection, the value is not
> correct?
>
> If yes, why not using __tcp_set_rto()?
>
> > And, I must say that 2 x __tcp_set_rto() seems a bit
> > arbitrary. Let's just use icsk_rto. Does it really matter that it
> > is
> > not updated very frequently ?
>
> Maybe yes.
>
> @Geliang: but before sending a new version, please check my comment:
>
> https://github.com/multipath-tcp/packetdrill/pull/166
>
> Especially this part:
>
> > 0.65s looks a lot to me: The 3rd ACK at line 15 should be injected
> > with a delay of 0.01s.
> >
> > What are the values of icsk->icsk_rto, tp->srtt_us, tp->rttvar_us,
> > __tcp_set_rto(tp) and the one you set for the timer when scheduling
> > each retransmissions and when running this test?
> >
> > (for the rto value in jiffies, please convert them to usec like the
> > others → jiffies_to_usecs())
>
> Sharing such info when running the packetdrill tests using a non
> debug
> kernel (and ideally also sharing the output of packetdrill, when
> running
> it with the extra -vvvv), would help better understand the situation
> here.
Here is the debug patch based on v4:
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 0497a7d2446e..1f203c279620 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -271,6 +271,12 @@ static unsigned int
mptcp_adjust_add_addr_timeout(struct mptcp_sock *msk)
mptcp_for_each_subflow(msk, subflow) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
struct inet_connection_sock *icsk = inet_csk(ssk);
+ struct tcp_sock *tp = tcp_sk(ssk);
+
+ pr_info("icsk->icsk_rto=%uus, __tcp_set_rto(tp)=%uus,
tp->srtt_us=%uus, tp->rttvar_us=%uus",
+ jiffies_to_usecs(icsk->icsk_rto),
+ jiffies_to_usecs(__tcp_set_rto(tp)),
+ tp->srtt_us, tp->rttvar_us);
if (icsk->icsk_rto > max)
max = icsk->icsk_rto;
@@ -319,9 +325,14 @@ static void mptcp_pm_add_timer(struct timer_list
*timer)
entry->retrans_times++;
}
- if (entry->retrans_times < ADD_ADDR_RETRANS_MAX)
+ if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) {
+ pr_info("retrans_times=%u add_addr_timeout=%uus\n",
+ entry->retrans_times,
+ jiffies_to_usecs(timeout << entry-
>retrans_times));
+
sk_reset_timer(sk, timer,
jiffies + (timeout << entry-
>retrans_times));
+ }
spin_unlock_bh(&msk->pm.lock);
And here is the log:
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_server.pkt
(ipv6)]
[ 374.343386] MPTCP: icsk->icsk_rto=211000us,
__tcp_set_rto(tp)=211000us, tp->srtt_us=83640us, tp->rttvar_us=200000us
[ 374.343396] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.343695] MPTCP: icsk->icsk_rto=211000us,
__tcp_set_rto(tp)=211000us, tp->srtt_us=83520us, tp->rttvar_us=200000us
[ 374.343696] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.351277] MPTCP: icsk->icsk_rto=211000us,
__tcp_set_rto(tp)=211000us, tp->srtt_us=81064us, tp->rttvar_us=200000us
[ 374.351280] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.359279] MPTCP: icsk->icsk_rto=211000us,
__tcp_set_rto(tp)=211000us, tp->srtt_us=83416us, tp->rttvar_us=200000us
[ 374.359280] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.362808] MPTCP: icsk->icsk_rto=433000us,
__tcp_set_rto(tp)=433000us, tp->srtt_us=1121560us, tp-
>rttvar_us=291850us
[ 374.362907] MPTCP: icsk->icsk_rto=312000us,
__tcp_set_rto(tp)=312000us, tp->srtt_us=827589us, tp-
>rttvar_us=207860us
[ 374.367273] MPTCP: icsk->icsk_rto=211000us,
__tcp_set_rto(tp)=211000us, tp->srtt_us=80728us, tp->rttvar_us=200000us
[ 374.367324] MPTCP: retrans_times=1 add_addr_timeout=422000us
FAIL
[/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_plain.pkt
(ipv4)]
stdout:
stderr:
add_addr_retry_plain.pkt:22: error handling packet: timing error:
expected outbound packet at 1.164616 sec but happened at 0.382298 sec;
tolerance 0.400000 sec
script packet: 1.164616 . 1:1(0) ack 1 <add_address address_id: 1
ipv4: 192.168.0.2 hmac: 18072054855919674796>
actual packet: 0.382298 . 1:1(0) ack 1 win 1050 <add_address
address_id: 1 ipv4: 192.168.0.2 hmac: 18072054855919674796>
Full log is attached.
In this test, __tcp_set_rto(tp) is the same as icsk->icsk_rto.
Thanks,
-Geliang
>
> Cheers,
> Matt
[-- Attachment #2: add_addr_timeout.log --]
[-- Type: text/x-log, Size: 8065 bytes --]
root@mptcpdev:/opt/packetdrill/gtests/net# ./packetdrill/run_all.py -lv mptcp/add_addr/
[ 164.023274] MPTCP: icsk->icsk_rto=310000us, __tcp_set_rto(tp)=310000us, tp->srtt_us=826456us, tp->rttvar_us=206614us
[ 372.324724] MPTCP: icsk->icsk_rto=212000us, __tcp_set_rto(tp)=212000us, tp->srtt_us=95520us, tp->rttvar_us=200000us
[ 372.334676] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81552us, tp->rttvar_us=200000us
[ 372.349761] MPTCP: icsk->icsk_rto=212000us, __tcp_set_rto(tp)=212000us, tp->srtt_us=90816us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_client.pkt (ipv4)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_port_client.pkt (ipv4)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_server.pkt (ipv6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_client.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_port_client.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_server.pkt (ipv4)]
[ 373.062029] MPTCP: icsk->icsk_rto=212000us, __tcp_set_rto(tp)=212000us, tp->srtt_us=93024us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_client.pkt (ipv6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_server.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr4_port_client.pkt (ipv6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_client.pkt (ipv4)]
[ 373.303826] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=85528us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_client.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_port_client.pkt (ipv4)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_port_client.pkt (ipv6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_server.pkt (ipv4-mapped-v6)]
[ 373.726341] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=82856us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_client.pkt (ipv6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_server.pkt (ipv4)]
[ 373.977508] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81896us, tp->rttvar_us=200000us
[ 374.125663] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83520us, tp->rttvar_us=200000us
[ 374.125737] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83640us, tp->rttvar_us=200000us
[ 374.132917] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81064us, tp->rttvar_us=200000us
[ 374.142526] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83416us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_port_client.pkt (ipv4-mapped-v6)]
[ 374.154510] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=80728us, tp->rttvar_us=200000us
[ 374.191280] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81896us, tp->rttvar_us=200000us
[ 374.191537] MPTCP: retrans_times=1 add_addr_timeout=422000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr6_server.pkt (ipv6)]
[ 374.343386] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83640us, tp->rttvar_us=200000us
[ 374.343396] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.343695] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83520us, tp->rttvar_us=200000us
[ 374.343696] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.351277] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81064us, tp->rttvar_us=200000us
[ 374.351280] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.359279] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83416us, tp->rttvar_us=200000us
[ 374.359280] MPTCP: retrans_times=1 add_addr_timeout=422000us
[ 374.362808] MPTCP: icsk->icsk_rto=433000us, __tcp_set_rto(tp)=433000us, tp->srtt_us=1121560us, tp->rttvar_us=291850us
[ 374.362907] MPTCP: icsk->icsk_rto=312000us, __tcp_set_rto(tp)=312000us, tp->srtt_us=827589us, tp->rttvar_us=207860us
[ 374.367273] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=80728us, tp->rttvar_us=200000us
[ 374.367324] MPTCP: retrans_times=1 add_addr_timeout=422000us
FAIL [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_plain.pkt (ipv4)]
stdout:
stderr:
add_addr_retry_plain.pkt:22: error handling packet: timing error: expected outbound packet at 1.164616 sec but happened at 0.382298 sec; tolerance 0.400000 sec
script packet: 1.164616 . 1:1(0) ack 1 <add_address address_id: 1 ipv4: 192.168.0.2 hmac: 18072054855919674796>
actual packet: 0.382298 . 1:1(0) ack 1 win 1050 <add_address address_id: 1 ipv4: 192.168.0.2 hmac: 18072054855919674796>
FAIL [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_plain.pkt (ipv4-mapped-v6)]
stdout:
stderr:
add_addr_retry_plain.pkt:22: error handling packet: timing error: expected outbound packet at 1.164602 sec but happened at 0.382412 sec; tolerance 0.400000 sec
script packet: 1.164602 . 1:1(0) ack 1 <add_address address_id: 1 ipv4: 192.168.0.2 hmac: 10396528243396168920>
actual packet: 0.382412 . 1:1(0) ack 1 win 1050 <add_address address_id: 1 ipv4: 192.168.0.2 hmac: 10396528243396168920>
FAIL [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_plain.pkt (ipv6)]
stdout:
stderr:
add_addr_retry_plain.pkt:22: error handling packet: timing error: expected outbound packet at 1.083163 sec but happened at 0.295888 sec; tolerance 0.400000 sec
script packet: 1.083163 . 1:1(0) ack 1 <add_address address_id: 1 ipv6: fd3d:a0b:17d6::2 hmac: 15336863693204847186>
actual packet: 0.295888 . 1:1(0) ack 1 win 1050 <add_address address_id: 1 ipv6: fd3d:a0b:17d6::2 hmac: 15336863693204847186>
[ 374.615405] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81896us, tp->rttvar_us=200000us
[ 374.615414] MPTCP: retrans_times=2 add_addr_timeout=844000us
[ 374.693298] MPTCP: icsk->icsk_rto=302000us, __tcp_set_rto(tp)=302000us, tp->srtt_us=803324us, tp->rttvar_us=200912us
[ 374.767297] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83520us, tp->rttvar_us=200000us
[ 374.767489] MPTCP: retrans_times=2 add_addr_timeout=844000us
[ 374.767658] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83640us, tp->rttvar_us=200000us
[ 374.767659] MPTCP: retrans_times=2 add_addr_timeout=844000us
[ 374.775398] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=81064us, tp->rttvar_us=200000us
[ 374.775406] MPTCP: retrans_times=2 add_addr_timeout=844000us
[ 374.783281] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83416us, tp->rttvar_us=200000us
[ 374.783283] MPTCP: retrans_times=2 add_addr_timeout=844000us
[ 374.791282] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=80728us, tp->rttvar_us=200000us
[ 374.791286] MPTCP: retrans_times=2 add_addr_timeout=844000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_echo_new_sf.pkt (ipv4)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_echo_new_sf.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_echo_new_sf.pkt (ipv6)]
[ 375.671518] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=80728us, tp->rttvar_us=200000us
[ 375.671535] MPTCP: icsk->icsk_rto=211000us, __tcp_set_rto(tp)=211000us, tp->srtt_us=83640us, tp->rttvar_us=200000us
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_errors.pkt (ipv4)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_errors.pkt (ipv4-mapped-v6)]
OK [/opt/packetdrill/gtests/net/mptcp/add_addr/add_addr_retry_errors.pkt (ipv6)]
Ran 27 tests: 24 passing, 3 failing, 0 timed out (5.41 sec): mptcp/add_addr/
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive
2025-08-12 9:20 ` Geliang Tang
@ 2025-08-12 9:32 ` Matthieu Baerts
0 siblings, 0 replies; 19+ messages in thread
From: Matthieu Baerts @ 2025-08-12 9:32 UTC (permalink / raw)
To: Geliang Tang, Christoph Paasch; +Cc: mptcp, Geliang Tang
Hi Geliang,
On 12/08/2025 11:20, Geliang Tang wrote:
> On Tue, 2025-08-12 at 10:39 +0200, Matthieu Baerts wrote:
>> On 11/08/2025 17:34, Christoph Paasch wrote:
(...)
>>> Why wouldn't icsk_rto not be good enough ? We just need a
>>> reasonable
>>> number here.
>>
>> I didn't have time to investigate in between, but is icsk_rto not
>> updated only when needed (@Geliang: did you check?). So could it be
>> possible that at the beginning of the connection, the value is not
>> correct?
>>
>> If yes, why not using __tcp_set_rto()?
>>
>>> And, I must say that 2 x __tcp_set_rto() seems a bit
>>> arbitrary. Let's just use icsk_rto. Does it really matter that it
>>> is
>>> not updated very frequently ?
>>
>> Maybe yes.
>>
>> @Geliang: but before sending a new version, please check my comment:
>>
>> https://github.com/multipath-tcp/packetdrill/pull/166
>>
>> Especially this part:
>>
>>> 0.65s looks a lot to me: The 3rd ACK at line 15 should be injected
>>> with a delay of 0.01s.
>>>
>>> What are the values of icsk->icsk_rto, tp->srtt_us, tp->rttvar_us,
>>> __tcp_set_rto(tp) and the one you set for the timer when scheduling
>>> each retransmissions and when running this test?
>>>
>>> (for the rto value in jiffies, please convert them to usec like the
>>> others → jiffies_to_usecs())
>>
>> Sharing such info when running the packetdrill tests using a non
>> debug
>> kernel (and ideally also sharing the output of packetdrill, when
>> running
>> it with the extra -vvvv), would help better understand the situation
>> here.
>
> Here is the debug patch based on v4:
(...)
Thank you for having checked! I suggest discussing the logs on the
packetdrill ticket where I just asked you to check when not all the
packetdrill tests are executed in parallel, certainly taking all CPU
resources:
https://github.com/multipath-tcp/packetdrill/pull/166
> In this test, __tcp_set_rto(tp) is the same as icsk->icsk_rto.
That's interesting. Then maybe we can simply use icsk_rto if that's easier.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-08-12 9:32 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 3:17 [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling Geliang Tang
2025-08-06 3:17 ` [PATCH mptcp-next v3 1/3] mptcp: remove duplicate sk_reset_timer call Geliang Tang
2025-08-06 15:31 ` Matthieu Baerts
2025-08-06 3:17 ` [PATCH mptcp-next v3 2/3] mptcp: make ADD_ADDR retransmission timeout adaptive Geliang Tang
2025-08-06 15:34 ` Matthieu Baerts
2025-08-07 3:25 ` Geliang Tang
2025-08-07 11:56 ` Matthieu Baerts
2025-08-10 1:13 ` Geliang Tang
2025-08-11 14:51 ` Matthieu Baerts
2025-08-11 15:34 ` Christoph Paasch
2025-08-12 8:33 ` Geliang Tang
2025-08-12 8:39 ` Matthieu Baerts
2025-08-12 9:20 ` Geliang Tang
2025-08-12 9:32 ` Matthieu Baerts
2025-08-06 3:17 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: remove add_addr_timeout settings Geliang Tang
2025-08-06 15:29 ` [PATCH mptcp-next v3 0/3] mptcp: improve ADD_ADDR timeout handling MPTCP CI
2025-08-06 15:36 ` Matthieu Baerts
2025-08-07 5:51 ` Geliang Tang
2025-08-07 11:56 ` Matthieu Baerts
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).