* [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt
@ 2025-11-27 15:58 Matthieu Baerts (NGI0)
2025-11-27 17:20 ` Paolo Abeni
2025-11-27 17:24 ` MPTCP CI
0 siblings, 2 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-11-27 15:58 UTC (permalink / raw)
To: MPTCP Upstream; +Cc: Paolo Abeni, Matthieu Baerts (NGI0)
This is a follow up of commit ecfea98b7d0d ("tcp: add
net.ipv4.tcp_rcvbuf_low_rtt"), but adapted to MPTCP.
MPTCP has mptcp_rcvbuf_grow(), which is similar to tcp_rcvbuf_grow, but
adapted for the MPTCP-level socket.
The idea here is similar to what has been done on TCP side: not let
mptcp_rcvbuf_grow() grow sk->sk_rcvbuf too fast for small RTT flows.
Quoting Eric: If sk->sk_rcvbuf is too big, this can force NIC driver to
not recycle pages from their page pool, and also can cause cache
evictions for DDIO enabled cpus/NIC, as receivers are usually slower
than senders.
If RTT if smaller than the new net.ipv4.tcp_rcvbuf_low_rtt sysctl value,
use the RTT / tcp_rcvbuf_low_rtt ratio to control sk_rcvbuf inflation.
Tested: NO :)
This is why it is still a RFC. My perf test env is currently broken. I'm
sharing this patch just in case it is easy for someone to validate this
patch. Ideally such tests should be done on top of "trace: mptcp: add
mptcp_rcvbuf_grow tracepoint" patch from Paolo (and probably on top of
the related series), following similar tests to the ones done by Eric,
making sure the receiver is slower than the sender. Feel free to take
the patch, and send new versions changing the author, etc. if needed.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/mptcp/protocol.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index e484c6391b48..715a9a072c6a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -208,6 +208,7 @@ static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
struct mptcp_sock *msk = mptcp_sk(sk);
const struct net *net = sock_net(sk);
u32 rcvwin, rcvbuf, cap, oldval;
+ u32 rtt_threshold, rtt_us;
u64 grow;
oldval = msk->rcvq_space.space;
@@ -219,10 +220,19 @@ static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
/* DRS is always one RTT late. */
rcvwin = newval << 1;
- /* slow start: allow the sender to double its rate. */
- grow = (u64)rcvwin * (newval - oldval);
- do_div(grow, oldval);
- rcvwin += grow << 1;
+ rtt_us = msk->rcvq_space.rtt_us >> 3;
+ rtt_threshold = READ_ONCE(net->ipv4.sysctl_tcp_rcvbuf_low_rtt);
+ if (rtt_us < rtt_threshold) {
+ /* For small RTT, we set @grow to rcvwin * rtt_us/rtt_threshold.
+ * It might take few additional ms to reach 'line rate',
+ * but will avoid sk_rcvbuf inflation and poor cache use.
+ */
+ grow = div_u64((u64)rcvwin * rtt_us, rtt_threshold);
+ } else {
+ /* slow start: allow the sender to double its rate. */
+ grow = div_u64(((u64)rcvwin << 1) * (newval - oldval), oldval);
+ }
+ rcvwin += grow;
if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
rcvwin += MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq;
---
base-commit: 1fea9a6bd10f5c5494b7973141083ec56ecffd74
change-id: 20251127-mptcp-tcp_rcvbuf_low_rtt-fc64120b153a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe@kernel.org>
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt
2025-11-27 15:58 [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt Matthieu Baerts (NGI0)
@ 2025-11-27 17:20 ` Paolo Abeni
2025-11-27 17:45 ` Matthieu Baerts
2025-11-27 17:24 ` MPTCP CI
1 sibling, 1 reply; 4+ messages in thread
From: Paolo Abeni @ 2025-11-27 17:20 UTC (permalink / raw)
To: Matthieu Baerts (NGI0), MPTCP Upstream
On 11/27/25 4:58 PM, Matthieu Baerts (NGI0) wrote:
> This is a follow up of commit ecfea98b7d0d ("tcp: add
> net.ipv4.tcp_rcvbuf_low_rtt"), but adapted to MPTCP.
>
> MPTCP has mptcp_rcvbuf_grow(), which is similar to tcp_rcvbuf_grow, but
> adapted for the MPTCP-level socket.
>
> The idea here is similar to what has been done on TCP side: not let
> mptcp_rcvbuf_grow() grow sk->sk_rcvbuf too fast for small RTT flows.
> Quoting Eric: If sk->sk_rcvbuf is too big, this can force NIC driver to
> not recycle pages from their page pool, and also can cause cache
> evictions for DDIO enabled cpus/NIC, as receivers are usually slower
> than senders.
>
> If RTT if smaller than the new net.ipv4.tcp_rcvbuf_low_rtt sysctl value,
> use the RTT / tcp_rcvbuf_low_rtt ratio to control sk_rcvbuf inflation.
Instead of duplicating the TCP math, I suggest factoring it out in an
helper and use it in both the TCP and MPTCP code.
/P
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt
2025-11-27 17:20 ` Paolo Abeni
@ 2025-11-27 17:45 ` Matthieu Baerts
0 siblings, 0 replies; 4+ messages in thread
From: Matthieu Baerts @ 2025-11-27 17:45 UTC (permalink / raw)
To: Paolo Abeni, MPTCP Upstream
Hi Paolo,
On 27/11/2025 18:20, Paolo Abeni wrote:
> On 11/27/25 4:58 PM, Matthieu Baerts (NGI0) wrote:
>> This is a follow up of commit ecfea98b7d0d ("tcp: add
>> net.ipv4.tcp_rcvbuf_low_rtt"), but adapted to MPTCP.
>>
>> MPTCP has mptcp_rcvbuf_grow(), which is similar to tcp_rcvbuf_grow, but
>> adapted for the MPTCP-level socket.
>>
>> The idea here is similar to what has been done on TCP side: not let
>> mptcp_rcvbuf_grow() grow sk->sk_rcvbuf too fast for small RTT flows.
>> Quoting Eric: If sk->sk_rcvbuf is too big, this can force NIC driver to
>> not recycle pages from their page pool, and also can cause cache
>> evictions for DDIO enabled cpus/NIC, as receivers are usually slower
>> than senders.
>>
>> If RTT if smaller than the new net.ipv4.tcp_rcvbuf_low_rtt sysctl value,
>> use the RTT / tcp_rcvbuf_low_rtt ratio to control sk_rcvbuf inflation.
>
> Instead of duplicating the TCP math, I suggest factoring it out in an
> helper and use it in both the TCP and MPTCP code.
Thank you for the review! Good idea!
I guess this patch can wait the next cycle, right? Or should I rush to
get this soon to stay in sync with TCP?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt
2025-11-27 15:58 [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt Matthieu Baerts (NGI0)
2025-11-27 17:20 ` Paolo Abeni
@ 2025-11-27 17:24 ` MPTCP CI
1 sibling, 0 replies; 4+ messages in thread
From: MPTCP CI @ 2025-11-27 17:24 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp
Hi Matthieu,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_add_addr 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19742476256
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3d5676c09a8f
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1028361
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-27 17:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-27 15:58 [PATCH RFC mptcp-next] mptcp: support net.ipv4.tcp_rcvbuf_low_rtt Matthieu Baerts (NGI0)
2025-11-27 17:20 ` Paolo Abeni
2025-11-27 17:45 ` Matthieu Baerts
2025-11-27 17:24 ` MPTCP CI
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox