* [PATCH net v3 0/4] tcp: fix receive autotune again
@ 2025-10-28 11:57 Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
2025-10-30 0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:57 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, linux-trace-kernel
Neal Cardwell found that recent kernels were having RWIN limited
issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
512MB.
He suspected that tcp_stream default buffer size (64KB) was triggering
heuristic added in ea33537d8292 ("tcp: add receive queue awareness
in tcp_rcv_space_adjust()").
After more testing, it turns out the bug was added earlier
with commit 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot").
I forgot once again that DRS has one RTT latency.
MPTCP also got the same issue.
This series :
- Prevent calling tcp_rcvbuf_grow() on some MPTCP subflows.
- adds rcv_ssthresh, window_clamp and rcv_wnd to trace_tcp_rcvbuf_grow().
- Refactors code in a patch with no functional changes.
- Fixes the issue in the final patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
[ Added patch 1/4. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v3:
- Fix warnings at build time by moving 'oldval' declaration (Matthieu)
- Prevent possible divide by zero issue in mptcp_rcv_space_adjust() (Paolo)
- Note: this v3 is not being sent by Eric because he is unavailable.
- Link to v2: https://patch.msgid.link/20251027073809.2112498-1-edumazet@google.com
Changes in v2:
- Rebased to net tree
- Changed mptcp_rcvbuf_grow() to read/write msk->rcvq_space.space (Paolo)
- Link to v1: https://patch.msgid.link/20251024075027.3178786-1-edumazet@google.com
---
Eric Dumazet (3):
trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
tcp: add newval parameter to tcp_rcvbuf_grow()
tcp: fix too slow tcp_rcvbuf_grow() action
Paolo Abeni (1):
mptcp: fix subflow rcvbuf adjust
include/net/tcp.h | 2 +-
include/trace/events/tcp.h | 9 +++++++++
net/ipv4/tcp_input.c | 21 ++++++++++++++-------
net/mptcp/protocol.c | 26 +++++++++++++++++---------
4 files changed, 41 insertions(+), 17 deletions(-)
---
base-commit: 210b35d6a7ea415494ce75490c4b43b4e717d935
change-id: 20251028-net-tcp-recv-autotune-5876d6d85d8a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
@ 2025-10-28 11:58 ` Matthieu Baerts (NGI0)
2025-10-29 13:41 ` Neal Cardwell
2025-10-30 0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:58 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, linux-trace-kernel
From: Eric Dumazet <edumazet@google.com>
While chasing yet another receive autotuning bug,
I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.
tcp_stream 40597 [068] 2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
tcp_stream 40597 [068] 2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
tcp_stream 40597 [068] 2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
tcp_stream 40597 [068] 2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
tcp_stream 40597 [068] 2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
To: Steven Rostedt <rostedt@goodmis.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: linux-trace-kernel@vger.kernel.org
---
include/trace/events/tcp.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 9d2c36c6a0ed..6757233bd064 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -218,6 +218,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
__field(__u32, space)
__field(__u32, ooo_space)
__field(__u32, rcvbuf)
+ __field(__u32, rcv_ssthresh)
+ __field(__u32, window_clamp)
+ __field(__u32, rcv_wnd)
__field(__u8, scaling_ratio)
__field(__u16, sport)
__field(__u16, dport)
@@ -245,6 +248,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
tp->rcv_nxt;
__entry->rcvbuf = sk->sk_rcvbuf;
+ __entry->rcv_ssthresh = tp->rcv_ssthresh;
+ __entry->window_clamp = tp->window_clamp;
+ __entry->rcv_wnd = tp->rcv_wnd;
__entry->scaling_ratio = tp->scaling_ratio;
__entry->sport = ntohs(inet->inet_sport);
__entry->dport = ntohs(inet->inet_dport);
@@ -264,11 +270,14 @@ TRACE_EVENT(tcp_rcvbuf_grow,
),
TP_printk("time=%u rtt_us=%u copied=%u inq=%u space=%u ooo=%u scaling_ratio=%u rcvbuf=%u "
+ "rcv_ssthresh=%u window_clamp=%u rcv_wnd=%u "
"family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 "
"saddrv6=%pI6c daddrv6=%pI6c skaddr=%p sock_cookie=%llx",
__entry->time, __entry->rtt_us, __entry->copied,
__entry->inq, __entry->space, __entry->ooo_space,
__entry->scaling_ratio, __entry->rcvbuf,
+ __entry->rcv_ssthresh, __entry->window_clamp,
+ __entry->rcv_wnd,
show_family_name(__entry->family),
__entry->sport, __entry->dport,
__entry->saddr, __entry->daddr,
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-29 13:41 ` Neal Cardwell
0 siblings, 0 replies; 4+ messages in thread
From: Neal Cardwell @ 2025-10-29 13:41 UTC (permalink / raw)
To: Matthieu Baerts (NGI0)
Cc: Eric Dumazet, Kuniyuki Iwashima, David S. Miller, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, Mat Martineau,
Geliang Tang, netdev, linux-kernel, mptcp, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel
On Tue, Oct 28, 2025 at 7:58 AM Matthieu Baerts (NGI0)
<matttbe@kernel.org> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> While chasing yet another receive autotuning bug,
> I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.
>
> tcp_stream 40597 [068] 2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
> tcp_stream 40597 [068] 2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
> tcp_stream 40597 [068] 2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
> tcp_stream 40597 [068] 2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
> tcp_stream 40597 [068] 2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Thanks, Eric and Matthieu!
neal
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net v3 0/4] tcp: fix receive autotune again
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-30 0:40 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-10-30 0:40 UTC (permalink / raw)
To: Matthieu Baerts
Cc: edumazet, ncardwell, kuniyu, davem, kuba, pabeni, horms, dsahern,
martineau, geliang, netdev, linux-kernel, mptcp, rostedt,
mhiramat, mathieu.desnoyers, linux-trace-kernel
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 28 Oct 2025 12:57:58 +0100 you wrote:
> Neal Cardwell found that recent kernels were having RWIN limited
> issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
> 512MB.
>
> He suspected that tcp_stream default buffer size (64KB) was triggering
> heuristic added in ea33537d8292 ("tcp: add receive queue awareness
> in tcp_rcv_space_adjust()").
>
> [...]
Here is the summary with links:
- [net,v3,1/4] mptcp: fix subflow rcvbuf adjust
https://git.kernel.org/netdev/net/c/a6f0459aadf1
- [net,v3,2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
https://git.kernel.org/netdev/net/c/24990d89c23d
- [net,v3,3/4] tcp: add newval parameter to tcp_rcvbuf_grow()
https://git.kernel.org/netdev/net/c/b1e014a1f327
- [net,v3,4/4] tcp: fix too slow tcp_rcvbuf_grow() action
https://git.kernel.org/netdev/net/c/aa251c84636c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-10-30 0:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
2025-10-29 13:41 ` Neal Cardwell
2025-10-30 0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).