linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3 0/4] tcp: fix receive autotune again
@ 2025-10-28 11:57 Matthieu Baerts (NGI0)
  2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
  2025-10-30  0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:57 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
	Matthieu Baerts, Mat Martineau, Geliang Tang
  Cc: netdev, linux-kernel, mptcp, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, linux-trace-kernel

Neal Cardwell found that recent kernels were having RWIN limited
issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
512MB.

He suspected that tcp_stream default buffer size (64KB) was triggering
heuristic added in ea33537d8292 ("tcp: add receive queue awareness
in tcp_rcv_space_adjust()").

After more testing, it turns out the bug was added earlier
with commit 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot").

I forgot once again that DRS has one RTT latency.

MPTCP also got the same issue.

This series :

- Prevent calling tcp_rcvbuf_grow() on some MPTCP subflows.
- adds rcv_ssthresh, window_clamp and rcv_wnd to trace_tcp_rcvbuf_grow().
- Refactors code in a patch with no functional changes.
- Fixes the issue in the final patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
[ Added patch 1/4. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v3:
- Fix warnings at build time by moving 'oldval' declaration (Matthieu)
- Prevent possible divide by zero issue in mptcp_rcv_space_adjust() (Paolo)
- Note: this v3 is not being sent by Eric because he is unavailable.
- Link to v2: https://patch.msgid.link/20251027073809.2112498-1-edumazet@google.com
Changes in v2:
- Rebased to net tree
- Changed mptcp_rcvbuf_grow() to read/write msk->rcvq_space.space (Paolo)
- Link to v1: https://patch.msgid.link/20251024075027.3178786-1-edumazet@google.com

---
Eric Dumazet (3):
      trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
      tcp: add newval parameter to tcp_rcvbuf_grow()
      tcp: fix too slow tcp_rcvbuf_grow() action

Paolo Abeni (1):
      mptcp: fix subflow rcvbuf adjust

 include/net/tcp.h          |  2 +-
 include/trace/events/tcp.h |  9 +++++++++
 net/ipv4/tcp_input.c       | 21 ++++++++++++++-------
 net/mptcp/protocol.c       | 26 +++++++++++++++++---------
 4 files changed, 41 insertions(+), 17 deletions(-)
---
base-commit: 210b35d6a7ea415494ce75490c4b43b4e717d935
change-id: 20251028-net-tcp-recv-autotune-5876d6d85d8a

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
  2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
@ 2025-10-28 11:58 ` Matthieu Baerts (NGI0)
  2025-10-29 13:41   ` Neal Cardwell
  2025-10-30  0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:58 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
	Matthieu Baerts, Mat Martineau, Geliang Tang
  Cc: netdev, linux-kernel, mptcp, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, linux-trace-kernel

From: Eric Dumazet <edumazet@google.com>

While chasing yet another receive autotuning bug,
I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.

tcp_stream 40597 [068]  2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
tcp_stream 40597 [068]  2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
tcp_stream 40597 [068]  2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
tcp_stream 40597 [068]  2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
tcp_stream 40597 [068]  2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
To: Steven Rostedt <rostedt@goodmis.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: linux-trace-kernel@vger.kernel.org
---
 include/trace/events/tcp.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 9d2c36c6a0ed..6757233bd064 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -218,6 +218,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
 		__field(__u32, space)
 		__field(__u32, ooo_space)
 		__field(__u32, rcvbuf)
+		__field(__u32, rcv_ssthresh)
+		__field(__u32, window_clamp)
+		__field(__u32, rcv_wnd)
 		__field(__u8, scaling_ratio)
 		__field(__u16, sport)
 		__field(__u16, dport)
@@ -245,6 +248,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
 				     tp->rcv_nxt;
 
 		__entry->rcvbuf = sk->sk_rcvbuf;
+		__entry->rcv_ssthresh = tp->rcv_ssthresh;
+		__entry->window_clamp = tp->window_clamp;
+		__entry->rcv_wnd = tp->rcv_wnd;
 		__entry->scaling_ratio = tp->scaling_ratio;
 		__entry->sport = ntohs(inet->inet_sport);
 		__entry->dport = ntohs(inet->inet_dport);
@@ -264,11 +270,14 @@ TRACE_EVENT(tcp_rcvbuf_grow,
 	),
 
 	TP_printk("time=%u rtt_us=%u copied=%u inq=%u space=%u ooo=%u scaling_ratio=%u rcvbuf=%u "
+		  "rcv_ssthresh=%u window_clamp=%u rcv_wnd=%u "
 		  "family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 "
 		  "saddrv6=%pI6c daddrv6=%pI6c skaddr=%p sock_cookie=%llx",
 		  __entry->time, __entry->rtt_us, __entry->copied,
 		  __entry->inq, __entry->space, __entry->ooo_space,
 		  __entry->scaling_ratio, __entry->rcvbuf,
+		  __entry->rcv_ssthresh, __entry->window_clamp,
+		  __entry->rcv_wnd,
 		  show_family_name(__entry->family),
 		  __entry->sport, __entry->dport,
 		  __entry->saddr, __entry->daddr,

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
  2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-29 13:41   ` Neal Cardwell
  0 siblings, 0 replies; 4+ messages in thread
From: Neal Cardwell @ 2025-10-29 13:41 UTC (permalink / raw)
  To: Matthieu Baerts (NGI0)
  Cc: Eric Dumazet, Kuniyuki Iwashima, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Simon Horman, David Ahern, Mat Martineau,
	Geliang Tang, netdev, linux-kernel, mptcp, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel

On Tue, Oct 28, 2025 at 7:58 AM Matthieu Baerts (NGI0)
<matttbe@kernel.org> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> While chasing yet another receive autotuning bug,
> I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.
>
> tcp_stream 40597 [068]  2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
> tcp_stream 40597 [068]  2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
> tcp_stream 40597 [068]  2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
> tcp_stream 40597 [068]  2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
> tcp_stream 40597 [068]  2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---

Reviewed-by: Neal Cardwell <ncardwell@google.com>

Thanks, Eric and Matthieu!

neal

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net v3 0/4] tcp: fix receive autotune again
  2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
  2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-30  0:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-10-30  0:40 UTC (permalink / raw)
  To: Matthieu Baerts
  Cc: edumazet, ncardwell, kuniyu, davem, kuba, pabeni, horms, dsahern,
	martineau, geliang, netdev, linux-kernel, mptcp, rostedt,
	mhiramat, mathieu.desnoyers, linux-trace-kernel

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 28 Oct 2025 12:57:58 +0100 you wrote:
> Neal Cardwell found that recent kernels were having RWIN limited
> issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
> 512MB.
> 
> He suspected that tcp_stream default buffer size (64KB) was triggering
> heuristic added in ea33537d8292 ("tcp: add receive queue awareness
> in tcp_rcv_space_adjust()").
> 
> [...]

Here is the summary with links:
  - [net,v3,1/4] mptcp: fix subflow rcvbuf adjust
    https://git.kernel.org/netdev/net/c/a6f0459aadf1
  - [net,v3,2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
    https://git.kernel.org/netdev/net/c/24990d89c23d
  - [net,v3,3/4] tcp: add newval parameter to tcp_rcvbuf_grow()
    https://git.kernel.org/netdev/net/c/b1e014a1f327
  - [net,v3,4/4] tcp: fix too slow tcp_rcvbuf_grow() action
    https://git.kernel.org/netdev/net/c/aa251c84636c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-30  0:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
2025-10-29 13:41   ` Neal Cardwell
2025-10-30  0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).