* [PATCH net v3 1/4] mptcp: fix subflow rcvbuf adjust
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
@ 2025-10-28 11:57 ` Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:57 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp
From: Paolo Abeni <pabeni@redhat.com>
The mptcp PM can add subflow to the conn_list before tcp_init_transfer().
Calling tcp_rcvbuf_grow() on such subflow is not correct as later
init will overwrite the update.
Fix the issue calling tcp_rcvbuf_grow() only after init buffer
initialization.
Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/mptcp/protocol.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0292162a14ee..a8a3bdf95543 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2051,6 +2051,7 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
msk->rcvq_space.space = msk->rcvq_space.copied;
if (mptcp_rcvbuf_grow(sk)) {
+ int copied = msk->rcvq_space.copied;
/* Make subflows follow along. If we do not do this, we
* get drops at subflow level if skbs can't be moved to
@@ -2063,8 +2064,11 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
ssk = mptcp_subflow_tcp_sock(subflow);
slow = lock_sock_fast(ssk);
- tcp_sk(ssk)->rcvq_space.space = msk->rcvq_space.copied;
- tcp_rcvbuf_grow(ssk);
+ /* subflows can be added before tcp_init_transfer() */
+ if (tcp_sk(ssk)->rcvq_space.space) {
+ tcp_sk(ssk)->rcvq_space.space = copied;
+ tcp_rcvbuf_grow(ssk);
+ }
unlock_sock_fast(ssk, slow);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
2025-10-28 11:57 ` [PATCH net v3 1/4] mptcp: fix subflow rcvbuf adjust Matthieu Baerts (NGI0)
@ 2025-10-28 11:58 ` Matthieu Baerts (NGI0)
2025-10-29 13:41 ` Neal Cardwell
2025-10-28 11:58 ` [PATCH net v3 3/4] tcp: add newval parameter to tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:58 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, linux-trace-kernel
From: Eric Dumazet <edumazet@google.com>
While chasing yet another receive autotuning bug,
I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.
tcp_stream 40597 [068] 2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
tcp_stream 40597 [068] 2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
tcp_stream 40597 [068] 2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
tcp_stream 40597 [068] 2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
tcp_stream 40597 [068] 2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
To: Steven Rostedt <rostedt@goodmis.org>
To: Masami Hiramatsu <mhiramat@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: linux-trace-kernel@vger.kernel.org
---
include/trace/events/tcp.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 9d2c36c6a0ed..6757233bd064 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -218,6 +218,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
__field(__u32, space)
__field(__u32, ooo_space)
__field(__u32, rcvbuf)
+ __field(__u32, rcv_ssthresh)
+ __field(__u32, window_clamp)
+ __field(__u32, rcv_wnd)
__field(__u8, scaling_ratio)
__field(__u16, sport)
__field(__u16, dport)
@@ -245,6 +248,9 @@ TRACE_EVENT(tcp_rcvbuf_grow,
tp->rcv_nxt;
__entry->rcvbuf = sk->sk_rcvbuf;
+ __entry->rcv_ssthresh = tp->rcv_ssthresh;
+ __entry->window_clamp = tp->window_clamp;
+ __entry->rcv_wnd = tp->rcv_wnd;
__entry->scaling_ratio = tp->scaling_ratio;
__entry->sport = ntohs(inet->inet_sport);
__entry->dport = ntohs(inet->inet_dport);
@@ -264,11 +270,14 @@ TRACE_EVENT(tcp_rcvbuf_grow,
),
TP_printk("time=%u rtt_us=%u copied=%u inq=%u space=%u ooo=%u scaling_ratio=%u rcvbuf=%u "
+ "rcv_ssthresh=%u window_clamp=%u rcv_wnd=%u "
"family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 "
"saddrv6=%pI6c daddrv6=%pI6c skaddr=%p sock_cookie=%llx",
__entry->time, __entry->rtt_us, __entry->copied,
__entry->inq, __entry->space, __entry->ooo_space,
__entry->scaling_ratio, __entry->rcvbuf,
+ __entry->rcv_ssthresh, __entry->window_clamp,
+ __entry->rcv_wnd,
show_family_name(__entry->family),
__entry->sport, __entry->dport,
__entry->saddr, __entry->daddr,
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-29 13:41 ` Neal Cardwell
0 siblings, 0 replies; 9+ messages in thread
From: Neal Cardwell @ 2025-10-29 13:41 UTC (permalink / raw)
To: Matthieu Baerts (NGI0)
Cc: Eric Dumazet, Kuniyuki Iwashima, David S. Miller, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, Mat Martineau,
Geliang Tang, netdev, linux-kernel, mptcp, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, linux-trace-kernel
On Tue, Oct 28, 2025 at 7:58 AM Matthieu Baerts (NGI0)
<matttbe@kernel.org> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> While chasing yet another receive autotuning bug,
> I found useful to add rcv_ssthresh, window_clamp and rcv_wnd.
>
> tcp_stream 40597 [068] 2172.978198: tcp:tcp_rcvbuf_grow: time=50307 rtt_us=50179 copied=77824 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=110592
> tcp_stream 40597 [068] 2173.028528: tcp:tcp_rcvbuf_grow: time=50336 rtt_us=50206 copied=110592 inq=0 space=77824 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=328658 window_clamp=435813 rcv_wnd=331776
> tcp_stream 40597 [068] 2173.078830: tcp:tcp_rcvbuf_grow: time=50305 rtt_us=50070 copied=270336 inq=0 space=110592 ooo=0 scaling_ratio=219 rcvbuf=509444 rcv_ssthresh=431159 window_clamp=435813 rcv_wnd=434176
> tcp_stream 40597 [068] 2173.129137: tcp:tcp_rcvbuf_grow: time=50313 rtt_us=50118 copied=434176 inq=0 space=270336 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=1299511 window_clamp=2102611 rcv_wnd=1302528
> tcp_stream 40597 [068] 2173.179451: tcp:tcp_rcvbuf_grow: time=50318 rtt_us=50041 copied=1019904 inq=0 space=434176 ooo=0 scaling_ratio=219 rcvbuf=2457847 rcv_ssthresh=2087445 window_clamp=2102611 rcv_wnd=2088960
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Thanks, Eric and Matthieu!
neal
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net v3 3/4] tcp: add newval parameter to tcp_rcvbuf_grow()
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
2025-10-28 11:57 ` [PATCH net v3 1/4] mptcp: fix subflow rcvbuf adjust Matthieu Baerts (NGI0)
2025-10-28 11:58 ` [PATCH net v3 2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-28 11:58 ` Matthieu Baerts (NGI0)
2025-10-29 13:42 ` Neal Cardwell
2025-10-28 11:58 ` [PATCH net v3 4/4] tcp: fix too slow tcp_rcvbuf_grow() action Matthieu Baerts (NGI0)
2025-10-30 0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
4 siblings, 1 reply; 9+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:58 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp
From: Eric Dumazet <edumazet@google.com>
This patch has no functional change, and prepares the following one.
tcp_rcvbuf_grow() will need to have access to tp->rcvq_space.space
old and new values.
Change mptcp_rcvbuf_grow() in a similar way.
Signed-off-by: Eric Dumazet <edumazet@google.com>
[ Moved 'oldval' declaration to the next patch to avoid warnings at
build time. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
include/net/tcp.h | 2 +-
net/ipv4/tcp_input.c | 14 +++++++-------
net/mptcp/protocol.c | 20 ++++++++------------
3 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5ca230ed526a..ab20f549b8f9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -370,7 +370,7 @@ void tcp_delack_timer_handler(struct sock *sk);
int tcp_ioctl(struct sock *sk, int cmd, int *karg);
enum skb_drop_reason tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb);
void tcp_rcv_established(struct sock *sk, struct sk_buff *skb);
-void tcp_rcvbuf_grow(struct sock *sk);
+void tcp_rcvbuf_grow(struct sock *sk, u32 newval);
void tcp_rcv_space_adjust(struct sock *sk);
int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp);
void tcp_twsk_destructor(struct sock *sk);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 31ea5af49f2d..cb4e07f84ae2 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -891,18 +891,20 @@ static inline void tcp_rcv_rtt_measure_ts(struct sock *sk,
}
}
-void tcp_rcvbuf_grow(struct sock *sk)
+void tcp_rcvbuf_grow(struct sock *sk, u32 newval)
{
const struct net *net = sock_net(sk);
struct tcp_sock *tp = tcp_sk(sk);
- int rcvwin, rcvbuf, cap;
+ u32 rcvwin, rcvbuf, cap;
+
+ tp->rcvq_space.space = newval;
if (!READ_ONCE(net->ipv4.sysctl_tcp_moderate_rcvbuf) ||
(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
return;
/* slow start: allow the sender to double its rate. */
- rcvwin = tp->rcvq_space.space << 1;
+ rcvwin = newval << 1;
if (!RB_EMPTY_ROOT(&tp->out_of_order_queue))
rcvwin += TCP_SKB_CB(tp->ooo_last_skb)->end_seq - tp->rcv_nxt;
@@ -943,9 +945,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
trace_tcp_rcvbuf_grow(sk, time);
- tp->rcvq_space.space = copied;
-
- tcp_rcvbuf_grow(sk);
+ tcp_rcvbuf_grow(sk, copied);
new_measure:
tp->rcvq_space.seq = tp->copied_seq;
@@ -5270,7 +5270,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
}
/* do not grow rcvbuf for not-yet-accepted or orphaned sockets. */
if (sk->sk_socket)
- tcp_rcvbuf_grow(sk);
+ tcp_rcvbuf_grow(sk, tp->rcvq_space.space);
}
static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a8a3bdf95543..052a0c62023f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -194,17 +194,18 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *msk, struct sk_buff *to,
* - mptcp does not maintain a msk-level window clamp
* - returns true when the receive buffer is actually updated
*/
-static bool mptcp_rcvbuf_grow(struct sock *sk)
+static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
{
struct mptcp_sock *msk = mptcp_sk(sk);
const struct net *net = sock_net(sk);
- int rcvwin, rcvbuf, cap;
+ u32 rcvwin, rcvbuf, cap;
+ msk->rcvq_space.space = newval;
if (!READ_ONCE(net->ipv4.sysctl_tcp_moderate_rcvbuf) ||
(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
return false;
- rcvwin = msk->rcvq_space.space << 1;
+ rcvwin = newval << 1;
if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
rcvwin += MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq;
@@ -334,7 +335,7 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *msk, struct sk_buff *skb)
skb_set_owner_r(skb, sk);
/* do not grow rcvbuf for not-yet-accepted or orphaned sockets. */
if (sk->sk_socket)
- mptcp_rcvbuf_grow(sk);
+ mptcp_rcvbuf_grow(sk, msk->rcvq_space.space);
}
static void mptcp_init_skb(struct sock *ssk, struct sk_buff *skb, int offset,
@@ -2049,10 +2050,7 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
if (msk->rcvq_space.copied <= msk->rcvq_space.space)
goto new_measure;
- msk->rcvq_space.space = msk->rcvq_space.copied;
- if (mptcp_rcvbuf_grow(sk)) {
- int copied = msk->rcvq_space.copied;
-
+ if (mptcp_rcvbuf_grow(sk, msk->rcvq_space.copied)) {
/* Make subflows follow along. If we do not do this, we
* get drops at subflow level if skbs can't be moved to
* the mptcp rx queue fast enough (announced rcv_win can
@@ -2065,10 +2063,8 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
ssk = mptcp_subflow_tcp_sock(subflow);
slow = lock_sock_fast(ssk);
/* subflows can be added before tcp_init_transfer() */
- if (tcp_sk(ssk)->rcvq_space.space) {
- tcp_sk(ssk)->rcvq_space.space = copied;
- tcp_rcvbuf_grow(ssk);
- }
+ if (tcp_sk(ssk)->rcvq_space.space)
+ tcp_rcvbuf_grow(ssk, msk->rcvq_space.copied);
unlock_sock_fast(ssk, slow);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net v3 3/4] tcp: add newval parameter to tcp_rcvbuf_grow()
2025-10-28 11:58 ` [PATCH net v3 3/4] tcp: add newval parameter to tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-29 13:42 ` Neal Cardwell
0 siblings, 0 replies; 9+ messages in thread
From: Neal Cardwell @ 2025-10-29 13:42 UTC (permalink / raw)
To: Matthieu Baerts (NGI0)
Cc: Eric Dumazet, Kuniyuki Iwashima, David S. Miller, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, Mat Martineau,
Geliang Tang, netdev, linux-kernel, mptcp
On Tue, Oct 28, 2025 at 7:58 AM Matthieu Baerts (NGI0)
<matttbe@kernel.org> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> This patch has no functional change, and prepares the following one.
>
> tcp_rcvbuf_grow() will need to have access to tp->rcvq_space.space
> old and new values.
>
> Change mptcp_rcvbuf_grow() in a similar way.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> [ Moved 'oldval' declaration to the next patch to avoid warnings at
> build time. ]
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Thanks, Eric and Matthieu!
neal
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net v3 4/4] tcp: fix too slow tcp_rcvbuf_grow() action
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
` (2 preceding siblings ...)
2025-10-28 11:58 ` [PATCH net v3 3/4] tcp: add newval parameter to tcp_rcvbuf_grow() Matthieu Baerts (NGI0)
@ 2025-10-28 11:58 ` Matthieu Baerts (NGI0)
2025-10-29 13:42 ` Neal Cardwell
2025-10-30 0:40 ` [PATCH net v3 0/4] tcp: fix receive autotune again patchwork-bot+netdevbpf
4 siblings, 1 reply; 9+ messages in thread
From: Matthieu Baerts (NGI0) @ 2025-10-28 11:58 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
Matthieu Baerts, Mat Martineau, Geliang Tang
Cc: netdev, linux-kernel, mptcp
From: Eric Dumazet <edumazet@google.com>
While the blamed commits apparently avoided an overshoot,
they also limited how fast a sender can increase BDP at each RTT.
This is not exactly a revert, we do not add the 16 * tp->advmss
cushion we had, and we are keeping the out_of_order_queue
contribution.
Do the same in mptcp_rcvbuf_grow().
Tested:
emulated 50ms rtt (tcp_stream --tcp-tx-delay 50000), cubic 20 second flow.
net.ipv4.tcp_rmem set to "4096 131072 67000000"
perf record -a -e tcp:tcp_rcvbuf_grow sleep 20
perf script
Before:
We can see we fail to roughly double RWIN at each RTT.
Sender is RWIN limited while CWND is ramping up (before getting tcp_wmem
limited).
tcp_stream 33793 [010] 825.717525: tcp:tcp_rcvbuf_grow: time=100869 rtt_us=50428 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
tcp_stream 33793 [010] 825.768966: tcp:tcp_rcvbuf_grow: time=51447 rtt_us=50362 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=106496
tcp_stream 33793 [010] 825.821539: tcp:tcp_rcvbuf_grow: time=52577 rtt_us=50243 copied=114688 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=201096 rcv_ssthresh=167377 window_clamp=172031 rcv_wnd=167936
tcp_stream 33793 [010] 825.871781: tcp:tcp_rcvbuf_grow: time=50248 rtt_us=50237 copied=167936 inq=0 space=114688 ooo=0 scaling_ratio=219 rcvbuf=268129 rcv_ssthresh=224722 window_clamp=229375 rcv_wnd=225280
tcp_stream 33793 [010] 825.922475: tcp:tcp_rcvbuf_grow: time=50698 rtt_us=50183 copied=241664 inq=0 space=167936 ooo=0 scaling_ratio=219 rcvbuf=392617 rcv_ssthresh=331217 window_clamp=335871 rcv_wnd=323584
tcp_stream 33793 [010] 825.973326: tcp:tcp_rcvbuf_grow: time=50855 rtt_us=50213 copied=339968 inq=0 space=241664 ooo=0 scaling_ratio=219 rcvbuf=564986 rcv_ssthresh=478674 window_clamp=483327 rcv_wnd=462848
tcp_stream 33793 [010] 826.023970: tcp:tcp_rcvbuf_grow: time=50647 rtt_us=50248 copied=491520 inq=0 space=339968 ooo=0 scaling_ratio=219 rcvbuf=794811 rcv_ssthresh=671778 window_clamp=679935 rcv_wnd=651264
tcp_stream 33793 [010] 826.074612: tcp:tcp_rcvbuf_grow: time=50648 rtt_us=50227 copied=700416 inq=0 space=491520 ooo=0 scaling_ratio=219 rcvbuf=1149124 rcv_ssthresh=974881 window_clamp=983039 rcv_wnd=942080
tcp_stream 33793 [010] 826.125452: tcp:tcp_rcvbuf_grow: time=50845 rtt_us=50225 copied=987136 inq=8192 space=700416 ooo=0 scaling_ratio=219 rcvbuf=1637502 rcv_ssthresh=1392674 window_clamp=1400831 rcv_wnd=1339392
tcp_stream 33793 [010] 826.175698: tcp:tcp_rcvbuf_grow: time=50250 rtt_us=50198 copied=1347584 inq=0 space=978944 ooo=0 scaling_ratio=219 rcvbuf=2288672 rcv_ssthresh=1949729 window_clamp=1957887 rcv_wnd=1945600
tcp_stream 33793 [010] 826.225947: tcp:tcp_rcvbuf_grow: time=50252 rtt_us=50240 copied=1945600 inq=0 space=1347584 ooo=0 scaling_ratio=219 rcvbuf=3150516 rcv_ssthresh=2687010 window_clamp=2695167 rcv_wnd=2691072
tcp_stream 33793 [010] 826.276175: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50224 copied=2691072 inq=0 space=1945600 ooo=0 scaling_ratio=219 rcvbuf=4548617 rcv_ssthresh=3883041 window_clamp=3891199 rcv_wnd=3887104
tcp_stream 33793 [010] 826.326403: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50229 copied=3887104 inq=0 space=2691072 ooo=0 scaling_ratio=219 rcvbuf=6291456 rcv_ssthresh=5370482 window_clamp=5382144 rcv_wnd=5373952
tcp_stream 33793 [010] 826.376723: tcp:tcp_rcvbuf_grow: time=50323 rtt_us=50218 copied=5373952 inq=0 space=3887104 ooo=0 scaling_ratio=219 rcvbuf=9087658 rcv_ssthresh=7755537 window_clamp=7774207 rcv_wnd=7757824
tcp_stream 33793 [010] 826.426991: tcp:tcp_rcvbuf_grow: time=50274 rtt_us=50196 copied=7757824 inq=180224 space=5373952 ooo=0 scaling_ratio=219 rcvbuf=12563759 rcv_ssthresh=10729233 window_clamp=10747903 rcv_wnd=10575872
tcp_stream 33793 [010] 826.477229: tcp:tcp_rcvbuf_grow: time=50241 rtt_us=50078 copied=10731520 inq=180224 space=7577600 ooo=0 scaling_ratio=219 rcvbuf=17715667 rcv_ssthresh=15136529 window_clamp=15155199 rcv_wnd=14983168
tcp_stream 33793 [010] 826.527482: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50153 copied=15138816 inq=360448 space=10551296 ooo=0 scaling_ratio=219 rcvbuf=24667870 rcv_ssthresh=21073410 window_clamp=21102591 rcv_wnd=20766720
tcp_stream 33793 [010] 826.577712: tcp:tcp_rcvbuf_grow: time=50234 rtt_us=50228 copied=21073920 inq=0 space=14778368 ooo=0 scaling_ratio=219 rcvbuf=34550339 rcv_ssthresh=29517041 window_clamp=29556735 rcv_wnd=29519872
tcp_stream 33793 [010] 826.627982: tcp:tcp_rcvbuf_grow: time=50275 rtt_us=50220 copied=29519872 inq=540672 space=21073920 ooo=0 scaling_ratio=219 rcvbuf=49268707 rcv_ssthresh=42090625 window_clamp=42147839 rcv_wnd=41627648
tcp_stream 33793 [010] 826.678274: tcp:tcp_rcvbuf_grow: time=50296 rtt_us=50185 copied=42053632 inq=761856 space=28979200 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57238168 window_clamp=57316406 rcv_wnd=56606720
tcp_stream 33793 [010] 826.728627: tcp:tcp_rcvbuf_grow: time=50357 rtt_us=50128 copied=43913216 inq=851968 space=41291776 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56524800
tcp_stream 33793 [010] 827.131364: tcp:tcp_rcvbuf_grow: time=50239 rtt_us=50127 copied=43843584 inq=655360 space=43061248 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56696832
tcp_stream 33793 [010] 827.181613: tcp:tcp_rcvbuf_grow: time=50254 rtt_us=50115 copied=43843584 inq=524288 space=43188224 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56807424
tcp_stream 33793 [010] 828.339635: tcp:tcp_rcvbuf_grow: time=50283 rtt_us=50110 copied=43843584 inq=458752 space=43319296 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56864768
tcp_stream 33793 [010] 828.440350: tcp:tcp_rcvbuf_grow: time=50404 rtt_us=50099 copied=43843584 inq=393216 space=43384832 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56922112
tcp_stream 33793 [010] 829.195106: tcp:tcp_rcvbuf_grow: time=50154 rtt_us=50077 copied=43843584 inq=196608 space=43450368 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=57090048
After:
It takes few steps to increase RWIN. Sender is no longer RWIN limited.
tcp_stream 50826 [010] 935.634212: tcp:tcp_rcvbuf_grow: time=100788 rtt_us=50315 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
tcp_stream 50826 [010] 935.685642: tcp:tcp_rcvbuf_grow: time=51437 rtt_us=50361 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=160875 rcv_ssthresh=132969 window_clamp=137623 rcv_wnd=131072
tcp_stream 50826 [010] 935.738299: tcp:tcp_rcvbuf_grow: time=52660 rtt_us=50256 copied=139264 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=502741 rcv_ssthresh=411497 window_clamp=430079 rcv_wnd=413696
tcp_stream 50826 [010] 935.788544: tcp:tcp_rcvbuf_grow: time=50249 rtt_us=50233 copied=307200 inq=0 space=139264 ooo=0 scaling_ratio=219 rcvbuf=728690 rcv_ssthresh=618717 window_clamp=623371 rcv_wnd=618496
tcp_stream 50826 [010] 935.838796: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50202 copied=618496 inq=0 space=307200 ooo=0 scaling_ratio=219 rcvbuf=2450338 rcv_ssthresh=1855709 window_clamp=2096187 rcv_wnd=1859584
tcp_stream 50826 [010] 935.889140: tcp:tcp_rcvbuf_grow: time=50347 rtt_us=50166 copied=1261568 inq=0 space=618496 ooo=0 scaling_ratio=219 rcvbuf=4376503 rcv_ssthresh=3725291 window_clamp=3743961 rcv_wnd=3706880
tcp_stream 50826 [010] 935.939435: tcp:tcp_rcvbuf_grow: time=50300 rtt_us=50185 copied=2478080 inq=24576 space=1261568 ooo=0 scaling_ratio=219 rcvbuf=9082648 rcv_ssthresh=7733731 window_clamp=7769921 rcv_wnd=7692288
tcp_stream 50826 [010] 935.989681: tcp:tcp_rcvbuf_grow: time=50251 rtt_us=50221 copied=4915200 inq=114688 space=2453504 ooo=0 scaling_ratio=219 rcvbuf=16574936 rcv_ssthresh=14108110 window_clamp=14179339 rcv_wnd=14024704
tcp_stream 50826 [010] 936.039967: tcp:tcp_rcvbuf_grow: time=50289 rtt_us=50279 copied=9830400 inq=114688 space=4800512 ooo=0 scaling_ratio=219 rcvbuf=32695050 rcv_ssthresh=27896187 window_clamp=27969593 rcv_wnd=27815936
tcp_stream 50826 [010] 936.090172: tcp:tcp_rcvbuf_grow: time=50211 rtt_us=50200 copied=19841024 inq=114688 space=9715712 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
tcp_stream 50826 [010] 936.140430: tcp:tcp_rcvbuf_grow: time=50262 rtt_us=50197 copied=39501824 inq=114688 space=19726336 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
tcp_stream 50826 [010] 936.190527: tcp:tcp_rcvbuf_grow: time=50101 rtt_us=50071 copied=43655168 inq=262144 space=39387136 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
tcp_stream 50826 [010] 936.240719: tcp:tcp_rcvbuf_grow: time=50197 rtt_us=50057 copied=43843584 inq=262144 space=43393024 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
tcp_stream 50826 [010] 936.341271: tcp:tcp_rcvbuf_grow: time=50297 rtt_us=50123 copied=43843584 inq=131072 space=43581440 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57147392
tcp_stream 50826 [010] 936.642503: tcp:tcp_rcvbuf_grow: time=50131 rtt_us=50084 copied=43843584 inq=0 space=43712512 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57262080
Fixes: 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot")
Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/589
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
net/ipv4/tcp_input.c | 11 +++++++++--
net/mptcp/protocol.c | 10 +++++++++-
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index cb4e07f84ae2..e4a979b75cc6 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -895,17 +895,24 @@ void tcp_rcvbuf_grow(struct sock *sk, u32 newval)
{
const struct net *net = sock_net(sk);
struct tcp_sock *tp = tcp_sk(sk);
- u32 rcvwin, rcvbuf, cap;
+ u32 rcvwin, rcvbuf, cap, oldval;
+ u64 grow;
+ oldval = tp->rcvq_space.space;
tp->rcvq_space.space = newval;
if (!READ_ONCE(net->ipv4.sysctl_tcp_moderate_rcvbuf) ||
(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
return;
- /* slow start: allow the sender to double its rate. */
+ /* DRS is always one RTT late. */
rcvwin = newval << 1;
+ /* slow start: allow the sender to double its rate. */
+ grow = (u64)rcvwin * (newval - oldval);
+ do_div(grow, oldval);
+ rcvwin += grow << 1;
+
if (!RB_EMPTY_ROOT(&tp->out_of_order_queue))
rcvwin += TCP_SKB_CB(tp->ooo_last_skb)->end_seq - tp->rcv_nxt;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 052a0c62023f..875027b9319c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -198,15 +198,23 @@ static bool mptcp_rcvbuf_grow(struct sock *sk, u32 newval)
{
struct mptcp_sock *msk = mptcp_sk(sk);
const struct net *net = sock_net(sk);
- u32 rcvwin, rcvbuf, cap;
+ u32 rcvwin, rcvbuf, cap, oldval;
+ u64 grow;
+ oldval = msk->rcvq_space.space;
msk->rcvq_space.space = newval;
if (!READ_ONCE(net->ipv4.sysctl_tcp_moderate_rcvbuf) ||
(sk->sk_userlocks & SOCK_RCVBUF_LOCK))
return false;
+ /* DRS is always one RTT late. */
rcvwin = newval << 1;
+ /* slow start: allow the sender to double its rate. */
+ grow = (u64)rcvwin * (newval - oldval);
+ do_div(grow, oldval);
+ rcvwin += grow << 1;
+
if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
rcvwin += MPTCP_SKB_CB(msk->ooo_last_skb)->end_seq - msk->ack_seq;
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH net v3 4/4] tcp: fix too slow tcp_rcvbuf_grow() action
2025-10-28 11:58 ` [PATCH net v3 4/4] tcp: fix too slow tcp_rcvbuf_grow() action Matthieu Baerts (NGI0)
@ 2025-10-29 13:42 ` Neal Cardwell
0 siblings, 0 replies; 9+ messages in thread
From: Neal Cardwell @ 2025-10-29 13:42 UTC (permalink / raw)
To: Matthieu Baerts (NGI0)
Cc: Eric Dumazet, Kuniyuki Iwashima, David S. Miller, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, Mat Martineau,
Geliang Tang, netdev, linux-kernel, mptcp
On Tue, Oct 28, 2025 at 7:58 AM Matthieu Baerts (NGI0)
<matttbe@kernel.org> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> While the blamed commits apparently avoided an overshoot,
> they also limited how fast a sender can increase BDP at each RTT.
>
> This is not exactly a revert, we do not add the 16 * tp->advmss
> cushion we had, and we are keeping the out_of_order_queue
> contribution.
>
> Do the same in mptcp_rcvbuf_grow().
>
> Tested:
>
> emulated 50ms rtt (tcp_stream --tcp-tx-delay 50000), cubic 20 second flow.
> net.ipv4.tcp_rmem set to "4096 131072 67000000"
>
> perf record -a -e tcp:tcp_rcvbuf_grow sleep 20
> perf script
>
> Before:
>
> We can see we fail to roughly double RWIN at each RTT.
> Sender is RWIN limited while CWND is ramping up (before getting tcp_wmem
> limited).
>
> tcp_stream 33793 [010] 825.717525: tcp:tcp_rcvbuf_grow: time=100869 rtt_us=50428 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
> tcp_stream 33793 [010] 825.768966: tcp:tcp_rcvbuf_grow: time=51447 rtt_us=50362 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=106496
> tcp_stream 33793 [010] 825.821539: tcp:tcp_rcvbuf_grow: time=52577 rtt_us=50243 copied=114688 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=201096 rcv_ssthresh=167377 window_clamp=172031 rcv_wnd=167936
> tcp_stream 33793 [010] 825.871781: tcp:tcp_rcvbuf_grow: time=50248 rtt_us=50237 copied=167936 inq=0 space=114688 ooo=0 scaling_ratio=219 rcvbuf=268129 rcv_ssthresh=224722 window_clamp=229375 rcv_wnd=225280
> tcp_stream 33793 [010] 825.922475: tcp:tcp_rcvbuf_grow: time=50698 rtt_us=50183 copied=241664 inq=0 space=167936 ooo=0 scaling_ratio=219 rcvbuf=392617 rcv_ssthresh=331217 window_clamp=335871 rcv_wnd=323584
> tcp_stream 33793 [010] 825.973326: tcp:tcp_rcvbuf_grow: time=50855 rtt_us=50213 copied=339968 inq=0 space=241664 ooo=0 scaling_ratio=219 rcvbuf=564986 rcv_ssthresh=478674 window_clamp=483327 rcv_wnd=462848
> tcp_stream 33793 [010] 826.023970: tcp:tcp_rcvbuf_grow: time=50647 rtt_us=50248 copied=491520 inq=0 space=339968 ooo=0 scaling_ratio=219 rcvbuf=794811 rcv_ssthresh=671778 window_clamp=679935 rcv_wnd=651264
> tcp_stream 33793 [010] 826.074612: tcp:tcp_rcvbuf_grow: time=50648 rtt_us=50227 copied=700416 inq=0 space=491520 ooo=0 scaling_ratio=219 rcvbuf=1149124 rcv_ssthresh=974881 window_clamp=983039 rcv_wnd=942080
> tcp_stream 33793 [010] 826.125452: tcp:tcp_rcvbuf_grow: time=50845 rtt_us=50225 copied=987136 inq=8192 space=700416 ooo=0 scaling_ratio=219 rcvbuf=1637502 rcv_ssthresh=1392674 window_clamp=1400831 rcv_wnd=1339392
> tcp_stream 33793 [010] 826.175698: tcp:tcp_rcvbuf_grow: time=50250 rtt_us=50198 copied=1347584 inq=0 space=978944 ooo=0 scaling_ratio=219 rcvbuf=2288672 rcv_ssthresh=1949729 window_clamp=1957887 rcv_wnd=1945600
> tcp_stream 33793 [010] 826.225947: tcp:tcp_rcvbuf_grow: time=50252 rtt_us=50240 copied=1945600 inq=0 space=1347584 ooo=0 scaling_ratio=219 rcvbuf=3150516 rcv_ssthresh=2687010 window_clamp=2695167 rcv_wnd=2691072
> tcp_stream 33793 [010] 826.276175: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50224 copied=2691072 inq=0 space=1945600 ooo=0 scaling_ratio=219 rcvbuf=4548617 rcv_ssthresh=3883041 window_clamp=3891199 rcv_wnd=3887104
> tcp_stream 33793 [010] 826.326403: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50229 copied=3887104 inq=0 space=2691072 ooo=0 scaling_ratio=219 rcvbuf=6291456 rcv_ssthresh=5370482 window_clamp=5382144 rcv_wnd=5373952
> tcp_stream 33793 [010] 826.376723: tcp:tcp_rcvbuf_grow: time=50323 rtt_us=50218 copied=5373952 inq=0 space=3887104 ooo=0 scaling_ratio=219 rcvbuf=9087658 rcv_ssthresh=7755537 window_clamp=7774207 rcv_wnd=7757824
> tcp_stream 33793 [010] 826.426991: tcp:tcp_rcvbuf_grow: time=50274 rtt_us=50196 copied=7757824 inq=180224 space=5373952 ooo=0 scaling_ratio=219 rcvbuf=12563759 rcv_ssthresh=10729233 window_clamp=10747903 rcv_wnd=10575872
> tcp_stream 33793 [010] 826.477229: tcp:tcp_rcvbuf_grow: time=50241 rtt_us=50078 copied=10731520 inq=180224 space=7577600 ooo=0 scaling_ratio=219 rcvbuf=17715667 rcv_ssthresh=15136529 window_clamp=15155199 rcv_wnd=14983168
> tcp_stream 33793 [010] 826.527482: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50153 copied=15138816 inq=360448 space=10551296 ooo=0 scaling_ratio=219 rcvbuf=24667870 rcv_ssthresh=21073410 window_clamp=21102591 rcv_wnd=20766720
> tcp_stream 33793 [010] 826.577712: tcp:tcp_rcvbuf_grow: time=50234 rtt_us=50228 copied=21073920 inq=0 space=14778368 ooo=0 scaling_ratio=219 rcvbuf=34550339 rcv_ssthresh=29517041 window_clamp=29556735 rcv_wnd=29519872
> tcp_stream 33793 [010] 826.627982: tcp:tcp_rcvbuf_grow: time=50275 rtt_us=50220 copied=29519872 inq=540672 space=21073920 ooo=0 scaling_ratio=219 rcvbuf=49268707 rcv_ssthresh=42090625 window_clamp=42147839 rcv_wnd=41627648
> tcp_stream 33793 [010] 826.678274: tcp:tcp_rcvbuf_grow: time=50296 rtt_us=50185 copied=42053632 inq=761856 space=28979200 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57238168 window_clamp=57316406 rcv_wnd=56606720
> tcp_stream 33793 [010] 826.728627: tcp:tcp_rcvbuf_grow: time=50357 rtt_us=50128 copied=43913216 inq=851968 space=41291776 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56524800
> tcp_stream 33793 [010] 827.131364: tcp:tcp_rcvbuf_grow: time=50239 rtt_us=50127 copied=43843584 inq=655360 space=43061248 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56696832
> tcp_stream 33793 [010] 827.181613: tcp:tcp_rcvbuf_grow: time=50254 rtt_us=50115 copied=43843584 inq=524288 space=43188224 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56807424
> tcp_stream 33793 [010] 828.339635: tcp:tcp_rcvbuf_grow: time=50283 rtt_us=50110 copied=43843584 inq=458752 space=43319296 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56864768
> tcp_stream 33793 [010] 828.440350: tcp:tcp_rcvbuf_grow: time=50404 rtt_us=50099 copied=43843584 inq=393216 space=43384832 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56922112
> tcp_stream 33793 [010] 829.195106: tcp:tcp_rcvbuf_grow: time=50154 rtt_us=50077 copied=43843584 inq=196608 space=43450368 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=57090048
>
> After:
>
> It takes few steps to increase RWIN. Sender is no longer RWIN limited.
>
> tcp_stream 50826 [010] 935.634212: tcp:tcp_rcvbuf_grow: time=100788 rtt_us=50315 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
> tcp_stream 50826 [010] 935.685642: tcp:tcp_rcvbuf_grow: time=51437 rtt_us=50361 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=160875 rcv_ssthresh=132969 window_clamp=137623 rcv_wnd=131072
> tcp_stream 50826 [010] 935.738299: tcp:tcp_rcvbuf_grow: time=52660 rtt_us=50256 copied=139264 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=502741 rcv_ssthresh=411497 window_clamp=430079 rcv_wnd=413696
> tcp_stream 50826 [010] 935.788544: tcp:tcp_rcvbuf_grow: time=50249 rtt_us=50233 copied=307200 inq=0 space=139264 ooo=0 scaling_ratio=219 rcvbuf=728690 rcv_ssthresh=618717 window_clamp=623371 rcv_wnd=618496
> tcp_stream 50826 [010] 935.838796: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50202 copied=618496 inq=0 space=307200 ooo=0 scaling_ratio=219 rcvbuf=2450338 rcv_ssthresh=1855709 window_clamp=2096187 rcv_wnd=1859584
> tcp_stream 50826 [010] 935.889140: tcp:tcp_rcvbuf_grow: time=50347 rtt_us=50166 copied=1261568 inq=0 space=618496 ooo=0 scaling_ratio=219 rcvbuf=4376503 rcv_ssthresh=3725291 window_clamp=3743961 rcv_wnd=3706880
> tcp_stream 50826 [010] 935.939435: tcp:tcp_rcvbuf_grow: time=50300 rtt_us=50185 copied=2478080 inq=24576 space=1261568 ooo=0 scaling_ratio=219 rcvbuf=9082648 rcv_ssthresh=7733731 window_clamp=7769921 rcv_wnd=7692288
> tcp_stream 50826 [010] 935.989681: tcp:tcp_rcvbuf_grow: time=50251 rtt_us=50221 copied=4915200 inq=114688 space=2453504 ooo=0 scaling_ratio=219 rcvbuf=16574936 rcv_ssthresh=14108110 window_clamp=14179339 rcv_wnd=14024704
> tcp_stream 50826 [010] 936.039967: tcp:tcp_rcvbuf_grow: time=50289 rtt_us=50279 copied=9830400 inq=114688 space=4800512 ooo=0 scaling_ratio=219 rcvbuf=32695050 rcv_ssthresh=27896187 window_clamp=27969593 rcv_wnd=27815936
> tcp_stream 50826 [010] 936.090172: tcp:tcp_rcvbuf_grow: time=50211 rtt_us=50200 copied=19841024 inq=114688 space=9715712 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
> tcp_stream 50826 [010] 936.140430: tcp:tcp_rcvbuf_grow: time=50262 rtt_us=50197 copied=39501824 inq=114688 space=19726336 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
> tcp_stream 50826 [010] 936.190527: tcp:tcp_rcvbuf_grow: time=50101 rtt_us=50071 copied=43655168 inq=262144 space=39387136 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
> tcp_stream 50826 [010] 936.240719: tcp:tcp_rcvbuf_grow: time=50197 rtt_us=50057 copied=43843584 inq=262144 space=43393024 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
> tcp_stream 50826 [010] 936.341271: tcp:tcp_rcvbuf_grow: time=50297 rtt_us=50123 copied=43843584 inq=131072 space=43581440 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57147392
> tcp_stream 50826 [010] 936.642503: tcp:tcp_rcvbuf_grow: time=50131 rtt_us=50084 copied=43843584 inq=0 space=43712512 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57262080
>
> Fixes: 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot")
> Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
> Reported-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/589
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Thanks, Eric and Matthieu!
neal
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net v3 0/4] tcp: fix receive autotune again
2025-10-28 11:57 [PATCH net v3 0/4] tcp: fix receive autotune again Matthieu Baerts (NGI0)
` (3 preceding siblings ...)
2025-10-28 11:58 ` [PATCH net v3 4/4] tcp: fix too slow tcp_rcvbuf_grow() action Matthieu Baerts (NGI0)
@ 2025-10-30 0:40 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-10-30 0:40 UTC (permalink / raw)
To: Matthieu Baerts
Cc: edumazet, ncardwell, kuniyu, davem, kuba, pabeni, horms, dsahern,
martineau, geliang, netdev, linux-kernel, mptcp, rostedt,
mhiramat, mathieu.desnoyers, linux-trace-kernel
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 28 Oct 2025 12:57:58 +0100 you wrote:
> Neal Cardwell found that recent kernels were having RWIN limited
> issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
> 512MB.
>
> He suspected that tcp_stream default buffer size (64KB) was triggering
> heuristic added in ea33537d8292 ("tcp: add receive queue awareness
> in tcp_rcv_space_adjust()").
>
> [...]
Here is the summary with links:
- [net,v3,1/4] mptcp: fix subflow rcvbuf adjust
https://git.kernel.org/netdev/net/c/a6f0459aadf1
- [net,v3,2/4] trace: tcp: add three metrics to trace_tcp_rcvbuf_grow()
https://git.kernel.org/netdev/net/c/24990d89c23d
- [net,v3,3/4] tcp: add newval parameter to tcp_rcvbuf_grow()
https://git.kernel.org/netdev/net/c/b1e014a1f327
- [net,v3,4/4] tcp: fix too slow tcp_rcvbuf_grow() action
https://git.kernel.org/netdev/net/c/aa251c84636c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 9+ messages in thread