public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Wesley Atwell <atwellwea@gmail.com>
To: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	David Ahern <dsahern@kernel.org>, Simon Horman <horms@kernel.org>,
	Simon Baatz <gmbnomis@gmail.com>, Shuah Khan <shuah@kernel.org>,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	Wesley Atwell <atwellwea@gmail.com>
Subject: [PATCH net-next v3 2/3] tcp: keep scaled no-shrink window representable
Date: Tue, 24 Mar 2026 14:53:00 -0600	[thread overview]
Message-ID: <20260324205301.1361608-3-atwellwea@gmail.com> (raw)
In-Reply-To: <20260324205301.1361608-1-atwellwea@gmail.com>

In the scaled no-shrink path, __tcp_select_window() currently rounds the
raw free-space value up to the receive-window scale quantum.

When raw backed free_space sits just below the next quantum, that can
expose fresh sender-visible credit beyond the currently backed receive
space.

Fix this by keeping tp->rcv_wnd representable in scaled units: round
larger windows down to the scale quantum and preserve only the small
non-zero case that would otherwise scale away to zero.

This series intentionally leaves that smaller longstanding non-zero case
unchanged. The proven bug and the new reproducer are both in the
larger-window path where free_space is at least one scale quantum, so
changing 0 < free_space < granularity into zero would be a separate
behavior change.

That representability matters across ACK transitions too, not only on
the immediate raw-free_space-limited ACK. tcp_select_window() preserves
the currently offered window when shrinking is disallowed, so if an
earlier ACK stores a rounded-up value in tp->rcv_wnd, a later
raw-free_space-limited ACK can keep inheriting that extra unit.

Keeping tp->rcv_wnd representable throughout the scaled no-shrink path
prevents that carry-forward and makes later no-shrink decisions reason
from a right edge the peer could actually have seen on the wire.

This removes the larger-window quantization slack while preserving the
small non-zero case needed to avoid scaling away to zero.

Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
---
v3:
- keep granularity in signed int space so the free_space comparison
  stays type-safe

v2:
- rename gran to granularity
- clarify why representable tp->rcv_wnd state is required across later
  no-shrink transitions
- clarify that this series still intentionally leaves the smaller
  longstanding non-zero case unchanged

 net/ipv4/tcp_output.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 35c3b0ab5a0cb714155d5720fe56888f71aecced..5fc0e0d22f10bf56ece1be536b75013768112acf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3375,13 +3375,19 @@ u32 __tcp_select_window(struct sock *sk)
 	 * scaled window will not line up with the MSS boundary anyway.
 	 */
 	if (tp->rx_opt.rcv_wscale) {
-		window = free_space;
+		int granularity = 1 << tp->rx_opt.rcv_wscale;
 
-		/* Advertise enough space so that it won't get scaled away.
-		 * Import case: prevent zero window announcement if
-		 * 1<<rcv_wscale > mss.
+		/* Keep tp->rcv_wnd representable in scaled units so later
+		 * no-shrink decisions reason about the same right edge we
+		 * can advertise on the wire. Preserve only a small non-zero
+		 * offer that would otherwise get scaled away to zero.
 		 */
-		window = ALIGN(window, (1 << tp->rx_opt.rcv_wscale));
+		if (free_space >= granularity)
+			window = round_down(free_space, granularity);
+		else if (free_space > 0)
+			window = granularity;
+		else
+			window = 0;
 	} else {
 		window = tp->rcv_wnd;
 		/* Get the largest window that is a nice multiple of mss.
-- 
2.43.0

  parent reply	other threads:[~2026-03-24 20:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 20:52 [PATCH net-next v3 0/3] tcp: fix scaled no-shrink rwnd quantization slack Wesley Atwell
2026-03-24 20:52 ` [PATCH net-next v3 1/3] selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss Wesley Atwell
2026-03-24 20:53 ` Wesley Atwell [this message]
2026-03-24 20:53 ` [PATCH net-next v3 3/3] selftests: packetdrill: cover scaled rwnd quantization slack Wesley Atwell
2026-03-25  7:53   ` Simon Baatz
2026-03-25  7:58 ` [PATCH net-next v3 0/3] tcp: fix scaled no-shrink " Simon Baatz
2026-03-25 15:14   ` Eric Dumazet
2026-03-25 17:17     ` Wesley Atwell
2026-03-25 17:28       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324205301.1361608-3-atwellwea@gmail.com \
    --to=atwellwea@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=gmbnomis@gmail.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox