From: Wesley Atwell <atwellwea@gmail.com>
To: netdev@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
davem@davemloft.net, edumazet@google.com, ncardwell@google.com,
kuniyu@google.com, dsahern@kernel.org, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, shuah@kernel.org,
gmbnomis@gmail.com, Wesley Atwell <atwellwea@gmail.com>
Subject: [PATCH net-next v2 2/3] tcp: keep scaled no-shrink window representable
Date: Tue, 24 Mar 2026 00:04:09 -0600 [thread overview]
Message-ID: <20260324060410.1137199-3-atwellwea@gmail.com> (raw)
In-Reply-To: <20260324060410.1137199-1-atwellwea@gmail.com>
In the scaled no-shrink path, __tcp_select_window() currently rounds the
raw free-space value up to the receive-window scale quantum.
When raw backed free_space sits just below the next quantum, that can
expose fresh sender-visible credit beyond the currently backed receive
space.
Fix this by keeping tp->rcv_wnd representable in scaled units: round
larger windows down to the scale quantum and preserve only the small
non-zero case that would otherwise scale away to zero.
This series intentionally leaves that smaller longstanding non-zero case
unchanged. The proven bug and the new reproducer are both in the
larger-window path where free_space is at least one scale quantum, so
changing 0 < free_space < granularity into zero would be a separate
behavior change.
That representability matters across ACK transitions too, not only on
the immediate raw-free_space-limited ACK. tcp_select_window() preserves
the currently offered window when shrinking is disallowed, so if an
earlier ACK stores a rounded-up value in tp->rcv_wnd, a later
raw-free_space-limited ACK can keep inheriting that extra unit.
Keeping tp->rcv_wnd representable throughout the scaled no-shrink path
prevents that carry-forward and makes later no-shrink decisions reason
from a right edge the peer could actually have seen on the wire.
This removes the larger-window quantization slack while preserving the
small non-zero case needed to avoid scaling away to zero.
Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
---
v2:
- rename gran to granularity
- clarify why representable tp->rcv_wnd state is required across later
no-shrink transitions
- clarify that this series still intentionally leaves the smaller
longstanding non-zero case unchanged
net/ipv4/tcp_output.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 35c3b0ab5a0c..e5c4c09101be 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3375,13 +3375,19 @@ u32 __tcp_select_window(struct sock *sk)
* scaled window will not line up with the MSS boundary anyway.
*/
if (tp->rx_opt.rcv_wscale) {
- window = free_space;
+ u32 granularity = 1U << tp->rx_opt.rcv_wscale;
- /* Advertise enough space so that it won't get scaled away.
- * Import case: prevent zero window announcement if
- * 1<<rcv_wscale > mss.
+ /* Keep tp->rcv_wnd representable in scaled units so later
+ * no-shrink decisions reason about the same right edge we
+ * can advertise on the wire. Preserve only a small non-zero
+ * offer that would otherwise get scaled away to zero.
*/
- window = ALIGN(window, (1 << tp->rx_opt.rcv_wscale));
+ if (free_space >= granularity)
+ window = round_down(free_space, granularity);
+ else if (free_space > 0)
+ window = granularity;
+ else
+ window = 0;
} else {
window = tp->rcv_wnd;
/* Get the largest window that is a nice multiple of mss.
--
2.43.0
next prev parent reply other threads:[~2026-03-24 6:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 6:04 [PATCH net-next v2 0/3] tcp: fix scaled no-shrink rwnd quantization slack Wesley Atwell
2026-03-24 6:04 ` [PATCH net-next v2 1/3] selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss Wesley Atwell
2026-03-24 14:28 ` Eric Dumazet
2026-03-24 6:04 ` Wesley Atwell [this message]
2026-03-24 7:18 ` [PATCH net-next v2 2/3] tcp: keep scaled no-shrink window representable Eric Dumazet
2026-03-24 6:04 ` [PATCH net-next v2 3/3] selftests: packetdrill: cover scaled rwnd quantization slack Wesley Atwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260324060410.1137199-3-atwellwea@gmail.com \
--to=atwellwea@gmail.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=gmbnomis@gmail.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.