From: Wesley Atwell <atwellwea@gmail.com>
To: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>,
Neal Cardwell <ncardwell@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
David Ahern <dsahern@kernel.org>, Simon Horman <horms@kernel.org>,
Simon Baatz <gmbnomis@gmail.com>, Shuah Khan <shuah@kernel.org>,
linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
Wesley Atwell <atwellwea@gmail.com>
Subject: [PATCH net-next v3 0/3] tcp: fix scaled no-shrink rwnd quantization slack
Date: Tue, 24 Mar 2026 14:52:58 -0600 [thread overview]
Message-ID: <20260324205301.1361608-1-atwellwea@gmail.com> (raw)
Hi,
This v3 addresses the follow-up review on v2.
Eric pointed out that 1/3 does not need the added packetdrill comment
and that 2/3 compared signed free_space against an unsigned
granularity.
This revision drops the extra in-file comment from 1/3 and keeps
the scaled-window granularity in int space in 2/3 so the comparison
stays type-safe. The overall approach and reproducer remain unchanged
from v2.
Simon was right that the original 3/3 only showed the explicit
rcv_ssthresh-limited ALIGN-up behavior. For v2, 3/3 was replaced with
an OOO-memory-based reproducer that first grows rcv_ssthresh with
in-order data and then drives raw backed free_space below
rcv_ssthresh without advancing rcv_nxt. In the instrumented
old-behavior run that shaped this test, the critical ACK reached
free_space=86190, rcv_ssthresh=86286, and still advertised 87040
(85 << 10). With 2/3 applied, the same ACK stays at 84.
That follow-up also clarified why the broader 2/3 change is required.
A narrower variant that preserved the old rcv_ssthresh-limited ALIGN-up
behavior was not sufficient: earlier ACKs still stored 85 in tp->rcv_wnd,
and tcp_select_window() later preserved that extra unit because shrinking
was disallowed. Keeping tp->rcv_wnd representable across the scaled
no-shrink path is what lets later ACKs settle at the correct
wire-visible edge.
Problem
=======
In the scaled no-shrink path, __tcp_select_window() rounds free_space up
to the receive-window scale quantum:
window = ALIGN(free_space, 1 << tp->rx_opt.rcv_wscale);
When raw backed free_space sits just below the next quantum, that can
expose fresh sender-visible credit that is not actually backed by the
current receive-memory state.
Approach
========
This repost keeps the part with a clear fail-before/pass-after case:
- relax one unrelated packetdrill test which was pinning an
incidental advertised window
- keep tp->rcv_wnd representable in scaled units by rounding larger
windows down to the scale quantum
- preserve only the small non-zero case that would otherwise scale
away to zero; changing that longstanding non-zero-to-zero behavior
would be a separate change from the bug proven here
- prove the actual raw-free_space case with a packetdrill sequence
that reaches free_space < rcv_ssthresh without changing SO_RCVBUF
after the handshake
Tests
=====
Local validation included:
- git diff --check
- checkpatch on the touched diff
- /home/wes/nipa/local/vmksft dirty --tests
'net/packetdrill:tcp_ooo_rcv_mss.pkt
net/packetdrill:tcp_rcv_quantization_credit.pkt'
passes in run 20260324-202158-4929 for ipv4, ipv6, and
ipv4-mapped-ipv6
- the same quantization packetdrill fails on HEAD without 2/3 with:
expected: win 84
actual: win 85
Changes in v3
=============
- drop the unnecessary explanatory packetdrill comment from 1/3
- keep 2/3 granularity in signed int space to avoid the free_space
signed/unsigned comparison bug Eric pointed out
- keep 3/3 unchanged
Series layout
=============
1/3 selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss
2/3 tcp: keep scaled no-shrink window representable
3/3 selftests: packetdrill: cover scaled rwnd quantization slack
Thanks,
Wesley Atwell
---
net/ipv4/tcp_output.c | 16 +++++++++++-----
.../selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt | 5 ++---
.../packetdrill/tcp_rcv_quantization_credit.pkt | 62 ++++++++++++++++++++++
3 files changed, 75 insertions(+), 8 deletions(-)
base-commit: 5446b8691eb8278f10deca92048fad84ffd1e4d5
--
2.43.0
next reply other threads:[~2026-03-24 20:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 20:52 Wesley Atwell [this message]
2026-03-24 20:52 ` [PATCH net-next v3 1/3] selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss Wesley Atwell
2026-03-24 20:53 ` [PATCH net-next v3 2/3] tcp: keep scaled no-shrink window representable Wesley Atwell
2026-03-24 20:53 ` [PATCH net-next v3 3/3] selftests: packetdrill: cover scaled rwnd quantization slack Wesley Atwell
2026-03-25 7:53 ` Simon Baatz
2026-03-25 7:58 ` [PATCH net-next v3 0/3] tcp: fix scaled no-shrink " Simon Baatz
2026-03-25 15:14 ` Eric Dumazet
2026-03-25 17:17 ` Wesley Atwell
2026-03-25 17:28 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260324205301.1361608-1-atwellwea@gmail.com \
--to=atwellwea@gmail.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=gmbnomis@gmail.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox