public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Wesley Atwell <atwellwea@gmail.com>
To: netdev@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	davem@davemloft.net, edumazet@google.com, ncardwell@google.com,
	kuniyu@google.com, dsahern@kernel.org, kuba@kernel.org,
	pabeni@redhat.com, horms@kernel.org, shuah@kernel.org,
	gmbnomis@gmail.com, Wesley Atwell <atwellwea@gmail.com>
Subject: [PATCH net-next v2 0/3] tcp: fix scaled no-shrink rwnd quantization slack
Date: Tue, 24 Mar 2026 00:04:07 -0600	[thread overview]
Message-ID: <20260324060410.1137199-1-atwellwea@gmail.com> (raw)

Hi,

this v2 addresses the review on the earlier quantization series.

Simon was right that the original 3/3 only showed the explicit
rcv_ssthresh-limited ALIGN-up behavior. For v2, 3/3 is replaced with an
OOO-memory-based reproducer that first grows rcv_ssthresh with in-order
data and then drives raw backed free_space below rcv_ssthresh without
advancing rcv_nxt. In the instrumented old-behavior run that shaped this
test, the critical ACK reached free_space=86190, rcv_ssthresh=86286,
and still advertised 87040 (85 << 10). With 2/3 applied, the same ACK
stays at 84.

That follow-up also clarified why the broader 2/3 change is required.
A narrower variant that preserved the old rcv_ssthresh-limited ALIGN-up
behavior was not sufficient: earlier ACKs still stored 85 in tp->rcv_wnd,
and tcp_select_window() later preserved that extra unit because shrinking
was disallowed. Keeping tp->rcv_wnd representable across the scaled
no-shrink path is what lets later ACKs settle at the correct
wire-visible edge.

Problem
=======

In the scaled no-shrink path, __tcp_select_window() rounds free_space up
to the receive-window scale quantum:

  window = ALIGN(free_space, 1 << tp->rx_opt.rcv_wscale);

When raw backed free_space sits just below the next quantum, that can
expose fresh sender-visible credit that is not actually backed by the
current receive-memory state.

Approach
========

This repost keeps only the part with a clear fail-before/pass-after
story today:

  - relax one unrelated packetdrill test which was pinning an
    incidental advertised window
  - keep tp->rcv_wnd representable in scaled units by rounding larger
    windows down to the scale quantum
  - preserve only the small non-zero case that would otherwise scale
    away to zero; changing that longstanding non-zero-to-zero behavior
    would be a separate change from the bug proven here
  - prove the actual raw-free_space case with a packetdrill sequence
    that reaches free_space < rcv_ssthresh without changing SO_RCVBUF
    after the handshake

Tests
=====

Local validation:
- git diff --check
- checkpatch on the touched diff
- local vmksft targeted run of
  net/packetdrill:tcp_rcv_quantization_credit.pkt passes with this
  series applied for ipv4, ipv6, and ipv4-mapped-ipv6
- the same packetdrill fails on HEAD without 2/3 with:

    expected: win 84
      actual: win 85

Changes in v2
=============

- leave 1/3 unchanged
- rename gran to granularity in 2/3
- clarify in 2/3 why representable tp->rcv_wnd state is required across
  later no-shrink transitions
- clarify in 2/3 that the smaller longstanding non-zero case remains
  intentionally unchanged in this series
- replace 3/3 with the proven OOO-memory reproducer for the raw
  free_space case
- drop the IPv4-only restriction in 3/3 after validating the test on
  the default packetdrill protocol set

Series layout
=============

  1/3 selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss
  2/3 tcp: keep scaled no-shrink window representable
  3/3 selftests: packetdrill: cover scaled rwnd quantization slack

Thanks,
Wesley Atwell

---
 net/ipv4/tcp_output.c                              | 16 +++++++++++-----
 .../selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt |  8 +++++---
 .../packetdrill/tcp_rcv_quantization_credit.pkt   | 62 ++++++++++++++++++++++
 3 files changed, 78 insertions(+), 8 deletions(-)

-- 
2.43.0

             reply	other threads:[~2026-03-24  6:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24  6:04 Wesley Atwell [this message]
2026-03-24  6:04 ` [PATCH net-next v2 1/3] selftests: packetdrill: stop pinning rwnd in tcp_ooo_rcv_mss Wesley Atwell
2026-03-24 14:28   ` Eric Dumazet
2026-03-24  6:04 ` [PATCH net-next v2 2/3] tcp: keep scaled no-shrink window representable Wesley Atwell
2026-03-24  7:18   ` Eric Dumazet
2026-03-24  6:04 ` [PATCH net-next v2 3/3] selftests: packetdrill: cover scaled rwnd quantization slack Wesley Atwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324060410.1137199-1-atwellwea@gmail.com \
    --to=atwellwea@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=gmbnomis@gmail.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox