* [PATCH net v3 0/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome
@ 2026-05-07 15:48 Ankit Jain
2026-05-07 15:48 ` [PATCH net v3 1/2] " Ankit Jain
2026-05-07 15:48 ` [PATCH net v3 2/2] selftests/net: add packetdrill test for locked SO_RCVBUF SWS Ankit Jain
0 siblings, 2 replies; 4+ messages in thread
From: Ankit Jain @ 2026-05-07 15:48 UTC (permalink / raw)
To: edumazet, netdev
Cc: kuba, davem, pabeni, ncardwell, kuniyu, horms, shuah,
quic_subashab, quic_stranche, linux-kselftest, linux-kernel,
karen.badiryan, ajay.kaher, alexey.makhalov,
vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu, Ankit Jain
This patch series fixes Silly Window Syndrome (SWS) for sockets using
locked SO_RCVBUF.
When applications like Tomcat lock SO_RCVBUF, receiving small packets
causes the memory truesize penalty to drop the scaling_ratio to 1.
This shrinks the internal window clamp and leads to 504 Gateway Timeouts.
Patch 1 bypasses this penalty for locked sockets, except for GRO packets.
Patch 2 adds a packetdrill test to validate this fix.
Link to v1:
https://lore.kernel.org/all/20260427152756.1205-1-ankit-aj.jain@broadcom.com/
Link to v2:
https://lore.kernel.org/all/20260504144945.13477-1-ankit-aj.jain@broadcom.com/
v2 -> v3:
- Changed GRO detection from checking tp->advmss to skb->len > len
based on Eric Dumazet's suggestion. This correctly detects GRO
packets even if they contain tiny segments.
- Updated packetdrill script. Removed the ad-hoc mss 48 configuration.
It now uses a standard 1460 MSS and sends varying packet sizes
(600, 700, 800 bytes) to naturally trigger the scaling_ratio
recalculation.
Testing:
- Verified fix in a live Java/Tomcat environment (504 timeouts resolved).
- Passed the newly added packetdrill test demonstrating the clamp bypass.
- Passed upstream regression tests: tcp_rcv_neg_window.pkt,
tcp_rcv_wnd_shrink_allowed.pkt, tcp_rcv_wnd_shrink_nomem.pkt,
tcp_rcv_zero_wnd_fin.pkt, and tcp_rcv_big_endseq.pkt
Ankit Jain (2):
tcp: protect locked SO_RCVBUF from Silly Window Syndrome
selftests/net: add packetdrill test for locked SO_RCVBUF SWS
net/ipv4/tcp_input.c | 7 ++++-
.../net/packetdrill/tcp_locked_rcvbuf_sws.pkt | 29 +++++++++++++++++++
2 files changed, 35 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/net/packetdrill/tcp_locked_rcvbuf_sws.pkt
--
2.53.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net v3 1/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome
2026-05-07 15:48 [PATCH net v3 0/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome Ankit Jain
@ 2026-05-07 15:48 ` Ankit Jain
2026-05-11 7:18 ` Eric Dumazet
2026-05-07 15:48 ` [PATCH net v3 2/2] selftests/net: add packetdrill test for locked SO_RCVBUF SWS Ankit Jain
1 sibling, 1 reply; 4+ messages in thread
From: Ankit Jain @ 2026-05-07 15:48 UTC (permalink / raw)
To: edumazet, netdev
Cc: kuba, davem, pabeni, ncardwell, kuniyu, horms, shuah,
quic_subashab, quic_stranche, linux-kselftest, linux-kernel,
karen.badiryan, ajay.kaher, alexey.makhalov,
vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu, Ankit Jain
When an application locks SO_RCVBUF, it disables TCP window auto-tuning.
However, the kernel still applies dynamic truesize penalties to the
scaling_ratio.
For small packets, this penalty drops the scaling_ratio to 1. This
reduces the advertised window and causes Silly Window Syndrome (SWS)
along with 504 Gateway Timeouts in applications like Tomcat.
This patch bypasses the truesize penalty if SOCK_RCVBUF_LOCK is set.
To prevent memory exhaustion from large aggregate payloads, the penalty
is still applied for GRO packets (skb->len > len).
Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
Reported-by: Karen Badiryan <karen.badiryan@broadcom.com>
Signed-off-by: Ankit Jain <ankit-aj.jain@broadcom.com>
---
net/ipv4/tcp_input.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d5c9e65d9760..4b1832b3face 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -240,8 +240,13 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb)
/* Note: divides are still a bit expensive.
* For the moment, only adjust scaling_ratio
* when we update icsk_ack.rcv_mss.
+ *
+ * Bypass truesize penalty for locked SO_RCVBUF to prevent
+ * window collapse. Still apply it to GRO packets.
*/
- if (unlikely(len != icsk->icsk_ack.rcv_mss)) {
+ if (unlikely(len != icsk->icsk_ack.rcv_mss &&
+ (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK) ||
+ skb->len > len))) {
u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE;
u8 old_ratio = tcp_sk(sk)->scaling_ratio;
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH net v3 1/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome
2026-05-07 15:48 ` [PATCH net v3 1/2] " Ankit Jain
@ 2026-05-11 7:18 ` Eric Dumazet
0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-05-11 7:18 UTC (permalink / raw)
To: Ankit Jain
Cc: netdev, kuba, davem, pabeni, ncardwell, kuniyu, horms, shuah,
quic_subashab, quic_stranche, linux-kselftest, linux-kernel,
karen.badiryan, ajay.kaher, alexey.makhalov,
vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu
On Thu, May 7, 2026 at 8:51 AM Ankit Jain <ankit-aj.jain@broadcom.com> wrote:
>
> When an application locks SO_RCVBUF, it disables TCP window auto-tuning.
> However, the kernel still applies dynamic truesize penalties to the
> scaling_ratio.
>
> For small packets, this penalty drops the scaling_ratio to 1. This
> reduces the advertised window and causes Silly Window Syndrome (SWS)
> along with 504 Gateway Timeouts in applications like Tomcat.
>
> This patch bypasses the truesize penalty if SOCK_RCVBUF_LOCK is set.
> To prevent memory exhaustion from large aggregate payloads, the penalty
> is still applied for GRO packets (skb->len > len).
>
> Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
> Reported-by: Karen Badiryan <karen.badiryan@broadcom.com>
> Signed-off-by: Ankit Jain <ankit-aj.jain@broadcom.com>
I still do not see why the current behavior has a 'bug'. I think it is
quite sane/reasonable.
Your selftest does not show anything wrong IMO.
If you want to avoid SWS, perhaps the application needs to _not_ use
SO_RCVBUF with a value smaller than the device MTU?
Applications using SO_RCVBUF with tiny values can not expect kernel
behavior to be stable,
since rcvbuf management depends on metadata size, which can vary
between kernels versions/options.
If a kernel change is needed, I would rather enforce a sane sk_rcvbuf
floor when the MSS is learnt
at accept()/connect() time.
( TCP_SKB_MIN_TRUESIZE / SOCK_MIN_RCVBUF definitions )
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net v3 2/2] selftests/net: add packetdrill test for locked SO_RCVBUF SWS
2026-05-07 15:48 [PATCH net v3 0/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome Ankit Jain
2026-05-07 15:48 ` [PATCH net v3 1/2] " Ankit Jain
@ 2026-05-07 15:48 ` Ankit Jain
1 sibling, 0 replies; 4+ messages in thread
From: Ankit Jain @ 2026-05-07 15:48 UTC (permalink / raw)
To: edumazet, netdev
Cc: kuba, davem, pabeni, ncardwell, kuniyu, horms, shuah,
quic_subashab, quic_stranche, linux-kselftest, linux-kernel,
karen.badiryan, ajay.kaher, alexey.makhalov,
vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu, Ankit Jain
Add a packetdrill test to verify that locked SO_RCVBUF sockets do not
suffer from scaling_ratio truesize penalties.
The test uses a standard 1460 MSS and sends medium-sized packets
(600, 700, 800 bytes) to trigger the recalculation logic. It checks
that the internal window clamp (tcpi_rcv_ssthresh) does not drop
unexpectedly.
Signed-off-by: Ankit Jain <ankit-aj.jain@broadcom.com>
---
.../net/packetdrill/tcp_locked_rcvbuf_sws.pkt | 29 +++++++++++++++++++
1 file changed, 29 insertions(+)
create mode 100644 tools/testing/selftests/net/packetdrill/tcp_locked_rcvbuf_sws.pkt
diff --git a/tools/testing/selftests/net/packetdrill/tcp_locked_rcvbuf_sws.pkt b/tools/testing/selftests/net/packetdrill/tcp_locked_rcvbuf_sws.pkt
new file mode 100644
index 000000000000..43e1d00d5f26
--- /dev/null
+++ b/tools/testing/selftests/net/packetdrill/tcp_locked_rcvbuf_sws.pkt
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+// Test that TCP does not reduce scaling_ratio for locked SO_RCVBUF.
+
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
++0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [32768], 4) = 0
++0 bind(3, ..., ...) = 0
++0 listen(3, 1) = 0
+
+// Establish connection with standard MSS.
++0 < S 0:0(0) win 65535 <mss 1460,nop,wscale 8>
++0 > S. 0:0(0) ack 1 <...>
++0 < . 1:1(0) ack 1 win 65535
++0 accept(3, ..., ...) = 4
+
+// Inject varying payload sizes to force scaling_ratio recalculation.
++0.1 < P. 1:601(600) ack 1 win 65535
++0 > . 1:1(0) ack 601 <...>
+
++0.1 < P. 601:1301(700) ack 1 win 65535
++0 > . 1:1(0) ack 1301 <...>
+
++0.1 < P. 1301:2101(800) ack 1 win 65535
++0 > . 1:1(0) ack 2101 <...>
+
+// Check that truesize penalty did not reduce the window clamp.
+// On unpatched kernels, rcv_ssthresh drops to ~22K.
++0.1 %{
+assert tcpi_rcv_ssthresh > 28000, f"rcv_ssthresh dropped unexpectedly: {tcpi_rcv_ssthresh}"
+}%
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-11 7:19 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 15:48 [PATCH net v3 0/2] tcp: protect locked SO_RCVBUF from Silly Window Syndrome Ankit Jain
2026-05-07 15:48 ` [PATCH net v3 1/2] " Ankit Jain
2026-05-11 7:18 ` Eric Dumazet
2026-05-07 15:48 ` [PATCH net v3 2/2] selftests/net: add packetdrill test for locked SO_RCVBUF SWS Ankit Jain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox