Netdev List
 help / color / mirror / Atom feed
* [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM)
@ 2026-06-04  8:22 xietangxin
  2026-06-08 11:55 ` Menglong Dong
  0 siblings, 1 reply; 4+ messages in thread
From: xietangxin @ 2026-06-04  8:22 UTC (permalink / raw)
  To: edumazet, davem, kuba, pabeni
  Cc: jmaloy, menglong8.dong, kuniyu, horms, willemb, netdev,
	linux-kernel

Hi all,

We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing.

1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window().
After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"),
Both peers freeze their rcv_nxt and set rcv_wnd = 0.

2.Prior to freezing, both sides had already sent out flight data.
Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing,
but the peer's seq of subsequent packets continues to grow.

3.When Peer A receives Peer B's Zero Window ACK,
the packet's seq is far ahead of Peer A's frozen rcv_nxt.
Both peers drop each other's packet, also no Zero Window Probes are triggered
because snd_wnd is never updated to 0.


Simplified Packet Trace:

Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially.

Time  Dir      Type        Seq   Ack   Win  Len  Status
------------------------------------------------------------------------
T1:   B -> A   [PSH, ACK]  1000  5000  3000 100  (A hits OOM, rcv_nxt=1000)
T2:   B -> A   [ACK]       1100  5000  3000 200  (Dropped due to A's OOM)
T3:   B -> A   [PSH, ACK]  1300  5000  3000 200  (Dropped due to A's OOM)

T4:   A -> B   [PSH, ACK]  5000  1000  3000 100  (B hits OOM, rcv_nxt=5000)
T5:   A -> B   [ACK]       5100  1000  3000 200  (Dropped due to B's OOM)
T6:   A -> B   [PSH, ACK]  5300  1000  3000 200  (Dropped due to B's OOM)

-- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 --

T7:   B -> A   [ZeroWin]   1500  5000  0    0    (Dropped: Seq 1500 != 1000)
T8:   A -> B   [ZeroWin]   5500  1000  0    0    (Dropped: Seq 5500 != 5000)
T9:   A -> B   [WinUpdate] 5500  1000  20   0    (Dropped: Seq 5500 != 5000)

Should we relax the sequence check in tcp_sequence() for zero window ACK?

Any feedback or guidance would be greatly appreciated.

-- 
Best regards,
Tangxin Xie


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM)
  2026-06-04  8:22 [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) xietangxin
@ 2026-06-08 11:55 ` Menglong Dong
  2026-06-25 13:22   ` xietangxin
  0 siblings, 1 reply; 4+ messages in thread
From: Menglong Dong @ 2026-06-08 11:55 UTC (permalink / raw)
  To: xietangxin
  Cc: edumazet, davem, kuba, pabeni, jmaloy, menglong8.dong, kuniyu,
	horms, willemb, netdev, linux-kernel, linux-stable

On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write:
> Hi all,
> 
> We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing.
> 
> 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window().
> After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"),
> Both peers freeze their rcv_nxt and set rcv_wnd = 0.
> 
> 2.Prior to freezing, both sides had already sent out flight data.
> Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing,
> but the peer's seq of subsequent packets continues to grow.
> 
> 3.When Peer A receives Peer B's Zero Window ACK,
> the packet's seq is far ahead of Peer A's frozen rcv_nxt.
> Both peers drop each other's packet, also no Zero Window Probes are triggered
> because snd_wnd is never updated to 0.
> 

Hi,

The problem you addressed is already fixed in this commit:
0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"),
which hasn't been picked to the 6.6 branch.

That patch doesn't have the Fix tag, so I'm not sure if it will be picked
to the 6.6 branch. Just CC the linux-stable :)

Thanks!
Menglong Dong

> 
> Simplified Packet Trace:
> 
> Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially.
> 
> Time  Dir      Type        Seq   Ack   Win  Len  Status
> ------------------------------------------------------------------------
> T1:   B -> A   [PSH, ACK]  1000  5000  3000 100  (A hits OOM, rcv_nxt=1000)
> T2:   B -> A   [ACK]       1100  5000  3000 200  (Dropped due to A's OOM)
> T3:   B -> A   [PSH, ACK]  1300  5000  3000 200  (Dropped due to A's OOM)
> 
> T4:   A -> B   [PSH, ACK]  5000  1000  3000 100  (B hits OOM, rcv_nxt=5000)
> T5:   A -> B   [ACK]       5100  1000  3000 200  (Dropped due to B's OOM)
> T6:   A -> B   [PSH, ACK]  5300  1000  3000 200  (Dropped due to B's OOM)
> 
> -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 --
> 
> T7:   B -> A   [ZeroWin]   1500  5000  0    0    (Dropped: Seq 1500 != 1000)
> T8:   A -> B   [ZeroWin]   5500  1000  0    0    (Dropped: Seq 5500 != 5000)
> T9:   A -> B   [WinUpdate] 5500  1000  20   0    (Dropped: Seq 5500 != 5000)
> 
> Should we relax the sequence check in tcp_sequence() for zero window ACK?
> 
> Any feedback or guidance would be greatly appreciated.
> 
> -- 
> Best regards,
> Tangxin Xie
> 
> 
> 





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM)
  2026-06-08 11:55 ` Menglong Dong
@ 2026-06-25 13:22   ` xietangxin
  2026-06-25 14:34     ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: xietangxin @ 2026-06-25 13:22 UTC (permalink / raw)
  To: Menglong Dong
  Cc: edumazet, davem, kuba, pabeni, jmaloy, menglong8.dong, kuniyu,
	horms, willemb, netdev, linux-kernel, linux-stable



On 6/8/2026 7:55 PM, Menglong Dong wrote:
> On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write:
>> Hi all,
>>
>> We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing.
>>
>> 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window().
>> After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"),
>> Both peers freeze their rcv_nxt and set rcv_wnd = 0.
>>
>> 2.Prior to freezing, both sides had already sent out flight data.
>> Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing,
>> but the peer's seq of subsequent packets continues to grow.
>>
>> 3.When Peer A receives Peer B's Zero Window ACK,
>> the packet's seq is far ahead of Peer A's frozen rcv_nxt.
>> Both peers drop each other's packet, also no Zero Window Probes are triggered
>> because snd_wnd is never updated to 0.
>>
> 
> Hi,
> 
> The problem you addressed is already fixed in this commit:
> 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"),
> which hasn't been picked to the 6.6 branch.
> 
> That patch doesn't have the Fix tag, so I'm not sure if it will be picked
> to the 6.6 branch. Just CC the linux-stable :)
> 
> Thanks!
> Menglong Dong
> 
>>
>> Simplified Packet Trace:
>>
>> Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially.
>>
>> Time  Dir      Type        Seq   Ack   Win  Len  Status
>> ------------------------------------------------------------------------
>> T1:   B -> A   [PSH, ACK]  1000  5000  3000 100  (A hits OOM, rcv_nxt=1000)
>> T2:   B -> A   [ACK]       1100  5000  3000 200  (Dropped due to A's OOM)
>> T3:   B -> A   [PSH, ACK]  1300  5000  3000 200  (Dropped due to A's OOM)
>>
>> T4:   A -> B   [PSH, ACK]  5000  1000  3000 100  (B hits OOM, rcv_nxt=5000)
>> T5:   A -> B   [ACK]       5100  1000  3000 200  (Dropped due to B's OOM)
>> T6:   A -> B   [PSH, ACK]  5300  1000  3000 200  (Dropped due to B's OOM)
>>
>> -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 --
>>
>> T7:   B -> A   [ZeroWin]   1500  5000  0    0    (Dropped: Seq 1500 != 1000)
>> T8:   A -> B   [ZeroWin]   5500  1000  0    0    (Dropped: Seq 5500 != 5000)
>> T9:   A -> B   [WinUpdate] 5500  1000  20   0    (Dropped: Seq 5500 != 5000)
>>
>> Should we relax the sequence check in tcp_sequence() for zero window ACK?
>>
>> Any feedback or guidance would be greatly appreciated.
>>
>> -- 
>> Best regards,
>> Tangxin Xie
>>
>>
>>
> 
> 
> 

Hi,

We observed a throughput regression (dropping from ~1GB/s to 100MB/s)
in our test environment after commit 0e24d17bd966
("tcp: implement RFC 7323 window retraction receiver requirements").

When the rcv_buf reaches the pressure triggers tcp_clamp_window().
then rcv_ssthresh is strictly capped to 2 * advmss.
Subsequently, even after the user completely consumes the data and releases
a massive amount of free_space, tcp_select_window() is still heavily
suppressed by the clamped rcv_ssthresh. As a result, the receiver advertises
an extremely small window (Win=23) to the peer.

The sender cannot transmit any new data segments, until the sender's RTO timer
expires and triggers a slow-start recovery. This 200ms silence window slashes
our bandwidth by 90%.


No.   Time           Source       Destination  Info
-----------------------------------------------------------------------------------------------
1045  08:16:06.8005  192.168.1.9  192.168.1.10  [TCP ZeroWindow] 57334 -> 6666 [PSH, ACK] Win=0
1052  08:16:06.8013  192.168.1.9  192.168.1.10  [TCP Window Update] 57334 -> 6666 [ACK] Win=23
1055  08:16:06.8036  192.168.1.10  192.168.1.9  6666 -> 57334 [ACK] Seq=2999704568 Ack=2416286095
=========================== 200ms  SILENCE (RTO WAITING) ===================================
1088  08:16:07.0056  192.168.1.10  192.168.1.9  [TCP Retransmission] 6666 -> 57334 Len=1448
1090  08:16:07.0060  192.168.1.10  192.168.1.9  [TCP Retransmission] Len=2896

-- 
Best regards,
Tangxin Xie


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM)
  2026-06-25 13:22   ` xietangxin
@ 2026-06-25 14:34     ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2026-06-25 14:34 UTC (permalink / raw)
  To: xietangxin
  Cc: Menglong Dong, davem, kuba, pabeni, jmaloy, menglong8.dong,
	kuniyu, horms, willemb, netdev, linux-kernel, linux-stable

On Thu, Jun 25, 2026 at 6:22 AM xietangxin <xietangxin@yeah.net> wrote:
>
>
>
> On 6/8/2026 7:55 PM, Menglong Dong wrote:
> > On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write:
> >> Hi all,
> >>
> >> We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing.
> >>
> >> 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window().
> >> After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"),
> >> Both peers freeze their rcv_nxt and set rcv_wnd = 0.
> >>
> >> 2.Prior to freezing, both sides had already sent out flight data.
> >> Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing,
> >> but the peer's seq of subsequent packets continues to grow.
> >>
> >> 3.When Peer A receives Peer B's Zero Window ACK,
> >> the packet's seq is far ahead of Peer A's frozen rcv_nxt.
> >> Both peers drop each other's packet, also no Zero Window Probes are triggered
> >> because snd_wnd is never updated to 0.
> >>
> >
> > Hi,
> >
> > The problem you addressed is already fixed in this commit:
> > 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"),
> > which hasn't been picked to the 6.6 branch.
> >
> > That patch doesn't have the Fix tag, so I'm not sure if it will be picked
> > to the 6.6 branch. Just CC the linux-stable :)
> >
> > Thanks!
> > Menglong Dong
> >
> >>
> >> Simplified Packet Trace:
> >>
> >> Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially.
> >>
> >> Time  Dir      Type        Seq   Ack   Win  Len  Status
> >> ------------------------------------------------------------------------
> >> T1:   B -> A   [PSH, ACK]  1000  5000  3000 100  (A hits OOM, rcv_nxt=1000)
> >> T2:   B -> A   [ACK]       1100  5000  3000 200  (Dropped due to A's OOM)
> >> T3:   B -> A   [PSH, ACK]  1300  5000  3000 200  (Dropped due to A's OOM)
> >>
> >> T4:   A -> B   [PSH, ACK]  5000  1000  3000 100  (B hits OOM, rcv_nxt=5000)
> >> T5:   A -> B   [ACK]       5100  1000  3000 200  (Dropped due to B's OOM)
> >> T6:   A -> B   [PSH, ACK]  5300  1000  3000 200  (Dropped due to B's OOM)
> >>
> >> -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 --
> >>
> >> T7:   B -> A   [ZeroWin]   1500  5000  0    0    (Dropped: Seq 1500 != 1000)
> >> T8:   A -> B   [ZeroWin]   5500  1000  0    0    (Dropped: Seq 5500 != 5000)
> >> T9:   A -> B   [WinUpdate] 5500  1000  20   0    (Dropped: Seq 5500 != 5000)
> >>
> >> Should we relax the sequence check in tcp_sequence() for zero window ACK?
> >>
> >> Any feedback or guidance would be greatly appreciated.
> >>
> >> --
> >> Best regards,
> >> Tangxin Xie
> >>
> >>
> >>
> >
> >
> >
>
> Hi,
>
> We observed a throughput regression (dropping from ~1GB/s to 100MB/s)
> in our test environment after commit 0e24d17bd966
> ("tcp: implement RFC 7323 window retraction receiver requirements").
>

Could you provide instructions on how you/we can deterministically
reproduce this issue?

> When the rcv_buf reaches the pressure triggers tcp_clamp_window().
> then rcv_ssthresh is strictly capped to 2 * advmss.
> Subsequently, even after the user completely consumes the data and releases
> a massive amount of free_space, tcp_select_window() is still heavily
> suppressed by the clamped rcv_ssthresh. As a result, the receiver advertises
> an extremely small window (Win=23) to the peer.
>
> The sender cannot transmit any new data segments, until the sender's RTO timer
> expires and triggers a slow-start recovery. This 200ms silence window slashes
> our bandwidth by 90%.
>
>
> No.   Time           Source       Destination  Info
> -----------------------------------------------------------------------------------------------
> 1045  08:16:06.8005  192.168.1.9  192.168.1.10  [TCP ZeroWindow] 57334 -> 6666 [PSH, ACK] Win=0
> 1052  08:16:06.8013  192.168.1.9  192.168.1.10  [TCP Window Update] 57334 -> 6666 [ACK] Win=23
> 1055  08:16:06.8036  192.168.1.10  192.168.1.9  6666 -> 57334 [ACK] Seq=2999704568 Ack=2416286095
> =========================== 200ms  SILENCE (RTO WAITING) ===================================
> 1088  08:16:07.0056  192.168.1.10  192.168.1.9  [TCP Retransmission] 6666 -> 57334 Len=1448
> 1090  08:16:07.0060  192.168.1.10  192.168.1.9  [TCP Retransmission] Len=2896
>
> --
> Best regards,
> Tangxin Xie
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-25 14:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04  8:22 [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) xietangxin
2026-06-08 11:55 ` Menglong Dong
2026-06-25 13:22   ` xietangxin
2026-06-25 14:34     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox