* [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM)
@ 2026-06-04 8:22 xietangxin
2026-06-08 11:55 ` Menglong Dong
0 siblings, 1 reply; 4+ messages in thread
From: xietangxin @ 2026-06-04 8:22 UTC (permalink / raw)
To: edumazet, davem, kuba, pabeni
Cc: jmaloy, menglong8.dong, kuniyu, horms, willemb, netdev,
linux-kernel
Hi all,
We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing.
1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window().
After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"),
Both peers freeze their rcv_nxt and set rcv_wnd = 0.
2.Prior to freezing, both sides had already sent out flight data.
Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing,
but the peer's seq of subsequent packets continues to grow.
3.When Peer A receives Peer B's Zero Window ACK,
the packet's seq is far ahead of Peer A's frozen rcv_nxt.
Both peers drop each other's packet, also no Zero Window Probes are triggered
because snd_wnd is never updated to 0.
Simplified Packet Trace:
Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially.
Time Dir Type Seq Ack Win Len Status
------------------------------------------------------------------------
T1: B -> A [PSH, ACK] 1000 5000 3000 100 (A hits OOM, rcv_nxt=1000)
T2: B -> A [ACK] 1100 5000 3000 200 (Dropped due to A's OOM)
T3: B -> A [PSH, ACK] 1300 5000 3000 200 (Dropped due to A's OOM)
T4: A -> B [PSH, ACK] 5000 1000 3000 100 (B hits OOM, rcv_nxt=5000)
T5: A -> B [ACK] 5100 1000 3000 200 (Dropped due to B's OOM)
T6: A -> B [PSH, ACK] 5300 1000 3000 200 (Dropped due to B's OOM)
-- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 --
T7: B -> A [ZeroWin] 1500 5000 0 0 (Dropped: Seq 1500 != 1000)
T8: A -> B [ZeroWin] 5500 1000 0 0 (Dropped: Seq 5500 != 5000)
T9: A -> B [WinUpdate] 5500 1000 20 0 (Dropped: Seq 5500 != 5000)
Should we relax the sequence check in tcp_sequence() for zero window ACK?
Any feedback or guidance would be greatly appreciated.
--
Best regards,
Tangxin Xie
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) 2026-06-04 8:22 [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) xietangxin @ 2026-06-08 11:55 ` Menglong Dong 2026-06-25 13:22 ` xietangxin 0 siblings, 1 reply; 4+ messages in thread From: Menglong Dong @ 2026-06-08 11:55 UTC (permalink / raw) To: xietangxin Cc: edumazet, davem, kuba, pabeni, jmaloy, menglong8.dong, kuniyu, horms, willemb, netdev, linux-kernel, linux-stable On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write: > Hi all, > > We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing. > > 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window(). > After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"), > Both peers freeze their rcv_nxt and set rcv_wnd = 0. > > 2.Prior to freezing, both sides had already sent out flight data. > Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing, > but the peer's seq of subsequent packets continues to grow. > > 3.When Peer A receives Peer B's Zero Window ACK, > the packet's seq is far ahead of Peer A's frozen rcv_nxt. > Both peers drop each other's packet, also no Zero Window Probes are triggered > because snd_wnd is never updated to 0. > Hi, The problem you addressed is already fixed in this commit: 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"), which hasn't been picked to the 6.6 branch. That patch doesn't have the Fix tag, so I'm not sure if it will be picked to the 6.6 branch. Just CC the linux-stable :) Thanks! Menglong Dong > > Simplified Packet Trace: > > Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially. > > Time Dir Type Seq Ack Win Len Status > ------------------------------------------------------------------------ > T1: B -> A [PSH, ACK] 1000 5000 3000 100 (A hits OOM, rcv_nxt=1000) > T2: B -> A [ACK] 1100 5000 3000 200 (Dropped due to A's OOM) > T3: B -> A [PSH, ACK] 1300 5000 3000 200 (Dropped due to A's OOM) > > T4: A -> B [PSH, ACK] 5000 1000 3000 100 (B hits OOM, rcv_nxt=5000) > T5: A -> B [ACK] 5100 1000 3000 200 (Dropped due to B's OOM) > T6: A -> B [PSH, ACK] 5300 1000 3000 200 (Dropped due to B's OOM) > > -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 -- > > T7: B -> A [ZeroWin] 1500 5000 0 0 (Dropped: Seq 1500 != 1000) > T8: A -> B [ZeroWin] 5500 1000 0 0 (Dropped: Seq 5500 != 5000) > T9: A -> B [WinUpdate] 5500 1000 20 0 (Dropped: Seq 5500 != 5000) > > Should we relax the sequence check in tcp_sequence() for zero window ACK? > > Any feedback or guidance would be greatly appreciated. > > -- > Best regards, > Tangxin Xie > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) 2026-06-08 11:55 ` Menglong Dong @ 2026-06-25 13:22 ` xietangxin 2026-06-25 14:34 ` Eric Dumazet 0 siblings, 1 reply; 4+ messages in thread From: xietangxin @ 2026-06-25 13:22 UTC (permalink / raw) To: Menglong Dong Cc: edumazet, davem, kuba, pabeni, jmaloy, menglong8.dong, kuniyu, horms, willemb, netdev, linux-kernel, linux-stable On 6/8/2026 7:55 PM, Menglong Dong wrote: > On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write: >> Hi all, >> >> We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing. >> >> 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window(). >> After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"), >> Both peers freeze their rcv_nxt and set rcv_wnd = 0. >> >> 2.Prior to freezing, both sides had already sent out flight data. >> Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing, >> but the peer's seq of subsequent packets continues to grow. >> >> 3.When Peer A receives Peer B's Zero Window ACK, >> the packet's seq is far ahead of Peer A's frozen rcv_nxt. >> Both peers drop each other's packet, also no Zero Window Probes are triggered >> because snd_wnd is never updated to 0. >> > > Hi, > > The problem you addressed is already fixed in this commit: > 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"), > which hasn't been picked to the 6.6 branch. > > That patch doesn't have the Fix tag, so I'm not sure if it will be picked > to the 6.6 branch. Just CC the linux-stable :) > > Thanks! > Menglong Dong > >> >> Simplified Packet Trace: >> >> Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially. >> >> Time Dir Type Seq Ack Win Len Status >> ------------------------------------------------------------------------ >> T1: B -> A [PSH, ACK] 1000 5000 3000 100 (A hits OOM, rcv_nxt=1000) >> T2: B -> A [ACK] 1100 5000 3000 200 (Dropped due to A's OOM) >> T3: B -> A [PSH, ACK] 1300 5000 3000 200 (Dropped due to A's OOM) >> >> T4: A -> B [PSH, ACK] 5000 1000 3000 100 (B hits OOM, rcv_nxt=5000) >> T5: A -> B [ACK] 5100 1000 3000 200 (Dropped due to B's OOM) >> T6: A -> B [PSH, ACK] 5300 1000 3000 200 (Dropped due to B's OOM) >> >> -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 -- >> >> T7: B -> A [ZeroWin] 1500 5000 0 0 (Dropped: Seq 1500 != 1000) >> T8: A -> B [ZeroWin] 5500 1000 0 0 (Dropped: Seq 5500 != 5000) >> T9: A -> B [WinUpdate] 5500 1000 20 0 (Dropped: Seq 5500 != 5000) >> >> Should we relax the sequence check in tcp_sequence() for zero window ACK? >> >> Any feedback or guidance would be greatly appreciated. >> >> -- >> Best regards, >> Tangxin Xie >> >> >> > > > Hi, We observed a throughput regression (dropping from ~1GB/s to 100MB/s) in our test environment after commit 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"). When the rcv_buf reaches the pressure triggers tcp_clamp_window(). then rcv_ssthresh is strictly capped to 2 * advmss. Subsequently, even after the user completely consumes the data and releases a massive amount of free_space, tcp_select_window() is still heavily suppressed by the clamped rcv_ssthresh. As a result, the receiver advertises an extremely small window (Win=23) to the peer. The sender cannot transmit any new data segments, until the sender's RTO timer expires and triggers a slow-start recovery. This 200ms silence window slashes our bandwidth by 90%. No. Time Source Destination Info ----------------------------------------------------------------------------------------------- 1045 08:16:06.8005 192.168.1.9 192.168.1.10 [TCP ZeroWindow] 57334 -> 6666 [PSH, ACK] Win=0 1052 08:16:06.8013 192.168.1.9 192.168.1.10 [TCP Window Update] 57334 -> 6666 [ACK] Win=23 1055 08:16:06.8036 192.168.1.10 192.168.1.9 6666 -> 57334 [ACK] Seq=2999704568 Ack=2416286095 =========================== 200ms SILENCE (RTO WAITING) =================================== 1088 08:16:07.0056 192.168.1.10 192.168.1.9 [TCP Retransmission] 6666 -> 57334 Len=1448 1090 08:16:07.0060 192.168.1.10 192.168.1.9 [TCP Retransmission] Len=2896 -- Best regards, Tangxin Xie ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) 2026-06-25 13:22 ` xietangxin @ 2026-06-25 14:34 ` Eric Dumazet 0 siblings, 0 replies; 4+ messages in thread From: Eric Dumazet @ 2026-06-25 14:34 UTC (permalink / raw) To: xietangxin Cc: Menglong Dong, davem, kuba, pabeni, jmaloy, menglong8.dong, kuniyu, horms, willemb, netdev, linux-kernel, linux-stable On Thu, Jun 25, 2026 at 6:22 AM xietangxin <xietangxin@yeah.net> wrote: > > > > On 6/8/2026 7:55 PM, Menglong Dong wrote: > > On 2026/6/4 16:22 xietangxin <xietangxin@yeah.net> write: > >> Hi all, > >> > >> We have observed a TCP connection deadlock on stable 6.6 under heavy stress testing. > >> > >> 1.Both Peer A and Peer B enter the ICSK_ACK_NOMEM branch in tcp_select_window(). > >> After commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze"), > >> Both peers freeze their rcv_nxt and set rcv_wnd = 0. > >> > >> 2.Prior to freezing, both sides had already sent out flight data. > >> Since both sides are dropping incoming data packets due to OOM, rcv_nxt stops advancing, > >> but the peer's seq of subsequent packets continues to grow. > >> > >> 3.When Peer A receives Peer B's Zero Window ACK, > >> the packet's seq is far ahead of Peer A's frozen rcv_nxt. > >> Both peers drop each other's packet, also no Zero Window Probes are triggered > >> because snd_wnd is never updated to 0. > >> > > > > Hi, > > > > The problem you addressed is already fixed in this commit: > > 0e24d17bd966 ("tcp: implement RFC 7323 window retraction receiver requirements"), > > which hasn't been picked to the 6.6 branch. > > > > That patch doesn't have the Fix tag, so I'm not sure if it will be picked > > to the 6.6 branch. Just CC the linux-stable :) > > > > Thanks! > > Menglong Dong > > > >> > >> Simplified Packet Trace: > >> > >> Assume Peer A's rcv_nxt = 1000, and Peer B's rcv_nxt = 5000 initially. > >> > >> Time Dir Type Seq Ack Win Len Status > >> ------------------------------------------------------------------------ > >> T1: B -> A [PSH, ACK] 1000 5000 3000 100 (A hits OOM, rcv_nxt=1000) > >> T2: B -> A [ACK] 1100 5000 3000 200 (Dropped due to A's OOM) > >> T3: B -> A [PSH, ACK] 1300 5000 3000 200 (Dropped due to A's OOM) > >> > >> T4: A -> B [PSH, ACK] 5000 1000 3000 100 (B hits OOM, rcv_nxt=5000) > >> T5: A -> B [ACK] 5100 1000 3000 200 (Dropped due to B's OOM) > >> T6: A -> B [PSH, ACK] 5300 1000 3000 200 (Dropped due to B's OOM) > >> > >> -- Both sides are now in OOM. B's Seq is 1500; A's Seq is 5500 -- > >> > >> T7: B -> A [ZeroWin] 1500 5000 0 0 (Dropped: Seq 1500 != 1000) > >> T8: A -> B [ZeroWin] 5500 1000 0 0 (Dropped: Seq 5500 != 5000) > >> T9: A -> B [WinUpdate] 5500 1000 20 0 (Dropped: Seq 5500 != 5000) > >> > >> Should we relax the sequence check in tcp_sequence() for zero window ACK? > >> > >> Any feedback or guidance would be greatly appreciated. > >> > >> -- > >> Best regards, > >> Tangxin Xie > >> > >> > >> > > > > > > > > Hi, > > We observed a throughput regression (dropping from ~1GB/s to 100MB/s) > in our test environment after commit 0e24d17bd966 > ("tcp: implement RFC 7323 window retraction receiver requirements"). > Could you provide instructions on how you/we can deterministically reproduce this issue? > When the rcv_buf reaches the pressure triggers tcp_clamp_window(). > then rcv_ssthresh is strictly capped to 2 * advmss. > Subsequently, even after the user completely consumes the data and releases > a massive amount of free_space, tcp_select_window() is still heavily > suppressed by the clamped rcv_ssthresh. As a result, the receiver advertises > an extremely small window (Win=23) to the peer. > > The sender cannot transmit any new data segments, until the sender's RTO timer > expires and triggers a slow-start recovery. This 200ms silence window slashes > our bandwidth by 90%. > > > No. Time Source Destination Info > ----------------------------------------------------------------------------------------------- > 1045 08:16:06.8005 192.168.1.9 192.168.1.10 [TCP ZeroWindow] 57334 -> 6666 [PSH, ACK] Win=0 > 1052 08:16:06.8013 192.168.1.9 192.168.1.10 [TCP Window Update] 57334 -> 6666 [ACK] Win=23 > 1055 08:16:06.8036 192.168.1.10 192.168.1.9 6666 -> 57334 [ACK] Seq=2999704568 Ack=2416286095 > =========================== 200ms SILENCE (RTO WAITING) =================================== > 1088 08:16:07.0056 192.168.1.10 192.168.1.9 [TCP Retransmission] 6666 -> 57334 Len=1448 > 1090 08:16:07.0060 192.168.1.10 192.168.1.9 [TCP Retransmission] Len=2896 > > -- > Best regards, > Tangxin Xie > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-25 14:34 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-04 8:22 [BUG] TCP connection deadlock under simultaneous bidirectional ICSK_ACK_NOMEM (OOM) xietangxin 2026-06-08 11:55 ` Menglong Dong 2026-06-25 13:22 ` xietangxin 2026-06-25 14:34 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox