netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] TCP: Duplicate ACK storm after reordering with delayed packet (BBR RTO triggered)
@ 2025-08-28  1:12 Ahmed, Shehab Sarar
  2025-08-28  3:15 ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Ahmed, Shehab Sarar @ 2025-08-28  1:12 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: edumazet@google.com, ncardwell@google.com, kuniyu@google.com

Hello,

I am a PhD student doing research on adversarial testing of different TCP protocols. Recently, I found an interesting behavior of TCP that I am describing below:

The network RTT was high for about a second before it was abruptly reduced. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing the congestion control protocol into the retransmission timeout (RTO) phase.

I experienced this behavior in linux kernel 5.4.230 version and was wondering if the same issue persists in the recent-most kernel. Do you know of any commit that addressed this issue? If not, I am highly enthusiastic to investigate further. My suspicion is that the problem lies in tcp_input.c. I will be eagerly waiting for your reply.

Thanks
Shehab

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] TCP: Duplicate ACK storm after reordering with delayed packet (BBR RTO triggered)
  2025-08-28  1:12 [BUG] TCP: Duplicate ACK storm after reordering with delayed packet (BBR RTO triggered) Ahmed, Shehab Sarar
@ 2025-08-28  3:15 ` Eric Dumazet
  2025-08-28 20:51   ` Neal Cardwell
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2025-08-28  3:15 UTC (permalink / raw)
  To: Ahmed, Shehab Sarar
  Cc: netdev@vger.kernel.org, ncardwell@google.com, kuniyu@google.com

On Wed, Aug 27, 2025 at 6:12 PM Ahmed, Shehab Sarar
<shehaba2@illinois.edu> wrote:
>
> Hello,
>
> I am a PhD student doing research on adversarial testing of different TCP protocols. Recently, I found an interesting behavior of TCP that I am describing below:
>
> The network RTT was high for about a second before it was abruptly reduced. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing the congestion control protocol into the retransmission timeout (RTO) phase.
>
> I experienced this behavior in linux kernel 5.4.230 version and was wondering if the same issue persists in the recent-most kernel. Do you know of any commit that addressed this issue? If not, I am highly enthusiastic to investigate further. My suspicion is that the problem lies in tcp_input.c. I will be eagerly waiting for your reply.

I really wonder why anyone would do any research on v5.4.230, a more
than 2 years old kernel, clearly unsupported.

I suggest you write a packetdrill test to exhibit the issue, then run
a reverse bisection to find the commit fixing it (assuming recent
kernels are fixed).

There are about 8200 patches between v5.4.230 and v5.4.296, a
bisection should be fast.

>
> Thanks
> Shehab

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] TCP: Duplicate ACK storm after reordering with delayed packet (BBR RTO triggered)
  2025-08-28  3:15 ` Eric Dumazet
@ 2025-08-28 20:51   ` Neal Cardwell
  0 siblings, 0 replies; 3+ messages in thread
From: Neal Cardwell @ 2025-08-28 20:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ahmed, Shehab Sarar, netdev@vger.kernel.org, kuniyu@google.com

On Wed, Aug 27, 2025 at 11:16 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Aug 27, 2025 at 6:12 PM Ahmed, Shehab Sarar
> <shehaba2@illinois.edu> wrote:
> >
> > Hello,
> >
> > I am a PhD student doing research on adversarial testing of different TCP protocols. Recently, I found an interesting behavior of TCP that I am describing below:
> >
> > The network RTT was high for about a second before it was abruptly reduced. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing the congestion control protocol into the retransmission timeout (RTO) phase.
> >
> > I experienced this behavior in linux kernel 5.4.230 version and was wondering if the same issue persists in the recent-most kernel. Do you know of any commit that addressed this issue? If not, I am highly enthusiastic to investigate further. My suspicion is that the problem lies in tcp_input.c. I will be eagerly waiting for your reply.
>
> I really wonder why anyone would do any research on v5.4.230, a more
> than 2 years old kernel, clearly unsupported.
>
> I suggest you write a packetdrill test to exhibit the issue, then run
> a reverse bisection to find the commit fixing it (assuming recent
> kernels are fixed).
>
> There are about 8200 patches between v5.4.230 and v5.4.296, a
> bisection should be fast.

Thanks for your report, Shehab.

I agree with Eric's suggestion to try writing a packetdrill test case
for this, so we have a reproducer for the behavior, and if there is a
bug we can create a regression test for Linux TCP with that.

Shehab, while you are working on a packetdrill reproducer of this
case, if you can share a binary tcpdump .pcap trace of such a
scenario, that would be very useful. From your detailed description it
sounds like you have such a trace. If you can share it, that would be
great. A visualization with tcptrace or similar tools may be easier
for us to parse than this English prose description. ;-)

best regards,
neal

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-28 20:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28  1:12 [BUG] TCP: Duplicate ACK storm after reordering with delayed packet (BBR RTO triggered) Ahmed, Shehab Sarar
2025-08-28  3:15 ` Eric Dumazet
2025-08-28 20:51   ` Neal Cardwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).