[PATCH net 0/2] tcp: re-enable acceptance of FIN packets when RWIN is 0

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH net 0/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
@ 2026-02-22 12:35 Simon Baatz via B4 Relay
  2026-02-22 12:35 ` [PATCH net 1/2] " Simon Baatz via B4 Relay
  2026-02-22 12:35 ` [PATCH net 2/2] selftests/net: packetdrill: Verify " Simon Baatz via B4 Relay
  0 siblings, 2 replies; 9+ messages in thread
From: Simon Baatz via B4 Relay @ 2026-02-22 12:35 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan
  Cc: netdev, linux-kernel, linux-kselftest, Simon Baatz

Hi,

this series restores the ability to accept in‑sequence FIN packets 
even when the advertised receive window is zero, and adds a 
packetdrill test to guard the behavior.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
---
Simon Baatz (2):
      tcp: re-enable acceptance of FIN packets when RWIN is 0
      selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0

 net/ipv4/tcp_input.c                               |  7 +++++-
 .../net/packetdrill/tcp_rcv_zero_wnd_fin.pkt       | 27 ++++++++++++++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)
---
base-commit: d4f687fbbce45b5e88438e89b5e26c0c15847992
change-id: 20260221-fix_zero_wnd_fin-d1ba11cd3b07

Best regards,
-- 
Simon Baatz <gmbnomis@gmail.com>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-22 12:35 [PATCH net 0/2] tcp: re-enable acceptance of FIN packets when RWIN is 0 Simon Baatz via B4 Relay
@ 2026-02-22 12:35 ` Simon Baatz via B4 Relay
  2026-02-23  1:36   ` Kuniyuki Iwashima
  2026-02-23  8:01   ` Eric Dumazet
  2026-02-22 12:35 ` [PATCH net 2/2] selftests/net: packetdrill: Verify " Simon Baatz via B4 Relay
  1 sibling, 2 replies; 9+ messages in thread
From: Simon Baatz via B4 Relay @ 2026-02-22 12:35 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan
  Cc: netdev, linux-kernel, linux-kselftest, Simon Baatz

From: Simon Baatz <gmbnomis@gmail.com>

Commit 2bd99aef1b19 ("tcp: accept bare FIN packets under memory
pressure") allowed accepting FIN packets in tcp_data_queue() even when
the receive window was closed, to prevent ACK/FIN loops with broken
clients.

Such a FIN packet is in sequence, but because the FIN consumes a
sequence number, it extends beyond the window. Before commit
9ca48d616ed7 ("tcp: do not accept packets beyond window"),
tcp_sequence() only required the seq to be within the window. After
that change, the entire packet (including the FIN) must fit within the
window. As a result, such FIN packets are now dropped and the handling
path is no longer reached.

Be more lenient by not counting the sequence number consumed by the
FIN when calling tcp_sequence(), restoring the previous behavior for
cases where only the FIN extends beyond the window.

Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
---
 net/ipv4/tcp_input.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e7b41abb82aad33d8cab4fcfa989cc4771149b41..fde612f12d3625000081958d13cec8779684a642 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6379,7 +6379,12 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,

 step1:
 	/* Step 1: check sequence number */
-	reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
+
+	/* Some stacks are known to handle FIN incorrectly; allow the FIN
+	 * to extend beyond the window and check it in detail later.
+	 */
+	reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
+			      TCP_SKB_CB(skb)->end_seq - th->fin);
 	if (reason) {
 		/* RFC793, page 37: "In all states except SYN-SENT, all reset
 		 * (RST) segments are validated by checking their SEQ-fields."

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/2] selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0
  2026-02-22 12:35 [PATCH net 0/2] tcp: re-enable acceptance of FIN packets when RWIN is 0 Simon Baatz via B4 Relay
  2026-02-22 12:35 ` [PATCH net 1/2] " Simon Baatz via B4 Relay
@ 2026-02-22 12:35 ` Simon Baatz via B4 Relay
  2026-02-23  1:37   ` Kuniyuki Iwashima
  1 sibling, 1 reply; 9+ messages in thread
From: Simon Baatz via B4 Relay @ 2026-02-22 12:35 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan
  Cc: netdev, linux-kernel, linux-kselftest, Simon Baatz

From: Simon Baatz <gmbnomis@gmail.com>

Add a packetdrill test that verifies we accept bare FIN packets when
the advertised receive window is zero.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
---
 .../net/packetdrill/tcp_rcv_zero_wnd_fin.pkt       | 27 ++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt
new file mode 100644
index 0000000000000000000000000000000000000000..e245359a1a91d4b6d7ef64496681cea57792cb71
--- /dev/null
+++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Some TCP stacks send FINs even though the window is closed. We break
+// a possible FIN/ACK loop by accepting the FIN.
+
+--mss=1000
+
+`./defaults.sh`
+
+// Establish a connection.
+   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+   +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [20000], 4) = 0
+   +0 bind(3, ..., ...) = 0
+   +0 listen(3, 1) = 0
+
+   +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+   +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
+   +0 < . 1:1(0) ack 1 win 257
+
+   +0 accept(3, ..., ...) = 4
+
+   +0 < P. 1:60001(60000) ack 1 win 257
+    * > .  1:1(0) ack 60001 win 0
+
+   +0 < F. 60001:60001(0) ack 1 win 257
+   +0 > . 1:1(0) ack 60002 win 0

-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-22 12:35 ` [PATCH net 1/2] " Simon Baatz via B4 Relay
@ 2026-02-23  1:36   ` Kuniyuki Iwashima
  2026-02-23  8:01   ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-23  1:36 UTC (permalink / raw)
  To: gmbnomis
  Cc: Eric Dumazet, Neal Cardwell, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

On Sun, Feb 22, 2026 at 4:35 AM Simon Baatz via B4 Relay
<devnull+gmbnomis.gmail.com@kernel.org> wrote:
>
> From: Simon Baatz <gmbnomis@gmail.com>
>
> Commit 2bd99aef1b19 ("tcp: accept bare FIN packets under memory
> pressure") allowed accepting FIN packets in tcp_data_queue() even when
> the receive window was closed, to prevent ACK/FIN loops with broken
> clients.
>
> Such a FIN packet is in sequence, but because the FIN consumes a
> sequence number, it extends beyond the window. Before commit
> 9ca48d616ed7 ("tcp: do not accept packets beyond window"),
> tcp_sequence() only required the seq to be within the window. After
> that change, the entire packet (including the FIN) must fit within the
> window. As a result, such FIN packets are now dropped and the handling
> path is no longer reached.
>
> Be more lenient by not counting the sequence number consumed by the
> FIN when calling tcp_sequence(), restoring the previous behavior for
> cases where only the FIN extends beyond the window.
>
> Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")
> Signed-off-by: Simon Baatz <gmbnomis@gmail.com>

Good catch, thanks !

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] selftests/net: packetdrill: Verify acceptance of FIN packets when RWIN is 0
  2026-02-22 12:35 ` [PATCH net 2/2] selftests/net: packetdrill: Verify " Simon Baatz via B4 Relay
@ 2026-02-23  1:37   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-02-23  1:37 UTC (permalink / raw)
  To: gmbnomis
  Cc: Eric Dumazet, Neal Cardwell, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

On Sun, Feb 22, 2026 at 4:35 AM Simon Baatz via B4 Relay
<devnull+gmbnomis.gmail.com@kernel.org> wrote:
>
> From: Simon Baatz <gmbnomis@gmail.com>
>
> Add a packetdrill test that verifies we accept bare FIN packets when
> the advertised receive window is zero.
>
> Signed-off-by: Simon Baatz <gmbnomis@gmail.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>


> ---
>  .../net/packetdrill/tcp_rcv_zero_wnd_fin.pkt       | 27 ++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
>
> diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt
> new file mode 100644
> index 0000000000000000000000000000000000000000..e245359a1a91d4b6d7ef64496681cea57792cb71
> --- /dev/null
> +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_zero_wnd_fin.pkt
> @@ -0,0 +1,27 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Some TCP stacks send FINs even though the window is closed. We break
> +// a possible FIN/ACK loop by accepting the FIN.
> +
> +--mss=1000
> +
> +`./defaults.sh`
> +
> +// Establish a connection.
> +   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> +   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> +   +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [20000], 4) = 0
> +   +0 bind(3, ..., ...) = 0
> +   +0 listen(3, 1) = 0
> +
> +   +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
> +   +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
> +   +0 < . 1:1(0) ack 1 win 257
> +
> +   +0 accept(3, ..., ...) = 4
> +
> +   +0 < P. 1:60001(60000) ack 1 win 257
> +    * > .  1:1(0) ack 60001 win 0
> +
> +   +0 < F. 60001:60001(0) ack 1 win 257
> +   +0 > . 1:1(0) ack 60002 win 0
>
> --
> 2.53.0
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-22 12:35 ` [PATCH net 1/2] " Simon Baatz via B4 Relay
  2026-02-23  1:36   ` Kuniyuki Iwashima
@ 2026-02-23  8:01   ` Eric Dumazet
  2026-02-23  8:36     ` Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-02-23  8:01 UTC (permalink / raw)
  To: gmbnomis
  Cc: Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

On Sun, Feb 22, 2026 at 1:35 PM Simon Baatz via B4 Relay
<devnull+gmbnomis.gmail.com@kernel.org> wrote:
>
> From: Simon Baatz <gmbnomis@gmail.com>
>
> Commit 2bd99aef1b19 ("tcp: accept bare FIN packets under memory
> pressure") allowed accepting FIN packets in tcp_data_queue() even when
> the receive window was closed, to prevent ACK/FIN loops with broken
> clients.
>
> Such a FIN packet is in sequence, but because the FIN consumes a
> sequence number, it extends beyond the window. Before commit
> 9ca48d616ed7 ("tcp: do not accept packets beyond window"),
> tcp_sequence() only required the seq to be within the window. After
> that change, the entire packet (including the FIN) must fit within the
> window. As a result, such FIN packets are now dropped and the handling
> path is no longer reached.
>
> Be more lenient by not counting the sequence number consumed by the
> FIN when calling tcp_sequence(), restoring the previous behavior for
> cases where only the FIN extends beyond the window.
>
> Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")

OK, but this commit is fine ? It seems the issue is coming from buggy peers ?

Eventually the receive queue would be drained by the application, the
peer would retransmit
this FIN, and it would be accepted.

> Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
> ---
>  net/ipv4/tcp_input.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index e7b41abb82aad33d8cab4fcfa989cc4771149b41..fde612f12d3625000081958d13cec8779684a642 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6379,7 +6379,12 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
>
>  step1:
>         /* Step 1: check sequence number */
> -       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
> +
> +       /* Some stacks are known to handle FIN incorrectly; allow the FIN
> +        * to extend beyond the window and check it in detail later.
> +        */
> +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> +                             TCP_SKB_CB(skb)->end_seq - th->fin);

I don't think this is the right fix. Basically it says that FIN do not
count, but TCP RFC says otherwise.

It also adds code in TCP fast path.

See for reference

commit 22555032c513e62fe744d4cdd553539897e8e922
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Apr 18 21:45:58 2024 +0000

    tcp: remove dubious FIN exception from tcp_cwnd_test()

    tcp_cwnd_test() has a special handing for the last packet in
    the write queue if it is smaller than one MSS and has the FIN flag.

    This is in violation of TCP RFC, and seems quite dubious.

    This packet can be sent only if the current CWND is bigger
    than the number of packets in flight.

    Making tcp_cwnd_test() result independent of the first skb
    in the write queue is needed for the last patch of the series.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240418214600.1291486-2-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-23  8:01   ` Eric Dumazet
@ 2026-02-23  8:36     ` Eric Dumazet
  2026-02-23 17:16       ` Simon Baatz
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-02-23  8:36 UTC (permalink / raw)
  To: gmbnomis
  Cc: Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

On Mon, Feb 23, 2026 at 9:01 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Sun, Feb 22, 2026 at 1:35 PM Simon Baatz via B4 Relay
> <devnull+gmbnomis.gmail.com@kernel.org> wrote:
> >
> > From: Simon Baatz <gmbnomis@gmail.com>
> >
> > Commit 2bd99aef1b19 ("tcp: accept bare FIN packets under memory
> > pressure") allowed accepting FIN packets in tcp_data_queue() even when
> > the receive window was closed, to prevent ACK/FIN loops with broken
> > clients.
> >
> > Such a FIN packet is in sequence, but because the FIN consumes a
> > sequence number, it extends beyond the window. Before commit
> > 9ca48d616ed7 ("tcp: do not accept packets beyond window"),
> > tcp_sequence() only required the seq to be within the window. After
> > that change, the entire packet (including the FIN) must fit within the
> > window. As a result, such FIN packets are now dropped and the handling
> > path is no longer reached.
> >
> > Be more lenient by not counting the sequence number consumed by the
> > FIN when calling tcp_sequence(), restoring the previous behavior for
> > cases where only the FIN extends beyond the window.
> >
> > Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")
>
> OK, but this commit is fine ? It seems the issue is coming from buggy peers ?
>
> Eventually the receive queue would be drained by the application, the
> peer would retransmit
> this FIN, and it would be accepted.
>
> > Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
> > ---
> >  net/ipv4/tcp_input.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index e7b41abb82aad33d8cab4fcfa989cc4771149b41..fde612f12d3625000081958d13cec8779684a642 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -6379,7 +6379,12 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
> >
> >  step1:
> >         /* Step 1: check sequence number */
> > -       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
> > +
> > +       /* Some stacks are known to handle FIN incorrectly; allow the FIN
> > +        * to extend beyond the window and check it in detail later.
> > +        */
> > +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> > +                             TCP_SKB_CB(skb)->end_seq - th->fin);
>
> I don't think this is the right fix. Basically it says that FIN do not
> count, but TCP RFC says otherwise.
>
> It also adds code in TCP fast path.

We can keep fast path unchanged with this variant.

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e7b41abb82aad33d8cab4fcfa989cc4771149b41..156c92450f3ed00357aff2ef3e586b83f3cecb5e
100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4858,15 +4858,24 @@ static enum skb_drop_reason
tcp_disordered_ack_check(const struct sock *sk,
  */

 static enum skb_drop_reason tcp_sequence(const struct sock *sk,
-                                        u32 seq, u32 end_seq)
+                                        u32 seq, u32 end_seq,
+                                        const struct tcphdr *th)
 {
        const struct tcp_sock *tp = tcp_sk(sk);
+       u32 seq_limit;

        if (before(end_seq, tp->rcv_wup))
                return SKB_DROP_REASON_TCP_OLD_SEQUENCE;

-       if (after(end_seq, tp->rcv_nxt + tcp_receive_window(tp))) {
-               if (after(seq, tp->rcv_nxt + tcp_receive_window(tp)))
+       seq_limit = tp->rcv_nxt + tcp_receive_window(tp);
+       if (unlikely(after(end_seq, seq_limit))) {
+               /* Some stacks are known to handle FIN incorrectly;
allow the FIN
+                * to extend beyond the window and check it in detail later.
+                */
+               if (!after(end_seq - th->fin, seq_limit))
+                       return SKB_NOT_DROPPED_YET;
+
+               if (after(seq, seq_limit))
                        return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;

                /* Only accept this packet if receive queue is empty. */
@@ -6379,7 +6388,8 @@ static bool tcp_validate_incoming(struct sock
*sk, struct sk_buff *skb,

 step1:
        /* Step 1: check sequence number */
-       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
TCP_SKB_CB(skb)->end_seq);
+       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
+                             TCP_SKB_CB(skb)->end_seq, th);
        if (reason) {
                /* RFC793, page 37: "In all states except SYN-SENT, all reset
                 * (RST) segments are validated by checking their SEQ-fields."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-23  8:36     ` Eric Dumazet
@ 2026-02-23 17:16       ` Simon Baatz
  2026-02-23 17:38         ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Baatz @ 2026-02-23 17:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

Hi Eric,

On Mon, Feb 23, 2026 at 09:36:38AM +0100, Eric Dumazet wrote:
> On Mon, Feb 23, 2026 at 9:01???AM Eric Dumazet <edumazet@google.com> wrote:
> ...
> > >
> > > Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")
> >
> > OK, but this commit is fine ? It seems the issue is coming from buggy peers ?

That "Fixes" tag is a technicality, I did not want to imply that the
commit is broken: 9ca48d616ed7 (which is RFC compliant) turns the
workaround (which isn't RFC compliant) 2bd99aef1b19 ("tcp:
accept bare FIN packets under memory pressure") into dead code.

If we still need that workaround, technically, this is a regression
caused by 9ca48d616ed7.  If not, I am perfectly happy to propose a
commit that removes the dead code.

> > Eventually the receive queue would be drained by the application, the
> > peer would retransmit
> > this FIN, and it would be accepted.

If I understand the problem correctly, the workaround was introduced
to break a FIN/ACK loop because the broken peer (macOS) did not use
exponential backoff in that scenario.  So yes, it would be accepted,
but until then we and the peer would ping-pong FIN/ACKs.

But as said, if we don't need the workaround I can submit a patch to
remove it.

> ...
> > > +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> > > +                             TCP_SKB_CB(skb)->end_seq - th->fin);
> >
> > I don't think this is the right fix. Basically it says that FIN do not
> > count, but TCP RFC says otherwise.

We can be more specific and only allow that if we have a zero window. 
(I thought I would not hurt much to accept that FIN which does not
take up real space)

> > It also adds code in TCP fast path.

Hmm, the call site for tcp_validate_incoming() in
tcp_rcv_established() is:

	/*
	 *	Standard slow path.
	 */
validate:
	if (!tcp_validate_incoming(sk, skb, th, 1))
		return;


Am I missing something?

 
> We can keep fast path unchanged with this variant.
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index e7b41abb82aad33d8cab4fcfa989cc4771149b41..156c92450f3ed00357aff2ef3e586b83f3cecb5e
> 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4858,15 +4858,24 @@ static enum skb_drop_reason
> tcp_disordered_ack_check(const struct sock *sk,
>   */
> 
>  static enum skb_drop_reason tcp_sequence(const struct sock *sk,
> -                                        u32 seq, u32 end_seq)
> +                                        u32 seq, u32 end_seq,
> +                                        const struct tcphdr *th)
>  {
>         const struct tcp_sock *tp = tcp_sk(sk);
> +       u32 seq_limit;
> 
>         if (before(end_seq, tp->rcv_wup))
>                 return SKB_DROP_REASON_TCP_OLD_SEQUENCE;
> 
> -       if (after(end_seq, tp->rcv_nxt + tcp_receive_window(tp))) {
> -               if (after(seq, tp->rcv_nxt + tcp_receive_window(tp)))
> +       seq_limit = tp->rcv_nxt + tcp_receive_window(tp);
> +       if (unlikely(after(end_seq, seq_limit))) {
> +               /* Some stacks are known to handle FIN incorrectly;
> allow the FIN
> +                * to extend beyond the window and check it in detail later.
> +                */
> +               if (!after(end_seq - th->fin, seq_limit))
> +                       return SKB_NOT_DROPPED_YET;
> +
> +               if (after(seq, seq_limit))
>                         return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
> 
>                 /* Only accept this packet if receive queue is empty. */
> @@ -6379,7 +6388,8 @@ static bool tcp_validate_incoming(struct sock
> *sk, struct sk_buff *skb,
> 
>  step1:
>         /* Step 1: check sequence number */
> -       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> TCP_SKB_CB(skb)->end_seq);
> +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> +                             TCP_SKB_CB(skb)->end_seq, th);
>         if (reason) {
>                 /* RFC793, page 37: "In all states except SYN-SENT, all reset
>                  * (RST) segments are validated by checking their SEQ-fields."

Sure, I can do that.  Do you want me to add the
tcp_receive_window(tp) == 0 check to narrow it down further?

(And a dumb question: As you are the author of that code, how do I
attribute that commit when submitting a v2?)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/2] tcp: re-enable acceptance of FIN packets when RWIN is 0
  2026-02-23 17:16       ` Simon Baatz
@ 2026-02-23 17:38         ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-02-23 17:38 UTC (permalink / raw)
  To: Simon Baatz
  Cc: Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev,
	linux-kernel, linux-kselftest

On Mon, Feb 23, 2026 at 6:16 PM Simon Baatz <gmbnomis@gmail.com> wrote:
>
> Hi Eric,
>
> On Mon, Feb 23, 2026 at 09:36:38AM +0100, Eric Dumazet wrote:
> > On Mon, Feb 23, 2026 at 9:01???AM Eric Dumazet <edumazet@google.com> wrote:
> > ...
> > > >
> > > > Fixes: 9ca48d616ed7 ("tcp: do not accept packets beyond window")
> > >
> > > OK, but this commit is fine ? It seems the issue is coming from buggy peers ?
>
> That "Fixes" tag is a technicality, I did not want to imply that the
> commit is broken: 9ca48d616ed7 (which is RFC compliant) turns the
> workaround (which isn't RFC compliant) 2bd99aef1b19 ("tcp:
> accept bare FIN packets under memory pressure") into dead code.
>
> If we still need that workaround, technically, this is a regression
> caused by 9ca48d616ed7.  If not, I am perfectly happy to propose a
> commit that removes the dead code.
>
> > > Eventually the receive queue would be drained by the application, the
> > > peer would retransmit
> > > this FIN, and it would be accepted.
>
> If I understand the problem correctly, the workaround was introduced
> to break a FIN/ACK loop because the broken peer (macOS) did not use
> exponential backoff in that scenario.  So yes, it would be accepted,
> but until then we and the peer would ping-pong FIN/ACKs.
>
> But as said, if we don't need the workaround I can submit a patch to
> remove it.
>
> > ...
> > > > +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> > > > +                             TCP_SKB_CB(skb)->end_seq - th->fin);
> > >
> > > I don't think this is the right fix. Basically it says that FIN do not
> > > count, but TCP RFC says otherwise.
>
> We can be more specific and only allow that if we have a zero window.
> (I thought I would not hurt much to accept that FIN which does not
> take up real space)
>
> > > It also adds code in TCP fast path.
>
> Hmm, the call site for tcp_validate_incoming() in
> tcp_rcv_established() is:
>
>         /*
>          *      Standard slow path.
>          */
> validate:
>         if (!tcp_validate_incoming(sk, skb, th, 1))
>                 return;
>
>
> Am I missing something?

The 'slow path' comment does not mean everything from this can be really slow.
'slow path' is quite often used, header prediction is less and less a thing.

>
>
> > We can keep fast path unchanged with this variant.
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index e7b41abb82aad33d8cab4fcfa989cc4771149b41..156c92450f3ed00357aff2ef3e586b83f3cecb5e
> > 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -4858,15 +4858,24 @@ static enum skb_drop_reason
> > tcp_disordered_ack_check(const struct sock *sk,
> >   */
> >
> >  static enum skb_drop_reason tcp_sequence(const struct sock *sk,
> > -                                        u32 seq, u32 end_seq)
> > +                                        u32 seq, u32 end_seq,
> > +                                        const struct tcphdr *th)
> >  {
> >         const struct tcp_sock *tp = tcp_sk(sk);
> > +       u32 seq_limit;
> >
> >         if (before(end_seq, tp->rcv_wup))
> >                 return SKB_DROP_REASON_TCP_OLD_SEQUENCE;
> >
> > -       if (after(end_seq, tp->rcv_nxt + tcp_receive_window(tp))) {
> > -               if (after(seq, tp->rcv_nxt + tcp_receive_window(tp)))
> > +       seq_limit = tp->rcv_nxt + tcp_receive_window(tp);
> > +       if (unlikely(after(end_seq, seq_limit))) {
> > +               /* Some stacks are known to handle FIN incorrectly;
> > allow the FIN
> > +                * to extend beyond the window and check it in detail later.
> > +                */
> > +               if (!after(end_seq - th->fin, seq_limit))
> > +                       return SKB_NOT_DROPPED_YET;
> > +
> > +               if (after(seq, seq_limit))
> >                         return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
> >
> >                 /* Only accept this packet if receive queue is empty. */
> > @@ -6379,7 +6388,8 @@ static bool tcp_validate_incoming(struct sock
> > *sk, struct sk_buff *skb,
> >
> >  step1:
> >         /* Step 1: check sequence number */
> > -       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> > TCP_SKB_CB(skb)->end_seq);
> > +       reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq,
> > +                             TCP_SKB_CB(skb)->end_seq, th);
> >         if (reason) {
> >                 /* RFC793, page 37: "In all states except SYN-SENT, all reset
> >                  * (RST) segments are validated by checking their SEQ-fields."
>
> Sure, I can do that.  Do you want me to add the
> tcp_receive_window(tp) == 0 check to narrow it down further?

Not sure what you mean.

>
> (And a dumb question: As you are the author of that code, how do I
> attribute that commit when submitting a v2?)

No worries, this is part of the review process, I give you permission to
be the author, even if this happens to be something I wrote.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-23 17:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-22 12:35 [PATCH net 0/2] tcp: re-enable acceptance of FIN packets when RWIN is 0 Simon Baatz via B4 Relay
2026-02-22 12:35 ` [PATCH net 1/2] " Simon Baatz via B4 Relay
2026-02-23  1:36   ` Kuniyuki Iwashima
2026-02-23  8:01   ` Eric Dumazet
2026-02-23  8:36     ` Eric Dumazet
2026-02-23 17:16       ` Simon Baatz
2026-02-23 17:38         ` Eric Dumazet
2026-02-22 12:35 ` [PATCH net 2/2] selftests/net: packetdrill: Verify " Simon Baatz via B4 Relay
2026-02-23  1:37   ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox