netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] tcp: receiver changes
@ 2025-07-11 11:39 Eric Dumazet
  2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:39 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

Before accepting an incoming packet:

- Make sure to not accept a packet beyond advertized RWIN.
  If not, increment a new SNMP counter (LINUX_MIB_BEYOND_WINDOW)

- ooo packets should update rcv_mss and tp->scaling_ratio.

- Make sure to not accept packet beyond sk_rcvbuf limit.

This series includes three associated packetdrill tests.

Eric Dumazet (8):
  tcp: do not accept packets beyond window
  tcp: add LINUX_MIB_BEYOND_WINDOW
  selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt
  tcp: call tcp_measure_rcv_mss() for ooo packets
  selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt
  tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb
  tcp: stronger sk_rcvbuf checks
  selftests/net: packetdrill: add tcp_rcv_toobig.pkt

 .../networking/net_cachelines/snmp.rst        |  1 +
 include/net/dropreason-core.h                 |  9 +++-
 include/net/sock.h                            |  2 +-
 include/uapi/linux/snmp.h                     |  1 +
 net/ipv4/proc.c                               |  1 +
 net/ipv4/tcp_input.c                          | 48 ++++++++++++++-----
 .../net/packetdrill/tcp_ooo_rcv_mss.pkt       | 27 +++++++++++
 .../net/packetdrill/tcp_rcv_big_endseq.pkt    | 44 +++++++++++++++++
 .../net/packetdrill/tcp_rcv_toobig.pkt        | 33 +++++++++++++
 9 files changed, 152 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_rcv_toobig.pkt

-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH net-next 1/8] tcp: do not accept packets beyond window
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
@ 2025-07-11 11:39 ` Eric Dumazet
  2025-07-12 20:52   ` Kuniyuki Iwashima
  2025-07-15  1:38   ` Jakub Kicinski
  2025-07-11 11:40 ` [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW Eric Dumazet
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:39 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

Currently, TCP accepts incoming packets which might go beyond the
offered RWIN.

Add to tcp_sequence() the validation of packet end sequence.

Add the corresponding check in the fast path.

We relax this new constraint if the receive queue is empty,
to not freeze flows from buggy peers.

Add a new drop reason : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/dropreason-core.h |  7 ++++++-
 net/ipv4/tcp_input.c          | 22 +++++++++++++++++-----
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index b9e78290269e6e7d9d9155171f6b0ef03c7697c9..d88ff9a75d15fe60a961332a7eb4be94c5c7c3ec 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -45,6 +45,7 @@
 	FN(TCP_LISTEN_OVERFLOW)		\
 	FN(TCP_OLD_SEQUENCE)		\
 	FN(TCP_INVALID_SEQUENCE)	\
+	FN(TCP_INVALID_END_SEQUENCE)	\
 	FN(TCP_INVALID_ACK_SEQUENCE)	\
 	FN(TCP_RESET)			\
 	FN(TCP_INVALID_SYN)		\
@@ -303,8 +304,12 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_TCP_LISTEN_OVERFLOW,
 	/** @SKB_DROP_REASON_TCP_OLD_SEQUENCE: Old SEQ field (duplicate packet) */
 	SKB_DROP_REASON_TCP_OLD_SEQUENCE,
-	/** @SKB_DROP_REASON_TCP_INVALID_SEQUENCE: Not acceptable SEQ field */
+	/** @SKB_DROP_REASON_TCP_INVALID_SEQUENCE: Not acceptable SEQ field.
+	 */
 	SKB_DROP_REASON_TCP_INVALID_SEQUENCE,
+	/** @SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE: Not acceptable END_SEQ field.
+	 */
+	SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE,
 	/**
 	 * @SKB_DROP_REASON_TCP_INVALID_ACK_SEQUENCE: Not acceptable ACK SEQ
 	 * field because ack sequence is not in the window between snd_una
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9b03c44c12b862b5d33f4390cfc85e2f8897cd8e..f0f9c78654b449cb2a122e8c53fdcc96e5317de7 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4391,14 +4391,22 @@ static enum skb_drop_reason tcp_disordered_ack_check(const struct sock *sk,
  * (borrowed from freebsd)
  */
 
-static enum skb_drop_reason tcp_sequence(const struct tcp_sock *tp,
+static enum skb_drop_reason tcp_sequence(const struct sock *sk,
 					 u32 seq, u32 end_seq)
 {
+	const struct tcp_sock *tp = tcp_sk(sk);
+
 	if (before(end_seq, tp->rcv_wup))
 		return SKB_DROP_REASON_TCP_OLD_SEQUENCE;
 
-	if (after(seq, tp->rcv_nxt + tcp_receive_window(tp)))
-		return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
+	if (after(end_seq, tp->rcv_nxt + tcp_receive_window(tp))) {
+		if (after(seq, tp->rcv_nxt + tcp_receive_window(tp)))
+			return SKB_DROP_REASON_TCP_INVALID_SEQUENCE;
+
+		/* Only accept this packet if receive queue is empty. */
+		if (skb_queue_len(&sk->sk_receive_queue))
+			return SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE;
+	}
 
 	return SKB_NOT_DROPPED_YET;
 }
@@ -5881,7 +5889,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
 
 step1:
 	/* Step 1: check sequence number */
-	reason = tcp_sequence(tp, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
+	reason = tcp_sequence(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
 	if (reason) {
 		/* RFC793, page 37: "In all states except SYN-SENT, all reset
 		 * (RST) segments are validated by checking their SEQ-fields."
@@ -6110,6 +6118,10 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
 			if (tcp_checksum_complete(skb))
 				goto csum_error;
 
+			if (after(TCP_SKB_CB(skb)->end_seq,
+				  tp->rcv_nxt + tcp_receive_window(tp)))
+				goto validate;
+
 			if ((int)skb->truesize > sk->sk_forward_alloc)
 				goto step5;
 
@@ -6165,7 +6177,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb)
 	/*
 	 *	Standard slow path.
 	 */
-
+validate:
 	if (!tcp_validate_incoming(sk, skb, th, 1))
 		return;
 
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
  2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 20:55   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt Eric Dumazet
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

Add a new SNMP MIB : LINUX_MIB_BEYOND_WINDOW

Incremented when an incoming packet is received beyond the
receiver window.

nstat -az | grep TcpExtBeyondWindow

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 Documentation/networking/net_cachelines/snmp.rst | 1 +
 include/net/dropreason-core.h                    | 2 ++
 include/uapi/linux/snmp.h                        | 1 +
 net/ipv4/proc.c                                  | 1 +
 net/ipv4/tcp_input.c                             | 1 +
 5 files changed, 6 insertions(+)

diff --git a/Documentation/networking/net_cachelines/snmp.rst b/Documentation/networking/net_cachelines/snmp.rst
index bd44b3eebbef75352599883b9dde36e7889d4120..bce4eb35ec48112ec43d99c58351d3b646a708ec 100644
--- a/Documentation/networking/net_cachelines/snmp.rst
+++ b/Documentation/networking/net_cachelines/snmp.rst
@@ -36,6 +36,7 @@ unsigned_long  LINUX_MIB_TIMEWAITRECYCLED
 unsigned_long  LINUX_MIB_TIMEWAITKILLED
 unsigned_long  LINUX_MIB_PAWSACTIVEREJECTED
 unsigned_long  LINUX_MIB_PAWSESTABREJECTED
+unsigned_long  LINUX_MIB_BEYOND_WINDOW
 unsigned_long  LINUX_MIB_TSECR_REJECTED
 unsigned_long  LINUX_MIB_PAWS_OLD_ACK
 unsigned_long  LINUX_MIB_PAWS_TW_REJECTED
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index d88ff9a75d15fe60a961332a7eb4be94c5c7c3ec..6176e060541f330792014dd6081d1d0857536640 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -305,9 +305,11 @@ enum skb_drop_reason {
 	/** @SKB_DROP_REASON_TCP_OLD_SEQUENCE: Old SEQ field (duplicate packet) */
 	SKB_DROP_REASON_TCP_OLD_SEQUENCE,
 	/** @SKB_DROP_REASON_TCP_INVALID_SEQUENCE: Not acceptable SEQ field.
+	 * Corresponds to LINUX_MIB_BEYOND_WINDOW.
 	 */
 	SKB_DROP_REASON_TCP_INVALID_SEQUENCE,
 	/** @SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE: Not acceptable END_SEQ field.
+	 * Corresponds to LINUX_MIB_BEYOND_WINDOW.
 	 */
 	SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE,
 	/**
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 1d234d7e1892778c5ff04c240f8360608f391401..49f5640092a0df7ca2bfb01e87a627d9b1bc4233 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -186,6 +186,7 @@ enum
 	LINUX_MIB_TIMEWAITKILLED,		/* TimeWaitKilled */
 	LINUX_MIB_PAWSACTIVEREJECTED,		/* PAWSActiveRejected */
 	LINUX_MIB_PAWSESTABREJECTED,		/* PAWSEstabRejected */
+	LINUX_MIB_BEYOND_WINDOW,		/* BeyondWindow */
 	LINUX_MIB_TSECRREJECTED,		/* TSEcrRejected */
 	LINUX_MIB_PAWS_OLD_ACK,			/* PAWSOldAck */
 	LINUX_MIB_PAWS_TW_REJECTED,		/* PAWSTimewait */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index ea2f01584379a59a0a01226ae0f45d3614733fef..65b0d0ab0084029db43135a91da6eeb1f1fba024 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -189,6 +189,7 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TWKilled", LINUX_MIB_TIMEWAITKILLED),
 	SNMP_MIB_ITEM("PAWSActive", LINUX_MIB_PAWSACTIVEREJECTED),
 	SNMP_MIB_ITEM("PAWSEstab", LINUX_MIB_PAWSESTABREJECTED),
+	SNMP_MIB_ITEM("BeyondWindow", LINUX_MIB_BEYOND_WINDOW),
 	SNMP_MIB_ITEM("TSEcrRejected", LINUX_MIB_TSECRREJECTED),
 	SNMP_MIB_ITEM("PAWSOldAck", LINUX_MIB_PAWS_OLD_ACK),
 	SNMP_MIB_ITEM("PAWSTimewait", LINUX_MIB_PAWS_TW_REJECTED),
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index f0f9c78654b449cb2a122e8c53fdcc96e5317de7..5e2d82c273e2fc914706651a660464db4aba8504 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5900,6 +5900,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
 		if (!th->rst) {
 			if (th->syn)
 				goto syn_challenge;
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_BEYOND_WINDOW);
 			if (!tcp_oow_rate_limited(sock_net(sk), skb,
 						  LINUX_MIB_TCPACKSKIPPEDSEQ,
 						  &tp->last_oow_ack_time))
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
  2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
  2025-07-11 11:40 ` [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 20:58   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets Eric Dumazet
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

This test checks TCP behavior when receiving a packet beyond the window.

It checks the new TcpExtBeyondWindow SNMP counter.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 .../net/packetdrill/tcp_rcv_big_endseq.pkt    | 44 +++++++++++++++++++
 1 file changed, 44 insertions(+)
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt

diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
new file mode 100644
index 0000000000000000000000000000000000000000..7e170b94fd366ef516d68cf97bf921fdbf437ca8
--- /dev/null
+++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+--mss=1000
+
+`./defaults.sh`
+
+    0 `nstat -n`
+
+// Establish a connection.
+   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+   +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [10000], 4) = 0
+   +0 bind(3, ..., ...) = 0
+   +0 listen(3, 1) = 0
+
+   +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+   +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
+  +.1 < . 1:1(0) ack 1 win 257
+
+  +0 accept(3, ..., ...) = 4
+
+  +0 < P. 1:4001(4000) ack 1 win 257
+  +0 > .  1:1(0) ack 4001 win 5000
+
+// packet in sequence : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE / LINUX_MIB_BEYOND_WINDOW
+  +0 < P. 4001:54001(50000) ack 1 win 257
+  +0 > .  1:1(0) ack 4001 win 5000
+
+// ooo packet. : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE / LINUX_MIB_BEYOND_WINDOW
+  +1 < P. 5001:55001(50000) ack 1 win 257
+  +0 > .  1:1(0) ack 4001 win 5000
+
+// SKB_DROP_REASON_TCP_INVALID_SEQUENCE / LINUX_MIB_BEYOND_WINDOW
+  +0 < P. 70001:80001(10000) ack 1 win 257
+  +0 > .  1:1(0) ack 4001 win 5000
+
+  +0 read(4, ..., 100000) = 4000
+
+// If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd
+  +0 < P. 4001:54001(50000) ack 1 win 257
+  +.040 > .  1:1(0) ack 54001 win 0
+
+// Check LINUX_MIB_BEYOND_WINDOW has been incremented 3 times.
++0 `nstat | grep TcpExtBeyondWindow | grep -q " 3 "`
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (2 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 21:11   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt Eric Dumazet
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

tcp_measure_rcv_mss() is used to update icsk->icsk_ack.rcv_mss
(tcpi_rcv_mss in tcp_info) and tp->scaling_ratio.

Calling it from tcp_data_queue_ofo() makes sure these
fields are updated, and permits a better tuning
of sk->sk_rcvbuf, in the case a new flow receives many ooo
packets.

Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 5e2d82c273e2fc914706651a660464db4aba8504..78da05933078b5b665113b57a0edc03b29820496 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4923,6 +4923,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 		return;
 	}
 
+	tcp_measure_rcv_mss(sk, skb);
 	/* Disable header prediction. */
 	tp->pred_flags = 0;
 	inet_csk_schedule_ack(sk);
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (3 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 21:42   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb Eric Dumazet
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

We make sure tcpi_rcv_mss and tp->scaling_ratio
are correctly updated if no in-order packet has been received yet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 .../net/packetdrill/tcp_ooo_rcv_mss.pkt       | 27 +++++++++++++++++++
 1 file changed, 27 insertions(+)
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt

diff --git a/tools/testing/selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt b/tools/testing/selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt
new file mode 100644
index 0000000000000000000000000000000000000000..7e6bc5fb0c8d78f36dc3d18842ff11d938c4e41b
--- /dev/null
+++ b/tools/testing/selftests/net/packetdrill/tcp_ooo_rcv_mss.pkt
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+
+--mss=1000
+
+`./defaults.sh
+sysctl -q net.ipv4.tcp_rmem="4096 131072 $((32*1024*1024))"`
+
+   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+   +0 bind(3, ..., ...) = 0
+   +0 listen(3, 1) = 0
+
+   +0 < S 0:0(0) win 65535 <mss 1000,nop,nop,sackOK,nop,wscale 7>
+   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 10>
+  +.1 < . 1:1(0) ack 1 win 257
+
+   +0 accept(3, ..., ...) = 4
+
+   +0 < . 2001:11001(9000) ack 1 win 257
+   +0 > . 1:1(0) ack 1 win 81 <nop,nop,sack 2001:11001>
+
+// check that ooo packet properly updates tcpi_rcv_mss
+   +0 %{ assert tcpi_rcv_mss == 1000, tcpi_rcv_mss }%
+
+   +0 < . 11001:21001(10000) ack 1 win 257
+   +0 > . 1:1(0) ack 1 win 81 <nop,nop,sack 2001:21001>
+
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (4 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 21:43   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks Eric Dumazet
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

These functions to not modify the skb, add a const qualifier.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h   | 2 +-
 net/ipv4/tcp_input.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 0f2443d4ec581639eb3bdc46cb9b2932123e9246..c8a4b283df6fc4b931270502ddbb5df7ae1e4aa2 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1553,7 +1553,7 @@ __sk_rmem_schedule(struct sock *sk, int size, bool pfmemalloc)
 }
 
 static inline bool
-sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
+sk_rmem_schedule(struct sock *sk, const struct sk_buff *skb, int size)
 {
 	return __sk_rmem_schedule(sk, size, skb_pfmemalloc(skb));
 }
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 78da05933078b5b665113b57a0edc03b29820496..39de55ff898e6ec9c6e5bc9dc7b80ec9d235ca44 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4888,7 +4888,7 @@ static void tcp_ofo_queue(struct sock *sk)
 static bool tcp_prune_ofo_queue(struct sock *sk, const struct sk_buff *in_skb);
 static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb);
 
-static int tcp_try_rmem_schedule(struct sock *sk, struct sk_buff *skb,
+static int tcp_try_rmem_schedule(struct sock *sk, const struct sk_buff *skb,
 				 unsigned int size)
 {
 	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (5 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 21:54   ` Kuniyuki Iwashima
  2025-07-11 11:40 ` [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt Eric Dumazet
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

Currently, TCP stack accepts incoming packet if sizes of receive queues
are below sk->sk_rcvbuf limit.

This can cause memory overshoot if the packet is big, like an 1/2 MB
BIG TCP one.

Refine the check to take into account the incoming skb truesize.

Note that we still accept the packet if the receive queue is empty,
to not completely freeze TCP flows in pathological conditions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 39de55ff898e6ec9c6e5bc9dc7b80ec9d235ca44..9c5baace4b7b24140ab5e0eafc397f124c8c64dd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4888,10 +4888,20 @@ static void tcp_ofo_queue(struct sock *sk)
 static bool tcp_prune_ofo_queue(struct sock *sk, const struct sk_buff *in_skb);
 static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb);
 
+/* Check if this incoming skb can be added to socket receive queues
+ * while satisfying sk->sk_rcvbuf limit.
+ */
+static bool tcp_can_ingest(const struct sock *sk, const struct sk_buff *skb)
+{
+	unsigned int new_mem = atomic_read(&sk->sk_rmem_alloc) + skb->truesize;
+
+	return new_mem <= sk->sk_rcvbuf;
+}
+
 static int tcp_try_rmem_schedule(struct sock *sk, const struct sk_buff *skb,
 				 unsigned int size)
 {
-	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
+	if (!tcp_can_ingest(sk, skb) ||
 	    !sk_rmem_schedule(sk, skb, size)) {
 
 		if (tcp_prune_queue(sk, skb) < 0)
@@ -5507,7 +5517,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk, const struct sk_buff *in_skb)
 		tcp_drop_reason(sk, skb, SKB_DROP_REASON_TCP_OFO_QUEUE_PRUNE);
 		tp->ooo_last_skb = rb_to_skb(prev);
 		if (!prev || goal <= 0) {
-			if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
+			if (tcp_can_ingest(sk, skb) &&
 			    !tcp_under_memory_pressure(sk))
 				break;
 			goal = sk->sk_rcvbuf >> 3;
@@ -5541,12 +5551,12 @@ static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb)
 
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_PRUNECALLED);
 
-	if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
+	if (!tcp_can_ingest(sk, in_skb))
 		tcp_clamp_window(sk);
 	else if (tcp_under_memory_pressure(sk))
 		tcp_adjust_rcv_ssthresh(sk);
 
-	if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf)
+	if (tcp_can_ingest(sk, in_skb))
 		return 0;
 
 	tcp_collapse_ofo_queue(sk);
@@ -5556,7 +5566,7 @@ static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb)
 			     NULL,
 			     tp->copied_seq, tp->rcv_nxt);
 
-	if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf)
+	if (tcp_can_ingest(sk, in_skb))
 		return 0;
 
 	/* Collapsing did not help, destructive actions follow.
@@ -5564,7 +5574,7 @@ static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb)
 
 	tcp_prune_ofo_queue(sk, in_skb);
 
-	if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf)
+	if (tcp_can_ingest(sk, in_skb))
 		return 0;
 
 	/* If we are really being abused, tell the caller to silently
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (6 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks Eric Dumazet
@ 2025-07-11 11:40 ` Eric Dumazet
  2025-07-12 21:57   ` Kuniyuki Iwashima
  2025-07-15  2:20 ` [PATCH net-next 0/8] tcp: receiver changes patchwork-bot+netdevbpf
  2025-07-15  8:25 ` Paolo Abeni
  9 siblings, 1 reply; 29+ messages in thread
From: Eric Dumazet @ 2025-07-11 11:40 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, Eric Dumazet

Check that TCP receiver behavior after "tcp: stronger sk_rcvbuf checks"

Too fat packet is dropped unless receive queue is empty.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 .../net/packetdrill/tcp_rcv_toobig.pkt        | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_rcv_toobig.pkt

diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_toobig.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_toobig.pkt
new file mode 100644
index 0000000000000000000000000000000000000000..f575c0ff89da3c856208b315358c1c4a4c331d12
--- /dev/null
+++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_toobig.pkt
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+
+--mss=1000
+
+`./defaults.sh`
+
+    0 `nstat -n`
+
+// Establish a connection.
+   +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+   +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [20000], 4) = 0
+   +0 bind(3, ..., ...) = 0
+   +0 listen(3, 1) = 0
+
+   +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+   +0 > S. 0:0(0) ack 1 win 18980 <mss 1460,nop,wscale 0>
+  +.1 < . 1:1(0) ack 1 win 257
+
+   +0 accept(3, ..., ...) = 4
+
+   +0 < P. 1:20001(20000) ack 1 win 257
+ +.04 > .  1:1(0) ack 20001 win 18000
+
+   +0 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [12000], 4) = 0
+   +0 < P. 20001:80001(60000) ack 1 win 257
+   +0 > .  1:1(0) ack 20001 win 18000
+
+   +0 read(4, ..., 20000) = 20000
+// A too big packet is accepted if the receive queue is empty
+   +0 < P. 20001:80001(60000) ack 1 win 257
+   +0 > .  1:1(0) ack 80001 win 0
+
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 1/8] tcp: do not accept packets beyond window
  2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
@ 2025-07-12 20:52   ` Kuniyuki Iwashima
  2025-07-15  1:38   ` Jakub Kicinski
  1 sibling, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 20:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Currently, TCP accepts incoming packets which might go beyond the
> offered RWIN.
>
> Add to tcp_sequence() the validation of packet end sequence.
>
> Add the corresponding check in the fast path.
>
> We relax this new constraint if the receive queue is empty,
> to not freeze flows from buggy peers.
>
> Add a new drop reason : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW
  2025-07-11 11:40 ` [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW Eric Dumazet
@ 2025-07-12 20:55   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 20:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Add a new SNMP MIB : LINUX_MIB_BEYOND_WINDOW
>
> Incremented when an incoming packet is received beyond the
> receiver window.
>
> nstat -az | grep TcpExtBeyondWindow
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt
  2025-07-11 11:40 ` [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt Eric Dumazet
@ 2025-07-12 20:58   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 20:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> This test checks TCP behavior when receiving a packet beyond the window.
>
> It checks the new TcpExtBeyondWindow SNMP counter.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets
  2025-07-11 11:40 ` [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets Eric Dumazet
@ 2025-07-12 21:11   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 21:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> tcp_measure_rcv_mss() is used to update icsk->icsk_ack.rcv_mss
> (tcpi_rcv_mss in tcp_info) and tp->scaling_ratio.
>
> Calling it from tcp_data_queue_ofo() makes sure these
> fields are updated, and permits a better tuning
> of sk->sk_rcvbuf, in the case a new flow receives many ooo
> packets.
>
> Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt
  2025-07-11 11:40 ` [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt Eric Dumazet
@ 2025-07-12 21:42   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 21:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> We make sure tcpi_rcv_mss and tp->scaling_ratio
> are correctly updated if no in-order packet has been received yet.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb
  2025-07-11 11:40 ` [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb Eric Dumazet
@ 2025-07-12 21:43   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 21:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> These functions to not modify the skb, add a const qualifier.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks
  2025-07-11 11:40 ` [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks Eric Dumazet
@ 2025-07-12 21:54   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 21:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Currently, TCP stack accepts incoming packet if sizes of receive queues
> are below sk->sk_rcvbuf limit.
>
> This can cause memory overshoot if the packet is big, like an 1/2 MB
> BIG TCP one.
>
> Refine the check to take into account the incoming skb truesize.
>
> Note that we still accept the packet if the receive queue is empty,
> to not completely freeze TCP flows in pathological conditions.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt
  2025-07-11 11:40 ` [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt Eric Dumazet
@ 2025-07-12 21:57   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-12 21:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet

On Fri, Jul 11, 2025 at 4:40 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Check that TCP receiver behavior after "tcp: stronger sk_rcvbuf checks"
>
> Too fat packet is dropped unless receive queue is empty.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 1/8] tcp: do not accept packets beyond window
  2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
  2025-07-12 20:52   ` Kuniyuki Iwashima
@ 2025-07-15  1:38   ` Jakub Kicinski
  1 sibling, 0 replies; 29+ messages in thread
From: Jakub Kicinski @ 2025-07-15  1:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, Neal Cardwell, Simon Horman,
	Kuniyuki Iwashima, Willem de Bruijn, netdev, eric.dumazet

On Fri, 11 Jul 2025 11:39:59 +0000 Eric Dumazet wrote:
> -	/** @SKB_DROP_REASON_TCP_INVALID_SEQUENCE: Not acceptable SEQ field */
> +	/** @SKB_DROP_REASON_TCP_INVALID_SEQUENCE: Not acceptable SEQ field.
> +	 */
>  	SKB_DROP_REASON_TCP_INVALID_SEQUENCE,
> +	/** @SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE: Not acceptable END_SEQ field.
> +	 */
> +	SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE,

FWIW this is not valid kdoc. We can either do:

	/** @WORDS: bla bla bla */

or

	/**
	 * @WORDS: bla bla bla
	 */

but "networking inspired style":

	/** @WORDS: bla bla bla
	 */

is not allowed.

Ima fix for you when applying.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (7 preceding siblings ...)
  2025-07-11 11:40 ` [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt Eric Dumazet
@ 2025-07-15  2:20 ` patchwork-bot+netdevbpf
  2025-07-15  8:25 ` Paolo Abeni
  9 siblings, 0 replies; 29+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-07-15  2:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, ncardwell, horms, kuniyu, willemb, netdev,
	eric.dumazet

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 11 Jul 2025 11:39:58 +0000 you wrote:
> Before accepting an incoming packet:
> 
> - Make sure to not accept a packet beyond advertized RWIN.
>   If not, increment a new SNMP counter (LINUX_MIB_BEYOND_WINDOW)
> 
> - ooo packets should update rcv_mss and tp->scaling_ratio.
> 
> [...]

Here is the summary with links:
  - [net-next,1/8] tcp: do not accept packets beyond window
    https://git.kernel.org/netdev/net-next/c/9ca48d616ed7
  - [net-next,2/8] tcp: add LINUX_MIB_BEYOND_WINDOW
    https://git.kernel.org/netdev/net-next/c/6c758062c64d
  - [net-next,3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt
    https://git.kernel.org/netdev/net-next/c/f5fda1a86884
  - [net-next,4/8] tcp: call tcp_measure_rcv_mss() for ooo packets
    https://git.kernel.org/netdev/net-next/c/38d7e4443365
  - [net-next,5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt
    https://git.kernel.org/netdev/net-next/c/445e0cc38d49
  - [net-next,6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb
    https://git.kernel.org/netdev/net-next/c/75dff0584cce
  - [net-next,7/8] tcp: stronger sk_rcvbuf checks
    https://git.kernel.org/netdev/net-next/c/1d2fbaad7cd8
  - [net-next,8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt
    https://git.kernel.org/netdev/net-next/c/906893cf2cf2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
                   ` (8 preceding siblings ...)
  2025-07-15  2:20 ` [PATCH net-next 0/8] tcp: receiver changes patchwork-bot+netdevbpf
@ 2025-07-15  8:25 ` Paolo Abeni
  2025-07-15  9:21   ` Matthieu Baerts
  9 siblings, 1 reply; 29+ messages in thread
From: Paolo Abeni @ 2025-07-15  8:25 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Matthieu Baerts (NGI0)
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, David S . Miller, Jakub Kicinski

On 7/11/25 1:39 PM, Eric Dumazet wrote:
> Before accepting an incoming packet:
> 
> - Make sure to not accept a packet beyond advertized RWIN.
>   If not, increment a new SNMP counter (LINUX_MIB_BEYOND_WINDOW)
> 
> - ooo packets should update rcv_mss and tp->scaling_ratio.
> 
> - Make sure to not accept packet beyond sk_rcvbuf limit.
> 
> This series includes three associated packetdrill tests.

I suspect this series is causing pktdrill failures for the
tcp_rcv_big_endseq.pkt test case:

# selftests: net/packetdrill: tcp_rcv_big_endseq.pkt
# TAP version 13
# 1..2
# tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.347964 sec but happened at 1.307939 sec;
tolerance 0.014000 sec
# script packet:  1.347964 . 1:1(0) ack 54001 win 0
# actual packet:  1.307939 . 1:1(0) ack 54001 win 0
# not ok 1 ipv4
# tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.354946 sec but happened at 1.314923 sec;
tolerance 0.014000 sec
# script packet:  1.354946 . 1:1(0) ack 54001 win 0
# actual packet:  1.314923 . 1:1(0) ack 54001 win 0
# not ok 2 ipv6
# # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0

the event is happening _before_ the expected time, I guess it's more a
functional issue than a timing one.

I also suspect this series is causing flakes in mptcp tests, i.e.:

# INFO: disconnect
# 63 ns1 MPTCP -> ns1 (10.0.1.1:20001      ) MPTCP     (duration
227ms) [ OK ]
# 64 ns1 MPTCP -> ns1 (10.0.1.1:20002      ) TCP       (duration
96ms) [ OK ]
# 65 ns1 TCP   -> ns1 (10.0.1.1:20003      ) MPTCP     copyfd_io_poll:
poll timed out (events: POLLIN 0, POLLOUT 4)
# copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 0)
# (duration 30318ms) [FAIL] client exit code 2, server 0
#
# netns ns1-VslcTV (listener) socket stat for 20003:
# Netid State      Recv-Q Send-Q Local Address:Port  Peer Address:Port

# tcp   FIN-WAIT-2 0      0           10.0.1.1:20003     10.0.1.1:60698
timer:(timewait,59sec,0) ino:0 sk:1012
#
# tcp   TIME-WAIT  0      0           10.0.1.1:20003     10.0.1.1:60696
timer:(timewait,29sec,0) ino:0 sk:1013
#
# TcpActiveOpens                  3                  0.0
# TcpPassiveOpens                 3                  0.0
# TcpInSegs                       1472               0.0
# TcpOutSegs                      1471               0.0
# TcpRetransSegs                  3                  0.0
# TcpExtPruneCalled               4                  0.0
# TcpExtRcvPruned                 3                  0.0
# TcpExtTW                        3                  0.0
# TcpExtBeyondWindow              7                  0.0
# TcpExtTCPHPHits                 34                 0.0
# TcpExtTCPPureAcks               386                0.0
# TcpExtTCPHPAcks                 33                 0.0
# TcpExtTCPSackRecovery           1                  0.0
# TcpExtTCPFastRetrans            1                  0.0
# TcpExtTCPLossProbes             2                  0.0
# TcpExtTCPLossProbeRecovery      1                  0.0
# TcpExtTCPRcvCollapsed           3                  0.0
# TcpExtTCPBacklogCoalesce        261                0.0
# TcpExtTCPSackShiftFallback      1                  0.0
# TcpExtTCPRcvCoalesce            500                0.0
# TcpExtTCPOFOQueue               1                  0.0
# TcpExtTCPFromZeroWindowAdv      60                 0.0
# TcpExtTCPToZeroWindowAdv        58                 0.0
# TcpExtTCPWantZeroWindowAdv      296                0.0
# TcpExtTCPOrigDataSent           1038               0.0
# TcpExtTCPHystartTrainDetect     1                  0.0
# TcpExtTCPHystartTrainCwnd       16                 0.0
# TcpExtTCPACKSkippedSeq          1                  0.0
# TcpExtTCPWinProbe               7                  0.0
# TcpExtTCPDelivered              1041               0.0
# TcpExtTCPRcvQDrop               2                  0.0
#
# netns ns1-VslcTV (connector) socket stat for 20003:
# Failed to find cgroup2 mount
# Failed to find cgroup2 mount
# Netid State     Recv-Q Send-Q  Local Address:Port  Peer Address:Port

# tcp   TIME-WAIT 0      0            10.0.1.1:60684     10.0.1.1:20003
timer:(timewait,29sec,0) ino:0 sk:11
#
# tcp   LAST-ACK  0      1735147      10.0.1.1:60698     10.0.1.1:20003
timer:(persist,22sec,0) ino:0 sk:12 cgroup:unreachable:1 ---
#  skmem:(r0,rb361100,t0,tb2626560,f2838,w1758442,o0,bl0,d61) ts sack
cubic wscale:7,7 rto:201 backoff:7 rtt:0.12/0.215 ato:40 mss:65483
pmtu:65535 rcvmss:65483 advmss:65483 cwnd:7 ssthresh:7
bytes_sent:1738187 bytes_retrans:65461 bytes_acked:1672727
bytes_received:7659224 segs_out:180 segs_in:243 data_segs_out:103
data_segs_in:221 send 30558733333bps lastsnd:30125 lastrcv:30322
lastack:3693 pacing_rate 36480477512bps delivery_rate 196449000000bps
delivered:103 app_limited busy:30351ms rwnd_limited:30350ms(100.0%)
retrans:0/1 rcv_rtt:0.005 rcv_space:289974 rcv_ssthresh:324480
notsent:1735147 minrtt:0.001 rcv_wnd:324480

@Matttbe: can you reproduce the flakes locally? if so, does reverting
that series stop them? (not that I'm planning a revert, just to validate
my guess).

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15  8:25 ` Paolo Abeni
@ 2025-07-15  9:21   ` Matthieu Baerts
  2025-07-15 10:14     ` Paolo Abeni
  0 siblings, 1 reply; 29+ messages in thread
From: Matthieu Baerts @ 2025-07-15  9:21 UTC (permalink / raw)
  To: Paolo Abeni, Eric Dumazet, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, David S . Miller, Jakub Kicinski

Hi Paolo,

Thank you for having CCed me!

On 15/07/2025 10:25, Paolo Abeni wrote:
> On 7/11/25 1:39 PM, Eric Dumazet wrote:
>> Before accepting an incoming packet:
>>
>> - Make sure to not accept a packet beyond advertized RWIN.
>>   If not, increment a new SNMP counter (LINUX_MIB_BEYOND_WINDOW)
>>
>> - ooo packets should update rcv_mss and tp->scaling_ratio.
>>
>> - Make sure to not accept packet beyond sk_rcvbuf limit.
>>
>> This series includes three associated packetdrill tests.
> 
> I suspect this series is causing pktdrill failures for the
> tcp_rcv_big_endseq.pkt test case:

(Note that this series introduces this new pktdrill test)

(...)
> the event is happening _before_ the expected time, I guess it's more a
> functional issue than a timing one.
> 
> I also suspect this series is causing flakes in mptcp tests, i.e.:

(...)

> @Matttbe: can you reproduce the flakes locally? if so, does reverting
> that series stop them? (not that I'm planning a revert, just to validate
> my guess).

I'm trying to reproduce this locally on top of net-next, no luck so far.
I will also continue to monitor the MPTCP CI.

For the moment, I don't think it might be linked to this series: NIPA is
validating it since the 11th, and the issues only appeared last night.
Plus, I recently added new MPTCP selftests running these tests in 3
additional modes. If this flake was present for a long time, it might be
more visible today.

Eventually, because the failure is due to a poll timed out, and other
unrelated tests have failed at that time too, could it be due to
overloaded test machines?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15  9:21   ` Matthieu Baerts
@ 2025-07-15 10:14     ` Paolo Abeni
  2025-07-15 10:40       ` Matthieu Baerts
  2025-07-15 13:28       ` Jakub Kicinski
  0 siblings, 2 replies; 29+ messages in thread
From: Paolo Abeni @ 2025-07-15 10:14 UTC (permalink / raw)
  To: Matthieu Baerts, Eric Dumazet, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, David S . Miller, Jakub Kicinski

On 7/15/25 11:21 AM, Matthieu Baerts wrote:
> On 15/07/2025 10:25, Paolo Abeni wrote:
>> @Matttbe: can you reproduce the flakes locally? if so, does reverting
>> that series stop them? (not that I'm planning a revert, just to validate
>> my guess).
> 
> I'm trying to reproduce this locally on top of net-next, no luck so far.
> I will also continue to monitor the MPTCP CI.
> 
> For the moment, I don't think it might be linked to this series: 

Agreed. I did not notice the pending mptcp patches, which are a more
relevant suspect here.

> Eventually, because the failure is due to a poll timed out, and other
> unrelated tests have failed at that time too, could it be due to
> overloaded test machines?

Not for a 60s timeout, I guess :-P

/P



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 10:14     ` Paolo Abeni
@ 2025-07-15 10:40       ` Matthieu Baerts
  2025-07-15 13:28       ` Jakub Kicinski
  1 sibling, 0 replies; 29+ messages in thread
From: Matthieu Baerts @ 2025-07-15 10:40 UTC (permalink / raw)
  To: Paolo Abeni, Eric Dumazet, Neal Cardwell
  Cc: Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, netdev,
	eric.dumazet, David S . Miller, Jakub Kicinski

Hi Paolo,

On 15/07/2025 12:14, Paolo Abeni wrote:
> On 7/15/25 11:21 AM, Matthieu Baerts wrote:
>> On 15/07/2025 10:25, Paolo Abeni wrote:
>>> @Matttbe: can you reproduce the flakes locally? if so, does reverting
>>> that series stop them? (not that I'm planning a revert, just to validate
>>> my guess).
>>
>> I'm trying to reproduce this locally on top of net-next, no luck so far.
>> I will also continue to monitor the MPTCP CI.
>>
>> For the moment, I don't think it might be linked to this series: 
> 
> Agreed. I did not notice the pending mptcp patches, which are a more
> relevant suspect here.
> 
>> Eventually, because the failure is due to a poll timed out, and other
>> unrelated tests have failed at that time too, could it be due to
>> overloaded test machines?
> 
> Not for a 60s timeout, I guess :-P

:)

The poll timeout is set to 10s I think. But yes, it is still too long to
be caused by an overloaded test machine I suppose.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 10:14     ` Paolo Abeni
  2025-07-15 10:40       ` Matthieu Baerts
@ 2025-07-15 13:28       ` Jakub Kicinski
  2025-07-15 13:33         ` Jakub Kicinski
  2025-07-15 13:50         ` Paolo Abeni
  1 sibling, 2 replies; 29+ messages in thread
From: Jakub Kicinski @ 2025-07-15 13:28 UTC (permalink / raw)
  To: Paolo Abeni, Neal Cardwell
  Cc: Matthieu Baerts, Eric Dumazet, Simon Horman, Kuniyuki Iwashima,
	Willem de Bruijn, netdev, eric.dumazet, David S . Miller

On Tue, 15 Jul 2025 12:14:34 +0200 Paolo Abeni wrote:
> > Eventually, because the failure is due to a poll timed out, and other
> > unrelated tests have failed at that time too, could it be due to
> > overloaded test machines?  
> 
> Not for a 60s timeout, I guess :-P

I think the timeout may be packetdrill-version related.
I tried with the Fedora packetdrill and the test times out.
With packetdrill built from source on my laptop I get:

# (null):17: error handling packet: timing error: expected outbound packet at 0.074144 sec but happened at -1752585909.757339 sec; tolerance 0.004000 sec
# script packet:  0.074144 S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
# actual packet: -1752585909.757339 S.0 0:0(0) ack 1 <mss 1460,nop,wscale 0>

:o

But the CI just gets the failure Paolo quoted.

I'm leaning towards Eric using a different packetdrill, and/or this
being packetdrill / compiler related. On Fedora I'm hitting this build
failure which may explain why the distro hasn't updated recently:

cc -g -Wall -Werror   -c -o code.o code.c
In file included from code.h:29,
                 from code.c:26:
types.h:64:12: error: two or more data types in declaration specifiers
   64 | typedef u8 bool;
      |            ^~~~
types.h:64:1: error: useless type name in empty declaration [-Werror]
   64 | typedef u8 bool;
      | ^~~~~~~
types.h:66:9: error: cannot use keyword ‘false’ as enumeration constant
   66 |         false = 0,
      |         ^~~~~
types.h:66:9: note: ‘false’ is a keyword with ‘-std=c23’ onwards
cc1: all warnings being treated as errors
make: *** [<builtin>: code.o] Error 1


Neal?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 13:28       ` Jakub Kicinski
@ 2025-07-15 13:33         ` Jakub Kicinski
  2025-07-15 13:52           ` Paolo Abeni
  2025-07-15 14:48           ` Kuniyuki Iwashima
  2025-07-15 13:50         ` Paolo Abeni
  1 sibling, 2 replies; 29+ messages in thread
From: Jakub Kicinski @ 2025-07-15 13:33 UTC (permalink / raw)
  To: Paolo Abeni, Neal Cardwell
  Cc: Matthieu Baerts, Eric Dumazet, Simon Horman, Kuniyuki Iwashima,
	Willem de Bruijn, netdev, eric.dumazet, David S . Miller

On Tue, 15 Jul 2025 06:28:29 -0700 Jakub Kicinski wrote:
> # (null):17: error handling packet: timing error: expected outbound packet at 0.074144 sec but happened at -1752585909.757339 sec; tolerance 0.004000 sec
> # script packet:  0.074144 S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
> # actual packet: -1752585909.757339 S.0 0:0(0) ack 1 <mss 1460,nop,wscale 0>

This is definitely compiler related, I rebuilt with clang and the build
error goes away. Now I get a more sane failure:

# tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
# script packet:  1.230105 . 1:1(0) ack 54001 win 0 
# actual packet:  1.190101 . 1:1(0) ack 54001 win 0 

$ gcc --version
gcc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2)

I don't understand why the ack is supposed to be delayed, should we
just do this? (I think Eric is OOO, FWIW)

diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
index 7e170b94fd36..3848b419e68c 100644
--- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
+++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
@@ -38,7 +38,7 @@
 
 // If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd
   +0 < P. 4001:54001(50000) ack 1 win 257
-  +.040 > .  1:1(0) ack 54001 win 0
+  +0 > .  1:1(0) ack 54001 win 0
 
 // Check LINUX_MIB_BEYOND_WINDOW has been incremented 3 times.
 +0 `nstat | grep TcpExtBeyondWindow | grep -q " 3 "`

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 13:28       ` Jakub Kicinski
  2025-07-15 13:33         ` Jakub Kicinski
@ 2025-07-15 13:50         ` Paolo Abeni
  1 sibling, 0 replies; 29+ messages in thread
From: Paolo Abeni @ 2025-07-15 13:50 UTC (permalink / raw)
  To: Jakub Kicinski, Neal Cardwell
  Cc: Matthieu Baerts, Eric Dumazet, Simon Horman, Kuniyuki Iwashima,
	Willem de Bruijn, netdev, eric.dumazet, David S . Miller

On 7/15/25 3:28 PM, Jakub Kicinski wrote:
> On Tue, 15 Jul 2025 12:14:34 +0200 Paolo Abeni wrote:
>>> Eventually, because the failure is due to a poll timed out, and other
>>> unrelated tests have failed at that time too, could it be due to
>>> overloaded test machines?  
>>
>> Not for a 60s timeout, I guess :-P

FTR, the above was referred to the mptcp selftest failure/timeout.

/P


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 13:33         ` Jakub Kicinski
@ 2025-07-15 13:52           ` Paolo Abeni
  2025-07-15 14:54             ` Jakub Kicinski
  2025-07-15 14:48           ` Kuniyuki Iwashima
  1 sibling, 1 reply; 29+ messages in thread
From: Paolo Abeni @ 2025-07-15 13:52 UTC (permalink / raw)
  To: Jakub Kicinski, Neal Cardwell
  Cc: Matthieu Baerts, Eric Dumazet, Simon Horman, Kuniyuki Iwashima,
	Willem de Bruijn, netdev, eric.dumazet, David S . Miller

On 7/15/25 3:33 PM, Jakub Kicinski wrote:
> On Tue, 15 Jul 2025 06:28:29 -0700 Jakub Kicinski wrote:
>> # (null):17: error handling packet: timing error: expected outbound packet at 0.074144 sec but happened at -1752585909.757339 sec; tolerance 0.004000 sec
>> # script packet:  0.074144 S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
>> # actual packet: -1752585909.757339 S.0 0:0(0) ack 1 <mss 1460,nop,wscale 0>
> 
> This is definitely compiler related, I rebuilt with clang and the build
> error goes away. Now I get a more sane failure:
> 
> # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
> # script packet:  1.230105 . 1:1(0) ack 54001 win 0 
> # actual packet:  1.190101 . 1:1(0) ack 54001 win 0 
> 
> $ gcc --version
> gcc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2)
> 
> I don't understand why the ack is supposed to be delayed, should we
> just do this? (I think Eric is OOO, FWIW)
> 
> diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> index 7e170b94fd36..3848b419e68c 100644
> --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> @@ -38,7 +38,7 @@
>  
>  // If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd
>    +0 < P. 4001:54001(50000) ack 1 win 257
> -  +.040 > .  1:1(0) ack 54001 win 0
> +  +0 > .  1:1(0) ack 54001 win 0
>  
>  // Check LINUX_MIB_BEYOND_WINDOW has been incremented 3 times.
>  +0 `nstat | grep TcpExtBeyondWindow | grep -q " 3 "`

The above looks sane to me, but I Neal or Willem ack would be appreciated.

/P





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 13:33         ` Jakub Kicinski
  2025-07-15 13:52           ` Paolo Abeni
@ 2025-07-15 14:48           ` Kuniyuki Iwashima
  1 sibling, 0 replies; 29+ messages in thread
From: Kuniyuki Iwashima @ 2025-07-15 14:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Paolo Abeni, Neal Cardwell, Matthieu Baerts, Eric Dumazet,
	Simon Horman, Willem de Bruijn, netdev, eric.dumazet,
	David S . Miller

On Tue, Jul 15, 2025 at 6:33 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 15 Jul 2025 06:28:29 -0700 Jakub Kicinski wrote:
> > # (null):17: error handling packet: timing error: expected outbound packet at 0.074144 sec but happened at -1752585909.757339 sec; tolerance 0.004000 sec
> > # script packet:  0.074144 S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
> > # actual packet: -1752585909.757339 S.0 0:0(0) ack 1 <mss 1460,nop,wscale 0>
>
> This is definitely compiler related, I rebuilt with clang and the build
> error goes away. Now I get a more sane failure:
>
> # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
> # script packet:  1.230105 . 1:1(0) ack 54001 win 0
> # actual packet:  1.190101 . 1:1(0) ack 54001 win 0
>
> $ gcc --version
> gcc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2)
>
> I don't understand why the ack is supposed to be delayed, should we
> just do this? (I think Eric is OOO, FWIW)
>
> diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> index 7e170b94fd36..3848b419e68c 100644
> --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> @@ -38,7 +38,7 @@
>
>  // If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd
>    +0 < P. 4001:54001(50000) ack 1 win 257
> -  +.040 > .  1:1(0) ack 54001 win 0
> +  +0 > .  1:1(0) ack 54001 win 0
>
>  // Check LINUX_MIB_BEYOND_WINDOW has been incremented 3 times.
>  +0 `nstat | grep TcpExtBeyondWindow | grep -q " 3 "`

I remember I didn't see this error just after the commit that added the test,
and now I see the failure after commit 1d2fbaad7cd8c ("tcp: stronger
sk_rcvbuf checks").

[root@fedora packetdrill]# uname -r
6.16.0-rc5-01431-g75dff0584cce
[root@fedora packetdrill]# ./ksft_runner.sh tcp_rcv_big_endseq.pkt
TAP version 13
1..2
ok 1 ipv4
ok 2 ipv6
# Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0

[root@fedora packetdrill]# uname -r
6.16.0-rc5-01432-g1d2fbaad7cd8
[root@fedora packetdrill]# ./ksft_runner.sh tcp_rcv_big_endseq.pkt
TAP version 13
1..2
tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.148682 sec but happened at 1.108681 sec;
tolerance 0.005005 sec
script packet:  1.148682 . 1:1(0) ack 54001 win 0
actual packet:  1.108681 . 1:1(0) ack 54001 win 0
not ok 1 ipv4
tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.146130 sec but happened at 1.106130 sec;
tolerance 0.005005 sec
script packet:  1.146130 . 1:1(0) ack 54001 win 0
actual packet:  1.106130 . 1:1(0) ack 54001 win 0
not ok 2 ipv6
# Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0


On 75dff0584cce, the test failed if I removed the delay.
I haven't checked where it comes from, but probably that's
why Eric added the delay ?

[root@fedora packetdrill]# ./ksft_runner.sh tcp_rcv_big_endseq.pkt
TAP version 13
1..2
tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.105941 sec but happened at 1.146774 sec;
tolerance 0.004000 sec
script packet:  1.105941 . 1:1(0) ack 54001 win 0
actual packet:  1.146774 . 1:1(0) ack 54001 win 0
not ok 1 ipv4
tcp_rcv_big_endseq.pkt:41: error handling packet: timing error:
expected outbound packet at 1.106215 sec but happened at 1.146815 sec;
tolerance 0.004000 sec
script packet:  1.106215 . 1:1(0) ack 54001 win 0
actual packet:  1.146815 . 1:1(0) ack 54001 win 0

not ok 2 ipv6
# Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 0/8] tcp: receiver changes
  2025-07-15 13:52           ` Paolo Abeni
@ 2025-07-15 14:54             ` Jakub Kicinski
  0 siblings, 0 replies; 29+ messages in thread
From: Jakub Kicinski @ 2025-07-15 14:54 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Neal Cardwell, Matthieu Baerts, Eric Dumazet, Simon Horman,
	Kuniyuki Iwashima, Willem de Bruijn, netdev, eric.dumazet,
	David S . Miller

On Tue, 15 Jul 2025 15:52:33 +0200 Paolo Abeni wrote:
> On 7/15/25 3:33 PM, Jakub Kicinski wrote:
> > On Tue, 15 Jul 2025 06:28:29 -0700 Jakub Kicinski wrote:  
> >> # (null):17: error handling packet: timing error: expected outbound packet at 0.074144 sec but happened at -1752585909.757339 sec; tolerance 0.004000 sec
> >> # script packet:  0.074144 S. 0:0(0) ack 1 <mss 1460,nop,wscale 0>
> >> # actual packet: -1752585909.757339 S.0 0:0(0) ack 1 <mss 1460,nop,wscale 0>  
> > 
> > This is definitely compiler related, I rebuilt with clang and the build
> > error goes away. Now I get a more sane failure:
> > 
> > # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
> > # script packet:  1.230105 . 1:1(0) ack 54001 win 0 
> > # actual packet:  1.190101 . 1:1(0) ack 54001 win 0 
> > 
> > $ gcc --version
> > gcc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2)
> > 
> > I don't understand why the ack is supposed to be delayed, should we
> > just do this? (I think Eric is OOO, FWIW)
> > 
> > diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> > index 7e170b94fd36..3848b419e68c 100644
> > --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> > +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt
> > @@ -38,7 +38,7 @@
> >  
> >  // If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd
> >    +0 < P. 4001:54001(50000) ack 1 win 257
> > -  +.040 > .  1:1(0) ack 54001 win 0
> > +  +0 > .  1:1(0) ack 54001 win 0
> >  
> >  // Check LINUX_MIB_BEYOND_WINDOW has been incremented 3 times.
> >  +0 `nstat | grep TcpExtBeyondWindow | grep -q " 3 "`  
> 
> The above looks sane to me, but I Neal or Willem ack would be appreciated.

Posted officially here to get it queued to the CI already:
https://lore.kernel.org/all/20250715142849.959444-1-kuba@kernel.org/

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-07-15 14:54 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 11:39 [PATCH net-next 0/8] tcp: receiver changes Eric Dumazet
2025-07-11 11:39 ` [PATCH net-next 1/8] tcp: do not accept packets beyond window Eric Dumazet
2025-07-12 20:52   ` Kuniyuki Iwashima
2025-07-15  1:38   ` Jakub Kicinski
2025-07-11 11:40 ` [PATCH net-next 2/8] tcp: add LINUX_MIB_BEYOND_WINDOW Eric Dumazet
2025-07-12 20:55   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 3/8] selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt Eric Dumazet
2025-07-12 20:58   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 4/8] tcp: call tcp_measure_rcv_mss() for ooo packets Eric Dumazet
2025-07-12 21:11   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 5/8] selftests/net: packetdrill: add tcp_ooo_rcv_mss.pkt Eric Dumazet
2025-07-12 21:42   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 6/8] tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb Eric Dumazet
2025-07-12 21:43   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks Eric Dumazet
2025-07-12 21:54   ` Kuniyuki Iwashima
2025-07-11 11:40 ` [PATCH net-next 8/8] selftests/net: packetdrill: add tcp_rcv_toobig.pkt Eric Dumazet
2025-07-12 21:57   ` Kuniyuki Iwashima
2025-07-15  2:20 ` [PATCH net-next 0/8] tcp: receiver changes patchwork-bot+netdevbpf
2025-07-15  8:25 ` Paolo Abeni
2025-07-15  9:21   ` Matthieu Baerts
2025-07-15 10:14     ` Paolo Abeni
2025-07-15 10:40       ` Matthieu Baerts
2025-07-15 13:28       ` Jakub Kicinski
2025-07-15 13:33         ` Jakub Kicinski
2025-07-15 13:52           ` Paolo Abeni
2025-07-15 14:54             ` Jakub Kicinski
2025-07-15 14:48           ` Kuniyuki Iwashima
2025-07-15 13:50         ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).