Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] selftests: drv-net: gro: signal over-coalescing more reliably
@ 2026-06-07  0:24 Jakub Kicinski
  0 siblings, 0 replies; only message in thread
From: Jakub Kicinski @ 2026-06-07  0:24 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
	shuah, willemb, linux-kselftest

GRO test is very timing-sensitive, packets may be delayed
by the network or just sent slowly. Because of this we retry
each test case up to 6 times.

This makes perfect sense for positive cases, in which we want
to see coalescing. Negative test cases, which modify headers
and expect no coalescing should have opposite treatment.
We should really try 6 times and make sure that each time
the test failed. This would, however, require that we annotate
each test to indicate whether its positive or negative.
Let's start with a simpler improvement. Do not allow
retries if we detected over-coalescing. Previously the negative
case would have to get lucky at least once in 6 tries to pass.
Now the first failure breaks the retry loop.

For background - NICs tend to ignore the contents of the TCP
timestamp option, so that test case commonly fails. In NIPA
having 6 attempts, however, was enough for some NICs to get
multiple successful runs in a row, getting the test cases
auto-classified as expected to pass, even tho the NIC does
not comply with the expectations.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: shuah@kernel.org
CC: willemb@google.com
CC: linux-kselftest@vger.kernel.org
---
 tools/testing/selftests/net/lib/gro.c      | 16 +++++++++++++++-
 tools/testing/selftests/drivers/net/gro.py |  8 +++++++-
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/lib/gro.c b/tools/testing/selftests/net/lib/gro.c
index fa35dfc8e790..7a333155de1a 100644
--- a/tools/testing/selftests/net/lib/gro.c
+++ b/tools/testing/selftests/net/lib/gro.c
@@ -108,6 +108,8 @@
 #define EXT_PAYLOAD_1 "\x00\x00\x00\x00\x00\x00"
 #define EXT_PAYLOAD_2 "\x11\x11\x11\x11\x11\x11"
 
+#define EXIT_OVER_COALESCE	42
+
 #define ipv6_optlen(p)  (((p)->hdrlen+1) << 3) /* calculate IPv6 extension header len */
 #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
 
@@ -1165,6 +1167,8 @@ static void check_recv_pkts(int fd, int *correct_payload,
 	struct ipv6hdr *ip6h = (struct ipv6hdr *)(buffer + nhoff);
 	struct tcphdr *tcph;
 	bool bad_packet = false;
+	int bytes_expected = 0;
+	int bytes_received = 0;
 	int tcp_ext_len = 0;
 	int ip_ext_len = 0;
 	int pkt_size = -1;
@@ -1173,8 +1177,10 @@ static void check_recv_pkts(int fd, int *correct_payload,
 	int i;
 
 	vlog("Expected {");
-	for (i = 0; i < correct_num_pkts; i++)
+	for (i = 0; i < correct_num_pkts; i++) {
 		vlog("%d ", correct_payload[i]);
+		bytes_expected += correct_payload[i];
+	}
 	vlog("}, Total %d packets\nReceived {", correct_num_pkts);
 
 	while (1) {
@@ -1209,9 +1215,17 @@ static void check_recv_pkts(int fd, int *correct_payload,
 			vlog("[!=%d]", correct_payload[num_pkt]);
 			bad_packet = true;
 		}
+		bytes_received += data_len;
 		num_pkt++;
 	}
 	vlog("}, Total %d packets.\n", num_pkt);
+	/* Signal over-coalescing explicitly, it's a hard failure, unlike
+	 * under-coalescing which could be timing- or loss-related.
+	 */
+	if (num_pkt < correct_num_pkts && bytes_received == bytes_expected)
+		error(EXIT_OVER_COALESCE, 0,
+		      "over-coalesced: got %d pkts vs expected %d (%d B)",
+		      num_pkt, correct_num_pkts, bytes_received);
 	if (num_pkt != correct_num_pkts)
 		error(1, 0, "incorrect number of packets");
 	if (bad_packet)
diff --git a/tools/testing/selftests/drivers/net/gro.py b/tools/testing/selftests/drivers/net/gro.py
index fd158c775b1c..6ab8c97880d1 100755
--- a/tools/testing/selftests/drivers/net/gro.py
+++ b/tools/testing/selftests/drivers/net/gro.py
@@ -40,7 +40,7 @@ import glob
 import os
 import re
 from lib.py import ksft_run, ksft_exit, ksft_pr
-from lib.py import NetDrvEpEnv, KsftXfailEx
+from lib.py import NetDrvEpEnv, KsftFailEx, KsftXfailEx
 from lib.py import NetdevFamily, EthtoolFamily
 from lib.py import bkg, cmd, defer, ethtool, ip
 from lib.py import ksft_variants, KsftNamedVariant
@@ -370,6 +370,12 @@ def _run_gro_bin(cfg, test_name, protocol=None, num_flows=None,
 
         ksft_pr(rx_proc)
 
+        # ret==42 means the receiver detected over-coalescing.
+        # This is unambiguous proof of a bug, retries can only cause
+        # false negatives.
+        if rx_proc.ret == 42:
+            raise KsftFailEx(f"GRO over-coalesced in {protocol}/{test_name}")
+
         if test_name.startswith("large_") and os.environ.get("KSFT_MACHINE_SLOW"):
             ksft_pr(f"Ignoring {protocol}/{test_name} failure due to slow environment")
             return
-- 
2.54.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-06-07  0:24 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-07  0:24 [PATCH net-next] selftests: drv-net: gro: signal over-coalescing more reliably Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox