public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO
@ 2026-02-05 22:05 Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats Jakub Kicinski
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Add miscellaneous pieces related to production use of HW-GRO:
 - report standard stats from drivers (bnxt included here,
   Gal recently posted patches for mlx5 which is great)
 - CLI tool for calculating HW GRO savings / effectiveness
 - tests for the stats, packet ordering and depth

Jakub Kicinski (9):
  eth: bnxt: gather and report HW-GRO stats
  tools: ynltool: factor out qstat dumping
  tools: ynltool: add qstats analysis for HW-GRO efficiency / savings
  selftests: net: move gro to lib for HW vs SW reuse
  selftests: drv-net: give HW stats sync time extra 25% of margin
  selftests: drv-net: gro: use SO_TXTIME to schedule packets together
  selftests: drv-net: gro: test GRO stats
  selftests: drv-net: gro: add test for packet ordering
  selftests: drv-net: gro: add a test for HW-GRO depth

 tools/testing/selftests/drivers/net/Makefile  |   1 -
 .../testing/selftests/drivers/net/hw/Makefile |   1 +
 tools/testing/selftests/net/lib/Makefile      |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   4 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  17 +-
 tools/net/ynl/ynltool/qstats.c                | 170 +++++---
 .../selftests/{drivers/net => net/lib}/gro.c  | 242 +++++++++++-
 .../testing/selftests/drivers/net/.gitignore  |   1 -
 tools/testing/selftests/drivers/net/gro.py    |   2 +-
 .../selftests/drivers/net/hw/gro_hw.py        | 374 ++++++++++++++++++
 .../selftests/drivers/net/lib/py/env.py       |   4 +-
 tools/testing/selftests/net/lib/.gitignore    |   1 +
 12 files changed, 748 insertions(+), 70 deletions(-)
 rename tools/testing/selftests/{drivers/net => net/lib}/gro.c (87%)
 create mode 100755 tools/testing/selftests/drivers/net/hw/gro_hw.py

-- 
2.53.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-05 22:44   ` Michael Chan
  2026-02-05 22:05 ` [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping Jakub Kicinski
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Count and report HW-GRO stats as seen by the kernel.
The device stats for GRO seem to not reflect the reality,
perhaps they count sessions which did not actually result
in any aggregation. Also they count wire packets, so we
have to count super-frames, anyway.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 ++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++++++++++++++--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index f036ef60230b..b2efdbdd1356 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1128,6 +1128,8 @@ struct bnxt_rx_sw_stats {
 	u64			rx_buf_errors;
 	u64			rx_oom_discards;
 	u64			rx_netpoll_discards;
+	u64			rx_hw_gro_packets;
+	u64			rx_hw_gro_wire_packets;
 };
 
 struct bnxt_tx_sw_stats {
@@ -1151,6 +1153,8 @@ struct bnxt_total_ring_err_stats {
 	u64			rx_total_oom_discards;
 	u64			rx_total_netpoll_discards;
 	u64			rx_total_ring_discards;
+	u64			rx_total_hw_gro_packets;
+	u64			rx_total_hw_gro_wire_packets;
 	u64			tx_total_resets;
 	u64			tx_total_ring_discards;
 	u64			total_missed_irqs;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 466e0fc6141f..4a4145f138f9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1804,7 +1804,8 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 					   struct bnxt_tpa_info *tpa_info,
 					   struct rx_tpa_end_cmp *tpa_end,
 					   struct rx_tpa_end_cmp_ext *tpa_end1,
-					   struct sk_buff *skb)
+					   struct sk_buff *skb,
+					   struct bnxt_rx_sw_stats *rx_stats)
 {
 #ifdef CONFIG_INET
 	int payload_off;
@@ -1814,6 +1815,11 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	if (segs == 1)
 		return skb;
 
+	if (bp->dev->features & NETIF_F_GRO_HW) {
+		rx_stats->rx_hw_gro_packets++;
+		rx_stats->rx_hw_gro_wire_packets += segs;
+	}
+
 	NAPI_GRO_CB(skb)->count = segs;
 	skb_shinfo(skb)->gso_size =
 		le32_to_cpu(tpa_end1->rx_tpa_end_cmp_seg_len);
@@ -1987,7 +1993,8 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 	}
 
 	if (gro)
-		skb = bnxt_gro_skb(bp, tpa_info, tpa_end, tpa_end1, skb);
+		skb = bnxt_gro_skb(bp, tpa_info, tpa_end, tpa_end1, skb,
+				   &cpr->sw_stats->rx);
 
 	return skb;
 }
@@ -13492,6 +13499,8 @@ static void bnxt_get_one_ring_err_stats(struct bnxt *bp,
 	stats->rx_total_netpoll_discards += sw_stats->rx.rx_netpoll_discards;
 	stats->rx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, rx_discard_pkts);
+	stats->rx_total_hw_gro_packets += sw_stats->rx.rx_hw_gro_packets;
+	stats->rx_total_hw_gro_wire_packets += sw_stats->rx.rx_hw_gro_wire_packets;
 	stats->tx_total_resets += sw_stats->tx.tx_resets;
 	stats->tx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, tx_discard_pkts);
@@ -15931,6 +15940,8 @@ static void bnxt_get_queue_stats_rx(struct net_device *dev, int i,
 	stats->bytes += BNXT_GET_RING_STATS64(sw, rx_bcast_bytes);
 
 	stats->alloc_fail = cpr->sw_stats->rx.rx_oom_discards;
+	stats->hw_gro_packets = cpr->sw_stats->rx.rx_hw_gro_packets;
+	stats->hw_gro_wire_packets = cpr->sw_stats->rx.rx_hw_gro_wire_packets;
 }
 
 static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
@@ -15966,6 +15977,8 @@ static void bnxt_get_base_stats(struct net_device *dev,
 	rx->packets = bp->net_stats_prev.rx_packets;
 	rx->bytes = bp->net_stats_prev.rx_bytes;
 	rx->alloc_fail = bp->ring_err_stats_prev.rx_total_oom_discards;
+	rx->hw_gro_packets = bp->ring_err_stats_prev.rx_total_hw_gro_packets;
+	rx->hw_gro_wire_packets = bp->ring_err_stats_prev.rx_total_hw_gro_wire_packets;
 
 	tx->packets = bp->net_stats_prev.tx_packets;
 	tx->bytes = bp->net_stats_prev.tx_bytes;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 14:58   ` Petr Machata
  2026-02-05 22:05 ` [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings Jakub Kicinski
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

The logic to open a socket and dump the queues is the same
across sub-commands. Factor it out, we'll need it again.

No functional changes intended.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/net/ynl/ynltool/qstats.c | 95 +++++++++++++++-------------------
 1 file changed, 41 insertions(+), 54 deletions(-)

diff --git a/tools/net/ynl/ynltool/qstats.c b/tools/net/ynl/ynltool/qstats.c
index 31fb45709ffa..d19acab0bf2a 100644
--- a/tools/net/ynl/ynltool/qstats.c
+++ b/tools/net/ynl/ynltool/qstats.c
@@ -237,13 +237,47 @@ static void print_plain_qstats(struct netdev_qstats_get_list *qstats)
 	}
 }
 
-static int do_show(int argc, char **argv)
+static struct netdev_qstats_get_list *
+qstats_dump(enum netdev_qstats_scope scope)
 {
 	struct netdev_qstats_get_list *qstats;
 	struct netdev_qstats_get_req *req;
 	struct ynl_error yerr;
 	struct ynl_sock *ys;
-	int ret = 0;
+
+	ys = ynl_sock_create(&ynl_netdev_family, &yerr);
+	if (!ys) {
+		p_err("YNL: %s", yerr.msg);
+		return NULL;
+	}
+
+	req = netdev_qstats_get_req_alloc();
+	if (!req) {
+		p_err("failed to allocate qstats request");
+		goto err_close;
+	}
+
+	if (scope)
+		netdev_qstats_get_req_set_scope(req, scope);
+
+	qstats = netdev_qstats_get_dump(ys, req);
+	netdev_qstats_get_req_free(req);
+	if (!qstats) {
+		p_err("failed to get queue stats: %s", ys->err.msg);
+		goto err_close;
+	}
+
+	ynl_sock_destroy(ys);
+	return qstats;
+
+err_close:
+	ynl_sock_destroy(ys);
+	return NULL;
+}
+
+static int do_show(int argc, char **argv)
+{
+	struct netdev_qstats_get_list *qstats;
 
 	/* Parse options */
 	while (argc > 0) {
@@ -268,29 +302,9 @@ static int do_show(int argc, char **argv)
 		}
 	}
 
-	ys = ynl_sock_create(&ynl_netdev_family, &yerr);
-	if (!ys) {
-		p_err("YNL: %s", yerr.msg);
+	qstats = qstats_dump(scope);
+	if (!qstats)
 		return -1;
-	}
-
-	req = netdev_qstats_get_req_alloc();
-	if (!req) {
-		p_err("failed to allocate qstats request");
-		ret = -1;
-		goto exit_close;
-	}
-
-	if (scope)
-		netdev_qstats_get_req_set_scope(req, scope);
-
-	qstats = netdev_qstats_get_dump(ys, req);
-	netdev_qstats_get_req_free(req);
-	if (!qstats) {
-		p_err("failed to get queue stats: %s", ys->err.msg);
-		ret = -1;
-		goto exit_close;
-	}
 
 	/* Print the stats as returned by the kernel */
 	if (json_output)
@@ -299,9 +313,7 @@ static int do_show(int argc, char **argv)
 		print_plain_qstats(qstats);
 
 	netdev_qstats_get_list_free(qstats);
-exit_close:
-	ynl_sock_destroy(ys);
-	return ret;
+	return 0;
 }
 
 static void compute_stats(__u64 *values, unsigned int count,
@@ -406,10 +418,7 @@ static int cmp_ifindex_type(const void *a, const void *b)
 static int do_balance(int argc, char **argv __attribute__((unused)))
 {
 	struct netdev_qstats_get_list *qstats;
-	struct netdev_qstats_get_req *req;
 	struct netdev_qstats_get_rsp **sorted;
-	struct ynl_error yerr;
-	struct ynl_sock *ys;
 	unsigned int count = 0;
 	unsigned int i, j;
 	int ret = 0;
@@ -419,29 +428,9 @@ static int do_balance(int argc, char **argv __attribute__((unused)))
 		return -1;
 	}
 
-	ys = ynl_sock_create(&ynl_netdev_family, &yerr);
-	if (!ys) {
-		p_err("YNL: %s", yerr.msg);
+	qstats = qstats_dump(NETDEV_QSTATS_SCOPE_QUEUE);
+	if (!qstats)
 		return -1;
-	}
-
-	req = netdev_qstats_get_req_alloc();
-	if (!req) {
-		p_err("failed to allocate qstats request");
-		ret = -1;
-		goto exit_close;
-	}
-
-	/* Always use queue scope for balance analysis */
-	netdev_qstats_get_req_set_scope(req, NETDEV_QSTATS_SCOPE_QUEUE);
-
-	qstats = netdev_qstats_get_dump(ys, req);
-	netdev_qstats_get_req_free(req);
-	if (!qstats) {
-		p_err("failed to get queue stats: %s", ys->err.msg);
-		ret = -1;
-		goto exit_close;
-	}
 
 	/* Count and sort queues */
 	ynl_dump_foreach(qstats, qs)
@@ -576,8 +565,6 @@ static int do_balance(int argc, char **argv __attribute__((unused)))
 	free(sorted);
 exit_free_qstats:
 	netdev_qstats_get_list_free(qstats);
-exit_close:
-	ynl_sock_destroy(ys);
 	return ret;
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 13:44   ` Petr Machata
  2026-02-05 22:05 ` [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse Jakub Kicinski
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Extend ynltool to compute HW GRO savings metric - how many
packets has HW GRO been able to save the kernel from seeing.

Note that this definition does not actually take into account
whether the segments were or weren't eligible for HW GRO.
If a machine is receiving all-UDP traffic - new metric will show
HW-GRO savings of 0%. Conversely since the super-packet still
counts as a received packet, savings of 100% is not achievable.
Perfect HW-GRO on a machine with 4k MTU and 64kB super-frames
would show ~93.75% savings. With 1.5k MTU we may see up to
~97.8% savings (if my math is right).

Example after 10 sec of iperf on a freshly booted machine
with 1.5k MTU:

  $ ynltool qstats show
  eth0     rx-packets:  40681280               rx-bytes:   61575208437
        rx-alloc-fail:         0      rx-hw-gro-packets:       1225133
                                 rx-hw-gro-wire-packets:      40656633
  $ ynltool qstats hw-gro
  eth0: 96.9% savings

None of the NICs I have access to can report "missed" HW-GRO
opportunities so computing a true "effectiveness" metric
is not possible. One could also argue that effectiveness metric
is inferior in environments where we control both senders and
receivers, the savings metrics will capture both regressions
in receiver's HW GRO effectiveness but also regressions in senders
sending smaller TSO trains. And we care about both. The main
downside is that it's hard to tell at a glance how well the NIC
is doing because the savings will be dependent on traffic patterns.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/net/ynl/ynltool/qstats.c | 75 +++++++++++++++++++++++++++++++---
 1 file changed, 70 insertions(+), 5 deletions(-)

diff --git a/tools/net/ynl/ynltool/qstats.c b/tools/net/ynl/ynltool/qstats.c
index d19acab0bf2a..e5b83cf9bf3b 100644
--- a/tools/net/ynl/ynltool/qstats.c
+++ b/tools/net/ynl/ynltool/qstats.c
@@ -568,6 +568,64 @@ static int do_balance(int argc, char **argv __attribute__((unused)))
 	return ret;
 }
 
+static int do_hw_gro(int argc, char **argv __attribute__((unused)))
+{
+	struct netdev_qstats_get_list *qstats;
+
+	if (argc > 0) {
+		p_err("hw-gro command takes no arguments");
+		return -1;
+	}
+
+	qstats = qstats_dump(0);
+	if (!qstats)
+		return -1;
+
+	if (json_output)
+		jsonw_start_array(json_wtr);
+
+	ynl_dump_foreach(qstats, qs) {
+		char ifname[IF_NAMESIZE];
+		const char *name;
+		double savings;
+
+		if (!qs->_present.rx_packets ||
+		    !qs->_present.rx_hw_gro_packets ||
+		    !qs->_present.rx_hw_gro_wire_packets)
+			continue;
+
+		if (!qs->rx_packets)
+			continue;
+
+		/* How many skbs did we avoid allocating thanks to HW GRO */
+		savings = (double)(qs->rx_hw_gro_wire_packets -
+				   qs->rx_hw_gro_packets) /
+			qs->rx_packets * 100.0;
+
+		name = if_indextoname(qs->ifindex, ifname);
+
+		if (json_output) {
+			jsonw_start_object(json_wtr);
+			if (name)
+				jsonw_string_field(json_wtr, "ifname", name);
+			jsonw_float_field(json_wtr, "savings", savings);
+			jsonw_end_object(json_wtr);
+		} else {
+			if (name)
+				printf("%s", name);
+			else
+				printf("ifindex:%u", qs->ifindex);
+			printf(": %.1f%% savings\n", savings);
+		}
+	}
+
+	if (json_output)
+		jsonw_end_array(json_wtr);
+
+	netdev_qstats_get_list_free(qstats);
+	return 0;
+}
+
 static int do_help(int argc __attribute__((unused)),
 		   char **argv __attribute__((unused)))
 {
@@ -580,6 +638,7 @@ static int do_help(int argc __attribute__((unused)),
 		"Usage: %s qstats { COMMAND | help }\n"
 		"       %s qstats [ show ] [ OPTIONS ]\n"
 		"       %s qstats balance\n"
+		"       %s qstats hw-gro\n"
 		"\n"
 		"       OPTIONS := { scope queue | group-by { device | queue } }\n"
 		"\n"
@@ -588,17 +647,23 @@ static int do_help(int argc __attribute__((unused)),
 		"       show scope queue      - Display per-queue statistics\n"
 		"       show group-by device  - Display device-aggregated statistics (default)\n"
 		"       show group-by queue   - Display per-queue statistics\n"
-		"       balance               - Analyze traffic distribution balance.\n"
+		"\n"
+		"  Analysis:\n"
+		"       balance               - Traffic distribution between queues.\n"
+		"       hw-gro                - HW GRO effectiveness analysis\n"
+		"                               - savings - delta between packets received\n"
+		"                                 on the wire and packets seen by the kernel.\n"
 		"",
-		bin_name, bin_name, bin_name);
+		bin_name, bin_name, bin_name, bin_name);
 
 	return 0;
 }
 
 static const struct cmd qstats_cmds[] = {
-	{ "show",	do_show },
-	{ "balance",	do_balance },
-	{ "help",	do_help },
+	{ "show",		do_show },
+	{ "balance",		do_balance },
+	{ "hw-gro",	        do_hw_gro },
+	{ "help",		do_help },
 	{ 0 }
 };
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (2 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 15:01   ` Petr Machata
  2026-02-05 22:05 ` [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin Jakub Kicinski
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

The gro.c packet sender is used for SW testing but bulk of incoming
new tests will be HW-specific. So it's better to put them under
drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c
to lib so we can reuse it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/drivers/net/Makefile           | 1 -
 tools/testing/selftests/net/lib/Makefile               | 1 +
 tools/testing/selftests/{drivers/net => net/lib}/gro.c | 0
 tools/testing/selftests/drivers/net/.gitignore         | 1 -
 tools/testing/selftests/drivers/net/gro.py             | 2 +-
 tools/testing/selftests/net/lib/.gitignore             | 1 +
 6 files changed, 3 insertions(+), 3 deletions(-)
 rename tools/testing/selftests/{drivers/net => net/lib}/gro.c (100%)

diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile
index 8154d6d429d3..7c7fa75b80c2 100644
--- a/tools/testing/selftests/drivers/net/Makefile
+++ b/tools/testing/selftests/drivers/net/Makefile
@@ -6,7 +6,6 @@ TEST_INCLUDES := $(wildcard lib/py/*.py) \
 		 ../../net/lib.sh \
 
 TEST_GEN_FILES := \
-	gro \
 	napi_id_helper \
 # end of TEST_GEN_FILES
 
diff --git a/tools/testing/selftests/net/lib/Makefile b/tools/testing/selftests/net/lib/Makefile
index 5339f56329e1..ff83603397d0 100644
--- a/tools/testing/selftests/net/lib/Makefile
+++ b/tools/testing/selftests/net/lib/Makefile
@@ -14,6 +14,7 @@ TEST_FILES := \
 TEST_GEN_FILES := \
 	$(patsubst %.c,%.o,$(wildcard *.bpf.c)) \
 	csum \
+	gro \
 	xdp_helper \
 # end of TEST_GEN_FILES
 
diff --git a/tools/testing/selftests/drivers/net/gro.c b/tools/testing/selftests/net/lib/gro.c
similarity index 100%
rename from tools/testing/selftests/drivers/net/gro.c
rename to tools/testing/selftests/net/lib/gro.c
diff --git a/tools/testing/selftests/drivers/net/.gitignore b/tools/testing/selftests/drivers/net/.gitignore
index 3633c7a3ed65..585ecb4d5dc4 100644
--- a/tools/testing/selftests/drivers/net/.gitignore
+++ b/tools/testing/selftests/drivers/net/.gitignore
@@ -1,4 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
-gro
 napi_id_helper
 psp_responder
diff --git a/tools/testing/selftests/drivers/net/gro.py b/tools/testing/selftests/drivers/net/gro.py
index cbc1b19dbc91..2da53686354f 100755
--- a/tools/testing/selftests/drivers/net/gro.py
+++ b/tools/testing/selftests/drivers/net/gro.py
@@ -117,7 +117,7 @@ from lib.py import ksft_variants
     """ Setup hardware loopback mode for GRO testing. """
 
     if not hasattr(cfg, "bin_remote"):
-        cfg.bin_local = cfg.test_dir / "gro"
+        cfg.bin_local = cfg.net_lib_dir / "gro"
         cfg.bin_remote = cfg.remote.deploy(cfg.bin_local)
 
     if not hasattr(cfg, "feat"):
diff --git a/tools/testing/selftests/net/lib/.gitignore b/tools/testing/selftests/net/lib/.gitignore
index bbc97d6bf556..6cd2b762af5d 100644
--- a/tools/testing/selftests/net/lib/.gitignore
+++ b/tools/testing/selftests/net/lib/.gitignore
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 csum
+gro
 xdp_helper
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (3 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 14:40   ` Petr Machata
  2026-02-05 22:05 ` [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together Jakub Kicinski
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

There are transient failures for devices which update stats
periodically, especially if it's the FW DMA'ing the stats
rather than host periodic work querying the FW. Wait 25%
longer than strictly necessary.

For devices which don't report stats-block-usecs we retain
25 msec as the default wait time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/drivers/net/lib/py/env.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py
index 41cc248ac848..39a98eb2592e 100644
--- a/tools/testing/selftests/drivers/net/lib/py/env.py
+++ b/tools/testing/selftests/drivers/net/lib/py/env.py
@@ -285,7 +285,7 @@ from .remote import Remote
                 if "Operation not supported" not in e.cmd.stderr:
                     raise
 
-            self._stats_settle_time = 0.025 + \
-                data.get('stats-block-usecs', 0) / 1000 / 1000
+            self._stats_settle_time = \
+                1.25 * data.get('stats-block-usecs', 20000) / 1000 / 1000
 
         time.sleep(self._stats_settle_time)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (4 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 15:19   ` Petr Machata
  2026-02-05 22:05 ` [PATCH net-next 7/9] selftests: drv-net: gro: test GRO stats Jakub Kicinski
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Longer packet sequence tests are quite flaky when the test is run
over a real network. Try to avoid at least the jitter on the sender
side by scheduling all the packets to be sent at once using SO_TXTIME.
Use hardcoded tx time of 5msec in the future. In my test increasing
this time past 2msec makes no difference so 5msec is plenty of margin.
Since we now expect more output buffering make sure to raise SNDBUF.

Experimenting with long sequences I see frequent failures when sending
200 packets, only 50-100 packets get coalesced. With this change
up to 1000 packets get coalesced relatively reliably.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/net/lib/gro.c | 49 +++++++++++++++++++++++++--
 1 file changed, 46 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/lib/gro.c b/tools/testing/selftests/net/lib/gro.c
index 3c0745b68bfa..832edf3e1290 100644
--- a/tools/testing/selftests/net/lib/gro.c
+++ b/tools/testing/selftests/net/lib/gro.c
@@ -63,6 +63,7 @@
 #include <linux/filter.h>
 #include <linux/if_packet.h>
 #include <linux/ipv6.h>
+#include <linux/net_tstamp.h>
 #include <net/ethernet.h>
 #include <net/if.h>
 #include <netinet/in.h>
@@ -74,6 +75,7 @@
 #include <stdio.h>
 #include <stdarg.h>
 #include <string.h>
+#include <time.h>
 #include <unistd.h>
 
 #include "kselftest.h"
@@ -123,6 +125,9 @@ static int tcp_offset = -1;
 static int total_hdr_len = -1;
 static int ethhdr_proto = -1;
 static bool ipip;
+static uint64_t txtime_ns;
+
+#define TXTIME_DELAY_MS 5
 
 static void vlog(const char *fmt, ...)
 {
@@ -330,13 +335,35 @@ static void fill_transportlayer(void *buf, int seq_offset, int ack_offset,
 
 static void write_packet(int fd, char *buf, int len, struct sockaddr_ll *daddr)
 {
+	char control[CMSG_SPACE(sizeof(uint64_t))];
+	struct msghdr msg = {};
+	struct iovec iov = {};
+	struct cmsghdr *cm;
 	int ret = -1;
 
-	ret = sendto(fd, buf, len, 0, (struct sockaddr *)daddr, sizeof(*daddr));
+	iov.iov_base = buf;
+	iov.iov_len = len;
+
+	msg.msg_iov = &iov;
+	msg.msg_iovlen = 1;
+	msg.msg_name = daddr;
+	msg.msg_namelen = sizeof(*daddr);
+
+	memset(control, 0, sizeof(control));
+	msg.msg_control = control;
+	msg.msg_controllen = sizeof(control);
+
+	cm = CMSG_FIRSTHDR(&msg);
+	cm->cmsg_level = SOL_SOCKET;
+	cm->cmsg_type = SCM_TXTIME;
+	cm->cmsg_len = CMSG_LEN(sizeof(uint64_t));
+	memcpy(CMSG_DATA(cm), &txtime_ns, sizeof(txtime_ns));
+
+	ret = sendmsg(fd, &msg, 0);
 	if (ret == -1)
-		error(1, errno, "sendto failure");
+		error(1, errno, "sendmsg failure");
 	if (ret != len)
-		error(1, errno, "sendto wrong length");
+		error(1, 0, "sendmsg wrong length: %d vs %d", ret, len);
 }
 
 static void create_packet(void *buf, int seq_offset, int ack_offset,
@@ -1058,15 +1085,31 @@ static void check_recv_pkts(int fd, int *correct_payload,
 
 static void gro_sender(void)
 {
+	struct sock_txtime so_txtime = { .clockid = CLOCK_MONOTONIC, };
+	int bufsize = 4 * 1024 * 1024; /* 4 MB */
 	const int fin_delay_us = 100 * 1000;
 	static char fin_pkt[MAX_HDR_LEN];
 	struct sockaddr_ll daddr = {};
+	struct timespec ts;
 	int txfd = -1;
 
 	txfd = socket(PF_PACKET, SOCK_RAW, IPPROTO_RAW);
 	if (txfd < 0)
 		error(1, errno, "socket creation");
 
+	if (setsockopt(txfd, SOL_SOCKET, SO_SNDBUF, &bufsize, sizeof(bufsize)))
+		error(1, errno, "cannot set sndbuf size, setsockopt failed");
+
+	if (setsockopt(txfd, SOL_SOCKET, SO_TXTIME,
+		       &so_txtime, sizeof(so_txtime)))
+		error(1, errno, "setsockopt SO_TXTIME");
+
+	if (clock_gettime(CLOCK_MONOTONIC, &ts))
+		error(1, errno, "clock_gettime");
+
+	txtime_ns = ts.tv_sec * 1000000000ULL + ts.tv_nsec;
+	txtime_ns += TXTIME_DELAY_MS * 1000000ULL;
+
 	memset(&daddr, 0, sizeof(daddr));
 	daddr.sll_ifindex = if_nametoindex(ifname);
 	if (daddr.sll_ifindex == 0)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 7/9] selftests: drv-net: gro: test GRO stats
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (5 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 8/9] selftests: drv-net: gro: add test for packet ordering Jakub Kicinski
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Test accuracy of GRO stats. We want to cover two potentially tricky
cases:
 - single segment GRO
 - packets which were eligible but didn't get GRO'd

The first case is trivial, teach gro.c to send on packet, and check
GRO stats didn't move.

Second case requires gro.c to send a lot of flows expecting the NIC
to run out of GRO flow capacity.

To avoid system traffic noise we steer the packets to a dedicated
queue and operate on qstat.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 .../testing/selftests/drivers/net/hw/Makefile |   1 +
 tools/testing/selftests/net/lib/gro.c         | 161 ++++++++++-
 .../selftests/drivers/net/hw/gro_hw.py        | 271 ++++++++++++++++++
 3 files changed, 432 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/drivers/net/hw/gro_hw.py

diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
index a64140333a46..19e5ca223273 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -26,6 +26,7 @@ TEST_PROGS = \
 	ethtool_extended_state.sh \
 	ethtool_mm.sh \
 	ethtool_rmon.sh \
+	gro_hw.py \
 	hw_stats_l3.sh \
 	hw_stats_l3_gre.sh \
 	iou-zcrx.py \
diff --git a/tools/testing/selftests/net/lib/gro.c b/tools/testing/selftests/net/lib/gro.c
index 832edf3e1290..d37676c009e8 100644
--- a/tools/testing/selftests/net/lib/gro.c
+++ b/tools/testing/selftests/net/lib/gro.c
@@ -43,6 +43,10 @@
  *   - large_max: exceeding max size
  *   - large_rem: remainder handling
  *
+ * single, capacity:
+ *  Boring cases used to test coalescing machinery itself and stats
+ *  more than protocol behavior.
+ *
  * MSS is defined as 4096 - header because if it is too small
  * (i.e. 1500 MTU - header), it will result in many packets,
  * increasing the "large" test case's flakiness. This is because
@@ -126,6 +130,9 @@ static int total_hdr_len = -1;
 static int ethhdr_proto = -1;
 static bool ipip;
 static uint64_t txtime_ns;
+static int num_flows = 4;
+
+#define CAPACITY_PAYLOAD_LEN 200
 
 #define TXTIME_DELAY_MS 5
 
@@ -387,6 +394,45 @@ static void create_packet(void *buf, int seq_offset, int ack_offset,
 	fill_datalinklayer(buf);
 }
 
+static void create_capacity_packet(void *buf, int flow_id, int pkt_idx, int psh)
+{
+	int seq_offset = pkt_idx * CAPACITY_PAYLOAD_LEN;
+	struct tcphdr *tcph;
+
+	create_packet(buf, seq_offset, 0, CAPACITY_PAYLOAD_LEN, 0);
+
+	/* Customize for this flow id */
+	memset(buf + total_hdr_len, 'a' + flow_id, CAPACITY_PAYLOAD_LEN);
+
+	tcph = buf + tcp_offset;
+	tcph->source = htons(SPORT + flow_id);
+	tcph->psh = psh;
+	tcph->check = 0;
+	tcph->check = tcp_checksum(tcph, CAPACITY_PAYLOAD_LEN);
+}
+
+/* Send a capacity test, 2 packets per flow, all first packets then all second:
+ *  A1 B1 C1 D1 ... A2 B2 C2 D2 ...
+ */
+static void send_capacity(int fd, struct sockaddr_ll *daddr)
+{
+	static char buf[MAX_HDR_LEN + CAPACITY_PAYLOAD_LEN];
+	int pkt_size = total_hdr_len + CAPACITY_PAYLOAD_LEN;
+	int i;
+
+	/* Send first packet of each flow (no PSH) */
+	for (i = 0; i < num_flows; i++) {
+		create_capacity_packet(buf, i, 0, 0);
+		write_packet(fd, buf, pkt_size, daddr);
+	}
+
+	/* Send second packet of each flow (with PSH to flush) */
+	for (i = 0; i < num_flows; i++) {
+		create_capacity_packet(buf, i, 1, 1);
+		write_packet(fd, buf, pkt_size, daddr);
+	}
+}
+
 #ifndef TH_CWR
 #define TH_CWR 0x80
 #endif
@@ -1083,6 +1129,93 @@ static void check_recv_pkts(int fd, int *correct_payload,
 	printf("Test succeeded\n\n");
 }
 
+static void check_capacity_pkts(int fd)
+{
+	static char buffer[IP_MAXPACKET + ETH_HLEN + 1];
+	struct iphdr *iph = (struct iphdr *)(buffer + ETH_HLEN);
+	struct ipv6hdr *ip6h = (struct ipv6hdr *)(buffer + ETH_HLEN);
+	const char *fail_reason = NULL;
+	int flow_order[num_flows * 2];
+	int coalesced[num_flows];
+	struct tcphdr *tcph;
+	int ip_ext_len = 0;
+	int total_data = 0;
+	int pkt_size = -1;
+	int data_len = 0;
+	int num_pkt = 0;
+	int num_coal = 0;
+	int flow_id;
+	int sport;
+
+	memset(coalesced, 0, sizeof(coalesced));
+	memset(flow_order, -1, sizeof(flow_order));
+
+	while (total_data < num_flows * CAPACITY_PAYLOAD_LEN * 2) {
+		ip_ext_len = 0;
+		pkt_size = recv(fd, buffer, IP_MAXPACKET + ETH_HLEN + 1, 0);
+		if (pkt_size < 0)
+			recv_error(fd, errno);
+
+		if (iph->version == 4)
+			ip_ext_len = (iph->ihl - 5) * 4;
+		else if (ip6h->version == 6 && ip6h->nexthdr != IPPROTO_TCP)
+			ip_ext_len = MIN_EXTHDR_SIZE;
+
+		tcph = (struct tcphdr *)(buffer + tcp_offset + ip_ext_len);
+
+		/* FIN packet terminates reception */
+		if (tcph->fin)
+			break;
+
+		sport = ntohs(tcph->source);
+		flow_id = sport - SPORT;
+
+		if (flow_id < 0 || flow_id >= num_flows) {
+			vlog("Invalid flow_id %d from sport %d\n",
+			     flow_id, sport);
+			fail_reason = fail_reason ?: "invalid packet";
+			continue;
+		}
+
+		/* Calculate payload length */
+		if (pkt_size == ETH_ZLEN && iph->version == 4) {
+			data_len = ntohs(iph->tot_len)
+				- sizeof(struct tcphdr) - sizeof(struct iphdr);
+		} else {
+			data_len = pkt_size - total_hdr_len - ip_ext_len;
+		}
+
+		flow_order[num_pkt] = flow_id;
+		coalesced[flow_id] = data_len;
+
+		if (data_len == CAPACITY_PAYLOAD_LEN * 2) {
+			num_coal++;
+		} else {
+			vlog("Pkt %d: flow %d, sport %d, len %d (expected %d)\n",
+			     num_pkt, flow_id, sport, data_len,
+			     CAPACITY_PAYLOAD_LEN * 2);
+			fail_reason = fail_reason ?: "not coalesced";
+		}
+
+		num_pkt++;
+		total_data += data_len;
+	}
+
+	if (!fail_reason) {
+		vlog("All %d flows coalesced correctly\n", num_flows);
+		printf("Test succeeded\n\n");
+	} else {
+		printf("FAILED\n");
+	}
+
+	/* Always print stats for external validation */
+	printf("STATS: received=%d wire=%d coalesced=%d\n",
+	       num_pkt, num_pkt + num_coal, num_coal);
+
+	if (fail_reason)
+		error(1, 0, "capacity test failed %s", fail_reason);
+}
+
 static void gro_sender(void)
 {
 	struct sock_txtime so_txtime = { .clockid = CLOCK_MONOTONIC, };
@@ -1242,6 +1375,19 @@ static void gro_sender(void)
 
 		send_large(txfd, &daddr, remainder + 1);
 		write_packet(txfd, fin_pkt, total_hdr_len, &daddr);
+
+	/* machinery sub-tests */
+	} else if (strcmp(testname, "single") == 0) {
+		static char buf[MAX_HDR_LEN + PAYLOAD_LEN];
+
+		create_packet(buf, 0, 0, PAYLOAD_LEN, 0);
+		write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr);
+		write_packet(txfd, fin_pkt, total_hdr_len, &daddr);
+	} else if (strcmp(testname, "capacity") == 0) {
+		send_capacity(txfd, &daddr);
+		usleep(fin_delay_us);
+		write_packet(txfd, fin_pkt, total_hdr_len, &daddr);
+
 	} else {
 		error(1, 0, "Unknown testcase: %s", testname);
 	}
@@ -1437,6 +1583,15 @@ static void gro_receiver(void)
 		correct_payload[2] = remainder + 1;
 		printf("last segment sent individually: ");
 		check_recv_pkts(rxfd, correct_payload, 3);
+
+	/* machinery sub-tests */
+	} else if (strcmp(testname, "single") == 0) {
+		printf("single data packet: ");
+		correct_payload[0] = PAYLOAD_LEN;
+		check_recv_pkts(rxfd, correct_payload, 1);
+	} else if (strcmp(testname, "capacity") == 0) {
+		check_capacity_pkts(rxfd);
+
 	} else {
 		error(1, 0, "Test case error: unknown testname %s", testname);
 	}
@@ -1454,6 +1609,7 @@ static void parse_args(int argc, char **argv)
 		{ "ipv4", no_argument, NULL, '4' },
 		{ "ipv6", no_argument, NULL, '6' },
 		{ "ipip", no_argument, NULL, 'e' },
+		{ "num-flows", required_argument, NULL, 'n' },
 		{ "rx", no_argument, NULL, 'r' },
 		{ "saddr", required_argument, NULL, 's' },
 		{ "smac", required_argument, NULL, 'S' },
@@ -1463,7 +1619,7 @@ static void parse_args(int argc, char **argv)
 	};
 	int c;
 
-	while ((c = getopt_long(argc, argv, "46d:D:ei:rs:S:t:v", opts, NULL)) != -1) {
+	while ((c = getopt_long(argc, argv, "46d:D:ei:n:rs:S:t:v", opts, NULL)) != -1) {
 		switch (c) {
 		case '4':
 			proto = PF_INET;
@@ -1487,6 +1643,9 @@ static void parse_args(int argc, char **argv)
 		case 'i':
 			ifname = optarg;
 			break;
+		case 'n':
+			num_flows = atoi(optarg);
+			break;
 		case 'r':
 			tx_socket = false;
 			break;
diff --git a/tools/testing/selftests/drivers/net/hw/gro_hw.py b/tools/testing/selftests/drivers/net/hw/gro_hw.py
new file mode 100755
index 000000000000..3bca19e8f339
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/gro_hw.py
@@ -0,0 +1,271 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+HW GRO tests focusing on device machinery like stats, rather than protocol
+processing.
+"""
+
+import glob
+import re
+
+from lib.py import ksft_run, ksft_exit, ksft_pr
+from lib.py import ksft_eq, ksft_ge
+from lib.py import NetDrvEpEnv, NetdevFamily
+from lib.py import KsftSkipEx
+from lib.py import bkg, cmd, defer, ethtool, ip
+
+
+# gro.c uses hardcoded DPORT=8000
+GRO_DPORT = 8000
+
+
+def _get_queue_stats(cfg, queue_id):
+    """Get stats for a specific Rx queue."""
+    cfg.wait_hw_stats_settle()
+    data = cfg.netnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]},
+                                dump=True)
+    for q in data:
+        if q.get('queue-type') == 'rx' and q.get('queue-id') == queue_id:
+            return q
+    return {}
+
+
+def _resolve_dmac(cfg, ipver):
+    """Find the destination MAC address for sending packets."""
+    attr = "dmac" + ipver
+    if hasattr(cfg, attr):
+        return getattr(cfg, attr)
+
+    route = ip(f"-{ipver} route get {cfg.addr_v[ipver]}",
+               json=True, host=cfg.remote)[0]
+    gw = route.get("gateway")
+    if not gw:
+        setattr(cfg, attr, cfg.dev['address'])
+        return getattr(cfg, attr)
+
+    cmd(f"ping -c1 -W0 -I{cfg.remote_ifname} {gw}", host=cfg.remote)
+    neigh = ip(f"neigh get {gw} dev {cfg.remote_ifname}",
+               json=True, host=cfg.remote)[0]
+    setattr(cfg, attr, neigh['lladdr'])
+    return getattr(cfg, attr)
+
+
+def _setup_isolated_queue(cfg):
+    """Set up an isolated queue for testing using ntuple filter.
+
+    Remove queue 1 from the default RSS context and steer test traffic to it.
+    """
+    test_queue = 1
+
+    qcnt = len(glob.glob(f"/sys/class/net/{cfg.ifname}/queues/rx-*"))
+    if qcnt < 2:
+        raise KsftSkipEx(f"Need at least 2 queues, have {qcnt}")
+
+    # Remove queue 1 from default RSS context by setting its weight to 0
+    weights = ["1"] * qcnt
+    weights[test_queue] = "0"
+    ethtool(f"-X {cfg.ifname} weight " + " ".join(weights))
+    defer(ethtool, f"-X {cfg.ifname} default")
+
+    # Set up ntuple filter to steer our test traffic to the isolated queue
+    flow  = f"flow-type tcp{cfg.addr_ipver} "
+    flow += f"dst-ip {cfg.addr} dst-port {GRO_DPORT} action {test_queue}"
+    output = ethtool(f"-N {cfg.ifname} {flow}").stdout
+    ntuple_id = int(output.split()[-1])
+    defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
+
+    return test_queue
+
+
+def _run_gro_test(cfg, test_name, num_flows=None, ignore_fail=False):
+    """Run gro binary with given test and return output."""
+    if not hasattr(cfg, "bin_remote"):
+        cfg.bin_local = cfg.net_lib_dir / "gro"
+        cfg.bin_remote = cfg.remote.deploy(cfg.bin_local)
+
+    ipver = cfg.addr_ipver
+    protocol = f"--ipv{ipver}"
+    dmac = _resolve_dmac(cfg, ipver)
+
+    base_args = [
+        protocol,
+        f"--dmac {dmac}",
+        f"--smac {cfg.remote_dev['address']}",
+        f"--daddr {cfg.addr}",
+        f"--saddr {cfg.remote_addr_v[ipver]}",
+        f"--test {test_name}",
+    ]
+    if num_flows:
+        base_args.append(f"--num-flows {num_flows}")
+
+    args = " ".join(base_args)
+
+    rx_cmd = f"{cfg.bin_local} {args} --rx --iface {cfg.ifname}"
+    tx_cmd = f"{cfg.bin_remote} {args} --iface {cfg.remote_ifname}"
+
+    with bkg(rx_cmd, ksft_ready=True, exit_wait=True, fail=False) as rx_proc:
+        cmd(tx_cmd, host=cfg.remote)
+
+    if not ignore_fail:
+        ksft_eq(rx_proc.ret, 0)
+        if rx_proc.ret != 0:
+            ksft_pr(rx_proc)
+
+    return rx_proc.stdout
+
+
+def _require_hw_gro_stats(cfg, queue_id):
+    """Check if device reports HW GRO stats for the queue."""
+    stats = _get_queue_stats(cfg, queue_id)
+    required = ['rx-packets', 'rx-hw-gro-packets', 'rx-hw-gro-wire-packets']
+    for stat in required:
+        if stat not in stats:
+            raise KsftSkipEx(f"Driver does not report '{stat}' via qstats")
+
+
+def _set_ethtool_feat(cfg, current, feats):
+    """Set ethtool features with defer to restore original state."""
+    s2n = {True: "on", False: "off"}
+
+    new = ["-K", cfg.ifname]
+    old = ["-K", cfg.ifname]
+    no_change = True
+    for name, state in feats.items():
+        new += [name, s2n[state]]
+        old += [name, s2n[current[name]["active"]]]
+
+        if current[name]["active"] != state:
+            no_change = False
+            if current[name]["fixed"]:
+                raise KsftSkipEx(f"Device does not support {name}")
+    if no_change:
+        return
+
+    eth_cmd = ethtool(" ".join(new))
+    defer(ethtool, " ".join(old))
+
+    # If ethtool printed something kernel must have modified some features
+    if eth_cmd.stdout:
+        ksft_pr(eth_cmd)
+
+
+def _setup_hw_gro(cfg):
+    """Enable HW GRO on the device, disabling SW GRO."""
+    feat = ethtool(f"-k {cfg.ifname}", json=True)[0]
+
+    # Try to disable SW GRO and enable HW GRO
+    _set_ethtool_feat(cfg, feat,
+                      {"generic-receive-offload": False,
+                       "rx-gro-hw": True,
+                       "large-receive-offload": False})
+
+    # Some NICs treat HW GRO as a GRO sub-feature so disabling GRO
+    # will also clear HW GRO. Use a hack of installing XDP generic
+    # to skip SW GRO, even when enabled.
+    feat = ethtool(f"-k {cfg.ifname}", json=True)[0]
+    if not feat["rx-gro-hw"]["active"]:
+        ksft_pr("Driver clears HW GRO when SW GRO is cleared, using generic XDP workaround")
+        prog = cfg.net_lib_dir / "xdp_dummy.bpf.o"
+        ip(f"link set dev {cfg.ifname} xdpgeneric obj {prog} sec xdp")
+        defer(ip, f"link set dev {cfg.ifname} xdpgeneric off")
+
+        # Attaching XDP may change features, fetch the latest state
+        feat = ethtool(f"-k {cfg.ifname}", json=True)[0]
+
+        _set_ethtool_feat(cfg, feat,
+                          {"generic-receive-offload": True,
+                           "rx-gro-hw": True,
+                           "large-receive-offload": False})
+
+
+def _check_gro_stats(cfg, test_queue, stats_before,
+                     expect_rx, expect_gro, expect_wire):
+    """Validate GRO stats against expected values."""
+    stats_after = _get_queue_stats(cfg, test_queue)
+
+    rx_delta = (stats_after.get('rx-packets', 0) -
+                stats_before.get('rx-packets', 0))
+    gro_delta = (stats_after.get('rx-hw-gro-packets', 0) -
+                 stats_before.get('rx-hw-gro-packets', 0))
+    wire_delta = (stats_after.get('rx-hw-gro-wire-packets', 0) -
+                  stats_before.get('rx-hw-gro-wire-packets', 0))
+
+    ksft_eq(rx_delta, expect_rx, comment="rx-packets")
+    ksft_eq(gro_delta, expect_gro, comment="rx-hw-gro-packets")
+    ksft_eq(wire_delta, expect_wire, comment="rx-hw-gro-wire-packets")
+
+
+def test_gro_stats_single(cfg):
+    """
+    Test that a single packet doesn't affect GRO stats.
+
+    Send a single packet that cannot be coalesced (nothing to coalesce with).
+    GRO stats should not increase since no coalescing occurred.
+    rx-packets should increase by 2 (1 data + 1 FIN).
+    """
+    _setup_hw_gro(cfg)
+
+    test_queue = _setup_isolated_queue(cfg)
+    _require_hw_gro_stats(cfg, test_queue)
+
+    stats_before = _get_queue_stats(cfg, test_queue)
+
+    _run_gro_test(cfg, "single")
+
+    # 1 data + 1 FIN = 2 rx-packets, no coalescing
+    _check_gro_stats(cfg, test_queue, stats_before,
+                     expect_rx=2, expect_gro=0, expect_wire=0)
+
+
+def test_gro_stats_full(cfg):
+    """
+    Test GRO stats when overwhelming HW GRO capacity.
+
+    Send 500 flows to exceed HW GRO flow capacity on a single queue.
+    This should result in some packets not being coalesced.
+    Validate that qstats match what gro.c observed.
+    """
+    _setup_hw_gro(cfg)
+
+    test_queue = _setup_isolated_queue(cfg)
+    _require_hw_gro_stats(cfg, test_queue)
+
+    num_flows = 500
+    stats_before = _get_queue_stats(cfg, test_queue)
+
+    # Run capacity test - will likely fail because not all packets coalesce
+    output = _run_gro_test(cfg, "capacity", num_flows=num_flows,
+                           ignore_fail=True)
+
+    # Parse gro.c output: "STATS: received=X wire=Y coalesced=Z"
+    match = re.search(r'STATS: received=(\d+) wire=(\d+) coalesced=(\d+)',
+                      output)
+    if not match:
+        raise KsftSkipEx(f"Could not parse gro.c output: {output}")
+
+    rx_frames = int(match.group(2))
+    gro_coalesced = int(match.group(3))
+
+    ksft_ge(gro_coalesced, 1,
+            comment="At least some packets should coalesce")
+
+    # received + 1 FIN, coalesced super-packets, coalesced * 2 wire packets
+    _check_gro_stats(cfg, test_queue, stats_before,
+                     expect_rx=rx_frames + 1,
+                     expect_gro=gro_coalesced,
+                     expect_wire=gro_coalesced * 2)
+
+
+def main() -> None:
+    """ Ksft boiler plate main """
+
+    with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
+        cfg.netnl = NetdevFamily()
+        ksft_run([test_gro_stats_single,
+                  test_gro_stats_full], args=(cfg,))
+    ksft_exit()
+
+
+if __name__ == "__main__":
+    main()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 8/9] selftests: drv-net: gro: add test for packet ordering
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (6 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 7/9] selftests: drv-net: gro: test GRO stats Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-05 22:05 ` [PATCH net-next 9/9] selftests: drv-net: gro: add a test for HW-GRO depth Jakub Kicinski
  2026-02-06 15:36 ` [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Petr Machata
  9 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Add a test to check if the NIC reorders packets if the hit GRO.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/testing/selftests/net/lib/gro.c         | 38 +++++++++++++++++--
 .../selftests/drivers/net/hw/gro_hw.py        | 29 ++++++++++++--
 2 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/lib/gro.c b/tools/testing/selftests/net/lib/gro.c
index d37676c009e8..3d3312b37248 100644
--- a/tools/testing/selftests/net/lib/gro.c
+++ b/tools/testing/selftests/net/lib/gro.c
@@ -131,6 +131,7 @@ static int ethhdr_proto = -1;
 static bool ipip;
 static uint64_t txtime_ns;
 static int num_flows = 4;
+static bool order_check;
 
 #define CAPACITY_PAYLOAD_LEN 200
 
@@ -1134,6 +1135,7 @@ static void check_capacity_pkts(int fd)
 	static char buffer[IP_MAXPACKET + ETH_HLEN + 1];
 	struct iphdr *iph = (struct iphdr *)(buffer + ETH_HLEN);
 	struct ipv6hdr *ip6h = (struct ipv6hdr *)(buffer + ETH_HLEN);
+	int num_pkt = 0, num_coal = 0, pkt_idx;
 	const char *fail_reason = NULL;
 	int flow_order[num_flows * 2];
 	int coalesced[num_flows];
@@ -1142,8 +1144,6 @@ static void check_capacity_pkts(int fd)
 	int total_data = 0;
 	int pkt_size = -1;
 	int data_len = 0;
-	int num_pkt = 0;
-	int num_coal = 0;
 	int flow_id;
 	int sport;
 
@@ -1201,6 +1201,34 @@ static void check_capacity_pkts(int fd)
 		total_data += data_len;
 	}
 
+	/* Check flow ordering. We expect to see all non-coalesced first segs
+	 * then interleaved coalesced and non-coalesced second frames.
+	 */
+	pkt_idx = 0;
+	for (flow_id = 0; order_check && flow_id < num_flows; flow_id++) {
+		bool coaled = coalesced[flow_id] > CAPACITY_PAYLOAD_LEN;
+
+		if (coaled)
+			continue;
+
+		if (flow_order[pkt_idx] != flow_id) {
+			vlog("Flow order mismatch (non-coalesced) at position %d: expected flow %d, got flow %d\n",
+			     pkt_idx, flow_id, flow_order[pkt_idx]);
+			fail_reason = fail_reason ?: "bad packet order (1)";
+		}
+		pkt_idx++;
+	}
+	for (flow_id = 0; order_check && flow_id < num_flows; flow_id++) {
+		bool coaled = coalesced[flow_id] > CAPACITY_PAYLOAD_LEN;
+
+		if (flow_order[pkt_idx] != flow_id) {
+			vlog("Flow order mismatch at position %d: expected flow %d, got flow %d, coalesced: %d\n",
+			     pkt_idx, flow_id, flow_order[pkt_idx], coaled);
+			fail_reason = fail_reason ?: "bad packet order (2)";
+		}
+		pkt_idx++;
+	}
+
 	if (!fail_reason) {
 		vlog("All %d flows coalesced correctly\n", num_flows);
 		printf("Test succeeded\n\n");
@@ -1614,12 +1642,13 @@ static void parse_args(int argc, char **argv)
 		{ "saddr", required_argument, NULL, 's' },
 		{ "smac", required_argument, NULL, 'S' },
 		{ "test", required_argument, NULL, 't' },
+		{ "order-check", no_argument, NULL, 'o' },
 		{ "verbose", no_argument, NULL, 'v' },
 		{ 0, 0, 0, 0 }
 	};
 	int c;
 
-	while ((c = getopt_long(argc, argv, "46d:D:ei:n:rs:S:t:v", opts, NULL)) != -1) {
+	while ((c = getopt_long(argc, argv, "46d:D:ei:n:rs:S:t:ov", opts, NULL)) != -1) {
 		switch (c) {
 		case '4':
 			proto = PF_INET;
@@ -1658,6 +1687,9 @@ static void parse_args(int argc, char **argv)
 		case 't':
 			testname = optarg;
 			break;
+		case 'o':
+			order_check = true;
+			break;
 		case 'v':
 			verbose = true;
 			break;
diff --git a/tools/testing/selftests/drivers/net/hw/gro_hw.py b/tools/testing/selftests/drivers/net/hw/gro_hw.py
index 3bca19e8f339..18a3b1bceefd 100755
--- a/tools/testing/selftests/drivers/net/hw/gro_hw.py
+++ b/tools/testing/selftests/drivers/net/hw/gro_hw.py
@@ -10,7 +10,7 @@ import glob
 import re
 
 from lib.py import ksft_run, ksft_exit, ksft_pr
-from lib.py import ksft_eq, ksft_ge
+from lib.py import ksft_eq, ksft_ge, ksft_variants
 from lib.py import NetDrvEpEnv, NetdevFamily
 from lib.py import KsftSkipEx
 from lib.py import bkg, cmd, defer, ethtool, ip
@@ -78,7 +78,8 @@ GRO_DPORT = 8000
     return test_queue
 
 
-def _run_gro_test(cfg, test_name, num_flows=None, ignore_fail=False):
+def _run_gro_test(cfg, test_name, num_flows=None, ignore_fail=False,
+                  order_check=False):
     """Run gro binary with given test and return output."""
     if not hasattr(cfg, "bin_remote"):
         cfg.bin_local = cfg.net_lib_dir / "gro"
@@ -98,6 +99,8 @@ GRO_DPORT = 8000
     ]
     if num_flows:
         base_args.append(f"--num-flows {num_flows}")
+    if order_check:
+        base_args.append("--order-check")
 
     args = " ".join(base_args)
 
@@ -257,13 +260,33 @@ def _check_gro_stats(cfg, test_queue, stats_before,
                      expect_wire=gro_coalesced * 2)
 
 
+@ksft_variants([4, 32, 512])
+def test_gro_order(cfg, num_flows):
+    """
+    Test that HW GRO preserves packet ordering between flows.
+
+    Packets may get delayed until the aggreate is released,
+    but reordering between aggregates and packet terminating
+    the aggregate and normal packets should not happen.
+
+    Note that this test is stricter than truly required.
+    Reordering packets between flows should not cause issues.
+    This test will also fail if traffic is run over an ECMP fabric.
+    """
+    _setup_hw_gro(cfg)
+    _setup_isolated_queue(cfg)
+
+    _run_gro_test(cfg, "capacity", num_flows=num_flows, order_check=True)
+
+
 def main() -> None:
     """ Ksft boiler plate main """
 
     with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
         cfg.netnl = NetdevFamily()
         ksft_run([test_gro_stats_single,
-                  test_gro_stats_full], args=(cfg,))
+                  test_gro_stats_full,
+                  test_gro_order], args=(cfg,))
     ksft_exit()
 
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 9/9] selftests: drv-net: gro: add a test for HW-GRO depth
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (7 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 8/9] selftests: drv-net: gro: add test for packet ordering Jakub Kicinski
@ 2026-02-05 22:05 ` Jakub Kicinski
  2026-02-06 15:36 ` [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Petr Machata
  9 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-05 22:05 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, shuah, willemb,
	petrm, donald.hunter, michael.chan, pavan.chebbi, linux-kselftest,
	Jakub Kicinski

Reuse the long sequence test to max out the NIC HW-GRO depth.
Repeat for a single queue and RSS context with 8 queues.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 .../selftests/drivers/net/hw/gro_hw.py        | 86 ++++++++++++++++++-
 1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/drivers/net/hw/gro_hw.py b/tools/testing/selftests/drivers/net/hw/gro_hw.py
index 18a3b1bceefd..7450b8884f44 100755
--- a/tools/testing/selftests/drivers/net/hw/gro_hw.py
+++ b/tools/testing/selftests/drivers/net/hw/gro_hw.py
@@ -10,8 +10,8 @@ import glob
 import re
 
 from lib.py import ksft_run, ksft_exit, ksft_pr
-from lib.py import ksft_eq, ksft_ge, ksft_variants
-from lib.py import NetDrvEpEnv, NetdevFamily
+from lib.py import ksft_eq, ksft_ge, ksft_variants, KsftNamedVariant
+from lib.py import NetDrvEpEnv, NetdevFamily, EthtoolFamily
 from lib.py import KsftSkipEx
 from lib.py import bkg, cmd, defer, ethtool, ip
 
@@ -78,6 +78,19 @@ GRO_DPORT = 8000
     return test_queue
 
 
+def _setup_queue_count(cfg, num_queues):
+    """Configure the NIC to use a specific number of queues."""
+    channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}})
+    ch_max = channels.get('combined-max', 0)
+    qcnt = channels['combined-count']
+
+    if ch_max < num_queues:
+        raise KsftSkipEx(f"Need at least {num_queues} queues, max={ch_max}")
+
+    defer(ethtool, f"-L {cfg.ifname} combined {qcnt}")
+    ethtool(f"-L {cfg.ifname} combined {num_queues}")
+
+
 def _run_gro_test(cfg, test_name, num_flows=None, ignore_fail=False,
                   order_check=False):
     """Run gro binary with given test and return output."""
@@ -279,14 +292,81 @@ def _check_gro_stats(cfg, test_queue, stats_before,
     _run_gro_test(cfg, "capacity", num_flows=num_flows, order_check=True)
 
 
+@ksft_variants([
+    KsftNamedVariant("isolated", _setup_isolated_queue),
+    KsftNamedVariant("1q", lambda cfg: _setup_queue_count(cfg, 1)),
+    KsftNamedVariant("8q", lambda cfg: _setup_queue_count(cfg, 8)),
+])
+def test_gro_capacity(cfg, setup_func):
+    """
+    Probe HW GRO capacity.
+
+    Start with 8 flows and increase by 4x on each successful run.
+    Retry up to 3 times on failure.
+
+    Variants:
+      - isolated: Use a single queue isolated from RSS
+      - 1q: Configure NIC to use 1 queue
+      - 8q: Configure NIC to use 8 queues
+    """
+    max_retries = 3
+
+    _setup_hw_gro(cfg)
+    queue_id = setup_func(cfg)
+
+    num_flows = 8
+    while True:
+        success = False
+        for attempt in range(max_retries):
+            if queue_id is not None:
+                stats_before = _get_queue_stats(cfg, queue_id)
+
+            output = _run_gro_test(cfg, "capacity", num_flows=num_flows,
+                                   ignore_fail=True)
+
+            if queue_id is not None:
+                stats_after = _get_queue_stats(cfg, queue_id)
+                qstat_pkts = (stats_after.get('rx-packets', 0) -
+                              stats_before.get('rx-packets', 0))
+                qstat_str = f" qstat={qstat_pkts}"
+            else:
+                qstat_str = ""
+
+            # Parse and print STATS line
+            match = re.search(
+                r'STATS: received=(\d+) wire=(\d+) coalesced=(\d+)', output)
+            if match:
+                received = int(match.group(1))
+                wire = int(match.group(2))
+                coalesced = int(match.group(3))
+                status = "PASS" if received == num_flows else "FAIL"
+                ksft_pr(f"flows={num_flows} attempt={attempt + 1} "
+                        f"received={received} wire={wire} "
+                        f"coalesced={coalesced}{qstat_str} [{status}]")
+                if received == num_flows:
+                    success = True
+                    break
+            else:
+                ksft_pr(f"flows={num_flows} attempt={attempt + 1}"
+                        f"{qstat_str} [FAIL - can't parse stats]")
+
+        if not success:
+            ksft_pr(f"Stopped at {num_flows} flows")
+            break
+
+        num_flows *= 2
+
+
 def main() -> None:
     """ Ksft boiler plate main """
 
     with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
+        cfg.ethnl = EthtoolFamily()
         cfg.netnl = NetdevFamily()
         ksft_run([test_gro_stats_single,
                   test_gro_stats_full,
-                  test_gro_order], args=(cfg,))
+                  test_gro_order,
+                  test_gro_capacity], args=(cfg,))
     ksft_exit()
 
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats
  2026-02-05 22:05 ` [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats Jakub Kicinski
@ 2026-02-05 22:44   ` Michael Chan
  2026-02-06  0:24     ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Chan @ 2026-02-05 22:44 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, pavan.chebbi, linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 1830 bytes --]

On Thu, Feb 5, 2026 at 2:07 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Count and report HW-GRO stats as seen by the kernel.
> The device stats for GRO seem to not reflect the reality,
> perhaps they count sessions which did not actually result
> in any aggregation.

Yes, the HW count includes single packets without additional
aggregations.  In the driver, when we see only 1 segment, we treat it
as a non GRO packet.  That's likely the discrepancy you're seeing.

Also, for completeness, should we count LRO packets as well?

> Also they count wire packets, so we
> have to count super-frames, anyway.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---

> @@ -1804,7 +1804,8 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
>                                            struct bnxt_tpa_info *tpa_info,
>                                            struct rx_tpa_end_cmp *tpa_end,
>                                            struct rx_tpa_end_cmp_ext *tpa_end1,
> -                                          struct sk_buff *skb)
> +                                          struct sk_buff *skb,
> +                                          struct bnxt_rx_sw_stats *rx_stats)
>  {
>  #ifdef CONFIG_INET
>         int payload_off;
> @@ -1814,6 +1815,11 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
>         if (segs == 1)
>                 return skb;
>
> +       if (bp->dev->features & NETIF_F_GRO_HW) {

If we enter this function, NETIF_F_GRO_HW should always be true.

> +               rx_stats->rx_hw_gro_packets++;
> +               rx_stats->rx_hw_gro_wire_packets += segs;
> +       }
> +
>         NAPI_GRO_CB(skb)->count = segs;
>         skb_shinfo(skb)->gso_size =
>                 le32_to_cpu(tpa_end1->rx_tpa_end_cmp_seg_len);

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5469 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats
  2026-02-05 22:44   ` Michael Chan
@ 2026-02-06  0:24     ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-02-06  0:24 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, pavan.chebbi, linux-kselftest

On Thu, 5 Feb 2026 14:44:53 -0800 Michael Chan wrote:
> On Thu, Feb 5, 2026 at 2:07 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > Count and report HW-GRO stats as seen by the kernel.
> > The device stats for GRO seem to not reflect the reality,
> > perhaps they count sessions which did not actually result
> > in any aggregation.  
> 
> Yes, the HW count includes single packets without additional
> aggregations.  In the driver, when we see only 1 segment, we treat it
> as a non GRO packet.  That's likely the discrepancy you're seeing.
> 
> Also, for completeness, should we count LRO packets as well?

Not in this counter:

      -
        name: rx-hw-gro-wire-packets
        doc: |
          Number of packets that were coalesced to bigger packetss with the
          HW-GRO netdevice feature. LRO-coalesced packets are not counted.

I don't think we have a counter defined for LRO, yet.

> > @@ -1814,6 +1815,11 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
> >         if (segs == 1)
> >                 return skb;
> >
> > +       if (bp->dev->features & NETIF_F_GRO_HW) {  
> 
> If we enter this function, NETIF_F_GRO_HW should always be true.

Ah, I see it now.. will drop the condition.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings
  2026-02-05 22:05 ` [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings Jakub Kicinski
@ 2026-02-06 13:44   ` Petr Machata
  0 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 13:44 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

> Extend ynltool to compute HW GRO savings metric - how many
> packets has HW GRO been able to save the kernel from seeing.
>
> Note that this definition does not actually take into account
> whether the segments were or weren't eligible for HW GRO.
> If a machine is receiving all-UDP traffic - new metric will show
> HW-GRO savings of 0%. Conversely since the super-packet still
> counts as a received packet, savings of 100% is not achievable.
> Perfect HW-GRO on a machine with 4k MTU and 64kB super-frames
> would show ~93.75% savings. With 1.5k MTU we may see up to
> ~97.8% savings (if my math is right).
>
> Example after 10 sec of iperf on a freshly booted machine
> with 1.5k MTU:
>
>   $ ynltool qstats show
>   eth0     rx-packets:  40681280               rx-bytes:   61575208437
>         rx-alloc-fail:         0      rx-hw-gro-packets:       1225133
>                                  rx-hw-gro-wire-packets:      40656633
>   $ ynltool qstats hw-gro
>   eth0: 96.9% savings
>
> None of the NICs I have access to can report "missed" HW-GRO
> opportunities so computing a true "effectiveness" metric
> is not possible. One could also argue that effectiveness metric
> is inferior in environments where we control both senders and
> receivers, the savings metrics will capture both regressions
> in receiver's HW GRO effectiveness but also regressions in senders
> sending smaller TSO trains. And we care about both. The main
> downside is that it's hard to tell at a glance how well the NIC
> is doing because the savings will be dependent on traffic patterns.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  tools/net/ynl/ynltool/qstats.c | 75 +++++++++++++++++++++++++++++++---
>  1 file changed, 70 insertions(+), 5 deletions(-)
>
> diff --git a/tools/net/ynl/ynltool/qstats.c b/tools/net/ynl/ynltool/qstats.c
> index d19acab0bf2a..e5b83cf9bf3b 100644
> --- a/tools/net/ynl/ynltool/qstats.c
> +++ b/tools/net/ynl/ynltool/qstats.c

Since I see there's going to be a v2, a nit:

> @@ -580,6 +638,7 @@ static int do_help(int argc __attribute__((unused)),
>  		"Usage: %s qstats { COMMAND | help }\n"
>  		"       %s qstats [ show ] [ OPTIONS ]\n"
>  		"       %s qstats balance\n"
> +		"       %s qstats hw-gro\n"
>  		"\n"
>  		"       OPTIONS := { scope queue | group-by { device | queue } }\n"
>  		"\n"

I think at this point it would make sense to convert to %1$s throughout
instead of pumping in more arguments.

> @@ -588,17 +647,23 @@ static int do_help(int argc __attribute__((unused)),
>  		"       show scope queue      - Display per-queue statistics\n"
>  		"       show group-by device  - Display device-aggregated statistics (default)\n"
>  		"       show group-by queue   - Display per-queue statistics\n"
> -		"       balance               - Analyze traffic distribution balance.\n"
> +		"\n"
> +		"  Analysis:\n"
> +		"       balance               - Traffic distribution between queues.\n"
> +		"       hw-gro                - HW GRO effectiveness analysis\n"
> +		"                               - savings - delta between packets received\n"
> +		"                                 on the wire and packets seen by the kernel.\n"
>  		"",
> -		bin_name, bin_name, bin_name);
> +		bin_name, bin_name, bin_name, bin_name);
>  
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin
  2026-02-05 22:05 ` [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin Jakub Kicinski
@ 2026-02-06 14:40   ` Petr Machata
  0 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 14:40 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

> There are transient failures for devices which update stats
> periodically, especially if it's the FW DMA'ing the stats
> rather than host periodic work querying the FW. Wait 25%
> longer than strictly necessary.
>
> For devices which don't report stats-block-usecs we retain
> 25 msec as the default wait time.

Right, because now the default is 0.2, and we apply the 1.25 multiplier.

> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Petr Machata <petrm@nvidia.com>

> ---
>  tools/testing/selftests/drivers/net/lib/py/env.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py
> index 41cc248ac848..39a98eb2592e 100644
> --- a/tools/testing/selftests/drivers/net/lib/py/env.py
> +++ b/tools/testing/selftests/drivers/net/lib/py/env.py
> @@ -285,7 +285,7 @@ from .remote import Remote
>                  if "Operation not supported" not in e.cmd.stderr:
>                      raise
>  
> -            self._stats_settle_time = 0.025 + \
> -                data.get('stats-block-usecs', 0) / 1000 / 1000
> +            self._stats_settle_time = \
> +                1.25 * data.get('stats-block-usecs', 20000) / 1000 / 1000
>  
>          time.sleep(self._stats_settle_time)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping
  2026-02-05 22:05 ` [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping Jakub Kicinski
@ 2026-02-06 14:58   ` Petr Machata
  0 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 14:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

> The logic to open a socket and dump the queues is the same
> across sub-commands. Factor it out, we'll need it again.
>
> No functional changes intended.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Petr Machata <petrm@nvidia.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse
  2026-02-05 22:05 ` [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse Jakub Kicinski
@ 2026-02-06 15:01   ` Petr Machata
  0 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 15:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

> The gro.c packet sender is used for SW testing but bulk of incoming
> new tests will be HW-specific. So it's better to put them under
> drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c
> to lib so we can reuse it.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Petr Machata <petrm@nvidia.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together
  2026-02-05 22:05 ` [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together Jakub Kicinski
@ 2026-02-06 15:19   ` Petr Machata
  0 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 15:19 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

> Longer packet sequence tests are quite flaky when the test is run
> over a real network. Try to avoid at least the jitter on the sender
> side by scheduling all the packets to be sent at once using SO_TXTIME.
> Use hardcoded tx time of 5msec in the future. In my test increasing
> this time past 2msec makes no difference so 5msec is plenty of margin.
> Since we now expect more output buffering make sure to raise SNDBUF.
>
> Experimenting with long sequences I see frequent failures when sending
> 200 packets, only 50-100 packets get coalesced. With this change
> up to 1000 packets get coalesced relatively reliably.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Petr Machata <petrm@nvidia.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO
  2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
                   ` (8 preceding siblings ...)
  2026-02-05 22:05 ` [PATCH net-next 9/9] selftests: drv-net: gro: add a test for HW-GRO depth Jakub Kicinski
@ 2026-02-06 15:36 ` Petr Machata
  9 siblings, 0 replies; 18+ messages in thread
From: Petr Machata @ 2026-02-06 15:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, shuah,
	willemb, petrm, donald.hunter, michael.chan, pavan.chebbi,
	linux-kselftest


Jakub Kicinski <kuba@kernel.org> writes:

>   selftests: drv-net: gro: test GRO stats
>   selftests: drv-net: gro: add test for packet ordering
>   selftests: drv-net: gro: add a test for HW-GRO depth

I went over these three, and have to admit that I don't understand GRO
enough to have an opinion on them. Likewise the Broadcom-specific code.
So I'm done with the review.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-02-06 15:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05 22:05 [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Jakub Kicinski
2026-02-05 22:05 ` [PATCH net-next 1/9] eth: bnxt: gather and report HW-GRO stats Jakub Kicinski
2026-02-05 22:44   ` Michael Chan
2026-02-06  0:24     ` Jakub Kicinski
2026-02-05 22:05 ` [PATCH net-next 2/9] tools: ynltool: factor out qstat dumping Jakub Kicinski
2026-02-06 14:58   ` Petr Machata
2026-02-05 22:05 ` [PATCH net-next 3/9] tools: ynltool: add qstats analysis for HW-GRO efficiency / savings Jakub Kicinski
2026-02-06 13:44   ` Petr Machata
2026-02-05 22:05 ` [PATCH net-next 4/9] selftests: net: move gro to lib for HW vs SW reuse Jakub Kicinski
2026-02-06 15:01   ` Petr Machata
2026-02-05 22:05 ` [PATCH net-next 5/9] selftests: drv-net: give HW stats sync time extra 25% of margin Jakub Kicinski
2026-02-06 14:40   ` Petr Machata
2026-02-05 22:05 ` [PATCH net-next 6/9] selftests: drv-net: gro: use SO_TXTIME to schedule packets together Jakub Kicinski
2026-02-06 15:19   ` Petr Machata
2026-02-05 22:05 ` [PATCH net-next 7/9] selftests: drv-net: gro: test GRO stats Jakub Kicinski
2026-02-05 22:05 ` [PATCH net-next 8/9] selftests: drv-net: gro: add test for packet ordering Jakub Kicinski
2026-02-05 22:05 ` [PATCH net-next 9/9] selftests: drv-net: gro: add a test for HW-GRO depth Jakub Kicinski
2026-02-06 15:36 ` [PATCH net-next 0/9] net: stats, tools, driver tests for HW GRO Petr Machata

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox