netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
@ 2025-03-05 22:52 Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 01/10] eth: bnxt: use napi_consume_skb() Jakub Kicinski
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Some workloads want to be able to track bandwidth utilization on
the scale of 10s of msecs. bnxt uses HW stats and async stats
updates, with update frequency controlled via ethtool -C.
Updating all HW stats more often than 100 msec is both hard for
the device and consumes PCIe bandwidth. Switch to maintaining
basic Rx / Tx packet and byte counters in SW.

Tested with drivers/net/stats.py:
  # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

Manually tested by comparing the ethtool -S stats (which continues
to show HW stats) with qstats, and total interface stats.
With and without HW-GRO, and with XDP on / off.
Stopping and starting the interface also doesn't corrupt the values.

v3:
 - try to include vlan tag and padding length in the stats
v2: https://lore.kernel.org/20250228012534.3460918-1-kuba@kernel.org
 - fix skipping XDP vs the XDP Tx ring handling (Michael)
 - rename the defines as well as the structs (Przemek)
 - fix counding frag'ed packets in XDP Tx
v1: https://lore.kernel.org/20250226211003.2790916-1-kuba@kernel.org

Jakub Kicinski (10):
  eth: bnxt: use napi_consume_skb()
  eth: bnxt: don't run xdp programs on fallback traffic
  eth: bnxt: rename ring_err_stats -> ring_drv_stats
  eth: bnxt: snapshot driver stats
  eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO
  eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb()
  eth: bnxt: extract VLAN info early on
  eth: bnxt: maintain rx pkt/byte stats in SW
  eth: bnxt: maintain tx pkt/byte stats in SW
  eth: bnxt: count xdp xmit packets

 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  49 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   5 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 272 +++++++++++-------
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  20 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  47 ++-
 5 files changed, 264 insertions(+), 129 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 01/10] eth: bnxt: use napi_consume_skb()
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 02/10] eth: bnxt: don't run xdp programs on fallback traffic Jakub Kicinski
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Use napi_consume_skb() to improve skb recycling.
__bnxt_tx_int() already has the real NAPI passed in budget.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 15c57a06ecaf..f6a26f6f85bb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -855,7 +855,7 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
 next_tx_int:
 		cons = NEXT_TX(cons);
 
-		dev_consume_skb_any(skb);
+		napi_consume_skb(skb, budget);
 	}
 
 	WRITE_ONCE(txr->tx_cons, cons);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 02/10] eth: bnxt: don't run xdp programs on fallback traffic
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 01/10] eth: bnxt: use napi_consume_skb() Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 03/10] eth: bnxt: rename ring_err_stats -> ring_drv_stats Jakub Kicinski
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

The XDP program attached to the PF should not be executed
on the fallback traffic. Pass the desired dev to bnxt_rx_xdp()
and abort if the packet is for a representor. bnxt_rx_xdp()
has a lot of arguments already, so presumably adding one
more is okay.

Compile tested only.

Well behaved drivers (nfp) do not execute XDP on fallback
traffic, but perhaps this is a matter of opinion rather than
a hard rule, therefore I'm not considering this a fix.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - pass dev to bnxt_rx_xdp(), and skip just the BPF execution,
   to avoid unintentionally skipping the Tx ring handling
v1: https://lore.kernel.org/20250226211003.2790916-3-kuba@kernel.org
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |  3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 11 +++++++----
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  8 ++++++--
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
index 0122782400b8..752b6cf0022c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
@@ -17,7 +17,8 @@ struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp,
 				   dma_addr_t mapping, u32 len,
 				   struct xdp_buff *xdp);
 void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget);
-bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
+bool bnxt_rx_xdp(struct bnxt *bp, struct net_device *dev,
+		 struct bnxt_rx_ring_info *rxr, u16 cons,
 		 struct xdp_buff *xdp, struct page *page, u8 **data_ptr,
 		 unsigned int *len, u8 *event);
 int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f6a26f6f85bb..94bc9121d3f9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2036,7 +2036,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 {
 	struct bnxt_napi *bnapi = cpr->bnapi;
 	struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
-	struct net_device *dev = bp->dev;
+	struct net_device *dev;
 	struct rx_cmp *rxcmp;
 	struct rx_cmp_ext *rxcmp1;
 	u32 tmp_raw_cons = *raw_cons;
@@ -2159,6 +2159,10 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	len = flags >> RX_CMP_LEN_SHIFT;
 	dma_addr = rx_buf->mapping;
 
+	dev = bp->dev;
+	if (cmp_type == CMP_TYPE_RX_L2_CMP)
+		dev = bnxt_get_pkt_dev(bp, RX_CMP_CFA_CODE(rxcmp1));
+
 	if (bnxt_xdp_attached(bp, rxr)) {
 		bnxt_xdp_buff_init(bp, rxr, cons, data_ptr, len, &xdp);
 		if (agg_bufs) {
@@ -2172,7 +2176,8 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	}
 
 	if (xdp_active) {
-		if (bnxt_rx_xdp(bp, rxr, cons, &xdp, data, &data_ptr, &len, event)) {
+		if (bnxt_rx_xdp(bp, dev, rxr, cons, &xdp, data, &data_ptr, &len,
+				event)) {
 			rc = 1;
 			goto next_rx;
 		}
@@ -2239,8 +2244,6 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 		skb_set_hash(skb, le32_to_cpu(rxcmp->rx_cmp_rss_hash), type);
 	}
 
-	if (cmp_type == CMP_TYPE_RX_L2_CMP)
-		dev = bnxt_get_pkt_dev(bp, RX_CMP_CFA_CODE(rxcmp1));
 	skb->protocol = eth_type_trans(skb, dev);
 
 	if (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index e6c64e4bd66c..aba49ddb0e66 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -221,7 +221,8 @@ void bnxt_xdp_buff_frags_free(struct bnxt_rx_ring_info *rxr,
  * true    - packet consumed by XDP and new buffer is allocated.
  * false   - packet should be passed to the stack.
  */
-bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
+bool bnxt_rx_xdp(struct bnxt *bp, struct net_device *dev,
+		 struct bnxt_rx_ring_info *rxr, u16 cons,
 		 struct xdp_buff *xdp, struct page *page, u8 **data_ptr,
 		 unsigned int *len, u8 *event)
 {
@@ -246,7 +247,10 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 	/* BNXT_RX_PAGE_MODE(bp) when XDP enabled */
 	orig_data = xdp->data;
 
-	act = bpf_prog_run_xdp(xdp_prog, xdp);
+	if (bp->dev == dev)
+		act = bpf_prog_run_xdp(xdp_prog, xdp);
+	else /* packet is for a VF representor */
+		act = XDP_PASS;
 
 	tx_avail = bnxt_tx_avail(bp, txr);
 	/* If the tx ring is not full, we must not update the rx producer yet
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 03/10] eth: bnxt: rename ring_err_stats -> ring_drv_stats
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 01/10] eth: bnxt: use napi_consume_skb() Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 02/10] eth: bnxt: don't run xdp programs on fallback traffic Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 04/10] eth: bnxt: snapshot driver stats Jakub Kicinski
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

We will soon store non-error stats to the ring struct.
Rename them to "drv" stats, as these are all maintained
by the driver (even if partially based on info from descriptors).

Pure rename using sed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  8 ++++----
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 14 ++++++-------
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 20 +++++++++----------
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index e85b5ce94f58..34f23ddd4d71 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1136,7 +1136,7 @@ struct bnxt_sw_stats {
 	struct bnxt_cmn_sw_stats cmn;
 };
 
-struct bnxt_total_ring_err_stats {
+struct bnxt_total_ring_drv_stats {
 	u64			rx_total_l4_csum_errors;
 	u64			rx_total_resets;
 	u64			rx_total_buf_errors;
@@ -2538,7 +2538,7 @@ struct bnxt {
 	u8			pri2cos_idx[8];
 	u8			pri2cos_valid;
 
-	struct bnxt_total_ring_err_stats ring_err_stats_prev;
+	struct bnxt_total_ring_drv_stats ring_drv_stats_prev;
 
 	u16			hwrm_max_req_len;
 	u16			hwrm_max_ext_req_len;
@@ -2936,8 +2936,8 @@ int bnxt_half_open_nic(struct bnxt *bp);
 void bnxt_half_close_nic(struct bnxt *bp);
 void bnxt_reenable_sriov(struct bnxt *bp);
 void bnxt_close_nic(struct bnxt *, bool, bool);
-void bnxt_get_ring_err_stats(struct bnxt *bp,
-			     struct bnxt_total_ring_err_stats *stats);
+void bnxt_get_ring_drv_stats(struct bnxt *bp,
+			     struct bnxt_total_ring_drv_stats *stats);
 bool bnxt_rfs_capable(struct bnxt *bp, bool new_rss_ctx);
 int bnxt_dbg_hwrm_rd_reg(struct bnxt *bp, u32 reg_off, u16 num_words,
 			 u32 *reg_buf);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 94bc9121d3f9..4b85f224c344 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -12941,7 +12941,7 @@ static void __bnxt_close_nic(struct bnxt *bp, bool irq_re_init,
 	/* Save ring stats before shutdown */
 	if (bp->bnapi && irq_re_init) {
 		bnxt_get_ring_stats(bp, &bp->net_stats_prev);
-		bnxt_get_ring_err_stats(bp, &bp->ring_err_stats_prev);
+		bnxt_get_ring_drv_stats(bp, &bp->ring_drv_stats_prev);
 	}
 	if (irq_re_init) {
 		bnxt_free_irq(bp);
@@ -13191,8 +13191,8 @@ bnxt_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	clear_bit(BNXT_STATE_READ_STATS, &bp->state);
 }
 
-static void bnxt_get_one_ring_err_stats(struct bnxt *bp,
-					struct bnxt_total_ring_err_stats *stats,
+static void bnxt_get_one_ring_drv_stats(struct bnxt *bp,
+					struct bnxt_total_ring_drv_stats *stats,
 					struct bnxt_cp_ring_info *cpr)
 {
 	struct bnxt_sw_stats *sw_stats = cpr->sw_stats;
@@ -13211,13 +13211,13 @@ static void bnxt_get_one_ring_err_stats(struct bnxt *bp,
 	stats->total_missed_irqs += sw_stats->cmn.missed_irqs;
 }
 
-void bnxt_get_ring_err_stats(struct bnxt *bp,
-			     struct bnxt_total_ring_err_stats *stats)
+void bnxt_get_ring_drv_stats(struct bnxt *bp,
+			     struct bnxt_total_ring_drv_stats *stats)
 {
 	int i;
 
 	for (i = 0; i < bp->cp_nr_rings; i++)
-		bnxt_get_one_ring_err_stats(bp, stats, &bp->bnapi[i]->cp_ring);
+		bnxt_get_one_ring_drv_stats(bp, stats, &bp->bnapi[i]->cp_ring);
 }
 
 static bool bnxt_mc_list_updated(struct bnxt *bp, u32 *rx_mask)
@@ -15643,7 +15643,7 @@ static void bnxt_get_base_stats(struct net_device *dev,
 
 	rx->packets = bp->net_stats_prev.rx_packets;
 	rx->bytes = bp->net_stats_prev.rx_bytes;
-	rx->alloc_fail = bp->ring_err_stats_prev.rx_total_oom_discards;
+	rx->alloc_fail = bp->ring_drv_stats_prev.rx_total_oom_discards;
 
 	tx->packets = bp->net_stats_prev.tx_packets;
 	tx->bytes = bp->net_stats_prev.tx_bytes;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 9c5820839514..023a0c2d52fd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -340,7 +340,7 @@ enum {
 	RX_NETPOLL_DISCARDS,
 };
 
-static const char *const bnxt_ring_err_stats_arr[] = {
+static const char *const bnxt_ring_drv_stats_arr[] = {
 	"rx_total_l4_csum_errors",
 	"rx_total_resets",
 	"rx_total_buf_errors",
@@ -500,7 +500,7 @@ static const struct {
 	BNXT_TX_STATS_PRI_ENTRIES(tx_packets),
 };
 
-#define BNXT_NUM_RING_ERR_STATS	ARRAY_SIZE(bnxt_ring_err_stats_arr)
+#define BNXT_NUM_RING_DRV_STATS	ARRAY_SIZE(bnxt_ring_drv_stats_arr)
 #define BNXT_NUM_PORT_STATS ARRAY_SIZE(bnxt_port_stats_arr)
 #define BNXT_NUM_STATS_PRI			\
 	(ARRAY_SIZE(bnxt_rx_bytes_pri_arr) +	\
@@ -539,7 +539,7 @@ static int bnxt_get_num_stats(struct bnxt *bp)
 	int num_stats = bnxt_get_num_ring_stats(bp);
 	int len;
 
-	num_stats += BNXT_NUM_RING_ERR_STATS;
+	num_stats += BNXT_NUM_RING_DRV_STATS;
 
 	if (bp->flags & BNXT_FLAG_PORT_STATS)
 		num_stats += BNXT_NUM_PORT_STATS;
@@ -594,7 +594,7 @@ static bool is_tx_ring(struct bnxt *bp, int ring_num)
 static void bnxt_get_ethtool_stats(struct net_device *dev,
 				   struct ethtool_stats *stats, u64 *buf)
 {
-	struct bnxt_total_ring_err_stats ring_err_stats = {0};
+	struct bnxt_total_ring_drv_stats ring_drv_stats = {0};
 	struct bnxt *bp = netdev_priv(dev);
 	u64 *curr, *prev;
 	u32 tpa_stats;
@@ -643,12 +643,12 @@ static void bnxt_get_ethtool_stats(struct net_device *dev,
 			buf[j] = sw[k];
 	}
 
-	bnxt_get_ring_err_stats(bp, &ring_err_stats);
+	bnxt_get_ring_drv_stats(bp, &ring_drv_stats);
 
 skip_ring_stats:
-	curr = &ring_err_stats.rx_total_l4_csum_errors;
-	prev = &bp->ring_err_stats_prev.rx_total_l4_csum_errors;
-	for (i = 0; i < BNXT_NUM_RING_ERR_STATS; i++, j++, curr++, prev++)
+	curr = &ring_drv_stats.rx_total_l4_csum_errors;
+	prev = &bp->ring_drv_stats_prev.rx_total_l4_csum_errors;
+	for (i = 0; i < BNXT_NUM_RING_DRV_STATS; i++, j++, curr++, prev++)
 		buf[j] = *curr + *prev;
 
 	if (bp->flags & BNXT_FLAG_PORT_STATS) {
@@ -752,8 +752,8 @@ static void bnxt_get_strings(struct net_device *dev, u32 stringset, u8 *buf)
 				ethtool_sprintf(&buf, "[%d]: %s", i, str);
 			}
 		}
-		for (i = 0; i < BNXT_NUM_RING_ERR_STATS; i++)
-			ethtool_puts(&buf, bnxt_ring_err_stats_arr[i]);
+		for (i = 0; i < BNXT_NUM_RING_DRV_STATS; i++)
+			ethtool_puts(&buf, bnxt_ring_drv_stats_arr[i]);
 
 		if (bp->flags & BNXT_FLAG_PORT_STATS)
 			for (i = 0; i < BNXT_NUM_PORT_STATS; i++) {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 04/10] eth: bnxt: snapshot driver stats
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (2 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 03/10] eth: bnxt: rename ring_err_stats -> ring_drv_stats Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 05/10] eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO Jakub Kicinski
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Subsequent commits will add datapath stats which need u64_stats
protection. Make current readers work on a snapshot, so it's
easier to extend this code without much duplication.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 +++++++++++++++--------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4b85f224c344..854e7ec5390b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -13093,6 +13093,12 @@ static int bnxt_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	return -EOPNOTSUPP;
 }
 
+static void bnxt_drv_stat_snapshot(const struct bnxt_sw_stats *sw_stats,
+				   struct bnxt_sw_stats *snapshot)
+{
+	memcpy(snapshot, sw_stats, sizeof(*snapshot));
+}
+
 static void bnxt_get_ring_stats(struct bnxt *bp,
 				struct rtnl_link_stats64 *stats)
 {
@@ -13101,8 +13107,11 @@ static void bnxt_get_ring_stats(struct bnxt *bp,
 	for (i = 0; i < bp->cp_nr_rings; i++) {
 		struct bnxt_napi *bnapi = bp->bnapi[i];
 		struct bnxt_cp_ring_info *cpr = &bnapi->cp_ring;
+		struct bnxt_sw_stats sw_stats;
 		u64 *sw = cpr->stats.sw_stats;
 
+		bnxt_drv_stat_snapshot(cpr->sw_stats, &sw_stats);
+
 		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_ucast_pkts);
 		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_mcast_pkts);
 		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_bcast_pkts);
@@ -13127,8 +13136,8 @@ static void bnxt_get_ring_stats(struct bnxt *bp,
 		stats->tx_dropped += BNXT_GET_RING_STATS64(sw, tx_error_pkts);
 
 		stats->rx_dropped +=
-			cpr->sw_stats->rx.rx_netpoll_discards +
-			cpr->sw_stats->rx.rx_oom_discards;
+			sw_stats.rx.rx_netpoll_discards +
+			sw_stats.rx.rx_oom_discards;
 	}
 }
 
@@ -13195,20 +13204,22 @@ static void bnxt_get_one_ring_drv_stats(struct bnxt *bp,
 					struct bnxt_total_ring_drv_stats *stats,
 					struct bnxt_cp_ring_info *cpr)
 {
-	struct bnxt_sw_stats *sw_stats = cpr->sw_stats;
 	u64 *hw_stats = cpr->stats.sw_stats;
+	struct bnxt_sw_stats sw_stats;
 
-	stats->rx_total_l4_csum_errors += sw_stats->rx.rx_l4_csum_errors;
-	stats->rx_total_resets += sw_stats->rx.rx_resets;
-	stats->rx_total_buf_errors += sw_stats->rx.rx_buf_errors;
-	stats->rx_total_oom_discards += sw_stats->rx.rx_oom_discards;
-	stats->rx_total_netpoll_discards += sw_stats->rx.rx_netpoll_discards;
+	bnxt_drv_stat_snapshot(cpr->sw_stats, &sw_stats);
+
+	stats->rx_total_l4_csum_errors += sw_stats.rx.rx_l4_csum_errors;
+	stats->rx_total_resets += sw_stats.rx.rx_resets;
+	stats->rx_total_buf_errors += sw_stats.rx.rx_buf_errors;
+	stats->rx_total_oom_discards += sw_stats.rx.rx_oom_discards;
+	stats->rx_total_netpoll_discards += sw_stats.rx.rx_netpoll_discards;
 	stats->rx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, rx_discard_pkts);
-	stats->tx_total_resets += sw_stats->tx.tx_resets;
+	stats->tx_total_resets += sw_stats.tx.tx_resets;
 	stats->tx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, tx_discard_pkts);
-	stats->total_missed_irqs += sw_stats->cmn.missed_irqs;
+	stats->total_missed_irqs += sw_stats.cmn.missed_irqs;
 }
 
 void bnxt_get_ring_drv_stats(struct bnxt *bp,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 05/10] eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (3 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 04/10] eth: bnxt: snapshot driver stats Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 06/10] eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb() Jakub Kicinski
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Use IS_ENABLED(CONFIG_INET) to make the code easier to refactor.
Now all packets which did not go thru GRO will exit in the same
branch.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 854e7ec5390b..d8a24a8bcfe8 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1740,12 +1740,11 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 					   struct rx_tpa_end_cmp_ext *tpa_end1,
 					   struct sk_buff *skb)
 {
-#ifdef CONFIG_INET
 	int payload_off;
 	u16 segs;
 
 	segs = TPA_END_TPA_SEGS(tpa_end);
-	if (segs == 1)
+	if (segs == 1 || !IS_ENABLED(CONFIG_INET))
 		return skb;
 
 	NAPI_GRO_CB(skb)->count = segs;
@@ -1759,7 +1758,6 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	skb = bp->gro_func(tpa_info, payload_off, TPA_END_GRO_TS(tpa_end), skb);
 	if (likely(skb))
 		tcp_gro_complete(skb);
-#endif
 	return skb;
 }
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 06/10] eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb()
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (4 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 05/10] eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 07/10] eth: bnxt: extract VLAN info early on Jakub Kicinski
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

bnxt_tpa_end() skips calling bnxt_gro_skb() if it determines that GRO
should not be performed. For ease of packet counting pass the gro bool
into bnxt_gro_skb(), this way we have a single point thru which all
packets coming out of bnxt_tpa_end() should pass.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v3:
 - flatten it harder, so we can reuse on normal Rx
v2: https://lore.kernel.org/20250228012534.3460918-7-kuba@kernel.org
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d8a24a8bcfe8..dba4779f0925 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1735,6 +1735,7 @@ static struct sk_buff *bnxt_gro_func_5730x(struct bnxt_tpa_info *tpa_info,
 }
 
 static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
+					   bool gro,
 					   struct bnxt_tpa_info *tpa_info,
 					   struct rx_tpa_end_cmp *tpa_end,
 					   struct rx_tpa_end_cmp_ext *tpa_end1,
@@ -1743,9 +1744,9 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	int payload_off;
 	u16 segs;
 
-	segs = TPA_END_TPA_SEGS(tpa_end);
+	segs = gro ? TPA_END_TPA_SEGS(tpa_end) : 1;
 	if (segs == 1 || !IS_ENABLED(CONFIG_INET))
-		return skb;
+		goto non_gro;
 
 	NAPI_GRO_CB(skb)->count = segs;
 	skb_shinfo(skb)->gso_size =
@@ -1756,8 +1757,12 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	else
 		payload_off = TPA_END_PAYLOAD_OFF(tpa_end);
 	skb = bp->gro_func(tpa_info, payload_off, TPA_END_GRO_TS(tpa_end), skb);
-	if (likely(skb))
-		tcp_gro_complete(skb);
+	if (!skb)
+		goto non_gro;
+
+	tcp_gro_complete(skb);
+
+non_gro: /* note: skb may be null! */
 	return skb;
 }
 
@@ -1917,10 +1922,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 			(tpa_info->flags2 & RX_CMP_FLAGS2_T_L4_CS_CALC) >> 3;
 	}
 
-	if (gro)
-		skb = bnxt_gro_skb(bp, tpa_info, tpa_end, tpa_end1, skb);
-
-	return skb;
+	return bnxt_gro_skb(bp, gro, tpa_info, tpa_end, tpa_end1, skb);
 }
 
 static void bnxt_tpa_agg(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 07/10] eth: bnxt: extract VLAN info early on
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (5 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 06/10] eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb() Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 08/10] eth: bnxt: maintain rx pkt/byte stats in SW Jakub Kicinski
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Michael would like the SW stats to include VLAN bytes, perhaps
uniquely among ethernet drivers. To do this we need to extract
the VLAN info before we call XDP, so before skb is allocated.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 42 ++++++++++-------------
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dba4779f0925..b0a9e3c6b377 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1966,45 +1966,36 @@ static bool bnxt_rx_ts_valid(struct bnxt *bp, u32 flags,
 	return true;
 }
 
-static struct sk_buff *bnxt_rx_vlan(struct sk_buff *skb, u8 cmp_type,
-				    struct rx_cmp *rxcmp,
-				    struct rx_cmp_ext *rxcmp1)
+static u32
+bnxt_rx_vlan(u8 cmp_type, struct rx_cmp *rxcmp, struct rx_cmp_ext *rxcmp1)
 {
-	__be16 vlan_proto;
-	u16 vtag;
+	u16 vlan_proto = 0, vtag = 0;
 
 	if (cmp_type == CMP_TYPE_RX_L2_CMP) {
 		__le32 flags2 = rxcmp1->rx_cmp_flags2;
 		u32 meta_data;
 
 		if (!(flags2 & cpu_to_le32(RX_CMP_FLAGS2_META_FORMAT_VLAN)))
-			return skb;
+			return 0;
 
 		meta_data = le32_to_cpu(rxcmp1->rx_cmp_meta_data);
 		vtag = meta_data & RX_CMP_FLAGS2_METADATA_TCI_MASK;
-		vlan_proto = htons(meta_data >> RX_CMP_FLAGS2_METADATA_TPID_SFT);
-		if (eth_type_vlan(vlan_proto))
-			__vlan_hwaccel_put_tag(skb, vlan_proto, vtag);
-		else
-			goto vlan_err;
+		vlan_proto = meta_data >> RX_CMP_FLAGS2_METADATA_TPID_SFT;
 	} else if (cmp_type == CMP_TYPE_RX_L2_V3_CMP) {
 		if (RX_CMP_VLAN_VALID(rxcmp)) {
 			u32 tpid_sel = RX_CMP_VLAN_TPID_SEL(rxcmp);
 
 			if (tpid_sel == RX_CMP_METADATA1_TPID_8021Q)
-				vlan_proto = htons(ETH_P_8021Q);
+				vlan_proto = ETH_P_8021Q;
 			else if (tpid_sel == RX_CMP_METADATA1_TPID_8021AD)
-				vlan_proto = htons(ETH_P_8021AD);
+				vlan_proto = ETH_P_8021AD;
 			else
-				goto vlan_err;
+				vlan_proto = 0xffff;
 			vtag = RX_CMP_METADATA0_TCI(rxcmp1);
-			__vlan_hwaccel_put_tag(skb, vlan_proto, vtag);
 		}
 	}
-	return skb;
-vlan_err:
-	dev_kfree_skb(skb);
-	return NULL;
+
+	return (u32)vlan_proto << 16 | vtag;
 }
 
 static enum pkt_hash_types bnxt_rss_ext_op(struct bnxt *bp,
@@ -2049,6 +2040,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	struct sk_buff *skb;
 	struct xdp_buff xdp;
 	u32 flags, misc;
+	u32 vlan_info;
 	u32 cmpl_ts;
 	void *data;
 	int rc = 0;
@@ -2163,6 +2155,10 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 	if (cmp_type == CMP_TYPE_RX_L2_CMP)
 		dev = bnxt_get_pkt_dev(bp, RX_CMP_CFA_CODE(rxcmp1));
 
+	vlan_info = bnxt_rx_vlan(cmp_type, rxcmp, rxcmp1);
+	if (vlan_info && !eth_type_vlan(htons(vlan_info >> 16)))
+		goto next_rx;
+
 	if (bnxt_xdp_attached(bp, rxr)) {
 		bnxt_xdp_buff_init(bp, rxr, cons, data_ptr, len, &xdp);
 		if (agg_bufs) {
@@ -2246,11 +2242,9 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 
 	skb->protocol = eth_type_trans(skb, dev);
 
-	if (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX) {
-		skb = bnxt_rx_vlan(skb, cmp_type, rxcmp, rxcmp1);
-		if (!skb)
-			goto next_rx;
-	}
+	if (vlan_info && skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX)
+		__vlan_hwaccel_put_tag(skb, htons(vlan_info >> 16),
+				       vlan_info & 0xffff);
 
 	skb_checksum_none_assert(skb);
 	if (RX_CMP_L4_CS_OK(rxcmp1)) {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 08/10] eth: bnxt: maintain rx pkt/byte stats in SW
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (6 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 07/10] eth: bnxt: extract VLAN info early on Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-05 22:52 ` [PATCH net-next v3 09/10] eth: bnxt: maintain tx " Jakub Kicinski
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Some workloads want to be able to track bandwidth utilization on
the scale of 10s of msecs. Updating all HW stats at this rate is
both hard and wasteful of PCIe bandwidth.

Maintain basic Rx pkt/byte counters in software. ethtool -S will still
show the HW stats, but qstats and rtnl stats will show SW statistics.

We need to take care of HW-GRO, XDP and VF representors. Per netdev
qstat definition Rx stats should reflect packets passed to XDP (if
active, otherwise to the stack). XDP and GRO do not interoperate
in bnxt, so we need to count the packets in a few places.
Add a helper and call it where needed.

Do not count VF representor traffic as traffic for the main netdev.

The stats are added towards the end of the struct since ethtool
code expects existing members to be first.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v3:
 - count vlan tags
 - pass all non-XDP packets thru bnxt_gro_skb()
v2: https://lore.kernel.org/20250228012534.3460918-8-kuba@kernel.org
 - adjust to skipping XDP inside bnxt_rx_pkt()
v1: https://lore.kernel.org/20250226211003.2790916-8-kuba@kernel.org
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     | 25 +++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 68 +++++++++++--------
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 18 ++++-
 4 files changed, 82 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 34f23ddd4d71..376141a41ecf 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1118,8 +1118,12 @@ struct bnxt_rx_sw_stats {
 	u64			rx_l4_csum_errors;
 	u64			rx_resets;
 	u64			rx_buf_errors;
+	/* non-ethtool stats follow */
 	u64			rx_oom_discards;
 	u64			rx_netpoll_discards;
+	u64			rx_packets;
+	u64			rx_bytes;
+	struct u64_stats_sync	syncp;
 };
 
 struct bnxt_tx_sw_stats {
@@ -1146,6 +1150,9 @@ struct bnxt_total_ring_drv_stats {
 	u64			tx_total_resets;
 	u64			tx_total_ring_discards;
 	u64			total_missed_irqs;
+	/* non-ethtool stats follow */
+	u64			rx_total_packets;
+	u64			rx_total_bytes;
 };
 
 struct bnxt_stats_mem {
@@ -2797,6 +2804,24 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp,
 	return bp->tx_ring_size - (used & bp->tx_ring_mask);
 }
 
+static inline void bnxt_rx_pkt_cnt(struct bnxt *bp, struct bnxt_napi *bnapi,
+				   struct net_device *dev,
+				   int segs, int len, int payload_off)
+{
+	struct bnxt_sw_stats *sw_stats = bnapi->cp_ring.sw_stats;
+
+	/* Packet is for a representor */
+	if (bp->dev != dev)
+		return;
+
+	u64_stats_update_begin(&sw_stats->rx.syncp);
+	sw_stats->rx.rx_packets	+= segs;
+	sw_stats->rx.rx_bytes	+= len;
+	if (segs > 1)
+		sw_stats->rx.rx_bytes += (segs - 1) * payload_off;
+	u64_stats_update_end(&sw_stats->rx.syncp);
+}
+
 static inline void bnxt_writeq(struct bnxt *bp, u64 val,
 			       volatile void __iomem *addr)
 {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
index 752b6cf0022c..0461f6783d95 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
@@ -20,7 +20,7 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget);
 bool bnxt_rx_xdp(struct bnxt *bp, struct net_device *dev,
 		 struct bnxt_rx_ring_info *rxr, u16 cons,
 		 struct xdp_buff *xdp, struct page *page, u8 **data_ptr,
-		 unsigned int *len, u8 *event);
+		 unsigned int *len, u8 *event, u32 vlan_info);
 int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp);
 int bnxt_xdp_xmit(struct net_device *dev, int num_frames,
 		  struct xdp_frame **frames, u32 flags);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b0a9e3c6b377..bdc052b247ba 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1735,15 +1735,23 @@ static struct sk_buff *bnxt_gro_func_5730x(struct bnxt_tpa_info *tpa_info,
 }
 
 static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
+					   struct bnxt_napi *bnapi,
 					   bool gro,
 					   struct bnxt_tpa_info *tpa_info,
 					   struct rx_tpa_end_cmp *tpa_end,
 					   struct rx_tpa_end_cmp_ext *tpa_end1,
 					   struct sk_buff *skb)
 {
+	struct net_device *dev = skb->dev;
 	int payload_off;
+	int full_len;
 	u16 segs;
 
+	full_len = skb->len - skb_mac_offset(skb);
+	if (skb_vlan_tag_present(skb))
+		full_len += 4;
+
+	payload_off = 0;
 	segs = gro ? TPA_END_TPA_SEGS(tpa_end) : 1;
 	if (segs == 1 || !IS_ENABLED(CONFIG_INET))
 		goto non_gro;
@@ -1763,6 +1771,7 @@ static inline struct sk_buff *bnxt_gro_skb(struct bnxt *bp,
 	tcp_gro_complete(skb);
 
 non_gro: /* note: skb may be null! */
+	bnxt_rx_pkt_cnt(bp, bnapi, dev, segs, full_len, payload_off);
 	return skb;
 }
 
@@ -1922,7 +1931,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
 			(tpa_info->flags2 & RX_CMP_FLAGS2_T_L4_CS_CALC) >> 3;
 	}
 
-	return bnxt_gro_skb(bp, gro, tpa_info, tpa_end, tpa_end1, skb);
+	return bnxt_gro_skb(bp, bnapi, gro, tpa_info, tpa_end, tpa_end1, skb);
 }
 
 static void bnxt_tpa_agg(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
@@ -2173,7 +2182,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 
 	if (xdp_active) {
 		if (bnxt_rx_xdp(bp, dev, rxr, cons, &xdp, data, &data_ptr, &len,
-				event)) {
+				event, vlan_info)) {
 			rc = 1;
 			goto next_rx;
 		}
@@ -2273,6 +2282,11 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
 			}
 		}
 	}
+
+	if (!xdp_active)
+		/* For accounting only, XDP packets counted in bnxt_rx_xdp() */
+		bnxt_gro_skb(bp, bnapi, false, NULL, NULL, NULL, skb);
+
 	bnxt_deliver_skb(bp, bnapi, skb);
 	rc = 1;
 
@@ -5113,6 +5127,8 @@ static int bnxt_alloc_stats(struct bnxt *bp)
 		if (!cpr->sw_stats)
 			return -ENOMEM;
 
+		u64_stats_init(&cpr->sw_stats->rx.syncp);
+
 		cpr->stats.len = size;
 		rc = bnxt_alloc_stats_mem(bp, &cpr->stats, !i);
 		if (rc)
@@ -13090,7 +13106,14 @@ static int bnxt_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 static void bnxt_drv_stat_snapshot(const struct bnxt_sw_stats *sw_stats,
 				   struct bnxt_sw_stats *snapshot)
 {
-	memcpy(snapshot, sw_stats, sizeof(*snapshot));
+	unsigned int seq_rx;
+
+	do {
+		seq_rx = u64_stats_fetch_begin(&sw_stats->rx.syncp);
+
+		memcpy(snapshot, sw_stats, sizeof(*snapshot));
+
+	} while (u64_stats_fetch_retry(&sw_stats->rx.syncp, seq_rx));
 }
 
 static void bnxt_get_ring_stats(struct bnxt *bp,
@@ -13106,18 +13129,13 @@ static void bnxt_get_ring_stats(struct bnxt *bp,
 
 		bnxt_drv_stat_snapshot(cpr->sw_stats, &sw_stats);
 
-		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_ucast_pkts);
-		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_mcast_pkts);
-		stats->rx_packets += BNXT_GET_RING_STATS64(sw, rx_bcast_pkts);
+		stats->rx_packets += sw_stats.rx.rx_packets;
+		stats->rx_bytes += sw_stats.rx.rx_bytes;
 
 		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_ucast_pkts);
 		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_mcast_pkts);
 		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_bcast_pkts);
 
-		stats->rx_bytes += BNXT_GET_RING_STATS64(sw, rx_ucast_bytes);
-		stats->rx_bytes += BNXT_GET_RING_STATS64(sw, rx_mcast_bytes);
-		stats->rx_bytes += BNXT_GET_RING_STATS64(sw, rx_bcast_bytes);
-
 		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_ucast_bytes);
 		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_mcast_bytes);
 		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_bcast_bytes);
@@ -13208,6 +13226,8 @@ static void bnxt_get_one_ring_drv_stats(struct bnxt *bp,
 	stats->rx_total_buf_errors += sw_stats.rx.rx_buf_errors;
 	stats->rx_total_oom_discards += sw_stats.rx.rx_oom_discards;
 	stats->rx_total_netpoll_discards += sw_stats.rx.rx_netpoll_discards;
+	stats->rx_total_packets += sw_stats.rx.rx_packets;
+	stats->rx_total_bytes += sw_stats.rx.rx_bytes;
 	stats->rx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, rx_discard_pkts);
 	stats->tx_total_resets += sw_stats.tx.tx_resets;
@@ -15600,23 +15620,17 @@ static void bnxt_get_queue_stats_rx(struct net_device *dev, int i,
 				    struct netdev_queue_stats_rx *stats)
 {
 	struct bnxt *bp = netdev_priv(dev);
-	struct bnxt_cp_ring_info *cpr;
-	u64 *sw;
+	struct bnxt_sw_stats *sw_stats;
+	unsigned int seq;
 
-	cpr = &bp->bnapi[i]->cp_ring;
-	sw = cpr->stats.sw_stats;
+	sw_stats = bp->bnapi[i]->cp_ring.sw_stats;
 
-	stats->packets = 0;
-	stats->packets += BNXT_GET_RING_STATS64(sw, rx_ucast_pkts);
-	stats->packets += BNXT_GET_RING_STATS64(sw, rx_mcast_pkts);
-	stats->packets += BNXT_GET_RING_STATS64(sw, rx_bcast_pkts);
-
-	stats->bytes = 0;
-	stats->bytes += BNXT_GET_RING_STATS64(sw, rx_ucast_bytes);
-	stats->bytes += BNXT_GET_RING_STATS64(sw, rx_mcast_bytes);
-	stats->bytes += BNXT_GET_RING_STATS64(sw, rx_bcast_bytes);
-
-	stats->alloc_fail = cpr->sw_stats->rx.rx_oom_discards;
+	do {
+		seq = u64_stats_fetch_begin(&sw_stats->rx.syncp);
+		stats->packets = sw_stats->rx.rx_packets;
+		stats->bytes = sw_stats->rx.rx_bytes;
+		stats->alloc_fail = sw_stats->rx.rx_oom_discards;
+	} while (u64_stats_fetch_retry(&sw_stats->rx.syncp, seq));
 }
 
 static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
@@ -15646,8 +15660,8 @@ static void bnxt_get_base_stats(struct net_device *dev,
 {
 	struct bnxt *bp = netdev_priv(dev);
 
-	rx->packets = bp->net_stats_prev.rx_packets;
-	rx->bytes = bp->net_stats_prev.rx_bytes;
+	rx->packets = bp->ring_drv_stats_prev.rx_total_packets;
+	rx->bytes = bp->ring_drv_stats_prev.rx_total_bytes;
 	rx->alloc_fail = bp->ring_drv_stats_prev.rx_total_oom_discards;
 
 	tx->packets = bp->net_stats_prev.tx_packets;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index aba49ddb0e66..d13c8e06d299 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -224,7 +224,7 @@ void bnxt_xdp_buff_frags_free(struct bnxt_rx_ring_info *rxr,
 bool bnxt_rx_xdp(struct bnxt *bp, struct net_device *dev,
 		 struct bnxt_rx_ring_info *rxr, u16 cons,
 		 struct xdp_buff *xdp, struct page *page, u8 **data_ptr,
-		 unsigned int *len, u8 *event)
+		 unsigned int *len, u8 *event, u32 vlan_info)
 {
 	struct bpf_prog *xdp_prog = READ_ONCE(rxr->xdp_prog);
 	struct bnxt_tx_ring_info *txr;
@@ -247,10 +247,22 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct net_device *dev,
 	/* BNXT_RX_PAGE_MODE(bp) when XDP enabled */
 	orig_data = xdp->data;
 
-	if (bp->dev == dev)
+	if (bp->dev == dev) {
+		struct skb_shared_info *sinfo;
+		int stat_len = *len;
+
+		if (vlan_info)
+			stat_len += 4;
+		if (unlikely(xdp_buff_has_frags(xdp))) {
+			sinfo = xdp_get_shared_info_from_buff(xdp);
+			stat_len += sinfo->xdp_frags_size;
+		}
+
+		bnxt_rx_pkt_cnt(bp, rxr->bnapi, dev, 1, stat_len, 0);
 		act = bpf_prog_run_xdp(xdp_prog, xdp);
-	else /* packet is for a VF representor */
+	} else { /* packet is for a VF representor */
 		act = XDP_PASS;
+	}
 
 	tx_avail = bnxt_tx_avail(bp, txr);
 	/* If the tx ring is not full, we must not update the rx producer yet
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 09/10] eth: bnxt: maintain tx pkt/byte stats in SW
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (7 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 08/10] eth: bnxt: maintain rx pkt/byte stats in SW Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-09  8:25   ` Michael Chan
  2025-03-05 22:52 ` [PATCH net-next v3 10/10] eth: bnxt: count xdp xmit packets Jakub Kicinski
  2025-03-06 10:54 ` [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Taehee Yoo
  10 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Some workloads want to be able to track bandwidth utilization on
the scale of 10s of msecs. Updating all HW stats at this rate is
both hard and wasteful of PCIe bandwidth.

Maintain basic Tx pkt/byte counters in software. ethtool -S will still
show the HW stats, but qstats and rtnl stats will show SW statistics.

We need to take care of TSO and VF representors, record relevant
state in tx_buf to avoid touching potentially cold skb. Use existing
holes in the struct (no size change). Note that according to TX_BD_HSIZE
max header size is 255, so u8 should be enough.

It's not obvious whether VF representor traffic should be counted,
on one hand it doesn't belong to the local interface. On the other
it does go thru normal queuing so Qdisc will show it. I opted to
_not_ count it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v3:
 - count padding and vlans
v2: https://lore.kernel.org/20250228012534.3460918-9-kuba@kernel.org
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 13 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 78 +++++++++++++++--------
 2 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 376141a41ecf..37d7f08a73c3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -771,6 +771,7 @@ struct nqe_cn {
 	 SKB_DATA_ALIGN((unsigned int)sizeof(struct skb_shared_info)))
 
 #define BNXT_MIN_PKT_SIZE	52
+#define BNXT_MIN_ETH_SIZE	60
 
 #define BNXT_DEFAULT_RX_RING_SIZE	511
 #define BNXT_DEFAULT_TX_RING_SIZE	511
@@ -877,11 +878,15 @@ struct bnxt_sw_tx_bd {
 		struct sk_buff		*skb;
 		struct xdp_frame	*xdpf;
 	};
+	struct page		*page;
 	DEFINE_DMA_UNMAP_ADDR(mapping);
 	DEFINE_DMA_UNMAP_LEN(len);
-	struct page		*page;
+	u16			extra_segs;
+	u8			extra_bytes;
+	u8			hdr_size;
 	u8			is_ts_pkt;
 	u8			is_push;
+	u8			is_vfr;
 	u8			action;
 	unsigned short		nr_frags;
 	union {
@@ -1128,6 +1133,10 @@ struct bnxt_rx_sw_stats {
 
 struct bnxt_tx_sw_stats {
 	u64			tx_resets;
+	/* non-ethtool stats follow */
+	u64			tx_packets;
+	u64			tx_bytes;
+	struct u64_stats_sync	syncp;
 };
 
 struct bnxt_cmn_sw_stats {
@@ -1153,6 +1162,8 @@ struct bnxt_total_ring_drv_stats {
 	/* non-ethtool stats follow */
 	u64			rx_total_packets;
 	u64			rx_total_bytes;
+	u64			tx_total_packets;
+	u64			tx_total_bytes;
 };
 
 struct bnxt_stats_mem {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index bdc052b247ba..893102c0d24e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -514,6 +514,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	vlan_tag_flags = 0;
 	cfa_action = bnxt_xmit_get_cfa_action(skb);
+	tx_buf->is_vfr = !!cfa_action;
 	if (skb_vlan_tag_present(skb)) {
 		vlan_tag_flags = TX_BD_CFA_META_KEY_VLAN |
 				 skb_vlan_tag_get(skb);
@@ -522,6 +523,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		 */
 		if (skb->vlan_proto == htons(ETH_P_8021Q))
 			vlan_tag_flags |= 1 << TX_BD_CFA_META_TPID_SHIFT;
+		tx_buf->extra_bytes += VLAN_HLEN;
 	}
 
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) && ptp &&
@@ -610,6 +612,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				    DB_RING_IDX(&txr->tx_db, prod));
 		WRITE_ONCE(txr->tx_prod, prod);
 
+		if (skb->len < BNXT_MIN_ETH_SIZE)
+			tx_buf->extra_bytes += BNXT_MIN_ETH_SIZE - skb->len;
 		tx_buf->is_push = 1;
 		netdev_tx_sent_queue(txq, skb->len);
 		wmb();	/* Sync is_push and byte queue before pushing data */
@@ -634,6 +638,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			goto tx_kick_pending;
 		length = BNXT_MIN_PKT_SIZE;
 	}
+	if (skb->len < BNXT_MIN_ETH_SIZE)
+		tx_buf->extra_bytes += BNXT_MIN_ETH_SIZE - skb->len;
 
 	mapping = dma_map_single(&pdev->dev, skb->data, len, DMA_TO_DEVICE);
 
@@ -675,6 +681,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		length = skb_shinfo(skb)->gso_size;
 		txbd1->tx_bd_mss = cpu_to_le32(length);
 		length += hdr_len;
+
+		tx_buf->hdr_size = hdr_len;
+		tx_buf->extra_segs = skb_shinfo(skb)->gso_segs - 1;
 	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		txbd1->tx_bd_hsize_lflags |=
 			cpu_to_le32(TX_BD_FLAGS_TCP_UDP_CHKSUM);
@@ -784,6 +793,10 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (txr->kick_pending)
 		bnxt_txr_db_kick(bp, txr, txr->tx_prod);
 	txr->tx_buf_ring[txr->tx_prod].skb = NULL;
+	txr->tx_buf_ring[txr->tx_prod].extra_segs = 0;
+	txr->tx_buf_ring[txr->tx_prod].extra_bytes = 0;
+	txr->tx_buf_ring[txr->tx_prod].hdr_size = 0;
+	txr->tx_buf_ring[txr->tx_prod].is_vfr = 0;
 	dev_core_stats_tx_dropped_inc(dev);
 	return NETDEV_TX_OK;
 }
@@ -793,11 +806,12 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
 			  int budget)
 {
 	struct netdev_queue *txq = netdev_get_tx_queue(bp->dev, txr->txq_index);
+	struct bnxt_sw_stats *sw_stats = txr->bnapi->cp_ring.sw_stats;
 	struct pci_dev *pdev = bp->pdev;
+	int adj_bytes = 0, tx_bytes = 0;
+	int adj_pkts = 0, tx_pkts = 0;
 	u16 hw_cons = txr->tx_hw_cons;
-	unsigned int tx_bytes = 0;
 	u16 cons = txr->tx_cons;
-	int tx_pkts = 0;
 	bool rc = false;
 
 	while (RING_TX(bp, cons) != hw_cons) {
@@ -823,8 +837,20 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
 		cons = NEXT_TX(cons);
 		tx_pkts++;
 		tx_bytes += skb->len;
+		if (!tx_buf->is_vfr) {
+			adj_pkts += tx_buf->extra_segs;
+			adj_bytes += tx_buf->extra_bytes +
+				tx_buf->extra_segs * tx_buf->hdr_size;
+		} else {
+			adj_pkts--;
+			adj_bytes -= skb->len;
+		}
 		tx_buf->skb = NULL;
+		tx_buf->extra_segs = 0;
+		tx_buf->extra_bytes = 0;
+		tx_buf->hdr_size = 0;
 		tx_buf->is_ts_pkt = 0;
+		tx_buf->is_vfr = 0;
 
 		if (tx_buf->is_push) {
 			tx_buf->is_push = 0;
@@ -860,6 +886,11 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
 
 	WRITE_ONCE(txr->tx_cons, cons);
 
+	u64_stats_update_begin(&sw_stats->tx.syncp);
+	sw_stats->tx.tx_packets += tx_pkts + adj_pkts;
+	sw_stats->tx.tx_bytes   += tx_bytes + adj_bytes;
+	u64_stats_update_end(&sw_stats->tx.syncp);
+
 	__netif_txq_completed_wake(txq, tx_pkts, tx_bytes,
 				   bnxt_tx_avail(bp, txr), bp->tx_wake_thresh,
 				   READ_ONCE(txr->dev_state) == BNXT_DEV_STATE_CLOSING);
@@ -5128,6 +5159,7 @@ static int bnxt_alloc_stats(struct bnxt *bp)
 			return -ENOMEM;
 
 		u64_stats_init(&cpr->sw_stats->rx.syncp);
+		u64_stats_init(&cpr->sw_stats->tx.syncp);
 
 		cpr->stats.len = size;
 		rc = bnxt_alloc_stats_mem(bp, &cpr->stats, !i);
@@ -13106,14 +13138,16 @@ static int bnxt_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 static void bnxt_drv_stat_snapshot(const struct bnxt_sw_stats *sw_stats,
 				   struct bnxt_sw_stats *snapshot)
 {
-	unsigned int seq_rx;
+	unsigned int seq_rx, seq_tx;
 
 	do {
 		seq_rx = u64_stats_fetch_begin(&sw_stats->rx.syncp);
+		seq_tx = u64_stats_fetch_begin(&sw_stats->tx.syncp);
 
 		memcpy(snapshot, sw_stats, sizeof(*snapshot));
 
-	} while (u64_stats_fetch_retry(&sw_stats->rx.syncp, seq_rx));
+	} while (u64_stats_fetch_retry(&sw_stats->rx.syncp, seq_rx) ||
+		 u64_stats_fetch_retry(&sw_stats->tx.syncp, seq_tx));
 }
 
 static void bnxt_get_ring_stats(struct bnxt *bp,
@@ -13132,13 +13166,8 @@ static void bnxt_get_ring_stats(struct bnxt *bp,
 		stats->rx_packets += sw_stats.rx.rx_packets;
 		stats->rx_bytes += sw_stats.rx.rx_bytes;
 
-		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_ucast_pkts);
-		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_mcast_pkts);
-		stats->tx_packets += BNXT_GET_RING_STATS64(sw, tx_bcast_pkts);
-
-		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_ucast_bytes);
-		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_mcast_bytes);
-		stats->tx_bytes += BNXT_GET_RING_STATS64(sw, tx_bcast_bytes);
+		stats->tx_packets += sw_stats.tx.tx_packets;
+		stats->tx_bytes += sw_stats.tx.tx_bytes;
 
 		stats->rx_missed_errors +=
 			BNXT_GET_RING_STATS64(sw, rx_discard_pkts);
@@ -13230,6 +13259,8 @@ static void bnxt_get_one_ring_drv_stats(struct bnxt *bp,
 	stats->rx_total_bytes += sw_stats.rx.rx_bytes;
 	stats->rx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, rx_discard_pkts);
+	stats->tx_total_packets += sw_stats.tx.tx_packets;
+	stats->tx_total_bytes += sw_stats.tx.tx_bytes;
 	stats->tx_total_resets += sw_stats.tx.tx_resets;
 	stats->tx_total_ring_discards +=
 		BNXT_GET_RING_STATS64(hw_stats, tx_discard_pkts);
@@ -15637,21 +15668,16 @@ static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
 				    struct netdev_queue_stats_tx *stats)
 {
 	struct bnxt *bp = netdev_priv(dev);
-	struct bnxt_napi *bnapi;
-	u64 *sw;
+	struct bnxt_sw_stats *sw_stats;
+	unsigned int seq;
 
-	bnapi = bp->tx_ring[bp->tx_ring_map[i]].bnapi;
-	sw = bnapi->cp_ring.stats.sw_stats;
+	sw_stats = bp->tx_ring[bp->tx_ring_map[i]].bnapi->cp_ring.sw_stats;
 
-	stats->packets = 0;
-	stats->packets += BNXT_GET_RING_STATS64(sw, tx_ucast_pkts);
-	stats->packets += BNXT_GET_RING_STATS64(sw, tx_mcast_pkts);
-	stats->packets += BNXT_GET_RING_STATS64(sw, tx_bcast_pkts);
-
-	stats->bytes = 0;
-	stats->bytes += BNXT_GET_RING_STATS64(sw, tx_ucast_bytes);
-	stats->bytes += BNXT_GET_RING_STATS64(sw, tx_mcast_bytes);
-	stats->bytes += BNXT_GET_RING_STATS64(sw, tx_bcast_bytes);
+	do {
+		seq = u64_stats_fetch_begin(&sw_stats->tx.syncp);
+		stats->packets = sw_stats->tx.tx_packets;
+		stats->bytes = sw_stats->tx.tx_bytes;
+	} while (u64_stats_fetch_retry(&sw_stats->tx.syncp, seq));
 }
 
 static void bnxt_get_base_stats(struct net_device *dev,
@@ -15664,8 +15690,8 @@ static void bnxt_get_base_stats(struct net_device *dev,
 	rx->bytes = bp->ring_drv_stats_prev.rx_total_bytes;
 	rx->alloc_fail = bp->ring_drv_stats_prev.rx_total_oom_discards;
 
-	tx->packets = bp->net_stats_prev.tx_packets;
-	tx->bytes = bp->net_stats_prev.tx_bytes;
+	tx->packets = bp->ring_drv_stats_prev.tx_total_packets;
+	tx->bytes = bp->ring_drv_stats_prev.tx_total_bytes;
 }
 
 static const struct netdev_stat_ops bnxt_stat_ops = {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next v3 10/10] eth: bnxt: count xdp xmit packets
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (8 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 09/10] eth: bnxt: maintain tx " Jakub Kicinski
@ 2025-03-05 22:52 ` Jakub Kicinski
  2025-03-06 10:54 ` [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Taehee Yoo
  10 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-05 22:52 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, michael.chan,
	pavan.chebbi, przemyslaw.kitszel, Jakub Kicinski

Count XDP_TX and XDP_REDIRECT packets. Since the Tx rings are separate
we count the packets sent to the base stats, not per-queues stats.

The XDP stats are protected by the Rx syncp since they are in NAPI
context. Feels slightly less ugly than having a Tx stats in Rx struct.
But neither is ideal.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v3:
 - count XDP on same members as Tx
v2: https://lore.kernel.org/20250228012534.3460918-10-kuba@kernel.org
 - move tx_buf init sooner so that shinfo handling can access it
v1: https://lore.kernel.org/20250226211003.2790916-10-kuba@kernel.org
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  9 ++++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 16 ++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 25 +++++++++++++++----
 3 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 37d7f08a73c3..0e9702871fd3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -881,7 +881,10 @@ struct bnxt_sw_tx_bd {
 	struct page		*page;
 	DEFINE_DMA_UNMAP_ADDR(mapping);
 	DEFINE_DMA_UNMAP_LEN(len);
-	u16			extra_segs;
+	union {
+		u16			extra_segs;
+		u16			xdp_len;
+	};
 	u8			extra_bytes;
 	u8			hdr_size;
 	u8			is_ts_pkt;
@@ -1134,8 +1137,8 @@ struct bnxt_rx_sw_stats {
 struct bnxt_tx_sw_stats {
 	u64			tx_resets;
 	/* non-ethtool stats follow */
-	u64			tx_packets;
-	u64			tx_bytes;
+	u64			tx_packets; /* for XDP_TX, under rx syncp */
+	u64			tx_bytes; /* for XDP_TX, under rx syncp */
 	struct u64_stats_sync	syncp;
 };
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 893102c0d24e..30d2b6b25301 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15685,6 +15685,7 @@ static void bnxt_get_base_stats(struct net_device *dev,
 				struct netdev_queue_stats_tx *tx)
 {
 	struct bnxt *bp = netdev_priv(dev);
+	int i;
 
 	rx->packets = bp->ring_drv_stats_prev.rx_total_packets;
 	rx->bytes = bp->ring_drv_stats_prev.rx_total_bytes;
@@ -15692,6 +15693,21 @@ static void bnxt_get_base_stats(struct net_device *dev,
 
 	tx->packets = bp->ring_drv_stats_prev.tx_total_packets;
 	tx->bytes = bp->ring_drv_stats_prev.tx_total_bytes;
+
+	for (i = 0; i < bp->tx_nr_rings_xdp; i++) {
+		struct bnxt_sw_stats *sw_stats = bp->bnapi[i]->cp_ring.sw_stats;
+		unsigned int seq;
+		u64 pkts, bytes;
+
+		do {
+			seq = u64_stats_fetch_begin(&sw_stats->rx.syncp);
+			pkts = sw_stats->tx.tx_packets;
+			bytes = sw_stats->tx.tx_bytes;
+		} while (u64_stats_fetch_retry(&sw_stats->rx.syncp, seq));
+
+		tx->packets += pkts;
+		tx->bytes += bytes;
+	}
 }
 
 static const struct netdev_stat_ops bnxt_stat_ops = {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index d13c8e06d299..8ab40ae5c443 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -35,14 +35,17 @@ struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp,
 	u16 prod;
 	int i;
 
-	if (xdp && xdp_buff_has_frags(xdp)) {
-		sinfo = xdp_get_shared_info_from_buff(xdp);
-		num_frags = sinfo->nr_frags;
-	}
-
 	/* fill up the first buffer */
 	prod = txr->tx_prod;
 	tx_buf = &txr->tx_buf_ring[RING_TX(bp, prod)];
+	tx_buf->xdp_len = len;
+
+	if (xdp && xdp_buff_has_frags(xdp)) {
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		tx_buf->xdp_len += sinfo->xdp_frags_size;
+		num_frags = sinfo->nr_frags;
+	}
+
 	tx_buf->nr_frags = num_frags;
 	if (xdp)
 		tx_buf->page = virt_to_head_page(xdp->data);
@@ -120,9 +123,11 @@ static void __bnxt_xmit_xdp_redirect(struct bnxt *bp,
 
 void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
 {
+	struct bnxt_sw_stats *sw_stats = bnapi->cp_ring.sw_stats;
 	struct bnxt_tx_ring_info *txr = bnapi->tx_ring[0];
 	struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
 	u16 tx_hw_cons = txr->tx_hw_cons;
+	unsigned int pkts = 0, bytes = 0;
 	bool rx_doorbell_needed = false;
 	struct bnxt_sw_tx_bd *tx_buf;
 	u16 tx_cons = txr->tx_cons;
@@ -135,6 +140,10 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
 	while (RING_TX(bp, tx_cons) != tx_hw_cons) {
 		tx_buf = &txr->tx_buf_ring[RING_TX(bp, tx_cons)];
 
+		pkts++;
+		bytes += tx_buf->xdp_len;
+		tx_buf->xdp_len = 0;
+
 		if (tx_buf->action == XDP_REDIRECT) {
 			struct pci_dev *pdev = bp->pdev;
 
@@ -163,6 +172,12 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
 		tx_cons = NEXT_TX(tx_cons);
 	}
 
+	/* Note: Rx sync here, because Rx == NAPI context */
+	u64_stats_update_begin(&sw_stats->rx.syncp);
+	sw_stats->tx.tx_packets += pkts;
+	sw_stats->tx.tx_bytes += bytes;
+	u64_stats_update_end(&sw_stats->rx.syncp);
+
 	bnapi->events &= ~BNXT_TX_CMP_EVENT;
 	WRITE_ONCE(txr->tx_cons, tx_cons);
 	if (rx_doorbell_needed) {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
                   ` (9 preceding siblings ...)
  2025-03-05 22:52 ` [PATCH net-next v3 10/10] eth: bnxt: count xdp xmit packets Jakub Kicinski
@ 2025-03-06 10:54 ` Taehee Yoo
  2025-03-06 15:24   ` Jakub Kicinski
  10 siblings, 1 reply; 18+ messages in thread
From: Taehee Yoo @ 2025-03-06 10:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
	michael.chan, pavan.chebbi, przemyslaw.kitszel

On Thu, Mar 6, 2025 at 7:52 AM Jakub Kicinski <kuba@kernel.org> wrote:
>

Hi Jakub,
Thanks a lot for this work!

> Some workloads want to be able to track bandwidth utilization on
> the scale of 10s of msecs. bnxt uses HW stats and async stats
> updates, with update frequency controlled via ethtool -C.
> Updating all HW stats more often than 100 msec is both hard for
> the device and consumes PCIe bandwidth. Switch to maintaining
> basic Rx / Tx packet and byte counters in SW.
>
> Tested with drivers/net/stats.py:
>   # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

I found kernel panic while testing stats.py, it occurs when the
interface is down. If the interface is down, cp_ring is not allocated
but bnxt_get_queue_stats_rx() accesses it without null check.

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 163990067 P4D 163990067 PUD 144363067 PMD 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 UID: 0 PID: 1654 Comm: python3 Not tainted 6.14.0-rc1+ #6
da0f9ad0522edf8bf0c96e8453594913017a5fc9
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en]
Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 0f 1f 44 00 00 48 8b 87 48 0b 00 00 48 63 f6 <48> 8b
1
RSP: 0018:ffffa95ac3c2b7e0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffc0650710 RCX: 0000000000000000
RDX: ffffa95ac3c2b858 RSI: 0000000000000000 RDI: ffffa25a1c1e8000
RBP: ffffa259e3947100 R08: 0000000000000004 R09: ffffa259e4a6601c
R10: 0000000000000015 R11: ffffa259e4a66000 R12: 0000000000000000
R13: ffffa95ac3c2b8c0 R14: ffffa25a1c1e8000 R15: 0000000000000000
FS:  00007f0b3ccbf080(0000) GS:ffffa260df600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000162d5a000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die+0x20/0x70
 ? page_fault_oops+0x15a/0x460
 ? exc_page_fault+0x6e/0x180
 ? asm_exc_page_fault+0x22/0x30
 ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en
3bf73dc1ebebb3ca46ef8948d1fc1a94acbeeba1]
 netdev_nl_stats_by_netdev+0x2b1/0x4e0
 ? xas_load+0x9/0xb0
 ? xas_find+0x183/0x1d0
 ? xa_find+0x8b/0xe0
 netdev_nl_qstats_get_dumpit+0xbf/0x1e0
 genl_dumpit+0x31/0x90
 netlink_dump+0x1a8/0x360

This is not a bug of this series, we can reproduce top of net/net-next
without this series.
Reproduce:
 ip link set $interface down
 ./cli.py --spec netdev.yaml --dump qstats-get
OR
 python ./stats.py

It seems that the driver is supposed to return qstats even if interface
is down. So, I think bnxt driver needs to store the sw_stats when the
interface is down. that may be very similar to the bnxt_get_ring_stats()
and bnxt_get_ring_drv_stats().
What do you think about it?

Thanks a lot!
Taehee Yoo

>
> Manually tested by comparing the ethtool -S stats (which continues
> to show HW stats) with qstats, and total interface stats.
> With and without HW-GRO, and with XDP on / off.
> Stopping and starting the interface also doesn't corrupt the values.
>
> v3:
>  - try to include vlan tag and padding length in the stats
> v2: https://lore.kernel.org/20250228012534.3460918-1-kuba@kernel.org
>  - fix skipping XDP vs the XDP Tx ring handling (Michael)
>  - rename the defines as well as the structs (Przemek)
>  - fix counding frag'ed packets in XDP Tx
> v1: https://lore.kernel.org/20250226211003.2790916-1-kuba@kernel.org
>
> Jakub Kicinski (10):
>   eth: bnxt: use napi_consume_skb()
>   eth: bnxt: don't run xdp programs on fallback traffic
>   eth: bnxt: rename ring_err_stats -> ring_drv_stats
>   eth: bnxt: snapshot driver stats
>   eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO
>   eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb()
>   eth: bnxt: extract VLAN info early on
>   eth: bnxt: maintain rx pkt/byte stats in SW
>   eth: bnxt: maintain tx pkt/byte stats in SW
>   eth: bnxt: count xdp xmit packets
>
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  49 +++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   5 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 272 +++++++++++-------
>  .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  20 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  47 ++-
>  5 files changed, 264 insertions(+), 129 deletions(-)
>
> --
> 2.48.1
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-06 10:54 ` [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Taehee Yoo
@ 2025-03-06 15:24   ` Jakub Kicinski
  2025-03-06 19:25     ` Michael Chan
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-06 15:24 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
	michael.chan, pavan.chebbi, przemyslaw.kitszel

On Thu, 6 Mar 2025 19:54:22 +0900 Taehee Yoo wrote:
> It seems that the driver is supposed to return qstats even if interface
> is down. So, I think bnxt driver needs to store the sw_stats when the
> interface is down.

I think the driver already stores the stats on close and will report
them in "base" values. So all we need in per-queue callbacks is to
return early, when the interface is down?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-06 15:24   ` Jakub Kicinski
@ 2025-03-06 19:25     ` Michael Chan
  2025-03-07  5:25       ` Taehee Yoo
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Chan @ 2025-03-06 19:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Taehee Yoo, davem, netdev, edumazet, pabeni, andrew+netdev, horms,
	pavan.chebbi, przemyslaw.kitszel

[-- Attachment #1: Type: text/plain, Size: 553 bytes --]

On Thu, Mar 6, 2025 at 7:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 6 Mar 2025 19:54:22 +0900 Taehee Yoo wrote:
> > It seems that the driver is supposed to return qstats even if interface
> > is down. So, I think bnxt driver needs to store the sw_stats when the
> > interface is down.
>
> I think the driver already stores the stats on close and will report
> them in "base" values. So all we need in per-queue callbacks is to
> return early, when the interface is down?

Yes, we can check if (!bp->bnapi) and return early.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-06 19:25     ` Michael Chan
@ 2025-03-07  5:25       ` Taehee Yoo
  2025-03-07  5:30         ` Michael Chan
  0 siblings, 1 reply; 18+ messages in thread
From: Taehee Yoo @ 2025-03-07  5:25 UTC (permalink / raw)
  To: Michael Chan
  Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev,
	horms, pavan.chebbi, przemyslaw.kitszel

On Fri, Mar 7, 2025 at 4:25 AM Michael Chan <michael.chan@broadcom.com> wrote:
>
> On Thu, Mar 6, 2025 at 7:25 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Thu, 6 Mar 2025 19:54:22 +0900 Taehee Yoo wrote:
> > > It seems that the driver is supposed to return qstats even if interface
> > > is down. So, I think bnxt driver needs to store the sw_stats when the
> > > interface is down.
> >
> > I think the driver already stores the stats on close and will report
> > them in "base" values. So all we need in per-queue callbacks is to
> > return early, when the interface is down?
>
> Yes, we can check if (!bp->bnapi) and return early.

Hi Jakub and Michael,
Thanks a lot for the review!

I checked that early return if the interface is down, It works well
without any problem.
So if you're okay with it, I would like to add this fix to my current patchset.
https://lore.kernel.org/netdev/20250306072422.3303386-1-ap420073@gmail.com

Thanks a lot!
Taehee Yoo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-07  5:25       ` Taehee Yoo
@ 2025-03-07  5:30         ` Michael Chan
  2025-03-07 16:43           ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Chan @ 2025-03-07  5:30 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev,
	horms, pavan.chebbi, przemyslaw.kitszel

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

On Thu, Mar 6, 2025 at 9:26 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> On Fri, Mar 7, 2025 at 4:25 AM Michael Chan <michael.chan@broadcom.com> wrote:
> > Yes, we can check if (!bp->bnapi) and return early.
>
> Hi Jakub and Michael,
> Thanks a lot for the review!
>
> I checked that early return if the interface is down, It works well
> without any problem.
> So if you're okay with it, I would like to add this fix to my current patchset.
> https://lore.kernel.org/netdev/20250306072422.3303386-1-ap420073@gmail.com
>

Yes, please go ahead and add it.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW
  2025-03-07  5:30         ` Michael Chan
@ 2025-03-07 16:43           ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2025-03-07 16:43 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: Michael Chan, davem, netdev, edumazet, pabeni, andrew+netdev,
	horms, pavan.chebbi, przemyslaw.kitszel

On Thu, 6 Mar 2025 21:30:00 -0800 Michael Chan wrote:
> > I checked that early return if the interface is down, It works well
> > without any problem.
> > So if you're okay with it, I would like to add this fix to my current patchset.
> > https://lore.kernel.org/netdev/20250306072422.3303386-1-ap420073@gmail.com
>
> Yes, please go ahead and add it.  Thanks.

Yes from me, too, FWIW, thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next v3 09/10] eth: bnxt: maintain tx pkt/byte stats in SW
  2025-03-05 22:52 ` [PATCH net-next v3 09/10] eth: bnxt: maintain tx " Jakub Kicinski
@ 2025-03-09  8:25   ` Michael Chan
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Chan @ 2025-03-09  8:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
	pavan.chebbi, przemyslaw.kitszel

[-- Attachment #1: Type: text/plain, Size: 1129 bytes --]

On Wed, Mar 5, 2025 at 2:52 PM Jakub Kicinski <kuba@kernel.org> wrote:

> @@ -610,6 +612,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>                                     DB_RING_IDX(&txr->tx_db, prod));
>                 WRITE_ONCE(txr->tx_prod, prod);
>
> +               if (skb->len < BNXT_MIN_ETH_SIZE)
> +                       tx_buf->extra_bytes += BNXT_MIN_ETH_SIZE - skb->len;

s/BNXT_MIN_ETH_SIZE/ETH_ZLEN

BNXT_MIN_ETH_SIZE is 52 bytes but HW will pad to 60 bytes.  So we need
to also include the HW padding.

>                 tx_buf->is_push = 1;
>                 netdev_tx_sent_queue(txq, skb->len);
>                 wmb();  /* Sync is_push and byte queue before pushing data */
> @@ -634,6 +638,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>                         goto tx_kick_pending;
>                 length = BNXT_MIN_PKT_SIZE;
>         }
> +       if (skb->len < BNXT_MIN_ETH_SIZE)
> +               tx_buf->extra_bytes += BNXT_MIN_ETH_SIZE - skb->len;

Same here for non-push packets.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4196 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-03-09  8:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05 22:52 [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 01/10] eth: bnxt: use napi_consume_skb() Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 02/10] eth: bnxt: don't run xdp programs on fallback traffic Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 03/10] eth: bnxt: rename ring_err_stats -> ring_drv_stats Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 04/10] eth: bnxt: snapshot driver stats Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 05/10] eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 06/10] eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb() Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 07/10] eth: bnxt: extract VLAN info early on Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 08/10] eth: bnxt: maintain rx pkt/byte stats in SW Jakub Kicinski
2025-03-05 22:52 ` [PATCH net-next v3 09/10] eth: bnxt: maintain tx " Jakub Kicinski
2025-03-09  8:25   ` Michael Chan
2025-03-05 22:52 ` [PATCH net-next v3 10/10] eth: bnxt: count xdp xmit packets Jakub Kicinski
2025-03-06 10:54 ` [PATCH net-next v3 00/10] eth: bnxt: maintain basic pkt/byte counters in SW Taehee Yoo
2025-03-06 15:24   ` Jakub Kicinski
2025-03-06 19:25     ` Michael Chan
2025-03-07  5:25       ` Taehee Yoo
2025-03-07  5:30         ` Michael Chan
2025-03-07 16:43           ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).