netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support
@ 2025-02-08 20:29 Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 01/10] bnxt_en: Set NPAR 1.2 support when registering with firmware Michael Chan
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms

The first patch adds NPAR 1.2 support.  Patches 2 to 10 add TPH
(TLP Processing Hints) support.  These TPH driver patches are new
revisions originally posted as part of the TPH PCI patch series.
Additional driver refactoring has been done so that we can free
and allocate RX completion ring and the TX rings if the channel is
a combined channel.  We also add napi_disable() and napi_enable()
during queue_stop() and queue_start() respectively, and reset for
error handling in queue_start().

v4:
Fix NPAR typo in patch #1 and improve the description of NPAR

v3:
Fix build bot warning in patch #9
Add MODULE_IMPORT_NS("NETDEV_INTERNAL") to patch #10
Optimize ring operations when TPH is not enabled in patch #8 and #9

v2:
Major change is error handling in patch #9.

https://lore.kernel.org/netdev/20250116192343.34535-1-michael.chan@broadcom.com/

v1: 
https://lore.kernel.org/netdev/20250113063927.4017173-1-michael.chan@broadcom.com/

Discussion about adding napi_disable()/napi_enable():

https://lore.kernel.org/netdev/5336d624-8d8b-40a6-b732-b020e4a119a2@davidwei.uk/#t

Previous driver series fixing rtnl_lock and empty release function:

https://lore.kernel.org/netdev/20241115200412.1340286-1-wei.huang2@amd.com/

v5 of the PCI series using netdev_rx_queue_restart():

https://lore.kernel.org/netdev/20240916205103.3882081-5-wei.huang2@amd.com/

v1 of the PCI series using open/close:

https://lore.kernel.org/netdev/20240509162741.1937586-9-wei.huang2@amd.com/

Manoj Panicker (1):
  bnxt_en: Add TPH support in BNXT driver

Michael Chan (5):
  bnxt_en: Set NPAR 1.2 support when registering with firmware
  bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips
  bnxt_en: Refactor TX ring allocation logic
  bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS
  bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings

Somnath Kotur (4):
  bnxt_en: Refactor completion ring free routine
  bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring
  bnxt_en: Reallocate RX completion ring for TPH support
  bnxt_en: Extend queue stop/start for TX rings

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 531 ++++++++++++++++------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   8 +
 2 files changed, 408 insertions(+), 131 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 01/10] bnxt_en: Set NPAR 1.2 support when registering with firmware
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 02/10] bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips Michael Chan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur

NPAR (Network interface card partitioning)[1] 1.2 adds a transparent
VLAN tag for all packets between the NIC and the switch.  Because of
that, RX VLAN acceleration cannot be supported for any additional
host configured VLANs.  The driver has to acknowledge that it can
support no RX VLAN acceleration and set the NPAR 1.2 supported flag
when registering with the FW.  Otherwise, the FW call will fail and
the driver will abort on these NPAR 1.2 NICs with this error:

bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2

[1] https://techdocs.broadcom.com/us/en/storage-and-ethernet-connectivity/ethernet-nic-controllers/bcm957xxx/adapters/introduction/features/network-partitioning-npar.html

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
v4: Fix typo in NPAR and improve the description of NPAR
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7b8b5b39c7bb..fc870c104c56 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5565,6 +5565,8 @@ int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp, unsigned long *bmap, int bmap_size,
 	if (bp->fw_cap & BNXT_FW_CAP_ERROR_RECOVERY)
 		flags |= FUNC_DRV_RGTR_REQ_FLAGS_ERROR_RECOVERY_SUPPORT |
 			 FUNC_DRV_RGTR_REQ_FLAGS_MASTER_SUPPORT;
+	if (bp->fw_cap & BNXT_FW_CAP_NPAR_1_2)
+		flags |= FUNC_DRV_RGTR_REQ_FLAGS_NPAR_1_2_SUPPORT;
 	req->flags = cpu_to_le32(flags);
 	req->ver_maj_8b = DRV_VER_MAJ;
 	req->ver_min_8b = DRV_VER_MIN;
@@ -8365,6 +8367,7 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
 
 	switch (resp->port_partition_type) {
 	case FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_0:
+	case FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_2:
 	case FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_5:
 	case FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR2_0:
 		bp->port_partition_type = resp->port_partition_type;
@@ -9529,6 +9532,8 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp)
 		bp->fw_cap |= BNXT_FW_CAP_HOT_RESET_IF;
 	if (BNXT_PF(bp) && (flags_ext & FUNC_QCAPS_RESP_FLAGS_EXT_FW_LIVEPATCH_SUPPORTED))
 		bp->fw_cap |= BNXT_FW_CAP_LIVEPATCH;
+	if (flags_ext & FUNC_QCAPS_RESP_FLAGS_EXT_NPAR_1_2_SUPPORTED)
+		bp->fw_cap |= BNXT_FW_CAP_NPAR_1_2;
 	if (BNXT_PF(bp) && (flags_ext & FUNC_QCAPS_RESP_FLAGS_EXT_DFLT_VLAN_TPID_PCP_SUPPORTED))
 		bp->fw_cap |= BNXT_FW_CAP_DFLT_VLAN_TPID_PCP;
 	if (flags_ext & FUNC_QCAPS_RESP_FLAGS_EXT_BS_V2_SUPPORTED)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 2373f423a523..db0be469a3db 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2492,6 +2492,7 @@ struct bnxt {
 	#define BNXT_FW_CAP_CFA_RFS_RING_TBL_IDX_V3	BIT_ULL(39)
 	#define BNXT_FW_CAP_VNIC_RE_FLUSH		BIT_ULL(40)
 	#define BNXT_FW_CAP_SW_MAX_RESOURCE_LIMITS	BIT_ULL(41)
+	#define BNXT_FW_CAP_NPAR_1_2			BIT_ULL(42)
 
 	u32			fw_dbg_cap;
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 02/10] bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 01/10] bnxt_en: Set NPAR 1.2 support when registering with firmware Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 03/10] bnxt_en: Refactor TX ring allocation logic Michael Chan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Ajit Khaparde

Add a new bnxt_hwrm_cp_ring_alloc_p5() function to handle allocating
one completion ring on P5_PLUS chips.  This simplifies the existing code
and will be useful later in the series.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
v2: Use const for a variable
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 44 +++++++++++------------
 1 file changed, 21 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fc870c104c56..0e16ea823fbd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7199,6 +7199,25 @@ static int bnxt_hwrm_rx_agg_ring_alloc(struct bnxt *bp,
 	return 0;
 }
 
+static int bnxt_hwrm_cp_ring_alloc_p5(struct bnxt *bp,
+				      struct bnxt_cp_ring_info *cpr)
+{
+	const u32 type = HWRM_RING_ALLOC_CMPL;
+	struct bnxt_napi *bnapi = cpr->bnapi;
+	struct bnxt_ring_struct *ring;
+	u32 map_idx = bnapi->index;
+	int rc;
+
+	ring = &cpr->cp_ring_struct;
+	ring->handle = BNXT_SET_NQ_HDL(cpr);
+	rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+	if (rc)
+		return rc;
+	bnxt_set_db(bp, &cpr->cp_db, type, map_idx, ring->fw_ring_id);
+	bnxt_db_cq(bp, &cpr->cp_db, cpr->cp_raw_cons);
+	return 0;
+}
+
 static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 {
 	bool agg_rings = !!(bp->flags & BNXT_FLAG_AGG_RINGS);
@@ -7242,19 +7261,9 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		u32 map_idx;
 
 		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
-			struct bnxt_cp_ring_info *cpr2 = txr->tx_cpr;
-			struct bnxt_napi *bnapi = txr->bnapi;
-			u32 type2 = HWRM_RING_ALLOC_CMPL;
-
-			ring = &cpr2->cp_ring_struct;
-			ring->handle = BNXT_SET_NQ_HDL(cpr2);
-			map_idx = bnapi->index;
-			rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx);
+			rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
 			if (rc)
 				goto err_out;
-			bnxt_set_db(bp, &cpr2->cp_db, type2, map_idx,
-				    ring->fw_ring_id);
-			bnxt_db_cq(bp, &cpr2->cp_db, cpr2->cp_raw_cons);
 		}
 		ring = &txr->tx_ring_struct;
 		map_idx = i;
@@ -7274,20 +7283,9 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		if (!agg_rings)
 			bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
-			struct bnxt_cp_ring_info *cpr2 = rxr->rx_cpr;
-			struct bnxt_napi *bnapi = rxr->bnapi;
-			u32 type2 = HWRM_RING_ALLOC_CMPL;
-			struct bnxt_ring_struct *ring;
-			u32 map_idx = bnapi->index;
-
-			ring = &cpr2->cp_ring_struct;
-			ring->handle = BNXT_SET_NQ_HDL(cpr2);
-			rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx);
+			rc = bnxt_hwrm_cp_ring_alloc_p5(bp, rxr->rx_cpr);
 			if (rc)
 				goto err_out;
-			bnxt_set_db(bp, &cpr2->cp_db, type2, map_idx,
-				    ring->fw_ring_id);
-			bnxt_db_cq(bp, &cpr2->cp_db, cpr2->cp_raw_cons);
 		}
 	}
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 03/10] bnxt_en: Refactor TX ring allocation logic
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 01/10] bnxt_en: Set NPAR 1.2 support when registering with firmware Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 02/10] bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 04/10] bnxt_en: Refactor completion ring free routine Michael Chan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Ajit Khaparde

Add a new bnxt_hwrm_tx_ring_alloc() function to handle allocating
a transmit ring.  This will be useful later in the series.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
v2: Use const for a variable
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0e16ea823fbd..8ab7345acb0a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7218,6 +7218,20 @@ static int bnxt_hwrm_cp_ring_alloc_p5(struct bnxt *bp,
 	return 0;
 }
 
+static int bnxt_hwrm_tx_ring_alloc(struct bnxt *bp,
+				   struct bnxt_tx_ring_info *txr, u32 tx_idx)
+{
+	struct bnxt_ring_struct *ring = &txr->tx_ring_struct;
+	const u32 type = HWRM_RING_ALLOC_TX;
+	int rc;
+
+	rc = hwrm_ring_alloc_send_msg(bp, ring, type, tx_idx);
+	if (rc)
+		return rc;
+	bnxt_set_db(bp, &txr->tx_db, type, tx_idx, ring->fw_ring_id);
+	return 0;
+}
+
 static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 {
 	bool agg_rings = !!(bp->flags & BNXT_FLAG_AGG_RINGS);
@@ -7254,23 +7268,17 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		}
 	}
 
-	type = HWRM_RING_ALLOC_TX;
 	for (i = 0; i < bp->tx_nr_rings; i++) {
 		struct bnxt_tx_ring_info *txr = &bp->tx_ring[i];
-		struct bnxt_ring_struct *ring;
-		u32 map_idx;
 
 		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
 			rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
 			if (rc)
 				goto err_out;
 		}
-		ring = &txr->tx_ring_struct;
-		map_idx = i;
-		rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+		rc = bnxt_hwrm_tx_ring_alloc(bp, txr, i);
 		if (rc)
 			goto err_out;
-		bnxt_set_db(bp, &txr->tx_db, type, map_idx, ring->fw_ring_id);
 	}
 
 	for (i = 0; i < bp->rx_nr_rings; i++) {
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 04/10] bnxt_en: Refactor completion ring free routine
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (2 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 03/10] bnxt_en: Refactor TX ring allocation logic Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 05/10] bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring Michael Chan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Kalesh AP

From: Somnath Kotur <somnath.kotur@broadcom.com>

Add a wrapper routine to free L2 completion rings.  This will be
useful later in the series.

Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++---------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8ab7345acb0a..52d4dc222759 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7405,6 +7405,20 @@ static void bnxt_hwrm_rx_agg_ring_free(struct bnxt *bp,
 	bp->grp_info[grp_idx].agg_fw_ring_id = INVALID_HW_RING_ID;
 }
 
+static void bnxt_hwrm_cp_ring_free(struct bnxt *bp,
+				   struct bnxt_cp_ring_info *cpr)
+{
+	struct bnxt_ring_struct *ring;
+
+	ring = &cpr->cp_ring_struct;
+	if (ring->fw_ring_id == INVALID_HW_RING_ID)
+		return;
+
+	hwrm_ring_free_send_msg(bp, ring, RING_FREE_REQ_RING_TYPE_L2_CMPL,
+				INVALID_HW_RING_ID);
+	ring->fw_ring_id = INVALID_HW_RING_ID;
+}
+
 static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
 {
 	u32 type;
@@ -7450,17 +7464,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
 		struct bnxt_ring_struct *ring;
 		int j;
 
-		for (j = 0; j < cpr->cp_ring_count && cpr->cp_ring_arr; j++) {
-			struct bnxt_cp_ring_info *cpr2 = &cpr->cp_ring_arr[j];
+		for (j = 0; j < cpr->cp_ring_count && cpr->cp_ring_arr; j++)
+			bnxt_hwrm_cp_ring_free(bp, &cpr->cp_ring_arr[j]);
 
-			ring = &cpr2->cp_ring_struct;
-			if (ring->fw_ring_id == INVALID_HW_RING_ID)
-				continue;
-			hwrm_ring_free_send_msg(bp, ring,
-						RING_FREE_REQ_RING_TYPE_L2_CMPL,
-						INVALID_HW_RING_ID);
-			ring->fw_ring_id = INVALID_HW_RING_ID;
-		}
 		ring = &cpr->cp_ring_struct;
 		if (ring->fw_ring_id != INVALID_HW_RING_ID) {
 			hwrm_ring_free_send_msg(bp, ring, type,
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 05/10] bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (3 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 04/10] bnxt_en: Refactor completion ring free routine Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 06/10] bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS Michael Chan
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur

From: Somnath Kotur <somnath.kotur@broadcom.com>

Modify bnxt_free_tx_rings() to free the skbs per TX ring.
This will be useful later in the series.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 115 ++++++++++++----------
 1 file changed, 61 insertions(+), 54 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 52d4dc222759..453f52648145 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3314,74 +3314,81 @@ static int bnxt_poll_p5(struct napi_struct *napi, int budget)
 	return work_done;
 }
 
-static void bnxt_free_tx_skbs(struct bnxt *bp)
+static void bnxt_free_one_tx_ring_skbs(struct bnxt *bp,
+				       struct bnxt_tx_ring_info *txr, int idx)
 {
 	int i, max_idx;
 	struct pci_dev *pdev = bp->pdev;
 
-	if (!bp->tx_ring)
-		return;
-
 	max_idx = bp->tx_nr_pages * TX_DESC_CNT;
-	for (i = 0; i < bp->tx_nr_rings; i++) {
-		struct bnxt_tx_ring_info *txr = &bp->tx_ring[i];
-		int j;
 
-		if (!txr->tx_buf_ring)
+	for (i = 0; i < max_idx;) {
+		struct bnxt_sw_tx_bd *tx_buf = &txr->tx_buf_ring[i];
+		struct sk_buff *skb;
+		int j, last;
+
+		if (idx  < bp->tx_nr_rings_xdp &&
+		    tx_buf->action == XDP_REDIRECT) {
+			dma_unmap_single(&pdev->dev,
+					 dma_unmap_addr(tx_buf, mapping),
+					 dma_unmap_len(tx_buf, len),
+					 DMA_TO_DEVICE);
+			xdp_return_frame(tx_buf->xdpf);
+			tx_buf->action = 0;
+			tx_buf->xdpf = NULL;
+			i++;
 			continue;
+		}
 
-		for (j = 0; j < max_idx;) {
-			struct bnxt_sw_tx_bd *tx_buf = &txr->tx_buf_ring[j];
-			struct sk_buff *skb;
-			int k, last;
-
-			if (i < bp->tx_nr_rings_xdp &&
-			    tx_buf->action == XDP_REDIRECT) {
-				dma_unmap_single(&pdev->dev,
-					dma_unmap_addr(tx_buf, mapping),
-					dma_unmap_len(tx_buf, len),
-					DMA_TO_DEVICE);
-				xdp_return_frame(tx_buf->xdpf);
-				tx_buf->action = 0;
-				tx_buf->xdpf = NULL;
-				j++;
-				continue;
-			}
+		skb = tx_buf->skb;
+		if (!skb) {
+			i++;
+			continue;
+		}
 
-			skb = tx_buf->skb;
-			if (!skb) {
-				j++;
-				continue;
-			}
+		tx_buf->skb = NULL;
 
-			tx_buf->skb = NULL;
+		if (tx_buf->is_push) {
+			dev_kfree_skb(skb);
+			i += 2;
+			continue;
+		}
 
-			if (tx_buf->is_push) {
-				dev_kfree_skb(skb);
-				j += 2;
-				continue;
-			}
+		dma_unmap_single(&pdev->dev,
+				 dma_unmap_addr(tx_buf, mapping),
+				 skb_headlen(skb),
+				 DMA_TO_DEVICE);
 
-			dma_unmap_single(&pdev->dev,
-					 dma_unmap_addr(tx_buf, mapping),
-					 skb_headlen(skb),
-					 DMA_TO_DEVICE);
+		last = tx_buf->nr_frags;
+		i += 2;
+		for (j = 0; j < last; j++, i++) {
+			int ring_idx = i & bp->tx_ring_mask;
+			skb_frag_t *frag = &skb_shinfo(skb)->frags[j];
 
-			last = tx_buf->nr_frags;
-			j += 2;
-			for (k = 0; k < last; k++, j++) {
-				int ring_idx = j & bp->tx_ring_mask;
-				skb_frag_t *frag = &skb_shinfo(skb)->frags[k];
-
-				tx_buf = &txr->tx_buf_ring[ring_idx];
-				dma_unmap_page(
-					&pdev->dev,
-					dma_unmap_addr(tx_buf, mapping),
-					skb_frag_size(frag), DMA_TO_DEVICE);
-			}
-			dev_kfree_skb(skb);
+			tx_buf = &txr->tx_buf_ring[ring_idx];
+			dma_unmap_page(&pdev->dev,
+				       dma_unmap_addr(tx_buf, mapping),
+				       skb_frag_size(frag), DMA_TO_DEVICE);
 		}
-		netdev_tx_reset_queue(netdev_get_tx_queue(bp->dev, i));
+		dev_kfree_skb(skb);
+	}
+	netdev_tx_reset_queue(netdev_get_tx_queue(bp->dev, idx));
+}
+
+static void bnxt_free_tx_skbs(struct bnxt *bp)
+{
+	int i;
+
+	if (!bp->tx_ring)
+		return;
+
+	for (i = 0; i < bp->tx_nr_rings; i++) {
+		struct bnxt_tx_ring_info *txr = &bp->tx_ring[i];
+
+		if (!txr->tx_buf_ring)
+			continue;
+
+		bnxt_free_one_tx_ring_skbs(bp, txr, i);
 	}
 }
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 06/10] bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (4 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 05/10] bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 07/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Michael Chan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Hongguang Gao, Ajit Khaparde

There is some common code for setting up RX and RX AGG ring allocation
parameters for P5_PLUS chips.  Refactor the logic into a new function.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 58 +++++++++++------------
 1 file changed, 28 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 453f52648145..ac63d3feaa1d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6944,6 +6944,28 @@ static void bnxt_hwrm_ring_grp_free(struct bnxt *bp)
 	hwrm_req_drop(bp, req);
 }
 
+static void bnxt_set_rx_ring_params_p5(struct bnxt *bp, u32 ring_type,
+				       struct hwrm_ring_alloc_input *req,
+				       struct bnxt_ring_struct *ring)
+{
+	struct bnxt_ring_grp_info *grp_info = &bp->grp_info[ring->grp_idx];
+	u32 enables = RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID;
+
+	if (ring_type == HWRM_RING_ALLOC_AGG) {
+		req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX_AGG;
+		req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+		req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
+		enables |= RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID;
+	} else {
+		req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
+		if (NET_IP_ALIGN == 2)
+			req->flags =
+				cpu_to_le16(RING_ALLOC_REQ_FLAGS_RX_SOP_PAD);
+	}
+	req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
+	req->enables |= cpu_to_le32(enables);
+}
+
 static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 				    struct bnxt_ring_struct *ring,
 				    u32 ring_type, u32 map_index)
@@ -6995,37 +7017,13 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
 		break;
 	}
 	case HWRM_RING_ALLOC_RX:
-		req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
-		req->length = cpu_to_le32(bp->rx_ring_mask + 1);
-		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
-			u16 flags = 0;
-
-			/* Association of rx ring with stats context */
-			grp_info = &bp->grp_info[ring->grp_idx];
-			req->rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
-			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
-			req->enables |= cpu_to_le32(
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
-			if (NET_IP_ALIGN == 2)
-				flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
-			req->flags = cpu_to_le16(flags);
-		}
-		break;
 	case HWRM_RING_ALLOC_AGG:
-		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
-			req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX_AGG;
-			/* Association of agg ring with rx ring */
-			grp_info = &bp->grp_info[ring->grp_idx];
-			req->rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
-			req->rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
-			req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
-			req->enables |= cpu_to_le32(
-				RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
-				RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
-		} else {
-			req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
-		}
-		req->length = cpu_to_le32(bp->rx_agg_ring_mask + 1);
+		req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
+		req->length = (ring_type == HWRM_RING_ALLOC_RX) ?
+			      cpu_to_le32(bp->rx_ring_mask + 1) :
+			      cpu_to_le32(bp->rx_agg_ring_mask + 1);
+		if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
+			bnxt_set_rx_ring_params_p5(bp, ring_type, req, ring);
 		break;
 	case HWRM_RING_ALLOC_CMPL:
 		req->ring_type = RING_ALLOC_REQ_RING_TYPE_L2_CMPL;
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 07/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (5 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 06/10] bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 08/10] bnxt_en: Reallocate RX completion ring for TPH support Michael Chan
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Hongguang Gao, Ajit Khaparde

Newer firmware can use the NQ ring ID associated with each RX/RX AGG
ring to enable PCIe Steering Tags on P5_PLUS chips.  When allocating
RX/RX AGG rings, pass along NQ ring ID for the firmware to use.  This
information helps optimize DMA writes by directing them to the cache
closer to the CPU consuming the data, potentially improving the
processing speed.  This change is backward-compatible with older
firmware, which will simply disregard the information.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ac63d3feaa1d..c6cf575af53f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6949,7 +6949,8 @@ static void bnxt_set_rx_ring_params_p5(struct bnxt *bp, u32 ring_type,
 				       struct bnxt_ring_struct *ring)
 {
 	struct bnxt_ring_grp_info *grp_info = &bp->grp_info[ring->grp_idx];
-	u32 enables = RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID;
+	u32 enables = RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID |
+		      RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID;
 
 	if (ring_type == HWRM_RING_ALLOC_AGG) {
 		req->ring_type = RING_ALLOC_REQ_RING_TYPE_RX_AGG;
@@ -6963,6 +6964,7 @@ static void bnxt_set_rx_ring_params_p5(struct bnxt *bp, u32 ring_type,
 				cpu_to_le16(RING_ALLOC_REQ_FLAGS_RX_SOP_PAD);
 	}
 	req->stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
+	req->nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
 	req->enables |= cpu_to_le32(enables);
 }
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 08/10] bnxt_en: Reallocate RX completion ring for TPH support
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (6 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 07/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings Michael Chan
  2025-02-08 20:29 ` [PATCH net-next v4 10/10] bnxt_en: Add TPH support in BNXT driver Michael Chan
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, David Wei

From: Somnath Kotur <somnath.kotur@broadcom.com>

In order to program the correct Steering Tag during an IRQ affinity
change, we need to free/re-allocate the RX completion ring during
queue_restart.  If TPH is enabled, call FW to free the Rx completion
ring and clear the ring entries in queue_stop().  Re-allocate it in
queue_start() if TPH is enabled.  Note that TPH mode is not enabled
in this patch and will be enabled later in the patch series.

While modifying bnxt_queue_start(), remove the unnecessary zeroing of
rxr->rx_next_cons.  It gets overwritten by the clone in
bnxt_queue_start().

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: David Wei <dw@davidwei.uk>

v3: Only free/allocate the RX cmpl ring when TPH is enabled.
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 +++++++++++++++++++++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  2 ++
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c6cf575af53f..019a8433b0d6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7426,6 +7426,19 @@ static void bnxt_hwrm_cp_ring_free(struct bnxt *bp,
 	ring->fw_ring_id = INVALID_HW_RING_ID;
 }
 
+static void bnxt_clear_one_cp_ring(struct bnxt *bp, struct bnxt_cp_ring_info *cpr)
+{
+	struct bnxt_ring_struct *ring = &cpr->cp_ring_struct;
+	int i, size = ring->ring_mem.page_size;
+
+	cpr->cp_raw_cons = 0;
+	cpr->toggle = 0;
+
+	for (i = 0; i < bp->cp_nr_pages; i++)
+		if (cpr->cp_desc_ring[i])
+			memset(cpr->cp_desc_ring[i], 0, size);
+}
+
 static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
 {
 	u32 type;
@@ -15645,9 +15658,16 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 	rc = bnxt_hwrm_rx_ring_alloc(bp, rxr);
 	if (rc)
 		return rc;
+
+	if (bp->tph_mode) {
+		rc = bnxt_hwrm_cp_ring_alloc_p5(bp, rxr->rx_cpr);
+		if (rc)
+			goto err_free_hwrm_rx_ring;
+	}
+
 	rc = bnxt_hwrm_rx_agg_ring_alloc(bp, rxr);
 	if (rc)
-		goto err_free_hwrm_rx_ring;
+		goto err_free_hwrm_cp_ring;
 
 	bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 	if (bp->flags & BNXT_FLAG_AGG_RINGS)
@@ -15672,6 +15692,9 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 
 	return 0;
 
+err_free_hwrm_cp_ring:
+	if (bp->tph_mode)
+		bnxt_hwrm_cp_ring_free(bp, rxr->rx_cpr);
 err_free_hwrm_rx_ring:
 	bnxt_hwrm_rx_ring_free(bp, rxr, false);
 	return rc;
@@ -15696,11 +15719,15 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
 	cancel_work_sync(&rxr->bnapi->cp_ring.dim.work);
 	bnxt_hwrm_rx_ring_free(bp, rxr, false);
 	bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
-	rxr->rx_next_cons = 0;
 	page_pool_disable_direct_recycling(rxr->page_pool);
 	if (bnxt_separate_head_pool())
 		page_pool_disable_direct_recycling(rxr->head_pool);
 
+	if (bp->tph_mode) {
+		bnxt_hwrm_cp_ring_free(bp, rxr->rx_cpr);
+		bnxt_clear_one_cp_ring(bp, rxr->rx_cpr);
+	}
+
 	memcpy(qmem, rxr, sizeof(*rxr));
 	bnxt_init_rx_ring_struct(bp, qmem);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index db0be469a3db..4e20878e7714 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2410,6 +2410,8 @@ struct bnxt {
 	u8			max_q;
 	u8			num_tc;
 
+	u8			tph_mode;
+
 	unsigned int		current_interval;
 #define BNXT_TIMER_INTERVAL	HZ
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (7 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 08/10] bnxt_en: Reallocate RX completion ring for TPH support Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  2025-02-12  1:44   ` Jakub Kicinski
  2025-02-08 20:29 ` [PATCH net-next v4 10/10] bnxt_en: Add TPH support in BNXT driver Michael Chan
  9 siblings, 1 reply; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Ajit Khaparde, David Wei

From: Somnath Kotur <somnath.kotur@broadcom.com>

In order to use queue_stop/queue_start to support the new Steering
Tags, we need to free the TX ring and TX completion ring if it is a
combined channel with TX/RX sharing the same NAPI.  Otherwise
TX completions will not have the updated Steering Tag.  If TPH is
not enabled, we just stop the TX ring without freeing the TX/TX cmpl
rings.  With that we can now add napi_disable() and napi_enable()
during queue_stop()/ queue_start().  This will guarantee that NAPI
will stop processing the completion entries in case there are
additional pending entries in the completion rings after queue_stop().

There could be some NQEs sitting unprocessed while NAPI is disabled
thereby leaving the NQ unarmed.  Explicitly re-arm the NQ after
napi_enable() in queue start so that NAPI will resume properly.

Error handling in bnxt_queue_start() requires a reset.  If a TX
ring cannot be allocated or initialized properly, it will cause
TX timeout.  The reset will also free any partially allocated
rings.

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: David Wei <dw@davidwei.uk>

v3:
Fix build bot warning.
Only free TX/TX cmpl rings when TPH is enabled.

v2:
Add reset for error handling in queue_start().
Fix compile error.

Discussion about adding napi_disable()/napi_enable():

https://lore.kernel.org/netdev/5336d624-8d8b-40a6-b732-b020e4a119a2@davidwei.uk/#t
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 132 ++++++++++++++++++++--
 1 file changed, 122 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 019a8433b0d6..fee9baff9e5a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7368,6 +7368,22 @@ static int hwrm_ring_free_send_msg(struct bnxt *bp,
 	return 0;
 }
 
+static void bnxt_hwrm_tx_ring_free(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
+				   bool close_path)
+{
+	struct bnxt_ring_struct *ring = &txr->tx_ring_struct;
+	u32 cmpl_ring_id;
+
+	if (ring->fw_ring_id == INVALID_HW_RING_ID)
+		return;
+
+	cmpl_ring_id = close_path ? bnxt_cp_ring_for_tx(bp, txr) :
+		       INVALID_HW_RING_ID;
+	hwrm_ring_free_send_msg(bp, ring, RING_FREE_REQ_RING_TYPE_TX,
+				cmpl_ring_id);
+	ring->fw_ring_id = INVALID_HW_RING_ID;
+}
+
 static void bnxt_hwrm_rx_ring_free(struct bnxt *bp,
 				   struct bnxt_rx_ring_info *rxr,
 				   bool close_path)
@@ -11274,6 +11290,75 @@ int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init)
 	return 0;
 }
 
+static void bnxt_tx_queue_stop(struct bnxt *bp, int idx)
+{
+	struct bnxt_tx_ring_info *txr;
+	struct netdev_queue *txq;
+	struct bnxt_napi *bnapi;
+	int i;
+
+	bnapi = bp->bnapi[idx];
+	bnxt_for_each_napi_tx(i, bnapi, txr) {
+		WRITE_ONCE(txr->dev_state, BNXT_DEV_STATE_CLOSING);
+		synchronize_net();
+
+		if (!(bnapi->flags & BNXT_NAPI_FLAG_XDP)) {
+			txq = netdev_get_tx_queue(bp->dev, txr->txq_index);
+			if (txq) {
+				__netif_tx_lock_bh(txq);
+				netif_tx_stop_queue(txq);
+				__netif_tx_unlock_bh(txq);
+			}
+		}
+
+		if (!bp->tph_mode)
+			continue;
+
+		bnxt_hwrm_tx_ring_free(bp, txr, true);
+		bnxt_hwrm_cp_ring_free(bp, txr->tx_cpr);
+		bnxt_free_one_tx_ring_skbs(bp, txr, txr->txq_index);
+		bnxt_clear_one_cp_ring(bp, txr->tx_cpr);
+	}
+}
+
+static int bnxt_tx_queue_start(struct bnxt *bp, int idx)
+{
+	struct bnxt_tx_ring_info *txr;
+	struct netdev_queue *txq;
+	struct bnxt_napi *bnapi;
+	int rc, i;
+
+	bnapi = bp->bnapi[idx];
+	bnxt_for_each_napi_tx(i, bnapi, txr) {
+		if (!bp->tph_mode)
+			goto start_tx;
+
+		rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
+		if (rc)
+			return rc;
+
+		rc = bnxt_hwrm_tx_ring_alloc(bp, txr, false);
+		if (rc)
+			return rc;
+
+		txr->tx_prod = 0;
+		txr->tx_cons = 0;
+		txr->tx_hw_cons = 0;
+start_tx:
+		WRITE_ONCE(txr->dev_state, 0);
+		synchronize_net();
+
+		if (bnapi->flags & BNXT_NAPI_FLAG_XDP)
+			continue;
+
+		txq = netdev_get_tx_queue(bp->dev, txr->txq_index);
+		if (txq)
+			netif_tx_start_queue(txq);
+	}
+
+	return 0;
+}
+
 static void bnxt_free_irq(struct bnxt *bp)
 {
 	struct bnxt_irq *irq;
@@ -15638,6 +15723,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 	struct bnxt_rx_ring_info *rxr, *clone;
 	struct bnxt_cp_ring_info *cpr;
 	struct bnxt_vnic_info *vnic;
+	struct bnxt_napi *bnapi;
 	int i, rc;
 
 	rxr = &bp->rx_ring[idx];
@@ -15655,27 +15741,42 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 
 	bnxt_copy_rx_ring(bp, rxr, clone);
 
+	bnapi = rxr->bnapi;
+	cpr = &bnapi->cp_ring;
 	rc = bnxt_hwrm_rx_ring_alloc(bp, rxr);
 	if (rc)
-		return rc;
+		goto err_reset_rx;
 
 	if (bp->tph_mode) {
 		rc = bnxt_hwrm_cp_ring_alloc_p5(bp, rxr->rx_cpr);
 		if (rc)
-			goto err_free_hwrm_rx_ring;
+			goto err_reset_rx;
 	}
 
 	rc = bnxt_hwrm_rx_agg_ring_alloc(bp, rxr);
 	if (rc)
-		goto err_free_hwrm_cp_ring;
+		goto err_reset_rx;
 
 	bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
 	if (bp->flags & BNXT_FLAG_AGG_RINGS)
 		bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
 
-	cpr = &rxr->bnapi->cp_ring;
 	cpr->sw_stats->rx.rx_resets++;
 
+	if (bp->flags & BNXT_FLAG_SHARED_RINGS) {
+		cpr->sw_stats->tx.tx_resets++;
+		rc = bnxt_tx_queue_start(bp, idx);
+		if (rc) {
+			netdev_warn(bp->dev,
+				    "tx queue restart failed: rc=%d\n", rc);
+			bnapi->tx_fault = 1;
+			goto err_reset;
+		}
+	}
+
+	napi_enable(&bnapi->napi);
+	bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
+
 	for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
 		vnic = &bp->vnic_info[i];
 
@@ -15692,11 +15793,12 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
 
 	return 0;
 
-err_free_hwrm_cp_ring:
-	if (bp->tph_mode)
-		bnxt_hwrm_cp_ring_free(bp, rxr->rx_cpr);
-err_free_hwrm_rx_ring:
-	bnxt_hwrm_rx_ring_free(bp, rxr, false);
+err_reset_rx:
+	rxr->bnapi->in_reset = true;
+err_reset:
+	napi_enable(&bnapi->napi);
+	bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
+	bnxt_reset_task(bp, true);
 	return rc;
 }
 
@@ -15704,7 +15806,9 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
 {
 	struct bnxt *bp = netdev_priv(dev);
 	struct bnxt_rx_ring_info *rxr;
+	struct bnxt_cp_ring_info *cpr;
 	struct bnxt_vnic_info *vnic;
+	struct bnxt_napi *bnapi;
 	int i;
 
 	for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
@@ -15716,17 +15820,25 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
 	/* Make sure NAPI sees that the VNIC is disabled */
 	synchronize_net();
 	rxr = &bp->rx_ring[idx];
-	cancel_work_sync(&rxr->bnapi->cp_ring.dim.work);
+	bnapi = rxr->bnapi;
+	cpr = &bnapi->cp_ring;
+	cancel_work_sync(&cpr->dim.work);
 	bnxt_hwrm_rx_ring_free(bp, rxr, false);
 	bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
 	page_pool_disable_direct_recycling(rxr->page_pool);
 	if (bnxt_separate_head_pool())
 		page_pool_disable_direct_recycling(rxr->head_pool);
 
+	if (bp->flags & BNXT_FLAG_SHARED_RINGS)
+		bnxt_tx_queue_stop(bp, idx);
+
+	napi_disable(&bnapi->napi);
+
 	if (bp->tph_mode) {
 		bnxt_hwrm_cp_ring_free(bp, rxr->rx_cpr);
 		bnxt_clear_one_cp_ring(bp, rxr->rx_cpr);
 	}
+	bnxt_db_nq(bp, &cpr->cp_db, cpr->cp_raw_cons);
 
 	memcpy(qmem, rxr, sizeof(*rxr));
 	bnxt_init_rx_ring_struct(bp, qmem);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next v4 10/10] bnxt_en: Add TPH support in BNXT driver
  2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
                   ` (8 preceding siblings ...)
  2025-02-08 20:29 ` [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings Michael Chan
@ 2025-02-08 20:29 ` Michael Chan
  9 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-08 20:29 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, kuba, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Manoj Panicker, Somnath Kotur, Wei Huang, Ajit Khaparde

From: Manoj Panicker <manoj.panicker2@amd.com>

Add TPH support to the Broadcom BNXT device driver. This allows the
driver to utilize TPH functions for retrieving and configuring Steering
Tags when changing interrupt affinity. With compatible NIC firmware,
network traffic will be tagged correctly with Steering Tags, resulting
in significant memory bandwidth savings and other advantages as
demonstrated by real network benchmarks on TPH-capable platforms.

Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Manoj Panicker <manoj.panicker2@amd.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
Cc: Bjorn Helgaas <helgaas@kernel.org>

v3:
Add MODULE_IMPORT_NS("NETDEV_INTERNAL")

Previous driver series fixing rtnl_lock and empty release function:

https://lore.kernel.org/netdev/20241115200412.1340286-1-wei.huang2@amd.com/

v5 of the PCI series using netdev_rx_queue_restart():

https://lore.kernel.org/netdev/20240916205103.3882081-5-wei.huang2@amd.com/

v1 of the PCI series using open/close:

https://lore.kernel.org/netdev/20240509162741.1937586-9-wei.huang2@amd.com/
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 106 ++++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   5 +
 2 files changed, 111 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fee9baff9e5a..2b7df91840fd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -55,6 +55,8 @@
 #include <net/page_pool/helpers.h>
 #include <linux/align.h>
 #include <net/netdev_queues.h>
+#include <net/netdev_rx_queue.h>
+#include <linux/pci-tph.h>
 
 #include "bnxt_hsi.h"
 #include "bnxt.h"
@@ -76,6 +78,7 @@
 #define BNXT_DEF_MSG_ENABLE	(NETIF_MSG_DRV | NETIF_MSG_HW | \
 				 NETIF_MSG_TX_ERR)
 
+MODULE_IMPORT_NS("NETDEV_INTERNAL");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Broadcom NetXtreme network driver");
 
@@ -11359,6 +11362,83 @@ static int bnxt_tx_queue_start(struct bnxt *bp, int idx)
 	return 0;
 }
 
+static void bnxt_irq_affinity_notify(struct irq_affinity_notify *notify,
+				     const cpumask_t *mask)
+{
+	struct bnxt_irq *irq;
+	u16 tag;
+	int err;
+
+	irq = container_of(notify, struct bnxt_irq, affinity_notify);
+
+	if (!irq->bp->tph_mode)
+		return;
+
+	cpumask_copy(irq->cpu_mask, mask);
+
+	if (irq->ring_nr >= irq->bp->rx_nr_rings)
+		return;
+
+	if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+				cpumask_first(irq->cpu_mask), &tag))
+		return;
+
+	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
+		return;
+
+	rtnl_lock();
+	if (netif_running(irq->bp->dev)) {
+		err = netdev_rx_queue_restart(irq->bp->dev, irq->ring_nr);
+		if (err)
+			netdev_err(irq->bp->dev,
+				   "RX queue restart failed: err=%d\n", err);
+	}
+	rtnl_unlock();
+}
+
+static void bnxt_irq_affinity_release(struct kref *ref)
+{
+	struct irq_affinity_notify *notify =
+		container_of(ref, struct irq_affinity_notify, kref);
+	struct bnxt_irq *irq;
+
+	irq = container_of(notify, struct bnxt_irq, affinity_notify);
+
+	if (!irq->bp->tph_mode)
+		return;
+
+	if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, 0)) {
+		netdev_err(irq->bp->dev,
+			   "Setting ST=0 for MSIX entry %d failed\n",
+			   irq->msix_nr);
+		return;
+	}
+}
+
+static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
+{
+	irq_set_affinity_notifier(irq->vector, NULL);
+}
+
+static void bnxt_register_irq_notifier(struct bnxt *bp, struct bnxt_irq *irq)
+{
+	struct irq_affinity_notify *notify;
+
+	irq->bp = bp;
+
+	/* Nothing to do if TPH is not enabled */
+	if (!bp->tph_mode)
+		return;
+
+	/* Register IRQ affinity notifier */
+	notify = &irq->affinity_notify;
+	notify->irq = irq->vector;
+	notify->notify = bnxt_irq_affinity_notify;
+	notify->release = bnxt_irq_affinity_release;
+
+	irq_set_affinity_notifier(irq->vector, notify);
+}
+
 static void bnxt_free_irq(struct bnxt *bp)
 {
 	struct bnxt_irq *irq;
@@ -11381,11 +11461,18 @@ static void bnxt_free_irq(struct bnxt *bp)
 				free_cpumask_var(irq->cpu_mask);
 				irq->have_cpumask = 0;
 			}
+
+			bnxt_release_irq_notifier(irq);
+
 			free_irq(irq->vector, bp->bnapi[i]);
 		}
 
 		irq->requested = 0;
 	}
+
+	/* Disable TPH support */
+	pcie_disable_tph(bp->pdev);
+	bp->tph_mode = 0;
 }
 
 static int bnxt_request_irq(struct bnxt *bp)
@@ -11405,6 +11492,12 @@ static int bnxt_request_irq(struct bnxt *bp)
 #ifdef CONFIG_RFS_ACCEL
 	rmap = bp->dev->rx_cpu_rmap;
 #endif
+
+	/* Enable TPH support as part of IRQ request */
+	rc = pcie_enable_tph(bp->pdev, PCI_TPH_ST_IV_MODE);
+	if (!rc)
+		bp->tph_mode = PCI_TPH_ST_IV_MODE;
+
 	for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
 		int map_idx = bnxt_cp_num_to_irq_num(bp, i);
 		struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
@@ -11428,8 +11521,11 @@ static int bnxt_request_irq(struct bnxt *bp)
 
 		if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
 			int numa_node = dev_to_node(&bp->pdev->dev);
+			u16 tag;
 
 			irq->have_cpumask = 1;
+			irq->msix_nr = map_idx;
+			irq->ring_nr = i;
 			cpumask_set_cpu(cpumask_local_spread(i, numa_node),
 					irq->cpu_mask);
 			rc = irq_update_affinity_hint(irq->vector, irq->cpu_mask);
@@ -11439,6 +11535,16 @@ static int bnxt_request_irq(struct bnxt *bp)
 					    irq->vector);
 				break;
 			}
+
+			bnxt_register_irq_notifier(bp, irq);
+
+			/* Init ST table entry */
+			if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
+						cpumask_first(irq->cpu_mask),
+						&tag))
+				continue;
+
+			pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
 		}
 	}
 	return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 4e20878e7714..e85b5ce94f58 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1234,6 +1234,11 @@ struct bnxt_irq {
 	u8		have_cpumask:1;
 	char		name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
 	cpumask_var_t	cpu_mask;
+
+	struct bnxt	*bp;
+	int		msix_nr;
+	int		ring_nr;
+	struct irq_affinity_notify affinity_notify;
 };
 
 #define HWRM_RING_ALLOC_TX	0x1
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings
  2025-02-08 20:29 ` [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings Michael Chan
@ 2025-02-12  1:44   ` Jakub Kicinski
  2025-02-12  2:31     ` Michael Chan
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-12  1:44 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Ajit Khaparde, David Wei

On Sat,  8 Feb 2025 12:29:15 -0800 Michael Chan wrote:
> +		rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
> +		if (rc)
> +			return rc;
> +
> +		rc = bnxt_hwrm_tx_ring_alloc(bp, txr, false);
> +		if (rc)
> +			return rc;

Under what circumstances can these alloc calls fail?
"alloc" sounds concerning in a start call.

> +		txr->tx_prod = 0;
> +		txr->tx_cons = 0;
> +		txr->tx_hw_cons = 0;

>  	cpr->sw_stats->rx.rx_resets++;
>  
> +	if (bp->flags & BNXT_FLAG_SHARED_RINGS) {
> +		cpr->sw_stats->tx.tx_resets++;

Is there a reason why queue op stop/start cycles are counted as resets?
IIUC previously only faults (~errors) would be counted as resets.
ifdown / ifup or ring reconfig (ethtool -L / -G) would not increment
resets. I think queue reconfig is more like ethtool -L than a fault.
It'd be more consistent with existing code not to increment these
counters.

> +		rc = bnxt_tx_queue_start(bp, idx);
> +		if (rc) {
> +			netdev_warn(bp->dev,
> +				    "tx queue restart failed: rc=%d\n", rc);
> +			bnapi->tx_fault = 1;
> +			goto err_reset;
> +		}
> +	}
> +
> +	napi_enable(&bnapi->napi);

Here you first start the queue then enable NAPI...

> +	bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
> +
>  	for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
>  		vnic = &bp->vnic_info[i];
>  

> @@ -15716,17 +15820,25 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
>  	/* Make sure NAPI sees that the VNIC is disabled */
>  	synchronize_net();
>  	rxr = &bp->rx_ring[idx];
> -	cancel_work_sync(&rxr->bnapi->cp_ring.dim.work);
> +	bnapi = rxr->bnapi;
> +	cpr = &bnapi->cp_ring;
> +	cancel_work_sync(&cpr->dim.work);
>  	bnxt_hwrm_rx_ring_free(bp, rxr, false);
>  	bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
>  	page_pool_disable_direct_recycling(rxr->page_pool);
>  	if (bnxt_separate_head_pool())
>  		page_pool_disable_direct_recycling(rxr->head_pool);
>  
> +	if (bp->flags & BNXT_FLAG_SHARED_RINGS)
> +		bnxt_tx_queue_stop(bp, idx);
> +
> +	napi_disable(&bnapi->napi);

... but here you do the opposite, and require extra synchronization 
in bnxt_tx_queue_stop() to set your magic flag, sync the NAPI etc.
Why can't the start and stop paths be the mirror image?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings
  2025-02-12  1:44   ` Jakub Kicinski
@ 2025-02-12  2:31     ` Michael Chan
  2025-02-12  2:43       ` Jakub Kicinski
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Chan @ 2025-02-12  2:31 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Ajit Khaparde, David Wei

[-- Attachment #1: Type: text/plain, Size: 3621 bytes --]

On Tue, Feb 11, 2025 at 5:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat,  8 Feb 2025 12:29:15 -0800 Michael Chan wrote:
> > +             rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
> > +             if (rc)
> > +                     return rc;
> > +
> > +             rc = bnxt_hwrm_tx_ring_alloc(bp, txr, false);
> > +             if (rc)
> > +                     return rc;
>
> Under what circumstances can these alloc calls fail?
> "alloc" sounds concerning in a start call.

The ring has been previously reserved with FW, so it normally should
not fail.  I'll need to ask the FW team for some possible failure
scenarios.

>
> > +             txr->tx_prod = 0;
> > +             txr->tx_cons = 0;
> > +             txr->tx_hw_cons = 0;
>
> >       cpr->sw_stats->rx.rx_resets++;
> >
> > +     if (bp->flags & BNXT_FLAG_SHARED_RINGS) {
> > +             cpr->sw_stats->tx.tx_resets++;
>
> Is there a reason why queue op stop/start cycles are counted as resets?
> IIUC previously only faults (~errors) would be counted as resets.
> ifdown / ifup or ring reconfig (ethtool -L / -G) would not increment
> resets. I think queue reconfig is more like ethtool -L than a fault.
> It'd be more consistent with existing code not to increment these
> counters.

I think David's original code increments the rx_reset counter for
every queue_start.  We're just following that.  Maybe it came from the
original plan to use HWRM_RING_RESET to do the RX
queue_stop/queue_start.  We can remove the reset counters for all
queue_stop/queue_start if that makes more sense.

>
> > +             rc = bnxt_tx_queue_start(bp, idx);
> > +             if (rc) {
> > +                     netdev_warn(bp->dev,
> > +                                 "tx queue restart failed: rc=%d\n", rc);
> > +                     bnapi->tx_fault = 1;
> > +                     goto err_reset;
> > +             }
> > +     }
> > +
> > +     napi_enable(&bnapi->napi);
>
> Here you first start the queue then enable NAPI...
>
> > +     bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons);
> > +
> >       for (i = 0; i <= BNXT_VNIC_NTUPLE; i++) {
> >               vnic = &bp->vnic_info[i];
> >
>
> > @@ -15716,17 +15820,25 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
> >       /* Make sure NAPI sees that the VNIC is disabled */
> >       synchronize_net();
> >       rxr = &bp->rx_ring[idx];
> > -     cancel_work_sync(&rxr->bnapi->cp_ring.dim.work);
> > +     bnapi = rxr->bnapi;
> > +     cpr = &bnapi->cp_ring;
> > +     cancel_work_sync(&cpr->dim.work);
> >       bnxt_hwrm_rx_ring_free(bp, rxr, false);
> >       bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
> >       page_pool_disable_direct_recycling(rxr->page_pool);
> >       if (bnxt_separate_head_pool())
> >               page_pool_disable_direct_recycling(rxr->head_pool);
> >
> > +     if (bp->flags & BNXT_FLAG_SHARED_RINGS)
> > +             bnxt_tx_queue_stop(bp, idx);
> > +
> > +     napi_disable(&bnapi->napi);
>
> ... but here you do the opposite, and require extra synchronization
> in bnxt_tx_queue_stop() to set your magic flag, sync the NAPI etc.
> Why can't the start and stop paths be the mirror image?

The ring free operation requires interrupt/NAPI to be working.  FW
signals the completion of the ring free command on the completion ring
associated with the ring we're freeing.  When we see this completion
during NAPI, it guarantees that this is the last DMA on that ring.
Only ring free FW commands are handled this way, requiring NAPI.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings
  2025-02-12  2:31     ` Michael Chan
@ 2025-02-12  2:43       ` Jakub Kicinski
  2025-02-12 22:59         ` Michael Chan
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-12  2:43 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Ajit Khaparde, David Wei

On Tue, 11 Feb 2025 18:31:21 -0800 Michael Chan wrote:
> On Tue, Feb 11, 2025 at 5:44 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Sat,  8 Feb 2025 12:29:15 -0800 Michael Chan wrote:  
> > > +             rc = bnxt_hwrm_cp_ring_alloc_p5(bp, txr->tx_cpr);
> > > +             if (rc)
> > > +                     return rc;
> > > +
> > > +             rc = bnxt_hwrm_tx_ring_alloc(bp, txr, false);
> > > +             if (rc)
> > > +                     return rc;  
> >
> > Under what circumstances can these alloc calls fail?
> > "alloc" sounds concerning in a start call.  
> 
> The ring has been previously reserved with FW, so it normally should
> not fail.  I'll need to ask the FW team for some possible failure
> scenarios.

Thanks, expectation is that start never fails.
If the FW team comes back with "should never happen if rings 
are reserved" please add a comment to that effect here. Since
this is one of very few implementations people may read it
and incorrectly assume that allocating is okay.
If the FW team comes back with a list of possible but unlikely
scenarios I'm afraid a rework will be needed.

> > >       cpr->sw_stats->rx.rx_resets++;
> > >
> > > +     if (bp->flags & BNXT_FLAG_SHARED_RINGS) {
> > > +             cpr->sw_stats->tx.tx_resets++;  
> >
> > Is there a reason why queue op stop/start cycles are counted as resets?
> > IIUC previously only faults (~errors) would be counted as resets.
> > ifdown / ifup or ring reconfig (ethtool -L / -G) would not increment
> > resets. I think queue reconfig is more like ethtool -L than a fault.
> > It'd be more consistent with existing code not to increment these
> > counters.  
> 
> I think David's original code increments the rx_reset counter for
> every queue_start.  We're just following that.  Maybe it came from the
> original plan to use HWRM_RING_RESET to do the RX
> queue_stop/queue_start.  We can remove the reset counters for all
> queue_stop/queue_start if that makes more sense.

I vote remove, just to be crystal clear.

> > > @@ -15716,17 +15820,25 @@ static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
> > >       /* Make sure NAPI sees that the VNIC is disabled */
> > >       synchronize_net();
> > >       rxr = &bp->rx_ring[idx];
> > > -     cancel_work_sync(&rxr->bnapi->cp_ring.dim.work);
> > > +     bnapi = rxr->bnapi;
> > > +     cpr = &bnapi->cp_ring;
> > > +     cancel_work_sync(&cpr->dim.work);
> > >       bnxt_hwrm_rx_ring_free(bp, rxr, false);
> > >       bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
> > >       page_pool_disable_direct_recycling(rxr->page_pool);
> > >       if (bnxt_separate_head_pool())
> > >               page_pool_disable_direct_recycling(rxr->head_pool);
> > >
> > > +     if (bp->flags & BNXT_FLAG_SHARED_RINGS)
> > > +             bnxt_tx_queue_stop(bp, idx);
> > > +
> > > +     napi_disable(&bnapi->napi);  
> >
> > ... but here you do the opposite, and require extra synchronization
> > in bnxt_tx_queue_stop() to set your magic flag, sync the NAPI etc.
> > Why can't the start and stop paths be the mirror image?  
> 
> The ring free operation requires interrupt/NAPI to be working.  FW
> signals the completion of the ring free command on the completion ring
> associated with the ring we're freeing.  When we see this completion
> during NAPI, it guarantees that this is the last DMA on that ring.
> Only ring free FW commands are handled this way, requiring NAPI.

Ugh, I feel like this was explained to me before, sorry.
Again, a comment in the code would go a long way for non-Broadcom
readers.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings
  2025-02-12  2:43       ` Jakub Kicinski
@ 2025-02-12 22:59         ` Michael Chan
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Chan @ 2025-02-12 22:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, pavan.chebbi,
	andrew.gospodarek, michal.swiatkowski, helgaas, horms,
	Somnath Kotur, Ajit Khaparde, David Wei

[-- Attachment #1: Type: text/plain, Size: 947 bytes --]

On Tue, Feb 11, 2025 at 6:43 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 11 Feb 2025 18:31:21 -0800 Michael Chan wrote:
> > The ring has been previously reserved with FW, so it normally should
> > not fail.  I'll need to ask the FW team for some possible failure
> > scenarios.
>
> Thanks, expectation is that start never fails.
> If the FW team comes back with "should never happen if rings
> are reserved" please add a comment to that effect here. Since
> this is one of very few implementations people may read it
> and incorrectly assume that allocating is okay.
> If the FW team comes back with a list of possible but unlikely
> scenarios I'm afraid a rework will be needed.
>

The FW team has confirmed that it should never fail in this case.  The
ring has been previously reserved and allocated.  Re-allocating it
with essentially the same parameters should never fail.  I will be
sending V5 soon.  Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-02-12 22:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-08 20:29 [PATCH net-next v4 00/10] bnxt_en: Add NPAR 1.2 and TPH support Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 01/10] bnxt_en: Set NPAR 1.2 support when registering with firmware Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 02/10] bnxt_en: Refactor completion ring allocation logic for P5_PLUS chips Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 03/10] bnxt_en: Refactor TX ring allocation logic Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 04/10] bnxt_en: Refactor completion ring free routine Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 05/10] bnxt_en: Refactor bnxt_free_tx_rings() to free per TX ring Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 06/10] bnxt_en: Refactor RX/RX AGG ring parameters setup for P5_PLUS Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 07/10] bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 08/10] bnxt_en: Reallocate RX completion ring for TPH support Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 09/10] bnxt_en: Extend queue stop/start for TX rings Michael Chan
2025-02-12  1:44   ` Jakub Kicinski
2025-02-12  2:31     ` Michael Chan
2025-02-12  2:43       ` Jakub Kicinski
2025-02-12 22:59         ` Michael Chan
2025-02-08 20:29 ` [PATCH net-next v4 10/10] bnxt_en: Add TPH support in BNXT driver Michael Chan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).