* [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
@ 2024-06-11 2:33 David Wei
2024-06-11 2:33 ` [PATCH net-next v1 1/3] bnxt_en: Add support to call FW to update a VNIC David Wei
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: David Wei @ 2024-06-11 2:33 UTC (permalink / raw)
To: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev
Cc: Pavel Begunkov, Jakub Kicinski, David Ahern, David S. Miller,
Eric Dumazet, Paolo Abeni
Implement netdev_queue_mgmt_ops for bnxt added in [1]. This will be used
in the io_uring ZC Rx patchset to configure queues with a custom page
pool w/ a special memory provider for zero copy support.
The first two patches prep the driver, while the final patch adds the
implementation.
This implementation can only reset Rx queues that are _not_ in the main
RSS context. That is, ethtool -X must be called to reserve a number of
queues outside of the main RSS context, and only these queues work with
netdev_queue_mgmt_ops. Otherwise, EOPNOTSUPP is returned.
I didn't include the netdev core API using this netdev_queue_mgmt_ops
because Mina is adding it in his devmem TCP series [2]. But I'm happy to
include it if folks want to include a user with this series.
I tested this series on BCM957504-N1100FY4 with FW 229.1.123.0. I
manually injected failures at all the places that can return an errno
and confirmed that the device/queue is never left in a broken state.
[1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/
[2]: https://lore.kernel.org/netdev/20240607005127.3078656-2-almasrymina@google.com/
David Wei (2):
bnxt_en: split rx ring helpers out from ring helpers
bnxt_en: implement queue API
Michael Chan (1):
bnxt_en: Add support to call FW to update a VNIC
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 623 +++++++++++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 +
drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 37 ++
3 files changed, 555 insertions(+), 108 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net-next v1 1/3] bnxt_en: Add support to call FW to update a VNIC
2024-06-11 2:33 [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
@ 2024-06-11 2:33 ` David Wei
2024-06-11 2:33 ` [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers David Wei
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: David Wei @ 2024-06-11 2:33 UTC (permalink / raw)
To: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev
Cc: Pavel Begunkov, Jakub Kicinski, David Ahern, David S. Miller,
Eric Dumazet, Paolo Abeni
From: Michael Chan <michael.chan@broadcom.com>
Add the HWRM_VNIC_UPDATE message structures and the function to
send the message to firmware. This message can be used when
disabling and enabling a receive ring within a VNIC. The mru
which is the maximum receive size of packets received by the
VNIC can be updated.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 23 +++++++++++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 37 +++++++++++++++++++
3 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7dc00c0d8992..ebcf393f06dc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6452,7 +6452,8 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic)
req->dflt_ring_grp = cpu_to_le16(bp->grp_info[grp_idx].fw_grp_id);
req->lb_rule = cpu_to_le16(0xffff);
vnic_mru:
- req->mru = cpu_to_le16(bp->dev->mtu + ETH_HLEN + VLAN_HLEN);
+ vnic->mru = bp->dev->mtu + ETH_HLEN + VLAN_HLEN;
+ req->mru = cpu_to_le16(vnic->mru);
req->vnic_id = cpu_to_le16(vnic->fw_vnic_id);
#ifdef CONFIG_BNXT_SRIOV
@@ -9912,6 +9913,26 @@ static int __bnxt_setup_vnic(struct bnxt *bp, struct bnxt_vnic_info *vnic)
return rc;
}
+int bnxt_hwrm_vnic_update(struct bnxt *bp, struct bnxt_vnic_info *vnic,
+ u8 valid)
+{
+ struct hwrm_vnic_update_input *req;
+ int rc;
+
+ rc = hwrm_req_init(bp, req, HWRM_VNIC_UPDATE);
+ if (rc)
+ return rc;
+
+ req->vnic_id = cpu_to_le32(vnic->fw_vnic_id);
+
+ if (valid & VNIC_UPDATE_REQ_ENABLES_MRU_VALID)
+ req->mru = cpu_to_le16(vnic->mru);
+
+ req->enables = cpu_to_le32(valid);
+
+ return hwrm_req_send(bp, req);
+}
+
int bnxt_hwrm_vnic_rss_cfg_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic)
{
int rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 656ab81c0272..fb989dd97db9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1221,6 +1221,7 @@ struct bnxt_vnic_info {
#define BNXT_MAX_CTX_PER_VNIC 8
u16 fw_rss_cos_lb_ctx[BNXT_MAX_CTX_PER_VNIC];
u16 fw_l2_ctx_id;
+ u16 mru;
#define BNXT_MAX_UC_ADDRS 4
struct bnxt_l2_filter *l2_filters[BNXT_MAX_UC_ADDRS];
/* index 0 always dev_addr */
@@ -2753,6 +2754,8 @@ int bnxt_hwrm_free_wol_fltr(struct bnxt *bp);
int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all);
int bnxt_hwrm_func_qcaps(struct bnxt *bp);
int bnxt_hwrm_fw_set_time(struct bnxt *);
+int bnxt_hwrm_vnic_update(struct bnxt *bp, struct bnxt_vnic_info *vnic,
+ u8 valid);
int bnxt_hwrm_vnic_rss_cfg_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic);
int __bnxt_setup_vnic_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic);
void bnxt_del_one_rss_ctx(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
index 06ea86c80be1..abb1463b056c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
@@ -6469,6 +6469,43 @@ struct hwrm_vnic_alloc_output {
u8 valid;
};
+/* hwrm_vnic_update_input (size:256b/32B) */
+struct hwrm_vnic_update_input {
+ __le16 req_type;
+ __le16 cmpl_ring;
+ __le16 seq_id;
+ __le16 target_id;
+ __le64 resp_addr;
+ __le32 vnic_id;
+ __le32 enables;
+ #define VNIC_UPDATE_REQ_ENABLES_VNIC_STATE_VALID 0x1UL
+ #define VNIC_UPDATE_REQ_ENABLES_MRU_VALID 0x2UL
+ #define VNIC_UPDATE_REQ_ENABLES_METADATA_FORMAT_TYPE_VALID 0x4UL
+ u8 vnic_state;
+ #define VNIC_UPDATE_REQ_VNIC_STATE_NORMAL 0x0UL
+ #define VNIC_UPDATE_REQ_VNIC_STATE_DROP 0x1UL
+ #define VNIC_UPDATE_REQ_VNIC_STATE_LAST VNIC_UPDATE_REQ_VNIC_STATE_DROP
+ u8 metadata_format_type;
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_0 0x0UL
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_1 0x1UL
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_2 0x2UL
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_3 0x3UL
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_4 0x4UL
+ #define VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_LAST VNIC_UPDATE_REQ_METADATA_FORMAT_TYPE_4
+ __le16 mru;
+ u8 unused_1[4];
+};
+
+/* hwrm_vnic_update_output (size:128b/16B) */
+struct hwrm_vnic_update_output {
+ __le16 error_code;
+ __le16 req_type;
+ __le16 seq_id;
+ __le16 resp_len;
+ u8 unused_0[7];
+ u8 valid;
+};
+
/* hwrm_vnic_free_input (size:192b/24B) */
struct hwrm_vnic_free_input {
__le16 req_type;
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers
2024-06-11 2:33 [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
2024-06-11 2:33 ` [PATCH net-next v1 1/3] bnxt_en: Add support to call FW to update a VNIC David Wei
@ 2024-06-11 2:33 ` David Wei
2024-06-14 20:01 ` Simon Horman
2024-06-11 2:33 ` [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
2024-06-11 2:40 ` [PATCH net-next v1 0/3] " David Ahern
3 siblings, 1 reply; 13+ messages in thread
From: David Wei @ 2024-06-11 2:33 UTC (permalink / raw)
To: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev
Cc: Pavel Begunkov, Jakub Kicinski, David Ahern, David S. Miller,
Eric Dumazet, Paolo Abeni
To prepare for queue API implementation, split rx ring functions out
from ring helpers. These new helpers will be called from queue API
implementation.
Signed-off-by: David Wei <dw@davidwei.uk>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 302 ++++++++++++++--------
1 file changed, 195 insertions(+), 107 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ebcf393f06dc..b5895e0854d1 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3317,37 +3317,12 @@ static void bnxt_free_tx_skbs(struct bnxt *bp)
}
}
-static void bnxt_free_one_rx_ring_skbs(struct bnxt *bp, int ring_nr)
+static void bnxt_free_one_rx_ring(struct bnxt *bp, struct bnxt_rx_ring_info *rxr)
{
- struct bnxt_rx_ring_info *rxr = &bp->rx_ring[ring_nr];
struct pci_dev *pdev = bp->pdev;
- struct bnxt_tpa_idx_map *map;
- int i, max_idx, max_agg_idx;
+ int i, max_idx;
max_idx = bp->rx_nr_pages * RX_DESC_CNT;
- max_agg_idx = bp->rx_agg_nr_pages * RX_DESC_CNT;
- if (!rxr->rx_tpa)
- goto skip_rx_tpa_free;
-
- for (i = 0; i < bp->max_tpa; i++) {
- struct bnxt_tpa_info *tpa_info = &rxr->rx_tpa[i];
- u8 *data = tpa_info->data;
-
- if (!data)
- continue;
-
- dma_unmap_single_attrs(&pdev->dev, tpa_info->mapping,
- bp->rx_buf_use_size, bp->rx_dir,
- DMA_ATTR_WEAK_ORDERING);
-
- tpa_info->data = NULL;
-
- skb_free_frag(data);
- }
-
-skip_rx_tpa_free:
- if (!rxr->rx_buf_ring)
- goto skip_rx_buf_free;
for (i = 0; i < max_idx; i++) {
struct bnxt_sw_rx_bd *rx_buf = &rxr->rx_buf_ring[i];
@@ -3367,12 +3342,15 @@ static void bnxt_free_one_rx_ring_skbs(struct bnxt *bp, int ring_nr)
skb_free_frag(data);
}
}
+}
-skip_rx_buf_free:
- if (!rxr->rx_agg_ring)
- goto skip_rx_agg_free;
+static void bnxt_free_one_rx_agg_ring(struct bnxt *bp, struct bnxt_rx_ring_info *rxr)
+{
+ int i, max_idx;
+
+ max_idx = bp->rx_agg_nr_pages * RX_DESC_CNT;
- for (i = 0; i < max_agg_idx; i++) {
+ for (i = 0; i < max_idx; i++) {
struct bnxt_sw_rx_agg_bd *rx_agg_buf = &rxr->rx_agg_ring[i];
struct page *page = rx_agg_buf->page;
@@ -3384,6 +3362,45 @@ static void bnxt_free_one_rx_ring_skbs(struct bnxt *bp, int ring_nr)
page_pool_recycle_direct(rxr->page_pool, page);
}
+}
+
+static void bnxt_free_one_rx_ring_skbs(struct bnxt *bp, int ring_nr)
+{
+ struct bnxt_rx_ring_info *rxr = &bp->rx_ring[ring_nr];
+ struct pci_dev *pdev = bp->pdev;
+ struct bnxt_tpa_idx_map *map;
+ int i;
+
+ if (!rxr->rx_tpa)
+ goto skip_rx_tpa_free;
+
+ for (i = 0; i < bp->max_tpa; i++) {
+ struct bnxt_tpa_info *tpa_info = &rxr->rx_tpa[i];
+ u8 *data = tpa_info->data;
+
+ if (!data)
+ continue;
+
+ dma_unmap_single_attrs(&pdev->dev, tpa_info->mapping,
+ bp->rx_buf_use_size, bp->rx_dir,
+ DMA_ATTR_WEAK_ORDERING);
+
+ tpa_info->data = NULL;
+
+ skb_free_frag(data);
+ }
+
+skip_rx_tpa_free:
+ if (!rxr->rx_buf_ring)
+ goto skip_rx_buf_free;
+
+ bnxt_free_one_rx_ring(bp, rxr);
+
+skip_rx_buf_free:
+ if (!rxr->rx_agg_ring)
+ goto skip_rx_agg_free;
+
+ bnxt_free_one_rx_agg_ring(bp, rxr);
skip_rx_agg_free:
map = rxr->rx_tpa_idx_map;
@@ -4062,37 +4079,55 @@ static void bnxt_init_rxbd_pages(struct bnxt_ring_struct *ring, u32 type)
}
}
-static int bnxt_alloc_one_rx_ring(struct bnxt *bp, int ring_nr)
+static void bnxt_alloc_one_rx_ring_skb(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr,
+ int ring_nr)
{
- struct bnxt_rx_ring_info *rxr = &bp->rx_ring[ring_nr];
- struct net_device *dev = bp->dev;
u32 prod;
int i;
prod = rxr->rx_prod;
for (i = 0; i < bp->rx_ring_size; i++) {
if (bnxt_alloc_rx_data(bp, rxr, prod, GFP_KERNEL)) {
- netdev_warn(dev, "init'ed rx ring %d with %d/%d skbs only\n",
+ netdev_warn(bp->dev, "init'ed rx ring %d with %d/%d skbs only\n",
ring_nr, i, bp->rx_ring_size);
break;
}
prod = NEXT_RX(prod);
}
rxr->rx_prod = prod;
+}
- if (!(bp->flags & BNXT_FLAG_AGG_RINGS))
- return 0;
+static void bnxt_alloc_one_rx_ring_page(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr,
+ int ring_nr)
+{
+ u32 prod;
+ int i;
prod = rxr->rx_agg_prod;
for (i = 0; i < bp->rx_agg_ring_size; i++) {
if (bnxt_alloc_rx_page(bp, rxr, prod, GFP_KERNEL)) {
- netdev_warn(dev, "init'ed rx ring %d with %d/%d pages only\n",
+ netdev_warn(bp->dev, "init'ed rx ring %d with %d/%d pages only\n",
ring_nr, i, bp->rx_ring_size);
break;
}
prod = NEXT_RX_AGG(prod);
}
rxr->rx_agg_prod = prod;
+}
+
+static int bnxt_alloc_one_rx_ring(struct bnxt *bp, int ring_nr)
+{
+ struct bnxt_rx_ring_info *rxr = &bp->rx_ring[ring_nr];
+ int i;
+
+ bnxt_alloc_one_rx_ring_skb(bp, rxr, ring_nr);
+
+ if (!(bp->flags & BNXT_FLAG_AGG_RINGS))
+ return 0;
+
+ bnxt_alloc_one_rx_ring_page(bp, rxr, ring_nr);
if (rxr->rx_tpa) {
dma_addr_t mapping;
@@ -4111,9 +4146,9 @@ static int bnxt_alloc_one_rx_ring(struct bnxt *bp, int ring_nr)
return 0;
}
-static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr)
+static void bnxt_init_one_rx_ring_rxbd(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
{
- struct bnxt_rx_ring_info *rxr;
struct bnxt_ring_struct *ring;
u32 type;
@@ -4123,28 +4158,45 @@ static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr)
if (NET_IP_ALIGN == 2)
type |= RX_BD_FLAGS_SOP;
- rxr = &bp->rx_ring[ring_nr];
ring = &rxr->rx_ring_struct;
bnxt_init_rxbd_pages(ring, type);
-
- netif_queue_set_napi(bp->dev, ring_nr, NETDEV_QUEUE_TYPE_RX,
- &rxr->bnapi->napi);
-
- if (BNXT_RX_PAGE_MODE(bp) && bp->xdp_prog) {
- bpf_prog_add(bp->xdp_prog, 1);
- rxr->xdp_prog = bp->xdp_prog;
- }
ring->fw_ring_id = INVALID_HW_RING_ID;
+}
+
+static void bnxt_init_one_rx_agg_ring_rxbd(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
+{
+ struct bnxt_ring_struct *ring;
+ u32 type;
ring = &rxr->rx_agg_ring_struct;
ring->fw_ring_id = INVALID_HW_RING_ID;
-
if ((bp->flags & BNXT_FLAG_AGG_RINGS)) {
type = ((u32)BNXT_RX_PAGE_SIZE << RX_BD_LEN_SHIFT) |
RX_BD_TYPE_RX_AGG_BD | RX_BD_FLAGS_SOP;
bnxt_init_rxbd_pages(ring, type);
}
+}
+
+static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr)
+{
+ struct bnxt_rx_ring_info *rxr;
+ struct bnxt_ring_struct *ring;
+
+ rxr = &bp->rx_ring[ring_nr];
+ ring = &rxr->rx_ring_struct;
+ bnxt_init_one_rx_ring_rxbd(bp, rxr);
+
+ netif_queue_set_napi(bp->dev, ring_nr, NETDEV_QUEUE_TYPE_RX,
+ &rxr->bnapi->napi);
+
+ if (BNXT_RX_PAGE_MODE(bp) && bp->xdp_prog) {
+ bpf_prog_add(bp->xdp_prog, 1);
+ rxr->xdp_prog = bp->xdp_prog;
+ }
+
+ bnxt_init_one_rx_agg_ring_rxbd(bp, rxr);
return bnxt_alloc_one_rx_ring(bp, ring_nr);
}
@@ -6870,6 +6922,48 @@ static void bnxt_set_db(struct bnxt *bp, struct bnxt_db_info *db, u32 ring_type,
bnxt_set_db_mask(bp, db, ring_type);
}
+static int bnxt_hwrm_rx_ring_alloc(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
+{
+ struct bnxt_ring_struct *ring = &rxr->rx_ring_struct;
+ struct bnxt_napi *bnapi = rxr->bnapi;
+ u32 type = HWRM_RING_ALLOC_RX;
+ u32 map_idx = bnapi->index;
+ int rc;
+
+ rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+ if (rc)
+ return rc;
+
+ bnxt_set_db(bp, &rxr->rx_db, type, map_idx, ring->fw_ring_id);
+ bp->grp_info[map_idx].rx_fw_ring_id = ring->fw_ring_id;
+
+ return 0;
+}
+
+static int bnxt_hwrm_rx_agg_ring_alloc(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
+{
+ struct bnxt_ring_struct *ring = &rxr->rx_agg_ring_struct;
+ u32 type = HWRM_RING_ALLOC_AGG;
+ u32 grp_idx = ring->grp_idx;
+ u32 map_idx;
+ int rc;
+
+ map_idx = grp_idx + bp->rx_nr_rings;
+ rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+ if (rc)
+ return rc;
+
+ bnxt_set_db(bp, &rxr->rx_agg_db, type, map_idx,
+ ring->fw_ring_id);
+ bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
+ bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
+ bp->grp_info[grp_idx].agg_fw_ring_id = ring->fw_ring_id;
+
+ return 0;
+}
+
static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
{
bool agg_rings = !!(bp->flags & BNXT_FLAG_AGG_RINGS);
@@ -6935,24 +7029,21 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
bnxt_set_db(bp, &txr->tx_db, type, map_idx, ring->fw_ring_id);
}
- type = HWRM_RING_ALLOC_RX;
for (i = 0; i < bp->rx_nr_rings; i++) {
struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
- struct bnxt_ring_struct *ring = &rxr->rx_ring_struct;
- struct bnxt_napi *bnapi = rxr->bnapi;
- u32 map_idx = bnapi->index;
- rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+ rc = bnxt_hwrm_rx_ring_alloc(bp, rxr);
if (rc)
goto err_out;
- bnxt_set_db(bp, &rxr->rx_db, type, map_idx, ring->fw_ring_id);
/* If we have agg rings, post agg buffers first. */
if (!agg_rings)
bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
- bp->grp_info[map_idx].rx_fw_ring_id = ring->fw_ring_id;
if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
struct bnxt_cp_ring_info *cpr2 = rxr->rx_cpr;
+ struct bnxt_napi *bnapi = rxr->bnapi;
u32 type2 = HWRM_RING_ALLOC_CMPL;
+ struct bnxt_ring_struct *ring;
+ u32 map_idx = bnapi->index;
ring = &cpr2->cp_ring_struct;
ring->handle = BNXT_SET_NQ_HDL(cpr2);
@@ -6966,23 +7057,10 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
}
if (agg_rings) {
- type = HWRM_RING_ALLOC_AGG;
for (i = 0; i < bp->rx_nr_rings; i++) {
- struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
- struct bnxt_ring_struct *ring =
- &rxr->rx_agg_ring_struct;
- u32 grp_idx = ring->grp_idx;
- u32 map_idx = grp_idx + bp->rx_nr_rings;
-
- rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
+ rc = bnxt_hwrm_rx_agg_ring_alloc(bp, &bp->rx_ring[i]);
if (rc)
goto err_out;
-
- bnxt_set_db(bp, &rxr->rx_agg_db, type, map_idx,
- ring->fw_ring_id);
- bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
- bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
- bp->grp_info[grp_idx].agg_fw_ring_id = ring->fw_ring_id;
}
}
err_out:
@@ -7022,6 +7100,50 @@ static int hwrm_ring_free_send_msg(struct bnxt *bp,
return 0;
}
+static void bnxt_hwrm_rx_ring_free(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr,
+ bool close_path)
+{
+ struct bnxt_ring_struct *ring = &rxr->rx_ring_struct;
+ u32 grp_idx = rxr->bnapi->index;
+ u32 cmpl_ring_id;
+
+ if (ring->fw_ring_id == INVALID_HW_RING_ID)
+ return;
+
+ cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
+ hwrm_ring_free_send_msg(bp, ring,
+ RING_FREE_REQ_RING_TYPE_RX,
+ close_path ? cmpl_ring_id :
+ INVALID_HW_RING_ID);
+ ring->fw_ring_id = INVALID_HW_RING_ID;
+ bp->grp_info[grp_idx].rx_fw_ring_id = INVALID_HW_RING_ID;
+}
+
+static void bnxt_hwrm_rx_agg_ring_free(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr,
+ bool close_path)
+{
+ struct bnxt_ring_struct *ring = &rxr->rx_agg_ring_struct;
+ u32 grp_idx = rxr->bnapi->index;
+ u32 type, cmpl_ring_id;
+
+ if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
+ type = RING_FREE_REQ_RING_TYPE_RX_AGG;
+ else
+ type = RING_FREE_REQ_RING_TYPE_RX;
+
+ if (ring->fw_ring_id == INVALID_HW_RING_ID)
+ return;
+
+ cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
+ hwrm_ring_free_send_msg(bp, ring, type,
+ close_path ? cmpl_ring_id :
+ INVALID_HW_RING_ID);
+ ring->fw_ring_id = INVALID_HW_RING_ID;
+ bp->grp_info[grp_idx].agg_fw_ring_id = INVALID_HW_RING_ID;
+}
+
static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
{
u32 type;
@@ -7046,42 +7168,8 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
}
for (i = 0; i < bp->rx_nr_rings; i++) {
- struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
- struct bnxt_ring_struct *ring = &rxr->rx_ring_struct;
- u32 grp_idx = rxr->bnapi->index;
-
- if (ring->fw_ring_id != INVALID_HW_RING_ID) {
- u32 cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
-
- hwrm_ring_free_send_msg(bp, ring,
- RING_FREE_REQ_RING_TYPE_RX,
- close_path ? cmpl_ring_id :
- INVALID_HW_RING_ID);
- ring->fw_ring_id = INVALID_HW_RING_ID;
- bp->grp_info[grp_idx].rx_fw_ring_id =
- INVALID_HW_RING_ID;
- }
- }
-
- if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
- type = RING_FREE_REQ_RING_TYPE_RX_AGG;
- else
- type = RING_FREE_REQ_RING_TYPE_RX;
- for (i = 0; i < bp->rx_nr_rings; i++) {
- struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
- struct bnxt_ring_struct *ring = &rxr->rx_agg_ring_struct;
- u32 grp_idx = rxr->bnapi->index;
-
- if (ring->fw_ring_id != INVALID_HW_RING_ID) {
- u32 cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
-
- hwrm_ring_free_send_msg(bp, ring, type,
- close_path ? cmpl_ring_id :
- INVALID_HW_RING_ID);
- ring->fw_ring_id = INVALID_HW_RING_ID;
- bp->grp_info[grp_idx].agg_fw_ring_id =
- INVALID_HW_RING_ID;
- }
+ bnxt_hwrm_rx_ring_free(bp, &bp->rx_ring[i], close_path);
+ bnxt_hwrm_rx_agg_ring_free(bp, &bp->rx_ring[i], close_path);
}
/* The completion rings are about to be freed. After that the
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-11 2:33 [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
2024-06-11 2:33 ` [PATCH net-next v1 1/3] bnxt_en: Add support to call FW to update a VNIC David Wei
2024-06-11 2:33 ` [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers David Wei
@ 2024-06-11 2:33 ` David Wei
2024-06-14 20:07 ` Simon Horman
2024-06-11 2:40 ` [PATCH net-next v1 0/3] " David Ahern
3 siblings, 1 reply; 13+ messages in thread
From: David Wei @ 2024-06-11 2:33 UTC (permalink / raw)
To: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev
Cc: Pavel Begunkov, Jakub Kicinski, David Ahern, David S. Miller,
Eric Dumazet, Paolo Abeni
Implement netdev_queue_mgmt_ops for bnxt added in [1].
Two bnxt_rx_ring_info structs are allocated to hold the new/old queue
memory. Queue memory is copied from/to the main bp->rx_ring[idx]
bnxt_rx_ring_info.
Queue memory is pre-allocated in bnxt_queue_mem_alloc() into a clone,
and then copied into bp->rx_ring[idx] in bnxt_queue_mem_start().
Similarly, when bp->rx_ring[idx] is stopped its queue memory is copied
into a clone, and then freed later in bnxt_queue_mem_free().
I tested this patchset with netdev_rx_queue_restart(), including
inducing errors in all places that returns an error code. In all cases,
the queue is left in a good working state.
Rx queues are stopped/started using bnxt_hwrm_vnic_update(), which only
affects queues that are not in the default RSS context. This is
different to the GVE that also implemented the queue API recently where
arbitrary Rx queues can be stopped. Due to this limitation, all ndos
returns EOPNOTSUPP if the queue is in the default RSS context.
Thanks to Somnath for helping me with using bnxt_hwrm_vnic_update() to
stop/start an Rx queue. With their permission I've added them as
Acked-by.
[1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 307 ++++++++++++++++++++++
1 file changed, 307 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b5895e0854d1..aa857b109378 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3997,6 +3997,62 @@ static int bnxt_alloc_cp_rings(struct bnxt *bp)
return 0;
}
+static void bnxt_init_rx_ring_struct(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
+{
+ struct bnxt_ring_mem_info *rmem;
+ struct bnxt_ring_struct *ring;
+
+ ring = &rxr->rx_ring_struct;
+ rmem = &ring->ring_mem;
+ rmem->nr_pages = bp->rx_nr_pages;
+ rmem->page_size = HW_RXBD_RING_SIZE;
+ rmem->pg_arr = (void **)rxr->rx_desc_ring;
+ rmem->dma_arr = rxr->rx_desc_mapping;
+ rmem->vmem_size = SW_RXBD_RING_SIZE * bp->rx_nr_pages;
+ rmem->vmem = (void **)&rxr->rx_buf_ring;
+
+ ring = &rxr->rx_agg_ring_struct;
+ rmem = &ring->ring_mem;
+ rmem->nr_pages = bp->rx_agg_nr_pages;
+ rmem->page_size = HW_RXBD_RING_SIZE;
+ rmem->pg_arr = (void **)rxr->rx_agg_desc_ring;
+ rmem->dma_arr = rxr->rx_agg_desc_mapping;
+ rmem->vmem_size = SW_RXBD_AGG_RING_SIZE * bp->rx_agg_nr_pages;
+ rmem->vmem = (void **)&rxr->rx_agg_ring;
+}
+
+static void bnxt_reset_rx_ring_struct(struct bnxt *bp,
+ struct bnxt_rx_ring_info *rxr)
+{
+ struct bnxt_ring_mem_info *rmem;
+ struct bnxt_ring_struct *ring;
+ int i;
+
+ rxr->page_pool->p.napi = NULL;
+ rxr->page_pool = NULL;
+
+ ring = &rxr->rx_ring_struct;
+ rmem = &ring->ring_mem;
+ rmem->pg_tbl = NULL;
+ rmem->pg_tbl_map = 0;
+ for (i = 0; i < rmem->nr_pages; i++) {
+ rmem->pg_arr[i] = NULL;
+ rmem->dma_arr[i] = 0;
+ }
+ *rmem->vmem = NULL;
+
+ ring = &rxr->rx_agg_ring_struct;
+ rmem = &ring->ring_mem;
+ rmem->pg_tbl = NULL;
+ rmem->pg_tbl_map = 0;
+ for (i = 0; i < rmem->nr_pages; i++) {
+ rmem->pg_arr[i] = NULL;
+ rmem->dma_arr[i] = 0;
+ }
+ *rmem->vmem = NULL;
+}
+
static void bnxt_init_ring_struct(struct bnxt *bp)
{
int i, j;
@@ -14937,6 +14993,256 @@ static const struct netdev_stat_ops bnxt_stat_ops = {
.get_base_stats = bnxt_get_base_stats,
};
+static int bnxt_alloc_rx_agg_bmap(struct bnxt *bp, struct bnxt_rx_ring_info *rxr)
+{
+ u16 mem_size;
+
+ rxr->rx_agg_bmap_size = bp->rx_agg_ring_mask + 1;
+ mem_size = rxr->rx_agg_bmap_size / 8;
+ rxr->rx_agg_bmap = kzalloc(mem_size, GFP_KERNEL);
+ if (!rxr->rx_agg_bmap)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
+{
+ struct bnxt_rx_ring_info *rxr, *clone;
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_ring_struct *ring;
+ int rc;
+
+ if (bnxt_get_max_rss_ring(bp) >= idx)
+ return -EOPNOTSUPP;
+
+ rxr = &bp->rx_ring[idx];
+ clone = qmem;
+ memcpy(clone, rxr, sizeof(*rxr));
+ bnxt_init_rx_ring_struct(bp, clone);
+ bnxt_reset_rx_ring_struct(bp, clone);
+
+ clone->rx_prod = 0;
+ clone->rx_agg_prod = 0;
+ clone->rx_sw_agg_prod = 0;
+ clone->rx_next_cons = 0;
+
+ rc = bnxt_alloc_rx_page_pool(bp, clone, rxr->page_pool->p.nid);
+ if (rc)
+ return rc;
+
+ ring = &clone->rx_ring_struct;
+ rc = bnxt_alloc_ring(bp, &ring->ring_mem);
+ if (rc)
+ goto err_free_rx_ring;
+
+ if (bp->flags & BNXT_FLAG_AGG_RINGS) {
+ ring = &clone->rx_agg_ring_struct;
+ rc = bnxt_alloc_ring(bp, &ring->ring_mem);
+ if (rc)
+ goto err_free_rx_agg_ring;
+
+ rc = bnxt_alloc_rx_agg_bmap(bp, clone);
+ if (rc)
+ goto err_free_rx_agg_ring;
+ }
+
+ bnxt_init_one_rx_ring_rxbd(bp, clone);
+ bnxt_init_one_rx_agg_ring_rxbd(bp, clone);
+
+ bnxt_alloc_one_rx_ring_skb(bp, clone, idx);
+ if (bp->flags & BNXT_FLAG_AGG_RINGS)
+ bnxt_alloc_one_rx_ring_page(bp, clone, idx);
+
+ return 0;
+
+err_free_rx_agg_ring:
+ bnxt_free_ring(bp, &clone->rx_agg_ring_struct.ring_mem);
+err_free_rx_ring:
+ bnxt_free_ring(bp, &clone->rx_ring_struct.ring_mem);
+ clone->page_pool->p.napi = NULL;
+ page_pool_destroy(clone->page_pool);
+ clone->page_pool = NULL;
+ return rc;
+}
+
+static void bnxt_queue_mem_free(struct net_device *dev, void *qmem)
+{
+ struct bnxt_rx_ring_info *rxr = qmem;
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_ring_struct *ring;
+
+ if (bnxt_get_max_rss_ring(bp) >= idx)
+ return -EOPNOTSUPP;
+
+ bnxt_free_one_rx_ring(bp, rxr);
+ bnxt_free_one_rx_agg_ring(bp, rxr);
+
+ /* At this point, this NAPI instance has another page pool associated
+ * with it. Disconnect here before freeing the old page pool to avoid
+ * warnings.
+ */
+ rxr->page_pool->p.napi = NULL;
+ page_pool_destroy(rxr->page_pool);
+ rxr->page_pool = NULL;
+
+ ring = &rxr->rx_ring_struct;
+ bnxt_free_ring(bp, &ring->ring_mem);
+
+ ring = &rxr->rx_agg_ring_struct;
+ bnxt_free_ring(bp, &ring->ring_mem);
+
+ kfree(rxr->rx_agg_bmap);
+ rxr->rx_agg_bmap = NULL;
+}
+
+static void bnxt_copy_rx_ring(struct bnxt *bp,
+ struct bnxt_rx_ring_info *dst,
+ struct bnxt_rx_ring_info *src)
+{
+ struct bnxt_ring_mem_info *dst_rmem, *src_rmem;
+ struct bnxt_ring_struct *dst_ring, *src_ring;
+ int i;
+
+ dst_ring = &dst->rx_ring_struct;
+ dst_rmem = &dst_ring->ring_mem;
+ src_ring = &src->rx_ring_struct;
+ src_rmem = &src_ring->ring_mem;
+
+ WARN_ON(dst_rmem->nr_pages != src_rmem->nr_pages);
+ WARN_ON(dst_rmem->page_size != src_rmem->page_size);
+ WARN_ON(dst_rmem->flags != src_rmem->flags);
+ WARN_ON(dst_rmem->depth != src_rmem->depth);
+ WARN_ON(dst_rmem->vmem_size != src_rmem->vmem_size);
+ WARN_ON(dst_rmem->ctx_mem != src_rmem->ctx_mem);
+
+ dst_rmem->pg_tbl = src_rmem->pg_tbl;
+ dst_rmem->pg_tbl_map = src_rmem->pg_tbl_map;
+ *dst_rmem->vmem = *src_rmem->vmem;
+ for (i = 0; i < dst_rmem->nr_pages; i++) {
+ dst_rmem->pg_arr[i] = src_rmem->pg_arr[i];
+ dst_rmem->dma_arr[i] = src_rmem->dma_arr[i];
+ }
+
+ if (!(bp->flags & BNXT_FLAG_AGG_RINGS))
+ return;
+
+ dst_ring = &dst->rx_agg_ring_struct;
+ dst_rmem = &dst_ring->ring_mem;
+ src_ring = &src->rx_agg_ring_struct;
+ src_rmem = &src_ring->ring_mem;
+
+ WARN_ON(dst_rmem->nr_pages != src_rmem->nr_pages);
+ WARN_ON(dst_rmem->page_size != src_rmem->page_size);
+ WARN_ON(dst_rmem->flags != src_rmem->flags);
+ WARN_ON(dst_rmem->depth != src_rmem->depth);
+ WARN_ON(dst_rmem->vmem_size != src_rmem->vmem_size);
+ WARN_ON(dst_rmem->ctx_mem != src_rmem->ctx_mem);
+ WARN_ON(dst->rx_agg_bmap_size != src->rx_agg_bmap_size);
+
+ dst_rmem->pg_tbl = src_rmem->pg_tbl;
+ dst_rmem->pg_tbl_map = src_rmem->pg_tbl_map;
+ *dst_rmem->vmem = *src_rmem->vmem;
+ for (i = 0; i < dst_rmem->nr_pages; i++) {
+ dst_rmem->pg_arr[i] = src_rmem->pg_arr[i];
+ dst_rmem->dma_arr[i] = src_rmem->dma_arr[i];
+ }
+
+ dst->rx_agg_bmap = src->rx_agg_bmap;
+}
+
+static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
+{
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_rx_ring_info *rxr, *clone;
+ struct bnxt_cp_ring_info *cpr;
+ struct bnxt_vnic_info *vnic;
+ int rc;
+
+ if (bnxt_get_max_rss_ring(bp) >= idx)
+ return -EOPNOTSUPP;
+
+ rxr = &bp->rx_ring[idx];
+ clone = qmem;
+
+ rxr->rx_prod = clone->rx_prod;
+ rxr->rx_agg_prod = clone->rx_agg_prod;
+ rxr->rx_sw_agg_prod = clone->rx_sw_agg_prod;
+ rxr->rx_next_cons = clone->rx_next_cons;
+ rxr->page_pool = clone->page_pool;
+
+ bnxt_copy_rx_ring(bp, rxr, clone);
+
+ rc = bnxt_hwrm_rx_ring_alloc(bp, rxr);
+ if (rc)
+ return rc;
+ rc = bnxt_hwrm_rx_agg_ring_alloc(bp, rxr);
+ if (rc)
+ goto err_free_hwrm_rx_ring;
+
+ bnxt_db_write(bp, &rxr->rx_db, rxr->rx_prod);
+ if (bp->flags & BNXT_FLAG_AGG_RINGS)
+ bnxt_db_write(bp, &rxr->rx_agg_db, rxr->rx_agg_prod);
+
+ napi_enable(&rxr->bnapi->napi);
+
+ vnic = &bp->vnic_info[BNXT_VNIC_NTUPLE];
+ vnic->mru = bp->dev->mtu + ETH_HLEN + VLAN_HLEN;
+ rc = bnxt_hwrm_vnic_update(bp, vnic,
+ VNIC_UPDATE_REQ_ENABLES_MRU_VALID);
+ if (rc)
+ goto err_free_hwrm_rx_agg_ring;
+
+ cpr = &rxr->bnapi->cp_ring;
+ cpr->sw_stats->rx.rx_resets++;
+
+ return 0;
+
+err_free_hwrm_rx_agg_ring:
+ napi_disable(&rxr->bnapi->napi);
+ bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
+err_free_hwrm_rx_ring:
+ bnxt_hwrm_rx_ring_free(bp, rxr, false);
+ return rc;
+}
+
+static int bnxt_queue_stop(struct net_device *dev, void *qmem, int idx)
+{
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_rx_ring_info *rxr;
+ struct bnxt_vnic_info *vnic;
+ int rc;
+
+ if (bnxt_get_max_rss_ring(bp) >= idx)
+ return -EOPNOTSUPP;
+
+ vnic = &bp->vnic_info[BNXT_VNIC_NTUPLE];
+ vnic->mru = 0;
+ rc = bnxt_hwrm_vnic_update(bp, vnic,
+ VNIC_UPDATE_REQ_ENABLES_MRU_VALID);
+ if (rc)
+ return rc;
+
+ rxr = &bp->rx_ring[idx];
+ napi_disable(&rxr->bnapi->napi);
+ bnxt_hwrm_rx_ring_free(bp, rxr, false);
+ bnxt_hwrm_rx_agg_ring_free(bp, rxr, false);
+ rxr->rx_next_cons = 0;
+
+ memcpy(qmem, rxr, sizeof(*rxr));
+ bnxt_init_rx_ring_struct(bp, qmem);
+
+ return 0;
+}
+
+static const struct netdev_queue_mgmt_ops bnxt_queue_mgmt_ops = {
+ .ndo_queue_mem_size = sizeof(struct bnxt_rx_ring_info),
+ .ndo_queue_mem_alloc = bnxt_queue_mem_alloc,
+ .ndo_queue_mem_free = bnxt_queue_mem_free,
+ .ndo_queue_start = bnxt_queue_start,
+ .ndo_queue_stop = bnxt_queue_stop,
+};
+
static void bnxt_remove_one(struct pci_dev *pdev)
{
struct net_device *dev = pci_get_drvdata(pdev);
@@ -15402,6 +15708,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->stat_ops = &bnxt_stat_ops;
dev->watchdog_timeo = BNXT_TX_TIMEOUT;
dev->ethtool_ops = &bnxt_ethtool_ops;
+ dev->queue_mgmt_ops = &bnxt_queue_mgmt_ops;
pci_set_drvdata(pdev, dev);
rc = bnxt_alloc_hwrm_resources(bp);
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-11 2:33 [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
` (2 preceding siblings ...)
2024-06-11 2:33 ` [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
@ 2024-06-11 2:40 ` David Ahern
2024-06-11 3:41 ` David Wei
3 siblings, 1 reply; 13+ messages in thread
From: David Ahern @ 2024-06-11 2:40 UTC (permalink / raw)
To: David Wei, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev
Cc: Pavel Begunkov, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni
On 6/10/24 8:33 PM, David Wei wrote:
> Implement netdev_queue_mgmt_ops for bnxt added in [1]. This will be used
> in the io_uring ZC Rx patchset to configure queues with a custom page
> pool w/ a special memory provider for zero copy support.
>
I do not see it explicitly called out, so asking here: does this change
enable / require header split if using memory from this "special memory
provider"?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-11 2:40 ` [PATCH net-next v1 0/3] " David Ahern
@ 2024-06-11 3:41 ` David Wei
2024-06-12 15:52 ` David Ahern
0 siblings, 1 reply; 13+ messages in thread
From: David Wei @ 2024-06-11 3:41 UTC (permalink / raw)
To: David Ahern, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev
Cc: Pavel Begunkov, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni
On 2024-06-10 19:40, David Ahern wrote:
> On 6/10/24 8:33 PM, David Wei wrote:
>> Implement netdev_queue_mgmt_ops for bnxt added in [1]. This will be used
>> in the io_uring ZC Rx patchset to configure queues with a custom page
>> pool w/ a special memory provider for zero copy support.
>>
>
>
> I do not see it explicitly called out, so asking here: does this change
> enable / require header split if using memory from this "special memory
> provider"?
This patchset is orthogonal to header split and page pool memory
providers. It implements netdev_queue_mgmt_ops which enables dynamically
resetting a single Rx queue without needing a full device close/open,
and pre-allocates queue descriptor memory _before_ stopping the queue to
minimise downtime. It shows that, with FW support and carefully written
code, it is possible for drivers to move away from doing full resets on
changes.
My first intent is to use it for reconfiguring an Rx queue with a
different page pool, but that isn't the only purpose. For example, it
opens up the possibility of dynamically creating queues that share a
single napi instance.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-11 3:41 ` David Wei
@ 2024-06-12 15:52 ` David Ahern
2024-06-12 18:19 ` David Wei
0 siblings, 1 reply; 13+ messages in thread
From: David Ahern @ 2024-06-12 15:52 UTC (permalink / raw)
To: David Wei, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev
Cc: Pavel Begunkov, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni
On 6/10/24 9:41 PM, David Wei wrote:
>
> This patchset is orthogonal to header split and page pool memory
> providers. It implements netdev_queue_mgmt_ops which enables dynamically
Ok, where is the validation that these queues must be configured for
header-data split to use non-kernel memory?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-12 15:52 ` David Ahern
@ 2024-06-12 18:19 ` David Wei
2024-06-12 21:11 ` Jakub Kicinski
2024-06-12 21:54 ` Andy Gospodarek
0 siblings, 2 replies; 13+ messages in thread
From: David Wei @ 2024-06-12 18:19 UTC (permalink / raw)
To: David Ahern, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev
Cc: Pavel Begunkov, Jakub Kicinski, David S. Miller, Eric Dumazet,
Paolo Abeni
On 2024-06-12 08:52, David Ahern wrote:
> On 6/10/24 9:41 PM, David Wei wrote:
>>
>> This patchset is orthogonal to header split and page pool memory
>> providers. It implements netdev_queue_mgmt_ops which enables dynamically
>
> Ok, where is the validation that these queues must be configured for
> header-data split to use non-kernel memory?
Any validation would be done outside of this patchset, which only
focuses on resetting an Rx queue. Reconfiguring page pools and HDS is
orthogonal and unrelated to this patchset.
The netdev core API consuming netdev_queue_mgmt_ops would be
netdev_rx_queue_restart() added in [1].
[1]: https://lore.kernel.org/lkml/20240607005127.3078656-2-almasrymina@google.com/
Validation would be done at a layer above, i.e. whatever that calls
netdev_rx_queue_restart(), as opposed to netdev_rx_queue_restart() or
netdev_queue_mgmt_ops itself.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-12 18:19 ` David Wei
@ 2024-06-12 21:11 ` Jakub Kicinski
2024-06-12 21:54 ` Andy Gospodarek
1 sibling, 0 replies; 13+ messages in thread
From: Jakub Kicinski @ 2024-06-12 21:11 UTC (permalink / raw)
To: David Wei
Cc: David Ahern, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev, Pavel Begunkov, David S. Miller,
Eric Dumazet, Paolo Abeni
On Wed, 12 Jun 2024 11:19:47 -0700 David Wei wrote:
> Any validation would be done outside of this patchset, which only
> focuses on resetting an Rx queue. Reconfiguring page pools and HDS is
> orthogonal and unrelated to this patchset.
>
> The netdev core API consuming netdev_queue_mgmt_ops would be
> netdev_rx_queue_restart() added in [1].
>
> [1]: https://lore.kernel.org/lkml/20240607005127.3078656-2-almasrymina@google.com/
>
> Validation would be done at a layer above, i.e. whatever that calls
> netdev_rx_queue_restart(), as opposed to netdev_rx_queue_restart() or
> netdev_queue_mgmt_ops itself.
One could be tempted to pass a "configuration" object to tell the
driver that this queue will be required to run with HDS enabled.
I wonder where I've seen such a thing...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-12 18:19 ` David Wei
2024-06-12 21:11 ` Jakub Kicinski
@ 2024-06-12 21:54 ` Andy Gospodarek
1 sibling, 0 replies; 13+ messages in thread
From: Andy Gospodarek @ 2024-06-12 21:54 UTC (permalink / raw)
To: David Wei
Cc: David Ahern, Michael Chan, Andy Gospodarek, Adrian Alvarado,
Somnath Kotur, netdev, Pavel Begunkov, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni
On Wed, Jun 12, 2024 at 11:19:47AM -0700, David Wei wrote:
> On 2024-06-12 08:52, David Ahern wrote:
> > On 6/10/24 9:41 PM, David Wei wrote:
> >>
> >> This patchset is orthogonal to header split and page pool memory
> >> providers. It implements netdev_queue_mgmt_ops which enables dynamically
> >
> > Ok, where is the validation that these queues must be configured for
> > header-data split to use non-kernel memory?
>
> Any validation would be done outside of this patchset, which only
> focuses on resetting an Rx queue. Reconfiguring page pools and HDS is
> orthogonal and unrelated to this patchset.
I can confirm this. These changes are just to keep the allocation
schemes the same as they are today, but add support for stopping/freeing
and starting/allocating rings individually and via the new API.
> The netdev core API consuming netdev_queue_mgmt_ops would be
> netdev_rx_queue_restart() added in [1].
>
> [1]: https://lore.kernel.org/lkml/20240607005127.3078656-2-almasrymina@google.com/
>
> Validation would be done at a layer above, i.e. whatever that calls
> netdev_rx_queue_restart(), as opposed to netdev_rx_queue_restart() or
> netdev_queue_mgmt_ops itself.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers
2024-06-11 2:33 ` [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers David Wei
@ 2024-06-14 20:01 ` Simon Horman
0 siblings, 0 replies; 13+ messages in thread
From: Simon Horman @ 2024-06-14 20:01 UTC (permalink / raw)
To: David Wei
Cc: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev, Pavel Begunkov, Jakub Kicinski, David Ahern,
David S. Miller, Eric Dumazet, Paolo Abeni
On Mon, Jun 10, 2024 at 07:33:23PM -0700, David Wei wrote:
> To prepare for queue API implementation, split rx ring functions out
> from ring helpers. These new helpers will be called from queue API
> implementation.
>
> Signed-off-by: David Wei <dw@davidwei.uk>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 302 ++++++++++++++--------
> 1 file changed, 195 insertions(+), 107 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
...
> +static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr)
> +{
> + struct bnxt_rx_ring_info *rxr;
> + struct bnxt_ring_struct *ring;
> +
> + rxr = &bp->rx_ring[ring_nr];
> + ring = &rxr->rx_ring_struct;
Hi David,
A minor nit from my side.
ring is set but otherwise unused in this function.
Flagged by W=1 builds with gcc-13 and clang-18.
> + bnxt_init_one_rx_ring_rxbd(bp, rxr);
> +
> + netif_queue_set_napi(bp->dev, ring_nr, NETDEV_QUEUE_TYPE_RX,
> + &rxr->bnapi->napi);
> +
> + if (BNXT_RX_PAGE_MODE(bp) && bp->xdp_prog) {
> + bpf_prog_add(bp->xdp_prog, 1);
> + rxr->xdp_prog = bp->xdp_prog;
> + }
> +
> + bnxt_init_one_rx_agg_ring_rxbd(bp, rxr);
>
> return bnxt_alloc_one_rx_ring(bp, ring_nr);
> }
...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-11 2:33 ` [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
@ 2024-06-14 20:07 ` Simon Horman
2024-06-18 4:36 ` David Wei
0 siblings, 1 reply; 13+ messages in thread
From: Simon Horman @ 2024-06-14 20:07 UTC (permalink / raw)
To: David Wei
Cc: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev, Pavel Begunkov, Jakub Kicinski, David Ahern,
David S. Miller, Eric Dumazet, Paolo Abeni
On Mon, Jun 10, 2024 at 07:33:24PM -0700, David Wei wrote:
> Implement netdev_queue_mgmt_ops for bnxt added in [1].
>
> Two bnxt_rx_ring_info structs are allocated to hold the new/old queue
> memory. Queue memory is copied from/to the main bp->rx_ring[idx]
> bnxt_rx_ring_info.
>
> Queue memory is pre-allocated in bnxt_queue_mem_alloc() into a clone,
> and then copied into bp->rx_ring[idx] in bnxt_queue_mem_start().
>
> Similarly, when bp->rx_ring[idx] is stopped its queue memory is copied
> into a clone, and then freed later in bnxt_queue_mem_free().
>
> I tested this patchset with netdev_rx_queue_restart(), including
> inducing errors in all places that returns an error code. In all cases,
> the queue is left in a good working state.
>
> Rx queues are stopped/started using bnxt_hwrm_vnic_update(), which only
> affects queues that are not in the default RSS context. This is
> different to the GVE that also implemented the queue API recently where
> arbitrary Rx queues can be stopped. Due to this limitation, all ndos
> returns EOPNOTSUPP if the queue is in the default RSS context.
>
> Thanks to Somnath for helping me with using bnxt_hwrm_vnic_update() to
> stop/start an Rx queue. With their permission I've added them as
> Acked-by.
>
> [1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/
>
> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Signed-off-by: David Wei <dw@davidwei.uk>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 307 ++++++++++++++++++++++
> 1 file changed, 307 insertions(+)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
...
> +static void bnxt_queue_mem_free(struct net_device *dev, void *qmem)
> +{
> + struct bnxt_rx_ring_info *rxr = qmem;
> + struct bnxt *bp = netdev_priv(dev);
> + struct bnxt_ring_struct *ring;
> +
> + if (bnxt_get_max_rss_ring(bp) >= idx)
> + return -EOPNOTSUPP;
Hi David,
I guess there was some last minute refactoring and these sloped through the
cracks. The two lines above seem a bit out of place here.
* idx doesn't exist in this context
* The return type of this function is void
> +
> + bnxt_free_one_rx_ring(bp, rxr);
> + bnxt_free_one_rx_agg_ring(bp, rxr);
> +
> + /* At this point, this NAPI instance has another page pool associated
> + * with it. Disconnect here before freeing the old page pool to avoid
> + * warnings.
> + */
> + rxr->page_pool->p.napi = NULL;
> + page_pool_destroy(rxr->page_pool);
> + rxr->page_pool = NULL;
> +
> + ring = &rxr->rx_ring_struct;
> + bnxt_free_ring(bp, &ring->ring_mem);
> +
> + ring = &rxr->rx_agg_ring_struct;
> + bnxt_free_ring(bp, &ring->ring_mem);
> +
> + kfree(rxr->rx_agg_bmap);
> + rxr->rx_agg_bmap = NULL;
> +}
...
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops
2024-06-14 20:07 ` Simon Horman
@ 2024-06-18 4:36 ` David Wei
0 siblings, 0 replies; 13+ messages in thread
From: David Wei @ 2024-06-18 4:36 UTC (permalink / raw)
To: Simon Horman
Cc: Michael Chan, Andy Gospodarek, Adrian Alvarado, Somnath Kotur,
netdev, Pavel Begunkov, Jakub Kicinski, David Ahern,
David S. Miller, Eric Dumazet, Paolo Abeni
On 2024-06-14 13:07, Simon Horman wrote:
> On Mon, Jun 10, 2024 at 07:33:24PM -0700, David Wei wrote:
>> Implement netdev_queue_mgmt_ops for bnxt added in [1].
>>
>> Two bnxt_rx_ring_info structs are allocated to hold the new/old queue
>> memory. Queue memory is copied from/to the main bp->rx_ring[idx]
>> bnxt_rx_ring_info.
>>
>> Queue memory is pre-allocated in bnxt_queue_mem_alloc() into a clone,
>> and then copied into bp->rx_ring[idx] in bnxt_queue_mem_start().
>>
>> Similarly, when bp->rx_ring[idx] is stopped its queue memory is copied
>> into a clone, and then freed later in bnxt_queue_mem_free().
>>
>> I tested this patchset with netdev_rx_queue_restart(), including
>> inducing errors in all places that returns an error code. In all cases,
>> the queue is left in a good working state.
>>
>> Rx queues are stopped/started using bnxt_hwrm_vnic_update(), which only
>> affects queues that are not in the default RSS context. This is
>> different to the GVE that also implemented the queue API recently where
>> arbitrary Rx queues can be stopped. Due to this limitation, all ndos
>> returns EOPNOTSUPP if the queue is in the default RSS context.
>>
>> Thanks to Somnath for helping me with using bnxt_hwrm_vnic_update() to
>> stop/start an Rx queue. With their permission I've added them as
>> Acked-by.
>>
>> [1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/
>>
>> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
>> Signed-off-by: David Wei <dw@davidwei.uk>
>> ---
>> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 307 ++++++++++++++++++++++
>> 1 file changed, 307 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>
> ...
>
>> +static void bnxt_queue_mem_free(struct net_device *dev, void *qmem)
>> +{
>> + struct bnxt_rx_ring_info *rxr = qmem;
>> + struct bnxt *bp = netdev_priv(dev);
>> + struct bnxt_ring_struct *ring;
>> +
>> + if (bnxt_get_max_rss_ring(bp) >= idx)
>> + return -EOPNOTSUPP;
>
> Hi David,
>
> I guess there was some last minute refactoring and these sloped through the
> cracks. The two lines above seem a bit out of place here.
>
> * idx doesn't exist in this context
> * The return type of this function is void
Hi Simon, thanks, you were absolutely right and these slipped through at
the end. Will be fixed in the next version.
>
>> +
>> + bnxt_free_one_rx_ring(bp, rxr);
>> + bnxt_free_one_rx_agg_ring(bp, rxr);
>> +
>> + /* At this point, this NAPI instance has another page pool associated
>> + * with it. Disconnect here before freeing the old page pool to avoid
>> + * warnings.
>> + */
>> + rxr->page_pool->p.napi = NULL;
>> + page_pool_destroy(rxr->page_pool);
>> + rxr->page_pool = NULL;
>> +
>> + ring = &rxr->rx_ring_struct;
>> + bnxt_free_ring(bp, &ring->ring_mem);
>> +
>> + ring = &rxr->rx_agg_ring_struct;
>> + bnxt_free_ring(bp, &ring->ring_mem);
>> +
>> + kfree(rxr->rx_agg_bmap);
>> + rxr->rx_agg_bmap = NULL;
>> +}
>
> ...
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-06-18 4:36 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-11 2:33 [PATCH net-next v1 0/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
2024-06-11 2:33 ` [PATCH net-next v1 1/3] bnxt_en: Add support to call FW to update a VNIC David Wei
2024-06-11 2:33 ` [PATCH net-next v1 2/3] bnxt_en: split rx ring helpers out from ring helpers David Wei
2024-06-14 20:01 ` Simon Horman
2024-06-11 2:33 ` [PATCH net-next v1 3/3] bnxt_en: implement netdev_queue_mgmt_ops David Wei
2024-06-14 20:07 ` Simon Horman
2024-06-18 4:36 ` David Wei
2024-06-11 2:40 ` [PATCH net-next v1 0/3] " David Ahern
2024-06-11 3:41 ` David Wei
2024-06-12 15:52 ` David Ahern
2024-06-12 18:19 ` David Wei
2024-06-12 21:11 ` Jakub Kicinski
2024-06-12 21:54 ` Andy Gospodarek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).