[net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

* [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support
@ 2026-03-19 11:47 Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support Subbaraya Sundeep
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-19 11:47 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2
  Cc: netdev, linux-kernel, Subbaraya Sundeep

This series adds NPA Halo support for CN20K in the octeontx2 AF and
PF drivers. On CN20K, NPA supports a unified "Halo" context that combines
aura and pool contexts in a single structure. This is a simplification
in hardware so that there is no need to initialize both Aura and Pool
contexts for queues. Separate Aura and Pool contexts are needed say if
we have to point many Auras to a single pool but we always use 1:1 Aura
and Pool map in Octeontx2 netdev driver. Hence for CN20K use Halo
context for netdevs.

The series:

  1) Adds Halo context type, mbox handling, and halo_bmap tracking in AF.
  2) Adds NPA DPC (diagnostic/performance counters) 32 counters with
     per-LF permit registers, mbox alloc/free, and teardown handling.
  3) Adds debugfs for Halo (halo_ctx file and NPA context display/write
     for HALO ctype).
  4) Switches the CN20K PF driver to use the unified Halo context and
     allocates a DPC counter for the NPA LF.

Changes for v2:
 Fixed all AI reviews
 Removed inline and added const for npa_ctype_str(as per Simon)
 Fixed build warning flagged with W=1 


Thanks,
Sundeep

Linu Cherian (3):
  octeontx2-af: npa: cn20k: Add NPA Halo support
  octeontx2-af: npa: cn20k: Add DPC support
  octeontx2-af: npa: cn20k: Add debugfs for Halo

Subbaraya Sundeep (1):
  octeontx2-pf: cn20k: Use unified Halo context

 .../ethernet/marvell/octeontx2/af/cn20k/api.h |   6 +
 .../marvell/octeontx2/af/cn20k/debugfs.c      |  60 +++++
 .../marvell/octeontx2/af/cn20k/debugfs.h      |   2 +
 .../ethernet/marvell/octeontx2/af/cn20k/npa.c | 143 ++++++++++++
 .../ethernet/marvell/octeontx2/af/cn20k/reg.h |   7 +
 .../marvell/octeontx2/af/cn20k/struct.h       |  81 +++++++
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |  25 +++
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |   5 +
 .../marvell/octeontx2/af/rvu_debugfs.c        |  74 ++++++-
 .../ethernet/marvell/octeontx2/af/rvu_npa.c   |  77 ++++++-
 .../marvell/octeontx2/af/rvu_struct.h         |   1 +
 .../ethernet/marvell/octeontx2/nic/cn20k.c    | 207 +++++++++---------
 .../ethernet/marvell/octeontx2/nic/cn20k.h    |   3 +
 .../marvell/octeontx2/nic/otx2_common.h       |   2 +
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |   6 +
 15 files changed, 580 insertions(+), 119 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support
  2026-03-19 11:47 [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support Subbaraya Sundeep
@ 2026-03-19 11:47 ` Subbaraya Sundeep
  2026-03-20 16:52   ` Simon Horman
  2026-03-19 11:47 ` [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support Subbaraya Sundeep
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-19 11:47 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2
  Cc: netdev, linux-kernel, Linu Cherian, Subbaraya Sundeep

From: Linu Cherian <lcherian@marvell.com>

CN20K silicon implements unified aura and pool context
type called Halo for better resource usage. Add support to
handle Halo context type operations.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 .../ethernet/marvell/octeontx2/af/cn20k/npa.c | 27 +++++++
 .../marvell/octeontx2/af/cn20k/struct.h       | 81 +++++++++++++++++++
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |  6 ++
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |  2 +
 .../ethernet/marvell/octeontx2/af/rvu_npa.c   | 63 +++++++++++++--
 .../marvell/octeontx2/af/rvu_struct.h         |  1 +
 6 files changed, 173 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
index fe8f926c8b75..c963f43dc7b0 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
@@ -19,3 +19,30 @@ int rvu_mbox_handler_npa_cn20k_aq_enq(struct rvu *rvu,
 				   (struct npa_aq_enq_rsp *)rsp);
 }
 EXPORT_SYMBOL(rvu_mbox_handler_npa_cn20k_aq_enq);
+
+int rvu_npa_halo_hwctx_disable(struct npa_aq_enq_req *req)
+{
+	struct npa_cn20k_aq_enq_req *hreq;
+
+	hreq = (struct npa_cn20k_aq_enq_req *)req;
+
+	hreq->halo.bp_ena_0 = 0;
+	hreq->halo.bp_ena_1 = 0;
+	hreq->halo.bp_ena_2 = 0;
+	hreq->halo.bp_ena_3 = 0;
+	hreq->halo.bp_ena_4 = 0;
+	hreq->halo.bp_ena_5 = 0;
+	hreq->halo.bp_ena_6 = 0;
+	hreq->halo.bp_ena_7 = 0;
+
+	hreq->halo_mask.bp_ena_0 = 1;
+	hreq->halo_mask.bp_ena_1 = 1;
+	hreq->halo_mask.bp_ena_2 = 1;
+	hreq->halo_mask.bp_ena_3 = 1;
+	hreq->halo_mask.bp_ena_4 = 1;
+	hreq->halo_mask.bp_ena_5 = 1;
+	hreq->halo_mask.bp_ena_6 = 1;
+	hreq->halo_mask.bp_ena_7 = 1;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
index 763f6cabd7c2..2364bafd329d 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
@@ -377,4 +377,85 @@ struct npa_cn20k_pool_s {
 
 static_assert(sizeof(struct npa_cn20k_pool_s) == NIX_MAX_CTX_SIZE);
 
+struct npa_cn20k_halo_s {
+	u64 stack_base                  : 64;
+	u64 ena                         :  1;
+	u64 nat_align                   :  1;
+	u64 reserved_66_67              :  2;
+	u64 stack_caching               :  1;
+	u64 reserved_69_71              :  3;
+	u64 aura_drop_ena               :  1;
+	u64 reserved_73_79              :  7;
+	u64 aura_drop                   :  8;
+	u64 buf_offset                  : 12;
+	u64 reserved_100_103            :  4;
+	u64 buf_size                    : 12;
+	u64 reserved_116_119            :  4;
+	u64 ref_cnt_prof                :  3;
+	u64 reserved_123_127            :  5;
+	u64 stack_max_pages             : 32;
+	u64 stack_pages                 : 32;
+	u64 bp_0                        :  7;
+	u64 bp_1                        :  7;
+	u64 bp_2                        :  7;
+	u64 bp_3                        :  7;
+	u64 bp_4                        :  7;
+	u64 bp_5                        :  7;
+	u64 bp_6                        :  7;
+	u64 bp_7                        :  7;
+	u64 bp_ena_0                    :  1;
+	u64 bp_ena_1                    :  1;
+	u64 bp_ena_2                    :  1;
+	u64 bp_ena_3                    :  1;
+	u64 bp_ena_4                    :  1;
+	u64 bp_ena_5                    :  1;
+	u64 bp_ena_6                    :  1;
+	u64 bp_ena_7                    :  1;
+	u64 stack_offset                :  4;
+	u64 reserved_260_263            :  4;
+	u64 shift                       :  6;
+	u64 reserved_270_271            :  2;
+	u64 avg_level                   :  8;
+	u64 avg_con                     :  9;
+	u64 fc_ena                      :  1;
+	u64 fc_stype                    :  2;
+	u64 fc_hyst_bits                :  4;
+	u64 fc_up_crossing              :  1;
+	u64 reserved_297_299            :  3;
+	u64 update_time                 : 16;
+	u64 reserved_316_319            :  4;
+	u64 fc_addr                     : 64;
+	u64 ptr_start                   : 64;
+	u64 ptr_end                     : 64;
+	u64 bpid_0                      : 12;
+	u64 reserved_524_535            : 12;
+	u64 err_int                     :  8;
+	u64 err_int_ena                 :  8;
+	u64 thresh_int                  :  1;
+	u64 thresh_int_ena              :  1;
+	u64 thresh_up                   :  1;
+	u64 reserved_555                :  1;
+	u64 thresh_qint_idx             :  7;
+	u64 reserved_563                :  1;
+	u64 err_qint_idx                :  7;
+	u64 reserved_571_575            :  5;
+	u64 thresh                      : 36;
+	u64 reserved_612_615            :  4;
+	u64 fc_msh_dst                  : 11;
+	u64 reserved_627_630            :  4;
+	u64 op_dpc_ena                  :  1;
+	u64 op_dpc_set                  :  5;
+	u64 reserved_637_637            :  1;
+	u64 stream_ctx                  :  1;
+	u64 unified_ctx                 :  1;
+	u64 reserved_640_703            : 64;
+	u64 reserved_704_767            : 64;
+	u64 reserved_768_831            : 64;
+	u64 reserved_832_895            : 64;
+	u64 reserved_896_959            : 64;
+	u64 reserved_960_1023           : 64;
+};
+
+static_assert(sizeof(struct npa_cn20k_halo_s) == NIX_MAX_CTX_SIZE);
+
 #endif
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index dc42c81c0942..4a97bd93d882 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -884,6 +884,8 @@ struct npa_cn20k_aq_enq_req {
 		struct npa_cn20k_aura_s aura;
 		/* Valid when op == WRITE/INIT and ctype == POOL */
 		struct npa_cn20k_pool_s pool;
+		/* Valid when op == WRITE/INIT and ctype == HALO */
+		struct npa_cn20k_halo_s halo;
 	};
 	/* Mask data when op == WRITE (1=write, 0=don't write) */
 	union {
@@ -891,6 +893,8 @@ struct npa_cn20k_aq_enq_req {
 		struct npa_cn20k_aura_s aura_mask;
 		/* Valid when op == WRITE and ctype == POOL */
 		struct npa_cn20k_pool_s pool_mask;
+		/* Valid when op == WRITE/INIT and ctype == HALO */
+		struct npa_cn20k_halo_s halo_mask;
 	};
 };
 
@@ -901,6 +905,8 @@ struct npa_cn20k_aq_enq_rsp {
 		struct npa_cn20k_aura_s aura;
 		/* Valid when op == READ and ctype == POOL */
 		struct npa_cn20k_pool_s pool;
+		/* Valid when op == READ and ctype == HALO */
+		struct npa_cn20k_halo_s halo;
 	};
 };
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index a466181cf908..36a71d32b894 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -261,6 +261,7 @@ struct rvu_pfvf {
 	struct qmem	*pool_ctx;
 	struct qmem	*npa_qints_ctx;
 	unsigned long	*aura_bmap;
+	unsigned long	*halo_bmap; /* Aura and Halo are mutually exclusive */
 	unsigned long	*pool_bmap;
 
 	/* NIX contexts */
@@ -1008,6 +1009,7 @@ void rvu_npa_freemem(struct rvu *rvu);
 void rvu_npa_lf_teardown(struct rvu *rvu, u16 pcifunc, int npalf);
 int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 			struct npa_aq_enq_rsp *rsp);
+int rvu_npa_halo_hwctx_disable(struct npa_aq_enq_req *req);
 
 /* NIX APIs */
 bool is_nixlf_attached(struct rvu *rvu, u16 pcifunc);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
index e2a33e46b48a..96904b8eea62 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
@@ -12,6 +12,11 @@
 #include "rvu_reg.h"
 #include "rvu.h"
 
+static inline bool npa_ctype_invalid(struct rvu *rvu, int ctype)
+{
+	return !is_cn20k(rvu->pdev) && ctype == NPA_AQ_CTYPE_HALO;
+}
+
 static int npa_aq_enqueue_wait(struct rvu *rvu, struct rvu_block *block,
 			       struct npa_aq_inst_s *inst)
 {
@@ -72,13 +77,19 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 	bool ena;
 
 	pfvf = rvu_get_pfvf(rvu, pcifunc);
-	if (!pfvf->aura_ctx || req->aura_id >= pfvf->aura_ctx->qsize)
+	if (!pfvf->aura_ctx || req->aura_id >= pfvf->aura_ctx->qsize ||
+	    npa_ctype_invalid(rvu, req->ctype))
 		return NPA_AF_ERR_AQ_ENQUEUE;
 
 	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, pcifunc);
 	if (!pfvf->npalf || blkaddr < 0)
 		return NPA_AF_ERR_AF_LF_INVALID;
 
+	/* Ensure halo bitmap is exclusive to halo ctype */
+	if (is_cn20k(rvu->pdev) && req->ctype != NPA_AQ_CTYPE_HALO &&
+	    test_bit(req->aura_id, pfvf->halo_bmap))
+		return NPA_AF_ERR_AQ_ENQUEUE;
+
 	block = &hw->block[blkaddr];
 	aq = block->aq;
 	if (!aq) {
@@ -119,7 +130,7 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 			memcpy(mask, &req->aura_mask,
 			       sizeof(struct npa_aura_s));
 			memcpy(ctx, &req->aura, sizeof(struct npa_aura_s));
-		} else {
+		} else { /* Applies to pool and halo since size is same */
 			memcpy(mask, &req->pool_mask,
 			       sizeof(struct npa_pool_s));
 			memcpy(ctx, &req->pool, sizeof(struct npa_pool_s));
@@ -135,7 +146,7 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 			req->aura.pool_addr = pfvf->pool_ctx->iova +
 			(req->aura.pool_addr * pfvf->pool_ctx->entry_sz);
 			memcpy(ctx, &req->aura, sizeof(struct npa_aura_s));
-		} else { /* POOL's context */
+		} else { /* Applies to pool and halo since size is same */
 			memcpy(ctx, &req->pool, sizeof(struct npa_pool_s));
 		}
 		break;
@@ -176,6 +187,20 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 		}
 	}
 
+	if (req->ctype == NPA_AQ_CTYPE_HALO) {
+		if (req->op == NPA_AQ_INSTOP_INIT && req->aura.ena)
+			__set_bit(req->aura_id, pfvf->halo_bmap);
+		if (req->op == NPA_AQ_INSTOP_WRITE) {
+			ena = (req->aura.ena & req->aura_mask.ena) |
+				(test_bit(req->aura_id, pfvf->halo_bmap) &
+				~req->aura_mask.ena);
+			if (ena)
+				__set_bit(req->aura_id, pfvf->halo_bmap);
+			else
+				__clear_bit(req->aura_id, pfvf->halo_bmap);
+		}
+	}
+
 	/* Set pool bitmap if pool hw context is enabled */
 	if (req->ctype == NPA_AQ_CTYPE_POOL) {
 		if (req->op == NPA_AQ_INSTOP_INIT && req->pool.ena)
@@ -198,7 +223,7 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 			if (req->ctype == NPA_AQ_CTYPE_AURA)
 				memcpy(&rsp->aura, ctx,
 				       sizeof(struct npa_aura_s));
-			else
+			else /* Applies to pool and halo since size is same */
 				memcpy(&rsp->pool, ctx,
 				       sizeof(struct npa_pool_s));
 		}
@@ -210,12 +235,14 @@ int rvu_npa_aq_enq_inst(struct rvu *rvu, struct npa_aq_enq_req *req,
 static int npa_lf_hwctx_disable(struct rvu *rvu, struct hwctx_disable_req *req)
 {
 	struct rvu_pfvf *pfvf = rvu_get_pfvf(rvu, req->hdr.pcifunc);
+	const char *context = "Unknown";
 	struct npa_aq_enq_req aq_req;
 	unsigned long *bmap;
 	int id, cnt = 0;
 	int err = 0, rc;
 
-	if (!pfvf->pool_ctx || !pfvf->aura_ctx)
+	if (!pfvf->pool_ctx || !pfvf->aura_ctx ||
+	    npa_ctype_invalid(rvu, req->ctype))
 		return NPA_AF_ERR_AQ_ENQUEUE;
 
 	memset(&aq_req, 0, sizeof(struct npa_aq_enq_req));
@@ -226,6 +253,7 @@ static int npa_lf_hwctx_disable(struct rvu *rvu, struct hwctx_disable_req *req)
 		aq_req.pool_mask.ena = 1;
 		cnt = pfvf->pool_ctx->qsize;
 		bmap = pfvf->pool_bmap;
+		context = "Pool";
 	} else if (req->ctype == NPA_AQ_CTYPE_AURA) {
 		aq_req.aura.ena = 0;
 		aq_req.aura_mask.ena = 1;
@@ -233,6 +261,14 @@ static int npa_lf_hwctx_disable(struct rvu *rvu, struct hwctx_disable_req *req)
 		aq_req.aura_mask.bp_ena = 1;
 		cnt = pfvf->aura_ctx->qsize;
 		bmap = pfvf->aura_bmap;
+		context = "Aura";
+	} else if (req->ctype == NPA_AQ_CTYPE_HALO) {
+		aq_req.aura.ena = 0;
+		aq_req.aura_mask.ena = 1;
+		rvu_npa_halo_hwctx_disable(&aq_req);
+		cnt = pfvf->aura_ctx->qsize;
+		bmap = pfvf->halo_bmap;
+		context = "Halo";
 	}
 
 	aq_req.ctype = req->ctype;
@@ -246,8 +282,7 @@ static int npa_lf_hwctx_disable(struct rvu *rvu, struct hwctx_disable_req *req)
 		if (rc) {
 			err = rc;
 			dev_err(rvu->dev, "Failed to disable %s:%d context\n",
-				(req->ctype == NPA_AQ_CTYPE_AURA) ?
-				"Aura" : "Pool", id);
+				context, id);
 		}
 	}
 
@@ -311,6 +346,9 @@ static void npa_ctx_free(struct rvu *rvu, struct rvu_pfvf *pfvf)
 	kfree(pfvf->aura_bmap);
 	pfvf->aura_bmap = NULL;
 
+	kfree(pfvf->halo_bmap);
+	pfvf->halo_bmap = NULL;
+
 	qmem_free(rvu->dev, pfvf->aura_ctx);
 	pfvf->aura_ctx = NULL;
 
@@ -374,6 +412,13 @@ int rvu_mbox_handler_npa_lf_alloc(struct rvu *rvu,
 	if (!pfvf->aura_bmap)
 		goto free_mem;
 
+	if (is_cn20k(rvu->pdev)) {
+		pfvf->halo_bmap = kcalloc(NPA_AURA_COUNT(req->aura_sz),
+					  sizeof(long), GFP_KERNEL);
+		if (!pfvf->halo_bmap)
+			goto free_mem;
+	}
+
 	/* Alloc memory for pool HW contexts */
 	hwctx_size = 1UL << ((ctx_cfg >> 4) & 0xF);
 	err = qmem_alloc(rvu->dev, &pfvf->pool_ctx, req->nr_pools, hwctx_size);
@@ -562,6 +607,10 @@ void rvu_npa_lf_teardown(struct rvu *rvu, u16 pcifunc, int npalf)
 	ctx_req.ctype = NPA_AQ_CTYPE_AURA;
 	npa_lf_hwctx_disable(rvu, &ctx_req);
 
+	/* Disable all Halos */
+	ctx_req.ctype = NPA_AQ_CTYPE_HALO;
+	npa_lf_hwctx_disable(rvu, &ctx_req);
+
 	npa_ctx_free(rvu, pfvf);
 }
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
index 8e868f815de1..d37cf2cf0fee 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
@@ -130,6 +130,7 @@ enum npa_aq_comp {
 enum npa_aq_ctype {
 	NPA_AQ_CTYPE_AURA = 0x0,
 	NPA_AQ_CTYPE_POOL = 0x1,
+	NPA_AQ_CTYPE_HALO = 0x2,
 };
 
 /* NPA admin queue instruction opcodes */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support
  2026-03-19 11:47 [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support Subbaraya Sundeep
@ 2026-03-19 11:47 ` Subbaraya Sundeep
  2026-03-20 16:50   ` Simon Horman
  2026-03-19 11:47 ` [net-next PATCH v2 3/4] octeontx2-af: npa: cn20k: Add debugfs for Halo Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context Subbaraya Sundeep
  3 siblings, 1 reply; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-19 11:47 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2
  Cc: netdev, linux-kernel, Linu Cherian, Subbaraya Sundeep

From: Linu Cherian <lcherian@marvell.com>

CN20k introduces 32 diagnostic and performance
counters that are shared across all NPA LFs.

Counters being shared, each PF driver need to request
for a counter with the required configuration to the AF,
so that a counter can be allocated and mapped to the
respective LF with the requested configuration.

Add new mbox messages, npa_dpc_alloc/free to handle this.

Also ensure all the LF to DPC counter mappings are cleared
at the time of LF free/teardown.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 .../ethernet/marvell/octeontx2/af/cn20k/api.h |   6 +
 .../ethernet/marvell/octeontx2/af/cn20k/npa.c | 116 ++++++++++++++++++
 .../ethernet/marvell/octeontx2/af/cn20k/reg.h |   7 ++
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |  19 +++
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |   3 +
 .../ethernet/marvell/octeontx2/af/rvu_npa.c   |  14 ++-
 6 files changed, 164 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/api.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/api.h
index 4285b5d6a6a2..b13e7628f767 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/api.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/api.h
@@ -29,4 +29,10 @@ int cn20k_mbox_setup(struct otx2_mbox *mbox, struct pci_dev *pdev,
 		     void *reg_base, int direction, int ndevs);
 void cn20k_rvu_enable_afvf_intr(struct rvu *rvu, int vfs);
 void cn20k_rvu_disable_afvf_intr(struct rvu *rvu, int vfs);
+
+int npa_cn20k_dpc_alloc(struct rvu *rvu, struct npa_cn20k_dpc_alloc_req *req,
+			struct npa_cn20k_dpc_alloc_rsp *rsp);
+int npa_cn20k_dpc_free(struct rvu *rvu, struct npa_cn20k_dpc_free_req *req);
+void npa_cn20k_dpc_free_all(struct rvu *rvu, u16 pcifunc);
+
 #endif /* CN20K_API_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
index c963f43dc7b0..1def2504872f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
@@ -8,6 +8,8 @@
 #include <linux/module.h>
 #include <linux/pci.h>
 
+#include "cn20k/api.h"
+#include "cn20k/reg.h"
 #include "struct.h"
 #include "../rvu.h"
 
@@ -46,3 +48,117 @@ int rvu_npa_halo_hwctx_disable(struct npa_aq_enq_req *req)
 
 	return 0;
 }
+
+int npa_cn20k_dpc_alloc(struct rvu *rvu, struct npa_cn20k_dpc_alloc_req *req,
+			struct npa_cn20k_dpc_alloc_rsp *rsp)
+{
+	struct rvu_hwinfo *hw = rvu->hw;
+	u16 pcifunc = req->hdr.pcifunc;
+	int cntr, lf, blkaddr, ridx;
+	struct rvu_block *block;
+	struct rvu_pfvf *pfvf;
+	u64 val, lfmask;
+
+	pfvf = rvu_get_pfvf(rvu, pcifunc);
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (!pfvf->npalf || blkaddr < 0)
+		return NPA_AF_ERR_AF_LF_INVALID;
+
+	block = &hw->block[blkaddr];
+	lf = rvu_get_lf(rvu, block, pcifunc, 0);
+	if (lf < 0)
+		return NPA_AF_ERR_AF_LF_INVALID;
+
+	/* allocate a new counter */
+	cntr = rvu_alloc_rsrc(&rvu->npa_dpc);
+	if (cntr < 0)
+		return cntr;
+	rsp->cntr_id = cntr;
+
+	/* DPC counter config */
+	rvu_write64(rvu, blkaddr, NPA_AF_DPCX_CFG(cntr), req->dpc_conf);
+
+	/* 0 to 63 lfs -> idx 0, 64 - 127 lfs -> idx 1 */
+	ridx = lf >> 6;
+	lfmask = BIT_ULL(ridx ? lf - NPA_DPC_LFS_PER_REG : lf);
+
+	ridx = 2 * cntr + ridx;
+	/* Give permission for LF access */
+	val = rvu_read64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx));
+	val |= lfmask;
+	rvu_write64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx), val);
+
+	return 0;
+}
+
+int rvu_mbox_handler_npa_cn20k_dpc_alloc(struct rvu *rvu,
+					 struct npa_cn20k_dpc_alloc_req *req,
+					 struct npa_cn20k_dpc_alloc_rsp *rsp)
+{
+	return npa_cn20k_dpc_alloc(rvu, req, rsp);
+}
+
+int npa_cn20k_dpc_free(struct rvu *rvu, struct npa_cn20k_dpc_free_req *req)
+{
+	struct rvu_hwinfo *hw = rvu->hw;
+	u16 pcifunc = req->hdr.pcifunc;
+	int cntr, lf, blkaddr, ridx;
+	struct rvu_block *block;
+	struct rvu_pfvf *pfvf;
+	u64 val, lfmask;
+
+	pfvf = rvu_get_pfvf(rvu, pcifunc);
+
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+	if (!pfvf->npalf || blkaddr < 0)
+		return NPA_AF_ERR_AF_LF_INVALID;
+
+	block = &hw->block[blkaddr];
+	lf = rvu_get_lf(rvu, block, pcifunc, 0);
+	if (lf < 0)
+		return NPA_AF_ERR_AF_LF_INVALID;
+
+	if (req->cntr_id >= NPA_DPC_MAX)
+		return NPA_AF_ERR_PARAM;
+
+	/* 0 to 63 lfs -> idx 0, 64 - 127 lfs -> idx 1 */
+	ridx = lf >> 6;
+	lfmask = BIT_ULL(ridx ? lf - NPA_DPC_LFS_PER_REG : lf);
+	cntr = req->cntr_id;
+
+	ridx = 2 * cntr + ridx;
+
+	val = rvu_read64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx));
+	/* Check if the counter is allotted to this LF */
+	if (!(val & lfmask))
+		return 0;
+
+	/* Revert permission */
+	val &= ~lfmask;
+	rvu_write64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx), val);
+
+	/* Free this counter */
+	rvu_free_rsrc(&rvu->npa_dpc, req->cntr_id);
+
+	return 0;
+}
+
+void npa_cn20k_dpc_free_all(struct rvu *rvu, u16 pcifunc)
+{
+	struct npa_cn20k_dpc_free_req req;
+	int i;
+
+	req.hdr.pcifunc = pcifunc;
+	for (i = 0; i < NPA_DPC_MAX; i++) {
+		req.cntr_id = i;
+		npa_cn20k_dpc_free(rvu, &req);
+	}
+}
+
+int rvu_mbox_handler_npa_cn20k_dpc_free(struct rvu *rvu,
+					struct npa_cn20k_dpc_free_req *req,
+					struct msg_rsp *rsp)
+{
+	return npa_cn20k_dpc_free(rvu, req);
+}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/reg.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/reg.h
index 8bfaa507ee50..9b49e376878e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/reg.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/reg.h
@@ -143,4 +143,11 @@
 	offset = (0xb000000ull | (a) << 4 | (b) << 20);		\
 	offset; })
 
+/* NPA Registers */
+#define NPA_AF_DPCX_CFG(a)		(0x800 | (a) << 6)
+#define NPA_AF_DPC_PERMITX(a)		(0x1000 | (a) << 3)
+
+#define NPA_DPC_MAX			32
+#define NPA_DPC_LFS_PER_REG		64
+
 #endif /* RVU_MBOX_REG_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index 4a97bd93d882..b29ec26b66b7 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -213,6 +213,10 @@ M(NPA_AQ_ENQ,		0x402, npa_aq_enq, npa_aq_enq_req, npa_aq_enq_rsp)   \
 M(NPA_HWCTX_DISABLE,	0x403, npa_hwctx_disable, hwctx_disable_req, msg_rsp)\
 M(NPA_CN20K_AQ_ENQ,	0x404, npa_cn20k_aq_enq, npa_cn20k_aq_enq_req,	\
 				npa_cn20k_aq_enq_rsp)			\
+M(NPA_CN20K_DPC_ALLOC,	0x405, npa_cn20k_dpc_alloc, npa_cn20k_dpc_alloc_req, \
+				npa_cn20k_dpc_alloc_rsp)		\
+M(NPA_CN20K_DPC_FREE,	0x406, npa_cn20k_dpc_free, npa_cn20k_dpc_free_req, \
+				msg_rsp)				\
 /* SSO/SSOW mbox IDs (range 0x600 - 0x7FF) */				\
 /* TIM mbox IDs (range 0x800 - 0x9FF) */				\
 /* CPT mbox IDs (range 0xA00 - 0xBFF) */				\
@@ -910,6 +914,21 @@ struct npa_cn20k_aq_enq_rsp {
 	};
 };
 
+struct npa_cn20k_dpc_alloc_req {
+	struct mbox_msghdr hdr;
+	u16 dpc_conf;
+};
+
+struct npa_cn20k_dpc_alloc_rsp {
+	struct mbox_msghdr hdr;
+	u8 cntr_id;
+};
+
+struct npa_cn20k_dpc_free_req {
+	struct mbox_msghdr hdr;
+	u8 cntr_id;
+};
+
 /* Disable all contexts of type 'ctype' */
 struct hwctx_disable_req {
 	struct mbox_msghdr hdr;
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index 36a71d32b894..0299fa1bd3bc 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -663,6 +663,9 @@ struct rvu {
 	/* CPT interrupt lock */
 	spinlock_t		cpt_intr_lock;
 
+	/* NPA */
+	struct rsrc_bmap	npa_dpc;
+
 	struct mutex		mbox_lock; /* Serialize mbox up and down msgs */
 	u16			rep_pcifunc;
 	bool			altaf_ready;
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
index 96904b8eea62..3cd24226007b 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
@@ -8,6 +8,8 @@
 #include <linux/module.h>
 #include <linux/pci.h>
 
+#include "cn20k/api.h"
+#include "cn20k/reg.h"
 #include "rvu_struct.h"
 #include "rvu_reg.h"
 #include "rvu.h"
@@ -504,6 +506,8 @@ int rvu_mbox_handler_npa_lf_free(struct rvu *rvu, struct msg_req *req,
 		return NPA_AF_ERR_LF_RESET;
 	}
 
+	if (is_cn20k(rvu->pdev))
+		npa_cn20k_dpc_free_all(rvu, pcifunc);
 	npa_ctx_free(rvu, pfvf);
 
 	return 0;
@@ -569,12 +573,17 @@ static int npa_aq_init(struct rvu *rvu, struct rvu_block *block)
 int rvu_npa_init(struct rvu *rvu)
 {
 	struct rvu_hwinfo *hw = rvu->hw;
-	int blkaddr;
+	int err, blkaddr;
 
 	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
 	if (blkaddr < 0)
 		return 0;
 
+	rvu->npa_dpc.max = NPA_DPC_MAX;
+	err = rvu_alloc_bitmap(&rvu->npa_dpc);
+	if (err)
+		return err;
+
 	/* Initialize admin queue */
 	return npa_aq_init(rvu, &hw->block[blkaddr]);
 }
@@ -591,6 +600,7 @@ void rvu_npa_freemem(struct rvu *rvu)
 
 	block = &hw->block[blkaddr];
 	rvu_aq_free(rvu, block->aq);
+	kfree(rvu->npa_dpc.bmap);
 }
 
 void rvu_npa_lf_teardown(struct rvu *rvu, u16 pcifunc, int npalf)
@@ -611,6 +621,8 @@ void rvu_npa_lf_teardown(struct rvu *rvu, u16 pcifunc, int npalf)
 	ctx_req.ctype = NPA_AQ_CTYPE_HALO;
 	npa_lf_hwctx_disable(rvu, &ctx_req);
 
+	if (is_cn20k(rvu->pdev))
+		npa_cn20k_dpc_free_all(rvu, pcifunc);
 	npa_ctx_free(rvu, pfvf);
 }
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net-next PATCH v2 3/4] octeontx2-af: npa: cn20k: Add debugfs for Halo
  2026-03-19 11:47 [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support Subbaraya Sundeep
@ 2026-03-19 11:47 ` Subbaraya Sundeep
  2026-03-19 11:47 ` [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context Subbaraya Sundeep
  3 siblings, 0 replies; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-19 11:47 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2
  Cc: netdev, linux-kernel, Linu Cherian, Subbaraya Sundeep

From: Linu Cherian <lcherian@marvell.com>

Similar to other hardware contexts add debugfs support for
unified Halo context.

Sample output on cn20k::
/sys/kernel/debug/cn20k/npa # cat halo_ctx
======halo : 2=======
W0: Stack base          ffffff790000
W1: ena                 1
W1: nat_align           0
W1: stack_caching       1
W1: aura drop ena       0
W1: aura drop           0
W1: buf_offset          0
W1: buf_size            32
W1: ref_cnt_prof                0
W2: stack_max_pages     13
W2: stack_pages         11
W3: bp_0                0
W3: bp_1                0
W3: bp_2                0

snip ..

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 .../marvell/octeontx2/af/cn20k/debugfs.c      | 60 +++++++++++++++
 .../marvell/octeontx2/af/cn20k/debugfs.h      |  2 +
 .../marvell/octeontx2/af/rvu_debugfs.c        | 74 +++++++++++++++++--
 3 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 3debf2fae1a4..c0cfd3a39c23 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -489,3 +489,63 @@ void print_npa_cn20k_pool_ctx(struct seq_file *m,
 		   pool->thresh_qint_idx, pool->err_qint_idx);
 	seq_printf(m, "W8: fc_msh_dst\t\t%d\n", pool->fc_msh_dst);
 }
+
+void print_npa_cn20k_halo_ctx(struct seq_file *m, struct npa_aq_enq_rsp *rsp)
+{
+	struct npa_cn20k_aq_enq_rsp *cn20k_rsp;
+	struct npa_cn20k_halo_s *halo;
+
+	cn20k_rsp = (struct npa_cn20k_aq_enq_rsp *)rsp;
+	halo = &cn20k_rsp->halo;
+
+	seq_printf(m, "W0: Stack base\t\t%llx\n", halo->stack_base);
+
+	seq_printf(m, "W1: ena \t\t%d\nW1: nat_align \t\t%d\n",
+		   halo->ena, halo->nat_align);
+	seq_printf(m, "W1: stack_caching\t%d\n",
+		   halo->stack_caching);
+	seq_printf(m, "W1: aura drop ena\t%d\n", halo->aura_drop_ena);
+	seq_printf(m, "W1: aura drop\t\t%d\n", halo->aura_drop);
+	seq_printf(m, "W1: buf_offset\t\t%d\nW1: buf_size\t\t%d\n",
+		   halo->buf_offset, halo->buf_size);
+	seq_printf(m, "W1: ref_cnt_prof\t\t%d\n", halo->ref_cnt_prof);
+	seq_printf(m, "W2: stack_max_pages \t%d\nW2: stack_pages\t\t%d\n",
+		   halo->stack_max_pages, halo->stack_pages);
+	seq_printf(m, "W3: bp_0\t\t%d\nW3: bp_1\t\t%d\nW3: bp_2\t\t%d\n",
+		   halo->bp_0, halo->bp_1, halo->bp_2);
+	seq_printf(m, "W3: bp_3\t\t%d\nW3: bp_4\t\t%d\nW3: bp_5\t\t%d\n",
+		   halo->bp_3, halo->bp_4, halo->bp_5);
+	seq_printf(m, "W3: bp_6\t\t%d\nW3: bp_7\t\t%d\nW3: bp_ena_0\t\t%d\n",
+		   halo->bp_6, halo->bp_7, halo->bp_ena_0);
+	seq_printf(m, "W3: bp_ena_1\t\t%d\nW3: bp_ena_2\t\t%d\n",
+		   halo->bp_ena_1, halo->bp_ena_2);
+	seq_printf(m, "W3: bp_ena_3\t\t%d\nW3: bp_ena_4\t\t%d\n",
+		   halo->bp_ena_3, halo->bp_ena_4);
+	seq_printf(m, "W3: bp_ena_5\t\t%d\nW3: bp_ena_6\t\t%d\n",
+		   halo->bp_ena_5, halo->bp_ena_6);
+	seq_printf(m, "W3: bp_ena_7\t\t%d\n", halo->bp_ena_7);
+	seq_printf(m, "W4: stack_offset\t%d\nW4: shift\t\t%d\nW4: avg_level\t\t%d\n",
+		   halo->stack_offset, halo->shift, halo->avg_level);
+	seq_printf(m, "W4: avg_con \t\t%d\nW4: fc_ena\t\t%d\nW4: fc_stype\t\t%d\n",
+		   halo->avg_con, halo->fc_ena, halo->fc_stype);
+	seq_printf(m, "W4: fc_hyst_bits\t%d\nW4: fc_up_crossing\t%d\n",
+		   halo->fc_hyst_bits, halo->fc_up_crossing);
+	seq_printf(m, "W4: update_time\t\t%d\n", halo->update_time);
+	seq_printf(m, "W5: fc_addr\t\t%llx\n", halo->fc_addr);
+	seq_printf(m, "W6: ptr_start\t\t%llx\n", halo->ptr_start);
+	seq_printf(m, "W7: ptr_end\t\t%llx\n", halo->ptr_end);
+	seq_printf(m, "W8: bpid_0\t\t%d\n", halo->bpid_0);
+	seq_printf(m, "W8: err_int \t\t%d\nW8: err_int_ena\t\t%d\n",
+		   halo->err_int, halo->err_int_ena);
+	seq_printf(m, "W8: thresh_int\t\t%d\nW8: thresh_int_ena \t%d\n",
+		   halo->thresh_int, halo->thresh_int_ena);
+	seq_printf(m, "W8: thresh_up\t\t%d\nW8: thresh_qint_idx\t%d\n",
+		   halo->thresh_up, halo->thresh_qint_idx);
+	seq_printf(m, "W8: err_qint_idx \t%d\n", halo->err_qint_idx);
+	seq_printf(m, "W9: thresh\t\t%llu\n", (u64)halo->thresh);
+	seq_printf(m, "W9: fc_msh_dst\t\t%d\n", halo->fc_msh_dst);
+	seq_printf(m, "W9: op_dpc_ena\t\t%d\nW9: op_dpc_set\t\t%d\n",
+		   halo->op_dpc_ena, halo->op_dpc_set);
+	seq_printf(m, "W9: stream_ctx\t\t%d\nW9: unified_ctx\t\t%d\n",
+		   halo->stream_ctx, halo->unified_ctx);
+}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.h
index 0c5f05883666..7e00c7499e35 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.h
@@ -27,5 +27,7 @@ void print_npa_cn20k_aura_ctx(struct seq_file *m,
 			      struct npa_cn20k_aq_enq_rsp *rsp);
 void print_npa_cn20k_pool_ctx(struct seq_file *m,
 			      struct npa_cn20k_aq_enq_rsp *rsp);
+void print_npa_cn20k_halo_ctx(struct seq_file *m,
+			      struct npa_aq_enq_rsp *rsp);
 
 #endif
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
index 413f9fa40b33..3d73bf7f0b1f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
@@ -975,6 +975,12 @@ static void print_npa_qsize(struct seq_file *m, struct rvu_pfvf *pfvf)
 					pfvf->aura_ctx->qsize);
 		seq_printf(m, "Aura count : %d\n", pfvf->aura_ctx->qsize);
 		seq_printf(m, "Aura context ena/dis bitmap : %s\n", buf);
+		if (pfvf->halo_bmap) {
+			bitmap_print_to_pagebuf(false, buf, pfvf->halo_bmap,
+						pfvf->aura_ctx->qsize);
+			seq_printf(m, "Halo context ena/dis bitmap : %s\n",
+				   buf);
+		}
 	}
 
 	if (!pfvf->pool_ctx) {
@@ -1204,6 +1210,20 @@ static void print_npa_pool_ctx(struct seq_file *m, struct npa_aq_enq_rsp *rsp)
 		seq_printf(m, "W8: fc_msh_dst\t\t%d\n", pool->fc_msh_dst);
 }
 
+static const char *npa_ctype_str(int ctype)
+{
+	switch (ctype) {
+	case NPA_AQ_CTYPE_AURA:
+		return "aura";
+	case NPA_AQ_CTYPE_HALO:
+		return "halo";
+	case NPA_AQ_CTYPE_POOL:
+		return "pool";
+	default:
+		return "unknown";
+	}
+}
+
 /* Reads aura/pool's ctx from admin queue */
 static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 {
@@ -1220,6 +1240,7 @@ static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 
 	switch (ctype) {
 	case NPA_AQ_CTYPE_AURA:
+	case NPA_AQ_CTYPE_HALO:
 		npalf = rvu->rvu_dbg.npa_aura_ctx.lf;
 		id = rvu->rvu_dbg.npa_aura_ctx.id;
 		all = rvu->rvu_dbg.npa_aura_ctx.all;
@@ -1244,6 +1265,9 @@ static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 	} else if (ctype == NPA_AQ_CTYPE_POOL && !pfvf->pool_ctx) {
 		seq_puts(m, "Pool context is not initialized\n");
 		return -EINVAL;
+	} else if (ctype == NPA_AQ_CTYPE_HALO && !pfvf->aura_ctx) {
+		seq_puts(m, "Halo context is not initialized\n");
+		return -EINVAL;
 	}
 
 	memset(&aq_req, 0, sizeof(struct npa_aq_enq_req));
@@ -1253,6 +1277,9 @@ static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 	if (ctype == NPA_AQ_CTYPE_AURA) {
 		max_id = pfvf->aura_ctx->qsize;
 		print_npa_ctx = print_npa_aura_ctx;
+	} else if (ctype == NPA_AQ_CTYPE_HALO) {
+		max_id = pfvf->aura_ctx->qsize;
+		print_npa_ctx = print_npa_cn20k_halo_ctx;
 	} else {
 		max_id = pfvf->pool_ctx->qsize;
 		print_npa_ctx = print_npa_pool_ctx;
@@ -1260,8 +1287,7 @@ static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 
 	if (id < 0 || id >= max_id) {
 		seq_printf(m, "Invalid %s, valid range is 0-%d\n",
-			   (ctype == NPA_AQ_CTYPE_AURA) ? "aura" : "pool",
-			max_id - 1);
+			   npa_ctype_str(ctype), max_id - 1);
 		return -EINVAL;
 	}
 
@@ -1274,12 +1300,19 @@ static int rvu_dbg_npa_ctx_display(struct seq_file *m, void *unused, int ctype)
 		aq_req.aura_id = aura;
 
 		/* Skip if queue is uninitialized */
+		if (ctype == NPA_AQ_CTYPE_AURA &&
+		    !test_bit(aura, pfvf->aura_bmap))
+			continue;
+
+		if (ctype == NPA_AQ_CTYPE_HALO &&
+		    !test_bit(aura, pfvf->halo_bmap))
+			continue;
+
 		if (ctype == NPA_AQ_CTYPE_POOL && !test_bit(aura, pfvf->pool_bmap))
 			continue;
 
-		seq_printf(m, "======%s : %d=======\n",
-			   (ctype == NPA_AQ_CTYPE_AURA) ? "AURA" : "POOL",
-			aq_req.aura_id);
+		seq_printf(m, "======%s : %d=======\n", npa_ctype_str(ctype),
+			   aq_req.aura_id);
 		rc = rvu_npa_aq_enq_inst(rvu, &aq_req, &rsp);
 		if (rc) {
 			seq_puts(m, "Failed to read context\n");
@@ -1308,6 +1341,12 @@ static int write_npa_ctx(struct rvu *rvu, bool all,
 			return -EINVAL;
 		}
 		max_id = pfvf->aura_ctx->qsize;
+	} else if (ctype == NPA_AQ_CTYPE_HALO) {
+		if (!pfvf->aura_ctx) {
+			dev_warn(rvu->dev, "Halo context is not initialized\n");
+			return -EINVAL;
+		}
+		max_id = pfvf->aura_ctx->qsize;
 	} else if (ctype == NPA_AQ_CTYPE_POOL) {
 		if (!pfvf->pool_ctx) {
 			dev_warn(rvu->dev, "Pool context is not initialized\n");
@@ -1318,13 +1357,14 @@ static int write_npa_ctx(struct rvu *rvu, bool all,
 
 	if (id < 0 || id >= max_id) {
 		dev_warn(rvu->dev, "Invalid %s, valid range is 0-%d\n",
-			 (ctype == NPA_AQ_CTYPE_AURA) ? "aura" : "pool",
+			 npa_ctype_str(ctype),
 			max_id - 1);
 		return -EINVAL;
 	}
 
 	switch (ctype) {
 	case NPA_AQ_CTYPE_AURA:
+	case NPA_AQ_CTYPE_HALO:
 		rvu->rvu_dbg.npa_aura_ctx.lf = npalf;
 		rvu->rvu_dbg.npa_aura_ctx.id = id;
 		rvu->rvu_dbg.npa_aura_ctx.all = all;
@@ -1383,12 +1423,12 @@ static ssize_t rvu_dbg_npa_ctx_write(struct file *filp,
 				     const char __user *buffer,
 				     size_t count, loff_t *ppos, int ctype)
 {
-	char *cmd_buf, *ctype_string = (ctype == NPA_AQ_CTYPE_AURA) ?
-					"aura" : "pool";
+	const char *ctype_string = npa_ctype_str(ctype);
 	struct seq_file *seqfp = filp->private_data;
 	struct rvu *rvu = seqfp->private;
 	int npalf, id = 0, ret;
 	bool all = false;
+	char *cmd_buf;
 
 	if ((*ppos != 0) || !count)
 		return -EINVAL;
@@ -1426,6 +1466,21 @@ static int rvu_dbg_npa_aura_ctx_display(struct seq_file *filp, void *unused)
 
 RVU_DEBUG_SEQ_FOPS(npa_aura_ctx, npa_aura_ctx_display, npa_aura_ctx_write);
 
+static ssize_t rvu_dbg_npa_halo_ctx_write(struct file *filp,
+					  const char __user *buffer,
+					  size_t count, loff_t *ppos)
+{
+	return rvu_dbg_npa_ctx_write(filp, buffer, count, ppos,
+				     NPA_AQ_CTYPE_HALO);
+}
+
+static int rvu_dbg_npa_halo_ctx_display(struct seq_file *filp, void *unused)
+{
+	return rvu_dbg_npa_ctx_display(filp, unused, NPA_AQ_CTYPE_HALO);
+}
+
+RVU_DEBUG_SEQ_FOPS(npa_halo_ctx, npa_halo_ctx_display, npa_halo_ctx_write);
+
 static ssize_t rvu_dbg_npa_pool_ctx_write(struct file *filp,
 					  const char __user *buffer,
 					  size_t count, loff_t *ppos)
@@ -2816,6 +2871,9 @@ static void rvu_dbg_npa_init(struct rvu *rvu)
 			    &rvu_dbg_npa_qsize_fops);
 	debugfs_create_file("aura_ctx", 0600, rvu->rvu_dbg.npa, rvu,
 			    &rvu_dbg_npa_aura_ctx_fops);
+	if (is_cn20k(rvu->pdev))
+		debugfs_create_file("halo_ctx", 0600, rvu->rvu_dbg.npa, rvu,
+				    &rvu_dbg_npa_halo_ctx_fops);
 	debugfs_create_file("pool_ctx", 0600, rvu->rvu_dbg.npa, rvu,
 			    &rvu_dbg_npa_pool_ctx_fops);
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context
  2026-03-19 11:47 [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support Subbaraya Sundeep
                   ` (2 preceding siblings ...)
  2026-03-19 11:47 ` [net-next PATCH v2 3/4] octeontx2-af: npa: cn20k: Add debugfs for Halo Subbaraya Sundeep
@ 2026-03-19 11:47 ` Subbaraya Sundeep
  2026-03-20 16:50   ` Simon Horman
  3 siblings, 1 reply; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-19 11:47 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2
  Cc: netdev, linux-kernel, Subbaraya Sundeep

Use unified Halo context present in CN20K hardware for
octeontx2 netdevs instead of aura and pool contexts.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 .../ethernet/marvell/octeontx2/nic/cn20k.c    | 207 +++++++++---------
 .../ethernet/marvell/octeontx2/nic/cn20k.h    |   3 +
 .../marvell/octeontx2/nic/otx2_common.h       |   2 +
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |   6 +
 4 files changed, 115 insertions(+), 103 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
index a5a8f4558717..866f48e758a2 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
@@ -242,15 +242,6 @@ int cn20k_register_pfvf_mbox_intr(struct otx2_nic *pf, int numvfs)
 
 #define RQ_BP_LVL_AURA   (255 - ((85 * 256) / 100)) /* BP when 85% is full */
 
-static u8 cn20k_aura_bpid_idx(struct otx2_nic *pfvf, int aura_id)
-{
-#ifdef CONFIG_DCB
-	return pfvf->queue_to_pfc_map[aura_id];
-#else
-	return 0;
-#endif
-}
-
 static int cn20k_tc_get_entry_index(struct otx2_flow_config *flow_cfg,
 				    struct otx2_tc_flow *node)
 {
@@ -517,84 +508,7 @@ int cn20k_tc_alloc_entry(struct otx2_nic *nic,
 	return 0;
 }
 
-static int cn20k_aura_aq_init(struct otx2_nic *pfvf, int aura_id,
-			      int pool_id, int numptrs)
-{
-	struct npa_cn20k_aq_enq_req *aq;
-	struct otx2_pool *pool;
-	u8 bpid_idx;
-	int err;
-
-	pool = &pfvf->qset.pool[pool_id];
-
-	/* Allocate memory for HW to update Aura count.
-	 * Alloc one cache line, so that it fits all FC_STYPE modes.
-	 */
-	if (!pool->fc_addr) {
-		err = qmem_alloc(pfvf->dev, &pool->fc_addr, 1, OTX2_ALIGN);
-		if (err)
-			return err;
-	}
-
-	/* Initialize this aura's context via AF */
-	aq = otx2_mbox_alloc_msg_npa_cn20k_aq_enq(&pfvf->mbox);
-	if (!aq) {
-		/* Shared mbox memory buffer is full, flush it and retry */
-		err = otx2_sync_mbox_msg(&pfvf->mbox);
-		if (err)
-			return err;
-		aq = otx2_mbox_alloc_msg_npa_cn20k_aq_enq(&pfvf->mbox);
-		if (!aq)
-			return -ENOMEM;
-	}
-
-	aq->aura_id = aura_id;
-
-	/* Will be filled by AF with correct pool context address */
-	aq->aura.pool_addr = pool_id;
-	aq->aura.pool_caching = 1;
-	aq->aura.shift = ilog2(numptrs) - 8;
-	aq->aura.count = numptrs;
-	aq->aura.limit = numptrs;
-	aq->aura.avg_level = 255;
-	aq->aura.ena = 1;
-	aq->aura.fc_ena = 1;
-	aq->aura.fc_addr = pool->fc_addr->iova;
-	aq->aura.fc_hyst_bits = 0; /* Store count on all updates */
-
-	/* Enable backpressure for RQ aura */
-	if (aura_id < pfvf->hw.rqpool_cnt && !is_otx2_lbkvf(pfvf->pdev)) {
-		aq->aura.bp_ena = 0;
-		/* If NIX1 LF is attached then specify NIX1_RX.
-		 *
-		 * Below NPA_AURA_S[BP_ENA] is set according to the
-		 * NPA_BPINTF_E enumeration given as:
-		 * 0x0 + a*0x1 where 'a' is 0 for NIX0_RX and 1 for NIX1_RX so
-		 * NIX0_RX is 0x0 + 0*0x1 = 0
-		 * NIX1_RX is 0x0 + 1*0x1 = 1
-		 * But in HRM it is given that
-		 * "NPA_AURA_S[BP_ENA](w1[33:32]) - Enable aura backpressure to
-		 * NIX-RX based on [BP] level. One bit per NIX-RX; index
-		 * enumerated by NPA_BPINTF_E."
-		 */
-		if (pfvf->nix_blkaddr == BLKADDR_NIX1)
-			aq->aura.bp_ena = 1;
-
-		bpid_idx = cn20k_aura_bpid_idx(pfvf, aura_id);
-		aq->aura.bpid = pfvf->bpid[bpid_idx];
-
-		/* Set backpressure level for RQ's Aura */
-		aq->aura.bp = RQ_BP_LVL_AURA;
-	}
-
-	/* Fill AQ info */
-	aq->ctype = NPA_AQ_CTYPE_AURA;
-	aq->op = NPA_AQ_INSTOP_INIT;
-
-	return 0;
-}
-
-static int cn20k_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id,
+static int cn20k_halo_aq_init(struct otx2_nic *pfvf, u16 pool_id,
 			      int stack_pages, int numptrs, int buf_size,
 			      int type)
 {
@@ -610,36 +524,55 @@ static int cn20k_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id,
 	if (err)
 		return err;
 
+	/* Allocate memory for HW to update Aura count.
+	 * Alloc one cache line, so that it fits all FC_STYPE modes.
+	 */
+	if (!pool->fc_addr) {
+		err = qmem_alloc(pfvf->dev, &pool->fc_addr, 1, OTX2_ALIGN);
+		if (err) {
+			qmem_free(pfvf->dev, pool->stack);
+			return err;
+		}
+	}
+
 	pool->rbsize = buf_size;
 
-	/* Initialize this pool's context via AF */
+	/* Initialize this aura's context via AF */
 	aq = otx2_mbox_alloc_msg_npa_cn20k_aq_enq(&pfvf->mbox);
 	if (!aq) {
 		/* Shared mbox memory buffer is full, flush it and retry */
 		err = otx2_sync_mbox_msg(&pfvf->mbox);
-		if (err) {
-			qmem_free(pfvf->dev, pool->stack);
-			return err;
-		}
+		if (err)
+			goto free_mem;
 		aq = otx2_mbox_alloc_msg_npa_cn20k_aq_enq(&pfvf->mbox);
 		if (!aq) {
-			qmem_free(pfvf->dev, pool->stack);
-			return -ENOMEM;
+			err = -ENOMEM;
+			goto free_mem;
 		}
 	}
 
 	aq->aura_id = pool_id;
-	aq->pool.stack_base = pool->stack->iova;
-	aq->pool.stack_caching = 1;
-	aq->pool.ena = 1;
-	aq->pool.buf_size = buf_size / 128;
-	aq->pool.stack_max_pages = stack_pages;
-	aq->pool.shift = ilog2(numptrs) - 8;
-	aq->pool.ptr_start = 0;
-	aq->pool.ptr_end = ~0ULL;
+
+	aq->halo.stack_base = pool->stack->iova;
+	aq->halo.stack_caching = 1;
+	aq->halo.ena = 1;
+	aq->halo.buf_size = buf_size / 128;
+	aq->halo.stack_max_pages = stack_pages;
+	aq->halo.shift = ilog2(numptrs) - 8;
+	aq->halo.ptr_start = 0;
+	aq->halo.ptr_end = ~0ULL;
+
+	aq->halo.avg_level = 255;
+	aq->halo.fc_ena = 1;
+	aq->halo.fc_addr = pool->fc_addr->iova;
+	aq->halo.fc_hyst_bits = 0; /* Store count on all updates */
+
+	aq->halo.op_dpc_ena = 1;
+	aq->halo.op_dpc_set = pfvf->npa_dpc;
+	aq->halo.unified_ctx = 1;
 
 	/* Fill AQ info */
-	aq->ctype = NPA_AQ_CTYPE_POOL;
+	aq->ctype = NPA_AQ_CTYPE_HALO;
 	aq->op = NPA_AQ_INSTOP_INIT;
 
 	if (type != AURA_NIX_RQ) {
@@ -661,6 +594,74 @@ static int cn20k_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id,
 	}
 
 	return 0;
+
+free_mem:
+	qmem_free(pfvf->dev, pool->stack);
+	qmem_free(pfvf->dev, pool->fc_addr);
+	return err;
+}
+
+static int cn20k_aura_aq_init(struct otx2_nic *pfvf, int aura_id,
+			      int pool_id, int numptrs)
+{
+	return 0;
+}
+
+static int cn20k_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id,
+			      int stack_pages, int numptrs, int buf_size,
+			      int type)
+{
+	return cn20k_halo_aq_init(pfvf, pool_id, stack_pages,
+				  numptrs, buf_size, type);
+}
+
+int cn20k_npa_alloc_dpc(struct otx2_nic *nic)
+{
+	struct npa_cn20k_dpc_alloc_req *req;
+	struct npa_cn20k_dpc_alloc_rsp *rsp;
+	int err;
+
+	req = otx2_mbox_alloc_msg_npa_cn20k_dpc_alloc(&nic->mbox);
+	if (!req)
+		return -ENOMEM;
+
+	/* Count successful ALLOC requests only */
+	req->dpc_conf = 1ULL << 4;
+
+	err = otx2_sync_mbox_msg(&nic->mbox);
+	if (err)
+		return err;
+
+	rsp = (struct npa_cn20k_dpc_alloc_rsp *)otx2_mbox_get_rsp(&nic->mbox.mbox,
+								  0, &req->hdr);
+	if (IS_ERR(rsp))
+		return PTR_ERR(rsp);
+
+	nic->npa_dpc = rsp->cntr_id;
+
+	return 0;
+}
+
+int cn20k_npa_free_dpc(struct otx2_nic *nic)
+{
+	struct npa_cn20k_dpc_free_req *req;
+	int err;
+
+	mutex_lock(&nic->mbox.lock);
+
+	req = otx2_mbox_alloc_msg_npa_cn20k_dpc_free(&nic->mbox);
+	if (!req) {
+		mutex_unlock(&nic->mbox.lock);
+		return -ENOMEM;
+	}
+
+	req->cntr_id = nic->npa_dpc;
+
+	err = otx2_sync_mbox_msg(&nic->mbox);
+
+	mutex_unlock(&nic->mbox.lock);
+
+	return err;
 }
 
 static int cn20k_sq_aq_init(void *dev, u16 qidx, u8 chan_offset, u16 sqb_aura)
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.h b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.h
index b5e527f6d7eb..16a69d84ea79 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.h
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.h
@@ -28,4 +28,7 @@ int cn20k_tc_alloc_entry(struct otx2_nic *nic,
 			 struct otx2_tc_flow *new_node,
 			 struct npc_install_flow_req *dummy);
 int cn20k_tc_free_mcam_entry(struct otx2_nic *nic, u16 entry);
+int cn20k_npa_alloc_dpc(struct otx2_nic *nic);
+int cn20k_npa_free_dpc(struct otx2_nic *nic);
+
 #endif /* CN20K_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
index eecee612b7b2..06d96059d026 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
@@ -592,6 +592,8 @@ struct otx2_nic {
 	struct cn10k_ipsec	ipsec;
 	/* af_xdp zero-copy */
 	unsigned long		*af_xdp_zc_qidx;
+
+	u8			npa_dpc; /* NPA DPC counter id */
 };
 
 static inline bool is_otx2_lbkvf(struct pci_dev *pdev)
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index ee623476e5ff..2941549d46c8 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1651,6 +1651,9 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
 	if (!is_otx2_lbkvf(pf->pdev))
 		otx2_nix_config_bp(pf, true);
 
+	if (is_cn20k(pf->pdev))
+		cn20k_npa_alloc_dpc(pf);
+
 	/* Init Auras and pools used by NIX RQ, for free buffer ptrs */
 	err = otx2_rq_aura_pool_init(pf);
 	if (err) {
@@ -1790,6 +1793,9 @@ void otx2_free_hw_resources(struct otx2_nic *pf)
 
 	otx2_free_sq_res(pf);
 
+	if (is_cn20k(pf->pdev))
+		cn20k_npa_free_dpc(pf);
+
 	/* Free RQ buffer pointers*/
 	otx2_free_aura_ptr(pf, AURA_NIX_RQ);
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support
  2026-03-19 11:47 ` [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support Subbaraya Sundeep
@ 2026-03-20 16:50   ` Simon Horman
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2026-03-20 16:50 UTC (permalink / raw)
  To: sbhatta
  Cc: Simon Horman, kuba, gakula, bbhushan2, netdev, linux-kernel,
	davem, lcherian, andrew+netdev, edumazet, pabeni, sgoutham

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
octeontx2-af: npa: cn20k: Add DPC support

CN20k introduces 32 diagnostic and performance counters that are shared
across all NPA LFs. This patch adds new mbox messages npa_dpc_alloc/free
to allocate and map counters to LFs, and ensures LF to DPC counter
mappings are cleared at LF free/teardown.

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
> index c963f43dc7b0..1def2504872f 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npa.c

[ ... ]

> @@ -46,3 +48,117 @@ int rvu_npa_halo_hwctx_disable(struct npa_aq_enq_req *req)
>
>  	return 0;
>  }
> +
> +int npa_cn20k_dpc_alloc(struct rvu *rvu, struct npa_cn20k_dpc_alloc_req *req,
> +			struct npa_cn20k_dpc_alloc_rsp *rsp)
> +{
> +	struct rvu_hwinfo *hw = rvu->hw;
> +	u16 pcifunc = req->hdr.pcifunc;
> +	int cntr, lf, blkaddr, ridx;
> +	struct rvu_block *block;
> +	struct rvu_pfvf *pfvf;
> +	u64 val, lfmask;
> +
> +	pfvf = rvu_get_pfvf(rvu, pcifunc);
> +
> +	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
> +	if (!pfvf->npalf || blkaddr < 0)
> +		return NPA_AF_ERR_AF_LF_INVALID;
> +
> +	block = &hw->block[blkaddr];
> +	lf = rvu_get_lf(rvu, block, pcifunc, 0);
> +	if (lf < 0)
> +		return NPA_AF_ERR_AF_LF_INVALID;
> +
> +	/* allocate a new counter */
> +	cntr = rvu_alloc_rsrc(&rvu->npa_dpc);
                ^^^^

Are rvu_alloc_rsrc() and rvu_free_rsrc() calls synchronized here?

These functions use non-atomic find_first_zero_bit() + __set_bit() and
__clear_bit() operations on the bitmap. Looking at other callers like
rvu_mbox_handler_attach_resources() in rvu.c:

int rvu_mbox_handler_attach_resources(struct rvu *rvu,
                                      struct rsrc_attach *attach,
                                      struct msg_rsp *rsp)
{
    ...
    mutex_lock(&rvu->rsrc_lock);

    for (q_cnt = 0; q_cnt < attach->npa_msixoff; q_cnt++) {
        msix = rvu_alloc_rsrc(&pfvf->msix);
        ...
    }
    mutex_unlock(&rvu->rsrc_lock);
    ...
}

Similarly, rvu_mbox_handler_nix_bandprof_alloc() and
rvu_mbox_handler_mcs_alloc_resources() also acquire rvu->rsrc_lock
before calling rvu_alloc_rsrc(). rvu_mbox_handler_npc_mcam_alloc_counter()
uses mcam->lock.

Since npa_cn20k_dpc_alloc() can be invoked by the mbox handler
rvu_mbox_handler_npa_cn20k_dpc_alloc() concurrently from different
PF/VF workqueue items in __rvu_mbox_handler(), two threads could race:

Thread 1                                 Thread 2
npa_cn20k_dpc_alloc()                    npa_cn20k_dpc_alloc()
  find_first_zero_bit(npa_dpc) = 0         find_first_zero_bit(npa_dpc) = 0
  __set_bit(0, npa_dpc)                    __set_bit(0, npa_dpc)
  return cntr = 0                          return cntr = 0

Both threads get counter 0, leading to duplicate allocation. The same
race exists in npa_cn20k_dpc_free() with rvu_free_rsrc().

Should these calls be protected with mutex_lock(&rvu->rsrc_lock) similar
to other resource allocation paths?

> +	if (cntr < 0)
> +		return cntr;
> +	rsp->cntr_id = cntr;
> +
> +	/* DPC counter config */
> +	rvu_write64(rvu, blkaddr, NPA_AF_DPCX_CFG(cntr), req->dpc_conf);
> +
> +	/* 0 to 63 lfs -> idx 0, 64 - 127 lfs -> idx 1 */
> +	ridx = lf >> 6;
> +	lfmask = BIT_ULL(ridx ? lf - NPA_DPC_LFS_PER_REG : lf);
> +
> +	ridx = 2 * cntr + ridx;
> +	/* Give permission for LF access */
> +	val = rvu_read64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx));
> +	val |= lfmask;
> +	rvu_write64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx), val);
> +
> +	return 0;
> +}
> +
> +int rvu_mbox_handler_npa_cn20k_dpc_alloc(struct rvu *rvu,
> +					 struct npa_cn20k_dpc_alloc_req *req,
> +					 struct npa_cn20k_dpc_alloc_rsp *rsp)
> +{
> +	return npa_cn20k_dpc_alloc(rvu, req, rsp);
> +}
> +
> +int npa_cn20k_dpc_free(struct rvu *rvu, struct npa_cn20k_dpc_free_req *req)
> +{
> +	struct rvu_hwinfo *hw = rvu->hw;
> +	u16 pcifunc = req->hdr.pcifunc;
> +	int cntr, lf, blkaddr, ridx;
> +	struct rvu_block *block;
> +	struct rvu_pfvf *pfvf;
> +	u64 val, lfmask;
> +
> +	pfvf = rvu_get_pfvf(rvu, pcifunc);
> +
> +	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
> +	if (!pfvf->npalf || blkaddr < 0)
> +		return NPA_AF_ERR_AF_LF_INVALID;
> +
> +	block = &hw->block[blkaddr];
> +	lf = rvu_get_lf(rvu, block, pcifunc, 0);
> +	if (lf < 0)
> +		return NPA_AF_ERR_AF_LF_INVALID;
> +
> +	if (req->cntr_id >= NPA_DPC_MAX)
> +		return NPA_AF_ERR_PARAM;
> +
> +	/* 0 to 63 lfs -> idx 0, 64 - 127 lfs -> idx 1 */
> +	ridx = lf >> 6;
> +	lfmask = BIT_ULL(ridx ? lf - NPA_DPC_LFS_PER_REG : lf);
> +	cntr = req->cntr_id;
> +
> +	ridx = 2 * cntr + ridx;
> +
> +	val = rvu_read64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx));
> +	/* Check if the counter is allotted to this LF */
> +	if (!(val & lfmask))
> +		return 0;
> +
> +	/* Revert permission */
> +	val &= ~lfmask;
> +	rvu_write64(rvu, blkaddr, NPA_AF_DPC_PERMITX(ridx), val);
> +
> +	/* Free this counter */
> +	rvu_free_rsrc(&rvu->npa_dpc, req->cntr_id);
> +
> +	return 0;
> +}

[ ... ]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context
  2026-03-19 11:47 ` [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context Subbaraya Sundeep
@ 2026-03-20 16:50   ` Simon Horman
  2026-03-26  7:00     ` Subbaraya Sundeep
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2026-03-20 16:50 UTC (permalink / raw)
  To: sbhatta
  Cc: Simon Horman, kuba, gakula, bbhushan2, netdev, linux-kernel,
	davem, lcherian, andrew+netdev, edumazet, pabeni, sgoutham

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
octeontx2-pf: cn20k: Use unified Halo context

Use unified Halo context present in CN20K hardware for
octeontx2 netdevs instead of aura and pool contexts.

> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c

[ ... ]

> +int cn20k_npa_alloc_dpc(struct otx2_nic *nic)
> +{
> +	struct npa_cn20k_dpc_alloc_req *req;
> +	struct npa_cn20k_dpc_alloc_rsp *rsp;
> +	int err;
> +
> +	req = otx2_mbox_alloc_msg_npa_cn20k_dpc_alloc(&nic->mbox);
> +	if (!req)
> +		return -ENOMEM;
> +
> +	/* Count successful ALLOC requests only */
> +	req->dpc_conf = 1ULL << 4;
> +
> +	err = otx2_sync_mbox_msg(&nic->mbox);
> +	if (err)
> +		return err;
> +
> +	rsp = (struct npa_cn20k_dpc_alloc_rsp *)otx2_mbox_get_rsp(&nic->mbox.mbox,
> +								  0, &req->hdr);
> +	if (IS_ERR(rsp))
> +		return PTR_ERR(rsp);
> +
> +	nic->npa_dpc = rsp->cntr_id;
> +
> +	return 0;
> +}

[ ... ]

> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> @@ -1651,6 +1651,9 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
>  	if (!is_otx2_lbkvf(pf->pdev))
>  		otx2_nix_config_bp(pf, true);
>
> +	if (is_cn20k(pf->pdev))
> +		cn20k_npa_alloc_dpc(pf);
                ^^^^^^^^^^^^^^^^^^^^^

Should the return value of cn20k_npa_alloc_dpc() be checked here?

Looking at cn20k_npa_alloc_dpc(), it can return -ENOMEM or errors from
otx2_sync_mbox_msg() or PTR_ERR() from otx2_mbox_get_rsp(). If the DPC
counter allocation fails, pf->npa_dpc remains 0 (from struct
zero-initialization).

Then cn20k_halo_aq_init() configures all NPA auras with
aq->halo.op_dpc_set = pfvf->npa_dpc (= 0), referencing an unallocated
DPC performance counter. On teardown, cn20k_npa_free_dpc() sends a mbox
message to free counter ID 0, which was never allocated.

Every other error-returning mbox operation in this function checks the
return value and jumps to an error label on failure. Should this follow
the same pattern?

> +
>  	/* Init Auras and pools used by NIX RQ, for free buffer ptrs */
>  	err = otx2_rq_aura_pool_init(pf);
>  	if (err) {

[ ... ]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support
  2026-03-19 11:47 ` [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support Subbaraya Sundeep
@ 2026-03-20 16:52   ` Simon Horman
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2026-03-20 16:52 UTC (permalink / raw)
  To: Subbaraya Sundeep
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, netdev, linux-kernel, Linu Cherian

On Thu, Mar 19, 2026 at 05:17:03PM +0530, Subbaraya Sundeep wrote:
> From: Linu Cherian <lcherian@marvell.com>

...

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
> index e2a33e46b48a..96904b8eea62 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c
> @@ -12,6 +12,11 @@
>  #include "rvu_reg.h"
>  #include "rvu.h"
>  
> +static inline bool npa_ctype_invalid(struct rvu *rvu, int ctype)
> +{
> +	return !is_cn20k(rvu->pdev) && ctype == NPA_AQ_CTYPE_HALO;
> +}

Please drop the inline keyword here.
We can let the compiler chose to inline things (or not).

...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context
  2026-03-20 16:50   ` Simon Horman
@ 2026-03-26  7:00     ` Subbaraya Sundeep
  0 siblings, 0 replies; 9+ messages in thread
From: Subbaraya Sundeep @ 2026-03-26  7:00 UTC (permalink / raw)
  To: Simon Horman
  Cc: kuba, gakula, bbhushan2, netdev, linux-kernel, davem,
	andrew+netdev, edumazet, pabeni, sgoutham

On 2026-03-20 at 22:20:37, Simon Horman (horms@kernel.org) wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> 
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> octeontx2-pf: cn20k: Use unified Halo context
> 
> Use unified Halo context present in CN20K hardware for
> octeontx2 netdevs instead of aura and pool contexts.
> 
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
> > --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c
> 
> [ ... ]
> 
> > +int cn20k_npa_alloc_dpc(struct otx2_nic *nic)
> > +{
> > +	struct npa_cn20k_dpc_alloc_req *req;
> > +	struct npa_cn20k_dpc_alloc_rsp *rsp;
> > +	int err;
> > +
> > +	req = otx2_mbox_alloc_msg_npa_cn20k_dpc_alloc(&nic->mbox);
> > +	if (!req)
> > +		return -ENOMEM;
> > +
> > +	/* Count successful ALLOC requests only */
> > +	req->dpc_conf = 1ULL << 4;
> > +
> > +	err = otx2_sync_mbox_msg(&nic->mbox);
> > +	if (err)
> > +		return err;
> > +
> > +	rsp = (struct npa_cn20k_dpc_alloc_rsp *)otx2_mbox_get_rsp(&nic->mbox.mbox,
> > +								  0, &req->hdr);
> > +	if (IS_ERR(rsp))
> > +		return PTR_ERR(rsp);
> > +
> > +	nic->npa_dpc = rsp->cntr_id;
> > +
> > +	return 0;
> > +}
> 
> [ ... ]
> 
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> > --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
> > @@ -1651,6 +1651,9 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
> >  	if (!is_otx2_lbkvf(pf->pdev))
> >  		otx2_nix_config_bp(pf, true);
> >
> > +	if (is_cn20k(pf->pdev))
> > +		cn20k_npa_alloc_dpc(pf);
>                 ^^^^^^^^^^^^^^^^^^^^^
> 
> Should the return value of cn20k_npa_alloc_dpc() be checked here?
> 
DPC counters are for debugging only we can proceed if counter was not
allocated.

> Looking at cn20k_npa_alloc_dpc(), it can return -ENOMEM or errors from
> otx2_sync_mbox_msg() or PTR_ERR() from otx2_mbox_get_rsp(). If the DPC
> counter allocation fails, pf->npa_dpc remains 0 (from struct
> zero-initialization).
> 
> Then cn20k_halo_aq_init() configures all NPA auras with
> aq->halo.op_dpc_set = pfvf->npa_dpc (= 0), referencing an unallocated
> DPC performance counter. On teardown, cn20k_npa_free_dpc() sends a mbox
> message to free counter ID 0, which was never allocated.
Agreed. Will add npa_dpc_valid flag and based on that only will use
pfvf->npa_dpc to fix this.

Thanks,
Sundeep
> 
> Every other error-returning mbox operation in this function checks the
> return value and jumps to an error label on failure. Should this follow
> the same pattern?
> 
> > +
> >  	/* Init Auras and pools used by NIX RQ, for free buffer ptrs */
> >  	err = otx2_rq_aura_pool_init(pf);
> >  	if (err) {
> 
> [ ... ]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-26  7:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 11:47 [net-next PATCH v2 0/4] octeontx2: CN20K NPA Halo context support Subbaraya Sundeep
2026-03-19 11:47 ` [net-next PATCH v2 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support Subbaraya Sundeep
2026-03-20 16:52   ` Simon Horman
2026-03-19 11:47 ` [net-next PATCH v2 2/4] octeontx2-af: npa: cn20k: Add DPC support Subbaraya Sundeep
2026-03-20 16:50   ` Simon Horman
2026-03-19 11:47 ` [net-next PATCH v2 3/4] octeontx2-af: npa: cn20k: Add debugfs for Halo Subbaraya Sundeep
2026-03-19 11:47 ` [net-next PATCH v2 4/4] octeontx2-pf: cn20k: Use unified Halo context Subbaraya Sundeep
2026-03-20 16:50   ` Simon Horman
2026-03-26  7:00     ` Subbaraya Sundeep

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox