Netdev List
 help / color / mirror / Atom feed
* [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements.
@ 2026-06-09  4:04 Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 1/9] octeontx2-af: enforce single RVU AF probe Ratheesh Kannoth
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

This series extends Marvell octeontx2-af support for CN20K NPC (MCAM
debuggability, allocation policy, default-rule lifetime, optional KPU
profiles from firmware files, X2/X4 MCAM keyword handling in flows and
defaults, and dynamic CN20K NPC private state), adds a devlink mechanism
for multi-value parameters, and moves devlink_nl_param_fill() temporaries
to the heap so stack usage stays reasonable once union devlink_param_value
grows (patch 3).

Patch 1 enforces a single RVU admin-function PCI device in the kernel.
On Octeon series SoCs, hardware resources such as NPC, NIX and related
blocks are global and coordinated by the AF driver; PFs and VFs request
them through AF mailbox messages.  Firmware exposes only one AF PCI
function at boot, so two AF driver instances cannot both own that state.
rvu_probe() rejects a second bind with -EBUSY, logs a warning, clears the
probe gate on early allocation failures, and aligns the driver model with
hardware so reviewers and automation can rely on exactly one bound AF.

Patch 2 improves CN20K MCAM visibility in debugfs: mcam_layout marks
enabled entries, dstats reports per-entry hit deltas (baseline updated in
software after each read; hardware counters are not cleared), and mismatch
lists enabled entries without a PF mapping.

Patch 3 allocates the per-configuration-mode union devlink_param_value
buffers and struct devlink_param_gset_ctx used by devlink_nl_param_fill()
with kcalloc()/kzalloc_obj() and funnels failures through a single cleanup
path so the netlink reply path stays safe as the union grows.

Patch 4 (Saeed) introduces DEVLINK_PARAM_TYPE_U64_ARRAY and nested
DEVLINK_ATTR_PARAM_VALUE_DATA attributes so drivers and user space can
exchange bounded u64 arrays; YAML, uapi, and netlink validation are
updated.

Patch 5 adds a runtime devlink parameter srch_order to reorder CN20K
subbank search during MCAM allocation (the param uses the u64 array type
from patch 4).

Patch 6 ties default MCAM entries to NIX LF alloc/free on CN20K, adds
NIX_LF_DONT_FREE_DFT_IDXS for PF teardown paths that must not drop default
NPC indexes while the driver still owns state, and tightens nix_lf_alloc
error propagation.

Patch 7 allows loading a custom KPU profile from /lib/firmware/kpu via
module parameter kpu_profile, with cam2 / ptype_mask wiring and helpers
that share firmware-sourced vs filesystem-sourced profile layouts.

Patch 8 makes default-rule allocation, AF flow install, and PF-side RSS,
defaults, and ethtool flows respect the active CN20K MCAM keyword width
(X2 vs X4), including X4 reference-index masking and -EOPNOTSUPP when a
flow needs X4 keys on an X2-only profile.

Patch 9 replaces file-scope npc_priv and static dstats with allocation
sized from discovered bank/subbank geometry, threads npc_priv_get()
through CN20K NPC paths, and allocates dstats via devm_kzalloc for the
debugfs helper.

Patch 1 is ordered first so later patches assume a single bound AF.
Heap-backed devlink_nl_param_fill() sits immediately before the U64 array
param work so incremental builds stay stack-safe as the union grows; the
CN20K patches keep srch_order ahead of NIX LF coordination, optional KPU
profile load from firmware files, X2/X4 handling, and the npc_priv refactor
that touches the same files heavily.

Ratheesh Kannoth (8):
  octeontx2-af: Enforce single RVU AF probe
  octeontx2-af: npc: cn20k: debugfs enhancements
  devlink: heap-allocate param fill buffers in devlink_nl_param_fill
  octeontx2-af: npc: cn20k: add subbank search order control
  octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
  octeontx2-af: npc: Support for custom KPU profile from filesystem
  octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT
    alloc
  octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.

Saeed Mahameed (1):
  devlink: Implement devlink param multi attribute nested data values

 Documentation/netlink/specs/devlink.yaml      |   4 +
 .../marvell/octeontx2/af/cn20k/debugfs.c      | 163 ++++-
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 638 ++++++++++++------
 .../ethernet/marvell/octeontx2/af/cn20k/npc.h |  18 +-
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |   1 +
 .../net/ethernet/marvell/octeontx2/af/npc.h   |  17 +
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |  14 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |  12 +-
 .../marvell/octeontx2/af/rvu_devlink.c        |  92 ++-
 .../ethernet/marvell/octeontx2/af/rvu_nix.c   |  77 ++-
 .../ethernet/marvell/octeontx2/af/rvu_npc.c   | 486 ++++++++++---
 .../ethernet/marvell/octeontx2/af/rvu_npc.h   |  17 +
 .../marvell/octeontx2/af/rvu_npc_fs.c         |  12 +-
 .../marvell/octeontx2/af/rvu_reg.h            |   1 +
 .../marvell/octeontx2/nic/otx2_flows.c        |  48 +-
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |   6 +-
 include/net/devlink.h                         |   8 +
 include/uapi/linux/devlink.h                  |   1 +
 net/devlink/netlink_gen.c                     |   2 +
 net/devlink/param.c                           |  95 ++-
 20 files changed, 1306 insertions(+), 406 deletions(-)

--
v19 -> v20: Addressed Jakub comments
	https://lore.kernel.org/netdev/20260605063245.3553861-1-rkannoth@marvell.com/

v18 -> v19: Addressed Jakub comments.
	https://lore.kernel.org/netdev/20260602060359.1894952-1-rkannoth@marvell.com/
	Added 1 more patch.

v17 -> v18: Addressed sashiko comments.
	https://lore.kernel.org/netdev/20260601025844.865865-1-rkannoth@marvell.com/

v16 -> v17: Addressed Jakub comments.
	https://lore.kernel.org/netdev/20260521095303.2395584-1-rkannoth@marvell.com/

v15 -> v16: Addressed Sashiko comments
	https://lore.kernel.org/netdev/20260520020939.1457231-1-rkannoth@marvell.com/

v14 -> v15: Addressed Paolo comments
	https://lore.kernel.org/netdev/20260514062537.3813802-1-rkannoth@marvell.com/

v13 -> v14: Addressed sashiko comments.
	I had to revert Jiri comment in v11 as sashiko was complaining about
	leaking kernel memory to userspace.
	https://lore.kernel.org/netdev/20260511033923.1301976-1-rkannoth@marvell.com/

v12 -> v13: Addressed David Laight comments
	https://lore.kernel.org/netdev/20260508034912.4082520-1-rkannoth@marvell.com/

v11 -> v12: Addressed Paolo,Jiri comments.
	https://lore.kernel.org/netdev/20260409025055.1664053-1-rkannoth@marvell.com/
	Added one patch which was rejected by simon in net
	(as it was kind of enhancement rather than a bug)
	Added one more patch- which allocates two variables from heap.

v10 -> v11: Addressed Paolo comments.
	https://lore.kernel.org/netdev/20260403025533.6250-1-rkannoth@marvell.com/

v9 -> v10: Addressed Paolo comments
	https://lore.kernel.org/netdev/
	20260330053105.2722453-1-rkannoth@marvell.com/

v8 -> v9: Addressed Simon comments
	https://lore.kernel.org/netdev/
	20260325072159.1126964-1-rkannoth@marvell.com/

v7 -> v8: Addressed Simon comments
	https://lore.kernel.org/netdev/
	20260323035110.3908741-1-rkannoth@marvell.com/T/#t

v6 -> v7: Addressed Simon comments
	https://lore.kernel.org/netdev/20260320165432.98832-1-horms@kernel.org/

v5 -> v6: Addressed Jakub,Jiri comments
	https://lore.kernel.org/netdev/
	20260317045623.250187-1-rkannoth@marvell.com/

v4 -> v5: Addressed Jakub comments
	https://lore.kernel.org/netdev/
	20260312022754.2029595-6-rkannoth@marvell.com/

v3 -> v4: Addressed Simon comments
	https://lore.kernel.org/netdev/abDeXLpMMxp7G1v3@rkannoth-OptiPlex-7090/#t

v2 -> v3: Addressed Simon comments.
	https://lore.kernel.org/netdev/
	20260304043032.3661647-1-rkannoth@marvell.com/

v1 -> v2: Addressed Jakub comments.
	https://lore.kernel.org/netdev/
	20260302085803.2449828-1-rkannoth@marvell.com/#t

2.43.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 1/9] octeontx2-af: enforce single RVU AF probe
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

On Octeon series SoCs, the AF is an integrated device within the SoC, and
hardware resources such as NPC, NIX and related blocks are global and
coordinated by the AF driver.  Physical and virtual functions request those
resources via AF mailbox messages, so two AF driver instances cannot both
own that global state; firmware exposes only one AF PCI function at boot
and any further octeontx2-af PCI probe returns -EBUSY so software matches
the single-AF model.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
index 3cf131508ecf..12db4c7a11f8 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
@@ -3542,19 +3542,29 @@ static void rvu_update_module_params(struct rvu *rvu)
 		kpu_profile ? kpu_profile : default_pfl_name, KPU_NAME_LEN);
 }
 
+static atomic_t device_bound = ATOMIC_INIT(0);
+
 static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct device *dev = &pdev->dev;
 	struct rvu *rvu;
 	int    err;
 
+	if (atomic_cmpxchg(&device_bound, 0, 1) != 0) {
+		dev_warn(dev, "Only one af device is supported.\n");
+		return -EBUSY;
+	}
+
 	rvu = devm_kzalloc(dev, sizeof(*rvu), GFP_KERNEL);
-	if (!rvu)
+	if (!rvu) {
+		atomic_set(&device_bound, 0);
 		return -ENOMEM;
+	}
 
 	rvu->hw = devm_kzalloc(dev, sizeof(struct rvu_hwinfo), GFP_KERNEL);
 	if (!rvu->hw) {
 		devm_kfree(dev, rvu);
+		atomic_set(&device_bound, 0);
 		return -ENOMEM;
 	}
 
@@ -3687,6 +3697,7 @@ static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	pci_set_drvdata(pdev, NULL);
 	devm_kfree(&pdev->dev, rvu->hw);
 	devm_kfree(dev, rvu);
+	atomic_set(&device_bound, 0);
 	return err;
 }
 
@@ -3716,6 +3727,7 @@ static void rvu_remove(struct pci_dev *pdev)
 		cn20k_free_mbox_memory(rvu);
 	kfree(rvu->ng_rvu);
 	devm_kfree(&pdev->dev, rvu);
+	atomic_set(&device_bound, 0);
 }
 
 static void rvu_shutdown(struct pci_dev *pdev)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 1/9] octeontx2-af: enforce single RVU AF probe Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  4:19   ` Ratheesh Kannoth
  2026-06-10  5:08   ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 3/9] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

Improve MCAM visibility and field debugging for CN20K NPC.

- Extend "mcam_layout" to show enabled (+) or disabled state per entry
  so status can be verified without parsing the full "mcam_entry" dump.
- Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
  since the prior read by comparing hardware counters to a per-entry
  software baseline and advancing that baseline after each read (hardware
  counters are not cleared).
- Add "mismatch" debugfs entry: lists MCAM entries that are enabled
  but not explicitly allocated, helping diagnose allocation/field issues.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../marvell/octeontx2/af/cn20k/debugfs.c      | 158 +++++++++++++++++-
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c |  43 ++++-
 .../ethernet/marvell/octeontx2/af/cn20k/npc.h |  11 ++
 3 files changed, 197 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 6f13296303cb..730ef97a57e6 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -13,6 +13,7 @@
 #include "struct.h"
 #include "rvu.h"
 #include "debugfs.h"
+#include "cn20k/reg.h"
 #include "cn20k/npc.h"
 
 static int npc_mcam_layout_show(struct seq_file *s, void *unused)
@@ -58,7 +59,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
 						 "v:%u", vidx0);
 				}
 
-				seq_printf(s, "\t%u(%#x) %s\n", idx0, pf1,
+				seq_printf(s, "\t%u(%#x)%c %s\n", idx0, pf1,
+					   test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
 					   map ? buf0 : " ");
 			}
 			goto next;
@@ -101,9 +103,13 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
 						 vidx1);
 				}
 
-				seq_printf(s, "%05u(%#x) %s\t\t%05u(%#x) %s\n",
-					   idx1, pf2, v1 ? buf1 : "       ",
-					   idx0, pf1, v0 ? buf0 : "       ");
+				seq_printf(s, "%05u(%#x)%c %s\t\t%05u(%#x)%c %s\n",
+					   idx1, pf2,
+					   test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
+					   v1 ? buf1 : "       ",
+					   idx0, pf1,
+					   test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+					   v0 ? buf0 : "       ");
 
 				continue;
 			}
@@ -120,8 +126,9 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
 						 vidx0);
 				}
 
-				seq_printf(s, "\t\t   \t\t%05u(%#x) %s\n", idx0,
-					   pf1, map ? buf0 : " ");
+				seq_printf(s, "\t\t   \t\t%05u(%#x)%c %s\n", idx0, pf1,
+					   test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+					   map ? buf0 : " ");
 				continue;
 			}
 
@@ -134,7 +141,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
 				snprintf(buf1, sizeof(buf1), "v:%05u", vidx1);
 			}
 
-			seq_printf(s, "%05u(%#x) %s\n", idx1, pf1,
+			seq_printf(s, "%05u(%#x)%c %s\n", idx1, pf1,
+				   test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
 				   map ? buf1 : " ");
 		}
 next:
@@ -145,6 +153,136 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
 
 DEFINE_SHOW_ATTRIBUTE(npc_mcam_layout);
 
+#define __OCTEONTX2_DEBUGFS_ATTRIBUTE_FOPS(__name)			\
+static const struct file_operations __name ## _fops = {			\
+	.owner = THIS_MODULE,						\
+	.open = __name ## _open,					\
+	.read = seq_read,						\
+	.llseek = seq_lseek,						\
+	.release = single_release,					\
+}
+
+#define DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(__name, __size)		\
+static int __name ## _open(struct inode *inode, struct file *file)		\
+{										\
+	return single_open_size(file, __name ## _show, inode->i_private,	\
+				__size);					\
+}										\
+__OCTEONTX2_DEBUGFS_ATTRIBUTE_FOPS(__name)
+
+static DEFINE_MUTEX(stats_lock);
+
+/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
+ * hard limit on all silicon variants, preventing any possibility of
+ * out-of-bounds access.
+ */
+static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
+static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
+{
+	struct npc_priv_t *npc_priv;
+	int blkaddr, pf, mcam_idx;
+	u64 stats, delta;
+	struct rvu *rvu;
+	char buff[64];
+	u8 key_type;
+	void *map;
+
+	npc_priv = npc_priv_get();
+	rvu = s->private;
+	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
+	if (blkaddr < 0)
+		return 0;
+
+	mutex_lock(&stats_lock);
+	seq_puts(s, "idx\tpfunc\tstats\n");
+	for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+		for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+			mcam_idx = bank * npc_priv->bank_depth + idx;
+
+			if (npc_mcam_idx_2_key_type(rvu, mcam_idx, &key_type))
+				continue;
+
+			if (key_type == NPC_MCAM_KEY_X4 && bank != 0)
+				continue;
+
+			if (!test_bit(mcam_idx, npc_priv->en_map))
+				continue;
+
+			stats = rvu_read64(rvu, blkaddr,
+					   NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
+			if (!stats)
+				continue;
+			if (stats == dstats[bank][idx])
+				continue;
+
+			if (stats < dstats[bank][idx])
+				dstats[bank][idx] = 0;
+
+			pf = 0xFFFF;
+			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+			if (map)
+				pf = xa_to_value(map);
+
+			delta = stats - dstats[bank][idx];
+
+			snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
+				 mcam_idx, pf, delta);
+			seq_puts(s, buff);
+
+			dstats[bank][idx] = stats;
+		}
+	}
+
+	mutex_unlock(&stats_lock);
+	return 0;
+}
+
+/*  "%u\t%#04x\t%llu\n" needs less than 64 characters to print */
+#define TOTAL_SZ (MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH * 64)
+DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_dstats, TOTAL_SZ);
+
+static int npc_mcam_mismatch_show(struct seq_file *s, void *unused)
+{
+	struct npc_priv_t *npc_priv;
+	struct npc_subbank *sb;
+	int mcam_idx, sb_off;
+	struct rvu *rvu;
+	char buff[64];
+	void *map;
+	int rc;
+
+	npc_priv = npc_priv_get();
+	rvu = s->private;
+
+	seq_puts(s, "index\tsb idx\tkw type\n");
+	for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+		for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+			mcam_idx = bank * npc_priv->bank_depth + idx;
+
+			if (!test_bit(mcam_idx, npc_priv->en_map))
+				continue;
+
+			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+			if (map)
+				continue;
+
+			rc = npc_mcam_idx_2_subbank_idx(rvu, mcam_idx,
+							&sb, &sb_off);
+			if (rc)
+				continue;
+
+			snprintf(buff, sizeof(buff), "%u\t%d\t%u\n",
+				 mcam_idx, sb->idx, sb->key_type);
+
+			seq_puts(s, buff);
+		}
+	}
+	return 0;
+}
+
+/* "%u\t%d\t%u\n" needs less than 64 characters to print. */
+DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_mismatch, TOTAL_SZ);
+
 static int npc_mcam_default_show(struct seq_file *s, void *unused)
 {
 	struct npc_priv_t *npc_priv;
@@ -259,6 +397,12 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
 	debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
 			    npc_priv, &npc_vidx2idx_map_fops);
 
+	debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
+			    &npc_mcam_dstats_fops);
+
+	debugfs_create_file("mismatch", 0444, rvu->rvu_dbg.npc, rvu,
+			    &npc_mcam_mismatch_fops);
+
 	debugfs_create_file("idx2vidx", 0444, rvu->rvu_dbg.npc,
 			    npc_priv, &npc_idx2vidx_map_fops);
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 003487d7c3cf..dd5d88ba0bc2 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -824,7 +824,7 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
 		rvu_write64(rvu, blkaddr,
 			    NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank),
 			    cfg);
-		return 0;
+		goto update_en_map;
 	}
 
 	/* For NPC_CN20K_MCAM_KEY_X4 keys, both the banks
@@ -842,6 +842,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
 			    cfg);
 	}
 
+update_en_map:
+	if (enable)
+		set_bit(index, npc_priv.en_map);
+	else
+		clear_bit(index, npc_priv.en_map);
+
 	return 0;
 }
 
@@ -1789,9 +1795,9 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
 	return 0;
 }
 
-static int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
-				      struct npc_subbank **sb,
-				      int *sb_off)
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+			       struct npc_subbank **sb,
+			       int *sb_off)
 {
 	int bank_off, sb_id;
 
@@ -4498,11 +4504,17 @@ static int npc_priv_init(struct rvu *rvu)
 		npc_const2 = rvu_read64(rvu, blkaddr, NPC_AF_CONST2);
 
 	num_banks = mcam->banks;
-	bank_depth = mcam->banksize;
+	if (!num_banks || num_banks > MAX_NUM_BANKS) {
+		dev_err(rvu->dev,
+			"Number of banks(%u) is invalid\n", num_banks);
+		return -EINVAL;
+	}
 
 	num_subbanks = FIELD_GET(GENMASK_ULL(39, 32), npc_const2);
-	if (!num_subbanks) {
-		dev_err(rvu->dev, "Number of subbanks is zero\n");
+	if (!num_subbanks || num_subbanks > MAX_NUM_SUB_BANKS) {
+		dev_err(rvu->dev,
+			"Number of subbanks is invalid %u\n",
+			num_subbanks);
 		return -EFAULT;
 	}
 
@@ -4513,10 +4525,23 @@ static int npc_priv_init(struct rvu *rvu)
 		return -EINVAL;
 	}
 
-	npc_priv.num_subbanks = num_subbanks;
+	bank_depth = mcam->banksize;
+	if (!bank_depth || bank_depth % num_subbanks) {
+		dev_err(rvu->dev,
+			"Bank depth(%u) should be a multiple of num_subbanks(%u)\n",
+			bank_depth, num_subbanks);
+		return -EINVAL;
+	}
 
 	subbank_depth =	bank_depth / num_subbanks;
+	if (!subbank_depth || subbank_depth > MAX_SUBBANK_DEPTH) {
+		dev_err(rvu->dev,
+			"Invalid subbank depth %u\n",
+			subbank_depth);
+		return -EINVAL;
+	}
 
+	npc_priv.num_subbanks = num_subbanks;
 	npc_priv.bank_depth = bank_depth;
 	npc_priv.subbank_depth = subbank_depth;
 
@@ -4605,6 +4630,8 @@ void npc_cn20k_deinit(struct rvu *rvu)
 	 */
 	kfree(npc_priv.sb);
 	kfree(subbank_srch_order);
+	bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
+		     MAX_SUBBANK_DEPTH);
 }
 
 static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 3d5eb952cc07..3e851950be64 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -10,6 +10,10 @@
 
 #define MKEX_CN20K_SIGN	0x19bbfdbd160
 
+/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
+ * hard limit on all silicon variants, preventing any possibility of
+ * out-of-bounds access on matrix defined using these values.
+ */
 #define MAX_NUM_BANKS 2
 #define MAX_NUM_SUB_BANKS 32
 #define MAX_SUBBANK_DEPTH 256
@@ -170,6 +174,7 @@ struct npc_defrag_show_node {
  * @num_banks:		Number of banks.
  * @num_subbanks:	Number of subbanks.
  * @subbank_depth:	Depth of subbank.
+ * @en_map:		Enable/disable status.
  * @kw:			Kex configured key type.
  * @sb:			Subbank array.
  * @xa_sb_used:		Array of used subbanks.
@@ -193,6 +198,9 @@ struct npc_priv_t {
 	const int num_banks;
 	int num_subbanks;
 	int subbank_depth;
+	DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
+		       MAX_NUM_SUB_BANKS *
+		       MAX_SUBBANK_DEPTH);
 	u8 kw;
 	struct npc_subbank *sb;
 	struct xarray xa_sb_used;
@@ -336,5 +344,8 @@ u16 npc_cn20k_vidx2idx(u16 index);
 u16 npc_cn20k_idx2vidx(u16 idx);
 int npc_cn20k_defrag(struct rvu *rvu);
 bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc);
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+			       struct npc_subbank **sb,
+			       int *sb_off);
 
 #endif /* NPC_CN20K_H */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 3/9] devlink: heap-allocate param fill buffers in devlink_nl_param_fill
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 1/9] octeontx2-af: enforce single RVU AF probe Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 4/9] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

devlink_nl_param_fill() kept two per-configuration-mode copies of
union devlink_param_value plus a struct devlink_param_gset_ctx on the
stack while building the Netlink reply. Allocate those with kcalloc()
and kzalloc_obj() instead, and route failures through a single cleanup
path so temporary buffers are always freed.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 net/devlink/param.c | 62 +++++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 16 deletions(-)

diff --git a/net/devlink/param.c b/net/devlink/param.c
index 1a196d3a843d..bd3881349c60 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -304,56 +304,79 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
 				 u32 portid, u32 seq, int flags,
 				 struct netlink_ext_ack *extack)
 {
-	union devlink_param_value default_value[DEVLINK_PARAM_CMODE_MAX + 1];
-	union devlink_param_value param_value[DEVLINK_PARAM_CMODE_MAX + 1];
 	bool default_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
 	bool param_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
 	const struct devlink_param *param = param_item->param;
-	struct devlink_param_gset_ctx ctx;
+	union devlink_param_value *default_value;
+	union devlink_param_value *param_value;
+	struct devlink_param_gset_ctx *ctx;
 	struct nlattr *param_values_list;
 	struct nlattr *param_attr;
 	void *hdr;
 	int err;
 	int i;
 
+	default_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+				sizeof(*default_value), GFP_KERNEL);
+	if (!default_value)
+		return -ENOMEM;
+
+	param_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+			      sizeof(*param_value), GFP_KERNEL);
+	if (!param_value) {
+		kfree(default_value);
+		return -ENOMEM;
+	}
+
+	ctx = kzalloc_obj(*ctx);
+	if (!ctx) {
+		kfree(param_value);
+		kfree(default_value);
+		return -ENOMEM;
+	}
+
 	/* Get value from driver part to driverinit configuration mode */
 	for (i = 0; i <= DEVLINK_PARAM_CMODE_MAX; i++) {
 		if (!devlink_param_cmode_is_supported(param, i))
 			continue;
 		if (i == DEVLINK_PARAM_CMODE_DRIVERINIT) {
-			if (param_item->driverinit_value_new_valid)
+			if (param_item->driverinit_value_new_valid) {
 				param_value[i] = param_item->driverinit_value_new;
-			else if (param_item->driverinit_value_valid)
+			} else if (param_item->driverinit_value_valid) {
 				param_value[i] = param_item->driverinit_value;
-			else
-				return -EOPNOTSUPP;
+			} else {
+				err = -EOPNOTSUPP;
+				goto get_put_fail;
+			}
 
 			if (param_item->driverinit_value_valid) {
 				default_value[i] = param_item->driverinit_default;
 				default_value_set[i] = true;
 			}
 		} else {
-			ctx.cmode = i;
-			err = devlink_param_get(devlink, param, &ctx, extack);
+			ctx->cmode = i;
+			err = devlink_param_get(devlink, param, ctx, extack);
 			if (err)
-				return err;
-			param_value[i] = ctx.val;
+				goto get_put_fail;
 
-			err = devlink_param_get_default(devlink, param, &ctx,
+			param_value[i] = ctx->val;
+
+			err = devlink_param_get_default(devlink, param, ctx,
 							extack);
 			if (!err) {
-				default_value[i] = ctx.val;
+				default_value[i] = ctx->val;
 				default_value_set[i] = true;
 			} else if (err != -EOPNOTSUPP) {
-				return err;
+				goto get_put_fail;
 			}
 		}
 		param_value_set[i] = true;
 	}
 
+	err = -EMSGSIZE;
 	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
 	if (!hdr)
-		return -EMSGSIZE;
+		goto get_put_fail;
 
 	if (devlink_nl_put_handle(msg, devlink))
 		goto genlmsg_cancel;
@@ -393,6 +416,9 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
 	nla_nest_end(msg, param_values_list);
 	nla_nest_end(msg, param_attr);
 	genlmsg_end(msg, hdr);
+	kfree(default_value);
+	kfree(param_value);
+	kfree(ctx);
 	return 0;
 
 values_list_nest_cancel:
@@ -401,7 +427,11 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
 	nla_nest_cancel(msg, param_attr);
 genlmsg_cancel:
 	genlmsg_cancel(msg, hdr);
-	return -EMSGSIZE;
+get_put_fail:
+	kfree(default_value);
+	kfree(param_value);
+	kfree(ctx);
+	return err;
 }
 
 static void devlink_param_notify(struct devlink *devlink,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 4/9] devlink: Implement devlink param multi attribute nested data values
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (2 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 3/9] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Saeed Mahameed, Ratheesh Kannoth

From: Saeed Mahameed <saeedm@nvidia.com>

Devlink param value attribute is not defined since devlink is handling
the value validating and parsing internally, this allows us to implement
multi attribute values without breaking any policies.

Devlink param multi-attribute values are considered to be dynamically
sized arrays of u64 values, by introducing a new devlink param type
DEVLINK_PARAM_TYPE_U64_ARRAY, driver and user space can set a variable
count of u64 values into the DEVLINK_ATTR_PARAM_VALUE_DATA attribute.

Implement get/set parsing and add to the internal value structure passed
to drivers.

This is useful for devices that need to configure a list of values for
a specific configuration.

example:
$ devlink dev param show pci/... name multi-value-param
name multi-value-param type driver-specific
values:
cmode permanent value: 0,1,2,3,4,5,6,7

$ devlink dev param set pci/... name multi-value-param \
		value 4,5,6,7,0,1,2,3 cmode permanent

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 Documentation/netlink/specs/devlink.yaml |  4 +++
 include/net/devlink.h                    |  8 ++++++
 include/uapi/linux/devlink.h             |  1 +
 net/devlink/netlink_gen.c                |  2 ++
 net/devlink/param.c                      | 33 +++++++++++++++++++++++-
 5 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 247b147d689f..52ad1e7805d1 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -234,6 +234,10 @@ definitions:
         value: 10
       -
         name: binary
+      -
+        name: u64-array
+        value: 129
+
   -
     name: rate-tc-index-max
     type: const
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 5f4083dc4345..dd546dbd57cf 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -433,6 +433,13 @@ enum devlink_param_type {
 	DEVLINK_PARAM_TYPE_U64 = DEVLINK_VAR_ATTR_TYPE_U64,
 	DEVLINK_PARAM_TYPE_STRING = DEVLINK_VAR_ATTR_TYPE_STRING,
 	DEVLINK_PARAM_TYPE_BOOL = DEVLINK_VAR_ATTR_TYPE_FLAG,
+	DEVLINK_PARAM_TYPE_U64_ARRAY = DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
+};
+
+#define __DEVLINK_PARAM_MAX_ARRAY_SIZE 32
+struct devlink_param_u64_array {
+	u64 size;
+	u64 val[__DEVLINK_PARAM_MAX_ARRAY_SIZE];
 };
 
 union devlink_param_value {
@@ -442,6 +449,7 @@ union devlink_param_value {
 	u64 vu64;
 	char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
 	bool vbool;
+	struct devlink_param_u64_array u64arr;
 };
 
 struct devlink_param_gset_ctx {
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 0b165eac7619..ca713bcc47b9 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -406,6 +406,7 @@ enum devlink_var_attr_type {
 	DEVLINK_VAR_ATTR_TYPE_BINARY,
 	__DEVLINK_VAR_ATTR_TYPE_CUSTOM_BASE = 0x80,
 	/* Any possible custom types, unrelated to NLA_* values go below */
+	DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
 };
 
 enum devlink_attr {
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index 81899786fd98..f52b0c2b19ed 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -37,6 +37,8 @@ devlink_attr_param_type_validate(const struct nlattr *attr,
 	case DEVLINK_VAR_ATTR_TYPE_NUL_STRING:
 		fallthrough;
 	case DEVLINK_VAR_ATTR_TYPE_BINARY:
+		fallthrough;
+	case DEVLINK_VAR_ATTR_TYPE_U64_ARRAY:
 		return 0;
 	}
 	NL_SET_ERR_MSG_ATTR(extack, attr, "invalid enum value");
diff --git a/net/devlink/param.c b/net/devlink/param.c
index bd3881349c60..3e9d2e5750c2 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -252,6 +252,15 @@ devlink_nl_param_value_put(struct sk_buff *msg, enum devlink_param_type type,
 				return -EMSGSIZE;
 		}
 		break;
+	case DEVLINK_PARAM_TYPE_U64_ARRAY:
+		if (val->u64arr.size > __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+			return -EMSGSIZE;
+
+		for (int i = 0; i < val->u64arr.size; i++) {
+			if (nla_put_uint(msg, nla_type, val->u64arr.val[i]))
+				return -EMSGSIZE;
+		}
+		break;
 	}
 	return 0;
 }
@@ -537,7 +546,7 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
 				  union devlink_param_value *value)
 {
 	struct nlattr *param_data;
-	int len;
+	int len, cnt, rem;
 
 	param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
 
@@ -577,6 +586,28 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
 			return -EINVAL;
 		value->vbool = nla_get_flag(param_data);
 		break;
+
+	case DEVLINK_PARAM_TYPE_U64_ARRAY:
+		cnt = 0;
+		nla_for_each_attr_type(param_data,
+				       DEVLINK_ATTR_PARAM_VALUE_DATA,
+				       genlmsg_data(info->genlhdr),
+				       genlmsg_len(info->genlhdr), rem) {
+			if (cnt >= __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+				return -EMSGSIZE;
+
+			if ((nla_len(param_data) != sizeof(u64)) &&
+			    (nla_len(param_data) != sizeof(u32))) {
+				NL_SET_BAD_ATTR(info->extack, param_data);
+				return -EINVAL;
+			}
+
+			value->u64arr.val[cnt] = nla_get_uint(param_data);
+			cnt++;
+		}
+
+		value->u64arr.size = cnt;
+		break;
 	}
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (3 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 4/9] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  4:28   ` Ratheesh Kannoth
  2026-06-10  5:10   ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

CN20K NPC MCAM is split into 32 subbanks that are searched in a
predefined order during allocation. Lower-numbered subbanks have
higher priority than higher-numbered ones.

Add a runtime "srch_order" to control the order in which
subbanks are searched during MCAM allocation.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 120 +++++++++++++++++-
 .../ethernet/marvell/octeontx2/af/cn20k/npc.h |   3 +
 .../marvell/octeontx2/af/rvu_devlink.c        |  92 ++++++++++++--
 3 files changed, 203 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index dd5d88ba0bc2..5bc2f07b5525 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -3376,7 +3376,7 @@ rvu_mbox_handler_npc_cn20k_get_kex_cfg(struct rvu *rvu,
 	return 0;
 }
 
-static int *subbank_srch_order;
+static u32 *subbank_srch_order;
 
 static void npc_populate_restricted_idxs(int num_subbanks)
 {
@@ -3388,7 +3388,7 @@ static int npc_create_srch_order(int cnt)
 {
 	int val = 0;
 
-	subbank_srch_order = kcalloc(cnt, sizeof(int),
+	subbank_srch_order = kcalloc(cnt, sizeof(u32),
 				     GFP_KERNEL);
 	if (!subbank_srch_order)
 		return -ENOMEM;
@@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
 		mutex_unlock(&npc_priv.sb[i].lock);
 }
 
+int npc_cn20k_search_order_set(struct rvu *rvu,
+			       u64 narr[MAX_NUM_SUB_BANKS], int cnt)
+{
+	struct npc_mcam *mcam = &rvu->hw->mcam;
+	int rsrc[2][MAX_NUM_SUB_BANKS] = { };
+	u8 save[MAX_NUM_SUB_BANKS] = { };
+	struct npc_subbank *sb;
+	struct xarray *xa;
+	int prio, rc, err;
+	int sb_idx;
+	enum {
+		FREE = 0,
+		USED = 1,
+	};
+
+	if (cnt != npc_priv.num_subbanks) {
+		dev_err(rvu->dev, "Number of entries(%u) != %u\n",
+			cnt, npc_priv.num_subbanks);
+		return -EINVAL;
+	}
+
+	mutex_lock(&mcam->lock);
+	npc_lock_all_subbank();
+
+	for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
+		sb = &npc_priv.sb[sb_idx];
+		save[sb->idx] = sb->arr_idx;
+	}
+
+	for (prio = 0; prio < cnt; prio++) {
+		sb_idx = narr[prio];
+		sb = &npc_priv.sb[sb_idx];
+
+		if (sb->flags & NPC_SUBBANK_FLAG_USED)
+			xa = &npc_priv.xa_sb_used;
+		else
+			xa = &npc_priv.xa_sb_free;
+
+		rc = xa_err(xa_store(xa, prio,
+				     xa_mk_value(sb_idx), GFP_KERNEL));
+		if (rc) {
+			dev_err(rvu->dev,
+				"Setting arr_idx=%d for sb=%d failed\n",
+				sb->arr_idx, sb_idx);
+			goto fail;
+		}
+
+		if (sb->flags & NPC_SUBBANK_FLAG_USED) {
+			rsrc[USED][sb->arr_idx] -= 1;
+			rsrc[USED][prio] += 1;
+		} else {
+			rsrc[FREE][sb->arr_idx] -= 1;
+			rsrc[FREE][prio] += 1;
+		}
+
+		sb->arr_idx = prio;
+	}
+
+	for (prio = 0; prio < cnt; prio++) {
+		if (rsrc[FREE][prio] == -1)
+			xa_erase(&npc_priv.xa_sb_free, prio);
+
+		if (rsrc[USED][prio] == -1)
+			xa_erase(&npc_priv.xa_sb_used, prio);
+	}
+
+	for (int i = 0; i < cnt; i++)
+		subbank_srch_order[i] = (u32)narr[i];
+
+	restrict_valid = false;
+
+	npc_unlock_all_subbank();
+	mutex_unlock(&mcam->lock);
+
+	return 0;
+
+fail:
+	for (prio = 0; prio < cnt; prio++) {
+		if (rsrc[FREE][prio] == 1)
+			xa_erase(&npc_priv.xa_sb_free, prio);
+
+		if (rsrc[USED][prio] == 1)
+			xa_erase(&npc_priv.xa_sb_used, prio);
+	}
+
+	for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
+		sb = &npc_priv.sb[sb_idx];
+		sb->arr_idx = save[sb_idx];
+
+		if (sb->flags & NPC_SUBBANK_FLAG_USED)
+			xa = &npc_priv.xa_sb_used;
+		else
+			xa = &npc_priv.xa_sb_free;
+
+		/* Since the entry already exists, xa_store() replaces
+		 * the value without a kmalloc(), making failure highly unlikely.
+		 */
+		err = xa_err(xa_store(xa, sb->arr_idx,
+				      xa_mk_value(sb->idx), GFP_KERNEL));
+		WARN(!!err, "Failed to rollback sb=%u idx=%u\n",
+		     sb->idx, sb->arr_idx);
+	}
+
+	npc_unlock_all_subbank();
+	mutex_unlock(&mcam->lock);
+
+	return rc;
+}
+
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
+{
+	*restricted_order = restrict_valid;
+	*sz = npc_priv.num_subbanks;
+	return subbank_srch_order;
+}
+
 /* Only non-ref non-contigous mcam indexes
  * are picked for defrag process
  */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 3e851950be64..8bf857317e49 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -347,5 +347,8 @@ bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc);
 int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
 			       struct npc_subbank **sb,
 			       int *sb_off);
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz);
+int npc_cn20k_search_order_set(struct rvu *rvu, u64 narr[MAX_NUM_SUB_BANKS],
+			       int cnt);
 
 #endif /* NPC_CN20K_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index a42404e6db7c..aa3ecab5ebd8 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -1258,6 +1258,7 @@ enum rvu_af_dl_param_id {
 	RVU_AF_DEVLINK_PARAM_ID_NPC_EXACT_FEATURE_DISABLE,
 	RVU_AF_DEVLINK_PARAM_ID_NPC_DEF_RULE_CNTR_ENABLE,
 	RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
+	RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
 	RVU_AF_DEVLINK_PARAM_ID_NIX_MAXLF,
 };
 
@@ -1619,12 +1620,83 @@ static int rvu_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
 	return 0;
 }
 
+static int rvu_af_dl_npc_srch_order_set(struct devlink *devlink, u32 id,
+					struct devlink_param_gset_ctx *ctx,
+					struct netlink_ext_ack *extack)
+{
+	struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+	struct rvu *rvu = rvu_dl->rvu;
+
+	return npc_cn20k_search_order_set(rvu,
+					  ctx->val.u64arr.val,
+					  ctx->val.u64arr.size);
+}
+
+static int rvu_af_dl_npc_srch_order_get(struct devlink *devlink, u32 id,
+					struct devlink_param_gset_ctx *ctx,
+					struct netlink_ext_ack *extack)
+{
+	bool restricted_order;
+	const u32 *order;
+	u32 sz;
+
+	order = npc_cn20k_search_order_get(&restricted_order, &sz);
+	ctx->val.u64arr.size = sz;
+	for (int i = 0; i < sz; i++)
+		ctx->val.u64arr.val[i] = order[i];
+
+	return 0;
+}
+
+static int rvu_af_dl_npc_srch_order_validate(struct devlink *devlink, u32 id,
+					     union devlink_param_value *val,
+					     struct netlink_ext_ack *extack)
+{
+	struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+	struct rvu *rvu = rvu_dl->rvu;
+	bool restricted_order;
+	unsigned long w = 0;
+	u64 *arr;
+	u32 sz;
+
+	npc_cn20k_search_order_get(&restricted_order, &sz);
+	if (sz != val->u64arr.size) {
+		dev_err(rvu->dev,
+			"Wrong size %llu, should be %u\n",
+			val->u64arr.size, sz);
+		return -EINVAL;
+	}
+
+	arr = val->u64arr.val;
+	for (int i = 0; i < sz; i++) {
+		if (arr[i] >= sz)
+			return -EINVAL;
+
+		w |= BIT_ULL(arr[i]);
+	}
+
+	if (bitmap_weight(&w, sz) != sz) {
+		dev_err(rvu->dev,
+			"Duplicate or out-of-range subbank index. %lu\n",
+			find_first_zero_bit(&w, sz));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static const struct devlink_ops rvu_devlink_ops = {
 	.eswitch_mode_get = rvu_devlink_eswitch_mode_get,
 	.eswitch_mode_set = rvu_devlink_eswitch_mode_set,
 };
 
-static const struct devlink_param rvu_af_dl_param_defrag[] = {
+static const struct devlink_param rvu_af_dl_cn20k_params[] = {
+	DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
+			     "npc_srch_order", DEVLINK_PARAM_TYPE_U64_ARRAY,
+			     BIT(DEVLINK_PARAM_CMODE_RUNTIME),
+			     rvu_af_dl_npc_srch_order_get,
+			     rvu_af_dl_npc_srch_order_set,
+			     rvu_af_dl_npc_srch_order_validate),
 	DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
 			     "npc_defrag", DEVLINK_PARAM_TYPE_STRING,
 			     BIT(DEVLINK_PARAM_CMODE_RUNTIME),
@@ -1666,13 +1738,13 @@ int rvu_register_dl(struct rvu *rvu)
 	}
 
 	if (is_cn20k(rvu->pdev)) {
-		err = devlink_params_register(dl, rvu_af_dl_param_defrag,
-					      ARRAY_SIZE(rvu_af_dl_param_defrag));
+		err = devlink_params_register(dl, rvu_af_dl_cn20k_params,
+					      ARRAY_SIZE(rvu_af_dl_cn20k_params));
 		if (err) {
 			dev_err(rvu->dev,
-				"devlink defrag params register failed with error %d",
+				"devlink cn20k params register failed with error %d",
 				err);
-			goto err_dl_defrag;
+			goto err_dl_cn20k_params;
 		}
 	}
 
@@ -1695,10 +1767,10 @@ int rvu_register_dl(struct rvu *rvu)
 
 err_dl_exact_match:
 	if (is_cn20k(rvu->pdev))
-		devlink_params_unregister(dl, rvu_af_dl_param_defrag,
-					  ARRAY_SIZE(rvu_af_dl_param_defrag));
+		devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+					  ARRAY_SIZE(rvu_af_dl_cn20k_params));
 
-err_dl_defrag:
+err_dl_cn20k_params:
 	devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
 
 err_dl_health:
@@ -1717,8 +1789,8 @@ void rvu_unregister_dl(struct rvu *rvu)
 	devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
 
 	if (is_cn20k(rvu->pdev))
-		devlink_params_unregister(dl, rvu_af_dl_param_defrag,
-					  ARRAY_SIZE(rvu_af_dl_param_defrag));
+		devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+					  ARRAY_SIZE(rvu_af_dl_cn20k_params));
 
 	/* Unregister exact match devlink only for CN10K-B */
 	if (rvu_npc_exact_has_match_table(rvu))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (4 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  4:30   ` Ratheesh Kannoth
  2026-06-10  4:31   ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
reinit or teardown without the AF freeing CN20K default NPC rule indexes
while the driver still owns that state (otx2_init_hw_resources and
otx2_free_hw_resources).

On CN20K, allocate default NPC rules from NIX LF alloc before
nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
and free from NIX LF free when the new flag is not set. Tighten
rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
(remove the blanket -ENOMEM at the free_mem path).

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |  1 +
 .../ethernet/marvell/octeontx2/af/rvu_nix.c   | 77 +++++++++++++------
 .../ethernet/marvell/octeontx2/af/rvu_npc.c   | 20 +++--
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |  6 +-
 4 files changed, 69 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index dc42c81c0942..e07fbf842b94 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -1009,6 +1009,7 @@ struct nix_lf_free_req {
 	struct mbox_msghdr hdr;
 #define NIX_LF_DISABLE_FLOWS		BIT_ULL(0)
 #define NIX_LF_DONT_FREE_TX_VTAG	BIT_ULL(1)
+#define NIX_LF_DONT_FREE_DFT_IDXS	BIT_ULL(2)
 	u64 flags;
 };
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
index f977734ae712..d8989395e875 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
@@ -16,6 +16,7 @@
 #include "cgx.h"
 #include "lmac_common.h"
 #include "rvu_npc_hash.h"
+#include "cn20k/npc.h"
 
 static void nix_free_tx_vtag_entries(struct rvu *rvu, u16 pcifunc);
 static int rvu_nix_get_bpid(struct rvu *rvu, struct nix_bp_cfg_req *req,
@@ -1499,9 +1500,11 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 				  struct nix_lf_alloc_req *req,
 				  struct nix_lf_alloc_rsp *rsp)
 {
-	int nixlf, qints, hwctx_size, intf, err, rc = 0;
+	int nixlf, qints, hwctx_size, intf, rc = 0;
+	u16 bcast, mcast, promisc, ucast;
 	struct rvu_hwinfo *hw = rvu->hw;
 	u16 pcifunc = req->hdr.pcifunc;
+	bool rules_created = false;
 	struct rvu_block *block;
 	struct rvu_pfvf *pfvf;
 	u64 cfg, ctx_cfg;
@@ -1555,8 +1558,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 		return NIX_AF_ERR_RSS_GRPS_INVALID;
 
 	/* Reset this NIX LF */
-	err = rvu_lf_reset(rvu, block, nixlf);
-	if (err) {
+	rc = rvu_lf_reset(rvu, block, nixlf);
+	if (rc) {
 		dev_err(rvu->dev, "Failed to reset NIX%d LF%d\n",
 			block->addr - BLKADDR_NIX0, nixlf);
 		return NIX_AF_ERR_LF_RESET;
@@ -1566,13 +1569,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 
 	/* Alloc NIX RQ HW context memory and config the base */
 	hwctx_size = 1UL << ((ctx_cfg >> 4) & 0xF);
-	err = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
-	if (err)
+	rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
+	if (rc)
 		goto free_mem;
 
 	pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
-	if (!pfvf->rq_bmap)
+	if (!pfvf->rq_bmap) {
+		rc = -ENOMEM;
 		goto free_mem;
+	}
 
 	rvu_write64(rvu, blkaddr, NIX_AF_LFX_RQS_BASE(nixlf),
 		    (u64)pfvf->rq_ctx->iova);
@@ -1583,13 +1588,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 
 	/* Alloc NIX SQ HW context memory and config the base */
 	hwctx_size = 1UL << (ctx_cfg & 0xF);
-	err = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
-	if (err)
+	rc = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
+	if (rc)
 		goto free_mem;
 
 	pfvf->sq_bmap = kcalloc(req->sq_cnt, sizeof(long), GFP_KERNEL);
-	if (!pfvf->sq_bmap)
+	if (!pfvf->sq_bmap) {
+		rc = -ENOMEM;
 		goto free_mem;
+	}
 
 	rvu_write64(rvu, blkaddr, NIX_AF_LFX_SQS_BASE(nixlf),
 		    (u64)pfvf->sq_ctx->iova);
@@ -1599,13 +1606,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 
 	/* Alloc NIX CQ HW context memory and config the base */
 	hwctx_size = 1UL << ((ctx_cfg >> 8) & 0xF);
-	err = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
-	if (err)
+	rc = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
+	if (rc)
 		goto free_mem;
 
 	pfvf->cq_bmap = kcalloc(req->cq_cnt, sizeof(long), GFP_KERNEL);
-	if (!pfvf->cq_bmap)
+	if (!pfvf->cq_bmap) {
+		rc = -ENOMEM;
 		goto free_mem;
+	}
 
 	rvu_write64(rvu, blkaddr, NIX_AF_LFX_CQS_BASE(nixlf),
 		    (u64)pfvf->cq_ctx->iova);
@@ -1615,18 +1624,18 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 
 	/* Initialize receive side scaling (RSS) */
 	hwctx_size = 1UL << ((ctx_cfg >> 12) & 0xF);
-	err = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
-				 req->rss_grps, hwctx_size, req->way_mask,
-				 !!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
-	if (err)
+	rc = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
+				req->rss_grps, hwctx_size, req->way_mask,
+				!!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
+	if (rc)
 		goto free_mem;
 
 	/* Alloc memory for CQINT's HW contexts */
 	cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
 	qints = (cfg >> 24) & 0xFFF;
 	hwctx_size = 1UL << ((ctx_cfg >> 24) & 0xF);
-	err = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
-	if (err)
+	rc = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
+	if (rc)
 		goto free_mem;
 
 	rvu_write64(rvu, blkaddr, NIX_AF_LFX_CINTS_BASE(nixlf),
@@ -1639,8 +1648,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 	cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
 	qints = (cfg >> 12) & 0xFFF;
 	hwctx_size = 1UL << ((ctx_cfg >> 20) & 0xF);
-	err = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
-	if (err)
+	rc = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
+	if (rc)
 		goto free_mem;
 
 	rvu_write64(rvu, blkaddr, NIX_AF_LFX_QINTS_BASE(nixlf),
@@ -1684,10 +1693,22 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 	if (is_sdp_pfvf(rvu, pcifunc))
 		intf = NIX_INTF_TYPE_SDP;
 
-	err = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
-				 !!(req->flags & NIX_LF_LBK_BLK_SEL));
-	if (err)
-		goto free_mem;
+	if (is_cn20k(rvu->pdev)) {
+		rc = npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &bcast, &mcast,
+						 &promisc, &ucast);
+		if (rc) {
+			rc = npc_cn20k_dft_rules_alloc(rvu, pcifunc);
+			if (rc)
+				goto free_mem;
+
+			rules_created = true;
+		}
+	}
+
+	rc = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
+				!!(req->flags & NIX_LF_LBK_BLK_SEL));
+	if (rc)
+		goto free_dft;
 
 	/* Disable NPC entries as NIXLF's contexts are not initialized yet */
 	rvu_npc_disable_default_entries(rvu, pcifunc, nixlf);
@@ -1699,9 +1720,12 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
 
 	goto exit;
 
+free_dft:
+	if (is_cn20k(rvu->pdev) && rules_created)
+		npc_cn20k_dft_rules_free(rvu, pcifunc);
+
 free_mem:
 	nix_ctx_free(rvu, pfvf);
-	rc = -ENOMEM;
 
 exit:
 	/* Set macaddr of this PF/VF */
@@ -1775,6 +1799,9 @@ int rvu_mbox_handler_nix_lf_free(struct rvu *rvu, struct nix_lf_free_req *req,
 
 	nix_ctx_free(rvu, pfvf);
 
+	if (is_cn20k(rvu->pdev) && !(req->flags & NIX_LF_DONT_FREE_DFT_IDXS))
+		npc_cn20k_dft_rules_free(rvu, pcifunc);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index d301a3f0f87a..150d50b72c48 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1285,11 +1285,18 @@ void npc_enadis_default_mce_entry(struct rvu *rvu, u16 pcifunc,
 	struct nix_mce_list *mce_list;
 	int index, blkaddr, mce_idx;
 	struct rvu_pfvf *pfvf;
+	u16 ptr[4];
 
 	/* multicast pkt replication is not enabled for AF's VFs & SDP links */
 	if (is_lbk_vf(rvu, pcifunc) || is_sdp_pfvf(rvu, pcifunc))
 		return;
 
+	/* In cn20k, only CGX mapped devices have default MCAST entry */
+	if (is_cn20k(rvu->pdev) &&
+	    npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &ptr[0], &ptr[1],
+					&ptr[2], &ptr[3]))
+		return;
+
 	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
 	if (blkaddr < 0)
 		return;
@@ -1329,9 +1336,12 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
 	struct rvu_pfvf *pfvf = rvu_get_pfvf(rvu, pcifunc);
 	struct npc_mcam *mcam = &rvu->hw->mcam;
 	int index, blkaddr;
+	u16 ptr[4];
 
 	/* only CGX or LBK interfaces have default entries */
-	if (is_cn20k(rvu->pdev) && !npc_is_cgx_or_lbk(rvu, pcifunc))
+	if (is_cn20k(rvu->pdev) &&
+	    npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &ptr[0], &ptr[1],
+					&ptr[2], &ptr[3]))
 		return;
 
 	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
@@ -4085,12 +4095,10 @@ void rvu_npc_clear_ucast_entry(struct rvu *rvu, int pcifunc, int nixlf)
 
 	ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc,
 					     nixlf, NIXLF_UCAST_ENTRY);
-	if (ucast_idx < 0) {
-		dev_err(rvu->dev,
-			"%s: Error to get ucast entry for pcifunc=%#x\n",
-			__func__, pcifunc);
+
+	/* In cn20k, default rules are freed before detach rsrc */
+	if (ucast_idx < 0)
 		return;
-	}
 
 	npc_enable_mcam_entry(rvu, mcam, blkaddr, ucast_idx, false);
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index f9fbf0c17648..b4538edb13f8 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1053,7 +1053,6 @@ irqreturn_t otx2_pfaf_mbox_intr_handler(int irq, void *pf_irq)
 	/* Clear the IRQ */
 	otx2_write64(pf, RVU_PF_INT, BIT_ULL(0));
 
-
 	mbox_data = otx2_read64(pf, RVU_PF_PFAF_MBOX0);
 
 	if (mbox_data & MBOX_UP_MSG) {
@@ -1729,7 +1728,7 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
 	mutex_lock(&mbox->lock);
 	free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
 	if (free_req) {
-		free_req->flags = NIX_LF_DISABLE_FLOWS;
+		free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
 		if (otx2_sync_mbox_msg(mbox))
 			dev_err(pf->dev, "%s failed to free nixlf\n", __func__);
 	}
@@ -1803,7 +1802,7 @@ void otx2_free_hw_resources(struct otx2_nic *pf)
 	/* Reset NIX LF */
 	free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
 	if (free_req) {
-		free_req->flags = NIX_LF_DISABLE_FLOWS;
+		free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
 		if (!(pf->flags & OTX2_FLAG_PF_SHUTDOWN))
 			free_req->flags |= NIX_LF_DONT_FREE_TX_VTAG;
 		if (otx2_sync_mbox_msg(mbox))
@@ -1926,7 +1925,6 @@ int otx2_alloc_queue_mem(struct otx2_nic *pf)
 	struct otx2_qset *qset = &pf->qset;
 	struct otx2_cq_poll *cq_poll;
 
-
 	/* RQ and SQs are mapped to different CQs,
 	 * so find out max CQ IRQs (i.e CINTs) needed.
 	 */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (5 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  5:24   ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
  8 siblings, 1 reply; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

Flashing updated firmware on deployed devices is cumbersome. Provide a
mechanism to load a custom KPU (Key Parse Unit) profile directly from
the filesystem at module load time.

When the rvu_af module is loaded with the kpu_profile parameter, the
specified profile is read from /lib/firmware/kpu and programmed into
the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
used by filesystem-loaded profiles and support ptype/ptype_mask in
npc_config_kpucam when profile->from_fs is set.

Usage:
  1. Copy the KPU profile file to /lib/firmware/kpu.
  2. Build OCTEONTX2_AF as a module.
  3. Load: insmod rvu_af.ko kpu_profile=<profile_name>

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c |  57 ++-
 .../net/ethernet/marvell/octeontx2/af/npc.h   |  17 +
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |  12 +-
 .../ethernet/marvell/octeontx2/af/rvu_npc.c   | 466 ++++++++++++++----
 .../ethernet/marvell/octeontx2/af/rvu_npc.h   |  17 +
 .../ethernet/marvell/octeontx2/af/rvu_reg.h   |   1 +
 6 files changed, 459 insertions(+), 111 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 5bc2f07b5525..d2939726eb8d 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -521,13 +521,17 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
 			       int kpm, int start_entry,
 			       const struct npc_kpu_profile *profile)
 {
+	int num_cam_entries, num_action_entries;
 	int entry, num_entries, max_entries;
 	u64 idx;
 
-	if (profile->cam_entries != profile->action_entries) {
+	num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+	num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+	if (num_cam_entries != num_action_entries) {
 		dev_err(rvu->dev,
 			"kpm%d: CAM and action entries [%d != %d] not equal\n",
-			kpm, profile->cam_entries, profile->action_entries);
+			kpm, num_cam_entries, num_action_entries);
 
 		WARN(1, "Fatal error\n");
 		return;
@@ -536,16 +540,18 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
 	max_entries = rvu->hw->npc_kpu_entries / 2;
 	entry = start_entry;
 	/* Program CAM match entries for previous kpm extracted data */
-	num_entries = min_t(int, profile->cam_entries, max_entries);
+	num_entries = min_t(int, num_cam_entries, max_entries);
 	for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
-		npc_config_kpmcam(rvu, blkaddr, &profile->cam[idx],
+		npc_config_kpmcam(rvu, blkaddr,
+				  npc_get_kpu_cam_nth_entry(rvu, profile, idx),
 				  kpm, entry);
 
 	entry = start_entry;
 	/* Program this kpm's actions */
-	num_entries = min_t(int, profile->action_entries, max_entries);
+	num_entries = min_t(int, num_action_entries, max_entries);
 	for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
-		npc_config_kpmaction(rvu, blkaddr, &profile->action[idx],
+		npc_config_kpmaction(rvu, blkaddr,
+				     npc_get_kpu_action_nth_entry(rvu, profile, idx),
 				     kpm, entry, false);
 }
 
@@ -611,20 +617,23 @@ npc_enable_kpm_entry(struct rvu *rvu, int blkaddr, int kpm, int num_entries)
 static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
 {
 	const struct npc_kpu_profile *profile1, *profile2;
+	int pfl1_num_cam_entries, pfl2_num_cam_entries;
 	int idx, total_cam_entries;
 
 	for (idx = 0; idx < num_kpms; idx++) {
 		profile1 = &rvu->kpu.kpu[idx];
+		pfl1_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile1);
 		npc_program_single_kpm_profile(rvu, blkaddr, idx, 0, profile1);
 		profile2 = &rvu->kpu.kpu[idx + KPU_OFFSET];
+		pfl2_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile2);
+
 		npc_program_single_kpm_profile(rvu, blkaddr, idx,
-					       profile1->cam_entries,
+					       pfl1_num_cam_entries,
 					       profile2);
-		total_cam_entries = profile1->cam_entries +
-			profile2->cam_entries;
+		total_cam_entries = pfl1_num_cam_entries + pfl2_num_cam_entries;
 		npc_enable_kpm_entry(rvu, blkaddr, idx, total_cam_entries);
 		rvu_write64(rvu, blkaddr, NPC_AF_KPMX_PASS2_OFFSET(idx),
-			    profile1->cam_entries);
+			    pfl1_num_cam_entries);
 		/* Enable the KPUs associated with this KPM */
 		rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx), 0x01);
 		rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx + KPU_OFFSET),
@@ -634,6 +643,7 @@ static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
 
 void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
 {
+	struct npc_kpu_profile_action *act;
 	struct rvu_hwinfo *hw = rvu->hw;
 	int num_pkinds, idx;
 
@@ -665,9 +675,15 @@ void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
 	num_pkinds = rvu->kpu.pkinds;
 	num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
 
-	for (idx = 0; idx < num_pkinds; idx++)
-		npc_config_kpmaction(rvu, blkaddr, &rvu->kpu.ikpu[idx],
+	/* Cn20k does not support Custom profile from filesystem */
+	for (idx = 0; idx < num_pkinds; idx++) {
+		act = npc_get_ikpu_nth_entry(rvu, idx);
+		if (!act)
+			continue;
+
+		npc_config_kpmaction(rvu, blkaddr, act,
 				     0, idx, true);
+	}
 
 	/* Program KPM CAM and Action profiles */
 	npc_program_kpm_profile(rvu, blkaddr, hw->npc_kpms);
@@ -679,7 +695,7 @@ struct npc_priv_t *npc_priv_get(void)
 }
 
 static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
-				struct npc_mcam_kex_extr *mkex_extr,
+				const struct npc_mcam_kex_extr *mkex_extr,
 				u8 intf)
 {
 	u8 num_extr = rvu->hw->npc_kex_extr;
@@ -708,7 +724,7 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
 }
 
 static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
-				struct npc_mcam_kex_extr *mkex_extr,
+				const struct npc_mcam_kex_extr *mkex_extr,
 				u8 intf)
 {
 	u8 num_extr = rvu->hw->npc_kex_extr;
@@ -737,7 +753,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
 }
 
 static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
-				     struct npc_mcam_kex_extr *mkex_extr)
+				     const struct npc_mcam_kex_extr *mkex_extr)
 {
 	struct rvu_hwinfo *hw = rvu->hw;
 	u8 intf;
@@ -1630,8 +1646,8 @@ npc_cn20k_update_action_entries_n_flags(struct rvu *rvu,
 int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
 			       struct npc_kpu_profile_adapter *profile)
 {
+	const struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
 	size_t hdr_sz = sizeof(struct npc_cn20k_kpu_profile_fwdata);
-	struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
 	struct npc_kpu_profile_action *action;
 	struct npc_kpu_profile_cam *cam;
 	struct npc_kpu_fwdata *fw_kpu;
@@ -1676,8 +1692,15 @@ int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
 	}
 
 	/* Verify if profile fits the HW */
+	if (fw->kpus > rvu->hw->npc_kpus) {
+		dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+			 rvu->hw->npc_kpus);
+		return -EINVAL;
+	}
+
+	/* Check if there is enough memory */
 	if (fw->kpus > profile->kpus) {
-		dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
+		dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
 			 profile->kpus);
 		return -EINVAL;
 	}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
index 2138c044fe41..eaed172f1606 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
@@ -267,6 +267,19 @@ struct npc_kpu_profile_cam {
 	u16 dp2_mask;
 } __packed;
 
+struct npc_kpu_profile_cam2 {
+	u8 state;
+	u8 state_mask;
+	u16 dp0;
+	u16 dp0_mask;
+	u16 dp1;
+	u16 dp1_mask;
+	u16 dp2;
+	u16 dp2_mask;
+	u8 ptype;
+	u8 ptype_mask;
+} __packed;
+
 struct npc_kpu_profile_action {
 	u8 errlev;
 	u8 errcode;
@@ -292,6 +305,10 @@ struct npc_kpu_profile {
 	int action_entries;
 	struct npc_kpu_profile_cam *cam;
 	struct npc_kpu_profile_action *action;
+	int cam_entries2;
+	int action_entries2;
+	struct npc_kpu_profile_action *action2;
+	struct npc_kpu_profile_cam2 *cam2;
 };
 
 /* NPC KPU register formats */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index 65397daae4c2..7f3505ae6860 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -553,17 +553,19 @@ struct npc_kpu_profile_adapter {
 	const char			*name;
 	u64				version;
 	const struct npc_lt_def_cfg	*lt_def;
-	const struct npc_kpu_profile_action	*ikpu; /* array[pkinds] */
-	const struct npc_kpu_profile	*kpu; /* array[kpus] */
+	struct npc_kpu_profile_action	*ikpu; /* array[pkinds] */
+	struct npc_kpu_profile_action	*ikpu2; /* array[pkinds] */
+	struct npc_kpu_profile	*kpu; /* array[kpus] */
 	union npc_mcam_key_prfl {
-		struct npc_mcam_kex		*mkex;
+		const struct npc_mcam_kex		*mkex;
 					/* used for cn9k and cn10k */
-		struct npc_mcam_kex_extr	*mkex_extr; /* used for cn20k */
+		const struct npc_mcam_kex_extr	*mkex_extr; /* used for cn20k */
 	} mcam_kex_prfl;
 	struct npc_mcam_kex_hash	*mkex_hash;
 	bool				custom;
 	size_t				pkinds;
 	size_t				kpus;
+	bool				from_fs;
 };
 
 #define RVU_SWITCH_LBK_CHAN	63
@@ -634,7 +636,7 @@ struct rvu {
 
 	/* Firmware data */
 	struct rvu_fwdata	*fwdata;
-	void			*kpu_fwdata;
+	const void		*kpu_fwdata;
 	size_t			kpu_fwdata_sz;
 	void __iomem		*kpu_prfl_addr;
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 150d50b72c48..b4635d78f9d5 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1495,7 +1495,8 @@ void rvu_npc_free_mcam_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
 }
 
 static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
-				struct npc_mcam_kex *mkex, u8 intf)
+				const struct npc_mcam_kex *mkex,
+				u8 intf)
 {
 	int lid, lt, ld, fl;
 
@@ -1524,7 +1525,8 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
 }
 
 static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
-				struct npc_mcam_kex *mkex, u8 intf)
+				const struct npc_mcam_kex *mkex,
+				u8 intf)
 {
 	int lid, lt, ld, fl;
 
@@ -1553,7 +1555,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
 }
 
 static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
-				     struct npc_mcam_kex *mkex)
+				     const struct npc_mcam_kex *mkex)
 {
 	struct rvu_hwinfo *hw = rvu->hw;
 	u8 intf;
@@ -1693,8 +1695,12 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
 			      const struct npc_kpu_profile_cam *kpucam,
 			      int kpu, int entry)
 {
+	const struct npc_kpu_profile_cam2 *kpucam2 = (void *)kpucam;
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
 	struct npc_kpu_cam cam0 = {0};
 	struct npc_kpu_cam cam1 = {0};
+	u64 *val = (u64 *)&cam1;
+	u64 *mask = (u64 *)&cam0;
 
 	cam1.state = kpucam->state & kpucam->state_mask;
 	cam1.dp0_data = kpucam->dp0 & kpucam->dp0_mask;
@@ -1706,6 +1712,14 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
 	cam0.dp1_data = ~kpucam->dp1 & kpucam->dp1_mask;
 	cam0.dp2_data = ~kpucam->dp2 & kpucam->dp2_mask;
 
+	if (profile->from_fs) {
+		u8 ptype = kpucam2->ptype;
+		u8 pmask = kpucam2->ptype_mask;
+
+		*val |= FIELD_PREP(GENMASK_ULL(57, 56), ptype & pmask);
+		*mask |= FIELD_PREP(GENMASK_ULL(57, 56), ~ptype & pmask);
+	}
+
 	rvu_write64(rvu, blkaddr,
 		    NPC_AF_KPUX_ENTRYX_CAMX(kpu, entry, 0), *(u64 *)&cam0);
 	rvu_write64(rvu, blkaddr,
@@ -1717,34 +1731,104 @@ u64 npc_enable_mask(int count)
 	return (((count) < 64) ? ~(BIT_ULL(count) - 1) : (0x00ULL));
 }
 
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+	if (profile->from_fs)
+		return &profile->ikpu2[n];
+
+	return &profile->ikpu[n];
+}
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+			    const struct npc_kpu_profile *kpu_pfl)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+	if (profile->from_fs)
+		return kpu_pfl->cam_entries2;
+
+	return kpu_pfl->cam_entries;
+}
+
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+			  const struct npc_kpu_profile *kpu_pfl, int n)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+	if (profile->from_fs)
+		return (void *)&kpu_pfl->cam2[n];
+
+	return (void *)&kpu_pfl->cam[n];
+}
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+			       const struct npc_kpu_profile *kpu_pfl)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+	if (profile->from_fs)
+		return kpu_pfl->action_entries2;
+
+	return kpu_pfl->action_entries;
+}
+
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+			     const struct npc_kpu_profile *kpu_pfl,
+			     int n)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+	if (profile->from_fs)
+		return (void *)&kpu_pfl->action2[n];
+
+	return (void *)&kpu_pfl->action[n];
+}
+
 static void npc_program_kpu_profile(struct rvu *rvu, int blkaddr, int kpu,
 				    const struct npc_kpu_profile *profile)
 {
+	int num_cam_entries, num_action_entries;
 	int entry, num_entries, max_entries;
 	u64 entry_mask;
 
-	if (profile->cam_entries != profile->action_entries) {
+	num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+	num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+	if (num_cam_entries != num_action_entries) {
 		dev_err(rvu->dev,
 			"KPU%d: CAM and action entries [%d != %d] not equal\n",
-			kpu, profile->cam_entries, profile->action_entries);
+			kpu, num_cam_entries, num_action_entries);
 	}
 
 	max_entries = rvu->hw->npc_kpu_entries;
 
+	WARN(num_cam_entries > max_entries,
+	     "KPU%u: err: hw max entries=%u, input entries=%u\n",
+	     kpu,  rvu->hw->npc_kpu_entries, num_cam_entries);
+
 	/* Program CAM match entries for previous KPU extracted data */
-	num_entries = min_t(int, profile->cam_entries, max_entries);
+	num_entries = min_t(int, num_cam_entries, max_entries);
 	for (entry = 0; entry < num_entries; entry++)
 		npc_config_kpucam(rvu, blkaddr,
-				  &profile->cam[entry], kpu, entry);
+				  (void *)npc_get_kpu_cam_nth_entry(rvu, profile, entry),
+				  kpu, entry);
 
 	/* Program this KPU's actions */
-	num_entries = min_t(int, profile->action_entries, max_entries);
+	num_entries = min_t(int, num_action_entries, max_entries);
 	for (entry = 0; entry < num_entries; entry++)
-		npc_config_kpuaction(rvu, blkaddr, &profile->action[entry],
+		npc_config_kpuaction(rvu, blkaddr,
+				     (void *)npc_get_kpu_action_nth_entry(rvu, profile, entry),
 				     kpu, entry, false);
 
 	/* Enable all programmed entries */
-	num_entries = min_t(int, profile->action_entries, profile->cam_entries);
+	num_entries = min_t(int, num_action_entries, num_cam_entries);
 	entry_mask = npc_enable_mask(num_entries);
 	/* Disable first KPU_MAX_CST_ENT entries for built-in profile */
 	if (!rvu->kpu.custom)
@@ -1788,26 +1872,175 @@ static void npc_prepare_default_kpu(struct rvu *rvu,
 	npc_cn20k_update_action_entries_n_flags(rvu, profile);
 }
 
-static int npc_apply_custom_kpu(struct rvu *rvu,
-				struct npc_kpu_profile_adapter *profile)
+static int npc_alloc_kpu_cam2_n_action2(struct rvu *rvu, int kpu_num,
+					int num_entries)
+{
+	struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
+	struct npc_kpu_profile *kpu;
+
+	kpu = &adapter->kpu[kpu_num];
+
+	kpu->cam2 = devm_kcalloc(rvu->dev, num_entries,
+				 sizeof(*kpu->cam2), GFP_KERNEL);
+	if (!kpu->cam2)
+		return -ENOMEM;
+
+	kpu->action2 = devm_kcalloc(rvu->dev, num_entries,
+				    sizeof(*kpu->action2), GFP_KERNEL);
+	if (!kpu->action2)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int npc_apply_custom_kpu_from_fw(struct rvu *rvu,
+					struct npc_kpu_profile_adapter *profile)
 {
 	size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+	const struct npc_kpu_profile_fwdata *fw;
 	struct npc_kpu_profile_action *action;
-	struct npc_kpu_profile_fwdata *fw;
 	struct npc_kpu_profile_cam *cam;
 	struct npc_kpu_fwdata *fw_kpu;
-	int entries;
-	u16 kpu, entry;
+	int entries, entry, kpu;
 
-	if (is_cn20k(rvu->pdev))
-		return npc_cn20k_apply_custom_kpu(rvu, profile);
+	fw = rvu->kpu_fwdata;
+
+	for (kpu = 0; kpu < fw->kpus; kpu++) {
+		if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+			dev_warn(rvu->dev,
+				 "Profile size mismatch on KPU%i parsing\n",
+				 kpu + 1);
+			return -EINVAL;
+		}
+
+		fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+		if (fw_kpu->entries < 0) {
+			dev_warn(rvu->dev,
+				 "Profile entries is negative on KPU%i parsing\n",
+				 kpu + 1);
+			return -EINVAL;
+		}
+
+		if (fw_kpu->entries > KPU_MAX_CST_ENT)
+			dev_warn(rvu->dev,
+				 "Too many custom entries on KPU%d: %d > %d\n",
+				 kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
+		entries = min_t(int, fw_kpu->entries, KPU_MAX_CST_ENT);
+		cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
+		offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
+		action = (struct npc_kpu_profile_action *)(fw->data + offset);
+		offset += fw_kpu->entries * sizeof(*action);
+		if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+			dev_warn(rvu->dev,
+				 "Profile size mismatch on KPU%i parsing.\n",
+				 kpu + 1);
+			return -EINVAL;
+		}
+		for (entry = 0; entry < entries; entry++) {
+			profile->kpu[kpu].cam[entry] = cam[entry];
+			profile->kpu[kpu].action[entry] = action[entry];
+		}
+	}
+
+	return 0;
+}
+
+static int npc_apply_custom_kpu_from_fs(struct rvu *rvu,
+					struct npc_kpu_profile_adapter *profile)
+{
+	size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+	const struct npc_kpu_profile_fwdata *fw;
+	struct npc_kpu_profile_action *action;
+	struct npc_kpu_profile_cam2 *cam2;
+	struct npc_kpu_fwdata *fw_kpu;
+	int entries, ret, entry, kpu;
 
 	fw = rvu->kpu_fwdata;
 
+	/* Binary blob contains ikpu actions entries at start of data[0] */
+	profile->ikpu2 = devm_kcalloc(rvu->dev, 1,
+				      sizeof(ikpu_action_entries),
+				      GFP_KERNEL);
+	if (!profile->ikpu2)
+		return -ENOMEM;
+
+	action = (struct npc_kpu_profile_action *)(fw->data + offset);
+
+	if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
+		return -EINVAL;
+
+	/* The firmware layout does dependent on the internal size of
+	 * ikpu_action_entries.
+	 */
+	memcpy((void *)profile->ikpu2, action, sizeof(ikpu_action_entries));
+	offset += sizeof(ikpu_action_entries);
+
+	for (kpu = 0; kpu < fw->kpus; kpu++) {
+		if (rvu->kpu_fwdata_sz < hdr_sz + offset + sizeof(*fw_kpu)) {
+			dev_warn(rvu->dev,
+				 "profile size mismatch on kpu%i parsing\n",
+				 kpu + 1);
+			return -EINVAL;
+		}
+
+		fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+		if (fw_kpu->entries <= 0) {
+			dev_warn(rvu->dev,
+				 "Invalid kpu entries on KPU%d\n", kpu);
+			return -EINVAL;
+		}
+
+		entries = min_t(int, fw_kpu->entries, rvu->hw->npc_kpu_entries);
+		dev_info(rvu->dev,
+			 "Loading %u entries on KPU%d\n", entries, kpu);
+
+		cam2 = (struct npc_kpu_profile_cam2 *)fw_kpu->data;
+		offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam2);
+		action = (struct npc_kpu_profile_action *)(fw->data + offset);
+		offset += fw_kpu->entries * sizeof(*action);
+		if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+			dev_warn(rvu->dev,
+				 "profile size mismatch on kpu%i parsing.\n",
+				 kpu + 1);
+			return -EINVAL;
+		}
+
+		profile->kpu[kpu].cam_entries2 = entries;
+		profile->kpu[kpu].action_entries2 = entries;
+		ret = npc_alloc_kpu_cam2_n_action2(rvu, kpu, entries);
+		if (ret) {
+			dev_warn(rvu->dev,
+				 "profile entry allocation failed for kpu=%d for %d entries\n",
+				 kpu, entries);
+			return -EINVAL;
+		}
+
+		for (entry = 0; entry < entries; entry++) {
+			profile->kpu[kpu].cam2[entry] = cam2[entry];
+			profile->kpu[kpu].action2[entry] = action[entry];
+		}
+	}
+
+	return 0;
+}
+
+static int npc_apply_custom_kpu(struct rvu *rvu,
+				struct npc_kpu_profile_adapter *profile,
+				bool from_fs, int *fw_kpus)
+{
+	size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata);
+	const struct npc_kpu_profile_fwdata *fw;
+	struct npc_kpu_profile_fwdata *sfw;
+
+	if (is_cn20k(rvu->pdev))
+		return npc_cn20k_apply_custom_kpu(rvu, profile);
+
 	if (rvu->kpu_fwdata_sz < hdr_sz) {
 		dev_warn(rvu->dev, "Invalid KPU profile size\n");
 		return -EINVAL;
 	}
+
+	fw = rvu->kpu_fwdata;
 	if (le64_to_cpu(fw->signature) != KPU_SIGN) {
 		dev_warn(rvu->dev, "Invalid KPU profile signature %llx\n",
 			 fw->signature);
@@ -1835,42 +2068,38 @@ static int npc_apply_custom_kpu(struct rvu *rvu,
 		return -EINVAL;
 	}
 	/* Verify if profile fits the HW */
+	if (fw->kpus > rvu->hw->npc_kpus) {
+		dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+			 rvu->hw->npc_kpus);
+		return -EINVAL;
+	}
+
+	/* Check if there is enough memory for fw loading.
+	 * Check if there is enough entries for profile->kpu[] to
+	 * set cam_entries2 and action_entries2
+	 */
 	if (fw->kpus > profile->kpus) {
-		dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
+		dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
 			 profile->kpus);
 		return -EINVAL;
 	}
 
+	*fw_kpus = fw->kpus;
+
+	sfw = devm_kcalloc(rvu->dev, 1, sizeof(*sfw), GFP_KERNEL);
+	if (!sfw)
+		return -ENOMEM;
+
+	memcpy(sfw, fw, sizeof(*sfw));
+
 	profile->custom = 1;
-	profile->name = fw->name;
+	profile->name = sfw->name;
 	profile->version = le64_to_cpu(fw->version);
-	profile->mcam_kex_prfl.mkex = &fw->mkex;
-	profile->lt_def = &fw->lt_def;
-
-	for (kpu = 0; kpu < fw->kpus; kpu++) {
-		fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
-		if (fw_kpu->entries > KPU_MAX_CST_ENT)
-			dev_warn(rvu->dev,
-				 "Too many custom entries on KPU%d: %d > %d\n",
-				 kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
-		entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
-		cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
-		offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
-		action = (struct npc_kpu_profile_action *)(fw->data + offset);
-		offset += fw_kpu->entries * sizeof(*action);
-		if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
-			dev_warn(rvu->dev,
-				 "Profile size mismatch on KPU%i parsing.\n",
-				 kpu + 1);
-			return -EINVAL;
-		}
-		for (entry = 0; entry < entries; entry++) {
-			profile->kpu[kpu].cam[entry] = cam[entry];
-			profile->kpu[kpu].action[entry] = action[entry];
-		}
-	}
+	profile->mcam_kex_prfl.mkex = &sfw->mkex;
+	profile->lt_def = &sfw->lt_def;
 
-	return 0;
+	return from_fs ? npc_apply_custom_kpu_from_fs(rvu, profile) :
+		npc_apply_custom_kpu_from_fw(rvu, profile);
 }
 
 static int npc_load_kpu_prfl_img(struct rvu *rvu, void __iomem *prfl_addr,
@@ -1958,45 +2187,19 @@ static int npc_load_kpu_profile_fwdb(struct rvu *rvu, const char *kpu_profile)
 	return ret;
 }
 
-void npc_load_kpu_profile(struct rvu *rvu)
+static int npc_load_kpu_profile_from_fw(struct rvu *rvu)
 {
 	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
 	const char *kpu_profile = rvu->kpu_pfl_name;
-	const struct firmware *fw = NULL;
-	bool retry_fwdb = false;
-
-	/* If user not specified profile customization */
-	if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
-		goto revert_to_default;
-	/* First prepare default KPU, then we'll customize top entries. */
-	npc_prepare_default_kpu(rvu, profile);
-
-	/* Order of preceedence for load loading NPC profile (high to low)
-	 * Firmware binary in filesystem.
-	 * Firmware database method.
-	 * Default KPU profile.
-	 */
-	if (!request_firmware_direct(&fw, kpu_profile, rvu->dev)) {
-		dev_info(rvu->dev, "Loading KPU profile from firmware: %s\n",
-			 kpu_profile);
-		rvu->kpu_fwdata = kzalloc(fw->size, GFP_KERNEL);
-		if (rvu->kpu_fwdata) {
-			memcpy(rvu->kpu_fwdata, fw->data, fw->size);
-			rvu->kpu_fwdata_sz = fw->size;
-		}
-		release_firmware(fw);
-		retry_fwdb = true;
-		goto program_kpu;
-	}
+	int fw_kpus = 0;
 
-load_image_fwdb:
 	/* Loading the KPU profile using firmware database */
 	if (npc_load_kpu_profile_fwdb(rvu, kpu_profile))
-		goto revert_to_default;
+		return -EFAULT;
 
-program_kpu:
 	/* Apply profile customization if firmware was loaded. */
-	if (!rvu->kpu_fwdata_sz || npc_apply_custom_kpu(rvu, profile)) {
+	if (!rvu->kpu_fwdata_sz ||
+	    npc_apply_custom_kpu(rvu, profile, false, &fw_kpus)) {
 		/* If image from firmware filesystem fails to load or invalid
 		 * retry with firmware database method.
 		 */
@@ -2010,10 +2213,6 @@ void npc_load_kpu_profile(struct rvu *rvu)
 			}
 			rvu->kpu_fwdata = NULL;
 			rvu->kpu_fwdata_sz = 0;
-			if (retry_fwdb) {
-				retry_fwdb = false;
-				goto load_image_fwdb;
-			}
 		}
 
 		dev_warn(rvu->dev,
@@ -2021,22 +2220,101 @@ void npc_load_kpu_profile(struct rvu *rvu)
 			 kpu_profile);
 		kfree(rvu->kpu_fwdata);
 		rvu->kpu_fwdata = NULL;
-		goto revert_to_default;
+		return -EFAULT;
 	}
 
-	dev_info(rvu->dev, "Using custom profile '%s', version %d.%d.%d\n",
+	dev_info(rvu->dev, "Using custom profile '%.32s', version %d.%d.%d\n",
 		 profile->name, NPC_KPU_VER_MAJ(profile->version),
 		 NPC_KPU_VER_MIN(profile->version),
 		 NPC_KPU_VER_PATCH(profile->version));
 
-	return;
+	return 0;
+}
+
+static int npc_load_kpu_profile_from_fs(struct rvu *rvu)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+	const char *kpu_profile = rvu->kpu_pfl_name;
+	const struct firmware *fw = NULL;
+	int ret, fw_kpus = 0;
+	char path[512] = "kpu/";
+
+	if (strlen(kpu_profile) > sizeof(path) - strlen("kpu/") - 1) {
+		dev_err(rvu->dev, "kpu profile name is too big\n");
+		return -ENOSPC;
+	}
+
+	strcat(path, kpu_profile);
+
+	if (request_firmware_direct(&fw, path, rvu->dev))
+		return -ENOENT;
+
+	dev_info(rvu->dev, "Loading KPU profile from filesystem: %s\n",
+		 path);
+
+	rvu->kpu_fwdata = fw->data;
+	rvu->kpu_fwdata_sz = fw->size;
+
+	ret = npc_apply_custom_kpu(rvu, profile, true, &fw_kpus);
+	release_firmware(fw);
+	rvu->kpu_fwdata = NULL;
+
+	if (ret) {
+		rvu->kpu_fwdata_sz = 0;
+		dev_err(rvu->dev,
+			"Loading KPU profile from filesystem failed\n");
+		return ret;
+	}
+
+	/* In firmware loading from filesystem method, all entries are from
+	 * same binary blob.
+	 */
+	rvu->kpu.kpus = fw_kpus;
+	profile->kpus = fw_kpus;
+	profile->from_fs = true;
+	return 0;
+}
+
+void npc_load_kpu_profile(struct rvu *rvu)
+{
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+	const char *kpu_profile = rvu->kpu_pfl_name;
+
+	profile->from_fs = false;
+
+	npc_prepare_default_kpu(rvu, profile);
+
+	/* If user not specified profile customization */
+	if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
+		return;
+
+	/* Order of preceedence for load loading NPC profile (high to low)
+	 * Firmware binary in filesystem.
+	 * Firmware database method.
+	 * Default KPU profile.
+	 */
+
+	/* Filesystem-based KPU loading is not supported on cn20k.
+	 * npc_prepare_default_kpu() was invoked earlier, but control
+	 * reached this point because the default profile was not selected.
+	 * No need to call it again.
+	 */
+	if (!is_cn20k(rvu->pdev)) {
+		if (!npc_load_kpu_profile_from_fs(rvu))
+			return;
+	}
+
+	/* First prepare default KPU, then we'll customize top entries. */
+	npc_prepare_default_kpu(rvu, profile);
+	if (!npc_load_kpu_profile_from_fw(rvu))
+		return;
 
-revert_to_default:
 	npc_prepare_default_kpu(rvu, profile);
 }
 
 static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
 {
+	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
 	struct rvu_hwinfo *hw = rvu->hw;
 	int num_pkinds, num_kpus, idx;
 
@@ -2060,7 +2338,9 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
 	num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
 
 	for (idx = 0; idx < num_pkinds; idx++)
-		npc_config_kpuaction(rvu, blkaddr, &rvu->kpu.ikpu[idx], 0, idx, true);
+		npc_config_kpuaction(rvu, blkaddr,
+				     npc_get_ikpu_nth_entry(rvu, idx),
+				     0, idx, true);
 
 	/* Program KPU CAM and Action profiles */
 	num_kpus = rvu->kpu.kpus;
@@ -2068,6 +2348,11 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
 
 	for (idx = 0; idx < num_kpus; idx++)
 		npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
+
+	if (profile->from_fs) {
+		rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(54), 0x03);
+		rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(58), 0x03);
+	}
 }
 
 void npc_mcam_rsrcs_deinit(struct rvu *rvu)
@@ -2297,18 +2582,21 @@ static void rvu_npc_hw_init(struct rvu *rvu, int blkaddr)
 
 static void rvu_npc_setup_interfaces(struct rvu *rvu, int blkaddr)
 {
-	struct npc_mcam_kex_extr *mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
-	struct npc_mcam_kex *mkex = rvu->kpu.mcam_kex_prfl.mkex;
+	const struct npc_mcam_kex_extr *mkex_extr;
 	struct npc_mcam *mcam = &rvu->hw->mcam;
 	struct rvu_hwinfo *hw = rvu->hw;
+	const struct npc_mcam_kex *mkex;
 	u64 nibble_ena, rx_kex, tx_kex;
 	u64 *keyx_cfg, reg;
 	u8 intf;
 
+	mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
+	mkex = rvu->kpu.mcam_kex_prfl.mkex;
+
 	if (is_cn20k(rvu->pdev)) {
-		keyx_cfg = mkex_extr->keyx_cfg;
+		keyx_cfg = (u64 *)mkex_extr->keyx_cfg;
 	} else {
-		keyx_cfg = mkex->keyx_cfg;
+		keyx_cfg = (u64 *)mkex->keyx_cfg;
 		/* Reserve last counter for MCAM RX miss action which is set to
 		 * drop packet. This way we will know how many pkts didn't
 		 * match any MCAM entry.
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
index 83c5e32e2afc..662f6693cfe9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
@@ -18,4 +18,21 @@ int npc_fwdb_prfl_img_map(struct rvu *rvu, void __iomem **prfl_img_addr,
 
 void npc_mcam_clear_bit(struct npc_mcam *mcam, u16 index);
 void npc_mcam_set_bit(struct npc_mcam *mcam, u16 index);
+
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n);
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+			    const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+			  const struct npc_kpu_profile *kpu_pfl, int n);
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+			       const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+			     const struct npc_kpu_profile *kpu_pfl, int n);
 #endif /* RVU_NPC_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
index 62cdc714ba57..ab89b8c6e490 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
@@ -596,6 +596,7 @@
 #define NPC_AF_INTFX_KEX_CFG(a)		(0x01010 | (a) << 8)
 #define NPC_AF_PKINDX_ACTION0(a)	(0x80000ull | (a) << 6)
 #define NPC_AF_PKINDX_ACTION1(a)	(0x80008ull | (a) << 6)
+#define NPC_AF_PKINDX_TYPE(a)		(0x80010ull | (a) << 6)
 #define NPC_AF_PKINDX_CPI_DEFX(a, b)	(0x80020ull | (a) << 6 | (b) << 3)
 #define NPC_AF_KPUX_ENTRYX_CAMX(a, b, c) \
 		(0x100000 | (a) << 14 | (b) << 6 | (c) << 3)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (6 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  5:31   ` Ratheesh Kannoth
  2026-06-09  4:04 ` [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
  8 siblings, 1 reply; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

Default CN20K NPC rule allocation now keys off the active MCAM keyword
width: use X4 with a bank-masked reference index when the silicon uses
X4 keys, and X2 with the raw index otherwise (replacing the previous
always-X2 / eidx + 1 behaviour).

In the AF flow-install path, flows that need more than 256 key bits
query the NPC profile; if the platform is fixed to X2 entries, fail
with -EOPNOTSUPP instead of requesting X4. Otherwise select X4 for the
MCAM alloc.

On the PF, cache and pass the profile kw_type from npc_get_pfl_info
through otx2_mcam_pfl_info_get(), and use it when allocating MCAM
entries for RSS/defaults and when installing ethtool flows on CN20K,
including masking the reference index for X4 slot layout.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 21 ++++++--
 .../marvell/octeontx2/af/rvu_npc_fs.c         | 12 ++++-
 .../marvell/octeontx2/nic/otx2_flows.c        | 48 +++++++++++++------
 3 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index d2939726eb8d..78d8cf522be1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -4500,10 +4500,16 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	pfvf = rvu_get_pfvf(rvu, pcifunc);
 	pfvf->hw_prio = NPC_DFT_RULE_PRIO;
 
+	if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+		req.kw_type = NPC_MCAM_KEY_X4;
+		req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+	} else {
+		req.kw_type = NPC_MCAM_KEY_X2;
+		req.ref_entry = eidx;
+	}
+
 	req.contig = false;
 	req.ref_prio = NPC_MCAM_HIGHER_PRIO;
-	req.ref_entry = eidx;
-	req.kw_type = NPC_MCAM_KEY_X2;
 	req.count = cnt;
 	req.hdr.pcifunc = pcifunc;
 
@@ -4533,11 +4539,18 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	 * as NPC_DFT_RULE_PRIO - 1 (higher hw priority)
 	 */
 	req.contig = false;
-	req.kw_type = NPC_MCAM_KEY_X2;
 	req.count = cnt;
 	req.hdr.pcifunc = pcifunc;
 	req.ref_prio = NPC_MCAM_LOWER_PRIO;
-	req.ref_entry = eidx + 1;
+
+	if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+		req.kw_type = NPC_MCAM_KEY_X4;
+		req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+	} else {
+		req.kw_type = NPC_MCAM_KEY_X2;
+		req.ref_entry = eidx;
+	}
+
 	ret = rvu_mbox_handler_npc_mcam_alloc_entry(rvu, &req, &rsp);
 	if (ret) {
 		dev_err(rvu->dev,
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
index 34f1e066707b..a22decbe3449 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
@@ -1671,9 +1671,11 @@ rvu_npc_alloc_entry_for_flow_install(struct rvu *rvu,
 {
 	struct npc_mcam_alloc_entry_req entry_req;
 	struct npc_mcam_alloc_entry_rsp entry_rsp;
+	struct npc_get_pfl_info_rsp rsp = { 0 };
 	struct npc_get_num_kws_req kws_req;
 	struct npc_get_num_kws_rsp kws_rsp;
 	int off, kw_bits, rc;
+	struct msg_req req;
 	u8 *src, *dst;
 
 	if (!is_cn20k(rvu->pdev)) {
@@ -1697,8 +1699,16 @@ rvu_npc_alloc_entry_for_flow_install(struct rvu *rvu,
 	kw_bits = kws_rsp.kws * 64;
 
 	*kw_type = NPC_MCAM_KEY_X2;
-	if (kw_bits > 256)
+	if (kw_bits > 256) {
+		rvu_mbox_handler_npc_get_pfl_info(rvu, &req, &rsp);
+		if (rsp.kw_type == NPC_MCAM_KEY_X2) {
+			dev_err(rvu->dev,
+				"Only X2 entries are supported in X2 profile\n");
+			return -EOPNOTSUPP;
+		}
+
 		*kw_type = NPC_MCAM_KEY_X4;
+	}
 
 	memset(&entry_req, 0, sizeof(entry_req));
 	memset(&entry_rsp, 0, sizeof(entry_rsp));
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 38cc539d724d..5dd0591fed99 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -37,14 +37,13 @@ static void otx2_clear_ntuple_flow_info(struct otx2_nic *pfvf, struct otx2_flow_
 	flow_cfg->max_flows = 0;
 }
 
-static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
-				  u16 *x4_slots)
+static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, u16 *x4_slots, u8 *kw_type)
 {
 	struct npc_get_pfl_info_rsp *rsp;
 	struct msg_req *req;
 	static struct {
 		bool is_set;
-		bool is_x2;
+		u8 kw_type;
 		u16 x4_slots;
 	} pfl_info;
 
@@ -53,8 +52,8 @@ static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
 	 */
 	mutex_lock(&pfvf->mbox.lock);
 	if (pfl_info.is_set) {
-		*is_x2 = pfl_info.is_x2;
 		*x4_slots = pfl_info.x4_slots;
+		*kw_type = pfl_info.kw_type;
 		mutex_unlock(&pfvf->mbox.lock);
 		return 0;
 	}
@@ -79,16 +78,16 @@ static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
 		return -EFAULT;
 	}
 
-	*is_x2 = (rsp->kw_type == NPC_MCAM_KEY_X2);
-	if (*is_x2)
-		*x4_slots = 0;
+	pfl_info.kw_type = rsp->kw_type;
+	if (rsp->kw_type == NPC_MCAM_KEY_X2)
+		pfl_info.x4_slots = 0;
 	else
-		*x4_slots = rsp->x4_slots;
-
-	pfl_info.is_x2 = *is_x2;
-	pfl_info.x4_slots = *x4_slots;
+		pfl_info.x4_slots = rsp->x4_slots;
 	pfl_info.is_set = true;
 
+	*x4_slots = pfl_info.x4_slots;
+	*kw_type = pfl_info.kw_type;
+
 	mutex_unlock(&pfvf->mbox.lock);
 	return 0;
 }
@@ -164,6 +163,7 @@ int otx2_alloc_mcam_entries(struct otx2_nic *pfvf, u16 count)
 	u16 dft_idx = 0, x4_slots = 0;
 	int ent, allocated = 0, ref;
 	bool is_x2 = false;
+	u8 kw_type = 0;
 	int rc;
 
 	/* Free current ones and allocate new ones with requested count */
@@ -182,12 +182,14 @@ int otx2_alloc_mcam_entries(struct otx2_nic *pfvf, u16 count)
 	}
 
 	if (is_cn20k(pfvf->pdev)) {
-		rc = otx2_mcam_pfl_info_get(pfvf, &is_x2, &x4_slots);
+		rc = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
 		if (rc) {
 			netdev_err(pfvf->netdev, "Error to retrieve profile info\n");
 			return rc;
 		}
 
+		is_x2 = kw_type == NPC_MCAM_KEY_X2;
+
 		rc = otx2_get_dft_rl_idx(pfvf, &dft_idx);
 		if (rc) {
 			netdev_err(pfvf->netdev,
@@ -289,6 +291,8 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
 	struct npc_mcam_alloc_entry_rsp *rsp;
 	int vf_vlan_max_flows, count;
 	int rc, ref, prio, ent;
+	u8 kw_type = 0;
+	u16 x4_slots;
 	u16 dft_idx;
 
 	ref = 0;
@@ -315,6 +319,16 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
 	if (!flow_cfg->def_ent)
 		return -ENOMEM;
 
+	kw_type = NPC_MCAM_KEY_X2;
+	if (is_cn20k(pfvf->pdev)) {
+		rc = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
+		if (rc) {
+			netdev_err(pfvf->netdev,
+				   "Error to get pfl info\n");
+			return rc;
+		}
+	}
+
 	mutex_lock(&pfvf->mbox.lock);
 
 	req = otx2_mbox_alloc_msg_npc_mcam_alloc_entry(&pfvf->mbox);
@@ -324,6 +338,10 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
 	}
 
 	req->kw_type = NPC_MCAM_KEY_X2;
+	if (is_cn20k(pfvf->pdev) && kw_type == NPC_MCAM_KEY_X4) {
+		req->kw_type = NPC_MCAM_KEY_X4;
+		ref &= (x4_slots - 1);
+	}
 	req->contig = false;
 	req->count = count;
 	req->ref_prio = prio;
@@ -1174,15 +1192,14 @@ static int otx2_add_flow_msg(struct otx2_nic *pfvf, struct otx2_flow *flow)
 #ifdef CONFIG_DCB
 	int vlan_prio, qidx, pfc_rule = 0;
 #endif
+	bool modify = false, is_x2;
 	int err, vf = 0, off, sz;
-	bool modify = false;
 	u8 kw_type = 0;
 	u8 *src, *dst;
 	u16 x4_slots;
-	bool is_x2;
 
 	if (is_cn20k(pfvf->pdev)) {
-		err = otx2_mcam_pfl_info_get(pfvf, &is_x2, &x4_slots);
+		err = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
 		if (err) {
 			netdev_err(pfvf->netdev,
 				   "Error to retrieve NPC profile info, pcifunc=%#x\n",
@@ -1190,6 +1207,7 @@ static int otx2_add_flow_msg(struct otx2_nic *pfvf, struct otx2_flow *flow)
 			return -EFAULT;
 		}
 
+		is_x2 = kw_type == NPC_MCAM_KEY_X2;
 		if (!is_x2) {
 			err = otx2_prepare_flow_request(&flow->flow_spec,
 							&treq);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
  2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
                   ` (7 preceding siblings ...)
  2026-06-09  4:04 ` [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
@ 2026-06-09  4:04 ` Ratheesh Kannoth
  2026-06-10  4:40   ` Ratheesh Kannoth
  2026-06-10  5:33   ` Ratheesh Kannoth
  8 siblings, 2 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-09  4:04 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham, Ratheesh Kannoth

Replace the file-scope static npc_priv with a kcalloc'd struct filled
from hardware bank/subbank geometry at init (num_banks is no longer a
const compile-time constant; drop init_done and use a non-NULL
npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
through the CN20K NPC code paths, extend teardown to kfree the root
struct on failure and in npc_cn20k_deinit, and adjust MCAM section
setup to use the discovered subbank count.

Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
and use the allocated backing store consistently when computing deltas
(including the counter rollover compare).

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
 .../marvell/octeontx2/af/cn20k/debugfs.c      |  17 +-
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 442 +++++++++---------
 .../ethernet/marvell/octeontx2/af/cn20k/npc.h |   4 +-
 3 files changed, 240 insertions(+), 223 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 730ef97a57e6..b6fda42e44c7 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -176,7 +176,8 @@ static DEFINE_MUTEX(stats_lock);
  * hard limit on all silicon variants, preventing any possibility of
  * out-of-bounds access.
  */
-static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
+static u64 (*dstats)[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS];
+
 static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
 {
 	struct npc_priv_t *npc_priv;
@@ -212,24 +213,24 @@ static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
 					   NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
 			if (!stats)
 				continue;
-			if (stats == dstats[bank][idx])
+			if (stats == dstats[0][bank][idx])
 				continue;
 
-			if (stats < dstats[bank][idx])
-				dstats[bank][idx] = 0;
+			if (stats < dstats[0][bank][idx])
+				dstats[0][bank][idx] = 0;
 
 			pf = 0xFFFF;
 			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
 			if (map)
 				pf = xa_to_value(map);
 
-			delta = stats - dstats[bank][idx];
+			delta = stats - dstats[0][bank][idx];
 
 			snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
 				 mcam_idx, pf, delta);
 			seq_puts(s, buff);
 
-			dstats[bank][idx] = stats;
+			dstats[0][bank][idx] = stats;
 		}
 	}
 
@@ -397,6 +398,10 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
 	debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
 			    npc_priv, &npc_vidx2idx_map_fops);
 
+	dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
+	if (!dstats)
+		return -ENOMEM;
+
 	debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
 			    &npc_mcam_dstats_fops);
 
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 78d8cf522be1..79e6684ed855 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -16,9 +16,7 @@
 #include "cn20k/reg.h"
 #include "rvu_npc_fs.h"
 
-static struct npc_priv_t npc_priv = {
-	.num_banks = MAX_NUM_BANKS,
-};
+static struct npc_priv_t *npc_priv;
 
 static const char *npc_kw_name[NPC_MCAM_KEY_MAX] = {
 	[NPC_MCAM_KEY_DYN] = "DYNAMIC",
@@ -226,7 +224,7 @@ static u16 npc_idx2vidx(u16 idx)
 	vidx = idx;
 	index = idx;
 
-	map = xa_load(&npc_priv.xa_idx2vidx_map, index);
+	map = xa_load(&npc_priv->xa_idx2vidx_map, index);
 	if (!map)
 		goto done;
 
@@ -242,7 +240,7 @@ static u16 npc_idx2vidx(u16 idx)
 
 static bool npc_is_vidx(u16 vidx)
 {
-	return vidx >= npc_priv.bank_depth * 2;
+	return vidx >= npc_priv->bank_depth * 2;
 }
 
 static u16 npc_vidx2idx(u16 vidx)
@@ -256,7 +254,7 @@ static u16 npc_vidx2idx(u16 vidx)
 	idx = vidx;
 	index = vidx;
 
-	map = xa_load(&npc_priv.xa_vidx2idx_map, index);
+	map = xa_load(&npc_priv->xa_vidx2idx_map, index);
 	if (!map)
 		goto done;
 
@@ -272,7 +270,7 @@ static u16 npc_vidx2idx(u16 vidx)
 
 u16 npc_cn20k_vidx2idx(u16 idx)
 {
-	if (!npc_priv.init_done)
+	if (!npc_priv)
 		return idx;
 
 	if (!npc_is_vidx(idx))
@@ -283,7 +281,7 @@ u16 npc_cn20k_vidx2idx(u16 idx)
 
 u16 npc_cn20k_idx2vidx(u16 idx)
 {
-	if (!npc_priv.init_done)
+	if (!npc_priv)
 		return idx;
 
 	if (npc_is_vidx(idx))
@@ -306,7 +304,7 @@ static int npc_vidx_maps_del_entry(struct rvu *rvu, u16 vidx, u16 *old_midx)
 
 	mcam_idx = npc_vidx2idx(vidx);
 
-	map = xa_erase(&npc_priv.xa_vidx2idx_map, vidx);
+	map = xa_erase(&npc_priv->xa_vidx2idx_map, vidx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: vidx(%u) does not map to proper mcam idx\n",
@@ -314,7 +312,7 @@ static int npc_vidx_maps_del_entry(struct rvu *rvu, u16 vidx, u16 *old_midx)
 		return -ESRCH;
 	}
 
-	map = xa_erase(&npc_priv.xa_idx2vidx_map, mcam_idx);
+	map = xa_erase(&npc_priv->xa_idx2vidx_map, mcam_idx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: vidx(%u) is not valid\n",
@@ -341,7 +339,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
 		return -ESRCH;
 	}
 
-	map = xa_erase(&npc_priv.xa_vidx2idx_map, vidx);
+	map = xa_erase(&npc_priv->xa_vidx2idx_map, vidx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: vidx(%u) could not be deleted from vidx2idx map\n",
@@ -351,7 +349,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
 
 	old_midx = xa_to_value(map);
 
-	rc = xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
+	rc = xa_insert(&npc_priv->xa_vidx2idx_map, vidx,
 		       xa_mk_value(new_midx), GFP_KERNEL);
 	if (rc) {
 		dev_err(rvu->dev,
@@ -360,7 +358,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
 		goto fail1;
 	}
 
-	map = xa_erase(&npc_priv.xa_idx2vidx_map, old_midx);
+	map = xa_erase(&npc_priv->xa_idx2vidx_map, old_midx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: old_midx(%u, vidx(%u)) cannot be added to idx2vidx map\n",
@@ -369,7 +367,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
 		goto fail2;
 	}
 
-	rc = xa_insert(&npc_priv.xa_idx2vidx_map, new_midx,
+	rc = xa_insert(&npc_priv->xa_idx2vidx_map, new_midx,
 		       xa_mk_value(vidx), GFP_KERNEL);
 	if (rc) {
 		dev_err(rvu->dev,
@@ -382,21 +380,21 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
 
 fail3:
 	/* Restore vidx at old_midx location */
-	if (xa_insert(&npc_priv.xa_idx2vidx_map, old_midx,
+	if (xa_insert(&npc_priv->xa_idx2vidx_map, old_midx,
 		      xa_mk_value(vidx), GFP_KERNEL))
 		dev_err(rvu->dev,
 			"%s: Error to roll back idx2vidx old_midx=%u vidx=%u\n",
 			__func__, old_midx, vidx);
 fail2:
 	/* Erase new_midx inserted at vidx */
-	if (!xa_erase(&npc_priv.xa_vidx2idx_map, vidx))
+	if (!xa_erase(&npc_priv->xa_vidx2idx_map, vidx))
 		dev_err(rvu->dev,
 			"%s: Failed to roll back vidx2idx vidx=%u\n",
 			__func__, vidx);
 
 fail1:
 	/* Restore old_midx at vidx location */
-	if (xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
+	if (xa_insert(&npc_priv->xa_vidx2idx_map, vidx,
 		      xa_mk_value(old_midx), GFP_KERNEL))
 		dev_err(rvu->dev,
 			"%s: Failed to roll back vidx2idx to old_midx=%u, vidx=%u\n",
@@ -412,10 +410,10 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
 	u32 id;
 
 	/* Virtual index start from maximum mcam index + 1 */
-	max = npc_priv.bank_depth * 2 * 2 - 1;
-	min = npc_priv.bank_depth * 2;
+	max = npc_priv->bank_depth * 2 * 2 - 1;
+	min = npc_priv->bank_depth * 2;
 
-	rc = xa_alloc(&npc_priv.xa_vidx2idx_map, &id,
+	rc = xa_alloc(&npc_priv->xa_vidx2idx_map, &id,
 		      xa_mk_value(mcam_idx),
 		      XA_LIMIT(min, max), GFP_KERNEL);
 	if (rc) {
@@ -425,7 +423,7 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
 		goto fail1;
 	}
 
-	rc = xa_insert(&npc_priv.xa_idx2vidx_map, mcam_idx,
+	rc = xa_insert(&npc_priv->xa_idx2vidx_map, mcam_idx,
 		       xa_mk_value(id), GFP_KERNEL);
 	if (rc) {
 		dev_err(rvu->dev,
@@ -440,7 +438,7 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
 	return 0;
 
 fail2:
-	xa_erase(&npc_priv.xa_vidx2idx_map, id);
+	xa_erase(&npc_priv->xa_vidx2idx_map, id);
 fail1:
 	return rc;
 }
@@ -691,7 +689,7 @@ void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
 
 struct npc_priv_t *npc_priv_get(void)
 {
-	return &npc_priv;
+	return npc_priv;
 }
 
 static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
@@ -860,9 +858,9 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
 
 update_en_map:
 	if (enable)
-		set_bit(index, npc_priv.en_map);
+		set_bit(index, npc_priv->en_map);
 	else
-		clear_bit(index, npc_priv.en_map);
+		clear_bit(index, npc_priv->en_map);
 
 	return 0;
 }
@@ -1751,28 +1749,28 @@ int npc_mcam_idx_2_key_type(struct rvu *rvu, u16 mcam_idx, u8 *key_type)
 	int bank_off, sb_id;
 
 	/* mcam_idx should be less than (2 * bank depth) */
-	if (mcam_idx >= npc_priv.bank_depth * 2) {
+	if (mcam_idx >= npc_priv->bank_depth * 2) {
 		dev_err(rvu->dev, "%s: bad params\n",
 			__func__);
 		return -EINVAL;
 	}
 
 	/* find mcam offset per bank */
-	bank_off = mcam_idx & (npc_priv.bank_depth - 1);
+	bank_off = mcam_idx & (npc_priv->bank_depth - 1);
 
 	/* Find subbank id */
-	sb_id = bank_off / npc_priv.subbank_depth;
+	sb_id = bank_off / npc_priv->subbank_depth;
 
 	/* Check if subbank id is more than maximum
 	 * number of subbanks available
 	 */
-	if (sb_id >= npc_priv.num_subbanks) {
+	if (sb_id >= npc_priv->num_subbanks) {
 		dev_err(rvu->dev, "%s: invalid subbank %d\n",
 			__func__, sb_id);
 		return -EINVAL;
 	}
 
-	sb = &npc_priv.sb[sb_id];
+	sb = &npc_priv->sb[sb_id];
 
 	*key_type = sb->key_type;
 
@@ -1788,7 +1786,7 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
 	 * subsection depth - 1
 	 */
 	if (sb->key_type == NPC_MCAM_KEY_X4 &&
-	    sub_off >= npc_priv.subbank_depth) {
+	    sub_off >= npc_priv->subbank_depth) {
 		dev_err(rvu->dev,
 			"%s: Failed to get mcam idx (x4) sb->idx=%u sub_off=%u",
 			__func__, sb->idx, sub_off);
@@ -1799,7 +1797,7 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
 	 * 2 * subsection depth - 1
 	 */
 	if (sb->key_type == NPC_MCAM_KEY_X2 &&
-	    sub_off >= npc_priv.subbank_depth * 2) {
+	    sub_off >= npc_priv->subbank_depth * 2) {
 		dev_err(rvu->dev,
 			"%s: Failed to get mcam idx (x2) sb->idx=%u sub_off=%u",
 			__func__, sb->idx, sub_off);
@@ -1807,12 +1805,12 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
 	}
 
 	/* Find subbank offset from respective subbank (w.r.t bank) */
-	off = sub_off & (npc_priv.subbank_depth - 1);
+	off = sub_off & (npc_priv->subbank_depth - 1);
 
 	/* if subsection idx is in bank1, add bank depth,
 	 * which is part of sb->b1b
 	 */
-	bot = sub_off >= npc_priv.subbank_depth ? sb->b1b : sb->b0b;
+	bot = sub_off >= npc_priv->subbank_depth ? sb->b1b : sb->b0b;
 
 	*mcam_idx = bot + off;
 	return 0;
@@ -1825,37 +1823,37 @@ int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
 	int bank_off, sb_id;
 
 	/* mcam_idx should be less than (2 * bank depth) */
-	if (mcam_idx >= npc_priv.bank_depth * 2) {
+	if (mcam_idx >= npc_priv->bank_depth * 2) {
 		dev_err(rvu->dev, "%s: Invalid mcam idx %u\n",
 			__func__, mcam_idx);
 		return -EINVAL;
 	}
 
 	/* find mcam offset per bank */
-	bank_off = mcam_idx & (npc_priv.bank_depth - 1);
+	bank_off = mcam_idx & (npc_priv->bank_depth - 1);
 
 	/* Find subbank id */
-	sb_id = bank_off / npc_priv.subbank_depth;
+	sb_id = bank_off / npc_priv->subbank_depth;
 
 	/* Check if subbank id is more than maximum
 	 * number of subbanks available
 	 */
-	if (sb_id >= npc_priv.num_subbanks) {
+	if (sb_id >= npc_priv->num_subbanks) {
 		dev_err(rvu->dev, "%s: invalid subbank %d\n",
 			__func__, sb_id);
 		return -EINVAL;
 	}
 
-	*sb = &npc_priv.sb[sb_id];
+	*sb = &npc_priv->sb[sb_id];
 
 	/* Subbank offset per bank */
-	*sb_off = bank_off % npc_priv.subbank_depth;
+	*sb_off = bank_off % npc_priv->subbank_depth;
 
 	/* Index in a subbank should add subbank depth
 	 * if it is in bank1
 	 */
-	if (mcam_idx >= npc_priv.bank_depth)
-		*sb_off += npc_priv.subbank_depth;
+	if (mcam_idx >= npc_priv->bank_depth)
+		*sb_off += npc_priv->subbank_depth;
 
 	return 0;
 }
@@ -1871,9 +1869,9 @@ static int __npc_subbank_contig_alloc(struct rvu *rvu,
 	int k, offset, delta = 0;
 	int cnt = 0, sbd;
 
-	sbd = npc_priv.subbank_depth;
+	sbd = npc_priv->subbank_depth;
 
-	if (sidx >= npc_priv.bank_depth)
+	if (sidx >= npc_priv->bank_depth)
 		delta = sbd;
 
 	switch (prio) {
@@ -1940,8 +1938,8 @@ static int __npc_subbank_non_contig_alloc(struct rvu *rvu,
 	int cnt = 0, delta;
 	int k, sbd;
 
-	sbd = npc_priv.subbank_depth;
-	delta = sidx >= npc_priv.bank_depth ? sbd : 0;
+	sbd = npc_priv->subbank_depth;
+	delta = sidx >= npc_priv->bank_depth ? sbd : 0;
 
 	switch (prio) {
 		/* Find an area of size 'count' from sidx to eidx */
@@ -2002,7 +2000,7 @@ static void __npc_subbank_sboff_2_off(struct rvu *rvu, struct npc_subbank *sb,
 {
 	int sbd;
 
-	sbd = npc_priv.subbank_depth;
+	sbd = npc_priv->subbank_depth;
 
 	*off = sb_off & (sbd - 1);
 	*bmap = (sb_off >= sbd) ? sb->b1map : sb->b0map;
@@ -2051,20 +2049,20 @@ static int __npc_subbank_mark_free(struct rvu *rvu, struct npc_subbank *sb)
 	sb->flags = NPC_SUBBANK_FLAG_FREE;
 	sb->key_type = 0;
 
-	bitmap_clear(sb->b0map, 0, npc_priv.subbank_depth);
-	bitmap_clear(sb->b1map, 0, npc_priv.subbank_depth);
+	bitmap_clear(sb->b0map, 0, npc_priv->subbank_depth);
+	bitmap_clear(sb->b1map, 0, npc_priv->subbank_depth);
 
-	if (!xa_erase(&npc_priv.xa_sb_used, sb->arr_idx)) {
+	if (!xa_erase(&npc_priv->xa_sb_used, sb->arr_idx)) {
 		dev_err(rvu->dev,
 			"%s: Error to delete from xa_sb_used array\n",
 			__func__);
 		return -EFAULT;
 	}
 
-	rc = xa_insert(&npc_priv.xa_sb_free, sb->arr_idx,
+	rc = xa_insert(&npc_priv->xa_sb_free, sb->arr_idx,
 		       xa_mk_value(sb->idx), GFP_KERNEL);
 	if (rc) {
-		rc = xa_insert(&npc_priv.xa_sb_used, sb->arr_idx,
+		rc = xa_insert(&npc_priv->xa_sb_used, sb->arr_idx,
 			       xa_mk_value(sb->idx), GFP_KERNEL);
 		if (rc)
 			dev_err(rvu->dev,
@@ -2093,21 +2091,21 @@ static int __npc_subbank_mark_used(struct rvu *rvu, struct npc_subbank *sb,
 	sb->flags = NPC_SUBBANK_FLAG_USED;
 	sb->key_type = key_type;
 	if (key_type == NPC_MCAM_KEY_X4)
-		sb->free_cnt = npc_priv.subbank_depth;
+		sb->free_cnt = npc_priv->subbank_depth;
 	else
-		sb->free_cnt = 2 * npc_priv.subbank_depth;
+		sb->free_cnt = 2 * npc_priv->subbank_depth;
 
-	bitmap_clear(sb->b0map, 0, npc_priv.subbank_depth);
-	bitmap_clear(sb->b1map, 0, npc_priv.subbank_depth);
+	bitmap_clear(sb->b0map, 0, npc_priv->subbank_depth);
+	bitmap_clear(sb->b1map, 0, npc_priv->subbank_depth);
 
-	if (!xa_erase(&npc_priv.xa_sb_free, sb->arr_idx)) {
+	if (!xa_erase(&npc_priv->xa_sb_free, sb->arr_idx)) {
 		dev_err(rvu->dev,
 			"%s: Error to delete from xa_sb_free array\n",
 			__func__);
 		return -EFAULT;
 	}
 
-	rc = xa_insert(&npc_priv.xa_sb_used, sb->arr_idx,
+	rc = xa_insert(&npc_priv->xa_sb_used, sb->arr_idx,
 		       xa_mk_value(sb->idx), GFP_KERNEL);
 	if (rc)
 		dev_err(rvu->dev,
@@ -2131,10 +2129,10 @@ static bool __npc_subbank_free(struct rvu *rvu, struct npc_subbank *sb,
 
 	/* Check whether we can mark whole subbank as free */
 	if (sb->key_type == NPC_MCAM_KEY_X4) {
-		if (sb->free_cnt < npc_priv.subbank_depth)
+		if (sb->free_cnt < npc_priv->subbank_depth)
 			goto done;
 	} else {
-		if (sb->free_cnt < 2 * npc_priv.subbank_depth)
+		if (sb->free_cnt < 2 * npc_priv->subbank_depth)
 			goto done;
 	}
 
@@ -2213,7 +2211,7 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
 
 	/* x4 indexes are from 0 to bank size as it combines two x2 banks */
 	if (key_type == NPC_MCAM_KEY_X4 &&
-	    (ref >= npc_priv.bank_depth || limit >= npc_priv.bank_depth)) {
+	    (ref >= npc_priv->bank_depth || limit >= npc_priv->bank_depth)) {
 		dev_err(rvu->dev,
 			"%s: Wrong ref_enty(%d) or limit(%d) for x4\n",
 			__func__, ref, limit);
@@ -2223,8 +2221,8 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
 	/* This function is called either bank0 or bank1 portion of a subbank.
 	 * so ref and limit should be on same bank.
 	 */
-	diffbank = !!((ref & npc_priv.bank_depth) ^
-		      (limit & npc_priv.bank_depth));
+	diffbank = !!((ref & npc_priv->bank_depth) ^
+		      (limit & npc_priv->bank_depth));
 	if (diffbank) {
 		dev_err(rvu->dev,
 			"%s: request ref and limit should be from same bank\n",
@@ -2248,7 +2246,7 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
 	 * or equal to mcam entries available in the subbank if contig.
 	 */
 	if (sb->flags & NPC_SUBBANK_FLAG_FREE) {
-		if (contig && count > npc_priv.subbank_depth) {
+		if (contig && count > npc_priv->subbank_depth) {
 			dev_err(rvu->dev, "%s: Less number of entries\n",
 				__func__);
 			return -ENOSPC;
@@ -2271,10 +2269,10 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
 	}
 
 process:
-	/* if ref or limit >= npc_priv.bank_depth, index are in bank1.
+	/* if ref or limit >= npc_priv->bank_depth, index are in bank1.
 	 * else bank0.
 	 */
-	if (ref >= npc_priv.bank_depth) {
+	if (ref >= npc_priv->bank_depth) {
 		bmap = sb->b1map;
 		t = sb->b1t;
 		b = sb->b1b;
@@ -2285,8 +2283,8 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
 	}
 
 	/* Calculate free slots */
-	bw = bitmap_weight(bmap, npc_priv.subbank_depth);
-	bfree = npc_priv.subbank_depth - bw;
+	bw = bitmap_weight(bmap, npc_priv->subbank_depth);
+	bfree = npc_priv->subbank_depth - bw;
 
 	if (!bfree) {
 		dev_dbg(rvu->dev, "%s: subbank is full\n", __func__);
@@ -2415,7 +2413,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
 	int pcifunc, idx;
 	void *map;
 
-	map = xa_erase(&npc_priv.xa_idx2pf_map, mcam_idx);
+	map = xa_erase(&npc_priv->xa_idx2pf_map, mcam_idx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: failed to erase mcam_idx(%u) from xa_idx2pf map\n",
@@ -2424,7 +2422,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
 	}
 
 	pcifunc = xa_to_value(map);
-	map = xa_load(&npc_priv.xa_pf_map, pcifunc);
+	map = xa_load(&npc_priv->xa_pf_map, pcifunc);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: failed to find entry for (%u) from xa_pf_map, mcam=%u\n",
@@ -2434,7 +2432,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
 
 	idx = xa_to_value(map);
 
-	map = xa_erase(&npc_priv.xa_pf2idx_map[idx], mcam_idx);
+	map = xa_erase(&npc_priv->xa_pf2idx_map[idx], mcam_idx);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: failed to erase mcam_idx(%u) from xa_pf2idx_map map\n",
@@ -2454,18 +2452,18 @@ npc_add_to_pf_maps(struct rvu *rvu, u16 mcam_idx, int pcifunc)
 		"%s: add2maps mcam_idx(%u) to xa_idx2pf map pcifunc=%#x\n",
 		__func__, mcam_idx, pcifunc);
 
-	rc = xa_insert(&npc_priv.xa_idx2pf_map, mcam_idx,
+	rc = xa_insert(&npc_priv->xa_idx2pf_map, mcam_idx,
 		       xa_mk_value(pcifunc), GFP_KERNEL);
 
 	if (rc) {
-		map = xa_load(&npc_priv.xa_idx2pf_map, mcam_idx);
+		map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
 		dev_err(rvu->dev,
 			"%s: failed to insert mcam_idx(%u) to xa_idx2pf map, existing value=%lu\n",
 			__func__, mcam_idx, xa_to_value(map));
 		return -EFAULT;
 	}
 
-	map = xa_load(&npc_priv.xa_pf_map, pcifunc);
+	map = xa_load(&npc_priv->xa_pf_map, pcifunc);
 	if (!map) {
 		dev_err(rvu->dev,
 			"%s: failed to find pf map entry for pcifunc=%#x, mcam=%u\n",
@@ -2475,12 +2473,12 @@ npc_add_to_pf_maps(struct rvu *rvu, u16 mcam_idx, int pcifunc)
 
 	idx = xa_to_value(map);
 
-	rc = xa_insert(&npc_priv.xa_pf2idx_map[idx], mcam_idx,
+	rc = xa_insert(&npc_priv->xa_pf2idx_map[idx], mcam_idx,
 		       xa_mk_value(pcifunc), GFP_KERNEL);
 
 	if (rc) {
-		map = xa_load(&npc_priv.xa_pf2idx_map[idx], mcam_idx);
-		xa_erase(&npc_priv.xa_idx2pf_map, mcam_idx);
+		map = xa_load(&npc_priv->xa_pf2idx_map[idx], mcam_idx);
+		xa_erase(&npc_priv->xa_idx2pf_map, mcam_idx);
 		dev_err(rvu->dev,
 			"%s: failed to insert mcam_idx(%u) to xa_pf2idx_map map, earlier value=%lu idx=%u\n",
 			__func__, mcam_idx, xa_to_value(map), idx);
@@ -2510,9 +2508,9 @@ npc_subbank_suits(struct npc_subbank *sb, int key_type)
 	return false;
 }
 
-#define SB_ALIGN_UP(val)   (((val) + npc_priv.subbank_depth) & \
-			    ~((npc_priv.subbank_depth) - 1))
-#define SB_ALIGN_DOWN(val) ALIGN_DOWN((val), npc_priv.subbank_depth)
+#define SB_ALIGN_UP(val)   (((val) + npc_priv->subbank_depth) & \
+			    ~((npc_priv->subbank_depth) - 1))
+#define SB_ALIGN_DOWN(val) ALIGN_DOWN((val), npc_priv->subbank_depth)
 
 static void npc_subbank_iter_down(struct rvu *rvu,
 				  int ref, int limit,
@@ -2538,7 +2536,7 @@ static void npc_subbank_iter_down(struct rvu *rvu,
 	}
 
 	*cur_ref = *cur_limit - 1;
-	align = *cur_ref - npc_priv.subbank_depth + 1;
+	align = *cur_ref - npc_priv->subbank_depth + 1;
 	if (align <= limit) {
 		*stop = true;
 		*cur_limit = limit;
@@ -2578,7 +2576,7 @@ static void npc_subbank_iter_up(struct rvu *rvu,
 	}
 
 	*cur_ref = *cur_limit + 1;
-	align = *cur_ref + npc_priv.subbank_depth - 1;
+	align = *cur_ref + npc_priv->subbank_depth - 1;
 
 	if (align >= limit) {
 		*stop = true;
@@ -2606,17 +2604,17 @@ npc_subbank_iter(struct rvu *rvu, int key_type,
 
 	/* limit and ref should < bank_depth for x4 */
 	if (key_type == NPC_MCAM_KEY_X4) {
-		if (*cur_ref >= npc_priv.bank_depth)
+		if (*cur_ref >= npc_priv->bank_depth)
 			return -EINVAL;
 
-		if (*cur_limit >= npc_priv.bank_depth)
+		if (*cur_limit >= npc_priv->bank_depth)
 			return -EINVAL;
 	}
 	/* limit and ref should < 2 * bank_depth, for x2 */
-	if (*cur_ref >= 2 * npc_priv.bank_depth)
+	if (*cur_ref >= 2 * npc_priv->bank_depth)
 		return -EINVAL;
 
-	if (*cur_limit >= 2 * npc_priv.bank_depth)
+	if (*cur_limit >= 2 * npc_priv->bank_depth)
 		return -EINVAL;
 
 	return 0;
@@ -2651,7 +2649,7 @@ static int npc_idx_free(struct rvu *rvu, u16 *mcam_idx, int count,
 			vidx = npc_idx2vidx(midx);
 		}
 
-		if (midx >= npc_priv.bank_depth * npc_priv.num_banks) {
+		if (midx >= npc_priv->bank_depth * npc_priv->num_banks) {
 			dev_err(rvu->dev,
 				"%s: Invalid mcam_idx=%u cannot be deleted\n",
 				__func__, mcam_idx[i]);
@@ -2846,7 +2844,7 @@ static int npc_subbank_free_cnt(struct rvu *rvu, struct npc_subbank *sb,
 {
 	int cnt, spd;
 
-	spd = npc_priv.subbank_depth;
+	spd = npc_priv->subbank_depth;
 	mutex_lock(&sb->lock);
 
 	if (sb->flags & NPC_SUBBANK_FLAG_FREE)
@@ -3005,7 +3003,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
 	max_alloc = !contig;
 
 	/* Check used subbanks for free slots */
-	xa_for_each(&npc_priv.xa_sb_used, index, val) {
+	xa_for_each(&npc_priv->xa_sb_used, index, val) {
 		idx = xa_to_value(val);
 
 		/* Minimize allocation from restricted subbanks
@@ -3014,7 +3012,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
 		if (npc_subbank_restrict_usage(rvu, idx))
 			continue;
 
-		sb = &npc_priv.sb[idx];
+		sb = &npc_priv->sb[idx];
 
 		/* Skip if not suitable subbank */
 		if (!npc_subbank_suits(sb, key_type))
@@ -3071,9 +3069,9 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
 	}
 
 	/* Allocate in free subbanks */
-	xa_for_each(&npc_priv.xa_sb_free, index, val) {
+	xa_for_each(&npc_priv->xa_sb_free, index, val) {
 		idx = xa_to_value(val);
-		sb = &npc_priv.sb[idx];
+		sb = &npc_priv->sb[idx];
 
 		/* Minimize allocation from restricted subbanks
 		 * in noref allocations.
@@ -3129,7 +3127,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
 	for (i = 0; restrict_valid &&
 	     (i < ARRAY_SIZE(npc_subbank_restricted_idxs)); i++) {
 		idx = npc_subbank_restricted_idxs[i];
-		sb = &npc_priv.sb[idx];
+		sb = &npc_priv->sb[idx];
 
 		/* Skip if not suitable subbank */
 		if (!npc_subbank_suits(sb, key_type))
@@ -3209,7 +3207,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
 	bool ref_valid;
 	u16 vidx;
 
-	bd = npc_priv.bank_depth;
+	bd = npc_priv->bank_depth;
 
 	/* Special case: ref == 0 && limit= 0 && prio == HIGH && count == 1
 	 * Here user wants to allocate 0th entry
@@ -3227,7 +3225,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
 	ref_valid = !!(limit || ref);
 	defrag_candidate = !ref_valid && !contig && virt;
 	if (!ref_valid) {
-		if (contig && count > npc_priv.subbank_depth)
+		if (contig && count > npc_priv->subbank_depth)
 			goto try_noref_multi_subbank;
 
 		rc = npc_subbank_noref_alloc(rvu, key_type, contig,
@@ -3272,7 +3270,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
 		return -EINVAL;
 	}
 
-	if (contig && count > npc_priv.subbank_depth)
+	if (contig && count > npc_priv->subbank_depth)
 		goto try_ref_multi_subbank;
 
 	rc = npc_subbank_ref_alloc(rvu, key_type, ref, limit,
@@ -3334,8 +3332,8 @@ void npc_cn20k_subbank_calc_free(struct rvu *rvu, int *x2_free,
 	*x4_free = 0;
 	*sb_free = 0;
 
-	for (i = 0; i < npc_priv.num_subbanks; i++) {
-		sb = &npc_priv.sb[i];
+	for (i = 0; i < npc_priv->num_subbanks; i++) {
+		sb = &npc_priv->sb[i];
 		mutex_lock(&sb->lock);
 
 		/* Count number of free subbanks */
@@ -3433,11 +3431,11 @@ static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
 {
 	mutex_init(&sb->lock);
 
-	sb->b0b = idx * npc_priv.subbank_depth;
-	sb->b0t = sb->b0b + npc_priv.subbank_depth - 1;
+	sb->b0b = idx * npc_priv->subbank_depth;
+	sb->b0t = sb->b0b + npc_priv->subbank_depth - 1;
 
-	sb->b1b = npc_priv.bank_depth + idx * npc_priv.subbank_depth;
-	sb->b1t = sb->b1b + npc_priv.subbank_depth - 1;
+	sb->b1b = npc_priv->bank_depth + idx * npc_priv->subbank_depth;
+	sb->b1t = sb->b1b + npc_priv->subbank_depth - 1;
 
 	sb->flags = NPC_SUBBANK_FLAG_FREE;
 	sb->idx = idx;
@@ -3449,7 +3447,7 @@ static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
 	/* Keep first and last subbank at end of free array; so that
 	 * it will be used at last
 	 */
-	xa_store(&npc_priv.xa_sb_free, sb->arr_idx,
+	xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
 		 xa_mk_value(sb->idx), GFP_KERNEL);
 }
 
@@ -3474,7 +3472,7 @@ static int npc_pcifunc_map_create(struct rvu *rvu)
 
 		pcifunc = pf << 9;
 
-		xa_store(&npc_priv.xa_pf_map, (unsigned long)pcifunc,
+		xa_store(&npc_priv->xa_pf_map, (unsigned long)pcifunc,
 			 xa_mk_value(cnt), GFP_KERNEL);
 
 		cnt++;
@@ -3483,7 +3481,7 @@ static int npc_pcifunc_map_create(struct rvu *rvu)
 		for (vf = 0; vf < numvfs; vf++) {
 			pcifunc = (pf << 9) | (vf + 1);
 
-			xa_store(&npc_priv.xa_pf_map, (unsigned long)pcifunc,
+			xa_store(&npc_priv->xa_pf_map, (unsigned long)pcifunc,
 				 xa_mk_value(cnt), GFP_KERNEL);
 			cnt++;
 		}
@@ -3569,7 +3567,7 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu,
 	int rc, sb_off, i, err;
 	bool deleted;
 
-	sb = &npc_priv.sb[f->idx];
+	sb = &npc_priv->sb[f->idx];
 
 	alloc_cnt1 = 0;
 	alloc_cnt2 = 0;
@@ -3639,9 +3637,9 @@ static int npc_defrag_add_2_show_list(struct rvu *rvu, u16 old_midx,
 	node->vidx = vidx;
 	INIT_LIST_HEAD(&node->list);
 
-	mutex_lock(&npc_priv.lock);
-	list_add_tail(&node->list, &npc_priv.defrag_lh);
-	mutex_unlock(&npc_priv.lock);
+	mutex_lock(&npc_priv->lock);
+	list_add_tail(&node->list, &npc_priv->defrag_lh);
+	mutex_unlock(&npc_priv->lock);
 
 	return 0;
 }
@@ -3745,7 +3743,7 @@ int npc_defrag_move_vdx_to_free(struct rvu *rvu,
 		}
 
 		/* save pcifunc */
-		map = xa_load(&npc_priv.xa_idx2pf_map, old_midx);
+		map = xa_load(&npc_priv->xa_idx2pf_map, old_midx);
 		pcifunc = xa_to_value(map);
 
 		/* delete from pf maps */
@@ -3904,29 +3902,29 @@ static void npc_defrag_list_clear(void)
 {
 	struct npc_defrag_show_node *node, *next;
 
-	mutex_lock(&npc_priv.lock);
-	list_for_each_entry_safe(node, next, &npc_priv.defrag_lh, list) {
+	mutex_lock(&npc_priv->lock);
+	list_for_each_entry_safe(node, next, &npc_priv->defrag_lh, list) {
 		list_del_init(&node->list);
 		kfree(node);
 	}
 
-	mutex_unlock(&npc_priv.lock);
+	mutex_unlock(&npc_priv->lock);
 }
 
 static void npc_lock_all_subbank(void)
 {
 	int i;
 
-	for (i = 0; i < npc_priv.num_subbanks; i++)
-		mutex_lock(&npc_priv.sb[i].lock);
+	for (i = 0; i < npc_priv->num_subbanks; i++)
+		mutex_lock(&npc_priv->sb[i].lock);
 }
 
 static void npc_unlock_all_subbank(void)
 {
 	int i;
 
-	for (i = npc_priv.num_subbanks - 1; i >= 0; i--)
-		mutex_unlock(&npc_priv.sb[i].lock);
+	for (i = npc_priv->num_subbanks - 1; i >= 0; i--)
+		mutex_unlock(&npc_priv->sb[i].lock);
 }
 
 int npc_cn20k_search_order_set(struct rvu *rvu,
@@ -3944,9 +3942,9 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
 		USED = 1,
 	};
 
-	if (cnt != npc_priv.num_subbanks) {
+	if (cnt != npc_priv->num_subbanks) {
 		dev_err(rvu->dev, "Number of entries(%u) != %u\n",
-			cnt, npc_priv.num_subbanks);
+			cnt, npc_priv->num_subbanks);
 		return -EINVAL;
 	}
 
@@ -3954,18 +3952,18 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
 	npc_lock_all_subbank();
 
 	for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
-		sb = &npc_priv.sb[sb_idx];
+		sb = &npc_priv->sb[sb_idx];
 		save[sb->idx] = sb->arr_idx;
 	}
 
 	for (prio = 0; prio < cnt; prio++) {
 		sb_idx = narr[prio];
-		sb = &npc_priv.sb[sb_idx];
+		sb = &npc_priv->sb[sb_idx];
 
 		if (sb->flags & NPC_SUBBANK_FLAG_USED)
-			xa = &npc_priv.xa_sb_used;
+			xa = &npc_priv->xa_sb_used;
 		else
-			xa = &npc_priv.xa_sb_free;
+			xa = &npc_priv->xa_sb_free;
 
 		rc = xa_err(xa_store(xa, prio,
 				     xa_mk_value(sb_idx), GFP_KERNEL));
@@ -3989,10 +3987,10 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
 
 	for (prio = 0; prio < cnt; prio++) {
 		if (rsrc[FREE][prio] == -1)
-			xa_erase(&npc_priv.xa_sb_free, prio);
+			xa_erase(&npc_priv->xa_sb_free, prio);
 
 		if (rsrc[USED][prio] == -1)
-			xa_erase(&npc_priv.xa_sb_used, prio);
+			xa_erase(&npc_priv->xa_sb_used, prio);
 	}
 
 	for (int i = 0; i < cnt; i++)
@@ -4008,20 +4006,20 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
 fail:
 	for (prio = 0; prio < cnt; prio++) {
 		if (rsrc[FREE][prio] == 1)
-			xa_erase(&npc_priv.xa_sb_free, prio);
+			xa_erase(&npc_priv->xa_sb_free, prio);
 
 		if (rsrc[USED][prio] == 1)
-			xa_erase(&npc_priv.xa_sb_used, prio);
+			xa_erase(&npc_priv->xa_sb_used, prio);
 	}
 
 	for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
-		sb = &npc_priv.sb[sb_idx];
+		sb = &npc_priv->sb[sb_idx];
 		sb->arr_idx = save[sb_idx];
 
 		if (sb->flags & NPC_SUBBANK_FLAG_USED)
-			xa = &npc_priv.xa_sb_used;
+			xa = &npc_priv->xa_sb_used;
 		else
-			xa = &npc_priv.xa_sb_free;
+			xa = &npc_priv->xa_sb_free;
 
 		/* Since the entry already exists, xa_store() replaces
 		 * the value without a kmalloc(), making failure highly unlikely.
@@ -4041,7 +4039,7 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
 const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
 {
 	*restricted_order = restrict_valid;
-	*sz = npc_priv.num_subbanks;
+	*sz = npc_priv->num_subbanks;
 	return subbank_srch_order;
 }
 
@@ -4065,7 +4063,7 @@ int npc_cn20k_defrag(struct rvu *rvu)
 	INIT_LIST_HEAD(&x4lh);
 	INIT_LIST_HEAD(&x2lh);
 
-	node = kcalloc(npc_priv.num_subbanks, sizeof(*node), GFP_KERNEL);
+	node = kcalloc(npc_priv->num_subbanks, sizeof(*node), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
@@ -4074,13 +4072,13 @@ int npc_cn20k_defrag(struct rvu *rvu)
 	npc_lock_all_subbank();
 
 	/* Fill in node with subbank properties */
-	for (i = 0; i < npc_priv.num_subbanks; i++) {
-		sb = &npc_priv.sb[i];
+	for (i = 0; i < npc_priv->num_subbanks; i++) {
+		sb = &npc_priv->sb[i];
 
 		node[i].idx = i;
 		node[i].key_type = sb->key_type;
 		node[i].free_cnt = sb->free_cnt;
-		node[i].vidx = kcalloc(npc_priv.subbank_depth * 2,
+		node[i].vidx = kcalloc(npc_priv->subbank_depth * 2,
 				       sizeof(*node[i].vidx),
 				       GFP_KERNEL);
 		if (!node[i].vidx) {
@@ -4110,8 +4108,8 @@ int npc_cn20k_defrag(struct rvu *rvu)
 	}
 
 	/* Filling vidx[] array with all vidx in that subbank */
-	xa_for_each_start(&npc_priv.xa_vidx2idx_map, index, map,
-			  npc_priv.bank_depth * 2) {
+	xa_for_each_start(&npc_priv->xa_vidx2idx_map, index, map,
+			  npc_priv->bank_depth * 2) {
 		midx = xa_to_value(map);
 		rc =  npc_mcam_idx_2_subbank_idx(rvu, midx,
 						 &sb, &sb_off);
@@ -4128,14 +4126,14 @@ int npc_cn20k_defrag(struct rvu *rvu)
 	}
 
 	/* Mark all subbank which has ref allocation */
-	for (i = 0; i < npc_priv.num_subbanks; i++) {
+	for (i = 0; i < npc_priv->num_subbanks; i++) {
 		tnode = &node[i];
 
 		if (!tnode->valid)
 			continue;
 
 		tot = (tnode->key_type == NPC_MCAM_KEY_X2) ?
-			npc_priv.subbank_depth * 2 : npc_priv.subbank_depth;
+			npc_priv->subbank_depth * 2 : npc_priv->subbank_depth;
 
 		if (node[i].vidx_cnt != tot - tnode->free_cnt)
 			tnode->refs = true;
@@ -4152,7 +4150,7 @@ int npc_cn20k_defrag(struct rvu *rvu)
 free_vidx:
 	npc_unlock_all_subbank();
 	mutex_unlock(&mcam->lock);
-	for (i = 0; i < npc_priv.num_subbanks; i++)
+	for (i = 0; i < npc_priv->num_subbanks; i++)
 		kfree(node[i].vidx);
 	kfree(node);
 	return rc;
@@ -4180,7 +4178,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
 		*ptr[i] = USHRT_MAX;
 	}
 
-	if (!npc_priv.init_done)
+	if (!npc_priv)
 		return 0;
 
 	if (is_lbk_vf(rvu, pcifunc)) {
@@ -4188,7 +4186,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
 			return -EINVAL;
 
 		idx = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
-		val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+		val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
 		if (!val) {
 			pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
 				 __func__,
@@ -4207,7 +4205,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
 			return -EINVAL;
 
 		idx = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
-		val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+		val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
 		if (!val) {
 			pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
 				 __func__,
@@ -4227,7 +4225,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
 			continue;
 
 		idx = NPC_DFT_RULE_ID_MK(pcifunc, i);
-		val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+		val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
 		if (!val) {
 			pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
 				 __func__,
@@ -4251,8 +4249,8 @@ int rvu_mbox_handler_npc_get_pfl_info(struct rvu *rvu, struct msg_req *req,
 		return -EOPNOTSUPP;
 	}
 
-	rsp->kw_type = npc_priv.kw;
-	rsp->x4_slots = npc_priv.bank_depth;
+	rsp->kw_type = npc_priv->kw;
+	rsp->x4_slots = npc_priv->bank_depth;
 	return 0;
 }
 
@@ -4342,7 +4340,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
 	int blkaddr, rc, i;
 	void *map;
 
-	if (!npc_priv.init_done)
+	if (!npc_priv)
 		return;
 
 	if (!npc_is_cgx_or_lbk(rvu, pcifunc)) {
@@ -4360,7 +4358,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
 	/* LBK */
 	if (is_lbk_vf(rvu, pcifunc)) {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
-		map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+		map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
 		if (!map)
 			dev_dbg(rvu->dev,
 				"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4374,7 +4372,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
 	/* VF */
 	if (is_vf(pcifunc)) {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
-		map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+		map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
 		if (!map)
 			dev_dbg(rvu->dev,
 				"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4388,7 +4386,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
 	/* PF */
 	for (i = NPC_DFT_RULE_START_ID; i < NPC_DFT_RULE_MAX_ID; i++)  {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, i);
-		map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+		map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
 		if (!map)
 			dev_dbg(rvu->dev,
 				"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4448,7 +4446,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	struct msg_rsp free_rsp;
 	u16 b, m, p, u;
 
-	if (!npc_priv.init_done)
+	if (!npc_priv)
 		return 0;
 
 	if (!npc_is_cgx_or_lbk(rvu, pcifunc)) {
@@ -4471,7 +4469,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	}
 
 	/* Set ref index as lowest priority index */
-	eidx = 2 * npc_priv.bank_depth - 1;
+	eidx = 2 * npc_priv->bank_depth - 1;
 
 	/* Install only UCAST for VF */
 	cnt = is_vf(pcifunc) ? 1 : ARRAY_SIZE(mcam_idx);
@@ -4500,9 +4498,9 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	pfvf = rvu_get_pfvf(rvu, pcifunc);
 	pfvf->hw_prio = NPC_DFT_RULE_PRIO;
 
-	if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+	if (npc_priv->kw == NPC_MCAM_KEY_X4) {
 		req.kw_type = NPC_MCAM_KEY_X4;
-		req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+		req.ref_entry = eidx & (npc_priv->bank_depth - 1);
 	} else {
 		req.kw_type = NPC_MCAM_KEY_X2;
 		req.ref_entry = eidx;
@@ -4543,9 +4541,9 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	req.hdr.pcifunc = pcifunc;
 	req.ref_prio = NPC_MCAM_LOWER_PRIO;
 
-	if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+	if (npc_priv->kw == NPC_MCAM_KEY_X4) {
 		req.kw_type = NPC_MCAM_KEY_X4;
-		req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+		req.ref_entry = eidx & (npc_priv->bank_depth - 1);
 	} else {
 		req.kw_type = NPC_MCAM_KEY_X2;
 		req.ref_entry = eidx;
@@ -4569,7 +4567,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	/* LBK */
 	if (is_lbk_vf(rvu, pcifunc)) {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
-		ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+		ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
 				xa_mk_value(mcam_idx[0]), GFP_KERNEL);
 		if (ret) {
 			dev_err(rvu->dev,
@@ -4586,7 +4584,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	/* VF */
 	if (is_vf(pcifunc)) {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
-		ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+		ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
 				xa_mk_value(mcam_idx[0]), GFP_KERNEL);
 		if (ret) {
 			dev_err(rvu->dev,
@@ -4604,7 +4602,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 	for (i = NPC_DFT_RULE_START_ID, k = 0; i < NPC_DFT_RULE_MAX_ID &&
 	     k < cnt; i++, k++) {
 		index = NPC_DFT_RULE_ID_MK(pcifunc, i);
-		ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+		ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
 				xa_mk_value(mcam_idx[k]), GFP_KERNEL);
 		if (ret) {
 			dev_err(rvu->dev,
@@ -4613,7 +4611,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
 				pcifunc);
 			for (int p = NPC_DFT_RULE_START_ID; p < i; p++) {
 				index = NPC_DFT_RULE_ID_MK(pcifunc, p);
-				xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+				xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
 			}
 			goto err;
 		}
@@ -4693,71 +4691,79 @@ static int npc_priv_init(struct rvu *rvu)
 		return -EINVAL;
 	}
 
-	npc_priv.num_subbanks = num_subbanks;
-	npc_priv.bank_depth = bank_depth;
-	npc_priv.subbank_depth = subbank_depth;
+	npc_priv = kcalloc(1, sizeof(*npc_priv), GFP_KERNEL);
+	if (!npc_priv)
+		return -ENOMEM;
+
+	npc_priv->num_banks = num_banks;
+	npc_priv->num_subbanks = num_subbanks;
+	npc_priv->bank_depth = bank_depth;
+	npc_priv->subbank_depth = subbank_depth;
 
 	/* Get kex configured key size */
 	cfg = rvu_read64(rvu, blkaddr, NPC_AF_INTFX_KEX_CFG(0));
-	npc_priv.kw = FIELD_GET(GENMASK_ULL(34, 32), cfg);
+	npc_priv->kw = FIELD_GET(GENMASK_ULL(34, 32), cfg);
 
 	dev_info(rvu->dev,
 		 "banks=%u depth=%u, subbanks=%u depth=%u, key type=%s\n",
 		 num_banks, bank_depth, num_subbanks, subbank_depth,
-		 npc_kw_name[npc_priv.kw]);
+		 npc_kw_name[npc_priv->kw]);
 
-	npc_priv.sb = kcalloc(num_subbanks, sizeof(struct npc_subbank),
-			      GFP_KERNEL);
-	if (!npc_priv.sb)
-		return -ENOMEM;
+	npc_priv->sb = kcalloc(num_subbanks, sizeof(struct npc_subbank),
+			       GFP_KERNEL);
+	if (!npc_priv->sb)
+		goto fail1;
 
-	xa_init_flags(&npc_priv.xa_sb_used, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_sb_free, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_idx2pf_map, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_pf_map, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_pf2dfl_rmap, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_idx2vidx_map, XA_FLAGS_ALLOC);
-	xa_init_flags(&npc_priv.xa_vidx2idx_map, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_sb_used, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_sb_free, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_idx2pf_map, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_pf_map, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_pf2dfl_rmap, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_idx2vidx_map, XA_FLAGS_ALLOC);
+	xa_init_flags(&npc_priv->xa_vidx2idx_map, XA_FLAGS_ALLOC);
 
 	if (npc_create_srch_order(num_subbanks))
-		goto fail1;
+		goto fail2;
 
 	npc_populate_restricted_idxs(num_subbanks);
 
 	/* Initialize subbanks */
-	for (i = 0, sb = npc_priv.sb; i < num_subbanks; i++, sb++)
+	for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
 		npc_subbank_init(rvu, sb, i);
 
 	/* Get number of pcifuncs in the system */
-	npc_priv.pf_cnt = npc_pcifunc_map_create(rvu);
-	npc_priv.xa_pf2idx_map = kcalloc(npc_priv.pf_cnt,
-					 sizeof(struct xarray),
-					 GFP_KERNEL);
-	if (!npc_priv.xa_pf2idx_map)
-		goto fail2;
+	npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
+	npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,
+					  sizeof(struct xarray),
+					  GFP_KERNEL);
+	if (!npc_priv->xa_pf2idx_map)
+		goto fail3;
 
-	for (i = 0; i < npc_priv.pf_cnt; i++)
-		xa_init_flags(&npc_priv.xa_pf2idx_map[i], XA_FLAGS_ALLOC);
+	for (i = 0; i < npc_priv->pf_cnt; i++)
+		xa_init_flags(&npc_priv->xa_pf2idx_map[i], XA_FLAGS_ALLOC);
 
-	INIT_LIST_HEAD(&npc_priv.defrag_lh);
-	mutex_init(&npc_priv.lock);
+	INIT_LIST_HEAD(&npc_priv->defrag_lh);
+	mutex_init(&npc_priv->lock);
 
 	return 0;
 
-fail2:
+fail3:
 	kfree(subbank_srch_order);
 	subbank_srch_order = NULL;
 
+fail2:
+	xa_destroy(&npc_priv->xa_sb_used);
+	xa_destroy(&npc_priv->xa_sb_free);
+	xa_destroy(&npc_priv->xa_idx2pf_map);
+	xa_destroy(&npc_priv->xa_pf_map);
+	xa_destroy(&npc_priv->xa_pf2dfl_rmap);
+	xa_destroy(&npc_priv->xa_idx2vidx_map);
+	xa_destroy(&npc_priv->xa_vidx2idx_map);
+	kfree(npc_priv->sb);
+	npc_priv->sb = NULL;
 fail1:
-	xa_destroy(&npc_priv.xa_sb_used);
-	xa_destroy(&npc_priv.xa_sb_free);
-	xa_destroy(&npc_priv.xa_idx2pf_map);
-	xa_destroy(&npc_priv.xa_pf_map);
-	xa_destroy(&npc_priv.xa_pf2dfl_rmap);
-	xa_destroy(&npc_priv.xa_idx2vidx_map);
-	xa_destroy(&npc_priv.xa_vidx2idx_map);
-	kfree(npc_priv.sb);
-	npc_priv.sb = NULL;
+	kfree(npc_priv);
+	npc_priv = NULL;
 	return -ENOMEM;
 }
 
@@ -4765,25 +4771,31 @@ void npc_cn20k_deinit(struct rvu *rvu)
 {
 	int i;
 
-	xa_destroy(&npc_priv.xa_sb_used);
-	xa_destroy(&npc_priv.xa_sb_free);
-	xa_destroy(&npc_priv.xa_idx2pf_map);
-	xa_destroy(&npc_priv.xa_pf_map);
-	xa_destroy(&npc_priv.xa_pf2dfl_rmap);
-	xa_destroy(&npc_priv.xa_idx2vidx_map);
-	xa_destroy(&npc_priv.xa_vidx2idx_map);
+	if (!npc_priv)
+		return;
 
-	for (i = 0; i < npc_priv.pf_cnt; i++)
-		xa_destroy(&npc_priv.xa_pf2idx_map[i]);
+	xa_destroy(&npc_priv->xa_sb_used);
+	xa_destroy(&npc_priv->xa_sb_free);
+	xa_destroy(&npc_priv->xa_idx2pf_map);
+	xa_destroy(&npc_priv->xa_pf_map);
+	xa_destroy(&npc_priv->xa_pf2dfl_rmap);
+	xa_destroy(&npc_priv->xa_idx2vidx_map);
+	xa_destroy(&npc_priv->xa_vidx2idx_map);
 
-	kfree(npc_priv.xa_pf2idx_map);
+	for (i = 0; i < npc_priv->pf_cnt; i++)
+		xa_destroy(&npc_priv->xa_pf2idx_map[i]);
+
+	kfree(npc_priv->xa_pf2idx_map);
 	/* No need to destroy mutex lock as it is
 	 * part of subbank structure
 	 */
-	kfree(npc_priv.sb);
+	kfree(npc_priv->sb);
 	kfree(subbank_srch_order);
-	bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
+	bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
 		     MAX_SUBBANK_DEPTH);
+	npc_defrag_list_clear();
+	kfree(npc_priv);
+	npc_priv = NULL;
 }
 
 static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
@@ -4796,7 +4808,7 @@ static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
 		return -ENODEV;
 	}
 
-	for (sec = 0; sec < npc_priv.num_subbanks; sec++)
+	for (sec = 0; sec < npc_priv->num_subbanks; sec++)
 		rvu_write64(rvu, blkaddr,
 			    NPC_AF_MCAM_SECTIONX_CFG_EXT(sec), key_type);
 
@@ -4818,10 +4830,12 @@ int npc_cn20k_init(struct rvu *rvu)
 	if (err) {
 		dev_err(rvu->dev, "%s: mcam section configuration failure\n",
 			__func__);
-		return err;
+		goto fail;
 	}
 
-	npc_priv.init_done = true;
-
 	return 0;
+
+fail:
+	npc_cn20k_deinit(rvu);
+	return err;
 }
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 8bf857317e49..10e5bab50f62 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -183,7 +183,6 @@ struct npc_defrag_show_node {
  * @xa_idx2pf_map:	Mcam index to PF map.
  * @xa_pf_map:		Pcifunc to index map.
  * @pf_cnt:		Number of PFs.
- * @init_done:		Indicates MCAM initialization is done.
  * @xa_pf2dfl_rmap:	PF to default rule index map.
  * @xa_idx2vidx_map:	Mcam index to virtual index map.
  * @xa_vidx2idx_map:	virtual index to mcam index map.
@@ -195,7 +194,7 @@ struct npc_defrag_show_node {
  */
 struct npc_priv_t {
 	int bank_depth;
-	const int num_banks;
+	int num_banks;
 	int num_subbanks;
 	int subbank_depth;
 	DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
@@ -214,7 +213,6 @@ struct npc_priv_t {
 	struct list_head defrag_lh;
 	struct mutex lock; /* protect defrag nodes */
 	int pf_cnt;
-	bool init_done;
 };
 
 struct npc_kpm_action0 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements
  2026-06-09  4:04 ` [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
@ 2026-06-10  4:19   ` Ratheesh Kannoth
  2026-06-10  5:08   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  4:19 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:46, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
>   so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
>   since the prior read by comparing hardware counters to a per-entry
>   software baseline and advancing that baseline after each read (hardware
>   counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
>   but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://sashiko.dev/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says

>> +static DEFINE_MUTEX(stats_lock);
>> +
>> +/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
>> + * hard limit on all silicon variants, preventing any possibility of
>> + * out-of-bounds access.
>> + */
>> +static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
>> +static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
>> +{
>[ ... ]
>> +			stats = rvu_read64(rvu, blkaddr,
>> +					   NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
>> +			if (!stats)
>> +				continue;
>> +			if (stats == dstats[bank][idx])
>> +				continue;
>> +
>> +			if (stats < dstats[bank][idx])
>> +				dstats[bank][idx] = 0;
>Can stale dstats baselines persist across MCAM entry reallocation and
>silently corrupt hit telemetry?
>The dstats array maintains a per-entry software baseline. When an entry is
>freed, its hardware hit counter is reset to 0 in npc_clear_x2_entry(), but
>its corresponding dstats baseline is never cleared.
>If the entry is later reallocated and its hardware counter accumulates more
>hits than the old stale baseline before the next debugfs read, this fallback
>logic evaluates to false. The calculation will then incorrectly subtract
>the stale baseline, massively under-reporting the actual hits.
>Should the baseline be explicitly cleared synchronously when the entry
>is freed?

Thanks for pointing this out. The dstats matrix is intended purely
as a lightweight telemetry tool to identify which MCAM entries are
actively being hit by current traffic via debugfs, rather than
providing strict, transactionally guaranteed billing or accounting
metrics.

You are correct that a stale baseline could cause a transient
under-reporting on the very first read after a reallocation if the
hardware counter quickly surpasses the old baseline. However,
during that first read, the fallback logic will immediately catch
up and synchronize dstats[bank][idx] to the current hardware state
(stats). All subsequent reads will reflect completely accurate hit
increments.

Given that this is non-critical debugfs telemetry, i would like to
keep this as it is. or fix it as part of "net" tree.

>> +
>> +			pf = 0xFFFF;
>> +			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
>> +			if (map)
>> +				pf = xa_to_value(map);
>> +
>> +			delta = stats - dstats[bank][idx];
>> +
>> +			snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> +				 mcam_idx, pf, delta);
>> +			seq_puts(s, buff);
>> +
>> +			dstats[bank][idx] = stats;
>Does this global baseline destructively corrupt telemetry under concurrent
>reads?
>When any process reads the debugfs file, it updates dstats[bank][idx],
>destructively advancing the baseline for all other readers. If a monitoring
>daemon reads the file periodically and a user manually reads it, the manual
>read consumes the delta and resets the baseline, causing the monitoring
>daemon to miss all hits that occurred in between.
>Should telemetry exposed to userspace either provide raw cumulative counters
>or track baselines per-file-descriptor context?

Same comment as above.

>> +	mutex_unlock(&stats_lock);
>> +	return 0;
>> +}
>> +
>> +/*  "%u\t%#04x\t%llu\n" needs less than 64 characters to print */
>> +#define TOTAL_SZ (MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH * 64)
>> +DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_dstats, TOTAL_SZ);
>Does this bypass the chunking design of seq_file and cause unconditionally
>large allocations?
>By calling single_open_size() with a hardcoded TOTAL_SZ of roughly 1MB
>(16,384 entries * 64 bytes), this unconditionally allocates 1MB of kernel
>memory via kvmalloc every time the file is opened.
>Instead of dumping everything in a single show() call, would it be better
>to implement the seq_file iterator callbacks (start, next, show, stop) to
>process the MCAM entries in page-sized chunks?

I dont see any issue with current approach and would like to keep it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control
  2026-06-09  4:04 ` [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
@ 2026-06-10  4:28   ` Ratheesh Kannoth
  2026-06-10  5:10   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  4:28 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:49, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> CN20K NPC MCAM is split into 32 subbanks that are searched in a
> predefined order during allocation. Lower-numbered subbanks have
> higher priority than higher-numbered ones.
>
> Add a runtime "srch_order" to control the order in which
> subbanks are searched during MCAM allocation.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> ---

https://sashiko.dev/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

>> @@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
>>  		mutex_unlock(&npc_priv.sb[i].lock);
>>  }
>>
>> +int npc_cn20k_search_order_set(struct rvu *rvu,
>> +			       u64 narr[MAX_NUM_SUB_BANKS], int cnt)
>> +{
>[ ... ]
>> +	for (prio = 0; prio < cnt; prio++) {
>> +		if (rsrc[FREE][prio] == -1)
>> +			xa_erase(&npc_priv.xa_sb_free, prio);
>> +
>> +		if (rsrc[USED][prio] == -1)
>> +			xa_erase(&npc_priv.xa_sb_used, prio);
>> +	}
>> +
>> +	for (int i = 0; i < cnt; i++)
>> +		subbank_srch_order[i] = (u32)narr[i];
>Does this change the semantic meaning of the subbank_srch_order array?
>During initialization in npc_create_srch_order(), subbank_srch_order is
>mapped as subbank_idx -> priority. The devlink parameter npc_srch_order
>expects an array ordered by priority (narr[priority] = subbank_idx),
>which is handled correctly by the earlier loops in this function.
>However, assigning subbank_srch_order[i] = narr[i] here silently changes
>the array in-place from mapping (idx -> priority) to mapping (priority -> idx).
Thank you. npc_create_srch_order() is creating it properly but npc_subbank_init() is
not settig this order properly. This is a bug. This patch did not introduce this bug.
Will post a patch to "net" tree.

>> +
>> +	restrict_valid = false;
>> +
>> +	npc_unlock_all_subbank();
>> +	mutex_unlock(&mcam->lock);
>> +
>> +	return 0;
>[ ... ]
>> +const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
>> +{
>> +	*restricted_order = restrict_valid;
>> +	*sz = npc_priv.num_subbanks;
>> +	return subbank_srch_order;
>Could this result in scrambled subbank search orders if a user space tool
>reads the default order and then writes it back?
>Since this directly returns subbank_srch_order, if it is called before any
>set operation occurs, it returns the original idx -> priority mapping.
>If called after a set operation, it returns a priority -> subbank_idx
>mapping.
>If the original default order is read by devlink and then written back
>unmodified, the set path will misinterpret the original idx -> priority
>array as a priority -> subbank_idx array, scrambling the search order.

Same comment as above.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
  2026-06-09  4:04 ` [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
@ 2026-06-10  4:30   ` Ratheesh Kannoth
  2026-06-10  4:31   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  4:30 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:50, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
> reinit or teardown without the AF freeing CN20K default NPC rule indexes
> while the driver still owns that state (otx2_init_hw_resources and
> otx2_free_hw_resources).
>
> On CN20K, allocate default NPC rules from NIX LF alloc before
> nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
> and free from NIX LF free when the new flag is not set. Tighten
> rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
> qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
> (remove the blanket -ENOMEM at the free_mem path).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://sashiko.dev/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says

>> -	if (err)
>> +	rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
>> +	if (rc)
>>  		goto free_mem;
>>
>>  	pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
>> -	if (!pfvf->rq_bmap)
>> +	if (!pfvf->rq_bmap) {
>> +		rc = -ENOMEM;
>>  		goto free_mem;
>> +	}
>This is a pre-existing issue, but does this code allow a memory leak if a VF
>sends consecutive allocation requests?
>Looking at rvu_mbox_handler_nix_lf_alloc(), the AF driver handles the NIX LF
>allocation by allocating DMA memory via qmem_alloc() and heap memory via
>kcalloc(). It appears to directly overwrite the context pointers (such as
>pfvf->rq_ctx and pfvf->rq_bmap) without checking if they are already
>allocated.
>If a misbehaving VF sends this allocation message multiple times sequentially
>without an intervening NIX_LF_FREE message, could the previously allocated
>memory be silently overwritten and leaked?
>It looks like a similar pattern might also exist for NPA LF allocations in
>rvu_mbox_handler_npa_lf_alloc() where pfvf->aura_ctx and pfvf->pool_ctx are
>allocated in a similar manner.

This is a pre-existing issue.

A hardening series to "net-next" post merge of this series would address
this issue.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
  2026-06-09  4:04 ` [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
  2026-06-10  4:30   ` Ratheesh Kannoth
@ 2026-06-10  4:31   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  4:31 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:50, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
> reinit or teardown without the AF freeing CN20K default NPC rule indexes
> while the driver still owns that state (otx2_init_hw_resources and
> otx2_free_hw_resources).
>
> On CN20K, allocate default NPC rules from NIX LF alloc before
> nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
> and free from NIX LF free when the new flag is not set. Tighten
> rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
> qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
> (remove the blanket -ENOMEM at the free_mem path).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://sashiko.dev/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:
>
>> -	if (err)
>> +	rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
>> +	if (rc)
>>  		goto free_mem;
>>
>>  	pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
>> -	if (!pfvf->rq_bmap)
>> +	if (!pfvf->rq_bmap) {
>> +		rc = -ENOMEM;
>>  		goto free_mem;
>> +	}
>This is a pre-existing issue, but does this code allow a memory leak if a VF
>sends consecutive allocation requests?
>Looking at rvu_mbox_handler_nix_lf_alloc(), the AF driver handles the NIX LF
>allocation by allocating DMA memory via qmem_alloc() and heap memory via
>kcalloc(). It appears to directly overwrite the context pointers (such as
>pfvf->rq_ctx and pfvf->rq_bmap) without checking if they are already
>allocated.
>If a misbehaving VF sends this allocation message multiple times sequentially
>without an intervening NIX_LF_FREE message, could the previously allocated
>memory be silently overwritten and leaked?
>It looks like a similar pattern might also exist for NPA LF allocations in
>rvu_mbox_handler_npa_lf_alloc() where pfvf->aura_ctx and pfvf->pool_ctx are
>allocated in a similar manner.

This is a pre-existing issue, will post a patch to "net" tree post merge of this series.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
  2026-06-09  4:04 ` [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
@ 2026-06-10  4:40   ` Ratheesh Kannoth
  2026-06-10  5:33   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  4:40 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:53, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Replace the file-scope static npc_priv with a kcalloc'd struct filled
> from hardware bank/subbank geometry at init (num_banks is no longer a
> const compile-time constant; drop init_done and use a non-NULL
> npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
> through the CN20K NPC code paths, extend teardown to kfree the root
> struct on failure and in npc_cn20k_deinit, and adjust MCAM section
> setup to use the discovered subbank count.
>
> Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
> and use the allocated backing store consistently when computing deltas
> (including the counter rollover compare).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://sashiko.dev/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

> @@ -397,6 +398,10 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
>>  	debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
>>  			    npc_priv, &npc_vidx2idx_map_fops);
>>
>> +	dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
>> +	if (!dstats)
>> +		return -ENOMEM;
>Does allocating this memory dynamically per-device but assigning it to a
>static global pointer break multi-device support?

Exist only one AF device per Soc

>If there are multiple RVU devices, the second device will overwrite the
>global dstats pointer, which leaks the devm-managed memory of the first
>device.
No.

>Also, because it uses devm_kzalloc(), if the specific device unbinds, the
>global pointer is freed and left dangling, potentially causing a
>Use-After-Free for the remaining active devices.

Next rvu probe() would allocate this memory again during probe(). Use will happen only after allocation.
As part of hardening series (post merge of this series), will plan to move this
pointer to "rvu" structure.

>>  	/* mcam_idx should be less than (2 * bank depth) */
>> -	if (mcam_idx >= npc_priv.bank_depth * 2) {
>> +	if (mcam_idx >= npc_priv->bank_depth * 2) {
>Can npc_priv be NULL here?
>If a mailbox message is processed before npc_cn20k_init() fully allocates
>npc_priv, or if initialization failed and left it NULL, this will cause a
>kernel panic. Should there be a check to ensure npc_priv is allocated
>before dereferencing?

mbox message is from PF/VF. PF/VF driver need AF driver to intialize properly. if npc_priv is NULL,
AF driver probe will fail as npc_priv is allocated during AF driver probe.

>> @@ -1825,37 +1823,37 @@ int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
>>  	int bank_off, sb_id;
>>
>>  	/* mcam_idx should be less than (2 * bank depth) */
>> -	if (mcam_idx >= npc_priv.bank_depth * 2) {
>> +	if (mcam_idx >= npc_priv->bank_depth * 2) {
>Can this crash if npc_priv is NULL here?
>Similar to the above, unconditionally dereferencing npc_priv here might
>lead to a NULL pointer dereference.
Same comment as above

>> @@ -4251,8 +4249,8 @@ int rvu_mbox_handler_npc_get_pfl_info(struct rvu *rvu, struct msg_req *req,
>>  		return -EOPNOTSUPP;
>>  	}
>>
>> -	rsp->kw_type = npc_priv.kw;
>> -	rsp->x4_slots = npc_priv.bank_depth;
>> +	rsp->kw_type = npc_priv->kw;
>> +	rsp->x4_slots = npc_priv->bank_depth;
>>  	return 0;
>Is it possible for rvu_mbox_handler_npc_get_pfl_info() to be called before
>npc_priv is successfully allocated?
No

>If this mailbox handler fires while npc_priv is NULL, it will panic. Before
>this patch, accessing the uninitialized static struct safely returned zeros.
Mailbox request is from PF. PF driver will fail if AF is not intialized properly.
npc_priv is allocated during AF driver probe().

>> -	npc_priv.num_subbanks = num_subbanks;
>> -	npc_priv.bank_depth = bank_depth;
>> -	npc_priv.subbank_depth = subbank_depth;
>> +	npc_priv = kcalloc(1, sizeof(*npc_priv), GFP_KERNEL);
>Could this overwrite the global npc_priv pointer if multiple RVU devices are
>present?
First patch in the series enforce only one AF device per system.

>
>> +	npc_defrag_list_clear();
>> +	kfree(npc_priv);
>> +	npc_priv = NULL;
>>  }
>Will setting the static global npc_priv to NULL during teardown crash other
>active devices?
>If multiple devices are present and one is unbound, it will free the shared
>global npc_priv pointer and set it to NULL. The other active devices will
>then crash the next time they attempt to dereference it.

Same comment as above.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements
  2026-06-09  4:04 ` [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
  2026-06-10  4:19   ` Ratheesh Kannoth
@ 2026-06-10  5:08   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  5:08 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:46, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
>   so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
>   since the prior read by comparing hardware counters to a per-entry
>   software baseline and advancing that baseline after each read (hardware
>   counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
>   but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

> +				continue;
>> +			if (stats == dstats[bank][idx])
>> +				continue;
>> +
>> +			if (stats < dstats[bank][idx])
>> +				dstats[bank][idx] = 0;
>The dstats[][] array is module-global and the show handler both reads
>and writes it on every read.  A few observations about that contract,
>which the commit message describes as "hit deltas since the prior
>read":
>The baseline survives across npc_cn20k_enable_mcam_entry(..., false)
>followed by a re-enable with a different tenant.  If the new tenant's
>hardware counter happens to equal the cached baseline, this branch
>silently skips the entry:
>	if (stats == dstats[bank][idx])
>		continue;
>On counter rollback the code zeroes the baseline first and then
>computes the delta against zero:
>	if (stats < dstats[bank][idx])
>		dstats[bank][idx] = 0;
>	...
>	delta = stats - dstats[bank][idx];
>so the very next read after a rollback reports the full absolute
>counter value rather than a delta.  Is that the intended behaviour?
>Also, since dstats[][] is a single shared baseline, two concurrent
>readers (for example an operator session and a monitoring script)
>serialise on stats_lock but each read advances the baseline for the
>other, so each one sees only a fraction of the deltas.  Would
>per-open state (allocated in the open() callback rather than file
>scope) be more in line with what the commit message promises?

Thanks for pointing this out. The dstats matrix is intended purely
as a lightweight telemetry tool to identify which MCAM entries are
actively being hit by current traffic via debugfs, rather than
providing strict, transactionally guaranteed billing or accounting
metrics.
You are correct that a stale baseline could cause a transient
under-reporting on the very first read after a reallocation if the
hardware counter quickly surpasses the old baseline. However,
during that first read, the fallback logic will immediately catch
up and synchronize dstats[bank][idx] to the current hardware state
(stats). All subsequent reads will reflect completely accurate hit
increments.
Given that this is non-critical debugfs telemetry, i would like to
keep this as it is. or fix it as part of "net" tree.

>> +
>> +			pf = 0xFFFF;
>> +			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
>> +			if (map)
>> +				pf = xa_to_value(map);
>> +
>> +			delta = stats - dstats[bank][idx];
>> +
>> +			snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> +				 mcam_idx, pf, delta);
>> +			seq_puts(s, buff);
>> +
>> +			dstats[bank][idx] = stats;
>> +		}
>> +	}
>> +
>> +	mutex_unlock(&stats_lock);
>> +	return 0;
>> +}
>npc_mcam_dstats_show() calls npc_mcam_idx_2_key_type() and then uses
>key_type to decide whether to skip bank 1 for X4 entries, but it does
>not take sb->lock while doing so.  The existing npc_mcam_layout_show()
>in the same file holds sb->lock around exactly these field reads:
>	mutex_lock(&sb->lock);
>	...
>	if (sb->key_type == NPC_MCAM_KEY_X4) {
>	...
>The mutators __npc_subbank_mark_used() and __npc_subbank_free() update
>sb->key_type under sb->lock via their npc_subbank_alloc /
>npc_subbank_free wrappers.
>If a subbank is concurrently being retyped, can this reader observe a
>transient sb->key_type value (for example 0 mid-free, or X4 for a
>subbank being retyped to X2) and as a result drop or duplicate
>entries due to the "key_type == NPC_MCAM_KEY_X4 && bank != 0" filter?

The dstats matrix is intended purely
as a lightweight telemetry tool to identify which MCAM entries are
actively being hit by current traffic via debugfs, rather than
providing strict, transactionally guaranteed billing or accounting
metrics. So these concurrency issue wont defeat the purpose of this
mcam entry.

>> +static int npc_mcam_mismatch_show(struct seq_file *s, void *unused)
>> +{
>> +	struct npc_priv_t *npc_priv;
>> +	struct npc_subbank *sb;
>> +	int mcam_idx, sb_off;
>> +	struct rvu *rvu;
>> +	char buff[64];
>> +	void *map;
>> +	int rc;
>> +
>> +	npc_priv = npc_priv_get();
>> +	rvu = s->private;
>> +
>> +	seq_puts(s, "index\tsb idx\tkw type\n");
>> +	for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
>> +		for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
>> +			mcam_idx = bank * npc_priv->bank_depth + idx;
>> +
>> +			if (!test_bit(mcam_idx, npc_priv->en_map))
>> +				continue;
>> +
>> +			map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
>> +			if (map)
>> +				continue;
>> +
>> +			rc = npc_mcam_idx_2_subbank_idx(rvu, mcam_idx,
>> +							&sb, &sb_off);
>> +			if (rc)
>> +				continue;
>> +
>> +			snprintf(buff, sizeof(buff), "%u\t%d\t%u\n",
>> +				 mcam_idx, sb->idx, sb->key_type);
>> +
>> +			seq_puts(s, buff);
>> +		}
>> +	}
>> +	return 0;
>> +}
>The same observation applies to npc_mcam_mismatch_show(): sb->idx and
>sb->key_type are read here without holding sb->lock, while the rest
>of npc.c serialises mutations of these fields under that lock.  Was
>sb->lock omitted intentionally for these debug paths?

This was omitted intentionally. npc_mcam_mismatch_show() is a debugfs dump
intended to provide a point-in-time snapshot. Acquiring sb->lock per subbank would
only protect that specific iteration and wouldn't guarantee full consistency
across the entire loop snapshot anyway. Therefore, we avoided the locks here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control
  2026-06-09  4:04 ` [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
  2026-06-10  4:28   ` Ratheesh Kannoth
@ 2026-06-10  5:10   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  5:10 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:49, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> CN20K NPC MCAM is split into 32 subbanks that are searched in a
> predefined order during allocation. Lower-numbered subbanks have
> higher priority than higher-numbered ones.
>
> Add a runtime "srch_order" to control the order in which
> subbanks are searched during MCAM allocation.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:
>> @@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
>>  		mutex_unlock(&npc_priv.sb[i].lock);
>>  }
>>
>> +int npc_cn20k_search_order_set(struct rvu *rvu,
>> +			       u64 narr[MAX_NUM_SUB_BANKS], int cnt)
>> +{
>[ ... ]
>> +	for (prio = 0; prio < cnt; prio++) {
>> +		sb_idx = narr[prio];
>> +		sb = &npc_priv.sb[sb_idx];
>[ ... ]
>> +		sb->arr_idx = prio;
>> +	}
>[ ... ]
>> +	for (int i = 0; i < cnt; i++)
>> +		subbank_srch_order[i] = (u32)narr[i];
>Do the get and set paths agree with init on what subbank_srch_order[]
>means?  The init path appears to treat the array as indexed by sb_idx
>with priority as the value:
>    /* npc_create_srch_order() */
>    for (int i = 0; i < cnt; i += 2) {
>        subbank_srch_order[i]     = cnt / 2 - val - 1;
>        subbank_srch_order[i + 1] = cnt / 2 + 1 + val;
>        val++;
>    }
>    subbank_srch_order[cnt - 1] = cnt / 2;
>    /* npc_subbank_init() */
>    sb->idx = idx;
>    sb->arr_idx = subbank_srch_order[idx];
>So initially subbank_srch_order[sb_idx] == priority (arr_idx).
>The new set helper, however, treats narr as indexed by priority with
>sb_idx as the value:
>    for (prio = 0; prio < cnt; prio++) {
>            sb_idx = narr[prio];
>            sb = &npc_priv.sb[sb_idx];
>            ...
>            sb->arr_idx = prio;
>    }
>    ...
>    for (int i = 0; i < cnt; i++)
>            subbank_srch_order[i] = (u32)narr[i];
>and the get helper returns subbank_srch_order verbatim.
>With cnt = 4 the round-trip then looks like:
>  after init:        subbank_srch_order = [1, 3, 0, 2]
>                     arr_idx[sb0..sb3]  = [1, 3, 0, 2]
>                     priority -> sb     = [2, 0, 3, 1]
>  user issues GET  -> [1, 3, 0, 2]  (validate accepts: distinct, < cnt)
>  user issues SET with the same array:
>                     arr_idx[sb0..sb3]  = [2, 0, 3, 1]
>                     priority -> sb     = [1, 3, 0, 2]
>That is, an unmodified GET -> SET echo silently inverts the runtime
>priority order of the MCAM subbanks.
>Should init be reworked so subbank_srch_order is indexed by priority
>(with npc_subbank_init() doing something like
>sb[subbank_srch_order[p]].arr_idx = p), or should set/get use the same
>indexing convention as init so a no-op round-trip is actually a no-op?
>> +	restrict_valid = false;
>> +
>> +	npc_unlock_all_subbank();
>> +	mutex_unlock(&mcam->lock);
>> +
>> +	return 0;
>[ ... ]
>> +}
>> +
>> +const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
>> +{
>> +	*restricted_order = restrict_valid;
>> +	*sz = npc_priv.num_subbanks;
>> +	return subbank_srch_order;

There is due to bug in npc_subbank_init() (not introduced by this patch).
Will post a patch to "net" tree.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem
  2026-06-09  4:04 ` [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
@ 2026-06-10  5:24   ` Ratheesh Kannoth
  0 siblings, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  5:24 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:51, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Flashing updated firmware on deployed devices is cumbersome. Provide a
> mechanism to load a custom KPU (Key Parse Unit) profile directly from
> the filesystem at module load time.
>
> When the rvu_af module is loaded with the kpu_profile parameter, the
> specified profile is read from /lib/firmware/kpu and programmed into
> the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
> used by filesystem-loaded profiles and support ptype/ptype_mask in
> npc_config_kpucam when profile->from_fs is set.
>
> Usage:
>   1. Copy the KPU profile file to /lib/firmware/kpu.
>   2. Build OCTEONTX2_AF as a module.
>   3. Load: insmod rvu_af.ko kpu_profile=<profile_name>
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

>  	cam0.dp2_data = ~kpucam->dp2 & kpucam->dp2_mask;
>>
>> +	if (profile->from_fs) {
>> +		u8 ptype = kpucam2->ptype;
>> +		u8 pmask = kpucam2->ptype_mask;
>> +
>> +		*val |= FIELD_PREP(GENMASK_ULL(57, 56), ptype & pmask);
>> +		*mask |= FIELD_PREP(GENMASK_ULL(57, 56), ~ptype & pmask);
>> +	}
>Bits [57:56] of this register fall inside the field that struct
>npc_kpu_cam currently declares as reserved:
>    struct npc_kpu_cam {
>    #if defined(__BIG_ENDIAN_BITFIELD)
>        u64 rsvd_63_56     : 8;
>        ...
>    #else
>        ...
>        u64 rsvd_63_56     : 8;
>    #endif
>    }
>The new code repurposes two of those reserved bits, gated only by the
>software flag profile->from_fs.  Is there a hardware capability
>indicator (chip revision, NPC_AF_CONST bit, etc.) that should also be
>consulted before driving these bits?  On silicon that does not
>implement the extended cam format, what behaviour do bits [57:56]
>exhibit?
>Should struct npc_kpu_cam be updated so the reserved field reflects
>the new ptype usage rather than leaving the kernel's hardware
>description and the runtime behaviour disagreeing?
Will update kernel's hw description as a patch to "net-next" once this
series is merged.

>> +		profile->kpu[kpu].cam_entries2 = entries;
>> +		profile->kpu[kpu].action_entries2 = entries;
>> +		ret = npc_alloc_kpu_cam2_n_action2(rvu, kpu, entries);
>> +		if (ret) {
>> +			dev_warn(rvu->dev,
>> +				 "profile entry allocation failed for kpu=%d for %d entries\n",
>> +				 kpu, entries);
>> +			return -EINVAL;
>> +		}
>Are cam_entries2 and action_entries2 being assigned in the wrong
>order with the allocation?  When npc_alloc_kpu_cam2_n_action2() fails
>(for example after cam2 succeeded but action2 returned -ENOMEM), the
>function returns -EINVAL but profile->kpu[kpu].cam_entries2 and
>action_entries2 stay at the entries value while cam2 / action2 may
>be partly NULL.
>profile->kpu aliases the file-scope npc_kpu_profiles[] array (set in
>npc_prepare_default_kpu()), so the inconsistent state persists in
>the static array even after the caller falls back to default.  The
>fallback path in npc_load_kpu_profile() calls npc_prepare_default_kpu()
>again, but that helper does not clear cam_entries2, action_entries2,
>cam2 or action2.
>Could the counts be set only after successful allocation, or zeroed
>on the error return?
cam_entries2 and action_entries2 are used only in case of loading from fs. So
it wont cause any issues.

> +static int npc_load_kpu_profile_from_fs(struct rvu *rvu)
>> +{
>> +	struct npc_kpu_profile_adapter *profile = &rvu->kpu;
>> +	const char *kpu_profile = rvu->kpu_pfl_name;
>> +	const struct firmware *fw = NULL;
>> +	int ret, fw_kpus = 0;
>> +	char path[512] = "kpu/";
>> +
>> +	if (strlen(kpu_profile) > sizeof(path) - strlen("kpu/") - 1) {
>> +		dev_err(rvu->dev, "kpu profile name is too big\n");
>> +		return -ENOSPC;
>> +	}
>> +
>> +	strcat(path, kpu_profile);
>> +
>> +	if (request_firmware_direct(&fw, path, rvu->dev))
>> +		return -ENOENT;
>> +
>> +	dev_info(rvu->dev, "Loading KPU profile from filesystem: %s\n",
>> +		 path);
>> +
>> +	rvu->kpu_fwdata = fw->data;
>> +	rvu->kpu_fwdata_sz = fw->size;
>> +
>> +	ret = npc_apply_custom_kpu(rvu, profile, true, &fw_kpus);
>> +	release_firmware(fw);
>> +	rvu->kpu_fwdata = NULL;
>> +
>> +	if (ret) {
>> +		rvu->kpu_fwdata_sz = 0;
>> +		dev_err(rvu->dev,
>> +			"Loading KPU profile from filesystem failed\n");
>> +		return ret;
>> +	}
>[Medium, Low]
>Is the change of the firmware search path a backwards-compatibility
>regression for existing users?
This was modified as part of review comment from Simon.

>On a separate note, on the success path this function leaves
>rvu->kpu_fwdata == NULL but rvu->kpu_fwdata_sz unchanged at fw->size
>(only the error path resets the size to 0).  rvu_npc_freemem() does
>kfree(rvu->kpu_fwdata) for the non-mmapped case, which assumes that
>field is either NULL or kmalloc'd memory; here rvu->kpu_fwdata is
>briefly aliased to fw->data (not kmalloc'd) before being NULLed.
>Should kpu_fwdata_sz be reset to 0 on success too, and should the
>firmware bytes be copied into kmalloc storage as the old code did to
>keep the pointer/size pair consistent?

This was intentional change. rvu->kpu_fwdata_sz is used later so we kept it
as such.

>> +				     npc_get_ikpu_nth_entry(rvu, idx),
>> +				     0, idx, true);
>>
>>  	/* Program KPU CAM and Action profiles */
>>  	num_kpus = rvu->kpu.kpus;
>> @@ -2068,6 +2348,11 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
>>
>>  	for (idx = 0; idx < num_kpus; idx++)
>>  		npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
>> +
>> +	if (profile->from_fs) {
>> +		rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(54), 0x03);
>> +		rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(58), 0x03);
>> +	}
>>  }
>Just above, the same function clamps PKIND iteration with an explicit
>comment about hardware capability:
>    /* First program IKPU profile i.e PKIND configs.
>     * Check HW max count to avoid configuring junk or
>     * writing to unsupported CSR addresses.
>     */
>    num_pkinds = rvu->kpu.pkinds;
>    num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
>Do these new writes to NPC_AF_PKINDX_TYPE(54) and
>NPC_AF_PKINDX_TYPE(58) bypass that safeguard?  hw->npc_pkinds is read
>from NPC_AF_CONST1 at runtime and stored as u8, so 54/58 are not
>statically guaranteed to be valid on all silicon variants where this
>non-cn20k path runs.
>Could these be guarded with a check against hw->npc_pkinds, or
>expressed via a symbolic constant tied to a chip-revision check?
>Also, the values 54, 58, and 0x03 appear here without commentary or
>mention in the commit log.  Could a brief comment or named constant
>explain what these PKIND indices represent and what 0x03 selects, so
>the kernel-firmware contract is visible to readers?
This is very low level hw bits (similar to npc_profile.h)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc
  2026-06-09  4:04 ` [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
@ 2026-06-10  5:31   ` Ratheesh Kannoth
  0 siblings, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  5:31 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:52, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Default CN20K NPC rule allocation now keys off the active MCAM keyword
> width: use X4 with a bank-masked reference index when the silicon uses
> X4 keys, and X2 with the raw index otherwise (replacing the previous
> always-X2 / eidx + 1 behaviour).
>
> In the AF flow-install path, flows that need more than 256 key bits
> query the NPC profile; if the platform is fixed to X2 entries, fail
> with -EOPNOTSUPP instead of requesting X4. Otherwise select X4 for the
> MCAM alloc.
>
> On the PF, cache and pass the profile kw_type from npc_get_pfl_info
> through otx2_mcam_pfl_info_get(), and use it when allocating MCAM
> entries for RSS/defaults and when installing ethtool flows on CN20K,
> including masking the reference index for X4 slot layout.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> ---

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

>>  	pfvf = rvu_get_pfvf(rvu, pcifunc);
>>  	pfvf->hw_prio = NPC_DFT_RULE_PRIO;
>>
>> +	if (npc_priv.kw == NPC_MCAM_KEY_X4) {
>> +		req.kw_type = NPC_MCAM_KEY_X4;
>> +		req.ref_entry = eidx & (npc_priv.bank_depth - 1);
>> +	} else {
>> +		req.kw_type = NPC_MCAM_KEY_X2;
>> +		req.ref_entry = eidx;
>> +	}
>> +
>How is the X1/DYN profile expected to be handled here?  In
>include/.../mbox.h the enum sets:
DYN profile can request X2/X4 entry.

>
>>  	*kw_type = NPC_MCAM_KEY_X2;
>> -	if (kw_bits > 256)
>> +	if (kw_bits > 256) {
>> +		rvu_mbox_handler_npc_get_pfl_info(rvu, &req, &rsp);
>> +		if (rsp.kw_type == NPC_MCAM_KEY_X2) {
>> +			dev_err(rvu->dev,
>> +				"Only X2 entries are supported in X2 profile\n");
>> +			return -EOPNOTSUPP;
>> +		}
>> +
>>  		*kw_type = NPC_MCAM_KEY_X4;
>> +	}
>On a DYN-configured platform rsp.kw_type is NPC_MCAM_KEY_DYN (== 0), so
>the equality check rsp.kw_type == NPC_MCAM_KEY_X2 is false and the
>function proceeds to set *kw_type = NPC_MCAM_KEY_X4.
>The commit log says:
>    if the platform is fixed to X2 entries, fail with -EOPNOTSUPP
>    instead of requesting X4
>Is the intent for this rejection to also fire on DYN, or to only fire
>on X2?  As written, the dft-rule path in npc_cn20k_dft_rules_alloc()
>buckets DYN with X2, while this flow-install path treats DYN as X4-
>capable, so the same npc_priv.kw == DYN value drives different kw_type
>choices in the two callers.
>Could the patch handle DYN explicitly (or document why it cannot occur
>on the paths reached here) so the two sites agree?
This is intentional. if profile is X2, we can't alloc an X4 entry.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
  2026-06-09  4:04 ` [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
  2026-06-10  4:40   ` Ratheesh Kannoth
@ 2026-06-10  5:33   ` Ratheesh Kannoth
  1 sibling, 0 replies; 20+ messages in thread
From: Ratheesh Kannoth @ 2026-06-10  5:33 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
	pabeni, sgoutham

On 2026-06-09 at 09:34:53, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Replace the file-scope static npc_priv with a kcalloc'd struct filled
> from hardware bank/subbank geometry at init (num_banks is no longer a
> const compile-time constant; drop init_done and use a non-NULL
> npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
> through the CN20K NPC code paths, extend teardown to kfree the root
> struct on failure and in npc_cn20k_deinit, and adjust MCAM section
> setup to use the discovered subbank count.
>
> Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
> and use the allocated backing store consistently when computing deltas
> (including the counter rollover compare).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260609040453.711932-1-rkannoth%40marvell.com says:

> -	kfree(npc_priv.xa_pf2idx_map);
>> +	kfree(npc_priv->xa_pf2idx_map);
>>  	/* No need to destroy mutex lock as it is
>>  	 * part of subbank structure
>>  	 */
>> -	kfree(npc_priv.sb);
>> +	kfree(npc_priv->sb);
>>  	kfree(subbank_srch_order);
>> -	bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> +	bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>>  		     MAX_SUBBANK_DEPTH);
>> +	npc_defrag_list_clear();
>> +	kfree(npc_priv);
>> +	npc_priv = NULL;
>>  }
>Now that npc_priv is dynamically allocated, en_map is an embedded
>DECLARE_BITMAP inside the struct that is unconditionally kfree'd two
>lines later:
>	bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>		     MAX_SUBBANK_DEPTH);
>	npc_defrag_list_clear();
>	kfree(npc_priv);
>	npc_priv = NULL;
>Does this bitmap_clear still serve a purpose? Pre-patch it cleared
>state on a static struct that persisted across unbind/rebind, but with
>the dynamic allocation it can be removed.
Not needed. But kept as part of graceful shutdown.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-06-10  5:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09  4:04 [PATCH v20 net-next 0/9] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 1/9] octeontx2-af: enforce single RVU AF probe Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-06-10  4:19   ` Ratheesh Kannoth
2026-06-10  5:08   ` Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 3/9] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 4/9] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 5/9] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
2026-06-10  4:28   ` Ratheesh Kannoth
2026-06-10  5:10   ` Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 6/9] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
2026-06-10  4:30   ` Ratheesh Kannoth
2026-06-10  4:31   ` Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 7/9] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
2026-06-10  5:24   ` Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 8/9] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
2026-06-10  5:31   ` Ratheesh Kannoth
2026-06-09  4:04 ` [PATCH v20 net-next 9/9] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
2026-06-10  4:40   ` Ratheesh Kannoth
2026-06-10  5:33   ` Ratheesh Kannoth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox