* [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements.
@ 2026-06-02 6:03 Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
` (7 more replies)
0 siblings, 8 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
This series extends Marvell octeontx2-af support for CN20K NPC (MCAM
debuggability, allocation policy, default-rule lifetime, optional KPU
profiles from firmware files, X2/X4 MCAM keyword handling in flows and
defaults, and dynamic CN20K NPC private state), adds a devlink mechanism
for multi-value parameters, and moves devlink_nl_param_fill() temporaries
to the heap so stack usage stays reasonable once union devlink_param_value
grows (patch 3).
Patch 1 improves CN20K MCAM visibility in debugfs: mcam_layout marks
enabled entries, dstats reports per-entry hit deltas (baseline updated in
software after each read; hardware counters are not cleared), and mismatch
lists enabled entries without a PF mapping.
Patch 2 allocates the per-configuration-mode union devlink_param_value
buffers and struct devlink_param_gset_ctx used by devlink_nl_param_fill()
with kcalloc()/kzalloc_obj() and funnels failures through a single cleanup
path so the netlink reply path stays safe as the union grows.
Patch 3 (Saeed) introduces DEVLINK_PARAM_TYPE_U64_ARRAY and nested
DEVLINK_ATTR_PARAM_VALUE_DATA attributes so drivers and user space can
exchange bounded u64 arrays; YAML, uapi, and netlink validation are
updated.
Patch 4 adds a runtime devlink parameter srch_order to reorder CN20K
subbank search during MCAM allocation (the param uses the u64 array type
from patch 3).
Patch 5 ties default MCAM entries to NIX LF alloc/free on CN20K, adds
NIX_LF_DONT_FREE_DFT_IDXS for PF teardown paths that must not drop default
NPC indexes while the driver still owns state, and tightens nix_lf_alloc
error propagation.
Patch 6 allows loading a custom KPU profile from /lib/firmware/kpu via
module parameter kpu_profile, with cam2 / ptype_mask wiring and helpers
that share firmware-sourced vs filesystem-sourced profile layouts.
Patch 7 makes default-rule allocation, AF flow install, and PF-side RSS,
defaults, and ethtool flows respect the active CN20K MCAM keyword width
(X2 vs X4), including X4 reference-index masking and -EOPNOTSUPP when a
flow needs X4 keys on an X2-only profile.
Patch 8 replaces file-scope npc_priv and static dstats with allocation
sized from discovered bank/subbank geometry, threads npc_priv_get()
through CN20K NPC paths, and allocates dstats via devm_kzalloc for the
debugfs helper.
Heap-backed devlink_nl_param_fill() sits immediately before the U64 array
param work so incremental builds stay stack-safe as the union grows; the
CN20K patches keep srch_order ahead of NIX LF coordination, optional KPU
profile load from firmware files, X2/X4 handling, and the npc_priv refactor
that touches the same files heavily.
Ratheesh Kannoth (7):
octeontx2-af: npc: cn20k: debugfs enhancements
devlink: heap-allocate param fill buffers in devlink_nl_param_fill
octeontx2-af: npc: cn20k: add subbank search order control
octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
octeontx2-af: npc: Support for custom KPU profile from filesystem
octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT
alloc
octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
Saeed Mahameed (1):
devlink: Implement devlink param multi attribute nested data values
Documentation/netlink/specs/devlink.yaml | 4 +
.../marvell/octeontx2/af/cn20k/debugfs.c | 163 ++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 633 ++++++++++++------
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 17 +-
.../net/ethernet/marvell/octeontx2/af/mbox.h | 1 +
.../net/ethernet/marvell/octeontx2/af/npc.h | 17 +
.../net/ethernet/marvell/octeontx2/af/rvu.h | 12 +-
.../marvell/octeontx2/af/rvu_devlink.c | 92 ++-
.../ethernet/marvell/octeontx2/af/rvu_nix.c | 69 +-
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 486 +++++++++++---
.../ethernet/marvell/octeontx2/af/rvu_npc.h | 17 +
.../marvell/octeontx2/af/rvu_npc_fs.c | 12 +-
.../ethernet/marvell/octeontx2/af/rvu_reg.h | 1 +
.../marvell/octeontx2/nic/otx2_flows.c | 48 +-
.../ethernet/marvell/octeontx2/nic/otx2_pf.c | 6 +-
include/net/devlink.h | 8 +
include/uapi/linux/devlink.h | 1 +
net/devlink/netlink_gen.c | 2 +
net/devlink/param.c | 95 ++-
19 files changed, 1280 insertions(+), 404 deletions(-)
--
v17 -> v18: Addressed sashiko comments.
https://lore.kernel.org/netdev/20260601025844.865865-1-rkannoth@marvell.com/
v16 -> v17: Addressed Jakub comments.
https://lore.kernel.org/netdev/20260521095303.2395584-1-rkannoth@marvell.com/
v15 -> v16: Addressed Sashiko comments
https://lore.kernel.org/netdev/20260520020939.1457231-1-rkannoth@marvell.com/
v14 -> v15: Addressed Paolo comments
https://lore.kernel.org/netdev/20260514062537.3813802-1-rkannoth@marvell.com/
v13 -> v14: Addressed sashiko comments.
I had to revert Jiri comment in v11 as sashiko was complaining about
leaking kernel memory to userspace.
https://lore.kernel.org/netdev/20260511033923.1301976-1-rkannoth@marvell.com/
v12 -> v13: Addressed David Laight comments
https://lore.kernel.org/netdev/20260508034912.4082520-1-rkannoth@marvell.com/
v11 -> v12: Addressed Paolo,Jiri comments.
https://lore.kernel.org/netdev/20260409025055.1664053-1-rkannoth@marvell.com/
Added one patch which was rejected by simon in net
(as it was kind of enhancement rather than a bug)
Added one more patch- which allocates two variables from heap.
v10 -> v11: Addressed Paolo comments.
https://lore.kernel.org/netdev/20260403025533.6250-1-rkannoth@marvell.com/
v9 -> v10: Addressed Paolo comments
https://lore.kernel.org/netdev/
20260330053105.2722453-1-rkannoth@marvell.com/
v8 -> v9: Addressed Simon comments
https://lore.kernel.org/netdev/
20260325072159.1126964-1-rkannoth@marvell.com/
v7 -> v8: Addressed Simon comments
https://lore.kernel.org/netdev/
20260323035110.3908741-1-rkannoth@marvell.com/T/#t
v6 -> v7: Addressed Simon comments
https://lore.kernel.org/netdev/20260320165432.98832-1-horms@kernel.org/
v5 -> v6: Addressed Jakub,Jiri comments
https://lore.kernel.org/netdev/
20260317045623.250187-1-rkannoth@marvell.com/
v4 -> v5: Addressed Jakub comments
https://lore.kernel.org/netdev/
20260312022754.2029595-6-rkannoth@marvell.com/
v3 -> v4: Addressed Simon comments
https://lore.kernel.org/netdev/abDeXLpMMxp7G1v3@rkannoth-OptiPlex-7090/#t
v2 -> v3: Addressed Simon comments.
https://lore.kernel.org/netdev/
20260304043032.3661647-1-rkannoth@marvell.com/
v1 -> v2: Addressed Jakub comments.
https://lore.kernel.org/netdev/
20260302085803.2449828-1-rkannoth@marvell.com/#t
2.43.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-03 6:19 ` Ratheesh Kannoth
2026-06-04 2:19 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 2/8] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
` (6 subsequent siblings)
7 siblings, 2 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
Improve MCAM visibility and field debugging for CN20K NPC.
- Extend "mcam_layout" to show enabled (+) or disabled state per entry
so status can be verified without parsing the full "mcam_entry" dump.
- Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
since the prior read by comparing hardware counters to a per-entry
software baseline and advancing that baseline after each read (hardware
counters are not cleared).
- Add "mismatch" debugfs entry: lists MCAM entries that are enabled
but not explicitly allocated, helping diagnose allocation/field issues.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../marvell/octeontx2/af/cn20k/debugfs.c | 158 +++++++++++++++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 37 +++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 11 ++
3 files changed, 191 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 6f13296303cb..730ef97a57e6 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -13,6 +13,7 @@
#include "struct.h"
#include "rvu.h"
#include "debugfs.h"
+#include "cn20k/reg.h"
#include "cn20k/npc.h"
static int npc_mcam_layout_show(struct seq_file *s, void *unused)
@@ -58,7 +59,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
"v:%u", vidx0);
}
- seq_printf(s, "\t%u(%#x) %s\n", idx0, pf1,
+ seq_printf(s, "\t%u(%#x)%c %s\n", idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
map ? buf0 : " ");
}
goto next;
@@ -101,9 +103,13 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
vidx1);
}
- seq_printf(s, "%05u(%#x) %s\t\t%05u(%#x) %s\n",
- idx1, pf2, v1 ? buf1 : " ",
- idx0, pf1, v0 ? buf0 : " ");
+ seq_printf(s, "%05u(%#x)%c %s\t\t%05u(%#x)%c %s\n",
+ idx1, pf2,
+ test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
+ v1 ? buf1 : " ",
+ idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+ v0 ? buf0 : " ");
continue;
}
@@ -120,8 +126,9 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
vidx0);
}
- seq_printf(s, "\t\t \t\t%05u(%#x) %s\n", idx0,
- pf1, map ? buf0 : " ");
+ seq_printf(s, "\t\t \t\t%05u(%#x)%c %s\n", idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+ map ? buf0 : " ");
continue;
}
@@ -134,7 +141,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
snprintf(buf1, sizeof(buf1), "v:%05u", vidx1);
}
- seq_printf(s, "%05u(%#x) %s\n", idx1, pf1,
+ seq_printf(s, "%05u(%#x)%c %s\n", idx1, pf1,
+ test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
map ? buf1 : " ");
}
next:
@@ -145,6 +153,136 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
DEFINE_SHOW_ATTRIBUTE(npc_mcam_layout);
+#define __OCTEONTX2_DEBUGFS_ATTRIBUTE_FOPS(__name) \
+static const struct file_operations __name ## _fops = { \
+ .owner = THIS_MODULE, \
+ .open = __name ## _open, \
+ .read = seq_read, \
+ .llseek = seq_lseek, \
+ .release = single_release, \
+}
+
+#define DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(__name, __size) \
+static int __name ## _open(struct inode *inode, struct file *file) \
+{ \
+ return single_open_size(file, __name ## _show, inode->i_private, \
+ __size); \
+} \
+__OCTEONTX2_DEBUGFS_ATTRIBUTE_FOPS(__name)
+
+static DEFINE_MUTEX(stats_lock);
+
+/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
+ * hard limit on all silicon variants, preventing any possibility of
+ * out-of-bounds access.
+ */
+static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
+static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
+{
+ struct npc_priv_t *npc_priv;
+ int blkaddr, pf, mcam_idx;
+ u64 stats, delta;
+ struct rvu *rvu;
+ char buff[64];
+ u8 key_type;
+ void *map;
+
+ npc_priv = npc_priv_get();
+ rvu = s->private;
+ blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
+ if (blkaddr < 0)
+ return 0;
+
+ mutex_lock(&stats_lock);
+ seq_puts(s, "idx\tpfunc\tstats\n");
+ for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+ for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+ mcam_idx = bank * npc_priv->bank_depth + idx;
+
+ if (npc_mcam_idx_2_key_type(rvu, mcam_idx, &key_type))
+ continue;
+
+ if (key_type == NPC_MCAM_KEY_X4 && bank != 0)
+ continue;
+
+ if (!test_bit(mcam_idx, npc_priv->en_map))
+ continue;
+
+ stats = rvu_read64(rvu, blkaddr,
+ NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
+ if (!stats)
+ continue;
+ if (stats == dstats[bank][idx])
+ continue;
+
+ if (stats < dstats[bank][idx])
+ dstats[bank][idx] = 0;
+
+ pf = 0xFFFF;
+ map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+ if (map)
+ pf = xa_to_value(map);
+
+ delta = stats - dstats[bank][idx];
+
+ snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
+ mcam_idx, pf, delta);
+ seq_puts(s, buff);
+
+ dstats[bank][idx] = stats;
+ }
+ }
+
+ mutex_unlock(&stats_lock);
+ return 0;
+}
+
+/* "%u\t%#04x\t%llu\n" needs less than 64 characters to print */
+#define TOTAL_SZ (MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH * 64)
+DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_dstats, TOTAL_SZ);
+
+static int npc_mcam_mismatch_show(struct seq_file *s, void *unused)
+{
+ struct npc_priv_t *npc_priv;
+ struct npc_subbank *sb;
+ int mcam_idx, sb_off;
+ struct rvu *rvu;
+ char buff[64];
+ void *map;
+ int rc;
+
+ npc_priv = npc_priv_get();
+ rvu = s->private;
+
+ seq_puts(s, "index\tsb idx\tkw type\n");
+ for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+ for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+ mcam_idx = bank * npc_priv->bank_depth + idx;
+
+ if (!test_bit(mcam_idx, npc_priv->en_map))
+ continue;
+
+ map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+ if (map)
+ continue;
+
+ rc = npc_mcam_idx_2_subbank_idx(rvu, mcam_idx,
+ &sb, &sb_off);
+ if (rc)
+ continue;
+
+ snprintf(buff, sizeof(buff), "%u\t%d\t%u\n",
+ mcam_idx, sb->idx, sb->key_type);
+
+ seq_puts(s, buff);
+ }
+ }
+ return 0;
+}
+
+/* "%u\t%d\t%u\n" needs less than 64 characters to print. */
+DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_mismatch, TOTAL_SZ);
+
static int npc_mcam_default_show(struct seq_file *s, void *unused)
{
struct npc_priv_t *npc_priv;
@@ -259,6 +397,12 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
npc_priv, &npc_vidx2idx_map_fops);
+ debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
+ &npc_mcam_dstats_fops);
+
+ debugfs_create_file("mismatch", 0444, rvu->rvu_dbg.npc, rvu,
+ &npc_mcam_mismatch_fops);
+
debugfs_create_file("idx2vidx", 0444, rvu->rvu_dbg.npc,
npc_priv, &npc_idx2vidx_map_fops);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 6b3f453fd500..9fa9a589cf9c 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -824,7 +824,7 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
rvu_write64(rvu, blkaddr,
NPC_AF_CN20K_MCAMEX_BANKX_CFG_EXT(mcam_idx, bank),
cfg);
- return 0;
+ goto update_en_map;
}
/* For NPC_CN20K_MCAM_KEY_X4 keys, both the banks
@@ -842,6 +842,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
cfg);
}
+update_en_map:
+ if (enable)
+ set_bit(index, npc_priv.en_map);
+ else
+ clear_bit(index, npc_priv.en_map);
+
return 0;
}
@@ -1789,9 +1795,9 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
return 0;
}
-static int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
- struct npc_subbank **sb,
- int *sb_off)
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+ struct npc_subbank **sb,
+ int *sb_off)
{
int bank_off, sb_id;
@@ -4498,11 +4504,19 @@ static int npc_priv_init(struct rvu *rvu)
npc_const2 = rvu_read64(rvu, blkaddr, NPC_AF_CONST2);
num_banks = mcam->banks;
+ if (num_banks > MAX_NUM_BANKS) {
+ dev_err(rvu->dev,
+ "Number of banks(%u) is invalid\n", num_banks);
+ return -EINVAL;
+ }
+
bank_depth = mcam->banksize;
num_subbanks = FIELD_GET(GENMASK_ULL(39, 32), npc_const2);
- if (!num_subbanks) {
- dev_err(rvu->dev, "Number of subbanks is zero\n");
+ if (!num_subbanks || num_subbanks > MAX_NUM_SUB_BANKS) {
+ dev_err(rvu->dev,
+ "Number of subbanks is invalid %u\n",
+ num_subbanks);
return -EFAULT;
}
@@ -4513,10 +4527,15 @@ static int npc_priv_init(struct rvu *rvu)
return -EINVAL;
}
- npc_priv.num_subbanks = num_subbanks;
-
subbank_depth = bank_depth / num_subbanks;
+ if (subbank_depth > MAX_SUBBANK_DEPTH) {
+ dev_err(rvu->dev,
+ "Invalid subbank depth %u\n",
+ subbank_depth);
+ return -EINVAL;
+ }
+ npc_priv.num_subbanks = num_subbanks;
npc_priv.bank_depth = bank_depth;
npc_priv.subbank_depth = subbank_depth;
@@ -4605,6 +4624,8 @@ void npc_cn20k_deinit(struct rvu *rvu)
*/
kfree(npc_priv.sb);
kfree(subbank_srch_order);
+ bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
+ MAX_SUBBANK_DEPTH);
}
static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 3d5eb952cc07..3e851950be64 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -10,6 +10,10 @@
#define MKEX_CN20K_SIGN 0x19bbfdbd160
+/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
+ * hard limit on all silicon variants, preventing any possibility of
+ * out-of-bounds access on matrix defined using these values.
+ */
#define MAX_NUM_BANKS 2
#define MAX_NUM_SUB_BANKS 32
#define MAX_SUBBANK_DEPTH 256
@@ -170,6 +174,7 @@ struct npc_defrag_show_node {
* @num_banks: Number of banks.
* @num_subbanks: Number of subbanks.
* @subbank_depth: Depth of subbank.
+ * @en_map: Enable/disable status.
* @kw: Kex configured key type.
* @sb: Subbank array.
* @xa_sb_used: Array of used subbanks.
@@ -193,6 +198,9 @@ struct npc_priv_t {
const int num_banks;
int num_subbanks;
int subbank_depth;
+ DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
+ MAX_NUM_SUB_BANKS *
+ MAX_SUBBANK_DEPTH);
u8 kw;
struct npc_subbank *sb;
struct xarray xa_sb_used;
@@ -336,5 +344,8 @@ u16 npc_cn20k_vidx2idx(u16 index);
u16 npc_cn20k_idx2vidx(u16 idx);
int npc_cn20k_defrag(struct rvu *rvu);
bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc);
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+ struct npc_subbank **sb,
+ int *sb_off);
#endif /* NPC_CN20K_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 2/8] devlink: heap-allocate param fill buffers in devlink_nl_param_fill
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 3/8] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
` (5 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
devlink_nl_param_fill() kept two per-configuration-mode copies of
union devlink_param_value plus a struct devlink_param_gset_ctx on the
stack while building the Netlink reply. Allocate those with kcalloc()
and kzalloc_obj() instead, and route failures through a single cleanup
path so temporary buffers are always freed.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
net/devlink/param.c | 62 +++++++++++++++++++++++++++++++++------------
1 file changed, 46 insertions(+), 16 deletions(-)
diff --git a/net/devlink/param.c b/net/devlink/param.c
index 1a196d3a843d..bd3881349c60 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -304,56 +304,79 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
u32 portid, u32 seq, int flags,
struct netlink_ext_ack *extack)
{
- union devlink_param_value default_value[DEVLINK_PARAM_CMODE_MAX + 1];
- union devlink_param_value param_value[DEVLINK_PARAM_CMODE_MAX + 1];
bool default_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
bool param_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
const struct devlink_param *param = param_item->param;
- struct devlink_param_gset_ctx ctx;
+ union devlink_param_value *default_value;
+ union devlink_param_value *param_value;
+ struct devlink_param_gset_ctx *ctx;
struct nlattr *param_values_list;
struct nlattr *param_attr;
void *hdr;
int err;
int i;
+ default_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+ sizeof(*default_value), GFP_KERNEL);
+ if (!default_value)
+ return -ENOMEM;
+
+ param_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+ sizeof(*param_value), GFP_KERNEL);
+ if (!param_value) {
+ kfree(default_value);
+ return -ENOMEM;
+ }
+
+ ctx = kzalloc_obj(*ctx);
+ if (!ctx) {
+ kfree(param_value);
+ kfree(default_value);
+ return -ENOMEM;
+ }
+
/* Get value from driver part to driverinit configuration mode */
for (i = 0; i <= DEVLINK_PARAM_CMODE_MAX; i++) {
if (!devlink_param_cmode_is_supported(param, i))
continue;
if (i == DEVLINK_PARAM_CMODE_DRIVERINIT) {
- if (param_item->driverinit_value_new_valid)
+ if (param_item->driverinit_value_new_valid) {
param_value[i] = param_item->driverinit_value_new;
- else if (param_item->driverinit_value_valid)
+ } else if (param_item->driverinit_value_valid) {
param_value[i] = param_item->driverinit_value;
- else
- return -EOPNOTSUPP;
+ } else {
+ err = -EOPNOTSUPP;
+ goto get_put_fail;
+ }
if (param_item->driverinit_value_valid) {
default_value[i] = param_item->driverinit_default;
default_value_set[i] = true;
}
} else {
- ctx.cmode = i;
- err = devlink_param_get(devlink, param, &ctx, extack);
+ ctx->cmode = i;
+ err = devlink_param_get(devlink, param, ctx, extack);
if (err)
- return err;
- param_value[i] = ctx.val;
+ goto get_put_fail;
- err = devlink_param_get_default(devlink, param, &ctx,
+ param_value[i] = ctx->val;
+
+ err = devlink_param_get_default(devlink, param, ctx,
extack);
if (!err) {
- default_value[i] = ctx.val;
+ default_value[i] = ctx->val;
default_value_set[i] = true;
} else if (err != -EOPNOTSUPP) {
- return err;
+ goto get_put_fail;
}
}
param_value_set[i] = true;
}
+ err = -EMSGSIZE;
hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
if (!hdr)
- return -EMSGSIZE;
+ goto get_put_fail;
if (devlink_nl_put_handle(msg, devlink))
goto genlmsg_cancel;
@@ -393,6 +416,9 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
nla_nest_end(msg, param_values_list);
nla_nest_end(msg, param_attr);
genlmsg_end(msg, hdr);
+ kfree(default_value);
+ kfree(param_value);
+ kfree(ctx);
return 0;
values_list_nest_cancel:
@@ -401,7 +427,11 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
nla_nest_cancel(msg, param_attr);
genlmsg_cancel:
genlmsg_cancel(msg, hdr);
- return -EMSGSIZE;
+get_put_fail:
+ kfree(default_value);
+ kfree(param_value);
+ kfree(ctx);
+ return err;
}
static void devlink_param_notify(struct devlink *devlink,
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 3/8] devlink: Implement devlink param multi attribute nested data values
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 2/8] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
` (4 subsequent siblings)
7 siblings, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Saeed Mahameed, Ratheesh Kannoth
From: Saeed Mahameed <saeedm@nvidia.com>
Devlink param value attribute is not defined since devlink is handling
the value validating and parsing internally, this allows us to implement
multi attribute values without breaking any policies.
Devlink param multi-attribute values are considered to be dynamically
sized arrays of u64 values, by introducing a new devlink param type
DEVLINK_PARAM_TYPE_U64_ARRAY, driver and user space can set a variable
count of u64 values into the DEVLINK_ATTR_PARAM_VALUE_DATA attribute.
Implement get/set parsing and add to the internal value structure passed
to drivers.
This is useful for devices that need to configure a list of values for
a specific configuration.
example:
$ devlink dev param show pci/... name multi-value-param
name multi-value-param type driver-specific
values:
cmode permanent value: 0,1,2,3,4,5,6,7
$ devlink dev param set pci/... name multi-value-param \
value 4,5,6,7,0,1,2,3 cmode permanent
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
Documentation/netlink/specs/devlink.yaml | 4 +++
include/net/devlink.h | 8 ++++++
include/uapi/linux/devlink.h | 1 +
net/devlink/netlink_gen.c | 2 ++
net/devlink/param.c | 33 +++++++++++++++++++++++-
5 files changed, 47 insertions(+), 1 deletion(-)
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 247b147d689f..52ad1e7805d1 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -234,6 +234,10 @@ definitions:
value: 10
-
name: binary
+ -
+ name: u64-array
+ value: 129
+
-
name: rate-tc-index-max
type: const
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 5f4083dc4345..dd546dbd57cf 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -433,6 +433,13 @@ enum devlink_param_type {
DEVLINK_PARAM_TYPE_U64 = DEVLINK_VAR_ATTR_TYPE_U64,
DEVLINK_PARAM_TYPE_STRING = DEVLINK_VAR_ATTR_TYPE_STRING,
DEVLINK_PARAM_TYPE_BOOL = DEVLINK_VAR_ATTR_TYPE_FLAG,
+ DEVLINK_PARAM_TYPE_U64_ARRAY = DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
+};
+
+#define __DEVLINK_PARAM_MAX_ARRAY_SIZE 32
+struct devlink_param_u64_array {
+ u64 size;
+ u64 val[__DEVLINK_PARAM_MAX_ARRAY_SIZE];
};
union devlink_param_value {
@@ -442,6 +449,7 @@ union devlink_param_value {
u64 vu64;
char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
bool vbool;
+ struct devlink_param_u64_array u64arr;
};
struct devlink_param_gset_ctx {
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 0b165eac7619..ca713bcc47b9 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -406,6 +406,7 @@ enum devlink_var_attr_type {
DEVLINK_VAR_ATTR_TYPE_BINARY,
__DEVLINK_VAR_ATTR_TYPE_CUSTOM_BASE = 0x80,
/* Any possible custom types, unrelated to NLA_* values go below */
+ DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
};
enum devlink_attr {
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index 81899786fd98..f52b0c2b19ed 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -37,6 +37,8 @@ devlink_attr_param_type_validate(const struct nlattr *attr,
case DEVLINK_VAR_ATTR_TYPE_NUL_STRING:
fallthrough;
case DEVLINK_VAR_ATTR_TYPE_BINARY:
+ fallthrough;
+ case DEVLINK_VAR_ATTR_TYPE_U64_ARRAY:
return 0;
}
NL_SET_ERR_MSG_ATTR(extack, attr, "invalid enum value");
diff --git a/net/devlink/param.c b/net/devlink/param.c
index bd3881349c60..3e9d2e5750c2 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -252,6 +252,15 @@ devlink_nl_param_value_put(struct sk_buff *msg, enum devlink_param_type type,
return -EMSGSIZE;
}
break;
+ case DEVLINK_PARAM_TYPE_U64_ARRAY:
+ if (val->u64arr.size > __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+ return -EMSGSIZE;
+
+ for (int i = 0; i < val->u64arr.size; i++) {
+ if (nla_put_uint(msg, nla_type, val->u64arr.val[i]))
+ return -EMSGSIZE;
+ }
+ break;
}
return 0;
}
@@ -537,7 +546,7 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
union devlink_param_value *value)
{
struct nlattr *param_data;
- int len;
+ int len, cnt, rem;
param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
@@ -577,6 +586,28 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
return -EINVAL;
value->vbool = nla_get_flag(param_data);
break;
+
+ case DEVLINK_PARAM_TYPE_U64_ARRAY:
+ cnt = 0;
+ nla_for_each_attr_type(param_data,
+ DEVLINK_ATTR_PARAM_VALUE_DATA,
+ genlmsg_data(info->genlhdr),
+ genlmsg_len(info->genlhdr), rem) {
+ if (cnt >= __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+ return -EMSGSIZE;
+
+ if ((nla_len(param_data) != sizeof(u64)) &&
+ (nla_len(param_data) != sizeof(u32))) {
+ NL_SET_BAD_ATTR(info->extack, param_data);
+ return -EINVAL;
+ }
+
+ value->u64arr.val[cnt] = nla_get_uint(param_data);
+ cnt++;
+ }
+
+ value->u64arr.size = cnt;
+ break;
}
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (2 preceding siblings ...)
2026-06-02 6:03 ` [PATCH v18 net-next 3/8] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-04 2:34 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
` (3 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
CN20K NPC MCAM is split into 32 subbanks that are searched in a
predefined order during allocation. Lower-numbered subbanks have
higher priority than higher-numbered ones.
Add a runtime "srch_order" to control the order in which
subbanks are searched during MCAM allocation.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 120 +++++++++++++++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 3 +
.../marvell/octeontx2/af/rvu_devlink.c | 92 ++++++++++++--
3 files changed, 203 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 9fa9a589cf9c..0e1744609ccf 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -3376,7 +3376,7 @@ rvu_mbox_handler_npc_cn20k_get_kex_cfg(struct rvu *rvu,
return 0;
}
-static int *subbank_srch_order;
+static u32 *subbank_srch_order;
static void npc_populate_restricted_idxs(int num_subbanks)
{
@@ -3388,7 +3388,7 @@ static int npc_create_srch_order(int cnt)
{
int val = 0;
- subbank_srch_order = kcalloc(cnt, sizeof(int),
+ subbank_srch_order = kcalloc(cnt, sizeof(u32),
GFP_KERNEL);
if (!subbank_srch_order)
return -ENOMEM;
@@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
mutex_unlock(&npc_priv.sb[i].lock);
}
+int npc_cn20k_search_order_set(struct rvu *rvu,
+ u64 narr[MAX_NUM_SUB_BANKS], int cnt)
+{
+ struct npc_mcam *mcam = &rvu->hw->mcam;
+ int rsrc[2][MAX_NUM_SUB_BANKS] = { };
+ u8 save[MAX_NUM_SUB_BANKS] = { };
+ struct npc_subbank *sb;
+ struct xarray *xa;
+ int prio, rc, err;
+ int sb_idx;
+ enum {
+ FREE = 0,
+ USED = 1,
+ };
+
+ if (cnt != npc_priv.num_subbanks) {
+ dev_err(rvu->dev, "Number of entries(%u) != %u\n",
+ cnt, npc_priv.num_subbanks);
+ return -EINVAL;
+ }
+
+ mutex_lock(&mcam->lock);
+ npc_lock_all_subbank();
+
+ for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
+ sb = &npc_priv.sb[sb_idx];
+ save[sb->idx] = sb->arr_idx;
+ }
+
+ for (prio = 0; prio < cnt; prio++) {
+ sb_idx = narr[prio];
+ sb = &npc_priv.sb[sb_idx];
+
+ if (sb->flags & NPC_SUBBANK_FLAG_USED)
+ xa = &npc_priv.xa_sb_used;
+ else
+ xa = &npc_priv.xa_sb_free;
+
+ rc = xa_err(xa_store(xa, prio,
+ xa_mk_value(sb_idx), GFP_KERNEL));
+ if (rc) {
+ dev_err(rvu->dev,
+ "Setting arr_idx=%d for sb=%d failed\n",
+ sb->arr_idx, sb_idx);
+ goto fail;
+ }
+
+ if (sb->flags & NPC_SUBBANK_FLAG_USED) {
+ rsrc[USED][sb->arr_idx] -= 1;
+ rsrc[USED][prio] += 1;
+ } else {
+ rsrc[FREE][sb->arr_idx] -= 1;
+ rsrc[FREE][prio] += 1;
+ }
+
+ sb->arr_idx = prio;
+ }
+
+ for (prio = 0; prio < cnt; prio++) {
+ if (rsrc[FREE][prio] == -1)
+ xa_erase(&npc_priv.xa_sb_free, prio);
+
+ if (rsrc[USED][prio] == -1)
+ xa_erase(&npc_priv.xa_sb_used, prio);
+ }
+
+ for (int i = 0; i < cnt; i++)
+ subbank_srch_order[i] = (u32)narr[i];
+
+ restrict_valid = false;
+
+ npc_unlock_all_subbank();
+ mutex_unlock(&mcam->lock);
+
+ return 0;
+
+fail:
+ for (prio = 0; prio < cnt; prio++) {
+ if (rsrc[FREE][prio] == 1)
+ xa_erase(&npc_priv.xa_sb_free, prio);
+
+ if (rsrc[USED][prio] == 1)
+ xa_erase(&npc_priv.xa_sb_used, prio);
+ }
+
+ for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
+ sb = &npc_priv.sb[sb_idx];
+ sb->arr_idx = save[sb_idx];
+
+ if (sb->flags & NPC_SUBBANK_FLAG_USED)
+ xa = &npc_priv.xa_sb_used;
+ else
+ xa = &npc_priv.xa_sb_free;
+
+ /* Since the entry already exists, xa_store() replaces
+ * the value without a kmalloc(), making failure highly unlikely.
+ */
+ err = xa_err(xa_store(xa, sb->arr_idx,
+ xa_mk_value(sb->idx), GFP_KERNEL));
+ WARN(!!err, "Failed to rollback sb=%u idx=%u\n",
+ sb->idx, sb->arr_idx);
+ }
+
+ npc_unlock_all_subbank();
+ mutex_unlock(&mcam->lock);
+
+ return rc;
+}
+
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
+{
+ *restricted_order = restrict_valid;
+ *sz = npc_priv.num_subbanks;
+ return subbank_srch_order;
+}
+
/* Only non-ref non-contigous mcam indexes
* are picked for defrag process
*/
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 3e851950be64..8bf857317e49 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -347,5 +347,8 @@ bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc);
int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
struct npc_subbank **sb,
int *sb_off);
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz);
+int npc_cn20k_search_order_set(struct rvu *rvu, u64 narr[MAX_NUM_SUB_BANKS],
+ int cnt);
#endif /* NPC_CN20K_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index a42404e6db7c..aa3ecab5ebd8 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -1258,6 +1258,7 @@ enum rvu_af_dl_param_id {
RVU_AF_DEVLINK_PARAM_ID_NPC_EXACT_FEATURE_DISABLE,
RVU_AF_DEVLINK_PARAM_ID_NPC_DEF_RULE_CNTR_ENABLE,
RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
+ RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
RVU_AF_DEVLINK_PARAM_ID_NIX_MAXLF,
};
@@ -1619,12 +1620,83 @@ static int rvu_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
return 0;
}
+static int rvu_af_dl_npc_srch_order_set(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx,
+ struct netlink_ext_ack *extack)
+{
+ struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+ struct rvu *rvu = rvu_dl->rvu;
+
+ return npc_cn20k_search_order_set(rvu,
+ ctx->val.u64arr.val,
+ ctx->val.u64arr.size);
+}
+
+static int rvu_af_dl_npc_srch_order_get(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx,
+ struct netlink_ext_ack *extack)
+{
+ bool restricted_order;
+ const u32 *order;
+ u32 sz;
+
+ order = npc_cn20k_search_order_get(&restricted_order, &sz);
+ ctx->val.u64arr.size = sz;
+ for (int i = 0; i < sz; i++)
+ ctx->val.u64arr.val[i] = order[i];
+
+ return 0;
+}
+
+static int rvu_af_dl_npc_srch_order_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value *val,
+ struct netlink_ext_ack *extack)
+{
+ struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+ struct rvu *rvu = rvu_dl->rvu;
+ bool restricted_order;
+ unsigned long w = 0;
+ u64 *arr;
+ u32 sz;
+
+ npc_cn20k_search_order_get(&restricted_order, &sz);
+ if (sz != val->u64arr.size) {
+ dev_err(rvu->dev,
+ "Wrong size %llu, should be %u\n",
+ val->u64arr.size, sz);
+ return -EINVAL;
+ }
+
+ arr = val->u64arr.val;
+ for (int i = 0; i < sz; i++) {
+ if (arr[i] >= sz)
+ return -EINVAL;
+
+ w |= BIT_ULL(arr[i]);
+ }
+
+ if (bitmap_weight(&w, sz) != sz) {
+ dev_err(rvu->dev,
+ "Duplicate or out-of-range subbank index. %lu\n",
+ find_first_zero_bit(&w, sz));
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static const struct devlink_ops rvu_devlink_ops = {
.eswitch_mode_get = rvu_devlink_eswitch_mode_get,
.eswitch_mode_set = rvu_devlink_eswitch_mode_set,
};
-static const struct devlink_param rvu_af_dl_param_defrag[] = {
+static const struct devlink_param rvu_af_dl_cn20k_params[] = {
+ DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
+ "npc_srch_order", DEVLINK_PARAM_TYPE_U64_ARRAY,
+ BIT(DEVLINK_PARAM_CMODE_RUNTIME),
+ rvu_af_dl_npc_srch_order_get,
+ rvu_af_dl_npc_srch_order_set,
+ rvu_af_dl_npc_srch_order_validate),
DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
"npc_defrag", DEVLINK_PARAM_TYPE_STRING,
BIT(DEVLINK_PARAM_CMODE_RUNTIME),
@@ -1666,13 +1738,13 @@ int rvu_register_dl(struct rvu *rvu)
}
if (is_cn20k(rvu->pdev)) {
- err = devlink_params_register(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ err = devlink_params_register(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
if (err) {
dev_err(rvu->dev,
- "devlink defrag params register failed with error %d",
+ "devlink cn20k params register failed with error %d",
err);
- goto err_dl_defrag;
+ goto err_dl_cn20k_params;
}
}
@@ -1695,10 +1767,10 @@ int rvu_register_dl(struct rvu *rvu)
err_dl_exact_match:
if (is_cn20k(rvu->pdev))
- devlink_params_unregister(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
-err_dl_defrag:
+err_dl_cn20k_params:
devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
err_dl_health:
@@ -1717,8 +1789,8 @@ void rvu_unregister_dl(struct rvu *rvu)
devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
if (is_cn20k(rvu->pdev))
- devlink_params_unregister(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
/* Unregister exact match devlink only for CN10K-B */
if (rvu_npc_exact_has_match_table(rvu))
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (3 preceding siblings ...)
2026-06-02 6:03 ` [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-03 6:37 ` Ratheesh Kannoth
2026-06-04 2:41 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
` (2 subsequent siblings)
7 siblings, 2 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
reinit or teardown without the AF freeing CN20K default NPC rule indexes
while the driver still owns that state (otx2_init_hw_resources and
otx2_free_hw_resources).
On CN20K, allocate default NPC rules from NIX LF alloc before
nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
and free from NIX LF free when the new flag is not set. Tighten
rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
(remove the blanket -ENOMEM at the free_mem path).
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../net/ethernet/marvell/octeontx2/af/mbox.h | 1 +
.../ethernet/marvell/octeontx2/af/rvu_nix.c | 69 ++++++++++++-------
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 20 ++++--
.../ethernet/marvell/octeontx2/nic/otx2_pf.c | 6 +-
4 files changed, 61 insertions(+), 35 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index dc42c81c0942..e07fbf842b94 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -1009,6 +1009,7 @@ struct nix_lf_free_req {
struct mbox_msghdr hdr;
#define NIX_LF_DISABLE_FLOWS BIT_ULL(0)
#define NIX_LF_DONT_FREE_TX_VTAG BIT_ULL(1)
+#define NIX_LF_DONT_FREE_DFT_IDXS BIT_ULL(2)
u64 flags;
};
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
index f977734ae712..7df256a9e01c 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
@@ -16,6 +16,7 @@
#include "cgx.h"
#include "lmac_common.h"
#include "rvu_npc_hash.h"
+#include "cn20k/npc.h"
static void nix_free_tx_vtag_entries(struct rvu *rvu, u16 pcifunc);
static int rvu_nix_get_bpid(struct rvu *rvu, struct nix_bp_cfg_req *req,
@@ -1499,7 +1500,7 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
struct nix_lf_alloc_req *req,
struct nix_lf_alloc_rsp *rsp)
{
- int nixlf, qints, hwctx_size, intf, err, rc = 0;
+ int nixlf, qints, hwctx_size, intf, rc = 0;
struct rvu_hwinfo *hw = rvu->hw;
u16 pcifunc = req->hdr.pcifunc;
struct rvu_block *block;
@@ -1555,8 +1556,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
return NIX_AF_ERR_RSS_GRPS_INVALID;
/* Reset this NIX LF */
- err = rvu_lf_reset(rvu, block, nixlf);
- if (err) {
+ rc = rvu_lf_reset(rvu, block, nixlf);
+ if (rc) {
dev_err(rvu->dev, "Failed to reset NIX%d LF%d\n",
block->addr - BLKADDR_NIX0, nixlf);
return NIX_AF_ERR_LF_RESET;
@@ -1566,13 +1567,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX RQ HW context memory and config the base */
hwctx_size = 1UL << ((ctx_cfg >> 4) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->rq_bmap)
+ if (!pfvf->rq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_RQS_BASE(nixlf),
(u64)pfvf->rq_ctx->iova);
@@ -1583,13 +1586,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX SQ HW context memory and config the base */
hwctx_size = 1UL << (ctx_cfg & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->sq_bmap = kcalloc(req->sq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->sq_bmap)
+ if (!pfvf->sq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_SQS_BASE(nixlf),
(u64)pfvf->sq_ctx->iova);
@@ -1599,13 +1604,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX CQ HW context memory and config the base */
hwctx_size = 1UL << ((ctx_cfg >> 8) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->cq_bmap = kcalloc(req->cq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->cq_bmap)
+ if (!pfvf->cq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_CQS_BASE(nixlf),
(u64)pfvf->cq_ctx->iova);
@@ -1615,18 +1622,18 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Initialize receive side scaling (RSS) */
hwctx_size = 1UL << ((ctx_cfg >> 12) & 0xF);
- err = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
- req->rss_grps, hwctx_size, req->way_mask,
- !!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
- if (err)
+ rc = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
+ req->rss_grps, hwctx_size, req->way_mask,
+ !!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
+ if (rc)
goto free_mem;
/* Alloc memory for CQINT's HW contexts */
cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
qints = (cfg >> 24) & 0xFFF;
hwctx_size = 1UL << ((ctx_cfg >> 24) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
+ if (rc)
goto free_mem;
rvu_write64(rvu, blkaddr, NIX_AF_LFX_CINTS_BASE(nixlf),
@@ -1639,8 +1646,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
qints = (cfg >> 12) & 0xFFF;
hwctx_size = 1UL << ((ctx_cfg >> 20) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
+ if (rc)
goto free_mem;
rvu_write64(rvu, blkaddr, NIX_AF_LFX_QINTS_BASE(nixlf),
@@ -1684,10 +1691,16 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
if (is_sdp_pfvf(rvu, pcifunc))
intf = NIX_INTF_TYPE_SDP;
- err = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
- !!(req->flags & NIX_LF_LBK_BLK_SEL));
- if (err)
- goto free_mem;
+ if (is_cn20k(rvu->pdev)) {
+ rc = npc_cn20k_dft_rules_alloc(rvu, pcifunc);
+ if (rc)
+ goto free_mem;
+ }
+
+ rc = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
+ !!(req->flags & NIX_LF_LBK_BLK_SEL));
+ if (rc)
+ goto free_dft;
/* Disable NPC entries as NIXLF's contexts are not initialized yet */
rvu_npc_disable_default_entries(rvu, pcifunc, nixlf);
@@ -1699,9 +1712,12 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
goto exit;
+free_dft:
+ if (is_cn20k(rvu->pdev))
+ npc_cn20k_dft_rules_free(rvu, pcifunc);
+
free_mem:
nix_ctx_free(rvu, pfvf);
- rc = -ENOMEM;
exit:
/* Set macaddr of this PF/VF */
@@ -1775,6 +1791,9 @@ int rvu_mbox_handler_nix_lf_free(struct rvu *rvu, struct nix_lf_free_req *req,
nix_ctx_free(rvu, pfvf);
+ if (is_cn20k(rvu->pdev) && !(req->flags & NIX_LF_DONT_FREE_DFT_IDXS))
+ npc_cn20k_dft_rules_free(rvu, pcifunc);
+
return 0;
}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 607d0cf1a778..5fa9e1c7ae9f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1285,11 +1285,18 @@ void npc_enadis_default_mce_entry(struct rvu *rvu, u16 pcifunc,
struct nix_mce_list *mce_list;
int index, blkaddr, mce_idx;
struct rvu_pfvf *pfvf;
+ u16 ptr[4];
/* multicast pkt replication is not enabled for AF's VFs & SDP links */
if (is_lbk_vf(rvu, pcifunc) || is_sdp_pfvf(rvu, pcifunc))
return;
+ /* In cn20k, only CGX mapped devices have default MCAST entry */
+ if (is_cn20k(rvu->pdev) &&
+ npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &ptr[0], &ptr[1],
+ &ptr[2], &ptr[3]))
+ return;
+
blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
if (blkaddr < 0)
return;
@@ -1329,9 +1336,12 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
struct rvu_pfvf *pfvf = rvu_get_pfvf(rvu, pcifunc);
struct npc_mcam *mcam = &rvu->hw->mcam;
int index, blkaddr;
+ u16 ptr[4];
/* only CGX or LBK interfaces have default entries */
- if (is_cn20k(rvu->pdev) && !npc_is_cgx_or_lbk(rvu, pcifunc))
+ if (is_cn20k(rvu->pdev) &&
+ npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &ptr[0], &ptr[1],
+ &ptr[2], &ptr[3]))
return;
blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
@@ -4085,12 +4095,10 @@ void rvu_npc_clear_ucast_entry(struct rvu *rvu, int pcifunc, int nixlf)
ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc,
nixlf, NIXLF_UCAST_ENTRY);
- if (ucast_idx < 0) {
- dev_err(rvu->dev,
- "%s: Error to get ucast entry for pcifunc=%#x\n",
- __func__, pcifunc);
+
+ /* In cn20k, default rules are freed before detach rsrc */
+ if (ucast_idx < 0)
return;
- }
npc_enable_mcam_entry(rvu, mcam, blkaddr, ucast_idx, false);
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index ee623476e5ff..81b088f5a016 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1053,7 +1053,6 @@ irqreturn_t otx2_pfaf_mbox_intr_handler(int irq, void *pf_irq)
/* Clear the IRQ */
otx2_write64(pf, RVU_PF_INT, BIT_ULL(0));
-
mbox_data = otx2_read64(pf, RVU_PF_PFAF_MBOX0);
if (mbox_data & MBOX_UP_MSG) {
@@ -1729,7 +1728,7 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
mutex_lock(&mbox->lock);
free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
if (free_req) {
- free_req->flags = NIX_LF_DISABLE_FLOWS;
+ free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
if (otx2_sync_mbox_msg(mbox))
dev_err(pf->dev, "%s failed to free nixlf\n", __func__);
}
@@ -1803,7 +1802,7 @@ void otx2_free_hw_resources(struct otx2_nic *pf)
/* Reset NIX LF */
free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
if (free_req) {
- free_req->flags = NIX_LF_DISABLE_FLOWS;
+ free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
if (!(pf->flags & OTX2_FLAG_PF_SHUTDOWN))
free_req->flags |= NIX_LF_DONT_FREE_TX_VTAG;
if (otx2_sync_mbox_msg(mbox))
@@ -1926,7 +1925,6 @@ int otx2_alloc_queue_mem(struct otx2_nic *pf)
struct otx2_qset *qset = &pf->qset;
struct otx2_cq_poll *cq_poll;
-
/* RQ and SQs are mapped to different CQs,
* so find out max CQ IRQs (i.e CINTs) needed.
*/
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (4 preceding siblings ...)
2026-06-02 6:03 ` [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-03 6:46 ` Ratheesh Kannoth
2026-06-04 3:07 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
7 siblings, 2 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
Flashing updated firmware on deployed devices is cumbersome. Provide a
mechanism to load a custom KPU (Key Parse Unit) profile directly from
the filesystem at module load time.
When the rvu_af module is loaded with the kpu_profile parameter, the
specified profile is read from /lib/firmware/kpu and programmed into
the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
used by filesystem-loaded profiles and support ptype/ptype_mask in
npc_config_kpucam when profile->from_fs is set.
Usage:
1. Copy the KPU profile file to /lib/firmware/kpu.
2. Build OCTEONTX2_AF as a module.
3. Load: insmod rvu_af.ko kpu_profile=<profile_name>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 57 ++-
.../net/ethernet/marvell/octeontx2/af/npc.h | 17 +
.../net/ethernet/marvell/octeontx2/af/rvu.h | 12 +-
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 466 ++++++++++++++----
.../ethernet/marvell/octeontx2/af/rvu_npc.h | 17 +
.../ethernet/marvell/octeontx2/af/rvu_reg.h | 1 +
6 files changed, 459 insertions(+), 111 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 0e1744609ccf..513e68711962 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -521,13 +521,17 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
int kpm, int start_entry,
const struct npc_kpu_profile *profile)
{
+ int num_cam_entries, num_action_entries;
int entry, num_entries, max_entries;
u64 idx;
- if (profile->cam_entries != profile->action_entries) {
+ num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+ num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+ if (num_cam_entries != num_action_entries) {
dev_err(rvu->dev,
"kpm%d: CAM and action entries [%d != %d] not equal\n",
- kpm, profile->cam_entries, profile->action_entries);
+ kpm, num_cam_entries, num_action_entries);
WARN(1, "Fatal error\n");
return;
@@ -536,16 +540,18 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
max_entries = rvu->hw->npc_kpu_entries / 2;
entry = start_entry;
/* Program CAM match entries for previous kpm extracted data */
- num_entries = min_t(int, profile->cam_entries, max_entries);
+ num_entries = min_t(int, num_cam_entries, max_entries);
for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
- npc_config_kpmcam(rvu, blkaddr, &profile->cam[idx],
+ npc_config_kpmcam(rvu, blkaddr,
+ npc_get_kpu_cam_nth_entry(rvu, profile, idx),
kpm, entry);
entry = start_entry;
/* Program this kpm's actions */
- num_entries = min_t(int, profile->action_entries, max_entries);
+ num_entries = min_t(int, num_action_entries, max_entries);
for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
- npc_config_kpmaction(rvu, blkaddr, &profile->action[idx],
+ npc_config_kpmaction(rvu, blkaddr,
+ npc_get_kpu_action_nth_entry(rvu, profile, idx),
kpm, entry, false);
}
@@ -611,20 +617,23 @@ npc_enable_kpm_entry(struct rvu *rvu, int blkaddr, int kpm, int num_entries)
static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
{
const struct npc_kpu_profile *profile1, *profile2;
+ int pfl1_num_cam_entries, pfl2_num_cam_entries;
int idx, total_cam_entries;
for (idx = 0; idx < num_kpms; idx++) {
profile1 = &rvu->kpu.kpu[idx];
+ pfl1_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile1);
npc_program_single_kpm_profile(rvu, blkaddr, idx, 0, profile1);
profile2 = &rvu->kpu.kpu[idx + KPU_OFFSET];
+ pfl2_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile2);
+
npc_program_single_kpm_profile(rvu, blkaddr, idx,
- profile1->cam_entries,
+ pfl1_num_cam_entries,
profile2);
- total_cam_entries = profile1->cam_entries +
- profile2->cam_entries;
+ total_cam_entries = pfl1_num_cam_entries + pfl2_num_cam_entries;
npc_enable_kpm_entry(rvu, blkaddr, idx, total_cam_entries);
rvu_write64(rvu, blkaddr, NPC_AF_KPMX_PASS2_OFFSET(idx),
- profile1->cam_entries);
+ pfl1_num_cam_entries);
/* Enable the KPUs associated with this KPM */
rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx), 0x01);
rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx + KPU_OFFSET),
@@ -634,6 +643,7 @@ static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
{
+ struct npc_kpu_profile_action *act;
struct rvu_hwinfo *hw = rvu->hw;
int num_pkinds, idx;
@@ -665,9 +675,15 @@ void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
num_pkinds = rvu->kpu.pkinds;
num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
- for (idx = 0; idx < num_pkinds; idx++)
- npc_config_kpmaction(rvu, blkaddr, &rvu->kpu.ikpu[idx],
+ /* Cn20k does not support Custom profile from filesystem */
+ for (idx = 0; idx < num_pkinds; idx++) {
+ act = npc_get_ikpu_nth_entry(rvu, idx);
+ if (!act)
+ continue;
+
+ npc_config_kpmaction(rvu, blkaddr, act,
0, idx, true);
+ }
/* Program KPM CAM and Action profiles */
npc_program_kpm_profile(rvu, blkaddr, hw->npc_kpms);
@@ -679,7 +695,7 @@ struct npc_priv_t *npc_priv_get(void)
}
static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr,
+ const struct npc_mcam_kex_extr *mkex_extr,
u8 intf)
{
u8 num_extr = rvu->hw->npc_kex_extr;
@@ -708,7 +724,7 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr,
+ const struct npc_mcam_kex_extr *mkex_extr,
u8 intf)
{
u8 num_extr = rvu->hw->npc_kex_extr;
@@ -737,7 +753,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr)
+ const struct npc_mcam_kex_extr *mkex_extr)
{
struct rvu_hwinfo *hw = rvu->hw;
u8 intf;
@@ -1630,8 +1646,8 @@ npc_cn20k_update_action_entries_n_flags(struct rvu *rvu,
int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
struct npc_kpu_profile_adapter *profile)
{
+ const struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
size_t hdr_sz = sizeof(struct npc_cn20k_kpu_profile_fwdata);
- struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
struct npc_kpu_profile_action *action;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_fwdata *fw_kpu;
@@ -1676,8 +1692,15 @@ int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
}
/* Verify if profile fits the HW */
+ if (fw->kpus > rvu->hw->npc_kpus) {
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+ rvu->hw->npc_kpus);
+ return -EINVAL;
+ }
+
+ /* Check if there is enough memory */
if (fw->kpus > profile->kpus) {
- dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
profile->kpus);
return -EINVAL;
}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
index cefc5d70f3e4..c8c0cb68535c 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
@@ -265,6 +265,19 @@ struct npc_kpu_profile_cam {
u16 dp2_mask;
} __packed;
+struct npc_kpu_profile_cam2 {
+ u8 state;
+ u8 state_mask;
+ u16 dp0;
+ u16 dp0_mask;
+ u16 dp1;
+ u16 dp1_mask;
+ u16 dp2;
+ u16 dp2_mask;
+ u8 ptype;
+ u8 ptype_mask;
+} __packed;
+
struct npc_kpu_profile_action {
u8 errlev;
u8 errcode;
@@ -290,6 +303,10 @@ struct npc_kpu_profile {
int action_entries;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_profile_action *action;
+ int cam_entries2;
+ int action_entries2;
+ struct npc_kpu_profile_action *action2;
+ struct npc_kpu_profile_cam2 *cam2;
};
/* NPC KPU register formats */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index de3fbd3d15d6..be85913a3591 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -553,17 +553,19 @@ struct npc_kpu_profile_adapter {
const char *name;
u64 version;
const struct npc_lt_def_cfg *lt_def;
- const struct npc_kpu_profile_action *ikpu; /* array[pkinds] */
- const struct npc_kpu_profile *kpu; /* array[kpus] */
+ struct npc_kpu_profile_action *ikpu; /* array[pkinds] */
+ struct npc_kpu_profile_action *ikpu2; /* array[pkinds] */
+ struct npc_kpu_profile *kpu; /* array[kpus] */
union npc_mcam_key_prfl {
- struct npc_mcam_kex *mkex;
+ const struct npc_mcam_kex *mkex;
/* used for cn9k and cn10k */
- struct npc_mcam_kex_extr *mkex_extr; /* used for cn20k */
+ const struct npc_mcam_kex_extr *mkex_extr; /* used for cn20k */
} mcam_kex_prfl;
struct npc_mcam_kex_hash *mkex_hash;
bool custom;
size_t pkinds;
size_t kpus;
+ bool from_fs;
};
#define RVU_SWITCH_LBK_CHAN 63
@@ -634,7 +636,7 @@ struct rvu {
/* Firmware data */
struct rvu_fwdata *fwdata;
- void *kpu_fwdata;
+ const void *kpu_fwdata;
size_t kpu_fwdata_sz;
void __iomem *kpu_prfl_addr;
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 5fa9e1c7ae9f..8a90b44872b6 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1495,7 +1495,8 @@ void rvu_npc_free_mcam_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
}
static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex, u8 intf)
+ const struct npc_mcam_kex *mkex,
+ u8 intf)
{
int lid, lt, ld, fl;
@@ -1524,7 +1525,8 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex, u8 intf)
+ const struct npc_mcam_kex *mkex,
+ u8 intf)
{
int lid, lt, ld, fl;
@@ -1553,7 +1555,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex)
+ const struct npc_mcam_kex *mkex)
{
struct rvu_hwinfo *hw = rvu->hw;
u8 intf;
@@ -1693,8 +1695,12 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
const struct npc_kpu_profile_cam *kpucam,
int kpu, int entry)
{
+ const struct npc_kpu_profile_cam2 *kpucam2 = (void *)kpucam;
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
struct npc_kpu_cam cam0 = {0};
struct npc_kpu_cam cam1 = {0};
+ u64 *val = (u64 *)&cam1;
+ u64 *mask = (u64 *)&cam0;
cam1.state = kpucam->state & kpucam->state_mask;
cam1.dp0_data = kpucam->dp0 & kpucam->dp0_mask;
@@ -1706,6 +1712,14 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
cam0.dp1_data = ~kpucam->dp1 & kpucam->dp1_mask;
cam0.dp2_data = ~kpucam->dp2 & kpucam->dp2_mask;
+ if (profile->from_fs) {
+ u8 ptype = kpucam2->ptype;
+ u8 pmask = kpucam2->ptype_mask;
+
+ *val |= FIELD_PREP(GENMASK_ULL(57, 56), ptype & pmask);
+ *mask |= FIELD_PREP(GENMASK_ULL(57, 56), ~ptype & pmask);
+ }
+
rvu_write64(rvu, blkaddr,
NPC_AF_KPUX_ENTRYX_CAMX(kpu, entry, 0), *(u64 *)&cam0);
rvu_write64(rvu, blkaddr,
@@ -1717,34 +1731,104 @@ u64 npc_enable_mask(int count)
return (((count) < 64) ? ~(BIT_ULL(count) - 1) : (0x00ULL));
}
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return &profile->ikpu2[n];
+
+ return &profile->ikpu[n];
+}
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return kpu_pfl->cam_entries2;
+
+ return kpu_pfl->cam_entries;
+}
+
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return (void *)&kpu_pfl->cam2[n];
+
+ return (void *)&kpu_pfl->cam[n];
+}
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return kpu_pfl->action_entries2;
+
+ return kpu_pfl->action_entries;
+}
+
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl,
+ int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return (void *)&kpu_pfl->action2[n];
+
+ return (void *)&kpu_pfl->action[n];
+}
+
static void npc_program_kpu_profile(struct rvu *rvu, int blkaddr, int kpu,
const struct npc_kpu_profile *profile)
{
+ int num_cam_entries, num_action_entries;
int entry, num_entries, max_entries;
u64 entry_mask;
- if (profile->cam_entries != profile->action_entries) {
+ num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+ num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+ if (num_cam_entries != num_action_entries) {
dev_err(rvu->dev,
"KPU%d: CAM and action entries [%d != %d] not equal\n",
- kpu, profile->cam_entries, profile->action_entries);
+ kpu, num_cam_entries, num_action_entries);
}
max_entries = rvu->hw->npc_kpu_entries;
+ WARN(num_cam_entries > max_entries,
+ "KPU%u: err: hw max entries=%u, input entries=%u\n",
+ kpu, rvu->hw->npc_kpu_entries, num_cam_entries);
+
/* Program CAM match entries for previous KPU extracted data */
- num_entries = min_t(int, profile->cam_entries, max_entries);
+ num_entries = min_t(int, num_cam_entries, max_entries);
for (entry = 0; entry < num_entries; entry++)
npc_config_kpucam(rvu, blkaddr,
- &profile->cam[entry], kpu, entry);
+ (void *)npc_get_kpu_cam_nth_entry(rvu, profile, entry),
+ kpu, entry);
/* Program this KPU's actions */
- num_entries = min_t(int, profile->action_entries, max_entries);
+ num_entries = min_t(int, num_action_entries, max_entries);
for (entry = 0; entry < num_entries; entry++)
- npc_config_kpuaction(rvu, blkaddr, &profile->action[entry],
+ npc_config_kpuaction(rvu, blkaddr,
+ (void *)npc_get_kpu_action_nth_entry(rvu, profile, entry),
kpu, entry, false);
/* Enable all programmed entries */
- num_entries = min_t(int, profile->action_entries, profile->cam_entries);
+ num_entries = min_t(int, num_action_entries, num_cam_entries);
entry_mask = npc_enable_mask(num_entries);
/* Disable first KPU_MAX_CST_ENT entries for built-in profile */
if (!rvu->kpu.custom)
@@ -1788,26 +1872,175 @@ static void npc_prepare_default_kpu(struct rvu *rvu,
npc_cn20k_update_action_entries_n_flags(rvu, profile);
}
-static int npc_apply_custom_kpu(struct rvu *rvu,
- struct npc_kpu_profile_adapter *profile)
+static int npc_alloc_kpu_cam2_n_action2(struct rvu *rvu, int kpu_num,
+ int num_entries)
+{
+ struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
+ struct npc_kpu_profile *kpu;
+
+ kpu = &adapter->kpu[kpu_num];
+
+ kpu->cam2 = devm_kcalloc(rvu->dev, num_entries,
+ sizeof(*kpu->cam2), GFP_KERNEL);
+ if (!kpu->cam2)
+ return -ENOMEM;
+
+ kpu->action2 = devm_kcalloc(rvu->dev, num_entries,
+ sizeof(*kpu->action2), GFP_KERNEL);
+ if (!kpu->action2)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu_from_fw(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile)
{
size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+ const struct npc_kpu_profile_fwdata *fw;
struct npc_kpu_profile_action *action;
- struct npc_kpu_profile_fwdata *fw;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_fwdata *fw_kpu;
- int entries;
- u16 kpu, entry;
+ int entries, entry, kpu;
- if (is_cn20k(rvu->pdev))
- return npc_cn20k_apply_custom_kpu(rvu, profile);
+ fw = rvu->kpu_fwdata;
+
+ for (kpu = 0; kpu < fw->kpus; kpu++) {
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "Profile size mismatch on KPU%i parsing\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+ if (fw_kpu->entries < 0) {
+ dev_warn(rvu->dev,
+ "Profile entries is negative on KPU%i parsing\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ if (fw_kpu->entries > KPU_MAX_CST_ENT)
+ dev_warn(rvu->dev,
+ "Too many custom entries on KPU%d: %d > %d\n",
+ kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
+ entries = min_t(int, fw_kpu->entries, KPU_MAX_CST_ENT);
+ cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
+ offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+ offset += fw_kpu->entries * sizeof(*action);
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "Profile size mismatch on KPU%i parsing.\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+ for (entry = 0; entry < entries; entry++) {
+ profile->kpu[kpu].cam[entry] = cam[entry];
+ profile->kpu[kpu].action[entry] = action[entry];
+ }
+ }
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu_from_fs(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile)
+{
+ size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+ const struct npc_kpu_profile_fwdata *fw;
+ struct npc_kpu_profile_action *action;
+ struct npc_kpu_profile_cam2 *cam2;
+ struct npc_kpu_fwdata *fw_kpu;
+ int entries, ret, entry, kpu;
fw = rvu->kpu_fwdata;
+ /* Binary blob contains ikpu actions entries at start of data[0] */
+ profile->ikpu2 = devm_kcalloc(rvu->dev, 1,
+ sizeof(ikpu_action_entries),
+ GFP_KERNEL);
+ if (!profile->ikpu2)
+ return -ENOMEM;
+
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+
+ if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
+ return -EINVAL;
+
+ /* The firmware layout does dependent on the internal size of
+ * ikpu_action_entries.
+ */
+ memcpy((void *)profile->ikpu2, action, sizeof(ikpu_action_entries));
+ offset += sizeof(ikpu_action_entries);
+
+ for (kpu = 0; kpu < fw->kpus; kpu++) {
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset + sizeof(*fw_kpu)) {
+ dev_warn(rvu->dev,
+ "profile size mismatch on kpu%i parsing\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+ if (fw_kpu->entries <= 0) {
+ dev_warn(rvu->dev,
+ "Invalid kpu entries on KPU%d\n", kpu);
+ return -EINVAL;
+ }
+
+ entries = min_t(int, fw_kpu->entries, rvu->hw->npc_kpu_entries);
+ dev_info(rvu->dev,
+ "Loading %u entries on KPU%d\n", entries, kpu);
+
+ cam2 = (struct npc_kpu_profile_cam2 *)fw_kpu->data;
+ offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam2);
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+ offset += fw_kpu->entries * sizeof(*action);
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "profile size mismatch on kpu%i parsing.\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ profile->kpu[kpu].cam_entries2 = entries;
+ profile->kpu[kpu].action_entries2 = entries;
+ ret = npc_alloc_kpu_cam2_n_action2(rvu, kpu, entries);
+ if (ret) {
+ dev_warn(rvu->dev,
+ "profile entry allocation failed for kpu=%d for %d entries\n",
+ kpu, entries);
+ return -EINVAL;
+ }
+
+ for (entry = 0; entry < entries; entry++) {
+ profile->kpu[kpu].cam2[entry] = cam2[entry];
+ profile->kpu[kpu].action2[entry] = action[entry];
+ }
+ }
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile,
+ bool from_fs, int *fw_kpus)
+{
+ size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata);
+ const struct npc_kpu_profile_fwdata *fw;
+ struct npc_kpu_profile_fwdata *sfw;
+
+ if (is_cn20k(rvu->pdev))
+ return npc_cn20k_apply_custom_kpu(rvu, profile);
+
if (rvu->kpu_fwdata_sz < hdr_sz) {
dev_warn(rvu->dev, "Invalid KPU profile size\n");
return -EINVAL;
}
+
+ fw = rvu->kpu_fwdata;
if (le64_to_cpu(fw->signature) != KPU_SIGN) {
dev_warn(rvu->dev, "Invalid KPU profile signature %llx\n",
fw->signature);
@@ -1835,42 +2068,38 @@ static int npc_apply_custom_kpu(struct rvu *rvu,
return -EINVAL;
}
/* Verify if profile fits the HW */
+ if (fw->kpus > rvu->hw->npc_kpus) {
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+ rvu->hw->npc_kpus);
+ return -EINVAL;
+ }
+
+ /* Check if there is enough memory for fw loading.
+ * Check if there is enough entries for profile->kpu[] to
+ * set cam_entries2 and action_entries2
+ */
if (fw->kpus > profile->kpus) {
- dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
profile->kpus);
return -EINVAL;
}
+ *fw_kpus = fw->kpus;
+
+ sfw = devm_kcalloc(rvu->dev, 1, sizeof(*sfw), GFP_KERNEL);
+ if (!sfw)
+ return -ENOMEM;
+
+ memcpy(sfw, fw, sizeof(*sfw));
+
profile->custom = 1;
- profile->name = fw->name;
+ profile->name = sfw->name;
profile->version = le64_to_cpu(fw->version);
- profile->mcam_kex_prfl.mkex = &fw->mkex;
- profile->lt_def = &fw->lt_def;
-
- for (kpu = 0; kpu < fw->kpus; kpu++) {
- fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
- if (fw_kpu->entries > KPU_MAX_CST_ENT)
- dev_warn(rvu->dev,
- "Too many custom entries on KPU%d: %d > %d\n",
- kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
- entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
- cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
- offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
- action = (struct npc_kpu_profile_action *)(fw->data + offset);
- offset += fw_kpu->entries * sizeof(*action);
- if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
- dev_warn(rvu->dev,
- "Profile size mismatch on KPU%i parsing.\n",
- kpu + 1);
- return -EINVAL;
- }
- for (entry = 0; entry < entries; entry++) {
- profile->kpu[kpu].cam[entry] = cam[entry];
- profile->kpu[kpu].action[entry] = action[entry];
- }
- }
+ profile->mcam_kex_prfl.mkex = &sfw->mkex;
+ profile->lt_def = &sfw->lt_def;
- return 0;
+ return from_fs ? npc_apply_custom_kpu_from_fs(rvu, profile) :
+ npc_apply_custom_kpu_from_fw(rvu, profile);
}
static int npc_load_kpu_prfl_img(struct rvu *rvu, void __iomem *prfl_addr,
@@ -1958,45 +2187,19 @@ static int npc_load_kpu_profile_fwdb(struct rvu *rvu, const char *kpu_profile)
return ret;
}
-void npc_load_kpu_profile(struct rvu *rvu)
+static int npc_load_kpu_profile_from_fw(struct rvu *rvu)
{
struct npc_kpu_profile_adapter *profile = &rvu->kpu;
const char *kpu_profile = rvu->kpu_pfl_name;
- const struct firmware *fw = NULL;
- bool retry_fwdb = false;
-
- /* If user not specified profile customization */
- if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
- goto revert_to_default;
- /* First prepare default KPU, then we'll customize top entries. */
- npc_prepare_default_kpu(rvu, profile);
-
- /* Order of preceedence for load loading NPC profile (high to low)
- * Firmware binary in filesystem.
- * Firmware database method.
- * Default KPU profile.
- */
- if (!request_firmware_direct(&fw, kpu_profile, rvu->dev)) {
- dev_info(rvu->dev, "Loading KPU profile from firmware: %s\n",
- kpu_profile);
- rvu->kpu_fwdata = kzalloc(fw->size, GFP_KERNEL);
- if (rvu->kpu_fwdata) {
- memcpy(rvu->kpu_fwdata, fw->data, fw->size);
- rvu->kpu_fwdata_sz = fw->size;
- }
- release_firmware(fw);
- retry_fwdb = true;
- goto program_kpu;
- }
+ int fw_kpus = 0;
-load_image_fwdb:
/* Loading the KPU profile using firmware database */
if (npc_load_kpu_profile_fwdb(rvu, kpu_profile))
- goto revert_to_default;
+ return -EFAULT;
-program_kpu:
/* Apply profile customization if firmware was loaded. */
- if (!rvu->kpu_fwdata_sz || npc_apply_custom_kpu(rvu, profile)) {
+ if (!rvu->kpu_fwdata_sz ||
+ npc_apply_custom_kpu(rvu, profile, false, &fw_kpus)) {
/* If image from firmware filesystem fails to load or invalid
* retry with firmware database method.
*/
@@ -2010,10 +2213,6 @@ void npc_load_kpu_profile(struct rvu *rvu)
}
rvu->kpu_fwdata = NULL;
rvu->kpu_fwdata_sz = 0;
- if (retry_fwdb) {
- retry_fwdb = false;
- goto load_image_fwdb;
- }
}
dev_warn(rvu->dev,
@@ -2021,22 +2220,101 @@ void npc_load_kpu_profile(struct rvu *rvu)
kpu_profile);
kfree(rvu->kpu_fwdata);
rvu->kpu_fwdata = NULL;
- goto revert_to_default;
+ return -EFAULT;
}
- dev_info(rvu->dev, "Using custom profile '%s', version %d.%d.%d\n",
+ dev_info(rvu->dev, "Using custom profile '%.32s', version %d.%d.%d\n",
profile->name, NPC_KPU_VER_MAJ(profile->version),
NPC_KPU_VER_MIN(profile->version),
NPC_KPU_VER_PATCH(profile->version));
- return;
+ return 0;
+}
+
+static int npc_load_kpu_profile_from_fs(struct rvu *rvu)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+ const char *kpu_profile = rvu->kpu_pfl_name;
+ const struct firmware *fw = NULL;
+ int ret, fw_kpus = 0;
+ char path[512] = "kpu/";
+
+ if (strlen(kpu_profile) > sizeof(path) - strlen("kpu/") - 1) {
+ dev_err(rvu->dev, "kpu profile name is too big\n");
+ return -ENOSPC;
+ }
+
+ strcat(path, kpu_profile);
+
+ if (request_firmware_direct(&fw, path, rvu->dev))
+ return -ENOENT;
+
+ dev_info(rvu->dev, "Loading KPU profile from filesystem: %s\n",
+ path);
+
+ rvu->kpu_fwdata = fw->data;
+ rvu->kpu_fwdata_sz = fw->size;
+
+ ret = npc_apply_custom_kpu(rvu, profile, true, &fw_kpus);
+ release_firmware(fw);
+ rvu->kpu_fwdata = NULL;
+
+ if (ret) {
+ rvu->kpu_fwdata_sz = 0;
+ dev_err(rvu->dev,
+ "Loading KPU profile from filesystem failed\n");
+ return ret;
+ }
+
+ /* In firmware loading from filesystem method, all entries are from
+ * same binary blob.
+ */
+ rvu->kpu.kpus = fw_kpus;
+ profile->kpus = fw_kpus;
+ profile->from_fs = true;
+ return 0;
+}
+
+void npc_load_kpu_profile(struct rvu *rvu)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+ const char *kpu_profile = rvu->kpu_pfl_name;
+
+ profile->from_fs = false;
+
+ npc_prepare_default_kpu(rvu, profile);
+
+ /* If user not specified profile customization */
+ if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
+ return;
+
+ /* Order of preceedence for load loading NPC profile (high to low)
+ * Firmware binary in filesystem.
+ * Firmware database method.
+ * Default KPU profile.
+ */
+
+ /* Filesystem-based KPU loading is not supported on cn20k.
+ * npc_prepare_default_kpu() was invoked earlier, but control
+ * reached this point because the default profile was not selected.
+ * No need to call it again.
+ */
+ if (!is_cn20k(rvu->pdev)) {
+ if (!npc_load_kpu_profile_from_fs(rvu))
+ return;
+ }
+
+ /* First prepare default KPU, then we'll customize top entries. */
+ npc_prepare_default_kpu(rvu, profile);
+ if (!npc_load_kpu_profile_from_fw(rvu))
+ return;
-revert_to_default:
npc_prepare_default_kpu(rvu, profile);
}
static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
struct rvu_hwinfo *hw = rvu->hw;
int num_pkinds, num_kpus, idx;
@@ -2060,7 +2338,9 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
for (idx = 0; idx < num_pkinds; idx++)
- npc_config_kpuaction(rvu, blkaddr, &rvu->kpu.ikpu[idx], 0, idx, true);
+ npc_config_kpuaction(rvu, blkaddr,
+ npc_get_ikpu_nth_entry(rvu, idx),
+ 0, idx, true);
/* Program KPU CAM and Action profiles */
num_kpus = rvu->kpu.kpus;
@@ -2068,6 +2348,11 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
for (idx = 0; idx < num_kpus; idx++)
npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
+
+ if (profile->from_fs) {
+ rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(54), 0x03);
+ rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(58), 0x03);
+ }
}
void npc_mcam_rsrcs_deinit(struct rvu *rvu)
@@ -2297,18 +2582,21 @@ static void rvu_npc_hw_init(struct rvu *rvu, int blkaddr)
static void rvu_npc_setup_interfaces(struct rvu *rvu, int blkaddr)
{
- struct npc_mcam_kex_extr *mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
- struct npc_mcam_kex *mkex = rvu->kpu.mcam_kex_prfl.mkex;
+ const struct npc_mcam_kex_extr *mkex_extr;
struct npc_mcam *mcam = &rvu->hw->mcam;
struct rvu_hwinfo *hw = rvu->hw;
+ const struct npc_mcam_kex *mkex;
u64 nibble_ena, rx_kex, tx_kex;
u64 *keyx_cfg, reg;
u8 intf;
+ mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
+ mkex = rvu->kpu.mcam_kex_prfl.mkex;
+
if (is_cn20k(rvu->pdev)) {
- keyx_cfg = mkex_extr->keyx_cfg;
+ keyx_cfg = (u64 *)mkex_extr->keyx_cfg;
} else {
- keyx_cfg = mkex->keyx_cfg;
+ keyx_cfg = (u64 *)mkex->keyx_cfg;
/* Reserve last counter for MCAM RX miss action which is set to
* drop packet. This way we will know how many pkts didn't
* match any MCAM entry.
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
index 83c5e32e2afc..662f6693cfe9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
@@ -18,4 +18,21 @@ int npc_fwdb_prfl_img_map(struct rvu *rvu, void __iomem **prfl_img_addr,
void npc_mcam_clear_bit(struct npc_mcam *mcam, u16 index);
void npc_mcam_set_bit(struct npc_mcam *mcam, u16 index);
+
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n);
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n);
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n);
#endif /* RVU_NPC_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
index 62cdc714ba57..ab89b8c6e490 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
@@ -596,6 +596,7 @@
#define NPC_AF_INTFX_KEX_CFG(a) (0x01010 | (a) << 8)
#define NPC_AF_PKINDX_ACTION0(a) (0x80000ull | (a) << 6)
#define NPC_AF_PKINDX_ACTION1(a) (0x80008ull | (a) << 6)
+#define NPC_AF_PKINDX_TYPE(a) (0x80010ull | (a) << 6)
#define NPC_AF_PKINDX_CPI_DEFX(a, b) (0x80020ull | (a) << 6 | (b) << 3)
#define NPC_AF_KPUX_ENTRYX_CAMX(a, b, c) \
(0x100000 | (a) << 14 | (b) << 6 | (c) << 3)
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (5 preceding siblings ...)
2026-06-02 6:03 ` [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-03 6:54 ` Ratheesh Kannoth
2026-06-04 3:16 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
7 siblings, 2 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
Default CN20K NPC rule allocation now keys off the active MCAM keyword
width: use X4 with a bank-masked reference index when the silicon uses
X4 keys, and X2 with the raw index otherwise (replacing the previous
always-X2 / eidx + 1 behaviour).
In the AF flow-install path, flows that need more than 256 key bits
query the NPC profile; if the platform is fixed to X2 entries, fail
with -EOPNOTSUPP instead of requesting X4. Otherwise select X4 for the
MCAM alloc.
On the PF, cache and pass the profile kw_type from npc_get_pfl_info
through otx2_mcam_pfl_info_get(), and use it when allocating MCAM
entries for RSS/defaults and when installing ethtool flows on CN20K,
including masking the reference index for X4 slot layout.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 21 ++++++--
.../marvell/octeontx2/af/rvu_npc_fs.c | 12 ++++-
.../marvell/octeontx2/nic/otx2_flows.c | 48 +++++++++++++------
3 files changed, 61 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 513e68711962..ae4683e0405d 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -4500,10 +4500,16 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
pfvf = rvu_get_pfvf(rvu, pcifunc);
pfvf->hw_prio = NPC_DFT_RULE_PRIO;
+ if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+ req.kw_type = NPC_MCAM_KEY_X4;
+ req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+ } else {
+ req.kw_type = NPC_MCAM_KEY_X2;
+ req.ref_entry = eidx;
+ }
+
req.contig = false;
req.ref_prio = NPC_MCAM_HIGHER_PRIO;
- req.ref_entry = eidx;
- req.kw_type = NPC_MCAM_KEY_X2;
req.count = cnt;
req.hdr.pcifunc = pcifunc;
@@ -4533,11 +4539,18 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
* as NPC_DFT_RULE_PRIO - 1 (higher hw priority)
*/
req.contig = false;
- req.kw_type = NPC_MCAM_KEY_X2;
req.count = cnt;
req.hdr.pcifunc = pcifunc;
req.ref_prio = NPC_MCAM_LOWER_PRIO;
- req.ref_entry = eidx + 1;
+
+ if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+ req.kw_type = NPC_MCAM_KEY_X4;
+ req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+ } else {
+ req.kw_type = NPC_MCAM_KEY_X2;
+ req.ref_entry = eidx;
+ }
+
ret = rvu_mbox_handler_npc_mcam_alloc_entry(rvu, &req, &rsp);
if (ret) {
dev_err(rvu->dev,
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
index 6ae9cdcb608b..d20eb0e47d7d 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c
@@ -1671,9 +1671,11 @@ rvu_npc_alloc_entry_for_flow_install(struct rvu *rvu,
{
struct npc_mcam_alloc_entry_req entry_req;
struct npc_mcam_alloc_entry_rsp entry_rsp;
+ struct npc_get_pfl_info_rsp rsp = { 0 };
struct npc_get_num_kws_req kws_req;
struct npc_get_num_kws_rsp kws_rsp;
int off, kw_bits, rc;
+ struct msg_req req;
u8 *src, *dst;
if (!is_cn20k(rvu->pdev)) {
@@ -1697,8 +1699,16 @@ rvu_npc_alloc_entry_for_flow_install(struct rvu *rvu,
kw_bits = kws_rsp.kws * 64;
*kw_type = NPC_MCAM_KEY_X2;
- if (kw_bits > 256)
+ if (kw_bits > 256) {
+ rvu_mbox_handler_npc_get_pfl_info(rvu, &req, &rsp);
+ if (rsp.kw_type == NPC_MCAM_KEY_X2) {
+ dev_err(rvu->dev,
+ "Only X2 entries are supported in X2 profile\n");
+ return -EOPNOTSUPP;
+ }
+
*kw_type = NPC_MCAM_KEY_X4;
+ }
memset(&entry_req, 0, sizeof(entry_req));
memset(&entry_rsp, 0, sizeof(entry_rsp));
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 38cc539d724d..5dd0591fed99 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -37,14 +37,13 @@ static void otx2_clear_ntuple_flow_info(struct otx2_nic *pfvf, struct otx2_flow_
flow_cfg->max_flows = 0;
}
-static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
- u16 *x4_slots)
+static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, u16 *x4_slots, u8 *kw_type)
{
struct npc_get_pfl_info_rsp *rsp;
struct msg_req *req;
static struct {
bool is_set;
- bool is_x2;
+ u8 kw_type;
u16 x4_slots;
} pfl_info;
@@ -53,8 +52,8 @@ static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
*/
mutex_lock(&pfvf->mbox.lock);
if (pfl_info.is_set) {
- *is_x2 = pfl_info.is_x2;
*x4_slots = pfl_info.x4_slots;
+ *kw_type = pfl_info.kw_type;
mutex_unlock(&pfvf->mbox.lock);
return 0;
}
@@ -79,16 +78,16 @@ static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, bool *is_x2,
return -EFAULT;
}
- *is_x2 = (rsp->kw_type == NPC_MCAM_KEY_X2);
- if (*is_x2)
- *x4_slots = 0;
+ pfl_info.kw_type = rsp->kw_type;
+ if (rsp->kw_type == NPC_MCAM_KEY_X2)
+ pfl_info.x4_slots = 0;
else
- *x4_slots = rsp->x4_slots;
-
- pfl_info.is_x2 = *is_x2;
- pfl_info.x4_slots = *x4_slots;
+ pfl_info.x4_slots = rsp->x4_slots;
pfl_info.is_set = true;
+ *x4_slots = pfl_info.x4_slots;
+ *kw_type = pfl_info.kw_type;
+
mutex_unlock(&pfvf->mbox.lock);
return 0;
}
@@ -164,6 +163,7 @@ int otx2_alloc_mcam_entries(struct otx2_nic *pfvf, u16 count)
u16 dft_idx = 0, x4_slots = 0;
int ent, allocated = 0, ref;
bool is_x2 = false;
+ u8 kw_type = 0;
int rc;
/* Free current ones and allocate new ones with requested count */
@@ -182,12 +182,14 @@ int otx2_alloc_mcam_entries(struct otx2_nic *pfvf, u16 count)
}
if (is_cn20k(pfvf->pdev)) {
- rc = otx2_mcam_pfl_info_get(pfvf, &is_x2, &x4_slots);
+ rc = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
if (rc) {
netdev_err(pfvf->netdev, "Error to retrieve profile info\n");
return rc;
}
+ is_x2 = kw_type == NPC_MCAM_KEY_X2;
+
rc = otx2_get_dft_rl_idx(pfvf, &dft_idx);
if (rc) {
netdev_err(pfvf->netdev,
@@ -289,6 +291,8 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
struct npc_mcam_alloc_entry_rsp *rsp;
int vf_vlan_max_flows, count;
int rc, ref, prio, ent;
+ u8 kw_type = 0;
+ u16 x4_slots;
u16 dft_idx;
ref = 0;
@@ -315,6 +319,16 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
if (!flow_cfg->def_ent)
return -ENOMEM;
+ kw_type = NPC_MCAM_KEY_X2;
+ if (is_cn20k(pfvf->pdev)) {
+ rc = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
+ if (rc) {
+ netdev_err(pfvf->netdev,
+ "Error to get pfl info\n");
+ return rc;
+ }
+ }
+
mutex_lock(&pfvf->mbox.lock);
req = otx2_mbox_alloc_msg_npc_mcam_alloc_entry(&pfvf->mbox);
@@ -324,6 +338,10 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
}
req->kw_type = NPC_MCAM_KEY_X2;
+ if (is_cn20k(pfvf->pdev) && kw_type == NPC_MCAM_KEY_X4) {
+ req->kw_type = NPC_MCAM_KEY_X4;
+ ref &= (x4_slots - 1);
+ }
req->contig = false;
req->count = count;
req->ref_prio = prio;
@@ -1174,15 +1192,14 @@ static int otx2_add_flow_msg(struct otx2_nic *pfvf, struct otx2_flow *flow)
#ifdef CONFIG_DCB
int vlan_prio, qidx, pfc_rule = 0;
#endif
+ bool modify = false, is_x2;
int err, vf = 0, off, sz;
- bool modify = false;
u8 kw_type = 0;
u8 *src, *dst;
u16 x4_slots;
- bool is_x2;
if (is_cn20k(pfvf->pdev)) {
- err = otx2_mcam_pfl_info_get(pfvf, &is_x2, &x4_slots);
+ err = otx2_mcam_pfl_info_get(pfvf, &x4_slots, &kw_type);
if (err) {
netdev_err(pfvf->netdev,
"Error to retrieve NPC profile info, pcifunc=%#x\n",
@@ -1190,6 +1207,7 @@ static int otx2_add_flow_msg(struct otx2_nic *pfvf, struct otx2_flow *flow)
return -EFAULT;
}
+ is_x2 = kw_type == NPC_MCAM_KEY_X2;
if (!is_x2) {
err = otx2_prepare_flow_request(&flow->flow_spec,
&treq);
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (6 preceding siblings ...)
2026-06-02 6:03 ` [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
@ 2026-06-02 6:03 ` Ratheesh Kannoth
2026-06-03 7:03 ` Ratheesh Kannoth
2026-06-04 3:21 ` Ratheesh Kannoth
7 siblings, 2 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-02 6:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham, Ratheesh Kannoth
Replace the file-scope static npc_priv with a kcalloc'd struct filled
from hardware bank/subbank geometry at init (num_banks is no longer a
const compile-time constant; drop init_done and use a non-NULL
npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
through the CN20K NPC code paths, extend teardown to kfree the root
struct on failure and in npc_cn20k_deinit, and adjust MCAM section
setup to use the discovered subbank count.
Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
and use the allocated backing store consistently when computing deltas
(including the counter rollover compare).
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../marvell/octeontx2/af/cn20k/debugfs.c | 17 +-
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 442 +++++++++---------
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 3 +-
3 files changed, 240 insertions(+), 222 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 730ef97a57e6..b6fda42e44c7 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -176,7 +176,8 @@ static DEFINE_MUTEX(stats_lock);
* hard limit on all silicon variants, preventing any possibility of
* out-of-bounds access.
*/
-static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
+static u64 (*dstats)[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS];
+
static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
{
struct npc_priv_t *npc_priv;
@@ -212,24 +213,24 @@ static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
if (!stats)
continue;
- if (stats == dstats[bank][idx])
+ if (stats == dstats[0][bank][idx])
continue;
- if (stats < dstats[bank][idx])
- dstats[bank][idx] = 0;
+ if (stats < dstats[0][bank][idx])
+ dstats[0][bank][idx] = 0;
pf = 0xFFFF;
map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
if (map)
pf = xa_to_value(map);
- delta = stats - dstats[bank][idx];
+ delta = stats - dstats[0][bank][idx];
snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
mcam_idx, pf, delta);
seq_puts(s, buff);
- dstats[bank][idx] = stats;
+ dstats[0][bank][idx] = stats;
}
}
@@ -397,6 +398,10 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
npc_priv, &npc_vidx2idx_map_fops);
+ dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
+ if (!dstats)
+ return -ENOMEM;
+
debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
&npc_mcam_dstats_fops);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index ae4683e0405d..94a766b3ac07 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -16,9 +16,7 @@
#include "cn20k/reg.h"
#include "rvu_npc_fs.h"
-static struct npc_priv_t npc_priv = {
- .num_banks = MAX_NUM_BANKS,
-};
+static struct npc_priv_t *npc_priv;
static const char *npc_kw_name[NPC_MCAM_KEY_MAX] = {
[NPC_MCAM_KEY_DYN] = "DYNAMIC",
@@ -226,7 +224,7 @@ static u16 npc_idx2vidx(u16 idx)
vidx = idx;
index = idx;
- map = xa_load(&npc_priv.xa_idx2vidx_map, index);
+ map = xa_load(&npc_priv->xa_idx2vidx_map, index);
if (!map)
goto done;
@@ -242,7 +240,7 @@ static u16 npc_idx2vidx(u16 idx)
static bool npc_is_vidx(u16 vidx)
{
- return vidx >= npc_priv.bank_depth * 2;
+ return vidx >= npc_priv->bank_depth * 2;
}
static u16 npc_vidx2idx(u16 vidx)
@@ -256,7 +254,7 @@ static u16 npc_vidx2idx(u16 vidx)
idx = vidx;
index = vidx;
- map = xa_load(&npc_priv.xa_vidx2idx_map, index);
+ map = xa_load(&npc_priv->xa_vidx2idx_map, index);
if (!map)
goto done;
@@ -272,7 +270,7 @@ static u16 npc_vidx2idx(u16 vidx)
u16 npc_cn20k_vidx2idx(u16 idx)
{
- if (!npc_priv.init_done)
+ if (!npc_priv)
return idx;
if (!npc_is_vidx(idx))
@@ -283,7 +281,7 @@ u16 npc_cn20k_vidx2idx(u16 idx)
u16 npc_cn20k_idx2vidx(u16 idx)
{
- if (!npc_priv.init_done)
+ if (!npc_priv)
return idx;
if (npc_is_vidx(idx))
@@ -306,7 +304,7 @@ static int npc_vidx_maps_del_entry(struct rvu *rvu, u16 vidx, u16 *old_midx)
mcam_idx = npc_vidx2idx(vidx);
- map = xa_erase(&npc_priv.xa_vidx2idx_map, vidx);
+ map = xa_erase(&npc_priv->xa_vidx2idx_map, vidx);
if (!map) {
dev_err(rvu->dev,
"%s: vidx(%u) does not map to proper mcam idx\n",
@@ -314,7 +312,7 @@ static int npc_vidx_maps_del_entry(struct rvu *rvu, u16 vidx, u16 *old_midx)
return -ESRCH;
}
- map = xa_erase(&npc_priv.xa_idx2vidx_map, mcam_idx);
+ map = xa_erase(&npc_priv->xa_idx2vidx_map, mcam_idx);
if (!map) {
dev_err(rvu->dev,
"%s: vidx(%u) is not valid\n",
@@ -341,7 +339,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
return -ESRCH;
}
- map = xa_erase(&npc_priv.xa_vidx2idx_map, vidx);
+ map = xa_erase(&npc_priv->xa_vidx2idx_map, vidx);
if (!map) {
dev_err(rvu->dev,
"%s: vidx(%u) could not be deleted from vidx2idx map\n",
@@ -351,7 +349,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
old_midx = xa_to_value(map);
- rc = xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
+ rc = xa_insert(&npc_priv->xa_vidx2idx_map, vidx,
xa_mk_value(new_midx), GFP_KERNEL);
if (rc) {
dev_err(rvu->dev,
@@ -360,7 +358,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
goto fail1;
}
- map = xa_erase(&npc_priv.xa_idx2vidx_map, old_midx);
+ map = xa_erase(&npc_priv->xa_idx2vidx_map, old_midx);
if (!map) {
dev_err(rvu->dev,
"%s: old_midx(%u, vidx(%u)) cannot be added to idx2vidx map\n",
@@ -369,7 +367,7 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
goto fail2;
}
- rc = xa_insert(&npc_priv.xa_idx2vidx_map, new_midx,
+ rc = xa_insert(&npc_priv->xa_idx2vidx_map, new_midx,
xa_mk_value(vidx), GFP_KERNEL);
if (rc) {
dev_err(rvu->dev,
@@ -382,21 +380,21 @@ static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
fail3:
/* Restore vidx at old_midx location */
- if (xa_insert(&npc_priv.xa_idx2vidx_map, old_midx,
+ if (xa_insert(&npc_priv->xa_idx2vidx_map, old_midx,
xa_mk_value(vidx), GFP_KERNEL))
dev_err(rvu->dev,
"%s: Error to roll back idx2vidx old_midx=%u vidx=%u\n",
__func__, old_midx, vidx);
fail2:
/* Erase new_midx inserted at vidx */
- if (!xa_erase(&npc_priv.xa_vidx2idx_map, vidx))
+ if (!xa_erase(&npc_priv->xa_vidx2idx_map, vidx))
dev_err(rvu->dev,
"%s: Failed to roll back vidx2idx vidx=%u\n",
__func__, vidx);
fail1:
/* Restore old_midx at vidx location */
- if (xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
+ if (xa_insert(&npc_priv->xa_vidx2idx_map, vidx,
xa_mk_value(old_midx), GFP_KERNEL))
dev_err(rvu->dev,
"%s: Failed to roll back vidx2idx to old_midx=%u, vidx=%u\n",
@@ -412,10 +410,10 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
u32 id;
/* Virtual index start from maximum mcam index + 1 */
- max = npc_priv.bank_depth * 2 * 2 - 1;
- min = npc_priv.bank_depth * 2;
+ max = npc_priv->bank_depth * 2 * 2 - 1;
+ min = npc_priv->bank_depth * 2;
- rc = xa_alloc(&npc_priv.xa_vidx2idx_map, &id,
+ rc = xa_alloc(&npc_priv->xa_vidx2idx_map, &id,
xa_mk_value(mcam_idx),
XA_LIMIT(min, max), GFP_KERNEL);
if (rc) {
@@ -425,7 +423,7 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
goto fail1;
}
- rc = xa_insert(&npc_priv.xa_idx2vidx_map, mcam_idx,
+ rc = xa_insert(&npc_priv->xa_idx2vidx_map, mcam_idx,
xa_mk_value(id), GFP_KERNEL);
if (rc) {
dev_err(rvu->dev,
@@ -440,7 +438,7 @@ static int npc_vidx_maps_add_entry(struct rvu *rvu, u16 mcam_idx, int pcifunc,
return 0;
fail2:
- xa_erase(&npc_priv.xa_vidx2idx_map, id);
+ xa_erase(&npc_priv->xa_vidx2idx_map, id);
fail1:
return rc;
}
@@ -691,7 +689,7 @@ void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
struct npc_priv_t *npc_priv_get(void)
{
- return &npc_priv;
+ return npc_priv;
}
static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
@@ -860,9 +858,9 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
update_en_map:
if (enable)
- set_bit(index, npc_priv.en_map);
+ set_bit(index, npc_priv->en_map);
else
- clear_bit(index, npc_priv.en_map);
+ clear_bit(index, npc_priv->en_map);
return 0;
}
@@ -1751,28 +1749,28 @@ int npc_mcam_idx_2_key_type(struct rvu *rvu, u16 mcam_idx, u8 *key_type)
int bank_off, sb_id;
/* mcam_idx should be less than (2 * bank depth) */
- if (mcam_idx >= npc_priv.bank_depth * 2) {
+ if (mcam_idx >= npc_priv->bank_depth * 2) {
dev_err(rvu->dev, "%s: bad params\n",
__func__);
return -EINVAL;
}
/* find mcam offset per bank */
- bank_off = mcam_idx & (npc_priv.bank_depth - 1);
+ bank_off = mcam_idx & (npc_priv->bank_depth - 1);
/* Find subbank id */
- sb_id = bank_off / npc_priv.subbank_depth;
+ sb_id = bank_off / npc_priv->subbank_depth;
/* Check if subbank id is more than maximum
* number of subbanks available
*/
- if (sb_id >= npc_priv.num_subbanks) {
+ if (sb_id >= npc_priv->num_subbanks) {
dev_err(rvu->dev, "%s: invalid subbank %d\n",
__func__, sb_id);
return -EINVAL;
}
- sb = &npc_priv.sb[sb_id];
+ sb = &npc_priv->sb[sb_id];
*key_type = sb->key_type;
@@ -1788,7 +1786,7 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
* subsection depth - 1
*/
if (sb->key_type == NPC_MCAM_KEY_X4 &&
- sub_off >= npc_priv.subbank_depth) {
+ sub_off >= npc_priv->subbank_depth) {
dev_err(rvu->dev,
"%s: Failed to get mcam idx (x4) sb->idx=%u sub_off=%u",
__func__, sb->idx, sub_off);
@@ -1799,7 +1797,7 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
* 2 * subsection depth - 1
*/
if (sb->key_type == NPC_MCAM_KEY_X2 &&
- sub_off >= npc_priv.subbank_depth * 2) {
+ sub_off >= npc_priv->subbank_depth * 2) {
dev_err(rvu->dev,
"%s: Failed to get mcam idx (x2) sb->idx=%u sub_off=%u",
__func__, sb->idx, sub_off);
@@ -1807,12 +1805,12 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
}
/* Find subbank offset from respective subbank (w.r.t bank) */
- off = sub_off & (npc_priv.subbank_depth - 1);
+ off = sub_off & (npc_priv->subbank_depth - 1);
/* if subsection idx is in bank1, add bank depth,
* which is part of sb->b1b
*/
- bot = sub_off >= npc_priv.subbank_depth ? sb->b1b : sb->b0b;
+ bot = sub_off >= npc_priv->subbank_depth ? sb->b1b : sb->b0b;
*mcam_idx = bot + off;
return 0;
@@ -1825,37 +1823,37 @@ int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
int bank_off, sb_id;
/* mcam_idx should be less than (2 * bank depth) */
- if (mcam_idx >= npc_priv.bank_depth * 2) {
+ if (mcam_idx >= npc_priv->bank_depth * 2) {
dev_err(rvu->dev, "%s: Invalid mcam idx %u\n",
__func__, mcam_idx);
return -EINVAL;
}
/* find mcam offset per bank */
- bank_off = mcam_idx & (npc_priv.bank_depth - 1);
+ bank_off = mcam_idx & (npc_priv->bank_depth - 1);
/* Find subbank id */
- sb_id = bank_off / npc_priv.subbank_depth;
+ sb_id = bank_off / npc_priv->subbank_depth;
/* Check if subbank id is more than maximum
* number of subbanks available
*/
- if (sb_id >= npc_priv.num_subbanks) {
+ if (sb_id >= npc_priv->num_subbanks) {
dev_err(rvu->dev, "%s: invalid subbank %d\n",
__func__, sb_id);
return -EINVAL;
}
- *sb = &npc_priv.sb[sb_id];
+ *sb = &npc_priv->sb[sb_id];
/* Subbank offset per bank */
- *sb_off = bank_off % npc_priv.subbank_depth;
+ *sb_off = bank_off % npc_priv->subbank_depth;
/* Index in a subbank should add subbank depth
* if it is in bank1
*/
- if (mcam_idx >= npc_priv.bank_depth)
- *sb_off += npc_priv.subbank_depth;
+ if (mcam_idx >= npc_priv->bank_depth)
+ *sb_off += npc_priv->subbank_depth;
return 0;
}
@@ -1871,9 +1869,9 @@ static int __npc_subbank_contig_alloc(struct rvu *rvu,
int k, offset, delta = 0;
int cnt = 0, sbd;
- sbd = npc_priv.subbank_depth;
+ sbd = npc_priv->subbank_depth;
- if (sidx >= npc_priv.bank_depth)
+ if (sidx >= npc_priv->bank_depth)
delta = sbd;
switch (prio) {
@@ -1940,8 +1938,8 @@ static int __npc_subbank_non_contig_alloc(struct rvu *rvu,
int cnt = 0, delta;
int k, sbd;
- sbd = npc_priv.subbank_depth;
- delta = sidx >= npc_priv.bank_depth ? sbd : 0;
+ sbd = npc_priv->subbank_depth;
+ delta = sidx >= npc_priv->bank_depth ? sbd : 0;
switch (prio) {
/* Find an area of size 'count' from sidx to eidx */
@@ -2002,7 +2000,7 @@ static void __npc_subbank_sboff_2_off(struct rvu *rvu, struct npc_subbank *sb,
{
int sbd;
- sbd = npc_priv.subbank_depth;
+ sbd = npc_priv->subbank_depth;
*off = sb_off & (sbd - 1);
*bmap = (sb_off >= sbd) ? sb->b1map : sb->b0map;
@@ -2051,20 +2049,20 @@ static int __npc_subbank_mark_free(struct rvu *rvu, struct npc_subbank *sb)
sb->flags = NPC_SUBBANK_FLAG_FREE;
sb->key_type = 0;
- bitmap_clear(sb->b0map, 0, npc_priv.subbank_depth);
- bitmap_clear(sb->b1map, 0, npc_priv.subbank_depth);
+ bitmap_clear(sb->b0map, 0, npc_priv->subbank_depth);
+ bitmap_clear(sb->b1map, 0, npc_priv->subbank_depth);
- if (!xa_erase(&npc_priv.xa_sb_used, sb->arr_idx)) {
+ if (!xa_erase(&npc_priv->xa_sb_used, sb->arr_idx)) {
dev_err(rvu->dev,
"%s: Error to delete from xa_sb_used array\n",
__func__);
return -EFAULT;
}
- rc = xa_insert(&npc_priv.xa_sb_free, sb->arr_idx,
+ rc = xa_insert(&npc_priv->xa_sb_free, sb->arr_idx,
xa_mk_value(sb->idx), GFP_KERNEL);
if (rc) {
- rc = xa_insert(&npc_priv.xa_sb_used, sb->arr_idx,
+ rc = xa_insert(&npc_priv->xa_sb_used, sb->arr_idx,
xa_mk_value(sb->idx), GFP_KERNEL);
if (rc)
dev_err(rvu->dev,
@@ -2093,21 +2091,21 @@ static int __npc_subbank_mark_used(struct rvu *rvu, struct npc_subbank *sb,
sb->flags = NPC_SUBBANK_FLAG_USED;
sb->key_type = key_type;
if (key_type == NPC_MCAM_KEY_X4)
- sb->free_cnt = npc_priv.subbank_depth;
+ sb->free_cnt = npc_priv->subbank_depth;
else
- sb->free_cnt = 2 * npc_priv.subbank_depth;
+ sb->free_cnt = 2 * npc_priv->subbank_depth;
- bitmap_clear(sb->b0map, 0, npc_priv.subbank_depth);
- bitmap_clear(sb->b1map, 0, npc_priv.subbank_depth);
+ bitmap_clear(sb->b0map, 0, npc_priv->subbank_depth);
+ bitmap_clear(sb->b1map, 0, npc_priv->subbank_depth);
- if (!xa_erase(&npc_priv.xa_sb_free, sb->arr_idx)) {
+ if (!xa_erase(&npc_priv->xa_sb_free, sb->arr_idx)) {
dev_err(rvu->dev,
"%s: Error to delete from xa_sb_free array\n",
__func__);
return -EFAULT;
}
- rc = xa_insert(&npc_priv.xa_sb_used, sb->arr_idx,
+ rc = xa_insert(&npc_priv->xa_sb_used, sb->arr_idx,
xa_mk_value(sb->idx), GFP_KERNEL);
if (rc)
dev_err(rvu->dev,
@@ -2131,10 +2129,10 @@ static bool __npc_subbank_free(struct rvu *rvu, struct npc_subbank *sb,
/* Check whether we can mark whole subbank as free */
if (sb->key_type == NPC_MCAM_KEY_X4) {
- if (sb->free_cnt < npc_priv.subbank_depth)
+ if (sb->free_cnt < npc_priv->subbank_depth)
goto done;
} else {
- if (sb->free_cnt < 2 * npc_priv.subbank_depth)
+ if (sb->free_cnt < 2 * npc_priv->subbank_depth)
goto done;
}
@@ -2213,7 +2211,7 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
/* x4 indexes are from 0 to bank size as it combines two x2 banks */
if (key_type == NPC_MCAM_KEY_X4 &&
- (ref >= npc_priv.bank_depth || limit >= npc_priv.bank_depth)) {
+ (ref >= npc_priv->bank_depth || limit >= npc_priv->bank_depth)) {
dev_err(rvu->dev,
"%s: Wrong ref_enty(%d) or limit(%d) for x4\n",
__func__, ref, limit);
@@ -2223,8 +2221,8 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
/* This function is called either bank0 or bank1 portion of a subbank.
* so ref and limit should be on same bank.
*/
- diffbank = !!((ref & npc_priv.bank_depth) ^
- (limit & npc_priv.bank_depth));
+ diffbank = !!((ref & npc_priv->bank_depth) ^
+ (limit & npc_priv->bank_depth));
if (diffbank) {
dev_err(rvu->dev,
"%s: request ref and limit should be from same bank\n",
@@ -2248,7 +2246,7 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
* or equal to mcam entries available in the subbank if contig.
*/
if (sb->flags & NPC_SUBBANK_FLAG_FREE) {
- if (contig && count > npc_priv.subbank_depth) {
+ if (contig && count > npc_priv->subbank_depth) {
dev_err(rvu->dev, "%s: Less number of entries\n",
__func__);
return -ENOSPC;
@@ -2271,10 +2269,10 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
}
process:
- /* if ref or limit >= npc_priv.bank_depth, index are in bank1.
+ /* if ref or limit >= npc_priv->bank_depth, index are in bank1.
* else bank0.
*/
- if (ref >= npc_priv.bank_depth) {
+ if (ref >= npc_priv->bank_depth) {
bmap = sb->b1map;
t = sb->b1t;
b = sb->b1b;
@@ -2285,8 +2283,8 @@ static int __npc_subbank_alloc(struct rvu *rvu, struct npc_subbank *sb,
}
/* Calculate free slots */
- bw = bitmap_weight(bmap, npc_priv.subbank_depth);
- bfree = npc_priv.subbank_depth - bw;
+ bw = bitmap_weight(bmap, npc_priv->subbank_depth);
+ bfree = npc_priv->subbank_depth - bw;
if (!bfree) {
dev_dbg(rvu->dev, "%s: subbank is full\n", __func__);
@@ -2415,7 +2413,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
int pcifunc, idx;
void *map;
- map = xa_erase(&npc_priv.xa_idx2pf_map, mcam_idx);
+ map = xa_erase(&npc_priv->xa_idx2pf_map, mcam_idx);
if (!map) {
dev_err(rvu->dev,
"%s: failed to erase mcam_idx(%u) from xa_idx2pf map\n",
@@ -2424,7 +2422,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
}
pcifunc = xa_to_value(map);
- map = xa_load(&npc_priv.xa_pf_map, pcifunc);
+ map = xa_load(&npc_priv->xa_pf_map, pcifunc);
if (!map) {
dev_err(rvu->dev,
"%s: failed to find entry for (%u) from xa_pf_map, mcam=%u\n",
@@ -2434,7 +2432,7 @@ npc_del_from_pf_maps(struct rvu *rvu, u16 mcam_idx)
idx = xa_to_value(map);
- map = xa_erase(&npc_priv.xa_pf2idx_map[idx], mcam_idx);
+ map = xa_erase(&npc_priv->xa_pf2idx_map[idx], mcam_idx);
if (!map) {
dev_err(rvu->dev,
"%s: failed to erase mcam_idx(%u) from xa_pf2idx_map map\n",
@@ -2454,18 +2452,18 @@ npc_add_to_pf_maps(struct rvu *rvu, u16 mcam_idx, int pcifunc)
"%s: add2maps mcam_idx(%u) to xa_idx2pf map pcifunc=%#x\n",
__func__, mcam_idx, pcifunc);
- rc = xa_insert(&npc_priv.xa_idx2pf_map, mcam_idx,
+ rc = xa_insert(&npc_priv->xa_idx2pf_map, mcam_idx,
xa_mk_value(pcifunc), GFP_KERNEL);
if (rc) {
- map = xa_load(&npc_priv.xa_idx2pf_map, mcam_idx);
+ map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
dev_err(rvu->dev,
"%s: failed to insert mcam_idx(%u) to xa_idx2pf map, existing value=%lu\n",
__func__, mcam_idx, xa_to_value(map));
return -EFAULT;
}
- map = xa_load(&npc_priv.xa_pf_map, pcifunc);
+ map = xa_load(&npc_priv->xa_pf_map, pcifunc);
if (!map) {
dev_err(rvu->dev,
"%s: failed to find pf map entry for pcifunc=%#x, mcam=%u\n",
@@ -2475,12 +2473,12 @@ npc_add_to_pf_maps(struct rvu *rvu, u16 mcam_idx, int pcifunc)
idx = xa_to_value(map);
- rc = xa_insert(&npc_priv.xa_pf2idx_map[idx], mcam_idx,
+ rc = xa_insert(&npc_priv->xa_pf2idx_map[idx], mcam_idx,
xa_mk_value(pcifunc), GFP_KERNEL);
if (rc) {
- map = xa_load(&npc_priv.xa_pf2idx_map[idx], mcam_idx);
- xa_erase(&npc_priv.xa_idx2pf_map, mcam_idx);
+ map = xa_load(&npc_priv->xa_pf2idx_map[idx], mcam_idx);
+ xa_erase(&npc_priv->xa_idx2pf_map, mcam_idx);
dev_err(rvu->dev,
"%s: failed to insert mcam_idx(%u) to xa_pf2idx_map map, earlier value=%lu idx=%u\n",
__func__, mcam_idx, xa_to_value(map), idx);
@@ -2510,9 +2508,9 @@ npc_subbank_suits(struct npc_subbank *sb, int key_type)
return false;
}
-#define SB_ALIGN_UP(val) (((val) + npc_priv.subbank_depth) & \
- ~((npc_priv.subbank_depth) - 1))
-#define SB_ALIGN_DOWN(val) ALIGN_DOWN((val), npc_priv.subbank_depth)
+#define SB_ALIGN_UP(val) (((val) + npc_priv->subbank_depth) & \
+ ~((npc_priv->subbank_depth) - 1))
+#define SB_ALIGN_DOWN(val) ALIGN_DOWN((val), npc_priv->subbank_depth)
static void npc_subbank_iter_down(struct rvu *rvu,
int ref, int limit,
@@ -2538,7 +2536,7 @@ static void npc_subbank_iter_down(struct rvu *rvu,
}
*cur_ref = *cur_limit - 1;
- align = *cur_ref - npc_priv.subbank_depth + 1;
+ align = *cur_ref - npc_priv->subbank_depth + 1;
if (align <= limit) {
*stop = true;
*cur_limit = limit;
@@ -2578,7 +2576,7 @@ static void npc_subbank_iter_up(struct rvu *rvu,
}
*cur_ref = *cur_limit + 1;
- align = *cur_ref + npc_priv.subbank_depth - 1;
+ align = *cur_ref + npc_priv->subbank_depth - 1;
if (align >= limit) {
*stop = true;
@@ -2606,17 +2604,17 @@ npc_subbank_iter(struct rvu *rvu, int key_type,
/* limit and ref should < bank_depth for x4 */
if (key_type == NPC_MCAM_KEY_X4) {
- if (*cur_ref >= npc_priv.bank_depth)
+ if (*cur_ref >= npc_priv->bank_depth)
return -EINVAL;
- if (*cur_limit >= npc_priv.bank_depth)
+ if (*cur_limit >= npc_priv->bank_depth)
return -EINVAL;
}
/* limit and ref should < 2 * bank_depth, for x2 */
- if (*cur_ref >= 2 * npc_priv.bank_depth)
+ if (*cur_ref >= 2 * npc_priv->bank_depth)
return -EINVAL;
- if (*cur_limit >= 2 * npc_priv.bank_depth)
+ if (*cur_limit >= 2 * npc_priv->bank_depth)
return -EINVAL;
return 0;
@@ -2651,7 +2649,7 @@ static int npc_idx_free(struct rvu *rvu, u16 *mcam_idx, int count,
vidx = npc_idx2vidx(midx);
}
- if (midx >= npc_priv.bank_depth * npc_priv.num_banks) {
+ if (midx >= npc_priv->bank_depth * npc_priv->num_banks) {
dev_err(rvu->dev,
"%s: Invalid mcam_idx=%u cannot be deleted\n",
__func__, mcam_idx[i]);
@@ -2846,7 +2844,7 @@ static int npc_subbank_free_cnt(struct rvu *rvu, struct npc_subbank *sb,
{
int cnt, spd;
- spd = npc_priv.subbank_depth;
+ spd = npc_priv->subbank_depth;
mutex_lock(&sb->lock);
if (sb->flags & NPC_SUBBANK_FLAG_FREE)
@@ -3005,7 +3003,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
max_alloc = !contig;
/* Check used subbanks for free slots */
- xa_for_each(&npc_priv.xa_sb_used, index, val) {
+ xa_for_each(&npc_priv->xa_sb_used, index, val) {
idx = xa_to_value(val);
/* Minimize allocation from restricted subbanks
@@ -3014,7 +3012,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
if (npc_subbank_restrict_usage(rvu, idx))
continue;
- sb = &npc_priv.sb[idx];
+ sb = &npc_priv->sb[idx];
/* Skip if not suitable subbank */
if (!npc_subbank_suits(sb, key_type))
@@ -3071,9 +3069,9 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
}
/* Allocate in free subbanks */
- xa_for_each(&npc_priv.xa_sb_free, index, val) {
+ xa_for_each(&npc_priv->xa_sb_free, index, val) {
idx = xa_to_value(val);
- sb = &npc_priv.sb[idx];
+ sb = &npc_priv->sb[idx];
/* Minimize allocation from restricted subbanks
* in noref allocations.
@@ -3129,7 +3127,7 @@ static int npc_subbank_noref_alloc(struct rvu *rvu, int key_type, bool contig,
for (i = 0; restrict_valid &&
(i < ARRAY_SIZE(npc_subbank_restricted_idxs)); i++) {
idx = npc_subbank_restricted_idxs[i];
- sb = &npc_priv.sb[idx];
+ sb = &npc_priv->sb[idx];
/* Skip if not suitable subbank */
if (!npc_subbank_suits(sb, key_type))
@@ -3209,7 +3207,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
bool ref_valid;
u16 vidx;
- bd = npc_priv.bank_depth;
+ bd = npc_priv->bank_depth;
/* Special case: ref == 0 && limit= 0 && prio == HIGH && count == 1
* Here user wants to allocate 0th entry
@@ -3227,7 +3225,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
ref_valid = !!(limit || ref);
defrag_candidate = !ref_valid && !contig && virt;
if (!ref_valid) {
- if (contig && count > npc_priv.subbank_depth)
+ if (contig && count > npc_priv->subbank_depth)
goto try_noref_multi_subbank;
rc = npc_subbank_noref_alloc(rvu, key_type, contig,
@@ -3272,7 +3270,7 @@ int npc_cn20k_ref_idx_alloc(struct rvu *rvu, int pcifunc, int key_type,
return -EINVAL;
}
- if (contig && count > npc_priv.subbank_depth)
+ if (contig && count > npc_priv->subbank_depth)
goto try_ref_multi_subbank;
rc = npc_subbank_ref_alloc(rvu, key_type, ref, limit,
@@ -3334,8 +3332,8 @@ void npc_cn20k_subbank_calc_free(struct rvu *rvu, int *x2_free,
*x4_free = 0;
*sb_free = 0;
- for (i = 0; i < npc_priv.num_subbanks; i++) {
- sb = &npc_priv.sb[i];
+ for (i = 0; i < npc_priv->num_subbanks; i++) {
+ sb = &npc_priv->sb[i];
mutex_lock(&sb->lock);
/* Count number of free subbanks */
@@ -3433,11 +3431,11 @@ static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
{
mutex_init(&sb->lock);
- sb->b0b = idx * npc_priv.subbank_depth;
- sb->b0t = sb->b0b + npc_priv.subbank_depth - 1;
+ sb->b0b = idx * npc_priv->subbank_depth;
+ sb->b0t = sb->b0b + npc_priv->subbank_depth - 1;
- sb->b1b = npc_priv.bank_depth + idx * npc_priv.subbank_depth;
- sb->b1t = sb->b1b + npc_priv.subbank_depth - 1;
+ sb->b1b = npc_priv->bank_depth + idx * npc_priv->subbank_depth;
+ sb->b1t = sb->b1b + npc_priv->subbank_depth - 1;
sb->flags = NPC_SUBBANK_FLAG_FREE;
sb->idx = idx;
@@ -3449,7 +3447,7 @@ static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
/* Keep first and last subbank at end of free array; so that
* it will be used at last
*/
- xa_store(&npc_priv.xa_sb_free, sb->arr_idx,
+ xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
xa_mk_value(sb->idx), GFP_KERNEL);
}
@@ -3474,7 +3472,7 @@ static int npc_pcifunc_map_create(struct rvu *rvu)
pcifunc = pf << 9;
- xa_store(&npc_priv.xa_pf_map, (unsigned long)pcifunc,
+ xa_store(&npc_priv->xa_pf_map, (unsigned long)pcifunc,
xa_mk_value(cnt), GFP_KERNEL);
cnt++;
@@ -3483,7 +3481,7 @@ static int npc_pcifunc_map_create(struct rvu *rvu)
for (vf = 0; vf < numvfs; vf++) {
pcifunc = (pf << 9) | (vf + 1);
- xa_store(&npc_priv.xa_pf_map, (unsigned long)pcifunc,
+ xa_store(&npc_priv->xa_pf_map, (unsigned long)pcifunc,
xa_mk_value(cnt), GFP_KERNEL);
cnt++;
}
@@ -3569,7 +3567,7 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu,
int rc, sb_off, i, err;
bool deleted;
- sb = &npc_priv.sb[f->idx];
+ sb = &npc_priv->sb[f->idx];
alloc_cnt1 = 0;
alloc_cnt2 = 0;
@@ -3639,9 +3637,9 @@ static int npc_defrag_add_2_show_list(struct rvu *rvu, u16 old_midx,
node->vidx = vidx;
INIT_LIST_HEAD(&node->list);
- mutex_lock(&npc_priv.lock);
- list_add_tail(&node->list, &npc_priv.defrag_lh);
- mutex_unlock(&npc_priv.lock);
+ mutex_lock(&npc_priv->lock);
+ list_add_tail(&node->list, &npc_priv->defrag_lh);
+ mutex_unlock(&npc_priv->lock);
return 0;
}
@@ -3745,7 +3743,7 @@ int npc_defrag_move_vdx_to_free(struct rvu *rvu,
}
/* save pcifunc */
- map = xa_load(&npc_priv.xa_idx2pf_map, old_midx);
+ map = xa_load(&npc_priv->xa_idx2pf_map, old_midx);
pcifunc = xa_to_value(map);
/* delete from pf maps */
@@ -3904,29 +3902,29 @@ static void npc_defrag_list_clear(void)
{
struct npc_defrag_show_node *node, *next;
- mutex_lock(&npc_priv.lock);
- list_for_each_entry_safe(node, next, &npc_priv.defrag_lh, list) {
+ mutex_lock(&npc_priv->lock);
+ list_for_each_entry_safe(node, next, &npc_priv->defrag_lh, list) {
list_del_init(&node->list);
kfree(node);
}
- mutex_unlock(&npc_priv.lock);
+ mutex_unlock(&npc_priv->lock);
}
static void npc_lock_all_subbank(void)
{
int i;
- for (i = 0; i < npc_priv.num_subbanks; i++)
- mutex_lock(&npc_priv.sb[i].lock);
+ for (i = 0; i < npc_priv->num_subbanks; i++)
+ mutex_lock(&npc_priv->sb[i].lock);
}
static void npc_unlock_all_subbank(void)
{
int i;
- for (i = npc_priv.num_subbanks - 1; i >= 0; i--)
- mutex_unlock(&npc_priv.sb[i].lock);
+ for (i = npc_priv->num_subbanks - 1; i >= 0; i--)
+ mutex_unlock(&npc_priv->sb[i].lock);
}
int npc_cn20k_search_order_set(struct rvu *rvu,
@@ -3944,9 +3942,9 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
USED = 1,
};
- if (cnt != npc_priv.num_subbanks) {
+ if (cnt != npc_priv->num_subbanks) {
dev_err(rvu->dev, "Number of entries(%u) != %u\n",
- cnt, npc_priv.num_subbanks);
+ cnt, npc_priv->num_subbanks);
return -EINVAL;
}
@@ -3954,18 +3952,18 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
npc_lock_all_subbank();
for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
- sb = &npc_priv.sb[sb_idx];
+ sb = &npc_priv->sb[sb_idx];
save[sb->idx] = sb->arr_idx;
}
for (prio = 0; prio < cnt; prio++) {
sb_idx = narr[prio];
- sb = &npc_priv.sb[sb_idx];
+ sb = &npc_priv->sb[sb_idx];
if (sb->flags & NPC_SUBBANK_FLAG_USED)
- xa = &npc_priv.xa_sb_used;
+ xa = &npc_priv->xa_sb_used;
else
- xa = &npc_priv.xa_sb_free;
+ xa = &npc_priv->xa_sb_free;
rc = xa_err(xa_store(xa, prio,
xa_mk_value(sb_idx), GFP_KERNEL));
@@ -3989,10 +3987,10 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
for (prio = 0; prio < cnt; prio++) {
if (rsrc[FREE][prio] == -1)
- xa_erase(&npc_priv.xa_sb_free, prio);
+ xa_erase(&npc_priv->xa_sb_free, prio);
if (rsrc[USED][prio] == -1)
- xa_erase(&npc_priv.xa_sb_used, prio);
+ xa_erase(&npc_priv->xa_sb_used, prio);
}
for (int i = 0; i < cnt; i++)
@@ -4008,20 +4006,20 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
fail:
for (prio = 0; prio < cnt; prio++) {
if (rsrc[FREE][prio] == 1)
- xa_erase(&npc_priv.xa_sb_free, prio);
+ xa_erase(&npc_priv->xa_sb_free, prio);
if (rsrc[USED][prio] == 1)
- xa_erase(&npc_priv.xa_sb_used, prio);
+ xa_erase(&npc_priv->xa_sb_used, prio);
}
for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
- sb = &npc_priv.sb[sb_idx];
+ sb = &npc_priv->sb[sb_idx];
sb->arr_idx = save[sb_idx];
if (sb->flags & NPC_SUBBANK_FLAG_USED)
- xa = &npc_priv.xa_sb_used;
+ xa = &npc_priv->xa_sb_used;
else
- xa = &npc_priv.xa_sb_free;
+ xa = &npc_priv->xa_sb_free;
/* Since the entry already exists, xa_store() replaces
* the value without a kmalloc(), making failure highly unlikely.
@@ -4041,7 +4039,7 @@ int npc_cn20k_search_order_set(struct rvu *rvu,
const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
{
*restricted_order = restrict_valid;
- *sz = npc_priv.num_subbanks;
+ *sz = npc_priv->num_subbanks;
return subbank_srch_order;
}
@@ -4065,7 +4063,7 @@ int npc_cn20k_defrag(struct rvu *rvu)
INIT_LIST_HEAD(&x4lh);
INIT_LIST_HEAD(&x2lh);
- node = kcalloc(npc_priv.num_subbanks, sizeof(*node), GFP_KERNEL);
+ node = kcalloc(npc_priv->num_subbanks, sizeof(*node), GFP_KERNEL);
if (!node)
return -ENOMEM;
@@ -4074,13 +4072,13 @@ int npc_cn20k_defrag(struct rvu *rvu)
npc_lock_all_subbank();
/* Fill in node with subbank properties */
- for (i = 0; i < npc_priv.num_subbanks; i++) {
- sb = &npc_priv.sb[i];
+ for (i = 0; i < npc_priv->num_subbanks; i++) {
+ sb = &npc_priv->sb[i];
node[i].idx = i;
node[i].key_type = sb->key_type;
node[i].free_cnt = sb->free_cnt;
- node[i].vidx = kcalloc(npc_priv.subbank_depth * 2,
+ node[i].vidx = kcalloc(npc_priv->subbank_depth * 2,
sizeof(*node[i].vidx),
GFP_KERNEL);
if (!node[i].vidx) {
@@ -4110,8 +4108,8 @@ int npc_cn20k_defrag(struct rvu *rvu)
}
/* Filling vidx[] array with all vidx in that subbank */
- xa_for_each_start(&npc_priv.xa_vidx2idx_map, index, map,
- npc_priv.bank_depth * 2) {
+ xa_for_each_start(&npc_priv->xa_vidx2idx_map, index, map,
+ npc_priv->bank_depth * 2) {
midx = xa_to_value(map);
rc = npc_mcam_idx_2_subbank_idx(rvu, midx,
&sb, &sb_off);
@@ -4128,14 +4126,14 @@ int npc_cn20k_defrag(struct rvu *rvu)
}
/* Mark all subbank which has ref allocation */
- for (i = 0; i < npc_priv.num_subbanks; i++) {
+ for (i = 0; i < npc_priv->num_subbanks; i++) {
tnode = &node[i];
if (!tnode->valid)
continue;
tot = (tnode->key_type == NPC_MCAM_KEY_X2) ?
- npc_priv.subbank_depth * 2 : npc_priv.subbank_depth;
+ npc_priv->subbank_depth * 2 : npc_priv->subbank_depth;
if (node[i].vidx_cnt != tot - tnode->free_cnt)
tnode->refs = true;
@@ -4152,7 +4150,7 @@ int npc_cn20k_defrag(struct rvu *rvu)
free_vidx:
npc_unlock_all_subbank();
mutex_unlock(&mcam->lock);
- for (i = 0; i < npc_priv.num_subbanks; i++)
+ for (i = 0; i < npc_priv->num_subbanks; i++)
kfree(node[i].vidx);
kfree(node);
return rc;
@@ -4180,7 +4178,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
*ptr[i] = USHRT_MAX;
}
- if (!npc_priv.init_done)
+ if (!npc_priv)
return 0;
if (is_lbk_vf(rvu, pcifunc)) {
@@ -4188,7 +4186,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
return -EINVAL;
idx = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
- val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+ val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
if (!val) {
pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
__func__,
@@ -4207,7 +4205,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
return -EINVAL;
idx = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
- val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+ val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
if (!val) {
pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
__func__,
@@ -4227,7 +4225,7 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
continue;
idx = NPC_DFT_RULE_ID_MK(pcifunc, i);
- val = xa_load(&npc_priv.xa_pf2dfl_rmap, idx);
+ val = xa_load(&npc_priv->xa_pf2dfl_rmap, idx);
if (!val) {
pr_debug("%s: Failed to find %s index for pcifunc=%#x\n",
__func__,
@@ -4251,8 +4249,8 @@ int rvu_mbox_handler_npc_get_pfl_info(struct rvu *rvu, struct msg_req *req,
return -EOPNOTSUPP;
}
- rsp->kw_type = npc_priv.kw;
- rsp->x4_slots = npc_priv.bank_depth;
+ rsp->kw_type = npc_priv->kw;
+ rsp->x4_slots = npc_priv->bank_depth;
return 0;
}
@@ -4342,7 +4340,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
int blkaddr, rc, i;
void *map;
- if (!npc_priv.init_done)
+ if (!npc_priv)
return;
if (!npc_is_cgx_or_lbk(rvu, pcifunc)) {
@@ -4360,7 +4358,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
/* LBK */
if (is_lbk_vf(rvu, pcifunc)) {
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
- map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+ map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
if (!map)
dev_dbg(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4374,7 +4372,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
/* VF */
if (is_vf(pcifunc)) {
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
- map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+ map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
if (!map)
dev_dbg(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4388,7 +4386,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
/* PF */
for (i = NPC_DFT_RULE_START_ID; i < NPC_DFT_RULE_MAX_ID; i++) {
index = NPC_DFT_RULE_ID_MK(pcifunc, i);
- map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+ map = xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
if (!map)
dev_dbg(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
@@ -4448,7 +4446,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
struct msg_rsp free_rsp;
u16 b, m, p, u;
- if (!npc_priv.init_done)
+ if (!npc_priv)
return 0;
if (!npc_is_cgx_or_lbk(rvu, pcifunc)) {
@@ -4471,7 +4469,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
}
/* Set ref index as lowest priority index */
- eidx = 2 * npc_priv.bank_depth - 1;
+ eidx = 2 * npc_priv->bank_depth - 1;
/* Install only UCAST for VF */
cnt = is_vf(pcifunc) ? 1 : ARRAY_SIZE(mcam_idx);
@@ -4500,9 +4498,9 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
pfvf = rvu_get_pfvf(rvu, pcifunc);
pfvf->hw_prio = NPC_DFT_RULE_PRIO;
- if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+ if (npc_priv->kw == NPC_MCAM_KEY_X4) {
req.kw_type = NPC_MCAM_KEY_X4;
- req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+ req.ref_entry = eidx & (npc_priv->bank_depth - 1);
} else {
req.kw_type = NPC_MCAM_KEY_X2;
req.ref_entry = eidx;
@@ -4543,9 +4541,9 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
req.hdr.pcifunc = pcifunc;
req.ref_prio = NPC_MCAM_LOWER_PRIO;
- if (npc_priv.kw == NPC_MCAM_KEY_X4) {
+ if (npc_priv->kw == NPC_MCAM_KEY_X4) {
req.kw_type = NPC_MCAM_KEY_X4;
- req.ref_entry = eidx & (npc_priv.bank_depth - 1);
+ req.ref_entry = eidx & (npc_priv->bank_depth - 1);
} else {
req.kw_type = NPC_MCAM_KEY_X2;
req.ref_entry = eidx;
@@ -4569,7 +4567,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
/* LBK */
if (is_lbk_vf(rvu, pcifunc)) {
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
- ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+ ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
xa_mk_value(mcam_idx[0]), GFP_KERNEL);
if (ret) {
dev_err(rvu->dev,
@@ -4586,7 +4584,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
/* VF */
if (is_vf(pcifunc)) {
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
- ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+ ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
xa_mk_value(mcam_idx[0]), GFP_KERNEL);
if (ret) {
dev_err(rvu->dev,
@@ -4604,7 +4602,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
for (i = NPC_DFT_RULE_START_ID, k = 0; i < NPC_DFT_RULE_MAX_ID &&
k < cnt; i++, k++) {
index = NPC_DFT_RULE_ID_MK(pcifunc, i);
- ret = xa_insert(&npc_priv.xa_pf2dfl_rmap, index,
+ ret = xa_insert(&npc_priv->xa_pf2dfl_rmap, index,
xa_mk_value(mcam_idx[k]), GFP_KERNEL);
if (ret) {
dev_err(rvu->dev,
@@ -4613,7 +4611,7 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
pcifunc);
for (int p = NPC_DFT_RULE_START_ID; p < i; p++) {
index = NPC_DFT_RULE_ID_MK(pcifunc, p);
- xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
+ xa_erase(&npc_priv->xa_pf2dfl_rmap, index);
}
goto err;
}
@@ -4687,71 +4685,79 @@ static int npc_priv_init(struct rvu *rvu)
return -EINVAL;
}
- npc_priv.num_subbanks = num_subbanks;
- npc_priv.bank_depth = bank_depth;
- npc_priv.subbank_depth = subbank_depth;
+ npc_priv = kcalloc(1, sizeof(*npc_priv), GFP_KERNEL);
+ if (!npc_priv)
+ return -ENOMEM;
+
+ npc_priv->num_banks = num_banks;
+ npc_priv->num_subbanks = num_subbanks;
+ npc_priv->bank_depth = bank_depth;
+ npc_priv->subbank_depth = subbank_depth;
/* Get kex configured key size */
cfg = rvu_read64(rvu, blkaddr, NPC_AF_INTFX_KEX_CFG(0));
- npc_priv.kw = FIELD_GET(GENMASK_ULL(34, 32), cfg);
+ npc_priv->kw = FIELD_GET(GENMASK_ULL(34, 32), cfg);
dev_info(rvu->dev,
"banks=%u depth=%u, subbanks=%u depth=%u, key type=%s\n",
num_banks, bank_depth, num_subbanks, subbank_depth,
- npc_kw_name[npc_priv.kw]);
+ npc_kw_name[npc_priv->kw]);
- npc_priv.sb = kcalloc(num_subbanks, sizeof(struct npc_subbank),
- GFP_KERNEL);
- if (!npc_priv.sb)
- return -ENOMEM;
+ npc_priv->sb = kcalloc(num_subbanks, sizeof(struct npc_subbank),
+ GFP_KERNEL);
+ if (!npc_priv->sb)
+ goto fail1;
- xa_init_flags(&npc_priv.xa_sb_used, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_sb_free, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_idx2pf_map, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_pf_map, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_pf2dfl_rmap, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_idx2vidx_map, XA_FLAGS_ALLOC);
- xa_init_flags(&npc_priv.xa_vidx2idx_map, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_sb_used, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_sb_free, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_idx2pf_map, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_pf_map, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_pf2dfl_rmap, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_idx2vidx_map, XA_FLAGS_ALLOC);
+ xa_init_flags(&npc_priv->xa_vidx2idx_map, XA_FLAGS_ALLOC);
if (npc_create_srch_order(num_subbanks))
- goto fail1;
+ goto fail2;
npc_populate_restricted_idxs(num_subbanks);
/* Initialize subbanks */
- for (i = 0, sb = npc_priv.sb; i < num_subbanks; i++, sb++)
+ for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
npc_subbank_init(rvu, sb, i);
/* Get number of pcifuncs in the system */
- npc_priv.pf_cnt = npc_pcifunc_map_create(rvu);
- npc_priv.xa_pf2idx_map = kcalloc(npc_priv.pf_cnt,
- sizeof(struct xarray),
- GFP_KERNEL);
- if (!npc_priv.xa_pf2idx_map)
- goto fail2;
+ npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
+ npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,
+ sizeof(struct xarray),
+ GFP_KERNEL);
+ if (!npc_priv->xa_pf2idx_map)
+ goto fail3;
- for (i = 0; i < npc_priv.pf_cnt; i++)
- xa_init_flags(&npc_priv.xa_pf2idx_map[i], XA_FLAGS_ALLOC);
+ for (i = 0; i < npc_priv->pf_cnt; i++)
+ xa_init_flags(&npc_priv->xa_pf2idx_map[i], XA_FLAGS_ALLOC);
- INIT_LIST_HEAD(&npc_priv.defrag_lh);
- mutex_init(&npc_priv.lock);
+ INIT_LIST_HEAD(&npc_priv->defrag_lh);
+ mutex_init(&npc_priv->lock);
return 0;
-fail2:
+fail3:
kfree(subbank_srch_order);
subbank_srch_order = NULL;
+fail2:
+ xa_destroy(&npc_priv->xa_sb_used);
+ xa_destroy(&npc_priv->xa_sb_free);
+ xa_destroy(&npc_priv->xa_idx2pf_map);
+ xa_destroy(&npc_priv->xa_pf_map);
+ xa_destroy(&npc_priv->xa_pf2dfl_rmap);
+ xa_destroy(&npc_priv->xa_idx2vidx_map);
+ xa_destroy(&npc_priv->xa_vidx2idx_map);
+ kfree(npc_priv->sb);
+ npc_priv->sb = NULL;
fail1:
- xa_destroy(&npc_priv.xa_sb_used);
- xa_destroy(&npc_priv.xa_sb_free);
- xa_destroy(&npc_priv.xa_idx2pf_map);
- xa_destroy(&npc_priv.xa_pf_map);
- xa_destroy(&npc_priv.xa_pf2dfl_rmap);
- xa_destroy(&npc_priv.xa_idx2vidx_map);
- xa_destroy(&npc_priv.xa_vidx2idx_map);
- kfree(npc_priv.sb);
- npc_priv.sb = NULL;
+ kfree(npc_priv);
+ npc_priv = NULL;
return -ENOMEM;
}
@@ -4759,25 +4765,31 @@ void npc_cn20k_deinit(struct rvu *rvu)
{
int i;
- xa_destroy(&npc_priv.xa_sb_used);
- xa_destroy(&npc_priv.xa_sb_free);
- xa_destroy(&npc_priv.xa_idx2pf_map);
- xa_destroy(&npc_priv.xa_pf_map);
- xa_destroy(&npc_priv.xa_pf2dfl_rmap);
- xa_destroy(&npc_priv.xa_idx2vidx_map);
- xa_destroy(&npc_priv.xa_vidx2idx_map);
+ if (!npc_priv)
+ return;
- for (i = 0; i < npc_priv.pf_cnt; i++)
- xa_destroy(&npc_priv.xa_pf2idx_map[i]);
+ xa_destroy(&npc_priv->xa_sb_used);
+ xa_destroy(&npc_priv->xa_sb_free);
+ xa_destroy(&npc_priv->xa_idx2pf_map);
+ xa_destroy(&npc_priv->xa_pf_map);
+ xa_destroy(&npc_priv->xa_pf2dfl_rmap);
+ xa_destroy(&npc_priv->xa_idx2vidx_map);
+ xa_destroy(&npc_priv->xa_vidx2idx_map);
- kfree(npc_priv.xa_pf2idx_map);
+ for (i = 0; i < npc_priv->pf_cnt; i++)
+ xa_destroy(&npc_priv->xa_pf2idx_map[i]);
+
+ kfree(npc_priv->xa_pf2idx_map);
/* No need to destroy mutex lock as it is
* part of subbank structure
*/
- kfree(npc_priv.sb);
+ kfree(npc_priv->sb);
kfree(subbank_srch_order);
- bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
+ bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
MAX_SUBBANK_DEPTH);
+ npc_defrag_list_clear();
+ kfree(npc_priv);
+ npc_priv = NULL;
}
static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
@@ -4790,7 +4802,7 @@ static int npc_setup_mcam_section(struct rvu *rvu, int key_type)
return -ENODEV;
}
- for (sec = 0; sec < npc_priv.num_subbanks; sec++)
+ for (sec = 0; sec < npc_priv->num_subbanks; sec++)
rvu_write64(rvu, blkaddr,
NPC_AF_MCAM_SECTIONX_CFG_EXT(sec), key_type);
@@ -4812,10 +4824,12 @@ int npc_cn20k_init(struct rvu *rvu)
if (err) {
dev_err(rvu->dev, "%s: mcam section configuration failure\n",
__func__);
- return err;
+ goto fail;
}
- npc_priv.init_done = true;
-
return 0;
+
+fail:
+ npc_cn20k_deinit(rvu);
+ return err;
}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 8bf857317e49..b759aa022a48 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -195,7 +195,7 @@ struct npc_defrag_show_node {
*/
struct npc_priv_t {
int bank_depth;
- const int num_banks;
+ int num_banks;
int num_subbanks;
int subbank_depth;
DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
@@ -214,7 +214,6 @@ struct npc_priv_t {
struct list_head defrag_lh;
struct mutex lock; /* protect defrag nodes */
int pf_cnt;
- bool init_done;
};
struct npc_kpm_action0 {
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
@ 2026-06-03 6:19 ` Ratheesh Kannoth
2026-06-04 2:19 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-03 6:19 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:52, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
> so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
> since the prior read by comparing hardware counters to a per-entry
> software baseline and advancing that baseline after each read (hardware
> counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
> but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com
sashiko says
>> + pf = 0xFFFF;
>> + map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
>> + if (map)
>> + pf = xa_to_value(map);
>> +
>> + delta = stats - dstats[bank][idx];
>> +
>> + snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> + mcam_idx, pf, delta);
>> + seq_puts(s, buff);
>> +
>> + dstats[bank][idx] = stats;
>Does this code allow an out-of-bounds array access?
No.
>In npc_priv_init() (shown below), integer division truncation allows
>a hardware-supplied bank_depth to exceed statically allocated array
>dimensions. Because remainders are truncated, a bank_depth value such
>as 8223 (with num_subbanks = 32) yields subbank_depth = 256.
npc_priv_init() do check against max limit for MAX_BANKDEPTH and bail out on
error.
>I noticed this was fixed later in the series in commit "octeontx2-af: npc:
>cn20k: Allocate npc_priv and dstats dynamically.", but this patch
>introduces a concrete buffer overflow.
These checks are in the current patch itself.
>+update_en_map:
>> + if (enable)
>> + set_bit(index, npc_priv.en_map);
>Can index exceed the bitmap size here?
>Due to the same integer division truncation issue mentioned above, index
>can exceed the expected bounds, causing set_bit() to write beyond the
>statically allocated 16384 bits of npc_priv.en_map.
No. "en_map" bitmap is defined with MAXimum size possible. Check against these max limits
are done during probe (in npc_priv_init()) and returns error.
> subbank_depth = bank_depth / num_subbanks;
>> + if (subbank_depth > MAX_SUBBANK_DEPTH) {
>This is a pre-existing issue, but can this integer division evaluate to 0?
>If hardware reports a banksize (which becomes bank_depth) less than
>num_subbanks (e.g., 16 vs 32), the integer division evaluates to 0.
Pre-existing issue, so not addressing here. We need to check agaist zero, wll post
as a seperate patch to "net" tree, - or may be a hardening series.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
2026-06-02 6:03 ` [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
@ 2026-06-03 6:37 ` Ratheesh Kannoth
2026-06-04 2:41 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-03 6:37 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:56, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
> reinit or teardown without the AF freeing CN20K default NPC rule indexes
> while the driver still owns that state (otx2_init_hw_resources and
> otx2_free_hw_resources).
>
> On CN20K, allocate default NPC rules from NIX LF alloc before
> nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
> and free from NIX LF free when the new flag is not set. Tighten
> rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
> qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
> (remove the blanket -ENOMEM at the free_mem path).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>> /* Alloc NIX RQ HW context memory and config the base */
>> hwctx_size = 1UL << ((ctx_cfg >> 4) & 0xF);
>> - err = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
>> - if (err)
>> + rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
>> + if (rc)
>> goto free_mem;
>>
>> pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
>> - if (!pfvf->rq_bmap)
>> + if (!pfvf->rq_bmap) {
>> + rc = -ENOMEM;
>> goto free_mem;
>> + }
>This is a pre-existing issue, but can repeated mailbox messages cause a memory
>leak here?
>If an untrusted VF continuously sends MBOX_MSG_NIX_LF_ALLOC messages without
>sending a corresponding NIX_LF_FREE, it appears qmem_alloc() and kcalloc()
>will overwrite the pointers in the pfvf structure (like pfvf->rq_ctx and
>pfvf->rq_bmap) without freeing the previously allocated memory.
>Could this be used by a guest to cause host DMA or heap memory exhaustion?
Prexisting issue, not addressing as part of this. May be a hardening series post this series,
or patch to introduce some limit can be planned.
>> + rc = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
>> + !!(req->flags & NIX_LF_LBK_BLK_SEL));
>> + if (rc)
>> + goto free_dft;
>>
>> /* Disable NPC entries as NIXLF's contexts are not initialized yet */
>> rvu_npc_disable_default_entries(rvu, pcifunc, nixlf);
>> @@ -1699,9 +1712,12 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
>>
>> goto exit;
>>
>> +free_dft:
>> + if (is_cn20k(rvu->pdev))
>> + npc_cn20k_dft_rules_free(rvu, pcifunc);
>Does this rollback path accidentally destroy pre-existing default rules?
>If the PF preserved the rules using NIX_LF_DONT_FREE_DFT_IDXS during teardown,
>npc_cn20k_dft_rules_alloc() returns 0 without allocating anything when they
>are already installed:
>npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
>{
> ...
> ret = npc_cn20k_dft_rules_idx_get(rvu, pcifunc, &b, &m, &p, &u);
> if (!ret) {
> dev_dbg(rvu->dev, "default rules are already installed...");
> return 0;
> }
> ...
>}
>If nix_interface_init() subsequently fails, the error handler jumps to free_dft
>and unconditionally frees the rules.
Even if it is freed, next successful nix lf alloc, will setup these default rules.
We can modify npc_cn20k_dft_rules_alloc() to add one more paremeter (bool &) to indicate whether newly/already
allocated status. Based on the flag, invoke npc_cn20k_dft_rules_free(). Will take it up as
hardening series after the series is merged.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem
2026-06-02 6:03 ` [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
@ 2026-06-03 6:46 ` Ratheesh Kannoth
2026-06-04 3:07 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-03 6:46 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:57, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Flashing updated firmware on deployed devices is cumbersome. Provide a
> mechanism to load a custom KPU (Key Parse Unit) profile directly from
> the filesystem at module load time.
>
> When the rvu_af module is loaded with the kpu_profile parameter, the
> specified profile is read from /lib/firmware/kpu and programmed into
> the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
> used by filesystem-loaded profiles and support ptype/ptype_mask in
> npc_config_kpucam when profile->from_fs is set.
>
> Usage:
> 1. Copy the KPU profile file to /lib/firmware/kpu.
> 2. Build OCTEONTX2_AF as a module.
> 3. Load: insmod rvu_af.ko kpu_profile=<profile_name>
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
> +static int npc_alloc_kpu_cam2_n_action2(struct rvu *rvu, int kpu_num,
>> + int num_entries)
>> +{
>> + struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
>> + struct npc_kpu_profile *kpu;
>> +
>> + kpu = &adapter->kpu[kpu_num];
>> +
>> + kpu->cam2 = devm_kcalloc(rvu->dev, num_entries,
>> + sizeof(*kpu->cam2), GFP_KERNEL);
>Does this code introduce a use-after-free risk?
No.
>The adapter's kpu pointer (profile->kpu) is set to the global static array
>npc_kpu_profiles in npc_prepare_default_kpu(). When this function is called,
>it allocates device-managed memory (devm_kcalloc) and stores the pointer
>into that globally shared array.
>If the device unbinds, the devres subsystem will automatically free this
>memory, leaving dangling pointers in the global array. A subsequent probe or
>concurrent access by another device could result in a use-after-free.
There is only one AF device per system.
When device binds, the structure which holds these pointers (global array) are also freed.
> + struct npc_kpu_fwdata *fw_kpu;
>> + int entries, entry, kpu;
>>
>> - if (is_cn20k(rvu->pdev))
>> - return npc_cn20k_apply_custom_kpu(rvu, profile);
>> + fw = rvu->kpu_fwdata;
>> +
>> + for (kpu = 0; kpu < fw->kpus; kpu++) {
>> + if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
>This is a pre-existing issue, but does this boundary check allow an
>out-of-bounds read?
Pre-existing issue, will post a patch to net tree, once this series is merged.
> + return -EINVAL;
>> +
>> + /* The firmware layout does dependent on the internal size of
>> + * ikpu_action_entries.
>> + */
>> + memcpy((void *)profile->ikpu2, action, sizeof(ikpu_action_entries));
>> + offset += sizeof(ikpu_action_entries);
>Will this logic break backward compatibility with existing firmware binaries?
>The firmware parsing offset is incremented by sizeof(ikpu_action_entries).
>Because ikpu_action_entries is a static kernel array whose size is determined
>by the number of supported port kinds (PKINDs), any future addition of a
>PKIND to the kernel will increase the array's size.
>This would misalign the parser's offset calculation for all existing
>firmware binaries on disk. Could this be made independent of the internal
>kernel structure size?
No, Since there is no field in this structure to indicate number of ikpu_action_entries, we
will keep same number of entries always.
>> + const struct npc_mcam_kex *mkex;
>> u64 nibble_ena, rx_kex, tx_kex;
>> u64 *keyx_cfg, reg;
>> u8 intf;
>>
>> + mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
>> + mkex = rvu->kpu.mcam_kex_prfl.mkex;
>> +
>> if (is_cn20k(rvu->pdev)) {
>> - keyx_cfg = mkex_extr->keyx_cfg;
>> + keyx_cfg = (u64 *)mkex_extr->keyx_cfg;
>This is a pre-existing issue, but does this explicit cast mask a potential
>kernel panic?
>By casting the const pointer mkex_extr->keyx_cfg to a non-const u64 *,
>the code allows mutating the underlying data later in this function
>(e.g., keyx_cfg[NIX_INTF_TX] = tx_kex).
>If the profile was loaded via the firmware database (fwdb), this memory
>is an __iomem region. A direct assignment to __iomem on ARM64 without
>using the appropriate I/O accessors (like writeq) can cause a synchronous
>external abort.
This is firmware loaded in memory (request_firmware_direct()), not iomem.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc
2026-06-02 6:03 ` [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
@ 2026-06-03 6:54 ` Ratheesh Kannoth
2026-06-04 3:16 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-03 6:54 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:58, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Default CN20K NPC rule allocation now keys off the active MCAM keyword
> width: use X4 with a bank-masked reference index when the silicon uses
> X4 keys, and X2 with the raw index otherwise (replacing the previous
> always-X2 / eidx + 1 behaviour).
>
> In the AF flow-install path, flows that need more than 256 key bits
> query the NPC profile; if the platform is fixed to X2 entries, fail
> with -EOPNOTSUPP instead of requesting X4. Otherwise select X4 for the
> MCAM alloc.
>
> On the PF, cache and pass the profile kw_type from npc_get_pfl_info
> through otx2_mcam_pfl_info_get(), and use it when allocating MCAM
> entries for RSS/defaults and when installing ethtool flows on CN20K,
> including masking the reference index for X4 slot layout.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>
>> - u16 *x4_slots)
>> +static int otx2_mcam_pfl_info_get(struct otx2_nic *pfvf, u16 *x4_slots, u8 *kw_type)
>> {
>> struct npc_get_pfl_info_rsp *rsp;
>> struct msg_req *req;
>> static struct {
>> bool is_set;
>> - bool is_x2;
>> + u8 kw_type;
>> u16 x4_slots;
>> } pfl_info;
>This is a pre-existing issue, but does using a static structure to cache
>the hardware profile break multi-device support?
These are value of AF device (singleton device per system). AF process mbox messages serially, so if any PF updates this
first, that value will be same and can be used for all other PFs.
>Additionally, the cache is protected by a per-instance lock, which provides
>no mutual exclusion when multiple distinct devices access it concurrently.
>Could this lead to data races?
No races as value is same and is updated by first PF who request it.
>> @@ -289,6 +291,8 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
>> struct npc_mcam_alloc_entry_rsp *rsp;
>> int vf_vlan_max_flows, count;
>> int rc, ref, prio, ent;
>> + u8 kw_type = 0;
>> + u16 x4_slots;
>> u16 dft_idx;
>>
>> ref = 0;
>> @@ -315,6 +319,16 @@ int otx2_mcam_entry_init(struct otx2_nic *pfvf)
>> if (!flow_cfg->def_ent)
>> return -ENOMEM;
>This is a pre-existing issue, but does this function leak devres-managed
>memory when triggered by devlink parameters?
Preexting issue, so will be addressed to net tree once this series is merged.
>> u8 kw_type = 0;
>> u8 *src, *dst;
>> u16 x4_slots;
>> - bool is_x2;
>This is a pre-existing issue, but does the VF validation error path in
>otx2_add_flow_msg() leak a mailbox message buffer?
>A mailbox message is dynamically allocated early in the function:
> req = otx2_mbox_alloc_msg_npc_install_flow(&pfvf->mbox);
>During the flow specification validation, if the VF is invalid, the
>function unlocks the mailbox and returns an error:
> vf = ethtool_get_flow_spec_ring_vf(ring_cookie);
> if (vf > pci_num_vf(pfvf->pdev)) {
> mutex_unlock(&pfvf->mbox.lock);
> return -EINVAL;
> }
>However, it fails to call otx2_mbox_reset(), which means the message slot
>remains active or consumed in the mailbox queue.
>Could a user repeatedly submitting invalid flows exhaust the mailbox queue
>and break communications with the Admin Function?
same as above.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
2026-06-02 6:03 ` [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
@ 2026-06-03 7:03 ` Ratheesh Kannoth
2026-06-04 3:21 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-03 7:03 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:59, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Replace the file-scope static npc_priv with a kcalloc'd struct filled
> from hardware bank/subbank geometry at init (num_banks is no longer a
> const compile-time constant; drop init_done and use a non-NULL
> npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
> through the CN20K NPC code paths, extend teardown to kfree the root
> struct on failure and in npc_cn20k_deinit, and adjust MCAM section
> setup to use the discovered subbank count.
>
> Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
> and use the allocated backing store consistently when computing deltas
> (including the counter rollover compare).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>>
>> + dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
>> + if (!dstats)
>> + return -ENOMEM;
>> +
>Since dstats remains a global static pointer, won't multiple RVU devices
>probing cause this to be unconditionally overwritten, leaking the previously
>allocated memory?
>
There is only one admin-function PCI device per system. Will post a patch to net-next (after this series is merged)to reject any additional AF probe
with -EBUSY so the driver model matches hardware and automated reviewers can rely on a single bound instance.
>Additionally, because this uses devm_kzalloc, if a device is unbound, won't
>devres free the memory while leaving the global pointer dangling? Could this
>lead to a use-after-free for other active devices that attempt to read from
>the debugfs file?
There is only one AF device per system, there is no support unbind AF device, so it is already attached.
>> - bw = bitmap_weight(bmap, npc_priv.subbank_depth);
>> - bfree = npc_priv.subbank_depth - bw;
>> + bw = bitmap_weight(bmap, npc_priv->subbank_depth);
>> + bfree = npc_priv->subbank_depth - bw;
>>
>> if (!bfree) {
>> dev_dbg(rvu->dev, "%s: subbank is full\n", __func__);
>This is a pre-existing issue, but does __npc_subbank_alloc() properly verify
>the bounds of the mcam_idx buffer before writing to it?
Pre-existing issue, will add bound check as part of hardening series, after the patch series merged.
> - npc_priv.subbank_depth = subbank_depth;
>> + npc_priv = kcalloc(1, sizeof(*npc_priv), GFP_KERNEL);
>> + if (!npc_priv)
>> + return -ENOMEM;
>> +
>Since npc_priv remains a file-scope global pointer, if multiple PCI physical
>functions probe, won't the second device initialization overwrite this pointer
>and permanently leak the memory allocated by the first device? Should this
>state be encapsulated within the per-device private data structure (such as
>struct rvu) instead?
THere is only one AF device, so only one npc_priv_t instance.
> + kfree(npc_priv->sb);
>> kfree(subbank_srch_order);
>> - bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> + bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> MAX_SUBBANK_DEPTH);
>> + npc_defrag_list_clear();
>> + kfree(npc_priv);
>> + npc_priv = NULL;
>> }
>If one device is torn down and calls npc_cn20k_deinit(), it explicitly frees
>the global npc_priv pointer and sets it to NULL. Won't this immediately break
>any other active RVU devices that are concurrently using this shared global
>pointer, leading to use-after-free or NULL pointer crashes?
>
THere is only one AF device, so only one npc_priv_t instance.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-06-03 6:19 ` Ratheesh Kannoth
@ 2026-06-04 2:19 ` Ratheesh Kannoth
2026-06-04 14:45 ` Jakub Kicinski
1 sibling, 1 reply; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 2:19 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:52, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
> so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
> since the prior read by comparing hardware counters to a per-entry
> software baseline and advancing that baseline after each read (hardware
> counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
> but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>> +static DEFINE_MUTEX(stats_lock);
>> +
>> +/* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
>> + * hard limit on all silicon variants, preventing any possibility of
>> + * out-of-bounds access.
>> + */
>> +static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
>The dstats array is a file-scope static and is never reset on driver
>teardown. npc_cn20k_deinit() now clears npc_priv.en_map but leaves
>dstats alone:
> void npc_cn20k_deinit(struct rvu *rvu)
> {
> ...
> bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
> MAX_SUBBANK_DEPTH);
> }
>On a probe -> remove -> probe sequence (or after an NPC reset) within
>the same module load, do the stale baselines from the previous device
>incarnation remain in dstats[][]?
>The wrap-around guard only self-corrects when the new hardware counter
>is smaller than the stored baseline:
> if (stats < dstats[bank][idx])
> dstats[bank][idx] = 0;
>If the post-reinit counter is greater than the stale value, would the
>first read of dstats report stats - stale_baseline rather than the true
>post-init delta? The commit message states baselines are advanced
>"after each read", which seems to imply per-incarnation correctness.
>Note: this appears to be addressed later in the same series by the
>patch "octeontx2-af: npc: cn20k: Allocate npc_priv and dstats
>dynamically" (4767c16e8909), which moves dstats to a devm_kzalloc()
>allocation that is freshly zeroed at each probe.
There is only one AF device per system. So this case wont happen.
>> + mutex_lock(&stats_lock);
>> + seq_puts(s, "idx\tpfunc\tstats\n");
>> + for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
>> + for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
>> + mcam_idx = bank * npc_priv->bank_depth + idx;
>> +
>> + if (npc_mcam_idx_2_key_type(rvu, mcam_idx, &key_type))
>> + continue;
>> +
>> + if (key_type == NPC_MCAM_KEY_X4 && bank != 0)
>> + continue;
>> +
>> + if (!test_bit(mcam_idx, npc_priv->en_map))
>> + continue;
>> +
>> + stats = rvu_read64(rvu, blkaddr,
>> + NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
>Can idx exceed the second dimension of dstats[][] here? The inner loop
>runs idx from npc_priv->bank_depth - 1 down to 0, but dstats is sized
>MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS (8192).
>In npc_priv_init() the new validation only constrains the quotient:
> subbank_depth = bank_depth / num_subbanks;
> if (subbank_depth > MAX_SUBBANK_DEPTH) {
> ...
> return -EINVAL;
> }
>Because of integer truncation, bank_depth values in the range
>(MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH, MAX_NUM_SUB_BANKS *
>MAX_SUBBANK_DEPTH + num_subbanks - 1] (for example 8193..8223 with
>num_subbanks=32) pass this check while exceeding 8192.
>mcam->banksize is sourced from a 16-bit hardware-reported field, so the
>value is structurally allowed even if current silicon programs a
>power-of-2 banksize. With such a banksize, would the new
>set_bit(index, npc_priv.en_map) in npc_cn20k_enable_mcam_entry() (which
>only validates index < mcam->total_entries) and the dstats[bank][idx]
>indexing here write past the bitmap and array?
>The new comment in cn20k/npc.h states:
> /* MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS represent
> * hard limit on all silicon variants, preventing any possibility of
> * out-of-bounds access on matrix defined using these values.
> */
>Would an explicit check such as bank_depth > MAX_NUM_SUB_BANKS *
>MAX_SUBBANK_DEPTH (or mcam->total_entries against the bitmap size) make
>the comment match the code?
>> + if (!stats)
>> + continue;
>> + if (stats == dstats[bank][idx])
>> + continue;
>> +
>> + if (stats < dstats[bank][idx])
>> + dstats[bank][idx] = 0;
>> +
npc_priv_init() during probe() do check against limits MAX_SUBBANK_DEPTH, MAX_NUM_SUB_BANKS.
So the condition wont happen.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control
2026-06-02 6:03 ` [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
@ 2026-06-04 2:34 ` Ratheesh Kannoth
0 siblings, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 2:34 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:55, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> CN20K NPC MCAM is split into 32 subbanks that are searched in a
> predefined order during allocation. Lower-numbered subbanks have
> higher priority than higher-numbered ones.
>
> Add a runtime "srch_order" to control the order in which
> subbanks are searched during MCAM allocation.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
> @@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
>> mutex_unlock(&npc_priv.sb[i].lock);
>> }
>>
>> +int npc_cn20k_search_order_set(struct rvu *rvu,
>> + u64 narr[MAX_NUM_SUB_BANKS], int cnt)
>> +{
>> + struct npc_mcam *mcam = &rvu->hw->mcam;
>> + int rsrc[2][MAX_NUM_SUB_BANKS] = { };
>> + u8 save[MAX_NUM_SUB_BANKS] = { };
>> + struct npc_subbank *sb;
>> + struct xarray *xa;
>> + int prio, rc, err;
>> + int sb_idx;
>The rvu pointer is taken as an argument, but the state being mutated
>(subbank_srch_order, restrict_valid, npc_priv) lives in file-scope statics
>in cn20k/npc.c. The rvu parameter is used only for dev_err() and to grab
>mcam->lock. Devlink parameters are per-device, so what happens on systems
>with more than one CN20K AF instance? It looks like setting npc_srch_order
>on one instance would silently apply to all of them, and a get on any
>instance would return whatever the last writer set globally. Was this
>intended, or should the state be moved under struct rvu (or its hw/npc
>container)?
There is only one AF device per system.
>The save[] array stores sb->arr_idx, but arr_idx is u16 in struct
>npc_subbank while save[] is declared u8. With MAX_NUM_SUB_BANKS == 32
>today the values fit, but is a BUILD_BUG_ON or a u16 declaration
>appropriate so that a future bump of MAX_NUM_SUB_BANKS does not silently
>truncate restored values during rollback? The neighbouring rsrc[][] array
>is correctly typed int.
Maximum subbank available on highend SoC is 32.
>> + for (int i = 0; i < cnt; i++)
>> + subbank_srch_order[i] = (u32)narr[i];
>> +
>> + restrict_valid = false;
>restrict_valid is unconditionally set to false here on every successful
>set, and there is no path elsewhere that sets it back to true. The flag
>gates several other behaviours in cn20k/npc.c:
> - npc_subbank_restrict_usage() returns false once restrict_valid is
> false (disabling the minimization of allocations from restricted
> subbanks).
> - The "Allocate from restricted subbanks" fallback loop is skipped.
> - npc_defrag_skip_restricted_sb() always returns false.
>Is it intended that the first userspace set of npc_srch_order silently
>disables the restricted-subbank usage-minimization feature and the
>restricted-subbank allocation fallback for the remaining lifetime of the
>driver? If so, could that be documented and exposed through the same
>uAPI so userspace can observe and revert it?
This is intentional. Once user change srch orer, "restrict_valid" is invalid.
>> +fail:
>> + for (prio = 0; prio < cnt; prio++) {
>> + if (rsrc[FREE][prio] == 1)
>> + xa_erase(&npc_priv.xa_sb_free, prio);
>> +
>> + if (rsrc[USED][prio] == 1)
>> + xa_erase(&npc_priv.xa_sb_used, prio);
>> + }
>> +
>> + for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
>> + sb = &npc_priv.sb[sb_idx];
>> + sb->arr_idx = save[sb_idx];
>> +
>> + if (sb->flags & NPC_SUBBANK_FLAG_USED)
>> + xa = &npc_priv.xa_sb_used;
>> + else
>> + xa = &npc_priv.xa_sb_free;
>> +
>> + /* Since the entry already exists, xa_store() replaces
>> + * the value without a kmalloc(), making failure highly unlikely.
>> + */
>> + err = xa_err(xa_store(xa, sb->arr_idx,
>> + xa_mk_value(sb->idx), GFP_KERNEL));
>> + WARN(!!err, "Failed to rollback sb=%u idx=%u\n",
>> + sb->idx, sb->arr_idx);
>> + }
>Is the comment above accurate on the failure path?
>The first loop in fail: erases keys where rsrc[FLAG][prio] == 1, which
>are exactly the prio slots written by completed iterations. When the
>rollback loop then writes the saved subbanks back to those keys via
>xa_store(xa, sb->arr_idx, ...), those keys are no longer present, so the
>xa_store has to allocate a new node and is not a simple replace.
>Under memory pressure that allocation can return -ENOMEM, and the only
>response here is WARN().
rsrc[FLAG][prio] == 1 indicate entries which are newly allocated. So rolling back
will delete those. Other slots do exist, so xa_store() wont fail due to memory pressure.
> +static int rvu_af_dl_npc_srch_order_get(struct devlink *devlink, u32 id,
>> + struct devlink_param_gset_ctx *ctx,
>> + struct netlink_ext_ack *extack)
>> +{
>> + bool restricted_order;
>> + const u32 *order;
>> + u32 sz;
>> +
>> + order = npc_cn20k_search_order_get(&restricted_order, &sz);
>> + ctx->val.u64arr.size = sz;
>> + for (int i = 0; i < sz; i++)
>> + ctx->val.u64arr.val[i] = order[i];
>> +
>> + return 0;
>> +}
>The restricted_order out parameter is written here but never read. The
>same is true in rvu_af_dl_npc_srch_order_validate() below. Is the
>intention to expose whether the default (restricted) order or a
>user-provided order is currently active to userspace, perhaps via a
>separate read-only param? As written, the value never leaves the local
>stack, so userspace has no way to learn it.
if default srch_order is modified, restrict_order will be false. No need
to read this value from userspace.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle
2026-06-02 6:03 ` [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
2026-06-03 6:37 ` Ratheesh Kannoth
@ 2026-06-04 2:41 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 2:41 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:56, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Add NIX_LF_DONT_FREE_DFT_IDXS so the PF can send NIX LF free during hw
> reinit or teardown without the AF freeing CN20K default NPC rule indexes
> while the driver still owns that state (otx2_init_hw_resources and
> otx2_free_hw_resources).
>
> On CN20K, allocate default NPC rules from NIX LF alloc before
> nix_interface_init, roll back with npc_cn20k_dft_rules_free on failure,
> and free from NIX LF free when the new flag is not set. Tighten
> rvu_mbox_handler_nix_lf_alloc error handling: use a single rc, propagate
> qmem_alloc and other errors, and set -ENOMEM only when kcalloc fails
> (remove the blanket -ENOMEM at the free_mem path).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says:
>4. nix_interface_init() then calls
> nix_update_mce_rule(NIXLF_BCAST_ENTRY, true), which can fail (for
> example -EINVAL from npc_get_nixlf_mcam_index in the cn20k path,
> or kzalloc failure in nix_update_mce_list).
>5. The new error path goto free_dft -> npc_cn20k_dft_rules_free()
> walks mcam->mcam_rules and kfree()s every rule whose entry matches
> one of the freed dft indexes, including the rule installed in
> step 3.
>6. pfvf->def_ucast_rule is not cleared, so it now points at freed
> memory.
>Subsequent mbox traffic for the same pcifunc dereferences that
>dangling pointer in places like:
If nix lf alloc fails, subsequent mbox traffic wont come for that Pcifunc.
Next nix_lf alloc, would alloc default mcam rules. Practically this is not possible.
I will add an hardening patchseries to remove the idempotency attribute of
npc_cn20k_dft_rules_alloc() after this series is merged.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem
2026-06-02 6:03 ` [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
2026-06-03 6:46 ` Ratheesh Kannoth
@ 2026-06-04 3:07 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 3:07 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:57, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Flashing updated firmware on deployed devices is cumbersome. Provide a
> mechanism to load a custom KPU (Key Parse Unit) profile directly from
> the filesystem at module load time.
>
> When the rvu_af module is loaded with the kpu_profile parameter, the
> specified profile is read from /lib/firmware/kpu and programmed into
> the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
> used by filesystem-loaded profiles and support ptype/ptype_mask in
> npc_config_kpucam when profile->from_fs is set.
>
> Usage:
> 1. Copy the KPU profile file to /lib/firmware/kpu.
> 2. Build OCTEONTX2_AF as a module.
> 3. Load: insmod rvu_af.ko kpu_profile=<profile_name>
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>> - for (idx = 0; idx < num_pkinds; idx++)
>> - npc_config_kpmaction(rvu, blkaddr, &rvu->kpu.ikpu[idx],
>> + /* Cn20k does not support Custom profile from filesystem */
>> + for (idx = 0; idx < num_pkinds; idx++) {
>> + act = npc_get_ikpu_nth_entry(rvu, idx);
>> + if (!act)
>> + continue;
>> +
>> + npc_config_kpmaction(rvu, blkaddr, act,
>> 0, idx, true);
>> + }
>Can act ever be NULL on this path? npc_get_ikpu_nth_entry() returns
>either &profile->ikpu2[n] (when from_fs is true) or &profile->ikpu[n],
>neither of which can be NULL. And as the comment above notes, cn20k
>does not take the from_fs path at all (it is gated by !is_cn20k() in
>npc_load_kpu_profile()), so ikpu always points at the static
>ikpu_action_entries[].
>If that is the case, the if (!act) continue check is dead.
I agree, but there is a plan to add it in future, so would like to keep this check
>> + if (profile->from_fs) {
>> + u8 ptype = kpucam2->ptype;
>> + u8 pmask = kpucam2->ptype_mask;
>> +
>> + *val |= FIELD_PREP(GENMASK_ULL(57, 56), ptype & pmask);
>> + *mask |= FIELD_PREP(GENMASK_ULL(57, 56), ~ptype & pmask);
>> + }
>The cast const struct npc_kpu_profile_cam2 *kpucam2 = (void *)kpucam is
>unconditional, but the function's parameter is typed as
>struct npc_kpu_profile_cam *. The two structures differ in size
>(12 vs 14 bytes packed).
>The reads of kpucam2->ptype and kpucam2->ptype_mask are gated by
>profile->from_fs, but the gate relies on a cross-function invariant:
>the from_fs flag must imply that npc_get_kpu_cam_nth_entry() was used
>and returned a cam2-typed element. There is no static assertion or
>runtime check tying these two together.
>Could a BUILD_BUG_ON ensuring
>offsetof(struct npc_kpu_profile_cam2, ptype) == sizeof(struct npc_kpu_profile_cam)
>be added so that a future layout change is caught at compile time?
>The bits that ptype/ptype_mask are written into are bits 57:56 of cam0
>and cam1, but struct npc_kpu_cam in npc.h still declares those bits as
>part of rsvd_63_56 : 8 in both endianness branches. Should the cam
>struct be updated with an explicit ptype : 2 / rsvd_63_58 : 6 split (or
>similar) so the contract that those two reserved bits are now live is
>visible at the struct definition?
>Is there also a way to assert that the silicon variants on which the
>filesystem profile path runs (cn9k, cn10k) actually decode bits 57:56
>as ptype-match?
The rsvd_63_56 field in struct npc_kpu_cam remains intentionally generic because this shared structure header
is consumed across multiple legacy silicon generations where those bits are strictly reserved.
Modifying the base bitfield definition would break backward compatibility metrics for older targets.
The localized use of FIELD_PREP(GENMASK_ULL(57, 56), ...) inside the from_fs block serves as the explicit.
>> + return -ENOMEM;
>> +
>> + kpu->action2 = devm_kcalloc(rvu->dev, num_entries,
>> + sizeof(*kpu->action2), GFP_KERNEL);
>> + if (!kpu->action2)
>> + return -ENOMEM;
>> +
>> + return 0;
>> +}
>Could this function be storing per-device devm allocations in
>module-global static state?
>adapter->kpu is set by npc_prepare_default_kpu() to npc_kpu_profiles
>from npc_profile.h, which is a static module-global array.
>kpu->cam2 and kpu->action2 are devm_kcalloc(rvu->dev, ...) so the
>underlying memory is freed at device release, but the pointers (and
>the cam_entries2/action_entries2 counters set in
>npc_apply_custom_kpu_from_fs()) remain in the static array.
>After unbind, would a subsequent rebind that uses the default profile
>or the from_fw path leave dangling pointers and non-zero
There is only one AF device per system.
> +static int npc_apply_custom_kpu_from_fw(struct rvu *rvu,
>> + struct npc_kpu_profile_adapter *profile)
>> {
>> size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
>> + const struct npc_kpu_profile_fwdata *fw;
>> struct npc_kpu_profile_action *action;
>> - struct npc_kpu_profile_fwdata *fw;
>> struct npc_kpu_profile_cam *cam;
>> struct npc_kpu_fwdata *fw_kpu;
>> - int entries;
>> - u16 kpu, entry;
>> + int entries, entry, kpu;
>>
>> - if (is_cn20k(rvu->pdev))
>> - return npc_cn20k_apply_custom_kpu(rvu, profile);
>> + fw = rvu->kpu_fwdata;
>> +
>> + for (kpu = 0; kpu < fw->kpus; kpu++) {
>> + if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
>> + dev_warn(rvu->dev,
>> + "Profile size mismatch on KPU%i parsing\n",
>> + kpu + 1);
>> + return -EINVAL;
>> + }
>> +
>> + fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
>> + if (fw_kpu->entries < 0) {
>Is the size check above missing sizeof(*fw_kpu)? The check guards
>that offset is reachable, but the next statement reads fw_kpu->entries
>which is at fw->data + offset + 0..3. When kpu_fwdata_sz equals
>hdr_sz + offset exactly, the read of fw_kpu->entries is past the end
>of the firmware buffer.
This is a prexisintg issue in the code, will fix it as part of patch to "net" tree
once this series is merged.
>>
>> fw = rvu->kpu_fwdata;
>>
>> + /* Binary blob contains ikpu actions entries at start of data[0] */
>> + profile->ikpu2 = devm_kcalloc(rvu->dev, 1,
>> + sizeof(ikpu_action_entries),
>> + GFP_KERNEL);
>> + if (!profile->ikpu2)
>> + return -ENOMEM;
>> +
>> + action = (struct npc_kpu_profile_action *)(fw->data + offset);
>> +
>> + if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
>> + return -EINVAL;
>Should the size validation happen before the devm allocation? When
>kpu_fwdata_sz is too small the function returns -EINVAL with
>profile->ikpu2 already allocated and stored. Devm reclaims at device
>release, but on every malformed-firmware probe a small buffer is left
>orphaned and reachable through profile->ikpu2 (with from_fs reset to
>false on the fallback path, so it is not currently dereferenced).
There is only one AF device per system and this loading of profiles is done during driver
init() time.
>> +
>> + /* The firmware layout does dependent on the internal size of
>> + * ikpu_action_entries.
>> + */
>> + memcpy((void *)profile->ikpu2, action, sizeof(ikpu_action_entries));
>> + offset += sizeof(ikpu_action_entries);
>Is the firmware ABI here tied to a kernel-internal sizeof?
>ikpu_action_entries[] is a static array in npc_profile.h. The
>filesystem blob has no field declaring how many ikpu entries it
>contains, so the offset advance depends on the kernel build. If a
>future kernel adds a pkind to ikpu_action_entries, every existing
>filesystem blob will be silently misparsed (offset advances by the
>wrong amount, then cam entries are read from the wrong location)
>even though the major version check still passes.
We have support only for little endian. And ikpu_action_entries structure does not have a fieled
to hold "size" currently. we cant add it now as it would break backward compatability.
>> return npc_cn20k_apply_custom_kpu(rvu, profile);
>>
>> if (rvu->kpu_fwdata_sz < hdr_sz) {
>> dev_warn(rvu->dev, "Invalid KPU profile size\n");
>> return -EINVAL;
>> }
>> +
>> + fw = rvu->kpu_fwdata;
>> if (le64_to_cpu(fw->signature) != KPU_SIGN) {
>> dev_warn(rvu->dev, "Invalid KPU profile signature %llx\n",
>> fw->signature);
>The new filesystem-loaded blob and the existing fwdb blob share the
>same KPU_SIGN signature and the same struct npc_kpu_profile_fwdata
>header, but data[] has a different binary layout: the filesystem
>format prepends a sizeof(ikpu_action_entries) block of ikpu actions
>and uses npc_kpu_profile_cam2 (16 bytes packed) instead of
>npc_kpu_profile_cam (14 bytes packed).
>Dispatch is decided entirely by the loader path:
> return from_fs ? npc_apply_custom_kpu_from_fs(rvu, profile) :
> npc_apply_custom_kpu_from_fw(rvu, profile);
>with nothing in the file itself indicating which layout applies.
>If a fwdb-formatted blob is dropped into /lib/firmware/kpu/<name>,
>signature and version checks pass, then the parser silently
>misinterprets the bytes, programming arbitrary values plus enabled
>ptype-match bits into the KPU CAM and PKIND ACTION CSRs.
>Should the on-disk header carry a format/cam_format_version field, or
>should the cam2 layout use a distinct signature?
This is intentional as per customer requirement.
>> +
>> profile->custom = 1;
>> - profile->name = fw->name;
>> + profile->name = sfw->name;
>> profile->version = le64_to_cpu(fw->version);
>> - profile->mcam_kex_prfl.mkex = &fw->mkex;
>> - profile->lt_def = &fw->lt_def;
>If npc_load_kpu_profile_from_fs() runs first and fails, then
>npc_load_kpu_profile_from_fw() retries via npc_apply_custom_kpu()
>again, a second sfw is allocated and overwrites profile->name,
>profile->mcam_kex_prfl.mkex and profile->lt_def. The first sfw is
>still tracked by devm but no longer referenced.
>Is one orphaned sfw per probe acceptable, or should the helper reuse
>the existing sfw on retry?
There is only one AF device per system
>> + release_firmware(fw);
>> + rvu->kpu_fwdata = NULL;
>> +
>> + if (ret) {
>> + rvu->kpu_fwdata_sz = 0;
>> + dev_err(rvu->dev,
>> + "Loading KPU profile from filesystem failed\n");
>> + return ret;
>> + }
>On the success path, kpu_fwdata is cleared but kpu_fwdata_sz is left
>non-zero. On the failure path both are cleared. Is the asymmetry
>intentional?
Yes.
>
>> for (idx = 0; idx < num_kpus; idx++)
>> npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
>> +
>> + if (profile->from_fs) {
>> + rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(54), 0x03);
>> + rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(58), 0x03);
>> + }
>> }
>A few questions about these unconditional writes:
>PKIND indices 54 and 58 are hardcoded. Other sites in the file clamp
>via min_t(int, hw->npc_pkinds, num_pkinds) before programming PKIND
>CSRs. Is there a guarantee that hw->npc_pkinds is greater than 58 on
>all silicon variants that take this path?
>The value 0x03 has no comment explaining what it programs. Could a
>brief comment or named constant be added?
>These writes are profile-specific (they encode knowledge of two
>particular PKINDs the customer's profile uses) but apply to any
>filesystem-loaded blob. Should these settings be expressed in the
>firmware blob itself rather than hardcoded in the kernel?
>The commit message describes the change as programming the profile
>read from the file and does not mention these register writes.
This very much hw specific and intentional.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc
2026-06-02 6:03 ` [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
2026-06-03 6:54 ` Ratheesh Kannoth
@ 2026-06-04 3:16 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 3:16 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:58, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Default CN20K NPC rule allocation now keys off the active MCAM keyword
> width: use X4 with a bank-masked reference index when the silicon uses
> X4 keys, and X2 with the raw index otherwise (replacing the previous
> always-X2 / eidx + 1 behaviour).
>
> In the AF flow-install path, flows that need more than 256 key bits
> query the NPC profile; if the platform is fixed to X2 entries, fail
> with -EOPNOTSUPP instead of requesting X4. Otherwise select X4 for the
> MCAM alloc.
>
> On the PF, cache and pass the profile kw_type from npc_get_pfl_info
> through otx2_mcam_pfl_info_get(), and use it when allocating MCAM
> entries for RSS/defaults and when installing ethtool flows on CN20K,
> including masking the reference index for X4 slot layout.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says:
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> @@ -4533,11 +4539,18 @@ int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
>> * as NPC_DFT_RULE_PRIO - 1 (higher hw priority)
>> */
>> req.contig = false;
>> - req.kw_type = NPC_MCAM_KEY_X2;
>> req.count = cnt;
>> req.hdr.pcifunc = pcifunc;
>> req.ref_prio = NPC_MCAM_LOWER_PRIO;
>> - req.ref_entry = eidx + 1;
>> +
>> + if (npc_priv.kw == NPC_MCAM_KEY_X4) {
>> + req.kw_type = NPC_MCAM_KEY_X4;
>> + req.ref_entry = eidx & (npc_priv.bank_depth - 1);
>> + } else {
>> + req.kw_type = NPC_MCAM_KEY_X2;
>> + req.ref_entry = eidx;
>> + }
>> +
>The first allocation block in npc_cn20k_dft_rules_alloc() previously used
>req.ref_entry = eidx, so adding only the X4 path to it is purely additive.
>This second (LOWER_PRIO fallback) block is different: it used to use
>req.ref_entry = eidx + 1, and the new X2 branch silently drops the +1 and
>uses eidx instead.
>Was the previous eidx + 1 (that is, 2 * npc_priv.bank_depth) in the
>fallback path intentional?
This is intentional to fix a bug as eidx + 1 may cross the boundary of the resource.
>> - if (kw_bits > 256)
>> + if (kw_bits > 256) {
>> + rvu_mbox_handler_npc_get_pfl_info(rvu, &req, &rsp);
>> + if (rsp.kw_type == NPC_MCAM_KEY_X2) {
>> + dev_err(rvu->dev,
>> + "Only X2 entries are supported in X2 profile\n");
>> + return -EOPNOTSUPP;
>> + }
>> +
>> *kw_type = NPC_MCAM_KEY_X4;
>> + }
>A couple of concerns here.
>The struct msg_req req; is left uninitialized on the stack and then passed
>to rvu_mbox_handler_npc_get_pfl_info(). The handler does not currently
>read req, but it is a registered mbox dispatch entry whose contract has
>its hdr filled in by the mbox layer, so should req be zero-initialized at
>the declaration to match that contract, similar to the on-the-same-line
>struct npc_get_pfl_info_rsp rsp = { 0 } above it?
>The int return of rvu_mbox_handler_npc_get_pfl_info() is also discarded,
>and rsp.kw_type is then tested unconditionally. Since rsp starts at zero
>and NPC_MCAM_KEY_X1 is 0, a hypothetical handler failure would skip the
>X2-profile guard and select NPC_MCAM_KEY_X4. The handler's only failure
>return today is the !is_cn20k() check, which the function has already
>short-circuited above, so there is no live failure path; would it still
>make sense to capture the return and bail out, matching the rc-checking
>pattern used for rvu_mbox_handler_npc_get_num_kws() just a few lines
>earlier?
rvu_mbox_handler_npc_get_pfl_info() behaves purely as a local accessor function in this execution path.
It retrieves a statically cached profile configuration without processing any mailboxes.
However, to maintain strict consistency with the mailbox contract and prevent
any future static analysis tools from flagging uninitialized stack usage, we will add a error handler
as per of hardening series once this patch series is merged.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.
2026-06-02 6:03 ` [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
2026-06-03 7:03 ` Ratheesh Kannoth
@ 2026-06-04 3:21 ` Ratheesh Kannoth
1 sibling, 0 replies; 21+ messages in thread
From: Ratheesh Kannoth @ 2026-06-04 3:21 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: andrew+netdev, davem, donald.hunter, edumazet, horms, jiri, kuba,
pabeni, sgoutham
On 2026-06-02 at 11:33:59, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> Replace the file-scope static npc_priv with a kcalloc'd struct filled
> from hardware bank/subbank geometry at init (num_banks is no longer a
> const compile-time constant; drop init_done and use a non-NULL
> npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
> through the CN20K NPC code paths, extend teardown to kfree the root
> struct on failure and in npc_cn20k_deinit, and adjust MCAM section
> setup to use the discovered subbank count.
>
> Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
> and use the allocated backing store consistently when computing deltas
> (including the counter rollover compare).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
>[ ... ]
>> @@ -397,6 +398,10 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
>> debugfs_create_file("vidx2idx", 0444, rvu->rvu_dbg.npc,
>> npc_priv, &npc_vidx2idx_map_fops);
>>
>> + dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
>> + if (!dstats)
>> + return -ENOMEM;
>> +
>> debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
>> &npc_mcam_dstats_fops);
>Should the dstats allocation move to the top of npc_cn20k_debugfs_init()?
>By the time the kzalloc fails, three debugfs files (mcam_layout,
>mcam_default, vidx2idx) have already been created, and on -ENOMEM the
>function returns without unwinding them. The sole caller in
>rvu_dbg_npc_init() is:
> if (is_cn20k(rvu->pdev))
> npc_cn20k_debugfs_init(rvu);
>rvu_dbg_npc_init() is static void and discards the return value, so under
>memory pressure during probe the device completes probe with a partially
>populated debugfs hierarchy and no error is logged. Pre-patch the function
>unconditionally returned 0, so this is a new failure mode introduced by
>this change.
I dont see any problem with this.
>> - xa_destroy(&npc_priv.xa_sb_used);
>[ ... ]
>> + if (!npc_priv)
>> + return;
>[ ... ]
>> + kfree(npc_priv->sb);
>> kfree(subbank_srch_order);
>> - bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> + bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> MAX_SUBBANK_DEPTH);
>> + npc_defrag_list_clear();
>> + kfree(npc_priv);
>> + npc_priv = NULL;
>> }
>Should npc_cn20k_deinit() also NULL out subbank_srch_order after
>kfree()ing it, to match the cleanup symmetry of the fail3 path in
>npc_priv_init()?
I dont see a need as there is only one AF device per system
>> - const int num_banks;
>> + int num_banks;
>> int num_subbanks;
>> int subbank_depth;
>> DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
>> @@ -214,7 +214,6 @@ struct npc_priv_t {
>> struct list_head defrag_lh;
>> struct mutex lock; /* protect defrag nodes */
>> int pf_cnt;
>> - bool init_done;
>> };
>The init_done field is removed from struct npc_priv_t, but the
>kernel-doc block above the struct still carries the corresponding
>description:
> * @init_done: Indicates MCAM initialization is done.
>Should this @init_done line in the kernel-doc be removed alongside the
>field? scripts/kernel-doc and make htmldocs will warn with "Excess
>struct member init_done description in npc_priv_t".
ACK. Should i post v19 to address this ?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements
2026-06-04 2:19 ` Ratheesh Kannoth
@ 2026-06-04 14:45 ` Jakub Kicinski
0 siblings, 0 replies; 21+ messages in thread
From: Jakub Kicinski @ 2026-06-04 14:45 UTC (permalink / raw)
To: Ratheesh Kannoth
Cc: linux-kernel, netdev, andrew+netdev, davem, donald.hunter,
edumazet, horms, jiri, pabeni, sgoutham
On Thu, 4 Jun 2026 07:49:43 +0530 Ratheesh Kannoth wrote:
> There is only one AF device per system. So this case wont happen.
Can you explain why? I thought this was a PCIe card, not an SoC driver.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2026-06-04 14:45 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 6:03 [PATCH v18 net-next 0/8] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 1/8] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-06-03 6:19 ` Ratheesh Kannoth
2026-06-04 2:19 ` Ratheesh Kannoth
2026-06-04 14:45 ` Jakub Kicinski
2026-06-02 6:03 ` [PATCH v18 net-next 2/8] devlink: heap-allocate param fill buffers in devlink_nl_param_fill Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 3/8] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
2026-06-04 2:34 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 5/8] octeontx2: cn20k: Coordinate default rules with NIX LF lifecycle Ratheesh Kannoth
2026-06-03 6:37 ` Ratheesh Kannoth
2026-06-04 2:41 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 6/8] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
2026-06-03 6:46 ` Ratheesh Kannoth
2026-06-04 3:07 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 7/8] octeontx2: cn20k: Respect NPC MCAM X2/X4 profile in flows and DFT alloc Ratheesh Kannoth
2026-06-03 6:54 ` Ratheesh Kannoth
2026-06-04 3:16 ` Ratheesh Kannoth
2026-06-02 6:03 ` [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically Ratheesh Kannoth
2026-06-03 7:03 ` Ratheesh Kannoth
2026-06-04 3:21 ` Ratheesh Kannoth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox