* [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements.
@ 2026-04-03 2:55 Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
This series extends Marvell octeontx2-af support for CN20K NPC (MCAM
debuggability, allocation policy, default-rule lifetime, and optional
KPU profiles from firmware files) and adds a devlink mechanism for
multi-value parameters, with a small mlx5 follow-up to keep stack usage
within -Wframe-larger-than limits once union devlink_param_value grows.
Patch 1 improves CN20K MCAM visibility in debugfs: mcam_layout marks
enabled entries, dstats reports per-entry hit deltas, and mismatch lists
enabled entries without a PF mapping. MCAM enable state is tracked in a
bitmap updated from the CN20K enable path.
Patch 2 heap-allocates the temporary devlink param value array in
mlx5e_pcie_cong_get_thresh_config() so a larger union devlink_param_value
does not overflow the stack (patch 3).
Patch 3 (Saeed) introduces DEVLINK_PARAM_TYPE_U64_ARRAY and nested
DEVLINK_ATTR_PARAM_VALUE_DATA attributes so drivers and user space can
exchange bounded u64 arrays; devlink_nl_param_fill() moves large locals
to the heap as well.
Patch 4 adds a runtime devlink parameter npc_srch_order (U64 array) to
reorder CN20K subbank search during MCAM allocation.
Patch 5 ties default MCAM entries (broadcast, multicast, promisc, ucast)
to NIX LF alloc/free on CN20K, adds NIX_LF_DONT_FREE_DFT_IDXS for kernel
PF suspend-style teardown, and adjusts free-all and default-entry paths
so default rules are not freed as ordinary user rules.
Patch 6 allows loading a custom KPU profile from /lib/firmware/kpu via
module parameter kpu_profile on non-CN20K paths, with cam2/ptype support
and shared helpers for firmware-sourced vs filesystem-sourced profiles;
CN20K continues to use its existing custom KPU apply path.
The mlx5 change is placed immediately before the devlink union growth
so the series applies cleanly and stays warning-free when built
incrementally.
Ratheesh Kannoth (5):
octeontx2-af: npc: cn20k: debugfs enhancements
net/mlx5e: heap-allocate devlink param values
octeontx2-af: npc: cn20k: add subbank search order control
octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM
entries
octeontx2-af: npc: Support for custom KPU profile from filesystem
Saeed Mahameed (1):
devlink: Implement devlink param multi attribute nested data values
Documentation/netlink/specs/devlink.yaml | 4 +
.../marvell/octeontx2/af/cn20k/debugfs.c | 126 ++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 255 +++++++--
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 10 +
.../net/ethernet/marvell/octeontx2/af/mbox.h | 1 +
.../net/ethernet/marvell/octeontx2/af/npc.h | 17 +
.../net/ethernet/marvell/octeontx2/af/rvu.h | 6 +-
.../marvell/octeontx2/af/rvu_devlink.c | 92 +++-
.../ethernet/marvell/octeontx2/af/rvu_nix.c | 16 +-
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 521 ++++++++++++++----
.../ethernet/marvell/octeontx2/af/rvu_npc.h | 17 +
.../ethernet/marvell/octeontx2/af/rvu_reg.h | 1 +
.../ethernet/marvell/octeontx2/nic/otx2_pf.c | 6 +-
.../mellanox/mlx5/core/en/pcie_cong_event.c | 11 +-
include/net/devlink.h | 8 +
include/uapi/linux/devlink.h | 1 +
net/devlink/netlink_gen.c | 2 +
net/devlink/param.c | 91 ++-
18 files changed, 979 insertions(+), 206 deletions(-)
--
v9 -> v10: Addressed Paolo comments
https://lore.kernel.org/netdev/20260330053105.2722453-1-rkannoth@marvell.com/
v8 -> v9: Addressed Simon comments
https://lore.kernel.org/netdev/20260325072159.1126964-1-rkannoth@marvell.com/
v7 -> v8: Addressed Simon comments
https://lore.kernel.org/netdev/20260323035110.3908741-1-rkannoth@marvell.com/T/#t
v6 -> v7: Addressed Simon comments
https://lore.kernel.org/netdev/20260320165432.98832-1-horms@kernel.org/
v5 -> v6: Addressed Jakub,Jiri comments
https://lore.kernel.org/netdev/20260317045623.250187-1-rkannoth@marvell.com/
v4 -> v5: Addressed Jakub comments
https://lore.kernel.org/netdev/20260312022754.2029595-6-rkannoth@marvell.com/
v3 -> v4: Addressed Simon comments
https://lore.kernel.org/netdev/abDeXLpMMxp7G1v3@rkannoth-OptiPlex-7090/#t
v2 -> v3: Addressed Simon comments.
https://lore.kernel.org/netdev/20260304043032.3661647-1-rkannoth@marvell.com/
v1 -> v2: Addressed Jakub comments.
https://lore.kernel.org/netdev/20260302085803.2449828-1-rkannoth@marvell.com/#t
2.43.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 2/6] net/mlx5e: heap-allocate devlink param values Ratheesh Kannoth
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
Improve MCAM visibility and field debugging for CN20K NPC.
- Extend "mcam_layout" to show enabled (+) or disabled state per entry
so status can be verified without parsing the full "mcam_entry" dump.
- Add "dstats" debugfs entry: reports recently hit MCAM indices with
packet counts; stats are cleared on read so each read shows deltas.
- Add "mismatch" debugfs entry: lists MCAM entries that are enabled
but not explicitly allocated, helping diagnose allocation/field issues.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../marvell/octeontx2/af/cn20k/debugfs.c | 126 +++++++++++++++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 16 ++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 7 +
3 files changed, 135 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
index 3debf2fae1a4..e8f85ed5ead7 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/debugfs.c
@@ -13,6 +13,7 @@
#include "struct.h"
#include "rvu.h"
#include "debugfs.h"
+#include "cn20k/reg.h"
#include "cn20k/npc.h"
static int npc_mcam_layout_show(struct seq_file *s, void *unused)
@@ -58,7 +59,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
"v:%u", vidx0);
}
- seq_printf(s, "\t%u(%#x) %s\n", idx0, pf1,
+ seq_printf(s, "\t%u(%#x)%c %s\n", idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
map ? buf0 : " ");
}
goto next;
@@ -101,9 +103,13 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
vidx1);
}
- seq_printf(s, "%05u(%#x) %s\t\t%05u(%#x) %s\n",
- idx1, pf2, v1 ? buf1 : " ",
- idx0, pf1, v0 ? buf0 : " ");
+ seq_printf(s, "%05u(%#x)%c %s\t\t%05u(%#x)%c %s\n",
+ idx1, pf2,
+ test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
+ v1 ? buf1 : " ",
+ idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+ v0 ? buf0 : " ");
continue;
}
@@ -120,8 +126,9 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
vidx0);
}
- seq_printf(s, "\t\t \t\t%05u(%#x) %s\n", idx0,
- pf1, map ? buf0 : " ");
+ seq_printf(s, "\t\t \t\t%05u(%#x)%c %s\n", idx0, pf1,
+ test_bit(idx0, npc_priv->en_map) ? '+' : ' ',
+ map ? buf0 : " ");
continue;
}
@@ -134,7 +141,8 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
snprintf(buf1, sizeof(buf1), "v:%05u", vidx1);
}
- seq_printf(s, "%05u(%#x) %s\n", idx1, pf1,
+ seq_printf(s, "%05u(%#x)%c %s\n", idx1, pf1,
+ test_bit(idx1, npc_priv->en_map) ? '+' : ' ',
map ? buf1 : " ");
}
next:
@@ -145,6 +153,100 @@ static int npc_mcam_layout_show(struct seq_file *s, void *unused)
DEFINE_SHOW_ATTRIBUTE(npc_mcam_layout);
+static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
+static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
+{
+ struct npc_priv_t *npc_priv;
+ int blkaddr, pf, mcam_idx;
+ u64 stats, delta;
+ struct rvu *rvu;
+ u8 key_type;
+ void *map;
+
+ npc_priv = npc_priv_get();
+ rvu = s->private;
+ blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
+ if (blkaddr < 0)
+ return 0;
+
+ seq_puts(s, "idx\tpfunc\tstats\n");
+ for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+ for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+ mcam_idx = bank * npc_priv->bank_depth + idx;
+
+ npc_mcam_idx_2_key_type(rvu, mcam_idx, &key_type);
+ if (key_type == NPC_MCAM_KEY_X4 && bank != 0)
+ continue;
+
+ if (!test_bit(mcam_idx, npc_priv->en_map))
+ continue;
+
+ stats = rvu_read64(rvu, blkaddr,
+ NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
+ if (!stats)
+ continue;
+ if (stats == dstats[bank][idx])
+ continue;
+
+ if (stats < dstats[bank][idx])
+ dstats[bank][idx] = 0;
+
+ pf = 0xFFFF;
+ map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+ if (map)
+ pf = xa_to_value(map);
+
+ if (stats > dstats[bank][idx])
+ delta = stats - dstats[bank][idx];
+ else
+ delta = stats;
+
+ seq_printf(s, "%u\t%#04x\t%llu\n",
+ mcam_idx, pf, delta);
+ dstats[bank][idx] = stats;
+ }
+ }
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(npc_mcam_dstats);
+
+static int npc_mcam_mismatch_show(struct seq_file *s, void *unused)
+{
+ struct npc_priv_t *npc_priv;
+ struct npc_subbank *sb;
+ int mcam_idx, sb_off;
+ struct rvu *rvu;
+ void *map;
+ int rc;
+
+ npc_priv = npc_priv_get();
+ rvu = s->private;
+
+ seq_puts(s, "index\tsb idx\tkw type\n");
+ for (int bank = npc_priv->num_banks - 1; bank >= 0; bank--) {
+ for (int idx = npc_priv->bank_depth - 1; idx >= 0; idx--) {
+ mcam_idx = bank * npc_priv->bank_depth + idx;
+
+ if (!test_bit(mcam_idx, npc_priv->en_map))
+ continue;
+
+ map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
+ if (map)
+ continue;
+
+ rc = npc_mcam_idx_2_subbank_idx(rvu, mcam_idx,
+ &sb, &sb_off);
+ if (rc)
+ continue;
+
+ seq_printf(s, "%u\t%d\t%u\n", mcam_idx, sb->idx,
+ sb->key_type);
+ }
+ }
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(npc_mcam_mismatch);
+
static int npc_mcam_default_show(struct seq_file *s, void *unused)
{
struct npc_priv_t *npc_priv;
@@ -257,6 +359,16 @@ int npc_cn20k_debugfs_init(struct rvu *rvu)
if (!npc_dentry)
return -EFAULT;
+ npc_dentry = debugfs_create_file("dstats", 0444, rvu->rvu_dbg.npc, rvu,
+ &npc_mcam_dstats_fops);
+ if (!npc_dentry)
+ return -EFAULT;
+
+ npc_dentry = debugfs_create_file("mismatch", 0444, rvu->rvu_dbg.npc, rvu,
+ &npc_mcam_mismatch_fops);
+ if (!npc_dentry)
+ return -EFAULT;
+
npc_dentry = debugfs_create_file("mcam_default", 0444, rvu->rvu_dbg.npc,
rvu, &npc_mcam_default_fops);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 7291fdb89b03..e854b85ced9e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -808,6 +808,9 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
u64 cfg, hw_prio;
u8 kw_type;
+ enable ? set_bit(index, npc_priv.en_map) :
+ clear_bit(index, npc_priv.en_map);
+
npc_mcam_idx_2_key_type(rvu, index, &kw_type);
if (kw_type == NPC_MCAM_KEY_X2) {
cfg = rvu_read64(rvu, blkaddr,
@@ -1016,14 +1019,12 @@ static void npc_cn20k_config_kw_x4(struct rvu *rvu, struct npc_mcam *mcam,
static void
npc_cn20k_set_mcam_bank_cfg(struct rvu *rvu, int blkaddr, int mcam_idx,
- int bank, u8 kw_type, bool enable, u8 hw_prio)
+ int bank, u8 kw_type, u8 hw_prio)
{
struct npc_mcam *mcam = &rvu->hw->mcam;
u64 bank_cfg;
bank_cfg = (u64)hw_prio << 24;
- if (enable)
- bank_cfg |= 0x1;
if (kw_type == NPC_MCAM_KEY_X2) {
rvu_write64(rvu, blkaddr,
@@ -1119,7 +1120,8 @@ void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index,
/* TODO: */
/* PF installing VF rule */
npc_cn20k_set_mcam_bank_cfg(rvu, blkaddr, mcam_idx, bank,
- kw_type, enable, hw_prio);
+ kw_type, hw_prio);
+ npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, enable);
}
void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest)
@@ -1735,9 +1737,9 @@ static int npc_subbank_idx_2_mcam_idx(struct rvu *rvu, struct npc_subbank *sb,
return 0;
}
-static int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
- struct npc_subbank **sb,
- int *sb_off)
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+ struct npc_subbank **sb,
+ int *sb_off)
{
int bank_off, sb_id;
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 815d0b257a7e..004a556c7b90 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -170,6 +170,7 @@ struct npc_defrag_show_node {
* @num_banks: Number of banks.
* @num_subbanks: Number of subbanks.
* @subbank_depth: Depth of subbank.
+ * @en_map: Enable/disable status.
* @kw: Kex configured key type.
* @sb: Subbank array.
* @xa_sb_used: Array of used subbanks.
@@ -193,6 +194,9 @@ struct npc_priv_t {
const int num_banks;
int num_subbanks;
int subbank_depth;
+ DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
+ MAX_NUM_SUB_BANKS *
+ MAX_SUBBANK_DEPTH);
u8 kw;
struct npc_subbank *sb;
struct xarray xa_sb_used;
@@ -336,5 +340,8 @@ int npc_mcam_idx_2_key_type(struct rvu *rvu, u16 mcam_idx, u8 *key_type);
u16 npc_cn20k_vidx2idx(u16 index);
u16 npc_cn20k_idx2vidx(u16 idx);
int npc_cn20k_defrag(struct rvu *rvu);
+int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
+ struct npc_subbank **sb,
+ int *sb_off);
#endif /* NPC_CN20K_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 2/6] net/mlx5e: heap-allocate devlink param values
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
union devlink_param_value grows when U64 array params
are added to devlink. Keeping a four-element array of that
union on the stack in mlx5e_pcie_cong_get_thresh_config()
then trips -Wframe-larger-than=1280. Allocate the temporary
values with kcalloc() and free them on success and
error paths.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/mellanox/mlx5/core/en/pcie_cong_event.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
index 2eb666a46f39..f02995552129 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
@@ -259,15 +259,21 @@ mlx5e_pcie_cong_get_thresh_config(struct mlx5_core_dev *dev,
MLX5_DEVLINK_PARAM_ID_PCIE_CONG_OUT_HIGH,
};
struct devlink *devlink = priv_to_devlink(dev);
- union devlink_param_value val[4];
+ union devlink_param_value *val;
+
+ val = kcalloc(4, sizeof(*val), GFP_KERNEL);
+ if (!val)
+ return -ENOMEM;
for (int i = 0; i < 4; i++) {
u32 id = ids[i];
int err;
err = devl_param_driverinit_value_get(devlink, id, &val[i]);
- if (err)
+ if (err) {
+ kfree(val);
return err;
+ }
}
config->inbound_low = val[0].vu16;
@@ -275,6 +281,7 @@ mlx5e_pcie_cong_get_thresh_config(struct mlx5_core_dev *dev,
config->outbound_low = val[2].vu16;
config->outbound_high = val[3].vu16;
+ kfree(val);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 2/6] net/mlx5e: heap-allocate devlink param values Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
2026-04-07 9:58 ` Paolo Abeni
2026-04-03 2:55 ` [PATCH v10 net-next 4/6] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
` (2 subsequent siblings)
5 siblings, 1 reply; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
From: Saeed Mahameed <saeedm@nvidia.com>
Devlink param value attribute is not defined since devlink is handling
the value validating and parsing internally, this allows us to implement
multi attribute values without breaking any policies.
Devlink param multi-attribute values are considered to be dynamically
sized arrays of u64 values, by introducing a new devlink param type
DEVLINK_PARAM_TYPE_U64_ARRAY, driver and user space can set a variable
count of u32 values into the DEVLINK_ATTR_PARAM_VALUE_DATA attribute.
Implement get/set parsing and add to the internal value structure passed
to drivers.
This is useful for devices that need to configure a list of values for
a specific configuration.
example:
$ devlink dev param show pci/... name multi-value-param
name multi-value-param type driver-specific
values:
cmode permanent value: 0,1,2,3,4,5,6,7
$ devlink dev param set pci/... name multi-value-param \
value 4,5,6,7,0,1,2,3 cmode permanent
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
Documentation/netlink/specs/devlink.yaml | 4 ++
include/net/devlink.h | 8 +++
include/uapi/linux/devlink.h | 1 +
net/devlink/netlink_gen.c | 2 +
net/devlink/param.c | 91 +++++++++++++++++++-----
5 files changed, 89 insertions(+), 17 deletions(-)
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index b495d56b9137..b619de4fe08a 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -226,6 +226,10 @@ definitions:
value: 10
-
name: binary
+ -
+ name: u64-array
+ value: 129
+
-
name: rate-tc-index-max
type: const
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 3038af6ec017..3a355fea8189 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -432,6 +432,13 @@ enum devlink_param_type {
DEVLINK_PARAM_TYPE_U64 = DEVLINK_VAR_ATTR_TYPE_U64,
DEVLINK_PARAM_TYPE_STRING = DEVLINK_VAR_ATTR_TYPE_STRING,
DEVLINK_PARAM_TYPE_BOOL = DEVLINK_VAR_ATTR_TYPE_FLAG,
+ DEVLINK_PARAM_TYPE_U64_ARRAY = DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
+};
+
+#define __DEVLINK_PARAM_MAX_ARRAY_SIZE 32
+struct devlink_param_u64_array {
+ u64 size;
+ u64 val[__DEVLINK_PARAM_MAX_ARRAY_SIZE];
};
union devlink_param_value {
@@ -441,6 +448,7 @@ union devlink_param_value {
u64 vu64;
char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
bool vbool;
+ struct devlink_param_u64_array u64arr;
};
struct devlink_param_gset_ctx {
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 7de2d8cc862f..5332223dd6d0 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -406,6 +406,7 @@ enum devlink_var_attr_type {
DEVLINK_VAR_ATTR_TYPE_BINARY,
__DEVLINK_VAR_ATTR_TYPE_CUSTOM_BASE = 0x80,
/* Any possible custom types, unrelated to NLA_* values go below */
+ DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
};
enum devlink_attr {
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index eb35e80e01d1..7aaf462f27ee 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -37,6 +37,8 @@ devlink_attr_param_type_validate(const struct nlattr *attr,
case DEVLINK_VAR_ATTR_TYPE_NUL_STRING:
fallthrough;
case DEVLINK_VAR_ATTR_TYPE_BINARY:
+ fallthrough;
+ case DEVLINK_VAR_ATTR_TYPE_U64_ARRAY:
return 0;
}
NL_SET_ERR_MSG_ATTR(extack, attr, "invalid enum value");
diff --git a/net/devlink/param.c b/net/devlink/param.c
index cf95268da5b0..2ec85dffd8ac 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -252,6 +252,14 @@ devlink_nl_param_value_put(struct sk_buff *msg, enum devlink_param_type type,
return -EMSGSIZE;
}
break;
+ case DEVLINK_PARAM_TYPE_U64_ARRAY:
+ if (val.u64arr.size > __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+ return -EMSGSIZE;
+
+ for (int i = 0; i < val.u64arr.size; i++)
+ if (nla_put_uint(msg, nla_type, val.u64arr.val[i]))
+ return -EMSGSIZE;
+ break;
}
return 0;
}
@@ -304,56 +312,78 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
u32 portid, u32 seq, int flags,
struct netlink_ext_ack *extack)
{
- union devlink_param_value default_value[DEVLINK_PARAM_CMODE_MAX + 1];
- union devlink_param_value param_value[DEVLINK_PARAM_CMODE_MAX + 1];
bool default_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
bool param_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
const struct devlink_param *param = param_item->param;
- struct devlink_param_gset_ctx ctx;
+ union devlink_param_value *default_value;
+ union devlink_param_value *param_value;
+ struct devlink_param_gset_ctx *ctx;
struct nlattr *param_values_list;
struct nlattr *param_attr;
void *hdr;
int err;
int i;
+ default_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+ sizeof(*default_value), GFP_KERNEL);
+ if (!default_value)
+ return -ENOMEM;
+
+ param_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
+ sizeof(*param_value), GFP_KERNEL);
+ if (!param_value) {
+ kfree(default_value);
+ return -ENOMEM;
+ }
+
+ ctx = kmalloc_obj(*ctx);
+ if (!ctx) {
+ kfree(param_value);
+ kfree(default_value);
+ return -ENOMEM;
+ }
+
/* Get value from driver part to driverinit configuration mode */
for (i = 0; i <= DEVLINK_PARAM_CMODE_MAX; i++) {
if (!devlink_param_cmode_is_supported(param, i))
continue;
if (i == DEVLINK_PARAM_CMODE_DRIVERINIT) {
- if (param_item->driverinit_value_new_valid)
+ if (param_item->driverinit_value_new_valid) {
param_value[i] = param_item->driverinit_value_new;
- else if (param_item->driverinit_value_valid)
+ } else if (param_item->driverinit_value_valid) {
param_value[i] = param_item->driverinit_value;
- else
- return -EOPNOTSUPP;
+ } else {
+ err = -EOPNOTSUPP;
+ goto get_put_fail;
+ }
if (param_item->driverinit_value_valid) {
default_value[i] = param_item->driverinit_default;
default_value_set[i] = true;
}
} else {
- ctx.cmode = i;
- err = devlink_param_get(devlink, param, &ctx, extack);
+ ctx->cmode = i;
+ err = devlink_param_get(devlink, param, ctx, extack);
if (err)
- return err;
- param_value[i] = ctx.val;
+ goto get_put_fail;
+ param_value[i] = ctx->val;
- err = devlink_param_get_default(devlink, param, &ctx,
+ err = devlink_param_get_default(devlink, param, ctx,
extack);
if (!err) {
- default_value[i] = ctx.val;
+ default_value[i] = ctx->val;
default_value_set[i] = true;
} else if (err != -EOPNOTSUPP) {
- return err;
+ goto get_put_fail;
}
}
param_value_set[i] = true;
}
+ err = -EMSGSIZE;
hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
if (!hdr)
- return -EMSGSIZE;
+ goto get_put_fail;
if (devlink_nl_put_handle(msg, devlink))
goto genlmsg_cancel;
@@ -393,6 +423,9 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
nla_nest_end(msg, param_values_list);
nla_nest_end(msg, param_attr);
genlmsg_end(msg, hdr);
+ kfree(default_value);
+ kfree(param_value);
+ kfree(ctx);
return 0;
values_list_nest_cancel:
@@ -401,7 +434,11 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
nla_nest_cancel(msg, param_attr);
genlmsg_cancel:
genlmsg_cancel(msg, hdr);
- return -EMSGSIZE;
+get_put_fail:
+ kfree(default_value);
+ kfree(param_value);
+ kfree(ctx);
+ return err;
}
static void devlink_param_notify(struct devlink *devlink,
@@ -507,7 +544,7 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
union devlink_param_value *value)
{
struct nlattr *param_data;
- int len;
+ int len, cnt, rem;
param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
@@ -547,6 +584,26 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
return -EINVAL;
value->vbool = nla_get_flag(param_data);
break;
+
+ case DEVLINK_PARAM_TYPE_U64_ARRAY:
+ cnt = 0;
+ nla_for_each_attr_type(param_data,
+ DEVLINK_ATTR_PARAM_VALUE_DATA,
+ genlmsg_data(info->genlhdr),
+ genlmsg_len(info->genlhdr), rem) {
+ if (cnt >= __DEVLINK_PARAM_MAX_ARRAY_SIZE)
+ return -EMSGSIZE;
+
+ if ((nla_len(param_data) != sizeof(u64)) &&
+ (nla_len(param_data) != sizeof(u32)))
+ return -EINVAL;
+
+ value->u64arr.val[cnt] = (u64)nla_get_uint(param_data);
+ cnt++;
+ }
+
+ value->u64arr.size = cnt;
+ break;
}
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 4/6] octeontx2-af: npc: cn20k: add subbank search order control
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (2 preceding siblings ...)
2026-04-03 2:55 ` [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 5/6] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 6/6] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
5 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
CN20K NPC MCAM is split into 32 subbanks that are searched in a
predefined order during allocation. Lower-numbered subbanks have
higher priority than higher-numbered ones.
Add a runtime devlink parameter "srch_order" (
DEVLINK_PARAM_TYPE_U32_ARRAY) to control the order in which
subbanks are searched during MCAM allocation.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 91 +++++++++++++++++-
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 2 +
.../marvell/octeontx2/af/rvu_devlink.c | 92 +++++++++++++++++--
3 files changed, 173 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index e854b85ced9e..153765b3e504 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -3317,7 +3317,7 @@ rvu_mbox_handler_npc_cn20k_get_kex_cfg(struct rvu *rvu,
return 0;
}
-static int *subbank_srch_order;
+static u32 *subbank_srch_order;
static void npc_populate_restricted_idxs(int num_subbanks)
{
@@ -3329,7 +3329,7 @@ static int npc_create_srch_order(int cnt)
{
int val = 0;
- subbank_srch_order = kcalloc(cnt, sizeof(int),
+ subbank_srch_order = kcalloc(cnt, sizeof(u32),
GFP_KERNEL);
if (!subbank_srch_order)
return -ENOMEM;
@@ -3809,6 +3809,93 @@ static void npc_unlock_all_subbank(void)
mutex_unlock(&npc_priv.sb[i].lock);
}
+int npc_cn20k_search_order_set(struct rvu *rvu,
+ u64 arr[MAX_NUM_SUB_BANKS], int cnt)
+{
+ struct npc_mcam *mcam = &rvu->hw->mcam;
+ u32 fslots[MAX_NUM_SUB_BANKS][2];
+ u32 uslots[MAX_NUM_SUB_BANKS][2];
+ int fcnt = 0, ucnt = 0;
+ struct npc_subbank *sb;
+ int idx, val, rc = 0;
+
+ unsigned long index;
+ void *v;
+
+ if (cnt != npc_priv.num_subbanks) {
+ dev_err(rvu->dev, "Number of entries(%u) != %u\n",
+ cnt, npc_priv.num_subbanks);
+ return -EINVAL;
+ }
+
+ mutex_lock(&mcam->lock);
+ npc_lock_all_subbank();
+ restrict_valid = false;
+
+ for (int i = 0; i < cnt; i++)
+ subbank_srch_order[i] = (u32)arr[i];
+
+ xa_for_each(&npc_priv.xa_sb_used, index, v) {
+ val = xa_to_value(v);
+ uslots[ucnt][0] = index;
+ uslots[ucnt][1] = val;
+ xa_erase(&npc_priv.xa_sb_used, index);
+ ucnt++;
+ }
+
+ xa_for_each(&npc_priv.xa_sb_free, index, v) {
+ val = xa_to_value(v);
+ fslots[fcnt][0] = index;
+ fslots[fcnt][1] = val;
+ xa_erase(&npc_priv.xa_sb_free, index);
+ fcnt++;
+ }
+
+ /* xa_store() is done under lock. If xa_store fails
+ * ,no rollback is planned as it might also fail.
+ */
+ for (int i = 0; i < ucnt; i++) {
+ idx = uslots[i][1];
+ sb = &npc_priv.sb[idx];
+ sb->arr_idx = subbank_srch_order[sb->idx];
+ rc = xa_err(xa_store(&npc_priv.xa_sb_used, sb->arr_idx,
+ xa_mk_value(sb->idx), GFP_KERNEL));
+ if (rc) {
+ dev_err(rvu->dev,
+ "Error to insert index to used list %u\n",
+ sb->idx);
+ goto fail_used;
+ }
+ }
+
+ for (int i = 0; i < fcnt; i++) {
+ idx = fslots[i][1];
+ sb = &npc_priv.sb[idx];
+ sb->arr_idx = subbank_srch_order[sb->idx];
+ rc = xa_err(xa_store(&npc_priv.xa_sb_free, sb->arr_idx,
+ xa_mk_value(sb->idx), GFP_KERNEL));
+ if (rc) {
+ dev_err(rvu->dev,
+ "Error to insert index to free list %u\n",
+ sb->idx);
+ goto fail_used;
+ }
+ }
+
+fail_used:
+ npc_unlock_all_subbank();
+ mutex_unlock(&mcam->lock);
+
+ return rc;
+}
+
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz)
+{
+ *restricted_order = restrict_valid;
+ *sz = npc_priv.num_subbanks;
+ return subbank_srch_order;
+}
+
/* Only non-ref non-contigous mcam indexes
* are picked for defrag process
*/
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 004a556c7b90..6f9f796940f3 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -343,5 +343,7 @@ int npc_cn20k_defrag(struct rvu *rvu);
int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
struct npc_subbank **sb,
int *sb_off);
+const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz);
+int npc_cn20k_search_order_set(struct rvu *rvu, u64 arr[32], int cnt);
#endif /* NPC_CN20K_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index 6494a9ee2f0d..0e8e33c836c9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -1258,6 +1258,7 @@ enum rvu_af_dl_param_id {
RVU_AF_DEVLINK_PARAM_ID_NPC_EXACT_FEATURE_DISABLE,
RVU_AF_DEVLINK_PARAM_ID_NPC_DEF_RULE_CNTR_ENABLE,
RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
+ RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
RVU_AF_DEVLINK_PARAM_ID_NIX_MAXLF,
};
@@ -1619,12 +1620,83 @@ static int rvu_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
return 0;
}
+static int rvu_af_dl_npc_srch_order_set(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx,
+ struct netlink_ext_ack *extack)
+{
+ struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+ struct rvu *rvu = rvu_dl->rvu;
+
+ return npc_cn20k_search_order_set(rvu,
+ ctx->val.u64arr.val,
+ ctx->val.u64arr.size);
+}
+
+static int rvu_af_dl_npc_srch_order_get(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx,
+ struct netlink_ext_ack *extack)
+{
+ bool restricted_order;
+ const u32 *order;
+ u32 sz;
+
+ order = npc_cn20k_search_order_get(&restricted_order, &sz);
+ ctx->val.u64arr.size = sz;
+ for (int i = 0; i < sz; i++)
+ ctx->val.u64arr.val[i] = order[i];
+
+ return 0;
+}
+
+static int rvu_af_dl_npc_srch_order_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ struct rvu_devlink *rvu_dl = devlink_priv(devlink);
+ struct rvu *rvu = rvu_dl->rvu;
+ bool restricted_order;
+ unsigned long w = 0;
+ u64 *arr;
+ u32 sz;
+
+ npc_cn20k_search_order_get(&restricted_order, &sz);
+ if (sz != val.u64arr.size) {
+ dev_err(rvu->dev,
+ "Wrong size %llu, should be %u\n",
+ val.u64arr.size, sz);
+ return -EINVAL;
+ }
+
+ arr = val.u64arr.val;
+ for (int i = 0; i < sz; i++) {
+ if (arr[i] >= sz)
+ return -EINVAL;
+
+ w |= BIT_ULL(arr[i]);
+ }
+
+ if (bitmap_weight(&w, sz) != sz) {
+ dev_err(rvu->dev,
+ "Duplicate or out-of-range subbank index. %lu\n",
+ find_first_zero_bit(&w, sz));
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static const struct devlink_ops rvu_devlink_ops = {
.eswitch_mode_get = rvu_devlink_eswitch_mode_get,
.eswitch_mode_set = rvu_devlink_eswitch_mode_set,
};
-static const struct devlink_param rvu_af_dl_param_defrag[] = {
+static const struct devlink_param rvu_af_dl_cn20k_params[] = {
+ DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_SRCH_ORDER,
+ "npc_srch_order", DEVLINK_PARAM_TYPE_U64_ARRAY,
+ BIT(DEVLINK_PARAM_CMODE_RUNTIME),
+ rvu_af_dl_npc_srch_order_get,
+ rvu_af_dl_npc_srch_order_set,
+ rvu_af_dl_npc_srch_order_validate),
DEVLINK_PARAM_DRIVER(RVU_AF_DEVLINK_PARAM_ID_NPC_DEFRAG,
"npc_defrag", DEVLINK_PARAM_TYPE_STRING,
BIT(DEVLINK_PARAM_CMODE_RUNTIME),
@@ -1666,13 +1738,13 @@ int rvu_register_dl(struct rvu *rvu)
}
if (is_cn20k(rvu->pdev)) {
- err = devlink_params_register(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ err = devlink_params_register(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
if (err) {
dev_err(rvu->dev,
- "devlink defrag params register failed with error %d",
+ "devlink cn20k params register failed with error %d",
err);
- goto err_dl_defrag;
+ goto err_dl_cn20k_params;
}
}
@@ -1695,10 +1767,10 @@ int rvu_register_dl(struct rvu *rvu)
err_dl_exact_match:
if (is_cn20k(rvu->pdev))
- devlink_params_unregister(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
-err_dl_defrag:
+err_dl_cn20k_params:
devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
err_dl_health:
@@ -1717,8 +1789,8 @@ void rvu_unregister_dl(struct rvu *rvu)
devlink_params_unregister(dl, rvu_af_dl_params, ARRAY_SIZE(rvu_af_dl_params));
if (is_cn20k(rvu->pdev))
- devlink_params_unregister(dl, rvu_af_dl_param_defrag,
- ARRAY_SIZE(rvu_af_dl_param_defrag));
+ devlink_params_unregister(dl, rvu_af_dl_cn20k_params,
+ ARRAY_SIZE(rvu_af_dl_cn20k_params));
/* Unregister exact match devlink only for CN10K-B */
if (rvu_npc_exact_has_match_table(rvu))
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 5/6] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (3 preceding siblings ...)
2026-04-03 2:55 ` [PATCH v10 net-next 4/6] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 6/6] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
5 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
Improve MCAM utilization by tying default (broadcast, multicast,
promisc, ucast) entry lifetime to NIX LF usage.
- On NIX LF alloc (e.g. kernel or DPDK), allocate default MCAM entries
if missing; on NIX LF free, release them so they return to the pool.
- Add NIX_LF_DONT_FREE_DFT_IDXS so the kernel PF driver can free the
NIX LF without releasing default entries (e.g. across suspend/resume).
- When NIX LF is used by DPDK, default entries are allocated on first
use and freed when the LF is released if NIX_LF_DONT_FREE_DFT_IDXS is
not set
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 108 ++++++++++-----
.../ethernet/marvell/octeontx2/af/cn20k/npc.h | 1 +
.../net/ethernet/marvell/octeontx2/af/mbox.h | 1 +
.../ethernet/marvell/octeontx2/af/rvu_nix.c | 69 ++++++----
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 125 +++++++++++++-----
.../ethernet/marvell/octeontx2/nic/otx2_pf.c | 4 +-
6 files changed, 218 insertions(+), 90 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 153765b3e504..69439ff76e10 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -808,6 +808,11 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
u64 cfg, hw_prio;
u8 kw_type;
+ if (index < 0 || index >= mcam->total_entries) {
+ WARN(1, "Wrong mcam index %d\n", index);
+ return;
+ }
+
enable ? set_bit(index, npc_priv.en_map) :
clear_bit(index, npc_priv.en_map);
@@ -1053,6 +1058,11 @@ void npc_cn20k_config_mcam_entry(struct rvu *rvu, int blkaddr, int index,
int kw = 0;
u8 kw_type;
+ if (index < 0 || index >= mcam->total_entries) {
+ WARN(1, "Wrong mcam index %d\n", index);
+ return;
+ }
+
/* Disable before mcam entry update */
npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, false);
@@ -1132,6 +1142,11 @@ void npc_cn20k_copy_mcam_entry(struct rvu *rvu, int blkaddr, u16 src, u16 dest)
int bank, i, sb, db;
int dbank, sbank;
+ if (src >= mcam->total_entries || dest >= mcam->total_entries) {
+ WARN(1, "Wrong mcam index src=%u dest=%u\n", src, dest);
+ return;
+ }
+
dbank = npc_get_bank(mcam, dest);
sbank = npc_get_bank(mcam, src);
npc_mcam_idx_2_key_type(rvu, src, &src_kwtype);
@@ -1190,11 +1205,24 @@ void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index,
int kw = 0, bank;
u8 kw_type;
+ if (index >= mcam->total_entries) {
+ WARN(1, "Wrong mcam index %u\n", index);
+ return;
+ }
+
npc_mcam_idx_2_key_type(rvu, index, &kw_type);
bank = npc_get_bank(mcam, index);
index &= (mcam->banksize - 1);
+ cfg = rvu_read64(rvu, blkaddr,
+ NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, bank, 0));
+ entry->action = cfg;
+
+ cfg = rvu_read64(rvu, blkaddr,
+ NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, bank, 1));
+ entry->vtag_action = cfg;
+
cfg = rvu_read64(rvu, blkaddr,
NPC_AF_CN20K_MCAMEX_BANKX_CAMX_INTF_EXT(index,
bank, 1)) & 3;
@@ -1244,7 +1272,7 @@ void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index,
bank,
0));
npc_cn20k_fill_entryword(entry, kw + 3, cam0, cam1);
- goto read_action;
+ return;
}
for (bank = 0; bank < mcam->banks_per_entry; bank++, kw = kw + 4) {
@@ -1289,17 +1317,6 @@ void npc_cn20k_read_mcam_entry(struct rvu *rvu, int blkaddr, u16 index,
npc_cn20k_fill_entryword(entry, kw + 3, cam0, cam1);
}
-read_action:
- /* 'action' is set to same value for both bank '0' and '1'.
- * Hence, reading bank '0' should be enough.
- */
- cfg = rvu_read64(rvu, blkaddr,
- NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, 0, 0));
- entry->action = cfg;
-
- cfg = rvu_read64(rvu, blkaddr,
- NPC_AF_CN20K_MCAMEX_BANKX_ACTIONX_EXT(index, 0, 1));
- entry->vtag_action = cfg;
}
int rvu_mbox_handler_npc_cn20k_mcam_write_entry(struct rvu *rvu,
@@ -1671,8 +1688,8 @@ int npc_mcam_idx_2_key_type(struct rvu *rvu, u16 mcam_idx, u8 *key_type)
/* mcam_idx should be less than (2 * bank depth) */
if (mcam_idx >= npc_priv.bank_depth * 2) {
- dev_err(rvu->dev, "%s: bad params\n",
- __func__);
+ dev_err(rvu->dev, "%s: bad params mcam_idx=%u\n",
+ __func__, mcam_idx);
return -EINVAL;
}
@@ -4024,6 +4041,13 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
void *val;
int i, j;
+ for (int i = 0; i < ARRAY_SIZE(ptr); i++) {
+ if (!ptr[i])
+ continue;
+
+ *ptr[i] = USHRT_MAX;
+ }
+
if (!npc_priv.init_done)
return 0;
@@ -4039,7 +4063,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
npc_dft_rule_name[NPC_DFT_RULE_PROMISC_ID],
pcifunc);
- *ptr[0] = USHRT_MAX;
return -ESRCH;
}
@@ -4059,7 +4082,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
npc_dft_rule_name[NPC_DFT_RULE_UCAST_ID],
pcifunc);
- *ptr[3] = USHRT_MAX;
return -ESRCH;
}
@@ -4079,7 +4101,6 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
__func__,
npc_dft_rule_name[i], pcifunc);
- *ptr[j] = USHRT_MAX;
continue;
}
@@ -4174,7 +4195,7 @@ int rvu_mbox_handler_npc_get_dft_rl_idxs(struct rvu *rvu, struct msg_req *req,
return 0;
}
-static bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc)
+bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc)
{
return is_pf_cgxmapped(rvu, rvu_get_pf(rvu->pdev, pcifunc)) ||
is_lbk_vf(rvu, pcifunc);
@@ -4182,9 +4203,10 @@ static bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc)
void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
{
- struct npc_mcam_free_entry_req free_req = { 0 };
+ struct npc_mcam *mcam = &rvu->hw->mcam;
+ struct rvu_npc_mcam_rule *rule, *tmp;
unsigned long index;
- struct msg_rsp rsp;
+ int blkaddr;
u16 ptr[4];
int rc, i;
void *map;
@@ -4209,7 +4231,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_PROMISC_ID);
map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
if (!map)
- dev_dbg(rvu->dev,
+ dev_err(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
__func__,
npc_dft_rule_name[NPC_DFT_RULE_PROMISC_ID],
@@ -4223,7 +4245,7 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
index = NPC_DFT_RULE_ID_MK(pcifunc, NPC_DFT_RULE_UCAST_ID);
map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
if (!map)
- dev_dbg(rvu->dev,
+ dev_err(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
__func__,
npc_dft_rule_name[NPC_DFT_RULE_UCAST_ID],
@@ -4237,21 +4259,47 @@ void npc_cn20k_dft_rules_free(struct rvu *rvu, u16 pcifunc)
index = NPC_DFT_RULE_ID_MK(pcifunc, i);
map = xa_erase(&npc_priv.xa_pf2dfl_rmap, index);
if (!map)
- dev_dbg(rvu->dev,
+ dev_err(rvu->dev,
"%s: Err from delete %s mcam idx from xarray (pcifunc=%#x\n",
__func__, npc_dft_rule_name[i],
pcifunc);
}
free_rules:
+ blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
+ if (blkaddr < 0)
+ return;
- free_req.hdr.pcifunc = pcifunc;
- free_req.all = 1;
- rc = rvu_mbox_handler_npc_mcam_free_entry(rvu, &free_req, &rsp);
- if (rc)
- dev_err(rvu->dev,
- "%s: Error deleting default entries (pcifunc=%#x\n",
- __func__, pcifunc);
+ for (int i = 0; i < 4; i++) {
+ if (ptr[i] == USHRT_MAX)
+ continue;
+
+ mutex_lock(&mcam->lock);
+ npc_mcam_clear_bit(mcam, ptr[i]);
+ mcam->entry2pfvf_map[ptr[i]] = NPC_MCAM_INVALID_MAP;
+ npc_cn20k_enable_mcam_entry(rvu, blkaddr, ptr[i], false);
+ mcam->entry2target_pffunc[ptr[i]] = 0x0;
+ mutex_unlock(&mcam->lock);
+
+ rc = npc_cn20k_idx_free(rvu, &ptr[i], 1);
+ if (rc)
+ dev_err(rvu->dev,
+ "%s:%d Error deleting default entries (pcifunc=%#x) mcam_idx=%u\n",
+ __func__, __LINE__, pcifunc, ptr[i]);
+ }
+
+ mutex_lock(&mcam->lock);
+ list_for_each_entry_safe(rule, tmp, &mcam->mcam_rules, list) {
+ for (int i = 0; i < 4; i++) {
+ if (ptr[i] != rule->entry)
+ continue;
+
+ list_del(&rule->list);
+ kfree(rule);
+ break;
+ }
+ }
+ mutex_unlock(&mcam->lock);
}
int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
index 6f9f796940f3..1b4b4a6fa378 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
@@ -345,5 +345,6 @@ int npc_mcam_idx_2_subbank_idx(struct rvu *rvu, u16 mcam_idx,
int *sb_off);
const u32 *npc_cn20k_search_order_get(bool *restricted_order, u32 *sz);
int npc_cn20k_search_order_set(struct rvu *rvu, u64 arr[32], int cnt);
+bool npc_is_cgx_or_lbk(struct rvu *rvu, u16 pcifunc);
#endif /* NPC_CN20K_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index dc42c81c0942..e07fbf842b94 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -1009,6 +1009,7 @@ struct nix_lf_free_req {
struct mbox_msghdr hdr;
#define NIX_LF_DISABLE_FLOWS BIT_ULL(0)
#define NIX_LF_DONT_FREE_TX_VTAG BIT_ULL(1)
+#define NIX_LF_DONT_FREE_DFT_IDXS BIT_ULL(2)
u64 flags;
};
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
index ef5b081162eb..584e98e25f11 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
@@ -16,6 +16,7 @@
#include "cgx.h"
#include "lmac_common.h"
#include "rvu_npc_hash.h"
+#include "cn20k/npc.h"
static void nix_free_tx_vtag_entries(struct rvu *rvu, u16 pcifunc);
static int rvu_nix_get_bpid(struct rvu *rvu, struct nix_bp_cfg_req *req,
@@ -1499,7 +1500,7 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
struct nix_lf_alloc_req *req,
struct nix_lf_alloc_rsp *rsp)
{
- int nixlf, qints, hwctx_size, intf, err, rc = 0;
+ int nixlf, qints, hwctx_size, intf, rc = 0;
struct rvu_hwinfo *hw = rvu->hw;
u16 pcifunc = req->hdr.pcifunc;
struct rvu_block *block;
@@ -1555,8 +1556,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
return NIX_AF_ERR_RSS_GRPS_INVALID;
/* Reset this NIX LF */
- err = rvu_lf_reset(rvu, block, nixlf);
- if (err) {
+ rc = rvu_lf_reset(rvu, block, nixlf);
+ if (rc) {
dev_err(rvu->dev, "Failed to reset NIX%d LF%d\n",
block->addr - BLKADDR_NIX0, nixlf);
return NIX_AF_ERR_LF_RESET;
@@ -1566,13 +1567,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX RQ HW context memory and config the base */
hwctx_size = 1UL << ((ctx_cfg >> 4) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->rq_ctx, req->rq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->rq_bmap = kcalloc(req->rq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->rq_bmap)
+ if (!pfvf->rq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_RQS_BASE(nixlf),
(u64)pfvf->rq_ctx->iova);
@@ -1583,13 +1586,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX SQ HW context memory and config the base */
hwctx_size = 1UL << (ctx_cfg & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->sq_ctx, req->sq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->sq_bmap = kcalloc(req->sq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->sq_bmap)
+ if (!pfvf->sq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_SQS_BASE(nixlf),
(u64)pfvf->sq_ctx->iova);
@@ -1599,13 +1604,15 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Alloc NIX CQ HW context memory and config the base */
hwctx_size = 1UL << ((ctx_cfg >> 8) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->cq_ctx, req->cq_cnt, hwctx_size);
+ if (rc)
goto free_mem;
pfvf->cq_bmap = kcalloc(req->cq_cnt, sizeof(long), GFP_KERNEL);
- if (!pfvf->cq_bmap)
+ if (!pfvf->cq_bmap) {
+ rc = -ENOMEM;
goto free_mem;
+ }
rvu_write64(rvu, blkaddr, NIX_AF_LFX_CQS_BASE(nixlf),
(u64)pfvf->cq_ctx->iova);
@@ -1615,18 +1622,18 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
/* Initialize receive side scaling (RSS) */
hwctx_size = 1UL << ((ctx_cfg >> 12) & 0xF);
- err = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
- req->rss_grps, hwctx_size, req->way_mask,
- !!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
- if (err)
+ rc = nixlf_rss_ctx_init(rvu, blkaddr, pfvf, nixlf, req->rss_sz,
+ req->rss_grps, hwctx_size, req->way_mask,
+ !!(req->flags & NIX_LF_RSS_TAG_LSB_AS_ADDER));
+ if (rc)
goto free_mem;
/* Alloc memory for CQINT's HW contexts */
cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
qints = (cfg >> 24) & 0xFFF;
hwctx_size = 1UL << ((ctx_cfg >> 24) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->cq_ints_ctx, qints, hwctx_size);
+ if (rc)
goto free_mem;
rvu_write64(rvu, blkaddr, NIX_AF_LFX_CINTS_BASE(nixlf),
@@ -1639,8 +1646,8 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
cfg = rvu_read64(rvu, blkaddr, NIX_AF_CONST2);
qints = (cfg >> 12) & 0xFFF;
hwctx_size = 1UL << ((ctx_cfg >> 20) & 0xF);
- err = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
- if (err)
+ rc = qmem_alloc(rvu->dev, &pfvf->nix_qints_ctx, qints, hwctx_size);
+ if (rc)
goto free_mem;
rvu_write64(rvu, blkaddr, NIX_AF_LFX_QINTS_BASE(nixlf),
@@ -1684,10 +1691,16 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
if (is_sdp_pfvf(rvu, pcifunc))
intf = NIX_INTF_TYPE_SDP;
- err = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
- !!(req->flags & NIX_LF_LBK_BLK_SEL));
- if (err)
- goto free_mem;
+ if (is_cn20k(rvu->pdev)) {
+ rc = npc_cn20k_dft_rules_alloc(rvu, pcifunc);
+ if (rc)
+ goto free_mem;
+ }
+
+ rc = nix_interface_init(rvu, pcifunc, intf, nixlf, rsp,
+ !!(req->flags & NIX_LF_LBK_BLK_SEL));
+ if (rc)
+ goto free_dft;
/* Disable NPC entries as NIXLF's contexts are not initialized yet */
rvu_npc_disable_default_entries(rvu, pcifunc, nixlf);
@@ -1699,9 +1712,12 @@ int rvu_mbox_handler_nix_lf_alloc(struct rvu *rvu,
goto exit;
+free_dft:
+ if (is_cn20k(rvu->pdev))
+ npc_cn20k_dft_rules_free(rvu, pcifunc);
+
free_mem:
nix_ctx_free(rvu, pfvf);
- rc = -ENOMEM;
exit:
/* Set macaddr of this PF/VF */
@@ -1775,6 +1791,9 @@ int rvu_mbox_handler_nix_lf_free(struct rvu *rvu, struct nix_lf_free_req *req,
nix_ctx_free(rvu, pfvf);
+ if (is_cn20k(rvu->pdev) && !(req->flags & NIX_LF_DONT_FREE_DFT_IDXS))
+ npc_cn20k_dft_rules_free(rvu, pcifunc);
+
return 0;
}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index c2ca5ed1d028..8d260bcfbf38 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -165,12 +165,20 @@ int npc_get_nixlf_mcam_index(struct npc_mcam *mcam,
switch (type) {
case NIXLF_BCAST_ENTRY:
+ if (bcast == USHRT_MAX)
+ return -EINVAL;
return bcast;
case NIXLF_ALLMULTI_ENTRY:
+ if (mcast == USHRT_MAX)
+ return -EINVAL;
return mcast;
case NIXLF_PROMISC_ENTRY:
+ if (promisc == USHRT_MAX)
+ return -EINVAL;
return promisc;
case NIXLF_UCAST_ENTRY:
+ if (ucast == USHRT_MAX)
+ return -EINVAL;
return ucast;
default:
return -EINVAL;
@@ -237,12 +245,8 @@ void npc_enable_mcam_entry(struct rvu *rvu, struct npc_mcam *mcam,
int bank = npc_get_bank(mcam, index);
int actbank = bank;
- if (is_cn20k(rvu->pdev)) {
- if (index < 0 || index >= mcam->banksize * mcam->banks)
- return;
-
+ if (is_cn20k(rvu->pdev))
return npc_cn20k_enable_mcam_entry(rvu, blkaddr, index, enable);
- }
index &= (mcam->banksize - 1);
for (; bank < (actbank + mcam->banks_per_entry); bank++) {
@@ -1113,7 +1117,7 @@ void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf,
index = mcam_index;
}
- if (index >= mcam->total_entries)
+ if (index < 0 || index >= mcam->total_entries)
return;
bank = npc_get_bank(mcam, index);
@@ -1158,16 +1162,18 @@ void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf,
/* If PF's promiscuous entry is enabled,
* Set RSS action for that entry as well
*/
- npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index,
- blkaddr, alg_idx);
+ if (index >= 0)
+ npc_update_rx_action_with_alg_idx(rvu, action, pfvf,
+ index, blkaddr, alg_idx);
index = npc_get_nixlf_mcam_index(mcam, pcifunc,
nixlf, NIXLF_ALLMULTI_ENTRY);
/* If PF's allmulti entry is enabled,
* Set RSS action for that entry as well
*/
- npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index,
- blkaddr, alg_idx);
+ if (index >= 0)
+ npc_update_rx_action_with_alg_idx(rvu, action, pfvf,
+ index, blkaddr, alg_idx);
}
}
@@ -1212,8 +1218,13 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
{
struct rvu_pfvf *pfvf = rvu_get_pfvf(rvu, pcifunc);
struct npc_mcam *mcam = &rvu->hw->mcam;
+ int type = NIXLF_UCAST_ENTRY;
int index, blkaddr;
+ /* only CGX or LBK interfaces have default entries */
+ if (is_cn20k(rvu->pdev) && !npc_is_cgx_or_lbk(rvu, pcifunc))
+ return;
+
blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
if (blkaddr < 0)
return;
@@ -1221,8 +1232,11 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
/* Ucast MCAM match entry of this PF/VF */
if (npc_is_feature_supported(rvu, BIT_ULL(NPC_DMAC),
pfvf->nix_rx_intf)) {
+ if (is_cn20k(rvu->pdev) && is_lbk_vf(rvu, pcifunc))
+ type = NIXLF_PROMISC_ENTRY;
+
index = npc_get_nixlf_mcam_index(mcam, pcifunc,
- nixlf, NIXLF_UCAST_ENTRY);
+ nixlf, type);
npc_enable_mcam_entry(rvu, mcam, blkaddr, index, enable);
}
@@ -1232,9 +1246,13 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc,
if ((pcifunc & RVU_PFVF_FUNC_MASK) && !rvu->hw->cap.nix_rx_multicast)
return;
+ type = NIXLF_BCAST_ENTRY;
+ if (is_cn20k(rvu->pdev) && is_lbk_vf(rvu, pcifunc))
+ type = NIXLF_PROMISC_ENTRY;
+
/* add/delete pf_func to broadcast MCE list */
npc_enadis_default_mce_entry(rvu, pcifunc, nixlf,
- NIXLF_BCAST_ENTRY, enable);
+ type, enable);
}
void rvu_npc_disable_default_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
@@ -1244,6 +1262,9 @@ void rvu_npc_disable_default_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
npc_enadis_default_entries(rvu, pcifunc, nixlf, false);
+ if (is_cn20k(rvu->pdev) && is_vf(pcifunc))
+ return;
+
/* Delete multicast and promisc MCAM entries */
npc_enadis_default_mce_entry(rvu, pcifunc, nixlf,
NIXLF_ALLMULTI_ENTRY, false);
@@ -1301,6 +1322,10 @@ void rvu_npc_disable_mcam_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
if (blkaddr < 0)
return;
+ /* only CGX or LBK interfaces have default entries */
+ if (is_cn20k(rvu->pdev) && !npc_is_cgx_or_lbk(rvu, pcifunc))
+ return;
+
mutex_lock(&mcam->lock);
/* Disable MCAM entries directing traffic to this 'pcifunc' */
@@ -2504,33 +2529,58 @@ void npc_mcam_clear_bit(struct npc_mcam *mcam, u16 index)
static void npc_mcam_free_all_entries(struct rvu *rvu, struct npc_mcam *mcam,
int blkaddr, u16 pcifunc)
{
+ u16 dft_idxs[NPC_DFT_RULE_MAX_ID] = {[0 ... NPC_DFT_RULE_MAX_ID - 1] = USHRT_MAX};
u16 index, cntr;
+ bool dft_rl;
int rc;
+ npc_cn20k_dft_rules_idx_get(rvu, pcifunc,
+ &dft_idxs[NPC_DFT_RULE_BCAST_ID],
+ &dft_idxs[NPC_DFT_RULE_MCAST_ID],
+ &dft_idxs[NPC_DFT_RULE_PROMISC_ID],
+ &dft_idxs[NPC_DFT_RULE_UCAST_ID]);
+
/* Scan all MCAM entries and free the ones mapped to 'pcifunc' */
for (index = 0; index < mcam->bmap_entries; index++) {
- if (mcam->entry2pfvf_map[index] == pcifunc) {
- mcam->entry2pfvf_map[index] = NPC_MCAM_INVALID_MAP;
- /* Free the entry in bitmap */
- npc_mcam_clear_bit(mcam, index);
- /* Disable the entry */
- npc_enable_mcam_entry(rvu, mcam, blkaddr, index, false);
-
- /* Update entry2counter mapping */
- cntr = mcam->entry2cntr_map[index];
- if (cntr != NPC_MCAM_INVALID_MAP)
- npc_unmap_mcam_entry_and_cntr(rvu, mcam,
- blkaddr, index,
- cntr);
- mcam->entry2target_pffunc[index] = 0x0;
- if (is_cn20k(rvu->pdev)) {
- rc = npc_cn20k_idx_free(rvu, &index, 1);
- if (rc)
- dev_err(rvu->dev,
- "Failed to free mcam idx=%u pcifunc=%#x\n",
- index, pcifunc);
+ if (mcam->entry2pfvf_map[index] != pcifunc)
+ continue;
+
+ mcam->entry2pfvf_map[index] = NPC_MCAM_INVALID_MAP;
+
+ dft_rl = false;
+ if (is_cn20k(rvu->pdev)) {
+ if (dft_idxs[NPC_DFT_RULE_BCAST_ID] == index ||
+ dft_idxs[NPC_DFT_RULE_MCAST_ID] == index ||
+ dft_idxs[NPC_DFT_RULE_PROMISC_ID] == index ||
+ dft_idxs[NPC_DFT_RULE_UCAST_ID] == index) {
+ dft_rl = true;
}
}
+
+ /* Free the entry in bitmap.*/
+ if (!dft_rl)
+ npc_mcam_clear_bit(mcam, index);
+
+ /* Disable the entry */
+ npc_enable_mcam_entry(rvu, mcam, blkaddr, index, false);
+
+ /* Update entry2counter mapping */
+ cntr = mcam->entry2cntr_map[index];
+ if (cntr != NPC_MCAM_INVALID_MAP)
+ npc_unmap_mcam_entry_and_cntr(rvu, mcam,
+ blkaddr, index,
+ cntr);
+ mcam->entry2target_pffunc[index] = 0x0;
+ if (is_cn20k(rvu->pdev)) {
+ if (dft_rl)
+ continue;
+
+ rc = npc_cn20k_idx_free(rvu, &index, 1);
+ if (rc)
+ dev_err(rvu->dev,
+ "Failed to free mcam idx=%u pcifunc=%#x\n",
+ index, pcifunc);
+ }
}
}
@@ -3917,13 +3967,22 @@ void rvu_npc_clear_ucast_entry(struct rvu *rvu, int pcifunc, int nixlf)
struct npc_mcam *mcam = &rvu->hw->mcam;
struct rvu_npc_mcam_rule *rule;
int ucast_idx, blkaddr;
+ u8 type;
blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
if (blkaddr < 0)
return;
+ type = NIXLF_UCAST_ENTRY;
+ if (is_cn20k(rvu->pdev) && is_lbk_vf(rvu, pcifunc))
+ type = NIXLF_PROMISC_ENTRY;
+
ucast_idx = npc_get_nixlf_mcam_index(mcam, pcifunc,
- nixlf, NIXLF_UCAST_ENTRY);
+ nixlf, type);
+
+ /* In cn20k, default rules are freed before detach rsrc */
+ if (ucast_idx < 0)
+ return;
npc_enable_mcam_entry(rvu, mcam, blkaddr, ucast_idx, false);
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index ee623476e5ff..366850742862 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1729,7 +1729,7 @@ int otx2_init_hw_resources(struct otx2_nic *pf)
mutex_lock(&mbox->lock);
free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
if (free_req) {
- free_req->flags = NIX_LF_DISABLE_FLOWS;
+ free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
if (otx2_sync_mbox_msg(mbox))
dev_err(pf->dev, "%s failed to free nixlf\n", __func__);
}
@@ -1803,7 +1803,7 @@ void otx2_free_hw_resources(struct otx2_nic *pf)
/* Reset NIX LF */
free_req = otx2_mbox_alloc_msg_nix_lf_free(mbox);
if (free_req) {
- free_req->flags = NIX_LF_DISABLE_FLOWS;
+ free_req->flags = NIX_LF_DISABLE_FLOWS | NIX_LF_DONT_FREE_DFT_IDXS;
if (!(pf->flags & OTX2_FLAG_PF_SHUTDOWN))
free_req->flags |= NIX_LF_DONT_FREE_TX_VTAG;
if (otx2_sync_mbox_msg(mbox))
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v10 net-next 6/6] octeontx2-af: npc: Support for custom KPU profile from filesystem
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
` (4 preceding siblings ...)
2026-04-03 2:55 ` [PATCH v10 net-next 5/6] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries Ratheesh Kannoth
@ 2026-04-03 2:55 ` Ratheesh Kannoth
5 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-03 2:55 UTC (permalink / raw)
To: netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, pabeni,
donald.hunter, horms, jiri, chuck.lever, matttbe, cjubran, saeedm,
leon, tariqt, mbloch, dtatulea, Ratheesh Kannoth
Flashing updated firmware on deployed devices is cumbersome. Provide a
mechanism to load a custom KPU (Key Parse Unit) profile directly from
the filesystem at module load time.
When the rvu_af module is loaded with the kpu_profile parameter, the
specified profile is read from /lib/firmware/kpu and programmed into
the KPU registers. Add npc_kpu_profile_cam2 for the extended cam format
used by filesystem-loaded profiles and support ptype/ptype_mask in
npc_config_kpucam when profile->from_fs is set.
Usage:
1. Copy the KPU profile file to /lib/firmware/kpu.
2. Build OCTEONTX2_AF as a module.
3. Load: insmod rvu_af.ko kpu_profile=<profile_name>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
---
.../ethernet/marvell/octeontx2/af/cn20k/npc.c | 57 ++-
.../net/ethernet/marvell/octeontx2/af/npc.h | 17 +
.../net/ethernet/marvell/octeontx2/af/rvu.h | 12 +-
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 446 ++++++++++++++----
.../ethernet/marvell/octeontx2/af/rvu_npc.h | 17 +
.../ethernet/marvell/octeontx2/af/rvu_reg.h | 1 +
6 files changed, 438 insertions(+), 112 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 69439ff76e10..9f71b7ea06ce 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -521,13 +521,17 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
int kpm, int start_entry,
const struct npc_kpu_profile *profile)
{
+ int num_cam_entries, num_action_entries;
int entry, num_entries, max_entries;
u64 idx;
- if (profile->cam_entries != profile->action_entries) {
+ num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+ num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+ if (num_cam_entries != num_action_entries) {
dev_err(rvu->dev,
"kpm%d: CAM and action entries [%d != %d] not equal\n",
- kpm, profile->cam_entries, profile->action_entries);
+ kpm, num_cam_entries, num_action_entries);
WARN(1, "Fatal error\n");
return;
@@ -536,16 +540,18 @@ npc_program_single_kpm_profile(struct rvu *rvu, int blkaddr,
max_entries = rvu->hw->npc_kpu_entries / 2;
entry = start_entry;
/* Program CAM match entries for previous kpm extracted data */
- num_entries = min_t(int, profile->cam_entries, max_entries);
+ num_entries = min_t(int, num_cam_entries, max_entries);
for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
- npc_config_kpmcam(rvu, blkaddr, &profile->cam[idx],
+ npc_config_kpmcam(rvu, blkaddr,
+ npc_get_kpu_cam_nth_entry(rvu, profile, idx),
kpm, entry);
entry = start_entry;
/* Program this kpm's actions */
- num_entries = min_t(int, profile->action_entries, max_entries);
+ num_entries = min_t(int, num_action_entries, max_entries);
for (idx = 0; entry < num_entries + start_entry; entry++, idx++)
- npc_config_kpmaction(rvu, blkaddr, &profile->action[idx],
+ npc_config_kpmaction(rvu, blkaddr,
+ npc_get_kpu_action_nth_entry(rvu, profile, idx),
kpm, entry, false);
}
@@ -611,20 +617,23 @@ npc_enable_kpm_entry(struct rvu *rvu, int blkaddr, int kpm, int num_entries)
static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
{
const struct npc_kpu_profile *profile1, *profile2;
+ int pfl1_num_cam_entries, pfl2_num_cam_entries;
int idx, total_cam_entries;
for (idx = 0; idx < num_kpms; idx++) {
profile1 = &rvu->kpu.kpu[idx];
+ pfl1_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile1);
npc_program_single_kpm_profile(rvu, blkaddr, idx, 0, profile1);
profile2 = &rvu->kpu.kpu[idx + KPU_OFFSET];
+ pfl2_num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile2);
+
npc_program_single_kpm_profile(rvu, blkaddr, idx,
- profile1->cam_entries,
+ pfl1_num_cam_entries,
profile2);
- total_cam_entries = profile1->cam_entries +
- profile2->cam_entries;
+ total_cam_entries = pfl1_num_cam_entries + pfl2_num_cam_entries;
npc_enable_kpm_entry(rvu, blkaddr, idx, total_cam_entries);
rvu_write64(rvu, blkaddr, NPC_AF_KPMX_PASS2_OFFSET(idx),
- profile1->cam_entries);
+ pfl1_num_cam_entries);
/* Enable the KPUs associated with this KPM */
rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx), 0x01);
rvu_write64(rvu, blkaddr, NPC_AF_KPUX_CFG(idx + KPU_OFFSET),
@@ -634,6 +643,7 @@ static void npc_program_kpm_profile(struct rvu *rvu, int blkaddr, int num_kpms)
void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
{
+ struct npc_kpu_profile_action *act;
struct rvu_hwinfo *hw = rvu->hw;
int num_pkinds, idx;
@@ -665,9 +675,15 @@ void npc_cn20k_parser_profile_init(struct rvu *rvu, int blkaddr)
num_pkinds = rvu->kpu.pkinds;
num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
- for (idx = 0; idx < num_pkinds; idx++)
- npc_config_kpmaction(rvu, blkaddr, &rvu->kpu.ikpu[idx],
+ /* Cn20k does not support Custom profile from filesystem */
+ for (idx = 0; idx < num_pkinds; idx++) {
+ act = npc_get_ikpu_nth_entry(rvu, idx);
+ if (!act)
+ continue;
+
+ npc_config_kpmaction(rvu, blkaddr, act,
0, idx, true);
+ }
/* Program KPM CAM and Action profiles */
npc_program_kpm_profile(rvu, blkaddr, hw->npc_kpms);
@@ -679,7 +695,7 @@ struct npc_priv_t *npc_priv_get(void)
}
static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr,
+ const struct npc_mcam_kex_extr *mkex_extr,
u8 intf)
{
u8 num_extr = rvu->hw->npc_kex_extr;
@@ -708,7 +724,7 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr,
+ const struct npc_mcam_kex_extr *mkex_extr,
u8 intf)
{
u8 num_extr = rvu->hw->npc_kex_extr;
@@ -737,7 +753,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex_extr *mkex_extr)
+ const struct npc_mcam_kex_extr *mkex_extr)
{
struct rvu_hwinfo *hw = rvu->hw;
u8 intf;
@@ -1589,8 +1605,8 @@ npc_cn20k_update_action_entries_n_flags(struct rvu *rvu,
int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
struct npc_kpu_profile_adapter *profile)
{
+ const struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
size_t hdr_sz = sizeof(struct npc_cn20k_kpu_profile_fwdata);
- struct npc_cn20k_kpu_profile_fwdata *fw = rvu->kpu_fwdata;
struct npc_kpu_profile_action *action;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_fwdata *fw_kpu;
@@ -1635,8 +1651,15 @@ int npc_cn20k_apply_custom_kpu(struct rvu *rvu,
}
/* Verify if profile fits the HW */
+ if (fw->kpus > rvu->hw->npc_kpus) {
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+ rvu->hw->npc_kpus);
+ return -EINVAL;
+ }
+
+ /* Check if there is enough memory */
if (fw->kpus > profile->kpus) {
- dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
profile->kpus);
return -EINVAL;
}
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
index cefc5d70f3e4..c8c0cb68535c 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
@@ -265,6 +265,19 @@ struct npc_kpu_profile_cam {
u16 dp2_mask;
} __packed;
+struct npc_kpu_profile_cam2 {
+ u8 state;
+ u8 state_mask;
+ u16 dp0;
+ u16 dp0_mask;
+ u16 dp1;
+ u16 dp1_mask;
+ u16 dp2;
+ u16 dp2_mask;
+ u8 ptype;
+ u8 ptype_mask;
+} __packed;
+
struct npc_kpu_profile_action {
u8 errlev;
u8 errcode;
@@ -290,6 +303,10 @@ struct npc_kpu_profile {
int action_entries;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_profile_action *action;
+ int cam_entries2;
+ int action_entries2;
+ struct npc_kpu_profile_action *action2;
+ struct npc_kpu_profile_cam2 *cam2;
};
/* NPC KPU register formats */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index a466181cf908..2a2f2287e0c0 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -553,17 +553,19 @@ struct npc_kpu_profile_adapter {
const char *name;
u64 version;
const struct npc_lt_def_cfg *lt_def;
- const struct npc_kpu_profile_action *ikpu; /* array[pkinds] */
- const struct npc_kpu_profile *kpu; /* array[kpus] */
+ struct npc_kpu_profile_action *ikpu; /* array[pkinds] */
+ struct npc_kpu_profile_action *ikpu2; /* array[pkinds] */
+ struct npc_kpu_profile *kpu; /* array[kpus] */
union npc_mcam_key_prfl {
- struct npc_mcam_kex *mkex;
+ const struct npc_mcam_kex *mkex;
/* used for cn9k and cn10k */
- struct npc_mcam_kex_extr *mkex_extr; /* used for cn20k */
+ const struct npc_mcam_kex_extr *mkex_extr; /* used for cn20k */
} mcam_kex_prfl;
struct npc_mcam_kex_hash *mkex_hash;
bool custom;
size_t pkinds;
size_t kpus;
+ bool from_fs;
};
#define RVU_SWITCH_LBK_CHAN 63
@@ -634,7 +636,7 @@ struct rvu {
/* Firmware data */
struct rvu_fwdata *fwdata;
- void *kpu_fwdata;
+ const void *kpu_fwdata;
size_t kpu_fwdata_sz;
void __iomem *kpu_prfl_addr;
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 8d260bcfbf38..9cb9a850e903 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1384,7 +1384,8 @@ void rvu_npc_free_mcam_entries(struct rvu *rvu, u16 pcifunc, int nixlf)
}
static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex, u8 intf)
+ const struct npc_mcam_kex *mkex,
+ u8 intf)
{
int lid, lt, ld, fl;
@@ -1413,7 +1414,8 @@ static void npc_program_mkex_rx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex, u8 intf)
+ const struct npc_mcam_kex *mkex,
+ u8 intf)
{
int lid, lt, ld, fl;
@@ -1442,7 +1444,7 @@ static void npc_program_mkex_tx(struct rvu *rvu, int blkaddr,
}
static void npc_program_mkex_profile(struct rvu *rvu, int blkaddr,
- struct npc_mcam_kex *mkex)
+ const struct npc_mcam_kex *mkex)
{
struct rvu_hwinfo *hw = rvu->hw;
u8 intf;
@@ -1582,8 +1584,12 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
const struct npc_kpu_profile_cam *kpucam,
int kpu, int entry)
{
+ const struct npc_kpu_profile_cam2 *kpucam2 = (void *)kpucam;
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
struct npc_kpu_cam cam0 = {0};
struct npc_kpu_cam cam1 = {0};
+ u64 *val = (u64 *)&cam1;
+ u64 *mask = (u64 *)&cam0;
cam1.state = kpucam->state & kpucam->state_mask;
cam1.dp0_data = kpucam->dp0 & kpucam->dp0_mask;
@@ -1595,6 +1601,14 @@ static void npc_config_kpucam(struct rvu *rvu, int blkaddr,
cam0.dp1_data = ~kpucam->dp1 & kpucam->dp1_mask;
cam0.dp2_data = ~kpucam->dp2 & kpucam->dp2_mask;
+ if (profile->from_fs) {
+ u8 ptype = kpucam2->ptype;
+ u8 pmask = kpucam2->ptype_mask;
+
+ *val |= FIELD_PREP(GENMASK_ULL(57, 56), ptype & pmask);
+ *mask |= FIELD_PREP(GENMASK_ULL(57, 56), ~ptype & pmask);
+ }
+
rvu_write64(rvu, blkaddr,
NPC_AF_KPUX_ENTRYX_CAMX(kpu, entry, 0), *(u64 *)&cam0);
rvu_write64(rvu, blkaddr,
@@ -1606,34 +1620,104 @@ u64 npc_enable_mask(int count)
return (((count) < 64) ? ~(BIT_ULL(count) - 1) : (0x00ULL));
}
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return &profile->ikpu2[n];
+
+ return &profile->ikpu[n];
+}
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return kpu_pfl->cam_entries2;
+
+ return kpu_pfl->cam_entries;
+}
+
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return (void *)&kpu_pfl->cam2[n];
+
+ return (void *)&kpu_pfl->cam[n];
+}
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return kpu_pfl->action_entries2;
+
+ return kpu_pfl->action_entries;
+}
+
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl,
+ int n)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+
+ if (profile->from_fs)
+ return (void *)&kpu_pfl->action2[n];
+
+ return (void *)&kpu_pfl->action[n];
+}
+
static void npc_program_kpu_profile(struct rvu *rvu, int blkaddr, int kpu,
const struct npc_kpu_profile *profile)
{
+ int num_cam_entries, num_action_entries;
int entry, num_entries, max_entries;
u64 entry_mask;
- if (profile->cam_entries != profile->action_entries) {
+ num_cam_entries = npc_get_num_kpu_cam_entries(rvu, profile);
+ num_action_entries = npc_get_num_kpu_action_entries(rvu, profile);
+
+ if (num_cam_entries != num_action_entries) {
dev_err(rvu->dev,
"KPU%d: CAM and action entries [%d != %d] not equal\n",
- kpu, profile->cam_entries, profile->action_entries);
+ kpu, num_cam_entries, num_action_entries);
}
max_entries = rvu->hw->npc_kpu_entries;
+ WARN(num_cam_entries > max_entries,
+ "KPU%u: err: hw max entries=%u, input entries=%u\n",
+ kpu, rvu->hw->npc_kpu_entries, num_cam_entries);
+
/* Program CAM match entries for previous KPU extracted data */
- num_entries = min_t(int, profile->cam_entries, max_entries);
+ num_entries = min_t(int, num_cam_entries, max_entries);
for (entry = 0; entry < num_entries; entry++)
npc_config_kpucam(rvu, blkaddr,
- &profile->cam[entry], kpu, entry);
+ (void *)npc_get_kpu_cam_nth_entry(rvu, profile, entry),
+ kpu, entry);
/* Program this KPU's actions */
- num_entries = min_t(int, profile->action_entries, max_entries);
+ num_entries = min_t(int, num_action_entries, max_entries);
for (entry = 0; entry < num_entries; entry++)
- npc_config_kpuaction(rvu, blkaddr, &profile->action[entry],
+ npc_config_kpuaction(rvu, blkaddr,
+ (void *)npc_get_kpu_action_nth_entry(rvu, profile, entry),
kpu, entry, false);
/* Enable all programmed entries */
- num_entries = min_t(int, profile->action_entries, profile->cam_entries);
+ num_entries = min_t(int, num_action_entries, num_cam_entries);
entry_mask = npc_enable_mask(num_entries);
/* Disable first KPU_MAX_CST_ENT entries for built-in profile */
if (!rvu->kpu.custom)
@@ -1677,26 +1761,166 @@ static void npc_prepare_default_kpu(struct rvu *rvu,
npc_cn20k_update_action_entries_n_flags(rvu, profile);
}
-static int npc_apply_custom_kpu(struct rvu *rvu,
- struct npc_kpu_profile_adapter *profile)
+static int npc_alloc_kpu_cam2_n_action2(struct rvu *rvu, int kpu_num,
+ int num_entries)
+{
+ struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
+ struct npc_kpu_profile *kpu;
+
+ kpu = &adapter->kpu[kpu_num];
+
+ kpu->cam2 = devm_kcalloc(rvu->dev, num_entries,
+ sizeof(*kpu->cam2), GFP_KERNEL);
+ if (!kpu->cam2)
+ return -ENOMEM;
+
+ kpu->action2 = devm_kcalloc(rvu->dev, num_entries,
+ sizeof(*kpu->action2), GFP_KERNEL);
+ if (!kpu->action2)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu_from_fw(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile)
{
size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+ const struct npc_kpu_profile_fwdata *fw;
struct npc_kpu_profile_action *action;
- struct npc_kpu_profile_fwdata *fw;
struct npc_kpu_profile_cam *cam;
struct npc_kpu_fwdata *fw_kpu;
- int entries;
- u16 kpu, entry;
+ int entries, entry, kpu;
- if (is_cn20k(rvu->pdev))
- return npc_cn20k_apply_custom_kpu(rvu, profile);
+ fw = rvu->kpu_fwdata;
+
+ /* Check if there is enough memory */
+ if (fw->kpus > profile->kpus) {
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %zu\n", fw->kpus,
+ profile->kpus);
+ return -EINVAL;
+ }
+
+ for (kpu = 0; kpu < fw->kpus; kpu++) {
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "Profile size mismatch on KPU%i parsing\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+ if (fw_kpu->entries > KPU_MAX_CST_ENT)
+ dev_warn(rvu->dev,
+ "Too many custom entries on KPU%d: %d > %d\n",
+ kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
+ entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
+ cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
+ offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+ offset += fw_kpu->entries * sizeof(*action);
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "Profile size mismatch on KPU%i parsing.\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+ for (entry = 0; entry < entries; entry++) {
+ profile->kpu[kpu].cam[entry] = cam[entry];
+ profile->kpu[kpu].action[entry] = action[entry];
+ }
+ }
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu_from_fs(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile)
+{
+ size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
+ const struct npc_kpu_profile_fwdata *fw;
+ struct npc_kpu_profile_action *action;
+ struct npc_kpu_profile_cam2 *cam2;
+ struct npc_kpu_fwdata *fw_kpu;
+ int entries, ret, entry, kpu;
fw = rvu->kpu_fwdata;
+ /* Binary blob contains ikpu actions entries at start of data[0] */
+ profile->ikpu2 = devm_kcalloc(rvu->dev, 1,
+ sizeof(ikpu_action_entries),
+ GFP_KERNEL);
+ if (!profile->ikpu2)
+ return -ENOMEM;
+
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+
+ if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
+ return -ENOMEM;
+
+ memcpy((void *)profile->ikpu2, action, sizeof(ikpu_action_entries));
+ offset += sizeof(ikpu_action_entries);
+
+ for (kpu = 0; kpu < fw->kpus; kpu++) {
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "profile size mismatch on kpu%i parsing\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
+ entries = min(fw_kpu->entries, rvu->hw->npc_kpu_entries);
+ dev_info(rvu->dev,
+ "Loading %u entries on KPU%d\n", entries, kpu);
+
+ cam2 = (struct npc_kpu_profile_cam2 *)fw_kpu->data;
+ offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam2);
+ action = (struct npc_kpu_profile_action *)(fw->data + offset);
+ offset += fw_kpu->entries * sizeof(*action);
+ if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
+ dev_warn(rvu->dev,
+ "profile size mismatch on kpu%i parsing.\n",
+ kpu + 1);
+ return -EINVAL;
+ }
+
+ profile->kpu[kpu].cam_entries2 = entries;
+ profile->kpu[kpu].action_entries2 = entries;
+ ret = npc_alloc_kpu_cam2_n_action2(rvu, kpu, entries);
+ if (ret) {
+ dev_warn(rvu->dev,
+ "profile entry allocation failed for kpu=%d for %d entries\n",
+ kpu, entries);
+ return -EINVAL;
+ }
+
+ for (entry = 0; entry < entries; entry++) {
+ profile->kpu[kpu].cam2[entry] = cam2[entry];
+ profile->kpu[kpu].action2[entry] = action[entry];
+ }
+ }
+
+ return 0;
+}
+
+static int npc_apply_custom_kpu(struct rvu *rvu,
+ struct npc_kpu_profile_adapter *profile,
+ bool from_fs, int *fw_kpus)
+{
+ size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata);
+ const struct npc_kpu_profile_fwdata *fw;
+ struct npc_kpu_profile_fwdata *sfw;
+
+ if (is_cn20k(rvu->pdev))
+ return npc_cn20k_apply_custom_kpu(rvu, profile);
+
if (rvu->kpu_fwdata_sz < hdr_sz) {
dev_warn(rvu->dev, "Invalid KPU profile size\n");
return -EINVAL;
}
+
+ fw = rvu->kpu_fwdata;
if (le64_to_cpu(fw->signature) != KPU_SIGN) {
dev_warn(rvu->dev, "Invalid KPU profile signature %llx\n",
fw->signature);
@@ -1724,42 +1948,28 @@ static int npc_apply_custom_kpu(struct rvu *rvu,
return -EINVAL;
}
/* Verify if profile fits the HW */
- if (fw->kpus > profile->kpus) {
- dev_warn(rvu->dev, "Not enough KPUs: %d > %ld\n", fw->kpus,
- profile->kpus);
+ if (fw->kpus > rvu->hw->npc_kpus) {
+ dev_warn(rvu->dev, "Not enough KPUs: %d > %d\n", fw->kpus,
+ rvu->hw->npc_kpus);
return -EINVAL;
}
+ *fw_kpus = fw->kpus;
+
+ sfw = devm_kcalloc(rvu->dev, 1, sizeof(*sfw), GFP_KERNEL);
+ if (!sfw)
+ return -ENOMEM;
+
+ memcpy(sfw, fw, sizeof(*sfw));
+
profile->custom = 1;
- profile->name = fw->name;
+ profile->name = sfw->name;
profile->version = le64_to_cpu(fw->version);
- profile->mcam_kex_prfl.mkex = &fw->mkex;
- profile->lt_def = &fw->lt_def;
+ profile->mcam_kex_prfl.mkex = &sfw->mkex;
+ profile->lt_def = &sfw->lt_def;
- for (kpu = 0; kpu < fw->kpus; kpu++) {
- fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
- if (fw_kpu->entries > KPU_MAX_CST_ENT)
- dev_warn(rvu->dev,
- "Too many custom entries on KPU%d: %d > %d\n",
- kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
- entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
- cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
- offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
- action = (struct npc_kpu_profile_action *)(fw->data + offset);
- offset += fw_kpu->entries * sizeof(*action);
- if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
- dev_warn(rvu->dev,
- "Profile size mismatch on KPU%i parsing.\n",
- kpu + 1);
- return -EINVAL;
- }
- for (entry = 0; entry < entries; entry++) {
- profile->kpu[kpu].cam[entry] = cam[entry];
- profile->kpu[kpu].action[entry] = action[entry];
- }
- }
-
- return 0;
+ return from_fs ? npc_apply_custom_kpu_from_fs(rvu, profile) :
+ npc_apply_custom_kpu_from_fw(rvu, profile);
}
static int npc_load_kpu_prfl_img(struct rvu *rvu, void __iomem *prfl_addr,
@@ -1847,45 +2057,19 @@ static int npc_load_kpu_profile_fwdb(struct rvu *rvu, const char *kpu_profile)
return ret;
}
-void npc_load_kpu_profile(struct rvu *rvu)
+static int npc_load_kpu_profile_from_fw(struct rvu *rvu)
{
struct npc_kpu_profile_adapter *profile = &rvu->kpu;
const char *kpu_profile = rvu->kpu_pfl_name;
- const struct firmware *fw = NULL;
- bool retry_fwdb = false;
-
- /* If user not specified profile customization */
- if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
- goto revert_to_default;
- /* First prepare default KPU, then we'll customize top entries. */
- npc_prepare_default_kpu(rvu, profile);
-
- /* Order of preceedence for load loading NPC profile (high to low)
- * Firmware binary in filesystem.
- * Firmware database method.
- * Default KPU profile.
- */
- if (!request_firmware_direct(&fw, kpu_profile, rvu->dev)) {
- dev_info(rvu->dev, "Loading KPU profile from firmware: %s\n",
- kpu_profile);
- rvu->kpu_fwdata = kzalloc(fw->size, GFP_KERNEL);
- if (rvu->kpu_fwdata) {
- memcpy(rvu->kpu_fwdata, fw->data, fw->size);
- rvu->kpu_fwdata_sz = fw->size;
- }
- release_firmware(fw);
- retry_fwdb = true;
- goto program_kpu;
- }
+ int fw_kpus = 0;
-load_image_fwdb:
/* Loading the KPU profile using firmware database */
if (npc_load_kpu_profile_fwdb(rvu, kpu_profile))
- goto revert_to_default;
+ return -EFAULT;
-program_kpu:
/* Apply profile customization if firmware was loaded. */
- if (!rvu->kpu_fwdata_sz || npc_apply_custom_kpu(rvu, profile)) {
+ if (!rvu->kpu_fwdata_sz ||
+ npc_apply_custom_kpu(rvu, profile, false, &fw_kpus)) {
/* If image from firmware filesystem fails to load or invalid
* retry with firmware database method.
*/
@@ -1899,10 +2083,6 @@ void npc_load_kpu_profile(struct rvu *rvu)
}
rvu->kpu_fwdata = NULL;
rvu->kpu_fwdata_sz = 0;
- if (retry_fwdb) {
- retry_fwdb = false;
- goto load_image_fwdb;
- }
}
dev_warn(rvu->dev,
@@ -1910,7 +2090,7 @@ void npc_load_kpu_profile(struct rvu *rvu)
kpu_profile);
kfree(rvu->kpu_fwdata);
rvu->kpu_fwdata = NULL;
- goto revert_to_default;
+ return -EFAULT;
}
dev_info(rvu->dev, "Using custom profile '%s', version %d.%d.%d\n",
@@ -1918,14 +2098,90 @@ void npc_load_kpu_profile(struct rvu *rvu)
NPC_KPU_VER_MIN(profile->version),
NPC_KPU_VER_PATCH(profile->version));
- return;
+ return 0;
+}
+
+static int npc_load_kpu_profile_from_fs(struct rvu *rvu)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+ const char *kpu_profile = rvu->kpu_pfl_name;
+ const struct firmware *fw = NULL;
+ int ret, fw_kpus = 0;
+ char path[512] = "kpu/";
+
+ if (strlen(kpu_profile) > sizeof(path) - strlen("kpu/") - 1) {
+ dev_err(rvu->dev, "kpu profile name is too big\n");
+ return -ENOSPC;
+ }
+
+ strcat(path, kpu_profile);
+
+ if (request_firmware_direct(&fw, path, rvu->dev))
+ return -ENOENT;
+
+ dev_info(rvu->dev, "Loading KPU profile from filesystem: %s\n",
+ path);
+
+ rvu->kpu_fwdata = fw->data;
+ rvu->kpu_fwdata_sz = fw->size;
+
+ ret = npc_apply_custom_kpu(rvu, profile, true, &fw_kpus);
+ release_firmware(fw);
+ rvu->kpu_fwdata = NULL;
+
+ if (ret) {
+ rvu->kpu_fwdata_sz = 0;
+ dev_err(rvu->dev,
+ "Loading KPU profile from filesystem failed\n");
+ return ret;
+ }
+
+ rvu->kpu.kpus = fw_kpus;
+ profile->kpus = fw_kpus;
+ profile->from_fs = true;
+ return 0;
+}
+
+void npc_load_kpu_profile(struct rvu *rvu)
+{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
+ const char *kpu_profile = rvu->kpu_pfl_name;
+
+ profile->from_fs = false;
+
+ npc_prepare_default_kpu(rvu, profile);
+
+ /* If user not specified profile customization */
+ if (!strncmp(kpu_profile, def_pfl_name, KPU_NAME_LEN))
+ return;
+
+ /* Order of preceedence for load loading NPC profile (high to low)
+ * Firmware binary in filesystem.
+ * Firmware database method.
+ * Default KPU profile.
+ */
+
+ /* Filesystem-based KPU loading is not supported on cn20k.
+ * npc_prepare_default_kpu() was invoked earlier, but control
+ * reached this point because the default profile was not selected.
+ * No need to call it again.
+ */
+ if (!is_cn20k(rvu->pdev)) {
+ if (!npc_load_kpu_profile_from_fs(rvu))
+ return;
+ }
+
+ /* First prepare default KPU, then we'll customize top entries. */
+ npc_prepare_default_kpu(rvu, profile);
+ if (!npc_load_kpu_profile_from_fw(rvu))
+ return;
-revert_to_default:
npc_prepare_default_kpu(rvu, profile);
}
static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
{
+ struct npc_kpu_profile_adapter *profile = &rvu->kpu;
struct rvu_hwinfo *hw = rvu->hw;
int num_pkinds, num_kpus, idx;
@@ -1949,7 +2205,9 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
num_pkinds = min_t(int, hw->npc_pkinds, num_pkinds);
for (idx = 0; idx < num_pkinds; idx++)
- npc_config_kpuaction(rvu, blkaddr, &rvu->kpu.ikpu[idx], 0, idx, true);
+ npc_config_kpuaction(rvu, blkaddr,
+ npc_get_ikpu_nth_entry(rvu, idx),
+ 0, idx, true);
/* Program KPU CAM and Action profiles */
num_kpus = rvu->kpu.kpus;
@@ -1957,6 +2215,11 @@ static void npc_parser_profile_init(struct rvu *rvu, int blkaddr)
for (idx = 0; idx < num_kpus; idx++)
npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
+
+ if (profile->from_fs) {
+ rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(54), 0x03);
+ rvu_write64(rvu, blkaddr, NPC_AF_PKINDX_TYPE(58), 0x03);
+ }
}
void npc_mcam_rsrcs_deinit(struct rvu *rvu)
@@ -2186,18 +2449,21 @@ static void rvu_npc_hw_init(struct rvu *rvu, int blkaddr)
static void rvu_npc_setup_interfaces(struct rvu *rvu, int blkaddr)
{
- struct npc_mcam_kex_extr *mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
- struct npc_mcam_kex *mkex = rvu->kpu.mcam_kex_prfl.mkex;
+ const struct npc_mcam_kex_extr *mkex_extr;
struct npc_mcam *mcam = &rvu->hw->mcam;
struct rvu_hwinfo *hw = rvu->hw;
+ const struct npc_mcam_kex *mkex;
u64 nibble_ena, rx_kex, tx_kex;
u64 *keyx_cfg, reg;
u8 intf;
+ mkex_extr = rvu->kpu.mcam_kex_prfl.mkex_extr;
+ mkex = rvu->kpu.mcam_kex_prfl.mkex;
+
if (is_cn20k(rvu->pdev)) {
- keyx_cfg = mkex_extr->keyx_cfg;
+ keyx_cfg = (u64 *)mkex_extr->keyx_cfg;
} else {
- keyx_cfg = mkex->keyx_cfg;
+ keyx_cfg = (u64 *)mkex->keyx_cfg;
/* Reserve last counter for MCAM RX miss action which is set to
* drop packet. This way we will know how many pkts didn't
* match any MCAM entry.
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
index 83c5e32e2afc..662f6693cfe9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.h
@@ -18,4 +18,21 @@ int npc_fwdb_prfl_img_map(struct rvu *rvu, void __iomem **prfl_img_addr,
void npc_mcam_clear_bit(struct npc_mcam *mcam, u16 index);
void npc_mcam_set_bit(struct npc_mcam *mcam, u16 index);
+
+struct npc_kpu_profile_action *
+npc_get_ikpu_nth_entry(struct rvu *rvu, int n);
+
+int
+npc_get_num_kpu_cam_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_cam *
+npc_get_kpu_cam_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n);
+
+int
+npc_get_num_kpu_action_entries(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl);
+struct npc_kpu_profile_action *
+npc_get_kpu_action_nth_entry(struct rvu *rvu,
+ const struct npc_kpu_profile *kpu_pfl, int n);
#endif /* RVU_NPC_H */
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
index 62cdc714ba57..ab89b8c6e490 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
@@ -596,6 +596,7 @@
#define NPC_AF_INTFX_KEX_CFG(a) (0x01010 | (a) << 8)
#define NPC_AF_PKINDX_ACTION0(a) (0x80000ull | (a) << 6)
#define NPC_AF_PKINDX_ACTION1(a) (0x80008ull | (a) << 6)
+#define NPC_AF_PKINDX_TYPE(a) (0x80010ull | (a) << 6)
#define NPC_AF_PKINDX_CPI_DEFX(a, b) (0x80020ull | (a) << 6 | (b) << 3)
#define NPC_AF_KPUX_ENTRYX_CAMX(a, b, c) \
(0x100000 | (a) << 14 | (b) << 6 | (c) << 3)
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values
2026-04-03 2:55 ` [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
@ 2026-04-07 9:58 ` Paolo Abeni
2026-04-08 10:41 ` Ratheesh Kannoth
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2026-04-07 9:58 UTC (permalink / raw)
To: Ratheesh Kannoth, netdev, linux-kernel, linux-rdma
Cc: sgoutham, andrew+netdev, davem, edumazet, kuba, donald.hunter,
horms, jiri, chuck.lever, matttbe, cjubran, saeedm, leon, tariqt,
mbloch, dtatulea
On 4/3/26 4:55 AM, Ratheesh Kannoth wrote:
> From: Saeed Mahameed <saeedm@nvidia.com>
>
> Devlink param value attribute is not defined since devlink is handling
> the value validating and parsing internally, this allows us to implement
> multi attribute values without breaking any policies.
>
> Devlink param multi-attribute values are considered to be dynamically
> sized arrays of u64 values, by introducing a new devlink param type
> DEVLINK_PARAM_TYPE_U64_ARRAY, driver and user space can set a variable
> count of u32 values into the DEVLINK_ATTR_PARAM_VALUE_DATA attribute.
>
> Implement get/set parsing and add to the internal value structure passed
> to drivers.
>
> This is useful for devices that need to configure a list of values for
> a specific configuration.
>
> example:
> $ devlink dev param show pci/... name multi-value-param
> name multi-value-param type driver-specific
> values:
> cmode permanent value: 0,1,2,3,4,5,6,7
>
> $ devlink dev param set pci/... name multi-value-param \
> value 4,5,6,7,0,1,2,3 cmode permanent
>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> ---
> Documentation/netlink/specs/devlink.yaml | 4 ++
> include/net/devlink.h | 8 +++
> include/uapi/linux/devlink.h | 1 +
> net/devlink/netlink_gen.c | 2 +
> net/devlink/param.c | 91 +++++++++++++++++++-----
> 5 files changed, 89 insertions(+), 17 deletions(-)
>
> diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
> index b495d56b9137..b619de4fe08a 100644
> --- a/Documentation/netlink/specs/devlink.yaml
> +++ b/Documentation/netlink/specs/devlink.yaml
> @@ -226,6 +226,10 @@ definitions:
> value: 10
> -
> name: binary
> + -
> + name: u64-array
> + value: 129
> +
> -
> name: rate-tc-index-max
> type: const
> diff --git a/include/net/devlink.h b/include/net/devlink.h
> index 3038af6ec017..3a355fea8189 100644
> --- a/include/net/devlink.h
> +++ b/include/net/devlink.h
> @@ -432,6 +432,13 @@ enum devlink_param_type {
> DEVLINK_PARAM_TYPE_U64 = DEVLINK_VAR_ATTR_TYPE_U64,
> DEVLINK_PARAM_TYPE_STRING = DEVLINK_VAR_ATTR_TYPE_STRING,
> DEVLINK_PARAM_TYPE_BOOL = DEVLINK_VAR_ATTR_TYPE_FLAG,
> + DEVLINK_PARAM_TYPE_U64_ARRAY = DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
> +};
> +
> +#define __DEVLINK_PARAM_MAX_ARRAY_SIZE 32
> +struct devlink_param_u64_array {
> + u64 size;
> + u64 val[__DEVLINK_PARAM_MAX_ARRAY_SIZE];
> };
>
> union devlink_param_value {
> @@ -441,6 +448,7 @@ union devlink_param_value {
> u64 vu64;
> char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
> bool vbool;
> + struct devlink_param_u64_array u64arr;
Sashiko as a couple of relevant remarks here, specifically:
---
Does this increase the size of union devlink_param_value from 32 bytes
to over 264 bytes?
Looking at existing functions like devlink_nl_param_value_fill_one() and
devlink_nl_param_value_put(), they take multiple copies of this union by
value. Passing two of these unions by value consumes over 528 bytes of
stack space, and combined in a call chain this pushes nearly 800 bytes
of arguments onto the stack.
Could this create a risk of hitting CONFIG_FRAME_WARN limits deep in
driver notification contexts? Should the signatures of the internal
functions and exported APIs be updated to pass the unions by pointer
instead?
---
> };
>
> struct devlink_param_gset_ctx {
> diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
> index 7de2d8cc862f..5332223dd6d0 100644
> --- a/include/uapi/linux/devlink.h
> +++ b/include/uapi/linux/devlink.h
> @@ -406,6 +406,7 @@ enum devlink_var_attr_type {
> DEVLINK_VAR_ATTR_TYPE_BINARY,
> __DEVLINK_VAR_ATTR_TYPE_CUSTOM_BASE = 0x80,
> /* Any possible custom types, unrelated to NLA_* values go below */
> + DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
> };
>
> enum devlink_attr {
> diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
> index eb35e80e01d1..7aaf462f27ee 100644
> --- a/net/devlink/netlink_gen.c
> +++ b/net/devlink/netlink_gen.c
> @@ -37,6 +37,8 @@ devlink_attr_param_type_validate(const struct nlattr *attr,
> case DEVLINK_VAR_ATTR_TYPE_NUL_STRING:
> fallthrough;
> case DEVLINK_VAR_ATTR_TYPE_BINARY:
> + fallthrough;
> + case DEVLINK_VAR_ATTR_TYPE_U64_ARRAY:
> return 0;
> }
> NL_SET_ERR_MSG_ATTR(extack, attr, "invalid enum value");
> diff --git a/net/devlink/param.c b/net/devlink/param.c
> index cf95268da5b0..2ec85dffd8ac 100644
> --- a/net/devlink/param.c
> +++ b/net/devlink/param.c
> @@ -252,6 +252,14 @@ devlink_nl_param_value_put(struct sk_buff *msg, enum devlink_param_type type,
> return -EMSGSIZE;
> }
> break;
> + case DEVLINK_PARAM_TYPE_U64_ARRAY:
> + if (val.u64arr.size > __DEVLINK_PARAM_MAX_ARRAY_SIZE)
> + return -EMSGSIZE;
> +
> + for (int i = 0; i < val.u64arr.size; i++)
> + if (nla_put_uint(msg, nla_type, val.u64arr.val[i]))
> + return -EMSGSIZE;
> + break;
> }
> return 0;
> }
> @@ -304,56 +312,78 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> u32 portid, u32 seq, int flags,
> struct netlink_ext_ack *extack)
> {
> - union devlink_param_value default_value[DEVLINK_PARAM_CMODE_MAX + 1];
> - union devlink_param_value param_value[DEVLINK_PARAM_CMODE_MAX + 1];
> bool default_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
> bool param_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
> const struct devlink_param *param = param_item->param;
> - struct devlink_param_gset_ctx ctx;
> + union devlink_param_value *default_value;
> + union devlink_param_value *param_value;
> + struct devlink_param_gset_ctx *ctx;
> struct nlattr *param_values_list;
> struct nlattr *param_attr;
> void *hdr;
> int err;
> int i;
>
> + default_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
> + sizeof(*default_value), GFP_KERNEL);
> + if (!default_value)
> + return -ENOMEM;
> +
> + param_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
> + sizeof(*param_value), GFP_KERNEL);
> + if (!param_value) {
> + kfree(default_value);
> + return -ENOMEM;
> + }
> +
> + ctx = kmalloc_obj(*ctx);
> + if (!ctx) {
> + kfree(param_value);
> + kfree(default_value);
> + return -ENOMEM;
> + }
> +
> /* Get value from driver part to driverinit configuration mode */
> for (i = 0; i <= DEVLINK_PARAM_CMODE_MAX; i++) {
> if (!devlink_param_cmode_is_supported(param, i))
> continue;
> if (i == DEVLINK_PARAM_CMODE_DRIVERINIT) {
> - if (param_item->driverinit_value_new_valid)
> + if (param_item->driverinit_value_new_valid) {
> param_value[i] = param_item->driverinit_value_new;
> - else if (param_item->driverinit_value_valid)
> + } else if (param_item->driverinit_value_valid) {
> param_value[i] = param_item->driverinit_value;
> - else
> - return -EOPNOTSUPP;
> + } else {
> + err = -EOPNOTSUPP;
> + goto get_put_fail;
> + }
>
> if (param_item->driverinit_value_valid) {
> default_value[i] = param_item->driverinit_default;
> default_value_set[i] = true;
> }
> } else {
> - ctx.cmode = i;
> - err = devlink_param_get(devlink, param, &ctx, extack);
> + ctx->cmode = i;
> + err = devlink_param_get(devlink, param, ctx, extack);
> if (err)
> - return err;
> - param_value[i] = ctx.val;
> + goto get_put_fail;
> + param_value[i] = ctx->val;
>
> - err = devlink_param_get_default(devlink, param, &ctx,
> + err = devlink_param_get_default(devlink, param, ctx,
> extack);
> if (!err) {
> - default_value[i] = ctx.val;
> + default_value[i] = ctx->val;
> default_value_set[i] = true;
> } else if (err != -EOPNOTSUPP) {
> - return err;
> + goto get_put_fail;
> }
> }
> param_value_set[i] = true;
> }
>
> + err = -EMSGSIZE;
> hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
> if (!hdr)
> - return -EMSGSIZE;
> + goto get_put_fail;
>
> if (devlink_nl_put_handle(msg, devlink))
> goto genlmsg_cancel;
> @@ -393,6 +423,9 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> nla_nest_end(msg, param_values_list);
> nla_nest_end(msg, param_attr);
> genlmsg_end(msg, hdr);
> + kfree(default_value);
> + kfree(param_value);
> + kfree(ctx);
> return 0;
>
> values_list_nest_cancel:
> @@ -401,7 +434,11 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> nla_nest_cancel(msg, param_attr);
> genlmsg_cancel:
> genlmsg_cancel(msg, hdr);
> - return -EMSGSIZE;
> +get_put_fail:
> + kfree(default_value);
> + kfree(param_value);
> + kfree(ctx);
> + return err;
> }
>
> static void devlink_param_notify(struct devlink *devlink,
> @@ -507,7 +544,7 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
> union devlink_param_value *value)
> {
> struct nlattr *param_data;
> - int len;
> + int len, cnt, rem;
>
> param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
>
> @@ -547,6 +584,26 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
> return -EINVAL;
> value->vbool = nla_get_flag(param_data);
> break;
> +
> + case DEVLINK_PARAM_TYPE_U64_ARRAY:
> + cnt = 0;
> + nla_for_each_attr_type(param_data,
> + DEVLINK_ATTR_PARAM_VALUE_DATA,
> + genlmsg_data(info->genlhdr),
> + genlmsg_len(info->genlhdr), rem) {
> + if (cnt >= __DEVLINK_PARAM_MAX_ARRAY_SIZE)
> + return -EMSGSIZE;
> +
> + if ((nla_len(param_data) != sizeof(u64)) &&
> + (nla_len(param_data) != sizeof(u32)))
> + return -EINVAL;
> +
> + value->u64arr.val[cnt] = (u64)nla_get_uint(param_data);
> + cnt++;
> + }
> +
> + value->u64arr.size = cnt;
> + break;
Sashiko says:
---
Does this make it impossible to set an empty array to clear a
multi-value parameter?
If userspace provides 0 elements, param_data will be NULL. Earlier in
devlink_param_value_get_from_info(), there is a check:
param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
if (param->type != DEVLINK_PARAM_TYPE_BOOL && !param_data)
return -EINVAL;
If the parameter is a U64_ARRAY and no data is provided, this check will
immediately return -EINVAL.
The kernel can successfully emit an empty array on a GET request if the
size is 0. Should the SET path similarly support receiving 0 elements to
allow userspace to clear a multi-value parameter?
---
There are several others NIC-specific remarks, which IMHO are mostly
pre-existing issues, but please have a look:
https://sashiko.dev/#/patchset/20260403025533.6250-1-rkannoth%40marvell.com
/P
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values
2026-04-07 9:58 ` Paolo Abeni
@ 2026-04-08 10:41 ` Ratheesh Kannoth
0 siblings, 0 replies; 9+ messages in thread
From: Ratheesh Kannoth @ 2026-04-08 10:41 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, linux-kernel, linux-rdma, sgoutham, andrew+netdev, davem,
edumazet, kuba, donald.hunter, horms, jiri, chuck.lever, matttbe,
cjubran, saeedm, leon, tariqt, mbloch, dtatulea
On 2026-04-07 at 15:28:09, Paolo Abeni (pabeni@redhat.com) wrote:
> On 4/3/26 4:55 AM, Ratheesh Kannoth wrote:
> > From: Saeed Mahameed <saeedm@nvidia.com>
> >
> > Devlink param value attribute is not defined since devlink is handling
> > the value validating and parsing internally, this allows us to implement
> > multi attribute values without breaking any policies.
> >
> > Devlink param multi-attribute values are considered to be dynamically
> > sized arrays of u64 values, by introducing a new devlink param type
> > DEVLINK_PARAM_TYPE_U64_ARRAY, driver and user space can set a variable
> > count of u32 values into the DEVLINK_ATTR_PARAM_VALUE_DATA attribute.
> >
> > Implement get/set parsing and add to the internal value structure passed
> > to drivers.
> >
> > This is useful for devices that need to configure a list of values for
> > a specific configuration.
> >
> > example:
> > $ devlink dev param show pci/... name multi-value-param
> > name multi-value-param type driver-specific
> > values:
> > cmode permanent value: 0,1,2,3,4,5,6,7
> >
> > $ devlink dev param set pci/... name multi-value-param \
> > value 4,5,6,7,0,1,2,3 cmode permanent
> >
> > Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> > Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> > ---
> > Documentation/netlink/specs/devlink.yaml | 4 ++
> > include/net/devlink.h | 8 +++
> > include/uapi/linux/devlink.h | 1 +
> > net/devlink/netlink_gen.c | 2 +
> > net/devlink/param.c | 91 +++++++++++++++++++-----
> > 5 files changed, 89 insertions(+), 17 deletions(-)
> >
> > diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
> > index b495d56b9137..b619de4fe08a 100644
> > --- a/Documentation/netlink/specs/devlink.yaml
> > +++ b/Documentation/netlink/specs/devlink.yaml
> > @@ -226,6 +226,10 @@ definitions:
> > value: 10
> > -
> > name: binary
> > + -
> > + name: u64-array
> > + value: 129
> > +
> > -
> > name: rate-tc-index-max
> > type: const
> > diff --git a/include/net/devlink.h b/include/net/devlink.h
> > index 3038af6ec017..3a355fea8189 100644
> > --- a/include/net/devlink.h
> > +++ b/include/net/devlink.h
> > @@ -432,6 +432,13 @@ enum devlink_param_type {
> > DEVLINK_PARAM_TYPE_U64 = DEVLINK_VAR_ATTR_TYPE_U64,
> > DEVLINK_PARAM_TYPE_STRING = DEVLINK_VAR_ATTR_TYPE_STRING,
> > DEVLINK_PARAM_TYPE_BOOL = DEVLINK_VAR_ATTR_TYPE_FLAG,
> > + DEVLINK_PARAM_TYPE_U64_ARRAY = DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
> > +};
> > +
> > +#define __DEVLINK_PARAM_MAX_ARRAY_SIZE 32
> > +struct devlink_param_u64_array {
> > + u64 size;
> > + u64 val[__DEVLINK_PARAM_MAX_ARRAY_SIZE];
> > };
> >
> > union devlink_param_value {
> > @@ -441,6 +448,7 @@ union devlink_param_value {
> > u64 vu64;
> > char vstr[__DEVLINK_PARAM_MAX_STRING_VALUE];
> > bool vbool;
> > + struct devlink_param_u64_array u64arr;
>
> Sashiko as a couple of relevant remarks here, specifically:
>
> ---
> Does this increase the size of union devlink_param_value from 32 bytes
> to over 264 bytes?
> Looking at existing functions like devlink_nl_param_value_fill_one() and
> devlink_nl_param_value_put(), they take multiple copies of this union by
> value. Passing two of these unions by value consumes over 528 bytes of
> stack space, and combined in a call chain this pushes nearly 800 bytes
> of arguments onto the stack.
> Could this create a risk of hitting CONFIG_FRAME_WARN limits deep in
> driver notification contexts? Should the signatures of the internal
> functions and exported APIs be updated to pass the unions by pointer
> instead?
ACK. will make as seperate patch.
> ---
>
> > };
> >
> > struct devlink_param_gset_ctx {
> > diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
> > index 7de2d8cc862f..5332223dd6d0 100644
> > --- a/include/uapi/linux/devlink.h
> > +++ b/include/uapi/linux/devlink.h
> > @@ -406,6 +406,7 @@ enum devlink_var_attr_type {
> > DEVLINK_VAR_ATTR_TYPE_BINARY,
> > __DEVLINK_VAR_ATTR_TYPE_CUSTOM_BASE = 0x80,
> > /* Any possible custom types, unrelated to NLA_* values go below */
> > + DEVLINK_VAR_ATTR_TYPE_U64_ARRAY,
> > };
> >
> > enum devlink_attr {
> > diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
> > index eb35e80e01d1..7aaf462f27ee 100644
> > --- a/net/devlink/netlink_gen.c
> > +++ b/net/devlink/netlink_gen.c
> > @@ -37,6 +37,8 @@ devlink_attr_param_type_validate(const struct nlattr *attr,
> > case DEVLINK_VAR_ATTR_TYPE_NUL_STRING:
> > fallthrough;
> > case DEVLINK_VAR_ATTR_TYPE_BINARY:
> > + fallthrough;
> > + case DEVLINK_VAR_ATTR_TYPE_U64_ARRAY:
> > return 0;
> > }
> > NL_SET_ERR_MSG_ATTR(extack, attr, "invalid enum value");
> > diff --git a/net/devlink/param.c b/net/devlink/param.c
> > index cf95268da5b0..2ec85dffd8ac 100644
> > --- a/net/devlink/param.c
> > +++ b/net/devlink/param.c
> > @@ -252,6 +252,14 @@ devlink_nl_param_value_put(struct sk_buff *msg, enum devlink_param_type type,
> > return -EMSGSIZE;
> > }
> > break;
> > + case DEVLINK_PARAM_TYPE_U64_ARRAY:
> > + if (val.u64arr.size > __DEVLINK_PARAM_MAX_ARRAY_SIZE)
> > + return -EMSGSIZE;
> > +
> > + for (int i = 0; i < val.u64arr.size; i++)
> > + if (nla_put_uint(msg, nla_type, val.u64arr.val[i]))
> > + return -EMSGSIZE;
> > + break;
> > }
> > return 0;
> > }
> > @@ -304,56 +312,78 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> > u32 portid, u32 seq, int flags,
> > struct netlink_ext_ack *extack)
> > {
> > - union devlink_param_value default_value[DEVLINK_PARAM_CMODE_MAX + 1];
> > - union devlink_param_value param_value[DEVLINK_PARAM_CMODE_MAX + 1];
> > bool default_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
> > bool param_value_set[DEVLINK_PARAM_CMODE_MAX + 1] = {};
> > const struct devlink_param *param = param_item->param;
> > - struct devlink_param_gset_ctx ctx;
> > + union devlink_param_value *default_value;
> > + union devlink_param_value *param_value;
> > + struct devlink_param_gset_ctx *ctx;
> > struct nlattr *param_values_list;
> > struct nlattr *param_attr;
> > void *hdr;
> > int err;
> > int i;
> >
> > + default_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
> > + sizeof(*default_value), GFP_KERNEL);
> > + if (!default_value)
> > + return -ENOMEM;
> > +
> > + param_value = kcalloc(DEVLINK_PARAM_CMODE_MAX + 1,
> > + sizeof(*param_value), GFP_KERNEL);
> > + if (!param_value) {
> > + kfree(default_value);
> > + return -ENOMEM;
> > + }
> > +
> > + ctx = kmalloc_obj(*ctx);
> > + if (!ctx) {
> > + kfree(param_value);
> > + kfree(default_value);
> > + return -ENOMEM;
> > + }
> > +
> > /* Get value from driver part to driverinit configuration mode */
> > for (i = 0; i <= DEVLINK_PARAM_CMODE_MAX; i++) {
> > if (!devlink_param_cmode_is_supported(param, i))
> > continue;
> > if (i == DEVLINK_PARAM_CMODE_DRIVERINIT) {
> > - if (param_item->driverinit_value_new_valid)
> > + if (param_item->driverinit_value_new_valid) {
> > param_value[i] = param_item->driverinit_value_new;
> > - else if (param_item->driverinit_value_valid)
> > + } else if (param_item->driverinit_value_valid) {
> > param_value[i] = param_item->driverinit_value;
> > - else
> > - return -EOPNOTSUPP;
> > + } else {
> > + err = -EOPNOTSUPP;
> > + goto get_put_fail;
> > + }
> >
> > if (param_item->driverinit_value_valid) {
> > default_value[i] = param_item->driverinit_default;
> > default_value_set[i] = true;
> > }
> > } else {
> > - ctx.cmode = i;
> > - err = devlink_param_get(devlink, param, &ctx, extack);
> > + ctx->cmode = i;
> > + err = devlink_param_get(devlink, param, ctx, extack);
> > if (err)
> > - return err;
> > - param_value[i] = ctx.val;
> > + goto get_put_fail;
> > + param_value[i] = ctx->val;
> >
> > - err = devlink_param_get_default(devlink, param, &ctx,
> > + err = devlink_param_get_default(devlink, param, ctx,
> > extack);
> > if (!err) {
> > - default_value[i] = ctx.val;
> > + default_value[i] = ctx->val;
> > default_value_set[i] = true;
> > } else if (err != -EOPNOTSUPP) {
> > - return err;
> > + goto get_put_fail;
> > }
> > }
> > param_value_set[i] = true;
> > }
> >
> > + err = -EMSGSIZE;
> > hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
> > if (!hdr)
> > - return -EMSGSIZE;
> > + goto get_put_fail;
> >
> > if (devlink_nl_put_handle(msg, devlink))
> > goto genlmsg_cancel;
> > @@ -393,6 +423,9 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> > nla_nest_end(msg, param_values_list);
> > nla_nest_end(msg, param_attr);
> > genlmsg_end(msg, hdr);
> > + kfree(default_value);
> > + kfree(param_value);
> > + kfree(ctx);
> > return 0;
> >
> > values_list_nest_cancel:
> > @@ -401,7 +434,11 @@ static int devlink_nl_param_fill(struct sk_buff *msg, struct devlink *devlink,
> > nla_nest_cancel(msg, param_attr);
> > genlmsg_cancel:
> > genlmsg_cancel(msg, hdr);
> > - return -EMSGSIZE;
> > +get_put_fail:
> > + kfree(default_value);
> > + kfree(param_value);
> > + kfree(ctx);
> > + return err;
> > }
> >
> > static void devlink_param_notify(struct devlink *devlink,
> > @@ -507,7 +544,7 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
> > union devlink_param_value *value)
> > {
> > struct nlattr *param_data;
> > - int len;
> > + int len, cnt, rem;
> >
> > param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
> >
> > @@ -547,6 +584,26 @@ devlink_param_value_get_from_info(const struct devlink_param *param,
> > return -EINVAL;
> > value->vbool = nla_get_flag(param_data);
> > break;
> > +
> > + case DEVLINK_PARAM_TYPE_U64_ARRAY:
> > + cnt = 0;
> > + nla_for_each_attr_type(param_data,
> > + DEVLINK_ATTR_PARAM_VALUE_DATA,
> > + genlmsg_data(info->genlhdr),
> > + genlmsg_len(info->genlhdr), rem) {
> > + if (cnt >= __DEVLINK_PARAM_MAX_ARRAY_SIZE)
> > + return -EMSGSIZE;
> > +
> > + if ((nla_len(param_data) != sizeof(u64)) &&
> > + (nla_len(param_data) != sizeof(u32)))
> > + return -EINVAL;
> > +
> > + value->u64arr.val[cnt] = (u64)nla_get_uint(param_data);
> > + cnt++;
> > + }
> > +
> > + value->u64arr.size = cnt;
> > + break;
>
> Sashiko says:
>
> ---
> Does this make it impossible to set an empty array to clear a
> multi-value parameter?
> If userspace provides 0 elements, param_data will be NULL. Earlier in
> devlink_param_value_get_from_info(), there is a check:
> param_data = info->attrs[DEVLINK_ATTR_PARAM_VALUE_DATA];
> if (param->type != DEVLINK_PARAM_TYPE_BOOL && !param_data)
> return -EINVAL;
> If the parameter is a U64_ARRAY and no data is provided, this check will
> immediately return -EINVAL.
> The kernel can successfully emit an empty array on a GET request if the
> size is 0. Should the SET path similarly support receiving 0 elements to
> allow userspace to clear a multi-value parameter?
multi-value parameter is to set values. if some one wants to set to zero
they can set all values as zero and pass a "size" array. we can ignore this comment, right ?
> ---
>
> There are several others NIC-specific remarks, which IMHO are mostly
> pre-existing issues, but please have a look:
>
> https://sashiko.dev/#/patchset/20260403025533.6250-1-rkannoth%40marvell.com
>
> /P
>
A. Patch 1: [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements
1.
"Is it safe to use a global static array here?
Concurrent reads of the dstats debugfs file could race to read and write to
this array, leading to torn updates. Additionally, on systems with multiple
RVU devices, wouldn't they share the same array and cross-contaminate their
statistics?"
The purpose of using "dstats" is to identify which rules current traffic is hitting.
We are not particularly concerned with overall packet statistics, as those can
be obtained through rule's stats register. Each rule hit will display the corresponding
rule during 'cating', so this comment can be disregarded.
2.
"Are the hardware parameters read into npc_priv->bank_depth and
npc_priv->num_banks guaranteed to fit within the static bounds of dstats?
If the hardware reports a bank_depth that makes mcam_idx exceed the
pre-allocated array dimensions, could this result in an out-of-bounds
memory access here?"
static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS are maximum possible. So above case
wont happen
3.
"Could this cause data loss if the output exceeds the initial seq_file
buffer size?
DEFINE_SHOW_ATTRIBUTE() uses single_open(), which will discard the buffer,
double its size, and re-invoke this show callback from the beginning if it
overflows.
Since dstats[bank][idx] is updated during the first execution, a
re-invocation would see stats == dstats[bank][idx] and silently drop all
previously processed entries."
The purpose of "dstats" entry to find live traffic hitting rules. It may be showing "wrong"
value for the first time, but second time onwards, it will be okay. Usually user runs 'cat'ing mutliple
times to find "htting" rule.
5.
"Similar to the dstats array, does this code overflow en_map?
If the index provided here can exceed the compile-time bounds of
MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH due to dynamically
read hardware sizes, it could corrupt memory.
Additionally, is en_map explicitly cleared during device initialization
or teardown?
Since npc_priv is global, if the driver is unbound and rebound without
unloading the module, could en_map retain stale state from the previous
session?"
MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH is big enough to hold
for all SoC variants.
B. Patch 2: [PATCH v10 net-next 2/6] net/mlx5e: heap-allocate devlink param values
1.
"Does returning -ENOMEM here trigger a redundant WARN_ON() in the caller?
Looking at mlx5e_pcie_cong_event_init(), it checks the return value like this:
drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c:mlx5e_pcie_cong_event_init() {
...
err = mlx5e_pcie_cong_get_thresh_config(mdev, &thresh_config);
if (WARN_ON(err))
return err;
...
}
Since kcalloc() already prints a warning and stack trace on allocation failure
(unless __GFP_NOWARN is used), wrapping the -ENOMEM return in WARN_ON() will
result in a double stack trace in the kernel log.
Is it possible to drop the WARN_ON() in the caller to avoid this redundant
splat?"
Existing code issue. i believe, i dont have to fix this. correct me pls.
C. Patch 3: [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values
Replied above as part of your comments.
D. Patch 4: [PATCH v10 net-next 4/6] octeontx2-af: npc: cn20k: add subbank search order control
1.
"If xa_store() fails and we jump to fail_used, doesn't this leave the device
in an unrecoverable state?
The code previously erased all subbank entries from npc_priv.xa_sb_used and
npc_priv.xa_sb_free. By aborting without restoring the remaining subbanks,
the erased subbanks appear to be permanently leaked, which would permanently
break subsequent hardware MCAM allocations. Global state is also modified
before the insertion loop, leaving the driver's internal state inconsistent.
Could this be avoided by pre-allocating the required XArray slots with
xa_reserve() or by building new XArrays and swapping them upon success?"
We are doing this operation underlocks and there wont be repetitions of slots in
the array. This above condiition is very unlikely, if there is some issue, we
flag the error thru print. This means that there is some other deep issue which
caused this error. So we can ignore this.
2.
" + if (bitmap_weight(&w, sz) != sz) {
> + dev_err(rvu->dev,
> + "Duplicate or out-of-range subbank index. %lu\n",
> + find_first_zero_bit(&w, sz));
> + return -EINVAL;
> + }
Does this error message print the missing index instead of the duplicate
one?
Since find_first_zero_bit() finds the first unset bit, it will return the
index that the user failed to provide, not the one they duplicated or
provided out of range. Could this lead to a confusing error message for
users?"
I will change print error message to "Duplicate or missing or out-of-range subbank index. %lu\n","
E. Patch 5: [PATCH v10 net-next 5/6] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries
1.
> u64 cfg, hw_prio;
> u8 kw_type;
>
> + if (index < 0 || index >= mcam->total_entries) {
> + WARN(1, "Wrong mcam index %d\n", index);
> + return;
> + }
Could this cause kernel log spam due to unchecked negative MCAM indices?
In npc_enadis_default_entries() and npc_enadis_default_mce_entry(), the code
queries the MCAM index using npc_get_nixlf_mcam_index(), which correctly
returns a negative error code if the entry is not allocated.
This unchecked negative index is then passed directly to
npc_cn20k_enable_mcam_entry(), which checks if (index < 0) and triggers a
stack trace via WARN(1, ...). This means that a routine NIX LF teardown of
an uninitialized interface (or a VF without multicast support querying the
PF's missing BCAST entry) will repeatedly trigger stack traces.
We have checks in npc_enadis_default_entries() to check for default entries and bailout.
/* only CGX or LBK interfaces have default entries */
if (is_cn20k(rvu->pdev) &&
!npc_is_cgx_or_lbk(rvu, pcifunc & ~RVU_PFVF_FUNC_MASK))
return;
will add the same for npc_enadis_default_mce_entry() as well.
2.
> @@ -4024,6 +4041,13 @@ int npc_cn20k_dft_rules_idx_get(struct rvu *rvu, u16 pcifunc, u16 *bcast,
> void *val;
> int i, j;
>
> + for (int i = 0; i < ARRAY_SIZE(ptr); i++) {
> + if (!ptr[i])
> + continue;
There is variable shadowing here. The variable 'i' is declared at the
function scope, but the loop re-declares 'i', shadowing the outer scope
variable. This triggers -Wshadow compiler warnings.
ACK.
3.
Does this code use virtual MCAM indices as physical indices, potentially
causing massive out-of-bounds memory corruption?
In npc_cn20k_dft_rules_alloc(), default MCAM entries are allocated via the
mailbox handler (rvu_mbox_handler_npc_mcam_alloc_entry), which returns
virtual indices (vidx) in rsp.entry_list for CN20K hardware. These vidx
values are inserted directly into the xa_pf2dfl_rmap xarray.
Later, this code in npc_cn20k_dft_rules_free() retrieves these vidx values
and passes them directly to physical index APIs: npc_mcam_clear_bit() and
mcam->entry2pfvf_map[]. Because vidx indices are structurally shifted
(starting at npc_priv.bank_depth * 2), they are guaranteed to be larger
than mcam->bmap_entries.
This causes rentry = mcam->bmap_entries - index - 1 in npc_mcam_clear_bit()
to underflow, leading __clear_bit() to perform out-of-bounds writes on
kernel heap memory. The vidx values should probably be mapped back to
physical indices using npc_cn20k_vidx2idx() before accessing software state
arrays.
virt field in npc_mcam_alloc_entry_req structure is not set for while calling for
default mcam index allocation using rvu_mbox_handler_npc_mcam_alloc_entry(). So
this comment can be ignored.
4.
Is there a race condition here that could lead to MCAM index corruption due
to premature lock release?
The mcam->lock is held while calling npc_mcam_clear_bit(), which
immediately marks the MCAM index as free in the software bitmap
(mcam->bmap). The code then explicitly unlocks mcam->lock before calling
npc_cn20k_idx_free() to release the index in the firmware/hardware.
Releasing the lock creates a critical race window: another thread can
concurrently allocate this exact MCAM index (since it appears free in the
bitmap), assign it to a new interface, and configure it.
When the original thread resumes and calls npc_cn20k_idx_free(), it
instructs the hardware to free the index, silently destroying the newly
allocated flow rules of the concurrent thread. Should the hardware cleanup
be performed before the software bitmap is marked free, or should the lock
remain held?
These alloc free request are thru mbox messages and are processed one by one (serially)
So ignore this case.
5.
int npc_cn20k_dft_rules_alloc(struct rvu *rvu, u16 pcifunc)
Is there a software bitmap leak and multi-free bug in the error path of
npc_cn20k_dft_rules_alloc()?
If xa_insert() fails, the code jumps to the err label to free the allocated
entries. The error path executes:
for (i = 0; i < cnt; i++)
npc_idx_free(rvu, mcam_idx, cnt, false);
Since npc_idx_free() takes an array and a count to free 'cnt' entries,
looping over it causes the exact same array of indices to be freed 'cnt'
times.
Furthermore, the error path fails to clear the software bits in the core
hardware bitmap (mcam->bmap) via npc_mcam_clear_bit() and does not
invalidate mcam->entry2pfvf_map. Since the allocation handler successfully
marked these entries as used, omitting these steps permanently leaks the
software state for these MCAM indices.
dft rules fail means a critical issue as this will affect system operation. Theere is
no point in recovering cleanly from it. But anyway, i dont find a clear 1:1 with code, am
i missing anything here.
6.
Does this early return skip the cleanup of user-defined flow rules for
non-CGX/LBK interfaces?
While the intent seems to be skipping the disablement of default entries
for interfaces that don't have them (like SDP or CPT), this function is
also responsible for disabling ALL user-defined MCAM flow rules associated
with the pcifunc by iterating over mcam->mcam_rules.
By returning early here, all user-defined rules are left fully active in
the hardware parser, pointing to a NIX LF that is being torn down and freed.
This could lead to hardware DMA faults and context errors when traffic
matches these orphaned rules. Should this check be exclusively inside
rvu_npc_disable_default_entries() instead?
ACK.
7.
Can this cause a permanent resource leak of default MCAM entries on PF driver
teardown?
otx2_free_hw_resources() unconditionally sets the NIX_LF_DONT_FREE_DFT_IDXS
flag during NIX LF free. While intended for cases like suspend/resume, this
function is called across all driver unbind/remove paths.
Consequently, when the otx2_pf driver is unloaded, the AF driver skips
freeing the default MCAM rules, leaving them permanently orphaned in the
AF's xa_pf2dfl_rmap and hardware allocation bitmap. They can never be
recovered or reused by other physical or virtual functions, causing a
severe leak of globally shared MCAM resources on every module detach.
These default mcam indexes are freed during rvu_detach_rsrcs mbox call.
rvu_detach_rsrcs() ->npc_cn20k_dft_rules_free().
F: Patch 6: [PATCH v10 net-next 6/6] octeontx2-af: npc: Support for custom KPU profile from filesystem
1.
Can this corrupt global state? In npc_prepare_default_kpu(), adapter->kpu
is initialized to the global static array npc_kpu_profiles. By assigning
kpu->cam2 and kpu->action2 here, this modifies the shared static array with
device-managed memory (devm_kcalloc).
If the device is unbound, the memory will be freed, leaving dangling pointers
in the global array which could lead to a use-after-free for subsequent device
instances.
This memory is part of "AF" and is initlaized during boot. AF memory will be freed when
device is shutdown. There wont be any case of af getting freed and come again on a live system.
2.
> +
> + action = (struct npc_kpu_profile_action *)(fw->data + offset);
> +
> + if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
> + return -ENOMEM;
Is -ENOMEM the correct return value here? A malformed or truncated firmware
file would cause this condition, so -EINVAL might be more appropriate than
indicating a memory allocation failure.
ACK.
3.
Could this cause firmware compatibility issues in the future?
ikpu_action_entries is an internal kernel array. If new entries are added
to it in future kernel versions, sizeof(ikpu_action_entries) will grow. This
would cause older firmware files to fail the size check above, or miscalculate
the offset for subsequent entries.
There could be a chance, but there is not field in "struct npc_kpu_profile_fwdata"
to indicate sizeof ikpu_action_entries. Adding a new field will break backward compatability etc.
Anyway In future if more entries are planned, we need to come up with a new design to address
this issue.
4.
+
> + for (kpu = 0; kpu < fw->kpus; kpu++) {
Does this loop have a bounds checking issue? Unlike the firmware database
path which checks if fw->kpus is greater than profile->kpus, this path only
checked if fw->kpus is greater than rvu->hw->npc_kpus earlier.
If the hardware limit exceeds the size of the statically allocated
profile->kpu array, this could overflow profile->kpu.
ACK.
5.
+ "profile size mismatch on kpu%i parsing\n",
> + kpu + 1);
> + return -EINVAL;
> + }
> +
> + fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
> + entries = min(fw_kpu->entries, rvu->hw->npc_kpu_entries);
Can this result in an out-of-bounds read? The size check above only ensures
that the start of fw_kpu is within the firmware data boundaries. It does not
verify that the entire fw_kpu structure fits within the remaining space before
dereferencing fw_kpu->entries.
> + dev_info(rvu->dev,
> + "Loading %u entries on KPU%d\n", entries, kpu);
> +
ACK.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-08 10:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03 2:55 [PATCH v10 net-next 0/6] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 1/6] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 2/6] net/mlx5e: heap-allocate devlink param values Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 3/6] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
2026-04-07 9:58 ` Paolo Abeni
2026-04-08 10:41 ` Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 4/6] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 5/6] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries Ratheesh Kannoth
2026-04-03 2:55 ` [PATCH v10 net-next 6/6] octeontx2-af: npc: Support for custom KPU profile from filesystem Ratheesh Kannoth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox