Re: [v3,net-next,5/5] octeontx2-af: Add support for loading custom KPU profile from filesystem

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Ratheesh Kannoth <rkannoth@marvell.com>
To: Simon Horman <horms@kernel.org>
Cc: <cjubran@nvidia.com>, <matttbe@kernel.org>,
	<chuck.lever@oracle.com>, <shshitrit@nvidia.com>,
	<linux-kernel@vger.kernel.org>, <netdev@vger.kernel.org>,
	<sgoutham@marvell.com>, <jiri@resnulli.us>, <davem@davemloft.net>,
	<saeedm@nvidia.com>, <kuba@kernel.org>, <edumazet@google.com>,
	<pabeni@redhat.com>, <donald.hunter@gmail.com>
Subject: Re: [v3,net-next,5/5] octeontx2-af: Add support for loading custom KPU profile from filesystem
Date: Wed, 11 Mar 2026 08:45:40 +0530	[thread overview]
Message-ID: <abDeXLpMMxp7G1v3@rkannoth-OptiPlex-7090> (raw)
In-Reply-To: <20260310172100.890161-1-horms@kernel.org>

On 2026-03-10 at 22:51:00, Simon Horman (horms@kernel.org) wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> octeontx2-af: Add support for loading custom KPU profile from filesystem
>
> This commit adds functionality to load custom KPU (Key Parse Unit)
> profiles from the filesystem at module load time, providing an
> alternative to flashing updated firmware on deployed devices.
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
> > index cefc5d70f3e4..5928c4d3cc8b 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
>
> [ ... ]
>
> > @@ -290,6 +303,7 @@ struct npc_kpu_profile {
> >  	int action_entries;
> >  	struct npc_kpu_profile_cam *cam;
> >  	struct npc_kpu_profile_action *action;
> > +	struct npc_kpu_profile_cam2 *cam2;
> >  };
>
> [ ... ]
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > index a466181cf908..1efc53b63b67 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
>
> [ ... ]
>
> > @@ -553,8 +553,8 @@ struct npc_kpu_profile_adapter {
> >  	const char			*name;
> >  	u64				version;
> >  	const struct npc_lt_def_cfg	*lt_def;
> > -	const struct npc_kpu_profile_action	*ikpu; /* array[pkinds] */
> > -	const struct npc_kpu_profile	*kpu; /* array[kpus] */
> > +	struct npc_kpu_profile_action	*ikpu; /* array[pkinds] */
> > +	struct npc_kpu_profile	*kpu; /* array[kpus] */
>
> [ ... ]
>
> > @@ -564,6 +564,7 @@ struct npc_kpu_profile_adapter {
> >  	bool				custom;
> >  	size_t				pkinds;
> >  	size_t				kpus;
> > +	bool				from_fs;
> >  };
>
> [ ... ]
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
> > index 352dc5f1d5b9..05aabe26f030 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
>
> [ ... ]
>
> > @@ -1610,6 +1622,7 @@ u64 npc_enable_mask(int count)
> >  static void npc_program_kpu_profile(struct rvu *rvu, int blkaddr, int kpu,
> >  				    const struct npc_kpu_profile *profile)
> >  {
> > +	struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
> >  	int entry, num_entries, max_entries;
> >  	u64 entry_mask;
> >
> > @@ -1621,10 +1634,15 @@ static void npc_program_kpu_profile(struct rvu *rvu, int blkaddr, int kpu,
> >
> >  	max_entries = rvu->hw->npc_kpu_entries;
> >
> > +	WARN(profile->cam_entries > max_entries,
> > +	     "KPU%u: err: hw max entries=%u, input entries=%u\n",
> > +	     kpu,  rvu->hw->npc_kpu_entries, profile->cam_entries);
> > +
> >  	/* Program CAM match entries for previous KPU extracted data */
> >  	num_entries = min_t(int, profile->cam_entries, max_entries);
> >  	for (entry = 0; entry < num_entries; entry++)
> >  		npc_config_kpucam(rvu, blkaddr,
> > +				  adapter->from_fs ? (void *)&profile->cam2[entry] :
> >  				  &profile->cam[entry], kpu, entry);
>
> Can this dereference NULL when from_fs is true for KPUs not covered by
> the filesystem firmware?
>
> Looking at the filesystem loading path in npc_apply_custom_kpu:
>
>     for (kpu = 0; kpu < fw->kpus; kpu++) {
>         ...
>         ret = npc_alloc_kpu_cam2_n_action(rvu, kpu, entries);
>         ...
>         for (entry = 0; entry < entries; entry++) {
>             profile->kpu[kpu].cam2[entry] = cam2[entry];
>             ...
>         }
>     }
>
> The code allocates cam2 and populates entries only for the first
> fw->kpus KPUs found in the firmware blob.
>
> However, npc_program_kpu_profile is later called for all KPUs in
> npc_parser_profile_init:
>
>     for (idx = 0; idx < num_kpus; idx++)
>         npc_program_kpu_profile(rvu, blkaddr, idx, &rvu->kpu.kpu[idx]);
>
> Since adapter->from_fs is a global flag (not per-KPU), when idx >=
> fw->kpus, profile->cam2 remains NULL (the default npc_kpu_profiles[]
> static initializer doesn't include cam2), causing &profile->cam2[entry]
> to compute an invalid pointer.
>
> Shouldn't there be a check to verify cam2 is non-NULL before
> dereferencing it, or allocate cam2 for all KPUs when from_fs is true?

The filsystem binary contains entries for all KPUs. So we can ignore this
case.

>
> >
> >  	/* Program this KPU's actions */
>
> [ ... ]
>
> > @@ -1678,26 +1696,50 @@ static void npc_prepare_default_kpu(struct rvu *rvu,
> >  	npc_cn20k_update_action_entries_n_flags(rvu, profile);
> >  }
> >
> > +static int npc_alloc_kpu_cam2_n_action(struct rvu *rvu, int kpu_num,
> > +				       int num_entries)
> > +{
> > +	struct npc_kpu_profile_adapter *adapter = &rvu->kpu;
> > +	struct npc_kpu_profile *kpu;
> > +
> > +	kpu = &adapter->kpu[kpu_num];
> > +
> > +	kpu->cam2 = devm_kcalloc(rvu->dev, num_entries,
> > +				 sizeof(*kpu->cam2), GFP_KERNEL);
> > +	if (!kpu->cam2)
> > +		return -ENOMEM;
> > +
> > +	kpu->action = devm_kcalloc(rvu->dev, num_entries,
> > +				   sizeof(*kpu->action), GFP_KERNEL);
> > +	if (!kpu->action)
> > +		return -ENOMEM;
> > +
> > +	return 0;
> > +}
>
> Can this permanently corrupt the default profile when the second
> allocation fails?
Expectation is to load the correct binary. Due to memory constraints, we decided
to skip rollback on errors. It is user responsibility to remove the corrupted FW
binary and reboot.

>
> The kpu pointer targets the static npc_kpu_profiles[] array (adapter->kpu
> starts as &npc_kpu_profiles[0]). When the first devm_kcalloc succeeds but
> the second fails:
>
>     kpu->cam2 = devm_kcalloc(...)  // succeeds, modifies static array
>     kpu->action = devm_kcalloc(...) // fails
>     return -ENOMEM;                 // kpu->action is now NULL in static array
>
> The caller in npc_apply_custom_kpu returns the error without restoring
> the original action pointer.
>
> Later, npc_parser_profile_init calls npc_program_kpu_profile for all
> KPUs, which calls npc_config_kpuaction with &profile->action[entry]
> where action is NULL, causing a crash when dereferencing the NULL pointer.
>
> Should the function restore the original pointers on allocation failure,
> or use a separate copy of the profile entries for filesystem loading?
Same as above.
>
> > +
> >  static int npc_apply_custom_kpu(struct rvu *rvu,
> > -				struct npc_kpu_profile_adapter *profile)
> > +				struct npc_kpu_profile_adapter *profile,
> > +				bool from_fs)
> >  {
> >  	size_t hdr_sz = sizeof(struct npc_kpu_profile_fwdata), offset = 0;
> >  	struct npc_kpu_profile_action *action;
> > +	struct npc_kpu_profile_fwdata *sfw;
> >  	struct npc_kpu_profile_fwdata *fw;
> > +	struct npc_kpu_profile_cam2 *cam2;
> >  	struct npc_kpu_profile_cam *cam;
> >  	struct npc_kpu_fwdata *fw_kpu;
> > -	int entries;
> > +	int entries, ret;
> >  	u16 kpu, entry;
>
> [ ... ]
>
> > @@ -1731,31 +1773,80 @@ static int npc_apply_custom_kpu(struct rvu *rvu,
> >  		return -EINVAL;
> >  	}
> >
> > +	sfw = devm_kcalloc(rvu->dev, 1, sizeof(*sfw), GFP_KERNEL);
> > +	if (!sfw)
> > +		return -ENOMEM;
> > +
> > +	memcpy(sfw, fw, sizeof(*sfw));
> > +
> >  	profile->custom = 1;
> > -	profile->name = fw->name;
> > +	profile->name = sfw->name;
> >  	profile->version = le64_to_cpu(fw->version);
> > -	profile->mcam_kex_prfl.mkex = &fw->mkex;
> > -	profile->lt_def = &fw->lt_def;
> > +	profile->mcam_kex_prfl.mkex = &sfw->mkex;
> > +	profile->lt_def = &sfw->lt_def;
> > +
> > +	/* Binary blob contains ikpu actions entries at start of data[0] */
> > +	if (from_fs) {
> > +		action = (struct npc_kpu_profile_action *)(fw->data + offset);
> > +
> > +		if (rvu->kpu_fwdata_sz < hdr_sz + sizeof(ikpu_action_entries))
> > +			return -ENOMEM;
> > +
> > +		memcpy((void *)profile->ikpu, action, sizeof(ikpu_action_entries));
> > +		offset += sizeof(ikpu_action_entries);
> > +	}
> >
> >  	for (kpu = 0; kpu < fw->kpus; kpu++) {
> >  		fw_kpu = (struct npc_kpu_fwdata *)(fw->data + offset);
> > -		if (fw_kpu->entries > KPU_MAX_CST_ENT)
> > -			dev_warn(rvu->dev,
> > -				 "Too many custom entries on KPU%d: %d > %d\n",
> > -				 kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
> > -		entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
> > -		cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
> > -		offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
> > +		if (!from_fs) {
> > +			if (fw_kpu->entries > KPU_MAX_CST_ENT)
> > +				dev_warn(rvu->dev,
> > +					 "Too many custom entries on KPU%d: %d > %d\n",
> > +					 kpu, fw_kpu->entries, KPU_MAX_CST_ENT);
> > +			entries = min(fw_kpu->entries, KPU_MAX_CST_ENT);
> > +			cam = (struct npc_kpu_profile_cam *)fw_kpu->data;
> > +			offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam);
> > +			action = (struct npc_kpu_profile_action *)(fw->data + offset);
> > +			offset += fw_kpu->entries * sizeof(*action);
> > +			if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
> > +				dev_warn(rvu->dev,
> > +					 "Profile size mismatch on KPU%i parsing.\n",
> > +					 kpu + 1);
> > +				return -EINVAL;
> > +			}
> > +			for (entry = 0; entry < entries; entry++) {
> > +				profile->kpu[kpu].cam[entry] = cam[entry];
> > +				profile->kpu[kpu].action[entry] = action[entry];
> > +			}
> > +			continue;
> > +		}
> > +		entries = fw_kpu->entries;
> > +		dev_info(rvu->dev,
> > +			 "Loading %u entries on KPU%d\n", entries, kpu);
> > +
> > +		cam2 = (struct npc_kpu_profile_cam2 *)fw_kpu->data;
> > +		offset += sizeof(*fw_kpu) + fw_kpu->entries * sizeof(*cam2);
> >  		action = (struct npc_kpu_profile_action *)(fw->data + offset);
> >  		offset += fw_kpu->entries * sizeof(*action);
> >  		if (rvu->kpu_fwdata_sz < hdr_sz + offset) {
> >  			dev_warn(rvu->dev,
> > -				 "Profile size mismatch on KPU%i parsing.\n",
> > +				 "profile size mismatch on kpu%i parsing.\n",
> >  				 kpu + 1);
> >  			return -EINVAL;
> >  		}
> > +
> > +		profile->kpu[kpu].cam_entries = entries;
> > +		profile->kpu[kpu].action_entries = entries;
> > +		ret = npc_alloc_kpu_cam2_n_action(rvu, kpu, entries);
> > +		if (ret) {
> > +			dev_warn(rvu->dev,
> > +				 "profile entry allocation failed for kpu=%d for %d entries\n",
> > +				 kpu, entries);
> > +			return -EINVAL;
> > +		}
> > +
> >  		for (entry = 0; entry < entries; entry++) {
> > -			profile->kpu[kpu].cam[entry] = cam[entry];
> > +			profile->kpu[kpu].cam2[entry] = cam2[entry];
> >  			profile->kpu[kpu].action[entry] = action[entry];
> >  		}
>
> Could this loop overflow and hang the kernel if firmware specifies more
> than 65535 entries?
FW binary is prepared by marvell and tested before delivering to customers.
And the entries wont cross u16 limit

>
> The loop variable 'entry' is declared as u16:
>
>     u16 kpu, entry;
>
> while 'entries' is an int:
>
>     entries = fw_kpu->entries;
>
> In the from_fs path, 'entries' is read directly from the firmware blob
> without capping it (unlike the !from_fs path which uses
> min(fw_kpu->entries, KPU_MAX_CST_ENT)).
>
> If a corrupt or malicious firmware file specifies fw_kpu->entries >
> 65535, the u16 variable wraps around at 65536:
>
>     for (entry = 0; entry < entries; entry++)
>
> When entry reaches 65535, incrementing it produces 0, which is still
> less than entries, creating an infinite loop that hangs the kernel.
>
> Should the code either cap entries in the from_fs path or use an
> int-sized loop variable?
>
> >  	}
>
> [ ... ]

next prev parent reply	other threads:[~2026-03-11  3:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09  2:46 [PATCH v3 net-next 0/5] octeontx2-af: npc: Enhancements Ratheesh Kannoth
2026-03-09  2:46 ` [PATCH v3 net-next 1/5] octeontx2-af: npc: cn20k: debugfs enhancements Ratheesh Kannoth
2026-03-09  2:46 ` [PATCH v3 net-next 2/5] devlink: Implement devlink param multi attribute nested data values Ratheesh Kannoth
2026-03-09  2:46 ` [PATCH v3 net-next 3/5] octeontx2-af: npc: cn20k: add subbank search order control Ratheesh Kannoth
2026-03-09  2:46 ` [PATCH v3 net-next 4/5] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries Ratheesh Kannoth
2026-03-09  2:46 ` [PATCH v3 net-next 5/5] octeontx2-af: Add support for loading custom KPU profile from filesystem Ratheesh Kannoth
2026-03-10 17:21   ` [v3,net-next,5/5] " Simon Horman
2026-03-11  3:15     ` Ratheesh Kannoth [this message]
2026-03-11 10:14       ` Simon Horman
2026-03-11 14:28         ` Ratheesh Kannoth
2026-03-13  2:08     ` Ratheesh Kannoth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abDeXLpMMxp7G1v3@rkannoth-OptiPlex-7090 \
    --to=rkannoth@marvell.com \
    --cc=chuck.lever@oracle.com \
    --cc=cjubran@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matttbe@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=saeedm@nvidia.com \
    --cc=sgoutham@marvell.com \
    --cc=shshitrit@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox