Re: [PATCH v4 net 01/10] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k

Netdev List
 help / color / mirror / Atom feed

From: Ratheesh Kannoth <rkannoth@marvell.com>
To: <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <sgoutham@marvell.com>, <davem@davemloft.net>,
	<edumazet@google.com>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<andrew+netdev@lunn.ch>, Suman Ghosh <sumang@marvell.com>,
	Dan Carpenter <error27@gmail.com>
Subject: Re: [PATCH v4 net 01/10] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k
Date: Mon, 27 Apr 2026 15:15:46 +0530	[thread overview]
Message-ID: <ae8wSuXlBdTJBkbu@rkannoth-OptiPlex-7090> (raw)
In-Reply-To: <20260427063213.3937451-2-rkannoth@marvell.com>

On 2026-04-27 at 12:02:04, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> npc_mcam_idx_2_key_type() can fail; callers used to ignore it and still
> used kw_type when enabling, configuring, copying, and reading MCAM
> entries. That could program or decode hardware with an undefined key
> type.
>
> Return -EINVAL when key-type lookup fails. Return -EINVAL from
> npc_cn20k_copy_mcam_entry() when src and dest key types differ instead
> of failing silently.
>
> Change npc_cn20k_{enable,config,copy,read}_mcam_entry() to return int on
> success or error. Thread those errors through the cn20k MCAM write and
> read mbox handlers, the cn20k baseline steer read path, NPC defrag
> move (disable/copy/enable with dev_err and -EFAULT), and the DMAC
> update path in rvu_npc_fs.c.
>
> Make npc_copy_mcam_entry() return int so the cn20k branch can return
> npc_cn20k_copy_mcam_entry() without a void/int mismatch, and fail
> NPC_MCAM_SHIFT_ENTRY when copy fails.

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260427063213.3937451-1-rkannoth%40marvell.com

>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> index 7291fdb89b03..8d5cee833af7 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>[ ... ]
>> @@ -1335,11 +1354,10 @@ int rvu_mbox_handler_npc_cn20k_mcam_write_entry(struct rvu *rvu,
>>  	if (is_pffunc_af(req->hdr.pcifunc))
>>  		nix_intf = req->intf;
>>
>> -	npc_cn20k_config_mcam_entry(rvu, blkaddr, req->entry, nix_intf,
>> -				    &req->entry_data, req->enable_entry,
>> -				    req->hw_prio, req->req_kw_type);
>> +	rc = npc_cn20k_config_mcam_entry(rvu, blkaddr, req->entry, nix_intf,
>> +					 &req->entry_data, req->enable_entry,
>> +					 req->hw_prio, req->req_kw_type);
>>
>> -	rc = 0;
>>  exit:
>>  	mutex_unlock(&mcam->lock);
>>  	return rc;
>Is it intentional that this handler, and the other cn20k mbox handlers
>touched here, now return a bare -EINVAL from the new failure paths while
>every other error exit in the same handlers returns one of the
>NPC_MCAM_* enum codes defined in mbox.h (NPC_MCAM_INVALID_REQ,
>NPC_MCAM_PERM_DENIED, NPC_MCAM_ALLOC_FAILED)?
>A caller switching on those enum values would not match -EINVAL coming
>out of npc_cn20k_config_mcam_entry() or npc_cn20k_read_mcam_entry().

I agree, Error handler's need a lot of cleanup, and is a candidate for net-next.
This patch is to fix a smatch error; so dont want to address this comment as part of this change. will
do as a followup patch

> @@ -1361,11 +1379,13 @@ int rvu_mbox_handler_npc_cn20k_mcam_read_entry(struct rvu *rvu,
>>
>>  	mutex_lock(&mcam->lock);
>>  	rc = npc_mcam_verify_entry(mcam, pcifunc, req->entry);
>> -	if (!rc)
>> -		npc_cn20k_read_mcam_entry(rvu, blkaddr, req->entry,
>> -					  &rsp->entry_data, &rsp->intf,
>> -					  &rsp->enable, &rsp->hw_prio);
>> +	if (rc)
>> +		goto fail;
>>
>> +	rc = npc_cn20k_read_mcam_entry(rvu, blkaddr, req->entry,
>> +				       &rsp->entry_data, &rsp->intf,
>> +				       &rsp->enable, &rsp->hw_prio);
>> +fail:
>>  	mutex_unlock(&mcam->lock);
>>  	return rc;
>>  }
>When npc_cn20k_read_mcam_entry() fails here (and in the base-steer
>read handler below), rc is returned but rsp->entry_data and friends
>may be partially populated or untouched. Should the response buffer
>be zeroed on the error path, or is it documented elsewhere that
>callers must ignore rsp on error?
Callers won't use rsp upon error. So we can ignore this comment.

>> @@ -1415,14 +1435,14 @@ int rvu_mbox_handler_npc_cn20k_mcam_alloc_and_write_entry(struct rvu *rvu,
>>  	else
>>  		nix_intf = pfvf->nix_rx_intf;
>>
>> -	npc_cn20k_config_mcam_entry(rvu, blkaddr, entry, nix_intf,
>> -				    &req->entry_data, req->enable_entry,
>> -				    req->hw_prio, req->req_kw_type);
>> +	rc = npc_cn20k_config_mcam_entry(rvu, blkaddr, entry, nix_intf,
>> +					 &req->entry_data, req->enable_entry,
>> +					 req->hw_prio, req->req_kw_type);
>>
>>  	mutex_unlock(&mcam->lock);
>>
>>  	rsp->entry = entry_rsp.entry;
>> -	return 0;
>> +	return rc;
>>  }
>Does this path now leak the MCAM entry on the new error?
>The entry was just obtained via rvu_mbox_handler_npc_mcam_alloc_entry()
>and its slot is recorded in entry_rsp. When npc_cn20k_config_mcam_entry()
>returns -EINVAL, the function returns rc without freeing the allocated
>entry, and it still writes rsp->entry = entry_rsp.entry so the caller
>gets an error alongside what looks like a valid entry index.
>Should a failure here release the entry (and likely clear rsp->

I agree; but this is also falls in same category of proper error handling; i believe
we can take this as part of follow up patch. Please suggest as this needs a lot of
code change to handle errors from all callers.

>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
>> index c2ca5ed1d028..ecaf0946b852 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
>[ ... ]
>> @@ -3266,7 +3270,10 @@ int rvu_mbox_handler_npc_mcam_shift_entry(struct rvu *rvu,
>>  		npc_enable_mcam_entry(rvu, mcam, blkaddr, new_entry, false);
>>
>>  		/* Copy rule from old entry to new entry */
>> -		npc_copy_mcam_entry(rvu, mcam, blkaddr, old_entry, new_entry);
>> +		if (npc_copy_mcam_entry(rvu, mcam, blkaddr, old_entry, new_entry)) {
>> +			rc = NPC_MCAM_INVALID_REQ;
>> +			break;
>> +		}
>>
>>  		/* Copy counter mapping, if any */
>>  		cntr = mcam->entry2cntr_map[old_entry];
>> @@ -3284,7 +3291,8 @@ int rvu_mbox_handler_npc_mcam_shift_entry(struct rvu *rvu,
>>
>>  	/* If shift has failed then report the failed index */
>>  	if (index != req->shift_count) {
>> -		rc = NPC_MCAM_PERM_DENIED;
>> +		if (!rc)
>> +			rc = NPC_MCAM_PERM_DENIED;
>>  		rsp->failed_entry_idx = index;
>>  	}
>Is NPC_MCAM_INVALID_REQ the intended code for a copy failure here?
>The sibling error paths in this loop propagate whatever
>npc_mcam_verify_entry() returns (NPC_MCAM_INVALID_REQ or
>NPC_MCAM_PERM_DENIED), and the post-loop block used to set
>NPC_MCAM_PERM_DENIED unconditionally. The copy-failure case picks
>NPC_MCAM_INVALID_REQ which is inconsistent with the existing codes
>for mid-shift failures in the same function.
Here rc was getting overwritten by previous rc. That is wrong. There is bigger
work to handle errors properly; that will be taken as a followup error code
handling. i believe, this fix change does not need to handle that. Please advise.

next prev parent reply	other threads:[~2026-04-27  9:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-27  6:32 [PATCH v4 net 0/10] octeontx2-af: npc: cn20k: MCAM fixes Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 01/10] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k Ratheesh Kannoth
2026-04-27  9:45   ` Ratheesh Kannoth [this message]
2026-04-27  6:32 ` [PATCH v4 net 02/10] octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in init Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 03/10] octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback Ratheesh Kannoth
2026-04-27  9:56   ` Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 04/10] octeontx2-af: npc: cn20k: Fix target map and rule Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 05/10] octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width Ratheesh Kannoth
2026-04-27 10:01   ` Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 06/10] octeontx2-af: npc: cn20k: Fix bank value Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 07/10] octeontx2-af: npc: cn20k: Fix MCAM actions read Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 08/10] octeontx2-af: npc: cn20k: Initialize default-rule index outputs up front Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 09/10] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free Ratheesh Kannoth
2026-04-27 10:09   ` Ratheesh Kannoth
2026-04-27  6:32 ` [PATCH v4 net 10/10] octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices Ratheesh Kannoth
2026-04-27 10:13   ` Ratheesh Kannoth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae8wSuXlBdTJBkbu@rkannoth-OptiPlex-7090 \
    --to=rkannoth@marvell.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=error27@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sgoutham@marvell.com \
    --cc=sumang@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox