Re: [PATCH v2 net 0/11] octeontx2-af: npc: cn20k: MCAM fixes

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Ratheesh Kannoth <rkannoth@marvell.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<sgoutham@marvell.com>, <davem@davemloft.net>,
	<edumazet@google.com>, <kuba@kernel.org>, <andrew+netdev@lunn.ch>,
	<dan.carpenter@linaro.org>
Subject: Re: [PATCH v2 net 0/11] octeontx2-af: npc: cn20k: MCAM fixes
Date: Thu, 23 Apr 2026 15:50:32 +0530	[thread overview]
Message-ID: <aenycIVFXeB2oCcK@rkannoth-OptiPlex-7090> (raw)
In-Reply-To: <a93c34d6-8c9f-417e-9ed2-ff4a9f668d11@redhat.com>

On 2026-04-23 at 15:14:10, Paolo Abeni (pabeni@redhat.com) wrote:
> On 4/20/26 4:34 AM, Ratheesh Kannoth wrote:
> > This series tightens Marvell OcteonTX2 AF NPC support for CN20K
> > silicon around MCAM key typing, optional debugfs setup, defrag
> > allocation rollback, x2 versus x4 KEX profiles and default-rule
> > allocation, logical MCAM clear and configuration, default-rule index
> > bookkeeping and explicit teardown, and NIXLF reserved-slot lookup when
> > default rules are missing.
> >
> > Patches 1 through 3 focus on AF error handling: propagate
> > npc_mcam_idx_2_key_type() failures through cn20k MCAM enable, config,
> > copy, and read paths; treat cn20k NPC debugfs files as optional so
> > probe does not fail when debugfs is unavailable; and fix defrag MCAM
> > allocation rollback so allocation errno is not overwritten by subbank
> > index resolution.
> >
> > Patches 4 and 5 align default-rule and flow-install behaviour with the
> > loaded mkex profile: prefer x4 default entries when the profile is x4,
> > and reject x4 flow keys when the profile is strictly x2.
> >
> > Patches 6 through 8 refine cn20k MCAM programming: clear entries by
> > logical index and resolved key width, fix bank and CFG sequencing in
> > npc_cn20k_config_mcam_entry(), and read action metadata from the
> > correct bank in npc_cn20k_read_mcam_entry().
> >
> > Patches 9 through 11 complete default-rule lifecycle handling:
> > initialize all default-rule index outputs up front, tear down default
> > MCAM rules explicitly (coordinated with npc_mcam_free_all_entries()),
> > and reject USHRT_MAX sentinel indices in npc_get_nixlf_mcam_index()
> > for cn20k.
> >
> > Ratheesh Kannoth (11):
> >   octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k
> >   octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in
> >   init
> >   octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc
> >   rollback
> >   octeontx2-af: npc: cn20k: Make default entries as x4.
> >   octeontx2-af: npc: cn20k: Reject request for x4 entries in x2
> >   profile.
> >   octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width
> >   octeontx2-af: npc: cn20k: Fix bank value.
> >   octeontx2-af: npc: cn20k: Fix MCAM actions read
> >   octeontx2-af: npc: cn20k: Initialize default-rule index outputs up
> >   front
> >   octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on
> >   free
> >   octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices
> >
> > Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> >
> > --
> > v1 -> v2: Addressed simon comments. Added more patch fixes to this series.
> > 	https://lore.kernel.org/netdev/20260418162013.GG280379@horms.kernel.org/
>
> I strongly suggest avoid extending a series to address issues found by
> the AI review, unless such issues are actual regressions/problems
> introduced by the new code. Sashiko tends to find a lot of collateral
> problem and the series could easily expand beyond any hope of merging.
>
> Instead you should reply on the ML, commenting on sashiko's review,
> explain why or why not the outlined problems are valid and/or should be
> handled separately.
>
> Thanks,
>
> Paolo
>

Patch 1: [PATCH v2 net 01/11] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k

>Is it intentional to discard the return value of npc_cn20k_enable_mcam_entry()
>here?
>If disabling an entry fails during teardown (e.g., in
>rvu_npc_disable_mcam_entries()), this could silently leave stale, active packet
>steering rules in the hardware.

Yes, the return value of npc_cn20k_enable_mcam_entry() is
intentionally discarded in this context.

The primary reason is that the validity of the MCAM index is
already verified prior to this call, so the function is not
expected to fail under normal conditions.

Additionally, during
rvu_npc_disable_mcam_entries()), failure handling is
best-effort. At this stage, the system is already
transitioning state, and retrying or propagating errors
does not provide meaningful recovery.

Introducing strict error handling here would require
propagating return values through multiple call sites,
which would significantly increase the scope and
complexity of this change.

Any modification to the return type or contract of
npc_enable_mcam_entry() would involve a broader refactor
across the driver and is therefore intentionally deferred
to a separate, dedicated change.

Given these constraints, ignoring the return value here is
a deliberate design decision and does not introduce
functional risk beyond what is already accepted in
teardown scenarios.

>If npc_update_dmac_value() returns an error, its caller in rvu_npc_fs.c seems
>to ignore it:
>			if (rule->vfvlan_cfg)
>				npc_update_dmac_value(rvu, blkaddr, rule, pfvf);
>			if (rule->rx_action.op == NIX_RX_ACTION_DEFAULT) {
>				...
>			}
>			npc_enable_mcam_entry(rvu, mcam, blkaddr,
>					      rule->entry, true);
>Could this lead to the code continuing and activating the rule in hardware with
>an invalid or outdated MAC address?

Handling the error returned by npc_update_dmac_value() only
at this call site would not be sufficient.

For correctness, the error would need to be propagated to
higher-level callers, which would require changes across
multiple layers of the call chain. This significantly
increases the scope and complexity of the change.

Such propagation would also introduce additional handling
requirements in several paths, potentially leading to more
issues being flagged by static analysis tools.

Given that this patch is focused on a specific fix,
extending it to include broader error-handling changes is
out of scope.

Therefore, this is being treated as a separate enhancement
and is intentionally not addressed here.

Patch 4: [PATCH v2 net 04/11] octeontx2-af: npc: cn20k: Make default entries as x4.

>Does this in-place modification break the sorting invariant of the mcam_rules
>list?
>In rvu_mcam_add_rule(), the mcam->mcam_rules list appears to be strictly
>maintained in ascending order by entry. Modifying rule->entry in-place
>without removing and correctly re-inserting the node could leave the list
>unsorted.
>Could this cause subsequent rule insertions or debug routines that rely on
>the list order to behave incorrectly?

The mcam_rules list is not required to maintain a strict
sorting invariant for correctness.

Although rvu_mcam_add_rule() inserts entries in ascending
order, there is no functional dependency on this ordering
in subsequent operations.

All lookups and operations on the mcam_rules list iterate
over the entire list rather than relying on ordering.
Therefore, modifying rule->entry in place does not impact
correctness, even if the list becomes unsorted.

As a result, leaving the list temporarily or permanently
unsorted does not lead to incorrect behavior in rule
insertion, lookup, or debug routines.

>While this static structure predates this patch, can this design cause state
>contamination across multiple devices?
>Because pfl_info is static, it is shared globally across all network
>interfaces and physical NICs on the host. However, it is protected by
>mutex_lock(&pfvf->mbox.lock), which is a per-NIC lock.
>If multiple interfaces probe concurrently, they could acquire their own locks
>and race on this global struct. Additionally, if the host has multiple cards
>configured with different hardware profiles (such as one X2 and one X4), the
>first card to query the hardware will permanently cache its profile for all
>subsequent cards.
>Would it be safer to cache this hardware profile data in a device-private
> +			 */
> +			WARN(1, "%s Error deleting default entries (pcifunc=%#x) mcam_idx=%u\n",
> +			     __func__, pcifunc, ptr[i]);
> +			return;
> +		}
AF processes mailbox (mbox) requests from PFs and VFs
serially. This guarantees that initialization of pfl_info
is not performed concurrently by multiple devices.

The profile information is populated once by AF during NPC
initialization and remains constant for the lifetime of the
DUT. There are no subsequent writes or updates to this data
after initialization.

As a result, pfl_info effectively becomes read-only after
being set. All later accesses from different PFs or VFs are
only reading the same stable value.

Even if multiple interfaces probe concurrently, they only
perform reads after initialization is complete. Concurrent
reads of immutable data do not introduce race conditions.

Regarding systems with multiple NICs or different hardware
profiles, the AF provides a consistent hardware view per
system. The profile data exposed through mbox is not
expected to vary dynamically per device after init.

Therefore, using a static structure here does not lead to
state contamination or incorrect behavior, and no
additional per-device storage is required in this context.

Patch 10: [PATCH v2 net 10/11] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free

>> +		rc = npc_cn20k_idx_free(rvu, &ptr[i], 1);
>> +		if (rc) {
>> +			/* Non recoverable error. Let us WARN and return. Keep system alive to
>> +			 * enable debugging
>> +			 */
>> +			WARN(1, "%s Error deleting default entries (pcifunc=%#x) mcam_idx=%u\n",
>> +			     __func__, pcifunc, ptr[i]);
>> +			return;
>> +		}
>Does this early return on failure skip the software list cleanup and leak
>the rvu_npc_mcam_rule list nodes for both this index and any successfully
>freed indices?
>Since the final list cleanup loop is bypassed, it appears these software
>rules remain permanently. Also, does this skip hardware cleanup for any
>remaining default rules in the array?
Default MCAM rules are allocated upfront during
initialization, and their presence is required for correct
system operation. A failure during allocation itself would
already indicate a critical issue.

If npc_cn20k_idx_free() fails, it indicates that freeing of
previously allocated indices has failed. This suggests a
serious inconsistency in the allocation/free logic rather
than a recoverable runtime error.

In such a scenario, attempting partial cleanup (either
software list or remaining hardware entries) is unsafe, as
it can further corrupt the allocation state and make
debugging significantly harder.

Therefore, the function intentionally emits a WARN and
returns early to preserve the current state for debugging
purposes. This avoids masking the underlying issue.

While this does skip subsequent cleanup of software list
nodes and remaining hardware entries, continuing execution
after such a failure is not considered safe or meaningful.

This condition is treated as a non-recoverable error and
indicates a deeper issue in the allocation/free algorithm
that must be investigated separately.

>+	}
>x_handler_npc_cn20k_mcam_alloc_and_write_entry
>> +
>> +	mutex_lock(&mcam->lock);
>> +	list_for_each_entry_safe(rule, tmp, &mcam->mcam_rules, list) {
>> +		for (int i = 0; i < 4; i++) {
>> +			if (ptr[i] != rule->entry)
>> +				continue;
>> +
>> +			list_del(&rule->list);
>> +			kfree(rule);
>> +			break;
>> +		}
>> +	}
>> +	mutex_unlock(&mcam->lock);
>Can dropping mcam->lock during npc_cn20k_idx_free() lead to a race
>condition? When the hardware index is freed, a concurrent thread could
>immediately allocate it and insert a new rule into mcam->mcam_rules with
>the same entry ID.

>Later, when this thread re-acquires the lock to clean up the software
>rules, the inner loop break only exits the for loop, meaning the list
>traversal continues. Would this cause both the old rule and the newly
>allocated concurrent rule to be deleted and freed, leading to a
>use-after-free for the active thread?

No. There is no concurrent threads;
default indexes are allocated thru mbox messages to AF.
AF process mbox messages processed one by one. So there is no chance of concurrency.

     prev parent reply	other threads:[~2026-04-23 10:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20  2:34 [PATCH v2 net 0/11] octeontx2-af: npc: cn20k: MCAM fixes Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 01/11] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k Ratheesh Kannoth
2026-04-23  9:38   ` [v2,net,01/11] " Paolo Abeni
2026-04-23 10:01   ` [PATCH v2 net 01/11] " Simon Horman
2026-04-20  2:34 ` [PATCH v2 net 02/11] octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in init Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 03/11] octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback Ratheesh Kannoth
2026-04-23 11:13   ` Simon Horman
2026-04-20  2:34 ` [PATCH v2 net 04/11] octeontx2-af: npc: cn20k: Make default entries as x4 Ratheesh Kannoth
2026-04-23 12:42   ` Simon Horman
2026-04-20  2:34 ` [PATCH v2 net 05/11] octeontx2-af: npc: cn20k: Reject request for x4 entries in x2 profile Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 06/11] octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width Ratheesh Kannoth
2026-04-23  9:39   ` [v2,net,06/11] " Paolo Abeni
2026-04-20  2:34 ` [PATCH v2 net 07/11] octeontx2-af: npc: cn20k: Fix bank value Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 08/11] octeontx2-af: npc: cn20k: Fix MCAM actions read Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 09/11] octeontx2-af: npc: cn20k: Initialize default-rule index outputs up front Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 10/11] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free Ratheesh Kannoth
2026-04-20  2:34 ` [PATCH v2 net 11/11] octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices Ratheesh Kannoth
2026-04-23  9:44 ` [PATCH v2 net 0/11] octeontx2-af: npc: cn20k: MCAM fixes Paolo Abeni
2026-04-23 10:20   ` Ratheesh Kannoth [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aenycIVFXeB2oCcK@rkannoth-OptiPlex-7090 \
    --to=rkannoth@marvell.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=dan.carpenter@linaro.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sgoutham@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox