public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Ratheesh Kannoth <rkannoth@marvell.com>
To: <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: <sgoutham@marvell.com>, <davem@davemloft.net>,
	<edumazet@google.com>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<andrew+netdev@lunn.ch>
Subject: Re: [PATCH v5 net 09/10] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free
Date: Thu, 30 Apr 2026 09:45:44 +0530	[thread overview]
Message-ID: <afLXcOpeJB556CYg@rkannoth-OptiPlex-7090> (raw)
In-Reply-To: <20260429022722.1110289-10-rkannoth@marvell.com>

On 2026-04-29 at 07:57:21, Ratheesh Kannoth (rkannoth@marvell.com) wrote:
> npc_cn20k_dft_rules_free() used the NPC MCAM mbox "free all" path, which
> does not match how cn20k tracks default-rule MCAM slots indexes.
>
> Resolve the default-rule indices, then for each valid slot clear the bitmap
> entry, drop the PF/VF map, disable the MCAM line, clear the target
> function, and npc_cn20k_idx_free(). Remove any matching software mcam_rules
> nodes. On hard failure from idx_free, WARN and stop so the box stays up for
> analysis.
>
> In npc_mcam_free_all_entries(), prefetch the same default-rule indices and,
> on cn20k, skip bitmap clear and idx_free when the scanned entry is one of
> those reserved defaults (they are released by npc_cn20k_dft_rules_free).
>
> Fixes: 09d3b7a1403f ("octeontx2-af: npc: cn20k: Allocate default MCAM indexes")
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

Patch 9: [PATCH v5 net 09/10] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free

>  free_rules:
> +	blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0);
> +	if (blkaddr < 0)
> +		return;
>Can this lead to a resource leak if blkaddr < 0?
>The tracking entries are removed from xa_pf2dfl_rmap via xa_erase() earlier
>in the function, before validating the block address. If rvu_get_blkaddr()
>fails and returns early, the rules remain allocated but the tracking
>information is permanently lost.
This condition will never fail for NPC block. This condition is there because of legacy coding method where every function
has this check. We need to put an effort to remove all these check from all functions and should have only one check during probe time to
validate this. That should be a candidate for net-next.

> +	for (int i = 0; i < 4; i++) {
> +		if (ptr[i] == USHRT_MAX)
> +			continue;
>
> -	free_req.hdr.pcifunc = pcifunc;
> -	free_req.all = 1;
> -	rc = rvu_mbox_handler_npc_mcam_free_entry(rvu, &free_req, &rsp);
> -	if (rc)
> -		dev_err(rvu->dev,
> -			"%s: Error deleting default entries (pcifunc=%#x\n",
> -			__func__, pcifunc);
> +		mutex_lock(&mcam->lock);
> +		npc_mcam_clear_bit(mcam, ptr[i]);
> +		mcam->entry2pfvf_map[ptr[i]] = NPC_MCAM_INVALID_MAP;
> +		npc_cn20k_enable_mcam_entry(rvu, blkaddr, ptr[i], false);
> +		mcam->entry2target_pffunc[ptr[i]] = 0x0;
> +		mutex_unlock(&mcam->lock);
>Is it possible for another thread to allocate this index before the rules
>list is cleaned up?
>The mcam->lock is released immediately after npc_mcam_clear_bit() is called,
>which makes the index available in the software bitmap. If a concurrent
>thread allocates this same index and adds its own rule to mcam->mcam_rules,
>could the subsequent list_for_each_entry_safe() loop mistakenly delete and
>free the newly allocated rule?
No. These are allocated thru mbox messages and are processed serially by AF driver.
So multiple thread issue wont happen.

> +
> +		rc = npc_cn20k_idx_free(rvu, &ptr[i], 1);
> +		if (rc) {
> +			/* Non recoverable error. Let us WARN and return. Keep system alive to
> +			 * enable debugging
> +			 */
> +			WARN(1, "%s Error deleting default entries (pcifunc=%#x) mcam_idx=%u\n",
> +			     __func__, pcifunc, ptr[i]);
> +			return;
>Does returning here leak the mcam_rules entries?
>If npc_cn20k_idx_free() fails, the function returns early. This skips the
>trailing list deletion loop, permanently leaking the rule structures for the
>failed index and any previously successful indices. Additionally, because
>npc_mcam_clear_bit() was already called for these indices, they appear
>available in the software bitmap, meaning subsequent allocations might get an
>index while the old rule node is still dangling.
Yes. we need that state to debug the issue if it ever happens than freeing mcam rule entries, as mcam rule entries
debugfs provide more information to debug. so lets keep this way.

>  		}
> +
> +		/* Update entry2counter mapping */
> +		cntr = mcam->entry2cntr_map[index];
> +		if (cntr != NPC_MCAM_INVALID_MAP)
> +			npc_unmap_mcam_entry_and_cntr(rvu, mcam,
> +						      blkaddr, index,
> +						      cntr);
>Does this unintentionally disable default rules in hardware?
>The calls to npc_enable_mcam_entry(..., false) and
>npc_unmap_mcam_entry_and_cntr(...) now occur before the cn20k_dft_rl check
>and the if (!cn20k_dft_rl) block. If a VF resets its filters (e.g., via an
>all=1 free request), its default rules (like UCAST or BCAST) appear to be
>permanently disabled in hardware and their counters unmapped, which might
>cause the VF to lose network connectivity until completely reinitialized.
> +
> +		if (!is_cn20k(rvu->pdev) || cn20k_dft_rl)
> +			continue;
> +
> +		rc = npc_cn20k_idx_free(rvu, &index, 1);
> +		if (rc)
> +			dev_err(rvu->dev,
> +				"Failed to free mcam idx=%u pcifunc=%#x\n",
> +				index, pcifunc);
>  	}
>  }
VF has only UCAST rules.

  reply	other threads:[~2026-04-30  4:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-29  2:27 [PATCH v5 net 00/10] octeontx2-af: npc: cn20k: MCAM fixes Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 01/10] octeontx2-af: npc: cn20k: Propagate MCAM key-type errors on cn20k Ratheesh Kannoth
2026-04-30  4:05   ` Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 02/10] octeontx2-af: npc: cn20k: Drop debugfs_create_file() error checks in init Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 03/10] octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback Ratheesh Kannoth
2026-04-30  4:10   ` Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 04/10] octeontx2-af: npc: cn20k: Fix target map and rule Ratheesh Kannoth
2026-04-30  4:13   ` Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 05/10] octeontx2-af: npc: cn20k: Clear MCAM entries by index and key width Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 06/10] octeontx2-af: npc: cn20k: Fix bank value Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 07/10] octeontx2-af: npc: cn20k: Fix MCAM actions read Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 08/10] octeontx2-af: npc: cn20k: Initialize default-rule index outputs up front Ratheesh Kannoth
2026-04-29  2:27 ` [PATCH v5 net 09/10] octeontx2-af: npc: cn20k: Tear down default MCAM rules explicitly on free Ratheesh Kannoth
2026-04-30  4:15   ` Ratheesh Kannoth [this message]
2026-04-29  2:27 ` [PATCH v5 net 10/10] octeontx2-af: npc: cn20k: Reject missing default-rule MCAM indices Ratheesh Kannoth
2026-04-30  4:17   ` Ratheesh Kannoth
2026-05-01  2:00 ` [PATCH v5 net 00/10] octeontx2-af: npc: cn20k: MCAM fixes patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afLXcOpeJB556CYg@rkannoth-OptiPlex-7090 \
    --to=rkannoth@marvell.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sgoutham@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox