Netdev List
 help / color / mirror / Atom feed
From: Marcin Szycik <marcin.szycik@linux.intel.com>
To: Petr Oros <poros@redhat.com>, netdev@vger.kernel.org
Cc: Ivan Vecera <ivecera@redhat.com>,
	Alice Michael <alice.michael@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	Eric Dumazet <edumazet@google.com>,
	linux-kernel@vger.kernel.org,
	Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Simon Horman <horms@kernel.org>,
	intel-wired-lan@lists.osuosl.org,
	Jacob Keller <jacob.e.keller@intel.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set
Date: Fri, 3 Jul 2026 18:34:30 +0200	[thread overview]
Message-ID: <f03d0930-9f51-45ad-9ed4-e9df335b8fa7@linux.intel.com> (raw)
In-Reply-To: <20260701133601.2118382-2-poros@redhat.com>



On 01.07.2026 15:36, Petr Oros wrote:
> When an ice port in a vlan-filtering bridge goes promiscuous (typical for
> bond slaves), the driver installs a per-VLAN ICE_SW_LKUP_PROMISC_VLAN rule
> for every VID on top of the broad ICE_SW_LKUP_DFLT VSI Rx rule. Each rule
> consumes one of the ~32K Flow Lookup Unit (FLU) entries the device shares
> across PFs, so a wide trunk (vid 2-4094) over several PFs overruns the
> pool: firmware rejects further Add Switch Rules with ENOSPC (AQ 0x10) and
> the DFLT Rx rule itself fails to install:
> 
>   ice 0000:5c:00.1: Failed to set VSI 14 as the default forwarding
>                     VSI, error -5
>   ice 0000:5c:00.1 ens1f1: Error -5 setting default VSI 14 Rx rule
> 
> Once a switch context is overrun the retries can also come back as ENOENT
> (AQ 0x2), which has misled triage toward a perceived recipe binding defect
> rather than a capacity issue.
> 
> The DFLT rule already catches every packet on the port regardless of VLAN
> tag, so the per-VLAN promisc expansion is redundant while it is installed.
> Skip it at the two sites that drive it, ice_set_promisc() and
> ice_vlan_rx_add_vid(), keyed on ice_is_vsi_dflt_vsi() rather than the
> netdev IFF_PROMISC flag so a failed or LAG-suppressed DFLT install still
> falls back to the per-VLAN rules.
> 
> IFF_ALLMULTI and IFF_PROMISC can reach ice_vsi_sync_fltr() in separate
> passes (a bridge join sets them through separate calls), so the allmulti
> pass may expand the per-VID rules before the DFLT rule exists. Drop those
> now-redundant rules right after ice_set_dflt_vsi() installs the DFLT rule;
> ice_vsi_exit_dflt_promisc() reinstates them when promisc is cleared.
> 
> ice_vsi_sync_fltr() subscribed multicast promiscuity only inside the
> "default VSI not yet in use" branch, so a promiscuous VSI that finds the
> default VSI rule already present (owned by another VSI, or preserved
> across a switchdev session) ended up in unicast promisc with no multicast
> subscription. Issue ice_set_promisc(ICE_MCAST_PROMISC_BITS) whenever the
> netdev is promiscuous; it is idempotent and returns 0 if the rule is
> already present.
> 
> Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> v3:
> - Dropped the two vid=0 ICE_SW_LKUP_PROMISC <-> ICE_SW_LKUP_PROMISC_VLAN
>   recipe-swap guards in ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid();
>   each swap is net-zero and guarding the demote stranded the vid=0 rule
>   in ICE_SW_LKUP_PROMISC_VLAN when the last VLAN was removed under the
>   DFLT rule. Reported by review.
> - Drop the now-redundant per-VID multicast promisc rules right after
>   ice_set_dflt_vsi(). A bridge join raises IFF_ALLMULTI and IFF_PROMISC
>   in separate sync passes, so the allmulti pass expands the per-VID rules
>   before the DFLT rule exists; the cleanup keeps them from lingering and
>   exhausting the FLU pool. ice_vsi_exit_dflt_promisc() reinstates them on
>   promisc off. Reported by review.
> - Issue ice_set_promisc(ICE_MCAST_PROMISC_BITS) whenever the netdev is
>   promiscuous, not only when this VSI installs the default VSI rule, so
>   multicast promisc is not lost when the rule is already in use (owned by
>   another VSI, or preserved across a switchdev session). Reported by
>   review.
> - Hoisted the combined VLAN promisc mask in ice_clear_promisc() into a
>   local for alignment. Dropped Aleksandr's Reviewed-by since the code
>   changed.
> 
> v2: https://lore.kernel.org/all/20260622113428.2565255-2-poros@redhat.com/
> v1: https://lore.kernel.org/all/89efbea9831175e6f57e9fe8557f7a0e48e050b7.1781786935.git.poros@redhat.com/
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 111 ++++++++++++++++++----
>  1 file changed, 90 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index b43d420ece99ca..a84de6cf6eb078 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -274,7 +274,8 @@ static int ice_set_promisc(struct ice_vsi *vsi, u8 promisc_m)
>  	if (vsi->type != ICE_VSI_PF)
>  		return 0;
>  
> -	if (ice_vsi_has_non_zero_vlans(vsi)) {
> +	/* skip per-VID expansion; the DFLT Rx rule already covers every VID */
> +	if (ice_vsi_has_non_zero_vlans(vsi) && !ice_is_vsi_dflt_vsi(vsi)) {
>  		promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
>  		status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi,
>  						       promisc_m);
> @@ -304,9 +305,20 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m)
>  		return 0;
>  
>  	if (ice_vsi_has_non_zero_vlans(vsi)) {
> -		promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
> +		u8 vlan_promisc_m = promisc_m | ICE_PROMISC_VLAN_RX |
> +				    ICE_PROMISC_VLAN_TX;
> +		int vid0_status;
> +
> +		/* set time used either recipe (per-VID PROMISC_VLAN, or vid=0

I find this sentence hard to understand - did you mean "ice_set_promisc() used
either recipe..."?

> +		 * PROMISC via the ice_set_promisc() else branch), so clear
> +		 * both; clearing an absent rule succeeds

What do you mean by this? Both will return -EEXIST if rule is absent. There can
also be other errors.

> +		 */
>  		status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi,
> -							 promisc_m);
> +							 vlan_promisc_m);
> +		vid0_status = ice_fltr_clear_vsi_promisc(&vsi->back->hw,
> +							 vsi->idx, promisc_m, 0);
> +		if (status == 0)
> +			status = vid0_status;
>  	} else {
>  		status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
>  						    promisc_m, 0);
> @@ -317,6 +329,59 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m)
>  	return status;
>  }
>  
> +/**
> + * ice_vsi_exit_dflt_promisc - drop the default VSI Rx rule on promisc off
> + * @vsi: the VSI leaving promiscuous mode
> + *
> + * For an IFF_ALLMULTI VSI with VLANs the per-VID multicast rules are
> + * reinstated before the default rule is cleared so coverage never lapses;
> + * the then redundant vid=0 rule is dropped best-effort. The callees log
> + * their own failures, so error returns are not re-logged here.
> + *
> + * Return: 0 on success, negative on error with the default rule left in place.
> + */
> +static int ice_vsi_exit_dflt_promisc(struct ice_vsi *vsi)
> +{
> +	struct ice_vsi_vlan_ops *vlan_ops = ice_get_compat_vsi_vlan_ops(vsi);
> +	struct net_device *netdev = vsi->netdev;
> +	struct ice_hw *hw = &vsi->back->hw;
> +	bool restore_mc;
> +	int err;
> +
> +	restore_mc = (vsi->current_netdev_flags & IFF_ALLMULTI) &&
> +		     ice_vsi_has_non_zero_vlans(vsi);
> +
> +	if (restore_mc) {
> +		err = ice_fltr_set_vlan_vsi_promisc(hw, vsi,
> +						    ICE_MCAST_VLAN_PROMISC_BITS);
> +		if (err && err != -EEXIST)
> +			return err;
> +	}
> +
> +	err = ice_clear_dflt_vsi(vsi);
> +	if (err)
> +		return err;
> +
> +	if (netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
> +		vlan_ops->ena_rx_filtering(vsi);
> +
> +	if (restore_mc)
> +		ice_fltr_clear_vsi_promisc(hw, vsi->idx, ICE_MCAST_PROMISC_BITS,
> +					   0);
> +
> +	return 0;
> +}
> +
> +/* Drop the per-VID multicast promisc rules, redundant once the default
> + * VSI Rx rule covers every VID. A no-op when the VSI has no VLANs.
> + */
> +static void ice_vsi_clear_vlan_mc_promisc(struct ice_vsi *vsi)
> +{
> +	if (ice_vsi_has_non_zero_vlans(vsi))

Nit: could flip condition to decrease indent level.

> +		ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi,
> +						ICE_MCAST_VLAN_PROMISC_BITS);

Error code ignored, not sure if intentionally.

> +}
> +
>  /**
>   * ice_vsi_sync_fltr - Update the VSI filter list to the HW
>   * @vsi: ptr to the VSI
> @@ -429,30 +494,35 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
>  				err = 0;
>  				vlan_ops->dis_rx_filtering(vsi);
>  
> -				/* promiscuous mode implies allmulticast so
> -				 * that VSIs that are in promiscuous mode are
> -				 * subscribed to multicast packets coming to
> -				 * the port
> +				/* DFLT now covers every VID; drop the per-VID
> +				 * multicast promisc rules a prior IFF_ALLMULTI
> +				 * pass may have installed (separate passes on a
> +				 * bridge join) so they do not linger and exhaust
> +				 * the FLU pool. exit_dflt_promisc() reinstates

Please use the full function name.

> +				 * them on promisc off.
>  				 */
> -				err = ice_set_promisc(vsi,
> -						      ICE_MCAST_PROMISC_BITS);
> -				if (err)
> -					goto out_promisc;
> +				ice_vsi_clear_vlan_mc_promisc(vsi);
>  			}
> +
> +			/* Promiscuous mode implies allmulticast. Subscribe
> +			 * the VSI to all multicast even when the default VSI
> +			 * rule is already in use and the block above is
> +			 * skipped (it may be owned by another VSI, or
> +			 * preserved across a switchdev session); the unicast
> +			 * catch-all does not cover the multicast subscription.
> +			 */
> +			err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS);
> +			if (err)
> +				goto out_promisc;
>  		} else {
>  			/* Clear Rx filter to remove traffic from wire */
>  			if (ice_is_vsi_dflt_vsi(vsi)) {
> -				err = ice_clear_dflt_vsi(vsi);
> +				err = ice_vsi_exit_dflt_promisc(vsi);
>  				if (err) {
> -					netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n",
> -						   err, vsi->vsi_num);
>  					vsi->current_netdev_flags |=
>  						IFF_PROMISC;
>  					goto out_promisc;
>  				}
> -				if (vsi->netdev->features &
> -				    NETIF_F_HW_VLAN_CTAG_FILTER)
> -					vlan_ops->ena_rx_filtering(vsi);
>  			}
>  
>  			/* disable allmulti here, but only if allmulti is not
> @@ -3676,10 +3746,9 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid)
>  	while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
>  		usleep_range(1000, 2000);
>  
> -	/* Add multicast promisc rule for the VLAN ID to be added if
> -	 * all-multicast is currently enabled.
> -	 */
> -	if (vsi->current_netdev_flags & IFF_ALLMULTI) {
> +	/* skip the per-VID rule when the DFLT Rx rule already covers this VID */
> +	if ((vsi->current_netdev_flags & IFF_ALLMULTI) &&
> +	    !ice_is_vsi_dflt_vsi(vsi)) {
>  		ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
>  					       ICE_MCAST_VLAN_PROMISC_BITS,
>  					       vid);
Thanks,
Marcin

  reply	other threads:[~2026-07-03 16:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 13:35 [PATCH iwl-net v3 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev Petr Oros
2026-07-01 13:36 ` [PATCH iwl-net v3 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros
2026-07-03 16:34   ` Marcin Szycik [this message]
2026-07-01 13:36 ` [PATCH iwl-net v3 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros
2026-07-03 16:36   ` [Intel-wired-lan] " Marcin Szycik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f03d0930-9f51-45ad-9ed4-e9df335b8fa7@linux.intel.com \
    --to=marcin.szycik@linux.intel.com \
    --cc=alice.michael@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=ivecera@redhat.com \
    --cc=jacob.e.keller@intel.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martyna.szapar-mudlaw@linux.intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=poros@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox