From: Marcin Szycik <marcin.szycik@linux.intel.com>
To: Petr Oros <poros@redhat.com>, netdev@vger.kernel.org
Cc: Ivan Vecera <ivecera@redhat.com>,
Alice Michael <alice.michael@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Eric Dumazet <edumazet@google.com>,
linux-kernel@vger.kernel.org,
Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Simon Horman <horms@kernel.org>,
intel-wired-lan@lists.osuosl.org,
Jacob Keller <jacob.e.keller@intel.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set
Date: Fri, 3 Jul 2026 18:34:30 +0200 [thread overview]
Message-ID: <f03d0930-9f51-45ad-9ed4-e9df335b8fa7@linux.intel.com> (raw)
In-Reply-To: <20260701133601.2118382-2-poros@redhat.com>
On 01.07.2026 15:36, Petr Oros wrote:
> When an ice port in a vlan-filtering bridge goes promiscuous (typical for
> bond slaves), the driver installs a per-VLAN ICE_SW_LKUP_PROMISC_VLAN rule
> for every VID on top of the broad ICE_SW_LKUP_DFLT VSI Rx rule. Each rule
> consumes one of the ~32K Flow Lookup Unit (FLU) entries the device shares
> across PFs, so a wide trunk (vid 2-4094) over several PFs overruns the
> pool: firmware rejects further Add Switch Rules with ENOSPC (AQ 0x10) and
> the DFLT Rx rule itself fails to install:
>
> ice 0000:5c:00.1: Failed to set VSI 14 as the default forwarding
> VSI, error -5
> ice 0000:5c:00.1 ens1f1: Error -5 setting default VSI 14 Rx rule
>
> Once a switch context is overrun the retries can also come back as ENOENT
> (AQ 0x2), which has misled triage toward a perceived recipe binding defect
> rather than a capacity issue.
>
> The DFLT rule already catches every packet on the port regardless of VLAN
> tag, so the per-VLAN promisc expansion is redundant while it is installed.
> Skip it at the two sites that drive it, ice_set_promisc() and
> ice_vlan_rx_add_vid(), keyed on ice_is_vsi_dflt_vsi() rather than the
> netdev IFF_PROMISC flag so a failed or LAG-suppressed DFLT install still
> falls back to the per-VLAN rules.
>
> IFF_ALLMULTI and IFF_PROMISC can reach ice_vsi_sync_fltr() in separate
> passes (a bridge join sets them through separate calls), so the allmulti
> pass may expand the per-VID rules before the DFLT rule exists. Drop those
> now-redundant rules right after ice_set_dflt_vsi() installs the DFLT rule;
> ice_vsi_exit_dflt_promisc() reinstates them when promisc is cleared.
>
> ice_vsi_sync_fltr() subscribed multicast promiscuity only inside the
> "default VSI not yet in use" branch, so a promiscuous VSI that finds the
> default VSI rule already present (owned by another VSI, or preserved
> across a switchdev session) ended up in unicast promisc with no multicast
> subscription. Issue ice_set_promisc(ICE_MCAST_PROMISC_BITS) whenever the
> netdev is promiscuous; it is idempotent and returns 0 if the rule is
> already present.
>
> Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> v3:
> - Dropped the two vid=0 ICE_SW_LKUP_PROMISC <-> ICE_SW_LKUP_PROMISC_VLAN
> recipe-swap guards in ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid();
> each swap is net-zero and guarding the demote stranded the vid=0 rule
> in ICE_SW_LKUP_PROMISC_VLAN when the last VLAN was removed under the
> DFLT rule. Reported by review.
> - Drop the now-redundant per-VID multicast promisc rules right after
> ice_set_dflt_vsi(). A bridge join raises IFF_ALLMULTI and IFF_PROMISC
> in separate sync passes, so the allmulti pass expands the per-VID rules
> before the DFLT rule exists; the cleanup keeps them from lingering and
> exhausting the FLU pool. ice_vsi_exit_dflt_promisc() reinstates them on
> promisc off. Reported by review.
> - Issue ice_set_promisc(ICE_MCAST_PROMISC_BITS) whenever the netdev is
> promiscuous, not only when this VSI installs the default VSI rule, so
> multicast promisc is not lost when the rule is already in use (owned by
> another VSI, or preserved across a switchdev session). Reported by
> review.
> - Hoisted the combined VLAN promisc mask in ice_clear_promisc() into a
> local for alignment. Dropped Aleksandr's Reviewed-by since the code
> changed.
>
> v2: https://lore.kernel.org/all/20260622113428.2565255-2-poros@redhat.com/
> v1: https://lore.kernel.org/all/89efbea9831175e6f57e9fe8557f7a0e48e050b7.1781786935.git.poros@redhat.com/
> ---
> drivers/net/ethernet/intel/ice/ice_main.c | 111 ++++++++++++++++++----
> 1 file changed, 90 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index b43d420ece99ca..a84de6cf6eb078 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -274,7 +274,8 @@ static int ice_set_promisc(struct ice_vsi *vsi, u8 promisc_m)
> if (vsi->type != ICE_VSI_PF)
> return 0;
>
> - if (ice_vsi_has_non_zero_vlans(vsi)) {
> + /* skip per-VID expansion; the DFLT Rx rule already covers every VID */
> + if (ice_vsi_has_non_zero_vlans(vsi) && !ice_is_vsi_dflt_vsi(vsi)) {
> promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
> status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi,
> promisc_m);
> @@ -304,9 +305,20 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m)
> return 0;
>
> if (ice_vsi_has_non_zero_vlans(vsi)) {
> - promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
> + u8 vlan_promisc_m = promisc_m | ICE_PROMISC_VLAN_RX |
> + ICE_PROMISC_VLAN_TX;
> + int vid0_status;
> +
> + /* set time used either recipe (per-VID PROMISC_VLAN, or vid=0
I find this sentence hard to understand - did you mean "ice_set_promisc() used
either recipe..."?
> + * PROMISC via the ice_set_promisc() else branch), so clear
> + * both; clearing an absent rule succeeds
What do you mean by this? Both will return -EEXIST if rule is absent. There can
also be other errors.
> + */
> status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi,
> - promisc_m);
> + vlan_promisc_m);
> + vid0_status = ice_fltr_clear_vsi_promisc(&vsi->back->hw,
> + vsi->idx, promisc_m, 0);
> + if (status == 0)
> + status = vid0_status;
> } else {
> status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
> promisc_m, 0);
> @@ -317,6 +329,59 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m)
> return status;
> }
>
> +/**
> + * ice_vsi_exit_dflt_promisc - drop the default VSI Rx rule on promisc off
> + * @vsi: the VSI leaving promiscuous mode
> + *
> + * For an IFF_ALLMULTI VSI with VLANs the per-VID multicast rules are
> + * reinstated before the default rule is cleared so coverage never lapses;
> + * the then redundant vid=0 rule is dropped best-effort. The callees log
> + * their own failures, so error returns are not re-logged here.
> + *
> + * Return: 0 on success, negative on error with the default rule left in place.
> + */
> +static int ice_vsi_exit_dflt_promisc(struct ice_vsi *vsi)
> +{
> + struct ice_vsi_vlan_ops *vlan_ops = ice_get_compat_vsi_vlan_ops(vsi);
> + struct net_device *netdev = vsi->netdev;
> + struct ice_hw *hw = &vsi->back->hw;
> + bool restore_mc;
> + int err;
> +
> + restore_mc = (vsi->current_netdev_flags & IFF_ALLMULTI) &&
> + ice_vsi_has_non_zero_vlans(vsi);
> +
> + if (restore_mc) {
> + err = ice_fltr_set_vlan_vsi_promisc(hw, vsi,
> + ICE_MCAST_VLAN_PROMISC_BITS);
> + if (err && err != -EEXIST)
> + return err;
> + }
> +
> + err = ice_clear_dflt_vsi(vsi);
> + if (err)
> + return err;
> +
> + if (netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
> + vlan_ops->ena_rx_filtering(vsi);
> +
> + if (restore_mc)
> + ice_fltr_clear_vsi_promisc(hw, vsi->idx, ICE_MCAST_PROMISC_BITS,
> + 0);
> +
> + return 0;
> +}
> +
> +/* Drop the per-VID multicast promisc rules, redundant once the default
> + * VSI Rx rule covers every VID. A no-op when the VSI has no VLANs.
> + */
> +static void ice_vsi_clear_vlan_mc_promisc(struct ice_vsi *vsi)
> +{
> + if (ice_vsi_has_non_zero_vlans(vsi))
Nit: could flip condition to decrease indent level.
> + ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi,
> + ICE_MCAST_VLAN_PROMISC_BITS);
Error code ignored, not sure if intentionally.
> +}
> +
> /**
> * ice_vsi_sync_fltr - Update the VSI filter list to the HW
> * @vsi: ptr to the VSI
> @@ -429,30 +494,35 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
> err = 0;
> vlan_ops->dis_rx_filtering(vsi);
>
> - /* promiscuous mode implies allmulticast so
> - * that VSIs that are in promiscuous mode are
> - * subscribed to multicast packets coming to
> - * the port
> + /* DFLT now covers every VID; drop the per-VID
> + * multicast promisc rules a prior IFF_ALLMULTI
> + * pass may have installed (separate passes on a
> + * bridge join) so they do not linger and exhaust
> + * the FLU pool. exit_dflt_promisc() reinstates
Please use the full function name.
> + * them on promisc off.
> */
> - err = ice_set_promisc(vsi,
> - ICE_MCAST_PROMISC_BITS);
> - if (err)
> - goto out_promisc;
> + ice_vsi_clear_vlan_mc_promisc(vsi);
> }
> +
> + /* Promiscuous mode implies allmulticast. Subscribe
> + * the VSI to all multicast even when the default VSI
> + * rule is already in use and the block above is
> + * skipped (it may be owned by another VSI, or
> + * preserved across a switchdev session); the unicast
> + * catch-all does not cover the multicast subscription.
> + */
> + err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS);
> + if (err)
> + goto out_promisc;
> } else {
> /* Clear Rx filter to remove traffic from wire */
> if (ice_is_vsi_dflt_vsi(vsi)) {
> - err = ice_clear_dflt_vsi(vsi);
> + err = ice_vsi_exit_dflt_promisc(vsi);
> if (err) {
> - netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n",
> - err, vsi->vsi_num);
> vsi->current_netdev_flags |=
> IFF_PROMISC;
> goto out_promisc;
> }
> - if (vsi->netdev->features &
> - NETIF_F_HW_VLAN_CTAG_FILTER)
> - vlan_ops->ena_rx_filtering(vsi);
> }
>
> /* disable allmulti here, but only if allmulti is not
> @@ -3676,10 +3746,9 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid)
> while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
> usleep_range(1000, 2000);
>
> - /* Add multicast promisc rule for the VLAN ID to be added if
> - * all-multicast is currently enabled.
> - */
> - if (vsi->current_netdev_flags & IFF_ALLMULTI) {
> + /* skip the per-VID rule when the DFLT Rx rule already covers this VID */
> + if ((vsi->current_netdev_flags & IFF_ALLMULTI) &&
> + !ice_is_vsi_dflt_vsi(vsi)) {
> ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
> ICE_MCAST_VLAN_PROMISC_BITS,
> vid);
Thanks,
Marcin
next prev parent reply other threads:[~2026-07-03 16:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 13:35 [PATCH iwl-net v3 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev Petr Oros
2026-07-01 13:36 ` [PATCH iwl-net v3 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros
2026-07-03 16:34 ` Marcin Szycik [this message]
2026-07-01 13:36 ` [PATCH iwl-net v3 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros
2026-07-03 16:36 ` [Intel-wired-lan] " Marcin Szycik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f03d0930-9f51-45ad-9ed4-e9df335b8fa7@linux.intel.com \
--to=marcin.szycik@linux.intel.com \
--cc=alice.michael@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=ivecera@redhat.com \
--cc=jacob.e.keller@intel.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martyna.szapar-mudlaw@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=poros@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox