* [PATCH iwl-net v2 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev @ 2026-06-22 11:34 Petr Oros 2026-06-22 11:34 ` [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros 2026-06-22 11:34 ` [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros 0 siblings, 2 replies; 7+ messages in thread From: Petr Oros @ 2026-06-22 11:34 UTC (permalink / raw) To: netdev; +Cc: Petr Oros Two fixes for the uplink default VSI Rx rule (DFLT) on E810 when the netdev is in IFF_PROMISC. Patch 1 drops the redundant per-VLAN promisc expansion that exhausts the FLU pool on a wide VLAN trunk across several PFs. Patch 2 keeps the DFLT Rx rule across a switchdev teardown instead of clobbering the promisc state the operator asked for. Changes since v1: - Patch 2: reworked to avoid the service task entirely. v1 scheduled a filter sync in ice_eswitch_disable_switchdev(); that work could run after ice_remove() freed the uplink VSI (use-after-free) and was not guaranteed to fire if ice_set_rx_mode() never ran again. v2 keeps or drops the DFLT Rx rule synchronously in ice_eswitch_release_env() by testing the live netdev->flags IFF_PROMISC, the same value ice_cfg_vlan_pruning() already keys on. No service task is scheduled and no symbol is exported. Dropped Aleksandr's Reviewed-by since the fix mechanism changed. - Patch 1: no functional changes, collected Aleksandr's Reviewed-by. Link to v1: https://lore.kernel.org/all/cover.1781786935.git.poros@redhat.com/ Petr Oros (2): ice: skip per-VLAN promisc rules when default VSI Rx rule is set ice: preserve uplink DFLT Rx rule on switchdev release drivers/net/ethernet/intel/ice/ice_eswitch.c | 18 +++- drivers/net/ethernet/intel/ice/ice_main.c | 90 +++++++++++++++----- 2 files changed, 84 insertions(+), 24 deletions(-) -- 2.53.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set 2026-06-22 11:34 [PATCH iwl-net v2 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev Petr Oros @ 2026-06-22 11:34 ` Petr Oros 2026-06-23 16:25 ` Simon Horman 2026-06-22 11:34 ` [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros 1 sibling, 1 reply; 7+ messages in thread From: Petr Oros @ 2026-06-22 11:34 UTC (permalink / raw) To: netdev; +Cc: Petr Oros, Aleksandr Loktionov When an ice port is part of a vlan-filtering bridge with a wide VLAN trunk and the netdev is in IFF_PROMISC (typical for bond slaves attached to a bridge), the driver installs per-VLAN ICE_SW_LKUP_PROMISC_VLAN entries (recipe 9) in addition to the broad ICE_SW_LKUP_DFLT VSI Rx rule (recipe 5). Each per-VLAN rule consumes one Flow Lookup Unit (FLU) entry from a fixed hardware pool of "up to 32K FLU entries" per device, documented in the E810 datasheet (613875-009 section 7.8.10, Table 7-18, page 1015). With three active PFs sharing one switch context and a bridge trunk of vid 2-4094, the configuration would require roughly 3 PFs * 4093 VLANs * 3 rules per VLAN per PF ~= 36,800 rules which exceeds the 32K FLU budget. Firmware then responds to further Add Switch Rules requests with AQ retval 0x10 (LIBIE_AQ_RC_ENOSPC) and the user-visible failure surfaces as ice 0000:5c:00.1: Failed to set VSI 14 as the default forwarding VSI, error -5 ice 0000:5c:00.1 ens1f1: Error -5 setting default VSI 14 Rx rule After a switch context has been driven into overrun, subsequent retries can come back as AQ retval 0x2 (LIBIE_AQ_RC_ENOENT), which has misled triage attempts toward a perceived recipe binding defect rather than a capacity issue. When the DFLT VSI Rx rule is in place it catches every packet on the lport regardless of VLAN tag, so the per-VLAN PROMISC_VLAN expansion is redundant. The recipe 4 VLAN prune entries are still installed per VLAN and continue to track the allowed VID set, but the IFF_PROMISC sync path disables their enforcement on the VSI via vlan_ops->dis_rx_filtering() before ice_set_promisc() runs. ena_rx_filtering() is restored when IFF_PROMISC is cleared. Skip the per-VLAN expansion at the two call sites that drive it: ice_set_promisc() falls through to ice_fltr_set_vsi_promisc() and ice_vlan_rx_add_vid() omits the per-VLAN ICE_MCAST_VLAN_PROMISC_BITS add. Plain IFF_ALLMULTI without an installed DFLT VSI rule is unchanged and still installs per-VLAN multicast promisc rules. Both checks use ice_is_vsi_dflt_vsi() which inspects the recipe filter list for an installed DFLT rule on this VSI, not netdev->flags & IFF_PROMISC. The HW-state predicate avoids two regression vectors that a user-intent predicate would introduce: 1. ice_lag_is_switchdev_running() short-circuits ice_set_dflt_vsi() to return 0 without installing the DFLT rule for a PF in switchdev LAG mode. An IFF_PROMISC-only check would also suppress the per-VLAN fallback, leaving the PF with no rule. 2. When ice_set_dflt_vsi() returns a non-EEXIST error (FLU exhausted, switch context divergence), the driver clears IFF_PROMISC from vsi->current_netdev_flags but the netdev's own flags retain IFF_PROMISC. The user-intent predicate would still suppress the per-VLAN fallback even though DFLT failed to install. The predicate is install-time only. The IFF_PROMISC off path closes the lifecycle gap in ice_vsi_exit_dflt_promisc(): for an IFF_ALLMULTI VSI with VLANs it reinstates the per-VID rules before clearing the default rule, so multicast coverage never lapses. If that AQ call fails the default rule is left in place, ice_vsi_exit_dflt_promisc() returns the error, and the sync_fltr pass bails with vsi->current_netdev_flags |= IFF_PROMISC; the current/netdev flag mismatch re-fires the IFF_PROMISC off path on the next sync. Clearing the default rule first would instead expose a window where neither the default rule nor the per-VID rules carry multicast. If ice_clear_dflt_vsi() fails after the per-VID rules were reinstated they are deliberately not rolled back. Clearing the default rule is a removal that frees an FLU entry rather than allocating one, so it cannot fail for lack of space; a failure is a transient AdminQ error. The per-VID rules are the steady state for an IFF_ALLMULTI VLAN VSI, so the only redundant entry left behind is the single un-removed default rule, not the per-VID set. The retry re-enters this path, ice_fltr_set_vlan_vsi_promisc() returns -EEXIST for the rules that already exist so nothing is reallocated, and the default rule is removed on the next attempt. Rolling the per-VID rules back here would instead churn thousands of removes and re-adds on every retry. After the default rule is gone the vid=0 PROMISC rule that paired with it is redundant and is dropped, but only to reclaim a filter entry, so a failure there is logged and does not abort the transition. ice_set_vsi_promisc() and ice_clear_vsi_promisc() dispatch the recipe based on whether ICE_PROMISC_VLAN_RX/TX bits are present in the mask: with the bits set, recipe ICE_SW_LKUP_PROMISC_VLAN is used; otherwise ICE_SW_LKUP_PROMISC. The else branch in ice_set_promisc() installs the vid=0 rule in ICE_SW_LKUP_PROMISC. Because ice_clear_promisc() with VLANs present adds the VLAN bits and would search ICE_SW_LKUP_PROMISC_VLAN, the recipe mismatch would leave the vid=0 ICE_SW_LKUP_PROMISC rule orphaned when VLANs are configured. This is a single stale rule, not a per-cycle leak: re-adding it on the next promisc on returns -EEXIST rather than allocating a new entry. The set-time recipe is not recorded, so ice_clear_promisc() clears both recipes; clearing a rule that is not present succeeds, both clears run unconditionally, and the first error is returned. The two VLAN-0 recipe transition blocks in ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid() that promote / demote the vid=0 rule between ICE_SW_LKUP_PROMISC and ICE_SW_LKUP_PROMISC_VLAN are likewise guarded by !ice_is_vsi_dflt_vsi(). With DFLT in place the vid=0 rule already covers every VID and a recipe swap would only install a redundant rule. Lab reproduction on an E810-C with the same firmware family (4.80, NVM 1.3805.0, DDP 1.3.43.0) using four PFs in vlan-filtering bridges with vid 2-4094 and the slaves brought to IFF_PROMISC before the bridge VLAN bulk add: before fix: ~12,279 AQ Add Switch Rules per PF, ENOSPC and ENOENT responses in dmesg, DFLT VSI Rx rule install fails on the affected PF after fix: ~4,093 AQ Add Switch Rules per PF, no AQ errors, DFLT VSI Rx rule installs on every PF The 66.7% reduction in installed switch rules per PF matches the expected per-VLAN saving: a single DFLT rule replaces the per-VID PROMISC_VLAN expansion. Functional regression test with vid 2-100 trunk between two ice ports through the lab switch (40/40 PASS, 0 AQ errors, 0 ENOSPC at 4093-VID customer scale): vid 50 unicast, vid 100 unicast, vid 50 broadcast ARP, vid 100 multicast IPv6 ND vid 200/500/1500/4000 isolation (out-of-trunk) and untagged not leaked: 0 packets reach any bridge endpoint IGMP/MLD snooping, Jumbo MTU 9000, reserved-multicast STP BPDU IFF_PROMISC + IFF_ALLMULTI transition (off while allmulti stays) Regression reproducer for commit 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling"): allmulti on -> add vid -> allmulti off -> allmulti on plus the orphan-rule Scenario 2; both converge with no stale rules 100-VID, 1000-VID, 4093-VID stress cycles (5/3/2 iterations each) switchdev mode toggle preserves IFF_PROMISC pruning state across the session (vid 999 multicast received before and after the legacy -> switchdev -> legacy cycle) SR-IOV: VFs unaffected because ice_set_promisc() early-returns for non-PF VSI and VF representors do not register ndo_vlan_rx_add_vid Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Petr Oros <poros@redhat.com> --- v2: - No functional changes; collected the Reviewed-by. v1: https://lore.kernel.org/all/89efbea9831175e6f57e9fe8557f7a0e48e050b7.1781786935.git.poros@redhat.com/ --- drivers/net/ethernet/intel/ice/ice_main.c | 90 ++++++++++++++++++----- 1 file changed, 70 insertions(+), 20 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 6d24056c247cf4..af8df81fc45623 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -274,7 +274,8 @@ static int ice_set_promisc(struct ice_vsi *vsi, u8 promisc_m) if (vsi->type != ICE_VSI_PF) return 0; - if (ice_vsi_has_non_zero_vlans(vsi)) { + /* skip per-VID expansion; the DFLT Rx rule already covers every VID */ + if (ice_vsi_has_non_zero_vlans(vsi) && !ice_is_vsi_dflt_vsi(vsi)) { promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX); status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m); @@ -304,9 +305,19 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m) return 0; if (ice_vsi_has_non_zero_vlans(vsi)) { - promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX); + int vid0_status; + + /* set time used either recipe (per-VID PROMISC_VLAN, or vid=0 + * PROMISC via the ice_set_promisc() else branch), so clear + * both; clearing an absent rule succeeds + */ status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi, - promisc_m); + promisc_m | ICE_PROMISC_VLAN_RX | + ICE_PROMISC_VLAN_TX); + vid0_status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, + vsi->idx, promisc_m, 0); + if (!status) + status = vid0_status; } else { status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); @@ -317,6 +328,49 @@ static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m) return status; } +/** + * ice_vsi_exit_dflt_promisc - drop the default VSI Rx rule on promisc off + * @vsi: the VSI leaving promiscuous mode + * + * For an IFF_ALLMULTI VSI with VLANs the per-VID multicast rules are + * reinstated before the default rule is cleared so coverage never lapses; + * the then redundant vid=0 rule is dropped best-effort. The callees log + * their own failures, so error returns are not re-logged here. + * + * Return: 0 on success, negative on error with the default rule left in place. + */ +static int ice_vsi_exit_dflt_promisc(struct ice_vsi *vsi) +{ + struct ice_vsi_vlan_ops *vlan_ops = ice_get_compat_vsi_vlan_ops(vsi); + struct net_device *netdev = vsi->netdev; + struct ice_hw *hw = &vsi->back->hw; + bool restore_mc; + int err; + + restore_mc = (vsi->current_netdev_flags & IFF_ALLMULTI) && + ice_vsi_has_non_zero_vlans(vsi); + + if (restore_mc) { + err = ice_fltr_set_vlan_vsi_promisc(hw, vsi, + ICE_MCAST_VLAN_PROMISC_BITS); + if (err && err != -EEXIST) + return err; + } + + err = ice_clear_dflt_vsi(vsi); + if (err) + return err; + + if (netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER) + vlan_ops->ena_rx_filtering(vsi); + + if (restore_mc) + ice_fltr_clear_vsi_promisc(hw, vsi->idx, ICE_MCAST_PROMISC_BITS, + 0); + + return 0; +} + /** * ice_vsi_sync_fltr - Update the VSI filter list to the HW * @vsi: ptr to the VSI @@ -442,17 +496,12 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi) } else { /* Clear Rx filter to remove traffic from wire */ if (ice_is_vsi_dflt_vsi(vsi)) { - err = ice_clear_dflt_vsi(vsi); + err = ice_vsi_exit_dflt_promisc(vsi); if (err) { - netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n", - err, vsi->vsi_num); vsi->current_netdev_flags |= IFF_PROMISC; goto out_promisc; } - if (vsi->netdev->features & - NETIF_F_HW_VLAN_CTAG_FILTER) - vlan_ops->ena_rx_filtering(vsi); } /* disable allmulti here, but only if allmulti is not @@ -3676,10 +3725,9 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) usleep_range(1000, 2000); - /* Add multicast promisc rule for the VLAN ID to be added if - * all-multicast is currently enabled. - */ - if (vsi->current_netdev_flags & IFF_ALLMULTI) { + /* skip the per-VID rule when the DFLT Rx rule already covers this VID */ + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && + !ice_is_vsi_dflt_vsi(vsi)) { ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, ICE_MCAST_VLAN_PROMISC_BITS, vid); @@ -3697,11 +3745,12 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) if (ret) goto finish; - /* If all-multicast is currently enabled and this VLAN ID is only one - * besides VLAN-0 we have to update look-up type of multicast promisc - * rule for VLAN-0 from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. + /* On the first non-zero VLAN, promote the VLAN-0 multicast promisc + * rule from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. Skip when + * the DFLT Rx rule is installed; it already covers every VID. */ if ((vsi->current_netdev_flags & IFF_ALLMULTI) && + !ice_is_vsi_dflt_vsi(vsi) && ice_vsi_num_non_zero_vlans(vsi) == 1) { ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, ICE_MCAST_PROMISC_BITS, 0); @@ -3764,11 +3813,12 @@ int ice_vlan_rx_kill_vid(struct net_device *netdev, __be16 proto, u16 vid) ICE_MCAST_VLAN_PROMISC_BITS, vid); if (!ice_vsi_has_non_zero_vlans(vsi)) { - /* Update look-up type of multicast promisc rule for VLAN 0 - * from ICE_SW_LKUP_PROMISC_VLAN to ICE_SW_LKUP_PROMISC when - * all-multicast is enabled and VLAN 0 is the only VLAN rule. + /* Last non-zero VLAN gone: demote the VLAN-0 multicast promisc + * rule back to ICE_SW_LKUP_PROMISC. Skip when the DFLT Rx rule + * is installed; no recipe swap is needed. */ - if (vsi->current_netdev_flags & IFF_ALLMULTI) { + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && + !ice_is_vsi_dflt_vsi(vsi)) { ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, ICE_MCAST_VLAN_PROMISC_BITS, 0); -- 2.53.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set 2026-06-22 11:34 ` [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros @ 2026-06-23 16:25 ` Simon Horman 2026-06-29 12:06 ` Petr Oros 0 siblings, 1 reply; 7+ messages in thread From: Simon Horman @ 2026-06-23 16:25 UTC (permalink / raw) To: poros; +Cc: Simon Horman, netdev, aleksandr.loktionov This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. Full review at: https://sashiko.dev --- Subject: ice: skip per-VLAN promisc rules when default VSI Rx rule is set This patch optimizes the number of hardware switch rules used when an ice port is part of a vlan-filtering bridge. It avoids installing per-VLAN promiscuous rules if the broad default VSI Rx rule is already catching all traffic. > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c > index e2fbe111f8496..efb1f536de6c7 100644 > --- a/drivers/net/ethernet/intel/ice/ice_main.c > +++ b/drivers/net/ethernet/intel/ice/ice_main.c [ ... ] > @@ -442,17 +496,12 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi) [Severity: High] This is a pre-existing issue, but does MAC filter overflow fail to force promiscuous mode, resulting in silent packet drops? When the hardware MAC filter limit is reached (LIBIE_AQ_RC_ENOSPC), the driver attempts to fall back to promiscuous mode by setting promisc_forced_on = true. However, the execution flow then reaches a check for IFF_PROMISC: if (vsi->current_netdev_flags & IFF_PROMISC) { /* Apply Rx filter rule to get traffic from wire */ ... } else { /* Clear Rx filter to remove traffic from wire */ Since the user never explicitly requested IFF_PROMISC, this evaluates to false, causing the driver to take the else branch. The else branch actively disables the default VSI rule and clears promiscuous mode instead of enabling it. The fallback feature appears to be non-functional, leading to lost traffic for newly added MACs. [Severity: High] This is a pre-existing issue, but could this function prevent retries of incomplete configurations and break multicast promiscuity if any default VSI is already active? In ice_vsi_sync_fltr(), the setup of multicast promiscuity and disabling of Rx filtering is gated: if (!ice_is_dflt_vsi_in_use(vsi->port_info)) { err = ice_set_dflt_vsi(vsi); ... err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS); if (err) goto out_promisc; } If ice_set_dflt_vsi() succeeds but the subsequent ice_set_promisc() fails, the function jumps to out_promisc to trigger a retry. On the next cycle, ice_is_dflt_vsi_in_use() returns true because this VSI is now the default, causing the entire block to be skipped and leaving the VSI misconfigured. Furthermore, if another VSI (such as a VF) is already the default VSI, the PF VSI entirely skips setting multicast promiscuity, breaking multicast reception in promiscuous mode. > } else { > /* Clear Rx filter to remove traffic from wire */ > if (ice_is_vsi_dflt_vsi(vsi)) { > - err = ice_clear_dflt_vsi(vsi); > + err = ice_vsi_exit_dflt_promisc(vsi); > if (err) { > - netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n", > - err, vsi->vsi_num); > vsi->current_netdev_flags |= > IFF_PROMISC; > goto out_promisc; > } > - if (vsi->netdev->features & > - NETIF_F_HW_VLAN_CTAG_FILTER) > - vlan_ops->ena_rx_filtering(vsi); > } [ ... ] > @@ -3675,10 +3724,9 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) > while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) > usleep_range(1000, 2000); [Severity: High] This is a pre-existing issue, but isn't there a risk of TOCTOU races with this ad-hoc lock pattern lacking memory barriers? The driver spins on ICE_CFG_BUSY using test_and_set_bit() and usleep_range() to guard VSI filter synchronization. The lock is subsequently dropped using clear_bit(ICE_CFG_BUSY, vsi->state) at the exit label. Because clear_bit() lacks a memory release barrier, the CPU or compiler can reorder memory stores from the critical section to occur after the flag is cleared, leading to data races and corruption of shared filter state on weakly ordered architectures. > - /* Add multicast promisc rule for the VLAN ID to be added if > - * all-multicast is currently enabled. > - */ > - if (vsi->current_netdev_flags & IFF_ALLMULTI) { > + /* skip the per-VID rule when the DFLT Rx rule already covers this VID */ > + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && > + !ice_is_vsi_dflt_vsi(vsi)) { > ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, > ICE_MCAST_VLAN_PROMISC_BITS, > vid); > @@ -3696,11 +3744,12 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) > if (ret) > goto finish; [Severity: High] This is a pre-existing issue, but what happens if vlan_ops->add_vlan() fails here? Is the hardware switch rule leaked? When adding a VLAN to an interface with IFF_ALLMULTI enabled, this function first allocates a promiscuous switch rule in hardware via ice_fltr_set_vsi_promisc(). If this succeeds but the subsequent call to vlan_ops->add_vlan() fails (e.g., due to ENOSPC or memory allocation failure), the function returns early without rolling back the promiscuous rule. Because the network stack considers the VLAN addition to have failed, ice_vlan_rx_kill_vid() will never be called for this VID, permanently leaking a Flow Lookup Unit (FLU). > - /* If all-multicast is currently enabled and this VLAN ID is only one > - * besides VLAN-0 we have to update look-up type of multicast promisc > - * rule for VLAN-0 from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. > + /* On the first non-zero VLAN, promote the VLAN-0 multicast promisc > + * rule from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. Skip when > + * the DFLT Rx rule is installed; it already covers every VID. > */ > if ((vsi->current_netdev_flags & IFF_ALLMULTI) && > + !ice_is_vsi_dflt_vsi(vsi) && > ice_vsi_num_non_zero_vlans(vsi) == 1) { [ ... ] > @@ -3763,11 +3812,12 @@ int ice_vlan_rx_kill_vid(struct net_device *netdev, __be16 proto, u16 vid) > ICE_MCAST_VLAN_PROMISC_BITS, vid); > > if (!ice_vsi_has_non_zero_vlans(vsi)) { > - /* Update look-up type of multicast promisc rule for VLAN 0 > - * from ICE_SW_LKUP_PROMISC_VLAN to ICE_SW_LKUP_PROMISC when > - * all-multicast is enabled and VLAN 0 is the only VLAN rule. > + /* Last non-zero VLAN gone: demote the VLAN-0 multicast promisc > + * rule back to ICE_SW_LKUP_PROMISC. Skip when the DFLT Rx rule > + * is installed; no recipe swap is needed. > */ > - if (vsi->current_netdev_flags & IFF_ALLMULTI) { > + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && > + !ice_is_vsi_dflt_vsi(vsi)) { [Severity: High] Will this modification cause a stale switch rule (ICE_SW_LKUP_PROMISC_VLAN for vid=0) to be permanently leaked if the last VLAN is removed while the DFLT VSI rule is active? This patch skips demoting the vid=0 multicast promiscuous rule from ICE_SW_LKUP_PROMISC_VLAN back to ICE_SW_LKUP_PROMISC when the last VLAN is removed, if ice_is_vsi_dflt_vsi(vsi) is true. However, if the interface later drops IFF_PROMISC (removing the DFLT VSI rule) and then drops IFF_ALLMULTI, ice_clear_promisc() is called. Because the interface now has zero VLANs, ice_clear_promisc() executes its else branch: } else { status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); } This only searches and clears rules from the ICE_SW_LKUP_PROMISC recipe. The stranded rule in the ICE_SW_LKUP_PROMISC_VLAN recipe is never cleared, resulting in a permanent hardware switch rule leak. > ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, > ICE_MCAST_VLAN_PROMISC_BITS, > 0); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set 2026-06-23 16:25 ` Simon Horman @ 2026-06-29 12:06 ` Petr Oros 0 siblings, 0 replies; 7+ messages in thread From: Petr Oros @ 2026-06-29 12:06 UTC (permalink / raw) To: Simon Horman; +Cc: netdev, aleksandr.loktionov On 6/23/26 18:25, Simon Horman wrote: > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > Full review at: https://sashiko.dev > --- > Subject: ice: skip per-VLAN promisc rules when default VSI Rx rule is set > > This patch optimizes the number of hardware switch rules used when an ice > port is part of a vlan-filtering bridge. It avoids installing per-VLAN > promiscuous rules if the broad default VSI Rx rule is already catching > all traffic. > >> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c >> index e2fbe111f8496..efb1f536de6c7 100644 >> --- a/drivers/net/ethernet/intel/ice/ice_main.c >> +++ b/drivers/net/ethernet/intel/ice/ice_main.c > [ ... ] >> @@ -442,17 +496,12 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi) > [Severity: High] > This is a pre-existing issue, but does MAC filter overflow fail to force > promiscuous mode, resulting in silent packet drops? Pre-existing; the promisc_forced_on fallback is not changed by this patch. > > When the hardware MAC filter limit is reached (LIBIE_AQ_RC_ENOSPC), the driver > attempts to fall back to promiscuous mode by setting promisc_forced_on = true. > > However, the execution flow then reaches a check for IFF_PROMISC: > > if (vsi->current_netdev_flags & IFF_PROMISC) { > /* Apply Rx filter rule to get traffic from wire */ > ... > } else { > /* Clear Rx filter to remove traffic from wire */ > > Since the user never explicitly requested IFF_PROMISC, this evaluates to false, > causing the driver to take the else branch. The else branch actively disables > the default VSI rule and clears promiscuous mode instead of enabling it. The > fallback feature appears to be non-functional, leading to lost traffic for > newly added MACs. > > [Severity: High] > This is a pre-existing issue, but could this function prevent retries of > incomplete configurations and break multicast promiscuity if any default > VSI is already active? v3 will address the multicast half: ice_set_promisc(ICE_MCAST_PROMISC_BITS) will be issued whenever the netdev is promiscuous, not only inside the !ice_is_dflt_vsi_in_use() branch. > > In ice_vsi_sync_fltr(), the setup of multicast promiscuity and disabling of > Rx filtering is gated: > > if (!ice_is_dflt_vsi_in_use(vsi->port_info)) { > err = ice_set_dflt_vsi(vsi); > ... > err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS); > if (err) > goto out_promisc; > } > > If ice_set_dflt_vsi() succeeds but the subsequent ice_set_promisc() fails, > the function jumps to out_promisc to trigger a retry. On the next cycle, > ice_is_dflt_vsi_in_use() returns true because this VSI is now the default, > causing the entire block to be skipped and leaving the VSI misconfigured. > > Furthermore, if another VSI (such as a VF) is already the default VSI, > the PF VSI entirely skips setting multicast promiscuity, breaking multicast > reception in promiscuous mode. > >> } else { >> /* Clear Rx filter to remove traffic from wire */ >> if (ice_is_vsi_dflt_vsi(vsi)) { >> - err = ice_clear_dflt_vsi(vsi); >> + err = ice_vsi_exit_dflt_promisc(vsi); >> if (err) { >> - netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n", >> - err, vsi->vsi_num); >> vsi->current_netdev_flags |= >> IFF_PROMISC; >> goto out_promisc; >> } >> - if (vsi->netdev->features & >> - NETIF_F_HW_VLAN_CTAG_FILTER) >> - vlan_ops->ena_rx_filtering(vsi); >> } > [ ... ] >> @@ -3675,10 +3724,9 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) >> while (test_and_set_bit(ICE_CFG_BUSY, vsi->state)) >> usleep_range(1000, 2000); > [Severity: High] > This is a pre-existing issue, but isn't there a risk of TOCTOU races with > this ad-hoc lock pattern lacking memory barriers? Pre-existing; the ICE_CFG_BUSY locking is not changed by this patch. > > The driver spins on ICE_CFG_BUSY using test_and_set_bit() and usleep_range() > to guard VSI filter synchronization. The lock is subsequently dropped using > clear_bit(ICE_CFG_BUSY, vsi->state) at the exit label. > > Because clear_bit() lacks a memory release barrier, the CPU or compiler can > reorder memory stores from the critical section to occur after the flag is > cleared, leading to data races and corruption of shared filter state on > weakly ordered architectures. > >> - /* Add multicast promisc rule for the VLAN ID to be added if >> - * all-multicast is currently enabled. >> - */ >> - if (vsi->current_netdev_flags & IFF_ALLMULTI) { >> + /* skip the per-VID rule when the DFLT Rx rule already covers this VID */ >> + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && >> + !ice_is_vsi_dflt_vsi(vsi)) { >> ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, >> ICE_MCAST_VLAN_PROMISC_BITS, >> vid); >> @@ -3696,11 +3744,12 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) >> if (ret) >> goto finish; > [Severity: High] > This is a pre-existing issue, but what happens if vlan_ops->add_vlan() > fails here? Is the hardware switch rule leaked? Pre-existing; the add_vlan() error unwind is not changed by this patch. > > When adding a VLAN to an interface with IFF_ALLMULTI enabled, this function > first allocates a promiscuous switch rule in hardware via > ice_fltr_set_vsi_promisc(). > > If this succeeds but the subsequent call to vlan_ops->add_vlan() fails > (e.g., due to ENOSPC or memory allocation failure), the function returns > early without rolling back the promiscuous rule. Because the network stack > considers the VLAN addition to have failed, ice_vlan_rx_kill_vid() will > never be called for this VID, permanently leaking a Flow Lookup Unit (FLU). > >> - /* If all-multicast is currently enabled and this VLAN ID is only one >> - * besides VLAN-0 we have to update look-up type of multicast promisc >> - * rule for VLAN-0 from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. >> + /* On the first non-zero VLAN, promote the VLAN-0 multicast promisc >> + * rule from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN. Skip when >> + * the DFLT Rx rule is installed; it already covers every VID. >> */ >> if ((vsi->current_netdev_flags & IFF_ALLMULTI) && >> + !ice_is_vsi_dflt_vsi(vsi) && >> ice_vsi_num_non_zero_vlans(vsi) == 1) { > [ ... ] >> @@ -3763,11 +3812,12 @@ int ice_vlan_rx_kill_vid(struct net_device *netdev, __be16 proto, u16 vid) >> ICE_MCAST_VLAN_PROMISC_BITS, vid); >> >> if (!ice_vsi_has_non_zero_vlans(vsi)) { >> - /* Update look-up type of multicast promisc rule for VLAN 0 >> - * from ICE_SW_LKUP_PROMISC_VLAN to ICE_SW_LKUP_PROMISC when >> - * all-multicast is enabled and VLAN 0 is the only VLAN rule. >> + /* Last non-zero VLAN gone: demote the VLAN-0 multicast promisc >> + * rule back to ICE_SW_LKUP_PROMISC. Skip when the DFLT Rx rule >> + * is installed; no recipe swap is needed. >> */ >> - if (vsi->current_netdev_flags & IFF_ALLMULTI) { >> + if ((vsi->current_netdev_flags & IFF_ALLMULTI) && >> + !ice_is_vsi_dflt_vsi(vsi)) { > [Severity: High] > Will this modification cause a stale switch rule (ICE_SW_LKUP_PROMISC_VLAN for > vid=0) to be permanently leaked if the last VLAN is removed while the DFLT VSI > rule is active? Real, and introduced by v1/v2. v3 will drop the two vid=0 recipe-swap guards, so the rule is demoted back to ICE_SW_LKUP_PROMISC when the last VLAN is removed and never stranded. Petr > > This patch skips demoting the vid=0 multicast promiscuous rule from > ICE_SW_LKUP_PROMISC_VLAN back to ICE_SW_LKUP_PROMISC when the last VLAN is > removed, if ice_is_vsi_dflt_vsi(vsi) is true. > > However, if the interface later drops IFF_PROMISC (removing the DFLT VSI rule) > and then drops IFF_ALLMULTI, ice_clear_promisc() is called. Because the > interface now has zero VLANs, ice_clear_promisc() executes its else branch: > > } else { > status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, > promisc_m, 0); > } > > This only searches and clears rules from the ICE_SW_LKUP_PROMISC recipe. The > stranded rule in the ICE_SW_LKUP_PROMISC_VLAN recipe is never cleared, > resulting in a permanent hardware switch rule leak. > >> ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, >> ICE_MCAST_VLAN_PROMISC_BITS, >> 0); ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release 2026-06-22 11:34 [PATCH iwl-net v2 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev Petr Oros 2026-06-22 11:34 ` [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros @ 2026-06-22 11:34 ` Petr Oros 2026-06-23 16:25 ` Simon Horman 1 sibling, 1 reply; 7+ messages in thread From: Petr Oros @ 2026-06-22 11:34 UTC (permalink / raw) To: netdev; +Cc: Petr Oros ice_eswitch_setup_env() calls ice_set_dflt_vsi() to install the ICE_SW_LKUP_DFLT Rx rule on the uplink VSI. The helper returns 0 even when the rule is already in place, so the call is a no-op if ice_vsi_sync_fltr() had previously installed the DFLT rule in response to IFF_PROMISC on the uplink netdev. ice_remove_vsi_fltr() called earlier in ice_eswitch_setup_env() does not affect this rule because ice_remove_vsi_lkup_fltr() lacks a case for ICE_SW_LKUP_DFLT and falls into its default branch which only logs. Switchdev mode then adds an ICE_FLTR_TX leg via ice_cfg_dflt_vsi() on the same VSI handle. ice_eswitch_release_env() unconditionally removed both the Rx and Tx DFLT rules. When the Rx DFLT was installed by ice_vsi_sync_fltr() before the switchdev session started, this clobbered promisc state the operator had asked for: the DFLT Rx rule disappeared while IFF_PROMISC was still set on the netdev, and the IFF_PROMISC sync path was not retriggered, so the uplink ended the session without the catch-all rule the netdev flags requested. Skip the Rx DFLT removal when the uplink is promiscuous, both in ice_eswitch_release_env() and in the err_def_tx unwind of ice_eswitch_setup_env(). The Tx leg installed by switchdev is always removed since switchdev owns it. Test the live netdev->flags for this decision. The ena_rx_filtering() call right above in ice_eswitch_release_env() reaches ice_cfg_vlan_pruning(), which already keys on the live netdev->flags IFF_PROMISC bit, so reusing the same value keeps the preserved DFLT rule and the VLAN pruning state mutually consistent across every promisc transition, including one the operator made while switchdev ran: ice_set_rx_mode() is gated off for the uplink during the session, so such a change never reaches the filter sync, but it is reflected in netdev->flags and is therefore honored here on release. Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment") Signed-off-by: Petr Oros <poros@redhat.com> --- v2: - Reworked the fix to avoid the service task entirely. v1 scheduled a filter sync in ice_eswitch_disable_switchdev() to reconcile the uplink DFLT Rx rule; that work could run after ice_remove() freed the uplink VSI (use-after-free) and was not guaranteed to fire if ice_set_rx_mode() never ran again. v2 keeps or drops the DFLT Rx rule synchronously in ice_eswitch_release_env() (and the setup_env error unwind) by testing the live netdev->flags IFF_PROMISC, the same value ice_cfg_vlan_pruning() already keys on, so the preserved rule and the pruning state stay consistent. No service task is scheduled and no symbol is exported. - Dropped the Reviewed-by since the fix mechanism changed. v1: https://lore.kernel.org/all/deef5756e534ef06c12d910c5305d3fd205d30a0.1781786935.git.poros@redhat.com/ --- drivers/net/ethernet/intel/ice/ice_eswitch.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c index 2e4f0969035f77..48273ef9f69dc8 100644 --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c @@ -66,8 +66,10 @@ static int ice_eswitch_setup_env(struct ice_pf *pf) ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, ICE_FLTR_TX); err_def_tx: - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, - ICE_FLTR_RX); + /* keep the Rx DFLT rule if the uplink is promiscuous (see release_env) */ + if (!(uplink_vsi->netdev->flags & IFF_PROMISC)) + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, + false, ICE_FLTR_RX); err_def_rx: ice_vsi_del_vlan_zero(uplink_vsi); err_vlan_zero: @@ -278,8 +280,16 @@ static void ice_eswitch_release_env(struct ice_pf *pf) vlan_ops->ena_rx_filtering(uplink_vsi); ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, ICE_FLTR_TX); - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, - ICE_FLTR_RX); + + /* Keep the Rx DFLT rule if the uplink is promiscuous; it must outlive + * the session. Test the live netdev->flags, the same value + * ena_rx_filtering() -> ice_cfg_vlan_pruning() above keys its decision + * on, so the preserved DFLT rule and the pruning state stay consistent. + */ + if (!(uplink_vsi->netdev->flags & IFF_PROMISC)) + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, + false, ICE_FLTR_RX); + ice_fltr_add_mac_and_broadcast(uplink_vsi, uplink_vsi->port_info->mac.perm_addr, ICE_FWD_TO_VSI); -- 2.53.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release 2026-06-22 11:34 ` [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros @ 2026-06-23 16:25 ` Simon Horman 2026-06-29 13:21 ` Petr Oros 0 siblings, 1 reply; 7+ messages in thread From: Simon Horman @ 2026-06-23 16:25 UTC (permalink / raw) To: poros; +Cc: Simon Horman, netdev, aleksandr.loktionov This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. Full review at: https://sashiko.dev --- ice: preserve uplink DFLT Rx rule on switchdev release This commit preserves the default Rx rule on the uplink VSI when releasing the switchdev environment if the interface is in promiscuous mode. This prevents the catch-all rule from being removed while the netdev flags still request promiscuity. > diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c > index 2e4f0969035f7..48273ef9f69dc 100644 > --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c > +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c [ ... ] > @@ -278,8 +280,16 @@ static void ice_eswitch_release_env(struct ice_pf *pf) > vlan_ops->ena_rx_filtering(uplink_vsi); [Severity: High] Could this lockless read of netdev->flags cause a regression due to a TOCTOU race? The ena_rx_filtering() call eventually reaches ice_cfg_vlan_pruning(), which reads uplink_vsi->netdev->flags & IFF_PROMISC locklessly to decide whether to enable VLAN pruning. > ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, > ICE_FLTR_TX); > - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, > - ICE_FLTR_RX); > + > + /* Keep the Rx DFLT rule if the uplink is promiscuous; it must outlive > + * the session. Test the live netdev->flags, the same value > + * ena_rx_filtering() -> ice_cfg_vlan_pruning() above keys its decision > + * on, so the preserved DFLT rule and the pruning state stay consistent. > + */ > + if (!(uplink_vsi->netdev->flags & IFF_PROMISC)) ... and then the exact same flags are read locklessly again right here. Because setup/teardown paths are protected by devl_lock rather than rtnl_lock, could a concurrent dev_change_flags() holding rtnl_lock toggle IFF_PROMISC between the two checks? If IFF_PROMISC changes from 0 to 1 in this window, the code would incorrectly enable VLAN pruning (seeing 0) and then incorrectly retain the DFLT_RX rule (seeing 1), leaving the hardware with VLAN pruning enabled during promiscuous mode. [Severity: High] Does preserving the Rx DFLT rule here cause a regression by permanently breaking multicast promiscuity configuration if IFF_PROMISC was set? During the switchdev session, ice_remove_vsi_fltr() removes the ICE_SW_LKUP_PROMISC rules. By manually preserving the DFLT rule here, any future synchronization via ice_vsi_sync_fltr() will evaluate !ice_is_dflt_vsi_in_use(vsi->port_info) as false: ice_main.c:ice_vsi_sync_fltr() { ... if (!ice_is_dflt_vsi_in_use(vsi->port_info)) { err = ice_set_dflt_vsi(vsi); ... err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS); } ... } Because it is skipped, ICE_MCAST_PROMISC_BITS is never configured, leaving the VSI in unicast promiscuous mode but silently dropping multicast traffic destined for other MAC addresses. > + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, > + false, ICE_FLTR_RX); > + > ice_fltr_add_mac_and_broadcast(uplink_vsi, > uplink_vsi->port_info->mac.perm_addr, > ICE_FWD_TO_VSI); [Severity: High] This is a pre-existing issue, but does this teardown function fail to synchronize hardware state for previously joined multicast groups and secondary unicast addresses? When entering switchdev mode, __dev_uc_unsync() and __dev_mc_unsync() clear the synchronization state of OS-level MAC addresses, and ice_remove_vsi_fltr() removes the actual hardware filters. While switchdev runs, calls to ndo_set_rx_mode return early and do not update hardware. When returning to legacy mode, this function restores only the primary MAC and broadcast addresses, but doesn't seem to invoke netif_set_rx_mode() or equivalent to resynchronize the deferred Rx filters from the netdev to the hardware. Could this cause the hardware to silently drop packets for previously joined groups until the interface is bounced? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release 2026-06-23 16:25 ` Simon Horman @ 2026-06-29 13:21 ` Petr Oros 0 siblings, 0 replies; 7+ messages in thread From: Petr Oros @ 2026-06-29 13:21 UTC (permalink / raw) To: Simon Horman; +Cc: netdev, aleksandr.loktionov On 6/23/26 18:25, Simon Horman wrote: > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > Full review at: https://sashiko.dev > --- > ice: preserve uplink DFLT Rx rule on switchdev release > > This commit preserves the default Rx rule on the uplink VSI when releasing > the switchdev environment if the interface is in promiscuous mode. This > prevents the catch-all rule from being removed while the netdev flags still > request promiscuity. > >> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c >> index 2e4f0969035f7..48273ef9f69dc 100644 >> --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c >> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c > [ ... ] >> @@ -278,8 +280,16 @@ static void ice_eswitch_release_env(struct ice_pf *pf) >> vlan_ops->ena_rx_filtering(uplink_vsi); > [Severity: High] > Could this lockless read of netdev->flags cause a regression due to a TOCTOU > race? Both reads key on the same live netdev->flags & IFF_PROMISC (the pruning read just above included); it is a single word, so no torn read, and a flip in the two-statement window is transient and reconciled by the next ice_vsi_sync_fltr() after switchdev stops. I kept the read live on purpose so the preserved rule and the pruning decision key on the same source. > The ena_rx_filtering() call eventually reaches ice_cfg_vlan_pruning(), > which reads uplink_vsi->netdev->flags & IFF_PROMISC locklessly to decide > whether to enable VLAN pruning. > >> ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, >> ICE_FLTR_TX); >> - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false, >> - ICE_FLTR_RX); >> + >> + /* Keep the Rx DFLT rule if the uplink is promiscuous; it must outlive >> + * the session. Test the live netdev->flags, the same value >> + * ena_rx_filtering() -> ice_cfg_vlan_pruning() above keys its decision >> + * on, so the preserved DFLT rule and the pruning state stay consistent. >> + */ >> + if (!(uplink_vsi->netdev->flags & IFF_PROMISC)) > ... and then the exact same flags are read locklessly again right here. > > Because setup/teardown paths are protected by devl_lock rather than > rtnl_lock, could a concurrent dev_change_flags() holding rtnl_lock toggle > IFF_PROMISC between the two checks? > > If IFF_PROMISC changes from 0 to 1 in this window, the code would incorrectly > enable VLAN pruning (seeing 0) and then incorrectly retain the DFLT_RX rule > (seeing 1), leaving the hardware with VLAN pruning enabled during promiscuous > mode. > > [Severity: High] > Does preserving the Rx DFLT rule here cause a regression by permanently > breaking multicast promiscuity configuration if IFF_PROMISC was set? Correct, the DFLT rule is unicast only. v3 will fix this in patch 1: ice_set_promisc(ICE_MCAST_PROMISC_BITS) will run whenever the netdev is promiscuous, including when ice_is_dflt_vsi_in_use() is already true. > > During the switchdev session, ice_remove_vsi_fltr() removes the > ICE_SW_LKUP_PROMISC rules. By manually preserving the DFLT rule here, any > future synchronization via ice_vsi_sync_fltr() will evaluate > !ice_is_dflt_vsi_in_use(vsi->port_info) as false: > > ice_main.c:ice_vsi_sync_fltr() { > ... > if (!ice_is_dflt_vsi_in_use(vsi->port_info)) { > err = ice_set_dflt_vsi(vsi); > ... > err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS); > } > ... > } > > Because it is skipped, ICE_MCAST_PROMISC_BITS is never configured, leaving > the VSI in unicast promiscuous mode but silently dropping multicast traffic > destined for other MAC addresses. > >> + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, >> + false, ICE_FLTR_RX); >> + >> ice_fltr_add_mac_and_broadcast(uplink_vsi, >> uplink_vsi->port_info->mac.perm_addr, >> ICE_FWD_TO_VSI); > [Severity: High] > This is a pre-existing issue, but does this teardown function fail to > synchronize hardware state for previously joined multicast groups and > secondary unicast addresses? Pre-existing (rx_mode is gated during switchdev). This patch only gates the DFLT Rx removal in ice_eswitch_release_env(); the multicast-promisc half is covered by the v3 patch 1 change above, and replaying the full deferred uc/mc list is a wider item left out here. Petr > > When entering switchdev mode, __dev_uc_unsync() and __dev_mc_unsync() clear > the synchronization state of OS-level MAC addresses, and > ice_remove_vsi_fltr() removes the actual hardware filters. While switchdev > runs, calls to ndo_set_rx_mode return early and do not update hardware. > > When returning to legacy mode, this function restores only the primary MAC > and broadcast addresses, but doesn't seem to invoke netif_set_rx_mode() or > equivalent to resynchronize the deferred Rx filters from the netdev to the > hardware. Could this cause the hardware to silently drop packets for > previously joined groups until the interface is bounced? > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-29 13:23 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-22 11:34 [PATCH iwl-net v2 0/2] ice: fix DFLT Rx rule handling for promisc and switchdev Petr Oros 2026-06-22 11:34 ` [PATCH iwl-net v2 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set Petr Oros 2026-06-23 16:25 ` Simon Horman 2026-06-29 12:06 ` Petr Oros 2026-06-22 11:34 ` [PATCH iwl-net v2 2/2] ice: preserve uplink DFLT Rx rule on switchdev release Petr Oros 2026-06-23 16:25 ` Simon Horman 2026-06-29 13:21 ` Petr Oros
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox