* [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes
@ 2026-04-23 13:04 Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change Jose Ignacio Tornos Martinez
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-23 13:04 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, przemyslaw.kitszel, aleksandr.loktionov,
jacob.e.keller, horms, jesse.brandeburg, anthony.l.nguyen, davem,
edumazet, kuba, pabeni, Jose Ignacio Tornos Martinez
This series fixes VF bonding failures introduced by commit ad7c7b2172c3
("net: hold netdev instance lock during sysfs operations").
When adding VFs to a bond immediately after setting trust mode, MAC
address changes fail with -EAGAIN, preventing bonding setup. This
affects both i40e (700-series) and ice (800-series) Intel NICs.
The core issue is lock contention: iavf_set_mac() is now called with the
netdev lock held and waits for MAC change completion while holding it.
However, both the watchdog task that sends the request and the adminq_task
that processes PF responses also need this lock, creating a deadlock where
neither can run, causing timeouts.
Additionally, setting VF trust triggers an unnecessary ~10 second VF reset
in i40e driver that delays bonding setup, even though filter
synchronization happens naturally during normal VF operation. For ice
driver, the delay is not so big, but in the same way the operation is not
necessary.
This series:
1. Adds safety guard to prevent MAC changes during reset or early
initialization (before VF is ready)
2. Eliminates unnecessary VF reset when setting trust in i40e, resetting
only when ADQ cloud filters need cleanup
3. Fixes lock contention by polling admin queue synchronously
4. Eliminates unnecessary VF reset when setting trust in ice, resetting
only when MAC LLDP filters need cleanup
The key fix (patch 3/4) implements a synchronous MAC change operation
similar to the approach used for ndo_change_mtu deadlock fix:
https://lore.kernel.org/intel-wired-lan/20260211191855.1532226-1-poros@redhat.com/
Instead of scheduling work and waiting, it:
- Sends the virtchnl message directly (not via watchdog)
- Polls the admin queue hardware directly for responses
- Processes all messages inline (including non-MAC messages)
- Returns when complete or times out
This allows the operation to complete synchronously while holding
netdev_lock, without relying on watchdog or adminq_task.
The function can sleep for up to 2.5 seconds polling hardware, but this
is acceptable since netdev_lock is per-device and only serializes
operations on the same interface.
Testing shows VF bonding now works reliably in ~5 seconds vs 15+ seconds
before (i40e), without timeouts or errors (i40e and ice).
Tested on Intel 700-series (i40e) and 800-series (ice) dual-port NICs
with iavf driver.
Thanks to Jan Tluka <jtluka@redhat.com> and Yuying Ma <yuma@redhat.com> for
reporting the issues.
Note: The refactoring suggested in v2 review to unify polling functions
and call iavf_virtchnl_completion() for all messages will be submitted
separately to net-next after this fix merges.
Jose Ignacio Tornos Martinez (4):
iavf: return EBUSY if reset in progress or not ready during MAC change
i40e: skip unnecessary VF reset when setting trust
iavf: send MAC change request synchronously
ice: skip unnecessary VF reset when setting trust
---
v4:
- No changes to patch 1 from v3
- Complete patch 2 with AI review (sashiko.dev) from Simon Horman.
- Complete patch 3 with the comments from Przemek Kitszel and AI review
from Simon Horman.
- Complete patch 4 with AI review (sashiko.dev) from Simon Horman and
issues addressed when comparing with i40e.
- Drop patch 5 from v3 (refactoring is postponed)
v3: https://lore.kernel.org/netdev/20260414110006.124286-1-jtornosm@redhat.com/
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 42 ++++++++++++++++++++++++++++++++----------
drivers/net/ethernet/intel/iavf/iavf.h | 10 ++++++++--
drivers/net/ethernet/intel/iavf/iavf_main.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
drivers/net/ethernet/intel/ice/ice_sriov.c | 41 +++++++++++++++++++++++++++++++++++++----
drivers/net/ethernet/intel/ice/ice_vf_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 +
7 files changed, 225 insertions(+), 43 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change
2026-04-23 13:04 [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes Jose Ignacio Tornos Martinez
@ 2026-04-23 13:04 ` Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-23 13:04 ` [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
` (2 subsequent siblings)
3 siblings, 1 reply; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-23 13:04 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, przemyslaw.kitszel, aleksandr.loktionov,
jacob.e.keller, horms, jesse.brandeburg, anthony.l.nguyen, davem,
edumazet, kuba, pabeni, Jose Ignacio Tornos Martinez
When a MAC address change is requested while the VF is resetting or still
initializing, return -EBUSY immediately instead of attempting the
operation.
Additionally, during early initialization states (before __IAVF_DOWN),
the PF may be slow to respond to MAC change requests, causing long
delays. Only allow MAC changes once the VF reaches __IAVF_DOWN state or
later, when the watchdog is running and the VF is ready for operations.
After commit ad7c7b2172c3 ("net: hold netdev instance lock
during sysfs operations"), MAC changes are called with the netdev lock
held, so we should not wait with the lock held during reset or
initialization. This allows the caller to retry or handle the busy state
appropriately without blocking other operations.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index dad001abc908..67aa14350b1b 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1060,6 +1060,9 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
struct sockaddr *addr = p;
int ret;
+ if (iavf_is_reset_in_progress(adapter) || adapter->state < __IAVF_DOWN)
+ return -EBUSY;
+
if (!is_valid_ether_addr(addr->sa_data))
return -EADDRNOTAVAIL;
--
2.53.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust
2026-04-23 13:04 [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change Jose Ignacio Tornos Martinez
@ 2026-04-23 13:04 ` Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 16:25 ` Simon Horman
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
3 siblings, 2 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-23 13:04 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, przemyslaw.kitszel, aleksandr.loktionov,
jacob.e.keller, horms, jesse.brandeburg, anthony.l.nguyen, davem,
edumazet, kuba, pabeni, Jose Ignacio Tornos Martinez
The current implementation triggers a VF reset when changing the trust
setting, causing a ~10 second delay during bonding setup.
In all the cases, the reset causes a ~10 second delay during which:
- VF must reinitialize completely
- Any in-progress operations (like bonding enslave) fail with timeouts
- VF is unavailable
When granting trust, no reset is needed - we can just set the capability
flag to allow privileged operations.
When revoking trust, we need to:
1. Clear the capability flag to block privileged operations
2. Disable promiscuous mode if it was enabled (trusted VFs can enable it)
3. Only reset if ADQ is enabled (to clean up cloud filters)
When we do reset (ADQ case), we reset first to clear VF_STATE_ACTIVE
(which blocks new cloud filter additions), then delete existing cloud
filters safely. This avoids the race condition where VF could add filters
during deletion.
When we don't reset, we manually handle capability flag and promiscuous
mode via helper function, eliminating the delay.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v4: Address AI review (sashiko.dev) from Simon Horman:
- Manually set/clear capability flag when not resetting
- Explicitly disable promiscuous mode when revoking trust
- Fix cloud filter race: reset FIRST (clears VF_STATE_ACTIVE),
delete filters AFTER (no race window)
- Add helper function i40e_setup_vf_trust() for non-reset path
v3: https://lore.kernel.org/all/20260414110006.124286-3-jtornosm@redhat.com/
.../ethernet/intel/i40e/i40e_virtchnl_pf.c | 42 ++++++++++++++-----
1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index a26c3d47ec15..69f68fec6809 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -4943,6 +4943,30 @@ int i40e_ndo_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool enable)
return ret;
}
+/**
+ * i40e_setup_vf_trust - Enable/disable VF trust mode without reset
+ * @vf: VF to configure
+ * @setting: trust setting
+ *
+ * Manually handle capability flag and promiscuous mode when changing trust
+ * without performing a VF reset.
+ * When reset is performed, this is not necessary as the reset procedure
+ * already handles this.
+ **/
+static void i40e_setup_vf_trust(struct i40e_vf *vf, bool setting)
+{
+ if (setting) {
+ set_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
+ } else {
+ clear_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
+
+ if (test_bit(I40E_VF_STATE_UC_PROMISC, &vf->vf_states) ||
+ test_bit(I40E_VF_STATE_MC_PROMISC, &vf->vf_states))
+ i40e_config_vf_promiscuous_mode(vf, vf->lan_vsi_idx,
+ false, false);
+ }
+}
+
/**
* i40e_ndo_set_vf_trust
* @netdev: network interface device structure of the pf
@@ -4987,19 +5011,17 @@ int i40e_ndo_set_vf_trust(struct net_device *netdev, int vf_id, bool setting)
set_bit(__I40E_MACVLAN_SYNC_PENDING, pf->state);
pf->vsi[vf->lan_vsi_idx]->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
- i40e_vc_reset_vf(vf, true);
+ /* Reset only if revoking trust with ADQ (for cloud filter cleanup) */
+ if (vf->adq_enabled && !setting) {
+ i40e_vc_reset_vf(vf, true);
+ i40e_del_all_cloud_filters(vf);
+ } else {
+ i40e_setup_vf_trust(vf, setting);
+ }
+
dev_info(&pf->pdev->dev, "VF %u is now %strusted\n",
vf_id, setting ? "" : "un");
- if (vf->adq_enabled) {
- if (!vf->trusted) {
- dev_info(&pf->pdev->dev,
- "VF %u no longer Trusted, deleting all cloud filters\n",
- vf_id);
- i40e_del_all_cloud_filters(vf);
- }
- }
-
out:
clear_bit(__I40E_VIRTCHNL_OP_PENDING, pf->state);
return ret;
--
2.53.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net v4 3/4] iavf: send MAC change request synchronously
2026-04-23 13:04 [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
@ 2026-04-23 13:04 ` Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
` (2 more replies)
2026-04-23 13:04 ` [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
3 siblings, 3 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-23 13:04 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, przemyslaw.kitszel, aleksandr.loktionov,
jacob.e.keller, horms, jesse.brandeburg, anthony.l.nguyen, davem,
edumazet, kuba, pabeni, Jose Ignacio Tornos Martinez, stable
After commit ad7c7b2172c3 ("net: hold netdev instance lock during sysfs
operations"), iavf_set_mac() is called with the netdev instance lock
already held.
The function queues a MAC address change request via
iavf_replace_primary_mac() and then waits for completion. However, in
the current flow, the actual virtchnl message is sent by the watchdog
task, which also needs to acquire the netdev lock to run. Additionally,
the adminq_task which processes virtchnl responses also needs the netdev
lock.
This creates a deadlock scenario:
1. iavf_set_mac() holds netdev lock and waits for MAC change
2. Watchdog needs netdev lock to send the request -> blocked
3. Even if request is sent, adminq_task needs netdev lock to process
PF response -> blocked
4. MAC change times out after 2.5 seconds
5. iavf_set_mac() returns -EAGAIN
This particularly affects VFs during bonding setup when multiple VFs are
enslaved in quick succession.
Fix by implementing a synchronous MAC change operation similar to the
approach used in commit fdadbf6e84c4 ("iavf: fix incorrect reset handling
in callbacks").
The solution:
1. Send the virtchnl ADD_ETH_ADDR message directly (not via watchdog)
2. Poll the admin queue hardware directly for responses
3. Process all received messages (including non-MAC messages)
4. Return when MAC change completes or times out
A new generic function iavf_poll_virtchnl_response() is introduced that
can be reused for any future synchronous virtchnl operations. It takes a
callback to check completion, allowing flexible condition checking.
This allows the operation to complete synchronously while holding
netdev_lock, without relying on watchdog or adminq_task. The function
can sleep for up to 2.5 seconds polling hardware, but this is acceptable
since netdev_lock is per-device and only serializes operations on the
same interface.
To support this, change iavf_add_ether_addrs() to return an error code
instead of void, allowing callers to detect failures. Additionally,
export iavf_mac_add_reject() to enable proper rollback on local failures
(timeouts, send errors) - PF rejections are already handled automatically
by iavf_virtchnl_completion().
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
cc: stable@vger.kernel.org
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v4: Complete with Przemek Kitszel comments:
- Remove vc_waitqueue entirely (not needed any more)
- Add named parameters to callback function pointer declaration for
clarity
- Simplify callback signature: add v_op parameter so callback
receives the opcode from the processed message to identify which
response was received
- Optimize polling loop to single condition check per iteration
instead of checking both before and after message processing
Address AI review (sashiko.dev) from Simon Horman:
- Complete iavf_add_ether_addrs() error handling
- Skip non-virtchnl hardware events (received_op=VIRTCHNL_OP_UNKNOWN),
these can cause false completion detection
- Complete rollback for local failures (not PF rejection) reusing
iavf_mac_add_reject() to restore the old primary filter
v3: https://lore.kernel.org/netdev/20260414110006.124286-4-jtornosm@redhat.com/
drivers/net/ethernet/intel/iavf/iavf.h | 10 +-
drivers/net/ethernet/intel/iavf/iavf_main.c | 70 +++++++++----
.../net/ethernet/intel/iavf/iavf_virtchnl.c | 99 +++++++++++++++++--
3 files changed, 151 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
index e9fb0a0919e3..78fa3df06e11 100644
--- a/drivers/net/ethernet/intel/iavf/iavf.h
+++ b/drivers/net/ethernet/intel/iavf/iavf.h
@@ -260,7 +260,6 @@ struct iavf_adapter {
struct work_struct adminq_task;
struct work_struct finish_config;
wait_queue_head_t down_waitqueue;
- wait_queue_head_t vc_waitqueue;
struct iavf_q_vector *q_vectors;
struct list_head vlan_filter_list;
int num_vlan_filters;
@@ -589,8 +588,9 @@ void iavf_configure_queues(struct iavf_adapter *adapter);
void iavf_enable_queues(struct iavf_adapter *adapter);
void iavf_disable_queues(struct iavf_adapter *adapter);
void iavf_map_queues(struct iavf_adapter *adapter);
-void iavf_add_ether_addrs(struct iavf_adapter *adapter);
+int iavf_add_ether_addrs(struct iavf_adapter *adapter);
void iavf_del_ether_addrs(struct iavf_adapter *adapter);
+void iavf_mac_add_reject(struct iavf_adapter *adapter);
void iavf_add_vlans(struct iavf_adapter *adapter);
void iavf_del_vlans(struct iavf_adapter *adapter);
void iavf_set_promiscuous(struct iavf_adapter *adapter);
@@ -607,6 +607,12 @@ void iavf_disable_vlan_stripping(struct iavf_adapter *adapter);
void iavf_virtchnl_completion(struct iavf_adapter *adapter,
enum virtchnl_ops v_opcode,
enum iavf_status v_retval, u8 *msg, u16 msglen);
+int iavf_poll_virtchnl_response(struct iavf_adapter *adapter,
+ bool (*condition)(struct iavf_adapter *adapter,
+ const void *data,
+ enum virtchnl_ops v_op),
+ const void *cond_data,
+ unsigned int timeout_ms);
int iavf_config_rss(struct iavf_adapter *adapter);
void iavf_cfg_queues_bw(struct iavf_adapter *adapter);
void iavf_cfg_queues_quanta_size(struct iavf_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 67aa14350b1b..bc5994bf2cd9 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1047,6 +1047,48 @@ static bool iavf_is_mac_set_handled(struct net_device *netdev,
return ret;
}
+/**
+ * iavf_mac_change_done - Check if MAC change completed
+ * @adapter: board private structure
+ * @data: MAC address being checked (as const void *)
+ * @v_op: virtchnl opcode from processed message
+ *
+ * Callback for iavf_poll_virtchnl_response() to check if MAC change completed.
+ *
+ * Returns true if MAC change completed, false otherwise
+ */
+static bool iavf_mac_change_done(struct iavf_adapter *adapter,
+ const void *data, enum virtchnl_ops v_op)
+{
+ const u8 *addr = data;
+
+ return iavf_is_mac_set_handled(adapter->netdev, addr);
+}
+
+/**
+ * iavf_set_mac_sync - Synchronously change MAC address
+ * @adapter: board private structure
+ * @addr: MAC address to set
+ *
+ * Sends MAC change request to PF and polls admin queue for response.
+ * Caller must hold netdev_lock. This can sleep for up to 2.5 seconds.
+ *
+ * Returns 0 on success, negative on failure
+ */
+static int iavf_set_mac_sync(struct iavf_adapter *adapter, const u8 *addr)
+{
+ int ret;
+
+ netdev_assert_locked(adapter->netdev);
+
+ ret = iavf_add_ether_addrs(adapter);
+ if (ret)
+ return ret;
+
+ return iavf_poll_virtchnl_response(adapter, iavf_mac_change_done,
+ addr, 2500);
+}
+
/**
* iavf_set_mac - NDO callback to set port MAC address
* @netdev: network interface device structure
@@ -1067,25 +1109,20 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
return -EADDRNOTAVAIL;
ret = iavf_replace_primary_mac(adapter, addr->sa_data);
-
if (ret)
return ret;
- ret = wait_event_interruptible_timeout(adapter->vc_waitqueue,
- iavf_is_mac_set_handled(netdev, addr->sa_data),
- msecs_to_jiffies(2500));
-
- /* If ret < 0 then it means wait was interrupted.
- * If ret == 0 then it means we got a timeout.
- * else it means we got response for set MAC from PF,
- * check if netdev MAC was updated to requested MAC,
- * if yes then set MAC succeeded otherwise it failed return -EACCES
- */
- if (ret < 0)
+ ret = iavf_set_mac_sync(adapter, addr->sa_data);
+ if (ret) {
+ /* Rollback for local failures (timeout, send error, -EBUSY).
+ * Note: If PF rejects the request (sends error response),
+ * iavf_virtchnl_completion() automatically calls
+ * iavf_mac_add_reject(), ret=0, and this is not executed.
+ * Only local failures (no PF response received) need manual rollback.
+ */
+ iavf_mac_add_reject(adapter);
return ret;
-
- if (!ret)
- return -EAGAIN;
+ }
if (!ether_addr_equal(netdev->dev_addr, addr->sa_data))
return -EACCES;
@@ -5415,9 +5452,6 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
/* Setup the wait queue for indicating transition to down status */
init_waitqueue_head(&adapter->down_waitqueue);
- /* Setup the wait queue for indicating virtchannel events */
- init_waitqueue_head(&adapter->vc_waitqueue);
-
INIT_LIST_HEAD(&adapter->ptp.aq_cmds);
init_waitqueue_head(&adapter->ptp.phc_time_waitqueue);
mutex_init(&adapter->ptp.aq_cmd_lock);
diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
index a52c100dcbc5..d1afb8261c24 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
@@ -2,6 +2,7 @@
/* Copyright(c) 2013 - 2018 Intel Corporation. */
#include <linux/net/intel/libie/rx.h>
+#include <net/netdev_lock.h>
#include "iavf.h"
#include "iavf_ptp.h"
@@ -555,20 +556,23 @@ iavf_set_mac_addr_type(struct virtchnl_ether_addr *virtchnl_ether_addr,
* @adapter: adapter structure
*
* Request that the PF add one or more addresses to our filters.
+ *
+ * Return: 0 on success, negative on failure
**/
-void iavf_add_ether_addrs(struct iavf_adapter *adapter)
+int iavf_add_ether_addrs(struct iavf_adapter *adapter)
{
struct virtchnl_ether_addr_list *veal;
struct iavf_mac_filter *f;
int i = 0, count = 0;
bool more = false;
size_t len;
+ int ret;
if (adapter->current_op != VIRTCHNL_OP_UNKNOWN) {
/* bail because we already have a command pending */
dev_err(&adapter->pdev->dev, "Cannot add filters, command %d pending\n",
adapter->current_op);
- return;
+ return -EBUSY;
}
spin_lock_bh(&adapter->mac_vlan_list_lock);
@@ -580,7 +584,7 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
if (!count) {
adapter->aq_required &= ~IAVF_FLAG_AQ_ADD_MAC_FILTER;
spin_unlock_bh(&adapter->mac_vlan_list_lock);
- return;
+ return 0;
}
adapter->current_op = VIRTCHNL_OP_ADD_ETH_ADDR;
@@ -594,8 +598,9 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
veal = kzalloc(len, GFP_ATOMIC);
if (!veal) {
+ adapter->current_op = VIRTCHNL_OP_UNKNOWN;
spin_unlock_bh(&adapter->mac_vlan_list_lock);
- return;
+ return -ENOMEM;
}
veal->vsi_id = adapter->vsi_res->vsi_id;
@@ -615,8 +620,15 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
spin_unlock_bh(&adapter->mac_vlan_list_lock);
- iavf_send_pf_msg(adapter, VIRTCHNL_OP_ADD_ETH_ADDR, (u8 *)veal, len);
+ ret = iavf_send_pf_msg(adapter, VIRTCHNL_OP_ADD_ETH_ADDR, (u8 *)veal, len);
kfree(veal);
+ if (ret) {
+ dev_err(&adapter->pdev->dev,
+ "Unable to send ADD_ETH_ADDR message to PF, error %d\n", ret);
+ adapter->current_op = VIRTCHNL_OP_UNKNOWN;
+ }
+
+ return ret;
}
/**
@@ -713,7 +725,7 @@ static void iavf_mac_add_ok(struct iavf_adapter *adapter)
*
* Remove filters from list based on PF response.
**/
-static void iavf_mac_add_reject(struct iavf_adapter *adapter)
+void iavf_mac_add_reject(struct iavf_adapter *adapter)
{
struct net_device *netdev = adapter->netdev;
struct iavf_mac_filter *f, *ftmp;
@@ -2389,7 +2401,6 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
iavf_mac_add_reject(adapter);
/* restore administratively set MAC address */
ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
- wake_up(&adapter->vc_waitqueue);
break;
case VIRTCHNL_OP_DEL_VLAN:
dev_err(&adapter->pdev->dev, "Failed to delete VLAN filter, error %s\n",
@@ -2586,7 +2597,6 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
eth_hw_addr_set(netdev, adapter->hw.mac.addr);
netif_addr_unlock_bh(netdev);
}
- wake_up(&adapter->vc_waitqueue);
break;
case VIRTCHNL_OP_GET_STATS: {
struct iavf_eth_stats *stats =
@@ -2956,3 +2966,76 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
} /* switch v_opcode */
adapter->current_op = VIRTCHNL_OP_UNKNOWN;
}
+
+/**
+ * iavf_poll_virtchnl_response - Poll admin queue for virtchnl response
+ * @adapter: adapter structure
+ * @condition: callback to check if desired response received
+ * @cond_data: context data passed to condition callback
+ * @timeout_ms: maximum time to wait in milliseconds
+ *
+ * Polls the admin queue and processes all incoming virtchnl messages.
+ * After processing each valid message, calls the condition callback to check
+ * if the expected response has been received. The callback receives the opcode
+ * of the processed message to identify which response was received. Continues
+ * polling until the callback returns true or timeout expires.
+ * Clear current_op on timeout to prevent permanent -EBUSY state.
+ * Caller must hold netdev_lock. This can sleep for up to timeout_ms while
+ * polling hardware.
+ *
+ * Return: 0 on success (condition met), -EAGAIN on timeout, or error code
+ **/
+int iavf_poll_virtchnl_response(struct iavf_adapter *adapter,
+ bool (*condition)(struct iavf_adapter *adapter,
+ const void *data,
+ enum virtchnl_ops v_op),
+ const void *cond_data,
+ unsigned int timeout_ms)
+{
+ struct iavf_hw *hw = &adapter->hw;
+ struct iavf_arq_event_info event;
+ enum virtchnl_ops received_op;
+ unsigned long timeout;
+ u32 v_retval;
+ u16 pending;
+ int ret = -EAGAIN;
+
+ netdev_assert_locked(adapter->netdev);
+
+ event.buf_len = IAVF_MAX_AQ_BUF_SIZE;
+ event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
+ if (!event.msg_buf)
+ return -ENOMEM;
+
+ timeout = jiffies + msecs_to_jiffies(timeout_ms);
+ do {
+ if (iavf_clean_arq_element(hw, &event, &pending) == IAVF_SUCCESS) {
+ received_op = (enum virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
+ if (received_op != VIRTCHNL_OP_UNKNOWN) {
+ v_retval = le32_to_cpu(event.desc.cookie_low);
+
+ iavf_virtchnl_completion(adapter, received_op,
+ (enum iavf_status)v_retval,
+ event.msg_buf, event.msg_len);
+
+ if (condition(adapter, cond_data, received_op)) {
+ ret = 0;
+ break;
+ }
+ }
+
+ memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE);
+
+ if (pending)
+ continue;
+ }
+
+ usleep_range(50, 75);
+ } while (time_before(jiffies, timeout));
+
+ if (ret == -EAGAIN && adapter->current_op != VIRTCHNL_OP_UNKNOWN)
+ adapter->current_op = VIRTCHNL_OP_UNKNOWN;
+
+ kfree(event.msg_buf);
+ return ret;
+}
--
2.53.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-23 13:04 [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes Jose Ignacio Tornos Martinez
` (2 preceding siblings ...)
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
@ 2026-04-23 13:04 ` Jose Ignacio Tornos Martinez
2026-04-23 13:17 ` Loktionov, Aleksandr
2026-04-27 16:50 ` Simon Horman
3 siblings, 2 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-23 13:04 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, przemyslaw.kitszel, aleksandr.loktionov,
jacob.e.keller, horms, jesse.brandeburg, anthony.l.nguyen, davem,
edumazet, kuba, pabeni, Jose Ignacio Tornos Martinez
Similar to the i40e fix, ice_set_vf_trust() unconditionally calls
ice_reset_vf() when the trust setting changes. While the delay is smaller
than i40e this reset is still unnecessary in most cases.
Additionally, the original code has a race condition: it deletes MAC LLDP
filters BEFORE resetting the VF. During this deletion, the VF is still
ACTIVE and can add new MAC LLDP filters concurrently, potentially
corrupting the filter list.
When granting trust, no reset is needed - we can just set the capability
flag to allow privileged operations.
When revoking trust, we need to:
1. Clear the capability flag to block privileged operations
2. Disable promiscuous mode if it was enabled (trusted VFs can enable it)
3. Only reset if MAC LLDP filters exist (to clean them up)
When we do reset (MAC LLDP case), we fix the race condition by resetting
first to clear VF state (which blocks new MAC LLDP filter additions), then
delete existing filters safely. During cleanup, vf->trusted remains true so
ice_vf_is_lldp_ena() works properly. Only after cleanup do we set
vf->trusted = false.
When we don't reset, we manually handle capability flag and promiscuous
mode via helper function.
The ice driver already has logic to clean up MAC LLDP filters when
removing trust. After this cleanup, the VF reset is only necessary if
there were actually filters to remove (num_mac_lldp was non-zero).
This saves time and eliminates unnecessary service disruption when
changing VF trust settings in most cases, while properly handling filter
cleanup.
Fixes: 2296345416b0 ("ice: receive LLDP on trusted VFs")
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v4:
- Address AI review (sashiko.dev) from Simon Horman:
vf->trusted ordering bug
- Fix upstream race condition when comparing with i40e code
- Apply capability flag and promiscuous mode fixes from i40e AI review
- Add helper function ice_setup_vf_trust() for non-reset path
- Export ice_vf_clear_all_promisc_modes() for code reuse
v3: https://lore.kernel.org/all/20260414110006.124286-5-jtornosm@redhat.com/
drivers/net/ethernet/intel/ice/ice_sriov.c | 41 +++++++++++++++++++--
drivers/net/ethernet/intel/ice/ice_vf_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 +
3 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
index 7e00e091756d..d0da7f6adc23 100644
--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
+++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
@@ -1364,6 +1364,34 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
return __ice_set_vf_mac(ice_netdev_to_pf(netdev), vf_id, mac);
}
+/**
+ * ice_setup_vf_trust - Enable/disable VF trust mode without reset
+ * @vf: VF to configure
+ * @setting: trust setting
+ *
+ * Manually handle capability flag and promiscuous mode when changing trust
+ * without performing a VF reset.
+ * When reset is performed, this is not necessary as the reset procedure
+ * already handles this.
+ **/
+static void ice_setup_vf_trust(struct ice_vf *vf, bool setting)
+{
+ struct ice_vsi *vsi;
+
+ if (setting) {
+ set_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
+ } else {
+ clear_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
+
+ if (test_bit(ICE_VF_STATE_UC_PROMISC, vf->vf_states) ||
+ test_bit(ICE_VF_STATE_MC_PROMISC, vf->vf_states)) {
+ vsi = ice_get_vf_vsi(vf);
+ if (vsi)
+ ice_vf_clear_all_promisc_modes(vf, vsi);
+ }
+ }
+}
+
/**
* ice_set_vf_trust
* @netdev: network interface device structure
@@ -1399,11 +1427,16 @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool trusted)
mutex_lock(&vf->cfg_lock);
- while (!trusted && vf->num_mac_lldp)
- ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf), false);
-
+ /* Reset only if revoking trust with MAC LLDP filters */
+ if (!trusted && vf->num_mac_lldp) {
+ ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
+ while (vf->num_mac_lldp)
+ ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf), false);
+ } else {
+ ice_setup_vf_trust(vf, trusted);
+ }
vf->trusted = trusted;
- ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
+
dev_info(ice_pf_to_dev(pf), "VF %u is now %strusted\n",
vf_id, trusted ? "" : "un");
diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
index c8bc952f05cd..81bbf30e5c29 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
@@ -623,7 +623,7 @@ ice_vf_get_promisc_masks(struct ice_vf *vf, struct ice_vsi *vsi,
*
* Clear all promiscuous/allmulticast filters for a VF
*/
-static int
+int
ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct ice_vsi *vsi)
{
struct ice_pf *pf = vf->pf;
diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
index 7a9c75d1d07c..a3501bd92311 100644
--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
@@ -310,6 +310,7 @@ bool ice_is_any_vf_in_unicast_promisc(struct ice_pf *pf);
void
ice_vf_get_promisc_masks(struct ice_vf *vf, struct ice_vsi *vsi,
u8 *ucast_m, u8 *mcast_m);
+int ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct ice_vsi *vsi);
int
ice_vf_set_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8 promisc_m);
int
--
2.53.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change
2026-04-23 13:04 ` [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change Jose Ignacio Tornos Martinez
@ 2026-04-23 13:14 ` Loktionov, Aleksandr
0 siblings, 0 replies; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-23 13:14 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez, netdev@vger.kernel.org
Cc: intel-wired-lan@lists.osuosl.org, Kitszel, Przemyslaw,
Keller, Jacob E, horms@kernel.org, jesse.brandeburg@intel.com,
Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com
> -----Original Message-----
> From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Sent: Thursday, April 23, 2026 3:04 PM
> To: netdev@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; horms@kernel.org;
> jesse.brandeburg@intel.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Jose Ignacio
> Tornos Martinez <jtornosm@redhat.com>
> Subject: [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or
> not ready during MAC change
>
> When a MAC address change is requested while the VF is resetting or
> still initializing, return -EBUSY immediately instead of attempting
> the operation.
>
> Additionally, during early initialization states (before __IAVF_DOWN),
> the PF may be slow to respond to MAC change requests, causing long
> delays. Only allow MAC changes once the VF reaches __IAVF_DOWN state
> or later, when the watchdog is running and the VF is ready for
> operations.
>
> After commit ad7c7b2172c3 ("net: hold netdev instance lock during
> sysfs operations"), MAC changes are called with the netdev lock held,
> so we should not wait with the lock held during reset or
> initialization. This allows the caller to retry or handle the busy
> state appropriately without blocking other operations.
>
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> ---
>
> drivers/net/ethernet/intel/iavf/iavf_main.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index dad001abc908..67aa14350b1b 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -1060,6 +1060,9 @@ static int iavf_set_mac(struct net_device
> *netdev, void *p)
> struct sockaddr *addr = p;
> int ret;
>
> + if (iavf_is_reset_in_progress(adapter) || adapter->state <
> __IAVF_DOWN)
> + return -EBUSY;
> +
> if (!is_valid_ether_addr(addr->sa_data))
> return -EADDRNOTAVAIL;
>
> --
> 2.53.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust
2026-04-23 13:04 ` [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
@ 2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 16:25 ` Simon Horman
1 sibling, 0 replies; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-23 13:14 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez, netdev@vger.kernel.org
Cc: intel-wired-lan@lists.osuosl.org, Kitszel, Przemyslaw,
Keller, Jacob E, horms@kernel.org, jesse.brandeburg@intel.com,
Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com
> -----Original Message-----
> From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Sent: Thursday, April 23, 2026 3:04 PM
> To: netdev@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; horms@kernel.org;
> jesse.brandeburg@intel.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Jose Ignacio
> Tornos Martinez <jtornosm@redhat.com>
> Subject: [PATCH net v4 2/4] i40e: skip unnecessary VF reset when
> setting trust
>
> The current implementation triggers a VF reset when changing the trust
> setting, causing a ~10 second delay during bonding setup.
>
> In all the cases, the reset causes a ~10 second delay during which:
> - VF must reinitialize completely
> - Any in-progress operations (like bonding enslave) fail with timeouts
> - VF is unavailable
>
> When granting trust, no reset is needed - we can just set the
> capability flag to allow privileged operations.
>
> When revoking trust, we need to:
> 1. Clear the capability flag to block privileged operations 2. Disable
> promiscuous mode if it was enabled (trusted VFs can enable it) 3. Only
> reset if ADQ is enabled (to clean up cloud filters)
>
> When we do reset (ADQ case), we reset first to clear VF_STATE_ACTIVE
> (which blocks new cloud filter additions), then delete existing cloud
> filters safely. This avoids the race condition where VF could add
> filters during deletion.
>
> When we don't reset, we manually handle capability flag and
> promiscuous mode via helper function, eliminating the delay.
>
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v4: Address AI review (sashiko.dev) from Simon Horman:
> - Manually set/clear capability flag when not resetting
> - Explicitly disable promiscuous mode when revoking trust
> - Fix cloud filter race: reset FIRST (clears VF_STATE_ACTIVE),
> delete filters AFTER (no race window)
> - Add helper function i40e_setup_vf_trust() for non-reset path
> v3: https://lore.kernel.org/all/20260414110006.124286-3-
> jtornosm@redhat.com/
>
> .../ethernet/intel/i40e/i40e_virtchnl_pf.c | 42 ++++++++++++++----
> -
> 1 file changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> index a26c3d47ec15..69f68fec6809 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> @@ -4943,6 +4943,30 @@ int i40e_ndo_set_vf_spoofchk(struct net_device
> *netdev, int vf_id, bool enable)
> return ret;
> }
>
> +/**
> + * i40e_setup_vf_trust - Enable/disable VF trust mode without reset
> + * @vf: VF to configure
> + * @setting: trust setting
> + *
> + * Manually handle capability flag and promiscuous mode when changing
> +trust
> + * without performing a VF reset.
> + * When reset is performed, this is not necessary as the reset
> +procedure
> + * already handles this.
> + **/
> +static void i40e_setup_vf_trust(struct i40e_vf *vf, bool setting) {
> + if (setting) {
> + set_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> + } else {
> + clear_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> +
> + if (test_bit(I40E_VF_STATE_UC_PROMISC, &vf->vf_states)
> ||
> + test_bit(I40E_VF_STATE_MC_PROMISC, &vf->vf_states))
> + i40e_config_vf_promiscuous_mode(vf, vf-
> >lan_vsi_idx,
> + false, false);
> + }
> +}
> +
> /**
> * i40e_ndo_set_vf_trust
> * @netdev: network interface device structure of the pf @@ -4987,19
> +5011,17 @@ int i40e_ndo_set_vf_trust(struct net_device *netdev, int
> vf_id, bool setting)
> set_bit(__I40E_MACVLAN_SYNC_PENDING, pf->state);
> pf->vsi[vf->lan_vsi_idx]->flags |=
> I40E_VSI_FLAG_FILTER_CHANGED;
>
> - i40e_vc_reset_vf(vf, true);
> + /* Reset only if revoking trust with ADQ (for cloud filter
> cleanup) */
> + if (vf->adq_enabled && !setting) {
> + i40e_vc_reset_vf(vf, true);
> + i40e_del_all_cloud_filters(vf);
> + } else {
> + i40e_setup_vf_trust(vf, setting);
> + }
> +
> dev_info(&pf->pdev->dev, "VF %u is now %strusted\n",
> vf_id, setting ? "" : "un");
>
> - if (vf->adq_enabled) {
> - if (!vf->trusted) {
> - dev_info(&pf->pdev->dev,
> - "VF %u no longer Trusted, deleting all
> cloud filters\n",
> - vf_id);
> - i40e_del_all_cloud_filters(vf);
> - }
> - }
> -
> out:
> clear_bit(__I40E_VIRTCHNL_OP_PENDING, pf->state);
> return ret;
> --
> 2.53.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 3/4] iavf: send MAC change request synchronously
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
@ 2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 9:23 ` Przemek Kitszel
2026-04-27 16:43 ` Simon Horman
2 siblings, 0 replies; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-23 13:14 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez, netdev@vger.kernel.org
Cc: intel-wired-lan@lists.osuosl.org, Kitszel, Przemyslaw,
Keller, Jacob E, horms@kernel.org, jesse.brandeburg@intel.com,
Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, stable@vger.kernel.org
> -----Original Message-----
> From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Sent: Thursday, April 23, 2026 3:04 PM
> To: netdev@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; horms@kernel.org;
> jesse.brandeburg@intel.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Jose Ignacio
> Tornos Martinez <jtornosm@redhat.com>; stable@vger.kernel.org
> Subject: [PATCH net v4 3/4] iavf: send MAC change request
> synchronously
>
> After commit ad7c7b2172c3 ("net: hold netdev instance lock during
> sysfs operations"), iavf_set_mac() is called with the netdev instance
> lock already held.
>
> The function queues a MAC address change request via
> iavf_replace_primary_mac() and then waits for completion. However, in
> the current flow, the actual virtchnl message is sent by the watchdog
> task, which also needs to acquire the netdev lock to run.
> Additionally, the adminq_task which processes virtchnl responses also
> needs the netdev lock.
>
> This creates a deadlock scenario:
> 1. iavf_set_mac() holds netdev lock and waits for MAC change 2.
> Watchdog needs netdev lock to send the request -> blocked 3. Even if
> request is sent, adminq_task needs netdev lock to process
> PF response -> blocked
> 4. MAC change times out after 2.5 seconds 5. iavf_set_mac() returns -
> EAGAIN
>
> This particularly affects VFs during bonding setup when multiple VFs
> are enslaved in quick succession.
>
> Fix by implementing a synchronous MAC change operation similar to the
> approach used in commit fdadbf6e84c4 ("iavf: fix incorrect reset
> handling in callbacks").
>
> The solution:
> 1. Send the virtchnl ADD_ETH_ADDR message directly (not via watchdog)
> 2. Poll the admin queue hardware directly for responses 3. Process all
> received messages (including non-MAC messages) 4. Return when MAC
> change completes or times out
>
> A new generic function iavf_poll_virtchnl_response() is introduced
> that can be reused for any future synchronous virtchnl operations. It
> takes a callback to check completion, allowing flexible condition
> checking.
>
> This allows the operation to complete synchronously while holding
> netdev_lock, without relying on watchdog or adminq_task. The function
> can sleep for up to 2.5 seconds polling hardware, but this is
> acceptable since netdev_lock is per-device and only serializes
> operations on the same interface.
>
> To support this, change iavf_add_ether_addrs() to return an error code
> instead of void, allowing callers to detect failures. Additionally,
> export iavf_mac_add_reject() to enable proper rollback on local
> failures (timeouts, send errors) - PF rejections are already handled
> automatically by iavf_virtchnl_completion().
>
> Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs
> operations")
> cc: stable@vger.kernel.org
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v4: Complete with Przemek Kitszel comments:
> - Remove vc_waitqueue entirely (not needed any more)
> - Add named parameters to callback function pointer declaration
> for
> clarity
> - Simplify callback signature: add v_op parameter so callback
> receives the opcode from the processed message to identify which
> response was received
> - Optimize polling loop to single condition check per iteration
> instead of checking both before and after message processing
> Address AI review (sashiko.dev) from Simon Horman:
> - Complete iavf_add_ether_addrs() error handling
> - Skip non-virtchnl hardware events
> (received_op=VIRTCHNL_OP_UNKNOWN),
> these can cause false completion detection
> - Complete rollback for local failures (not PF rejection) reusing
> iavf_mac_add_reject() to restore the old primary filter
> v3: https://lore.kernel.org/netdev/20260414110006.124286-4-
> jtornosm@redhat.com/
>
> drivers/net/ethernet/intel/iavf/iavf.h | 10 +-
> drivers/net/ethernet/intel/iavf/iavf_main.c | 70 +++++++++----
> .../net/ethernet/intel/iavf/iavf_virtchnl.c | 99 +++++++++++++++++-
> -
> 3 files changed, 151 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf.h
> b/drivers/net/ethernet/intel/iavf/iavf.h
> index e9fb0a0919e3..78fa3df06e11 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf.h
> +++ b/drivers/net/ethernet/intel/iavf/iavf.h
> @@ -260,7 +260,6 @@ struct iavf_adapter {
> struct work_struct adminq_task;
> struct work_struct finish_config;
> wait_queue_head_t down_waitqueue;
...
> --
> 2.53.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-23 13:04 ` [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
@ 2026-04-23 13:17 ` Loktionov, Aleksandr
2026-04-24 10:32 ` Jose Ignacio Tornos Martinez
2026-04-24 16:05 ` Loktionov, Aleksandr
2026-04-27 16:50 ` Simon Horman
1 sibling, 2 replies; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-23 13:17 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez, netdev@vger.kernel.org
Cc: intel-wired-lan@lists.osuosl.org, Kitszel, Przemyslaw,
Keller, Jacob E, horms@kernel.org, jesse.brandeburg@intel.com,
Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com
> -----Original Message-----
> From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Sent: Thursday, April 23, 2026 3:04 PM
> To: netdev@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; horms@kernel.org;
> jesse.brandeburg@intel.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Jose Ignacio
> Tornos Martinez <jtornosm@redhat.com>
> Subject: [PATCH net v4 4/4] ice: skip unnecessary VF reset when
> setting trust
>
> Similar to the i40e fix, ice_set_vf_trust() unconditionally calls
> ice_reset_vf() when the trust setting changes. While the delay is
> smaller than i40e this reset is still unnecessary in most cases.
>
> Additionally, the original code has a race condition: it deletes MAC
> LLDP filters BEFORE resetting the VF. During this deletion, the VF is
> still ACTIVE and can add new MAC LLDP filters concurrently,
> potentially corrupting the filter list.
>
> When granting trust, no reset is needed - we can just set the
> capability flag to allow privileged operations.
>
> When revoking trust, we need to:
> 1. Clear the capability flag to block privileged operations 2. Disable
> promiscuous mode if it was enabled (trusted VFs can enable it) 3. Only
> reset if MAC LLDP filters exist (to clean them up)
>
> When we do reset (MAC LLDP case), we fix the race condition by
> resetting first to clear VF state (which blocks new MAC LLDP filter
> additions), then delete existing filters safely. During cleanup, vf-
> >trusted remains true so
> ice_vf_is_lldp_ena() works properly. Only after cleanup do we set
> vf->trusted = false.
>
> When we don't reset, we manually handle capability flag and
> promiscuous mode via helper function.
>
> The ice driver already has logic to clean up MAC LLDP filters when
> removing trust. After this cleanup, the VF reset is only necessary if
> there were actually filters to remove (num_mac_lldp was non-zero).
>
> This saves time and eliminates unnecessary service disruption when
> changing VF trust settings in most cases, while properly handling
> filter cleanup.
>
> Fixes: 2296345416b0 ("ice: receive LLDP on trusted VFs")
For me it looks like cc: stable@vger.kernel.org must be added
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v4:
> - Address AI review (sashiko.dev) from Simon Horman:
> vf->trusted ordering bug
> - Fix upstream race condition when comparing with i40e code
> - Apply capability flag and promiscuous mode fixes from i40e AI
> review
> - Add helper function ice_setup_vf_trust() for non-reset path
> - Export ice_vf_clear_all_promisc_modes() for code reuse
> v3: https://lore.kernel.org/all/20260414110006.124286-5-
> jtornosm@redhat.com/
>
> drivers/net/ethernet/intel/ice/ice_sriov.c | 41 +++++++++++++++++++-
> - drivers/net/ethernet/intel/ice/ice_vf_lib.c | 2 +-
> drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 +
> 3 files changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c
> b/drivers/net/ethernet/intel/ice/ice_sriov.c
> index 7e00e091756d..d0da7f6adc23 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
> @@ -1364,6 +1364,34 @@ int ice_set_vf_mac(struct net_device *netdev,
> int vf_id, u8 *mac)
> return __ice_set_vf_mac(ice_netdev_to_pf(netdev), vf_id, mac);
> }
>
> +/**
> + * ice_setup_vf_trust - Enable/disable VF trust mode without reset
> + * @vf: VF to configure
> + * @setting: trust setting
> + *
> + * Manually handle capability flag and promiscuous mode when changing
> +trust
> + * without performing a VF reset.
> + * When reset is performed, this is not necessary as the reset
> +procedure
> + * already handles this.
> + **/
> +static void ice_setup_vf_trust(struct ice_vf *vf, bool setting) {
> + struct ice_vsi *vsi;
> +
> + if (setting) {
> + set_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> + } else {
> + clear_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> +
> + if (test_bit(ICE_VF_STATE_UC_PROMISC, vf->vf_states) ||
> + test_bit(ICE_VF_STATE_MC_PROMISC, vf->vf_states)) {
> + vsi = ice_get_vf_vsi(vf);
> + if (vsi)
> + ice_vf_clear_all_promisc_modes(vf, vsi);
You declare ice_vf_clear_all_promisc_modes() returning int, but ignore the return value.
Looks suspicious isn't it?
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> + }
> + }
> +}
> +
> /**
> * ice_set_vf_trust
> * @netdev: network interface device structure @@ -1399,11 +1427,16
> @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool
> trusted)
>
> mutex_lock(&vf->cfg_lock);
>
> - while (!trusted && vf->num_mac_lldp)
> - ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf),
> false);
> -
> + /* Reset only if revoking trust with MAC LLDP filters */
> + if (!trusted && vf->num_mac_lldp) {
> + ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
> + while (vf->num_mac_lldp)
> + ice_vf_update_mac_lldp_num(vf,
> ice_get_vf_vsi(vf), false);
> + } else {
> + ice_setup_vf_trust(vf, trusted);
> + }
> vf->trusted = trusted;
> - ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
> +
> dev_info(ice_pf_to_dev(pf), "VF %u is now %strusted\n",
> vf_id, trusted ? "" : "un");
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> index c8bc952f05cd..81bbf30e5c29 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> @@ -623,7 +623,7 @@ ice_vf_get_promisc_masks(struct ice_vf *vf, struct
> ice_vsi *vsi,
> *
> * Clear all promiscuous/allmulticast filters for a VF
> */
> -static int
> +int
> ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct ice_vsi
> *vsi) {
> struct ice_pf *pf = vf->pf;
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> index 7a9c75d1d07c..a3501bd92311 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> @@ -310,6 +310,7 @@ bool ice_is_any_vf_in_unicast_promisc(struct
> ice_pf *pf); void ice_vf_get_promisc_masks(struct ice_vf *vf, struct
> ice_vsi *vsi,
> u8 *ucast_m, u8 *mcast_m);
> +int ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct ice_vsi
> +*vsi);
> int
> ice_vf_set_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8
> promisc_m); int
> --
> 2.53.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-23 13:17 ` Loktionov, Aleksandr
@ 2026-04-24 10:32 ` Jose Ignacio Tornos Martinez
2026-04-24 10:37 ` Loktionov, Aleksandr
2026-04-24 16:05 ` Loktionov, Aleksandr
1 sibling, 1 reply; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-24 10:32 UTC (permalink / raw)
To: aleksandr.loktionov
Cc: anthony.l.nguyen, davem, edumazet, horms, intel-wired-lan,
jacob.e.keller, jesse.brandeburg, jtornosm, kuba, netdev, pabeni,
przemyslaw.kitszel
Hello Aleksandr,
> For me it looks like cc: stable@vger.kernel.org must be added
I am not sure about that, because the bugs fixed here (vf->trusted
ordering and race condition) only trigger when MAC LLDP filters exist,
which is an uncommon scenario.
And most users will benefit from the performance improvement that is an
optimization rather than the bug fixes.
I mean, I included the commit fixed as a reference but due to optimization
as the main reason, I didn't dare to request this for older versions.
> You declare ice_vf_clear_all_promisc_modes() returning int, but ignore
> the return value.
> Looks suspicious isn't it?
Well, it is used like that when the funciton is called locally (the
function is not modifiedi, just made public), and really my intention was
to clean as much as possible (so error checking is not necessary).
In my opinion it would be enough to warn about the possible problems
(already done in the existing function).
Anyway, if, despite the reasons I have tried to explain, you still think
the same way, please let me know so I can adjust them (if you don't mind,
I would wait for more reviews to include them in a next version).
Thanks
Best regards
Jose Ignacio
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-24 10:32 ` Jose Ignacio Tornos Martinez
@ 2026-04-24 10:37 ` Loktionov, Aleksandr
2026-04-24 12:40 ` Jose Ignacio Tornos Martinez
0 siblings, 1 reply; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-24 10:37 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez
Cc: Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
horms@kernel.org, intel-wired-lan@lists.osuosl.org,
Keller, Jacob E, jesse.brandeburg@intel.com, kuba@kernel.org,
netdev@vger.kernel.org, pabeni@redhat.com, Kitszel, Przemyslaw
> -----Original Message-----
> From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> Sent: Friday, April 24, 2026 12:33 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> davem@davemloft.net; edumazet@google.com; horms@kernel.org; intel-
> wired-lan@lists.osuosl.org; Keller, Jacob E
> <jacob.e.keller@intel.com>; jesse.brandeburg@intel.com;
> jtornosm@redhat.com; kuba@kernel.org; netdev@vger.kernel.org;
> pabeni@redhat.com; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>
> Subject: RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when
> setting trust
>
> Hello Aleksandr,
>
> > For me it looks like cc: stable@vger.kernel.org must be added
> I am not sure about that, because the bugs fixed here (vf->trusted
> ordering and race condition) only trigger when MAC LLDP filters exist,
> which is an uncommon scenario.
> And most users will benefit from the performance improvement that is
> an optimization rather than the bug fixes.
> I mean, I included the commit fixed as a reference but due to
> optimization as the main reason, I didn't dare to request this for
> older versions.
Ok I see your point. As it's optimization, you are right.
>
> > You declare ice_vf_clear_all_promisc_modes() returning int, but
> ignore
> > the return value.
> > Looks suspicious isn't it?
> Well, it is used like that when the funciton is called locally (the
> function is not modifiedi, just made public), and really my intention
> was to clean as much as possible (so error checking is not necessary).
> In my opinion it would be enough to warn about the possible problems
> (already done in the existing function).
Can you go extra mile and add error code handling?
Or at least document it in the code why you don't do it?
>
> Anyway, if, despite the reasons I have tried to explain, you still
> think the same way, please let me know so I can adjust them (if you
> don't mind, I would wait for more reviews to include them in a next
> version).
>
> Thanks
>
> Best regards
> Jose Ignacio
Thank you
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-24 10:37 ` Loktionov, Aleksandr
@ 2026-04-24 12:40 ` Jose Ignacio Tornos Martinez
0 siblings, 0 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-24 12:40 UTC (permalink / raw)
To: aleksandr.loktionov
Cc: anthony.l.nguyen, davem, edumazet, horms, intel-wired-lan,
jacob.e.keller, jesse.brandeburg, jtornosm, kuba, netdev, pabeni,
przemyslaw.kitszel
Hello Aleksandr,
>>> You declare ice_vf_clear_all_promisc_modes() returning int, but
>>> ignore the return value.
>>> Looks suspicious isn't it?
>> Well, it is used like that when the funciton is called locally (the
>> function is not modifiedi, just made public), and really my intention
>> was to clean as much as possible (so error checking is not necessary).
>> In my opinion it would be enough to warn about the possible problems
>> (already done in the existing function).
> Can you go extra mile and add error code handling?
> Or at least document it in the code why you don't do it?
Ok, I can add the error handling in ice_setup_vf_trust and an extra warning
to indicate that promiscuous mode clear failed when revoking trust.
Just let me wait a bit longer for more possible reviews to create the next
version of the series.
Thanks
Best regards
Jose Ignacio
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-23 13:17 ` Loktionov, Aleksandr
2026-04-24 10:32 ` Jose Ignacio Tornos Martinez
@ 2026-04-24 16:05 ` Loktionov, Aleksandr
2026-04-27 7:59 ` Jose Ignacio Tornos Martinez
1 sibling, 1 reply; 19+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-24 16:05 UTC (permalink / raw)
To: Loktionov, Aleksandr, Jose Ignacio Tornos Martinez,
netdev@vger.kernel.org
Cc: intel-wired-lan@lists.osuosl.org, Kitszel, Przemyslaw,
Keller, Jacob E, horms@kernel.org, jesse.brandeburg@intel.com,
Nguyen, Anthony L, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Loktionov, Aleksandr
> Sent: Thursday, April 23, 2026 3:17 PM
> To: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>;
> netdev@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; horms@kernel.org;
> jesse.brandeburg@intel.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com
> Subject: Re: [Intel-wired-lan] [PATCH net v4 4/4] ice: skip
> unnecessary VF reset when setting trust
>
>
>
> > -----Original Message-----
> > From: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> > Sent: Thursday, April 23, 2026 3:04 PM
> > To: netdev@vger.kernel.org
> > Cc: intel-wired-lan@lists.osuosl.org; Kitszel, Przemyslaw
> > <przemyslaw.kitszel@intel.com>; Loktionov, Aleksandr
> > <aleksandr.loktionov@intel.com>; Keller, Jacob E
> > <jacob.e.keller@intel.com>; horms@kernel.org;
> > jesse.brandeburg@intel.com; Nguyen, Anthony L
> > <anthony.l.nguyen@intel.com>; davem@davemloft.net;
> > edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Jose
> Ignacio
> > Tornos Martinez <jtornosm@redhat.com>
> > Subject: [PATCH net v4 4/4] ice: skip unnecessary VF reset when
> > setting trust
> >
> > Similar to the i40e fix, ice_set_vf_trust() unconditionally calls
> > ice_reset_vf() when the trust setting changes. While the delay is
> > smaller than i40e this reset is still unnecessary in most cases.
> >
> > Additionally, the original code has a race condition: it deletes MAC
> > LLDP filters BEFORE resetting the VF. During this deletion, the VF
> is
> > still ACTIVE and can add new MAC LLDP filters concurrently,
> > potentially corrupting the filter list.
> >
> > When granting trust, no reset is needed - we can just set the
> > capability flag to allow privileged operations.
> >
> > When revoking trust, we need to:
> > 1. Clear the capability flag to block privileged operations 2.
> Disable
> > promiscuous mode if it was enabled (trusted VFs can enable it) 3.
> Only
> > reset if MAC LLDP filters exist (to clean them up)
> >
> > When we do reset (MAC LLDP case), we fix the race condition by
> > resetting first to clear VF state (which blocks new MAC LLDP filter
> > additions), then delete existing filters safely. During cleanup, vf-
> > >trusted remains true so
> > ice_vf_is_lldp_ena() works properly. Only after cleanup do we set
> > vf->trusted = false.
> >
> > When we don't reset, we manually handle capability flag and
> > promiscuous mode via helper function.
> >
> > The ice driver already has logic to clean up MAC LLDP filters when
> > removing trust. After this cleanup, the VF reset is only necessary
> if
> > there were actually filters to remove (num_mac_lldp was non-zero).
> >
> > This saves time and eliminates unnecessary service disruption when
> > changing VF trust settings in most cases, while properly handling
> > filter cleanup.
> >
> > Fixes: 2296345416b0 ("ice: receive LLDP on trusted VFs")
> For me it looks like cc: stable@vger.kernel.org must be added
>
> > Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> > ---
> > v4:
> > - Address AI review (sashiko.dev) from Simon Horman:
> > vf->trusted ordering bug
> > - Fix upstream race condition when comparing with i40e code
> > - Apply capability flag and promiscuous mode fixes from i40e AI
> > review
> > - Add helper function ice_setup_vf_trust() for non-reset path
> > - Export ice_vf_clear_all_promisc_modes() for code reuse
> > v3: https://lore.kernel.org/all/20260414110006.124286-5-
> > jtornosm@redhat.com/
> >
> > drivers/net/ethernet/intel/ice/ice_sriov.c | 41
> +++++++++++++++++++-
> > - drivers/net/ethernet/intel/ice/ice_vf_lib.c | 2 +-
> > drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 +
> > 3 files changed, 39 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c
> > b/drivers/net/ethernet/intel/ice/ice_sriov.c
> > index 7e00e091756d..d0da7f6adc23 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
> > @@ -1364,6 +1364,34 @@ int ice_set_vf_mac(struct net_device *netdev,
> > int vf_id, u8 *mac)
> > return __ice_set_vf_mac(ice_netdev_to_pf(netdev), vf_id, mac);
> }
> >
> > +/**
> > + * ice_setup_vf_trust - Enable/disable VF trust mode without reset
> > + * @vf: VF to configure
> > + * @setting: trust setting
> > + *
> > + * Manually handle capability flag and promiscuous mode when
> changing
> > +trust
> > + * without performing a VF reset.
> > + * When reset is performed, this is not necessary as the reset
> > +procedure
> > + * already handles this.
> > + **/
One more nit, kdoc should end with '*/' not '**/'
With the best regards
Alex
> > +static void ice_setup_vf_trust(struct ice_vf *vf, bool setting) {
> > + struct ice_vsi *vsi;
> > +
> > + if (setting) {
> > + set_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> > + } else {
> > + clear_bit(ICE_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> > +
> > + if (test_bit(ICE_VF_STATE_UC_PROMISC, vf->vf_states) ||
> > + test_bit(ICE_VF_STATE_MC_PROMISC, vf->vf_states)) {
> > + vsi = ice_get_vf_vsi(vf);
> > + if (vsi)
> > + ice_vf_clear_all_promisc_modes(vf, vsi);
> You declare ice_vf_clear_all_promisc_modes() returning int, but ignore
> the return value.
> Looks suspicious isn't it?
>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>
> > + }
> > + }
> > +}
> > +
> > /**
> > * ice_set_vf_trust
> > * @netdev: network interface device structure @@ -1399,11 +1427,16
> > @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool
> > trusted)
> >
> > mutex_lock(&vf->cfg_lock);
> >
> > - while (!trusted && vf->num_mac_lldp)
> > - ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf),
> > false);
> > -
> > + /* Reset only if revoking trust with MAC LLDP filters */
> > + if (!trusted && vf->num_mac_lldp) {
> > + ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
> > + while (vf->num_mac_lldp)
> > + ice_vf_update_mac_lldp_num(vf,
> > ice_get_vf_vsi(vf), false);
> > + } else {
> > + ice_setup_vf_trust(vf, trusted);
> > + }
> > vf->trusted = trusted;
> > - ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
> > +
> > dev_info(ice_pf_to_dev(pf), "VF %u is now %strusted\n",
> > vf_id, trusted ? "" : "un");
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> > b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> > index c8bc952f05cd..81bbf30e5c29 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> > @@ -623,7 +623,7 @@ ice_vf_get_promisc_masks(struct ice_vf *vf,
> struct
> > ice_vsi *vsi,
> > *
> > * Clear all promiscuous/allmulticast filters for a VF
> > */
> > -static int
> > +int
> > ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct ice_vsi
> > *vsi) {
> > struct ice_pf *pf = vf->pf;
> > diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> > b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> > index 7a9c75d1d07c..a3501bd92311 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
> > @@ -310,6 +310,7 @@ bool ice_is_any_vf_in_unicast_promisc(struct
> > ice_pf *pf); void ice_vf_get_promisc_masks(struct ice_vf *vf,
> struct
> > ice_vsi *vsi,
> > u8 *ucast_m, u8 *mcast_m);
> > +int ice_vf_clear_all_promisc_modes(struct ice_vf *vf, struct
> ice_vsi
> > +*vsi);
> > int
> > ice_vf_set_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8
> > promisc_m); int
> > --
> > 2.53.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-24 16:05 ` Loktionov, Aleksandr
@ 2026-04-27 7:59 ` Jose Ignacio Tornos Martinez
0 siblings, 0 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-27 7:59 UTC (permalink / raw)
To: aleksandr.loktionov
Cc: anthony.l.nguyen, davem, edumazet, horms, intel-wired-lan,
jacob.e.keller, jesse.brandeburg, jtornosm, kuba, netdev, pabeni,
przemyslaw.kitszel
Hello Aleksandr,
> One more nit, kdoc should end with '*/' not '**/'
Ok, in the current code there are already some functions with headers
ending in '**/' and others with '*/' , the functions I have added or
changed the prototype in the series will all end in '*/' as you say,
in the next version.
Thanks
Best regards
Jose Ignacio
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net v4 3/4] iavf: send MAC change request synchronously
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
@ 2026-04-27 9:23 ` Przemek Kitszel
2026-04-27 11:34 ` Jose Ignacio Tornos Martinez
2026-04-27 16:43 ` Simon Horman
2 siblings, 1 reply; 19+ messages in thread
From: Przemek Kitszel @ 2026-04-27 9:23 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez
Cc: intel-wired-lan, aleksandr.loktionov, jacob.e.keller, horms,
anthony.l.nguyen, davem, edumazet, kuba, pabeni, netdev
On 4/23/26 15:04, Jose Ignacio Tornos Martinez wrote:
> After commit ad7c7b2172c3 ("net: hold netdev instance lock during sysfs
> operations"), iavf_set_mac() is called with the netdev instance lock
> already held.
>
> The function queues a MAC address change request via
> iavf_replace_primary_mac() and then waits for completion. However, in
> the current flow, the actual virtchnl message is sent by the watchdog
> task, which also needs to acquire the netdev lock to run. Additionally,
> the adminq_task which processes virtchnl responses also needs the netdev
> lock.
>
> This creates a deadlock scenario:
> 1. iavf_set_mac() holds netdev lock and waits for MAC change
> 2. Watchdog needs netdev lock to send the request -> blocked
> 3. Even if request is sent, adminq_task needs netdev lock to process
> PF response -> blocked
> 4. MAC change times out after 2.5 seconds
> 5. iavf_set_mac() returns -EAGAIN
>
> This particularly affects VFs during bonding setup when multiple VFs are
> enslaved in quick succession.
>
> Fix by implementing a synchronous MAC change operation similar to the
> approach used in commit fdadbf6e84c4 ("iavf: fix incorrect reset handling
> in callbacks").
>
> The solution:
> 1. Send the virtchnl ADD_ETH_ADDR message directly (not via watchdog)
> 2. Poll the admin queue hardware directly for responses
> 3. Process all received messages (including non-MAC messages)
> 4. Return when MAC change completes or times out
>
> A new generic function iavf_poll_virtchnl_response() is introduced that
> can be reused for any future synchronous virtchnl operations. It takes a
> callback to check completion, allowing flexible condition checking.
>
> This allows the operation to complete synchronously while holding
> netdev_lock, without relying on watchdog or adminq_task. The function
> can sleep for up to 2.5 seconds polling hardware, but this is acceptable
> since netdev_lock is per-device and only serializes operations on the
> same interface.
>
> To support this, change iavf_add_ether_addrs() to return an error code
> instead of void, allowing callers to detect failures. Additionally,
> export iavf_mac_add_reject() to enable proper rollback on local failures
> (timeouts, send errors) - PF rejections are already handled automatically
> by iavf_virtchnl_completion().
>
> Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
> cc: stable@vger.kernel.org
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v4: Complete with Przemek Kitszel comments:
> - Remove vc_waitqueue entirely (not needed any more)
nit: I would add a short note to commit message too
thanks a lot for the rest of changes
I have a few last nits, please find below
> - Add named parameters to callback function pointer declaration for
> clarity
> - Simplify callback signature: add v_op parameter so callback
> receives the opcode from the processed message to identify which
> response was received
> - Optimize polling loop to single condition check per iteration
> instead of checking both before and after message processing
> Address AI review (sashiko.dev) from Simon Horman:
> - Complete iavf_add_ether_addrs() error handling
> - Skip non-virtchnl hardware events (received_op=VIRTCHNL_OP_UNKNOWN),
> these can cause false completion detection
> - Complete rollback for local failures (not PF rejection) reusing
> iavf_mac_add_reject() to restore the old primary filter
> v3: https://lore.kernel.org/netdev/20260414110006.124286-4-jtornosm@redhat.com/
>
> drivers/net/ethernet/intel/iavf/iavf.h | 10 +-
> drivers/net/ethernet/intel/iavf/iavf_main.c | 70 +++++++++----
> .../net/ethernet/intel/iavf/iavf_virtchnl.c | 99 +++++++++++++++++--
> 3 files changed, 151 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
> index e9fb0a0919e3..78fa3df06e11 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf.h
> +++ b/drivers/net/ethernet/intel/iavf/iavf.h
> @@ -260,7 +260,6 @@ struct iavf_adapter {
> struct work_struct adminq_task;
> struct work_struct finish_config;
> wait_queue_head_t down_waitqueue;
> - wait_queue_head_t vc_waitqueue;
> struct iavf_q_vector *q_vectors;
> struct list_head vlan_filter_list;
> int num_vlan_filters;
> @@ -589,8 +588,9 @@ void iavf_configure_queues(struct iavf_adapter *adapter);
> void iavf_enable_queues(struct iavf_adapter *adapter);
> void iavf_disable_queues(struct iavf_adapter *adapter);
> void iavf_map_queues(struct iavf_adapter *adapter);
> -void iavf_add_ether_addrs(struct iavf_adapter *adapter);
> +int iavf_add_ether_addrs(struct iavf_adapter *adapter);
> void iavf_del_ether_addrs(struct iavf_adapter *adapter);
> +void iavf_mac_add_reject(struct iavf_adapter *adapter);
> void iavf_add_vlans(struct iavf_adapter *adapter);
> void iavf_del_vlans(struct iavf_adapter *adapter);
> void iavf_set_promiscuous(struct iavf_adapter *adapter);
> @@ -607,6 +607,12 @@ void iavf_disable_vlan_stripping(struct iavf_adapter *adapter);
> void iavf_virtchnl_completion(struct iavf_adapter *adapter,
> enum virtchnl_ops v_opcode,
> enum iavf_status v_retval, u8 *msg, u16 msglen);
> +int iavf_poll_virtchnl_response(struct iavf_adapter *adapter,
> + bool (*condition)(struct iavf_adapter *adapter,
> + const void *data,
> + enum virtchnl_ops v_op),
> + const void *cond_data,
> + unsigned int timeout_ms);
> int iavf_config_rss(struct iavf_adapter *adapter);
> void iavf_cfg_queues_bw(struct iavf_adapter *adapter);
> void iavf_cfg_queues_quanta_size(struct iavf_adapter *adapter);
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 67aa14350b1b..bc5994bf2cd9 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -1047,6 +1047,48 @@ static bool iavf_is_mac_set_handled(struct net_device *netdev,
> return ret;
> }
>
> +/**
> + * iavf_mac_change_done - Check if MAC change completed
> + * @adapter: board private structure
> + * @data: MAC address being checked (as const void *)
> + * @v_op: virtchnl opcode from processed message
> + *
> + * Callback for iavf_poll_virtchnl_response() to check if MAC change completed.
> + *
> + * Returns true if MAC change completed, false otherwise
I'm not a fan of kdoc, but would rather write kdoc-compilant comments:
Return: ...
> + */
> +static bool iavf_mac_change_done(struct iavf_adapter *adapter,
> + const void *data, enum virtchnl_ops v_op)
> +{
> + const u8 *addr = data;
> +
> + return iavf_is_mac_set_handled(adapter->netdev, addr);
> +}
> +
> +/**
> + * iavf_set_mac_sync - Synchronously change MAC address
> + * @adapter: board private structure
> + * @addr: MAC address to set
> + *
> + * Sends MAC change request to PF and polls admin queue for response.
> + * Caller must hold netdev_lock. This can sleep for up to 2.5 seconds.
> + *
> + * Returns 0 on success, negative on failure
ditto kdoc "Return:"
> + */
> +static int iavf_set_mac_sync(struct iavf_adapter *adapter, const u8 *addr)
> +{
> + int ret;
> +
> + netdev_assert_locked(adapter->netdev);
> +
> + ret = iavf_add_ether_addrs(adapter);
> + if (ret)
> + return ret;
> +
> + return iavf_poll_virtchnl_response(adapter, iavf_mac_change_done,
> + addr, 2500);
> +}
> +
> /**
> * iavf_set_mac - NDO callback to set port MAC address
> * @netdev: network interface device structure
> @@ -1067,25 +1109,20 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
> return -EADDRNOTAVAIL;
>
> ret = iavf_replace_primary_mac(adapter, addr->sa_data);
> -
> if (ret)
> return ret;
>
> - ret = wait_event_interruptible_timeout(adapter->vc_waitqueue,
> - iavf_is_mac_set_handled(netdev, addr->sa_data),
> - msecs_to_jiffies(2500));
> -
> - /* If ret < 0 then it means wait was interrupted.
> - * If ret == 0 then it means we got a timeout.
> - * else it means we got response for set MAC from PF,
> - * check if netdev MAC was updated to requested MAC,
> - * if yes then set MAC succeeded otherwise it failed return -EACCES
> - */
> - if (ret < 0)
> + ret = iavf_set_mac_sync(adapter, addr->sa_data);
> + if (ret) {
> + /* Rollback for local failures (timeout, send error, -EBUSY).
> + * Note: If PF rejects the request (sends error response),
> + * iavf_virtchnl_completion() automatically calls
> + * iavf_mac_add_reject(), ret=0, and this is not executed.
> + * Only local failures (no PF response received) need manual rollback.
> + */
> + iavf_mac_add_reject(adapter);
> return ret;
> -
> - if (!ret)
> - return -EAGAIN;
> + }
>
> if (!ether_addr_equal(netdev->dev_addr, addr->sa_data))
> return -EACCES;
> @@ -5415,9 +5452,6 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> /* Setup the wait queue for indicating transition to down status */
> init_waitqueue_head(&adapter->down_waitqueue);
>
> - /* Setup the wait queue for indicating virtchannel events */
> - init_waitqueue_head(&adapter->vc_waitqueue);
> -
> INIT_LIST_HEAD(&adapter->ptp.aq_cmds);
> init_waitqueue_head(&adapter->ptp.phc_time_waitqueue);
> mutex_init(&adapter->ptp.aq_cmd_lock);
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
> index a52c100dcbc5..d1afb8261c24 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
> @@ -2,6 +2,7 @@
> /* Copyright(c) 2013 - 2018 Intel Corporation. */
>
> #include <linux/net/intel/libie/rx.h>
> +#include <net/netdev_lock.h>
>
> #include "iavf.h"
> #include "iavf_ptp.h"
> @@ -555,20 +556,23 @@ iavf_set_mac_addr_type(struct virtchnl_ether_addr *virtchnl_ether_addr,
> * @adapter: adapter structure
> *
> * Request that the PF add one or more addresses to our filters.
> + *
> + * Return: 0 on success, negative on failure
> **/
thank you for also changing the kdoc when touching the function :)
> -void iavf_add_ether_addrs(struct iavf_adapter *adapter)
> +int iavf_add_ether_addrs(struct iavf_adapter *adapter)
> {
> struct virtchnl_ether_addr_list *veal;
> struct iavf_mac_filter *f;
> int i = 0, count = 0;
> bool more = false;
> size_t len;
> + int ret;
>
> if (adapter->current_op != VIRTCHNL_OP_UNKNOWN) {
> /* bail because we already have a command pending */
> dev_err(&adapter->pdev->dev, "Cannot add filters, command %d pending\n",
> adapter->current_op);
> - return;
> + return -EBUSY;
> }
>
> spin_lock_bh(&adapter->mac_vlan_list_lock);
> @@ -580,7 +584,7 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
> if (!count) {
> adapter->aq_required &= ~IAVF_FLAG_AQ_ADD_MAC_FILTER;
> spin_unlock_bh(&adapter->mac_vlan_list_lock);
> - return;
> + return 0;
> }
> adapter->current_op = VIRTCHNL_OP_ADD_ETH_ADDR;
>
> @@ -594,8 +598,9 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
>
> veal = kzalloc(len, GFP_ATOMIC);
> if (!veal) {
> + adapter->current_op = VIRTCHNL_OP_UNKNOWN;
> spin_unlock_bh(&adapter->mac_vlan_list_lock);
> - return;
> + return -ENOMEM;
> }
>
> veal->vsi_id = adapter->vsi_res->vsi_id;
> @@ -615,8 +620,15 @@ void iavf_add_ether_addrs(struct iavf_adapter *adapter)
>
> spin_unlock_bh(&adapter->mac_vlan_list_lock);
>
> - iavf_send_pf_msg(adapter, VIRTCHNL_OP_ADD_ETH_ADDR, (u8 *)veal, len);
> + ret = iavf_send_pf_msg(adapter, VIRTCHNL_OP_ADD_ETH_ADDR, (u8 *)veal, len);
> kfree(veal);
> + if (ret) {
> + dev_err(&adapter->pdev->dev,
> + "Unable to send ADD_ETH_ADDR message to PF, error %d\n", ret);
> + adapter->current_op = VIRTCHNL_OP_UNKNOWN;
> + }
> +
> + return ret;
> }
>
> /**
> @@ -713,7 +725,7 @@ static void iavf_mac_add_ok(struct iavf_adapter *adapter)
> *
> * Remove filters from list based on PF response.
> **/
> -static void iavf_mac_add_reject(struct iavf_adapter *adapter)
> +void iavf_mac_add_reject(struct iavf_adapter *adapter)
> {
> struct net_device *netdev = adapter->netdev;
> struct iavf_mac_filter *f, *ftmp;
> @@ -2389,7 +2401,6 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
> iavf_mac_add_reject(adapter);
> /* restore administratively set MAC address */
> ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
> - wake_up(&adapter->vc_waitqueue);
> break;
> case VIRTCHNL_OP_DEL_VLAN:
> dev_err(&adapter->pdev->dev, "Failed to delete VLAN filter, error %s\n",
> @@ -2586,7 +2597,6 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
> eth_hw_addr_set(netdev, adapter->hw.mac.addr);
> netif_addr_unlock_bh(netdev);
> }
> - wake_up(&adapter->vc_waitqueue);
> break;
> case VIRTCHNL_OP_GET_STATS: {
> struct iavf_eth_stats *stats =
> @@ -2956,3 +2966,76 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
> } /* switch v_opcode */
> adapter->current_op = VIRTCHNL_OP_UNKNOWN;
> }
> +
> +/**
> + * iavf_poll_virtchnl_response - Poll admin queue for virtchnl response
> + * @adapter: adapter structure
> + * @condition: callback to check if desired response received
> + * @cond_data: context data passed to condition callback
> + * @timeout_ms: maximum time to wait in milliseconds
> + *
> + * Polls the admin queue and processes all incoming virtchnl messages.
> + * After processing each valid message, calls the condition callback to check
> + * if the expected response has been received. The callback receives the opcode
> + * of the processed message to identify which response was received. Continues
> + * polling until the callback returns true or timeout expires.
> + * Clear current_op on timeout to prevent permanent -EBUSY state.
> + * Caller must hold netdev_lock. This can sleep for up to timeout_ms while
> + * polling hardware.
> + *
> + * Return: 0 on success (condition met), -EAGAIN on timeout, or error code
> + **/
single star for closing coments **/ → */
> +int iavf_poll_virtchnl_response(struct iavf_adapter *adapter,
> + bool (*condition)(struct iavf_adapter *adapter,
> + const void *data,
> + enum virtchnl_ops v_op),
> + const void *cond_data,
> + unsigned int timeout_ms)
> +{
> + struct iavf_hw *hw = &adapter->hw;
> + struct iavf_arq_event_info event;
> + enum virtchnl_ops received_op;
> + unsigned long timeout;
> + u32 v_retval;
> + u16 pending;
> + int ret = -EAGAIN;
RCT violation - we sort lines from longest to shortest
> +
> + netdev_assert_locked(adapter->netdev);
> +
> + event.buf_len = IAVF_MAX_AQ_BUF_SIZE;
> + event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
> + if (!event.msg_buf)
> + return -ENOMEM;
> +
> + timeout = jiffies + msecs_to_jiffies(timeout_ms);
> + do {
> + if (iavf_clean_arq_element(hw, &event, &pending) == IAVF_SUCCESS) {
> + received_op = (enum virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
> + if (received_op != VIRTCHNL_OP_UNKNOWN) {
> + v_retval = le32_to_cpu(event.desc.cookie_low);
> +
> + iavf_virtchnl_completion(adapter, received_op,
> + (enum iavf_status)v_retval,
> + event.msg_buf, event.msg_len);
> +
> + if (condition(adapter, cond_data, received_op)) {
> + ret = 0;
> + break;
> + }
> + }
> +
> + memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE);
> +
> + if (pending)
> + continue;
> + }
> +
> + usleep_range(50, 75);
we got again to the "sleep then check time" situation
to resolve that, you could init @pending with 0, and sleep at the very
begining of each loop step if (!pending)
after that I will be no longer complaining on this patch,
thank you again for the work!
> + } while (time_before(jiffies, timeout));
> +
> + if (ret == -EAGAIN && adapter->current_op != VIRTCHNL_OP_UNKNOWN)
> + adapter->current_op = VIRTCHNL_OP_UNKNOWN;
> +
> + kfree(event.msg_buf);
> + return ret;
> +}
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net v4 3/4] iavf: send MAC change request synchronously
2026-04-27 9:23 ` Przemek Kitszel
@ 2026-04-27 11:34 ` Jose Ignacio Tornos Martinez
0 siblings, 0 replies; 19+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-04-27 11:34 UTC (permalink / raw)
To: przemyslaw.kitszel
Cc: aleksandr.loktionov, anthony.l.nguyen, davem, edumazet, horms,
intel-wired-lan, jacob.e.keller, jtornosm, kuba, netdev, pabeni
Hello Przemek,
Thank you again for your comments.
I will try to include the new ones too in the next version.
Indeed kdoc "Return:" was already commented by you, I will try to address
all the occurences now.
Thanks
Best regards
Jose Ignacio
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust
2026-04-23 13:04 ` [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
@ 2026-04-27 16:25 ` Simon Horman
1 sibling, 0 replies; 19+ messages in thread
From: Simon Horman @ 2026-04-27 16:25 UTC (permalink / raw)
To: jtornosm
Cc: 'Simon Horman', netdev, intel-wired-lan,
przemyslaw.kitszel, aleksandr.loktionov, jacob.e.keller,
jesse.brandeburg, anthony.l.nguyen, davem, edumazet, kuba, pabeni
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
i40e: skip unnecessary VF reset when setting trust
This commit attempts to reduce the bonding setup delay by skipping the
synchronous VF reset when changing the trust setting, manually handling
the capability flag and promiscuous mode instead.
> When we do reset (ADQ case), we reset first to clear VF_STATE_ACTIVE
> (which blocks new cloud filter additions), then delete existing cloud
> filters safely. This avoids the race condition where VF could add filters
> during deletion.
Is this description accurate?
i40e_vc_reset_vf() is synchronous and restores I40E_VF_STATE_ACTIVE before
returning. When i40e_del_all_cloud_filters() traverses
vf->cloud_filter_list locklessly afterwards, the VF is already active again.
Could a concurrent VIRTCHNL_OP_ADD_CLOUD_FILTER message from the guest be
processed asynchronously and corrupt the list?
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> index a26c3d47ec1561..26b62650aac9aa 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
[ ... ]
> @@ -4943,6 +4943,30 @@ int i40e_ndo_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool enable)
> return ret;
> }
>
> +/**
> + * i40e_setup_vf_trust - Enable/disable VF trust mode without reset
> + * @vf: VF to configure
> + * @setting: trust setting
> + *
> + * Manually handle capability flag and promiscuous mode when changing trust
> + * without performing a VF reset.
> + * When reset is performed, this is not necessary as the reset procedure
> + * already handles this.
> + **/
> +static void i40e_setup_vf_trust(struct i40e_vf *vf, bool setting)
> +{
> + if (setting) {
> + set_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> + } else {
> + clear_bit(I40E_VIRTCHNL_VF_CAP_PRIVILEGE, &vf->vf_caps);
> +
> + if (test_bit(I40E_VF_STATE_UC_PROMISC, &vf->vf_states) ||
> + test_bit(I40E_VF_STATE_MC_PROMISC, &vf->vf_states))
> + i40e_config_vf_promiscuous_mode(vf, vf->lan_vsi_idx,
> + false, false);
Does this pass the internal software array index (vf->lan_vsi_idx) instead of
the firmware VSI ID (vf->lan_vsi_id)?
i40e_config_vf_promiscuous_mode() uses i40e_find_vsi_from_id() which matches
against the firmware ID. Could this cause the lookup to fail and silently
leave promiscuous mode enabled on the hardware?
Also, should this clear the I40E_VF_STATE_UC_PROMISC and
I40E_VF_STATE_MC_PROMISC flags from vf->vf_states?
If an untrusted VF performs a valid operation like adding a new VLAN, could
i40e_vc_add_vlan_msg() check the uncleared state bit and automatically issue
an AdminQ command to re-enable promiscuous mode, bypassing the revocation?
> + }
> +}
> +
> /**
> * i40e_ndo_set_vf_trust
> * @netdev: network interface device structure of the pf
> @@ -4987,19 +5011,17 @@ int i40e_ndo_set_vf_trust(struct net_device *netdev, int vf_id, bool setting)
> set_bit(__I40E_MACVLAN_SYNC_PENDING, pf->state);
> pf->vsi[vf->lan_vsi_idx]->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
>
> - i40e_vc_reset_vf(vf, true);
> + /* Reset only if revoking trust with ADQ (for cloud filter cleanup) */
> + if (vf->adq_enabled && !setting) {
> + i40e_vc_reset_vf(vf, true);
> + i40e_del_all_cloud_filters(vf);
Could this lead to a memory leak of the cloud filters?
The synchronous reset in i40e_vc_reset_vf() destroys the VF's VSI and
allocates a new one with a new firmware SEID. When
i40e_del_all_cloud_filters() executes afterwards, it looks up the VSI using
the old SEID from the filters, which might fail and cause it to skip the
free operation entirely.
> + } else {
> + i40e_setup_vf_trust(vf, setting);
> + }
By skipping the reset when revoking trust for non-ADQ VFs, do we leave excess
hardware filters intact? Trusted VFs can configure MAC and VLAN filters
beyond standard limits. Without a reset to flush them, does the untrusted
VF retain its configured spoofed MAC addresses and excessive VLANs
indefinitely?
Furthermore, when granting trust, skipping the reset prevents sending the
VIRTCHNL_EVENT_RESET_IMPENDING event to the guest. Since virtchnl VF drivers
negotiate their capabilities and resource requests during initialization,
does this mean the guest driver remains unaware of its newly granted
privileges until the interface is manually restarted?
> +
> dev_info(&pf->pdev->dev, "VF %u is now %strusted\n",
> vf_id, setting ? "" : "un");
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net v4 3/4] iavf: send MAC change request synchronously
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 9:23 ` Przemek Kitszel
@ 2026-04-27 16:43 ` Simon Horman
2 siblings, 0 replies; 19+ messages in thread
From: Simon Horman @ 2026-04-27 16:43 UTC (permalink / raw)
To: jtornosm
Cc: 'Simon Horman', netdev, intel-wired-lan,
przemyslaw.kitszel, aleksandr.loktionov, jacob.e.keller,
jesse.brandeburg, anthony.l.nguyen, davem, edumazet, kuba, pabeni
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
iavf: send MAC change request synchronously
This commit updates the MAC address setting in the iavf driver to be
synchronous. It sends the virtchnl message directly and polls the admin
queue for the response, rather than relying on the watchdog task, to avoid
deadlocks when holding the netdev instance lock.
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 67aa14350b1b01..bc5994bf2cd99d 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
[ ... ]
> @@ -1067,25 +1109,20 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
> return -EADDRNOTAVAIL;
>
> ret = iavf_replace_primary_mac(adapter, addr->sa_data);
> -
> if (ret)
> return ret;
>
> - ret = wait_event_interruptible_timeout(adapter->vc_waitqueue,
> - iavf_is_mac_set_handled(netdev, addr->sa_data),
> - msecs_to_jiffies(2500));
> -
> - /* If ret < 0 then it means wait was interrupted.
> - * If ret == 0 then it means we got a timeout.
> - * else it means we got response for set MAC from PF,
> - * check if netdev MAC was updated to requested MAC,
> - * if yes then set MAC succeeded otherwise it failed return -EACCES
> - */
> - if (ret < 0)
> + ret = iavf_set_mac_sync(adapter, addr->sa_data);
> + if (ret) {
> + /* Rollback for local failures (timeout, send error, -EBUSY).
> + * Note: If PF rejects the request (sends error response),
> + * iavf_virtchnl_completion() automatically calls
> + * iavf_mac_add_reject(), ret=0, and this is not executed.
> + * Only local failures (no PF response received) need manual rollback.
> + */
> + iavf_mac_add_reject(adapter);
> return ret;
> -
> - if (!ret)
> - return -EAGAIN;
> + }
Does the manual rollback in iavf_set_mac() need to restore adapter->hw.mac.addr?
Before calling iavf_set_mac_sync(), iavf_replace_primary_mac() optimistically
updates adapter->hw.mac.addr to the new MAC address and unsets the is_primary
flag on the old MAC filter. If iavf_set_mac_sync() fails locally,
iavf_mac_add_reject() deletes the pending new MAC filter but doesn't restore
adapter->hw.mac.addr back to netdev->dev_addr.
By contrast, the PF-rejection path in iavf_virtchnl_completion() explicitly
restores it:
iavf_mac_add_reject(adapter);
/* restore administratively set MAC address */
ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
If the local failure was a timeout and a delayed success response arrives from
the PF later, could iavf_virtchnl_completion() blindly copy the corrupted
adapter->hw.mac.addr into netdev->dev_addr? This seems like it would silently
change the interface MAC to an address already deleted from the internal list.
Additionally, on subsequent MAC change attempts, wouldn't
iavf_replace_primary_mac() search for the old filter using the corrupted
adapter->hw.mac.addr, fail to find it, and permanently leak the old filter in
the PF's hardware tables?
[ ... ]
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
> index a52c100dcbc56d..d1afb8261c2412 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
[ ... ]
> @@ -2956,3 +2966,76 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
> } /* switch v_opcode */
> adapter->current_op = VIRTCHNL_OP_UNKNOWN;
> }
> +
> +/**
> + * iavf_poll_virtchnl_response - Poll admin queue for virtchnl response
> + * @adapter: adapter structure
> + * @condition: callback to check if desired response received
> + * @cond_data: context data passed to condition callback
> + * @timeout_ms: maximum time to wait in milliseconds
> + *
> + * Polls the admin queue and processes all incoming virtchnl messages.
> + * After processing each valid message, calls the condition callback to check
> + * if the expected response has been received. The callback receives the opcode
> + * of the processed message to identify which response was received. Continues
> + * polling until the callback returns true or timeout expires.
> + * Clear current_op on timeout to prevent permanent -EBUSY state.
> + * Caller must hold netdev_lock. This can sleep for up to timeout_ms while
> + * polling hardware.
> + *
> + * Return: 0 on success (condition met), -EAGAIN on timeout, or error code
> + **/
> +int iavf_poll_virtchnl_response(struct iavf_adapter *adapter,
> + bool (*condition)(struct iavf_adapter *adapter,
> + const void *data,
> + enum virtchnl_ops v_op),
> + const void *cond_data,
> + unsigned int timeout_ms)
> +{
[ ... ]
> + if (iavf_clean_arq_element(hw, &event, &pending) == IAVF_SUCCESS) {
> + received_op = (enum virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
> + if (received_op != VIRTCHNL_OP_UNKNOWN) {
> + v_retval = le32_to_cpu(event.desc.cookie_low);
> +
> + iavf_virtchnl_completion(adapter, received_op,
> + (enum iavf_status)v_retval,
> + event.msg_buf, event.msg_len);
> +
> + if (condition(adapter, cond_data, received_op)) {
> + ret = 0;
> + break;
> + }
> + }
> +
> + memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE);
> +
> + if (pending)
> + continue;
> + }
> +
> + usleep_range(50, 75);
> + } while (time_before(jiffies, timeout));
> +
> + if (ret == -EAGAIN && adapter->current_op != VIRTCHNL_OP_UNKNOWN)
> + adapter->current_op = VIRTCHNL_OP_UNKNOWN;
Could clearing adapter->current_op on timeout cause overlapping virtchnl
commands and state corruption?
If the timeout expires and adapter->current_op is forced to
VIRTCHNL_OP_UNKNOWN, the VF watchdog might consider the admin queue free
and send a new command while the PF is still actively processing the
timed-out request.
If the PF eventually completes the timed-out request and sends a late response,
iavf_virtchnl_completion() will process it. At the end of
iavf_virtchnl_completion(), the driver unconditionally clears the state:
adapter->current_op = VIRTCHNL_OP_UNKNOWN;
Would this prematurely clear the tracking state for the newly in-flight
overlapping command, compounding the state machine corruption and allowing
even more commands to be sent concurrently?
> +
> + kfree(event.msg_buf);
> + return ret;
> +}
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust
2026-04-23 13:04 ` [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
2026-04-23 13:17 ` Loktionov, Aleksandr
@ 2026-04-27 16:50 ` Simon Horman
1 sibling, 0 replies; 19+ messages in thread
From: Simon Horman @ 2026-04-27 16:50 UTC (permalink / raw)
To: jtornosm
Cc: 'Simon Horman', netdev, intel-wired-lan,
przemyslaw.kitszel, aleksandr.loktionov, jacob.e.keller,
jesse.brandeburg, anthony.l.nguyen, davem, edumazet, kuba, pabeni
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ice: skip unnecessary VF reset when setting trust
This patch attempts to optimize VF trust setting changes by skipping the VF
reset when granting trust, and only resetting when revoking trust if MAC LLDP
filters are present. It also fixes a race condition by doing the reset before
clearing the LLDP filters.
> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
> index 7e00e091756ddc..8686c382404f3f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
[ ... ]
> @@ -1399,11 +1427,16 @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool trusted)
>
> mutex_lock(&vf->cfg_lock);
>
> - while (!trusted && vf->num_mac_lldp)
> - ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf), false);
> -
> + /* Reset only if revoking trust with MAC LLDP filters */
> + if (!trusted && vf->num_mac_lldp) {
> + ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
Does ice_reset_vf() clear vf->num_mac_lldp to 0 when it calls
ice_vf_pre_vsi_rebuild() internally?
If so, does this make the following while loop dead code, bypassing the
ice_vf_update_mac_lldp_num() calls that cleanly decrement the count and
trigger ice_vsi_cfg_sw_lldp() to clean up the LLDP software switch rules?
> + while (vf->num_mac_lldp)
> + ice_vf_update_mac_lldp_num(vf, ice_get_vf_vsi(vf), false);
> + } else {
> + ice_setup_vf_trust(vf, trusted);
> + }
If we revoke trust (!trusted) and vf->num_mac_lldp > 0, we enter the if block
and skip the else block containing ice_setup_vf_trust().
When ice_reset_vf() rebuilds the VSI, vf->trusted is still true. Will the
rebuild process use this true value to set the ICE_VIRTCHNL_VF_CAP_PRIVILEGE
bit in vf->vf_caps?
Since ice_setup_vf_trust() is skipped, does this fail to explicitly clear the
privilege capability bit? Could a malicious VM add a MAC LLDP filter before
trust is revoked to ensure num_mac_lldp > 0, thereby retaining its privileges?
Furthermore, when revoking trust and vf->num_mac_lldp == 0, we take the else
branch and skip ice_reset_vf().
Trusted VFs are permitted to allocate more than ICE_MAX_MACADDR_PER_VF or
ICE_MAX_VLAN_PER_VF limits. Previously, the unconditional ice_reset_vf()
would tear down the VSI and purge these extra non-default filters.
Without the reset, is there any logic in ice_setup_vf_trust() to prune the
extra MAC and VLAN filters that were added while the VF was trusted? Will
skipping the reset allow an untrusted VF to retain excessive hardware filters
indefinitely and potentially exhaust PF resources?
> vf->trusted = trusted;
> - ice_reset_vf(vf, ICE_VF_RESET_NOTIFY);
> +
> dev_info(ice_pf_to_dev(pf), "VF %u is now %strusted\n",
> vf_id, trusted ? "" : "un");
>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-04-27 16:52 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 13:04 [PATCH net v4 0/4] Fix i40e/ice/iavf VF bonding after netdev lock changes Jose Ignacio Tornos Martinez
2026-04-23 13:04 ` [PATCH net v4 1/4] iavf: return EBUSY if reset in progress or not ready during MAC change Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-23 13:04 ` [PATCH net v4 2/4] i40e: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 16:25 ` Simon Horman
2026-04-23 13:04 ` [PATCH net v4 3/4] iavf: send MAC change request synchronously Jose Ignacio Tornos Martinez
2026-04-23 13:14 ` Loktionov, Aleksandr
2026-04-27 9:23 ` Przemek Kitszel
2026-04-27 11:34 ` Jose Ignacio Tornos Martinez
2026-04-27 16:43 ` Simon Horman
2026-04-23 13:04 ` [PATCH net v4 4/4] ice: skip unnecessary VF reset when setting trust Jose Ignacio Tornos Martinez
2026-04-23 13:17 ` Loktionov, Aleksandr
2026-04-24 10:32 ` Jose Ignacio Tornos Martinez
2026-04-24 10:37 ` Loktionov, Aleksandr
2026-04-24 12:40 ` Jose Ignacio Tornos Martinez
2026-04-24 16:05 ` Loktionov, Aleksandr
2026-04-27 7:59 ` Jose Ignacio Tornos Martinez
2026-04-27 16:50 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox