* [PATCH 0/3] net/iavf: vf reset fixes
@ 2026-06-08 14:55 Ciara Loftus
2026-06-08 14:55 ` [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug Ciara Loftus
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Ciara Loftus @ 2026-06-08 14:55 UTC (permalink / raw)
To: dev; +Cc: Ciara Loftus
The patch [1] aimed to address a race condition in the iavf driver
during a reset and also reduced noisy logging during resets.
Patch 1 of this series extracts the noisy logging fix into its own
commit.
Patch 2 offers an alternative approach to fixing the race condition.
Patch 3 fixes a pre-existing refcount imbalance in the shared event
handler thread that became visible while investigating the reset path.
[1] https://patches.dpdk.org/project/dpdk/patch/20260605123646.1328492-1-chaitanyababux.talluri@intel.com/
Ciara Loftus (2):
net/iavf: wait for PF reset start before reinitializing
net/iavf: fix event handler refcount leak on HW reset
Talluri Chaitanyababu (1):
net/iavf: downgrade opcode 0 ARQ log to debug
drivers/net/intel/iavf/iavf.h | 1 +
drivers/net/intel/iavf/iavf_ethdev.c | 14 +++++++++++++-
drivers/net/intel/iavf/iavf_vchnl.c | 11 +++++++++--
3 files changed, 23 insertions(+), 3 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug
2026-06-08 14:55 [PATCH 0/3] net/iavf: vf reset fixes Ciara Loftus
@ 2026-06-08 14:55 ` Ciara Loftus
2026-06-09 14:28 ` Bruce Richardson
2026-06-08 14:55 ` [PATCH 2/3] net/iavf: wait for PF reset start before reinitializing Ciara Loftus
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Ciara Loftus @ 2026-06-08 14:55 UTC (permalink / raw)
To: dev; +Cc: Talluri Chaitanyababu
From: Talluri Chaitanyababu <chaitanyababux.talluri@intel.com>
After admin queue reinitialisation, completions from uninitialised
ARQ ring descriptor memory may arrive before any real PF response.
These carry opcode 0 (`VIRTCHNL_OP_UNKNOWN`) and trigger a WARNING
log on every poll iteration, flooding the log during reset recovery.
Treat opcode 0 as a distinct case and log it at DEBUG level, while
retaining WARNING for genuine opcode mismatches.
Signed-off-by: Talluri Chaitanyababu <chaitanyababux.talluri@intel.com>
---
drivers/net/intel/iavf/iavf_vchnl.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/intel/iavf/iavf_vchnl.c b/drivers/net/intel/iavf/iavf_vchnl.c
index 94ccfb5d6e..cd90d35023 100644
--- a/drivers/net/intel/iavf/iavf_vchnl.c
+++ b/drivers/net/intel/iavf/iavf_vchnl.c
@@ -299,8 +299,15 @@ iavf_read_msg_from_pf(struct iavf_adapter *adapter, uint16_t buf_len,
/* async reply msg on command issued by vf previously */
result = IAVF_MSG_CMD;
if (opcode != vf->pend_cmd) {
- PMD_DRV_LOG(WARNING, "command mismatch, expect %u, get %u",
- vf->pend_cmd, opcode);
+ if (opcode == VIRTCHNL_OP_UNKNOWN)
+ PMD_DRV_LOG(DEBUG,
+ "Spurious msg with opcode 0, pending cmd %u",
+ vf->pend_cmd);
+ else
+ PMD_DRV_LOG(WARNING,
+ "command mismatch, expect %u, get %u",
+ vf->pend_cmd, opcode);
+
result = IAVF_MSG_ERR;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] net/iavf: wait for PF reset start before reinitializing
2026-06-08 14:55 [PATCH 0/3] net/iavf: vf reset fixes Ciara Loftus
2026-06-08 14:55 ` [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug Ciara Loftus
@ 2026-06-08 14:55 ` Ciara Loftus
2026-06-08 14:55 ` [PATCH 3/3] net/iavf: fix event handler refcount leak on HW reset Ciara Loftus
2026-06-09 14:53 ` [PATCH 0/3] net/iavf: vf reset fixes Bruce Richardson
3 siblings, 0 replies; 6+ messages in thread
From: Ciara Loftus @ 2026-06-08 14:55 UTC (permalink / raw)
To: dev; +Cc: Ciara Loftus, stable, Talluri Chaitanyababu
Commit 1428895ad417 ("net/iavf: fix disabling of promiscuous modes on
close") added a synchronous VIRTCHNL round-trip on the close path
before the reset request is sent. This delays the reset just long
enough that `IAVF_VFGEN_RSTAT` still reads `VIRTCHNL_VFR_VFACTIVE`
when the re-init path polls it for reset completion. The driver
interprets this as the reset being complete, when in fact it has not
yet started, and proceeds to issue VIRTCHNL commands before the PF
has disabled the VF mailbox.
Fix by polling `IAVF_VF_ARQLEN1.ARQENABLE` immediately after the reset
request and before shutting down the admin queue, when the close is
triggered by a reset. The PF clears this bit as its first reset action,
providing an unambiguous signal that the reset is in progress.
Fixes: 1428895ad4 ("net/iavf: fix disabling of promiscuous modes on close")
Cc: stable@dpdk.org
Reported-by: Talluri Chaitanyababu <chaitanyababux.talluri@intel.com>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
drivers/net/intel/iavf/iavf.h | 1 +
drivers/net/intel/iavf/iavf_ethdev.c | 12 ++++++++++++
2 files changed, 13 insertions(+)
diff --git a/drivers/net/intel/iavf/iavf.h b/drivers/net/intel/iavf/iavf.h
index 2615b6f034..4444602a30 100644
--- a/drivers/net/intel/iavf/iavf.h
+++ b/drivers/net/intel/iavf/iavf.h
@@ -291,6 +291,7 @@ struct iavf_info {
struct rte_eth_dev *eth_dev;
bool in_reset_recovery;
+ bool reset_pending;
uint32_t ptp_caps;
rte_spinlock_t phc_time_aq_lock;
diff --git a/drivers/net/intel/iavf/iavf_ethdev.c b/drivers/net/intel/iavf/iavf_ethdev.c
index a8031e23a5..a38132e80e 100644
--- a/drivers/net/intel/iavf/iavf_ethdev.c
+++ b/drivers/net/intel/iavf/iavf_ethdev.c
@@ -106,6 +106,7 @@ static int iavf_dev_start(struct rte_eth_dev *dev);
static int iavf_dev_stop(struct rte_eth_dev *dev);
static int iavf_dev_close(struct rte_eth_dev *dev);
static int iavf_dev_reset(struct rte_eth_dev *dev);
+static bool iavf_is_reset_detected(struct iavf_adapter *adapter);
static int iavf_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
static const uint32_t *iavf_dev_supported_ptypes_get(struct rte_eth_dev *dev,
@@ -3196,6 +3197,14 @@ iavf_dev_close(struct rte_eth_dev *dev)
iavf_flow_uninit(adapter);
iavf_vf_reset(hw);
+ /*
+ * If a reset is pending, wait for the PF to disable the VF's admin
+ * receive queue (its first reset action) before we shut it down
+ * ourselves. This ensures iavf_check_vf_reset_done() does not see
+ * a stale VFACTIVE value on the re-init path.
+ */
+ if (vf->reset_pending)
+ iavf_is_reset_detected(adapter);
vf->aq_intr_enabled = false;
iavf_shutdown_adminq(hw);
if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) {
@@ -3273,6 +3282,7 @@ iavf_dev_reset(struct rte_eth_dev *dev)
struct iavf_adapter *adapter =
IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+ struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
/*
* Check whether the VF reset has been done and inform application,
* to avoid calling the virtual channel command, which may cause
@@ -3285,8 +3295,10 @@ iavf_dev_reset(struct rte_eth_dev *dev)
}
iavf_set_no_poll(adapter, false);
+ vf->reset_pending = true;
PMD_DRV_LOG(DEBUG, "Start dev_reset ...");
ret = iavf_dev_uninit(dev);
+ vf->reset_pending = false;
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/3] net/iavf: fix event handler refcount leak on HW reset
2026-06-08 14:55 [PATCH 0/3] net/iavf: vf reset fixes Ciara Loftus
2026-06-08 14:55 ` [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug Ciara Loftus
2026-06-08 14:55 ` [PATCH 2/3] net/iavf: wait for PF reset start before reinitializing Ciara Loftus
@ 2026-06-08 14:55 ` Ciara Loftus
2026-06-09 14:53 ` [PATCH 0/3] net/iavf: vf reset fixes Bruce Richardson
3 siblings, 0 replies; 6+ messages in thread
From: Ciara Loftus @ 2026-06-08 14:55 UTC (permalink / raw)
To: dev; +Cc: Ciara Loftus, stable
Currently, when handling a hardware reset, the uninit path skips
releasing the event handler reference while in_reset_recovery is set,
to prevent premature teardown of the event handler thread. However, the
subsequent re-init call unconditionally increments the reference count,
inflating ndev on every reset cycle. On the final device removal, the
count never reaches zero and the event handler thread is never joined.
Fix it by also skipping the event handler reference acquisition during
reset recovery, matching the symmetric skip in the uninit path so the
count stays stable across each reset cycle.
Fixes: 3e6a5d2d310a ("net/iavf: add devargs to enable VF auto-reset")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
drivers/net/intel/iavf/iavf_ethdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/intel/iavf/iavf_ethdev.c b/drivers/net/intel/iavf/iavf_ethdev.c
index a38132e80e..ec1ad02826 100644
--- a/drivers/net/intel/iavf/iavf_ethdev.c
+++ b/drivers/net/intel/iavf/iavf_ethdev.c
@@ -3031,7 +3031,7 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
adapter->tpid = RTE_ETHER_TYPE_VLAN; /* VLAN TPID set to 0x8100 by default */
rte_spinlock_init(&adapter->phc_sync_lock);
- if (iavf_dev_event_handler_init())
+ if (!vf->in_reset_recovery && iavf_dev_event_handler_init())
goto init_vf_err;
if (iavf_init_vf(eth_dev) != 0) {
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug
2026-06-08 14:55 ` [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug Ciara Loftus
@ 2026-06-09 14:28 ` Bruce Richardson
0 siblings, 0 replies; 6+ messages in thread
From: Bruce Richardson @ 2026-06-09 14:28 UTC (permalink / raw)
To: Ciara Loftus; +Cc: dev, Talluri Chaitanyababu
On Mon, Jun 08, 2026 at 02:55:16PM +0000, Ciara Loftus wrote:
> From: Talluri Chaitanyababu <chaitanyababux.talluri@intel.com>
>
> After admin queue reinitialisation, completions from uninitialised
> ARQ ring descriptor memory may arrive before any real PF response.
> These carry opcode 0 (`VIRTCHNL_OP_UNKNOWN`) and trigger a WARNING
> log on every poll iteration, flooding the log during reset recovery.
>
> Treat opcode 0 as a distinct case and log it at DEBUG level, while
> retaining WARNING for genuine opcode mismatches.
>
> Signed-off-by: Talluri Chaitanyababu <chaitanyababux.talluri@intel.com>
> ---
> drivers/net/intel/iavf/iavf_vchnl.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
Should this be backported as a bugfix?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/3] net/iavf: vf reset fixes
2026-06-08 14:55 [PATCH 0/3] net/iavf: vf reset fixes Ciara Loftus
` (2 preceding siblings ...)
2026-06-08 14:55 ` [PATCH 3/3] net/iavf: fix event handler refcount leak on HW reset Ciara Loftus
@ 2026-06-09 14:53 ` Bruce Richardson
3 siblings, 0 replies; 6+ messages in thread
From: Bruce Richardson @ 2026-06-09 14:53 UTC (permalink / raw)
To: Ciara Loftus; +Cc: dev
On Mon, Jun 08, 2026 at 02:55:15PM +0000, Ciara Loftus wrote:
> The patch [1] aimed to address a race condition in the iavf driver
> during a reset and also reduced noisy logging during resets.
> Patch 1 of this series extracts the noisy logging fix into its own
> commit.
> Patch 2 offers an alternative approach to fixing the race condition.
> Patch 3 fixes a pre-existing refcount imbalance in the shared event
> handler thread that became visible while investigating the reset path.
>
> [1] https://patches.dpdk.org/project/dpdk/patch/20260605123646.1328492-1-chaitanyababux.talluri@intel.com/
>
> Ciara Loftus (2):
> net/iavf: wait for PF reset start before reinitializing
> net/iavf: fix event handler refcount leak on HW reset
>
> Talluri Chaitanyababu (1):
> net/iavf: downgrade opcode 0 ARQ log to debug
>
> drivers/net/intel/iavf/iavf.h | 1 +
> drivers/net/intel/iavf/iavf_ethdev.c | 14 +++++++++++++-
> drivers/net/intel/iavf/iavf_vchnl.c | 11 +++++++++--
> 3 files changed, 23 insertions(+), 3 deletions(-)
>
Series-acked-by: Bruce Richardson <bruce.richardson@intel.com>
Series applied to dpdk-next-net-intel.
Thanks,
/Bruce
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-09 14:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 14:55 [PATCH 0/3] net/iavf: vf reset fixes Ciara Loftus
2026-06-08 14:55 ` [PATCH 1/3] net/iavf: downgrade opcode 0 ARQ log to debug Ciara Loftus
2026-06-09 14:28 ` Bruce Richardson
2026-06-08 14:55 ` [PATCH 2/3] net/iavf: wait for PF reset start before reinitializing Ciara Loftus
2026-06-08 14:55 ` [PATCH 3/3] net/iavf: fix event handler refcount leak on HW reset Ciara Loftus
2026-06-09 14:53 ` [PATCH 0/3] net/iavf: vf reset fixes Bruce Richardson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox