public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH iwl-next v2 0/2] Introduce IDPF PCI callbacks
@ 2026-04-14  3:16 Emil Tantilov
  2026-04-14  3:16 ` [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit() Emil Tantilov
  2026-04-14  3:16 ` [PATCH iwl-next v2 2/2] idpf: implement pci error handlers Emil Tantilov
  0 siblings, 2 replies; 12+ messages in thread
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas

This series implements PCI callbacks for the purpose of handling FLR and
PCI errors in the IDPF driver.

The first patch removes the conditional deinitialization of the mailbox in
the idpf_vc_core_deinit() function. Aside from being redundant, due to the
shutdown of the mailbox after a reset is detected, the check was also
preventing the driver from sending messages to stop and disable the vports
and queues on FW side, which is needed for the prepare phase of the FLR
handling.

The second patch implements the PCI callbacks. The logic here follows
the reset handling done in idpf_init_hard_reset(), but is split in
prepare and resume phases, where idpf_reset_prepare() stops all driver
operations and the resume callback attempt to recover following the
reset or the PCI error event.

Testing hints:
1. FLR via sysfs:
echo 1 > /sys/class/net/<ifname>/device/reset

Previously this would have been handled by idpf_init_hard_reset() as the
driver detects the reset. Now it will be done by the PCI err callbacks,
so this is the easiest way to test the reset_prepare/resume path.

2. PCI errors can be tested with aer-inject:
./aer-inject -s 83:00.0 examples/<error_type>

3. Stress testing can be done by combining various callbacks with the
reset from step 1:
echo 1 > /sys/class/net/<if>/device/reset& ethtool -L <if> combined 8
ethtool -L <if> combined 16& echo 1 > /sys/class/net/<if>/device/reset

Changelog:
v1->v2:
- Removed the call to pci_save_state() from idpf_pci_err_slot_reset(),
  as it is no longer needed after pci_restore_state(). Suggested by
  Lukas Wunner.

v1:
https://lore.kernel.org/netdev/20260411003959.30959-1-emil.s.tantilov@intel.com/

Emil Tantilov (2):
  idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
  idpf: implement pci error handlers

 drivers/net/ethernet/intel/idpf/idpf.h        |   3 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c    |  13 +-
 drivers/net/ethernet/intel/idpf/idpf_main.c   | 112 ++++++++++++++++++
 .../net/ethernet/intel/idpf/idpf_virtchnl.c   |  11 +-
 4 files changed, 127 insertions(+), 12 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
  2026-04-14  3:16 [PATCH iwl-next v2 0/2] Introduce IDPF PCI callbacks Emil Tantilov
@ 2026-04-14  3:16 ` Emil Tantilov
  2026-04-14 11:07   ` Loktionov, Aleksandr
  2026-04-14  3:16 ` [PATCH iwl-next v2 2/2] idpf: implement pci error handlers Emil Tantilov
  1 sibling, 1 reply; 12+ messages in thread
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas

Previously it was assumed that idpf_vc_core_deinit() is always being
called during reset handling, with remove being an exception. Ideally
the driver needs to communicate the changes to FW in all instances where
the MBX is not already disabled. Remove the remove_in_prog check from
idpf_vc_core_deinit() as the MBX was already disabled while handling the
reset via libie_ctlq_xn_shutdown() by the service task. This is also
needed by the following patch, introducing PCI callbacks support.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Jay Bhat <jay.bhat@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 129c8f6b0faa..fceaf3ec1cd4 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -3229,24 +3229,15 @@ int idpf_vc_core_init(struct idpf_adapter *adapter)
  */
 void idpf_vc_core_deinit(struct idpf_adapter *adapter)
 {
-	bool remove_in_prog;
-
 	if (!test_bit(IDPF_VC_CORE_INIT, adapter->flags))
 		return;
 
-	/* Avoid transaction timeouts when called during reset */
-	remove_in_prog = test_bit(IDPF_REMOVE_IN_PROG, adapter->flags);
-	if (!remove_in_prog)
-		idpf_deinit_dflt_mbx(adapter);
-
 	idpf_ptp_release(adapter);
 	idpf_deinit_task(adapter);
 	idpf_idc_deinit_core_aux_device(adapter);
 	idpf_rel_rx_pt_lkup(adapter);
 	idpf_intr_rel(adapter);
-
-	if (remove_in_prog)
-		idpf_deinit_dflt_mbx(adapter);
+	idpf_deinit_dflt_mbx(adapter);
 
 	cancel_delayed_work_sync(&adapter->serv_task);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14  3:16 [PATCH iwl-next v2 0/2] Introduce IDPF PCI callbacks Emil Tantilov
  2026-04-14  3:16 ` [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit() Emil Tantilov
@ 2026-04-14  3:16 ` Emil Tantilov
  2026-04-14 11:09   ` Loktionov, Aleksandr
  2026-04-14 15:13   ` Lukas Wunner
  1 sibling, 2 replies; 12+ messages in thread
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas

Add callbacks to handle PCI errors and FLR reset. When preparing to handle
reset on the bus, the driver must stop all operations that can lead to MMIO
access in order to prevent HW errors. To accomplish this introduce helper
idpf_reset_prepare() that gets called prior to FLR or when PCI error is
detected. Upon resume the recovery is done through the existing reset path
by starting the event task.

The following callbacks are implemented:
.reset_prepare runs the first portion of the generic reset path leading up
to the part where we wait for the reset to complete.
.reset_done/resume runs the recovery part of the reset handling.
.error_detected is the callback dealing with PCI errors, similar to the
prepare call, we stop all operations, prior to attempting a recovery.
.slot_reset is the callback attempting to restore the device, provided a
PCI reset was initiated by the AER driver.

Whereas previously the init logic guaranteed netdevs during reset, the
addition of idpf_detach_and_close() to the PCI callbacks flow makes it
possible for the function to be called without netdevs. Add check to
avoid NULL pointer dereference in that case.

Co-developed-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Jay Bhat <jay.bhat@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf.h      |   3 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c  |  13 ++-
 drivers/net/ethernet/intel/idpf/idpf_main.c | 112 ++++++++++++++++++++
 3 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index 1d0e32e47e87..164d2f3e233a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -88,6 +88,7 @@ enum idpf_state {
  * @IDPF_REMOVE_IN_PROG: Driver remove in progress
  * @IDPF_MB_INTR_MODE: Mailbox in interrupt mode
  * @IDPF_VC_CORE_INIT: virtchnl core has been init
+ * @IDPF_PCI_CB_RESET: Reset via the PCI callbacks
  * @IDPF_FLAGS_NBITS: Must be last
  */
 enum idpf_flags {
@@ -97,6 +98,7 @@ enum idpf_flags {
 	IDPF_REMOVE_IN_PROG,
 	IDPF_MB_INTR_MODE,
 	IDPF_VC_CORE_INIT,
+	IDPF_PCI_CB_RESET,
 	IDPF_FLAGS_NBITS,
 };
 
@@ -1012,4 +1014,5 @@ void idpf_idc_vdev_mtu_event(struct iidc_rdma_vport_dev_info *vdev_info,
 int idpf_add_del_fsteer_filters(struct idpf_adapter *adapter,
 				struct virtchnl2_flow_rule_add_del *rule,
 				enum virtchnl2_op opcode);
+void idpf_detach_and_close(struct idpf_adapter *adapter);
 #endif /* !_IDPF_H_ */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 7988836fbae0..1e706beb0098 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -758,13 +758,16 @@ static int idpf_init_mac_addr(struct idpf_vport *vport,
 	return 0;
 }
 
-static void idpf_detach_and_close(struct idpf_adapter *adapter)
+void idpf_detach_and_close(struct idpf_adapter *adapter)
 {
 	int max_vports = adapter->max_vports;
 
 	for (int i = 0; i < max_vports; i++) {
 		struct net_device *netdev = adapter->netdevs[i];
 
+		if (!netdev)
+			continue;
+
 		/* If the interface is in detached state, that means the
 		 * previous reset was not handled successfully for this
 		 * vport.
@@ -1908,6 +1911,10 @@ static void idpf_init_hard_reset(struct idpf_adapter *adapter)
 
 	dev_info(dev, "Device HW Reset initiated\n");
 
+	/* Reset has already happened, skip to recovery. */
+	if (test_and_clear_bit(IDPF_PCI_CB_RESET, adapter->flags))
+		goto check_rst_complete;
+
 	/* Prepare for reset */
 	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags)) {
 		reg_ops->trigger_reset(adapter, IDPF_HR_DRV_LOAD);
@@ -1925,6 +1932,7 @@ static void idpf_init_hard_reset(struct idpf_adapter *adapter)
 		goto unlock_mutex;
 	}
 
+check_rst_complete:
 	/* Wait for reset to complete */
 	err = idpf_check_reset_complete(adapter, &adapter->reset_reg);
 	if (err) {
@@ -1984,7 +1992,8 @@ void idpf_vc_event_task(struct work_struct *work)
 	if (test_bit(IDPF_HR_FUNC_RESET, adapter->flags))
 		goto func_reset;
 
-	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags))
+	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags) ||
+	    test_bit(IDPF_PCI_CB_RESET, adapter->flags))
 		goto drv_load;
 
 	return;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
index d99f759c55e1..54fca25c09f7 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_main.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
@@ -234,6 +234,7 @@ static int idpf_cfg_device(struct idpf_adapter *adapter)
 	if (err)
 		pci_dbg(pdev, "PCIe PTM is not supported by PCIe bus/controller\n");
 
+	pci_save_state(pdev);
 	pci_set_drvdata(pdev, adapter);
 
 	return 0;
@@ -360,6 +361,116 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return err;
 }
 
+static void idpf_reset_prepare(struct idpf_adapter *adapter)
+{
+	pci_dbg(adapter->pdev, "resetting\n");
+	set_bit(IDPF_HR_RESET_IN_PROG, adapter->flags);
+	cancel_delayed_work_sync(&adapter->serv_task);
+	cancel_delayed_work_sync(&adapter->vc_event_task);
+	idpf_detach_and_close(adapter);
+	idpf_idc_issue_reset_event(adapter->cdev_info);
+	idpf_vc_core_deinit(adapter);
+}
+
+/**
+ * idpf_pci_err_detected - PCI error detected, about to attempt recovery
+ * @pdev: PCI device struct
+ * @err: err detected
+ *
+ * Return: %PCI_ERS_RESULT_NEED_RESET to attempt recovery,
+ * %PCI_ERS_RESULT_DISCONNECT if recovery is not possible.
+ */
+static pci_ers_result_t
+idpf_pci_err_detected(struct pci_dev *pdev, pci_channel_state_t err)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	/* Shutdown the mailbox if PCI I/O is in a bad state to avoid MBX
+	 * timeouts during the prepare stage.
+	 */
+	if (pci_channel_offline(pdev))
+		libie_ctlq_xn_shutdown(adapter->xnm);
+
+	idpf_reset_prepare(adapter);
+
+	if (err == pci_channel_io_perm_failure)
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	/* When called due to PCI error, driver will have to force PFR on
+	 * resume, in order to complete the recovery via the event task.
+	 */
+	set_bit(IDPF_PCI_CB_RESET, adapter->flags);
+
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * idpf_pci_err_slot_reset - PCI undergoing reset
+ * @pdev: PCI device struct
+ *
+ * Reset PCI state and use a register read to see if we're good.
+ *
+ * Return: %PCI_ERS_RESULT_RECOVERED on success,
+ * %PCI_ERS_RESULT_DISCONNECT on failure.
+ */
+static pci_ers_result_t
+idpf_pci_err_slot_reset(struct pci_dev *pdev)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	pci_restore_state(pdev);
+	pci_set_master(pdev);
+	pci_wake_from_d3(pdev, false);
+	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF)
+		return PCI_ERS_RESULT_RECOVERED;
+
+	return PCI_ERS_RESULT_DISCONNECT;
+}
+
+/**
+ * idpf_pci_err_resume - Resume operations after PCI error recovery
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_resume(struct pci_dev *pdev)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	/* Force a PFR when resuming from PCI error. */
+	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
+		adapter->dev_ops.reg_ops.trigger_reset(adapter, IDPF_HR_FUNC_RESET);
+
+	queue_delayed_work(adapter->vc_event_wq,
+			   &adapter->vc_event_task,
+			   msecs_to_jiffies(300));
+}
+
+/**
+ * idpf_pci_err_reset_prepare - Prepare driver for PCI reset
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_reset_prepare(struct pci_dev *pdev)
+{
+	idpf_reset_prepare(pci_get_drvdata(pdev));
+}
+
+/**
+ * idpf_pci_err_reset_done - PCI err reset recovery complete
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_reset_done(struct pci_dev *pdev)
+{
+	pci_dbg(pdev, "reset: done\n");
+	idpf_pci_err_resume(pdev);
+}
+
+static const struct pci_error_handlers idpf_pci_err_handler = {
+	.error_detected = idpf_pci_err_detected,
+	.slot_reset = idpf_pci_err_slot_reset,
+	.reset_prepare = idpf_pci_err_reset_prepare,
+	.reset_done = idpf_pci_err_reset_done,
+	.resume = idpf_pci_err_resume,
+};
+
 /* idpf_pci_tbl - PCI Dev idpf ID Table
  */
 static const struct pci_device_id idpf_pci_tbl[] = {
@@ -377,5 +488,6 @@ static struct pci_driver idpf_driver = {
 	.sriov_configure	= idpf_sriov_configure,
 	.remove			= idpf_remove,
 	.shutdown		= idpf_shutdown,
+	.err_handler		= &idpf_pci_err_handler,
 };
 module_pci_driver(idpf_driver);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
  2026-04-14  3:16 ` [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit() Emil Tantilov
@ 2026-04-14 11:07   ` Loktionov, Aleksandr
  2026-04-14 14:56     ` Tantilov, Emil S
  0 siblings, 1 reply; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-14 11:07 UTC (permalink / raw)
  To: Tantilov, Emil S, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com, lukas@wunner.de



> -----Original Message-----
> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
> Sent: Tuesday, April 14, 2026 5:17 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
> Barrera, Ivan D <ivan.d.barrera@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Zaremba, Larysa
> <larysa.zaremba@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Lobakin, Aleksander <aleksander.lobakin@intel.com>;
> linux-pci@vger.kernel.org; Chittim, Madhu <madhu.chittim@intel.com>;
> decot@google.com; willemb@google.com; sheenamo@google.com;
> lukas@wunner.de
> Subject: [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit
> from idpf_vc_core_deinit()
"conditional" -> "conditional"

Everything else looks fine
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

> 
> Previously it was assumed that idpf_vc_core_deinit() is always being
> called during reset handling, with remove being an exception. Ideally
> the driver needs to communicate the changes to FW in all instances
> where the MBX is not already disabled. Remove the remove_in_prog check
> from
> idpf_vc_core_deinit() as the MBX was already disabled while handling
> the reset via libie_ctlq_xn_shutdown() by the service task. This is
> also needed by the following patch, introducing PCI callbacks support.
> 
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> Reviewed-by: Jay Bhat <jay.bhat@intel.com>
> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
> ---
>  drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
> b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
> index 129c8f6b0faa..fceaf3ec1cd4 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
> @@ -3229,24 +3229,15 @@ int idpf_vc_core_init(struct idpf_adapter
> *adapter)
>   */
>  void idpf_vc_core_deinit(struct idpf_adapter *adapter)  {
> -	bool remove_in_prog;
> -
>  	if (!test_bit(IDPF_VC_CORE_INIT, adapter->flags))
>  		return;
> 
> -	/* Avoid transaction timeouts when called during reset */
> -	remove_in_prog = test_bit(IDPF_REMOVE_IN_PROG, adapter->flags);
> -	if (!remove_in_prog)
> -		idpf_deinit_dflt_mbx(adapter);
> -
>  	idpf_ptp_release(adapter);
>  	idpf_deinit_task(adapter);
>  	idpf_idc_deinit_core_aux_device(adapter);
>  	idpf_rel_rx_pt_lkup(adapter);
>  	idpf_intr_rel(adapter);
> -
> -	if (remove_in_prog)
> -		idpf_deinit_dflt_mbx(adapter);
> +	idpf_deinit_dflt_mbx(adapter);
> 
>  	cancel_delayed_work_sync(&adapter->serv_task);
> 
> --
> 2.37.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14  3:16 ` [PATCH iwl-next v2 2/2] idpf: implement pci error handlers Emil Tantilov
@ 2026-04-14 11:09   ` Loktionov, Aleksandr
  2026-04-14 15:01     ` Tantilov, Emil S
  2026-04-14 15:10     ` Lukas Wunner
  2026-04-14 15:13   ` Lukas Wunner
  1 sibling, 2 replies; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-14 11:09 UTC (permalink / raw)
  To: Tantilov, Emil S, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com, lukas@wunner.de



> -----Original Message-----
> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
> Sent: Tuesday, April 14, 2026 5:17 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
> Barrera, Ivan D <ivan.d.barrera@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Zaremba, Larysa
> <larysa.zaremba@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Lobakin, Aleksander <aleksander.lobakin@intel.com>;
> linux-pci@vger.kernel.org; Chittim, Madhu <madhu.chittim@intel.com>;
> decot@google.com; willemb@google.com; sheenamo@google.com;
> lukas@wunner.de
> Subject: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
> 
> Add callbacks to handle PCI errors and FLR reset. When preparing to
> handle reset on the bus, the driver must stop all operations that can
> lead to MMIO access in order to prevent HW errors. To accomplish this
> introduce helper
> idpf_reset_prepare() that gets called prior to FLR or when PCI error
> is detected. Upon resume the recovery is done through the existing
> reset path by starting the event task.
> 
> The following callbacks are implemented:
> .reset_prepare runs the first portion of the generic reset path
> leading up to the part where we wait for the reset to complete.
> .reset_done/resume runs the recovery part of the reset handling.
> .error_detected is the callback dealing with PCI errors, similar to
> the prepare call, we stop all operations, prior to attempting a
> recovery.
> .slot_reset is the callback attempting to restore the device, provided
> a PCI reset was initiated by the AER driver.
> 
> Whereas previously the init logic guaranteed netdevs during reset, the
> addition of idpf_detach_and_close() to the PCI callbacks flow makes it
> possible for the function to be called without netdevs. Add check to
> avoid NULL pointer dereference in that case.
> 
> Co-developed-by: Alan Brady <alan.brady@intel.com>
> Signed-off-by: Alan Brady <alan.brady@intel.com>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> Reviewed-by: Jay Bhat <jay.bhat@intel.com>
> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
> ---
>  drivers/net/ethernet/intel/idpf/idpf.h      |   3 +
>  drivers/net/ethernet/intel/idpf/idpf_lib.c  |  13 ++-
> drivers/net/ethernet/intel/idpf/idpf_main.c | 112 ++++++++++++++++++++
>  3 files changed, 126 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf.h
> b/drivers/net/ethernet/intel/idpf/idpf.h
> index 1d0e32e47e87..164d2f3e233a 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf.h
> +++ b/drivers/net/ethernet/intel/idpf/idpf.h
> @@ -88,6 +88,7 @@ enum idpf_state {
>   * @IDPF_REMOVE_IN_PROG: Driver remove in progress
>   * @IDPF_MB_INTR_MODE: Mailbox in interrupt mode
>   * @IDPF_VC_CORE_INIT: virtchnl core has been init
> + * @IDPF_PCI_CB_RESET: Reset via the PCI callbacks
>   * @IDPF_FLAGS_NBITS: Must be last
>   */
>  enum idpf_flags {
> @@ -97,6 +98,7 @@ enum idpf_flags {
>  	IDPF_REMOVE_IN_PROG,
>  	IDPF_MB_INTR_MODE,
>  	IDPF_VC_CORE_INIT,

...

> +/**
> + * idpf_pci_err_resume - Resume operations after PCI error recovery
> + * @pdev: PCI device struct
> + */
> +static void idpf_pci_err_resume(struct pci_dev *pdev) {
> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
> +
> +	/* Force a PFR when resuming from PCI error. */
> +	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
> +		adapter->dev_ops.reg_ops.trigger_reset(adapter,
> IDPF_HR_FUNC_RESET);
You say "Force a PFR", but PFR is only triggered on the AER path, not on the FLR path.

Everything else looks fine
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

> +
> +	queue_delayed_work(adapter->vc_event_wq,
> +			   &adapter->vc_event_task,
> +			   msecs_to_jiffies(300));
> +}

...

>  };
>  module_pci_driver(idpf_driver);
> --
> 2.37.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
  2026-04-14 11:07   ` Loktionov, Aleksandr
@ 2026-04-14 14:56     ` Tantilov, Emil S
  0 siblings, 0 replies; 12+ messages in thread
From: Tantilov, Emil S @ 2026-04-14 14:56 UTC (permalink / raw)
  To: Loktionov, Aleksandr, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com, lukas@wunner.de



On 4/14/2026 4:07 AM, Loktionov, Aleksandr wrote:
> 
> 
>> -----Original Message-----
>> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
>> Sent: Tuesday, April 14, 2026 5:17 AM
>> To: intel-wired-lan@lists.osuosl.org
>> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
>> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
>> Barrera, Ivan D <ivan.d.barrera@intel.com>; Loktionov, Aleksandr
>> <aleksandr.loktionov@intel.com>; Zaremba, Larysa
>> <larysa.zaremba@intel.com>; Nguyen, Anthony L
>> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
>> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> pabeni@redhat.com; Lobakin, Aleksander <aleksander.lobakin@intel.com>;
>> linux-pci@vger.kernel.org; Chittim, Madhu <madhu.chittim@intel.com>;
>> decot@google.com; willemb@google.com; sheenamo@google.com;
>> lukas@wunner.de
>> Subject: [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit
>> from idpf_vc_core_deinit()
> "conditional" -> "conditional"

Doh. I will make sure it is corrected.

> 
> Everything else looks fine
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Thanks,
Emil

> 
>>
>> Previously it was assumed that idpf_vc_core_deinit() is always being
>> called during reset handling, with remove being an exception. Ideally
>> the driver needs to communicate the changes to FW in all instances
>> where the MBX is not already disabled. Remove the remove_in_prog check
>> from
>> idpf_vc_core_deinit() as the MBX was already disabled while handling
>> the reset via libie_ctlq_xn_shutdown() by the service task. This is
>> also needed by the following patch, introducing PCI callbacks support.
>>
>> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
>> Reviewed-by: Jay Bhat <jay.bhat@intel.com>
>> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
>> ---
>>   drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 11 +----------
>>   1 file changed, 1 insertion(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
>> b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
>> index 129c8f6b0faa..fceaf3ec1cd4 100644
>> --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
>> +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
>> @@ -3229,24 +3229,15 @@ int idpf_vc_core_init(struct idpf_adapter
>> *adapter)
>>    */
>>   void idpf_vc_core_deinit(struct idpf_adapter *adapter)  {
>> -	bool remove_in_prog;
>> -
>>   	if (!test_bit(IDPF_VC_CORE_INIT, adapter->flags))
>>   		return;
>>
>> -	/* Avoid transaction timeouts when called during reset */
>> -	remove_in_prog = test_bit(IDPF_REMOVE_IN_PROG, adapter->flags);
>> -	if (!remove_in_prog)
>> -		idpf_deinit_dflt_mbx(adapter);
>> -
>>   	idpf_ptp_release(adapter);
>>   	idpf_deinit_task(adapter);
>>   	idpf_idc_deinit_core_aux_device(adapter);
>>   	idpf_rel_rx_pt_lkup(adapter);
>>   	idpf_intr_rel(adapter);
>> -
>> -	if (remove_in_prog)
>> -		idpf_deinit_dflt_mbx(adapter);
>> +	idpf_deinit_dflt_mbx(adapter);
>>
>>   	cancel_delayed_work_sync(&adapter->serv_task);
>>
>> --
>> 2.37.3
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14 11:09   ` Loktionov, Aleksandr
@ 2026-04-14 15:01     ` Tantilov, Emil S
  2026-04-15  8:53       ` Loktionov, Aleksandr
  2026-04-14 15:10     ` Lukas Wunner
  1 sibling, 1 reply; 12+ messages in thread
From: Tantilov, Emil S @ 2026-04-14 15:01 UTC (permalink / raw)
  To: Loktionov, Aleksandr, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com, lukas@wunner.de



On 4/14/2026 4:09 AM, Loktionov, Aleksandr wrote:
> 
> 
>> -----Original Message-----
>> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
>> Sent: Tuesday, April 14, 2026 5:17 AM
>> To: intel-wired-lan@lists.osuosl.org
>> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
>> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
>> Barrera, Ivan D <ivan.d.barrera@intel.com>; Loktionov, Aleksandr
>> <aleksandr.loktionov@intel.com>; Zaremba, Larysa
>> <larysa.zaremba@intel.com>; Nguyen, Anthony L
>> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
>> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> pabeni@redhat.com; Lobakin, Aleksander <aleksander.lobakin@intel.com>;
>> linux-pci@vger.kernel.org; Chittim, Madhu <madhu.chittim@intel.com>;
>> decot@google.com; willemb@google.com; sheenamo@google.com;
>> lukas@wunner.de
>> Subject: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
>>
>> Add callbacks to handle PCI errors and FLR reset. When preparing to
>> handle reset on the bus, the driver must stop all operations that can
>> lead to MMIO access in order to prevent HW errors. To accomplish this
>> introduce helper
>> idpf_reset_prepare() that gets called prior to FLR or when PCI error
>> is detected. Upon resume the recovery is done through the existing
>> reset path by starting the event task.
>>
>> The following callbacks are implemented:
>> .reset_prepare runs the first portion of the generic reset path
>> leading up to the part where we wait for the reset to complete.
>> .reset_done/resume runs the recovery part of the reset handling.
>> .error_detected is the callback dealing with PCI errors, similar to
>> the prepare call, we stop all operations, prior to attempting a
>> recovery.
>> .slot_reset is the callback attempting to restore the device, provided
>> a PCI reset was initiated by the AER driver.
>>
>> Whereas previously the init logic guaranteed netdevs during reset, the
>> addition of idpf_detach_and_close() to the PCI callbacks flow makes it
>> possible for the function to be called without netdevs. Add check to
>> avoid NULL pointer dereference in that case.
>>
>> Co-developed-by: Alan Brady <alan.brady@intel.com>
>> Signed-off-by: Alan Brady <alan.brady@intel.com>
>> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
>> Reviewed-by: Jay Bhat <jay.bhat@intel.com>
>> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
>> ---
>>   drivers/net/ethernet/intel/idpf/idpf.h      |   3 +
>>   drivers/net/ethernet/intel/idpf/idpf_lib.c  |  13 ++-
>> drivers/net/ethernet/intel/idpf/idpf_main.c | 112 ++++++++++++++++++++
>>   3 files changed, 126 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf.h
>> b/drivers/net/ethernet/intel/idpf/idpf.h
>> index 1d0e32e47e87..164d2f3e233a 100644
>> --- a/drivers/net/ethernet/intel/idpf/idpf.h
>> +++ b/drivers/net/ethernet/intel/idpf/idpf.h
>> @@ -88,6 +88,7 @@ enum idpf_state {
>>    * @IDPF_REMOVE_IN_PROG: Driver remove in progress
>>    * @IDPF_MB_INTR_MODE: Mailbox in interrupt mode
>>    * @IDPF_VC_CORE_INIT: virtchnl core has been init
>> + * @IDPF_PCI_CB_RESET: Reset via the PCI callbacks
>>    * @IDPF_FLAGS_NBITS: Must be last
>>    */
>>   enum idpf_flags {
>> @@ -97,6 +98,7 @@ enum idpf_flags {
>>   	IDPF_REMOVE_IN_PROG,
>>   	IDPF_MB_INTR_MODE,
>>   	IDPF_VC_CORE_INIT,
> 
> ...
> 
>> +/**
>> + * idpf_pci_err_resume - Resume operations after PCI error recovery
>> + * @pdev: PCI device struct
>> + */
>> +static void idpf_pci_err_resume(struct pci_dev *pdev) {
>> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>> +
>> +	/* Force a PFR when resuming from PCI error. */
>> +	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
>> +		adapter->dev_ops.reg_ops.trigger_reset(adapter,
>> IDPF_HR_FUNC_RESET);
> You say "Force a PFR", but PFR is only triggered on the AER path, not on the FLR path.

Hence the "force" - the call to `trigger_reset` results in a PFR and is
only needed in the case of a PCI error. If this function was called
because a user issued an FLR, the kernel will trigger it for us. This
way we can reuse the reset handling path to restore the operation of the
netdevs.

Though I may be misunderstanding - are you referring to the wording or
the logic?

Thanks,
Emil

> 
> Everything else looks fine
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> 
>> +
>> +	queue_delayed_work(adapter->vc_event_wq,
>> +			   &adapter->vc_event_task,
>> +			   msecs_to_jiffies(300));
>> +}
> 
> ...
> 
>>   };
>>   module_pci_driver(idpf_driver);
>> --
>> 2.37.3
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14 11:09   ` Loktionov, Aleksandr
  2026-04-14 15:01     ` Tantilov, Emil S
@ 2026-04-14 15:10     ` Lukas Wunner
  2026-04-14 21:42       ` Tantilov, Emil S
  1 sibling, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2026-04-14 15:10 UTC (permalink / raw)
  To: Loktionov, Aleksandr
  Cc: Tantilov, Emil S, intel-wired-lan@lists.osuosl.org,
	netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com

On Tue, Apr 14, 2026 at 11:09:05AM +0000, Loktionov, Aleksandr wrote:
> > From: Tantilov, Emil S <emil.s.tantilov@intel.com>
> > .slot_reset is the callback attempting to restore the device, provided
> > a PCI reset was initiated by the AER driver.

Just for clarity, those callbacks are invoked by PCI core error handling
code and are shared by EEH, AER, DPC as well as s390 error recovery flows.
So it's not only AER.

> > +/**
> > + * idpf_pci_err_resume - Resume operations after PCI error recovery
> > + * @pdev: PCI device struct
> > + */
> > +static void idpf_pci_err_resume(struct pci_dev *pdev) {
> > +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
> > +
> > +	/* Force a PFR when resuming from PCI error. */
> > +	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
> > +		adapter->dev_ops.reg_ops.trigger_reset(adapter,
> > IDPF_HR_FUNC_RESET);
> 
> You say "Force a PFR", but PFR is only triggered on the AER path,
> not on the FLR path.

And?  idpf_pci_err_resume() is only invoked in the error recovery path
(aka AER path), not FLR path AFAICS.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14  3:16 ` [PATCH iwl-next v2 2/2] idpf: implement pci error handlers Emil Tantilov
  2026-04-14 11:09   ` Loktionov, Aleksandr
@ 2026-04-14 15:13   ` Lukas Wunner
  2026-04-14 21:43     ` Tantilov, Emil S
  1 sibling, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2026-04-14 15:13 UTC (permalink / raw)
  To: Emil Tantilov
  Cc: intel-wired-lan, netdev, przemyslaw.kitszel, jay.bhat,
	ivan.d.barrera, aleksandr.loktionov, larysa.zaremba,
	anthony.l.nguyen, andrew+netdev, davem, edumazet, kuba, pabeni,
	aleksander.lobakin, linux-pci, madhu.chittim, decot, willemb,
	sheenamo

On Mon, Apr 13, 2026 at 08:16:31PM -0700, Emil Tantilov wrote:
> +static pci_ers_result_t
> +idpf_pci_err_slot_reset(struct pci_dev *pdev)
> +{
> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
> +
> +	pci_restore_state(pdev);
> +	pci_set_master(pdev);
> +	pci_wake_from_d3(pdev, false);
> +	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF)
> +		return PCI_ERS_RESULT_RECOVERED;

FWIW, there's a PCI_POSSIBLE_ERROR() helper that you may find useful
to check for an "all ones" MMIO read.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14 15:10     ` Lukas Wunner
@ 2026-04-14 21:42       ` Tantilov, Emil S
  0 siblings, 0 replies; 12+ messages in thread
From: Tantilov, Emil S @ 2026-04-14 21:42 UTC (permalink / raw)
  To: Lukas Wunner, Loktionov, Aleksandr
  Cc: intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
	Kitszel, Przemyslaw, Bhat, Jay, Barrera, Ivan D, Zaremba, Larysa,
	Nguyen, Anthony L, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	Lobakin, Aleksander, linux-pci@vger.kernel.org, Chittim, Madhu,
	decot@google.com, willemb@google.com, sheenamo@google.com



On 4/14/2026 8:10 AM, Lukas Wunner wrote:
> On Tue, Apr 14, 2026 at 11:09:05AM +0000, Loktionov, Aleksandr wrote:
>>> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
>>> .slot_reset is the callback attempting to restore the device, provided
>>> a PCI reset was initiated by the AER driver.
> 
> Just for clarity, those callbacks are invoked by PCI core error handling
> code and are shared by EEH, AER, DPC as well as s390 error recovery flows.
> So it's not only AER.

Understood. I can change the wording to be more generic.

> 
>>> +/**
>>> + * idpf_pci_err_resume - Resume operations after PCI error recovery
>>> + * @pdev: PCI device struct
>>> + */
>>> +static void idpf_pci_err_resume(struct pci_dev *pdev) {
>>> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>>> +
>>> +	/* Force a PFR when resuming from PCI error. */
>>> +	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
>>> +		adapter->dev_ops.reg_ops.trigger_reset(adapter,
>>> IDPF_HR_FUNC_RESET);
>>
>> You say "Force a PFR", but PFR is only triggered on the AER path,
>> not on the FLR path.
> 
> And?  idpf_pci_err_resume() is only invoked in the error recovery path
> (aka AER path), not FLR path AFAICS.

The driver calls is in idpf_pci_err_reset_done():

<...>-86378   [009] ..... 342752.746321: idpf_pci_err_reset_prepare 
<-pci_dev_save_and_disable
bash-86378   [045] ..... 342756.748148: idpf_pci_err_reset_done 
<-pci_reset_function
bash-86378   [045] ..... 342756.748272: idpf_pci_err_resume 
<-pci_reset_function

Thanks,
Emil

> 
> Thanks,
> 
> Lukas


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14 15:13   ` Lukas Wunner
@ 2026-04-14 21:43     ` Tantilov, Emil S
  0 siblings, 0 replies; 12+ messages in thread
From: Tantilov, Emil S @ 2026-04-14 21:43 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: intel-wired-lan, netdev, przemyslaw.kitszel, jay.bhat,
	ivan.d.barrera, aleksandr.loktionov, larysa.zaremba,
	anthony.l.nguyen, andrew+netdev, davem, edumazet, kuba, pabeni,
	aleksander.lobakin, linux-pci, madhu.chittim, decot, willemb,
	sheenamo



On 4/14/2026 8:13 AM, Lukas Wunner wrote:
> On Mon, Apr 13, 2026 at 08:16:31PM -0700, Emil Tantilov wrote:
>> +static pci_ers_result_t
>> +idpf_pci_err_slot_reset(struct pci_dev *pdev)
>> +{
>> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>> +
>> +	pci_restore_state(pdev);
>> +	pci_set_master(pdev);
>> +	pci_wake_from_d3(pdev, false);
>> +	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF)
>> +		return PCI_ERS_RESULT_RECOVERED;
> 
> FWIW, there's a PCI_POSSIBLE_ERROR() helper that you may find useful
> to check for an "all ones" MMIO read.

Will check it out.

Thanks,
Emil

> 
> Thanks,
> 
> Lukas


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
  2026-04-14 15:01     ` Tantilov, Emil S
@ 2026-04-15  8:53       ` Loktionov, Aleksandr
  0 siblings, 0 replies; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-04-15  8:53 UTC (permalink / raw)
  To: Tantilov, Emil S, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Kitszel, Przemyslaw, Bhat, Jay,
	Barrera, Ivan D, Zaremba, Larysa, Nguyen, Anthony L,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, Lobakin, Aleksander,
	linux-pci@vger.kernel.org, Chittim, Madhu, decot@google.com,
	willemb@google.com, sheenamo@google.com, lukas@wunner.de



> -----Original Message-----
> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
> Sent: Tuesday, April 14, 2026 5:01 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; intel-wired-
> lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
> Barrera, Ivan D <ivan.d.barrera@intel.com>; Zaremba, Larysa
> <larysa.zaremba@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Lobakin, Aleksander <aleksander.lobakin@intel.com>;
> linux-pci@vger.kernel.org; Chittim, Madhu <madhu.chittim@intel.com>;
> decot@google.com; willemb@google.com; sheenamo@google.com;
> lukas@wunner.de
> Subject: Re: [PATCH iwl-next v2 2/2] idpf: implement pci error
> handlers
> 
> 
> 
> On 4/14/2026 4:09 AM, Loktionov, Aleksandr wrote:
> >
> >
> >> -----Original Message-----
> >> From: Tantilov, Emil S <emil.s.tantilov@intel.com>
> >> Sent: Tuesday, April 14, 2026 5:17 AM
> >> To: intel-wired-lan@lists.osuosl.org
> >> Cc: netdev@vger.kernel.org; Kitszel, Przemyslaw
> >> <przemyslaw.kitszel@intel.com>; Bhat, Jay <jay.bhat@intel.com>;
> >> Barrera, Ivan D <ivan.d.barrera@intel.com>; Loktionov, Aleksandr
> >> <aleksandr.loktionov@intel.com>; Zaremba, Larysa
> >> <larysa.zaremba@intel.com>; Nguyen, Anthony L
> >> <anthony.l.nguyen@intel.com>; andrew+netdev@lunn.ch;
> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> >> pabeni@redhat.com; Lobakin, Aleksander
> >> <aleksander.lobakin@intel.com>; linux-pci@vger.kernel.org; Chittim,
> >> Madhu <madhu.chittim@intel.com>; decot@google.com;
> >> willemb@google.com; sheenamo@google.com; lukas@wunner.de
> >> Subject: [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
> >>
> >> Add callbacks to handle PCI errors and FLR reset. When preparing to
> >> handle reset on the bus, the driver must stop all operations that
> can
> >> lead to MMIO access in order to prevent HW errors. To accomplish
> this
> >> introduce helper
> >> idpf_reset_prepare() that gets called prior to FLR or when PCI
> error
> >> is detected. Upon resume the recovery is done through the existing
> >> reset path by starting the event task.
> >>
> >> The following callbacks are implemented:
> >> .reset_prepare runs the first portion of the generic reset path
> >> leading up to the part where we wait for the reset to complete.
> >> .reset_done/resume runs the recovery part of the reset handling.
> >> .error_detected is the callback dealing with PCI errors, similar to
> >> the prepare call, we stop all operations, prior to attempting a
> >> recovery.
> >> .slot_reset is the callback attempting to restore the device,
> >> provided a PCI reset was initiated by the AER driver.
> >>
> >> Whereas previously the init logic guaranteed netdevs during reset,
> >> the addition of idpf_detach_and_close() to the PCI callbacks flow
> >> makes it possible for the function to be called without netdevs.
> Add
> >> check to avoid NULL pointer dereference in that case.
> >>
> >> Co-developed-by: Alan Brady <alan.brady@intel.com>
> >> Signed-off-by: Alan Brady <alan.brady@intel.com>
> >> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> >> Reviewed-by: Jay Bhat <jay.bhat@intel.com>
> >> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
> >> ---
> >>   drivers/net/ethernet/intel/idpf/idpf.h      |   3 +
> >>   drivers/net/ethernet/intel/idpf/idpf_lib.c  |  13 ++-
> >> drivers/net/ethernet/intel/idpf/idpf_main.c | 112
> ++++++++++++++++++++
> >>   3 files changed, 126 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/intel/idpf/idpf.h
> >> b/drivers/net/ethernet/intel/idpf/idpf.h
> >> index 1d0e32e47e87..164d2f3e233a 100644
> >> --- a/drivers/net/ethernet/intel/idpf/idpf.h
> >> +++ b/drivers/net/ethernet/intel/idpf/idpf.h
> >> @@ -88,6 +88,7 @@ enum idpf_state {
> >>    * @IDPF_REMOVE_IN_PROG: Driver remove in progress
> >>    * @IDPF_MB_INTR_MODE: Mailbox in interrupt mode
> >>    * @IDPF_VC_CORE_INIT: virtchnl core has been init
> >> + * @IDPF_PCI_CB_RESET: Reset via the PCI callbacks
> >>    * @IDPF_FLAGS_NBITS: Must be last
> >>    */
> >>   enum idpf_flags {
> >> @@ -97,6 +98,7 @@ enum idpf_flags {
> >>   	IDPF_REMOVE_IN_PROG,
> >>   	IDPF_MB_INTR_MODE,
> >>   	IDPF_VC_CORE_INIT,
> >
> > ...
> >
> >> +/**
> >> + * idpf_pci_err_resume - Resume operations after PCI error
> recovery
> >> + * @pdev: PCI device struct
> >> + */
> >> +static void idpf_pci_err_resume(struct pci_dev *pdev) {
> >> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
> >> +
> >> +	/* Force a PFR when resuming from PCI error. */
> >> +	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
> >> +		adapter->dev_ops.reg_ops.trigger_reset(adapter,
> >> IDPF_HR_FUNC_RESET);
> > You say "Force a PFR", but PFR is only triggered on the AER path,
> not on the FLR path.
> 
> Hence the "force" - the call to `trigger_reset` results in a PFR and
> is only needed in the case of a PCI error. If this function was called
> because a user issued an FLR, the kernel will trigger it for us. This
> way we can reuse the reset handling path to restore the operation of
> the netdevs.
> 
> Though I may be misunderstanding - are you referring to the wording or
> the logic?
From the first glance the comment looks misleading from my point of view.
Please consider rewording. 

> 
> Thanks,
> Emil
> 
> >
> > Everything else looks fine
> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> >
> >> +
> >> +	queue_delayed_work(adapter->vc_event_wq,
> >> +			   &adapter->vc_event_task,
> >> +			   msecs_to_jiffies(300));
> >> +}
> >
> > ...
> >
> >>   };
> >>   module_pci_driver(idpf_driver);
> >> --
> >> 2.37.3
> >


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-15  8:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-14  3:16 [PATCH iwl-next v2 0/2] Introduce IDPF PCI callbacks Emil Tantilov
2026-04-14  3:16 ` [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit() Emil Tantilov
2026-04-14 11:07   ` Loktionov, Aleksandr
2026-04-14 14:56     ` Tantilov, Emil S
2026-04-14  3:16 ` [PATCH iwl-next v2 2/2] idpf: implement pci error handlers Emil Tantilov
2026-04-14 11:09   ` Loktionov, Aleksandr
2026-04-14 15:01     ` Tantilov, Emil S
2026-04-15  8:53       ` Loktionov, Aleksandr
2026-04-14 15:10     ` Lukas Wunner
2026-04-14 21:42       ` Tantilov, Emil S
2026-04-14 15:13   ` Lukas Wunner
2026-04-14 21:43     ` Tantilov, Emil S

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox