netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain
@ 2024-05-15 21:07 Thinh Tran
  2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
  2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
  0 siblings, 2 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
  To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel
  Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
	rob.thomas, Thinh Tran

The patch fixes an issue where repeated EEH reports with a single error
on the bus of Intel X710 4-port 10G Base-T adapter in the MSI domain
cause the device to be permanently disabled.  It fully resets and
restarts the device when handling the PCI EEH error.

v4: corrected another typos.
v3: moved text commit messages from the cover letter to appropriate
    patches.
v2: fixed typos and split into two commits

Thinh Tran (2):
  i40e: factoring out i40e_suspend/i40e_resume
  i40e: Fully suspend and resume IO operations in EEH case

 drivers/net/ethernet/intel/i40e/i40e_main.c | 257 +++++++++++---------
 1 file changed, 140 insertions(+), 117 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
@ 2024-05-15 21:07 ` Thinh Tran
  2024-05-16  8:50   ` Simon Horman
                     ` (2 more replies)
  2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
  1 sibling, 3 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
  To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel
  Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
	rob.thomas, Thinh Tran

Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
introduced.  These functions were factored out from the existing
i40e_suspend() and i40e_resume() respectively.  This factoring was
done due to concerns about the logic of the I40E_SUSPENSED state, which
caused the device to be unable to recover.  The functions are now used
in the EEH handling for device suspend/resume callbacks.

The function i40e_enable_mc_magic_wake() has been moved ahead of
i40e_io_suspend() to ensure it is declared before being used.

Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
 1 file changed, 134 insertions(+), 114 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ffb9f9f15c52..281c8ec27af2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16303,6 +16303,138 @@ static void i40e_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
+/**
+ * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
+ * using the mac_address_write admin q function
+ * @pf: pointer to i40e_pf struct
+ **/
+static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
+{
+	struct i40e_hw *hw = &pf->hw;
+	u8 mac_addr[6];
+	u16 flags = 0;
+	int ret;
+
+	/* Get current MAC address in case it's an LAA */
+	if (pf->vsi[pf->lan_vsi] && pf->vsi[pf->lan_vsi]->netdev) {
+		ether_addr_copy(mac_addr,
+				pf->vsi[pf->lan_vsi]->netdev->dev_addr);
+	} else {
+		dev_err(&pf->pdev->dev,
+			"Failed to retrieve MAC address; using default\n");
+		ether_addr_copy(mac_addr, hw->mac.addr);
+	}
+
+	/* The FW expects the mac address write cmd to first be called with
+	 * one of these flags before calling it again with the multicast
+	 * enable flags.
+	 */
+	flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
+
+	if (hw->func_caps.flex10_enable && hw->partition_id != 1)
+		flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
+
+	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+	if (ret) {
+		dev_err(&pf->pdev->dev,
+			"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
+		return;
+	}
+
+	flags = I40E_AQC_MC_MAG_EN
+			| I40E_AQC_WOL_PRESERVE_ON_PFR
+			| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
+	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+	if (ret)
+		dev_err(&pf->pdev->dev,
+			"Failed to enable Multicast Magic Packet wake up\n");
+}
+
+/**
+ * i40e_io_suspend - suspend all IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_suspend(struct i40e_pf *pf)
+{
+	struct i40e_hw *hw = &pf->hw;
+
+	set_bit(__I40E_DOWN, pf->state);
+
+	/* Ensure service task will not be running */
+	del_timer_sync(&pf->service_timer);
+	cancel_work_sync(&pf->service_task);
+
+	/* Client close must be called explicitly here because the timer
+	 * has been stopped.
+	 */
+	i40e_notify_client_of_netdev_close(pf->vsi[pf->lan_vsi], false);
+
+	if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
+	    pf->wol_en)
+		i40e_enable_mc_magic_wake(pf);
+
+	/* Since we're going to destroy queues during the
+	 * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
+	 * whole section
+	 */
+	rtnl_lock();
+
+	i40e_prep_for_reset(pf);
+
+	wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
+	wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
+
+	/* Clear the interrupt scheme and release our IRQs so that the system
+	 * can safely hibernate even when there are a large number of CPUs.
+	 * Otherwise hibernation might fail when mapping all the vectors back
+	 * to CPU0.
+	 */
+	i40e_clear_interrupt_scheme(pf);
+
+	rtnl_unlock();
+
+	return 0;
+}
+
+/**
+ * i40e_io_resume - resume IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_resume(struct i40e_pf *pf)
+{
+	int err;
+
+	/* We need to hold the RTNL lock prior to restoring interrupt schemes,
+	 * since we're going to be restoring queues
+	 */
+	rtnl_lock();
+
+	/* We cleared the interrupt scheme when we suspended, so we need to
+	 * restore it now to resume device functionality.
+	 */
+	err = i40e_restore_interrupt_scheme(pf);
+	if (err) {
+		dev_err(&pf->pdev->dev, "Cannot restore interrupt scheme: %d\n",
+			err);
+	}
+
+	clear_bit(__I40E_DOWN, pf->state);
+	i40e_reset_and_rebuild(pf, false, true);
+
+	rtnl_unlock();
+
+	/* Clear suspended state last after everything is recovered */
+	clear_bit(__I40E_SUSPENDED, pf->state);
+
+	/* Restart the service task */
+	mod_timer(&pf->service_timer,
+		  round_jiffies(jiffies + pf->service_timer_period));
+
+	return 0;
+}
+
 /**
  * i40e_pci_error_detected - warning that something funky happened in PCI land
  * @pdev: PCI device information struct
@@ -16415,53 +16547,6 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
 	i40e_handle_reset_warning(pf, false);
 }
 
-/**
- * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
- * using the mac_address_write admin q function
- * @pf: pointer to i40e_pf struct
- **/
-static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
-{
-	struct i40e_hw *hw = &pf->hw;
-	u8 mac_addr[6];
-	u16 flags = 0;
-	int ret;
-
-	/* Get current MAC address in case it's an LAA */
-	if (pf->vsi[pf->lan_vsi] && pf->vsi[pf->lan_vsi]->netdev) {
-		ether_addr_copy(mac_addr,
-				pf->vsi[pf->lan_vsi]->netdev->dev_addr);
-	} else {
-		dev_err(&pf->pdev->dev,
-			"Failed to retrieve MAC address; using default\n");
-		ether_addr_copy(mac_addr, hw->mac.addr);
-	}
-
-	/* The FW expects the mac address write cmd to first be called with
-	 * one of these flags before calling it again with the multicast
-	 * enable flags.
-	 */
-	flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
-
-	if (hw->func_caps.flex10_enable && hw->partition_id != 1)
-		flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
-
-	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
-	if (ret) {
-		dev_err(&pf->pdev->dev,
-			"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
-		return;
-	}
-
-	flags = I40E_AQC_MC_MAG_EN
-			| I40E_AQC_WOL_PRESERVE_ON_PFR
-			| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
-	ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
-	if (ret)
-		dev_err(&pf->pdev->dev,
-			"Failed to enable Multicast Magic Packet wake up\n");
-}
-
 /**
  * i40e_shutdown - PCI callback for shutting down
  * @pdev: PCI device information struct
@@ -16521,48 +16606,11 @@ static void i40e_shutdown(struct pci_dev *pdev)
 static int __maybe_unused i40e_suspend(struct device *dev)
 {
 	struct i40e_pf *pf = dev_get_drvdata(dev);
-	struct i40e_hw *hw = &pf->hw;
 
 	/* If we're already suspended, then there is nothing to do */
 	if (test_and_set_bit(__I40E_SUSPENDED, pf->state))
 		return 0;
-
-	set_bit(__I40E_DOWN, pf->state);
-
-	/* Ensure service task will not be running */
-	del_timer_sync(&pf->service_timer);
-	cancel_work_sync(&pf->service_task);
-
-	/* Client close must be called explicitly here because the timer
-	 * has been stopped.
-	 */
-	i40e_notify_client_of_netdev_close(pf->vsi[pf->lan_vsi], false);
-
-	if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
-	    pf->wol_en)
-		i40e_enable_mc_magic_wake(pf);
-
-	/* Since we're going to destroy queues during the
-	 * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
-	 * whole section
-	 */
-	rtnl_lock();
-
-	i40e_prep_for_reset(pf);
-
-	wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
-	wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
-
-	/* Clear the interrupt scheme and release our IRQs so that the system
-	 * can safely hibernate even when there are a large number of CPUs.
-	 * Otherwise hibernation might fail when mapping all the vectors back
-	 * to CPU0.
-	 */
-	i40e_clear_interrupt_scheme(pf);
-
-	rtnl_unlock();
-
-	return 0;
+	return i40e_io_suspend(pf);
 }
 
 /**
@@ -16572,39 +16620,11 @@ static int __maybe_unused i40e_suspend(struct device *dev)
 static int __maybe_unused i40e_resume(struct device *dev)
 {
 	struct i40e_pf *pf = dev_get_drvdata(dev);
-	int err;
 
 	/* If we're not suspended, then there is nothing to do */
 	if (!test_bit(__I40E_SUSPENDED, pf->state))
 		return 0;
-
-	/* We need to hold the RTNL lock prior to restoring interrupt schemes,
-	 * since we're going to be restoring queues
-	 */
-	rtnl_lock();
-
-	/* We cleared the interrupt scheme when we suspended, so we need to
-	 * restore it now to resume device functionality.
-	 */
-	err = i40e_restore_interrupt_scheme(pf);
-	if (err) {
-		dev_err(dev, "Cannot restore interrupt scheme: %d\n",
-			err);
-	}
-
-	clear_bit(__I40E_DOWN, pf->state);
-	i40e_reset_and_rebuild(pf, false, true);
-
-	rtnl_unlock();
-
-	/* Clear suspended state last after everything is recovered */
-	clear_bit(__I40E_SUSPENDED, pf->state);
-
-	/* Restart the service task */
-	mod_timer(&pf->service_timer,
-		  round_jiffies(jiffies + pf->service_timer_period));
-
-	return 0;
+	return i40e_io_resume(pf);
 }
 
 static const struct pci_error_handlers i40e_err_handler = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
  2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
  2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
@ 2024-05-15 21:07 ` Thinh Tran
  2024-05-16  8:51   ` Simon Horman
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
  To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel
  Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
	rob.thomas, Thinh Tran, Jacob Keller

When EEH events occurs, the callback functions in the i40e, which are
managed by the EEH driver, will completely suspend and resume all IO
operations.

- In the PCI error detected callback, replaced i40e_prep_for_reset()
  with i40e_io_suspend(). The change is to fully suspend all I/O
  operations
- In the PCI error slot reset callback, replaced pci_enable_device_mem()
  with pci_enable_device(). This change enables both I/O and memory of
  the device.
- In the PCI error resume callback, replaced i40e_handle_reset_warning()
  with i40e_io_resume(). This change allows the system to resume I/O
  operations

Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 281c8ec27af2..9f71a61e0c52 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11138,6 +11138,8 @@ static void i40e_reset_and_rebuild(struct i40e_pf *pf, bool reinit,
 	ret = i40e_reset(pf);
 	if (!ret)
 		i40e_rebuild(pf, reinit, lock_acquired);
+	else
+		dev_err(&pf->pdev->dev, "%s: i40e_reset() FAILED", __func__);
 }
 
 /**
@@ -16459,7 +16461,7 @@ static pci_ers_result_t i40e_pci_error_detected(struct pci_dev *pdev,
 
 	/* shutdown all operations */
 	if (!test_bit(__I40E_SUSPENDED, pf->state))
-		i40e_prep_for_reset(pf);
+		i40e_io_suspend(pf);
 
 	/* Request a slot reset */
 	return PCI_ERS_RESULT_NEED_RESET;
@@ -16481,7 +16483,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
 	u32 reg;
 
 	dev_dbg(&pdev->dev, "%s\n", __func__);
-	if (pci_enable_device_mem(pdev)) {
+	/* enable I/O and memory of the device  */
+	if (pci_enable_device(pdev)) {
 		dev_info(&pdev->dev,
 			 "Cannot re-enable PCI device after reset.\n");
 		result = PCI_ERS_RESULT_DISCONNECT;
@@ -16544,7 +16547,7 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
 	if (test_bit(__I40E_SUSPENDED, pf->state))
 		return;
 
-	i40e_handle_reset_warning(pf, false);
+	i40e_io_resume(pf);
 }
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
@ 2024-05-16  8:50   ` Simon Horman
  2024-05-16 19:11   ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
  2024-05-23 17:57   ` Pucha, HimasekharX Reddy
  2 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2024-05-16  8:50 UTC (permalink / raw)
  To: Thinh Tran
  Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
	pabeni, intel-wired-lan, rob.thomas

On Wed, May 15, 2024 at 04:07:04PM -0500, Thinh Tran wrote:
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
> introduced.  These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively.  This factoring was
> done due to concerns about the logic of the I40E_SUSPENSED state, which
> caused the device to be unable to recover.  The functions are now used
> in the EEH handling for device suspend/resume callbacks.
> 
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
> 
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>

Hi Thrinh,

Sorry to nit-pick, but the request from Paul in his review of v3
was to use imperative mood in the title.

	Factor out i40e_suspend/i40e_resume

In any case, this patch looks good to me.

Reviewed-by: Simon Horman <horms@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
  2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
@ 2024-05-16  8:51   ` Simon Horman
  2024-05-23 17:59   ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
  2025-08-07  4:50   ` [PATCH iwl-net V4,2/2] " Lukas Wunner
  2 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2024-05-16  8:51 UTC (permalink / raw)
  To: Thinh Tran
  Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
	pabeni, intel-wired-lan, rob.thomas, Jacob Keller

On Wed, May 15, 2024 at 04:07:05PM -0500, Thinh Tran wrote:
> When EEH events occurs, the callback functions in the i40e, which are
> managed by the EEH driver, will completely suspend and resume all IO
> operations.
> 
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
>   with i40e_io_suspend(). The change is to fully suspend all I/O
>   operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
>   with pci_enable_device(). This change enables both I/O and memory of
>   the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
>   with i40e_io_resume(). This change allows the system to resume I/O
>   operations
> 
> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
  2024-05-16  8:50   ` Simon Horman
@ 2024-05-16 19:11   ` Jacob Keller
  2024-05-16 21:34     ` Thinh Tran
  2024-05-23 17:57   ` Pucha, HimasekharX Reddy
  2 siblings, 1 reply; 10+ messages in thread
From: Jacob Keller @ 2024-05-16 19:11 UTC (permalink / raw)
  To: Thinh Tran, netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel
  Cc: edumazet, rob.thomas, intel-wired-lan, pabeni, davem



On 5/15/2024 2:07 PM, Thinh Tran wrote:
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
> introduced.  These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively.  This factoring was
> done due to concerns about the logic of the I40E_SUSPENSED state, which
> caused the device to be unable to recover.  The functions are now used
> in the EEH handling for device suspend/resume callbacks.
> 
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
> 
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
>  1 file changed, 134 insertions(+), 114 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index ffb9f9f15c52..281c8ec27af2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -16303,6 +16303,138 @@ static void i40e_remove(struct pci_dev *pdev)
>  	pci_disable_device(pdev);
>  }
>  

I applied this to IWL net dev-queue, but I had some conflicts when
applying which I resolved manually. I would appreciate review of the
contents as committed:

https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git/commit/?h=dev-queue&id=b0bdaaffc27a79460a8053c2808fc54e4cbdd576

Thanks,
Jake

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-16 19:11   ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
@ 2024-05-16 21:34     ` Thinh Tran
  0 siblings, 0 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-16 21:34 UTC (permalink / raw)
  To: Jacob Keller, netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel
  Cc: edumazet, rob.thomas, intel-wired-lan, pabeni, davem



On 5/16/2024 2:11 PM, Jacob Keller wrote:
> 
> I applied this to IWL net dev-queue, but I had some conflicts when
> applying which I resolved manually. I would appreciate review of the
> contents as committed:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git/commit/?h=dev-queue&id=b0bdaaffc27a79460a8053c2808fc54e4cbdd576
> 
> Thanks,
> Jake
Hi Jake,
Your updated commit looks good.
Thank you for applying the patch to the IWL net dev-queue. I apologize 
for any conflicts that arose during the process. I appreciate your 
effort in manually resolving them. I noticed that my local repository, 
which was last pulled from the upstream kernel last week, did not 
include the commit ‘i40e: Add helper to access main VSI’.

Thank You,
Thinh Tran

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
  2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
  2024-05-16  8:50   ` Simon Horman
  2024-05-16 19:11   ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
@ 2024-05-23 17:57   ` Pucha, HimasekharX Reddy
  2 siblings, 0 replies; 10+ messages in thread
From: Pucha, HimasekharX Reddy @ 2024-05-23 17:57 UTC (permalink / raw)
  To: Thinh Tran, netdev@vger.kernel.org, kuba@kernel.org,
	Nguyen, Anthony L, Loktionov, Aleksandr, Kitszel, Przemyslaw,
	pmenzel@molgen.mpg.de
  Cc: edumazet@google.com, Thomas, Rob,
	intel-wired-lan@lists.osuosl.org, pabeni@redhat.com,
	davem@davemloft.net

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Thinh Tran
> Sent: Thursday, May 16, 2024 2:37 AM
> To: netdev@vger.kernel.org; kuba@kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; pmenzel@molgen.mpg.de
> Cc: edumazet@google.com; Thomas, Rob <rob.thomas@ibm.com>; Thinh Tran <thinhtr@linux.ibm.com>; intel-wired-lan@lists.osuosl.org; pabeni@redhat.com; davem@davemloft.net
> Subject: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
>
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been introduced.  These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively.  This factoring was done due to concerns about the logic of the I40E_SUSPENSED state, which caused the device to be unable to recover.  The functions are now used in the EEH handling for device suspend/resume callbacks.
>
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
>  1 file changed, 134 insertions(+), 114 deletions(-)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [Intel-wired-lan] [PATCH iwl-net V4, 2/2] i40e: Fully suspend and resume IO operations in EEH case
  2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
  2024-05-16  8:51   ` Simon Horman
@ 2024-05-23 17:59   ` Pucha, HimasekharX Reddy
  2025-08-07  4:50   ` [PATCH iwl-net V4,2/2] " Lukas Wunner
  2 siblings, 0 replies; 10+ messages in thread
From: Pucha, HimasekharX Reddy @ 2024-05-23 17:59 UTC (permalink / raw)
  To: Thinh Tran, netdev@vger.kernel.org, kuba@kernel.org,
	Nguyen, Anthony L, Loktionov, Aleksandr, Kitszel, Przemyslaw,
	pmenzel@molgen.mpg.de
  Cc: edumazet@google.com, Thomas, Rob, Keller, Jacob E,
	intel-wired-lan@lists.osuosl.org, pabeni@redhat.com,
	davem@davemloft.net

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Thinh Tran
> Sent: Thursday, May 16, 2024 2:37 AM
> To: netdev@vger.kernel.org; kuba@kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; pmenzel@molgen.mpg.de
> Cc: edumazet@google.com; Thomas, Rob <rob.thomas@ibm.com>; Thinh Tran <thinhtr@linux.ibm.com>; Keller, Jacob E <jacob.e.keller@intel.com>; intel-wired-lan@lists.osuosl.org; pabeni@redhat.com; davem@davemloft.net
> Subject: [Intel-wired-lan] [PATCH iwl-net V4, 2/2] i40e: Fully suspend and resume IO operations in EEH case
>
> When EEH events occurs, the callback functions in the i40e, which are managed by the EEH driver, will completely suspend and resume all IO operations.
>
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
>   with i40e_io_suspend(). The change is to fully suspend all I/O
>   operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
>   with pci_enable_device(). This change enables both I/O and memory of
>   the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
>   with i40e_io_resume(). This change allows the system to resume I/O
>   operations
>
> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
  2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
  2024-05-16  8:51   ` Simon Horman
  2024-05-23 17:59   ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
@ 2025-08-07  4:50   ` Lukas Wunner
  2 siblings, 0 replies; 10+ messages in thread
From: Lukas Wunner @ 2025-08-07  4:50 UTC (permalink / raw)
  To: Thinh Tran
  Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
	przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
	pabeni, intel-wired-lan, rob.thomas, Jacob Keller

On Wed, May 15, 2024 at 04:07:05PM -0500, Thinh Tran wrote:
> When EEH events occurs, the callback functions in the i40e, which are
> managed by the EEH driver, will completely suspend and resume all IO
> operations.
> 
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
>   with i40e_io_suspend(). The change is to fully suspend all I/O
>   operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
>   with pci_enable_device(). This change enables both I/O and memory of
>   the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
>   with i40e_io_resume(). This change allows the system to resume I/O
>   operations

The above was applied as commit c80b6538d35a.

> @@ -16481,7 +16483,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
>  	u32 reg;
>  
>  	dev_dbg(&pdev->dev, "%s\n", __func__);
> -	if (pci_enable_device_mem(pdev)) {
> +	/* enable I/O and memory of the device  */
> +	if (pci_enable_device(pdev)) {
>  		dev_info(&pdev->dev,
>  			 "Cannot re-enable PCI device after reset.\n");
>  		result = PCI_ERS_RESULT_DISCONNECT;

Why was this change made?

The driver calls pci_enable_device_mem() in i40e_probe(),
so calling pci_enable_device() here doesn't seem to make any sense.

The difference between pci_enable_device() and pci_enable_device_mem()
is that the former also enables access to the I/O Space of the device.
However I/O Space access is usually not used outside of x86.
And your patch targets powerpc because you seek to support EEH,
a powerpc-specific mechanism.

Unfortunately the commit message is not helpful at all because it
merely lists the code changes in prose form but doesn't explain
the *reason* for the change.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-08-07  4:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
2024-05-16  8:50   ` Simon Horman
2024-05-16 19:11   ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
2024-05-16 21:34     ` Thinh Tran
2024-05-23 17:57   ` Pucha, HimasekharX Reddy
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
2024-05-16  8:51   ` Simon Horman
2024-05-23 17:59   ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
2025-08-07  4:50   ` [PATCH iwl-net V4,2/2] " Lukas Wunner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).