* [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain
@ 2024-05-15 21:07 Thinh Tran
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
0 siblings, 2 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel
Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
rob.thomas, Thinh Tran
The patch fixes an issue where repeated EEH reports with a single error
on the bus of Intel X710 4-port 10G Base-T adapter in the MSI domain
cause the device to be permanently disabled. It fully resets and
restarts the device when handling the PCI EEH error.
v4: corrected another typos.
v3: moved text commit messages from the cover letter to appropriate
patches.
v2: fixed typos and split into two commits
Thinh Tran (2):
i40e: factoring out i40e_suspend/i40e_resume
i40e: Fully suspend and resume IO operations in EEH case
drivers/net/ethernet/intel/i40e/i40e_main.c | 257 +++++++++++---------
1 file changed, 140 insertions(+), 117 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume
2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
@ 2024-05-15 21:07 ` Thinh Tran
2024-05-16 8:50 ` Simon Horman
` (2 more replies)
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
1 sibling, 3 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel
Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
rob.thomas, Thinh Tran
Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
introduced. These functions were factored out from the existing
i40e_suspend() and i40e_resume() respectively. This factoring was
done due to concerns about the logic of the I40E_SUSPENSED state, which
caused the device to be unable to recover. The functions are now used
in the EEH handling for device suspend/resume callbacks.
The function i40e_enable_mc_magic_wake() has been moved ahead of
i40e_io_suspend() to ensure it is declared before being used.
Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
1 file changed, 134 insertions(+), 114 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ffb9f9f15c52..281c8ec27af2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16303,6 +16303,138 @@ static void i40e_remove(struct pci_dev *pdev)
pci_disable_device(pdev);
}
+/**
+ * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
+ * using the mac_address_write admin q function
+ * @pf: pointer to i40e_pf struct
+ **/
+static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
+{
+ struct i40e_hw *hw = &pf->hw;
+ u8 mac_addr[6];
+ u16 flags = 0;
+ int ret;
+
+ /* Get current MAC address in case it's an LAA */
+ if (pf->vsi[pf->lan_vsi] && pf->vsi[pf->lan_vsi]->netdev) {
+ ether_addr_copy(mac_addr,
+ pf->vsi[pf->lan_vsi]->netdev->dev_addr);
+ } else {
+ dev_err(&pf->pdev->dev,
+ "Failed to retrieve MAC address; using default\n");
+ ether_addr_copy(mac_addr, hw->mac.addr);
+ }
+
+ /* The FW expects the mac address write cmd to first be called with
+ * one of these flags before calling it again with the multicast
+ * enable flags.
+ */
+ flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
+
+ if (hw->func_caps.flex10_enable && hw->partition_id != 1)
+ flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
+
+ ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+ if (ret) {
+ dev_err(&pf->pdev->dev,
+ "Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
+ return;
+ }
+
+ flags = I40E_AQC_MC_MAG_EN
+ | I40E_AQC_WOL_PRESERVE_ON_PFR
+ | I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
+ ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
+ if (ret)
+ dev_err(&pf->pdev->dev,
+ "Failed to enable Multicast Magic Packet wake up\n");
+}
+
+/**
+ * i40e_io_suspend - suspend all IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_suspend(struct i40e_pf *pf)
+{
+ struct i40e_hw *hw = &pf->hw;
+
+ set_bit(__I40E_DOWN, pf->state);
+
+ /* Ensure service task will not be running */
+ del_timer_sync(&pf->service_timer);
+ cancel_work_sync(&pf->service_task);
+
+ /* Client close must be called explicitly here because the timer
+ * has been stopped.
+ */
+ i40e_notify_client_of_netdev_close(pf->vsi[pf->lan_vsi], false);
+
+ if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
+ pf->wol_en)
+ i40e_enable_mc_magic_wake(pf);
+
+ /* Since we're going to destroy queues during the
+ * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
+ * whole section
+ */
+ rtnl_lock();
+
+ i40e_prep_for_reset(pf);
+
+ wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
+ wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
+
+ /* Clear the interrupt scheme and release our IRQs so that the system
+ * can safely hibernate even when there are a large number of CPUs.
+ * Otherwise hibernation might fail when mapping all the vectors back
+ * to CPU0.
+ */
+ i40e_clear_interrupt_scheme(pf);
+
+ rtnl_unlock();
+
+ return 0;
+}
+
+/**
+ * i40e_io_resume - resume IO operations
+ * @pf: pointer to i40e_pf struct
+ *
+ **/
+static int i40e_io_resume(struct i40e_pf *pf)
+{
+ int err;
+
+ /* We need to hold the RTNL lock prior to restoring interrupt schemes,
+ * since we're going to be restoring queues
+ */
+ rtnl_lock();
+
+ /* We cleared the interrupt scheme when we suspended, so we need to
+ * restore it now to resume device functionality.
+ */
+ err = i40e_restore_interrupt_scheme(pf);
+ if (err) {
+ dev_err(&pf->pdev->dev, "Cannot restore interrupt scheme: %d\n",
+ err);
+ }
+
+ clear_bit(__I40E_DOWN, pf->state);
+ i40e_reset_and_rebuild(pf, false, true);
+
+ rtnl_unlock();
+
+ /* Clear suspended state last after everything is recovered */
+ clear_bit(__I40E_SUSPENDED, pf->state);
+
+ /* Restart the service task */
+ mod_timer(&pf->service_timer,
+ round_jiffies(jiffies + pf->service_timer_period));
+
+ return 0;
+}
+
/**
* i40e_pci_error_detected - warning that something funky happened in PCI land
* @pdev: PCI device information struct
@@ -16415,53 +16547,6 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
i40e_handle_reset_warning(pf, false);
}
-/**
- * i40e_enable_mc_magic_wake - enable multicast magic packet wake up
- * using the mac_address_write admin q function
- * @pf: pointer to i40e_pf struct
- **/
-static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
-{
- struct i40e_hw *hw = &pf->hw;
- u8 mac_addr[6];
- u16 flags = 0;
- int ret;
-
- /* Get current MAC address in case it's an LAA */
- if (pf->vsi[pf->lan_vsi] && pf->vsi[pf->lan_vsi]->netdev) {
- ether_addr_copy(mac_addr,
- pf->vsi[pf->lan_vsi]->netdev->dev_addr);
- } else {
- dev_err(&pf->pdev->dev,
- "Failed to retrieve MAC address; using default\n");
- ether_addr_copy(mac_addr, hw->mac.addr);
- }
-
- /* The FW expects the mac address write cmd to first be called with
- * one of these flags before calling it again with the multicast
- * enable flags.
- */
- flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
-
- if (hw->func_caps.flex10_enable && hw->partition_id != 1)
- flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
-
- ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
- if (ret) {
- dev_err(&pf->pdev->dev,
- "Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
- return;
- }
-
- flags = I40E_AQC_MC_MAG_EN
- | I40E_AQC_WOL_PRESERVE_ON_PFR
- | I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
- ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
- if (ret)
- dev_err(&pf->pdev->dev,
- "Failed to enable Multicast Magic Packet wake up\n");
-}
-
/**
* i40e_shutdown - PCI callback for shutting down
* @pdev: PCI device information struct
@@ -16521,48 +16606,11 @@ static void i40e_shutdown(struct pci_dev *pdev)
static int __maybe_unused i40e_suspend(struct device *dev)
{
struct i40e_pf *pf = dev_get_drvdata(dev);
- struct i40e_hw *hw = &pf->hw;
/* If we're already suspended, then there is nothing to do */
if (test_and_set_bit(__I40E_SUSPENDED, pf->state))
return 0;
-
- set_bit(__I40E_DOWN, pf->state);
-
- /* Ensure service task will not be running */
- del_timer_sync(&pf->service_timer);
- cancel_work_sync(&pf->service_task);
-
- /* Client close must be called explicitly here because the timer
- * has been stopped.
- */
- i40e_notify_client_of_netdev_close(pf->vsi[pf->lan_vsi], false);
-
- if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
- pf->wol_en)
- i40e_enable_mc_magic_wake(pf);
-
- /* Since we're going to destroy queues during the
- * i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
- * whole section
- */
- rtnl_lock();
-
- i40e_prep_for_reset(pf);
-
- wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
- wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
-
- /* Clear the interrupt scheme and release our IRQs so that the system
- * can safely hibernate even when there are a large number of CPUs.
- * Otherwise hibernation might fail when mapping all the vectors back
- * to CPU0.
- */
- i40e_clear_interrupt_scheme(pf);
-
- rtnl_unlock();
-
- return 0;
+ return i40e_io_suspend(pf);
}
/**
@@ -16572,39 +16620,11 @@ static int __maybe_unused i40e_suspend(struct device *dev)
static int __maybe_unused i40e_resume(struct device *dev)
{
struct i40e_pf *pf = dev_get_drvdata(dev);
- int err;
/* If we're not suspended, then there is nothing to do */
if (!test_bit(__I40E_SUSPENDED, pf->state))
return 0;
-
- /* We need to hold the RTNL lock prior to restoring interrupt schemes,
- * since we're going to be restoring queues
- */
- rtnl_lock();
-
- /* We cleared the interrupt scheme when we suspended, so we need to
- * restore it now to resume device functionality.
- */
- err = i40e_restore_interrupt_scheme(pf);
- if (err) {
- dev_err(dev, "Cannot restore interrupt scheme: %d\n",
- err);
- }
-
- clear_bit(__I40E_DOWN, pf->state);
- i40e_reset_and_rebuild(pf, false, true);
-
- rtnl_unlock();
-
- /* Clear suspended state last after everything is recovered */
- clear_bit(__I40E_SUSPENDED, pf->state);
-
- /* Restart the service task */
- mod_timer(&pf->service_timer,
- round_jiffies(jiffies + pf->service_timer_period));
-
- return 0;
+ return i40e_io_resume(pf);
}
static const struct pci_error_handlers i40e_err_handler = {
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
@ 2024-05-15 21:07 ` Thinh Tran
2024-05-16 8:51 ` Simon Horman
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-15 21:07 UTC (permalink / raw)
To: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel
Cc: jesse.brandeburg, davem, edumazet, pabeni, intel-wired-lan,
rob.thomas, Thinh Tran, Jacob Keller
When EEH events occurs, the callback functions in the i40e, which are
managed by the EEH driver, will completely suspend and resume all IO
operations.
- In the PCI error detected callback, replaced i40e_prep_for_reset()
with i40e_io_suspend(). The change is to fully suspend all I/O
operations
- In the PCI error slot reset callback, replaced pci_enable_device_mem()
with pci_enable_device(). This change enables both I/O and memory of
the device.
- In the PCI error resume callback, replaced i40e_handle_reset_warning()
with i40e_io_resume(). This change allows the system to resume I/O
operations
Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Robert Thomas <rob.thomas@ibm.com>
Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 281c8ec27af2..9f71a61e0c52 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11138,6 +11138,8 @@ static void i40e_reset_and_rebuild(struct i40e_pf *pf, bool reinit,
ret = i40e_reset(pf);
if (!ret)
i40e_rebuild(pf, reinit, lock_acquired);
+ else
+ dev_err(&pf->pdev->dev, "%s: i40e_reset() FAILED", __func__);
}
/**
@@ -16459,7 +16461,7 @@ static pci_ers_result_t i40e_pci_error_detected(struct pci_dev *pdev,
/* shutdown all operations */
if (!test_bit(__I40E_SUSPENDED, pf->state))
- i40e_prep_for_reset(pf);
+ i40e_io_suspend(pf);
/* Request a slot reset */
return PCI_ERS_RESULT_NEED_RESET;
@@ -16481,7 +16483,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
u32 reg;
dev_dbg(&pdev->dev, "%s\n", __func__);
- if (pci_enable_device_mem(pdev)) {
+ /* enable I/O and memory of the device */
+ if (pci_enable_device(pdev)) {
dev_info(&pdev->dev,
"Cannot re-enable PCI device after reset.\n");
result = PCI_ERS_RESULT_DISCONNECT;
@@ -16544,7 +16547,7 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
if (test_bit(__I40E_SUSPENDED, pf->state))
return;
- i40e_handle_reset_warning(pf, false);
+ i40e_io_resume(pf);
}
/**
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
@ 2024-05-16 8:50 ` Simon Horman
2024-05-16 19:11 ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
2024-05-23 17:57 ` Pucha, HimasekharX Reddy
2 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2024-05-16 8:50 UTC (permalink / raw)
To: Thinh Tran
Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
pabeni, intel-wired-lan, rob.thomas
On Wed, May 15, 2024 at 04:07:04PM -0500, Thinh Tran wrote:
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
> introduced. These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively. This factoring was
> done due to concerns about the logic of the I40E_SUSPENSED state, which
> caused the device to be unable to recover. The functions are now used
> in the EEH handling for device suspend/resume callbacks.
>
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
Hi Thrinh,
Sorry to nit-pick, but the request from Paul in his review of v3
was to use imperative mood in the title.
Factor out i40e_suspend/i40e_resume
In any case, this patch looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
@ 2024-05-16 8:51 ` Simon Horman
2024-05-23 17:59 ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
2025-08-07 4:50 ` [PATCH iwl-net V4,2/2] " Lukas Wunner
2 siblings, 0 replies; 10+ messages in thread
From: Simon Horman @ 2024-05-16 8:51 UTC (permalink / raw)
To: Thinh Tran
Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
pabeni, intel-wired-lan, rob.thomas, Jacob Keller
On Wed, May 15, 2024 at 04:07:05PM -0500, Thinh Tran wrote:
> When EEH events occurs, the callback functions in the i40e, which are
> managed by the EEH driver, will completely suspend and resume all IO
> operations.
>
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
> with i40e_io_suspend(). The change is to fully suspend all I/O
> operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
> with pci_enable_device(). This change enables both I/O and memory of
> the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
> with i40e_io_resume(). This change allows the system to resume I/O
> operations
>
> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
2024-05-16 8:50 ` Simon Horman
@ 2024-05-16 19:11 ` Jacob Keller
2024-05-16 21:34 ` Thinh Tran
2024-05-23 17:57 ` Pucha, HimasekharX Reddy
2 siblings, 1 reply; 10+ messages in thread
From: Jacob Keller @ 2024-05-16 19:11 UTC (permalink / raw)
To: Thinh Tran, netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel
Cc: edumazet, rob.thomas, intel-wired-lan, pabeni, davem
On 5/15/2024 2:07 PM, Thinh Tran wrote:
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been
> introduced. These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively. This factoring was
> done due to concerns about the logic of the I40E_SUSPENSED state, which
> caused the device to be unable to recover. The functions are now used
> in the EEH handling for device suspend/resume callbacks.
>
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
> 1 file changed, 134 insertions(+), 114 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index ffb9f9f15c52..281c8ec27af2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -16303,6 +16303,138 @@ static void i40e_remove(struct pci_dev *pdev)
> pci_disable_device(pdev);
> }
>
I applied this to IWL net dev-queue, but I had some conflicts when
applying which I resolved manually. I would appreciate review of the
contents as committed:
https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git/commit/?h=dev-queue&id=b0bdaaffc27a79460a8053c2808fc54e4cbdd576
Thanks,
Jake
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
2024-05-16 19:11 ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
@ 2024-05-16 21:34 ` Thinh Tran
0 siblings, 0 replies; 10+ messages in thread
From: Thinh Tran @ 2024-05-16 21:34 UTC (permalink / raw)
To: Jacob Keller, netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel
Cc: edumazet, rob.thomas, intel-wired-lan, pabeni, davem
On 5/16/2024 2:11 PM, Jacob Keller wrote:
>
> I applied this to IWL net dev-queue, but I had some conflicts when
> applying which I resolved manually. I would appreciate review of the
> contents as committed:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git/commit/?h=dev-queue&id=b0bdaaffc27a79460a8053c2808fc54e4cbdd576
>
> Thanks,
> Jake
Hi Jake,
Your updated commit looks good.
Thank you for applying the patch to the IWL net dev-queue. I apologize
for any conflicts that arose during the process. I appreciate your
effort in manually resolving them. I noticed that my local repository,
which was last pulled from the upstream kernel last week, did not
include the commit ‘i40e: Add helper to access main VSI’.
Thank You,
Thinh Tran
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
2024-05-16 8:50 ` Simon Horman
2024-05-16 19:11 ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
@ 2024-05-23 17:57 ` Pucha, HimasekharX Reddy
2 siblings, 0 replies; 10+ messages in thread
From: Pucha, HimasekharX Reddy @ 2024-05-23 17:57 UTC (permalink / raw)
To: Thinh Tran, netdev@vger.kernel.org, kuba@kernel.org,
Nguyen, Anthony L, Loktionov, Aleksandr, Kitszel, Przemyslaw,
pmenzel@molgen.mpg.de
Cc: edumazet@google.com, Thomas, Rob,
intel-wired-lan@lists.osuosl.org, pabeni@redhat.com,
davem@davemloft.net
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Thinh Tran
> Sent: Thursday, May 16, 2024 2:37 AM
> To: netdev@vger.kernel.org; kuba@kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; pmenzel@molgen.mpg.de
> Cc: edumazet@google.com; Thomas, Rob <rob.thomas@ibm.com>; Thinh Tran <thinhtr@linux.ibm.com>; intel-wired-lan@lists.osuosl.org; pabeni@redhat.com; davem@davemloft.net
> Subject: [Intel-wired-lan] [PATCH iwl-net V4, 1/2] i40e: factoring out i40e_suspend/i40e_resume
>
> Two new functions, i40e_io_suspend() and i40e_io_resume(), have been introduced. These functions were factored out from the existing
> i40e_suspend() and i40e_resume() respectively. This factoring was done due to concerns about the logic of the I40E_SUSPENSED state, which caused the device to be unable to recover. The functions are now used in the EEH handling for device suspend/resume callbacks.
>
> The function i40e_enable_mc_magic_wake() has been moved ahead of
> i40e_io_suspend() to ensure it is declared before being used.
>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 248 +++++++++++---------
> 1 file changed, 134 insertions(+), 114 deletions(-)
>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [Intel-wired-lan] [PATCH iwl-net V4, 2/2] i40e: Fully suspend and resume IO operations in EEH case
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
2024-05-16 8:51 ` Simon Horman
@ 2024-05-23 17:59 ` Pucha, HimasekharX Reddy
2025-08-07 4:50 ` [PATCH iwl-net V4,2/2] " Lukas Wunner
2 siblings, 0 replies; 10+ messages in thread
From: Pucha, HimasekharX Reddy @ 2024-05-23 17:59 UTC (permalink / raw)
To: Thinh Tran, netdev@vger.kernel.org, kuba@kernel.org,
Nguyen, Anthony L, Loktionov, Aleksandr, Kitszel, Przemyslaw,
pmenzel@molgen.mpg.de
Cc: edumazet@google.com, Thomas, Rob, Keller, Jacob E,
intel-wired-lan@lists.osuosl.org, pabeni@redhat.com,
davem@davemloft.net
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Thinh Tran
> Sent: Thursday, May 16, 2024 2:37 AM
> To: netdev@vger.kernel.org; kuba@kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; pmenzel@molgen.mpg.de
> Cc: edumazet@google.com; Thomas, Rob <rob.thomas@ibm.com>; Thinh Tran <thinhtr@linux.ibm.com>; Keller, Jacob E <jacob.e.keller@intel.com>; intel-wired-lan@lists.osuosl.org; pabeni@redhat.com; davem@davemloft.net
> Subject: [Intel-wired-lan] [PATCH iwl-net V4, 2/2] i40e: Fully suspend and resume IO operations in EEH case
>
> When EEH events occurs, the callback functions in the i40e, which are managed by the EEH driver, will completely suspend and resume all IO operations.
>
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
> with i40e_io_suspend(). The change is to fully suspend all I/O
> operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
> with pci_enable_device(). This change enables both I/O and memory of
> the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
> with i40e_io_resume(). This change allows the system to resume I/O
> operations
>
> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Tested-by: Robert Thomas <rob.thomas@ibm.com>
> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
2024-05-16 8:51 ` Simon Horman
2024-05-23 17:59 ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
@ 2025-08-07 4:50 ` Lukas Wunner
2 siblings, 0 replies; 10+ messages in thread
From: Lukas Wunner @ 2025-08-07 4:50 UTC (permalink / raw)
To: Thinh Tran
Cc: netdev, kuba, anthony.l.nguyen, aleksandr.loktionov,
przemyslaw.kitszel, pmenzel, jesse.brandeburg, davem, edumazet,
pabeni, intel-wired-lan, rob.thomas, Jacob Keller
On Wed, May 15, 2024 at 04:07:05PM -0500, Thinh Tran wrote:
> When EEH events occurs, the callback functions in the i40e, which are
> managed by the EEH driver, will completely suspend and resume all IO
> operations.
>
> - In the PCI error detected callback, replaced i40e_prep_for_reset()
> with i40e_io_suspend(). The change is to fully suspend all I/O
> operations
> - In the PCI error slot reset callback, replaced pci_enable_device_mem()
> with pci_enable_device(). This change enables both I/O and memory of
> the device.
> - In the PCI error resume callback, replaced i40e_handle_reset_warning()
> with i40e_io_resume(). This change allows the system to resume I/O
> operations
The above was applied as commit c80b6538d35a.
> @@ -16481,7 +16483,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
> u32 reg;
>
> dev_dbg(&pdev->dev, "%s\n", __func__);
> - if (pci_enable_device_mem(pdev)) {
> + /* enable I/O and memory of the device */
> + if (pci_enable_device(pdev)) {
> dev_info(&pdev->dev,
> "Cannot re-enable PCI device after reset.\n");
> result = PCI_ERS_RESULT_DISCONNECT;
Why was this change made?
The driver calls pci_enable_device_mem() in i40e_probe(),
so calling pci_enable_device() here doesn't seem to make any sense.
The difference between pci_enable_device() and pci_enable_device_mem()
is that the former also enables access to the I/O Space of the device.
However I/O Space access is usually not used outside of x86.
And your patch targets powerpc because you seek to support EEH,
a powerpc-specific mechanism.
Unfortunately the commit message is not helpful at all because it
merely lists the code changes in prose form but doesn't explain
the *reason* for the change.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-07 4:50 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-15 21:07 [PATCH iwl-net V4,0/2] Fix repeated EEH reports in MSI domain Thinh Tran
2024-05-15 21:07 ` [PATCH iwl-net V4,1/2] i40e: factoring out i40e_suspend/i40e_resume Thinh Tran
2024-05-16 8:50 ` Simon Horman
2024-05-16 19:11 ` [Intel-wired-lan] [PATCH iwl-net V4, 1/2] " Jacob Keller
2024-05-16 21:34 ` Thinh Tran
2024-05-23 17:57 ` Pucha, HimasekharX Reddy
2024-05-15 21:07 ` [PATCH iwl-net V4,2/2] i40e: Fully suspend and resume IO operations in EEH case Thinh Tran
2024-05-16 8:51 ` Simon Horman
2024-05-23 17:59 ` [Intel-wired-lan] [PATCH iwl-net V4, 2/2] " Pucha, HimasekharX Reddy
2025-08-07 4:50 ` [PATCH iwl-net V4,2/2] " Lukas Wunner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).