* [PATCH 0/3] ice/i40e: pci_enable_device() fixes
@ 2025-08-16 7:10 Lukas Wunner
2025-08-16 7:10 ` [PATCH 1/3] ice: Fix enable_cnt imbalance on resume Lukas Wunner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Lukas Wunner @ 2025-08-16 7:10 UTC (permalink / raw)
To: Tony Nguyen, Przemek Kitszel; +Cc: intel-wired-lan, netdev
The ice and i40e drivers perform surplus calls to pci_enable_device()
on resume and on error recovery. This results in the Memory Space Enable
bit in the PCI Command register not being cleared on driver unbind.
Not a catastrophic issue, so although these commits contain Fixes tags,
I recommend applying through next-queue.git to let them bake in linux-next
for a couple of weeks.
I have neither an ice nor i40e card available, so this is compile-tested
only. I'm hoping Intel validation can test it.
Suggested test procedure:
- Unbind the driver through sysfs without having suspended the card:
echo D:B:D.F | sudo tee /sys/bus/pci/drivers/ice/unbind
(replace D:B:D.F with the device address, e.g. 0000:07:00.0)
- Verify with lspci that it says "Mem-" in the "Control:" register:
lspci -vv -s D:B:D.F
- Rebind the driver:
echo D:B:D.F | sudo tee /sys/bus/pci/drivers/ice/bind
- Suspend the card, resume the card, unbind the driver, re-run lspci.
Expected result without this series "Mem+", with this series "Mem-".
For error recovery, the procedure is the same but instead of suspending
and resuming the card, an error needs to be injected. See the section
on "Software error injection" in Documentation/PCI/pcieaer-howto.rst.
Thanks!
Lukas Wunner (3):
ice: Fix enable_cnt imbalance on resume
ice: Fix enable_cnt imbalance on PCIe error recovery
i40e: Fix enable_cnt imbalance on PCIe error recovery
drivers/net/ethernet/intel/i40e/i40e_main.c | 29 ++++++----------
drivers/net/ethernet/intel/ice/ice_main.c | 38 ++++++---------------
2 files changed, 21 insertions(+), 46 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] ice: Fix enable_cnt imbalance on resume
2025-08-16 7:10 [PATCH 0/3] ice/i40e: pci_enable_device() fixes Lukas Wunner
@ 2025-08-16 7:10 ` Lukas Wunner
2025-08-16 7:10 ` [PATCH 2/3] ice: Fix enable_cnt imbalance on PCIe error recovery Lukas Wunner
2025-08-16 7:10 ` [PATCH 3/3] i40e: " Lukas Wunner
2 siblings, 0 replies; 4+ messages in thread
From: Lukas Wunner @ 2025-08-16 7:10 UTC (permalink / raw)
To: Tony Nguyen, Przemek Kitszel; +Cc: intel-wired-lan, netdev
The ice driver calls pci_enable_device_mem() on resume without having
called pci_disable_device() on suspend.
This leads to an imbalance of the enable_cnt tracked by the PCI core:
Every time the adapter is resumed, the enable_cnt keeps growing. If the
adapter is resumed at least once and the driver is then unbound, the
device isn't disabled since the enable_cnt hasn't reached zero (and
never again will).
Moreover the call to pci_enable_device_mem() has no effect because the
enable_cnt was already incremented in ice_probe() through the call to
pcim_enable_device(). The subsequent pci_enable_device_mem() thus bails
out after invoking pci_update_current_state(). But current_state was
already updated by the PCI core:
pci_pm_resume_noirq()
pci_pm_default_resume_early()
pci_pm_power_up_and_verify_state()
pci_update_current_state()
In summary, the call to pci_enable_device_mem() is both harmful and
superfluous, so remove it.
The intended purpose of the call may have been to set the Memory Space
Enable bit in the Command register again on resume, but that is already
achieved by the preceding call to pci_restore_state().
Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.9+
---
drivers/net/ethernet/intel/ice/ice_main.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 10a473a50710..3be4347223ef 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5643,12 +5643,6 @@ static int ice_resume(struct device *dev)
if (!pci_device_is_present(pdev))
return -ENODEV;
- ret = pci_enable_device_mem(pdev);
- if (ret) {
- dev_err(dev, "Cannot enable device after suspend\n");
- return ret;
- }
-
pf = pci_get_drvdata(pdev);
hw = &pf->hw;
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] ice: Fix enable_cnt imbalance on PCIe error recovery
2025-08-16 7:10 [PATCH 0/3] ice/i40e: pci_enable_device() fixes Lukas Wunner
2025-08-16 7:10 ` [PATCH 1/3] ice: Fix enable_cnt imbalance on resume Lukas Wunner
@ 2025-08-16 7:10 ` Lukas Wunner
2025-08-16 7:10 ` [PATCH 3/3] i40e: " Lukas Wunner
2 siblings, 0 replies; 4+ messages in thread
From: Lukas Wunner @ 2025-08-16 7:10 UTC (permalink / raw)
To: Tony Nguyen, Przemek Kitszel; +Cc: intel-wired-lan, netdev
After a PCIe Uncorrectable Error has been reported by an ice adapter
and has been recovered through a Secondary Bus Reset, its driver calls
pci_enable_device_mem() without having called pci_disable_device().
This leads to an imbalance of the enable_cnt tracked by the PCI core:
Every time error recovery occurs, the enable_cnt keeps growing. If it
occurs at least once and the driver is then unbound, the device isn't
disabled since the enable_cnt hasn't reached zero (and never again will).
The call to pci_enable_device_mem() has almost no effect because the
enable_cnt was already incremented in ice_probe() through the call to
pcim_enable_device(). The subsequent pci_enable_device_mem() thus bails
out after invoking pci_update_current_state().
Remove pci_enable_device_mem(). In lieu of pci_update_current_state(),
set the power state to D0 because that's the power state after a
Secondary Bus Reset (PCIe r7.0 sec 5.3.1.1).
The intended purpose of pci_enable_device_mem() may have been to set
the Memory Space Enable bit in the Command register again after reset,
but that is already achieved by the subsequent call to
pci_restore_state().
Fixes: 5995b6d0c6fc ("ice: Implement pci_error_handler ops")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.2+
---
drivers/net/ethernet/intel/ice/ice_main.c | 32 ++++++++---------------
1 file changed, 11 insertions(+), 21 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 3be4347223ef..848d5b512319 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5720,30 +5720,20 @@ ice_pci_err_detected(struct pci_dev *pdev, pci_channel_state_t err)
static pci_ers_result_t ice_pci_err_slot_reset(struct pci_dev *pdev)
{
struct ice_pf *pf = pci_get_drvdata(pdev);
- pci_ers_result_t result;
- int err;
u32 reg;
- err = pci_enable_device_mem(pdev);
- if (err) {
- dev_err(&pdev->dev, "Cannot re-enable PCI device after reset, error %d\n",
- err);
- result = PCI_ERS_RESULT_DISCONNECT;
- } else {
- pci_set_master(pdev);
- pci_restore_state(pdev);
- pci_save_state(pdev);
- pci_wake_from_d3(pdev, false);
-
- /* Check for life */
- reg = rd32(&pf->hw, GLGEN_RTRIG);
- if (!reg)
- result = PCI_ERS_RESULT_RECOVERED;
- else
- result = PCI_ERS_RESULT_DISCONNECT;
- }
+ pdev->current_state = PCI_D0;
+ pci_set_master(pdev);
+ pci_restore_state(pdev);
+ pci_save_state(pdev);
+ pci_wake_from_d3(pdev, false);
- return result;
+ /* Check for life */
+ reg = rd32(&pf->hw, GLGEN_RTRIG);
+ if (!reg)
+ return PCI_ERS_RESULT_RECOVERED;
+ else
+ return PCI_ERS_RESULT_DISCONNECT;
}
/**
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] i40e: Fix enable_cnt imbalance on PCIe error recovery
2025-08-16 7:10 [PATCH 0/3] ice/i40e: pci_enable_device() fixes Lukas Wunner
2025-08-16 7:10 ` [PATCH 1/3] ice: Fix enable_cnt imbalance on resume Lukas Wunner
2025-08-16 7:10 ` [PATCH 2/3] ice: Fix enable_cnt imbalance on PCIe error recovery Lukas Wunner
@ 2025-08-16 7:10 ` Lukas Wunner
2 siblings, 0 replies; 4+ messages in thread
From: Lukas Wunner @ 2025-08-16 7:10 UTC (permalink / raw)
To: Tony Nguyen, Przemek Kitszel; +Cc: intel-wired-lan, netdev
After a PCIe Uncorrectable Error has been reported by an i40e adapter
and has been recovered through a Secondary Bus Reset, its driver calls
pci_enable_device() without having called pci_disable_device().
This leads to an imbalance of the enable_cnt tracked by the PCI core:
Every time error recovery occurs, the enable_cnt keeps growing. If it
occurs at least once and the driver is then unbound, the device isn't
disabled since the enable_cnt hasn't reached zero (and never again will).
The call to pci_enable_device() has almost no effect because the
enable_cnt was already incremented in i40e_probe() through the call to
pci_enable_device_mem(). The subsequent pci_enable_device() thus bails
out after invoking pci_update_current_state().
Remove pci_enable_device(). In lieu of pci_update_current_state(), set
the power state to D0 because that's the power state after a Secondary
Bus Reset (PCIe r7.0 sec 5.3.1.1).
The intended purpose of pci_enable_device() may have been to set the
Memory Space Enable bit in the Command register again after reset, but
that is already achieved by the subsequent call to pci_restore_state().
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.12+
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 29 +++++++--------------
1 file changed, 10 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9d6d892602fa..7e87234fde67 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -16439,29 +16439,20 @@ static pci_ers_result_t i40e_pci_error_detected(struct pci_dev *pdev,
static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
{
struct i40e_pf *pf = pci_get_drvdata(pdev);
- pci_ers_result_t result;
u32 reg;
dev_dbg(&pdev->dev, "%s\n", __func__);
- /* enable I/O and memory of the device */
- if (pci_enable_device(pdev)) {
- dev_info(&pdev->dev,
- "Cannot re-enable PCI device after reset.\n");
- result = PCI_ERS_RESULT_DISCONNECT;
- } else {
- pci_set_master(pdev);
- pci_restore_state(pdev);
- pci_save_state(pdev);
- pci_wake_from_d3(pdev, false);
-
- reg = rd32(&pf->hw, I40E_GLGEN_RTRIG);
- if (reg == 0)
- result = PCI_ERS_RESULT_RECOVERED;
- else
- result = PCI_ERS_RESULT_DISCONNECT;
- }
+ pdev->current_state = PCI_D0;
+ pci_set_master(pdev);
+ pci_restore_state(pdev);
+ pci_save_state(pdev);
+ pci_wake_from_d3(pdev, false);
- return result;
+ reg = rd32(&pf->hw, I40E_GLGEN_RTRIG);
+ if (reg == 0)
+ return PCI_ERS_RESULT_RECOVERED;
+ else
+ return PCI_ERS_RESULT_DISCONNECT;
}
/**
--
2.47.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-16 7:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-16 7:10 [PATCH 0/3] ice/i40e: pci_enable_device() fixes Lukas Wunner
2025-08-16 7:10 ` [PATCH 1/3] ice: Fix enable_cnt imbalance on resume Lukas Wunner
2025-08-16 7:10 ` [PATCH 2/3] ice: Fix enable_cnt imbalance on PCIe error recovery Lukas Wunner
2025-08-16 7:10 ` [PATCH 3/3] i40e: " Lukas Wunner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).