* [PATCH v2] PCI: Explicitly put devices into D0 when initializing
@ 2025-04-24 4:31 Mario Limonciello
2025-04-24 15:07 ` Wysocki, Rafael J
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: Mario Limonciello @ 2025-04-24 4:31 UTC (permalink / raw)
To: mario.limonciello, bhelgaas, rafael.j.wysocki, huang.ying.caritas,
stern
Cc: linux-pci
From: Mario Limonciello <mario.limonciello@amd.com>
AMD BIOS team has root caused an issue that NVME storage failed to come
back from suspend to a lack of a call to _REG when NVME device was probed.
commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
added support for calling _REG when transitioning D-states, but this only
works if the device actually "transitions" D-states.
commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
devices") added support for runtime PM on PCI devices, but never actually
'explicitly' sets the device to D0.
To make sure that devices are in D0 and that platform methods such as
_REG are called, explicitly set all devices into D0 during initialization.
Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
v2:
* Move runtime PM calls after setting to D0
* Use pci_pm_power_up_and_verify_state()
---
drivers/pci/pci-driver.c | 6 ------
drivers/pci/pci.c | 13 ++++++++++---
drivers/pci/pci.h | 1 +
3 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index c8bd71a739f72..082918ce03d8a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev *pci_dev)
pci_enable_wake(pci_dev, PCI_D0, false);
}
-static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
-{
- pci_power_up(pci_dev);
- pci_update_current_state(pci_dev, PCI_D0);
-}
-
static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
{
pci_pm_power_up_and_verify_state(pci_dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e77d5b53c0cec..8d125998b30b7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
}
EXPORT_SYMBOL_GPL(pci_d3cold_disable);
+void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
+{
+ pci_power_up(pci_dev);
+ pci_update_current_state(pci_dev, PCI_D0);
+}
+
/**
* pci_pm_init - Initialize PM functions of given PCI device
* @dev: PCI device to handle.
@@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
u16 status;
u16 pmc;
- pm_runtime_forbid(&dev->dev);
- pm_runtime_set_active(&dev->dev);
- pm_runtime_enable(&dev->dev);
device_enable_async_suspend(&dev->dev);
dev->wakeup_prepared = false;
@@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
pci_read_config_word(dev, PCI_STATUS, &status);
if (status & PCI_STATUS_IMM_READY)
dev->imm_ready = 1;
+ pci_pm_power_up_and_verify_state(dev);
+ pm_runtime_forbid(&dev->dev);
+ pm_runtime_set_active(&dev->dev);
+ pm_runtime_enable(&dev->dev);
}
static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b81e99cd4b62a..49165b739138b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
void pci_dev_complete_resume(struct pci_dev *pci_dev);
void pci_config_pm_runtime_get(struct pci_dev *dev);
void pci_config_pm_runtime_put(struct pci_dev *dev);
+void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
void pci_pm_init(struct pci_dev *dev);
void pci_ea_init(struct pci_dev *dev);
void pci_msi_init(struct pci_dev *dev);
--
2.43.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
@ 2025-04-24 15:07 ` Wysocki, Rafael J
2025-04-24 18:20 ` Mario Limonciello
2025-04-27 12:54 ` Denis Benato
` (3 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Wysocki, Rafael J @ 2025-04-24 15:07 UTC (permalink / raw)
To: Mario Limonciello, mario.limonciello, bhelgaas,
huang.ying.caritas, stern
Cc: linux-pci, linux-pm, Rafael J. Wysocki
On 4/24/2025 6:31 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> v2:
> * Move runtime PM calls after setting to D0
> * Use pci_pm_power_up_and_verify_state()
You could give me a credit for this suggestion.
Anyways
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
> ---
> drivers/pci/pci-driver.c | 6 ------
> drivers/pci/pci.c | 13 ++++++++++---
> drivers/pci/pci.h | 1 +
> 3 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index c8bd71a739f72..082918ce03d8a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev *pci_dev)
> pci_enable_wake(pci_dev, PCI_D0, false);
> }
>
> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> -{
> - pci_power_up(pci_dev);
> - pci_update_current_state(pci_dev, PCI_D0);
> -}
> -
> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
> {
> pci_pm_power_up_and_verify_state(pci_dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0cec..8d125998b30b7 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
> }
> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> +{
> + pci_power_up(pci_dev);
> + pci_update_current_state(pci_dev, PCI_D0);
> +}
> +
> /**
> * pci_pm_init - Initialize PM functions of given PCI device
> * @dev: PCI device to handle.
> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
> u16 status;
> u16 pmc;
>
> - pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> device_enable_async_suspend(&dev->dev);
> dev->wakeup_prepared = false;
>
> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> + pci_pm_power_up_and_verify_state(dev);
> + pm_runtime_forbid(&dev->dev);
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b81e99cd4b62a..49165b739138b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
> void pci_dev_complete_resume(struct pci_dev *pci_dev);
> void pci_config_pm_runtime_get(struct pci_dev *dev);
> void pci_config_pm_runtime_put(struct pci_dev *dev);
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
> void pci_pm_init(struct pci_dev *dev);
> void pci_ea_init(struct pci_dev *dev);
> void pci_msi_init(struct pci_dev *dev);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-04-24 15:07 ` Wysocki, Rafael J
@ 2025-04-24 18:20 ` Mario Limonciello
0 siblings, 0 replies; 19+ messages in thread
From: Mario Limonciello @ 2025-04-24 18:20 UTC (permalink / raw)
To: Wysocki, Rafael J, mario.limonciello, bhelgaas,
huang.ying.caritas, stern
Cc: linux-pci, linux-pm, Rafael J. Wysocki
On 4/24/2025 10:07 AM, Wysocki, Rafael J wrote:
> On 4/24/2025 6:31 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> AMD BIOS team has root caused an issue that NVME storage failed to come
>> back from suspend to a lack of a call to _REG when NVME device was
>> probed.
>>
>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>> added support for calling _REG when transitioning D-states, but this only
>> works if the device actually "transitions" D-states.
>>
>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices") added support for runtime PM on PCI devices, but never actually
>> 'explicitly' sets the device to D0.
>>
>> To make sure that devices are in D0 and that platform methods such as
>> _REG are called, explicitly set all devices into D0 during
>> initialization.
>>
>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices")
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> v2:
>> * Move runtime PM calls after setting to D0
>> * Use pci_pm_power_up_and_verify_state()
>
> You could give me a credit for this suggestion.
Sorry about that, absolutely! I believe b4 can still pick it up.
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
>
> Anyways
>
> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
>
>
>> ---
>> drivers/pci/pci-driver.c | 6 ------
>> drivers/pci/pci.c | 13 ++++++++++---
>> drivers/pci/pci.h | 1 +
>> 3 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index c8bd71a739f72..082918ce03d8a 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev
>> *pci_dev)
>> pci_enable_wake(pci_dev, PCI_D0, false);
>> }
>> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
>> -{
>> - pci_power_up(pci_dev);
>> - pci_update_current_state(pci_dev, PCI_D0);
>> -}
>> -
>> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
>> {
>> pci_pm_power_up_and_verify_state(pci_dev);
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index e77d5b53c0cec..8d125998b30b7 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
>> }
>> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
>> +{
>> + pci_power_up(pci_dev);
>> + pci_update_current_state(pci_dev, PCI_D0);
>> +}
>> +
>> /**
>> * pci_pm_init - Initialize PM functions of given PCI device
>> * @dev: PCI device to handle.
>> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
>> u16 status;
>> u16 pmc;
>> - pm_runtime_forbid(&dev->dev);
>> - pm_runtime_set_active(&dev->dev);
>> - pm_runtime_enable(&dev->dev);
>> device_enable_async_suspend(&dev->dev);
>> dev->wakeup_prepared = false;
>> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
>> pci_read_config_word(dev, PCI_STATUS, &status);
>> if (status & PCI_STATUS_IMM_READY)
>> dev->imm_ready = 1;
>> + pci_pm_power_up_and_verify_state(dev);
>> + pm_runtime_forbid(&dev->dev);
>> + pm_runtime_set_active(&dev->dev);
>> + pm_runtime_enable(&dev->dev);
>> }
>> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index b81e99cd4b62a..49165b739138b 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
>> void pci_dev_complete_resume(struct pci_dev *pci_dev);
>> void pci_config_pm_runtime_get(struct pci_dev *dev);
>> void pci_config_pm_runtime_put(struct pci_dev *dev);
>> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
>> void pci_pm_init(struct pci_dev *dev);
>> void pci_ea_init(struct pci_dev *dev);
>> void pci_msi_init(struct pci_dev *dev);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
2025-04-24 15:07 ` Wysocki, Rafael J
@ 2025-04-27 12:54 ` Denis Benato
2025-05-05 19:20 ` Perry, David
` (2 subsequent siblings)
4 siblings, 0 replies; 19+ messages in thread
From: Denis Benato @ 2025-04-27 12:54 UTC (permalink / raw)
To: Mario Limonciello, mario.limonciello, bhelgaas, rafael.j.wysocki,
huang.ying.caritas, stern
Cc: linux-pci
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> v2:
> * Move runtime PM calls after setting to D0
> * Use pci_pm_power_up_and_verify_state()
> ---
> drivers/pci/pci-driver.c | 6 ------
> drivers/pci/pci.c | 13 ++++++++++---
> drivers/pci/pci.h | 1 +
> 3 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index c8bd71a739f72..082918ce03d8a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev *pci_dev)
> pci_enable_wake(pci_dev, PCI_D0, false);
> }
>
> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> -{
> - pci_power_up(pci_dev);
> - pci_update_current_state(pci_dev, PCI_D0);
> -}
> -
> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
> {
> pci_pm_power_up_and_verify_state(pci_dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0cec..8d125998b30b7 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
> }
> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> +{
> + pci_power_up(pci_dev);
> + pci_update_current_state(pci_dev, PCI_D0);
> +}
> +
> /**
> * pci_pm_init - Initialize PM functions of given PCI device
> * @dev: PCI device to handle.
> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
> u16 status;
> u16 pmc;
>
> - pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> device_enable_async_suspend(&dev->dev);
> dev->wakeup_prepared = false;
>
> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> + pci_pm_power_up_and_verify_state(dev);
> + pm_runtime_forbid(&dev->dev);
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b81e99cd4b62a..49165b739138b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
> void pci_dev_complete_resume(struct pci_dev *pci_dev);
> void pci_config_pm_runtime_get(struct pci_dev *dev);
> void pci_config_pm_runtime_put(struct pci_dev *dev);
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
> void pci_pm_init(struct pci_dev *dev);
> void pci_ea_init(struct pci_dev *dev);
> void pci_msi_init(struct pci_dev *dev);
Applying this patch makes my laptop power-off quickly and no hardware-related services gets stuck anymore during shutdown.
Tested-by: Denis Benato <benato.denis96@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
[not found] <BY5PR19MB3922D2994064085132D8FEBD9A822@BY5PR19MB3922.namprd19.prod.outlook.com>
@ 2025-05-01 14:14 ` Shen, Yijun
0 siblings, 0 replies; 19+ messages in thread
From: Shen, Yijun @ 2025-05-01 14:14 UTC (permalink / raw)
To: Mario Limonciello, Mario Limonciello, bhelgaas@google.com,
rafael.j.wysocki@intel.com, huang.ying.caritas@gmail.com,
stern@rowland.harvard.edu
Cc: linux-pci@vger.kernel.org
> From: Mario Limonciello mailto:mario.limonciello@amd.com
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was
> probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound
> PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices")
> Signed-off-by: Mario Limonciello mailto:mario.limonciello@amd.com
Dell hits the same issue.
Verified the patch on the issued system, the issue is gone.
Tested-By: Yijun Shen <Yijun_Shen@Dell.com>
> ---
> v2:
> * Move runtime PM calls after setting to D0
> * Use pci_pm_power_up_and_verify_state()
> ---
> drivers/pci/pci-driver.c | 6 ------
> drivers/pci/pci.c | 13 ++++++++++---
> drivers/pci/pci.h | 1 +
> 3 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index
> c8bd71a739f72..082918ce03d8a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev
> *pci_dev)
> pci_enable_wake(pci_dev, PCI_D0, false); }
>
> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev) -{
> - pci_power_up(pci_dev);
> - pci_update_current_state(pci_dev, PCI_D0);
> -}
> -
> static void pci_pm_default_resume_early(struct pci_dev *pci_dev) {
> pci_pm_power_up_and_verify_state(pci_dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index
> e77d5b53c0cec..8d125998b30b7 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev) }
> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev) {
> + pci_power_up(pci_dev);
> + pci_update_current_state(pci_dev, PCI_D0); }
> +
> /**
> * pci_pm_init - Initialize PM functions of given PCI device
> * @dev: PCI device to handle.
> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
> u16 status;
> u16 pmc;
>
> - pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> device_enable_async_suspend(&dev->dev);
> dev->wakeup_prepared = false;
>
> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> + pci_pm_power_up_and_verify_state(dev);
> + pm_runtime_forbid(&dev->dev);
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) diff --git
> a/drivers/pci/pci.h b/drivers/pci/pci.h index
> b81e99cd4b62a..49165b739138b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev); void
> pci_dev_complete_resume(struct pci_dev *pci_dev); void
> pci_config_pm_runtime_get(struct pci_dev *dev); void
> pci_config_pm_runtime_put(struct pci_dev *dev);
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
> void pci_pm_init(struct pci_dev *dev);
> void pci_ea_init(struct pci_dev *dev);
> void pci_msi_init(struct pci_dev *dev);
> --
> 2.43.0
Internal Use - Confidential
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
2025-04-24 15:07 ` Wysocki, Rafael J
2025-04-27 12:54 ` Denis Benato
@ 2025-05-05 19:20 ` Perry, David
2025-05-05 20:01 ` Mario Limonciello
2025-05-05 23:06 ` Bjorn Helgaas
2025-06-11 12:52 ` [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report Cabiddu, Giovanni
4 siblings, 1 reply; 19+ messages in thread
From: Perry, David @ 2025-05-05 19:20 UTC (permalink / raw)
To: Mario Limonciello, mario.limonciello, bhelgaas, rafael.j.wysocki,
huang.ying.caritas, stern
Cc: linux-pci
Verified the patch on AMD reference board.
Tested-By: David Perry <david.perry@amd.com>
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> v2:
> * Move runtime PM calls after setting to D0
> * Use pci_pm_power_up_and_verify_state()
> ---
> drivers/pci/pci-driver.c | 6 ------
> drivers/pci/pci.c | 13 ++++++++++---
> drivers/pci/pci.h | 1 +
> 3 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index c8bd71a739f72..082918ce03d8a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev *pci_dev)
> pci_enable_wake(pci_dev, PCI_D0, false);
> }
>
> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> -{
> - pci_power_up(pci_dev);
> - pci_update_current_state(pci_dev, PCI_D0);
> -}
> -
> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
> {
> pci_pm_power_up_and_verify_state(pci_dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0cec..8d125998b30b7 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
> }
> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> +{
> + pci_power_up(pci_dev);
> + pci_update_current_state(pci_dev, PCI_D0);
> +}
> +
> /**
> * pci_pm_init - Initialize PM functions of given PCI device
> * @dev: PCI device to handle.
> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
> u16 status;
> u16 pmc;
>
> - pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> device_enable_async_suspend(&dev->dev);
> dev->wakeup_prepared = false;
>
> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> + pci_pm_power_up_and_verify_state(dev);
> + pm_runtime_forbid(&dev->dev);
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b81e99cd4b62a..49165b739138b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
> void pci_dev_complete_resume(struct pci_dev *pci_dev);
> void pci_config_pm_runtime_get(struct pci_dev *dev);
> void pci_config_pm_runtime_put(struct pci_dev *dev);
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
> void pci_pm_init(struct pci_dev *dev);
> void pci_ea_init(struct pci_dev *dev);
> void pci_msi_init(struct pci_dev *dev);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-05-05 19:20 ` Perry, David
@ 2025-05-05 20:01 ` Mario Limonciello
0 siblings, 0 replies; 19+ messages in thread
From: Mario Limonciello @ 2025-05-05 20:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Perry, David, bhelgaas, rafael.j.wysocki,
huang.ying.caritas, stern
On 5/5/2025 2:20 PM, Perry, David wrote:
> Verified the patch on AMD reference board.
>
> Tested-By: David Perry <david.perry@amd.com>
Thanks Dave.
>
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> AMD BIOS team has root caused an issue that NVME storage failed to come
>> back from suspend to a lack of a call to _REG when NVME device was
>> probed.
>>
>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>> added support for calling _REG when transitioning D-states, but this only
>> works if the device actually "transitions" D-states.
>>
>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices") added support for runtime PM on PCI devices, but never actually
>> 'explicitly' sets the device to D0.
>>
>> To make sure that devices are in D0 and that platform methods such as
>> _REG are called, explicitly set all devices into D0 during
>> initialization.
>>
>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices")
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
Bjorn,
We now have some positive testing reports from this on
* Intel system (Denis Benato)
* Dell system (Yijun Shen)
* AMD reference system (Dave Perry)
Any comments?
>> v2:
>> * Move runtime PM calls after setting to D0
>> * Use pci_pm_power_up_and_verify_state()
>> ---
>> drivers/pci/pci-driver.c | 6 ------
>> drivers/pci/pci.c | 13 ++++++++++---
>> drivers/pci/pci.h | 1 +
>> 3 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index c8bd71a739f72..082918ce03d8a 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev
>> *pci_dev)
>> pci_enable_wake(pci_dev, PCI_D0, false);
>> }
>> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
>> -{
>> - pci_power_up(pci_dev);
>> - pci_update_current_state(pci_dev, PCI_D0);
>> -}
>> -
>> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
>> {
>> pci_pm_power_up_and_verify_state(pci_dev);
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index e77d5b53c0cec..8d125998b30b7 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
>> }
>> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
>> +{
>> + pci_power_up(pci_dev);
>> + pci_update_current_state(pci_dev, PCI_D0);
>> +}
>> +
>> /**
>> * pci_pm_init - Initialize PM functions of given PCI device
>> * @dev: PCI device to handle.
>> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
>> u16 status;
>> u16 pmc;
>> - pm_runtime_forbid(&dev->dev);
>> - pm_runtime_set_active(&dev->dev);
>> - pm_runtime_enable(&dev->dev);
>> device_enable_async_suspend(&dev->dev);
>> dev->wakeup_prepared = false;
>> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
>> pci_read_config_word(dev, PCI_STATUS, &status);
>> if (status & PCI_STATUS_IMM_READY)
>> dev->imm_ready = 1;
>> + pci_pm_power_up_and_verify_state(dev);
>> + pm_runtime_forbid(&dev->dev);
>> + pm_runtime_set_active(&dev->dev);
>> + pm_runtime_enable(&dev->dev);
>> }
>> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index b81e99cd4b62a..49165b739138b 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
>> void pci_dev_complete_resume(struct pci_dev *pci_dev);
>> void pci_config_pm_runtime_get(struct pci_dev *dev);
>> void pci_config_pm_runtime_put(struct pci_dev *dev);
>> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
>> void pci_pm_init(struct pci_dev *dev);
>> void pci_ea_init(struct pci_dev *dev);
>> void pci_msi_init(struct pci_dev *dev);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
` (2 preceding siblings ...)
2025-05-05 19:20 ` Perry, David
@ 2025-05-05 23:06 ` Bjorn Helgaas
2025-06-11 14:14 ` Nicolas Dichtel
2025-06-11 12:52 ` [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report Cabiddu, Giovanni
4 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2025-05-05 23:06 UTC (permalink / raw)
To: Mario Limonciello
Cc: mario.limonciello, bhelgaas, rafael.j.wysocki, huang.ying.caritas,
stern, linux-pci
On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Applied to pci/pm for v6.16, thanks!
> ---
> v2:
> * Move runtime PM calls after setting to D0
> * Use pci_pm_power_up_and_verify_state()
> ---
> drivers/pci/pci-driver.c | 6 ------
> drivers/pci/pci.c | 13 ++++++++++---
> drivers/pci/pci.h | 1 +
> 3 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index c8bd71a739f72..082918ce03d8a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -555,12 +555,6 @@ static void pci_pm_default_resume(struct pci_dev *pci_dev)
> pci_enable_wake(pci_dev, PCI_D0, false);
> }
>
> -static void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> -{
> - pci_power_up(pci_dev);
> - pci_update_current_state(pci_dev, PCI_D0);
> -}
> -
> static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
> {
> pci_pm_power_up_and_verify_state(pci_dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0cec..8d125998b30b7 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3192,6 +3192,12 @@ void pci_d3cold_disable(struct pci_dev *dev)
> }
> EXPORT_SYMBOL_GPL(pci_d3cold_disable);
>
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev)
> +{
> + pci_power_up(pci_dev);
> + pci_update_current_state(pci_dev, PCI_D0);
> +}
> +
> /**
> * pci_pm_init - Initialize PM functions of given PCI device
> * @dev: PCI device to handle.
> @@ -3202,9 +3208,6 @@ void pci_pm_init(struct pci_dev *dev)
> u16 status;
> u16 pmc;
>
> - pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> device_enable_async_suspend(&dev->dev);
> dev->wakeup_prepared = false;
>
> @@ -3266,6 +3269,10 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> + pci_pm_power_up_and_verify_state(dev);
> + pm_runtime_forbid(&dev->dev);
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b81e99cd4b62a..49165b739138b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -148,6 +148,7 @@ void pci_dev_adjust_pme(struct pci_dev *dev);
> void pci_dev_complete_resume(struct pci_dev *pci_dev);
> void pci_config_pm_runtime_get(struct pci_dev *dev);
> void pci_config_pm_runtime_put(struct pci_dev *dev);
> +void pci_pm_power_up_and_verify_state(struct pci_dev *pci_dev);
> void pci_pm_init(struct pci_dev *dev);
> void pci_ea_init(struct pci_dev *dev);
> void pci_msi_init(struct pci_dev *dev);
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
` (3 preceding siblings ...)
2025-05-05 23:06 ` Bjorn Helgaas
@ 2025-06-11 12:52 ` Cabiddu, Giovanni
2025-06-11 13:50 ` Mario Limonciello
4 siblings, 1 reply; 19+ messages in thread
From: Cabiddu, Giovanni @ 2025-06-11 12:52 UTC (permalink / raw)
To: Mario Limonciello, bhelgaas, alex.williamson
Cc: mario.limonciello, rafael.j.wysocki, huang.ying.caritas, stern,
linux-pci, mike.ximing.chen, ahsan.atta, suman.kumar.chakraborty,
kvm, linux-crypto, linux-kernel
Hi Mario, Bjorn and Alex,
On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> AMD BIOS team has root caused an issue that NVME storage failed to come
> back from suspend to a lack of a call to _REG when NVME device was probed.
>
> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> added support for calling _REG when transitioning D-states, but this only
> works if the device actually "transitions" D-states.
>
> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> devices") added support for runtime PM on PCI devices, but never actually
> 'explicitly' sets the device to D0.
>
> To make sure that devices are in D0 and that platform methods such as
> _REG are called, explicitly set all devices into D0 during initialization.
>
> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
Through a bisect, we identified that this patch, in v6.16-rc1,
introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
with -EACCES.
Upon further investigation, the -EACCES appears to originate from the
rpm_resume() function, which is called by pm_runtime_resume_and_get()
within vfio_pci_core_enable(). Here is the exact call trace:
drivers/base/power/runtime.c: rpm_resume()
drivers/base/power/runtime.c: __pm_runtime_resume()
include/linux/pm_runtime.h: pm_runtime_resume_and_get()
drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
drivers/vfio/vfio_main.c: device->ops->open_device()
drivers/vfio/vfio_main.c: vfio_df_device_first_open()
drivers/vfio/vfio_main.c: vfio_df_open()
drivers/vfio/group.c: vfio_df_group_open()
drivers/vfio/group.c: vfio_device_open_file()
drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
Is this a known issue that affects other devices? Is there any ongoing
discussion or fix in progress?
Thanks,
--
Giovanni
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 12:52 ` [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report Cabiddu, Giovanni
@ 2025-06-11 13:50 ` Mario Limonciello
2025-06-11 14:30 ` Cabiddu, Giovanni
2025-06-11 16:00 ` Alex Williamson
0 siblings, 2 replies; 19+ messages in thread
From: Mario Limonciello @ 2025-06-11 13:50 UTC (permalink / raw)
To: Cabiddu, Giovanni, bhelgaas, alex.williamson
Cc: mario.limonciello, rafael.j.wysocki, huang.ying.caritas, stern,
linux-pci, mike.ximing.chen, ahsan.atta, suman.kumar.chakraborty,
kvm, linux-crypto, linux-kernel
On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
> Hi Mario, Bjorn and Alex,
>
> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> AMD BIOS team has root caused an issue that NVME storage failed to come
>> back from suspend to a lack of a call to _REG when NVME device was probed.
>>
>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>> added support for calling _REG when transitioning D-states, but this only
>> works if the device actually "transitions" D-states.
>>
>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices") added support for runtime PM on PCI devices, but never actually
>> 'explicitly' sets the device to D0.
>>
>> To make sure that devices are in D0 and that platform methods such as
>> _REG are called, explicitly set all devices into D0 during initialization.
>>
>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
> Through a bisect, we identified that this patch, in v6.16-rc1,
> introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
> devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
> with -EACCES.
>
> Upon further investigation, the -EACCES appears to originate from the
> rpm_resume() function, which is called by pm_runtime_resume_and_get()
> within vfio_pci_core_enable(). Here is the exact call trace:
>
> drivers/base/power/runtime.c: rpm_resume()
> drivers/base/power/runtime.c: __pm_runtime_resume()
> include/linux/pm_runtime.h: pm_runtime_resume_and_get()
> drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
> drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
> drivers/vfio/vfio_main.c: device->ops->open_device()
> drivers/vfio/vfio_main.c: vfio_df_device_first_open()
> drivers/vfio/vfio_main.c: vfio_df_open()
> drivers/vfio/group.c: vfio_df_group_open()
> drivers/vfio/group.c: vfio_device_open_file()
> drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
> drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
>
> Is this a known issue that affects other devices? Is there any ongoing
> discussion or fix in progress?
>
> Thanks,
>
This is the first I've heard about an issue with that patch.
Does setting the VFIO parameter disable_idle_d3 help?
If so; this feels like an imbalance of runtime PM calls in the VFIO
stack that this patch exposed.
Alex, any ideas?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-05-05 23:06 ` Bjorn Helgaas
@ 2025-06-11 14:14 ` Nicolas Dichtel
2025-06-11 14:56 ` Mario Limonciello
0 siblings, 1 reply; 19+ messages in thread
From: Nicolas Dichtel @ 2025-06-11 14:14 UTC (permalink / raw)
To: Bjorn Helgaas, Mario Limonciello
Cc: mario.limonciello, bhelgaas, rafael.j.wysocki, huang.ying.caritas,
stern, linux-pci, Olivier MATZ, dev@dpdk.org
Le 06/05/2025 à 01:06, Bjorn Helgaas a écrit :
> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> AMD BIOS team has root caused an issue that NVME storage failed to come
>> back from suspend to a lack of a call to _REG when NVME device was probed.
>>
>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>> added support for calling _REG when transitioning D-states, but this only
>> works if the device actually "transitions" D-states.
>>
>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>> devices") added support for runtime PM on PCI devices, but never actually
>> 'explicitly' sets the device to D0.
>>
>> To make sure that devices are in D0 and that platform methods such as
>> _REG are called, explicitly set all devices into D0 during initialization.
>>
>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>
> Applied to pci/pm for v6.16, thanks!
>
I've a regression after this commit.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d4c10f763d7
I've started a QEMU with "-cpu host" on an AMD (AMD Ryzen 5 3600 6-Core
Processor) machine + virtio-net interfaces. When I try to start a testpmd (a
DPDK app), it cannot find the virtio port. The ioctl VFIO_GROUP_GET_DEVICE_FD fails.
To reproduce the issue:
qemu-system-x86_64 --enable-kvm -m 5G -cpu host \
-smp sockets=1,cores=2,threads=2 \
-snapshot -vga none -display none -nographic \
-drive if=none,file=/opt/vm/ubuntu-24.04-with-linux-net.qcow2,id=hda \
-device virtio-blk,drive=hda \
-device virtio-net,netdev=eth0,addr=03 -netdev user,id=eth0 \
-device virtio-net,netdev=eth1,addr=04 -netdev socket,id=eth1,mcast=230.0.0.1:1234
git clone git://dpdk.org/dpdk
cd dpdk/
meson build-static --werror --default-library=static --debug
ninja -C build-static
echo 3 > /proc/sys/vm/drop_caches
echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
modprobe vfio-pci
lspci
python3 ./usertools/dpdk-devbind.py --noiommu-mode -b vfio-pci 0000:00:04.0
./build-static/app/dpdk-testpmd -l 1,2 --socket-mem 512,0 -a 0000:00:04.0 -- -i
Here is the output:
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Getting a vfio_dev_fd for 0000:00:04.0 failed
PCI_BUS: Cannot get offset of region 0.
PCI_BUS: fail to disable req notifier.
PCI_BUS: fail to disable req notifier.
VIRTIO_INIT: eth_virtio_pci_init(): Failed to init PCI device
PCI_BUS: Requested device 0000:00:04.0 cannot be used
EAL: Bus (pci) probe failed.
testpmd: No probed ethernet devices
Interactive-mode selected
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Done
testpmd>
=> the problem starts at the line "Getting a vfio_dev_fd for 0000:00:04.0 failed"
https://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_vfio.c#n966
FWIW, here is the output when it starts correctly:
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
Interactive-mode selected
Warning: NUMA should be configured manually by using --port-numa-config and
--ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last port will
pair with itself.
Configuring Port 0 (socket 0)
EAL: Error disabling MSI-X interrupts for fd 277
Port 0: DE:ED:01:E0:1B:75
Checking link statuses...
Done
testpmd>
Any help would be appreciated.
Regards,
Nicolas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 13:50 ` Mario Limonciello
@ 2025-06-11 14:30 ` Cabiddu, Giovanni
2025-06-11 16:00 ` Alex Williamson
1 sibling, 0 replies; 19+ messages in thread
From: Cabiddu, Giovanni @ 2025-06-11 14:30 UTC (permalink / raw)
To: Mario Limonciello
Cc: bhelgaas, alex.williamson, mario.limonciello, rafael.j.wysocki,
huang.ying.caritas, stern, linux-pci, mike.ximing.chen,
ahsan.atta, suman.kumar.chakraborty, kvm, linux-crypto,
linux-kernel
On Wed, Jun 11, 2025 at 06:50:59AM -0700, Mario Limonciello wrote:
> On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
> > Hi Mario, Bjorn and Alex,
> >
> > On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> > > From: Mario Limonciello <mario.limonciello@amd.com>
> > >
> > > AMD BIOS team has root caused an issue that NVME storage failed to come
> > > back from suspend to a lack of a call to _REG when NVME device was probed.
> > >
> > > commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> > > added support for calling _REG when transitioning D-states, but this only
> > > works if the device actually "transitions" D-states.
> > >
> > > commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> > > devices") added support for runtime PM on PCI devices, but never actually
> > > 'explicitly' sets the device to D0.
> > >
> > > To make sure that devices are in D0 and that platform methods such as
> > > _REG are called, explicitly set all devices into D0 during initialization.
> > >
> > > Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > ---
> > Through a bisect, we identified that this patch, in v6.16-rc1,
> > introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
> > devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
> > with -EACCES.
> >
> > Upon further investigation, the -EACCES appears to originate from the
> > rpm_resume() function, which is called by pm_runtime_resume_and_get()
> > within vfio_pci_core_enable(). Here is the exact call trace:
> >
> > drivers/base/power/runtime.c: rpm_resume()
> > drivers/base/power/runtime.c: __pm_runtime_resume()
> > include/linux/pm_runtime.h: pm_runtime_resume_and_get()
> > drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
> > drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
> > drivers/vfio/vfio_main.c: device->ops->open_device()
> > drivers/vfio/vfio_main.c: vfio_df_device_first_open()
> > drivers/vfio/vfio_main.c: vfio_df_open()
> > drivers/vfio/group.c: vfio_df_group_open()
> > drivers/vfio/group.c: vfio_device_open_file()
> > drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
> > drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
> >
> > Is this a known issue that affects other devices? Is there any ongoing
> > discussion or fix in progress?
> >
> > Thanks,
> >
>
> This is the first I've heard about an issue with that patch.
>
> Does setting the VFIO parameter disable_idle_d3 help?
It does, ... a bit. With disable_idle_d3=1 the ioctl() is successful, but a
subsequent read on that file descriptor fails with -EIO.
ioctl(5, VFIO_GROUP_GET_DEVICE_FD, 0x7ffd2b38abf0) = 6
pread64(6, 0x7ffd2b38ab06, 2, 7696581394436) = -1 EIO (Input/output error)
>
> If so; this feels like an imbalance of runtime PM calls in the VFIO stack
> that this patch exposed.
>
> Alex, any ideas?
>
Regards,
--
Giovanni
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-06-11 14:14 ` Nicolas Dichtel
@ 2025-06-11 14:56 ` Mario Limonciello
2025-06-11 16:05 ` Alex Williamson
0 siblings, 1 reply; 19+ messages in thread
From: Mario Limonciello @ 2025-06-11 14:56 UTC (permalink / raw)
To: nicolas.dichtel, Bjorn Helgaas, Alex Williamson
Cc: mario.limonciello, bhelgaas, rafael.j.wysocki, huang.ying.caritas,
stern, linux-pci, Olivier MATZ, dev@dpdk.org
On 6/11/2025 7:14 AM, Nicolas Dichtel wrote:
> Le 06/05/2025 à 01:06, Bjorn Helgaas a écrit :
>> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> AMD BIOS team has root caused an issue that NVME storage failed to come
>>> back from suspend to a lack of a call to _REG when NVME device was probed.
>>>
>>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>>> added support for calling _REG when transitioning D-states, but this only
>>> works if the device actually "transitions" D-states.
>>>
>>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>>> devices") added support for runtime PM on PCI devices, but never actually
>>> 'explicitly' sets the device to D0.
>>>
>>> To make sure that devices are in D0 and that platform methods such as
>>> _REG are called, explicitly set all devices into D0 during initialization.
>>>
>>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>
>> Applied to pci/pm for v6.16, thanks!
>>
>
> I've a regression after this commit.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d4c10f763d7
>
> I've started a QEMU with "-cpu host" on an AMD (AMD Ryzen 5 3600 6-Core
> Processor) machine + virtio-net interfaces. When I try to start a testpmd (a
> DPDK app), it cannot find the virtio port. The ioctl VFIO_GROUP_GET_DEVICE_FD fails.
>
> To reproduce the issue:
> qemu-system-x86_64 --enable-kvm -m 5G -cpu host \
> -smp sockets=1,cores=2,threads=2 \
> -snapshot -vga none -display none -nographic \
> -drive if=none,file=/opt/vm/ubuntu-24.04-with-linux-net.qcow2,id=hda \
> -device virtio-blk,drive=hda \
> -device virtio-net,netdev=eth0,addr=03 -netdev user,id=eth0 \
> -device virtio-net,netdev=eth1,addr=04 -netdev socket,id=eth1,mcast=230.0.0.1:1234
>
> git clone git://dpdk.org/dpdk
> cd dpdk/
> meson build-static --werror --default-library=static --debug
> ninja -C build-static
> echo 3 > /proc/sys/vm/drop_caches
> echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> modprobe vfio-pci
> lspci
> python3 ./usertools/dpdk-devbind.py --noiommu-mode -b vfio-pci 0000:00:04.0
> ./build-static/app/dpdk-testpmd -l 1,2 --socket-mem 512,0 -a 0000:00:04.0 -- -i
>
> Here is the output:
> EAL: Detected CPU lcores: 4
> EAL: Detected NUMA nodes: 1
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Using IOMMU type 8 (No-IOMMU)
> EAL: Getting a vfio_dev_fd for 0000:00:04.0 failed
> PCI_BUS: Cannot get offset of region 0.
> PCI_BUS: fail to disable req notifier.
> PCI_BUS: fail to disable req notifier.
> VIRTIO_INIT: eth_virtio_pci_init(): Failed to init PCI device
> PCI_BUS: Requested device 0000:00:04.0 cannot be used
> EAL: Bus (pci) probe failed.
> testpmd: No probed ethernet devices
> Interactive-mode selected
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> Done
> testpmd>
>
> => the problem starts at the line "Getting a vfio_dev_fd for 0000:00:04.0 failed"
> https://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_vfio.c#n966
>
> FWIW, here is the output when it starts correctly:
> EAL: Detected CPU lcores: 4
> EAL: Detected NUMA nodes: 1
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Using IOMMU type 8 (No-IOMMU)
> Interactive-mode selected
> Warning: NUMA should be configured manually by using --port-numa-config and
> --ring-numa-config parameters along with --numa.
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
>
> Warning! port-topology=paired and odd forward ports number, the last port will
> pair with itself.
>
> Configuring Port 0 (socket 0)
> EAL: Error disabling MSI-X interrupts for fd 277
> Port 0: DE:ED:01:E0:1B:75
> Checking link statuses...
> Done
> testpmd>
>
> Any help would be appreciated.
>
> Regards,
> Nicolas
+AlexW
Thanks for the report and especially for the repro steps. This sounds
just like the one reported for the QAT regression also in this thread.
https://lore.kernel.org/linux-pci/aEmS+OQL7IbjdwKs@gcabiddu-mobl.ger.corp.intel.com/T/#m7e8929d6421690dc8bd6dc639d86c2b4db27cbc4
I'm traveling this week, but as your report doesn't have a dependency on
QAT hardware I will try to reproduce next week to understand what's
going on.
Alex - if you have any ideas please let me know.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 13:50 ` Mario Limonciello
2025-06-11 14:30 ` Cabiddu, Giovanni
@ 2025-06-11 16:00 ` Alex Williamson
2025-06-11 16:13 ` Cabiddu, Giovanni
1 sibling, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2025-06-11 16:00 UTC (permalink / raw)
To: Mario Limonciello
Cc: Cabiddu, Giovanni, bhelgaas, mario.limonciello, rafael.j.wysocki,
huang.ying.caritas, stern, linux-pci, mike.ximing.chen,
ahsan.atta, suman.kumar.chakraborty, kvm, linux-crypto,
linux-kernel
On Wed, 11 Jun 2025 06:50:59 -0700
Mario Limonciello <superm1@kernel.org> wrote:
> On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
> > Hi Mario, Bjorn and Alex,
> >
> > On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> >> From: Mario Limonciello <mario.limonciello@amd.com>
> >>
> >> AMD BIOS team has root caused an issue that NVME storage failed to come
> >> back from suspend to a lack of a call to _REG when NVME device was probed.
> >>
> >> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> >> added support for calling _REG when transitioning D-states, but this only
> >> works if the device actually "transitions" D-states.
> >>
> >> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> >> devices") added support for runtime PM on PCI devices, but never actually
> >> 'explicitly' sets the device to D0.
> >>
> >> To make sure that devices are in D0 and that platform methods such as
> >> _REG are called, explicitly set all devices into D0 during initialization.
> >>
> >> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> >> ---
> > Through a bisect, we identified that this patch, in v6.16-rc1,
> > introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
> > devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
> > with -EACCES.
> >
> > Upon further investigation, the -EACCES appears to originate from the
> > rpm_resume() function, which is called by pm_runtime_resume_and_get()
> > within vfio_pci_core_enable(). Here is the exact call trace:
> >
> > drivers/base/power/runtime.c: rpm_resume()
> > drivers/base/power/runtime.c: __pm_runtime_resume()
> > include/linux/pm_runtime.h: pm_runtime_resume_and_get()
> > drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
> > drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
> > drivers/vfio/vfio_main.c: device->ops->open_device()
> > drivers/vfio/vfio_main.c: vfio_df_device_first_open()
> > drivers/vfio/vfio_main.c: vfio_df_open()
> > drivers/vfio/group.c: vfio_df_group_open()
> > drivers/vfio/group.c: vfio_device_open_file()
> > drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
> > drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
> >
> > Is this a known issue that affects other devices? Is there any ongoing
> > discussion or fix in progress?
> >
> > Thanks,
> >
>
> This is the first I've heard about an issue with that patch.
>
> Does setting the VFIO parameter disable_idle_d3 help?
>
> If so; this feels like an imbalance of runtime PM calls in the VFIO
> stack that this patch exposed.
>
> Alex, any ideas?
Does the device in question have a PM capability? I note that
4d4c10f763d7 makes the sequence:
pm_runtime_forbid(&dev->dev);
pm_runtime_set_active(&dev->dev);
pm_runtime_enable(&dev->dev);
Dependent on the presence of a PM capability. The PM capability is
optional on SR-IOV VFs. This feels like a bug in the original patch,
we should be able to use pm_runtime ops on a device without
specifically checking if the device supports PCI PM.
vfio-pci also has a somewhat unique sequence versus other drivers, we
don't call pci_enable_device() until the user opens the device, but we
want to put the device into low power before that occurs. Historically
PCI-core left device in an unknown power state between driver uses, so
we've needed to manually move the device to D0 before calling
pm_runtime_allow() and pm_runtime_put() (see
vfio_pci_core_register_device()). Possibly this is redundant now but
we're using pci_set_power_state() which shouldn't interact with
pm_runtime, so my initial guess is that we might be unbalanced because
this is a VF w/o a PM capability and we've missed the expected
pm_runtime initialization sequence. Thanks,
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing
2025-06-11 14:56 ` Mario Limonciello
@ 2025-06-11 16:05 ` Alex Williamson
0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2025-06-11 16:05 UTC (permalink / raw)
To: Mario Limonciello
Cc: nicolas.dichtel, Bjorn Helgaas, mario.limonciello, bhelgaas,
rafael.j.wysocki, huang.ying.caritas, stern, linux-pci,
Olivier MATZ, dev@dpdk.org
On Wed, 11 Jun 2025 07:56:21 -0700
Mario Limonciello <superm1@kernel.org> wrote:
> On 6/11/2025 7:14 AM, Nicolas Dichtel wrote:
> > Le 06/05/2025 à 01:06, Bjorn Helgaas a écrit :
> >> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> >>> From: Mario Limonciello <mario.limonciello@amd.com>
> >>>
> >>> AMD BIOS team has root caused an issue that NVME storage failed to come
> >>> back from suspend to a lack of a call to _REG when NVME device was probed.
> >>>
> >>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> >>> added support for calling _REG when transitioning D-states, but this only
> >>> works if the device actually "transitions" D-states.
> >>>
> >>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> >>> devices") added support for runtime PM on PCI devices, but never actually
> >>> 'explicitly' sets the device to D0.
> >>>
> >>> To make sure that devices are in D0 and that platform methods such as
> >>> _REG are called, explicitly set all devices into D0 during initialization.
> >>>
> >>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> >>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> >>
> >> Applied to pci/pm for v6.16, thanks!
> >>
> >
> > I've a regression after this commit.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d4c10f763d7
> >
> > I've started a QEMU with "-cpu host" on an AMD (AMD Ryzen 5 3600 6-Core
> > Processor) machine + virtio-net interfaces. When I try to start a testpmd (a
> > DPDK app), it cannot find the virtio port. The ioctl VFIO_GROUP_GET_DEVICE_FD fails.
> >
> > To reproduce the issue:
> > qemu-system-x86_64 --enable-kvm -m 5G -cpu host \
> > -smp sockets=1,cores=2,threads=2 \
> > -snapshot -vga none -display none -nographic \
> > -drive if=none,file=/opt/vm/ubuntu-24.04-with-linux-net.qcow2,id=hda \
> > -device virtio-blk,drive=hda \
> > -device virtio-net,netdev=eth0,addr=03 -netdev user,id=eth0 \
> > -device virtio-net,netdev=eth1,addr=04 -netdev socket,id=eth1,mcast=230.0.0.1:1234
> >
> > git clone git://dpdk.org/dpdk
> > cd dpdk/
> > meson build-static --werror --default-library=static --debug
> > ninja -C build-static
> > echo 3 > /proc/sys/vm/drop_caches
> > echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> > modprobe vfio-pci
> > lspci
> > python3 ./usertools/dpdk-devbind.py --noiommu-mode -b vfio-pci 0000:00:04.0
> > ./build-static/app/dpdk-testpmd -l 1,2 --socket-mem 512,0 -a 0000:00:04.0 -- -i
> >
> > Here is the output:
> > EAL: Detected CPU lcores: 4
> > EAL: Detected NUMA nodes: 1
> > EAL: Detected static linkage of DPDK
> > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > EAL: Selected IOVA mode 'PA'
> > EAL: VFIO support initialized
> > EAL: Using IOMMU type 8 (No-IOMMU)
> > EAL: Getting a vfio_dev_fd for 0000:00:04.0 failed
> > PCI_BUS: Cannot get offset of region 0.
> > PCI_BUS: fail to disable req notifier.
> > PCI_BUS: fail to disable req notifier.
> > VIRTIO_INIT: eth_virtio_pci_init(): Failed to init PCI device
> > PCI_BUS: Requested device 0000:00:04.0 cannot be used
> > EAL: Bus (pci) probe failed.
> > testpmd: No probed ethernet devices
> > Interactive-mode selected
> > testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> > testpmd: preferred mempool ops selected: ring_mp_mc
> > Done
> > testpmd>
> >
> > => the problem starts at the line "Getting a vfio_dev_fd for 0000:00:04.0 failed"
> > https://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_vfio.c#n966
> >
> > FWIW, here is the output when it starts correctly:
> > EAL: Detected CPU lcores: 4
> > EAL: Detected NUMA nodes: 1
> > EAL: Detected static linkage of DPDK
> > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > EAL: Selected IOVA mode 'PA'
> > EAL: VFIO support initialized
> > EAL: Using IOMMU type 8 (No-IOMMU)
> > Interactive-mode selected
> > Warning: NUMA should be configured manually by using --port-numa-config and
> > --ring-numa-config parameters along with --numa.
> > testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> > testpmd: preferred mempool ops selected: ring_mp_mc
> >
> > Warning! port-topology=paired and odd forward ports number, the last port will
> > pair with itself.
> >
> > Configuring Port 0 (socket 0)
> > EAL: Error disabling MSI-X interrupts for fd 277
> > Port 0: DE:ED:01:E0:1B:75
> > Checking link statuses...
> > Done
> > testpmd>
> >
> > Any help would be appreciated.
> >
> > Regards,
> > Nicolas
>
> +AlexW
>
> Thanks for the report and especially for the repro steps. This sounds
> just like the one reported for the QAT regression also in this thread.
>
> https://lore.kernel.org/linux-pci/aEmS+OQL7IbjdwKs@gcabiddu-mobl.ger.corp.intel.com/T/#m7e8929d6421690dc8bd6dc639d86c2b4db27cbc4
>
> I'm traveling this week, but as your report doesn't have a dependency on
> QAT hardware I will try to reproduce next week to understand what's
> going on.
>
> Alex - if you have any ideas please let me know.
Note that this instantiation of the virtio-net device creates it as a
non-PCIe device, where QEMU only seems to create a PM capability when
the device is exposed as PCIe. Therefore this could also be a
manifestation that we've made pm_runtime initialization dependent on
the device having a PM capability. Thanks,
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 16:00 ` Alex Williamson
@ 2025-06-11 16:13 ` Cabiddu, Giovanni
2025-06-11 20:45 ` Mario Limonciello
0 siblings, 1 reply; 19+ messages in thread
From: Cabiddu, Giovanni @ 2025-06-11 16:13 UTC (permalink / raw)
To: Alex Williamson
Cc: Mario Limonciello, bhelgaas, mario.limonciello, rafael.j.wysocki,
huang.ying.caritas, stern, linux-pci, mike.ximing.chen,
ahsan.atta, suman.kumar.chakraborty, kvm, linux-crypto,
linux-kernel
On Wed, Jun 11, 2025 at 10:00:02AM -0600, Alex Williamson wrote:
> On Wed, 11 Jun 2025 06:50:59 -0700
> Mario Limonciello <superm1@kernel.org> wrote:
>
> > On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
> > > Hi Mario, Bjorn and Alex,
> > >
> > > On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> > >> From: Mario Limonciello <mario.limonciello@amd.com>
> > >>
> > >> AMD BIOS team has root caused an issue that NVME storage failed to come
> > >> back from suspend to a lack of a call to _REG when NVME device was probed.
> > >>
> > >> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> > >> added support for calling _REG when transitioning D-states, but this only
> > >> works if the device actually "transitions" D-states.
> > >>
> > >> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> > >> devices") added support for runtime PM on PCI devices, but never actually
> > >> 'explicitly' sets the device to D0.
> > >>
> > >> To make sure that devices are in D0 and that platform methods such as
> > >> _REG are called, explicitly set all devices into D0 during initialization.
> > >>
> > >> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > >> ---
> > > Through a bisect, we identified that this patch, in v6.16-rc1,
> > > introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
> > > devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
> > > with -EACCES.
> > >
> > > Upon further investigation, the -EACCES appears to originate from the
> > > rpm_resume() function, which is called by pm_runtime_resume_and_get()
> > > within vfio_pci_core_enable(). Here is the exact call trace:
> > >
> > > drivers/base/power/runtime.c: rpm_resume()
> > > drivers/base/power/runtime.c: __pm_runtime_resume()
> > > include/linux/pm_runtime.h: pm_runtime_resume_and_get()
> > > drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
> > > drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
> > > drivers/vfio/vfio_main.c: device->ops->open_device()
> > > drivers/vfio/vfio_main.c: vfio_df_device_first_open()
> > > drivers/vfio/vfio_main.c: vfio_df_open()
> > > drivers/vfio/group.c: vfio_df_group_open()
> > > drivers/vfio/group.c: vfio_device_open_file()
> > > drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
> > > drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
> > >
> > > Is this a known issue that affects other devices? Is there any ongoing
> > > discussion or fix in progress?
> > >
> > > Thanks,
> > >
> >
> > This is the first I've heard about an issue with that patch.
> >
> > Does setting the VFIO parameter disable_idle_d3 help?
> >
> > If so; this feels like an imbalance of runtime PM calls in the VFIO
> > stack that this patch exposed.
> >
> > Alex, any ideas?
>
> Does the device in question have a PM capability? I note that
> 4d4c10f763d7 makes the sequence:
>
> pm_runtime_forbid(&dev->dev);
> pm_runtime_set_active(&dev->dev);
> pm_runtime_enable(&dev->dev);
>
> Dependent on the presence of a PM capability. The PM capability is
> optional on SR-IOV VFs. This feels like a bug in the original patch,
> we should be able to use pm_runtime ops on a device without
> specifically checking if the device supports PCI PM.
>
> vfio-pci also has a somewhat unique sequence versus other drivers, we
> don't call pci_enable_device() until the user opens the device, but we
> want to put the device into low power before that occurs. Historically
> PCI-core left device in an unknown power state between driver uses, so
> we've needed to manually move the device to D0 before calling
> pm_runtime_allow() and pm_runtime_put() (see
> vfio_pci_core_register_device()). Possibly this is redundant now but
> we're using pci_set_power_state() which shouldn't interact with
> pm_runtime, so my initial guess is that we might be unbalanced because
> this is a VF w/o a PM capability and we've missed the expected
> pm_runtime initialization sequence. Thanks,
Yes, for Intel QAT, the issue occurs with a VF without the PM capability.
Thanks,
--
Giovanni
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 16:13 ` Cabiddu, Giovanni
@ 2025-06-11 20:45 ` Mario Limonciello
2025-06-11 22:07 ` Giovanni Cabiddu
2025-06-19 9:11 ` Alexey Kardashevskiy
0 siblings, 2 replies; 19+ messages in thread
From: Mario Limonciello @ 2025-06-11 20:45 UTC (permalink / raw)
To: Cabiddu, Giovanni, Alex Williamson
Cc: bhelgaas, mario.limonciello, rafael.j.wysocki, huang.ying.caritas,
stern, linux-pci, mike.ximing.chen, ahsan.atta,
suman.kumar.chakraborty, kvm, linux-crypto, linux-kernel
On 6/11/2025 9:13 AM, Cabiddu, Giovanni wrote:
> On Wed, Jun 11, 2025 at 10:00:02AM -0600, Alex Williamson wrote:
>> On Wed, 11 Jun 2025 06:50:59 -0700
>> Mario Limonciello <superm1@kernel.org> wrote:
>>
>>> On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
>>>> Hi Mario, Bjorn and Alex,
>>>>
>>>> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
>>>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>>>
>>>>> AMD BIOS team has root caused an issue that NVME storage failed to come
>>>>> back from suspend to a lack of a call to _REG when NVME device was probed.
>>>>>
>>>>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>>>>> added support for calling _REG when transitioning D-states, but this only
>>>>> works if the device actually "transitions" D-states.
>>>>>
>>>>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>>>>> devices") added support for runtime PM on PCI devices, but never actually
>>>>> 'explicitly' sets the device to D0.
>>>>>
>>>>> To make sure that devices are in D0 and that platform methods such as
>>>>> _REG are called, explicitly set all devices into D0 during initialization.
>>>>>
>>>>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>>> ---
>>>> Through a bisect, we identified that this patch, in v6.16-rc1,
>>>> introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
>>>> devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
>>>> with -EACCES.
>>>>
>>>> Upon further investigation, the -EACCES appears to originate from the
>>>> rpm_resume() function, which is called by pm_runtime_resume_and_get()
>>>> within vfio_pci_core_enable(). Here is the exact call trace:
>>>>
>>>> drivers/base/power/runtime.c: rpm_resume()
>>>> drivers/base/power/runtime.c: __pm_runtime_resume()
>>>> include/linux/pm_runtime.h: pm_runtime_resume_and_get()
>>>> drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
>>>> drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
>>>> drivers/vfio/vfio_main.c: device->ops->open_device()
>>>> drivers/vfio/vfio_main.c: vfio_df_device_first_open()
>>>> drivers/vfio/vfio_main.c: vfio_df_open()
>>>> drivers/vfio/group.c: vfio_df_group_open()
>>>> drivers/vfio/group.c: vfio_device_open_file()
>>>> drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
>>>> drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
>>>>
>>>> Is this a known issue that affects other devices? Is there any ongoing
>>>> discussion or fix in progress?
>>>>
>>>> Thanks,
>>>>
>>>
>>> This is the first I've heard about an issue with that patch.
>>>
>>> Does setting the VFIO parameter disable_idle_d3 help?
>>>
>>> If so; this feels like an imbalance of runtime PM calls in the VFIO
>>> stack that this patch exposed.
>>>
>>> Alex, any ideas?
>>
>> Does the device in question have a PM capability? I note that
>> 4d4c10f763d7 makes the sequence:
>>
>> pm_runtime_forbid(&dev->dev);
>> pm_runtime_set_active(&dev->dev);
>> pm_runtime_enable(&dev->dev);
>>
>> Dependent on the presence of a PM capability. The PM capability is
>> optional on SR-IOV VFs. This feels like a bug in the original patch,
>> we should be able to use pm_runtime ops on a device without
>> specifically checking if the device supports PCI PM.
>>
>> vfio-pci also has a somewhat unique sequence versus other drivers, we
>> don't call pci_enable_device() until the user opens the device, but we
>> want to put the device into low power before that occurs. Historically
>> PCI-core left device in an unknown power state between driver uses, so
>> we've needed to manually move the device to D0 before calling
>> pm_runtime_allow() and pm_runtime_put() (see
>> vfio_pci_core_register_device()). Possibly this is redundant now but
>> we're using pci_set_power_state() which shouldn't interact with
>> pm_runtime, so my initial guess is that we might be unbalanced because
>> this is a VF w/o a PM capability and we've missed the expected
>> pm_runtime initialization sequence. Thanks,
>
> Yes, for Intel QAT, the issue occurs with a VF without the PM capability.
>
> Thanks,
>
Got it, thanks Alex! I think this should help return it to previous
behavior for devices without runtime PM and still fix the problem it
needed to.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3dd44d1ad829..c495c3c692f5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3221,15 +3221,17 @@ void pci_pm_init(struct pci_dev *dev)
/* find PCI PM capability in list */
pm = pci_find_capability(dev, PCI_CAP_ID_PM);
- if (!pm)
+ if (!pm) {
+ goto poweron;
return;
+ }
/* Check device's ability to generate PME# */
pci_read_config_word(dev, pm + PCI_PM_PMC, &pmc);
if ((pmc & PCI_PM_CAP_VER_MASK) > 3) {
pci_err(dev, "unsupported PM cap regs version (%u)\n",
pmc & PCI_PM_CAP_VER_MASK);
- return;
+ goto poweron;
}
dev->pm_cap = pm;
@@ -3274,6 +3276,7 @@ void pci_pm_init(struct pci_dev *dev)
pci_read_config_word(dev, PCI_STATUS, &status);
if (status & PCI_STATUS_IMM_READY)
dev->imm_ready = 1;
+poweron:
pci_pm_power_up_and_verify_state(dev);
pm_runtime_forbid(&dev->dev);
pm_runtime_set_active(&dev->dev);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 20:45 ` Mario Limonciello
@ 2025-06-11 22:07 ` Giovanni Cabiddu
2025-06-19 9:11 ` Alexey Kardashevskiy
1 sibling, 0 replies; 19+ messages in thread
From: Giovanni Cabiddu @ 2025-06-11 22:07 UTC (permalink / raw)
To: Mario Limonciello
Cc: Alex Williamson, bhelgaas, mario.limonciello, rafael.j.wysocki,
huang.ying.caritas, stern, linux-pci, mike.ximing.chen,
ahsan.atta, suman.kumar.chakraborty, kvm, linux-crypto,
linux-kernel
On Wed, Jun 11, 2025 at 01:45:49PM -0700, Mario Limonciello wrote:
> On 6/11/2025 9:13 AM, Cabiddu, Giovanni wrote:
> > On Wed, Jun 11, 2025 at 10:00:02AM -0600, Alex Williamson wrote:
> > > On Wed, 11 Jun 2025 06:50:59 -0700
> > > Mario Limonciello <superm1@kernel.org> wrote:
> > >
> > > > On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
> > > > > Hi Mario, Bjorn and Alex,
> > > > >
> > > > > On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
> > > > > > From: Mario Limonciello <mario.limonciello@amd.com>
> > > > > >
> > > > > > AMD BIOS team has root caused an issue that NVME storage failed to come
> > > > > > back from suspend to a lack of a call to _REG when NVME device was probed.
> > > > > >
> > > > > > commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
> > > > > > added support for calling _REG when transitioning D-states, but this only
> > > > > > works if the device actually "transitions" D-states.
> > > > > >
> > > > > > commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
> > > > > > devices") added support for runtime PM on PCI devices, but never actually
> > > > > > 'explicitly' sets the device to D0.
> > > > > >
> > > > > > To make sure that devices are in D0 and that platform methods such as
> > > > > > _REG are called, explicitly set all devices into D0 during initialization.
> > > > > >
> > > > > > Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
> > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > > > > ---
> > > > > Through a bisect, we identified that this patch, in v6.16-rc1,
> > > > > introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
> > > > > devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
> > > > > with -EACCES.
> > > > >
> > > > > Upon further investigation, the -EACCES appears to originate from the
> > > > > rpm_resume() function, which is called by pm_runtime_resume_and_get()
> > > > > within vfio_pci_core_enable(). Here is the exact call trace:
> > > > >
> > > > > drivers/base/power/runtime.c: rpm_resume()
> > > > > drivers/base/power/runtime.c: __pm_runtime_resume()
> > > > > include/linux/pm_runtime.h: pm_runtime_resume_and_get()
> > > > > drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
> > > > > drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
> > > > > drivers/vfio/vfio_main.c: device->ops->open_device()
> > > > > drivers/vfio/vfio_main.c: vfio_df_device_first_open()
> > > > > drivers/vfio/vfio_main.c: vfio_df_open()
> > > > > drivers/vfio/group.c: vfio_df_group_open()
> > > > > drivers/vfio/group.c: vfio_device_open_file()
> > > > > drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
> > > > > drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
> > > > >
> > > > > Is this a known issue that affects other devices? Is there any ongoing
> > > > > discussion or fix in progress?
> > > > >
> > > > > Thanks,
> > > >
> > > > This is the first I've heard about an issue with that patch.
> > > >
> > > > Does setting the VFIO parameter disable_idle_d3 help?
> > > >
> > > > If so; this feels like an imbalance of runtime PM calls in the VFIO
> > > > stack that this patch exposed.
> > > >
> > > > Alex, any ideas?
> > >
> > > Does the device in question have a PM capability? I note that
> > > 4d4c10f763d7 makes the sequence:
> > >
> > > pm_runtime_forbid(&dev->dev);
> > > pm_runtime_set_active(&dev->dev);
> > > pm_runtime_enable(&dev->dev);
> > >
> > > Dependent on the presence of a PM capability. The PM capability is
> > > optional on SR-IOV VFs. This feels like a bug in the original patch,
> > > we should be able to use pm_runtime ops on a device without
> > > specifically checking if the device supports PCI PM.
> > >
> > > vfio-pci also has a somewhat unique sequence versus other drivers, we
> > > don't call pci_enable_device() until the user opens the device, but we
> > > want to put the device into low power before that occurs. Historically
> > > PCI-core left device in an unknown power state between driver uses, so
> > > we've needed to manually move the device to D0 before calling
> > > pm_runtime_allow() and pm_runtime_put() (see
> > > vfio_pci_core_register_device()). Possibly this is redundant now but
> > > we're using pci_set_power_state() which shouldn't interact with
> > > pm_runtime, so my initial guess is that we might be unbalanced because
> > > this is a VF w/o a PM capability and we've missed the expected
> > > pm_runtime initialization sequence. Thanks,
> >
> > Yes, for Intel QAT, the issue occurs with a VF without the PM capability.
> >
> > Thanks,
> >
>
> Got it, thanks Alex! I think this should help return it to previous
> behavior for devices without runtime PM and still fix the problem it needed
> to.
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3dd44d1ad829..c495c3c692f5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3221,15 +3221,17 @@ void pci_pm_init(struct pci_dev *dev)
>
> /* find PCI PM capability in list */
> pm = pci_find_capability(dev, PCI_CAP_ID_PM);
> - if (!pm)
> + if (!pm) {
> + goto poweron;
> return;
> + }
> /* Check device's ability to generate PME# */
> pci_read_config_word(dev, pm + PCI_PM_PMC, &pmc);
>
> if ((pmc & PCI_PM_CAP_VER_MASK) > 3) {
> pci_err(dev, "unsupported PM cap regs version (%u)\n",
> pmc & PCI_PM_CAP_VER_MASK);
> - return;
> + goto poweron;
> }
>
> dev->pm_cap = pm;
> @@ -3274,6 +3276,7 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> +poweron:
> pci_pm_power_up_and_verify_state(dev);
> pm_runtime_forbid(&dev->dev);
> pm_runtime_set_active(&dev->dev);
I tried this change and it works.
Thanks,
--
Giovanni
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report
2025-06-11 20:45 ` Mario Limonciello
2025-06-11 22:07 ` Giovanni Cabiddu
@ 2025-06-19 9:11 ` Alexey Kardashevskiy
1 sibling, 0 replies; 19+ messages in thread
From: Alexey Kardashevskiy @ 2025-06-19 9:11 UTC (permalink / raw)
To: Mario Limonciello, Cabiddu, Giovanni, Alex Williamson
Cc: bhelgaas, mario.limonciello, rafael.j.wysocki, huang.ying.caritas,
stern, linux-pci, mike.ximing.chen, ahsan.atta,
suman.kumar.chakraborty, kvm, linux-crypto, linux-kernel
On 12/6/25 06:45, Mario Limonciello wrote:
> On 6/11/2025 9:13 AM, Cabiddu, Giovanni wrote:
>> On Wed, Jun 11, 2025 at 10:00:02AM -0600, Alex Williamson wrote:
>>> On Wed, 11 Jun 2025 06:50:59 -0700
>>> Mario Limonciello <superm1@kernel.org> wrote:
>>>
>>>> On 6/11/2025 5:52 AM, Cabiddu, Giovanni wrote:
>>>>> Hi Mario, Bjorn and Alex,
>>>>>
>>>>> On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
>>>>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>>>>
>>>>>> AMD BIOS team has root caused an issue that NVME storage failed to come
>>>>>> back from suspend to a lack of a call to _REG when NVME device was probed.
>>>>>>
>>>>>> commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
>>>>>> added support for calling _REG when transitioning D-states, but this only
>>>>>> works if the device actually "transitions" D-states.
>>>>>>
>>>>>> commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
>>>>>> devices") added support for runtime PM on PCI devices, but never actually
>>>>>> 'explicitly' sets the device to D0.
>>>>>>
>>>>>> To make sure that devices are in D0 and that platform methods such as
>>>>>> _REG are called, explicitly set all devices into D0 during initialization.
>>>>>>
>>>>>> Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
>>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>>>> ---
>>>>> Through a bisect, we identified that this patch, in v6.16-rc1,
>>>>> introduces a regression on vfio-pci across all Intel QuickAssist (QAT)
>>>>> devices. Specifically, the ioctl VFIO_GROUP_GET_DEVICE_FD call fails
>>>>> with -EACCES.
>>>>>
>>>>> Upon further investigation, the -EACCES appears to originate from the
>>>>> rpm_resume() function, which is called by pm_runtime_resume_and_get()
>>>>> within vfio_pci_core_enable(). Here is the exact call trace:
>>>>>
>>>>> drivers/base/power/runtime.c: rpm_resume()
>>>>> drivers/base/power/runtime.c: __pm_runtime_resume()
>>>>> include/linux/pm_runtime.h: pm_runtime_resume_and_get()
>>>>> drivers/vfio/pci/vfio_pci_core.c: vfio_pci_core_enable()
>>>>> drivers/vfio/pci/vfio_pci.c: vfio_pci_open_device()
>>>>> drivers/vfio/vfio_main.c: device->ops->open_device()
>>>>> drivers/vfio/vfio_main.c: vfio_df_device_first_open()
>>>>> drivers/vfio/vfio_main.c: vfio_df_open()
>>>>> drivers/vfio/group.c: vfio_df_group_open()
>>>>> drivers/vfio/group.c: vfio_device_open_file()
>>>>> drivers/vfio/group.c: vfio_group_ioctl_get_device_fd()
>>>>> drivers/vfio/group.c: vfio_group_fops_unl_ioctl(..., VFIO_GROUP_GET_DEVICE_FD, ...)
>>>>>
>>>>> Is this a known issue that affects other devices? Is there any ongoing
>>>>> discussion or fix in progress?
>>>>>
>>>>> Thanks,
>>>>
>>>> This is the first I've heard about an issue with that patch.
>>>>
>>>> Does setting the VFIO parameter disable_idle_d3 help?
>>>>
>>>> If so; this feels like an imbalance of runtime PM calls in the VFIO
>>>> stack that this patch exposed.
>>>>
>>>> Alex, any ideas?
>>>
>>> Does the device in question have a PM capability? I note that
>>> 4d4c10f763d7 makes the sequence:
>>>
>>> pm_runtime_forbid(&dev->dev);
>>> pm_runtime_set_active(&dev->dev);
>>> pm_runtime_enable(&dev->dev);
>>>
>>> Dependent on the presence of a PM capability. The PM capability is
>>> optional on SR-IOV VFs. This feels like a bug in the original patch,
>>> we should be able to use pm_runtime ops on a device without
>>> specifically checking if the device supports PCI PM.
>>>
>>> vfio-pci also has a somewhat unique sequence versus other drivers, we
>>> don't call pci_enable_device() until the user opens the device, but we
>>> want to put the device into low power before that occurs. Historically
>>> PCI-core left device in an unknown power state between driver uses, so
>>> we've needed to manually move the device to D0 before calling
>>> pm_runtime_allow() and pm_runtime_put() (see
>>> vfio_pci_core_register_device()). Possibly this is redundant now but
>>> we're using pci_set_power_state() which shouldn't interact with
>>> pm_runtime, so my initial guess is that we might be unbalanced because
>>> this is a VF w/o a PM capability and we've missed the expected
>>> pm_runtime initialization sequence. Thanks,
>>
>> Yes, for Intel QAT, the issue occurs with a VF without the PM capability.
>>
>> Thanks,
>>
>
> Got it, thanks Alex! I think this should help return it to previous behavior for devices without runtime PM and still fix the problem it needed to.
Seems working for me too, thanks,
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3dd44d1ad829..c495c3c692f5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3221,15 +3221,17 @@ void pci_pm_init(struct pci_dev *dev)
>
> /* find PCI PM capability in list */
> pm = pci_find_capability(dev, PCI_CAP_ID_PM);
> - if (!pm)
> + if (!pm) {
> + goto poweron;
> return;
> + }
> /* Check device's ability to generate PME# */
> pci_read_config_word(dev, pm + PCI_PM_PMC, &pmc);
>
> if ((pmc & PCI_PM_CAP_VER_MASK) > 3) {
> pci_err(dev, "unsupported PM cap regs version (%u)\n",
> pmc & PCI_PM_CAP_VER_MASK);
> - return;
> + goto poweron;
> }
>
> dev->pm_cap = pm;
> @@ -3274,6 +3276,7 @@ void pci_pm_init(struct pci_dev *dev)
> pci_read_config_word(dev, PCI_STATUS, &status);
> if (status & PCI_STATUS_IMM_READY)
> dev->imm_ready = 1;
> +poweron:
> pci_pm_power_up_and_verify_state(dev);
> pm_runtime_forbid(&dev->dev);
> pm_runtime_set_active(&dev->dev);
--
Alexey
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-06-19 9:11 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 4:31 [PATCH v2] PCI: Explicitly put devices into D0 when initializing Mario Limonciello
2025-04-24 15:07 ` Wysocki, Rafael J
2025-04-24 18:20 ` Mario Limonciello
2025-04-27 12:54 ` Denis Benato
2025-05-05 19:20 ` Perry, David
2025-05-05 20:01 ` Mario Limonciello
2025-05-05 23:06 ` Bjorn Helgaas
2025-06-11 14:14 ` Nicolas Dichtel
2025-06-11 14:56 ` Mario Limonciello
2025-06-11 16:05 ` Alex Williamson
2025-06-11 12:52 ` [PATCH v2] PCI: Explicitly put devices into D0 when initializing - Bug report Cabiddu, Giovanni
2025-06-11 13:50 ` Mario Limonciello
2025-06-11 14:30 ` Cabiddu, Giovanni
2025-06-11 16:00 ` Alex Williamson
2025-06-11 16:13 ` Cabiddu, Giovanni
2025-06-11 20:45 ` Mario Limonciello
2025-06-11 22:07 ` Giovanni Cabiddu
2025-06-19 9:11 ` Alexey Kardashevskiy
[not found] <BY5PR19MB3922D2994064085132D8FEBD9A822@BY5PR19MB3922.namprd19.prod.outlook.com>
2025-05-01 14:14 ` [PATCH v2] PCI: Explicitly put devices into D0 when initializing Shen, Yijun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).