* [PATCH v9 0/3] PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe
@ 2024-09-24 7:05 Jian-Hong Pan
2024-09-24 7:14 ` [PATCH v9 1/3] PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates Jian-Hong Pan
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-24 7:05 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Johan Hovold, David Box, Ilpo Järvinen,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, linux-kernel, linux, Jian-Hong Pan
Notice the VMD remapped PCIe Root Port and NVMe have PCI PM L1 substates
capability, but they are disabled originally.
Here is a failed example on ASUS B1400CEAE with enabled VMD:
10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
...
Capabilities: [200 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=0us
10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
...
Capabilities: [900 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=101376ns
L1SubCtl2: T_PwrOn=50us
According to "PCIe r6.0, sec 5.5.4", to config the link between the PCIe
Root Port and the child device correctly:
* Ensure both devices are in D0 before enabling PCI-PM L1 PM Substates.
* Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and
LTR_L1.2_THRESHOLD are programmed properly on both devices before enable
bits for L1.2.
Prepare this series to fix that.
Jian-Hong Pan (3):
PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates
PCI/ASPM: Add notes about enabling PCI-PM L1SS to
pci_enable_link_state(_locked)
PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
drivers/pci/controller/vmd.c | 20 ++++++++++++++++----
drivers/pci/pcie/aspm.c | 6 ++++++
2 files changed, 22 insertions(+), 4 deletions(-)
--
2.46.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v9 1/3] PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates
2024-09-24 7:05 [PATCH v9 0/3] PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe Jian-Hong Pan
@ 2024-09-24 7:14 ` Jian-Hong Pan
2024-09-24 7:26 ` [PATCH v9 2/3] PCI/ASPM: Add notes about enabling PCI-PM L1SS to pci_enable_link_state(_locked) Jian-Hong Pan
2024-09-24 7:29 ` [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus() Jian-Hong Pan
2 siblings, 0 replies; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-24 7:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Johan Hovold, David Box, Ilpo Järvinen,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, linux-kernel, linux, Jian-Hong Pan
The remapped PCIe Root Port and the child device have PCI PM L1 substates
capability, but they are disabled originally.
Here is a failed example on ASUS B1400CEAE:
Capabilities: [900 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=101376ns
L1SubCtl2: T_PwrOn=50us
Power on all of the VMD remapped PCI devices to D0 before enable PCI-PM L1
PM Substates by following "PCIe r6.0, sec 5.5.4".
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
---
v2:
- Power on the VMD remapped devices with pci_set_power_state_locked()
- Prepare the PCIe LTR parameters before enable L1 Substates
- Add note into the comments of both pci_enable_link_state() and
pci_enable_link_state_locked() for kernel-doc.
- The original patch set can be split as individual patches.
v3:
- Re-send for the missed version information.
- Split drivers/pci/pcie/aspm.c modification into following patches.
- Fix the comment for enasuring the PCI devices in D0.
v4:
- The same
v5:
- Tweak the commit title and message
- Change the goto label from out_enable_link_state to out_state_change
v6~8:
- The same
v9:
- Update L1 PM Substates information against kernel v6.11 in commit message
drivers/pci/controller/vmd.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 264a180403a0..11870d1fc818 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -740,11 +740,9 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
return 0;
- pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
-
pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
if (!pos)
- return 0;
+ goto out_state_change;
/*
* Skip if the max snoop LTR is non-zero, indicating BIOS has set it
@@ -752,7 +750,7 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
*/
pci_read_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, <r_reg);
if (!!(ltr_reg & (PCI_LTR_VALUE_MASK | PCI_LTR_SCALE_MASK)))
- return 0;
+ goto out_state_change;
/*
* Set the default values to the maximum required by the platform to
@@ -764,6 +762,13 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
pci_write_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, ltr_reg);
pci_info(pdev, "VMD: Default LTR value set by driver\n");
+out_state_change:
+ /*
+ * Ensure devices are in D0 before enabling PCI-PM L1 PM Substates, per
+ * PCIe r6.0, sec 5.5.4.
+ */
+ pci_set_power_state_locked(pdev, PCI_D0);
+ pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
return 0;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 2/3] PCI/ASPM: Add notes about enabling PCI-PM L1SS to pci_enable_link_state(_locked)
2024-09-24 7:05 [PATCH v9 0/3] PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe Jian-Hong Pan
2024-09-24 7:14 ` [PATCH v9 1/3] PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates Jian-Hong Pan
@ 2024-09-24 7:26 ` Jian-Hong Pan
2024-09-24 7:29 ` [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus() Jian-Hong Pan
2 siblings, 0 replies; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-24 7:26 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Johan Hovold, David Box, Ilpo Järvinen,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, linux-kernel, linux, Jian-Hong Pan, Bjorn Helgaas
According to "PCIe r6.0, sec 5.5.4", add note about D0 requirement in
pci_enable_link_state() kernel-doc.
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
v3:
- Fix as readable comments
v4:
- The same
v5:
- Tweak and simplify the commit message
v6~9:
- The same
drivers/pci/pcie/aspm.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index cee2365e54b8..bd0a8a05647e 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1442,6 +1442,9 @@ static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked)
* touch the LNKCTL register. Also note that this does not enable states
* disabled by pci_disable_link_state(). Return 0 or a negative errno.
*
+ * Note: Ensure devices are in D0 before enabling PCI-PM L1 PM Substates, per
+ * PCIe r6.0, sec 5.5.4.
+ *
* @pdev: PCI device
* @state: Mask of ASPM link states to enable
*/
@@ -1458,6 +1461,9 @@ EXPORT_SYMBOL(pci_enable_link_state);
* can't touch the LNKCTL register. Also note that this does not enable states
* disabled by pci_disable_link_state(). Return 0 or a negative errno.
*
+ * Note: Ensure devices are in D0 before enabling PCI-PM L1 PM Substates, per
+ * PCIe r6.0, sec 5.5.4.
+ *
* @pdev: PCI device
* @state: Mask of ASPM link states to enable
*
--
2.46.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-24 7:05 [PATCH v9 0/3] PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe Jian-Hong Pan
2024-09-24 7:14 ` [PATCH v9 1/3] PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates Jian-Hong Pan
2024-09-24 7:26 ` [PATCH v9 2/3] PCI/ASPM: Add notes about enabling PCI-PM L1SS to pci_enable_link_state(_locked) Jian-Hong Pan
@ 2024-09-24 7:29 ` Jian-Hong Pan
2024-09-26 2:51 ` David E. Box
2 siblings, 1 reply; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-24 7:29 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Johan Hovold, David Box, Ilpo Järvinen,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, linux-kernel, linux, Jian-Hong Pan
PCI devices' parameters on the VMD bus have been programmed properly
originally. But, cleared after pci_reset_bus() and have not been restored
correctly. This leads the link's L1.2 between PCIe Root Port and child
device gets wrong configs.
Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
bridge and NVMe device should have the same LTR1.2_Threshold value.
However, they are configured as different values in this case:
10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
...
Capabilities: [200 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=0us
10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
...
Capabilities: [900 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=101376ns
L1SubCtl2: T_PwrOn=50us
Here is VMD mapped PCI device tree:
-+-[0000:00]-+-00.0 Intel Corporation Device 9a04
| ...
\-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
\-17.0 Intel Corporation Tiger Lake-LP SATA Controller
When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
restores NVMe's state before and after reset. Because bus [e1] has only one
device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
bridge)'s saved state capability data and restores L1SS with value 0.
So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
state after pci_reset_bus(). The restoring state also consumes the saving
state.
Link: https://www.spinics.net/lists/linux-pci/msg159267.html
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
---
v9:
- Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
drivers/pci/controller/vmd.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 11870d1fc818..2820005165b4 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
if (!list_empty(&child->devices)) {
dev = list_first_entry(&child->devices,
struct pci_dev, bus_list);
+ /* To avoid pci_reset_bus restore wrong ASPM L1SS
+ * configuration due to missed saving parent device's
+ * states, save & restore the parent device's states
+ * as well.
+ */
+ pci_save_state(dev->bus->self);
ret = pci_reset_bus(dev);
if (ret)
pci_warn(dev, "can't reset device: %d\n", ret);
+ pci_restore_state(dev->bus->self);
break;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-24 7:29 ` [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus() Jian-Hong Pan
@ 2024-09-26 2:51 ` David E. Box
2024-09-26 4:05 ` Jian-Hong Pan
0 siblings, 1 reply; 10+ messages in thread
From: David E. Box @ 2024-09-26 2:51 UTC (permalink / raw)
To: Jian-Hong Pan, Bjorn Helgaas
Cc: Johan Hovold, Ilpo Järvinen, Kuppuswamy Sathyanarayanan,
Nirmal Patel, Jonathan Derrick, linux-pci, linux-kernel, linux
Hi Jian-Hong,
On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> PCI devices' parameters on the VMD bus have been programmed properly
> originally. But, cleared after pci_reset_bus() and have not been restored
> correctly. This leads the link's L1.2 between PCIe Root Port and child
> device gets wrong configs.
>
> Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> bridge and NVMe device should have the same LTR1.2_Threshold value.
> However, they are configured as different values in this case:
>
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor
> PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> ...
> Capabilities: [200 v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> T_CommonMode=0us LTR1.2_Threshold=0ns
> L1SubCtl2: T_PwrOn=0us
>
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue
> SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> ...
> Capabilities: [900 v1] L1 PM Substates
> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> T_CommonMode=0us LTR1.2_Threshold=101376ns
> L1SubCtl2: T_PwrOn=50us
>
> Here is VMD mapped PCI device tree:
>
> -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> | ...
> \-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> \-17.0 Intel Corporation Tiger Lake-LP SATA Controller
>
> When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> restores NVMe's state before and after reset. Because bus [e1] has only one
> device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> bridge)'s saved state capability data and restores L1SS with value 0.
>
> So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
> state after pci_reset_bus(). The restoring state also consumes the saving
> state.
>
> Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> ---
> v9:
> - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
>
> drivers/pci/controller/vmd.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 11870d1fc818..2820005165b4 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> unsigned long features)
> if (!list_empty(&child->devices)) {
> dev = list_first_entry(&child->devices,
> struct pci_dev, bus_list);
Newline here
> + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> + * configuration due to missed saving parent device's
> + * states, save & restore the parent device's states
> + * as well.
> + */
No text on first line of comment.
/*
* To avoid pci_reset_bus restore wrong ASPM L1SS
* ...
*/
> + pci_save_state(dev->bus->self);
> ret = pci_reset_bus(dev);
> if (ret)
> pci_warn(dev, "can't reset device: %d\n",
> ret);
> + pci_restore_state(dev->bus->self);
I think Ilpo's point was that pci_save_aspm_l1ss_state() and
pci_restore_aspm_l1ss_state() should be more symmetric. If
pci_save_aspm_l1ss_state() is changed to also handle the state for the device
and its parent in the same call, then no change is needed here. But that would
only handle L1SS settings and not anything else that might need to be restored
after the bus reset. So I'm okay with it either way.
David
>
> break;
> }
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-26 2:51 ` David E. Box
@ 2024-09-26 4:05 ` Jian-Hong Pan
2024-09-26 10:52 ` Ilpo Järvinen
0 siblings, 1 reply; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-26 4:05 UTC (permalink / raw)
To: david.e.box
Cc: Bjorn Helgaas, Johan Hovold, Ilpo Järvinen,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, linux-kernel, linux
David E. Box <david.e.box@linux.intel.com> 於 2024年9月26日 週四 上午10:51寫道:
>
> Hi Jian-Hong,
>
> On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> > PCI devices' parameters on the VMD bus have been programmed properly
> > originally. But, cleared after pci_reset_bus() and have not been restored
> > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > device gets wrong configs.
> >
> > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > However, they are configured as different values in this case:
> >
> > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor
> > PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> > ...
> > Capabilities: [200 v1] L1 PM Substates
> > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > T_CommonMode=0us LTR1.2_Threshold=0ns
> > L1SubCtl2: T_PwrOn=0us
> >
> > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue
> > SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > ...
> > Capabilities: [900 v1] L1 PM Substates
> > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > T_CommonMode=0us LTR1.2_Threshold=101376ns
> > L1SubCtl2: T_PwrOn=50us
> >
> > Here is VMD mapped PCI device tree:
> >
> > -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> > | ...
> > \-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> > \-17.0 Intel Corporation Tiger Lake-LP SATA Controller
> >
> > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > restores NVMe's state before and after reset. Because bus [e1] has only one
> > device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> > That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> > bridge)'s saved state capability data and restores L1SS with value 0.
> >
> > So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
> > state after pci_reset_bus(). The restoring state also consumes the saving
> > state.
> >
> > Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > ---
> > v9:
> > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> >
> > drivers/pci/controller/vmd.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index 11870d1fc818..2820005165b4 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > unsigned long features)
> > if (!list_empty(&child->devices)) {
> > dev = list_first_entry(&child->devices,
> > struct pci_dev, bus_list);
>
> Newline here
> > + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> > + * configuration due to missed saving parent device's
> > + * states, save & restore the parent device's states
> > + * as well.
> > + */
>
> No text on first line of comment.
Oops! Thanks
> /*
> * To avoid pci_reset_bus restore wrong ASPM L1SS
> * ...
> */
>
> > + pci_save_state(dev->bus->self);
> > ret = pci_reset_bus(dev);
> > if (ret)
> > pci_warn(dev, "can't reset device: %d\n",
> > ret);
> > + pci_restore_state(dev->bus->self);
>
> I think Ilpo's point was that pci_save_aspm_l1ss_state() and
> pci_restore_aspm_l1ss_state() should be more symmetric. If
> pci_save_aspm_l1ss_state() is changed to also handle the state for the device
> and its parent in the same call, then no change is needed here. But that would
> only handle L1SS settings and not anything else that might need to be restored
> after the bus reset. So I'm okay with it either way.
Yes, that made me think the whole day before I sent out this patch. I
do not know what is the best reset bus procedure, especially how other
drivers reset the bus.
If trace the code path:
pci_reset_bus()
__pci_reset_bus()
pci_bus_reset()
pci_bus_save_and_disable_locked()
pci_bus_save_and_disable_locked()'s comment shows "Save and disable
devices from the top of the tree down while holding the @dev mutex
lock for the entire tree". I think that means the PCI(e) bridge should
save state first, then the following bus' devices. However, VMD resets
the child device (10000:e1:00.0 NVMe)'s bus first and only saves the
child device (10000:e1:00.0 NVMe)'s state. It does not visit the tree
from the top to down. So, it misses the PCIe bridge. Therefore, I
make it save & restore its parent (10000:e0:06.0 PCIe bridge)'s state
as compensation.
Jian-Hong Pan
> >
> > break;
> > }
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-26 4:05 ` Jian-Hong Pan
@ 2024-09-26 10:52 ` Ilpo Järvinen
2024-09-26 13:22 ` Ilpo Järvinen
2024-09-26 22:52 ` David E. Box
0 siblings, 2 replies; 10+ messages in thread
From: Ilpo Järvinen @ 2024-09-26 10:52 UTC (permalink / raw)
To: Jian-Hong Pan
Cc: david.e.box, Bjorn Helgaas, Johan Hovold,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, LKML, linux
[-- Attachment #1: Type: text/plain, Size: 6703 bytes --]
On Thu, 26 Sep 2024, Jian-Hong Pan wrote:
> David E. Box <david.e.box@linux.intel.com> 於 2024年9月26日 週四 上午10:51寫道:
> >
> > Hi Jian-Hong,
> >
> > On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> > > PCI devices' parameters on the VMD bus have been programmed properly
> > > originally. But, cleared after pci_reset_bus() and have not been restored
> > > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > > device gets wrong configs.
> > >
> > > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > > However, they are configured as different values in this case:
> > >
> > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor
> > > PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> > > ...
> > > Capabilities: [200 v1] L1 PM Substates
> > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > T_CommonMode=0us LTR1.2_Threshold=0ns
> > > L1SubCtl2: T_PwrOn=0us
> > >
> > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue
> > > SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > ...
> > > Capabilities: [900 v1] L1 PM Substates
> > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > T_CommonMode=0us LTR1.2_Threshold=101376ns
> > > L1SubCtl2: T_PwrOn=50us
> > >
> > > Here is VMD mapped PCI device tree:
> > >
> > > -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> > > | ...
> > > \-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> > > \-17.0 Intel Corporation Tiger Lake-LP SATA Controller
> > >
> > > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > > restores NVMe's state before and after reset. Because bus [e1] has only one
> > > device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> > > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > > bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> > > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > > value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> > > That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> > > bridge)'s saved state capability data and restores L1SS with value 0.
> > >
> > > So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
> > > state after pci_reset_bus(). The restoring state also consumes the saving
> > > state.
> > >
> > > Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > ---
> > > v9:
> > > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> > >
> > > drivers/pci/controller/vmd.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > index 11870d1fc818..2820005165b4 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > > unsigned long features)
> > > if (!list_empty(&child->devices)) {
> > > dev = list_first_entry(&child->devices,
> > > struct pci_dev, bus_list);
> >
> > Newline here
> > > + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> > > + * configuration due to missed saving parent device's
> > > + * states, save & restore the parent device's states
> > > + * as well.
> > > + */
> >
> > No text on first line of comment.
>
> Oops! Thanks
>
> > /*
> > * To avoid pci_reset_bus restore wrong ASPM L1SS
> > * ...
> > */
> >
> > > + pci_save_state(dev->bus->self);
> > > ret = pci_reset_bus(dev);
> > > if (ret)
> > > pci_warn(dev, "can't reset device: %d\n",
> > > ret);
> > > + pci_restore_state(dev->bus->self);
> >
> > I think Ilpo's point was that pci_save_aspm_l1ss_state() and
> > pci_restore_aspm_l1ss_state() should be more symmetric. If
> > pci_save_aspm_l1ss_state() is changed to also handle the state for the device
> > and its parent in the same call, then no change is needed here. But that would
> > only handle L1SS settings and not anything else that might need to be restored
> > after the bus reset. So I'm okay with it either way.
Why would something else need to be restored? The spec explicitly says the
bus reset should not cause config changes on the parent other than
to status bits.
Based on what is seen so far, the problem here is not the reset itself
breaking parent's config but ASPM code restoring values from state beyond
what it had saved and thus it programs pseudogarbage into the L1SS
settings.
> Yes, that made me think the whole day before I sent out this patch. I
> do not know what is the best reset bus procedure, especially how other
> drivers reset the bus.
>
> If trace the code path:
> pci_reset_bus()
> __pci_reset_bus()
> pci_bus_reset()
> pci_bus_save_and_disable_locked()
>
> pci_bus_save_and_disable_locked()'s comment shows "Save and disable
> devices from the top of the tree down while holding the @dev mutex
> lock for the entire tree". I think that means the PCI(e) bridge should
> save state first, then the following bus' devices. However, VMD resets
> the child device (10000:e1:00.0 NVMe)'s bus first and only saves the
> child device (10000:e1:00.0 NVMe)'s state. It does not visit the tree
> from the top to down. So, it misses the PCIe bridge. Therefore, I
> make it save & restore its parent (10000:e0:06.0 PCIe bridge)'s state
> as compensation.
The problem with your fix is it only takes care of part of the cases (i.e.
VMD). But there are other callers of pci_reset_bus() which would also
restore incorrect L1SS settings, no?
I'd suggest making the L1SS code symmetric on this, even if it means
saving the register value twice when walking the tree downwards (it seems
harmless to write the same value twice).
--
i.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-26 10:52 ` Ilpo Järvinen
@ 2024-09-26 13:22 ` Ilpo Järvinen
2024-09-27 10:42 ` Jian-Hong Pan
2024-09-26 22:52 ` David E. Box
1 sibling, 1 reply; 10+ messages in thread
From: Ilpo Järvinen @ 2024-09-26 13:22 UTC (permalink / raw)
To: Jian-Hong Pan
Cc: david.e.box, Bjorn Helgaas, Johan Hovold,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, LKML, linux
[-- Attachment #1: Type: text/plain, Size: 7278 bytes --]
On Thu, 26 Sep 2024, Ilpo Järvinen wrote:
> On Thu, 26 Sep 2024, Jian-Hong Pan wrote:
>
> > David E. Box <david.e.box@linux.intel.com> 於 2024年9月26日 週四 上午10:51寫道:
> > >
> > > Hi Jian-Hong,
> > >
> > > On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> > > > PCI devices' parameters on the VMD bus have been programmed properly
> > > > originally. But, cleared after pci_reset_bus() and have not been restored
> > > > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > > > device gets wrong configs.
> > > >
> > > > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > > > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > > > However, they are configured as different values in this case:
> > > >
> > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor
> > > > PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> > > > ...
> > > > Capabilities: [200 v1] L1 PM Substates
> > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > L1SubCtl2: T_PwrOn=0us
> > > >
> > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue
> > > > SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > > ...
> > > > Capabilities: [900 v1] L1 PM Substates
> > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > T_CommonMode=0us LTR1.2_Threshold=101376ns
> > > > L1SubCtl2: T_PwrOn=50us
> > > >
> > > > Here is VMD mapped PCI device tree:
> > > >
> > > > -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> > > > | ...
> > > > \-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> > > > \-17.0 Intel Corporation Tiger Lake-LP SATA Controller
> > > >
> > > > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > > > restores NVMe's state before and after reset. Because bus [e1] has only one
> > > > device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> > > > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > > > bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> > > > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > > > value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> > > > That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> > > > bridge)'s saved state capability data and restores L1SS with value 0.
> > > >
> > > > So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
> > > > state after pci_reset_bus(). The restoring state also consumes the saving
> > > > state.
> > > >
> > > > Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > ---
> > > > v9:
> > > > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> > > >
> > > > drivers/pci/controller/vmd.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > > index 11870d1fc818..2820005165b4 100644
> > > > --- a/drivers/pci/controller/vmd.c
> > > > +++ b/drivers/pci/controller/vmd.c
> > > > @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > > > unsigned long features)
> > > > if (!list_empty(&child->devices)) {
> > > > dev = list_first_entry(&child->devices,
> > > > struct pci_dev, bus_list);
> > >
> > > Newline here
> > > > + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> > > > + * configuration due to missed saving parent device's
> > > > + * states, save & restore the parent device's states
> > > > + * as well.
> > > > + */
> > >
> > > No text on first line of comment.
> >
> > Oops! Thanks
> >
> > > /*
> > > * To avoid pci_reset_bus restore wrong ASPM L1SS
> > > * ...
> > > */
> > >
> > > > + pci_save_state(dev->bus->self);
> > > > ret = pci_reset_bus(dev);
> > > > if (ret)
> > > > pci_warn(dev, "can't reset device: %d\n",
> > > > ret);
> > > > + pci_restore_state(dev->bus->self);
> > >
> > > I think Ilpo's point was that pci_save_aspm_l1ss_state() and
> > > pci_restore_aspm_l1ss_state() should be more symmetric. If
> > > pci_save_aspm_l1ss_state() is changed to also handle the state for the device
> > > and its parent in the same call, then no change is needed here. But that would
> > > only handle L1SS settings and not anything else that might need to be restored
> > > after the bus reset. So I'm okay with it either way.
>
> Why would something else need to be restored? The spec explicitly says the
> bus reset should not cause config changes on the parent other than
> to status bits.
>
> Based on what is seen so far, the problem here is not the reset itself
> breaking parent's config but ASPM code restoring values from state beyond
> what it had saved and thus it programs pseudogarbage into the L1SS
> settings.
>
> > Yes, that made me think the whole day before I sent out this patch. I
> > do not know what is the best reset bus procedure, especially how other
> > drivers reset the bus.
> >
> > If trace the code path:
> > pci_reset_bus()
> > __pci_reset_bus()
> > pci_bus_reset()
> > pci_bus_save_and_disable_locked()
> >
> > pci_bus_save_and_disable_locked()'s comment shows "Save and disable
> > devices from the top of the tree down while holding the @dev mutex
> > lock for the entire tree". I think that means the PCI(e) bridge should
> > save state first, then the following bus' devices. However, VMD resets
> > the child device (10000:e1:00.0 NVMe)'s bus first and only saves the
> > child device (10000:e1:00.0 NVMe)'s state. It does not visit the tree
> > from the top to down. So, it misses the PCIe bridge. Therefore, I
> > make it save & restore its parent (10000:e0:06.0 PCIe bridge)'s state
> > as compensation.
>
> The problem with your fix is it only takes care of part of the cases (i.e.
> VMD). But there are other callers of pci_reset_bus() which would also
> restore incorrect L1SS settings, no?
>
> I'd suggest making the L1SS code symmetric on this, even if it means
> saving the register value twice when walking the tree downwards (it seems
> harmless to write the same value twice).
Perhaps parent->state_saved == true could even be used to avoid doing it
twice/unnecessarily if the parent is already saved.
--
i.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-26 10:52 ` Ilpo Järvinen
2024-09-26 13:22 ` Ilpo Järvinen
@ 2024-09-26 22:52 ` David E. Box
1 sibling, 0 replies; 10+ messages in thread
From: David E. Box @ 2024-09-26 22:52 UTC (permalink / raw)
To: Ilpo Järvinen, Jian-Hong Pan
Cc: Bjorn Helgaas, Johan Hovold, Kuppuswamy Sathyanarayanan,
Nirmal Patel, Jonathan Derrick, linux-pci, LKML, linux
On Thu, 2024-09-26 at 13:52 +0300, Ilpo Järvinen wrote:
> On Thu, 26 Sep 2024, Jian-Hong Pan wrote:
>
> > David E. Box <david.e.box@linux.intel.com> 於 2024年9月26日 週四 上午10:51寫道:
> > >
> > > Hi Jian-Hong,
> > >
> > > On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> > > > PCI devices' parameters on the VMD bus have been programmed properly
> > > > originally. But, cleared after pci_reset_bus() and have not been
> > > > restored
> > > > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > > > device gets wrong configs.
> > > >
> > > > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > > > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > > > However, they are configured as different values in this case:
> > > >
> > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> > > > Processor
> > > > PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> > > > ...
> > > > Capabilities: [200 v1] L1 PM Substates
> > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > L1_PM_Substates+
> > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > L1SubCtl2: T_PwrOn=0us
> > > >
> > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> > > > Blue
> > > > SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > > ...
> > > > Capabilities: [900 v1] L1 PM Substates
> > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > L1_PM_Substates+
> > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > T_CommonMode=0us LTR1.2_Threshold=101376ns
> > > > L1SubCtl2: T_PwrOn=50us
> > > >
> > > > Here is VMD mapped PCI device tree:
> > > >
> > > > -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> > > > | ...
> > > > ¥-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> > > > ¥-17.0 Intel Corporation Tiger Lake-LP SATA Controller
> > > >
> > > > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > > > restores NVMe's state before and after reset. Because bus [e1] has only
> > > > one
> > > > device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> > > > restores the NVMe's state, it also restores the ASPM L1SS between the
> > > > PCIe
> > > > bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> > > > restored correctly. But, the PCIe bridge's L1SS is restored with the
> > > > wrong
> > > > value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> > > > That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> > > > bridge)'s saved state capability data and restores L1SS with value 0.
> > > >
> > > > So, save the PCIe bridge's state before pci_reset_bus(). Then, restore
> > > > the
> > > > state after pci_reset_bus(). The restoring state also consumes the
> > > > saving
> > > > state.
> > > >
> > > > Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > ---
> > > > v9:
> > > > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD
> > > > instead.
> > > >
> > > > drivers/pci/controller/vmd.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > > index 11870d1fc818..2820005165b4 100644
> > > > --- a/drivers/pci/controller/vmd.c
> > > > +++ b/drivers/pci/controller/vmd.c
> > > > @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > > > unsigned long features)
> > > > if (!list_empty(&child->devices)) {
> > > > dev = list_first_entry(&child->devices,
> > > > struct pci_dev, bus_list);
> > >
> > > Newline here
> > > > + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> > > > + * configuration due to missed saving parent
> > > > device's
> > > > + * states, save & restore the parent device's
> > > > states
> > > > + * as well.
> > > > + */
> > >
> > > No text on first line of comment.
> >
> > Oops! Thanks
> >
> > > /*
> > > * To avoid pci_reset_bus restore wrong ASPM L1SS
> > > * ...
> > > */
> > >
> > > > + pci_save_state(dev->bus->self);
> > > > ret = pci_reset_bus(dev);
> > > > if (ret)
> > > > pci_warn(dev, "can't reset device: %d¥n",
> > > > ret);
> > > > + pci_restore_state(dev->bus->self);
> > >
> > > I think Ilpo's point was that pci_save_aspm_l1ss_state() and
> > > pci_restore_aspm_l1ss_state() should be more symmetric. If
> > > pci_save_aspm_l1ss_state() is changed to also handle the state for the
> > > device
> > > and its parent in the same call, then no change is needed here. But that
> > > would
> > > only handle L1SS settings and not anything else that might need to be
> > > restored
> > > after the bus reset. So I'm okay with it either way.
>
> Why would something else need to be restored? The spec explicitly says the
> bus reset should not cause config changes on the parent other than
> to status bits.
>
> Based on what is seen so far, the problem here is not the reset itself
> breaking parent's config but ASPM code restoring values from state beyond
> what it had saved and thus it programs pseudogarbage into the L1SS
> settings.
I see. This must be an earlier state. If I had to guess, the parent (maybe the
child too) were saved before L1SS was configured and only the child was saved
during the bus reset leading to a mismatch in what was restored.
David
>
> > Yes, that made me think the whole day before I sent out this patch. I
> > do not know what is the best reset bus procedure, especially how other
> > drivers reset the bus.
> >
> > If trace the code path:
> > pci_reset_bus()
> > __pci_reset_bus()
> > pci_bus_reset()
> > pci_bus_save_and_disable_locked()
> >
> > pci_bus_save_and_disable_locked()'s comment shows "Save and disable
> > devices from the top of the tree down while holding the @dev mutex
> > lock for the entire tree". I think that means the PCI(e) bridge should
> > save state first, then the following bus' devices. However, VMD resets
> > the child device (10000:e1:00.0 NVMe)'s bus first and only saves the
> > child device (10000:e1:00.0 NVMe)'s state. It does not visit the tree
> > from the top to down. So, it misses the PCIe bridge. Therefore, I
> > make it save & restore its parent (10000:e0:06.0 PCIe bridge)'s state
> > as compensation.
>
> The problem with your fix is it only takes care of part of the cases (i.e.
> VMD). But there are other callers of pci_reset_bus() which would also
> restore incorrect L1SS settings, no?
>
> I'd suggest making the L1SS code symmetric on this, even if it means
> saving the register value twice when walking the tree downwards (it seems
> harmless to write the same value twice).
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus()
2024-09-26 13:22 ` Ilpo Järvinen
@ 2024-09-27 10:42 ` Jian-Hong Pan
0 siblings, 0 replies; 10+ messages in thread
From: Jian-Hong Pan @ 2024-09-27 10:42 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: david.e.box, Bjorn Helgaas, Johan Hovold,
Kuppuswamy Sathyanarayanan, Nirmal Patel, Jonathan Derrick,
linux-pci, LKML, linux
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> 於 2024年9月26日 週四 下午9:22寫道:
>
> On Thu, 26 Sep 2024, Ilpo Järvinen wrote:
>
> > On Thu, 26 Sep 2024, Jian-Hong Pan wrote:
> >
> > > David E. Box <david.e.box@linux.intel.com> 於 2024年9月26日 週四 上午10:51寫道:
> > > >
> > > > Hi Jian-Hong,
> > > >
> > > > On Tue, 2024-09-24 at 15:29 +0800, Jian-Hong Pan wrote:
> > > > > PCI devices' parameters on the VMD bus have been programmed properly
> > > > > originally. But, cleared after pci_reset_bus() and have not been restored
> > > > > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > > > > device gets wrong configs.
> > > > >
> > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > > > > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > > > > However, they are configured as different values in this case:
> > > > >
> > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor
> > > > > PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> > > > > ...
> > > > > Capabilities: [200 v1] L1 PM Substates
> > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > > L1SubCtl2: T_PwrOn=0us
> > > > >
> > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue
> > > > > SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> > > > > ...
> > > > > Capabilities: [900 v1] L1 PM Substates
> > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > T_CommonMode=0us LTR1.2_Threshold=101376ns
> > > > > L1SubCtl2: T_PwrOn=50us
> > > > >
> > > > > Here is VMD mapped PCI device tree:
> > > > >
> > > > > -+-[0000:00]-+-00.0 Intel Corporation Device 9a04
> > > > > | ...
> > > > > \-[10000:e0]-+-06.0-[e1]----00.0 Sandisk Corp WD Blue SN550 NVMe SSD
> > > > > \-17.0 Intel Corporation Tiger Lake-LP SATA Controller
> > > > >
> > > > > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > > > > restores NVMe's state before and after reset. Because bus [e1] has only one
> > > > > device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
> > > > > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > > > > bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
> > > > > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > > > > value 0x0. Becuase, the PCIe bridge's state is not saved before reset.
> > > > > That is why pci_restore_aspm_l1ss_state() gets the parent device (PCIe
> > > > > bridge)'s saved state capability data and restores L1SS with value 0.
> > > > >
> > > > > So, save the PCIe bridge's state before pci_reset_bus(). Then, restore the
> > > > > state after pci_reset_bus(). The restoring state also consumes the saving
> > > > > state.
> > > > >
> > > > > Link: https://www.spinics.net/lists/linux-pci/msg159267.html
> > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > > > > ---
> > > > > v9:
> > > > > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> > > > >
> > > > > drivers/pci/controller/vmd.c | 7 +++++++
> > > > > 1 file changed, 7 insertions(+)
> > > > >
> > > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > > > index 11870d1fc818..2820005165b4 100644
> > > > > --- a/drivers/pci/controller/vmd.c
> > > > > +++ b/drivers/pci/controller/vmd.c
> > > > > @@ -938,9 +938,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> > > > > unsigned long features)
> > > > > if (!list_empty(&child->devices)) {
> > > > > dev = list_first_entry(&child->devices,
> > > > > struct pci_dev, bus_list);
> > > >
> > > > Newline here
> > > > > + /* To avoid pci_reset_bus restore wrong ASPM L1SS
> > > > > + * configuration due to missed saving parent device's
> > > > > + * states, save & restore the parent device's states
> > > > > + * as well.
> > > > > + */
> > > >
> > > > No text on first line of comment.
> > >
> > > Oops! Thanks
> > >
> > > > /*
> > > > * To avoid pci_reset_bus restore wrong ASPM L1SS
> > > > * ...
> > > > */
> > > >
> > > > > + pci_save_state(dev->bus->self);
> > > > > ret = pci_reset_bus(dev);
> > > > > if (ret)
> > > > > pci_warn(dev, "can't reset device: %d\n",
> > > > > ret);
> > > > > + pci_restore_state(dev->bus->self);
> > > >
> > > > I think Ilpo's point was that pci_save_aspm_l1ss_state() and
> > > > pci_restore_aspm_l1ss_state() should be more symmetric. If
> > > > pci_save_aspm_l1ss_state() is changed to also handle the state for the device
> > > > and its parent in the same call, then no change is needed here. But that would
> > > > only handle L1SS settings and not anything else that might need to be restored
> > > > after the bus reset. So I'm okay with it either way.
> >
> > Why would something else need to be restored? The spec explicitly says the
> > bus reset should not cause config changes on the parent other than
> > to status bits.
> >
> > Based on what is seen so far, the problem here is not the reset itself
> > breaking parent's config but ASPM code restoring values from state beyond
> > what it had saved and thus it programs pseudogarbage into the L1SS
> > settings.
> >
> > > Yes, that made me think the whole day before I sent out this patch. I
> > > do not know what is the best reset bus procedure, especially how other
> > > drivers reset the bus.
> > >
> > > If trace the code path:
> > > pci_reset_bus()
> > > __pci_reset_bus()
> > > pci_bus_reset()
> > > pci_bus_save_and_disable_locked()
> > >
> > > pci_bus_save_and_disable_locked()'s comment shows "Save and disable
> > > devices from the top of the tree down while holding the @dev mutex
> > > lock for the entire tree". I think that means the PCI(e) bridge should
> > > save state first, then the following bus' devices. However, VMD resets
> > > the child device (10000:e1:00.0 NVMe)'s bus first and only saves the
> > > child device (10000:e1:00.0 NVMe)'s state. It does not visit the tree
> > > from the top to down. So, it misses the PCIe bridge. Therefore, I
> > > make it save & restore its parent (10000:e0:06.0 PCIe bridge)'s state
> > > as compensation.
> >
> > The problem with your fix is it only takes care of part of the cases (i.e.
> > VMD). But there are other callers of pci_reset_bus() which would also
> > restore incorrect L1SS settings, no?
> >
> > I'd suggest making the L1SS code symmetric on this, even if it means
> > saving the register value twice when walking the tree downwards (it seems
> > harmless to write the same value twice).
>
> Perhaps parent->state_saved == true could even be used to avoid doing it
> twice/unnecessarily if the parent is already saved.
Saving twice is the concern. Using parent->state_saved == true as
guard seems nice.
v10 patch has been sent out for review.
Jian-Hong Pan
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-27 10:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-24 7:05 [PATCH v9 0/3] PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe Jian-Hong Pan
2024-09-24 7:14 ` [PATCH v9 1/3] PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates Jian-Hong Pan
2024-09-24 7:26 ` [PATCH v9 2/3] PCI/ASPM: Add notes about enabling PCI-PM L1SS to pci_enable_link_state(_locked) Jian-Hong Pan
2024-09-24 7:29 ` [PATCH v9 3/3] PCI: vmd: Save/restore PCIe bridge states before/after pci_reset_bus() Jian-Hong Pan
2024-09-26 2:51 ` David E. Box
2024-09-26 4:05 ` Jian-Hong Pan
2024-09-26 10:52 ` Ilpo Järvinen
2024-09-26 13:22 ` Ilpo Järvinen
2024-09-27 10:42 ` Jian-Hong Pan
2024-09-26 22:52 ` David E. Box
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).