* [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
@ 2025-02-04 5:37 Feng Tang
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-04 5:37 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Jonathan Cameron, ilpo.jarvinen, linux-pci, linux-kernel,
Feng Tang
According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
least 1 second for the command-complete event, before resending the cmd
or sending a new cmd.
Currently get_port_device_capability() sends slot control cmd to disable
PCIe hotplug interrupts without waiting for its completion and there was
real problem reported for the lack of waiting.
Add the necessary wait to comply with PCIe spec. The waiting logic refers
existing pcie_poll_cmd().
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
---
drivers/pci/pci.h | 2 ++
drivers/pci/pcie/portdrv.c | 33 +++++++++++++++++++++++++++++++--
2 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 01e51db8d285..c1e234d1b81d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
#ifdef CONFIG_PCIEPORTBUS
void pcie_reset_lbms_count(struct pci_dev *port);
int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
#else
static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
{
return -EOPNOTSUPP;
}
+static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
#endif
struct pci_dev_reset_methods {
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 02e73099bad0..16010973bfe2 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -18,6 +18,7 @@
#include <linux/string.h>
#include <linux/slab.h>
#include <linux/aer.h>
+#include <linux/delay.h>
#include "../pci.h"
#include "portdrv.h"
@@ -205,6 +206,35 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
return 0;
}
+static int pcie_wait_sltctl_cmd_raw(struct pci_dev *pdev)
+{
+ u16 slot_status;
+ /* 1000 ms, according toPCIe spec 6.1, section 6.7.3.2 */
+ int timeout = 1000;
+
+ do {
+ pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
+ if (slot_status & PCI_EXP_SLTSTA_CC) {
+ pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
+ PCI_EXP_SLTSTA_CC);
+ return 0;
+ }
+ msleep(10);
+ timeout -= 10;
+ } while (timeout);
+
+ /* Timeout */
+ return -1;
+}
+
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
+{
+ pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+ PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+ if (pcie_wait_sltctl_cmd_raw(dev))
+ pci_info(dev, "Timeout on disabling hot-plug interrupts\n");
+}
+
/**
* get_port_device_capability - discover capabilities of a PCI Express port
* @dev: PCI Express port to examine
@@ -230,8 +260,7 @@ static int get_port_device_capability(struct pci_dev *dev)
* Disable hot-plug interrupts in case they have been enabled
* by the BIOS and the hot-plug service driver is not loaded.
*/
- pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
- PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+ pcie_disable_hp_interrupts_early(dev);
}
#ifdef CONFIG_PCIEAER
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-04 5:37 [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Feng Tang
@ 2025-02-04 5:37 ` Feng Tang
2025-02-04 9:14 ` Lukas Wunner
2025-02-04 9:23 ` Lukas Wunner
2025-02-04 9:07 ` [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Lukas Wunner
` (2 subsequent siblings)
3 siblings, 2 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-04 5:37 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Jonathan Cameron, ilpo.jarvinen, linux-pci, linux-kernel,
Feng Tang
There was a irq storm bug when testing "pci=nomsi" case, and the root
cause is: 'nomsi' will disable MSI and let devices and root ports use
legacy INTX inerrupt, and likely make several devices/ports share one
interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
interrupts, and actually asserts the command-complete interrupt.
As MSI is disabled, ACPI initialization code will not enumerate root
port's PCIE hotplug capability, and pciehp service driver wont' be
enabled for the root port to handle that interrupt, later on when it is
shared and enabled by other device driver like NVME or NIC, the "nobody
care irq storm" happens.
So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when
MSI is not enbaled.
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
---
drivers/pci/probe.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b6536ed599c3..10d72156da9a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1664,6 +1664,15 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
pdev->is_hotplug_bridge = 1;
+
+ /*
+ * When MSI is disabled, root port will use legacy INTX, and likely
+ * share INTX interrupt line with other devices like NIC/NVME. There
+ * was real world issue that the CCIE IRQ is asserted afer boot, but
+ * will not be handled well and cause IRQ storm. So disable it early.
+ */
+ if (!pci_msi_enabled())
+ pcie_disable_hp_interrupts_early(pdev);
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-04 5:37 [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Feng Tang
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
@ 2025-02-04 9:07 ` Lukas Wunner
2025-02-05 2:46 ` Feng Tang
2025-02-05 17:48 ` Markus Elfring
2025-02-05 18:26 ` [PATCH 1/2] " Sathyanarayanan Kuppuswamy
3 siblings, 1 reply; 19+ messages in thread
From: Lukas Wunner @ 2025-02-04 9:07 UTC (permalink / raw)
To: Feng Tang
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Tue, Feb 04, 2025 at 01:37:57PM +0800, Feng Tang wrote:
> According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> least 1 second for the command-complete event, before resending the cmd
> or sending a new cmd.
>
> Currently get_port_device_capability() sends slot control cmd to disable
> PCIe hotplug interrupts without waiting for its completion and there was
> real problem reported for the lack of waiting.
>
> Add the necessary wait to comply with PCIe spec. The waiting logic refers
> existing pcie_poll_cmd().
[...]
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -230,8 +260,7 @@ static int get_port_device_capability(struct pci_dev *dev)
> * Disable hot-plug interrupts in case they have been enabled
> * by the BIOS and the hot-plug service driver is not loaded.
> */
> - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> + pcie_disable_hp_interrupts_early(dev);
> }
The language of the code comment is a bit confusing in that it
says the hot-plug driver may not be "loaded". This sounds like
it could be modular. But it can't. It's always built-in.
So I think what is really meant here is that the driver may be
*disabled* in the config, i.e. CONFIG_HOTPLUG_PCI_PCIE=n.
Now if CONFIG_HOTPLUG_PCI_PCIE=n, you don't need to observe the
Command Completed delay because the hotplug driver won't touch
the Slot Control register afterwards. It's not compiled in.
On the other hand if CONFIG_HOTPLUG_PCI_PCIE=y, you don't need
to disable the CCIE and HPIE interrupt because the hotplug driver
will handle them.
So I think the proper solution here is to make the write to the
Slot Control register conditional on CONFIG_HOTPLUG_PCI_PCIE,
like this:
if (dev->is_hotplug_bridge &&
(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
- (pcie_ports_native || host->native_pcie_hotplug)) {
- services |= PCIE_PORT_SERVICE_HP;
+ pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
+ if (pcie_ports_native || host->native_pcie_hotplug)
+ services |= PCIE_PORT_SERVICE_HP;
/*
* Disable hot-plug interrupts in case they have been enabled
* by the BIOS and the hot-plug service driver is not loaded.
*/
- pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
- PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+ if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
+ pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+ PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
}
The above patch also makes sure the interrupts are quiesced if the
platform didn't grant hotplug control to OSPM.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
@ 2025-02-04 9:14 ` Lukas Wunner
2025-02-05 6:31 ` Feng Tang
2025-02-04 9:23 ` Lukas Wunner
1 sibling, 1 reply; 19+ messages in thread
From: Lukas Wunner @ 2025-02-04 9:14 UTC (permalink / raw)
To: Feng Tang
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> There was a irq storm bug when testing "pci=nomsi" case, and the root
> cause is: 'nomsi' will disable MSI and let devices and root ports use
> legacy INTX inerrupt, and likely make several devices/ports share one
> interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> interrupts, and actually asserts the command-complete interrupt.
> As MSI is disabled, ACPI initialization code will not enumerate root
> port's PCIE hotplug capability, and pciehp service driver wont' be
> enabled for the root port to handle that interrupt, later on when it is
> shared and enabled by other device driver like NVME or NIC, the "nobody
> care irq storm" happens.
>
> So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when
> MSI is not enbaled.
So I think this issue should go away if disabling the interrupt
by portdrv is no longer conditional on
(pcie_ports_native || host->native_pcie_hotplug)
like I've just proposed here:
https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/
... in which case this patch won't be necessary. Can you confirm that?
You can split the change I've proposed into two patches if you like.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
2025-02-04 9:14 ` Lukas Wunner
@ 2025-02-04 9:23 ` Lukas Wunner
2025-02-05 3:58 ` Feng Tang
1 sibling, 1 reply; 19+ messages in thread
From: Lukas Wunner @ 2025-02-04 9:23 UTC (permalink / raw)
To: Feng Tang
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> There was a irq storm bug when testing "pci=nomsi" case, and the root
> cause is: 'nomsi' will disable MSI and let devices and root ports use
> legacy INTX inerrupt, and likely make several devices/ports share one
> interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> interrupts, and actually asserts the command-complete interrupt.
> As MSI is disabled, ACPI initialization code will not enumerate root
> port's PCIE hotplug capability, and pciehp service driver wont' be
> enabled for the root port to handle that interrupt, later on when it is
> shared and enabled by other device driver like NVME or NIC, the "nobody
> care irq storm" happens.
Is there a section in the PCI Firmware Spec which says ACPI doesn't
enumerate the hotplug capability if MSI is disabled?
If so, it should be referenced in the commit message.
If not, I'm wondering if it's safe to fiddle with the Slot Control
register given the platform hasn't granted OSPM control of it.
Of course if this is spec-defined behavior in the nomsi case,
we could make the write to the Slot Control register conditional
on that. But if this turns out to be platform-specific behavior,
we can't deal with it in generic PCI code but only in a quirk.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-04 9:07 ` [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Lukas Wunner
@ 2025-02-05 2:46 ` Feng Tang
0 siblings, 0 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-05 2:46 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
Hi Lukas,
Thanks for your review and suggestions!
On Tue, Feb 04, 2025 at 10:07:04AM +0100, Lukas Wunner wrote:
> On Tue, Feb 04, 2025 at 01:37:57PM +0800, Feng Tang wrote:
> > According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> > least 1 second for the command-complete event, before resending the cmd
> > or sending a new cmd.
> >
> > Currently get_port_device_capability() sends slot control cmd to disable
> > PCIe hotplug interrupts without waiting for its completion and there was
> > real problem reported for the lack of waiting.
> >
> > Add the necessary wait to comply with PCIe spec. The waiting logic refers
> > existing pcie_poll_cmd().
> [...]
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -230,8 +260,7 @@ static int get_port_device_capability(struct pci_dev *dev)
> > * Disable hot-plug interrupts in case they have been enabled
> > * by the BIOS and the hot-plug service driver is not loaded.
> > */
> > - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> > - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> > + pcie_disable_hp_interrupts_early(dev);
> > }
>
> The language of the code comment is a bit confusing in that it
> says the hot-plug driver may not be "loaded". This sounds like
> it could be modular. But it can't. It's always built-in.
I'm not sure what problem this piece of code try to avoid, maybe
something simliar to the irq storm isseu as mentioned in the 2/2 patch?
The code comments could be about the small time window between this
point and the loading of pciehp driver, which will request_irq and
enable hotplug interrupt again.
> So I think what is really meant here is that the driver may be
> *disabled* in the config, i.e. CONFIG_HOTPLUG_PCI_PCIE=n.
>
> Now if CONFIG_HOTPLUG_PCI_PCIE=n, you don't need to observe the
> Command Completed delay because the hotplug driver won't touch
> the Slot Control register afterwards. It's not compiled in.
>
> On the other hand if CONFIG_HOTPLUG_PCI_PCIE=y, you don't need
> to disable the CCIE and HPIE interrupt because the hotplug driver
> will handle them.
>
> So I think the proper solution here is to make the write to the
> Slot Control register conditional on CONFIG_HOTPLUG_PCI_PCIE,
> like this:
>
> if (dev->is_hotplug_bridge &&
> (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> - pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
> - (pcie_ports_native || host->native_pcie_hotplug)) {
> - services |= PCIE_PORT_SERVICE_HP;
> + pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
> + if (pcie_ports_native || host->native_pcie_hotplug)
> + services |= PCIE_PORT_SERVICE_HP;
>
> /*
> * Disable hot-plug interrupts in case they have been enabled
> * by the BIOS and the hot-plug service driver is not loaded.
> */
> - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
> + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
Yes, this could sovlve the original issue.
The problem I heard from BIOS developer is, they didn't expect to
receive 2 link control commands at almost the same time, which doesn't
comply to pcie spec, and normaly the handling of one command will take
some time, though not as long as 1 second.
Thanks,
Feng
> The above patch also makes sure the interrupts are quiesced if the
> platform didn't grant hotplug control to OSPM.
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-04 9:23 ` Lukas Wunner
@ 2025-02-05 3:58 ` Feng Tang
2025-02-06 6:21 ` Lukas Wunner
0 siblings, 1 reply; 19+ messages in thread
From: Feng Tang @ 2025-02-05 3:58 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
Hi Lukas,
On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote:
> On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > legacy INTX inerrupt, and likely make several devices/ports share one
> > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > interrupts, and actually asserts the command-complete interrupt.
> > As MSI is disabled, ACPI initialization code will not enumerate root
> > port's PCIE hotplug capability, and pciehp service driver wont' be
> > enabled for the root port to handle that interrupt, later on when it is
> > shared and enabled by other device driver like NVME or NIC, the "nobody
> > care irq storm" happens.
>
> Is there a section in the PCI Firmware Spec which says ACPI doesn't
> enumerate the hotplug capability if MSI is disabled?
No, I didn't get it from spec, but found the logic by code reading
during debugging the irq storm issue. The related code is about:
#define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
| OSC_PCI_ASPM_SUPPORT \
| OSC_PCI_CLOCK_PM_SUPPORT \
| OSC_PCI_MSI_SUPPORT)
acpi_pci_root_add
negotiate_os_control
calculate_support
if (pci_msi_enabled())
support |= OSC_PCI_MSI_SUPPORT;
decode_osc_support
os_control_query_checks
if ((support & ACPI_PCIE_REQ_SUPPORT) != ACPI_PCIE_REQ_SUPPORT)
return false
acpi_pci_osc_control_set
And later in get_port_device_capability(), the pciehp service bit
won't be set, and driver is not loaded.
Thanks,
Feng
> If so, it should be referenced in the commit message.
>
> If not, I'm wondering if it's safe to fiddle with the Slot Control
> register given the platform hasn't granted OSPM control of it.
>
> Of course if this is spec-defined behavior in the nomsi case,
> we could make the write to the Slot Control register conditional
> on that. But if this turns out to be platform-specific behavior,
> we can't deal with it in generic PCI code but only in a quirk.
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-04 9:14 ` Lukas Wunner
@ 2025-02-05 6:31 ` Feng Tang
2025-02-05 13:31 ` Feng Tang
0 siblings, 1 reply; 19+ messages in thread
From: Feng Tang @ 2025-02-05 6:31 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Tue, Feb 04, 2025 at 10:14:10AM +0100, Lukas Wunner wrote:
> On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > legacy INTX inerrupt, and likely make several devices/ports share one
> > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > interrupts, and actually asserts the command-complete interrupt.
> > As MSI is disabled, ACPI initialization code will not enumerate root
> > port's PCIE hotplug capability, and pciehp service driver wont' be
> > enabled for the root port to handle that interrupt, later on when it is
> > shared and enabled by other device driver like NVME or NIC, the "nobody
> > care irq storm" happens.
> >
> > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when
> > MSI is not enbaled.
>
> So I think this issue should go away if disabling the interrupt
> by portdrv is no longer conditional on
>
> (pcie_ports_native || host->native_pcie_hotplug)
>
> like I've just proposed here:
>
> https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/
>
> ... in which case this patch won't be necessary. Can you confirm that?
Thanks for the suggestion! I will try to get the platform for test,
and report back.
As for the change,
+ if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
+ pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+ PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
The CONFIG_HOTPLUG_PCI_PCIE is always enabled on our platform and many
distros, I guess the check needs to be removed, which sees the 1 second
waiting again, and need the waiting logic in 1/2 patch?
Thanks,
Feng
>
> You can split the change I've proposed into two patches if you like.
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-05 6:31 ` Feng Tang
@ 2025-02-05 13:31 ` Feng Tang
0 siblings, 0 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-05 13:31 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Wed, Feb 05, 2025 at 02:31:56PM +0800, Feng Tang wrote:
> On Tue, Feb 04, 2025 at 10:14:10AM +0100, Lukas Wunner wrote:
> > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > > legacy INTX inerrupt, and likely make several devices/ports share one
> > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > > interrupts, and actually asserts the command-complete interrupt.
> > > As MSI is disabled, ACPI initialization code will not enumerate root
> > > port's PCIE hotplug capability, and pciehp service driver wont' be
> > > enabled for the root port to handle that interrupt, later on when it is
> > > shared and enabled by other device driver like NVME or NIC, the "nobody
> > > care irq storm" happens.
> > >
> > > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when
> > > MSI is not enbaled.
> >
> > So I think this issue should go away if disabling the interrupt
> > by portdrv is no longer conditional on
> >
> > (pcie_ports_native || host->native_pcie_hotplug)
> >
> > like I've just proposed here:
> >
> > https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/
> >
> > ... in which case this patch won't be necessary. Can you confirm that?
>
> Thanks for the suggestion! I will try to get the platform for test,
> and report back.
I haven't got the platform, but I recalled something, that disabling HP
interrupts inside get_port_device_capability()/portdrv_probe() got called
after the nvme_probe(), so it may still cause the irq storm due to:
* pcie root port's hotplug interrupt asserted
* the interrupt is shared with NVME and other device
* those device drivers enable the interrupt line early before portdrv's
probe()
That's why we tried to put the disabling early in PCI initialization code.
Thanks,
Feng
> As for the change,
> + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
> + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
>
> The CONFIG_HOTPLUG_PCI_PCIE is always enabled on our platform and many
> distros, I guess the check needs to be removed, which sees the 1 second
> waiting again, and need the waiting logic in 1/2 patch?
>
> Thanks,
> Feng
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-04 5:37 [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Feng Tang
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
2025-02-04 9:07 ` [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Lukas Wunner
@ 2025-02-05 17:48 ` Markus Elfring
2025-02-06 2:42 ` Feng Tang
2025-02-05 18:26 ` [PATCH 1/2] " Sathyanarayanan Kuppuswamy
3 siblings, 1 reply; 19+ messages in thread
From: Markus Elfring @ 2025-02-05 17:48 UTC (permalink / raw)
To: Feng Tang, linux-pci, Bjorn Helgaas
Cc: LKML, Ilpo Järvinen, Jonathan Cameron
…
> Add the necessary wait to comply with PCIe spec. The waiting logic refers
> existing pcie_poll_cmd().
* How do you think about to add any tags (like “Fixes” and “Cc”) accordingly?
* Will cover letters be usually helpful for such patch series?
* You would probably like to avoid a typo in an email address.
Regards,
Markus
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-04 5:37 [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Feng Tang
` (2 preceding siblings ...)
2025-02-05 17:48 ` Markus Elfring
@ 2025-02-05 18:26 ` Sathyanarayanan Kuppuswamy
2025-02-06 3:18 ` Feng Tang
3 siblings, 1 reply; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2025-02-05 18:26 UTC (permalink / raw)
To: Feng Tang, Bjorn Helgaas
Cc: Jonathan Cameron, ilpo.jarvinen, linux-pci, linux-kernel
On 2/3/25 9:37 PM, Feng Tang wrote:
> According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> least 1 second for the command-complete event, before resending the cmd
> or sending a new cmd.
>
> Currently get_port_device_capability() sends slot control cmd to disable
> PCIe hotplug interrupts without waiting for its completion and there was
> real problem reported for the lack of waiting.
Can you include the error log associated with this issue? What is the
actual issue you are seeing and in which hardware?
>
> Add the necessary wait to comply with PCIe spec. The waiting logic refers
> existing pcie_poll_cmd().
>
> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
> ---
> drivers/pci/pci.h | 2 ++
> drivers/pci/pcie/portdrv.c | 33 +++++++++++++++++++++++++++++++--
> 2 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 01e51db8d285..c1e234d1b81d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
> #ifdef CONFIG_PCIEPORTBUS
> void pcie_reset_lbms_count(struct pci_dev *port);
> int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
> #else
> static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
> static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
> {
> return -EOPNOTSUPP;
> }
> +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
> #endif
>
> struct pci_dev_reset_methods {
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 02e73099bad0..16010973bfe2 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -18,6 +18,7 @@
> #include <linux/string.h>
> #include <linux/slab.h>
> #include <linux/aer.h>
> +#include <linux/delay.h>
>
> #include "../pci.h"
> #include "portdrv.h"
> @@ -205,6 +206,35 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
> return 0;
> }
>
> +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *pdev)
> +{
> + u16 slot_status;
> + /* 1000 ms, according toPCIe spec 6.1, section 6.7.3.2 */
> + int timeout = 1000;
> +
> + do {
> + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
> + if (slot_status & PCI_EXP_SLTSTA_CC) {
> + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
> + PCI_EXP_SLTSTA_CC);
> + return 0;
> + }
> + msleep(10);
> + timeout -= 10;
> + } while (timeout);
> +
> + /* Timeout */
> + return -1;
> +}
May be this logic can be simplified using readl_poll_timeout()?
> +
> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
> +{
> + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> + if (pcie_wait_sltctl_cmd_raw(dev))
> + pci_info(dev, "Timeout on disabling hot-plug interrupts\n");
> +}
> +
> /**
> * get_port_device_capability - discover capabilities of a PCI Express port
> * @dev: PCI Express port to examine
> @@ -230,8 +260,7 @@ static int get_port_device_capability(struct pci_dev *dev)
> * Disable hot-plug interrupts in case they have been enabled
> * by the BIOS and the hot-plug service driver is not loaded.
> */
> - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> + pcie_disable_hp_interrupts_early(dev);
> }
>
> #ifdef CONFIG_PCIEAER
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-05 17:48 ` Markus Elfring
@ 2025-02-06 2:42 ` Feng Tang
2025-02-06 11:40 ` [1/2] " Markus Elfring
0 siblings, 1 reply; 19+ messages in thread
From: Feng Tang @ 2025-02-06 2:42 UTC (permalink / raw)
To: Markus Elfring
Cc: linux-pci, Bjorn Helgaas, LKML, Ilpo Järvinen,
Jonathan Cameron, rafael.j.wysocki
Hi Markus,
Thanks for the review and reminding!
On Wed, Feb 05, 2025 at 06:48:35PM +0100, Markus Elfring wrote:
> …
> > Add the necessary wait to comply with PCIe spec. The waiting logic refers
> > existing pcie_poll_cmd().
>
> * How do you think about to add any tags (like “Fixes” and “Cc”) accordingly?
The code went through some refactor, and the related commit should be:
commit 2bd50dd800b52245294cfceb56be62020cdc7515
Author: Rafael J. Wysocki <rjw@rjwysocki.net>
Date: Sat Aug 21 01:57:39 2010 +0200
PCI: PCIe: Disable PCIe port services during port initialization
In principle PCIe port services may be enabled by the BIOS, so it's
better to disable them during port initialization to avoid spurious
events from being generated.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add Rafael to cc.
> * Will cover letters be usually helpful for such patch series?
Initially I planned to send the 2 patches seprately as they handle
different issues, but as 2/2 will reuse the helper function added
by 1/2, I put them together.
> * You would probably like to avoid a typo in an email address.
Could you be more specific? I got the mail addresses from get_maintainers.pl.
thanks!
- Feng
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-05 18:26 ` [PATCH 1/2] " Sathyanarayanan Kuppuswamy
@ 2025-02-06 3:18 ` Feng Tang
2025-02-07 4:26 ` Sathyanarayanan Kuppuswamy
0 siblings, 1 reply; 19+ messages in thread
From: Feng Tang @ 2025-02-06 3:18 UTC (permalink / raw)
To: Sathyanarayanan Kuppuswamy
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
Hi Sathyanarayanan,
On Wed, Feb 05, 2025 at 10:26:59AM -0800, Sathyanarayanan Kuppuswamy wrote:
>
> On 2/3/25 9:37 PM, Feng Tang wrote:
> > According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> > least 1 second for the command-complete event, before resending the cmd
> > or sending a new cmd.
> >
> > Currently get_port_device_capability() sends slot control cmd to disable
> > PCIe hotplug interrupts without waiting for its completion and there was
> > real problem reported for the lack of waiting.
>
> Can you include the error log associated with this issue? What is the
> actual issue you are seeing and in which hardware?
For this one, we don't have specific log, as it was raised by firmware
developer, as in https://lore.kernel.org/lkml/Z6LRAozZm1UfgjqT@U-2FWC9VHC-2323.local/
When handling PCI hotplug problem, they hit issue and found their state
machine corrupted , and back traced to OS. They didn't expect to receive
2 link control commands at almost the same time, which doesn't comply to
pcie spec, and normally the handling of one command will take some time
in BIOS, though not as long as 1 second. The HW is an ARM server.
I will try to add these info to commit log in next version.
>
> >
> > Add the necessary wait to comply with PCIe spec. The waiting logic refers
> > existing pcie_poll_cmd().
> >
> > Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
> > ---
> > drivers/pci/pci.h | 2 ++
> > drivers/pci/pcie/portdrv.c | 33 +++++++++++++++++++++++++++++++--
> > 2 files changed, 33 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 01e51db8d285..c1e234d1b81d 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
> > #ifdef CONFIG_PCIEPORTBUS
> > void pcie_reset_lbms_count(struct pci_dev *port);
> > int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
> > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
> > #else
> > static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
> > static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
> > {
> > return -EOPNOTSUPP;
> > }
> > +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
> > #endif
> > struct pci_dev_reset_methods {
> > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> > index 02e73099bad0..16010973bfe2 100644
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -18,6 +18,7 @@
> > #include <linux/string.h>
> > #include <linux/slab.h>
> > #include <linux/aer.h>
> > +#include <linux/delay.h>
> > #include "../pci.h"
> > #include "portdrv.h"
> > @@ -205,6 +206,35 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
> > return 0;
> > }
> > +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *pdev)
> > +{
> > + u16 slot_status;
> > + /* 1000 ms, according toPCIe spec 6.1, section 6.7.3.2 */
> > + int timeout = 1000;
> > +
> > + do {
> > + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
> > + if (slot_status & PCI_EXP_SLTSTA_CC) {
> > + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
> > + PCI_EXP_SLTSTA_CC);
> > + return 0;
> > + }
> > + msleep(10);
> > + timeout -= 10;
> > + } while (timeout);
> > +
> > + /* Timeout */
> > + return -1;
> > +}
>
> May be this logic can be simplified using readl_poll_timeout()?
Seems this is what exactly I needed :) Many thanks for the suggestion!
- Feng
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-05 3:58 ` Feng Tang
@ 2025-02-06 6:21 ` Lukas Wunner
2025-02-12 13:04 ` Feng Tang
0 siblings, 1 reply; 19+ messages in thread
From: Lukas Wunner @ 2025-02-06 6:21 UTC (permalink / raw)
To: Feng Tang, Rafael J. Wysocki
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
[to += Rafael, start of thread is here:
https://lore.kernel.org/all/Z6HcoUB3i51bzQDs@wunner.de/
]
Hi Rafael,
On Wed, Feb 05, 2025 at 11:58:04AM +0800, Feng Tang wrote:
> On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote:
> > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > > legacy INTX inerrupt, and likely make several devices/ports share one
> > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > > interrupts, and actually asserts the command-complete interrupt.
> > > As MSI is disabled, ACPI initialization code will not enumerate root
> > > port's PCIE hotplug capability, and pciehp service driver wont' be
> > > enabled for the root port to handle that interrupt, later on when it is
> > > shared and enabled by other device driver like NVME or NIC, the "nobody
> > > care irq storm" happens.
> >
> > Is there a section in the PCI Firmware Spec which says ACPI doesn't
> > enumerate the hotplug capability if MSI is disabled?
>
> No, I didn't get it from spec, but found the logic by code reading
> during debugging the irq storm issue. The related code is about:
>
> #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
> | OSC_PCI_ASPM_SUPPORT \
> | OSC_PCI_CLOCK_PM_SUPPORT \
> | OSC_PCI_MSI_SUPPORT)
Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root
bridge (v3)") contains a change which doesn't seem to be explained in
the commit message:
If the user passes "pci=nomsi" on the command line, Linux doesn't
request hotplug control (or any other control) from the platform.
So ACPI always remains responsible for hotplug in the "pci=nomsi"
case.
The commit sought to fix a cpu hog issue:
https://bugzilla.kernel.org/show_bug.cgi?id=29722
It's unclear to me if that bug was fixed by requesting _OSC only once,
as the commit message suggests, or if the addition of OSC_MSI_SUPPORT
to ACPI_PCIE_REQ_SUPPORT fixed the issue.
Since the latter is not mentioned in the commit message,
it seems plausible to assume that the OSC_MSI_SUPPORT change
was unintentional.
In any case it doesn't seem to make sense to not request any
control in the "pci=nomsi" case.
It's also worth noting that the behavior is different on
Apple machines as they use a fixed _OSC set even for "pci=nomsi".
I'm wondering if OSC_PCI_MSI_SUPPORT should simply be removed
from ACPI_PCIE_REQ_SUPPORT, but I'm worried that it may cause
reappearance of the cpu hog issue.
Thoughts?
Thanks,
Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-06 2:42 ` Feng Tang
@ 2025-02-06 11:40 ` Markus Elfring
2025-02-07 1:40 ` Feng Tang
0 siblings, 1 reply; 19+ messages in thread
From: Markus Elfring @ 2025-02-06 11:40 UTC (permalink / raw)
To: Feng Tang
Cc: linux-pci, Bjorn Helgaas, LKML, Ilpo Järvinen,
Jonathan Cameron, rafael.j.wysocki
> Could you be more specific? I got the mail addresses from get_maintainers.pl.
Would you like to take another look at information also according to Jonathan Cameron
(for example in your patch recipient list)?
Regards,
Markus
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-06 11:40 ` [1/2] " Markus Elfring
@ 2025-02-07 1:40 ` Feng Tang
0 siblings, 0 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-07 1:40 UTC (permalink / raw)
To: Markus Elfring
Cc: linux-pci, Bjorn Helgaas, LKML, Ilpo Järvinen,
Jonathan Cameron, rafael.j.wysocki
On Thu, Feb 06, 2025 at 12:40:44PM +0100, Markus Elfring wrote:
> > Could you be more specific? I got the mail addresses from get_maintainers.pl.
> Would you like to take another look at information also according to Jonathan Cameron
> (for example in your patch recipient list)?
I see, thanks!
- Feng
> Regards,
> Markus
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-06 3:18 ` Feng Tang
@ 2025-02-07 4:26 ` Sathyanarayanan Kuppuswamy
2025-02-07 6:17 ` Feng Tang
0 siblings, 1 reply; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2025-02-07 4:26 UTC (permalink / raw)
To: Feng Tang
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On 2/5/25 7:18 PM, Feng Tang wrote:
> Hi Sathyanarayanan,
>
> On Wed, Feb 05, 2025 at 10:26:59AM -0800, Sathyanarayanan Kuppuswamy wrote:
>> On 2/3/25 9:37 PM, Feng Tang wrote:
>>> According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
>>> least 1 second for the command-complete event, before resending the cmd
>>> or sending a new cmd.
>>>
>>> Currently get_port_device_capability() sends slot control cmd to disable
>>> PCIe hotplug interrupts without waiting for its completion and there was
>>> real problem reported for the lack of waiting.
>> Can you include the error log associated with this issue? What is the
>> actual issue you are seeing and in which hardware?
> For this one, we don't have specific log, as it was raised by firmware
> developer, as in https://lore.kernel.org/lkml/Z6LRAozZm1UfgjqT@U-2FWC9VHC-2323.local/
>
> When handling PCI hotplug problem, they hit issue and found their state
> machine corrupted , and back traced to OS. They didn't expect to receive
> 2 link control commands at almost the same time, which doesn't comply to
Which 2 commands from OS? Did you identify both commands?
> pcie spec, and normally the handling of one command will take some time
> in BIOS, though not as long as 1 second. The HW is an ARM server.
>
> I will try to add these info to commit log in next version.
Ok. Please include it.
>
>>> Add the necessary wait to comply with PCIe spec. The waiting logic refers
>>> existing pcie_poll_cmd().
>>>
>>> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
>>> ---
>>> drivers/pci/pci.h | 2 ++
>>> drivers/pci/pcie/portdrv.c | 33 +++++++++++++++++++++++++++++++--
>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index 01e51db8d285..c1e234d1b81d 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
>>> #ifdef CONFIG_PCIEPORTBUS
>>> void pcie_reset_lbms_count(struct pci_dev *port);
>>> int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
>>> +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
>>> #else
>>> static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
>>> static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
>>> {
>>> return -EOPNOTSUPP;
>>> }
>>> +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
>>> #endif
>>> struct pci_dev_reset_methods {
>>> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
>>> index 02e73099bad0..16010973bfe2 100644
>>> --- a/drivers/pci/pcie/portdrv.c
>>> +++ b/drivers/pci/pcie/portdrv.c
>>> @@ -18,6 +18,7 @@
>>> #include <linux/string.h>
>>> #include <linux/slab.h>
>>> #include <linux/aer.h>
>>> +#include <linux/delay.h>
>>> #include "../pci.h"
>>> #include "portdrv.h"
>>> @@ -205,6 +206,35 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
>>> return 0;
>>> }
>>> +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *pdev)
>>> +{
>>> + u16 slot_status;
>>> + /* 1000 ms, according toPCIe spec 6.1, section 6.7.3.2 */
>>> + int timeout = 1000;
>>> +
>>> + do {
>>> + pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status);
>>> + if (slot_status & PCI_EXP_SLTSTA_CC) {
>>> + pcie_capability_write_word(pdev, PCI_EXP_SLTSTA,
>>> + PCI_EXP_SLTSTA_CC);
>>> + return 0;
>>> + }
>>> + msleep(10);
>>> + timeout -= 10;
>>> + } while (timeout);
>>> +
>>> + /* Timeout */
>>> + return -1;
>>> +}
>> May be this logic can be simplified using readl_poll_timeout()?
> Seems this is what exactly I needed :) Many thanks for the suggestion!
>
> - Feng
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events
2025-02-07 4:26 ` Sathyanarayanan Kuppuswamy
@ 2025-02-07 6:17 ` Feng Tang
0 siblings, 0 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-07 6:17 UTC (permalink / raw)
To: Sathyanarayanan Kuppuswamy
Cc: Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen, linux-pci,
linux-kernel
On Thu, Feb 06, 2025 at 08:26:13PM -0800, Sathyanarayanan Kuppuswamy wrote:
>
> On 2/5/25 7:18 PM, Feng Tang wrote:
> > Hi Sathyanarayanan,
> >
> > On Wed, Feb 05, 2025 at 10:26:59AM -0800, Sathyanarayanan Kuppuswamy wrote:
> > > On 2/3/25 9:37 PM, Feng Tang wrote:
> > > > According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> > > > least 1 second for the command-complete event, before resending the cmd
> > > > or sending a new cmd.
> > > >
> > > > Currently get_port_device_capability() sends slot control cmd to disable
> > > > PCIe hotplug interrupts without waiting for its completion and there was
> > > > real problem reported for the lack of waiting.
> > > Can you include the error log associated with this issue? What is the
> > > actual issue you are seeing and in which hardware?
> > For this one, we don't have specific log, as it was raised by firmware
> > developer, as in https://lore.kernel.org/lkml/Z6LRAozZm1UfgjqT@U-2FWC9VHC-2323.local/
> >
> > When handling PCI hotplug problem, they hit issue and found their state
> > machine corrupted , and back traced to OS. They didn't expect to receive
> > 2 link control commands at almost the same time, which doesn't comply to
>
> Which 2 commands from OS? Did you identify both commands?
Firmware developer saw 2 programming to PCIE Slot Control register of
one root port, first is writing 0x5ca, the second is writing 0x15f8.
IIUC, the first one is to disable CCIE/HPIE here from get_port_device_capability()
Thanks,
Feng
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled
2025-02-06 6:21 ` Lukas Wunner
@ 2025-02-12 13:04 ` Feng Tang
0 siblings, 0 replies; 19+ messages in thread
From: Feng Tang @ 2025-02-12 13:04 UTC (permalink / raw)
To: Lukas Wunner, Rafael J. Wysocki
Cc: Rafael J. Wysocki, Bjorn Helgaas, Jonathan Cameron, ilpo.jarvinen,
linux-pci, linux-kernel
On Thu, Feb 06, 2025 at 07:21:47AM +0100, Lukas Wunner wrote:
> [to += Rafael, start of thread is here:
> https://lore.kernel.org/all/Z6HcoUB3i51bzQDs@wunner.de/
> ]
>
> Hi Rafael,
>
> On Wed, Feb 05, 2025 at 11:58:04AM +0800, Feng Tang wrote:
> > On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote:
> > > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > > > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > > > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > > > legacy INTX inerrupt, and likely make several devices/ports share one
> > > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > > > interrupts, and actually asserts the command-complete interrupt.
> > > > As MSI is disabled, ACPI initialization code will not enumerate root
> > > > port's PCIE hotplug capability, and pciehp service driver wont' be
> > > > enabled for the root port to handle that interrupt, later on when it is
> > > > shared and enabled by other device driver like NVME or NIC, the "nobody
> > > > care irq storm" happens.
> > >
> > > Is there a section in the PCI Firmware Spec which says ACPI doesn't
> > > enumerate the hotplug capability if MSI is disabled?
> >
> > No, I didn't get it from spec, but found the logic by code reading
> > during debugging the irq storm issue. The related code is about:
> >
> > #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
> > | OSC_PCI_ASPM_SUPPORT \
> > | OSC_PCI_CLOCK_PM_SUPPORT \
> > | OSC_PCI_MSI_SUPPORT)
>
> Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root
> bridge (v3)") contains a change which doesn't seem to be explained in
> the commit message:
>
> If the user passes "pci=nomsi" on the command line, Linux doesn't
> request hotplug control (or any other control) from the platform.
> So ACPI always remains responsible for hotplug in the "pci=nomsi"
> case.
>
> The commit sought to fix a cpu hog issue:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=29722
>
> It's unclear to me if that bug was fixed by requesting _OSC only once,
> as the commit message suggests, or if the addition of OSC_MSI_SUPPORT
> to ACPI_PCIE_REQ_SUPPORT fixed the issue.
>
> Since the latter is not mentioned in the commit message,
> it seems plausible to assume that the OSC_MSI_SUPPORT change
> was unintentional.
>
> In any case it doesn't seem to make sense to not request any
> control in the "pci=nomsi" case.
>
> It's also worth noting that the behavior is different on
> Apple machines as they use a fixed _OSC set even for "pci=nomsi".
>
> I'm wondering if OSC_PCI_MSI_SUPPORT should simply be removed
> from ACPI_PCIE_REQ_SUPPORT, but I'm worried that it may cause
> reappearance of the cpu hog issue.
Hi Lukas,
I tried to remove OSC_PCI_MSI_SUPPORT from ACPI_PCIE_REQ_SUPPORT, but
after negotiate_os_control(), the 'PCIeHotplug' control is still
disabled in the control capability after ACPI query_osc, run_osc
routines (I haven't figured out why yet), thus the pciehp severvice
driver can't be loader.
Thanks,
Feng
> Thoughts?
>
> Thanks,
>
> Lukas
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-02-12 13:04 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04 5:37 [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Feng Tang
2025-02-04 5:37 ` [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
2025-02-04 9:14 ` Lukas Wunner
2025-02-05 6:31 ` Feng Tang
2025-02-05 13:31 ` Feng Tang
2025-02-04 9:23 ` Lukas Wunner
2025-02-05 3:58 ` Feng Tang
2025-02-06 6:21 ` Lukas Wunner
2025-02-12 13:04 ` Feng Tang
2025-02-04 9:07 ` [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events Lukas Wunner
2025-02-05 2:46 ` Feng Tang
2025-02-05 17:48 ` Markus Elfring
2025-02-06 2:42 ` Feng Tang
2025-02-06 11:40 ` [1/2] " Markus Elfring
2025-02-07 1:40 ` Feng Tang
2025-02-05 18:26 ` [PATCH 1/2] " Sathyanarayanan Kuppuswamy
2025-02-06 3:18 ` Feng Tang
2025-02-07 4:26 ` Sathyanarayanan Kuppuswamy
2025-02-07 6:17 ` Feng Tang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox