From: Feng Tang <feng.tang@linux.alibaba.com>
To: Bjorn Helgaas <bhelgaas@google.com>,
Lukas Wunner <lukas@wunner.de>,
Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@linux.intel.com>,
Liguang Zhang <zhangliguang@linux.alibaba.com>,
Guanghui Feng <guanghuifeng@linux.alibaba.com>,
rafael@kernel.org
Cc: Markus Elfring <Markus.Elfring@web.de>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org,
Feng Tang <feng.tang@linux.alibaba.com>
Subject: [PATCH v2 1/2] PCI/portdrv: Add necessary wait for disabling hotplug events
Date: Tue, 18 Feb 2025 11:48:58 +0800 [thread overview]
Message-ID: <20250218034859.40397-1-feng.tang@linux.alibaba.com> (raw)
There was problem reported by firmware developers that they received
2 pcie link control commands in very short intervals on an ARM server,
which doesn't comply with pcie spec, and broke their state machine and
work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
to wait at least 1 second for the command-complete event, before
resending the cmd or sending a new cmd.
And the first link control command firmware received is from
get_port_device_capability(), which sends cmd to disable pcie hotplug
interrupts without waiting for its completion.
Fix it by adding the necessary wait to comply with PCIe spec, referring
pcie_poll_cmd().
Also make the interrupt disabling not dependent on whether pciehp
service driver will be loaded as suggested by Lukas.
Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
Originally-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
---
Changlog:
since v1:
* Add the Originally-by for Liguang. The issue was found on a 5.10 kernel,
then 6.6. I was initially given a 5.10 kernel tar bar without git info to
debug the issue, and made the patch. Thanks to Guanghui who recently pointed
me to tree https://gitee.com/anolis/cloud-kernel which show the wait logic
in 5.10 was originally from Liguang, and never hit mainline.
* Make the irq disabling not dependent on wthether pciehp service driver
will be loaded (Lukas Wunner)
* Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan
Kuppuswamy)
* Add logic to skip irq disabling if it is already disabled.
drivers/pci/pci.h | 2 ++
drivers/pci/pcie/portdrv.c | 44 +++++++++++++++++++++++++++++++++-----
2 files changed, 41 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 01e51db8d285..c1e234d1b81d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { }
#ifdef CONFIG_PCIEPORTBUS
void pcie_reset_lbms_count(struct pci_dev *port);
int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
#else
static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
{
return -EOPNOTSUPP;
}
+static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
#endif
struct pci_dev_reset_methods {
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 02e73099bad0..2470333bba2f 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -18,6 +18,7 @@
#include <linux/string.h>
#include <linux/slab.h>
#include <linux/aer.h>
+#include <linux/iopoll.h>
#include "../pci.h"
#include "portdrv.h"
@@ -205,6 +206,40 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
return 0;
}
+static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev)
+{
+ u16 slot_status = 0;
+ int ret, ret1, timeout_us;
+
+ /* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */
+ timeout_us = 1000000;
+ ret = read_poll_timeout(pcie_capability_read_word, ret1,
+ (slot_status & PCI_EXP_SLTSTA_CC), 10000,
+ timeout_us, true, dev, PCI_EXP_SLTSTA,
+ &slot_status);
+ if (!ret)
+ pcie_capability_write_word(dev, PCI_EXP_SLTSTA,
+ PCI_EXP_SLTSTA_CC);
+
+ return ret;
+}
+
+void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
+{
+ u16 slot_ctrl = 0;
+
+ pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
+ /* Bail out early if it is already disabled */
+ if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
+ return;
+
+ pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+ PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+
+ if (pcie_wait_sltctl_cmd_raw(dev))
+ pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n");
+}
+
/**
* get_port_device_capability - discover capabilities of a PCI Express port
* @dev: PCI Express port to examine
@@ -222,16 +257,15 @@ static int get_port_device_capability(struct pci_dev *dev)
if (dev->is_hotplug_bridge &&
(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
- (pcie_ports_native || host->native_pcie_hotplug)) {
- services |= PCIE_PORT_SERVICE_HP;
+ pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
+ if (pcie_ports_native || host->native_pcie_hotplug)
+ services |= PCIE_PORT_SERVICE_HP;
/*
* Disable hot-plug interrupts in case they have been enabled
* by the BIOS and the hot-plug service driver is not loaded.
*/
- pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
- PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
+ pcie_disable_hp_interrupts_early(dev);
}
#ifdef CONFIG_PCIEAER
--
2.43.5
next reply other threads:[~2025-02-18 3:49 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-18 3:48 Feng Tang [this message]
2025-02-18 3:48 ` [PATCH v2 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled Feng Tang
2025-02-18 9:00 ` [PATCH v2 1/2] PCI/portdrv: Add necessary wait for disabling hotplug events Markus Elfring
2025-02-19 2:19 ` Feng Tang
2025-02-18 18:58 ` Sathyanarayanan Kuppuswamy
2025-02-19 6:53 ` Feng Tang
2025-02-18 22:33 ` Bjorn Helgaas
2025-02-19 2:53 ` Feng Tang
2025-02-19 11:12 ` Feng Tang
2025-02-19 5:57 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250218034859.40397-1-feng.tang@linux.alibaba.com \
--to=feng.tang@linux.alibaba.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Markus.Elfring@web.de \
--cc=bhelgaas@google.com \
--cc=guanghuifeng@linux.alibaba.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=rafael@kernel.org \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=zhangliguang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox