From: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
Keith Busch <kbusch@kernel.org>,
Yicong Yang <yangyicong@hisilicon.com>,
linux-pci@vger.kernel.org,
Stuart Hayes <stuart.w.hayes@gmail.com>,
Mika Westerberg <mika.westerberg@linux.intel.com>,
Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com>,
Joel Mathew Thomas <proxy0@tutamail.com>,
Russ Weight <russ.weight@linux.dev>,
Matthew Gerlach <matthew.gerlach@altera.com>,
Yilun Xu <yilun.xu@intel.com>,
linux-fpga@vger.kernel.org, Moshe Shemesh <moshe@nvidia.com>,
Shay Drory <shayd@nvidia.com>, Saeed Mahameed <saeedm@nvidia.com>,
Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH 2/2] PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset
Date: Sun, 13 Apr 2025 10:21:57 -0700 [thread overview]
Message-ID: <5e628372-5cf5-43e6-897e-166b7666e4ed@linux.intel.com> (raw)
In-Reply-To: <Z_nfuGrVh_CO7vbe@wunner.de>
On 4/11/25 8:36 PM, Lukas Wunner wrote:
> On Fri, Apr 11, 2025 at 03:28:15PM -0700, Sathyanarayanan Kuppuswamy wrote:
>> On 4/10/25 8:27 AM, Lukas Wunner wrote:
>>> Introduce two helpers to annotate code sections which cause spurious link
>>> changes: pci_hp_ignore_link_change() and pci_hp_unignore_link_change()
>>> Use those helpers in lieu of masking interrupts in the Slot Control
>>> register.
>>>
>>> Introduce a helper to check whether such a code section is executing
>>> concurrently and if so, await it: pci_hp_spurious_link_change()
>>> Invoke the helper in the hotplug IRQ thread pciehp_ist(). Re-use the
>>> IRQ thread's existing code which ignores DPC-induced link changes unless
>>> the link is unexpectedly down after reset recovery or the device was
>>> replaced during the bus reset.
>> Since most of the changes in this patch is related to adding framework to
>> ignore spurious hotplug event, why not split it in to a separate patch?
> The idea is to ease backporting. The issue fixes VFIO passthrough on
> v6.13-rc1 and newer, so the patch will have to be backported at least
> to v6.13, v6.14, v6.15.
Makes sense.
>
>> Is this fix tested in any platform?
> Yes, Joel Mathew Thomas successfully tested it:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=219765
>
> That's an Asus TUF FA507NU dual-GPU laptop (AMD CPU + Nvidia discrete GPU).
> The Nvidia GPU is passed through to a VM.
>
>
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -227,6 +227,7 @@ static inline bool pcie_downstream_port(const struct pci_dev *dev)
>>> /* Functions for PCI Hotplug drivers to use */
>>> int pci_hp_add_bridge(struct pci_dev *dev);
>>> +bool pci_hp_spurious_link_change(struct pci_dev *pdev);
>> Do you expect this function used outside hotplug code? If not why not leave
>> the declaration in pciehp.h?
> Any hotplug driver may call this. In particular, there are two other drivers
> implementing the ->reset_slot() callback: pnv_php.c and s390_pci_hpc.c
>
> pnv_php.c does a similar dance as pciehp_hpc.c (before this patch):
> It disables the interrupt, performs a Secondary Bus Reset, clears spurious
> events and re-enables the interrupt. I think it can be converted to use
> the newly introduced API. That should make it more robust against removal
> or replacement of the device during the Secondary Bus Reset.
>
> Also, to cope with runtime suspend to D3cold, there's an ignore_hotplug
> bit in struct pci_dev. The bit is set by drivers for discrete GPUs and
> is honored by acpiphp and pciehp. I'd like to remove the bit in favor
> of the new mechanism introduced here and that means acpiphp will have to
> be converted to call pci_hp_spurious_link_change().
>
> One thing that's problematic about the current behavior is that hotplug
> events are ignored wholesale for GPUs which runtime suspend to D3cold.
> That works for discrete GPUs in laptops which are soldered down on the
> mainboard, but doesn't work for GPUs which are plugged into an actual
> hotplug port, e.g. data center GPUs. The new API will allow detecting
> and ignoring spurious events in a more surgical manner. I envision
> that __pci_set_power_state() will call pci_hp_ignore_link_change()
> if the target power state is D3cold. Also vga_switcheroo.c will have
> to call that. But none of the GPU drivers will have to call
> pci_ignore_hotplug() anymore.
>
> To summarize, there are at least two other hotplug drivers besides pciehp
> which I expect will call pci_hp_spurious_link_change() in the foreseeable
> future, acpiphp and pnv_php, hence the declaration is not in pciehp.h
> but in drivers/pci/pci.h.
Thanks for sharing the details.
>
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -1848,6 +1848,14 @@ static inline void pcie_no_aspm(void) { }
>>> static inline bool pcie_aspm_enabled(struct pci_dev *pdev) { return false; }
>>> #endif
>>> +#ifdef CONFIG_HOTPLUG_PCI
>>> +void pci_hp_ignore_link_change(struct pci_dev *pdev);
>>> +void pci_hp_unignore_link_change(struct pci_dev *pdev);
>>> +#else
>>> +static inline void pci_hp_ignore_link_change(struct pci_dev *pdev) { }
>>> +static inline void pci_hp_unignore_link_change(struct pci_dev *pdev) { }
>>> +#endif
>>> +
>> Generally we expose APIs when we really need it. Since you have already
>> identified some use cases where you will use it in other drivers, why not
>> include one such change as an example?
> Mostly because I wanted to get this fix out the door quickly before more
> people come across the regression it addresses.
>
> Converting the Mellanox Ethernet driver to use the API would require an ack
> from its maintainers. Since it's not urgent, I was planning to do that in
> a subsequent cycle. Same for the conversion of D3cold handling.
Got it.
>
> Thanks,
>
> Lukas
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
next prev parent reply other threads:[~2025-04-13 17:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 15:27 [PATCH 0/2] Ignore spurious PCIe hotplug events Lukas Wunner
2025-04-10 15:27 ` [PATCH 1/2] PCI: pciehp: Ignore Presence Detect Changed caused by DPC Lukas Wunner
2025-04-11 2:34 ` Sathyanarayanan Kuppuswamy
2025-04-11 8:58 ` Lukas Wunner
2025-04-14 13:33 ` Ilpo Järvinen
2025-04-10 15:27 ` [PATCH 2/2] PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus Reset Lukas Wunner
2025-04-11 22:28 ` Sathyanarayanan Kuppuswamy
2025-04-12 3:36 ` Lukas Wunner
2025-04-13 17:21 ` Sathyanarayanan Kuppuswamy [this message]
2025-04-13 17:22 ` Sathyanarayanan Kuppuswamy
2025-04-14 13:32 ` Ilpo Järvinen
2025-04-16 8:00 ` Ilpo Järvinen
2025-04-10 22:19 ` [PATCH 0/2] Ignore spurious PCIe hotplug events Bjorn Helgaas
2025-04-15 20:51 ` Keith Busch
2025-04-16 15:06 ` Lukas Wunner
2025-04-18 1:26 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5e628372-5cf5-43e6-897e-166b7666e4ed@linux.intel.com \
--to=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=alex.williamson@redhat.com \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=kbusch@kernel.org \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=matthew.gerlach@altera.com \
--cc=mika.westerberg@linux.intel.com \
--cc=moshe@nvidia.com \
--cc=proxy0@tutamail.com \
--cc=russ.weight@linux.dev \
--cc=saeedm@nvidia.com \
--cc=shayd@nvidia.com \
--cc=stuart.w.hayes@gmail.com \
--cc=yangyicong@hisilicon.com \
--cc=yilun.xu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).