Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
@ 2026-05-06 16:43 Mario Limonciello
  2026-05-06 17:55 ` Lukas Wunner
  2026-05-06 20:58 ` sashiko-bot
  0 siblings, 2 replies; 6+ messages in thread
From: Mario Limonciello @ 2026-05-06 16:43 UTC (permalink / raw)
  To: bhelgaas; +Cc: Mario Limonciello (AMD), linux-pci

From: "Mario Limonciello (AMD)" <superm1@kernel.org>

When a hotplug slot is removed, pciehp still leaves both the slot and link
active. If a device is still physically connected, platform firmware
may continue to poll the link. On some systems with Thunderbolt 3 devices
connected this prevents the system from shutting down until the device is
disconnected.

To fix this add code to pciehp_power_off_slot() to reverse the init
sequence from pciehp_power_on_slot() (power on, then enable link becomes
disable link, then power off).

Call this cleanup when releasing the controller so the link is disabled
during driver removal.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---
 drivers/pci/hotplug/pciehp_hpc.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 4c62140a3cb44..6705dcaf1ad0c 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -348,6 +348,11 @@ static int pciehp_link_enable(struct controller *ctrl)
 	return __pciehp_link_set(ctrl, true);
 }
 
+static int pciehp_link_disable(struct controller *ctrl)
+{
+	return __pciehp_link_set(ctrl, false);
+}
+
 int pciehp_get_raw_indicator_status(struct hotplug_slot *hotplug_slot,
 				    u8 *status)
 {
@@ -558,6 +563,12 @@ int pciehp_power_on_slot(struct controller *ctrl)
 
 void pciehp_power_off_slot(struct controller *ctrl)
 {
+	int retval;
+
+	retval = pciehp_link_disable(ctrl);
+	if (retval)
+		ctrl_err(ctrl, "%s: Can not disable the link!\n", __func__);
+
 	pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_OFF, PCI_EXP_SLTCTL_PCC);
 	ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
@@ -1096,6 +1107,8 @@ struct controller *pcie_init(struct pcie_device *dev)
 void pciehp_release_ctrl(struct controller *ctrl)
 {
 	cancel_delayed_work_sync(&ctrl->button_work);
+	if (POWER_CTRL(ctrl))
+		pciehp_power_off_slot(ctrl);
 	kfree(ctrl);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
  2026-05-06 16:43 [PATCH] PCI: pciehp: Disable link and turn off slot power while removing Mario Limonciello
@ 2026-05-06 17:55 ` Lukas Wunner
  2026-05-06 19:16   ` Mario Limonciello
  2026-05-06 20:58 ` sashiko-bot
  1 sibling, 1 reply; 6+ messages in thread
From: Lukas Wunner @ 2026-05-06 17:55 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: bhelgaas, Mario Limonciello (AMD), linux-pci

On Wed, May 06, 2026 at 11:43:50AM -0500, Mario Limonciello wrote:
> When a hotplug slot is removed, pciehp still leaves both the slot and link
> active. If a device is still physically connected, platform firmware
> may continue to poll the link. On some systems with Thunderbolt 3 devices
> connected this prevents the system from shutting down until the device is
> disconnected.

"On some systems" is a little vague.  Which ones?  What is platform firmware
doing there exactly, is this something in AML?

> To fix this add code to pciehp_power_off_slot() to reverse the init
> sequence from pciehp_power_on_slot() (power on, then enable link becomes
> disable link, then power off).

Unfortunately it's not as easy as that.  You're essentially reverting
commit b1811d2455f3 ("PCI: pciehp: Don't disable the link permanently
during removal"), so you'll regress those products for which the commit
deliberately removed link disablement on slot poweroff.

Moreover there are products where Presence Detect is hardwired to zero,
see commit 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired
to zero").  So we have to rely on a link change to detect that a new device
has been plugged in after slot poweroff.  That won't work if the link is
disabled.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
  2026-05-06 17:55 ` Lukas Wunner
@ 2026-05-06 19:16   ` Mario Limonciello
  2026-05-07  4:26     ` Lukas Wunner
  0 siblings, 1 reply; 6+ messages in thread
From: Mario Limonciello @ 2026-05-06 19:16 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: bhelgaas, Mario Limonciello (AMD), linux-pci



On 5/6/26 12:55, Lukas Wunner wrote:
> On Wed, May 06, 2026 at 11:43:50AM -0500, Mario Limonciello wrote:
>> When a hotplug slot is removed, pciehp still leaves both the slot and link
>> active. If a device is still physically connected, platform firmware
>> may continue to poll the link. On some systems with Thunderbolt 3 devices
>> connected this prevents the system from shutting down until the device is
>> disconnected.
> 
> "On some systems" is a little vague.  Which ones?  What is platform firmware
> doing there exactly, is this something in AML?

In this case it was a Strix Halo OEM design that I don't have visiblity 
a layer lower.

But issue is specific to Thunderbolt 3 devices, doesn't happen with USB4 
devices with an active PCIe tunnel for some reason.

This doesn't reproduce on the reference designs we tried.

I /might/ be able to get some traces though from a USB4 analyzer with 
OEM design.

> 
>> To fix this add code to pciehp_power_off_slot() to reverse the init
>> sequence from pciehp_power_on_slot() (power on, then enable link becomes
>> disable link, then power off).
> 
> Unfortunately it's not as easy as that.  You're essentially reverting
> commit b1811d2455f3 ("PCI: pciehp: Don't disable the link permanently
> during removal"), so you'll regress those products for which the commit
> deliberately removed link disablement on slot poweroff.

Thanks for that pointer.  Unfortuantely the Link: tag for the discussion 
is HTTP 403.

> 
> Moreover there are products where Presence Detect is hardwired to zero,
> see commit 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired
> to zero").  So we have to rely on a link change to detect that a new device
> has been plugged in after slot poweroff.  That won't work if the link is
> disabled.
> 

How would you feel about turning off link detect somewhere specific to 
the shutdown path?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
  2026-05-06 16:43 [PATCH] PCI: pciehp: Disable link and turn off slot power while removing Mario Limonciello
  2026-05-06 17:55 ` Lukas Wunner
@ 2026-05-06 20:58 ` sashiko-bot
  1 sibling, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-05-06 20:58 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Unconditional slot power-off in `pciehp_release_ctrl()` abruptly cuts power to active devices, causing kernel crashes.
- [Low] Dead code in `pciehp_power_off_slot()` due to `pciehp_link_disable()` always returning 0.
--

commit 5574c68659aab56ef2cf2ec6acc4c999a6afa1fc
Author: Mario Limonciello (AMD) <superm1@kernel.org>

PCI: pciehp: Disable link and turn off slot power while removing

This commit adds code to pciehp_power_off_slot() to disable the link
before turning off slot power. It also introduces a cleanup sequence
in pciehp_release_ctrl() to disable the link and turn off slot power
during driver removal.

> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 4c62140a3cb44..6705dcaf1ad0c 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
[ ... ]
> @@ -558,6 +563,12 @@ int pciehp_power_on_slot(struct controller *ctrl)
>  
>  void pciehp_power_off_slot(struct controller *ctrl)
>  {
> +	int retval;
> +
> +	retval = pciehp_link_disable(ctrl);
> +	if (retval)
> +		ctrl_err(ctrl, "%s: Can not disable the link!\n", __func__);

Since pciehp_link_disable() calls __pciehp_link_set() which unconditionally
returns 0, will this error path ever be executed?

> +
>  	pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_OFF, PCI_EXP_SLTCTL_PCC);
>  	ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
>  		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
[ ... ]
> @@ -1096,6 +1107,8 @@ struct controller *pcie_init(struct pcie_device *dev)
>  void pciehp_release_ctrl(struct controller *ctrl)
>  {
>  	cancel_delayed_work_sync(&ctrl->button_work);
> +	if (POWER_CTRL(ctrl))
> +		pciehp_power_off_slot(ctrl);
>  	kfree(ctrl);
>  }

If the pciehp driver is unbound manually or encounters an error during
probe, could this unconditionally cut power to active devices?

Since pciehp is a port service, unbinding it removes hotplug monitoring
but does not detach the active child PCIe devices (such as NVMe drives)
from the PCI core.

If pciehp_release_ctrl() is called while child drivers are still bound and
running, does this lead to unexpected link down events and unsupported
requests?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260506164353.1278571-1-mario.limonciello@amd.com?part=1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
  2026-05-06 19:16   ` Mario Limonciello
@ 2026-05-07  4:26     ` Lukas Wunner
  2026-05-07  5:03       ` Mario Limonciello
  0 siblings, 1 reply; 6+ messages in thread
From: Lukas Wunner @ 2026-05-07  4:26 UTC (permalink / raw)
  To: Mario Limonciello; +Cc: bhelgaas, Mario Limonciello (AMD), linux-pci

On Wed, May 06, 2026 at 02:16:10PM -0500, Mario Limonciello wrote:
> On 5/6/26 12:55, Lukas Wunner wrote:
> > Moreover there are products where Presence Detect is hardwired to zero,
> > see commit 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired
> > to zero").  So we have to rely on a link change to detect that a new device
> > has been plugged in after slot poweroff.  That won't work if the link is
> > disabled.
> 
> How would you feel about turning off link detect somewhere specific to the
> shutdown path?

I *think* shutdown is performed bottom-up in the hierarchy and so
this might work.  But it's not great to introduce such a change
for everyone just because a single product needs it.  It would
be better to root-cause the issue.  Maybe the vendor is able to
provide a BIOS update which resolves it?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: pciehp: Disable link and turn off slot power while removing
  2026-05-07  4:26     ` Lukas Wunner
@ 2026-05-07  5:03       ` Mario Limonciello
  0 siblings, 0 replies; 6+ messages in thread
From: Mario Limonciello @ 2026-05-07  5:03 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: bhelgaas, Mario Limonciello (AMD), linux-pci



On 5/6/26 23:26, Lukas Wunner wrote:
> On Wed, May 06, 2026 at 02:16:10PM -0500, Mario Limonciello wrote:
>> On 5/6/26 12:55, Lukas Wunner wrote:
>>> Moreover there are products where Presence Detect is hardwired to zero,
>>> see commit 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired
>>> to zero").  So we have to rely on a link change to detect that a new device
>>> has been plugged in after slot poweroff.  That won't work if the link is
>>> disabled.
>>
>> How would you feel about turning off link detect somewhere specific to the
>> shutdown path?
> 
> I *think* shutdown is performed bottom-up in the hierarchy and so
> this might work.  But it's not great to introduce such a change
> for everyone just because a single product needs it.  

Generally speaking I'm not a fan of Linux changing hardware state and 
not cleaning up after itself at shutdown.  This can lead to weird 
behaviors for power consumption, wakeups, or inconsistent init across a 
kexec vs warm reboot vs cold reboot (that BIOS had a chance to swing 
it's hammer).

> It would
> be better to root-cause the issue.  

I'll see if we can figure out what's happening with traces on a USB 
analyzer.  But FWIW it's a Linux only issue :/

> Maybe the vendor is able to
> provide a BIOS update which resolves it?
> 

Given it's Linux only issue; unlikely.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-07  5:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 16:43 [PATCH] PCI: pciehp: Disable link and turn off slot power while removing Mario Limonciello
2026-05-06 17:55 ` Lukas Wunner
2026-05-06 19:16   ` Mario Limonciello
2026-05-07  4:26     ` Lukas Wunner
2026-05-07  5:03       ` Mario Limonciello
2026-05-06 20:58 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox