linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Don't make noise about disconnected USB4 devices
@ 2025-06-09  1:58 Mario Limonciello
  2025-06-09  1:58 ` [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices Mario Limonciello
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Mario Limonciello @ 2025-06-09  1:58 UTC (permalink / raw)
  To: mario.limonciello, bhelgaas, gregkh, mathias.nyman; +Cc: linux-pci, linux-usb

From: Mario Limonciello <mario.limonciello@amd.com>

When a USB4 or TBT3 dock is disconnected a lot of warnings and errors
are emitted related to the PCIe tunnels and XHCI controllers in th
dock.

The messages are loud, but it's mostly because the functions that
emit the messages don't check whether the device is actually alive.
The PCIe hotplug services mark the device as perm dead, so that
can be used to hide some of the messsages.

In the XHCI driver the device is marked as dying already, so that
can also be used to hide messages.

Mario Limonciello (4):
  PCI: Don't show errors on inaccessible PCI devices
  PCI: Fix runtime PM usage count underflow
  usb: xhci: Avoid showing errors during surprise removal
  usb: xhci: Avoid showing warnings for dying controller

 drivers/pci/pci-driver.c     | 3 ++-
 drivers/pci/pci.c            | 5 +++--
 drivers/usb/host/xhci-ring.c | 7 +++++--
 drivers/usb/host/xhci.c      | 6 ++++--
 4 files changed, 14 insertions(+), 7 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices
  2025-06-09  1:58 [PATCH 0/4] Don't make noise about disconnected USB4 devices Mario Limonciello
@ 2025-06-09  1:58 ` Mario Limonciello
  2025-06-09 15:09   ` Lukas Wunner
  2025-06-09  1:58 ` [PATCH 2/4] PCI: Fix runtime PM usage count underflow Mario Limonciello
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Mario Limonciello @ 2025-06-09  1:58 UTC (permalink / raw)
  To: mario.limonciello, bhelgaas; +Cc: linux-pci

From: Mario Limonciello <mario.limonciello@amd.com>

When a USB4 dock is unplugged the PCIe bridge it's connected to will
remove issue a "Link Down" and "Card not detected event". The PCI core
will treat this as a surprise hotplug event and unconfigure all downstream
devices. This involves setting the device error state to
`pci_channel_io_perm_failure`.

As the device is already gone and the PCI core is cleaning up there isn't
really any reason to show error messages to the user about failing to
change power states. Detect the error state and skip the messaging.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/pci/pci.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113bd..7b0b4087da4d3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1376,8 +1376,9 @@ int pci_power_up(struct pci_dev *dev)
 
 	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
 	if (PCI_POSSIBLE_ERROR(pmcsr)) {
-		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
-			pci_power_name(dev->current_state));
+		if (dev->error_state != pci_channel_io_perm_failure)
+			pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
+				pci_power_name(dev->current_state));
 		dev->current_state = PCI_D3cold;
 		return -EIO;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] PCI: Fix runtime PM usage count underflow
  2025-06-09  1:58 [PATCH 0/4] Don't make noise about disconnected USB4 devices Mario Limonciello
  2025-06-09  1:58 ` [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices Mario Limonciello
@ 2025-06-09  1:58 ` Mario Limonciello
  2025-06-09 15:16   ` Lukas Wunner
  2025-06-09  9:19 ` [PATCH 0/4] Don't make noise about disconnected USB4 devices Michał Pecio
  2025-06-09 15:20 ` Lukas Wunner
  3 siblings, 1 reply; 9+ messages in thread
From: Mario Limonciello @ 2025-06-09  1:58 UTC (permalink / raw)
  To: mario.limonciello, bhelgaas; +Cc: linux-pci

From: Mario Limonciello <mario.limonciello@amd.com>

When a USB4 dock is unplugged the PCIe bridge it's connected to will
remove issue a "Link Down" and "Card not detected event". The PCI core
will treat this as a surprise hotplug event and unconfigure all downstream
devices. This involves setting the device error state to
`pci_channel_io_perm_failure`.

When PCI core gets to the point that the device is removed using
pci_device_remove() the runtime count has already been decremented and
so calling pm_runtime_put_sync() will cause an underflow.

Detect the device is in the error state and skip the call for this cleanup
path.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/pci/pci-driver.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 9f6e145d93d62..ab4cfdfc8fbc0 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -479,7 +479,8 @@ static void pci_device_remove(struct device *dev)
 	pci_iov_remove(pci_dev);
 
 	/* Undo the runtime PM settings in local_pci_probe() */
-	pm_runtime_put_sync(dev);
+	if (pci_dev->error_state != pci_channel_io_perm_failure)
+		pm_runtime_put_sync(dev);
 
 	/*
 	 * If the device is still on, set the power state as "unknown",
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] Don't make noise about disconnected USB4 devices
  2025-06-09  1:58 [PATCH 0/4] Don't make noise about disconnected USB4 devices Mario Limonciello
  2025-06-09  1:58 ` [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices Mario Limonciello
  2025-06-09  1:58 ` [PATCH 2/4] PCI: Fix runtime PM usage count underflow Mario Limonciello
@ 2025-06-09  9:19 ` Michał Pecio
  2025-06-09 13:05   ` Mario Limonciello
  2025-06-09 15:20 ` Lukas Wunner
  3 siblings, 1 reply; 9+ messages in thread
From: Michał Pecio @ 2025-06-09  9:19 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: mario.limonciello, bhelgaas, gregkh, mathias.nyman, linux-pci,
	linux-usb

Hi,

General remarks:
- broken threading on 1/2 and 2/2
- some Cc missing on individual patch emails

On Sun,  8 Jun 2025 20:58:00 -0500, Mario Limonciello wrote:
> When a USB4 or TBT3 dock is disconnected a lot of warnings and errors
> are emitted related to the PCIe tunnels and XHCI controllers in th
> dock.

These patches will probably also trigger on any loss of PCIe link for
any reason: badly seated card, worn connector, EMI, etc.

Will there be any remaining message about dead PCIe links, or just
a silent disappearence? Like dev_info("USB disconnect ...") in USB.

> The messages are loud, but it's mostly because the functions that
> emit the messages don't check whether the device is actually alive.
> The PCIe hotplug services mark the device as perm dead, so that
> can be used to hide some of the messsages.
> 
> In the XHCI driver the device is marked as dying already, so that
> can also be used to hide messages.

Are PCI drivers expected to stay silent on sudden removal mid operation?
Is there no "safe ejection" procedure for those Thunderbolt devices?

> Mario Limonciello (4):
>   PCI: Don't show errors on inaccessible PCI devices
>   PCI: Fix runtime PM usage count underflow
>   usb: xhci: Avoid showing errors during surprise removal
>   usb: xhci: Avoid showing warnings for dying controller

Regards,
Michal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] Don't make noise about disconnected USB4 devices
  2025-06-09  9:19 ` [PATCH 0/4] Don't make noise about disconnected USB4 devices Michał Pecio
@ 2025-06-09 13:05   ` Mario Limonciello
  0 siblings, 0 replies; 9+ messages in thread
From: Mario Limonciello @ 2025-06-09 13:05 UTC (permalink / raw)
  To: Michał Pecio, Rodrigo Siqueira
  Cc: mario.limonciello, bhelgaas, gregkh, mathias.nyman, linux-pci,
	linux-usb

On 6/9/2025 4:19 AM, Michał Pecio wrote:
> Hi,
> 
> General remarks:
> - broken threading on 1/2 and 2/2
> - some Cc missing on individual patch emails

Yeah; sorry about that.  I got bit by 
https://github.com/kworkflow/kworkflow/issues/1207 once again.  Once I 
realized that happened I figured unthreaded was better than missing so I 
ended off sending the missing ones to each of the lists that missed them.

If I send a v2 with them together again I'll just manually do to/cc for 
everything.

> 
> On Sun,  8 Jun 2025 20:58:00 -0500, Mario Limonciello wrote:
>> When a USB4 or TBT3 dock is disconnected a lot of warnings and errors
>> are emitted related to the PCIe tunnels and XHCI controllers in th
>> dock.
> 
> These patches will probably also trigger on any loss of PCIe link for
> any reason: badly seated card, worn connector, EMI, etc.
> 
> Will there be any remaining message about dead PCIe links, or just
> a silent disappearence? Like dev_info("USB disconnect ...") in USB.
> 

Good point on the PCIe patches with other failures.  Those wouldn't have 
any "hotplug event" though would they?  This all stems from the hotplug 
event, so would it be worth storing the state on the struct pci_dev to 
conditionally show these PCIe messages?

>> The messages are loud, but it's mostly because the functions that
>> emit the messages don't check whether the device is actually alive.
>> The PCIe hotplug services mark the device as perm dead, so that
>> can be used to hide some of the messsages.
>>
>> In the XHCI driver the device is marked as dying already, so that
>> can also be used to hide messages.
> 
> Are PCI drivers expected to stay silent on sudden removal mid operation?
> Is there no "safe ejection" procedure for those Thunderbolt devices?
> 

With docking surprise hot removal is a standard operation.
Userspace doesn't offer anything for a clean removal event of PCIe like 
USB storage does.

>> Mario Limonciello (4):
>>    PCI: Don't show errors on inaccessible PCI devices
>>    PCI: Fix runtime PM usage count underflow
>>    usb: xhci: Avoid showing errors during surprise removal
>>    usb: xhci: Avoid showing warnings for dying controller
> 
> Regards,
> Michal


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices
  2025-06-09  1:58 ` [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices Mario Limonciello
@ 2025-06-09 15:09   ` Lukas Wunner
  2025-06-09 15:41     ` Mario Limonciello
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2025-06-09 15:09 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: mario.limonciello, bhelgaas, linux-pci, Rafael J. Wysocki,
	Mika Westerberg

[cc += Rafael, Mika]

On Sun, Jun 08, 2025 at 08:58:01PM -0500, Mario Limonciello wrote:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1376,8 +1376,9 @@ int pci_power_up(struct pci_dev *dev)
>  
>  	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>  	if (PCI_POSSIBLE_ERROR(pmcsr)) {
> -		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> -			pci_power_name(dev->current_state));
> +		if (dev->error_state != pci_channel_io_perm_failure)
> +			pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> +				pci_power_name(dev->current_state));
>  		dev->current_state = PCI_D3cold;
>  		return -EIO;
>  	}

Instead of merely silencing the error message, why not bail out early on
in the function, i.e.

	if (pci_dev_is_disconnected(dev)) {
		dev->current_state = PCI_D3cold;
		return -EIO;
	}

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] PCI: Fix runtime PM usage count underflow
  2025-06-09  1:58 ` [PATCH 2/4] PCI: Fix runtime PM usage count underflow Mario Limonciello
@ 2025-06-09 15:16   ` Lukas Wunner
  0 siblings, 0 replies; 9+ messages in thread
From: Lukas Wunner @ 2025-06-09 15:16 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: mario.limonciello, bhelgaas, linux-pci, Rafael J. Wysocki,
	Mika Westerberg

[cc += Rafael, Mika]

On Sun, Jun 08, 2025 at 08:58:02PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> When a USB4 dock is unplugged the PCIe bridge it's connected to will
> remove issue a "Link Down" and "Card not detected event". The PCI core

Nit: s/remove//

> will treat this as a surprise hotplug event and unconfigure all downstream
> devices. This involves setting the device error state to
> `pci_channel_io_perm_failure`.
> 
> When PCI core gets to the point that the device is removed using
> pci_device_remove() the runtime count has already been decremented and
> so calling pm_runtime_put_sync() will cause an underflow.

Where has it been decremented?  I think this needs to be identified
and a Fixes tag added.

> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -479,7 +479,8 @@ static void pci_device_remove(struct device *dev)
>  	pci_iov_remove(pci_dev);
>  
>  	/* Undo the runtime PM settings in local_pci_probe() */
> -	pm_runtime_put_sync(dev);
> +	if (pci_dev->error_state != pci_channel_io_perm_failure)
> +		pm_runtime_put_sync(dev);

Usually pci_dev_is_disconnected() is used in lieu of checking for
the error_state directly.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] Don't make noise about disconnected USB4 devices
  2025-06-09  1:58 [PATCH 0/4] Don't make noise about disconnected USB4 devices Mario Limonciello
                   ` (2 preceding siblings ...)
  2025-06-09  9:19 ` [PATCH 0/4] Don't make noise about disconnected USB4 devices Michał Pecio
@ 2025-06-09 15:20 ` Lukas Wunner
  3 siblings, 0 replies; 9+ messages in thread
From: Lukas Wunner @ 2025-06-09 15:20 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: mario.limonciello, bhelgaas, gregkh, mathias.nyman, linux-pci,
	linux-usb

On Sun, Jun 08, 2025 at 08:58:00PM -0500, Mario Limonciello wrote:
> Mario Limonciello (4):
>   PCI: Don't show errors on inaccessible PCI devices
>   PCI: Fix runtime PM usage count underflow
>   usb: xhci: Avoid showing errors during surprise removal
>   usb: xhci: Avoid showing warnings for dying controller

Patches [3/4] and [4/4] (which touch xhci) were only cc'ed to Bjorn.
You may want to resend these two to Mathias and Greg.
You might also want to split the series in two separate ones
for PCI and xhci if/when respinning.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices
  2025-06-09 15:09   ` Lukas Wunner
@ 2025-06-09 15:41     ` Mario Limonciello
  0 siblings, 0 replies; 9+ messages in thread
From: Mario Limonciello @ 2025-06-09 15:41 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: mario.limonciello, bhelgaas, linux-pci, Rafael J. Wysocki,
	Mika Westerberg

On 6/9/2025 10:09 AM, Lukas Wunner wrote:
> [cc += Rafael, Mika]
> 
> On Sun, Jun 08, 2025 at 08:58:01PM -0500, Mario Limonciello wrote:
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1376,8 +1376,9 @@ int pci_power_up(struct pci_dev *dev)
>>   
>>   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>   	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>> -		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>> -			pci_power_name(dev->current_state));
>> +		if (dev->error_state != pci_channel_io_perm_failure)
>> +			pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>> +				pci_power_name(dev->current_state));
>>   		dev->current_state = PCI_D3cold;
>>   		return -EIO;
>>   	}
> 
> Instead of merely silencing the error message, why not bail out early on
> in the function, i.e.
> 
> 	if (pci_dev_is_disconnected(dev)) {
> 		dev->current_state = PCI_D3cold;
> 		return -EIO;
> 	}
> 

Thanks for the suggestion.  That sounds good to me, I'll have a try.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-06-09 15:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09  1:58 [PATCH 0/4] Don't make noise about disconnected USB4 devices Mario Limonciello
2025-06-09  1:58 ` [PATCH 1/4] PCI: Don't show errors on inaccessible PCI devices Mario Limonciello
2025-06-09 15:09   ` Lukas Wunner
2025-06-09 15:41     ` Mario Limonciello
2025-06-09  1:58 ` [PATCH 2/4] PCI: Fix runtime PM usage count underflow Mario Limonciello
2025-06-09 15:16   ` Lukas Wunner
2025-06-09  9:19 ` [PATCH 0/4] Don't make noise about disconnected USB4 devices Michał Pecio
2025-06-09 13:05   ` Mario Limonciello
2025-06-09 15:20 ` Lukas Wunner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).