public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	stable@vger.kernel.org, Ethan Zhao <etzhao1900@gmail.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>
Subject: Re: [PATCH] PCI/sysfs: Ensure devices are powered for config reads
Date: Tue, 23 Sep 2025 16:07:29 -0700	[thread overview]
Message-ID: <aNMoMY17CTR2_jQz@google.com> (raw)
In-Reply-To: <20250923190231.GA2052830@bhelgaas>

Hi Bjorn,

On Tue, Sep 23, 2025 at 02:02:31PM -0500, Bjorn Helgaas wrote:
> [+cc Ethan, Andrey]
> 
> On Wed, Aug 20, 2025 at 10:26:08AM -0700, Brian Norris wrote:
> > From: Brian Norris <briannorris@google.com>
> > 
> > max_link_speed, max_link_width, current_link_speed, current_link_width,
> > secondary_bus_number, and subordinate_bus_number all access config
> > registers, but they don't check the runtime PM state. If the device is
> > in D3cold, we may see -EINVAL or even bogus values.
> > 
> > Wrap these access in pci_config_pm_runtime_{get,put}() like most of the
> > rest of the similar sysfs attributes.
> 
> Protecting the config reads seems right to me.
> 
> If the device is in D3cold, a config read will result in a Completion
> Timeout.  On most x86 platforms that's "fine" and merely results in ~0
> data.  But that's merely convention, not a PCIe spec requirement.
> 
> I think it's a potential issue with PCIe controllers used on arm64 and
> might result in an SError or synchronous abort from which we don't
> recover well.  I'd love to hear actual experience about how reading
> "current_link_speed" works on a device in D3cold in an arm64 system.

I'm working on a few such arm64 systems :) (pcie-qcom Chromebooks, and
non-upstream DWC-based Pixel phones; I have a little more knowledge of
the latter.) The answers may vary by SoC, and especially by PCIe
implementation. ARM SoCs are notoriously ... diverse.

To my knowledge, it can be several of the above on arm64 + DWC.

* pci_generic_config_read() -> pci_ops::map_bus() may return NULL, in
  which case we get PCIBIOS_DEVICE_NOT_FOUND / -EINVAL. And specifically
  on arm64 with DWC PCIe, dw_pcie_other_conf_map_bus() may see the link
  down on a suspended bridge and return NULL.

* The map_bus() check is admittedly racy, so we might still *actually*
  hit the hardware, at which point this gets even more
  implementation-defined:

  (a) if the PCIe HW is not powered (for example, if we put the link to
      L3 and fully powered off the controller to save power), we might
      not even get a completion timeout, and it depends on how the
      SoC is wired up. But I believe this tends to be SError, and a
      crash.

  (b) if the PCIe HW is powered but something else is down (e.g., link
      in L2, device in D3cold, etc.), we'll get a Completion Timeout,
      and a ~0 response. I also was under the impression a ~0 response
      is not spec-mandated, but I believe it's noted in the Synopsys
      documentation.

NB: I'm not sure there is really great upstream support for arm64 +
D3cold yet. If they're not using ACPI (as few arm64 systems do), they
probably don't have the appropriate platform_pci_* hooks to really
manage it properly. There have been some prior attempts at adding
non-x86/ACPI hooks for this, although for different reasons:

    https://lore.kernel.org/linux-pci/a38e76d6f3a90d7c968c32cee97604f3c41cbccf.camel@mediatek.com/
    [PATCH] PCI:PM: Support platforms that do not implement ACPI

That submission stalled because it didn't really have the whole picture
(in that case, the wwan/modem driver in question).

> As Ethan and Andrey pointed out, we could skip max_link_speed_show()
> because pcie_get_speed_cap() already uses a cached value and doesn't
> do a config access.

Ack, I'll drop that part of the change.

> max_link_width_show() is similar and also comes from PCI_EXP_LNKCAP
> but is not currently cached, so I think we do need that one.  Worth a
> comment to explain the non-obvious difference.

Sure, I'll add a comment for max_link_width.

> PCI_EXP_LNKCAP is ostensibly read-only and could conceivably be
> cached, but the ASPM exit latencies can change based on the Common
> Clock Configuration.

I'll plan not to add additional caching, unless excess wakeups becomes a
problem.

Brian

  reply	other threads:[~2025-09-23 23:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-20 17:26 [PATCH] PCI/sysfs: Ensure devices are powered for config reads Brian Norris
2025-08-21  0:54 ` Ethan Zhao
2025-08-21  2:56   ` Brian Norris
2025-08-21 12:41     ` Ethan Zhao
2025-08-21 15:28       ` Brian Norris
2025-08-22  1:11         ` Ethan Zhao
2025-09-10 18:35           ` Brian Norris
2025-09-23 17:56 ` Andrey Ryabinin
2025-09-23 19:02 ` Bjorn Helgaas
2025-09-23 23:07   ` Brian Norris [this message]
2025-09-23 23:27     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aNMoMY17CTR2_jQz@google.com \
    --to=briannorris@chromium.org \
    --cc=bhelgaas@google.com \
    --cc=etzhao1900@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox