From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: David Box <david.e.box@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
bhelgaas@google.com, andrea.righi@canonical.com,
vicamo.yang@canonical.com, kenny@panix.com,
nirmal.patel@linux.intel.com, linux-pm@vger.kernel.org,
linux-pci@vger.kernel.org, ilpo.jarvinen@linux.intel.com,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] PCI/ASPM: Allow ASPM enablement for devices behind Intel VMD
Date: Fri, 11 Jul 2025 17:49:03 +0300 (EEST) [thread overview]
Message-ID: <19452e8c-4e29-4703-afc9-3257a2d1183d@linux.intel.com> (raw)
In-Reply-To: <yqtsfh4pmbnogt67m6tk6pqpdcbz3kx3xx4lpinbgfvq4yi5wn@leoipmcnqsgs>
[-- Attachment #1: Type: text/plain, Size: 12444 bytes --]
On Fri, 11 Jul 2025, David Box wrote:
> Hey Rafael,
>
> On Thu, Jul 10, 2025 at 09:53:18PM +0200, Rafael J. Wysocki wrote:
> > On Thu, Jul 10, 2025 at 3:16 AM David E. Box
> > <david.e.box@linux.intel.com> wrote:
> > >
> > > Devices behind Intel's Volume Management Device (VMD) controller reside on
> > > a synthetic PCI hierarchy that is intentionally hidden from ACPI and
> > > firmware. As such, BIOS does not configure ASPM for these devices, and the
> > > responsibility for link power management, including ASPM and LTR, falls
> > > entirely to the VMD driver.
> > >
> > > However, the current PCIe ASPM code prevents ASPM configuration when the
> > > ACPI_FADT_NO_ASPM flag is set, disallowing OS management. This leaves ASPM
> > > permanently disabled for these devices, contrary to the platform's design
> > > intent.
> > >
> > > Introduce a callback mechanism that allows the VMD driver to enable ASPM
> > > for its domain, bypassing the global ACPI_FADT_NO_ASPM restriction that is
> > > not applicable in this context. This ensures that devices behind VMD can
> > > benefit from ASPM savings as originally intended.
> > >
> > > Link: https://lore.kernel.org/linux-pm/0b166ece-eeec-ba5d-2212-50d995611cef@panix.com
> > > Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> >
> > First of all, thanks for doing this work, much appreciated!
> >
> > > ---
> > > drivers/pci/controller/vmd.c | 28 ++++++++++++++++++++++++++--
> > > drivers/pci/pci.h | 8 ++++++++
> > > drivers/pci/pcie/aspm.c | 36 +++++++++++++++++++++++++++++++++++-
> > > 3 files changed, 69 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > index 8df064b62a2f..e685586dc18b 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -21,6 +21,8 @@
> > >
> > > #include <asm/irqdomain.h>
> > >
> > > +#include "../pci.h"
> > > +
> > > #define VMD_CFGBAR 0
> > > #define VMD_MEMBAR1 2
> > > #define VMD_MEMBAR2 4
> > > @@ -730,7 +732,7 @@ static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge,
> > > }
> > >
> > > /*
> > > - * Enable ASPM and LTR settings on devices that aren't configured by BIOS.
> > > + * Enable LTR settings on devices that aren't configured by BIOS.
> > > */
> > > static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
> > > {
> > > @@ -770,10 +772,27 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
> > > * PCIe r6.0, sec 5.5.4.
> > > */
> > > pci_set_power_state_locked(pdev, PCI_D0);
> > > - pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
> >
> > Do I think correctly that this doesn't work because of the
> > aspm_disabled check in __pci_enable_link_state()?
>
> Yes.
>
> >
> > > return 0;
> > > }
> > >
> > > +static long vmd_get_link_state(struct pci_dev *pdev, void *data)
> > > +{
> > > + struct pci_bus *vmd_bus = data;
> > > + struct pci_bus *bus = pdev->bus;
> > > +
> > > + while (bus) {
> > > + if (bus == vmd_bus)
> > > + return PCIE_LINK_STATE_ALL;
> > > +
> > > + if (!bus->self)
> > > + break;
> > > +
> > > + bus = bus->self->bus;
> > > + }
> >
> > If I'm not mistaken, it would be sufficient to do a check like
> >
> > if (pci_dev->bus->ops == &vmd_ops)
> > return PCIE_LINK_STATE_ALL;
> >
> > instead of the above, or if not then why not?
>
> As a replacement in the loop, that does look sufficient. I'm not sure if
> you're also suggesting replacing the loop itself.
>
> >
> > > +
> > > + return -ENODEV;
> > > +}
> > > +
> > > static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > > {
> > > struct pci_sysdata *sd = &vmd->sysdata;
> > > @@ -785,6 +804,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > > resource_size_t membar2_offset = 0x2000;
> > > struct pci_bus *child;
> > > struct pci_dev *dev;
> > > + struct pcie_aspm_vmd_link_state vmd_link_state;
> > > int ret;
> > >
> > > /*
> > > @@ -911,6 +931,10 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > > return -ENODEV;
> > > }
> > >
> > > + vmd_link_state.cb = vmd_get_link_state;
> > > + vmd_link_state.data = vmd->bus;
> > > + pci_register_vmd_link_state_cb(&vmd_link_state);
> > > +
> > > vmd_copy_host_bridge_flags(pci_find_host_bridge(vmd->dev->bus),
> > > to_pci_host_bridge(vmd->bus->bridge));
> > >
> > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > index 12215ee72afb..dcf7d39c660f 100644
> > > --- a/drivers/pci/pci.h
> > > +++ b/drivers/pci/pci.h
> > > @@ -821,6 +821,12 @@ void pci_configure_aspm_l1ss(struct pci_dev *dev);
> > > void pci_save_aspm_l1ss_state(struct pci_dev *dev);
> > > void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
> > >
> > > +
> > > +struct pcie_aspm_vmd_link_state {
> > > + long (*cb)(struct pci_dev *pdev, void *data);
> > > + void *data;
> > > +};
> > > +
> > > #ifdef CONFIG_PCIEASPM
> > > void pcie_aspm_init_link_state(struct pci_dev *pdev);
> > > void pcie_aspm_exit_link_state(struct pci_dev *pdev);
> > > @@ -828,6 +834,7 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
> > > void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> > > void pci_configure_ltr(struct pci_dev *pdev);
> > > void pci_bridge_reconfigure_ltr(struct pci_dev *pdev);
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state);
> > > #else
> > > static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
> > > static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> > > @@ -835,6 +842,7 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked)
> > > static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> > > static inline void pci_configure_ltr(struct pci_dev *pdev) { }
> > > static inline void pci_bridge_reconfigure_ltr(struct pci_dev *pdev) { }
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state) { }
> > > #endif
> > >
> > > #ifdef CONFIG_PCIE_ECRC
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index 29fcb0689a91..c609d3c309be 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -320,6 +320,27 @@ static int policy_to_clkpm_state(struct pcie_link_state *link)
> > > return 0;
> > > }
> > >
> > > +static struct pcie_aspm_vmd_link_state vmd_state;
> > > +
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state)
> > > +{
> > > + mutex_lock(&aspm_lock);
> > > + vmd_state.cb = state->cb;
> > > + vmd_state.data = state->data;
> > > + mutex_unlock(&aspm_lock);
> > > +}
> > > +EXPORT_SYMBOL_GPL(pci_register_vmd_link_state_cb);
> > > +
> > > +static long pci_get_vmd_link_state(struct pci_dev *pdev)
> > > +{
> > > + int state = -ENODEV;
> > > +
> > > + if (vmd_state.cb)
> > > + state = vmd_state.cb(pdev, vmd_state.data);
> > > +
> > > + return state;
> > > +}
> > > +
> > > static void pci_update_aspm_saved_state(struct pci_dev *dev)
> > > {
> > > struct pci_cap_saved_state *save_state;
> > > @@ -794,6 +815,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > u32 parent_lnkcap, child_lnkcap;
> > > u16 parent_lnkctl, child_lnkctl;
> > > struct pci_bus *linkbus = parent->subordinate;
> > > + int vmd_aspm_default;
> > >
> > > if (blacklist) {
> > > /* Set enabled/disable so that we will disable ASPM later */
> > > @@ -865,8 +887,20 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > pcie_capability_write_word(child, PCI_EXP_LNKCTL, child_lnkctl);
> > > }
> > >
> > > + /*
> > > + * Devices behind Intel VMD operate on a synthetic PCI bus that BIOS
> > > + * and ACPI do not enumerate or configure. ASPM for these devices must
> > > + * be managed by the VMD driver itself, independent of global ACPI ASPM
> > > + * constraints. This callback mechanism allows selective ASPM
> > > + * enablement for such domains.
> > > + */
> > > + vmd_aspm_default = pci_get_vmd_link_state(parent);
> > > +
> > > /* Save default state */
> > > - link->aspm_default = link->aspm_enabled;
> > > + if (vmd_aspm_default < 0)
> > > + link->aspm_default = link->aspm_enabled;
> > > + else
> > > + link->aspm_default = vmd_aspm_default;
> >
> > Well, this is not nice.
> >
> > First off, it adds VMD-specific stuff to otherwise generic ASPM code.
> > Second, it doesn't actually do anything about the aspm_disabled checks
> > all over the place, so they will still trigger even though ASPM will
> > be effectively enabled for devices on the VMD bus.
>
> I agree that it's not nice to be mixing VMD specific code here. It's a
> working bad solution to come up with a working good solution :)
>
> The reason this patch works is that the aspm_disabled checks only gate
> OS-controlled ASPM configuration. They don't affect the BIOS default
> values, which are what we're setting in link->aspm_default. The ASPM
> code uses link->aspm_default as the fallback when ASPM is globally
> disabled, which is exactly what we want for devices behind VMD where the
> driver, not BIOS, effectively is the platform provider of the defaults.
Oh, this was a big news to me.
So what you're saying is that if aspm_disabled is set, ->aspm_disable
cannot be set and thus pcie_config_aspm_link() that is not gated by
aspm_disabled can alter ASPM state despite OS not having ASPM control???
...That's really odd logic which the ASPM driver seems to be full of.
--
i.
> I put it here using a callback value because this where the BIOS
> defaults are saved for each device.
>
> >
> > >
> > > /* Setup initial capable state. Will be updated later */
> > > link->aspm_capable = link->aspm_support;
> > >
> > > base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
> > > --
> >
> > It appears to me that the underlying problem is that aspm_disabled is
> > global and it is set during PCI root bus creation in
> > acpi_pci_root_add(). Thus it affects all of the PCI buses even though
> > the BIOS says that it wants to control ASPM for this particular PCIe
> > hierarchy. If there were another PCI root enumerated by ACPI where
> > the OS would be allowed to control ASPM, it would not work just like
> > the VMD case.
>
> It would work in the non-VMD case because it has the BIOS default to
> use. VMD does not.
>
> >
> > To me, this suggests an approach based on moving the "ASPM disabled
> > because the BIOS wants to control it" setting to pci_bus_flags_t and
> > setting it on a per-hierarchy basis. Since the VMD bus is a separate
> > PCIe hierarchy from the kernel perspective, it will not have this flag
> > set and the OS should be able to configure ASPM for devices on that
> > bus.
> >
> > Do I think correctly or am I overlooking something here?
>
> It’s definitely along those lines. The issue is really caused by two
> things:
>
> 1. aspm_disabled global prevents OS control of ASPM
> 2. For VMD, there’s no BIOS provided link->aspm_default
>
> The solution I proposed addresses the problem by having a mechanism for
> the VMD driver to supply its own defaults in place of the missing BIOS
> configuration.
>
> Using pci_bus_flags_t could work as well, but would rather just say that
> aspm_disabled doesn't apply to the bus hierarchy, allowing the current
> pci_enable_link_state_locked() solution to work all the time. Either
> way, I'm definitely looking for a cleaner solution. The above hopefully
> clarifies the situation.
>
> David
>
next prev parent reply other threads:[~2025-07-11 14:49 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 1:16 [PATCH] PCI/ASPM: Allow ASPM enablement for devices behind Intel VMD David E. Box
2025-07-10 2:17 ` Kenneth R. Crudup
2025-07-10 19:53 ` Rafael J. Wysocki
2025-07-11 14:33 ` David Box
2025-07-11 14:49 ` Ilpo Järvinen [this message]
2025-07-11 16:06 ` David Box
2025-07-11 21:19 ` Bjorn Helgaas
2025-07-13 4:47 ` David Box
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19452e8c-4e29-4703-afc9-3257a2d1183d@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=andrea.righi@canonical.com \
--cc=bhelgaas@google.com \
--cc=david.e.box@linux.intel.com \
--cc=kenny@panix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=nirmal.patel@linux.intel.com \
--cc=rafael@kernel.org \
--cc=vicamo.yang@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.