* [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
@ 2026-03-20 17:23 John Hancock
2026-03-23 12:31 ` bjorn.forsman
2026-03-23 13:54 ` Manivannan Sadhasivam
0 siblings, 2 replies; 7+ messages in thread
From: John Hancock @ 2026-03-20 17:23 UTC (permalink / raw)
To: stable
Cc: bhelgaas, manivannan.sadhasivam, joro, linux-pci, iommu,
John Hancock
Commit 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF
platforms") introduced a regression affecting AMD IOMMU group isolation
on x86 systems, making PCIe passthrough non-functional.
While the commit addresses a legitimate ordering issue on OF/Device Tree
platforms, the fix modifies pci_dma_configure(), which executes on all
platforms regardless of firmware interface. On AMD systems with IOMMU,
moving pci_enable_acs() from pci_acs_init() to pci_dma_configure() alters
the point at which ACS is evaluated relative to IOMMU group assignment.
The result is that devices which previously occupied individual, exclusive
IOMMU groups are merged into a single group containing both passthrough
and non-passthrough members, violating IOMMU isolation requirements.
The commit author notes that pci_enable_acs() is now called twice per
device and that this is "presumably not an issue." On AMD IOMMU hardware
this assumption does not hold -- the change in call ordering has
observable and breaking consequences for group topology.
It is worth noting that this is a stable/LTS series (6.12.y), where
changes to fundamental PCI initialization ordering carry significant
risk for production and specialized workloads that depend on stable
IOMMU behavior across kernel updates. A regression of this nature --
silently breaking PCIe passthrough without any configuration change on
the part of the user -- is particularly disruptive in a series that
users reasonably expect to be conservative.
This revert restores pci_enable_acs() to pci_acs_init() and marks it
static again, fully restoring correct IOMMU group topology on affected
hardware.
Regression introduced in: 6.12.75
Tested on: 6.12.77 with this revert applied
Hardware:
CPU: AMD Ryzen Threadripper 2990WX (family 23h, Zen+)
IOMMU: AMD-Vi
Bisect:
6.12.74: GOOD -- IOMMU groups correct, passthrough functional
6.12.75: BAD -- IOMMU groups collapsed, passthrough broken
6.12.76: BAD -- still broken
6.12.77: BAD -- still broken
Fixes: 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF platforms")
Signed-off-by: John Hancock <john@kernel.doghat.io>
---
drivers/pci/pci-driver.c | 8 --------
drivers/pci/pci.c | 8 ++++----
drivers/pci/pci.h | 1 -
3 files changed, 4 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 9846ab70c..a00a2ce01 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1656,14 +1656,6 @@ static int pci_dma_configure(struct device *dev)
ret = acpi_dma_configure(dev, acpi_get_dma_attr(adev));
}
- /*
- * Attempt to enable ACS regardless of capability because some Root
- * Ports (e.g. those quirked with *_intel_pch_acs_*) do not have
- * the standard ACS capability but still support ACS via those
- * quirks.
- */
- pci_enable_acs(to_pci_dev(dev));
-
pci_put_host_bridge_device(bridge);
if (!ret && !driver->driver_managed_dma) {
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 040c0be2d..3eb878781 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1072,7 +1072,7 @@ static void pci_std_enable_acs(struct pci_dev *dev, struct pci_acs *caps)
* pci_enable_acs - enable ACS if hardware support it
* @dev: the PCI device
*/
-void pci_enable_acs(struct pci_dev *dev)
+static void pci_enable_acs(struct pci_dev *dev)
{
struct pci_acs caps;
bool enable_acs = false;
@@ -3718,6 +3718,14 @@ bool pci_acs_path_enabled(struct pci_dev *start,
void pci_acs_init(struct pci_dev *dev)
{
dev->acs_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+
+ /*
+ * Attempt to enable ACS regardless of capability because some Root
+ * Ports (e.g. those quirked with *_intel_pch_acs_*) do not have
+ * the standard ACS capability but still support ACS via those
+ * quirks.
+ */
+ pci_enable_acs(dev);
}
/**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b9b97ec0f..b1f393a42 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -653,7 +653,6 @@ static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
}
void pci_acs_init(struct pci_dev *dev);
-void pci_enable_acs(struct pci_dev *dev);
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_cfg);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-20 17:23 [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms" John Hancock
@ 2026-03-23 12:31 ` bjorn.forsman
2026-03-23 13:54 ` Manivannan Sadhasivam
1 sibling, 0 replies; 7+ messages in thread
From: bjorn.forsman @ 2026-03-23 12:31 UTC (permalink / raw)
To: john; +Cc: bhelgaas, iommu, joro, linux-pci, manivannan.sadhasivam, stable
Hi,
Thanks for submitting this fix/revert -- I spent a lot of time today
figuring out why my VM didn't start anymore, and finally came here.
Seeing that it's been a few days since this was posted, I figured I'd
add this message as a "ping" to the committers, in hopes of getting
this merged soon. (This regression is hitting end users now.)
Best regards,
Bjørn Forsman
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-20 17:23 [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms" John Hancock
2026-03-23 12:31 ` bjorn.forsman
@ 2026-03-23 13:54 ` Manivannan Sadhasivam
2026-03-23 15:06 ` Robin Murphy
2026-03-23 15:09 ` Manivannan Sadhasivam
1 sibling, 2 replies; 7+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-23 13:54 UTC (permalink / raw)
To: John Hancock, Robin Murphy
Cc: stable, bhelgaas, manivannan.sadhasivam, joro, linux-pci, iommu
+ Robin
On Fri, Mar 20, 2026 at 01:23:35PM -0400, John Hancock wrote:
> Commit 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF
> platforms") introduced a regression affecting AMD IOMMU group isolation
> on x86 systems, making PCIe passthrough non-functional.
>
> While the commit addresses a legitimate ordering issue on OF/Device Tree
> platforms, the fix modifies pci_dma_configure(), which executes on all
> platforms regardless of firmware interface. On AMD systems with IOMMU,
> moving pci_enable_acs() from pci_acs_init() to pci_dma_configure() alters
> the point at which ACS is evaluated relative to IOMMU group assignment.
> The result is that devices which previously occupied individual, exclusive
> IOMMU groups are merged into a single group containing both passthrough
> and non-passthrough members, violating IOMMU isolation requirements.
>
Ouch! Sorry for the breakage.
> The commit author notes that pci_enable_acs() is now called twice per
> device and that this is "presumably not an issue." On AMD IOMMU hardware
> this assumption does not hold -- the change in call ordering has
> observable and breaking consequences for group topology.
>
> It is worth noting that this is a stable/LTS series (6.12.y), where
> changes to fundamental PCI initialization ordering carry significant
> risk for production and specialized workloads that depend on stable
> IOMMU behavior across kernel updates. A regression of this nature --
> silently breaking PCIe passthrough without any configuration change on
> the part of the user -- is particularly disruptive in a series that
> users reasonably expect to be conservative.
>
I still haven't investigated this failure deeply, but it is also worth noting
that this regression only happens with v6.12 and earlier stable kernels as
mentioned in [1].
> This revert restores pci_enable_acs() to pci_acs_init() and marks it
> static again, fully restoring correct IOMMU group topology on affected
> hardware.
>
> Regression introduced in: 6.12.75
> Tested on: 6.12.77 with this revert applied
>
> Hardware:
> CPU: AMD Ryzen Threadripper 2990WX (family 23h, Zen+)
> IOMMU: AMD-Vi
>
> Bisect:
> 6.12.74: GOOD -- IOMMU groups correct, passthrough functional
> 6.12.75: BAD -- IOMMU groups collapsed, passthrough broken
> 6.12.76: BAD -- still broken
> 6.12.77: BAD -- still broken
>
> Fixes: 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF platforms")
> Signed-off-by: John Hancock <john@kernel.doghat.io>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
- Mani
[1] https://lore.kernel.org/all/2c30f181-ffc6-4d63-a64e-763cf4528f48@leemhuis.info/
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-23 13:54 ` Manivannan Sadhasivam
@ 2026-03-23 15:06 ` Robin Murphy
2026-03-23 15:13 ` Manivannan Sadhasivam
2026-03-23 15:09 ` Manivannan Sadhasivam
1 sibling, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2026-03-23 15:06 UTC (permalink / raw)
To: Manivannan Sadhasivam, John Hancock
Cc: stable, bhelgaas, manivannan.sadhasivam, joro, linux-pci, iommu
On 23/03/2026 1:54 pm, Manivannan Sadhasivam wrote:
> + Robin
>
> On Fri, Mar 20, 2026 at 01:23:35PM -0400, John Hancock wrote:
>> Commit 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF
>> platforms") introduced a regression affecting AMD IOMMU group isolation
>> on x86 systems, making PCIe passthrough non-functional.
>>
>> While the commit addresses a legitimate ordering issue on OF/Device Tree
>> platforms, the fix modifies pci_dma_configure(), which executes on all
>> platforms regardless of firmware interface. On AMD systems with IOMMU,
>> moving pci_enable_acs() from pci_acs_init() to pci_dma_configure() alters
>> the point at which ACS is evaluated relative to IOMMU group assignment.
>> The result is that devices which previously occupied individual, exclusive
>> IOMMU groups are merged into a single group containing both passthrough
>> and non-passthrough members, violating IOMMU isolation requirements.
>>
>
> Ouch! Sorry for the breakage.
>
>> The commit author notes that pci_enable_acs() is now called twice per
>> device and that this is "presumably not an issue." On AMD IOMMU hardware
>> this assumption does not hold -- the change in call ordering has
>> observable and breaking consequences for group topology.
>>
>> It is worth noting that this is a stable/LTS series (6.12.y), where
>> changes to fundamental PCI initialization ordering carry significant
>> risk for production and specialized workloads that depend on stable
>> IOMMU behavior across kernel updates. A regression of this nature --
>> silently breaking PCIe passthrough without any configuration change on
>> the part of the user -- is particularly disruptive in a series that
>> users reasonably expect to be conservative.
>>
>
> I still haven't investigated this failure deeply, but it is also worth noting
> that this regression only happens with v6.12 and earlier stable kernels as
> mentioned in [1].
Oops, indeed, relying on pci_dma_configure() to be called prior to group
assignment in iommu_init_device() only works since bcb81ac6ae3c ("iommu:
Get DT/ACPI parsing into the proper probe path") added that call path in
6.15 - thus the backport probably doesn't actually work for OF platforms
either.
Dropping this from 6.12.y and earlier stable branches seems like the
correct action to me (but not a mainline revert, obviously). ACS had
essentially *never* worked properly on OF platforms prior to 6.15, but
that was more down to fundamental design flaws in the OF-based IOMMU
probe path (dating back to 4.12) rather than any easily-fixable bug as
such, so realistically I think we just leave it that way.
Thanks,
Robin.
>> This revert restores pci_enable_acs() to pci_acs_init() and marks it
>> static again, fully restoring correct IOMMU group topology on affected
>> hardware.
>>
>> Regression introduced in: 6.12.75
>> Tested on: 6.12.77 with this revert applied
>>
>> Hardware:
>> CPU: AMD Ryzen Threadripper 2990WX (family 23h, Zen+)
>> IOMMU: AMD-Vi
>>
>> Bisect:
>> 6.12.74: GOOD -- IOMMU groups correct, passthrough functional
>> 6.12.75: BAD -- IOMMU groups collapsed, passthrough broken
>> 6.12.76: BAD -- still broken
>> 6.12.77: BAD -- still broken
>>
>> Fixes: 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF platforms")
>> Signed-off-by: John Hancock <john@kernel.doghat.io>
>
> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
>
> - Mani
>
> [1] https://lore.kernel.org/all/2c30f181-ffc6-4d63-a64e-763cf4528f48@leemhuis.info/
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-23 13:54 ` Manivannan Sadhasivam
2026-03-23 15:06 ` Robin Murphy
@ 2026-03-23 15:09 ` Manivannan Sadhasivam
1 sibling, 0 replies; 7+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-23 15:09 UTC (permalink / raw)
To: John Hancock, Robin Murphy, Jason Gunthorpe
Cc: stable, bhelgaas, manivannan.sadhasivam, joro, linux-pci, iommu
+ Jason
On Mon, Mar 23, 2026 at 07:24:30PM +0530, Manivannan Sadhasivam wrote:
> + Robin
>
> On Fri, Mar 20, 2026 at 01:23:35PM -0400, John Hancock wrote:
> > Commit 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF
> > platforms") introduced a regression affecting AMD IOMMU group isolation
> > on x86 systems, making PCIe passthrough non-functional.
> >
> > While the commit addresses a legitimate ordering issue on OF/Device Tree
> > platforms, the fix modifies pci_dma_configure(), which executes on all
> > platforms regardless of firmware interface. On AMD systems with IOMMU,
> > moving pci_enable_acs() from pci_acs_init() to pci_dma_configure() alters
> > the point at which ACS is evaluated relative to IOMMU group assignment.
> > The result is that devices which previously occupied individual, exclusive
> > IOMMU groups are merged into a single group containing both passthrough
> > and non-passthrough members, violating IOMMU isolation requirements.
> >
>
> Ouch! Sorry for the breakage.
>
> > The commit author notes that pci_enable_acs() is now called twice per
> > device and that this is "presumably not an issue." On AMD IOMMU hardware
> > this assumption does not hold -- the change in call ordering has
> > observable and breaking consequences for group topology.
> >
> > It is worth noting that this is a stable/LTS series (6.12.y), where
> > changes to fundamental PCI initialization ordering carry significant
> > risk for production and specialized workloads that depend on stable
> > IOMMU behavior across kernel updates. A regression of this nature --
> > silently breaking PCIe passthrough without any configuration change on
> > the part of the user -- is particularly disruptive in a series that
> > users reasonably expect to be conservative.
> >
>
> I still haven't investigated this failure deeply, but it is also worth noting
> that this regression only happens with v6.12 and earlier stable kernels as
> mentioned in [1].
>
> > This revert restores pci_enable_acs() to pci_acs_init() and marks it
> > static again, fully restoring correct IOMMU group topology on affected
> > hardware.
> >
> > Regression introduced in: 6.12.75
> > Tested on: 6.12.77 with this revert applied
> >
> > Hardware:
> > CPU: AMD Ryzen Threadripper 2990WX (family 23h, Zen+)
> > IOMMU: AMD-Vi
> >
> > Bisect:
> > 6.12.74: GOOD -- IOMMU groups correct, passthrough functional
> > 6.12.75: BAD -- IOMMU groups collapsed, passthrough broken
> > 6.12.76: BAD -- still broken
> > 6.12.77: BAD -- still broken
> >
> > Fixes: 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF platforms")
> > Signed-off-by: John Hancock <john@kernel.doghat.io>
>
> Acked-by: Manivannan Sadhasivam <mani@kernel.org>
>
> - Mani
>
> [1] https://lore.kernel.org/all/2c30f181-ffc6-4d63-a64e-763cf4528f48@leemhuis.info/
>
> --
> மணிவண்ணன் சதாசிவம்
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-23 15:06 ` Robin Murphy
@ 2026-03-23 15:13 ` Manivannan Sadhasivam
2026-03-23 22:27 ` Akemi Yagi
0 siblings, 1 reply; 7+ messages in thread
From: Manivannan Sadhasivam @ 2026-03-23 15:13 UTC (permalink / raw)
To: Robin Murphy
Cc: John Hancock, stable, bhelgaas, manivannan.sadhasivam, joro,
linux-pci, iommu
On Mon, Mar 23, 2026 at 03:06:16PM +0000, Robin Murphy wrote:
> On 23/03/2026 1:54 pm, Manivannan Sadhasivam wrote:
> > + Robin
> >
> > On Fri, Mar 20, 2026 at 01:23:35PM -0400, John Hancock wrote:
> > > Commit 7a126c1b6cfa ("PCI: Enable ACS after configuring IOMMU for OF
> > > platforms") introduced a regression affecting AMD IOMMU group isolation
> > > on x86 systems, making PCIe passthrough non-functional.
> > >
> > > While the commit addresses a legitimate ordering issue on OF/Device Tree
> > > platforms, the fix modifies pci_dma_configure(), which executes on all
> > > platforms regardless of firmware interface. On AMD systems with IOMMU,
> > > moving pci_enable_acs() from pci_acs_init() to pci_dma_configure() alters
> > > the point at which ACS is evaluated relative to IOMMU group assignment.
> > > The result is that devices which previously occupied individual, exclusive
> > > IOMMU groups are merged into a single group containing both passthrough
> > > and non-passthrough members, violating IOMMU isolation requirements.
> > >
> >
> > Ouch! Sorry for the breakage.
> >
> > > The commit author notes that pci_enable_acs() is now called twice per
> > > device and that this is "presumably not an issue." On AMD IOMMU hardware
> > > this assumption does not hold -- the change in call ordering has
> > > observable and breaking consequences for group topology.
> > >
> > > It is worth noting that this is a stable/LTS series (6.12.y), where
> > > changes to fundamental PCI initialization ordering carry significant
> > > risk for production and specialized workloads that depend on stable
> > > IOMMU behavior across kernel updates. A regression of this nature --
> > > silently breaking PCIe passthrough without any configuration change on
> > > the part of the user -- is particularly disruptive in a series that
> > > users reasonably expect to be conservative.
> > >
> >
> > I still haven't investigated this failure deeply, but it is also worth noting
> > that this regression only happens with v6.12 and earlier stable kernels as
> > mentioned in [1].
>
> Oops, indeed, relying on pci_dma_configure() to be called prior to group
> assignment in iommu_init_device() only works since bcb81ac6ae3c ("iommu: Get
> DT/ACPI parsing into the proper probe path") added that call path in 6.15 -
> thus the backport probably doesn't actually work for OF platforms either.
>
Ah, that makes sense. Thanks for finding the root cause. It might be very
obvious to you, but still... ;)
> Dropping this from 6.12.y and earlier stable branches seems like the correct
> action to me (but not a mainline revert, obviously). ACS had essentially
> *never* worked properly on OF platforms prior to 6.15, but that was more
> down to fundamental design flaws in the OF-based IOMMU probe path (dating
> back to 4.12) rather than any easily-fixable bug as such, so realistically I
> think we just leave it that way.
>
That's my opinion as well. I guess I need to send reverts for rest of the older
stable kernels as well.
Thanks once again, Robin!
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms"
2026-03-23 15:13 ` Manivannan Sadhasivam
@ 2026-03-23 22:27 ` Akemi Yagi
0 siblings, 0 replies; 7+ messages in thread
From: Akemi Yagi @ 2026-03-23 22:27 UTC (permalink / raw)
To: stable; +Cc: linux-pci
On Mon, 23 Mar 2026 20:43:55 +0530, Manivannan Sadhasivam wrote:
>> Dropping this from 6.12.y and earlier stable branches seems like the
>> correct action to me (but not a mainline revert, obviously). ACS had
>> essentially *never* worked properly on OF platforms prior to 6.15, but
>> that was more down to fundamental design flaws in the OF-based IOMMU
>> probe path (dating back to 4.12) rather than any easily-fixable bug as
>> such, so realistically I think we just leave it that way.
>>
> That's my opinion as well. I guess I need to send reverts for rest of
> the older stable kernels as well.
This is a short note to say that the 6.1 kernel was affected and the
issue was fixed by reverting the referenced commit ( https://elrepo.org/
bugs/view.php?id=1587 ).
Relevant bugzilla entry is here:
https://bugzilla.kernel.org/show_bug.cgi?id=221234
Akemi
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-23 22:32 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 17:23 [REGRESSION] PCI: Revert "Enable ACS after configuring IOMMU for OF platforms" John Hancock
2026-03-23 12:31 ` bjorn.forsman
2026-03-23 13:54 ` Manivannan Sadhasivam
2026-03-23 15:06 ` Robin Murphy
2026-03-23 15:13 ` Manivannan Sadhasivam
2026-03-23 22:27 ` Akemi Yagi
2026-03-23 15:09 ` Manivannan Sadhasivam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox