* [PATCH 1/6] iommu/amd: Clear aliases before setting the rlookup_table to NULL
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
@ 2026-06-01 13:41 ` Pranjal Shrivastava
2026-06-01 13:57 ` sashiko-bot
2026-06-01 13:42 ` [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device() Pranjal Shrivastava
` (5 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:41 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava, sashiko-bot
In iommu_ignore_device(), clearing the primary devid in the lookup table
before calling setup_aliases() causes rlookup_amd_iommu() to fail for
aliases. This prevents clearing the Device Table Entries (DTEs) for
DMA aliases.
Fix this by moving the setup_aliases() call before clearing the
rlookup_table entry.
Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f1333071da10..a94de66a885e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -745,10 +745,12 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
return;
devid = PCI_SBDF_TO_DEVID(sbdf);
- pci_seg->rlookup_table[devid] = NULL;
- memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
+ /* Clear aliases while rlookup is still valid */
setup_aliases(iommu, dev);
+
+ pci_seg->rlookup_table[devid] = NULL;
+ memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
}
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 1/6] iommu/amd: Clear aliases before setting the rlookup_table to NULL
2026-06-01 13:41 ` [PATCH 1/6] iommu/amd: Clear aliases before setting the rlookup_table to NULL Pranjal Shrivastava
@ 2026-06-01 13:57 ` sashiko-bot
0 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 13:57 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Moving setup_aliases() before clearing the primary DTE incorrectly copies the active (uncleared) DTE to DMA aliases, failing to clear them.
- [Medium] Reordering setup_aliases() before clearing rlookup_table exposes a missing dev_data error path, causing log spam and aborting alias iteration.
--
commit 3c05610e625a1605bf16151c884edcdce09fdf2d
Author: Pranjal Shrivastava <praan@google.com>
iommu/amd: Clear aliases before setting the rlookup_table to NULL
In iommu_ignore_device(), clearing the primary devid in the lookup table
before calling setup_aliases() causes rlookup_amd_iommu() to fail for
aliases. This commit moves the setup_aliases() call before clearing the
rlookup_table entry.
Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Signed-off-by: Pranjal Shrivastava <praan@google.com>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index f1333071da103..a94de66a885e7 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -745,10 +745,12 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> return;
>
> devid = PCI_SBDF_TO_DEVID(sbdf);
> - pci_seg->rlookup_table[devid] = NULL;
> - memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
>
> + /* Clear aliases while rlookup is still valid */
> setup_aliases(iommu, dev);
[Severity: High]
Does this copy the active DTE instead of the cleared DTE to the DMA aliases?
By moving the call to setup_aliases() before the memset() that clears the
primary Device Table Entry (DTE), setup_aliases() will read the primary
device's uncleared DTE and write this active DTE to the aliases.
Then memset() clears only the primary DTE, leaving the aliases retaining
their old active translations. Could this lead to unauthorized DMA or hardware
use-after-free if the domain's page tables are subsequently freed?
[Severity: Medium]
Does this reordering also expose a missing dev_data error path when
iommu_ignore_device() is called during early initialization?
If this is called from an early initialization error path before dev_data is
set, dev_data will be NULL. Previously, because rlookup_table[devid] was set
to NULL first, rlookup_amd_iommu() inside clone_alias() would return NULL and
safely exit without further processing.
With this patch, rlookup_table is still valid when setup_aliases() is called,
so clone_alias() proceeds and hits the check:
drivers/iommu/amd/iommu.c:clone_alias() {
...
dev_data = dev_iommu_priv_get(&pdev->dev);
if (!dev_data) {
pr_err("%s : Failed to get dev_data for 0x%x\n", __func__, devid);
ret = -EINVAL;
goto out;
}
...
}
Returning -EINVAL causes pci_for_each_dma_alias() to immediately abort its
iteration. Can this cause unintended log spam and prevent the remaining
aliases from being processed?
> +
> + pci_seg->rlookup_table[devid] = NULL;
> + memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
2026-06-01 13:41 ` [PATCH 1/6] iommu/amd: Clear aliases before setting the rlookup_table to NULL Pranjal Shrivastava
@ 2026-06-01 13:42 ` Pranjal Shrivastava
2026-06-01 14:09 ` sashiko-bot
` (2 more replies)
2026-06-01 13:42 ` [PATCH 3/6] iommu/amd: Split probe error paths to preserve IRQ remapping Pranjal Shrivastava
` (4 subsequent siblings)
6 siblings, 3 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:42 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava, sashiko-bot
The iommu_ignore_device() function currently uses memset() to clear
the Device Table Entry (DTE), which risks torn writes because the
hardware reads DTEs as atomic 256-bit qwords. Fix this by using
update_dte256() to perform a hardware-safe atomic clear when a live
dev_data entry is available.
Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a94de66a885e..9b5861e241d7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -750,7 +750,16 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
setup_aliases(iommu, dev);
pci_seg->rlookup_table[devid] = NULL;
- memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
+
+ /* Clear DTE if we have a live entry */
+ if (dev_data) {
+ struct dev_table_entry new = {};
+
+ amd_iommu_make_clear_dte(dev_data, &new);
+ update_dte256(iommu, dev_data, &new);
+ } else {
+ memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
+ }
}
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
2026-06-01 13:42 ` [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device() Pranjal Shrivastava
@ 2026-06-01 14:09 ` sashiko-bot
2026-06-01 14:19 ` Jason Gunthorpe
2026-06-02 8:55 ` Vasant Hegde
2 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 14:09 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Calling `setup_aliases()` before clearing the primary DTE propagates the active DTE to aliases instead of safely clearing them.
- [Low] The variable `dev_data` is used in `iommu_ignore_device()` without being declared or initialized, causing a build failure. Note: I found the fix later in the series in commit 'iommu/amd: Split probe error paths to preserve IRQ remapping'.
Pre-existing issues:
- [High] The patch leaves the hardware torn-write vulnerability unmitigated when `dev_data` is NULL by falling back to unsafe `memset()`.
--
commit 57abff52ae96ef6d8a5ffe516b907b27e8c51040
Author: Pranjal Shrivastava <praan@google.com>
Subject: iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
This commit modifies iommu_ignore_device() to use update_dte256() instead
of memset() for clearing the Device Table Entry (DTE) when dev_data is
available. This aims to prevent torn writes caused by the hardware reading
256-bit DTEs atomically.
Link: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -750,7 +750,16 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> setup_aliases(iommu, dev);
[Severity: High]
Does calling setup_aliases() before clearing the primary DTE cause the
aliases to receive the active DTE?
Looking at setup_aliases(), it invokes clone_aliases() which reads the
current primary DTE and copies it to all alias DTEs. Since this happens
before the primary DTE is cleared below, it appears the aliases remain
active with the uncleared DTE.
>
> pci_seg->rlookup_table[devid] = NULL;
> - memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> +
> + /* Clear DTE if we have a live entry */
> + if (dev_data) {
[Severity: Low]
Is dev_data declared in this scope? It appears this will cause a compilation
failure since dev_data is evaluated without being declared or initialized in
iommu_ignore_device().
I noticed this build issue is fixed later in the series by commit
'iommu/amd: Split probe error paths to preserve IRQ remapping'.
> + struct dev_table_entry new = {};
> +
> + amd_iommu_make_clear_dte(dev_data, &new);
> + update_dte256(iommu, dev_data, &new);
> + } else {
> + memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
[Severity: High]
This is a pre-existing issue, but does falling back to memset() leave the
hardware torn-write vulnerability unmitigated when dev_data is absent?
If a device lacks dev_data, such as during failed initialization, the BIOS
or ACPI IVHD table may have already established a live DTE. Executing
memset() in this state could still cause a torn read by the active IOMMU
hardware.
> + }
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=2
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
2026-06-01 13:42 ` [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device() Pranjal Shrivastava
2026-06-01 14:09 ` sashiko-bot
@ 2026-06-01 14:19 ` Jason Gunthorpe
2026-06-02 4:01 ` Pranjal Shrivastava
2026-06-02 8:55 ` Vasant Hegde
2 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2026-06-01 14:19 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, linux-pci, linux-kernel, Joerg Roedel,
Suravee Suthikulpanit, Vasant Hegde, Ankit Soni, Bjorn Helgaas,
Samiullah Khawaja, sashiko-bot
On Mon, Jun 01, 2026 at 01:42:00PM +0000, Pranjal Shrivastava wrote:
> The iommu_ignore_device() function currently uses memset() to clear
> the Device Table Entry (DTE), which risks torn writes because the
> hardware reads DTEs as atomic 256-bit qwords. Fix this by using
> update_dte256() to perform a hardware-safe atomic clear when a live
> dev_data entry is available.
>
> Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
> Reported-by: sashiko-bot@kernel.org
> Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
> drivers/iommu/amd/iommu.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a94de66a885e..9b5861e241d7 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -750,7 +750,16 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> setup_aliases(iommu, dev);
>
> pci_seg->rlookup_table[devid] = NULL;
> - memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> +
> + /* Clear DTE if we have a live entry */
> + if (dev_data) {
> + struct dev_table_entry new = {};
> +
> + amd_iommu_make_clear_dte(dev_data, &new);
> + update_dte256(iommu, dev_data, &new);
> + } else {
> + memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> + }
This seems a little weird, an ignored device shouldn't have a dev_data
really, or it will soon be freed.
I think you are better to replace the memset with a dedicated function
/* Cannot not be used on a probe'd device with a live dev_data */
disable_dte(..)
{
struct dev_table_entry new = {};
write_dte_lower128(ptr, new);
write_dte_upper128(ptr, new);
}
And then this new ordering breaks the clone_aliases flow, it was
supposed to copy the 0 from the current DTE to the aliases..
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
2026-06-01 14:19 ` Jason Gunthorpe
@ 2026-06-02 4:01 ` Pranjal Shrivastava
0 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-02 4:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, linux-pci, linux-kernel, Joerg Roedel,
Suravee Suthikulpanit, Vasant Hegde, Ankit Soni, Bjorn Helgaas,
Samiullah Khawaja, sashiko-bot
On Mon, Jun 01, 2026 at 11:19:56AM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 01, 2026 at 01:42:00PM +0000, Pranjal Shrivastava wrote:
> > The iommu_ignore_device() function currently uses memset() to clear
> > the Device Table Entry (DTE), which risks torn writes because the
> > hardware reads DTEs as atomic 256-bit qwords. Fix this by using
> > update_dte256() to perform a hardware-safe atomic clear when a live
> > dev_data entry is available.
> >
> > Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
> > Reported-by: sashiko-bot@kernel.org
> > Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
> > ---
> > drivers/iommu/amd/iommu.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index a94de66a885e..9b5861e241d7 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> > @@ -750,7 +750,16 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> > setup_aliases(iommu, dev);
> >
> > pci_seg->rlookup_table[devid] = NULL;
> > - memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> > +
> > + /* Clear DTE if we have a live entry */
> > + if (dev_data) {
> > + struct dev_table_entry new = {};
> > +
> > + amd_iommu_make_clear_dte(dev_data, &new);
> > + update_dte256(iommu, dev_data, &new);
> > + } else {
> > + memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> > + }
>
> This seems a little weird, an ignored device shouldn't have a dev_data
> really, or it will soon be freed.
>
> I think you are better to replace the memset with a dedicated function
>
> /* Cannot not be used on a probe'd device with a live dev_data */
> disable_dte(..)
> {
> struct dev_table_entry new = {};
>
> write_dte_lower128(ptr, new);
> write_dte_upper128(ptr, new);
> }
>
> And then this new ordering breaks the clone_aliases flow, it was
> supposed to copy the 0 from the current DTE to the aliases..
Ack.. I'll fix it.
Thanks,
Praan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
2026-06-01 13:42 ` [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device() Pranjal Shrivastava
2026-06-01 14:09 ` sashiko-bot
2026-06-01 14:19 ` Jason Gunthorpe
@ 2026-06-02 8:55 ` Vasant Hegde
2 siblings, 0 replies; 20+ messages in thread
From: Vasant Hegde @ 2026-06-02 8:55 UTC (permalink / raw)
To: Pranjal Shrivastava, iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Ankit Soni, Jason Gunthorpe,
Bjorn Helgaas, Samiullah Khawaja, sashiko-bot
On 6/1/2026 7:12 PM, Pranjal Shrivastava wrote:
> The iommu_ignore_device() function currently uses memset() to clear
> the Device Table Entry (DTE), which risks torn writes because the
> hardware reads DTEs as atomic 256-bit qwords. Fix this by using
> update_dte256() to perform a hardware-safe atomic clear when a live
> dev_data entry is available.
>
> Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table")
> Reported-by: sashiko-bot@kernel.org
> Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
> drivers/iommu/amd/iommu.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a94de66a885e..9b5861e241d7 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -750,7 +750,16 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> setup_aliases(iommu, dev);
>
> pci_seg->rlookup_table[devid] = NULL;
> - memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> +
> + /* Clear DTE if we have a live entry */
> + if (dev_data) {
> + struct dev_table_entry new = {};
> +
> + amd_iommu_make_clear_dte(dev_data, &new);
> + update_dte256(iommu, dev_data, &new);
> + } else {
> + memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> + }
> }
>
Why not clear_dte_entry() that will disable the DMA for device and alias
devices? May be we should rename fucntion name as well iommu_ignore_device ->
iommu_disable_device_dma or something.
-Vasant
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/6] iommu/amd: Split probe error paths to preserve IRQ remapping
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
2026-06-01 13:41 ` [PATCH 1/6] iommu/amd: Clear aliases before setting the rlookup_table to NULL Pranjal Shrivastava
2026-06-01 13:42 ` [PATCH 2/6] iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device() Pranjal Shrivastava
@ 2026-06-01 13:42 ` Pranjal Shrivastava
2026-06-01 14:30 ` sashiko-bot
2026-06-01 13:42 ` [PATCH 4/6] iommu/amd: Fix Use-After-Free in non-fatal probe error path Pranjal Shrivastava
` (3 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:42 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava, sashiko-bot
Split the amd_iommu_probe_device() error paths into err_deinit and
out_err. Proper init failures continue to call iommu_ignore_device()
while configuration failures (like PD_MODE_NONE or ATS mismatches) skip
it to preserve the rlookup_table entry required for IRQ remapping.
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Suggested-by: Ankit Soni <ankit.soni@amd.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9b5861e241d7..c3b3750d4a22 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -737,6 +737,7 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
{
struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
struct dev_table_entry *dev_table = get_dev_table(iommu);
int devid, sbdf;
@@ -2477,8 +2478,7 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
if (ret) {
dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
iommu_dev = ERR_PTR(ret);
- iommu_ignore_device(iommu, dev);
- goto out_err;
+ goto err_deinit;
}
amd_iommu_set_pci_msi_domain(dev, iommu);
@@ -2512,6 +2512,10 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
if (dev_is_pci(dev))
pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
+ return iommu_dev;
+
+err_deinit:
+ iommu_ignore_device(iommu, dev);
out_err:
return iommu_dev;
}
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 3/6] iommu/amd: Split probe error paths to preserve IRQ remapping
2026-06-01 13:42 ` [PATCH 3/6] iommu/amd: Split probe error paths to preserve IRQ remapping Pranjal Shrivastava
@ 2026-06-01 14:30 ` sashiko-bot
0 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 14:30 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] `iommu_ignore_device()` suffers from dead code and incorrect reordering, resulting in broken alias clearance, log spam, and unsafe non-atomic DTE clearing (torn writes).
- [Low] The commit silently fixes a build failure by adding a missing `dev_data` variable declaration that was used in the previous commit, breaking git bisect.
Pre-existing issues:
- [High] Skipping configuration on `PD_MODE_NONE` leaves `dev_data->max_irqs` uninitialized (0), completely breaking IRQ remapping.
--
commit b695f2fee16ac1c72b6f21b22dedf39b5ec55656
Author: Pranjal Shrivastava <praan@google.com>
iommu/amd: Split probe error paths to preserve IRQ remapping
This commit splits the error paths in amd_iommu_probe_device() to ensure
proper initialization failures clear the device, while configuration failures
like PD_MODE_NONE preserve the rlookup_table entry for IRQ remapping.
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Suggested-by: Ankit Soni <ankit.soni@amd.com>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 9b5861e241d71..c3b3750d4a22d 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -737,6 +737,7 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
> static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> {
> struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
[Severity: Low]
Does this silently fix a build failure from a previous commit?
It appears the preceding commit in this series introduced the use of dev_data
in iommu_ignore_device() but failed to declare it. Adding the declaration here
fixes the build, but breaks bisectability between the two commits.
Could this fix be squashed into the commit that introduced the bug to ensure
the series remains fully bisectable?
[Severity: High]
Is dev_data always NULL when iommu_ignore_device() is called here?
Since iommu_ignore_device() is exclusively invoked from the error path of
amd_iommu_probe_device() when iommu_init_device() fails, dev_data will be
NULL because iommu_init_device() fails before dev_iommu_priv_set() is called.
If dev_data is NULL, the atomic DTE clearing block in this function becomes
dead code, and it falls back to the unsafe memset(), which risks torn writes
on live hardware:
drivers/iommu/amd/iommu.c:iommu_ignore_device() {
...
if (dev_data) {
struct dev_table_entry new = {};
amd_iommu_make_clear_dte(dev_data, &new);
update_dte256(iommu, dev_data, &new);
} else {
memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
}
...
}
Additionally, because setup_aliases() is unconditionally called before this,
its internal clone_alias() will abort on the NULL dev_data and generate pr_err
log spam without actually clearing the alias DTEs.
[ ... ]
> @@ -2512,6 +2512,10 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
> if (dev_is_pci(dev))
> pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
>
> + return iommu_dev;
> +
> +err_deinit:
> + iommu_ignore_device(iommu, dev);
> out_err:
[Severity: High]
This is a pre-existing issue, but does skipping the device configuration
leave max_irqs uninitialized?
The commit message notes that configuration failures like PD_MODE_NONE
skip iommu_ignore_device() to preserve the rlookup_table entry for IRQ
remapping.
However, because amd_iommu_probe_device() jumps directly to out_err on
PD_MODE_NONE, it bypasses the initialization of dev_data->max_irqs, leaving
it at 0.
Later, when alloc_irq_index() tries to allocate an IRQ, it relies on max_irqs:
drivers/iommu/amd/iommu.c:alloc_irq_index() {
...
for (index = ALIGN(table->min_index, alignment), c = 0;
index < max_irqs;) {
...
}
Since max_irqs is 0, the loop is skipped and it returns -ENOSPC, which
completely breaks IRQ allocation for the device, defeating the purpose
of preserving the rlookup entry.
> return iommu_dev;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=3
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/6] iommu/amd: Fix Use-After-Free in non-fatal probe error path
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
` (2 preceding siblings ...)
2026-06-01 13:42 ` [PATCH 3/6] iommu/amd: Split probe error paths to preserve IRQ remapping Pranjal Shrivastava
@ 2026-06-01 13:42 ` Pranjal Shrivastava
2026-06-01 14:44 ` sashiko-bot
2026-06-01 13:42 ` [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure Pranjal Shrivastava
` (2 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:42 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava, sashiko-bot
When amd_iommu_probe_device() fails for optional features (e.g.
PD_MODE_NONE), the dev_data remains globally linked with a pointer
to the struct device. Since the IOMMU core does not invoke
release_device() after a probe failure, this could lead to a UAF once
the physical device is removed.
Fix this by introducing amd_iommu_clear_device_dte() to clear the HW
state and setting dev_data->dev = NULL on the non-fatal error path.
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c3b3750d4a22..4ef6024c5a4e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -734,6 +734,15 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
return 0;
}
+static void amd_iommu_clear_device_dte(struct amd_iommu *iommu,
+ struct iommu_dev_data *dev_data)
+{
+ struct dev_table_entry new = {};
+
+ amd_iommu_make_clear_dte(dev_data, &new);
+ update_dte256(iommu, dev_data, &new);
+}
+
static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
{
struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
@@ -753,14 +762,10 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
pci_seg->rlookup_table[devid] = NULL;
/* Clear DTE if we have a live entry */
- if (dev_data) {
- struct dev_table_entry new = {};
-
- amd_iommu_make_clear_dte(dev_data, &new);
- update_dte256(iommu, dev_data, &new);
- } else {
+ if (dev_data)
+ amd_iommu_clear_device_dte(iommu, dev_data);
+ else
memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
- }
}
@@ -2517,6 +2522,11 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
err_deinit:
iommu_ignore_device(iommu, dev);
out_err:
+ if (dev_data) {
+ amd_iommu_clear_device_dte(iommu, dev_data);
+ dev_data->dev = NULL;
+ }
+
return iommu_dev;
}
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 4/6] iommu/amd: Fix Use-After-Free in non-fatal probe error path
2026-06-01 13:42 ` [PATCH 4/6] iommu/amd: Fix Use-After-Free in non-fatal probe error path Pranjal Shrivastava
@ 2026-06-01 14:44 ` sashiko-bot
0 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 14:44 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Critical] The local variable `dev_data` is used uninitialized on the `iommu_init_device` error path, leading to arbitrary memory operations.
- [Critical] Calling `amd_iommu_make_clear_dte` during device probe triggers a NULL pointer dereference because `dev->iommu->iommu_dev` is not yet set.
Pre-existing issues:
- [High] Uncleared active DTE is copied to DMA aliases in `iommu_ignore_device()`, bypassing IOMMU isolation.
--
commit 9fad373a2e7ad46c5f5a781a35dd6f77a626cb3b
Author: Pranjal Shrivastava <praan@google.com>
iommu/amd: Fix Use-After-Free in non-fatal probe error path
When amd_iommu_probe_device() fails for optional features, the dev_data
remains globally linked with a pointer to the struct device. This patch
introduces amd_iommu_clear_device_dte() to clear the HW state and sets
dev_data->dev = NULL on the non-fatal error path to prevent a UAF.
Reported-by: sashiko-bot@kernel.org
Closes: https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index c3b3750d4a22d..4ef6024c5a4ea 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -734,6 +734,15 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
> return 0;
> }
>
> +static void amd_iommu_clear_device_dte(struct amd_iommu *iommu,
> + struct iommu_dev_data *dev_data)
> +{
> + struct dev_table_entry new = {};
> +
> + amd_iommu_make_clear_dte(dev_data, &new);
[Severity: Critical]
Does calling amd_iommu_make_clear_dte() during device probe trigger a null
pointer dereference?
If amd_iommu_make_clear_dte() is called during the probe error path, it
attempts to fetch the IOMMU via:
drivers/iommu/amd/amd_iommu.h:amd_iommu_make_clear_dte() {
...
struct amd_iommu *iommu = get_amd_iommu_from_dev(dev_data->dev);
...
}
This internally accesses dev->iommu->iommu_dev. However, because execution
occurs during the ops->probe_device() phase, the IOMMU core has not yet
populated this pointer. This causes __iommu_get_iommu_dev() to return null,
leading to a null pointer dereference when accessing iommu->pci_seg later in
the function.
Could we pass the explicitly provided iommu pointer directly to
amd_iommu_make_clear_dte() instead of fetching it again?
> + update_dte256(iommu, dev_data, &new);
> +}
> +
> static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> {
> struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
> @@ -753,14 +762,10 @@ static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
> pci_seg->rlookup_table[devid] = NULL;
>
> /* Clear DTE if we have a live entry */
[Severity: High]
This is a pre-existing issue, but does iommu_ignore_device() copy an
uncleared active DTE to DMA aliases?
Just before clearing the DTE, setup_aliases() is called:
drivers/iommu/amd/iommu.c:iommu_ignore_device() {
...
/* Clear aliases while rlookup is still valid */
setup_aliases(iommu, dev);
pci_seg->rlookup_table[devid] = NULL;
/* Clear DTE if we have a live entry */
...
}
Because setup_aliases() is called before clearing the primary device's DTE,
clone_alias() will read the primary device's active, uncleared DTE and write
it to the alias's DTE. When the primary device's DTE is eventually cleared,
the aliases remain fully active and capable of unauthorized DMA, which bypasses
IOMMU isolation for devices with aliases.
> - if (dev_data) {
> - struct dev_table_entry new = {};
> -
> - amd_iommu_make_clear_dte(dev_data, &new);
> - update_dte256(iommu, dev_data, &new);
> - } else {
> + if (dev_data)
> + amd_iommu_clear_device_dte(iommu, dev_data);
> + else
> memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
> - }
> }
>
>
> @@ -2517,6 +2522,11 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
> err_deinit:
> iommu_ignore_device(iommu, dev);
> out_err:
> + if (dev_data) {
[Severity: Critical]
Is it possible that dev_data is used uninitialized on the iommu_init_device()
error path?
If iommu_init_device() fails, the code jumps to the err_deinit label and falls
through to out_err:
drivers/iommu/amd/iommu.c:amd_iommu_probe_device() {
...
struct iommu_dev_data *dev_data;
...
ret = iommu_init_device(iommu, dev);
if (ret) {
dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
iommu_dev = ERR_PTR(ret);
goto err_deinit;
}
...
err_deinit:
iommu_ignore_device(iommu, dev);
out_err:
if (dev_data) {
amd_iommu_clear_device_dte(iommu, dev_data);
...
}
Because dev_data is declared without initialization and isn't assigned until
later in the function, it will contain stack garbage when the error path is
taken. This could result in passing a garbage pointer to
amd_iommu_clear_device_dte(), leading to arbitrary memory operations.
> + amd_iommu_clear_device_dte(iommu, dev_data);
> + dev_data->dev = NULL;
> + }
> +
> return iommu_dev;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=4
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
` (3 preceding siblings ...)
2026-06-01 13:42 ` [PATCH 4/6] iommu/amd: Fix Use-After-Free in non-fatal probe error path Pranjal Shrivastava
@ 2026-06-01 13:42 ` Pranjal Shrivastava
2026-06-01 15:01 ` sashiko-bot
2026-06-01 13:42 ` [PATCH 6/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
2026-06-02 8:49 ` [PATCH 0/6] iommu/amd: Refactors for ATS updates Vasant Hegde
6 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:42 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava
Update the driver to call pci_prepare_ats() after checking if
pci_ats_supported() and fail the probe_device if pci_prepare_ats()
returns an error. Additionally, update pdev_enable_cap_ats() to WARN_ON()
a failure in pci_enable_ats().
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4ef6024c5a4e..783f53cb8599 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -566,10 +566,17 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
if (amd_iommu_iotlb_sup &&
(dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
ret = pci_enable_ats(pdev, PAGE_SHIFT);
- if (!ret) {
- dev_data->ats_enabled = 1;
- dev_data->ats_qdep = pci_ats_queue_depth(pdev);
- }
+
+ /*
+ * pci_enable_ats() should not fail here because earlier
+ * checks have already verified support & config.
+ */
+ if (WARN_ON(ret))
+ return ret;
+
+ dev_data->ats_enabled = 1;
+ dev_data->ats_qdep = pci_ats_queue_depth(pdev);
+ ret = 0;
}
return ret;
@@ -2514,8 +2521,17 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
else
dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
- if (dev_is_pci(dev))
- pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
+ if (dev_is_pci(dev)) {
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (pci_ats_supported(pdev)) {
+ ret = pci_prepare_ats(pdev, PAGE_SHIFT);
+ if (ret) {
+ iommu_dev = ERR_PTR(ret);
+ goto out_err;
+ }
+ }
+ }
return iommu_dev;
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure
2026-06-01 13:42 ` [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-06-01 15:01 ` sashiko-bot
2026-06-01 15:05 ` Pranjal Shrivastava
0 siblings, 1 reply; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 15:01 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] The newly added WARN_ON(ret) in pdev_enable_cap_ats() can be spuriously triggered for SR-IOV Virtual Functions (VFs) if the Physical Function (PF) was not properly configured for ATS.
--
commit 40ae1c6d486f021eadc253f746299ffa4ca11dff
Author: Pranjal Shrivastava <praan@google.com>
iommu/amd: Fail probe on ATS configuration failure
This commit updates the AMD IOMMU driver to call pci_prepare_ats() during
device probe, failing the probe if it returns an error. It also adds a
WARN_ON() in pdev_enable_cap_ats() to catch failures from pci_enable_ats().
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 4ef6024c5a4ea..783f53cb85996 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -566,10 +566,17 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
> if (amd_iommu_iotlb_sup &&
> (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
> ret = pci_enable_ats(pdev, PAGE_SHIFT);
> - if (!ret) {
> - dev_data->ats_enabled = 1;
> - dev_data->ats_qdep = pci_ats_queue_depth(pdev);
> - }
> +
> + /*
> + * pci_enable_ats() should not fail here because earlier
> + * checks have already verified support & config.
> + */
> + if (WARN_ON(ret))
> + return ret;
[Severity: High]
Can this WARN_ON() trigger spuriously for SR-IOV Virtual Functions (VFs) if
the Physical Function (PF) was not properly configured for ATS?
Looking at pci_prepare_ats(), it has a fast-path that returns 0 immediately
for VFs without checking or configuring the PF's Smallest Translation Unit
(STU):
int pci_prepare_ats(struct pci_dev *dev, int ps)
{
...
if (dev->is_virtfn)
return 0;
...
}
Later, when pdev_enable_cap_ats() calls pci_enable_ats() for the VF, the
actual validation happens there:
int pci_enable_ats(struct pci_dev *dev, int ps)
{
...
if (pdev->ats_stu != ps)
return -EINVAL;
...
}
If the PF failed its IOMMU probe or was explicitly bypassed by the driver,
its ats_stu would remain 0.
When VFs are subsequently probed, pci_prepare_ats() would blindly succeed,
but pci_enable_ats() would fail with -EINVAL, triggering this new WARN_ON().
Could this lead to unnecessary bug reports or kernel crashes on systems
configured with panic_on_warn?
> +
> + dev_data->ats_enabled = 1;
> + dev_data->ats_qdep = pci_ats_queue_depth(pdev);
> + ret = 0;
> }
>
> return ret;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=5
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure
2026-06-01 15:01 ` sashiko-bot
@ 2026-06-01 15:05 ` Pranjal Shrivastava
0 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 15:05 UTC (permalink / raw)
To: sashiko-reviews; +Cc: linux-pci
On Mon, Jun 01, 2026 at 03:01:25PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] The newly added WARN_ON(ret) in pdev_enable_cap_ats() can be spuriously triggered for SR-IOV Virtual Functions (VFs) if the Physical Function (PF) was not properly configured for ATS.
> --
>
> commit 40ae1c6d486f021eadc253f746299ffa4ca11dff
> Author: Pranjal Shrivastava <praan@google.com>
>
> iommu/amd: Fail probe on ATS configuration failure
>
> This commit updates the AMD IOMMU driver to call pci_prepare_ats() during
> device probe, failing the probe if it returns an error. It also adds a
> WARN_ON() in pdev_enable_cap_ats() to catch failures from pci_enable_ats().
>
> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index 4ef6024c5a4ea..783f53cb85996 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> > @@ -566,10 +566,17 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
> > if (amd_iommu_iotlb_sup &&
> > (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
> > ret = pci_enable_ats(pdev, PAGE_SHIFT);
> > - if (!ret) {
> > - dev_data->ats_enabled = 1;
> > - dev_data->ats_qdep = pci_ats_queue_depth(pdev);
> > - }
> > +
> > + /*
> > + * pci_enable_ats() should not fail here because earlier
> > + * checks have already verified support & config.
> > + */
> > + if (WARN_ON(ret))
> > + return ret;
>
> [Severity: High]
> Can this WARN_ON() trigger spuriously for SR-IOV Virtual Functions (VFs) if
> the Physical Function (PF) was not properly configured for ATS?
>
> Looking at pci_prepare_ats(), it has a fast-path that returns 0 immediately
> for VFs without checking or configuring the PF's Smallest Translation Unit
> (STU):
>
> int pci_prepare_ats(struct pci_dev *dev, int ps)
> {
> ...
> if (dev->is_virtfn)
> return 0;
> ...
> }
>
> Later, when pdev_enable_cap_ats() calls pci_enable_ats() for the VF, the
> actual validation happens there:
>
> int pci_enable_ats(struct pci_dev *dev, int ps)
> {
> ...
> if (pdev->ats_stu != ps)
> return -EINVAL;
> ...
> }
>
> If the PF failed its IOMMU probe or was explicitly bypassed by the driver,
> its ats_stu would remain 0.
>
> When VFs are subsequently probed, pci_prepare_ats() would blindly succeed,
> but pci_enable_ats() would fail with -EINVAL, triggering this new WARN_ON().
>
> Could this lead to unnecessary bug reports or kernel crashes on systems
> configured with panic_on_warn?
No.. this error won't be hit because the path highlighted above is being
changed in the ATS series:
https://lore.kernel.org/all/20260601143644.2358771-3-praan@google.com/
Thanks,
Praan
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 6/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
` (4 preceding siblings ...)
2026-06-01 13:42 ` [PATCH 5/6] iommu/amd: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-06-01 13:42 ` Pranjal Shrivastava
2026-06-01 19:06 ` sashiko-bot
2026-06-02 8:49 ` [PATCH 0/6] iommu/amd: Refactors for ATS updates Vasant Hegde
6 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 13:42 UTC (permalink / raw)
To: iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Ankit Soni,
Jason Gunthorpe, Bjorn Helgaas, Samiullah Khawaja,
Pranjal Shrivastava, Baolu Lu, Nicolin Chen
Currently, pci_prepare_ats() internally calls pci_ats_supported() and
returns -EINVAL if the device does not support ATS. While this provides
a silent safety check, it conflates support detection with configuration.
Update pci_prepare_ats() to wrap the internal pci_ats_supported check in
a WARN_ON(). This mandates all callers to call pci_prepare_ats() only if
the function supports ATS.
Update the function documentation to mention that callers must verify
ATS support (via pci_ats_supported()) before calling pci_prepare_ats().
Suggested-by: Baolu Lu <baolu.lu@linux.intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/pci/ats.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index ec6c8dbdc5e9..e10a0ceb2645 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -53,7 +53,9 @@ EXPORT_SYMBOL_GPL(pci_ats_supported);
* @ps: the IOMMU page shift
*
* This must be done by the IOMMU driver on the PF before any VFs are created to
- * ensure that the VF can have ATS enabled.
+ * ensure that the VF can have ATS enabled. Callers must verify that ATS is
+ * supported by the device (e.g. via pci_ats_supported()) before calling this
+ * function.
*
* Returns 0 on success, or negative on failure.
*/
@@ -61,7 +63,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
{
u16 ctrl;
- if (!pci_ats_supported(dev))
+ if (WARN_ON(!pci_ats_supported(dev)))
return -EINVAL;
if (WARN_ON(dev->ats_enabled))
--
2.54.0.823.g6e5bcc1fc9-goog
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 6/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
2026-06-01 13:42 ` [PATCH 6/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
@ 2026-06-01 19:06 ` sashiko-bot
0 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-06-01 19:06 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Adding a WARN_ON() to pci_prepare_ats() causes spurious kernel warnings and potential panics because the ARM SMMUv3 driver does not check pci_ats_supported() prior to calling it.
--
commit eaccac64f7c8e00ef695d5c2076021c097520251
Author: Pranjal Shrivastava <praan@google.com>
PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
This commit updates pci_prepare_ats() to wrap the internal
pci_ats_supported() check in a WARN_ON(). It is intended to mandate that
all callers verify ATS support before calling the function.
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -61,7 +63,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
> {
> u16 ctrl;
>
> - if (!pci_ats_supported(dev))
> + if (WARN_ON(!pci_ats_supported(dev)))
[Severity: High]
Will this trigger spurious warnings or panics for non-ATS PCI devices
on ARM systems?
The arm_smmu_probe_device() function appears to unconditionally invoke
pci_prepare_ats() for all PCI devices without verifying ATS support first:
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:arm_smmu_probe_device() {
...
if (dev_is_pci(dev)) {
unsigned int stu = __ffs(smmu->pgsize_bitmap);
pci_prepare_ats(to_pci_dev(dev), stu);
}
...
}
Since this unpatched caller does not check pci_ats_supported() first, any
standard PCI device lacking ATS support might trigger this new WARN_ON()
during IOMMU device probe.
> return -EINVAL;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601134204.2150602-1-praan@google.com?part=6
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/6] iommu/amd: Refactors for ATS updates
2026-06-01 13:41 [PATCH 0/6] iommu/amd: Refactors for ATS updates Pranjal Shrivastava
` (5 preceding siblings ...)
2026-06-01 13:42 ` [PATCH 6/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
@ 2026-06-02 8:49 ` Vasant Hegde
2026-06-02 12:08 ` Jason Gunthorpe
6 siblings, 1 reply; 20+ messages in thread
From: Vasant Hegde @ 2026-06-02 8:49 UTC (permalink / raw)
To: Pranjal Shrivastava, iommu, linux-pci, linux-kernel
Cc: Joerg Roedel, Suravee Suthikulpanit, Ankit Soni, Jason Gunthorpe,
Bjorn Helgaas, Samiullah Khawaja
Pranjal,
On 6/1/2026 7:11 PM, Pranjal Shrivastava wrote:
> This series addresses some issues identified by Sashiko in the AMD IOMMU
> driver during the review of the subsystem-wide ATS update work [1].
>
> Patches 1-2 address the pre-existing bugs in iommu_ignore_device()
> regarding the order of alias clearing and cleaning up the DTEs.
>
Thanks for the series. While this series fixes few important bugs, at the end
certain checks are still scatters. I really want to move around the code bit so
that it makes easy to read.
ex: even after this series, in probe path its calling dev_is_pci() check 3
times! Device capability is still scattered between probe() and
iommu_init_device() function.
Does something like below makes sense (untested code)? This doesn't address
iommu_ignore_device() issues. May be we should rename iommu_ignore_device() to
something liek disable_device_dma that clears TV bit and other things.
-Vasant
---<---
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 57dc8fabc7d9..c628a2e7e3a8 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -672,11 +672,12 @@ static void pdev_disable_caps(struct pci_dev *pdev)
* This function checks if the driver got a valid device from the caller to
* avoid dereferencing invalid pointers.
*/
-static bool check_device(struct device *dev)
+static bool iommu_lookup_device(struct device *dev,
+ struct amd_iommu **iommu_out, u16 *devid_out)
{
struct amd_iommu_pci_seg *pci_seg;
- struct amd_iommu *iommu;
- int devid, sbdf;
+ int sbdf;
+ u16 devid;
if (!dev)
return false;
@@ -687,7 +688,8 @@ static bool check_device(struct device *dev)
devid = PCI_SBDF_TO_DEVID(sbdf);
iommu = rlookup_amd_iommu(dev);
- if (!iommu)
+ /* Not registered yet? */
+ if (!iommu || !iommu->iommu.ops)
return false;
/* Out of our scope? */
@@ -695,25 +697,19 @@ static bool check_device(struct device *dev)
if (devid > pci_seg->last_bdf)
return false;
+ *iommu_out = iommu;
+ *devid_out = devid;
return true;
}
-static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
+static struct iommu_dev_data *iommu_init_device(struct amd_iommu *iommu,
+ struct device *dev, u16 devid)
{
struct iommu_dev_data *dev_data;
- int devid, sbdf;
-
- if (dev_iommu_priv_get(dev))
- return 0;
- sbdf = get_device_sbdf_id(dev);
- if (sbdf < 0)
- return sbdf;
-
- devid = PCI_SBDF_TO_DEVID(sbdf);
dev_data = find_dev_data(iommu, devid);
if (!dev_data)
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
dev_data->dev = dev;
@@ -724,6 +720,25 @@ static int iommu_init_device(struct amd_iommu *iommu,
struct device *dev)
dev_iommu_priv_set(dev, dev_data);
setup_aliases(iommu, dev);
+ iommu_completion_wait(iommu);
+
+ return dev_data;
+}
+
+static void iommu_init_device_caps(struct iommu_dev_data *dev_data,
+ struct device *dev,
+ struct amd_iommu *iommu)
+{
+ if (FEATURE_NUM_INT_REMAP_SUP_2K(amd_iommu_efr2))
+ dev_data->max_irqs = MAX_IRQS_PER_TABLE_2K;
+ else
+ dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
+
+ amd_iommu_set_pci_msi_domain(dev, iommu);
+
+ if (!dev_is_pci(dev))
+ return;
+
/*
* By default we use passthrough mode for IOMMUv2 capable device.
* But if amd_iommu=force_isolation is set (e.g. to debug DMA to
@@ -731,11 +746,21 @@ static int iommu_init_device(struct amd_iommu *iommu,
struct device *dev)
* it'll be forced to go into translation mode.
*/
if ((iommu_default_passthrough() || !amd_iommu_force_isolation) &&
- dev_is_pci(dev) && amd_iommu_gt_ppr_supported()) {
+ amd_iommu_gt_ppr_supported()) {
dev_data->flags = pdev_get_caps(to_pci_dev(dev));
}
- return 0;
+ /*
+ * If IOMMU and device supports PASID then it will contain max
+ * supported PASIDs, else it will be zero.
+ */
+ if (amd_iommu_pasid_supported() &&
+ pdev_pasid_supported(dev_data)) {
+ dev_data->max_pasids = min_t(u32, iommu->iommu.max_pasids,
+ pci_max_pasids(to_pci_dev(dev)));
+ }
+
+ pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
}
static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
@@ -2451,43 +2476,29 @@ static struct iommu_device
*amd_iommu_probe_device(struct device *dev)
struct amd_iommu *iommu;
struct iommu_dev_data *dev_data;
int ret;
+ u16 devid;
- if (!check_device(dev))
- return ERR_PTR(-ENODEV);
-
- iommu = rlookup_amd_iommu(dev);
- if (!iommu)
- return ERR_PTR(-ENODEV);
-
- /* Not registered yet? */
- if (!iommu->iommu.ops)
+ if (!iommu_lookup_device(dev, &iommu, &devid))
return ERR_PTR(-ENODEV);
if (dev_iommu_priv_get(dev))
return &iommu->iommu;
- ret = iommu_init_device(iommu, dev);
- if (ret) {
- dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
- iommu_dev = ERR_PTR(ret);
+ dev_data = iommu_init_device(iommu, dev, devid);
+ if (IS_ERR(dev_data)) {
+ dev_err(dev, "Failed to initialize \n", devid);
iommu_ignore_device(iommu, dev);
- goto out_err;
+ return ERR_CAST(dev_data);
}
- amd_iommu_set_pci_msi_domain(dev, iommu);
+ iommu_init_device_caps(dev_data, dev, iommu);
iommu_dev = &iommu->iommu;
/*
- * If IOMMU and device supports PASID then it will contain max
- * supported PASIDs, else it will be zero.
+ * When DMA translation is unavailable return error so the iommu core
+ * won't attempt domain attach for this device. But interrupt-remap
+ * is still supported. Hence do not ignore the device.
*/
- dev_data = dev_iommu_priv_get(dev);
- if (amd_iommu_pasid_supported() && dev_is_pci(dev) &&
- pdev_pasid_supported(dev_data)) {
- dev_data->max_pasids = min_t(u32, iommu->iommu.max_pasids,
- pci_max_pasids(to_pci_dev(dev)));
- }
-
if (amd_iommu_pgtable == PD_MODE_NONE) {
pr_warn_once("%s: DMA translation not supported by iommu.\n",
__func__);
@@ -2495,16 +2506,6 @@ static struct iommu_device *amd_iommu_probe_device(struct
device *dev)
goto out_err;
}
- iommu_completion_wait(iommu);
-
- if (FEATURE_NUM_INT_REMAP_SUP_2K(amd_iommu_efr2))
- dev_data->max_irqs = MAX_IRQS_PER_TABLE_2K;
- else
- dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
-
- if (dev_is_pci(dev))
- pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
-
out_err:
return iommu_dev;
}
> Patches 3-4 refine the probe error paths. As noted[2] by Ankit & Sashiko,
> unconditionally calling iommu_ignore_device() on probe failure breaks
> IRQ remapping for devices in PD_MODE_NONE. We now split the error paths
> in probe_device to preserve interrupt mapping for non-fatal failures
> while ensuring that dangling device pointers are cleared to prevent
> potential UAFs.
>
> Finally, patch 5 implements the Fail Hard pattern being standardized
> for ATS, ensuring configuration errors are caught during probe_device and
> ATS enablement failures are reported with a WARN_ON().
>
> Patch 6 is carried forward as is from the original ATS work [3] to
> maintain bisectibility.
>
> [1] https://lore.kernel.org/all/20260529111208.387412-1-praan@google.com/
> [2] https://lore.kernel.org/all/20260529153216.2AD1E1F00899@smtp.kernel.org/
> [3] https://lore.kernel.org/all/20260529111208.387412-4-praan@google.com/
>
> Thanks,
> Praan
>
> Pranjal Shrivastava (6):
> iommu/amd: Clear aliases before setting the rlookup_table to NULL
> iommu/amd: Clear DTE with update_dte256 in iommu_ignore_device()
> iommu/amd: Split probe error paths to preserve IRQ remapping
> iommu/amd: Fix Use-After-Free in non-fatal probe error path
> iommu/amd: Fail probe on ATS configuration failure
> PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
>
> drivers/iommu/amd/iommu.c | 61 ++++++++++++++++++++++++++++++++-------
> drivers/pci/ats.c | 6 ++--
> 2 files changed, 55 insertions(+), 12 deletions(-)
>
> base-commit: 283d245468a2b61c41aa8b582f25ed5615d1c304
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 0/6] iommu/amd: Refactors for ATS updates
2026-06-02 8:49 ` [PATCH 0/6] iommu/amd: Refactors for ATS updates Vasant Hegde
@ 2026-06-02 12:08 ` Jason Gunthorpe
2026-06-04 8:23 ` Vasant Hegde
0 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2026-06-02 12:08 UTC (permalink / raw)
To: Vasant Hegde
Cc: Pranjal Shrivastava, iommu, linux-pci, linux-kernel, Joerg Roedel,
Suravee Suthikulpanit, Ankit Soni, Bjorn Helgaas,
Samiullah Khawaja
On Tue, Jun 02, 2026 at 02:19:54PM +0530, Vasant Hegde wrote:
> -static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
> +static struct iommu_dev_data *iommu_init_device(struct amd_iommu *iommu,
> + struct device *dev, u16 devid)
> {
> struct iommu_dev_data *dev_data;
> - int devid, sbdf;
> -
> - if (dev_iommu_priv_get(dev))
> - return 0;
>
> - sbdf = get_device_sbdf_id(dev);
> - if (sbdf < 0)
> - return sbdf;
> -
> - devid = PCI_SBDF_TO_DEVID(sbdf);
> dev_data = find_dev_data(iommu, devid);
> if (!dev_data)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
Why does the AMD driver do this? The dev data is supposed to be
allocated during probe and freed during remove, it doesn't make sense
to search for it in probe - it should not exist if the kernel is
working right.
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/6] iommu/amd: Refactors for ATS updates
2026-06-02 12:08 ` Jason Gunthorpe
@ 2026-06-04 8:23 ` Vasant Hegde
0 siblings, 0 replies; 20+ messages in thread
From: Vasant Hegde @ 2026-06-04 8:23 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Pranjal Shrivastava, iommu, linux-pci, linux-kernel, Joerg Roedel,
Suravee Suthikulpanit, Ankit Soni, Bjorn Helgaas,
Samiullah Khawaja
Jason,
On 6/2/2026 5:38 PM, Jason Gunthorpe wrote:
> On Tue, Jun 02, 2026 at 02:19:54PM +0530, Vasant Hegde wrote:
>> -static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
>> +static struct iommu_dev_data *iommu_init_device(struct amd_iommu *iommu,
>> + struct device *dev, u16 devid)
>> {
>> struct iommu_dev_data *dev_data;
>> - int devid, sbdf;
>> -
>> - if (dev_iommu_priv_get(dev))
>> - return 0;
>>
>> - sbdf = get_device_sbdf_id(dev);
>> - if (sbdf < 0)
>> - return sbdf;
>> -
>> - devid = PCI_SBDF_TO_DEVID(sbdf);
>> dev_data = find_dev_data(iommu, devid);
>> if (!dev_data)
>> - return -ENOMEM;
>> + return ERR_PTR(-ENOMEM);
>
> Why does the AMD driver do this? The dev data is supposed to be
> allocated during probe and freed during remove, it doesn't make sense
> to search for it in probe - it should not exist if the kernel is
> working right.
Digging into git history and how its all evolved, I agree. We should fix up
find_dev_data() stuff. we should separate out search from allocation. I will add
separate fix once this series settles.
-Vasant
^ permalink raw reply [flat|nested] 20+ messages in thread