Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Pranjal Shrivastava" <praan@google.com>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
Date: Fri, 29 May 2026 15:32:15 +0000	[thread overview]
Message-ID: <20260529153216.2AD1E1F00899@smtp.kernel.org> (raw)
In-Reply-To: <20260529111208.387412-7-praan@google.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [Critical] Unconditional iommu_ignore_device() on probe failure breaks IRQ remapping in PD_MODE_NONE.
- [High] Failing IOMMU probe on optional ATS configuration failure causes device breakage and leaves a dangling MSI domain.
- [Low] Redundant double invocation of iommu_ignore_device() on iommu_init_device() failure.

Pre-existing issues:
- [High] iommu_ignore_device() fails to clear DTEs for PCI aliases due to incorrect operation ordering.
- [High] Unprotected, non-atomic memset() on the hardware-accessed dev_table array.
--

commit e469ff12f2f6dc9d49a92308828b4812fa6c58ef
Author: Pranjal Shrivastava <praan@google.com>

iommu/amd: Fail probe on ATS configuration failure

The patch updates the AMD IOMMU driver to fail device probe if ATS
configuration (pci_prepare_ats) fails. It also adds a WARN_ON for
pci_enable_ats failure and unconditionally calls iommu_ignore_device
on all probe failures to fix a use-after-free risk.

> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 84cad43dc188c..b74c4504bee39 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
>  	else
>  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
>  
> -	if (dev_is_pci(dev))
> -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> +	if (dev_is_pci(dev)) {
> +		struct pci_dev *pdev = to_pci_dev(dev);
> +
> +		if (pci_ats_supported(pdev)) {
> +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> +			if (ret) {
> +				iommu_dev = ERR_PTR(ret);
> +				goto out_err;

[Severity: High]
Does this error path leave a dangling MSI domain?

Since ATS is an optional PCIe performance enhancement, failing the probe
disables the device in the IOMMU. However, earlier in the probe sequence,
amd_iommu_set_pci_msi_domain() assigned the device's MSI domain to the
IOMMU's IRQ domain.

Because this assignment is not reverted on error, the device is left in an
inconsistent state. The device will be assigned to a default bypass or
direct DMA ops, but will be forced to use the IOMMU's IRQ domain for MSIs.

When the driver requests MSIs, irq_remapping_alloc() will fail to find the
IOMMU in the rlookup_table, breaking interrupt allocation completely for
the device.

> +			}
> +		}
> +	}
>  
>  out_err:
> +	if (IS_ERR(iommu_dev))
> +		iommu_ignore_device(iommu, dev);

[Severity: Low]
Is this double invocation of iommu_ignore_device() intentional?

The patch adds a new call to iommu_ignore_device() here to ensure cleanup
on all probe failure paths, but it leaves the pre-existing explicit call
inside the error handling block for iommu_init_device():

drivers/iommu/amd/iommu.c:amd_iommu_probe_device() {
	...
	ret = iommu_init_device(iommu, dev);
	if (ret) {
		dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
		iommu_dev = ERR_PTR(ret);
		iommu_ignore_device(iommu, dev);
		goto out_err;
	}
	...
}

If iommu_init_device() fails, the code evaluates IS_ERR(iommu_dev) as true
at out_err and calls the cleanup function a second time. While not functionally
damaging, it iterates over DMA aliases and clears tables twice unnecessarily.


[Severity: Critical]
Does unconditionally calling iommu_ignore_device() on probe failure break
IRQ remapping when the page table mode is PD_MODE_NONE?

Previously, when amd_iommu_pgtable == PD_MODE_NONE (DMA translation
unsupported/disabled), the probe returned -ENODEV to disable DMA
translation for the device but intentionally bypassed iommu_ignore_device().

This kept the device in the rlookup_table, allowing IRQ remapping to
function correctly. By executing iommu_ignore_device() unconditionally
here, the rlookup_table entry is cleared.

Consequently, irq_remapping_alloc() will fail to find the IOMMU for any
device, breaking MSI allocation system-wide when the IOMMU is in
IRQ-remapping-only mode.


[Severity: High]
This is a pre-existing issue, but does iommu_ignore_device() clear DTEs
for PCI aliases in the correct order?

Looking at iommu_ignore_device():

drivers/iommu/amd/iommu.c:iommu_ignore_device() {
	...
	pci_seg->rlookup_table[devid] = NULL;
	...
	setup_aliases(iommu, dev);
}

Inside setup_aliases(), the callback clone_alias() attempts to look up
the IOMMU using rlookup_amd_iommu(&pdev->dev), which internally reads
rlookup_table[devid].

Because it was already set to NULL, the lookup fails, and clone_alias()
immediately returns 0. This skips clearing the Device Table Entries (DTEs)
for the device's DMA aliases, leaving them active and dangling.


[Severity: High]
This is a pre-existing issue, but does iommu_ignore_device() use a
non-atomic memset() on the hardware-accessed dev_table array?

Looking at iommu_ignore_device():

drivers/iommu/amd/iommu.c:iommu_ignore_device() {
	...
	memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
	...
}

The AMD IOMMU driver explicitly documents that "IOMMU reads the entire
Device Table entry in a single 256-bit transaction" and requires that
the "DTE[V|GV] bit is being written first when clearing".

A non-atomic clear can cause torn writes where the IOMMU hardware
concurrently reads a partially modified entry (e.g., seeing the Valid
bit still set while the page table pointers are zeroed), leading to
IOMMU faults or wild DMAs.

> +
>  	return iommu_dev;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529111208.387412-1-praan@google.com?part=6

  reply	other threads:[~2026-05-29 15:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 2/6] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
2026-05-29 13:07   ` sashiko-bot
2026-05-29 21:56   ` Nicolin Chen
2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
2026-05-29 13:44   ` sashiko-bot
2026-05-29 21:51   ` Nicolin Chen
2026-05-31 17:13     ` Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
2026-05-29 14:30   ` sashiko-bot
2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
2026-05-29 15:32   ` sashiko-bot [this message]
2026-06-01  6:00   ` Ankit Soni
2026-06-01  6:20     ` Pranjal Shrivastava
2026-06-01  8:17       ` Ankit Soni
2026-06-01 10:35         ` Pranjal Shrivastava
2026-06-01  9:28       ` Vasant Hegde
2026-06-01 10:41         ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260529153216.2AD1E1F00899@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=praan@google.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox