Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking
@ 2026-05-29 11:12 Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava

The primary motivation for this series is an ATS state mismatch observed
under heavy load (via iova_stress). A failure in pci_enable_ats() leaves
IOMMU drivers like arm-smmu-v3 with inconsistent state leading to PCI core
warnings during device detach.

While David's recent work [1] addressed a discovery race for specific
quirked devices by moving them to the HEADER phase, gaps remained
regarding how Virtual Functions (VFs) inherit state from their Physical
Functions (PFs). Specifically, pci_ats_supported() did not account for
PF-level quirked status, and pci_prepare_ats() lacked STU validation for
VFs.

Based on discussion with Jason and Baolu in v3/v5, it was decided that the
IOMMU drivers should explicitly check pci_ats_supported() before calling
pci_prepare_ats(). To enforce this, pci_prepare_ats() now noisily checks
for support via WARN_ON(). Furthermore, the device probe should fail if
pci_prepare_ats() fails. Since these early gates preclude software
configuration errors, any remaining failure during pci_enable_ats() is
treated as a kernel bug.

This series standardizes this pattern across ARM SMMUv3, Intel VT-d, &
AMD IOMMU drivers.

[1] https://lore.kernel.org/linux-pci/20260403222750.1215002-1-dmatlack@google.com/

[v6]
  - Reverted the decoupling of pci_ats_supported() from pci_prepare_ats().
  - Added a WARN_ON() to the internal support check in pci_prepare_ats().
  - Dropped the standalone Intel bugfixes (RB-tree and UAF) to be sent as a
   separate standalone series per maintainer request.
  - Kept the folded UAF fix in the AMD IOMMU patch to ensure the new error
   path is immediately safe.
  - Collected Reviewed-by tags from Lu Baolu for PCI core patches.

[v5]
  - https://lore.kernel.org/all/20260528202353.3422206-1-praan@google.com/
  - Decoupled pci_ats_supported() from pci_prepare_ats() in the PCI core.
  - Rebased SMMUv3 support on top of Nicolin Chen's "Always-On ATS" series.
  - Fixed pre-existing RB-tree corruption in VT-d probe (Baolu/Sashiko).
  - Addressed the pre-existing UAF in AMD IOMMU probe suggested by Sashiko.

[v4]
  - https://lore.kernel.org/all/20260525184347.4059549-1-praan@google.com/
  - Standardized the pattern across Intel VT-d and AMD IOMMU drivers.
  - Replaced the SMMUv3 ats_prepared gate with a fatal probe-fail logic.
  - Utilized WARN() macros for runtime enablement failures in all drivers.
  - Collected R-b tags from Jason and Sami.

Pranjal Shrivastava (6):
  PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs
  PCI/ATS: Validate STU for VFs in pci_prepare_ats()
  PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
  iommu/arm-smmu-v3: Standardize ATS enablement failure reporting
  iommu/vt-d: Fail probe on ATS configuration failure
  iommu/amd: Fail probe on ATS configuration failure

 drivers/iommu/amd/iommu.c                   | 30 ++++++++++++++++-----
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +++++--
 drivers/iommu/intel/iommu.c                 | 15 ++++++++---
 drivers/pci/ats.c                           | 19 +++++++++----
 4 files changed, 58 insertions(+), 16 deletions(-)

-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 2/6] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava,
	Jason Gunthorpe

Update pci_ats_supported() to additionally check the associated PF's
status when called on a VF. This ensures that PF-level quirks and
untrusted status are correctly propagated to VFs, providing a robust
support check that aligns with the kernel's PF-centric ATS configuration
model and is immune to the timing of VF-specific fixups.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/pci/ats.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 96efa00d9743..679a3c3c1d54 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -40,10 +40,13 @@ void pci_ats_init(struct pci_dev *dev)
  */
 bool pci_ats_supported(struct pci_dev *dev)
 {
-	if (!dev->ats_cap)
+	if (!dev->ats_cap || dev->untrusted)
 		return false;
 
-	return (dev->untrusted == 0);
+	if (dev->is_virtfn)
+		return pci_ats_supported(pci_physfn(dev));
+
+	return true;
 }
 EXPORT_SYMBOL_GPL(pci_ats_supported);
 
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 2/6] PCI/ATS: Validate STU for VFs in pci_prepare_ats()
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava,
	Jason Gunthorpe

While every PCI Function that implements ATS has an independent ATS
Extended Capability structure with a Read/Write Smallest Translation
Unit (STU) field, the kernel manages SR-IOV ATS by requiring the IOMMU
driver to configure the STU on the Physical Function (PF) before any
any Virtual Functions (VFs) are created.

Currently, pci_prepare_ats() bails out early for VFs, assuming that the
PF has already been correctly prepared. However, this creates a potential
mismatch if a VF is subsequently prepared with a different page shift.

Update pci_prepare_ats() to validate that the requested page shift (ps)
matches the STU already configured in the associated PF. This ensures
early detection of incompatible configurations and maintains the kernel's
policy of consistent STU sizing across all functions associated with a
given SMMU.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/pci/ats.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 679a3c3c1d54..9cb23780093d 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -73,8 +73,12 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
 	if (ps < PCI_ATS_MIN_STU)
 		return -EINVAL;
 
-	if (dev->is_virtfn)
+	if (dev->is_virtfn) {
+		if (pci_physfn(dev)->ats_stu != ps)
+			return -EINVAL;
+
 		return 0;
+	}
 
 	dev->ats_stu = ps;
 	ctrl = PCI_ATS_CTRL_STU(dev->ats_stu - PCI_ATS_MIN_STU);
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 2/6] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 13:07   ` sashiko-bot
  2026-05-29 21:56   ` Nicolin Chen
  2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava

Currently, pci_prepare_ats() internally calls pci_ats_supported() and
returns -EINVAL if the device does not support ATS. While this provides
a silent safety check, it conflates support detection with configuration.

Update pci_prepare_ats() to wrap the internal pci_ats_supported check in
a WARN_ON(). This mandates all callers to call pci_prepare_ats() only if
the function supports ATS.

Update the function documentation to mention that callers must verify
ATS support (via pci_ats_supported()) before calling pci_prepare_ats().

Suggested-by: Baolu Lu <baolu.lu@linux.intel.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/pci/ats.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 9cb23780093d..f1434f86ac40 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -56,7 +56,9 @@ EXPORT_SYMBOL_GPL(pci_ats_supported);
  * @ps: the IOMMU page shift
  *
  * This must be done by the IOMMU driver on the PF before any VFs are created to
- * ensure that the VF can have ATS enabled.
+ * ensure that the VF can have ATS enabled. Callers must verify that ATS is
+ * supported by the device (e.g. via pci_ats_supported()) before calling this
+ * function.
  *
  * Returns 0 on success, or negative on failure.
  */
@@ -64,7 +66,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
 {
 	u16 ctrl;
 
-	if (!pci_ats_supported(dev))
+	if (WARN_ON(!pci_ats_supported(dev)))
 		return -EINVAL;
 
 	if (WARN_ON(dev->ats_enabled))
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
                   ` (2 preceding siblings ...)
  2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 13:44   ` sashiko-bot
  2026-05-29 21:51   ` Nicolin Chen
  2026-05-29 11:12 ` [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
  2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
  5 siblings, 2 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava

The SMMUv3 driver currently has a two-phase commit in its ATS enablement
flow. During arm_smmu_attach_prepare(), it predicts whether ATS will be
enabled using arm_smmu_ats_supported() and accordingly increments
nr_ats_masters and merges ATS invalidations into the domain's invs array.

However, the actual hardware enablement via pci_enable_ats() happens
later in arm_smmu_attach_commit(). If this call to pci_enable_ats fails,
the SMMU driver's ATS state tracking remains polluted, i.e., the driver
tracks ATS as enabled on a master that is not actually using it. This
leads to an incorrect nr_ats_masters and triggers a warning in the PCI
core during detach:

 1 [  127.925080] ------------[ cut here ]------------
 2 [  127.925084] WARNING: drivers/pci/ats.c:132 at pci_disable_ats+0x94/0xa8
 3 ...
 4 [  128.068169] Call trace:
 5 [  128.070603]  pci_disable_ats+0x94/0xa8 (P)
 6 [  128.074688]  arm_smmu_attach_prepare+0x104/0x310
 7 [  128.079292]  arm_smmu_attach_dev_ste+0x128/0x1e0

The issue was exposed under heavy load when running a VFIO-based DMA
map stress test (iova_stress).

Following the addition of the arm_smmu_master_prepare_ats() [1] helper during
device probe, failable ATS configuration (STU setup) is now handled early
during probe. This ensures that any master reaching the attach phase is
guaranteed to have a valid ATS configuration.

Update arm_smmu_enable_ats() to use the WARN() macro for any
subsequent enablement failures during the commit phase. Since probe
checks now preclude software configuration errors, any failure here is
considered a kernel bug.

[1] https://lore.kernel.org/all/cover.1779392420.git.nicolinc@nvidia.com/

Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a10affb483a4..aaebd72bc48d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2956,8 +2956,14 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
 	 * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
 	 */
 	arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
-	if (pci_enable_ats(pdev, stu))
-		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
+
+	/*
+	 * Any failure at this point is a kernel bug. pci_ats_supported()
+	 * and pci_prepare_ats() have already verified the hardware capability
+	 * and programmed the STU. Thus, pci_enable_ats() should not fail here.
+	 */
+	WARN(pci_enable_ats(pdev, stu),
+	     "Failed to enable ATS (STU %zu)\n", stu);
 }
 
 static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
                   ` (3 preceding siblings ...)
  2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 14:30   ` sashiko-bot
  2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
  5 siblings, 1 reply; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava

Update the Intel VT-d driver to handle ATS configuration and enablement
more strictly. Specifically, update the device probe to fail if
pci_prepare_ats() returns an error. This ensures that any ATS-capable
master reaching the attach phase is guaranteed to have a valid config.

Additionally, update iommu_enable_pci_ats() to WARN() if pci_enable_ats
fails. Since earlier checks in the probe phase preclude config-related
failures, any failure during hardware enablement is considered a kernel
bug.

Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/iommu/intel/iommu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4d0e65bc131d..22308e4911e1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -873,8 +873,14 @@ static void iommu_enable_pci_ats(struct device_domain_info *info)
 	if (!pci_ats_page_aligned(pdev))
 		return;
 
-	if (!pci_enable_ats(pdev, VTD_PAGE_SHIFT))
-		info->ats_enabled = 1;
+	/*
+	 * pci_enable_ats() should not fail here because earlier checks
+	 * have already verified support and configuration.
+	 */
+	if (WARN_ON(pci_enable_ats(pdev, VTD_PAGE_SHIFT)))
+		return;
+
+	info->ats_enabled = 1;
 }
 
 static void iommu_disable_pci_ats(struct device_domain_info *info)
@@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
 
 	dev_iommu_priv_set(dev, info);
 	if (pdev && pci_ats_supported(pdev)) {
-		pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
+		ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
+		if (ret)
+			goto free;
+
 		ret = device_rbtree_insert(iommu, info);
 		if (ret)
 			goto free;
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
                   ` (4 preceding siblings ...)
  2026-05-29 11:12 ` [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-05-29 11:12 ` Pranjal Shrivastava
  2026-05-29 15:32   ` sashiko-bot
  2026-06-01  6:00   ` Ankit Soni
  5 siblings, 2 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-29 11:12 UTC (permalink / raw)
  To: iommu, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava

Update the AMD IOMMU driver to handle ATS configuration and enablement
more strictly. Specifically, update the device probe to fail if
pci_prepare_ats() returns an error. This ensures that any ATS-capable
master reaching the attach phase is guaranteed to have a valid config.

Additionally, update pdev_enable_cap_ats() to WARN_ON() if pci_enable_ats
fails. Since earlier checks in the probe phase preclude config-related
failures, any failure during hardware enablement is considered a kernel
bug.

Fix a pre-existing Use-After-Free risk by ensuring iommu_ignore_device()
is called on all probe failures after iommu_init_device().

Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
 drivers/iommu/amd/iommu.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 84cad43dc188..b74c4504bee3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -570,10 +570,16 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
 	if (amd_iommu_iotlb_sup &&
 	    (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
 		ret = pci_enable_ats(pdev, PAGE_SHIFT);
-		if (!ret) {
-			dev_data->ats_enabled = 1;
-			dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
-		}
+		/*
+		 * pci_enable_ats() should not fail here because earlier checks
+		 * have already verified support and configuration.
+		 */
+		if (WARN_ON(ret))
+			return ret;
+
+		dev_data->ats_enabled = 1;
+		dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
+		ret = 0;
 	}
 
 	return ret;
@@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
 	else
 		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
 
-	if (dev_is_pci(dev))
-		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
+	if (dev_is_pci(dev)) {
+		struct pci_dev *pdev = to_pci_dev(dev);
+
+		if (pci_ats_supported(pdev)) {
+			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
+			if (ret) {
+				iommu_dev = ERR_PTR(ret);
+				goto out_err;
+			}
+		}
+	}
 
 out_err:
+	if (IS_ERR(iommu_dev))
+		iommu_ignore_device(iommu, dev);
+
 	return iommu_dev;
 }
 
-- 
2.54.0.823.g6e5bcc1fc9-goog


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
  2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
@ 2026-05-29 13:07   ` sashiko-bot
  2026-05-29 21:56   ` Nicolin Chen
  1 sibling, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-05-29 13:07 UTC (permalink / raw)
  To: Pranjal Shrivastava; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Adding a WARN_ON() inside pci_prepare_ats() causes severe kernel warning spam during boot and creates a DoS vector via hotplugging on AMD systems.
--

commit d186831ffc8e024f9925cf7f6047ad60755bc2d8
Author: Pranjal Shrivastava <praan@google.com>

PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()

This commit updates pci_prepare_ats() to wrap the internal
pci_ats_supported() check in a WARN_ON(), mandating callers to verify ATS
support prior to calling it. It also updates the function documentation.

> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index 9cb23780093dd..f1434f86ac40b 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
[ ... ]
> @@ -64,7 +66,7 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
>  {
>  	u16 ctrl;
>  
> -	if (!pci_ats_supported(dev))
> +	if (WARN_ON(!pci_ats_supported(dev)))

[Severity: High]
Will this trigger severe warning spam during boot on AMD systems?

Currently, the AMD IOMMU driver unconditionally calls pci_prepare_ats() for
every PCI device in amd_iommu_probe_device() without checking if ATS is
supported. Since most standard devices lack ATS capability,
pci_ats_supported() returns false, which seems like it would trigger this
stack trace for every such device during probe.

This breaks git bisect and causes boot panics on systems with panic_on_warn
enabled, or provides a local DoS vector via device hotplugging.

I noticed this is fixed later in the series by commit e469ff12f2f6
(iommu/amd: Fail probe on ATS configuration failure). Should the AMD IOMMU
driver be updated before this patch to preserve bisectability?

>  		return -EINVAL;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529111208.387412-1-praan@google.com?part=3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting
  2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
@ 2026-05-29 13:44   ` sashiko-bot
  2026-05-29 21:51   ` Nicolin Chen
  1 sibling, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-05-29 13:44 UTC (permalink / raw)
  To: Pranjal Shrivastava; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] The error message drops the specific device context.
- [Low] State-mutating function call placed inside WARN() macro condition.
--

commit f858943c568bf1c4936c2371054b39b080e8ae6a
Author: Pranjal Shrivastava <praan@google.com>

iommu/arm-smmu-v3: Standardize ATS enablement failure reporting

This commit moves failable ATS configuration to the probe phase and
updates the commit phase to use WARN() for enablement failures. Since
hardware capabilities are verified earlier, failures here are considered
kernel bugs.

Link: https://lore.kernel.org/all/cover.1779392420.git.nicolinc@nvidia.com/

> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
[ ... ]
> @@ -2956,8 +2956,14 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
>  	 * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
>  	 */
>  	arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
> -	if (pci_enable_ats(pdev, stu))
> -		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> +
> +	/*
> +	 * Any failure at this point is a kernel bug. pci_ats_supported()
> +	 * and pci_prepare_ats() have already verified the hardware capability
> +	 * and programmed the STU. Thus, pci_enable_ats() should not fail here.
> +	 */
> +	WARN(pci_enable_ats(pdev, stu),
> +	     "Failed to enable ATS (STU %zu)\n", stu);

[Severity: Low]
Does placing a state-mutating function like pci_enable_ats() directly inside
the WARN() macro condition hide essential hardware configuration logic?

While Linux's generic WARN() macro currently evaluates its condition exactly
once, placing this logic inside an error-reporting macro can degrade code
readability and break developer expectations. Could the function call be
extracted and its return value evaluated instead?

[Severity: Medium]
Does this change drop the specific device context from the error message?

The previous dev_err() call automatically prefixed the log message with the
specific device's identifier (like its PCI BDF address), which is crucial
for determining exactly which hardware component failed. By switching to a
generic WARN(), this device context is lost.

Would it be better to use dev_WARN(master->dev, ...) or explicitly include
dev_name(master->dev) in the format string to preserve observability for
debugging specific hardware failures?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529111208.387412-1-praan@google.com?part=4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure
  2026-05-29 11:12 ` [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-05-29 14:30   ` sashiko-bot
  0 siblings, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-05-29 14:30 UTC (permalink / raw)
  To: Pranjal Shrivastava; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Unconditional call to `device_rbtree_remove()` for an uninserted `rb_node` corrupts the IOMMU device rbtree.
--

commit 28c5be83fdd5509a29185b7292f784b77e8d22e7
Author: Pranjal Shrivastava <praan@google.com>

iommu/vt-d: Fail probe on ATS configuration failure

Update the Intel VT-d driver to handle ATS configuration strictly by failing
the device probe if pci_prepare_ats() returns an error. Also, add a WARN_ON
in iommu_enable_pci_ats() to catch enablement failures.

> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 4d0e65bc131d7..22308e4911e1f 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
[ ... ]
> @@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
>  
>  	dev_iommu_priv_set(dev, info);
>  	if (pdev && pci_ats_supported(pdev)) {
> -		pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> +		ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> +		if (ret)
> +			goto free;
> +
>  		ret = device_rbtree_insert(iommu, info);
>  		if (ret)
>  			goto free;

[Severity: High]
This is a pre-existing issue, but I wanted to ask about the error handling
paths in this function. If a device does not support ATS or isn't a PCI
device, device_rbtree_insert() is skipped here.

However, if a subsequent initialization step fails, such as allocating the
PASID table:

intel_iommu_probe_device() {
	...
	if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) {
		ret = intel_pasid_alloc_table(dev);
		if (ret) {
			dev_err(dev, "PASID table allocation failed\n");
			goto clear_rbtree;
		}
	...
clear_rbtree:
	device_rbtree_remove(info);
free:
	kfree(info);
	...
}

Does this unconditionally call device_rbtree_remove() on a zero-initialized
and uninserted rb_node?

If rb_erase() is called on this uninserted node, could it treat the zeroed
node as the root and overwrite the actual rbtree root with NULL, corrupting
the IOMMU device tracking tree and dropping previously probed ATS devices?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529111208.387412-1-praan@google.com?part=5

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
@ 2026-05-29 15:32   ` sashiko-bot
  2026-06-01  6:00   ` Ankit Soni
  1 sibling, 0 replies; 20+ messages in thread
From: sashiko-bot @ 2026-05-29 15:32 UTC (permalink / raw)
  To: Pranjal Shrivastava; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [Critical] Unconditional iommu_ignore_device() on probe failure breaks IRQ remapping in PD_MODE_NONE.
- [High] Failing IOMMU probe on optional ATS configuration failure causes device breakage and leaves a dangling MSI domain.
- [Low] Redundant double invocation of iommu_ignore_device() on iommu_init_device() failure.

Pre-existing issues:
- [High] iommu_ignore_device() fails to clear DTEs for PCI aliases due to incorrect operation ordering.
- [High] Unprotected, non-atomic memset() on the hardware-accessed dev_table array.
--

commit e469ff12f2f6dc9d49a92308828b4812fa6c58ef
Author: Pranjal Shrivastava <praan@google.com>

iommu/amd: Fail probe on ATS configuration failure

The patch updates the AMD IOMMU driver to fail device probe if ATS
configuration (pci_prepare_ats) fails. It also adds a WARN_ON for
pci_enable_ats failure and unconditionally calls iommu_ignore_device
on all probe failures to fix a use-after-free risk.

> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 84cad43dc188c..b74c4504bee39 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
>  	else
>  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
>  
> -	if (dev_is_pci(dev))
> -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> +	if (dev_is_pci(dev)) {
> +		struct pci_dev *pdev = to_pci_dev(dev);
> +
> +		if (pci_ats_supported(pdev)) {
> +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> +			if (ret) {
> +				iommu_dev = ERR_PTR(ret);
> +				goto out_err;

[Severity: High]
Does this error path leave a dangling MSI domain?

Since ATS is an optional PCIe performance enhancement, failing the probe
disables the device in the IOMMU. However, earlier in the probe sequence,
amd_iommu_set_pci_msi_domain() assigned the device's MSI domain to the
IOMMU's IRQ domain.

Because this assignment is not reverted on error, the device is left in an
inconsistent state. The device will be assigned to a default bypass or
direct DMA ops, but will be forced to use the IOMMU's IRQ domain for MSIs.

When the driver requests MSIs, irq_remapping_alloc() will fail to find the
IOMMU in the rlookup_table, breaking interrupt allocation completely for
the device.

> +			}
> +		}
> +	}
>  
>  out_err:
> +	if (IS_ERR(iommu_dev))
> +		iommu_ignore_device(iommu, dev);

[Severity: Low]
Is this double invocation of iommu_ignore_device() intentional?

The patch adds a new call to iommu_ignore_device() here to ensure cleanup
on all probe failure paths, but it leaves the pre-existing explicit call
inside the error handling block for iommu_init_device():

drivers/iommu/amd/iommu.c:amd_iommu_probe_device() {
	...
	ret = iommu_init_device(iommu, dev);
	if (ret) {
		dev_err(dev, "Failed to initialize - trying to proceed anyway\n");
		iommu_dev = ERR_PTR(ret);
		iommu_ignore_device(iommu, dev);
		goto out_err;
	}
	...
}

If iommu_init_device() fails, the code evaluates IS_ERR(iommu_dev) as true
at out_err and calls the cleanup function a second time. While not functionally
damaging, it iterates over DMA aliases and clears tables twice unnecessarily.


[Severity: Critical]
Does unconditionally calling iommu_ignore_device() on probe failure break
IRQ remapping when the page table mode is PD_MODE_NONE?

Previously, when amd_iommu_pgtable == PD_MODE_NONE (DMA translation
unsupported/disabled), the probe returned -ENODEV to disable DMA
translation for the device but intentionally bypassed iommu_ignore_device().

This kept the device in the rlookup_table, allowing IRQ remapping to
function correctly. By executing iommu_ignore_device() unconditionally
here, the rlookup_table entry is cleared.

Consequently, irq_remapping_alloc() will fail to find the IOMMU for any
device, breaking MSI allocation system-wide when the IOMMU is in
IRQ-remapping-only mode.


[Severity: High]
This is a pre-existing issue, but does iommu_ignore_device() clear DTEs
for PCI aliases in the correct order?

Looking at iommu_ignore_device():

drivers/iommu/amd/iommu.c:iommu_ignore_device() {
	...
	pci_seg->rlookup_table[devid] = NULL;
	...
	setup_aliases(iommu, dev);
}

Inside setup_aliases(), the callback clone_alias() attempts to look up
the IOMMU using rlookup_amd_iommu(&pdev->dev), which internally reads
rlookup_table[devid].

Because it was already set to NULL, the lookup fails, and clone_alias()
immediately returns 0. This skips clearing the Device Table Entries (DTEs)
for the device's DMA aliases, leaving them active and dangling.


[Severity: High]
This is a pre-existing issue, but does iommu_ignore_device() use a
non-atomic memset() on the hardware-accessed dev_table array?

Looking at iommu_ignore_device():

drivers/iommu/amd/iommu.c:iommu_ignore_device() {
	...
	memset(&dev_table[devid], 0, sizeof(struct dev_table_entry));
	...
}

The AMD IOMMU driver explicitly documents that "IOMMU reads the entire
Device Table entry in a single 256-bit transaction" and requires that
the "DTE[V|GV] bit is being written first when clearing".

A non-atomic clear can cause torn writes where the IOMMU hardware
concurrently reads a partially modified entry (e.g., seeing the Valid
bit still set while the page table pointers are zeroed), leading to
IOMMU faults or wild DMAs.

> +
>  	return iommu_dev;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529111208.387412-1-praan@google.com?part=6

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting
  2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
  2026-05-29 13:44   ` sashiko-bot
@ 2026-05-29 21:51   ` Nicolin Chen
  2026-05-31 17:13     ` Pranjal Shrivastava
  1 sibling, 1 reply; 20+ messages in thread
From: Nicolin Chen @ 2026-05-29 21:51 UTC (permalink / raw)
  To: Pranjal Shrivastava
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
	Mostafa Saleh

On Fri, May 29, 2026 at 11:12:06AM +0000, Pranjal Shrivastava wrote:
> The SMMUv3 driver currently has a two-phase commit in its ATS enablement
> flow. During arm_smmu_attach_prepare(), it predicts whether ATS will be
> enabled using arm_smmu_ats_supported() and accordingly increments
> nr_ats_masters and merges ATS invalidations into the domain's invs array.
> 
> However, the actual hardware enablement via pci_enable_ats() happens
> later in arm_smmu_attach_commit(). If this call to pci_enable_ats fails,
> the SMMU driver's ATS state tracking remains polluted, i.e., the driver
> tracks ATS as enabled on a master that is not actually using it. This
> leads to an incorrect nr_ats_masters and triggers a warning in the PCI
> core during detach:
> 
>  1 [  127.925080] ------------[ cut here ]------------
>  2 [  127.925084] WARNING: drivers/pci/ats.c:132 at pci_disable_ats+0x94/0xa8
>  3 ...
>  4 [  128.068169] Call trace:
>  5 [  128.070603]  pci_disable_ats+0x94/0xa8 (P)
>  6 [  128.074688]  arm_smmu_attach_prepare+0x104/0x310
>  7 [  128.079292]  arm_smmu_attach_dev_ste+0x128/0x1e0
> 
> The issue was exposed under heavy load when running a VFIO-based DMA
> map stress test (iova_stress).
> 
> Following the addition of the arm_smmu_master_prepare_ats() [1] helper during
> device probe, failable ATS configuration (STU setup) is now handled early
> during probe. This ensures that any master reaching the attach phase is
> guaranteed to have a valid ATS configuration.
> 
> Update arm_smmu_enable_ats() to use the WARN() macro for any
> subsequent enablement failures during the commit phase. Since probe
> checks now preclude software configuration errors, any failure here is
> considered a kernel bug.

The commit message feels like mixing a stale background and the
real requirement (based on the latest code line). Could that DMA
map stress test still trigger the WARN_ON in pci_disable_ats(),
after having arm_smmu_master_prepare_ats()?

It'd be nicer if the writing can be simplified a bit.

>  	arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
> -	if (pci_enable_ats(pdev, stu))
> -		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> +
> +	/*
> +	 * Any failure at this point is a kernel bug. pci_ats_supported()
> +	 * and pci_prepare_ats() have already verified the hardware capability
> +	 * and programmed the STU. Thus, pci_enable_ats() should not fail here.
> +	 */

The patch that removes pci_ats_supported() from pci_prepare_ats()
is dropped in this v6. So, my previous comments may stay true and
the two lines can be enough?

 	/*
 	 * As pci_prepare_ats() have already verified the hardware capability
 	 * and programmed the STE, pci_enable_ats() should not fail here.
 	 */

> +	WARN(pci_enable_ats(pdev, stu),
> +	     "Failed to enable ATS (STU %zu)\n", stu);

https://sashiko.dev/#/patchset/20260529111208.387412-1-praan%40google.com
Please check Sashiko review (for other patches in this series too).

I think it'd be cleaner to just have:

-	if (pci_enable_ats(pdev, stu))
+	if (WARN_ON(pci_enable_ats(pdev, stu)))

Nicolin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats()
  2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
  2026-05-29 13:07   ` sashiko-bot
@ 2026-05-29 21:56   ` Nicolin Chen
  1 sibling, 0 replies; 20+ messages in thread
From: Nicolin Chen @ 2026-05-29 21:56 UTC (permalink / raw)
  To: Pranjal Shrivastava
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
	Mostafa Saleh

On Fri, May 29, 2026 at 11:12:05AM +0000, Pranjal Shrivastava wrote:
> Currently, pci_prepare_ats() internally calls pci_ats_supported() and
> returns -EINVAL if the device does not support ATS. While this provides
> a silent safety check, it conflates support detection with configuration.
> 
> Update pci_prepare_ats() to wrap the internal pci_ats_supported check in
> a WARN_ON(). This mandates all callers to call pci_prepare_ats() only if
> the function supports ATS.
> 
> Update the function documentation to mention that callers must verify
> ATS support (via pci_ats_supported()) before calling pci_prepare_ats().
> 
> Suggested-by: Baolu Lu <baolu.lu@linux.intel.com>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>

Sashiko pointed out a bisect issue. So, you might want to reorder
a bit. The patch itself looks good to me.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting
  2026-05-29 21:51   ` Nicolin Chen
@ 2026-05-31 17:13     ` Pranjal Shrivastava
  0 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-05-31 17:13 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
	Mostafa Saleh

On Fri, May 29, 2026 at 02:51:52PM -0700, Nicolin Chen wrote:
> On Fri, May 29, 2026 at 11:12:06AM +0000, Pranjal Shrivastava wrote:
> > The SMMUv3 driver currently has a two-phase commit in its ATS enablement
> > flow. During arm_smmu_attach_prepare(), it predicts whether ATS will be
> > enabled using arm_smmu_ats_supported() and accordingly increments
> > nr_ats_masters and merges ATS invalidations into the domain's invs array.
> > 
> > However, the actual hardware enablement via pci_enable_ats() happens
> > later in arm_smmu_attach_commit(). If this call to pci_enable_ats fails,
> > the SMMU driver's ATS state tracking remains polluted, i.e., the driver
> > tracks ATS as enabled on a master that is not actually using it. This
> > leads to an incorrect nr_ats_masters and triggers a warning in the PCI
> > core during detach:
> > 
> >  1 [  127.925080] ------------[ cut here ]------------
> >  2 [  127.925084] WARNING: drivers/pci/ats.c:132 at pci_disable_ats+0x94/0xa8
> >  3 ...
> >  4 [  128.068169] Call trace:
> >  5 [  128.070603]  pci_disable_ats+0x94/0xa8 (P)
> >  6 [  128.074688]  arm_smmu_attach_prepare+0x104/0x310
> >  7 [  128.079292]  arm_smmu_attach_dev_ste+0x128/0x1e0
> > 
> > The issue was exposed under heavy load when running a VFIO-based DMA
> > map stress test (iova_stress).
> > 
> > Following the addition of the arm_smmu_master_prepare_ats() [1] helper during
> > device probe, failable ATS configuration (STU setup) is now handled early
> > during probe. This ensures that any master reaching the attach phase is
> > guaranteed to have a valid ATS configuration.
> > 
> > Update arm_smmu_enable_ats() to use the WARN() macro for any
> > subsequent enablement failures during the commit phase. Since probe
> > checks now preclude software configuration errors, any failure here is
> > considered a kernel bug.
> 
> The commit message feels like mixing a stale background and the
> real requirement (based on the latest code line). Could that DMA
> map stress test still trigger the WARN_ON in pci_disable_ats(),
> after having arm_smmu_master_prepare_ats()?
> 
> It'd be nicer if the writing can be simplified a bit.

Ack. I'll re-word and remove stale context.

> 
> >  	arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
> > -	if (pci_enable_ats(pdev, stu))
> > -		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> > +
> > +	/*
> > +	 * Any failure at this point is a kernel bug. pci_ats_supported()
> > +	 * and pci_prepare_ats() have already verified the hardware capability
> > +	 * and programmed the STU. Thus, pci_enable_ats() should not fail here.
> > +	 */
> 
> The patch that removes pci_ats_supported() from pci_prepare_ats()
> is dropped in this v6. So, my previous comments may stay true and
> the two lines can be enough?
> 
>  	/*
>  	 * As pci_prepare_ats() have already verified the hardware capability
>  	 * and programmed the STE, pci_enable_ats() should not fail here.
>  	 */
> 
> > +	WARN(pci_enable_ats(pdev, stu),
> > +	     "Failed to enable ATS (STU %zu)\n", stu);

Ack. I'll update this.

> 
> https://sashiko.dev/#/patchset/20260529111208.387412-1-praan%40google.com
> Please check Sashiko review (for other patches in this series too).

Yup, already sent out a series [1] to address Sashiko findings
separately.

> 
> I think it'd be cleaner to just have:
> 
> -	if (pci_enable_ats(pdev, stu))
> +	if (WARN_ON(pci_enable_ats(pdev, stu)))

Sure.. I'll also maybe keep the dev_err log that we have, knowing STU
mismatch is slightly helpful.

Thanks,
Praan

[1] https://lore.kernel.org/all/20260531170254.60493-1-praan@google.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
  2026-05-29 15:32   ` sashiko-bot
@ 2026-06-01  6:00   ` Ankit Soni
  2026-06-01  6:20     ` Pranjal Shrivastava
  1 sibling, 1 reply; 20+ messages in thread
From: Ankit Soni @ 2026-06-01  6:00 UTC (permalink / raw)
  To: Pranjal Shrivastava
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

On Fri, May 29, 2026 at 11:12:08AM +0000, Pranjal Shrivastava wrote:
> Update the AMD IOMMU driver to handle ATS configuration and enablement
> more strictly. Specifically, update the device probe to fail if
> pci_prepare_ats() returns an error. This ensures that any ATS-capable
> master reaching the attach phase is guaranteed to have a valid config.
> 
> Additionally, update pdev_enable_cap_ats() to WARN_ON() if pci_enable_ats
> fails. Since earlier checks in the probe phase preclude config-related
> failures, any failure during hardware enablement is considered a kernel
> bug.
> 
> Fix a pre-existing Use-After-Free risk by ensuring iommu_ignore_device()
> is called on all probe failures after iommu_init_device().
> 
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
> ---
>  drivers/iommu/amd/iommu.c | 30 ++++++++++++++++++++++++------
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 84cad43dc188..b74c4504bee3 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -570,10 +570,16 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
>  	if (amd_iommu_iotlb_sup &&
>  	    (dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
>  		ret = pci_enable_ats(pdev, PAGE_SHIFT);
> -		if (!ret) {
> -			dev_data->ats_enabled = 1;
> -			dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
> -		}
> +		/*
> +		 * pci_enable_ats() should not fail here because earlier checks
> +		 * have already verified support and configuration.
> +		 */
> +		if (WARN_ON(ret))
> +			return ret;
> +
> +		dev_data->ats_enabled = 1;
> +		dev_data->ats_qdep    = pci_ats_queue_depth(pdev);
> +		ret = 0;
>  	}
>  
>  	return ret;
> @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
>  	else
>  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
>  
> -	if (dev_is_pci(dev))
> -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> +	if (dev_is_pci(dev)) {
> +		struct pci_dev *pdev = to_pci_dev(dev);
> +
> +		if (pci_ats_supported(pdev)) {
> +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> +			if (ret) {
> +				iommu_dev = ERR_PTR(ret);
> +				goto out_err;
> +			}
> +		}
> +	}
>  
>  out_err:
> +	if (IS_ERR(iommu_dev))
> +		iommu_ignore_device(iommu, dev);
> +
>  	return iommu_dev;
>  }
>  

Hi,
This regresses IRQ remapping in the PD_MODE_NONE branch. By design
rlookup_table[devid] must stay valid for IR - init.c:2257 documents
this: "Do not return an error to enable IRQ remapping ...". Pre-patch
the PD_MODE_NONE branch returned ERR_PTR(-ENODEV) without nulling
rlookup, precisely so irq_remapping_alloc() / __rlookup_amd_iommu()
keep working; this unconditional cleanup violates that.
The new pci_prepare_ats() failure path has the same shape:
amd_iommu_set_pci_msi_domain() ran earlier and parented dev->msi_domain
on iommu->ir_domain, but on this new out_err that's not unwound. So
nulling rlookup_table[devid] makes irq_remapping_alloc() return -EINVAL
on the first MSI alloc for the device. Sashiko also flagged this in [1];

Also if iommu_init_device() branch fails, iommu_ignore_device() will be
called twice.

[1] https://lore.kernel.org/r/20260529153216.2AD1E1F00899@smtp.kernel.org

-Ankit

> -- 
> 2.54.0.823.g6e5bcc1fc9-goog
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-06-01  6:00   ` Ankit Soni
@ 2026-06-01  6:20     ` Pranjal Shrivastava
  2026-06-01  8:17       ` Ankit Soni
  2026-06-01  9:28       ` Vasant Hegde
  0 siblings, 2 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01  6:20 UTC (permalink / raw)
  To: Ankit Soni
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

On Mon, Jun 01, 2026 at 06:00:15AM +0000, Ankit Soni wrote:
> > @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
> >  	else
> >  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
> >  
> > -	if (dev_is_pci(dev))
> > -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> > +	if (dev_is_pci(dev)) {
> > +		struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > +		if (pci_ats_supported(pdev)) {
> > +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> > +			if (ret) {
> > +				iommu_dev = ERR_PTR(ret);
> > +				goto out_err;
> > +			}
> > +		}
> > +	}
> >  
> >  out_err:
> > +	if (IS_ERR(iommu_dev))
> > +		iommu_ignore_device(iommu, dev);
> > +
> >  	return iommu_dev;
> >  }
> >  
> 
> Hi,
> This regresses IRQ remapping in the PD_MODE_NONE branch. By design
> rlookup_table[devid] must stay valid for IR - init.c:2257 documents
> this: "Do not return an error to enable IRQ remapping ...". Pre-patch
> the PD_MODE_NONE branch returned ERR_PTR(-ENODEV) without nulling
> rlookup, precisely so irq_remapping_alloc() / __rlookup_amd_iommu()
> keep working; this unconditional cleanup violates that.
> The new pci_prepare_ats() failure path has the same shape:
> amd_iommu_set_pci_msi_domain() ran earlier and parented dev->msi_domain
> on iommu->ir_domain, but on this new out_err that's not unwound. So
> nulling rlookup_table[devid] makes irq_remapping_alloc() return -EINVAL
> on the first MSI alloc for the device. Sashiko also flagged this in [1];
> 
> Also if iommu_init_device() branch fails, iommu_ignore_device() will be
> called twice.
> 

Hi Ankit,

Ack. Sashiko made me realize that this regresses IRQ mapping for AMD,
and I agree that the call to iommu_ignore_device() is a bit too 
aggressive as it  wipes the rlookup_table entry required for IRQ 
remapping, particularly in PD_MODE_NONE.

I was thinkig to address this in the next version as follows:

1. Split the probe error paths:
   - Proper init failures (like iommu_init_device) will continue to call
     iommu_ignore_device(). I will fix the double invocation here.

   - Config failures (like ATS mismatch or PD_MODE_NONE) will return an
     an error but skip caling  iommu_ignore_device(), preserving the
     rlookup_table entry for IRQ remapping.

2. Resolve the Use-After-Free (UAF):
   To prevent the UAF on the "DMA-only" failure path, I will ensure that
   the hardware Device Table Entry (DTE) is set to a safe state (like
   blocked or bypass) and the dev_data->dev pointer is cleared, as the
   IOMMU core does not invoke release_device() after a probe failure.

3. Fix iommu_ignore_device() infrastructure:
   I will address the pre-existing bugs identified by Sashiko:
   - Fix clearing order (calling setup_aliases before clearing the
     rlookup_table).
   - Replace the non-atomic memset() on the hardware dev_table with an
     atomic DTE update.

That said, I'm investigating the safest way to revert the MSI domain
assignment on probe failure to avoid the dangling domain issue pointed
out by Sashiko. Maybe we can add an amd_iommu_restore_msi_domain() helper
to revert the assignment made in amd_iommu_set_pci_msi_domain() on probe
failure?

Please, let me know if that sounds okay?

Also, I'm wondering if I should send this as a separate series specific 
to AMD which is unrelated to this one? Or maybe handle AMD IOMMU in a
separate series altogether. Let me know if you (and Vasanth / Suravee) 
would prefer that?

Thanks,
Pranjal

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-06-01  6:20     ` Pranjal Shrivastava
@ 2026-06-01  8:17       ` Ankit Soni
  2026-06-01 10:35         ` Pranjal Shrivastava
  2026-06-01  9:28       ` Vasant Hegde
  1 sibling, 1 reply; 20+ messages in thread
From: Ankit Soni @ 2026-06-01  8:17 UTC (permalink / raw)
  To: Pranjal Shrivastava, Vasant Hegde
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

On Mon, Jun 01, 2026 at 06:20:58AM +0000, Pranjal Shrivastava wrote:
> On Mon, Jun 01, 2026 at 06:00:15AM +0000, Ankit Soni wrote:
> > > @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
> > >  	else
> > >  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
> > >  
> > > -	if (dev_is_pci(dev))
> > > -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> > > +	if (dev_is_pci(dev)) {
> > > +		struct pci_dev *pdev = to_pci_dev(dev);
> > > +
> > > +		if (pci_ats_supported(pdev)) {
> > > +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> > > +			if (ret) {
> > > +				iommu_dev = ERR_PTR(ret);
> > > +				goto out_err;
> > > +			}
> > > +		}
> > > +	}
> > >  
> > >  out_err:
> > > +	if (IS_ERR(iommu_dev))
> > > +		iommu_ignore_device(iommu, dev);
> > > +
> > >  	return iommu_dev;
> > >  }
> > >  
> > 
> > Hi,
> > This regresses IRQ remapping in the PD_MODE_NONE branch. By design
> > rlookup_table[devid] must stay valid for IR - init.c:2257 documents
> > this: "Do not return an error to enable IRQ remapping ...". Pre-patch
> > the PD_MODE_NONE branch returned ERR_PTR(-ENODEV) without nulling
> > rlookup, precisely so irq_remapping_alloc() / __rlookup_amd_iommu()
> > keep working; this unconditional cleanup violates that.
> > The new pci_prepare_ats() failure path has the same shape:
> > amd_iommu_set_pci_msi_domain() ran earlier and parented dev->msi_domain
> > on iommu->ir_domain, but on this new out_err that's not unwound. So
> > nulling rlookup_table[devid] makes irq_remapping_alloc() return -EINVAL
> > on the first MSI alloc for the device. Sashiko also flagged this in [1];
> > 
> > Also if iommu_init_device() branch fails, iommu_ignore_device() will be
> > called twice.
> > 
> 
> Hi Ankit,
> 
> Ack. Sashiko made me realize that this regresses IRQ mapping for AMD,
> and I agree that the call to iommu_ignore_device() is a bit too 
> aggressive as it  wipes the rlookup_table entry required for IRQ 
> remapping, particularly in PD_MODE_NONE.
> 
> I was thinkig to address this in the next version as follows:
> 
> 1. Split the probe error paths:
>    - Proper init failures (like iommu_init_device) will continue to call
>      iommu_ignore_device(). I will fix the double invocation here.
> 
>    - Config failures (like ATS mismatch or PD_MODE_NONE) will return an
>      an error but skip caling  iommu_ignore_device(), preserving the
>      rlookup_table entry for IRQ remapping.
> 
> 2. Resolve the Use-After-Free (UAF):
>    To prevent the UAF on the "DMA-only" failure path, I will ensure that
>    the hardware Device Table Entry (DTE) is set to a safe state (like
>    blocked or bypass) and the dev_data->dev pointer is cleared, as the
>    IOMMU core does not invoke release_device() after a probe failure.
> 
> 3. Fix iommu_ignore_device() infrastructure:
>    I will address the pre-existing bugs identified by Sashiko:
>    - Fix clearing order (calling setup_aliases before clearing the
>      rlookup_table).
>    - Replace the non-atomic memset() on the hardware dev_table with an
>      atomic DTE update.
> 
> That said, I'm investigating the safest way to revert the MSI domain
> assignment on probe failure to avoid the dangling domain issue pointed
> out by Sashiko. Maybe we can add an amd_iommu_restore_msi_domain() helper
> to revert the assignment made in amd_iommu_set_pci_msi_domain() on probe
> failure?
> 
> Please, let me know if that sounds okay?
> 
> Also, I'm wondering if I should send this as a separate series specific 
> to AMD which is unrelated to this one? Or maybe handle AMD IOMMU in a
> separate series altogether. Let me know if you (and Vasanth / Suravee) 
> would prefer that?
> 
> Thanks,
> Pranjal

Hi Pranjal,

Plan looks good. One pushback: I don't think you need the
amd_iommu_restore_msi_domain() helper.

If point 1 preserves rlookup_table on the PD_MODE_NONE and
pci_prepare_ats() failure paths, dev->msi_domain pointing at
iommu->ir_domain stays functional - irq_remapping_alloc() /
__rlookup_amd_iommu() find the iommu and the chain works.
So fixing rlookup makes the MSI assignment correct, 
not dangling - no restore needed.

On splitting: While patches 1-5 are essentially settled. I'd lean 
toward pulling AMD into a separate follow-up so the rest doesn't wait,
but defer to Vasant/Suravee on that.

+Vasant

Thanks,
Ankit

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-06-01  6:20     ` Pranjal Shrivastava
  2026-06-01  8:17       ` Ankit Soni
@ 2026-06-01  9:28       ` Vasant Hegde
  2026-06-01 10:41         ` Pranjal Shrivastava
  1 sibling, 1 reply; 20+ messages in thread
From: Vasant Hegde @ 2026-06-01  9:28 UTC (permalink / raw)
  To: Pranjal Shrivastava, Ankit Soni
  Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
	Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
	Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

Hi Pranjal,

On 6/1/2026 11:50 AM, Pranjal Shrivastava wrote:
> On Mon, Jun 01, 2026 at 06:00:15AM +0000, Ankit Soni wrote:
>>> @@ -2502,10 +2508,22 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
>>>  	else
>>>  		dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
>>>  
>>> -	if (dev_is_pci(dev))
>>> -		pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
>>> +	if (dev_is_pci(dev)) {
>>> +		struct pci_dev *pdev = to_pci_dev(dev);
>>> +
>>> +		if (pci_ats_supported(pdev)) {
>>> +			ret = pci_prepare_ats(pdev, PAGE_SHIFT);
>>> +			if (ret) {
>>> +				iommu_dev = ERR_PTR(ret);
>>> +				goto out_err;
>>> +			}
>>> +		}
>>> +	}
>>>  
>>>  out_err:
>>> +	if (IS_ERR(iommu_dev))
>>> +		iommu_ignore_device(iommu, dev);
>>> +
>>>  	return iommu_dev;
>>>  }
>>>  
>>
>> Hi,
>> This regresses IRQ remapping in the PD_MODE_NONE branch. By design
>> rlookup_table[devid] must stay valid for IR - init.c:2257 documents
>> this: "Do not return an error to enable IRQ remapping ...". Pre-patch
>> the PD_MODE_NONE branch returned ERR_PTR(-ENODEV) without nulling
>> rlookup, precisely so irq_remapping_alloc() / __rlookup_amd_iommu()
>> keep working; this unconditional cleanup violates that.
>> The new pci_prepare_ats() failure path has the same shape:
>> amd_iommu_set_pci_msi_domain() ran earlier and parented dev->msi_domain
>> on iommu->ir_domain, but on this new out_err that's not unwound. So
>> nulling rlookup_table[devid] makes irq_remapping_alloc() return -EINVAL
>> on the first MSI alloc for the device. Sashiko also flagged this in [1];
>>
>> Also if iommu_init_device() branch fails, iommu_ignore_device() will be
>> called twice.
>>
> 
> Hi Ankit,
> 
> Ack. Sashiko made me realize that this regresses IRQ mapping for AMD,
> and I agree that the call to iommu_ignore_device() is a bit too 
> aggressive as it  wipes the rlookup_table entry required for IRQ 
> remapping, particularly in PD_MODE_NONE.
> 
> I was thinkig to address this in the next version as follows:
> 
> 1. Split the probe error paths:
>    - Proper init failures (like iommu_init_device) will continue to call
>      iommu_ignore_device(). I will fix the double invocation here.
> 
>    - Config failures (like ATS mismatch or PD_MODE_NONE) will return an
>      an error but skip caling  iommu_ignore_device(), preserving the
>      rlookup_table entry for IRQ remapping.

I was reviewing v5 last Friday and decided to fix probe() code as its been long
time I wanted to cleanup this code.  I have a patch series which pretty much
does this.

I haven't fixed iommu_ignore_device() code, but should be simple to fix it.
we can use amd_iommu_make_clear_dte / amd_iommu_update_dte. It will set the
dte.tv=0 and essentially blocks all DMAs.

If you already have the patches then please go ahead, else I can post the series
this week.


> 
> 2. Resolve the Use-After-Free (UAF):
>    To prevent the UAF on the "DMA-only" failure path, I will ensure that
>    the hardware Device Table Entry (DTE) is set to a safe state (like
>    blocked or bypass) and the dev_data->dev pointer is cleared, as the
>    IOMMU core does not invoke release_device() after a probe failure.
> 
> 3. Fix iommu_ignore_device() infrastructure:
>    I will address the pre-existing bugs identified by Sashiko:
>    - Fix clearing order (calling setup_aliases before clearing the
>      rlookup_table).
>    - Replace the non-atomic memset() on the hardware dev_table with an
>      atomic DTE update.
> 
> That said, I'm investigating the safest way to revert the MSI domain
> assignment on probe failure to avoid the dangling domain issue pointed
> out by Sashiko. Maybe we can add an amd_iommu_restore_msi_domain() helper
> to revert the assignment made in amd_iommu_set_pci_msi_domain() on probe
> failure?
> 
> Please, let me know if that sounds okay?
> 
> Also, I'm wondering if I should send this as a separate series specific 
> to AMD which is unrelated to this one? Or maybe handle AMD IOMMU in a
> separate series altogether. Let me know if you (and Vasanth / Suravee) 
> would prefer that?

Lets separate out AMD fixes part as its not related to this series.

If you want to keep this patch then just the "iommu_ignore_device" part that
should be OK -OR- if you want to drop entirely and pick it up with AMD specific
series that's also works for me.

-Vasant


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-06-01  8:17       ` Ankit Soni
@ 2026-06-01 10:35         ` Pranjal Shrivastava
  0 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 10:35 UTC (permalink / raw)
  To: Ankit Soni
  Cc: Vasant Hegde, iommu, linux-pci, linux-arm-kernel, linux-kernel,
	Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

On Mon, Jun 01, 2026 at 08:17:23AM +0000, Ankit Soni wrote:
> 
> Hi Pranjal,
> 
> Plan looks good. One pushback: I don't think you need the
> amd_iommu_restore_msi_domain() helper.
> 
> If point 1 preserves rlookup_table on the PD_MODE_NONE and
> pci_prepare_ats() failure paths, dev->msi_domain pointing at
> iommu->ir_domain stays functional - irq_remapping_alloc() /
> __rlookup_amd_iommu() find the iommu and the chain works.
> So fixing rlookup makes the MSI assignment correct, 
> not dangling - no restore needed.

Ack. That makes sense—if we preserve the rlookup_table entry, the MSI
the MSI domain mapping remains functionally correct and the chain
stays intact. Thank you for clarifying that.

> 
> On splitting: While patches 1-5 are essentially settled. I'd lean 
> toward pulling AMD into a separate follow-up so the rest doesn't wait,
> but defer to Vasant/Suravee on that.

Ack. I agree with the suggestion to pull the AMD-specific work into a
separate follow-up series. 

> 
> +Vasant

Apologies for missing the CC! I thought I had Joerg's latest tree and
get_maintainers would automatically add Vasant.

Thanks,
Praan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 6/6] iommu/amd: Fail probe on ATS configuration failure
  2026-06-01  9:28       ` Vasant Hegde
@ 2026-06-01 10:41         ` Pranjal Shrivastava
  0 siblings, 0 replies; 20+ messages in thread
From: Pranjal Shrivastava @ 2026-06-01 10:41 UTC (permalink / raw)
  To: Vasant Hegde
  Cc: Ankit Soni, iommu, linux-pci, linux-arm-kernel, linux-kernel,
	Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
	Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
	Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
	Pasha Tatashin, Mostafa Saleh

On Mon, Jun 01, 2026 at 02:58:11PM +0530, Vasant Hegde wrote:
> Hi Pranjal,
> 
> On 6/1/2026 11:50 AM, Pranjal Shrivastava wrote:
> > On Mon, Jun 01, 2026 at 06:00:15AM +0000, Ankit Soni wrote:
> > 
[...]
> > Hi Ankit,
> > 
> > Ack. Sashiko made me realize that this regresses IRQ mapping for AMD,
> > and I agree that the call to iommu_ignore_device() is a bit too 
> > aggressive as it  wipes the rlookup_table entry required for IRQ 
> > remapping, particularly in PD_MODE_NONE.
> > 
> > I was thinkig to address this in the next version as follows:
> > 
> > 1. Split the probe error paths:
> >    - Proper init failures (like iommu_init_device) will continue to call
> >      iommu_ignore_device(). I will fix the double invocation here.
> > 
> >    - Config failures (like ATS mismatch or PD_MODE_NONE) will return an
> >      an error but skip caling  iommu_ignore_device(), preserving the
> >      rlookup_table entry for IRQ remapping.
> 
> I was reviewing v5 last Friday and decided to fix probe() code as its been long
> time I wanted to cleanup this code.  I have a patch series which pretty much
> does this.
> 
> I haven't fixed iommu_ignore_device() code, but should be simple to fix it.
> we can use amd_iommu_make_clear_dte / amd_iommu_update_dte. It will set the
> dte.tv=0 and essentially blocks all DMAs.
> 
> If you already have the patches then please go ahead, else I can post the series
> this week.

Hi Vasant,

I have drafted the fixes (including the alias clearing order and the 
atomic DTE updates for iommu_ignore_device), 

I'll go ahead and post them as a separate preparatory series soon.
I'm also happy to incorporate any of your existing patches if you'd
like?

> 
> 
> > 
> > 2. Resolve the Use-After-Free (UAF):
> >    To prevent the UAF on the "DMA-only" failure path, I will ensure that
> >    the hardware Device Table Entry (DTE) is set to a safe state (like
> >    blocked or bypass) and the dev_data->dev pointer is cleared, as the
> >    IOMMU core does not invoke release_device() after a probe failure.
> > 
> > 3. Fix iommu_ignore_device() infrastructure:
> >    I will address the pre-existing bugs identified by Sashiko:
> >    - Fix clearing order (calling setup_aliases before clearing the
> >      rlookup_table).
> >    - Replace the non-atomic memset() on the hardware dev_table with an
> >      atomic DTE update.
> > 
> > That said, I'm investigating the safest way to revert the MSI domain
> > assignment on probe failure to avoid the dangling domain issue pointed
> > out by Sashiko. Maybe we can add an amd_iommu_restore_msi_domain() helper
> > to revert the assignment made in amd_iommu_set_pci_msi_domain() on probe
> > failure?
> > 
> > Please, let me know if that sounds okay?
> > 
> > Also, I'm wondering if I should send this as a separate series specific 
> > to AMD which is unrelated to this one? Or maybe handle AMD IOMMU in a
> > separate series altogether. Let me know if you (and Vasanth / Suravee) 
> > would prefer that?
> 
> Lets separate out AMD fixes part as its not related to this series.
> 
> If you want to keep this patch then just the "iommu_ignore_device" part that
> should be OK -OR- if you want to drop entirely and pick it up with AMD specific
> series that's also works for me.

Ack. I guess a separate series to handle AMD IOMMU sounds better. I'll
post one soon.

Thanks,
Praan

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-06-01 10:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-29 11:12 [PATCH v6 0/6] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 1/6] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 2/6] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 3/6] PCI/ATS: Mandate checking pci_ats_supported() before pci_prepare_ats() Pranjal Shrivastava
2026-05-29 13:07   ` sashiko-bot
2026-05-29 21:56   ` Nicolin Chen
2026-05-29 11:12 ` [PATCH v6 4/6] iommu/arm-smmu-v3: Standardize ATS enablement failure reporting Pranjal Shrivastava
2026-05-29 13:44   ` sashiko-bot
2026-05-29 21:51   ` Nicolin Chen
2026-05-31 17:13     ` Pranjal Shrivastava
2026-05-29 11:12 ` [PATCH v6 5/6] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
2026-05-29 14:30   ` sashiko-bot
2026-05-29 11:12 ` [PATCH v6 6/6] iommu/amd: " Pranjal Shrivastava
2026-05-29 15:32   ` sashiko-bot
2026-06-01  6:00   ` Ankit Soni
2026-06-01  6:20     ` Pranjal Shrivastava
2026-06-01  8:17       ` Ankit Soni
2026-06-01 10:35         ` Pranjal Shrivastava
2026-06-01  9:28       ` Vasant Hegde
2026-06-01 10:41         ` Pranjal Shrivastava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox