* [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking
@ 2026-05-25 18:43 Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
` (4 more replies)
0 siblings, 5 replies; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava
The primary motivation for this series is an ATS state mismatch observed
under heavy load (via iova_stress). A failure in pci_enable_ats() leaves
IOMMU drivers like arm-smmu-v3 with inconsistent state leading to PCI core
warnings during device detach.
While David's recent work [1] addressed a discovery race for specific
quirked devices by moving them to the HEADER phase, gaps remained
regarding how Virtual Functions (VFs) inherit state from their Physical
Functions (PFs). Specifically, pci_ats_supported() did not account for
PF-level quirked status, and pci_prepare_ats() lacked STU validation for
VFs.
Based on discussion with Jason in v3 it was decided that IOMMU drivers
should explicitly check pci_ats_supported before calling pci_prepare_ats
and the device probe should fail if pci_prepare_ats fails. Since these
early gates preclude software configuration errors, any remaining failure
during pci_enable_ats() is treated as a kernel bug & reported via WARN().
This series standardizes this pattern across ARM SMMUv3, Intel VT-d, &
AMD IOMMU drivers.
[1] https://lore.kernel.org/linux-pci/20260403222750.1215002-1-dmatlack@google.com/
[v4]
- Standardized the pattern across Intel VT-d and AMD IOMMU drivers.
- Replaced the SMMUv3 ats_prepared gate with a fatal probe-fail logic.
- Utilized WARN() macros for runtime enablement failures in all drivers.
- Collected R-b tags from Jason and Sami.
[v3] https://lore.kernel.org/all/20260519135323.1558777-1-praan@google.com/
[v2] https://lore.kernel.org/all/20260504163842.2692314-1-praan@google.com/
Pranjal Shrivastava (5):
PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs
PCI/ATS: Validate STU for VFs in pci_prepare_ats()
iommu/arm-smmu-v3: Fix ATS state tracking
iommu/vt-d: Fail probe on ATS configuration failure
iommu/amd: Fail probe on ATS configuration failure
drivers/iommu/amd/iommu.c | 27 ++++++++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++++++++---
drivers/iommu/intel/iommu.c | 15 +++++++++---
drivers/pci/ats.c | 14 ++++++++---
4 files changed, 61 insertions(+), 16 deletions(-)
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
@ 2026-05-25 18:43 ` Pranjal Shrivastava
2026-05-25 19:42 ` Nicolin Chen
2026-05-25 18:43 ` [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
` (3 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava,
Jason Gunthorpe
Update pci_ats_supported() to additionally check the associated PF's
status when called on a VF. This ensures that PF-level quirks and
untrusted status are correctly propagated to VFs, providing a robust
support check that aligns with the kernel's PF-centric ATS configuration
model and is immune to the timing of VF-specific fixups.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/pci/ats.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index ec6c8dbdc5e9..a5fa7745bce8 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -40,10 +40,13 @@ void pci_ats_init(struct pci_dev *dev)
*/
bool pci_ats_supported(struct pci_dev *dev)
{
- if (!dev->ats_cap)
+ if (!dev->ats_cap || dev->untrusted)
return false;
- return (dev->untrusted == 0);
+ if (dev->is_virtfn)
+ return pci_ats_supported(pci_physfn(dev));
+
+ return true;
}
EXPORT_SYMBOL_GPL(pci_ats_supported);
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats()
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
@ 2026-05-25 18:43 ` Pranjal Shrivastava
2026-05-25 19:46 ` Nicolin Chen
2026-05-25 18:43 ` [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking Pranjal Shrivastava
` (2 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava,
Jason Gunthorpe
While every PCI Function that implements ATS has an independent ATS
Extended Capability structure with a Read/Write Smallest Translation
Unit (STU) field, the kernel manages SR-IOV ATS by requiring the IOMMU
driver to configure the STU on the Physical Function (PF) before any
any Virtual Functions (VFs) are created.
Currently, pci_prepare_ats() bails out early for VFs, assuming that the
PF has already been correctly prepared. However, this creates a potential
mismatch if a VF is subsequently prepared with a different page shift.
Update pci_prepare_ats() to validate that the requested page shift (ps)
matches the STU already configured in the associated PF. This ensures
early detection of incompatible configurations and maintains the kernel's
policy of consistent STU sizing across all functions associated with a
given SMMU.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/pci/ats.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index a5fa7745bce8..54319854bfd8 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -73,8 +73,13 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
if (ps < PCI_ATS_MIN_STU)
return -EINVAL;
- if (dev->is_virtfn)
+ if (dev->is_virtfn) {
+ struct pci_dev *pdev = pci_physfn(dev);
+
+ if (pdev->ats_stu != ps)
+ return -EINVAL;
return 0;
+ }
dev->ats_stu = ps;
ctrl = PCI_ATS_CTRL_STU(dev->ats_stu - PCI_ATS_MIN_STU);
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
@ 2026-05-25 18:43 ` Pranjal Shrivastava
2026-05-25 20:05 ` Nicolin Chen
2026-05-25 20:24 ` sashiko-bot
2026-05-25 18:43 ` [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 5/5] iommu/amd: " Pranjal Shrivastava
4 siblings, 2 replies; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava
The SMMUv3 driver currently has a two-phase commit in its ATS enablement
flow. During arm_smmu_attach_prepare(), it predicts whether ATS will be
enabled using arm_smmu_ats_supported() and accordingly increments
nr_ats_masters and merges ATS invalidations into the domain's invs array.
However, the actual hardware enablement via pci_enable_ats() happens
later in arm_smmu_attach_commit(). If this call to pci_enable_ats fails,
the SMMU driver's ATS state tracking remains polluted, i.e., the driver
tracks ATS enabled on a master that is not actually using ATS. This leads
to an incorrect nr_ats_masters and triggers a warning in the PCI core
during detach:
[ 127.925080] ------------[ cut here ]------------
[ 127.925084] WARNING: drivers/pci/ats.c:132 at pci_disable_ats+0x94/0xa8, CPU#42: iova_stress/12240
[ 127.949761] Modules linked in: vfat fat dummy bridge stp llc cdc_ncm cdc_eem cdc_ether usbnet mii xhci_pci xhci_hcd ehci_pci ehci_hcd
[ 127.961760] CPU: 42 UID: 0 PID: 12240 Comm: iova_stress Not tainted 7.1.0-smp-DEV #4 PREEMPTLAZY
[ 127.970619] Hardware name: <REDACTED>
[ 127.977655] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 127.984603] pc : pci_disable_ats+0x94/0xa8
[ 127.988687] lr : arm_smmu_attach_prepare+0x104/0x310
...
[ 128.068169] Call trace:
[ 128.070603] pci_disable_ats+0x94/0xa8 (P)
[ 128.074688] arm_smmu_attach_prepare+0x104/0x310
[ 128.079292] arm_smmu_attach_dev_ste+0x128/0x1e0
[ 128.083899] arm_smmu_attach_dev_blocked+0x50/0x88
[ 128.088677] __iommu_attach_device+0x30/0x138
[ 128.093026] __iommu_group_set_domain_internal+0xdc/0x228
[ 128.098412] __iommu_take_dma_ownership+0x118/0x150
[ 128.103278] iommu_group_claim_dma_owner+0x48/0x80
[ 128.108056] vfio_container_attach_group+0xc8/0x1b0
[ 128.112927] vfio_group_fops_unl_ioctl+0x578/0x968
[ 128.117706] __arm64_sys_ioctl+0x90/0xe8
The issue was exposed under heavy load when running a VFIO-based DMA
map stress test (iova_stress).
Fix this by ensuring that all failable ATS configuration happens
early during device discovery. Update arm_smmu_probe_device() to call
pci_prepare_ats() only if ATS is supported and fail the probe if
pci_prepare_ats() returns an error, ensuring that any master that reaches
the attach phase is guaranteed to have a valid ATS configuration.
Additionally, update arm_smmu_enable_ats() to use the WARN() macro.
Since earlier checks now preclude configuration-related failures,
any failure during hardware enablement is a noisy kernel bug or fatal
hardware error that should be reported with a backtrace while
allowing the driver to continue in a balanced software state.
Fixes: 7497f4211f4f ("iommu/arm-smmu-v3: Make changing domains be hitless for ATS")
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e8d7dbe495f0..1d96064d314b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3065,8 +3065,14 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
* ATC invalidation of PASID 0 causes the entire ATC to be flushed.
*/
arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
- if (pci_enable_ats(pdev, stu))
- dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
+
+ /*
+ * Any failure at this point is a kernel bug. pci_ats_supported()
+ * and pci_prepare_ats() have already verified the hardware capability
+ * and programmed the STU. Thus, pci_enable_ats() should not fail here.
+ */
+ WARN(pci_enable_ats(pdev, stu),
+ "Failed to enable ATS (STU %zu)\n", stu);
}
static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
@@ -4264,9 +4270,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
master->stall_enabled = true;
if (dev_is_pci(dev)) {
- unsigned int stu = __ffs(smmu->pgsize_bitmap);
+ struct pci_dev *pdev = to_pci_dev(dev);
- pci_prepare_ats(to_pci_dev(dev), stu);
+ if (pci_ats_supported(pdev)) {
+ unsigned int stu = __ffs(smmu->pgsize_bitmap);
+ int ret;
+
+ ret = pci_prepare_ats(pdev, stu);
+ if (ret)
+ return ERR_PTR(ret);
+ }
}
return &smmu->iommu;
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
` (2 preceding siblings ...)
2026-05-25 18:43 ` [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking Pranjal Shrivastava
@ 2026-05-25 18:43 ` Pranjal Shrivastava
2026-05-25 20:56 ` sashiko-bot
2026-05-25 18:43 ` [PATCH v4 5/5] iommu/amd: " Pranjal Shrivastava
4 siblings, 1 reply; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava
Update the Intel VT-d driver to handle ATS configuration and enablement
more strictly. Specifically, update the device probe to fail if
pci_prepare_ats() returns an error. This ensures that any ATS-capable
master reaching the attach phase is guaranteed to have a valid config.
Additionally, update iommu_enable_pci_ats() to WARN() if pci_enable_ats
fails. Since earlier checks in the probe phase preclude config-related
failures, any failure during hardware enablement is considered a kernel
bug.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/intel/iommu.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4d0e65bc131d..22308e4911e1 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -873,8 +873,14 @@ static void iommu_enable_pci_ats(struct device_domain_info *info)
if (!pci_ats_page_aligned(pdev))
return;
- if (!pci_enable_ats(pdev, VTD_PAGE_SHIFT))
- info->ats_enabled = 1;
+ /*
+ * pci_enable_ats() should not fail here because earlier checks
+ * have already verified support and configuration.
+ */
+ if (WARN_ON(pci_enable_ats(pdev, VTD_PAGE_SHIFT)))
+ return;
+
+ info->ats_enabled = 1;
}
static void iommu_disable_pci_ats(struct device_domain_info *info)
@@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
dev_iommu_priv_set(dev, info);
if (pdev && pci_ats_supported(pdev)) {
- pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
+ ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
+ if (ret)
+ goto free;
+
ret = device_rbtree_insert(iommu, info);
if (ret)
goto free;
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 5/5] iommu/amd: Fail probe on ATS configuration failure
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
` (3 preceding siblings ...)
2026-05-25 18:43 ` [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-05-25 18:43 ` Pranjal Shrivastava
2026-05-25 21:35 ` sashiko-bot
4 siblings, 1 reply; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 18:43 UTC (permalink / raw)
To: iommu, linux-pci, linux-arm-kernel, linux-kernel
Cc: Joerg Roedel, Will Deacon, Bjorn Helgaas, David Woodhouse,
Lu Baolu, Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
Nicolin Chen, David Matlack, Samiullah Khawaja, Daniel Mentz,
Pasha Tatashin, Mostafa Saleh, Pranjal Shrivastava
Update the AMD IOMMU driver to handle ATS configuration and enablement
more strictly. Specifically, update the device probe to fail if
pci_prepare_ats() returns an error. This ensures that any ATS-capable
master reaching the attach phase is guaranteed to have a valid config.
Additionally, update pdev_enable_cap_ats() to WARN_ON() if pci_enable_ats
fails. Since earlier checks in the probe phase preclude config-related
failures, any failure during hardware enablement is considered a kernel
bug.
Signed-off-by: Pranjal Shrivastava <praan@google.com>
---
drivers/iommu/amd/iommu.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 84cad43dc188..1dddb08e7b22 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -570,10 +570,16 @@ static inline int pdev_enable_cap_ats(struct pci_dev *pdev)
if (amd_iommu_iotlb_sup &&
(dev_data->flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP)) {
ret = pci_enable_ats(pdev, PAGE_SHIFT);
- if (!ret) {
- dev_data->ats_enabled = 1;
- dev_data->ats_qdep = pci_ats_queue_depth(pdev);
- }
+ /*
+ * pci_enable_ats() should not fail here because earlier checks
+ * have already verified support and configuration.
+ */
+ if (WARN_ON(ret))
+ return ret;
+
+ dev_data->ats_enabled = 1;
+ dev_data->ats_qdep = pci_ats_queue_depth(pdev);
+ ret = 0;
}
return ret;
@@ -2502,8 +2508,17 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
else
dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
- if (dev_is_pci(dev))
- pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
+ if (dev_is_pci(dev)) {
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (pci_ats_supported(pdev)) {
+ ret = pci_prepare_ats(pdev, PAGE_SHIFT);
+ if (ret) {
+ iommu_dev = ERR_PTR(ret);
+ goto out_err;
+ }
+ }
+ }
out_err:
return iommu_dev;
--
2.54.0.746.g67dd491aae-goog
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs
2026-05-25 18:43 ` [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
@ 2026-05-25 19:42 ` Nicolin Chen
0 siblings, 0 replies; 17+ messages in thread
From: Nicolin Chen @ 2026-05-25 19:42 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh, Jason Gunthorpe
On Mon, May 25, 2026 at 06:43:43PM +0000, Pranjal Shrivastava wrote:
> Update pci_ats_supported() to additionally check the associated PF's
> status when called on a VF. This ensures that PF-level quirks and
> untrusted status are correctly propagated to VFs, providing a robust
> support check that aligns with the kernel's PF-centric ATS configuration
> model and is immune to the timing of VF-specific fixups.
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats()
2026-05-25 18:43 ` [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
@ 2026-05-25 19:46 ` Nicolin Chen
2026-05-28 20:27 ` Pranjal Shrivastava
0 siblings, 1 reply; 17+ messages in thread
From: Nicolin Chen @ 2026-05-25 19:46 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh, Jason Gunthorpe
On Mon, May 25, 2026 at 06:43:44PM +0000, Pranjal Shrivastava wrote:
> While every PCI Function that implements ATS has an independent ATS
> Extended Capability structure with a Read/Write Smallest Translation
> Unit (STU) field, the kernel manages SR-IOV ATS by requiring the IOMMU
> driver to configure the STU on the Physical Function (PF) before any
> any Virtual Functions (VFs) are created.
>
> Currently, pci_prepare_ats() bails out early for VFs, assuming that the
> PF has already been correctly prepared. However, this creates a potential
> mismatch if a VF is subsequently prepared with a different page shift.
>
> Update pci_prepare_ats() to validate that the requested page shift (ps)
> matches the STU already configured in the associated PF. This ensures
> early detection of incompatible configurations and maintains the kernel's
> policy of consistent STU sizing across all functions associated with a
> given SMMU.
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
> Signed-off-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Nit:
> @@ -73,8 +73,13 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
> if (ps < PCI_ATS_MIN_STU)
> return -EINVAL;
>
> - if (dev->is_virtfn)
> + if (dev->is_virtfn) {
> + struct pci_dev *pdev = pci_physfn(dev);
> +
> + if (pdev->ats_stu != ps)
> + return -EINVAL;
> return 0;
> + }
pdev doesn't seem useful. Dropping it reads better to me:
if (pci_physfn(dev)->ats_stu != ps)
return -EINVAL;
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking
2026-05-25 18:43 ` [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking Pranjal Shrivastava
@ 2026-05-25 20:05 ` Nicolin Chen
2026-05-25 20:30 ` Pranjal Shrivastava
2026-05-25 20:24 ` sashiko-bot
1 sibling, 1 reply; 17+ messages in thread
From: Nicolin Chen @ 2026-05-25 20:05 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh
On Mon, May 25, 2026 at 06:43:45PM +0000, Pranjal Shrivastava wrote:
> @@ -3065,8 +3065,14 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
> * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
> */
> arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
> - if (pci_enable_ats(pdev, stu))
> - dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> +
> + /*
> + * Any failure at this point is a kernel bug. pci_ats_supported()
> + * and pci_prepare_ats() have already verified the hardware capability
> + * and programmed the STU. Thus, pci_enable_ats() should not fail here.
Nits:
- WARN usually indicates a kernel bug already.
- pci_prepare_ats() covers pci_ats_supported().
/*
* As pci_prepare_ats() have already verified the hardware capability
* and programmed the STE, pci_enable_ats() should not fail here.
*/
> @@ -4264,9 +4270,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> master->stall_enabled = true;
>
> if (dev_is_pci(dev)) {
> - unsigned int stu = __ffs(smmu->pgsize_bitmap);
> + struct pci_dev *pdev = to_pci_dev(dev);
>
> - pci_prepare_ats(to_pci_dev(dev), stu);
> + if (pci_ats_supported(pdev)) {
> + unsigned int stu = __ffs(smmu->pgsize_bitmap);
> + int ret;
> +
> + ret = pci_prepare_ats(pdev, stu);
> + if (ret)
> + return ERR_PTR(ret);
> + }
Again, pci_prepare_ats() covers pci_ats_supported(). So, the check
is redundant. Instead, it should check arm_smmu_ats_supported().
By the way, this would conflict into my series:
https://lore.kernel.org/linux-iommu/18bb6f421b3be891caa8f1fb50f3a4d56b52d5be.1779392420.git.nicolinc@nvidia.com/
It would be nicer to have an arm_smmu_master_prepare_ats() so that
both series would have a common ground; mine would be just adding
some additional lines if your series goes in first.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking
2026-05-25 18:43 ` [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking Pranjal Shrivastava
2026-05-25 20:05 ` Nicolin Chen
@ 2026-05-25 20:24 ` sashiko-bot
1 sibling, 0 replies; 17+ messages in thread
From: sashiko-bot @ 2026-05-25 20:24 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Direct return on `pci_prepare_ats()` failure bypasses required cleanup, leading to memory leaks, dangling pointers in the SMMU's RB-tree (potential Use-After-Free), and left-over PASID configuration.
--
commit 89f07c4b4ad178ffe815778fc25432a3966e94a5
Author: Pranjal Shrivastava <praan@google.com>
iommu/arm-smmu-v3: Fix ATS state tracking
This commit addresses an issue with ATS state tracking during a two-phase
commit. It moves failable ATS configuration to the device discovery phase
by calling pci_prepare_ats() early and failing the probe if it returns an
error.
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
[ ... ]
> @@ -4264,9 +4270,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> master->stall_enabled = true;
>
> if (dev_is_pci(dev)) {
> - unsigned int stu = __ffs(smmu->pgsize_bitmap);
> + struct pci_dev *pdev = to_pci_dev(dev);
>
> - pci_prepare_ats(to_pci_dev(dev), stu);
> + if (pci_ats_supported(pdev)) {
> + unsigned int stu = __ffs(smmu->pgsize_bitmap);
> + int ret;
> +
> + ret = pci_prepare_ats(pdev, stu);
> + if (ret)
> + return ERR_PTR(ret);
[Severity: High]
Does this direct return bypass the required cleanup for the master structure?
Earlier in arm_smmu_probe_device(), the master structure is allocated and its
streams are inserted into the smmu->streams RB-tree using
arm_smmu_insert_master().
Returning ERR_PTR(ret) directly here leaves the master object and its
allocations leaked, and leaves dangling stream nodes in the smmu->streams
RB-tree.
If the SMMU later receives an event for this stream, could the driver
traverse the RB-tree, find the leaked stream, and attempt to access
stream->master->dev after the PCI subsystem has freed it, resulting in a
use-after-free?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260525184347.4059549-1-praan@google.com?part=3
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking
2026-05-25 20:05 ` Nicolin Chen
@ 2026-05-25 20:30 ` Pranjal Shrivastava
2026-05-25 20:40 ` Nicolin Chen
0 siblings, 1 reply; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-25 20:30 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh
On Mon, May 25, 2026 at 01:05:36PM -0700, Nicolin Chen wrote:
> On Mon, May 25, 2026 at 06:43:45PM +0000, Pranjal Shrivastava wrote:
> > @@ -3065,8 +3065,14 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
> > * ATC invalidation of PASID 0 causes the entire ATC to be flushed.
> > */
> > arm_smmu_atc_inv_master(master, IOMMU_NO_PASID);
> > - if (pci_enable_ats(pdev, stu))
> > - dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
> > +
> > + /*
> > + * Any failure at this point is a kernel bug. pci_ats_supported()
> > + * and pci_prepare_ats() have already verified the hardware capability
> > + * and programmed the STU. Thus, pci_enable_ats() should not fail here.
>
> Nits:
> - WARN usually indicates a kernel bug already.
> - pci_prepare_ats() covers pci_ats_supported().
Ack. For the pci_ats_supported.. since we're failing probe if
pci_prepare_ats() fails, the idea was to call prepare only if ATS was
supported [1] i.e. not penalise non-ATS callers of pci_prepare_ats. But
I agree.. maybe a better way would be to factor our pci_ats_supported().
>
> /*
> * As pci_prepare_ats() have already verified the hardware capability
> * and programmed the STE, pci_enable_ats() should not fail here.
> */
>
> > @@ -4264,9 +4270,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> > master->stall_enabled = true;
> >
> > if (dev_is_pci(dev)) {
> > - unsigned int stu = __ffs(smmu->pgsize_bitmap);
> > + struct pci_dev *pdev = to_pci_dev(dev);
> >
> > - pci_prepare_ats(to_pci_dev(dev), stu);
> > + if (pci_ats_supported(pdev)) {
> > + unsigned int stu = __ffs(smmu->pgsize_bitmap);
> > + int ret;
> > +
> > + ret = pci_prepare_ats(pdev, stu);
> > + if (ret)
> > + return ERR_PTR(ret);
> > + }
>
> Again, pci_prepare_ats() covers pci_ats_supported(). So, the check
> is redundant. Instead, it should check arm_smmu_ats_supported().
>
Ack, I'll add a check with arm_smmu_ats_supported
> By the way, this would conflict into my series:
> https://lore.kernel.org/linux-iommu/18bb6f421b3be891caa8f1fb50f3a4d56b52d5be.1779392420.git.nicolinc@nvidia.com/
>
> It would be nicer to have an arm_smmu_master_prepare_ats() so that
> both series would have a common ground; mine would be just adding
> some additional lines if your series goes in first.
Ahh yes, I was thinking about the conflict to (I'm yet to review the
series though). That makes sense, we could add
arm_smmu_master_prepare_ats() for clarity
Thanks,
Praan
[1] https://lore.kernel.org/all/20260519145947.GK7702@ziepe.ca/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking
2026-05-25 20:30 ` Pranjal Shrivastava
@ 2026-05-25 20:40 ` Nicolin Chen
0 siblings, 0 replies; 17+ messages in thread
From: Nicolin Chen @ 2026-05-25 20:40 UTC (permalink / raw)
To: Pranjal Shrivastava
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh
On Mon, May 25, 2026 at 08:30:18PM +0000, Pranjal Shrivastava wrote:
> On Mon, May 25, 2026 at 01:05:36PM -0700, Nicolin Chen wrote:
> > On Mon, May 25, 2026 at 06:43:45PM +0000, Pranjal Shrivastava wrote:
> > - pci_prepare_ats() covers pci_ats_supported().
>
> the idea was to call prepare only if ATS was
> supported [1] i.e. not penalise non-ATS callers of pci_prepare_ats. But
Yea. I realized that it is slightly different after I replied.
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure
2026-05-25 18:43 ` [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
@ 2026-05-25 20:56 ` sashiko-bot
2026-05-28 5:19 ` Baolu Lu
0 siblings, 1 reply; 17+ messages in thread
From: sashiko-bot @ 2026-05-25 20:56 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Unconditional RB-tree removal on error path causes tree corruption for non-ATS devices.
--
commit b8810979b183c84268a61e16cca0cbf36d0ac97a
Author: Pranjal Shrivastava <praan@google.com>
iommu/vt-d: Fail probe on ATS configuration failure
Updates the Intel vt-d driver to strictly handle ats configuration by failing
the device probe if pci_prepare_ats() returns an error. It also throws a
warning if pci_enable_ats() fails during enablement, treating it as a bug.
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 4d0e65bc131d7..22308e4911e1f 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
[ ... ]
> @@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
>
> dev_iommu_priv_set(dev, info);
> if (pdev && pci_ats_supported(pdev)) {
> - pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> + ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> + if (ret)
> + goto free;
> +
> ret = device_rbtree_insert(iommu, info);
> if (ret)
> goto free;
[Severity: High]
This is a pre-existing issue, but does unconditional removal of the rb-node
on the error path cause tree corruption for non-ats devices?
If a device does not support ats, the info structure skips the
device_rbtree_insert() call here, leaving info->node zero-initialized.
However, if a subsequent operation like intel_pasid_alloc_table() fails,
the error path jumps to clear_rbtree:
intel_iommu_probe_device()
...
clear_rbtree:
device_rbtree_remove(info);
free:
kfree(info);
Calling device_rbtree_remove() on this zero-initialized node causes the
rb-tree logic to treat it as a root node with no parent, ultimately
overwriting the actual tree root with null and corrupting the device_rbtree.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260525184347.4059549-1-praan@google.com?part=4
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 5/5] iommu/amd: Fail probe on ATS configuration failure
2026-05-25 18:43 ` [PATCH v4 5/5] iommu/amd: " Pranjal Shrivastava
@ 2026-05-25 21:35 ` sashiko-bot
0 siblings, 0 replies; 17+ messages in thread
From: sashiko-bot @ 2026-05-25 21:35 UTC (permalink / raw)
To: Pranjal Shrivastava; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Critical] The error path for pci_prepare_ats() in amd_iommu_probe_device() fails to revert previously initialized IOMMU state, leading to a potential Use-After-Free.
- [High] Failing the IOMMU device probe when pci_prepare_ats() fails breaks backward compatibility for devices that could function perfectly without ATS.
Pre-existing issues:
- [Critical] Pre-existing missing cleanup on error path when IOMMU page tables are disabled causes a Use-After-Free.
--
commit c40e9d6d303549c37f626ed11947fa01974222f8
Author: Pranjal Shrivastava <praan@google.com>
iommu/amd: Fail probe on ATS configuration failure
Update the AMD IOMMU driver to handle ATS configuration and enablement
more strictly. Specifically, update the device probe to fail if
pci_prepare_ats() returns an error. This ensures that any ATS-capable
master reaching the attach phase is guaranteed to have a valid config.
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 84cad43dc188c..1dddb08e7b224 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
[ ... ]
> @@ -2502,8 +2508,17 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev)
> else
> dev_data->max_irqs = MAX_IRQS_PER_TABLE_512;
>
> - if (dev_is_pci(dev))
> - pci_prepare_ats(to_pci_dev(dev), PAGE_SHIFT);
> + if (dev_is_pci(dev)) {
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + if (pci_ats_supported(pdev)) {
> + ret = pci_prepare_ats(pdev, PAGE_SHIFT);
> + if (ret) {
> + iommu_dev = ERR_PTR(ret);
> + goto out_err;
[Severity: High]
Since ATS is an optional performance optimization, does failing the IOMMU
probe when pci_prepare_ats() fails break backward compatibility?
This completely prevents devices with buggy ATS capabilities (or VF/PF STU
mismatches) from attaching to the IOMMU.
Could this disable DMA translation entirely for hardware that would otherwise
work correctly without ATS?
[Severity: Critical]
If pci_prepare_ats() fails, we return ERR_PTR(ret) without calling
iommu_ignore_device(). Could this lead to a Use-After-Free regression?
Since iommu_init_device() has already globally linked dev_data and populated
the hardware dev_table and rlookup_table with dev_data->dev pointing to the
PCI device struct, the IOMMU core receives an error and considers the device
unbound. It will not invoke release_device() upon removal.
When the physical device is later removed and freed by the PCI subsystem,
dev_data->dev becomes a dangling pointer.
If a spurious IOMMU fault or event occurs for that device ID, the interrupt
handler (e.g., amd_iommu_int_thread) will retrieve dev_data and dereference
the freed pointer (e.g., via dev_name(dev_data->dev) in dump_dte_entry()),
causing a Use-After-Free regression and system crash.
> + }
> + }
> + }
>
> out_err:
[Severity: Critical]
This is a pre-existing issue, but does the same Use-After-Free regression
occur when amd_iommu_pgtable == PD_MODE_NONE evaluates to true?
Similar to the newly introduced ATS error path, it jumps to out_err and
returns ERR_PTR(-ENODEV) without calling iommu_ignore_device().
This leaves dev_data globally linked with a populated hardware dev_table and
rlookup_table containing a dangling dev_data->dev pointer. Upon device
removal and a subsequent spurious IOMMU event, could this cause a
Use-After-Free regression for the exact same reasons?
> return iommu_dev;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260525184347.4059549-1-praan@google.com?part=5
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure
2026-05-25 20:56 ` sashiko-bot
@ 2026-05-28 5:19 ` Baolu Lu
2026-05-28 17:36 ` Pranjal Shrivastava
0 siblings, 1 reply; 17+ messages in thread
From: Baolu Lu @ 2026-05-28 5:19 UTC (permalink / raw)
To: sashiko-reviews, Pranjal Shrivastava; +Cc: linux-pci
On 5/26/26 04:56, sashiko-bot@kernel.org wrote:
>> @@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
>>
>> dev_iommu_priv_set(dev, info);
>> if (pdev && pci_ats_supported(pdev)) {
>> - pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
>> + ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
>> + if (ret)
>> + goto free;
>> +
>> ret = device_rbtree_insert(iommu, info);
>> if (ret)
>> goto free;
> [Severity: High]
> This is a pre-existing issue, but does unconditional removal of the rb-node
> on the error path cause tree corruption for non-ats devices?
>
> If a device does not support ats, the info structure skips the
> device_rbtree_insert() call here, leaving info->node zero-initialized.
>
> However, if a subsequent operation like intel_pasid_alloc_table() fails,
> the error path jumps to clear_rbtree:
>
> intel_iommu_probe_device()
> ...
> clear_rbtree:
> device_rbtree_remove(info);
> free:
> kfree(info);
>
> Calling device_rbtree_remove() on this zero-initialized node causes the
> rb-tree logic to treat it as a root node with no parent, ultimately
> overwriting the actual tree root with null and corrupting the device_rbtree.
Perhaps we can fix it like this?
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4d0e65bc131d..849d06dfe1ae 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -157,7 +157,10 @@ static void device_rbtree_remove(struct
device_domain_info *info)
unsigned long flags;
spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
- rb_erase(&info->node, &iommu->device_rbtree);
+ if (!RB_EMPTY_NODE(&info->node)) {
+ rb_erase(&info->node, &iommu->device_rbtree);
+ RB_CLEAR_NODE(&info->node);
+ }
spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags);
}
@@ -3254,6 +3257,7 @@ static struct iommu_device
*intel_iommu_probe_device(struct device *dev)
info->dev = dev;
info->iommu = iommu;
+ RB_CLEAR_NODE(&info->node);
if (dev_is_pci(dev)) {
if (ecap_dev_iotlb_support(iommu->ecap) &&
pci_ats_supported(pdev) &&
Thanks,
baolu
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure
2026-05-28 5:19 ` Baolu Lu
@ 2026-05-28 17:36 ` Pranjal Shrivastava
0 siblings, 0 replies; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 17:36 UTC (permalink / raw)
To: Baolu Lu; +Cc: sashiko-reviews, linux-pci
On Thu, May 28, 2026 at 01:19:13PM +0800, Baolu Lu wrote:
> On 5/26/26 04:56, sashiko-bot@kernel.org wrote:
> > > @@ -3288,7 +3294,10 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
> > > dev_iommu_priv_set(dev, info);
> > > if (pdev && pci_ats_supported(pdev)) {
> > > - pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> > > + ret = pci_prepare_ats(pdev, VTD_PAGE_SHIFT);
> > > + if (ret)
> > > + goto free;
> > > +
> > > ret = device_rbtree_insert(iommu, info);
> > > if (ret)
> > > goto free;
> > [Severity: High]
> > This is a pre-existing issue, but does unconditional removal of the rb-node
> > on the error path cause tree corruption for non-ats devices?
> >
> > If a device does not support ats, the info structure skips the
> > device_rbtree_insert() call here, leaving info->node zero-initialized.
> >
> > However, if a subsequent operation like intel_pasid_alloc_table() fails,
> > the error path jumps to clear_rbtree:
> >
> > intel_iommu_probe_device()
> > ...
> > clear_rbtree:
> > device_rbtree_remove(info);
> > free:
> > kfree(info);
> >
> > Calling device_rbtree_remove() on this zero-initialized node causes the
> > rb-tree logic to treat it as a root node with no parent, ultimately
> > overwriting the actual tree root with null and corrupting the device_rbtree.
>
> Perhaps we can fix it like this?
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 4d0e65bc131d..849d06dfe1ae 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -157,7 +157,10 @@ static void device_rbtree_remove(struct
> device_domain_info *info)
> unsigned long flags;
>
> spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
> - rb_erase(&info->node, &iommu->device_rbtree);
> + if (!RB_EMPTY_NODE(&info->node)) {
> + rb_erase(&info->node, &iommu->device_rbtree);
> + RB_CLEAR_NODE(&info->node);
> + }
> spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags);
> }
>
> @@ -3254,6 +3257,7 @@ static struct iommu_device
> *intel_iommu_probe_device(struct device *dev)
>
> info->dev = dev;
> info->iommu = iommu;
> + RB_CLEAR_NODE(&info->node);
> if (dev_is_pci(dev)) {
> if (ecap_dev_iotlb_support(iommu->ecap) &&
> pci_ats_supported(pdev) &&
Ack. Yes. I'll add it to the next version.
Thanks,
Praan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats()
2026-05-25 19:46 ` Nicolin Chen
@ 2026-05-28 20:27 ` Pranjal Shrivastava
0 siblings, 0 replies; 17+ messages in thread
From: Pranjal Shrivastava @ 2026-05-28 20:27 UTC (permalink / raw)
To: Nicolin Chen
Cc: iommu, linux-pci, linux-arm-kernel, linux-kernel, Joerg Roedel,
Will Deacon, Bjorn Helgaas, David Woodhouse, Lu Baolu,
Robin Murphy, Suravee Suthikulpanit, Jason Gunthorpe,
David Matlack, Samiullah Khawaja, Daniel Mentz, Pasha Tatashin,
Mostafa Saleh, Jason Gunthorpe
On Mon, May 25, 2026 at 12:46:53PM -0700, Nicolin Chen wrote:
> On Mon, May 25, 2026 at 06:43:44PM +0000, Pranjal Shrivastava wrote:
> > While every PCI Function that implements ATS has an independent ATS
> > Extended Capability structure with a Read/Write Smallest Translation
> > Unit (STU) field, the kernel manages SR-IOV ATS by requiring the IOMMU
> > driver to configure the STU on the Physical Function (PF) before any
> > any Virtual Functions (VFs) are created.
> >
> > Currently, pci_prepare_ats() bails out early for VFs, assuming that the
> > PF has already been correctly prepared. However, this creates a potential
> > mismatch if a VF is subsequently prepared with a different page shift.
> >
> > Update pci_prepare_ats() to validate that the requested page shift (ps)
> > matches the STU already configured in the associated PF. This ensures
> > early detection of incompatible configurations and maintains the kernel's
> > policy of consistent STU sizing across all functions associated with a
> > given SMMU.
> >
> > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
> > Signed-off-by: Pranjal Shrivastava <praan@google.com>
>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
> Nit:
>
> > @@ -73,8 +73,13 @@ int pci_prepare_ats(struct pci_dev *dev, int ps)
> > if (ps < PCI_ATS_MIN_STU)
> > return -EINVAL;
> >
> > - if (dev->is_virtfn)
> > + if (dev->is_virtfn) {
> > + struct pci_dev *pdev = pci_physfn(dev);
> > +
> > + if (pdev->ats_stu != ps)
> > + return -EINVAL;
> > return 0;
> > + }
>
> pdev doesn't seem useful. Dropping it reads better to me:
> if (pci_physfn(dev)->ats_stu != ps)
> return -EINVAL;
Ahh, I missed this nit in the v5 I just posted. I'll surely address this
in the next revision.
Thanks,
Praan
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-05-28 20:28 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-25 18:43 [PATCH v4 0/5] iommu: Standardize ATS robustness and state tracking Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 1/5] PCI/ATS: Ensure pci_ats_supported() is PF-aware for VFs Pranjal Shrivastava
2026-05-25 19:42 ` Nicolin Chen
2026-05-25 18:43 ` [PATCH v4 2/5] PCI/ATS: Validate STU for VFs in pci_prepare_ats() Pranjal Shrivastava
2026-05-25 19:46 ` Nicolin Chen
2026-05-28 20:27 ` Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 3/5] iommu/arm-smmu-v3: Fix ATS state tracking Pranjal Shrivastava
2026-05-25 20:05 ` Nicolin Chen
2026-05-25 20:30 ` Pranjal Shrivastava
2026-05-25 20:40 ` Nicolin Chen
2026-05-25 20:24 ` sashiko-bot
2026-05-25 18:43 ` [PATCH v4 4/5] iommu/vt-d: Fail probe on ATS configuration failure Pranjal Shrivastava
2026-05-25 20:56 ` sashiko-bot
2026-05-28 5:19 ` Baolu Lu
2026-05-28 17:36 ` Pranjal Shrivastava
2026-05-25 18:43 ` [PATCH v4 5/5] iommu/amd: " Pranjal Shrivastava
2026-05-25 21:35 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox