* [PATCH v4 0/7] Disable ATS via iommu during PCI resets
@ 2025-08-31 23:31 Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev() Nicolin Chen
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
Hi all,
PCIe permits a device to ignore ATS invalidation TLPs, while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out: e.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.
The OS should do something to mitigate this as we do not want production
systems to be reporting critical ATS failures, especially in a hypervisor
environment. Broadly, OS could arrange to ignore the timeouts, block page
table mutations to prevent invalidations, or disable and block ATS.
The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.
Provide a callback from the PCI subsystem that will enclose the reset and
have the iommu core temporarily change domains to group->blocking_domain,
so IOMMU drivers would fence any incoming ATS queries, synchronously stop
issuing new ATS invalidations, and wait for existing ATS invalidations to
complete. Doing this can avoid any ATS invaliation timeouts.
When a device is resetting, any new domain attachment has to be rejected,
until the reset is finished, to prevent ATS activity from being activated
between the two callback functions. Introduce a new pending_reset flag to
reject a concurrent __iommu_attach_device/set_group_pasid().
Finally, apply these iommu_dev_reset_prepare/done() functions in the PCI
reset functions.
Note that this series doesn't work well for a resetting alias device or a
SRIOV PF, so skip these two corner cases. There is nothing we can do for
alias devices since they share the same RID. For SRIOV PF, its VFs would
need to be blocked as well, and new dryrun attach_dev/set_group_pasid ops
will be required to allow compatible domain to be cached concurrently.
Some future followups after this series:
- A pair of dryrun testing ops for attach_dev/set_dev_pasid to make sure
no incompatible attempt will be given to attach_dev/set_dev_pasid().
- Stage all VFs to the blocked domain as well, if their PF is resetting.
- Clean up all unlocked iommu_get_domain_for_dev() call for UAF concern.
Replace with safer alternative APIs.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommu_dev_reset-v4
Changelog
v4
* Add Reviewed-by from Baolu
* [iommu] Use guard(mutex)
* [iommu] Update kdocs for typos and revisings
* [iommu] Skip two corner cases (alias and SRIOV)
* [iommu] Rework attach_dev to pass in old domain pointer
* [iommu] Reject concurrent attach_dev/set_dev_pasid for compatibility
concern
* [smmuv3] Drop the old_domain depedency in its release_dev callback
* [pci] Add pci_reset_iommu_prepare/_done() wrappers checking ATS cap
v3
https://lore.kernel.org/all/cover.1754952762.git.nicolinc@nvidia.com/
* Add Reviewed-by from Jason
* [iommu] Add a fast return in iommu_deferred_attach()
* [iommu] Update kdocs, inline comments, and commit logs
* [iommu] Use group->blocking_domain v.s. ops->blocked_domain
* [iommu] Drop require_direct, iommu_group_get(), and xa_lock()
* [iommu] Set the pending_reset flag after RID/PASID domain setups
* [iommu] Do not bypass PASID domains when RID domain is already the
blocking_domain
* [iommu] Add iommu_get_domain_for_dev_locked to correctly return the
blocking_domain
v2
https://lore.kernel.org/all/cover.1751096303.git.nicolinc@nvidia.com/
* [iommu] Update kdocs, inline comments, and commit logs
* [iommu] Replace long-holding group->mutex with a pending_reset flag
* [pci] Abort reset routines if iommu_dev_reset_prepare() fails
* [pci] Apply the same vulnerability fix to other reset functions
v1
https://lore.kernel.org/all/cover.1749494161.git.nicolinc@nvidia.com/
Thanks
Nicolin
Nicolin Chen (7):
iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev()
iommu: Lock group->mutex in iommu_deferred_attach()
iommu: Pass in gdev to __iommu_device_set_domain
iommu: Pass in old domain to attach_dev callback functions
iommu: Add iommu_get_domain_for_dev_locked() helper
iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done()
pci: Suspend iommu function prior to resetting a device
drivers/pci/pci.h | 2 +
include/linux/iommu.h | 16 +-
drivers/iommu/amd/iommu.c | 11 +-
drivers/iommu/apple-dart.c | 9 +-
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 66 +++--
drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +-
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 11 +-
drivers/iommu/dma-iommu.c | 2 +-
drivers/iommu/exynos-iommu.c | 6 +-
drivers/iommu/fsl_pamu_domain.c | 12 +-
drivers/iommu/intel/iommu.c | 10 +-
drivers/iommu/intel/nested.c | 2 +-
drivers/iommu/iommu.c | 256 +++++++++++++++++-
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/iommu/ipmmu-vmsa.c | 10 +-
drivers/iommu/msm_iommu.c | 8 +-
drivers/iommu/mtk_iommu.c | 8 +-
drivers/iommu/mtk_iommu_v1.c | 7 +-
drivers/iommu/omap-iommu.c | 12 +-
drivers/iommu/riscv/iommu.c | 9 +-
drivers/iommu/rockchip-iommu.c | 20 +-
drivers/iommu/s390-iommu.c | 9 +-
drivers/iommu/sprd-iommu.c | 3 +-
drivers/iommu/sun50i-iommu.c | 8 +-
drivers/iommu/tegra-smmu.c | 10 +-
drivers/iommu/virtio-iommu.c | 6 +-
drivers/pci/pci-acpi.c | 12 +-
drivers/pci/pci.c | 68 ++++-
drivers/pci/quirks.c | 18 +-
30 files changed, 509 insertions(+), 118 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev()
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 2/7] iommu: Lock group->mutex in iommu_deferred_attach() Nicolin Chen
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
The iommu_get_domain_for_dev() helper will be reworked to check a per-gdv
flag, so it will need to hold the group->mutex. This will give trouble to
existing attach_dev callback functions that call the helper for currently
attached old domains, since group->mutex is already held in an attach_dev
context.
To address this, step one is to pass in the attached "old" domain pointer
to the attach_dev op, similar to set_dev_pasid op.
However, the release_dev op is tricky in the iommu core, because it could
be invoked when the group isn't allocated, i.e. no way of guarateeing the
group->mutex being held. Thus, it would not be able to do any attachment
in the release_dev callback function, arm_smmu_release_device() here.
Add a release_domain, moving the attach from arm_smmu_release_device() to
the iommu_deinit_device() in the core, so that arm_smmu_release_device()
will not need to worry about the group->mutex.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++++++++-----
1 file changed, 26 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5968043ac8023..1a21d1a2dd454 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3291,6 +3291,31 @@ static struct iommu_domain arm_smmu_blocked_domain = {
.ops = &arm_smmu_blocked_ops,
};
+static int arm_smmu_attach_dev_release(struct iommu_domain *domain,
+ struct device *dev)
+{
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+ WARN_ON(master->iopf_refcount);
+
+ /* Put the STE back to what arm_smmu_init_strtab() sets */
+ if (dev->iommu->require_direct)
+ arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
+ else
+ arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
+
+ return 0;
+}
+
+static const struct iommu_domain_ops arm_smmu_release_ops = {
+ .attach_dev = arm_smmu_attach_dev_release,
+};
+
+static struct iommu_domain arm_smmu_release_domain = {
+ .type = IOMMU_DOMAIN_BLOCKED,
+ .ops = &arm_smmu_release_ops,
+};
+
static struct iommu_domain *
arm_smmu_domain_alloc_paging_flags(struct device *dev, u32 flags,
const struct iommu_user_data *user_data)
@@ -3580,14 +3605,6 @@ static void arm_smmu_release_device(struct device *dev)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
- WARN_ON(master->iopf_refcount);
-
- /* Put the STE back to what arm_smmu_init_strtab() sets */
- if (dev->iommu->require_direct)
- arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
- else
- arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
-
arm_smmu_disable_pasid(master);
arm_smmu_remove_master(master);
if (arm_smmu_cdtab_allocated(&master->cd_table))
@@ -3678,6 +3695,7 @@ static int arm_smmu_def_domain_type(struct device *dev)
static const struct iommu_ops arm_smmu_ops = {
.identity_domain = &arm_smmu_identity_domain,
.blocked_domain = &arm_smmu_blocked_domain,
+ .release_domain = &arm_smmu_release_domain,
.capable = arm_smmu_capable,
.hw_info = arm_smmu_hw_info,
.domain_alloc_sva = arm_smmu_sva_domain_alloc,
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 2/7] iommu: Lock group->mutex in iommu_deferred_attach()
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev() Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 3/7] iommu: Pass in gdev to __iommu_device_set_domain Nicolin Chen
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
The iommu_deferred_attach() function invokes __iommu_attach_device() while
not holding the group->mutex, like other __iommu_attach_device() callers.
Though there is no pratical bug being triggered so far, it would be better
to apply the same locking to this __iommu_attach_device(), since the IOMMU
drivers nowaday are more aware of the group->mutex -- some of them use the
iommu_group_mutex_assert() function that could be potentially in the path
of an attach_dev callback function invoked by the __iommu_attach_device().
The iommu_deferred_attach() will soon need to verify a new flag stored in
the struct group_device. To iterate the gdev list, the group->mutex should
be held for this matter too.
So, grab the mutex to guard __iommu_attach_device() like other callers.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommu.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 060ebe330ee16..1e0116bce0762 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2144,10 +2144,17 @@ EXPORT_SYMBOL_GPL(iommu_attach_device);
int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain)
{
- if (dev->iommu && dev->iommu->attach_deferred)
- return __iommu_attach_device(domain, dev);
+ /*
+ * This is called on the dma mapping fast path so avoid locking. This is
+ * racy, but we have an expectation that the driver will setup its DMAs
+ * inside probe while being single threaded to avoid racing.
+ */
+ if (!dev->iommu || !dev->iommu->attach_deferred)
+ return 0;
- return 0;
+ guard(mutex)(&dev->iommu_group->mutex);
+
+ return __iommu_attach_device(domain, dev);
}
void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 3/7] iommu: Pass in gdev to __iommu_device_set_domain
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev() Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 2/7] iommu: Lock group->mutex in iommu_deferred_attach() Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 4/7] iommu: Pass in old domain to attach_dev callback functions Nicolin Chen
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
The device under the reset will be attached to a blocked domain, while not
updating the group->domain pointer. So there needs to be a per-device flag
to indicate the reset state, for other iommu core functions to check so as
not to shift the attached domain during the reset state.
The regular device pointer can't store any private iommu flag. So the flag
has to be in the gdev structure.
Pass in the gdev pointer instead to the functions that will check that per
device flag.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1e0116bce0762..e6a66dacce1b8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -112,7 +112,7 @@ enum {
};
static int __iommu_device_set_domain(struct iommu_group *group,
- struct device *dev,
+ struct group_device *gdev,
struct iommu_domain *new_domain,
unsigned int flags);
static int __iommu_group_set_domain_internal(struct iommu_group *group,
@@ -602,7 +602,7 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
if (group->default_domain)
iommu_create_device_direct_mappings(group->default_domain, dev);
if (group->domain) {
- ret = __iommu_device_set_domain(group, dev, group->domain, 0);
+ ret = __iommu_device_set_domain(group, gdev, group->domain, 0);
if (ret)
goto err_remove_gdev;
} else if (!group->default_domain && !group_list) {
@@ -2263,10 +2263,11 @@ int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
EXPORT_SYMBOL_GPL(iommu_attach_group);
static int __iommu_device_set_domain(struct iommu_group *group,
- struct device *dev,
+ struct group_device *gdev,
struct iommu_domain *new_domain,
unsigned int flags)
{
+ struct device *dev = gdev->dev;
int ret;
/*
@@ -2346,8 +2347,7 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
*/
result = 0;
for_each_group_device(group, gdev) {
- ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
- flags);
+ ret = __iommu_device_set_domain(group, gdev, new_domain, flags);
if (ret) {
result = ret;
/*
@@ -2379,7 +2379,7 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
*/
if (group->domain)
WARN_ON(__iommu_device_set_domain(
- group, gdev->dev, group->domain,
+ group, gdev, group->domain,
IOMMU_SET_DOMAIN_MUST_SUCCEED));
if (gdev == last_gdev)
break;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 4/7] iommu: Pass in old domain to attach_dev callback functions
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
` (2 preceding siblings ...)
2025-08-31 23:31 ` [PATCH v4 3/7] iommu: Pass in gdev to __iommu_device_set_domain Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 5/7] iommu: Add iommu_get_domain_for_dev_locked() helper Nicolin Chen
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
The IOMMU core attaches each device to a default domain on probe(). Then,
every new "attach" operation has a fundamental meaning of two-fold:
- detach from its currently attached (old) domain
- attach to a given new domain
Modern IOMMU drivers following this pattern usually want to clean up the
things related to the old domain, so they call iommu_get_domain_for_dev()
to fetch the old domain.
Pass in the old domain pointer from the core to drivers, aligning with the
set_dev_pasid op that passes in already.
Ensure all low-level attach fcuntions in the core can forward the correct
old domain pointer. Thus, rework those functions as well.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
include/linux/iommu.h | 3 +-
drivers/iommu/amd/iommu.c | 11 ++++---
drivers/iommu/apple-dart.c | 9 +++--
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 5 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 ++++++++++++-------
drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +++--
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 11 ++++---
drivers/iommu/exynos-iommu.c | 6 ++--
drivers/iommu/fsl_pamu_domain.c | 12 +++----
drivers/iommu/intel/iommu.c | 10 ++++--
drivers/iommu/intel/nested.c | 2 +-
drivers/iommu/iommu.c | 26 +++++++++------
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/iommu/ipmmu-vmsa.c | 10 +++---
drivers/iommu/msm_iommu.c | 8 ++---
drivers/iommu/mtk_iommu.c | 8 ++---
drivers/iommu/mtk_iommu_v1.c | 7 ++--
drivers/iommu/omap-iommu.c | 12 +++----
drivers/iommu/riscv/iommu.c | 9 +++--
drivers/iommu/rockchip-iommu.c | 20 ++++++++---
drivers/iommu/s390-iommu.c | 9 +++--
drivers/iommu/sprd-iommu.c | 3 +-
drivers/iommu/sun50i-iommu.c | 8 +++--
drivers/iommu/tegra-smmu.c | 10 +++---
drivers/iommu/virtio-iommu.c | 6 ++--
25 files changed, 152 insertions(+), 97 deletions(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473d..801b2bd9e8d49 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -751,7 +751,8 @@ struct iommu_ops {
* @free: Release the domain after use.
*/
struct iommu_domain_ops {
- int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
+ int (*attach_dev)(struct iommu_domain *domain, struct device *dev,
+ struct iommu_domain *old);
int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
ioasid_t pasid, struct iommu_domain *old);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index eb348c63a8d09..8a18a3bfa5a2d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -69,8 +69,8 @@ int amd_iommu_max_glx_val = -1;
*/
DEFINE_IDA(pdom_ids);
-static int amd_iommu_attach_device(struct iommu_domain *dom,
- struct device *dev);
+static int amd_iommu_attach_device(struct iommu_domain *dom, struct device *dev,
+ struct iommu_domain *old);
static void set_dte_entry(struct amd_iommu *iommu,
struct iommu_dev_data *dev_data);
@@ -2634,7 +2634,8 @@ void amd_iommu_domain_free(struct iommu_domain *dom)
}
static int blocked_domain_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
@@ -2692,8 +2693,8 @@ static struct iommu_domain release_domain = {
}
};
-static int amd_iommu_attach_device(struct iommu_domain *dom,
- struct device *dev)
+static int amd_iommu_attach_device(struct iommu_domain *dom, struct device *dev,
+ struct iommu_domain *old)
{
struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
struct protection_domain *domain = to_pdomain(dom);
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 190f28d766151..88648e051e783 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -660,7 +660,8 @@ static int apple_dart_domain_add_streams(struct apple_dart_domain *domain,
}
static int apple_dart_attach_dev_paging(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
int ret, i;
struct apple_dart_stream_map *stream_map;
@@ -681,7 +682,8 @@ static int apple_dart_attach_dev_paging(struct iommu_domain *domain,
}
static int apple_dart_attach_dev_identity(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
struct apple_dart_stream_map *stream_map;
@@ -705,7 +707,8 @@ static struct iommu_domain apple_dart_identity_domain = {
};
static int apple_dart_attach_dev_blocked(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
struct apple_dart_stream_map *stream_map;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 8cd8929bbfdf8..313201a616991 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -138,14 +138,15 @@ void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master)
}
static int arm_smmu_attach_dev_nested(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old_domain)
{
struct arm_smmu_nested_domain *nested_domain =
to_smmu_nested_domain(domain);
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_attach_state state = {
.master = master,
- .old_domain = iommu_get_domain_for_dev(dev),
+ .old_domain = old_domain,
.ssid = IOMMU_NO_PASID,
};
struct arm_smmu_ste ste;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1a21d1a2dd454..de02eeb524c15 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3002,7 +3002,8 @@ void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
arm_smmu_remove_master_domain(master, state->old_domain, state->ssid);
}
-static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev,
+ struct iommu_domain *old_domain)
{
int ret = 0;
struct arm_smmu_ste target;
@@ -3010,7 +3011,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct arm_smmu_device *smmu;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_attach_state state = {
- .old_domain = iommu_get_domain_for_dev(dev),
+ .old_domain = old_domain,
.ssid = IOMMU_NO_PASID,
};
struct arm_smmu_master *master;
@@ -3186,7 +3187,7 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
/*
* When the last user of the CD table goes away downgrade the STE back
- * to a non-cd_table one.
+ * to a non-cd_table one, by re-attaching its sid_domain.
*/
if (!arm_smmu_ssids_in_use(&master->cd_table)) {
struct iommu_domain *sid_domain =
@@ -3194,12 +3195,14 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
if (sid_domain->type == IOMMU_DOMAIN_IDENTITY ||
sid_domain->type == IOMMU_DOMAIN_BLOCKED)
- sid_domain->ops->attach_dev(sid_domain, dev);
+ sid_domain->ops->attach_dev(sid_domain, dev,
+ sid_domain);
}
return 0;
}
static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
+ struct iommu_domain *old_domain,
struct device *dev,
struct arm_smmu_ste *ste,
unsigned int s1dss)
@@ -3207,7 +3210,7 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_attach_state state = {
.master = master,
- .old_domain = iommu_get_domain_for_dev(dev),
+ .old_domain = old_domain,
.ssid = IOMMU_NO_PASID,
};
@@ -3248,14 +3251,16 @@ static void arm_smmu_attach_dev_ste(struct iommu_domain *domain,
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old_domain)
{
struct arm_smmu_ste ste;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_master_clear_vmaster(master);
arm_smmu_make_bypass_ste(master->smmu, &ste);
- arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS);
+ arm_smmu_attach_dev_ste(domain, old_domain, dev, &ste,
+ STRTAB_STE_1_S1DSS_BYPASS);
return 0;
}
@@ -3269,14 +3274,15 @@ static struct iommu_domain arm_smmu_identity_domain = {
};
static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old_domain)
{
struct arm_smmu_ste ste;
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
arm_smmu_master_clear_vmaster(master);
arm_smmu_make_abort_ste(&ste);
- arm_smmu_attach_dev_ste(domain, dev, &ste,
+ arm_smmu_attach_dev_ste(domain, old_domain, dev, &ste,
STRTAB_STE_1_S1DSS_TERMINATE);
return 0;
}
@@ -3292,7 +3298,8 @@ static struct iommu_domain arm_smmu_blocked_domain = {
};
static int arm_smmu_attach_dev_release(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old_domain)
{
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
@@ -3300,9 +3307,11 @@ static int arm_smmu_attach_dev_release(struct iommu_domain *domain,
/* Put the STE back to what arm_smmu_init_strtab() sets */
if (dev->iommu->require_direct)
- arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
+ arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev,
+ old_domain);
else
- arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
+ arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev,
+ old_domain);
return 0;
}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4ced4b5bee4df..5e690cf85ec96 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1165,7 +1165,8 @@ static void arm_smmu_master_install_s2crs(struct arm_smmu_master_cfg *cfg,
}
}
-static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev,
+ struct iommu_domain *old)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
@@ -1234,7 +1235,8 @@ static int arm_smmu_attach_dev_type(struct device *dev,
}
static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
return arm_smmu_attach_dev_type(dev, S2CR_TYPE_BYPASS);
}
@@ -1249,7 +1251,8 @@ static struct iommu_domain arm_smmu_identity_domain = {
};
static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
return arm_smmu_attach_dev_type(dev, S2CR_TYPE_FAULT);
}
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index c5be95e560317..9222a4a48bb33 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -359,7 +359,8 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain)
kfree(qcom_domain);
}
-static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int qcom_iommu_attach_dev(struct iommu_domain *domain,
+ struct device *dev, struct iommu_domain *old)
{
struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
@@ -388,18 +389,18 @@ static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev
}
static int qcom_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct qcom_iommu_domain *qcom_domain;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev);
unsigned int i;
- if (domain == identity_domain || !domain)
+ if (old == identity_domain || !old)
return 0;
- qcom_domain = to_qcom_iommu_domain(domain);
+ qcom_domain = to_qcom_iommu_domain(old);
if (WARN_ON(!qcom_domain->iommu))
return -EINVAL;
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index b6edd178fe25e..b30d2bb87fa96 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -984,7 +984,8 @@ static void exynos_iommu_domain_free(struct iommu_domain *iommu_domain)
}
static int exynos_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev);
struct exynos_iommu_domain *domain;
@@ -1035,7 +1036,8 @@ static struct iommu_domain exynos_identity_domain = {
};
static int exynos_iommu_attach_device(struct iommu_domain *iommu_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev);
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 5f08523f97cb9..9664ef9840d2c 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -238,7 +238,7 @@ static int update_domain_stash(struct fsl_dma_domain *dma_domain, u32 val)
}
static int fsl_pamu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
unsigned long flags;
@@ -298,9 +298,9 @@ static int fsl_pamu_attach_device(struct iommu_domain *domain,
* switches to what looks like BLOCKING.
*/
static int fsl_pamu_platform_attach(struct iommu_domain *platform_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct fsl_dma_domain *dma_domain;
const u32 *prop;
int len;
@@ -311,11 +311,11 @@ static int fsl_pamu_platform_attach(struct iommu_domain *platform_domain,
* Hack to keep things working as they always have, only leaving an
* UNMANAGED domain makes it BLOCKING.
*/
- if (domain == platform_domain || !domain ||
- domain->type != IOMMU_DOMAIN_UNMANAGED)
+ if (old == platform_domain || !old ||
+ old->type != IOMMU_DOMAIN_UNMANAGED)
return 0;
- dma_domain = to_fsl_dma_domain(domain);
+ dma_domain = to_fsl_dma_domain(old);
/*
* Use LIODN of the PCI controller while detaching a
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9c3ab9d9f69a3..e9fbe9f6cc6cd 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3225,7 +3225,8 @@ void device_block_translation(struct device *dev)
}
static int blocking_domain_attach_dev(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
@@ -3532,7 +3533,8 @@ int paging_domain_compatible(struct iommu_domain *domain, struct device *dev)
}
static int intel_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
int ret;
@@ -4396,7 +4398,9 @@ static int device_setup_pass_through(struct device *dev)
context_setup_pass_through_cb, dev);
}
-static int identity_domain_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int identity_domain_attach_dev(struct iommu_domain *domain,
+ struct device *dev,
+ struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index 1b6ad9c900a5a..760d7aa2ade84 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -19,7 +19,7 @@
#include "pasid.h"
static int intel_nested_attach_dev(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index e6a66dacce1b8..ef3fd7bd1b553 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -100,7 +100,7 @@ static int iommu_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data);
static void iommu_release_device(struct device *dev);
static int __iommu_attach_device(struct iommu_domain *domain,
- struct device *dev);
+ struct device *dev, struct iommu_domain *old);
static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
static struct iommu_domain *__iommu_paging_domain_alloc_flags(struct device *dev,
@@ -114,6 +114,7 @@ enum {
static int __iommu_device_set_domain(struct iommu_group *group,
struct group_device *gdev,
struct iommu_domain *new_domain,
+ struct iommu_domain *old_domain,
unsigned int flags);
static int __iommu_group_set_domain_internal(struct iommu_group *group,
struct iommu_domain *new_domain,
@@ -517,7 +518,8 @@ static void iommu_deinit_device(struct device *dev)
* should still avoid touching any hardware configuration either.
*/
if (!dev->iommu->attach_deferred && ops->release_domain)
- ops->release_domain->ops->attach_dev(ops->release_domain, dev);
+ ops->release_domain->ops->attach_dev(ops->release_domain, dev,
+ group->domain);
if (ops->release_device)
ops->release_device(dev);
@@ -602,7 +604,8 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
if (group->default_domain)
iommu_create_device_direct_mappings(group->default_domain, dev);
if (group->domain) {
- ret = __iommu_device_set_domain(group, gdev, group->domain, 0);
+ ret = __iommu_device_set_domain(group, gdev, group->domain,
+ NULL, 0);
if (ret)
goto err_remove_gdev;
} else if (!group->default_domain && !group_list) {
@@ -2089,14 +2092,14 @@ static void __iommu_group_set_core_domain(struct iommu_group *group)
}
static int __iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
int ret;
if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
- ret = domain->ops->attach_dev(domain, dev);
+ ret = domain->ops->attach_dev(domain, dev, old);
if (ret)
return ret;
dev->iommu->attach_deferred = 0;
@@ -2154,7 +2157,7 @@ int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain)
guard(mutex)(&dev->iommu_group->mutex);
- return __iommu_attach_device(domain, dev);
+ return __iommu_attach_device(domain, dev, NULL);
}
void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
@@ -2265,6 +2268,7 @@ EXPORT_SYMBOL_GPL(iommu_attach_group);
static int __iommu_device_set_domain(struct iommu_group *group,
struct group_device *gdev,
struct iommu_domain *new_domain,
+ struct iommu_domain *old_domain,
unsigned int flags)
{
struct device *dev = gdev->dev;
@@ -2291,7 +2295,7 @@ static int __iommu_device_set_domain(struct iommu_group *group,
dev->iommu->attach_deferred = 0;
}
- ret = __iommu_attach_device(new_domain, dev);
+ ret = __iommu_attach_device(new_domain, dev, old_domain);
if (ret) {
/*
* If we have a blocking domain then try to attach that in hopes
@@ -2301,7 +2305,8 @@ static int __iommu_device_set_domain(struct iommu_group *group,
if ((flags & IOMMU_SET_DOMAIN_MUST_SUCCEED) &&
group->blocking_domain &&
group->blocking_domain != new_domain)
- __iommu_attach_device(group->blocking_domain, dev);
+ __iommu_attach_device(group->blocking_domain, dev,
+ old_domain);
return ret;
}
return 0;
@@ -2347,7 +2352,8 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
*/
result = 0;
for_each_group_device(group, gdev) {
- ret = __iommu_device_set_domain(group, gdev, new_domain, flags);
+ ret = __iommu_device_set_domain(group, gdev, new_domain,
+ group->domain, flags);
if (ret) {
result = ret;
/*
@@ -2379,7 +2385,7 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group,
*/
if (group->domain)
WARN_ON(__iommu_device_set_domain(
- group, gdev, group->domain,
+ group, gdev, group->domain, new_domain,
IOMMU_SET_DOMAIN_MUST_SUCCEED));
if (gdev == last_gdev)
break;
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 61686603c7693..fa4359bb15e84 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -216,7 +216,7 @@ static inline struct selftest_obj *to_selftest_obj(struct iommufd_object *obj)
}
static int mock_domain_nop_attach(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct mock_dev *mdev = to_mock_dev(dev);
struct mock_viommu *new_viommu = NULL;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index ffa892f657140..6667ecc331f01 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -590,7 +590,7 @@ static void ipmmu_domain_free(struct iommu_domain *io_domain)
}
static int ipmmu_attach_device(struct iommu_domain *io_domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
@@ -637,17 +637,17 @@ static int ipmmu_attach_device(struct iommu_domain *io_domain,
}
static int ipmmu_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *io_domain = iommu_get_domain_for_dev(dev);
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct ipmmu_vmsa_domain *domain;
unsigned int i;
- if (io_domain == identity_domain || !io_domain)
+ if (old == identity_domain || !old)
return 0;
- domain = to_vmsa_domain(io_domain);
+ domain = to_vmsa_domain(old);
for (i = 0; i < fwspec->num_ids; ++i)
ipmmu_utlb_disable(domain, fwspec->ids[i]);
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 43a61ba021a51..2629fbd0606d3 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -441,19 +441,19 @@ static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
static int msm_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct msm_priv *priv;
unsigned long flags;
struct msm_iommu_dev *iommu;
struct msm_iommu_ctx_dev *master;
int ret = 0;
- if (domain == identity_domain || !domain)
+ if (old == identity_domain || !old)
return 0;
- priv = to_msm_priv(domain);
+ priv = to_msm_priv(old);
free_io_pgtable_ops(priv->iop);
spin_lock_irqsave(&msm_iommu_lock, flags);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 0e0285348d2b8..9747ef1644138 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -705,7 +705,7 @@ static void mtk_iommu_domain_free(struct iommu_domain *domain)
}
static int mtk_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct mtk_iommu_data *data = dev_iommu_priv_get(dev), *frstdata;
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
@@ -773,12 +773,12 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
}
static int mtk_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
- if (domain == identity_domain || !domain)
+ if (old == identity_domain || !old)
return 0;
mtk_iommu_config(data, dev, false, 0);
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 10cc0b1197e80..3b45650263ac3 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -303,7 +303,9 @@ static void mtk_iommu_v1_domain_free(struct iommu_domain *domain)
kfree(to_mtk_domain(domain));
}
-static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device *dev)
+static int mtk_iommu_v1_attach_device(struct iommu_domain *domain,
+ struct device *dev,
+ struct iommu_domain *old)
{
struct mtk_iommu_v1_data *data = dev_iommu_priv_get(dev);
struct mtk_iommu_v1_domain *dom = to_mtk_domain(domain);
@@ -329,7 +331,8 @@ static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device
}
static int mtk_iommu_v1_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct mtk_iommu_v1_data *data = dev_iommu_priv_get(dev);
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 6fb93927bdb98..a38f69debac09 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1431,8 +1431,8 @@ static void omap_iommu_detach_fini(struct omap_iommu_domain *odomain)
odomain->iommus = NULL;
}
-static int
-omap_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int omap_iommu_attach_dev(struct iommu_domain *domain,
+ struct device *dev, struct iommu_domain *old)
{
struct omap_iommu_arch_data *arch_data = dev_iommu_priv_get(dev);
struct omap_iommu_domain *omap_domain = to_omap_domain(domain);
@@ -1536,15 +1536,15 @@ static void _omap_iommu_detach_dev(struct omap_iommu_domain *omap_domain,
}
static int omap_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct omap_iommu_domain *omap_domain;
- if (domain == identity_domain || !domain)
+ if (old == identity_domain || !old)
return 0;
- omap_domain = to_omap_domain(domain);
+ omap_domain = to_omap_domain(old);
spin_lock(&omap_domain->lock);
_omap_iommu_detach_dev(omap_domain, dev);
spin_unlock(&omap_domain->lock);
diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
index 2d0d31ba28860..9c58413ad641e 100644
--- a/drivers/iommu/riscv/iommu.c
+++ b/drivers/iommu/riscv/iommu.c
@@ -1319,7 +1319,8 @@ static bool riscv_iommu_pt_supported(struct riscv_iommu_device *iommu, int pgd_m
}
static int riscv_iommu_attach_paging_domain(struct iommu_domain *iommu_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct riscv_iommu_domain *domain = iommu_domain_to_riscv(iommu_domain);
struct riscv_iommu_device *iommu = dev_to_iommu(dev);
@@ -1424,7 +1425,8 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
}
static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct riscv_iommu_device *iommu = dev_to_iommu(dev);
struct riscv_iommu_info *info = dev_iommu_priv_get(dev);
@@ -1445,7 +1447,8 @@ static struct iommu_domain riscv_iommu_blocking_domain = {
};
static int riscv_iommu_attach_identity_domain(struct iommu_domain *iommu_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct riscv_iommu_device *iommu = dev_to_iommu(dev);
struct riscv_iommu_info *info = dev_iommu_priv_get(dev);
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 0861dd469bd86..85f3667e797c3 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -960,7 +960,8 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
}
static int rk_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct rk_iommu *iommu;
struct rk_iommu_domain *rk_domain;
@@ -1005,7 +1006,7 @@ static struct iommu_domain rk_identity_domain = {
};
static int rk_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct rk_iommu *iommu;
struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
@@ -1026,7 +1027,7 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
if (iommu->domain == domain)
return 0;
- ret = rk_iommu_identity_attach(&rk_identity_domain, dev);
+ ret = rk_iommu_identity_attach(&rk_identity_domain, dev, old);
if (ret)
return ret;
@@ -1041,8 +1042,17 @@ static int rk_iommu_attach_device(struct iommu_domain *domain,
return 0;
ret = rk_iommu_enable(iommu);
- if (ret)
- WARN_ON(rk_iommu_identity_attach(&rk_identity_domain, dev));
+ if (ret) {
+ /*
+ * Note rk_iommu_identity_attach() might fail before physically
+ * attaching the dev to iommu->domain, in which case the actual
+ * old domain for this revert should be rk_identity_domain v.s.
+ * iommu->domain. Since rk_iommu_identity_attach() does not care
+ * about the old domain argument for now, this is not a problem.
+ */
+ WARN_ON(rk_iommu_identity_attach(&rk_identity_domain, dev,
+ iommu->domain));
+ }
pm_runtime_put(iommu->dev);
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 9c80d61deb2c0..f2f58bb21720b 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -653,7 +653,8 @@ int zpci_iommu_register_ioat(struct zpci_dev *zdev, u8 *status)
}
static int blocking_domain_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct zpci_dev *zdev = to_zpci_dev(dev);
struct s390_domain *s390_domain;
@@ -677,7 +678,8 @@ static int blocking_domain_attach_device(struct iommu_domain *domain,
}
static int s390_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct s390_domain *s390_domain = to_s390_domain(domain);
struct zpci_dev *zdev = to_zpci_dev(dev);
@@ -1113,7 +1115,8 @@ static int __init s390_iommu_init(void)
subsys_initcall(s390_iommu_init);
static int s390_attach_dev_identity(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct zpci_dev *zdev = to_zpci_dev(dev);
u8 status;
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index c7ca1d8a0b153..555d4505c747a 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -247,7 +247,8 @@ static void sprd_iommu_domain_free(struct iommu_domain *domain)
}
static int sprd_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct sprd_iommu_device *sdev = dev_iommu_priv_get(dev);
struct sprd_iommu_domain *dom = to_sprd_domain(domain);
diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c
index de10b569d9a94..d3b190be18b5a 100644
--- a/drivers/iommu/sun50i-iommu.c
+++ b/drivers/iommu/sun50i-iommu.c
@@ -771,7 +771,8 @@ static void sun50i_iommu_detach_domain(struct sun50i_iommu *iommu,
}
static int sun50i_iommu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct sun50i_iommu *iommu = dev_iommu_priv_get(dev);
struct sun50i_iommu_domain *sun50i_domain;
@@ -797,7 +798,8 @@ static struct iommu_domain sun50i_iommu_identity_domain = {
};
static int sun50i_iommu_attach_device(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
struct sun50i_iommu_domain *sun50i_domain = to_sun50i_domain(domain);
struct sun50i_iommu *iommu;
@@ -813,7 +815,7 @@ static int sun50i_iommu_attach_device(struct iommu_domain *domain,
if (iommu->domain == domain)
return 0;
- sun50i_iommu_identity_attach(&sun50i_iommu_identity_domain, dev);
+ sun50i_iommu_identity_attach(&sun50i_iommu_identity_domain, dev, old);
sun50i_iommu_attach_domain(iommu, sun50i_domain);
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 36cdd5fbab077..336e0a3ff41fb 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -490,7 +490,7 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu,
}
static int tegra_smmu_attach_dev(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev, struct iommu_domain *old)
{
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
@@ -524,9 +524,9 @@ static int tegra_smmu_attach_dev(struct iommu_domain *domain,
}
static int tegra_smmu_identity_attach(struct iommu_domain *identity_domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct tegra_smmu_as *as;
struct tegra_smmu *smmu;
@@ -535,10 +535,10 @@ static int tegra_smmu_identity_attach(struct iommu_domain *identity_domain,
if (!fwspec)
return -ENODEV;
- if (domain == identity_domain || !domain)
+ if (old == identity_domain || !old)
return 0;
- as = to_smmu_as(domain);
+ as = to_smmu_as(old);
smmu = as->smmu;
for (index = 0; index < fwspec->num_ids; index++) {
tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 532db1de201ba..bd1af8a77005f 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -730,7 +730,8 @@ static struct iommu_domain *viommu_domain_alloc_identity(struct device *dev)
return domain;
}
-static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev,
+ struct iommu_domain *old)
{
int ret = 0;
struct virtio_iommu_req_attach req;
@@ -781,7 +782,8 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
static int viommu_attach_identity_domain(struct iommu_domain *domain,
- struct device *dev)
+ struct device *dev,
+ struct iommu_domain *old)
{
int ret = 0;
struct virtio_iommu_req_attach req;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 5/7] iommu: Add iommu_get_domain_for_dev_locked() helper
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
` (3 preceding siblings ...)
2025-08-31 23:31 ` [PATCH v4 4/7] iommu: Pass in old domain to attach_dev callback functions Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 6/7] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done() Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 7/7] pci: Suspend iommu function prior to resetting a device Nicolin Chen
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
There is a need to stage a PCI device that's under a reset to temporally
the blocked domain (i.e. detach it from its previously attached domain),
and then to reattach it back to its previous domain (i.e. detach it from
the blocked domain) after reset.
During the reset stage, there can be races from other attach/detachment.
To solve this, a per-gdev reset flag will be introduced so that all the
attach functions will reject any concurrent attach_dev callbacks.
So, iommu_get_domain_for_dev() function always returns the group->domain
that needs to be changed to the blocked domain by checking the per-gdev
flag, for which iommu_get_domain_for_dev() must hold the group->mutex.
On the other hand, caller like the SMMUv3 driver invoke it in one of its
set_dev_pasid functions where the group->mutex is held, while some other
callers like non-IOMMU drivers invoke it outside IOMMU callback functions
so the group->mutex is not held. Apparently, this makes it difficult to
add the lock to the existing iommu_get_domain_for_dev().
Introduce a new iommu_get_domain_for_dev_locked() helper to be used in a
a context that is already under the protection of the group->mutex.
Add a lockdep_assert_not_held to the existing iommu_get_domain_for_dev()
to note that it would be only used outside the group->mutex.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
include/linux/iommu.h | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 +++--
drivers/iommu/dma-iommu.c | 2 +-
drivers/iommu/iommu.c | 14 ++++++++++++++
4 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 801b2bd9e8d49..6d6d068c3de48 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -910,6 +910,7 @@ extern int iommu_attach_device(struct iommu_domain *domain,
extern void iommu_detach_device(struct iommu_domain *domain,
struct device *dev);
extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
+struct iommu_domain *iommu_get_domain_for_dev_locked(struct device *dev);
extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index de02eeb524c15..4a68bd121287a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3125,7 +3125,8 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
struct arm_smmu_cd *cd, struct iommu_domain *old)
{
- struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev);
+ struct iommu_domain *sid_domain =
+ iommu_get_domain_for_dev_locked(master->dev);
struct arm_smmu_attach_state state = {
.master = master,
.ssid = pasid,
@@ -3191,7 +3192,7 @@ static int arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
*/
if (!arm_smmu_ssids_in_use(&master->cd_table)) {
struct iommu_domain *sid_domain =
- iommu_get_domain_for_dev(master->dev);
+ iommu_get_domain_for_dev_locked(master->dev);
if (sid_domain->type == IOMMU_DOMAIN_IDENTITY ||
sid_domain->type == IOMMU_DOMAIN_BLOCKED)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ea2ef53bd4fef..99680cdb57265 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -2097,7 +2097,7 @@ EXPORT_SYMBOL_GPL(dma_iova_destroy);
void iommu_setup_dma_ops(struct device *dev)
{
- struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+ struct iommu_domain *domain = iommu_get_domain_for_dev_locked(dev);
if (dev_is_pci(dev))
dev->iommu->pci_32bit_workaround = !iommu_dma_forcedac;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ef3fd7bd1b553..f08c177f30de8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2179,6 +2179,7 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_detach_device);
+/* Caller must be a general/external function that isn't an IOMMU callback */
struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
{
/* Caller must be a probed driver on dev */
@@ -2187,10 +2188,23 @@ struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
if (!group)
return NULL;
+ lockdep_assert_not_held(&group->mutex);
+
return group->domain;
}
EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
+/* Caller must be an IOMMU callback/internal function that holds group->mutex */
+struct iommu_domain *iommu_get_domain_for_dev_locked(struct device *dev)
+{
+ struct iommu_group *group = dev->iommu_group;
+
+ lockdep_assert_held(&group->mutex);
+
+ return group->domain;
+}
+EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_locked);
+
/*
* For IOMMU_DOMAIN_DMA implementations which already provide their own
* guarantees that the group and its default domain are valid and correct.
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 6/7] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done()
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
` (4 preceding siblings ...)
2025-08-31 23:31 ` [PATCH v4 5/7] iommu: Add iommu_get_domain_for_dev_locked() helper Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 7/7] pci: Suspend iommu function prior to resetting a device Nicolin Chen
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
PCIe permits a device to ignore ATS invalidation TLPs, while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out. E.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.
The OS should do something to mitigate this as we do not want production
systems to be reporting critical ATS failures, especially in a hypervisor
environment. Broadly, OS could arrange to ignore the timeouts, block page
table mutations to prevent invalidations, or disable and block ATS.
The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.
Provide a callback from the PCI subsystem that will enclose the reset and
have the iommu core temporarily change all the attached domain to BLOCKED.
After attaching a BLOCKED domain, IOMMU hardware would fence any incoming
ATS queries. And IOMMU drivers should also synchronously stop issuing new
ATS invalidations and wait for all ATS invalidations to complete. This can
avoid any ATS invaliation timeouts.
However, if there is a domain attachment/replacement happening during an
ongoing reset, ATS routines may be re-activated between the two function
calls. So, introduce a new pending_reset flag in group_device, and reject
any concurrent attach_dev/set_dev_pasid call during a reset for a concern
of compatibility failure.
There are two corner cases that won't work:
1. Alias devices that share the same RID
Blocking one device also blocks the other alias devices that might not
want a reset. Given that it's very rare for an alias device to support
ATS, simply skip the blocking routine.
2. SRIOV devices that its PF is resetting while its VF isn't.
Both PF and VF should block RID and PASIDs. But, since VF is not aware
of the reset, it is difficult to block it and reject concurrent attach
calls, because it's not logically reasonable to reject a VF attachment
due to a resetting PF unless the VF is resetting too. To address this,
we won't be able to reject any concurrent attachment as simple as this
patch does; instead we will need two new compatibility testing ops for
attach_dev/set_dev_pasid to allowing caching a compatible attach. This
itself, however, would be a big series. So, for now, skip the blocking
routine for PF devices, and leave a note.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
include/linux/iommu.h | 12 +++
drivers/iommu/iommu.c | 199 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 211 insertions(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 6d6d068c3de48..0d8e252929c89 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1169,6 +1169,9 @@ void dev_iommu_priv_set(struct device *dev, void *priv);
extern struct mutex iommu_probe_device_lock;
int iommu_probe_device(struct device *dev);
+int iommu_dev_reset_prepare(struct device *dev);
+void iommu_dev_reset_done(struct device *dev);
+
int iommu_device_use_default_domain(struct device *dev);
void iommu_device_unuse_default_domain(struct device *dev);
@@ -1453,6 +1456,15 @@ static inline int iommu_fwspec_add_ids(struct device *dev, u32 *ids,
return -ENODEV;
}
+static inline int iommu_dev_reset_prepare(struct device *dev)
+{
+ return 0;
+}
+
+static inline void iommu_dev_reset_done(struct device *dev)
+{
+}
+
static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
{
return NULL;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f08c177f30de8..bcc239f3592f4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -71,12 +71,29 @@ struct group_device {
struct list_head list;
struct device *dev;
char *name;
+ bool pending_reset : 1;
};
/* Iterate over each struct group_device in a struct iommu_group */
#define for_each_group_device(group, pos) \
list_for_each_entry(pos, &(group)->devices, list)
+/* Callers must hold the dev->iommu_group->mutex */
+static struct group_device *device_to_group_device(struct device *dev)
+{
+ struct iommu_group *group = dev->iommu_group;
+ struct group_device *gdev;
+
+ lockdep_assert_held(&group->mutex);
+
+ /* gdev must be in the list */
+ for_each_group_device(group, gdev) {
+ if (gdev->dev == dev)
+ break;
+ }
+ return gdev;
+}
+
struct iommu_group_attribute {
struct attribute attr;
ssize_t (*show)(struct iommu_group *group, char *buf);
@@ -2157,6 +2174,12 @@ int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain)
guard(mutex)(&dev->iommu_group->mutex);
+ /*
+ * There is a concurrent attach while the device is resetting. Defer it
+ * until iommu_dev_reset_done() attaching the device to group->domain.
+ */
+ if (device_to_group_device(dev)->pending_reset)
+ return -EBUSY;
return __iommu_attach_device(domain, dev, NULL);
}
@@ -2201,6 +2224,16 @@ struct iommu_domain *iommu_get_domain_for_dev_locked(struct device *dev)
lockdep_assert_held(&group->mutex);
+ /*
+ * Driver handles the low-level __iommu_attach_device(), including the
+ * one invoked by iommu_dev_reset_done(), in which case the driver must
+ * get the blocking domain over group->domain caching the one prior to
+ * iommu_dev_reset_prepare(), so that it wouldn't end up with attaching
+ * the device from group->domain (old) to group->domain (new).
+ */
+ if (device_to_group_device(dev)->pending_reset)
+ return group->blocking_domain;
+
return group->domain;
}
EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_locked);
@@ -2309,6 +2342,13 @@ static int __iommu_device_set_domain(struct iommu_group *group,
dev->iommu->attach_deferred = 0;
}
+ /*
+ * There is a concurrent attach while the device is resetting. Defer it
+ * until iommu_dev_reset_done() attaching the device to group->domain.
+ */
+ if (gdev->pending_reset)
+ return -EBUSY;
+
ret = __iommu_attach_device(new_domain, dev, old_domain);
if (ret) {
/*
@@ -3394,6 +3434,15 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
int ret;
for_each_group_device(group, device) {
+ /*
+ * There is a concurrent attach while the device is resetting.
+ * Defer it until iommu_dev_reset_done() attaching the device to
+ * group->domain.
+ */
+ if (device->pending_reset) {
+ ret = -EBUSY;
+ goto err_revert;
+ }
if (device->dev->iommu->max_pasids > 0) {
ret = domain->ops->set_dev_pasid(domain, device->dev,
pasid, old);
@@ -3815,6 +3864,156 @@ int iommu_replace_group_handle(struct iommu_group *group,
}
EXPORT_SYMBOL_NS_GPL(iommu_replace_group_handle, "IOMMUFD_INTERNAL");
+/**
+ * iommu_dev_reset_prepare() - Block IOMMU to prepare for a device reset
+ * @dev: device that is going to enter a reset routine
+ *
+ * When certain device is entering a reset routine, it wants to block any IOMMU
+ * activity during the reset routine. This includes blocking any translation as
+ * well as cache invalidation too (especially the device cache).
+ *
+ * This function attaches all RID/PASID of the device's to IOMMU_DOMAIN_BLOCKED
+ * allowing any blocked-domain-supporting IOMMU driver to pause translation and
+ * cahce invalidation, but leaves the software domain pointers intact so later
+ * the iommu_dev_reset_done() can restore everything.
+ *
+ * Return: 0 on success or negative error code if the preparation failed.
+ *
+ * Caller must use iommu_dev_reset_prepare() and iommu_dev_reset_done() together
+ * before/after the core-level reset routine, to unclear the pending_reset flag.
+ *
+ * These two functions are designed to be used by PCI reset functions that would
+ * not invoke any racy iommu_release_device(), since PCI sysfs node gets removed
+ * before it notifies with a BUS_NOTIFY_REMOVED_DEVICE. When using them in other
+ * case, callers must ensure there will be no racy iommu_release_device() call,
+ * which otherwise would UAF the dev->iommu_group pointer.
+ */
+int iommu_dev_reset_prepare(struct device *dev)
+{
+ struct iommu_group *group = dev->iommu_group;
+ struct group_device *gdev;
+ unsigned long pasid;
+ void *entry;
+ int ret = 0;
+
+ if (!dev_has_iommu(dev))
+ return 0;
+
+ /*
+ * FIXME resetting a PF will reset any VF in the hardware level, so this
+ * should basically block both the PF and its VFs. On the other hand, VF
+ * software might not go through a reset, so it can run into any regular
+ * operation like invoking a concurrent attach_dev/set_dev_pasid call.
+ *
+ * Due to compatibility concern, any concurrent attach_dev/set_dev_pasid
+ * is being rejected with -EBUSY. For a PF, this rejection is reasonable
+ * and simple since a concurrent attachment would not be sane. For a VF,
+ * however, it would be difficult to justify.
+ *
+ * One way to work this out is to have a new op running a compatibility
+ * test for a concurrent attachment. Then, so long as it is compatible,
+ * the attachment would be deferred to iommu_dev_reset_done(). Bypass PF
+ * devices for now.
+ */
+ if (dev_is_pci(dev) && pci_num_vf(to_pci_dev(dev)) > 0)
+ return 0;
+
+ guard(mutex)(&group->mutex);
+
+ /* We cannot block an RID that is shared with another device */
+ if (dev_is_pci(dev)) {
+ for_each_group_device(group, gdev) {
+ if (gdev->dev != dev && dev_is_pci(gdev->dev) &&
+ pci_devs_are_dma_aliases(to_pci_dev(gdev->dev),
+ to_pci_dev(dev)))
+ return 0;
+ }
+ }
+
+ ret = __iommu_group_alloc_blocking_domain(group);
+ if (ret)
+ return ret;
+
+ /* Stage RID domain at blocking_domain while retaining group->domain */
+ if (group->domain != group->blocking_domain) {
+ ret = __iommu_attach_device(group->blocking_domain, dev,
+ group->domain);
+ if (ret)
+ return ret;
+ }
+
+ /*
+ * Stage PASID domains at blocking_domain while retaining pasid_array.
+ *
+ * The pasid_array is mostly fenced by group->mutex, except one reader
+ * in iommu_attach_handle_get(), so it's safe to read without xa_lock.
+ */
+ xa_for_each_start(&group->pasid_array, pasid, entry, 1)
+ iommu_remove_dev_pasid(dev, pasid,
+ pasid_array_entry_to_domain(entry));
+
+ device_to_group_device(dev)->pending_reset = true;
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_reset_prepare);
+
+/**
+ * iommu_dev_reset_done() - Restore IOMMU after a device reset is finished
+ * @dev: device that has finished a reset routine
+ *
+ * When certain device has finished a reset routine, it wants to restore its
+ * IOMMU activity, including new translation as well as cache invalidation, by
+ * re-attaching all RID/PASID of the device's back to the domains retained in
+ * the core-level structure.
+ *
+ * Caller must pair it with a successfully returned iommu_dev_reset_prepare().
+ *
+ * Note that, although unlikely, there is a risk that re-attaching domains might
+ * fail due to some unexpected happening like OOM.
+ */
+void iommu_dev_reset_done(struct device *dev)
+{
+ struct iommu_group *group = dev->iommu_group;
+ struct group_device *gdev;
+ unsigned long pasid;
+ void *entry;
+
+ if (!dev_has_iommu(dev))
+ return;
+
+ guard(mutex)(&group->mutex);
+
+ gdev = device_to_group_device(dev);
+
+ /* iommu_dev_reset_prepare() was not successfully called */
+ if (WARN_ON(!group->blocking_domain))
+ return;
+
+ /* iommu_dev_reset_prepare() was bypassed for the device */
+ if (!gdev->pending_reset)
+ return;
+
+ /* Re-attach RID domain back to group->domain */
+ if (group->domain != group->blocking_domain) {
+ WARN_ON(__iommu_attach_device(group->domain, dev,
+ group->blocking_domain));
+ }
+
+ /*
+ * Re-attach PASID domains back to the domains retained in pasid_array.
+ *
+ * The pasid_array is mostly fenced by group->mutex, except one reader
+ * in iommu_attach_handle_get(), so it's safe to read without xa_lock.
+ */
+ xa_for_each_start(&group->pasid_array, pasid, entry, 1)
+ WARN_ON(__iommu_set_group_pasid(
+ pasid_array_entry_to_domain(entry), group, pasid,
+ group->blocking_domain));
+
+ gdev->pending_reset = false;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_reset_done);
+
#if IS_ENABLED(CONFIG_IRQ_MSI_IOMMU)
/**
* iommu_dma_prepare_msi() - Map the MSI page in the IOMMU domain
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 7/7] pci: Suspend iommu function prior to resetting a device
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
` (5 preceding siblings ...)
2025-08-31 23:31 ` [PATCH v4 6/7] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done() Nicolin Chen
@ 2025-08-31 23:31 ` Nicolin Chen
6 siblings, 0 replies; 8+ messages in thread
From: Nicolin Chen @ 2025-08-31 23:31 UTC (permalink / raw)
To: joro, jgg, bhelgaas
Cc: suravee.suthikulpanit, will, robin.murphy, sven, j, alyssa, neal,
robin.clark, m.szyprowski, krzk, alim.akhtar, dwmw2, baolu.lu,
kevin.tian, yong.wu, matthias.bgg, angelogioacchino.delregno,
tjeznach, paul.walmsley, palmer, aou, alex, heiko, schnelle,
mjrosato, gerald.schaefer, orsonzhai, baolin.wang, zhang.lyra,
wens, jernej.skrabec, samuel, jean-philippe, rafael, lenb,
yi.l.liu, cwabbott0, quic_pbrahma, iommu, linux-kernel, asahi,
linux-arm-kernel, linux-arm-msm, linux-samsung-soc,
linux-mediatek, linux-riscv, linux-rockchip, linux-s390,
linux-sunxi, linux-tegra, virtualization, linux-acpi, linux-pci,
patches, vsethi, helgaas, etzhao1900
PCIe permits a device to ignore ATS invalidation TLPs, while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out: e.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.
The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.
Now iommu_dev_reset_prepare/done() helpers are introduced for this matter.
Use them in all the existing reset functions, which will attach the device
to an IOMMU_DOMAIN_BLOCKED during a reset, so as to allow IOMMU driver to:
- invoke pci_disable_ats() and pci_enable_ats(), if necessary
- wait for all ATS invalidations to complete
- stop issuing new ATS invalidations
- fence any incoming ATS queries
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/pci/pci.h | 2 ++
drivers/pci/pci-acpi.c | 12 ++++++--
drivers/pci/pci.c | 68 ++++++++++++++++++++++++++++++++++++++----
drivers/pci/quirks.c | 18 ++++++++++-
4 files changed, 92 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 34f65d69662e9..9700ebca55771 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -106,6 +106,8 @@ void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
int __pci_reset_bus(struct pci_bus *bus);
+int pci_reset_iommu_prepare(struct pci_dev *dev);
+void pci_reset_iommu_done(struct pci_dev *dev);
struct pci_cap_saved_data {
u16 cap_nr;
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index ddb25960ea47d..3291424730824 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -969,6 +969,7 @@ void pci_set_acpi_fwnode(struct pci_dev *dev)
int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
{
acpi_handle handle = ACPI_HANDLE(&dev->dev);
+ int ret = 0;
if (!handle || !acpi_has_method(handle, "_RST"))
return -ENOTTY;
@@ -976,12 +977,19 @@ int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
if (probe)
return 0;
+ ret = pci_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return ret;
+ }
+
if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
pci_warn(dev, "ACPI _RST failed\n");
- return -ENOTTY;
+ ret = -ENOTTY;
}
- return 0;
+ pci_reset_iommu_done(dev);
+ return ret;
}
bool acpi_pci_power_manageable(struct pci_dev *dev)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cdd..b4ca44ea6f494 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -13,6 +13,7 @@
#include <linux/delay.h>
#include <linux/dmi.h>
#include <linux/init.h>
+#include <linux/iommu.h>
#include <linux/msi.h>
#include <linux/of.h>
#include <linux/pci.h>
@@ -25,6 +26,7 @@
#include <linux/logic_pio.h>
#include <linux/device.h>
#include <linux/pm_runtime.h>
+#include <linux/pci-ats.h>
#include <linux/pci_hotplug.h>
#include <linux/vmalloc.h>
#include <asm/dma.h>
@@ -95,6 +97,23 @@ bool pci_reset_supported(struct pci_dev *dev)
return dev->reset_methods[0] != 0;
}
+/*
+ * Per PCIe r6.3, sec 10.3.1 IMPLEMENTATION NOTE, software disables ATS before
+ * initiating a reset. Notify the iommu driver that enabled ATS.
+ */
+int pci_reset_iommu_prepare(struct pci_dev *dev)
+{
+ if (pci_ats_supported(dev))
+ return iommu_dev_reset_prepare(&dev->dev);
+ return 0;
+}
+
+void pci_reset_iommu_done(struct pci_dev *dev)
+{
+ if (pci_ats_supported(dev))
+ iommu_dev_reset_done(&dev->dev);
+}
+
#ifdef CONFIG_PCI_DOMAINS
int pci_domains_supported = 1;
#endif
@@ -4529,13 +4548,22 @@ EXPORT_SYMBOL(pci_wait_for_pending_transaction);
*/
int pcie_flr(struct pci_dev *dev)
{
+ int ret = 0;
+
if (!pci_wait_for_pending_transaction(dev))
pci_err(dev, "timed out waiting for pending transaction; performing function level reset anyway\n");
+ /* Have to call it after waiting for pending DMA transaction */
+ ret = pci_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return ret;
+ }
+
pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_BCR_FLR);
if (dev->imm_ready)
- return 0;
+ goto done;
/*
* Per PCIe r4.0, sec 6.6.2, a device must complete an FLR within
@@ -4544,7 +4572,10 @@ int pcie_flr(struct pci_dev *dev)
*/
msleep(100);
- return pci_dev_wait(dev, "FLR", PCIE_RESET_READY_POLL_MS);
+ ret = pci_dev_wait(dev, "FLR", PCIE_RESET_READY_POLL_MS);
+done:
+ pci_reset_iommu_done(dev);
+ return ret;
}
EXPORT_SYMBOL_GPL(pcie_flr);
@@ -4572,6 +4603,7 @@ EXPORT_SYMBOL_GPL(pcie_reset_flr);
static int pci_af_flr(struct pci_dev *dev, bool probe)
{
+ int ret = 0;
int pos;
u8 cap;
@@ -4598,10 +4630,17 @@ static int pci_af_flr(struct pci_dev *dev, bool probe)
PCI_AF_STATUS_TP << 8))
pci_err(dev, "timed out waiting for pending transaction; performing AF function level reset anyway\n");
+ /* Have to call it after waiting for pending DMA transaction */
+ ret = pci_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return ret;
+ }
+
pci_write_config_byte(dev, pos + PCI_AF_CTRL, PCI_AF_CTRL_FLR);
if (dev->imm_ready)
- return 0;
+ goto done;
/*
* Per Advanced Capabilities for Conventional PCI ECN, 13 April 2006,
@@ -4611,7 +4650,10 @@ static int pci_af_flr(struct pci_dev *dev, bool probe)
*/
msleep(100);
- return pci_dev_wait(dev, "AF_FLR", PCIE_RESET_READY_POLL_MS);
+ ret = pci_dev_wait(dev, "AF_FLR", PCIE_RESET_READY_POLL_MS);
+done:
+ pci_reset_iommu_done(dev);
+ return ret;
}
/**
@@ -4632,6 +4674,7 @@ static int pci_af_flr(struct pci_dev *dev, bool probe)
static int pci_pm_reset(struct pci_dev *dev, bool probe)
{
u16 csr;
+ int ret;
if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
return -ENOTTY;
@@ -4646,6 +4689,12 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
if (dev->current_state != PCI_D0)
return -EINVAL;
+ ret = pci_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return ret;
+ }
+
csr &= ~PCI_PM_CTRL_STATE_MASK;
csr |= PCI_D3hot;
pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
@@ -4656,7 +4705,9 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
pci_dev_d3_sleep(dev);
- return pci_dev_wait(dev, "PM D3hot->D0", PCIE_RESET_READY_POLL_MS);
+ ret = pci_dev_wait(dev, "PM D3hot->D0", PCIE_RESET_READY_POLL_MS);
+ pci_reset_iommu_done(dev);
+ return ret;
}
/**
@@ -5111,6 +5162,12 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
if (rc)
return -ENOTTY;
+ rc = pci_reset_iommu_prepare(dev);
+ if (rc) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return rc;
+ }
+
if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR) {
val = reg;
} else {
@@ -5125,6 +5182,7 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
reg);
+ pci_reset_iommu_done(dev);
return rc;
}
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d97335a401930..c1c32e57fe267 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4225,6 +4225,22 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
{ 0 }
};
+static int __pci_dev_specific_reset(struct pci_dev *dev, bool probe,
+ const struct pci_dev_reset_methods *i)
+{
+ int ret;
+
+ ret = pci_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU\n");
+ return ret;
+ }
+
+ ret = i->reset(dev, probe);
+ pci_reset_iommu_done(dev);
+ return ret;
+}
+
/*
* These device-specific reset methods are here rather than in a driver
* because when a host assigns a device to a guest VM, the host may need
@@ -4239,7 +4255,7 @@ int pci_dev_specific_reset(struct pci_dev *dev, bool probe)
i->vendor == (u16)PCI_ANY_ID) &&
(i->device == dev->device ||
i->device == (u16)PCI_ANY_ID))
- return i->reset(dev, probe);
+ return __pci_dev_specific_reset(dev, probe, i);
}
return -ENOTTY;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-08-31 23:33 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-31 23:31 [PATCH v4 0/7] Disable ATS via iommu during PCI resets Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 1/7] iommu/arm-smmu-v3: Add release_domain to attach prior to release_dev() Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 2/7] iommu: Lock group->mutex in iommu_deferred_attach() Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 3/7] iommu: Pass in gdev to __iommu_device_set_domain Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 4/7] iommu: Pass in old domain to attach_dev callback functions Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 5/7] iommu: Add iommu_get_domain_for_dev_locked() helper Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 6/7] iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done() Nicolin Chen
2025-08-31 23:31 ` [PATCH v4 7/7] pci: Suspend iommu function prior to resetting a device Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).