* [PATCH v3 00/17] Consolidate the error handling around device attachment
@ 2023-04-05 23:38 Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 01/17] iommu: Replace iommu_group_device_count() with list_count_nodes() Jason Gunthorpe
` (17 more replies)
0 siblings, 18 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Device attachment has a bunch of different flows open coded in different
ways throughout the code.
One of the things that became apparently recently is that error handling
is important and we do need to consistently treat errors during attach and
have some strategy to unwind back to a safe state.
Implement a single algorithm for this in one function. It will call each
device's attach, if it fails it will try to go back to the prior domain or
as a contingency against a UAF crash try to go to a blocking domain.
As part of this we consolidate how the default domain is created and
attached as well into one place with a consistent flow.
The new worker functions are called __iommu_device_set_domain() and
__iommu_group_set_domain_internal(), each has sensible error handling
internally. At the end __iommu_group_set_domain_internal() is the only
function that stores to group->domain, and must be called to change this
value.
Some flags tell the intent of the caller, if the caller cannot accept a
failure, or if the caller is a first attach and wants to do the deferred
logic.
Several of the confusing workflows where we store things in group->domain
or group->default_domain before they are fully setup are removed.
This has a followup series that does a similar de-duplication to the probe
path:
https://github.com/jgunthorpe/linux/commits/iommu_err_unwind
v3:
- New patch to do iommu_group_create_direct_mappings() before
attach on the hotplug path based on Lu and Robin's remarks
- Fix to return 0 if the group has conflicting default domain types like
the original code
- Put iommu_group_create_direct_mappings() before attach in setup_domains
- Split up the alloc changes from the setup_domain patch to their own
patch, implement Robin's point about how the iommu_def_domain_type should
work
- New patch to optionally do iommu_group_create_direct_mappings() after
attach
- Reword the setup_domain patch's commit message
v2: https://lore.kernel.org/r/0-v2-cd32667d2ba6+70bd1-iommu_err_unwind_jgg@nvidia.com
- New patch to remove iommu_group_device_count()
- New patch to add a for_each helper: for_each_group_device()
- Rebase on Joerg's tree
- IOMMU_SET_DOMAIN_MUST_SUCCEED instead of true
- Split patch to fix owned groups during first attach
- Change iommu_create_device_direct_mappings to accept a domain not a
group
- Significantly revise the "iommu: Consolidate the default_domain setup to
one function" patch to de-duplicate the domain type calculation logic
too
- New patch to clean the flow inside iommu_group_store_type()
v1: https://lore.kernel.org/r/0-v1-20507a7e6b7e+2d6-iommu_err_unwind_jgg@nvidia.com
Jason Gunthorpe (17):
iommu: Replace iommu_group_device_count() with list_count_nodes()
iommu: Add for_each_group_device()
iommu: Make __iommu_group_set_domain() handle error unwind
iommu: Use __iommu_group_set_domain() for __iommu_attach_group()
iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain()
iommu: Replace __iommu_group_dma_first_attach() with set_domain
iommu: Make iommu_group_do_dma_first_attach() simpler
iommu: Make iommu_group_do_dma_first_attach() work with owned groups
iommu: Fix iommu_probe_device() to attach the right domain
iommu: Do iommu_group_create_direct_mappings() before attach
iommu: Remove the assignment of group->domain during default domain
alloc
iommu: Consolidate the code to calculate the target default domain
type
iommu: Revise iommu_group_alloc_default_domain()
iommu: Consolidate the default_domain setup to one function
iommu: Allow IOMMU_RESV_DIRECT to work on ARM
iommu: Remove __iommu_group_for_each_dev()
iommu: Tidy the control flow in iommu_group_store_type()
.clang-format | 1 +
drivers/iommu/iommu.c | 636 +++++++++++++++++++++---------------------
2 files changed, 325 insertions(+), 312 deletions(-)
base-commit: 3578e36f238ef81a1967e849e0a3106c9dd37e68
--
2.40.0
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 01/17] iommu: Replace iommu_group_device_count() with list_count_nodes()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 02/17] iommu: Add for_each_group_device() Jason Gunthorpe
` (16 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
No reason to wrapper a standard function, just call the library directly.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 15 ++-------------
1 file changed, 2 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7abee83610b6c9..8cdf47279704c2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1149,17 +1149,6 @@ void iommu_group_remove_device(struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_group_remove_device);
-static int iommu_group_device_count(struct iommu_group *group)
-{
- struct group_device *entry;
- int ret = 0;
-
- list_for_each_entry(entry, &group->devices, list)
- ret++;
-
- return ret;
-}
-
static int __iommu_group_for_each_dev(struct iommu_group *group, void *data,
int (*fn)(struct device *, void *))
{
@@ -2101,7 +2090,7 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
*/
mutex_lock(&group->mutex);
ret = -EINVAL;
- if (iommu_group_device_count(group) != 1)
+ if (list_count_nodes(&group->devices) != 1)
goto out_unlock;
ret = __iommu_attach_group(domain, group);
@@ -2132,7 +2121,7 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
mutex_lock(&group->mutex);
if (WARN_ON(domain != group->domain) ||
- WARN_ON(iommu_group_device_count(group) != 1))
+ WARN_ON(list_count_nodes(&group->devices) != 1))
goto out_unlock;
__iommu_group_set_core_domain(group);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 02/17] iommu: Add for_each_group_device()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 01/17] iommu: Replace iommu_group_device_count() with list_count_nodes() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind Jason Gunthorpe
` (15 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Convenience macro to iterate over every struct group_device in the group.
Replace all open coded list_for_each_entry's with this macro.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
.clang-format | 1 +
drivers/iommu/iommu.c | 16 ++++++++++------
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/.clang-format b/.clang-format
index d988e9fa9b2653..bece1995f2c159 100644
--- a/.clang-format
+++ b/.clang-format
@@ -254,6 +254,7 @@ ForEachMacros:
- 'for_each_free_mem_range'
- 'for_each_free_mem_range_reverse'
- 'for_each_func_rsrc'
+ - 'for_each_group_device'
- 'for_each_group_evsel'
- 'for_each_group_member'
- 'for_each_hstate'
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8cdf47279704c2..501257bd02fc3c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -67,6 +67,10 @@ struct group_device {
char *name;
};
+/* Iterate over each struct group_device in a struct iommu_group */
+#define for_each_group_device(group, pos) \
+ list_for_each_entry(pos, &(group)->devices, list)
+
struct iommu_group_attribute {
struct attribute attr;
ssize_t (*show)(struct iommu_group *group, char *buf);
@@ -463,7 +467,7 @@ __iommu_group_remove_device(struct iommu_group *group, struct device *dev)
struct group_device *device;
lockdep_assert_held(&group->mutex);
- list_for_each_entry(device, &group->devices, list) {
+ for_each_group_device(group, device) {
if (device->dev == dev) {
list_del(&device->list);
return device;
@@ -702,7 +706,7 @@ int iommu_get_group_resv_regions(struct iommu_group *group,
int ret = 0;
mutex_lock(&group->mutex);
- list_for_each_entry(device, &group->devices, list) {
+ for_each_group_device(group, device) {
struct list_head dev_resv_regions;
/*
@@ -1155,7 +1159,7 @@ static int __iommu_group_for_each_dev(struct iommu_group *group, void *data,
struct group_device *device;
int ret = 0;
- list_for_each_entry(device, &group->devices, list) {
+ for_each_group_device(group, device) {
ret = fn(device->dev, data);
if (ret)
break;
@@ -1959,7 +1963,7 @@ bool iommu_group_has_isolated_msi(struct iommu_group *group)
bool ret = true;
mutex_lock(&group->mutex);
- list_for_each_entry(group_dev, &group->devices, list)
+ for_each_group_device(group, group_dev)
ret &= msi_device_has_isolated_msi(group_dev->dev);
mutex_unlock(&group->mutex);
return ret;
@@ -3261,7 +3265,7 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
struct group_device *device;
int ret = 0;
- list_for_each_entry(device, &group->devices, list) {
+ for_each_group_device(group, device) {
ret = domain->ops->set_dev_pasid(domain, device->dev, pasid);
if (ret)
break;
@@ -3276,7 +3280,7 @@ static void __iommu_remove_group_pasid(struct iommu_group *group,
struct group_device *device;
const struct iommu_ops *ops;
- list_for_each_entry(device, &group->devices, list) {
+ for_each_group_device(group, device) {
ops = dev_iommu_ops(device->dev);
ops->remove_dev_pasid(device->dev, pasid);
}
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 01/17] iommu: Replace iommu_group_device_count() with list_count_nodes() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 02/17] iommu: Add for_each_group_device() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 18:08 ` Robin Murphy
2023-04-05 23:38 ` [PATCH v3 04/17] iommu: Use __iommu_group_set_domain() for __iommu_attach_group() Jason Gunthorpe
` (14 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Let's try to have a consistent and clear strategy for error handling
during domain attach failures.
There are two broad categories, the first is callers doing destruction and
trying to set the domain back to a previously good domain. These cases
cannot handle failure during destruction flows and must succeed, or at
least avoid a UAF on the current group->domain which is likely about to be
freed.
Many of the drivers are well behaved here and will not hit the WARN_ON's
or a UAF, but some are doing hypercalls/etc that can fail unpredictably
and don't meet the expectations.
The second case is attaching a domain for the first time in a failable
context, failure should restore the attachment back to group->domain using
the above unfailable operation.
Have __iommu_group_set_domain_internal() execute a common algorithm that
tries to achieve this, and in the worst case, would leave a device
"detached" or assigned to a global blocking domain. This relies on some
existing common driver behaviors where attach failure will also do detatch
and true IOMMU_DOMAIN_BLOCK implementations that are not allowed to ever
fail.
Name the first case with __iommu_group_set_domain_nofail() to make it
clear.
Pull all the error handling and WARN_ON generation into
__iommu_group_set_domain_internal().
Avoid the obfuscating use of __iommu_group_for_each_dev() and be more
careful about what should happen during failures by only touching devices
we've already touched.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 130 +++++++++++++++++++++++++++++++++++-------
1 file changed, 108 insertions(+), 22 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 501257bd02fc3c..b7cb2a56ebfe8c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -99,8 +99,26 @@ static int __iommu_attach_device(struct iommu_domain *domain,
struct device *dev);
static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
+
+enum {
+ IOMMU_SET_DOMAIN_MUST_SUCCEED = 1 << 0,
+};
+
+static int __iommu_group_set_domain_internal(struct iommu_group *group,
+ struct iommu_domain *new_domain,
+ unsigned int flags);
static int __iommu_group_set_domain(struct iommu_group *group,
- struct iommu_domain *new_domain);
+ struct iommu_domain *new_domain)
+{
+ return __iommu_group_set_domain_internal(group, new_domain, 0);
+}
+static void __iommu_group_set_domain_nofail(struct iommu_group *group,
+ struct iommu_domain *new_domain)
+{
+ WARN_ON(__iommu_group_set_domain_internal(
+ group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
+}
+
static int iommu_create_device_direct_mappings(struct iommu_group *group,
struct device *dev);
static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
@@ -2040,15 +2058,13 @@ EXPORT_SYMBOL_GPL(iommu_domain_free);
static void __iommu_group_set_core_domain(struct iommu_group *group)
{
struct iommu_domain *new_domain;
- int ret;
if (group->owner)
new_domain = group->blocking_domain;
else
new_domain = group->default_domain;
- ret = __iommu_group_set_domain(group, new_domain);
- WARN(ret, "iommu driver failed to attach the default/blocking domain");
+ __iommu_group_set_domain_nofail(group, new_domain);
}
static int __iommu_attach_device(struct iommu_domain *domain,
@@ -2233,21 +2249,55 @@ int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
}
EXPORT_SYMBOL_GPL(iommu_attach_group);
-static int iommu_group_do_set_platform_dma(struct device *dev, void *data)
+static int __iommu_device_set_domain(struct iommu_group *group,
+ struct device *dev,
+ struct iommu_domain *new_domain,
+ unsigned int flags)
{
- const struct iommu_ops *ops = dev_iommu_ops(dev);
-
- if (!WARN_ON(!ops->set_platform_dma_ops))
- ops->set_platform_dma_ops(dev);
+ int ret;
+ ret = __iommu_attach_device(new_domain, dev);
+ if (ret) {
+ /*
+ * If we have a blocking domain then try to attach that in hopes
+ * of avoiding a UAF. Modern drivers should implement blocking
+ * domains as global statics that cannot fail.
+ */
+ if ((flags & IOMMU_SET_DOMAIN_MUST_SUCCEED) &&
+ group->blocking_domain &&
+ group->blocking_domain != new_domain)
+ __iommu_attach_device(group->blocking_domain, dev);
+ return ret;
+ }
return 0;
}
-static int __iommu_group_set_domain(struct iommu_group *group,
- struct iommu_domain *new_domain)
+/*
+ * If 0 is returned the group's domain is new_domain. If an error is returned
+ * then the group's domain will be set back to the existing domain unless
+ * IOMMU_SET_DOMAIN_MUST_SUCCEED, otherwise an error is returned and the group's
+ * domains is left inconsistent. This is a driver bug to fail attach with a
+ * previously good domain. We try to avoid a kernel UAF because of this.
+ *
+ * IOMMU groups are really the natural working unit of the IOMMU, but the IOMMU
+ * API works on domains and devices. Bridge that gap by iterating over the
+ * devices in a group. Ideally we'd have a single device which represents the
+ * requestor ID of the group, but we also allow IOMMU drivers to create policy
+ * defined minimum sets, where the physical hardware may be able to distiguish
+ * members, but we wish to group them at a higher level (ex. untrusted
+ * multi-function PCI devices). Thus we attach each device.
+ */
+static int __iommu_group_set_domain_internal(struct iommu_group *group,
+ struct iommu_domain *new_domain,
+ unsigned int flags)
{
+ struct group_device *last_gdev;
+ struct group_device *gdev;
+ int result;
int ret;
+ lockdep_assert_held(&group->mutex);
+
if (group->domain == new_domain)
return 0;
@@ -2257,8 +2307,12 @@ static int __iommu_group_set_domain(struct iommu_group *group,
* platform specific behavior.
*/
if (!new_domain) {
- __iommu_group_for_each_dev(group, NULL,
- iommu_group_do_set_platform_dma);
+ for_each_group_device(group, gdev) {
+ const struct iommu_ops *ops = dev_iommu_ops(gdev->dev);
+
+ if (!WARN_ON(!ops->set_platform_dma_ops))
+ ops->set_platform_dma_ops(gdev->dev);
+ }
group->domain = NULL;
return 0;
}
@@ -2272,12 +2326,47 @@ static int __iommu_group_set_domain(struct iommu_group *group,
* Note that this is called in error unwind paths, attaching to a
* domain that has already been attached cannot fail.
*/
- ret = __iommu_group_for_each_dev(group, new_domain,
- iommu_group_do_attach_device);
- if (ret)
- return ret;
+ result = 0;
+ for_each_group_device(group, gdev) {
+ ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
+ flags);
+ if (ret) {
+ result = ret;
+ /*
+ * Keep trying the other devices in the group. If a
+ * driver fails attach to an otherwise good domain, and
+ * does not support blocking domains, it should at least
+ * drop its reference on the current domain so we don't
+ * UAF.
+ */
+ if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
+ continue;
+ goto err_revert;
+ }
+ }
group->domain = new_domain;
- return 0;
+ return result;
+
+err_revert:
+ last_gdev = gdev;
+ for_each_group_device(group, gdev) {
+ const struct iommu_ops *ops = dev_iommu_ops(gdev->dev);
+
+ /*
+ * If set_platform_dma_ops is not present a NULL domain can
+ * happen only for first probe, in which case we leave
+ * group->domain as NULL and let release clean everything up.
+ */
+ if (group->domain)
+ WARN_ON(__iommu_device_set_domain(
+ group, gdev->dev, group->domain,
+ IOMMU_SET_DOMAIN_MUST_SUCCEED));
+ else if (ops->set_platform_dma_ops)
+ ops->set_platform_dma_ops(gdev->dev);
+ if (gdev == last_gdev)
+ break;
+ }
+ return ret;
}
void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group)
@@ -3194,16 +3283,13 @@ EXPORT_SYMBOL_GPL(iommu_device_claim_dma_owner);
static void __iommu_release_dma_ownership(struct iommu_group *group)
{
- int ret;
-
if (WARN_ON(!group->owner_cnt || !group->owner ||
!xa_empty(&group->pasid_array)))
return;
group->owner_cnt = 0;
group->owner = NULL;
- ret = __iommu_group_set_domain(group, group->default_domain);
- WARN(ret, "iommu driver failed to attach the default domain");
+ __iommu_group_set_domain_nofail(group, group->default_domain);
}
/**
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 04/17] iommu: Use __iommu_group_set_domain() for __iommu_attach_group()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (2 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 05/17] iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain() Jason Gunthorpe
` (13 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
The error recovery here matches the recovery inside
__iommu_group_set_domain(), so just use it directly.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 40 +---------------------------------------
1 file changed, 1 insertion(+), 39 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b7cb2a56ebfe8c..ba9e988293f23f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2177,52 +2177,14 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
return dev->iommu_group->default_domain;
}
-/*
- * IOMMU groups are really the natural working unit of the IOMMU, but
- * the IOMMU API works on domains and devices. Bridge that gap by
- * iterating over the devices in a group. Ideally we'd have a single
- * device which represents the requestor ID of the group, but we also
- * allow IOMMU drivers to create policy defined minimum sets, where
- * the physical hardware may be able to distiguish members, but we
- * wish to group them at a higher level (ex. untrusted multi-function
- * PCI devices). Thus we attach each device.
- */
-static int iommu_group_do_attach_device(struct device *dev, void *data)
-{
- struct iommu_domain *domain = data;
-
- return __iommu_attach_device(domain, dev);
-}
-
static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group)
{
- int ret;
-
if (group->domain && group->domain != group->default_domain &&
group->domain != group->blocking_domain)
return -EBUSY;
- ret = __iommu_group_for_each_dev(group, domain,
- iommu_group_do_attach_device);
- if (ret == 0) {
- group->domain = domain;
- } else {
- /*
- * To recover from the case when certain device within the
- * group fails to attach to the new domain, we need force
- * attaching all devices back to the old domain. The old
- * domain is compatible for all devices in the group,
- * hence the iommu driver should always return success.
- */
- struct iommu_domain *old_domain = group->domain;
-
- group->domain = NULL;
- WARN(__iommu_group_set_domain(group, old_domain),
- "iommu driver failed to attach a compatible domain");
- }
-
- return ret;
+ return __iommu_group_set_domain(group, domain);
}
/**
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 05/17] iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (3 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 04/17] iommu: Use __iommu_group_set_domain() for __iommu_attach_group() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain Jason Gunthorpe
` (12 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
This is missing re-attach error handling if the attach fails, use the
common code.
The ugly "group->domain = prev_domain" will be cleaned in a later patch.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ba9e988293f23f..cd75d7405a3051 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2963,11 +2963,12 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
if (ret)
goto restore_old_domain;
- ret = iommu_group_create_direct_mappings(group);
+ group->domain = prev_dom;
+ ret = iommu_create_device_direct_mappings(group, dev);
if (ret)
goto free_new_domain;
- ret = __iommu_attach_group(group->default_domain, group);
+ ret = __iommu_group_set_domain(group, group->default_domain);
if (ret)
goto free_new_domain;
@@ -2979,7 +2980,6 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
iommu_domain_free(group->default_domain);
restore_old_domain:
group->default_domain = prev_dom;
- group->domain = prev_dom;
return ret;
}
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (4 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 05/17] iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 15:58 ` Robin Murphy
2023-04-05 23:38 ` [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler Jason Gunthorpe
` (11 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Prepare for removing the group->domain set from
iommu_group_alloc_default_domain() by calling
__iommu_group_set_domain_internal() to set the group->domain.
Add IOMMU_SET_DOMAIN_WITH_DEFERRED to allow it to do the attach_deferred
logic.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cd75d7405a3051..af3af752c255e4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -102,6 +102,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
enum {
IOMMU_SET_DOMAIN_MUST_SUCCEED = 1 << 0,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED = 1 << 1,
};
static int __iommu_group_set_domain_internal(struct iommu_group *group,
@@ -1855,12 +1856,6 @@ static void probe_alloc_default_domain(struct bus_type *bus,
}
-static int __iommu_group_dma_first_attach(struct iommu_group *group)
-{
- return __iommu_group_for_each_dev(group, group->default_domain,
- iommu_group_do_dma_first_attach);
-}
-
static int iommu_group_do_probe_finalize(struct device *dev, void *data)
{
const struct iommu_ops *ops = dev_iommu_ops(dev);
@@ -1923,7 +1918,10 @@ int bus_iommu_probe(struct bus_type *bus)
iommu_group_create_direct_mappings(group);
- ret = __iommu_group_dma_first_attach(group);
+ group->domain = NULL;
+ ret = __iommu_group_set_domain_internal(
+ group, group->default_domain,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED);
mutex_unlock(&group->mutex);
@@ -2218,6 +2216,12 @@ static int __iommu_device_set_domain(struct iommu_group *group,
{
int ret;
+ if ((flags & IOMMU_SET_DOMAIN_WITH_DEFERRED) &&
+ iommu_is_attach_deferred(dev)) {
+ dev->iommu->attach_deferred = 1;
+ return 0;
+ }
+
ret = __iommu_attach_device(new_domain, dev);
if (ret) {
/*
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (5 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 3:28 ` Baolu Lu
2023-04-06 15:44 ` Robin Murphy
2023-04-05 23:38 ` [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups Jason Gunthorpe
` (10 subsequent siblings)
17 siblings, 2 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
It should always attach to the current group->domain, so don't take in a
domain parameter. Use the __iommu_device_set_domain() common code to
handle the attach.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index af3af752c255e4..c68cf551d05dc4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -105,6 +105,10 @@ enum {
IOMMU_SET_DOMAIN_WITH_DEFERRED = 1 << 1,
};
+static int __iommu_device_set_domain(struct iommu_group *group,
+ struct device *dev,
+ struct iommu_domain *new_domain,
+ unsigned int flags);
static int __iommu_group_set_domain_internal(struct iommu_group *group,
struct iommu_domain *new_domain,
unsigned int flags);
@@ -405,18 +409,10 @@ static bool iommu_is_attach_deferred(struct device *dev)
return false;
}
-static int iommu_group_do_dma_first_attach(struct device *dev, void *data)
+static int iommu_group_do_dma_first_attach(struct iommu_group *group, struct device *dev)
{
- struct iommu_domain *domain = data;
-
- lockdep_assert_held(&dev->iommu_group->mutex);
-
- if (iommu_is_attach_deferred(dev)) {
- dev->iommu->attach_deferred = 1;
- return 0;
- }
-
- return __iommu_attach_device(domain, dev);
+ return __iommu_device_set_domain(group, dev, group->domain,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED);
}
int iommu_probe_device(struct device *dev)
@@ -449,7 +445,7 @@ int iommu_probe_device(struct device *dev)
* attach the default domain.
*/
if (group->default_domain && !group->owner) {
- ret = iommu_group_do_dma_first_attach(dev, group->default_domain);
+ ret = iommu_group_do_dma_first_attach(group, dev);
if (ret) {
mutex_unlock(&group->mutex);
iommu_group_put(group);
@@ -1117,7 +1113,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
if (group->domain)
- ret = iommu_group_do_dma_first_attach(dev, group->domain);
+ ret = iommu_group_do_dma_first_attach(group, dev);
mutex_unlock(&group->mutex);
if (ret)
goto err_put_group;
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (6 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 16:37 ` Robin Murphy
2023-04-05 23:38 ` [PATCH v3 09/17] iommu: Fix iommu_probe_device() to attach the right domain Jason Gunthorpe
` (9 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
If the group is already owned then defered attach should not be done and
the device should be directly connected to the correct domain.
Owned means that some driver is already bound to devices in the group and
is operating the group possibly without DMA API support. In this case
there would be no way to correct the mismatched domain.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c68cf551d05dc4..6fe8bc78db2016 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -411,8 +411,9 @@ static bool iommu_is_attach_deferred(struct device *dev)
static int iommu_group_do_dma_first_attach(struct iommu_group *group, struct device *dev)
{
- return __iommu_device_set_domain(group, dev, group->domain,
- IOMMU_SET_DOMAIN_WITH_DEFERRED);
+ return __iommu_device_set_domain(
+ group, dev, group->domain,
+ group->owner ? 0 : IOMMU_SET_DOMAIN_WITH_DEFERRED);
}
int iommu_probe_device(struct device *dev)
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 09/17] iommu: Fix iommu_probe_device() to attach the right domain
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (7 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach Jason Gunthorpe
` (8 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
The general invariant is that all devices in an iommu_group are attached
to group->domain. We missed some cases here where an owned group would not
get the device attached.
Rework this logic so it follows the default domain flow of the
bus_iommu_probe() - call iommu_alloc_default_domain(), then use
__iommu_group_set_domain_internal() to set up all the devices.
Finally always attach the device to the current domain if it is already
set.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 45 ++++++++++++++++++++++++-------------------
1 file changed, 25 insertions(+), 20 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6fe8bc78db2016..5bbd667f727605 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -432,27 +432,32 @@ int iommu_probe_device(struct device *dev)
goto err_release;
}
- /*
- * Try to allocate a default domain - needs support from the
- * IOMMU driver. There are still some drivers which don't
- * support default domains, so the return value is not yet
- * checked.
- */
mutex_lock(&group->mutex);
- iommu_alloc_default_domain(group, dev);
- /*
- * If device joined an existing group which has been claimed, don't
- * attach the default domain.
- */
- if (group->default_domain && !group->owner) {
+ if (group->domain) {
ret = iommu_group_do_dma_first_attach(group, dev);
- if (ret) {
- mutex_unlock(&group->mutex);
- iommu_group_put(group);
- goto err_release;
- }
+ } else if (!group->default_domain) {
+ /*
+ * Try to allocate a default domain - needs support from the
+ * IOMMU driver. There are still some drivers which don't
+ * support default domains, so the return value is not yet
+ * checked.
+ */
+ iommu_alloc_default_domain(group, dev);
+ group->domain = NULL;
+ if (group->default_domain)
+ ret = __iommu_group_set_domain_internal(
+ group, group->default_domain,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED);
+
+ /*
+ * We assume that the iommu driver starts up the device in
+ * 'set_platform_dma_ops' mode if it does not support default
+ * domains.
+ */
}
+ if (ret)
+ goto err_unlock;
iommu_create_device_direct_mappings(group, dev);
@@ -465,6 +470,9 @@ int iommu_probe_device(struct device *dev)
return 0;
+err_unlock:
+ mutex_unlock(&group->mutex);
+ iommu_group_put(group);
err_release:
iommu_release_device(dev);
@@ -1717,9 +1725,6 @@ static int iommu_alloc_default_domain(struct iommu_group *group,
{
unsigned int type;
- if (group->default_domain)
- return 0;
-
type = iommu_get_def_domain_type(dev) ? : iommu_def_domain_type;
return iommu_group_alloc_default_domain(dev->bus, group, type);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (8 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 09/17] iommu: Fix iommu_probe_device() to attach the right domain Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 3:14 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 11/17] iommu: Remove the assignment of group->domain during default domain alloc Jason Gunthorpe
` (7 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
The iommu_probe_device() path calls iommu_create_device_direct_mappings()
after attaching the device.
IOMMU_RESV_DIRECT maps need to be continually in place, so if a hotplugged
device has new ranges the should have been mapped into the default domain
before it is attached.
Move the iommu_create_device_direct_mappings() call up.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5bbd667f727605..ca209fb3ddd1c9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -435,6 +435,7 @@ int iommu_probe_device(struct device *dev)
mutex_lock(&group->mutex);
if (group->domain) {
+ iommu_create_device_direct_mappings(group, dev);
ret = iommu_group_do_dma_first_attach(group, dev);
} else if (!group->default_domain) {
/*
@@ -445,10 +446,12 @@ int iommu_probe_device(struct device *dev)
*/
iommu_alloc_default_domain(group, dev);
group->domain = NULL;
- if (group->default_domain)
+ if (group->default_domain) {
+ iommu_create_device_direct_mappings(group, dev);
ret = __iommu_group_set_domain_internal(
group, group->default_domain,
IOMMU_SET_DOMAIN_WITH_DEFERRED);
+ }
/*
* We assume that the iommu driver starts up the device in
@@ -459,8 +462,6 @@ int iommu_probe_device(struct device *dev)
if (ret)
goto err_unlock;
- iommu_create_device_direct_mappings(group, dev);
-
mutex_unlock(&group->mutex);
iommu_group_put(group);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 11/17] iommu: Remove the assignment of group->domain during default domain alloc
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (9 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 12/17] iommu: Consolidate the code to calculate the target default domain type Jason Gunthorpe
` (6 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
group->domain should only be set once all the device's drivers have
had their ops->attach_dev() called. iommu_group_alloc_default_domain()
doesn't do this, so it shouldn't set the value.
The previous patches organized things so that each caller of
iommu_group_alloc_default_domain() follows up with calling
__iommu_group_set_domain_internal() that does set the group->domain.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ca209fb3ddd1c9..efa6f193dc9da0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -445,7 +445,6 @@ int iommu_probe_device(struct device *dev)
* checked.
*/
iommu_alloc_default_domain(group, dev);
- group->domain = NULL;
if (group->default_domain) {
iommu_create_device_direct_mappings(group, dev);
ret = __iommu_group_set_domain_internal(
@@ -1716,8 +1715,6 @@ static int iommu_group_alloc_default_domain(struct bus_type *bus,
return -ENOMEM;
group->default_domain = dom;
- if (!group->domain)
- group->domain = dom;
return 0;
}
@@ -1921,7 +1918,6 @@ int bus_iommu_probe(struct bus_type *bus)
iommu_group_create_direct_mappings(group);
- group->domain = NULL;
ret = __iommu_group_set_domain_internal(
group, group->default_domain,
IOMMU_SET_DOMAIN_WITH_DEFERRED);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 12/17] iommu: Consolidate the code to calculate the target default domain type
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (10 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 11/17] iommu: Remove the assignment of group->domain during default domain alloc Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain() Jason Gunthorpe
` (5 subsequent siblings)
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Put all the code to calculate the default domain type into one
function. Make the function able to handle the
iommu_change_dev_def_domain() by taking in the target domain type and
erroring out if the target type isn't reachable.
This makes it really clear that specifying a 0 type during
iommu_change_dev_def_domain() will have the same outcome as the normal
probe path.
Remove the obfuscating use of __iommu_group_for_each_dev() and related
struct __group_domain_type.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 88 +++++++++++++++++--------------------------
1 file changed, 35 insertions(+), 53 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index efa6f193dc9da0..8623d3edc61014 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1810,50 +1810,43 @@ static int iommu_bus_notifier(struct notifier_block *nb,
return 0;
}
-struct __group_domain_type {
- struct device *dev;
- unsigned int type;
-};
-
-static int probe_get_default_domain_type(struct device *dev, void *data)
+/* A target_type of 0 will select the best domain type and cannot fail */
+static int iommu_get_default_domain_type(struct iommu_group *group,
+ int target_type)
{
- struct __group_domain_type *gtype = data;
- unsigned int type = iommu_get_def_domain_type(dev);
+ int best_type = target_type;
+ struct group_device *gdev;
+ struct device *last_dev;
- if (type) {
- if (gtype->type && gtype->type != type) {
- dev_warn(dev, "Device needs domain type %s, but device %s in the same iommu group requires type %s - using default\n",
- iommu_domain_type_str(type),
- dev_name(gtype->dev),
- iommu_domain_type_str(gtype->type));
- gtype->type = 0;
- }
+ lockdep_assert_held(&group->mutex);
+
+ for_each_group_device(group, gdev) {
+ unsigned int type = iommu_get_def_domain_type(gdev->dev);
+
+ if (best_type && type && best_type != type) {
+ if (target_type) {
+ dev_err_ratelimited(
+ gdev->dev,
+ "Device cannot be in %s domain\n",
+ iommu_domain_type_str(target_type));
+ return -1;
+ }
- if (!gtype->dev) {
- gtype->dev = dev;
- gtype->type = type;
+ dev_warn(
+ gdev->dev,
+ "Device needs domain type %s, but device %s in the same iommu group requires type %s - using default\n",
+ iommu_domain_type_str(type), dev_name(last_dev),
+ iommu_domain_type_str(best_type));
+ return 0;
}
+ if (!best_type)
+ best_type = type;
+ last_dev = gdev->dev;
}
- return 0;
-}
-
-static void probe_alloc_default_domain(struct bus_type *bus,
- struct iommu_group *group)
-{
- struct __group_domain_type gtype;
-
- memset(>ype, 0, sizeof(gtype));
-
- /* Ask for default domain requirements of all devices in the group */
- __iommu_group_for_each_dev(group, >ype,
- probe_get_default_domain_type);
-
- if (!gtype.type)
- gtype.type = iommu_def_domain_type;
-
- iommu_group_alloc_default_domain(bus, group, gtype.type);
-
+ if (!best_type)
+ return iommu_def_domain_type;
+ return best_type;
}
static int iommu_group_do_probe_finalize(struct device *dev, void *data)
@@ -1909,7 +1902,8 @@ int bus_iommu_probe(struct bus_type *bus)
list_del_init(&group->entry);
/* Try to allocate default domain */
- probe_alloc_default_domain(bus, group);
+ iommu_group_alloc_default_domain(
+ bus, group, iommu_get_default_domain_type(group, 0));
if (!group->default_domain) {
mutex_unlock(&group->mutex);
@@ -2929,27 +2923,15 @@ EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
static int iommu_change_dev_def_domain(struct iommu_group *group,
struct device *dev, int type)
{
- struct __group_domain_type gtype = {NULL, 0};
struct iommu_domain *prev_dom;
int ret;
lockdep_assert_held(&group->mutex);
prev_dom = group->default_domain;
- __iommu_group_for_each_dev(group, >ype,
- probe_get_default_domain_type);
- if (!type) {
- /*
- * If the user hasn't requested any specific type of domain and
- * if the device supports both the domains, then default to the
- * domain the device was booted with
- */
- type = gtype.type ? : iommu_def_domain_type;
- } else if (gtype.type && type != gtype.type) {
- dev_err_ratelimited(dev, "Device cannot be in %s domain\n",
- iommu_domain_type_str(type));
+ type = iommu_get_default_domain_type(group, type);
+ if (type < 0)
return -EINVAL;
- }
/*
* Switch to a new domain only if the requested domain type is different
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (11 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 12/17] iommu: Consolidate the code to calculate the target default domain type Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 3:40 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function Jason Gunthorpe
` (4 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Robin points out that the fallback to guessing what domains the driver
supports should only happen if the driver doesn't return a preference from
its ops->def_domain_type().
Re-organize iommu_group_alloc_default_domain() so it internally uses
iommu_def_domain_type only during the fallback and makes it clearer how
the fallback sequence works.
Make iommu_group_alloc_default_domain() return the domain so the return
based logic is cleaner and to prepare for the next patch.
Remove the iommu_alloc_default_domain() function as it is now trivially
just calling iommu_group_alloc_default_domain().
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 81 +++++++++++++++++++++++--------------------
1 file changed, 43 insertions(+), 38 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8623d3edc61014..5f6aecfd92c657 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -91,8 +91,9 @@ static const char * const iommu_group_resv_type_string[] = {
static int iommu_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data);
-static int iommu_alloc_default_domain(struct iommu_group *group,
- struct device *dev);
+static struct iommu_domain *
+iommu_group_alloc_default_domain(struct iommu_group *group, int req_type);
+static int iommu_get_def_domain_type(struct device *dev);
static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
unsigned type);
static int __iommu_attach_device(struct iommu_domain *domain,
@@ -444,13 +445,16 @@ int iommu_probe_device(struct device *dev)
* support default domains, so the return value is not yet
* checked.
*/
- iommu_alloc_default_domain(group, dev);
- if (group->default_domain) {
- iommu_create_device_direct_mappings(group, dev);
- ret = __iommu_group_set_domain_internal(
- group, group->default_domain,
- IOMMU_SET_DOMAIN_WITH_DEFERRED);
+ group->default_domain = iommu_group_alloc_default_domain(
+ group, iommu_get_def_domain_type(dev));
+ if (!group->default_domain) {
+ ret = -EINVAL;
+ goto err_unlock;
}
+ iommu_create_device_direct_mappings(group, dev);
+ ret = __iommu_group_set_domain_internal(
+ group, group->default_domain,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED);
/*
* We assume that the iommu driver starts up the device in
@@ -1697,35 +1701,38 @@ static int iommu_get_def_domain_type(struct device *dev)
return 0;
}
-static int iommu_group_alloc_default_domain(struct bus_type *bus,
- struct iommu_group *group,
- unsigned int type)
+/*
+ * req_type of 0 means "auto" which means to select a domain based on
+ * iommu_def_domain_type or what the driver actually supports.
+ */
+static struct iommu_domain *
+iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
{
+ struct bus_type *bus =
+ list_first_entry(&group->devices, struct group_device, list)
+ ->dev->bus;
struct iommu_domain *dom;
- dom = __iommu_domain_alloc(bus, type);
- if (!dom && type != IOMMU_DOMAIN_DMA) {
- dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);
- if (dom)
- pr_warn("Failed to allocate default IOMMU domain of type %u for group %s - Falling back to IOMMU_DOMAIN_DMA",
- type, group->name);
- }
+ lockdep_assert_held(&group->mutex);
- if (!dom)
- return -ENOMEM;
+ if (req_type)
+ return __iommu_domain_alloc(bus, req_type);
- group->default_domain = dom;
- return 0;
-}
-
-static int iommu_alloc_default_domain(struct iommu_group *group,
- struct device *dev)
-{
- unsigned int type;
+ /* The driver gave no guidance on what type to use, try the default */
+ dom = __iommu_domain_alloc(bus, iommu_def_domain_type);
+ if (dom)
+ return dom;
- type = iommu_get_def_domain_type(dev) ? : iommu_def_domain_type;
+ /* Otherwise IDENTITY and DMA_FQ defaults will try DMA */
+ if (iommu_def_domain_type == IOMMU_DOMAIN_DMA)
+ return NULL;
+ dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);
+ if (!dom)
+ return NULL;
- return iommu_group_alloc_default_domain(dev->bus, group, type);
+ pr_warn("Failed to allocate default IOMMU domain of type %u for group %s - Falling back to IOMMU_DOMAIN_DMA",
+ iommu_def_domain_type, group->name);
+ return dom;
}
/**
@@ -1843,9 +1850,6 @@ static int iommu_get_default_domain_type(struct iommu_group *group,
best_type = type;
last_dev = gdev->dev;
}
-
- if (!best_type)
- return iommu_def_domain_type;
return best_type;
}
@@ -1902,9 +1906,8 @@ int bus_iommu_probe(struct bus_type *bus)
list_del_init(&group->entry);
/* Try to allocate default domain */
- iommu_group_alloc_default_domain(
- bus, group, iommu_get_default_domain_type(group, 0));
-
+ group->default_domain = iommu_group_alloc_default_domain(
+ group, iommu_get_default_domain_type(group, 0));
if (!group->default_domain) {
mutex_unlock(&group->mutex);
continue;
@@ -2944,9 +2947,11 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
group->domain = NULL;
/* Sets group->default_domain to the newly allocated domain */
- ret = iommu_group_alloc_default_domain(dev->bus, group, type);
- if (ret)
+ group->default_domain = iommu_group_alloc_default_domain(group, type);
+ if (!group->default_domain) {
+ ret = -EINVAL;
goto restore_old_domain;
+ }
group->domain = prev_dom;
ret = iommu_create_device_direct_mappings(group, dev);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (12 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 4:43 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM Jason Gunthorpe
` (3 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Make iommu_change_dev_def_domain() general enough to setup the initial
default_domain or replace it with a new default_domain. Call the new
function iommu_setup_default_domain() and make it the only place in the
code that stores to group->default_domain.
Consolidate the three copies of the default_domain setup sequence. The flow
flow requires:
- Determining the domain type to use
- Checking if the current default domain is the same type
- Allocating a domain
- Doing iommu_create_device_direct_mappings()
- Attaching it to devices
- Store group->default_domain
This adjusts the domain allocation from the prior patch to be able to
detect if each of the allocation steps is already the domain we already
have, which is a more robust version of what change default domain was
already doing.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 200 ++++++++++++++++++------------------------
1 file changed, 83 insertions(+), 117 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5f6aecfd92c657..b1e8e8bf955bcd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -91,9 +91,6 @@ static const char * const iommu_group_resv_type_string[] = {
static int iommu_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data);
-static struct iommu_domain *
-iommu_group_alloc_default_domain(struct iommu_group *group, int req_type);
-static int iommu_get_def_domain_type(struct device *dev);
static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
unsigned type);
static int __iommu_attach_device(struct iommu_domain *domain,
@@ -125,7 +122,9 @@ static void __iommu_group_set_domain_nofail(struct iommu_group *group,
group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
}
-static int iommu_create_device_direct_mappings(struct iommu_group *group,
+static int iommu_setup_default_domain(struct iommu_group *group,
+ int target_type);
+static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
struct device *dev);
static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
static ssize_t iommu_group_store_type(struct iommu_group *group,
@@ -436,34 +435,15 @@ int iommu_probe_device(struct device *dev)
mutex_lock(&group->mutex);
if (group->domain) {
- iommu_create_device_direct_mappings(group, dev);
+ iommu_create_device_direct_mappings(group->domain, dev);
ret = iommu_group_do_dma_first_attach(group, dev);
+ if (ret)
+ goto err_unlock;
} else if (!group->default_domain) {
- /*
- * Try to allocate a default domain - needs support from the
- * IOMMU driver. There are still some drivers which don't
- * support default domains, so the return value is not yet
- * checked.
- */
- group->default_domain = iommu_group_alloc_default_domain(
- group, iommu_get_def_domain_type(dev));
- if (!group->default_domain) {
- ret = -EINVAL;
+ ret = iommu_setup_default_domain(group, 0);
+ if (ret)
goto err_unlock;
- }
- iommu_create_device_direct_mappings(group, dev);
- ret = __iommu_group_set_domain_internal(
- group, group->default_domain,
- IOMMU_SET_DOMAIN_WITH_DEFERRED);
-
- /*
- * We assume that the iommu driver starts up the device in
- * 'set_platform_dma_ops' mode if it does not support default
- * domains.
- */
}
- if (ret)
- goto err_unlock;
mutex_unlock(&group->mutex);
iommu_group_put(group);
@@ -1009,16 +989,15 @@ int iommu_group_set_name(struct iommu_group *group, const char *name)
}
EXPORT_SYMBOL_GPL(iommu_group_set_name);
-static int iommu_create_device_direct_mappings(struct iommu_group *group,
+static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
struct device *dev)
{
- struct iommu_domain *domain = group->default_domain;
struct iommu_resv_region *entry;
struct list_head mappings;
unsigned long pg_size;
int ret = 0;
- if (!domain || !iommu_is_dma_domain(domain))
+ if (!iommu_is_dma_domain(domain))
return 0;
BUG_ON(!domain->pgsize_bitmap);
@@ -1701,6 +1680,15 @@ static int iommu_get_def_domain_type(struct device *dev)
return 0;
}
+static struct iommu_domain *
+__iommu_group_alloc_default_domain(struct bus_type *bus,
+ struct iommu_group *group, int req_type)
+{
+ if (group->default_domain && group->default_domain->type == req_type)
+ return group->default_domain;
+ return __iommu_domain_alloc(bus, req_type);
+}
+
/*
* req_type of 0 means "auto" which means to select a domain based on
* iommu_def_domain_type or what the driver actually supports.
@@ -1716,17 +1704,17 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
lockdep_assert_held(&group->mutex);
if (req_type)
- return __iommu_domain_alloc(bus, req_type);
+ return __iommu_group_alloc_default_domain(bus, group, req_type);
/* The driver gave no guidance on what type to use, try the default */
- dom = __iommu_domain_alloc(bus, iommu_def_domain_type);
+ dom = __iommu_group_alloc_default_domain(bus, group, iommu_def_domain_type);
if (dom)
return dom;
/* Otherwise IDENTITY and DMA_FQ defaults will try DMA */
if (iommu_def_domain_type == IOMMU_DOMAIN_DMA)
return NULL;
- dom = __iommu_domain_alloc(bus, IOMMU_DOMAIN_DMA);
+ dom = __iommu_group_alloc_default_domain(bus, group, IOMMU_DOMAIN_DMA);
if (!dom)
return NULL;
@@ -1869,21 +1857,6 @@ static void __iommu_group_dma_finalize(struct iommu_group *group)
iommu_group_do_probe_finalize);
}
-static int iommu_do_create_direct_mappings(struct device *dev, void *data)
-{
- struct iommu_group *group = data;
-
- iommu_create_device_direct_mappings(group, dev);
-
- return 0;
-}
-
-static int iommu_group_create_direct_mappings(struct iommu_group *group)
-{
- return __iommu_group_for_each_dev(group, group,
- iommu_do_create_direct_mappings);
-}
-
int bus_iommu_probe(struct bus_type *bus)
{
struct iommu_group *group, *next;
@@ -1905,29 +1878,16 @@ int bus_iommu_probe(struct bus_type *bus)
/* Remove item from the list */
list_del_init(&group->entry);
- /* Try to allocate default domain */
- group->default_domain = iommu_group_alloc_default_domain(
- group, iommu_get_default_domain_type(group, 0));
- if (!group->default_domain) {
+ ret = iommu_setup_default_domain(group, 0);
+ if (ret) {
mutex_unlock(&group->mutex);
- continue;
+ return ret;
}
-
- iommu_group_create_direct_mappings(group);
-
- ret = __iommu_group_set_domain_internal(
- group, group->default_domain,
- IOMMU_SET_DOMAIN_WITH_DEFERRED);
-
mutex_unlock(&group->mutex);
-
- if (ret)
- break;
-
__iommu_group_dma_finalize(group);
}
- return ret;
+ return 0;
}
bool iommu_present(struct bus_type *bus)
@@ -2909,68 +2869,79 @@ int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
}
EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
-/*
- * Changes the default domain of an iommu group
- *
- * @group: The group for which the default domain should be changed
- * @dev: The first device in the group
- * @type: The type of the new default domain that gets associated with the group
- *
- * Returns 0 on success and error code on failure
+/**
+ * iommu_setup_default_domain - Set the default_domain for the group
+ * @group: Group to change
+ * @target_type: Domain type to set as the default_domain
*
- * Note:
- * 1. Presently, this function is called only when user requests to change the
- * group's default domain type through /sys/kernel/iommu_groups/<grp_id>/type
- * Please take a closer look if intended to use for other purposes.
+ * Allocate a default domain and set it as the current domain on the group. If
+ * the group already has a default domain it will be changed to the target_type.
+ * When target_type is 0 the default domain is selected based on driver and
+ * system preferences.
*/
-static int iommu_change_dev_def_domain(struct iommu_group *group,
- struct device *dev, int type)
+static int iommu_setup_default_domain(struct iommu_group *group,
+ int target_type)
{
- struct iommu_domain *prev_dom;
+ struct group_device *gdev;
+ struct iommu_domain *dom;
+ int req_type;
int ret;
lockdep_assert_held(&group->mutex);
- prev_dom = group->default_domain;
- type = iommu_get_default_domain_type(group, type);
- if (type < 0)
+ req_type = iommu_get_default_domain_type(group, target_type);
+ if (req_type < 0)
return -EINVAL;
/*
- * Switch to a new domain only if the requested domain type is different
- * from the existing default domain type
+ * There are still some drivers which don't support default domains, so
+ * we ignore the failure and leave group->default_domain NULL.
+ *
+ * We assume that the iommu driver starts up the device in
+ * 'set_platform_dma_ops' mode if it does not support default domains.
*/
- if (prev_dom->type == type)
- return 0;
-
- group->default_domain = NULL;
- group->domain = NULL;
-
- /* Sets group->default_domain to the newly allocated domain */
- group->default_domain = iommu_group_alloc_default_domain(group, type);
- if (!group->default_domain) {
- ret = -EINVAL;
- goto restore_old_domain;
+ dom = iommu_group_alloc_default_domain(group, req_type);
+ if (!dom) {
+ /* Once in default_domain mode we never leave */
+ if (group->default_domain)
+ return -ENODEV;
+ ret = 0;
+ goto out_set;
}
- group->domain = prev_dom;
- ret = iommu_create_device_direct_mappings(group, dev);
- if (ret)
- goto free_new_domain;
-
- ret = __iommu_group_set_domain(group, group->default_domain);
- if (ret)
- goto free_new_domain;
+ if (group->default_domain == dom)
+ return 0;
- iommu_domain_free(prev_dom);
+ /*
+ * IOMMU_RESV_DIRECT and IOMMU_RESV_DIRECT_RELAXABLE regions must be
+ * mapped before their device is attached, in order to guarantee
+ * continuity with any FW activity
+ */
+ for_each_group_device(group, gdev)
+ iommu_create_device_direct_mappings(dom, gdev->dev);
- return 0;
+ ret = __iommu_group_set_domain_internal(group, dom,
+ IOMMU_SET_DOMAIN_WITH_DEFERRED);
+ if (ret) {
+ /*
+ * An attach_dev failure may result in some devices being left
+ * attached to dom. This is not cleaned up until release_device
+ * is called. Thus we can't always free dom on failure, we have
+ * no choice but to stick the broken domain into
+ * group->default_domain to defer the free and try to continue.
+ */
+ if (list_count_nodes(&group->devices) > 1)
+ goto out_set;
-free_new_domain:
- iommu_domain_free(group->default_domain);
-restore_old_domain:
- group->default_domain = prev_dom;
+ iommu_domain_free(dom);
+ dom = NULL;
+ goto out_set;
+ }
+out_set:
+ if (group->default_domain)
+ iommu_domain_free(group->default_domain);
+ group->default_domain = dom;
return ret;
}
@@ -2986,8 +2957,6 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
static ssize_t iommu_group_store_type(struct iommu_group *group,
const char *buf, size_t count)
{
- struct group_device *grp_dev;
- struct device *dev;
int ret, req_type;
if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
@@ -3025,10 +2994,7 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
return -EPERM;
}
- grp_dev = list_first_entry(&group->devices, struct group_device, list);
- dev = grp_dev->dev;
-
- ret = iommu_change_dev_def_domain(group, dev, req_type);
+ ret = iommu_setup_default_domain(group, req_type);
/*
* Release the mutex here because ops->probe_finalize() call-back of
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (13 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-06 3:53 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 16/17] iommu: Remove __iommu_group_for_each_dev() Jason Gunthorpe
` (2 subsequent siblings)
17 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
For now several ARM drivers do not allow mappings to be created until a
domain is attached. This means they do not technically support
IOMMU_RESV_DIRECT as it requires the 1:1 maps to work continuously.
Currently if the platform requests these maps on ARM systems they are
silently ignored.
Work around this by trying again to establish the direct mappings after
the domain is attached if the pre-attach attempt failed.
In the long run the drivers will be fixed to fully setup domains when they
are created without waiting for attachment.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b1e8e8bf955bcd..d0aa837b31cdfd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2884,6 +2884,7 @@ static int iommu_setup_default_domain(struct iommu_group *group,
{
struct group_device *gdev;
struct iommu_domain *dom;
+ bool direct_failed;
int req_type;
int ret;
@@ -2917,8 +2918,11 @@ static int iommu_setup_default_domain(struct iommu_group *group,
* mapped before their device is attached, in order to guarantee
* continuity with any FW activity
*/
- for_each_group_device(group, gdev)
- iommu_create_device_direct_mappings(dom, gdev->dev);
+ direct_failed = false;
+ for_each_group_device(group, gdev) {
+ if (iommu_create_device_direct_mappings(dom, gdev->dev))
+ direct_failed = true;
+ }
ret = __iommu_group_set_domain_internal(group, dom,
IOMMU_SET_DOMAIN_WITH_DEFERRED);
@@ -2938,6 +2942,17 @@ static int iommu_setup_default_domain(struct iommu_group *group,
goto out_set;
}
+ /*
+ * Driver's are supposed to allow mappings to be installed in a domain
+ * before device attachment, but some don't. Hack around this defect by
+ * trying again after attaching. If this happens it means the device
+ * will not continuously have the IOMMU_RESV_DIRECT map.
+ */
+ if (direct_failed) {
+ for_each_group_device(group, gdev)
+ iommu_create_device_direct_mappings(dom, gdev->dev);
+ }
+
out_set:
if (group->default_domain)
iommu_domain_free(group->default_domain);
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 16/17] iommu: Remove __iommu_group_for_each_dev()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (14 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 17/17] iommu: Tidy the control flow in iommu_group_store_type() Jason Gunthorpe
2023-04-07 11:02 ` [PATCH v3 00/17] Consolidate the error handling around device attachment Naresh Kamboju
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
The last two users of it are quite trivial, just open code the one line
loop.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 53 ++++++++++++++++++++-----------------------
1 file changed, 25 insertions(+), 28 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d0aa837b31cdfd..cfacd99df78a05 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1160,20 +1160,6 @@ void iommu_group_remove_device(struct device *dev)
}
EXPORT_SYMBOL_GPL(iommu_group_remove_device);
-static int __iommu_group_for_each_dev(struct iommu_group *group, void *data,
- int (*fn)(struct device *, void *))
-{
- struct group_device *device;
- int ret = 0;
-
- for_each_group_device(group, device) {
- ret = fn(device->dev, data);
- if (ret)
- break;
- }
- return ret;
-}
-
/**
* iommu_group_for_each_dev - iterate over each device in the group
* @group: the group
@@ -1188,10 +1174,15 @@ static int __iommu_group_for_each_dev(struct iommu_group *group, void *data,
int iommu_group_for_each_dev(struct iommu_group *group, void *data,
int (*fn)(struct device *, void *))
{
- int ret;
+ struct group_device *device;
+ int ret = 0;
mutex_lock(&group->mutex);
- ret = __iommu_group_for_each_dev(group, data, fn);
+ for_each_group_device(group, device) {
+ ret = fn(device->dev, data);
+ if (ret)
+ break;
+ }
mutex_unlock(&group->mutex);
return ret;
@@ -1841,20 +1832,12 @@ static int iommu_get_default_domain_type(struct iommu_group *group,
return best_type;
}
-static int iommu_group_do_probe_finalize(struct device *dev, void *data)
+static void iommu_group_do_probe_finalize(struct device *dev)
{
const struct iommu_ops *ops = dev_iommu_ops(dev);
if (ops->probe_finalize)
ops->probe_finalize(dev);
-
- return 0;
-}
-
-static void __iommu_group_dma_finalize(struct iommu_group *group)
-{
- __iommu_group_for_each_dev(group, group->default_domain,
- iommu_group_do_probe_finalize);
}
int bus_iommu_probe(struct bus_type *bus)
@@ -1873,6 +1856,8 @@ int bus_iommu_probe(struct bus_type *bus)
return ret;
list_for_each_entry_safe(group, next, &group_list, entry) {
+ struct group_device *gdev;
+
mutex_lock(&group->mutex);
/* Remove item from the list */
@@ -1884,7 +1869,15 @@ int bus_iommu_probe(struct bus_type *bus)
return ret;
}
mutex_unlock(&group->mutex);
- __iommu_group_dma_finalize(group);
+
+ /*
+ * Mis-locked because the ops->probe_finalize() call-back of
+ * some IOMMU drivers calls arm_iommu_attach_device() which
+ * in-turn might call back into IOMMU core code, where it tries
+ * to take group->mutex, resulting in a deadlock.
+ */
+ for_each_group_device(group, gdev)
+ iommu_group_do_probe_finalize(gdev->dev);
}
return 0;
@@ -3020,8 +3013,12 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
mutex_unlock(&group->mutex);
/* Make sure dma_ops is appropriatley set */
- if (!ret)
- __iommu_group_dma_finalize(group);
+ if (!ret) {
+ struct group_device *gdev;
+
+ for_each_group_device(group, gdev)
+ iommu_group_do_probe_finalize(gdev->dev);
+ }
return ret ?: count;
}
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 17/17] iommu: Tidy the control flow in iommu_group_store_type()
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (15 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 16/17] iommu: Remove __iommu_group_for_each_dev() Jason Gunthorpe
@ 2023-04-05 23:38 ` Jason Gunthorpe
2023-04-07 11:02 ` [PATCH v3 00/17] Consolidate the error handling around device attachment Naresh Kamboju
17 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-05 23:38 UTC (permalink / raw)
To: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
Use a normal "goto unwind" instead of trying to be clever with checking
!ret and manually managing the unlock.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommu.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cfacd99df78a05..c72aa3e81aae50 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2965,6 +2965,7 @@ static int iommu_setup_default_domain(struct iommu_group *group,
static ssize_t iommu_group_store_type(struct iommu_group *group,
const char *buf, size_t count)
{
+ struct group_device *gdev;
int ret, req_type;
if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
@@ -2989,20 +2990,23 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
if (req_type == IOMMU_DOMAIN_DMA_FQ &&
group->default_domain->type == IOMMU_DOMAIN_DMA) {
ret = iommu_dma_init_fq(group->default_domain);
- if (!ret)
- group->default_domain->type = IOMMU_DOMAIN_DMA_FQ;
- mutex_unlock(&group->mutex);
+ if (ret)
+ goto out_unlock;
- return ret ?: count;
+ group->default_domain->type = IOMMU_DOMAIN_DMA_FQ;
+ ret = count;
+ goto out_unlock;
}
/* Otherwise, ensure that device exists and no driver is bound. */
if (list_empty(&group->devices) || group->owner_cnt) {
- mutex_unlock(&group->mutex);
- return -EPERM;
+ ret = -EPERM;
+ goto out_unlock;
}
ret = iommu_setup_default_domain(group, req_type);
+ if (ret)
+ goto out_unlock;
/*
* Release the mutex here because ops->probe_finalize() call-back of
@@ -3013,13 +3017,12 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
mutex_unlock(&group->mutex);
/* Make sure dma_ops is appropriatley set */
- if (!ret) {
- struct group_device *gdev;
-
- for_each_group_device(group, gdev)
- iommu_group_do_probe_finalize(gdev->dev);
- }
+ for_each_group_device(group, gdev)
+ iommu_group_do_probe_finalize(gdev->dev);
+ return count;
+out_unlock:
+ mutex_unlock(&group->mutex);
return ret ?: count;
}
--
2.40.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach
2023-04-05 23:38 ` [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach Jason Gunthorpe
@ 2023-04-06 3:14 ` Baolu Lu
0 siblings, 0 replies; 36+ messages in thread
From: Baolu Lu @ 2023-04-06 3:14 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Robin Murphy, Tom Rix,
Will Deacon
Cc: baolu.lu, Kevin Tian, Nicolin Chen
On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> The iommu_probe_device() path calls iommu_create_device_direct_mappings()
> after attaching the device.
>
> IOMMU_RESV_DIRECT maps need to be continually in place, so if a hotplugged
> device has new ranges the should have been mapped into the default domain
> before it is attached.
>
> Move the iommu_create_device_direct_mappings() call up.
>
> Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Best regards,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler
2023-04-05 23:38 ` [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler Jason Gunthorpe
@ 2023-04-06 3:28 ` Baolu Lu
2023-04-06 15:44 ` Robin Murphy
1 sibling, 0 replies; 36+ messages in thread
From: Baolu Lu @ 2023-04-06 3:28 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Robin Murphy, Tom Rix,
Will Deacon
Cc: baolu.lu, Kevin Tian, Nicolin Chen
On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> It should always attach to the current group->domain, so don't take in a
> domain parameter. Use the __iommu_device_set_domain() common code to
> handle the attach.
>
> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Best regards,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain()
2023-04-05 23:38 ` [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain() Jason Gunthorpe
@ 2023-04-06 3:40 ` Baolu Lu
2023-04-06 11:34 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Baolu Lu @ 2023-04-06 3:40 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Robin Murphy, Tom Rix,
Will Deacon
Cc: baolu.lu, Kevin Tian, Nicolin Chen
On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 8623d3edc61014..5f6aecfd92c657 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -91,8 +91,9 @@ static const char * const iommu_group_resv_type_string[] = {
>
> static int iommu_bus_notifier(struct notifier_block *nb,
> unsigned long action, void *data);
> -static int iommu_alloc_default_domain(struct iommu_group *group,
> - struct device *dev);
> +static struct iommu_domain *
> +iommu_group_alloc_default_domain(struct iommu_group *group, int req_type);
> +static int iommu_get_def_domain_type(struct device *dev);
> static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
> unsigned type);
> static int __iommu_attach_device(struct iommu_domain *domain,
> @@ -444,13 +445,16 @@ int iommu_probe_device(struct device *dev)
> * support default domains, so the return value is not yet
> * checked.
> */
> - iommu_alloc_default_domain(group, dev);
> - if (group->default_domain) {
> - iommu_create_device_direct_mappings(group, dev);
> - ret = __iommu_group_set_domain_internal(
> - group, group->default_domain,
> - IOMMU_SET_DOMAIN_WITH_DEFERRED);
> + group->default_domain = iommu_group_alloc_default_domain(
> + group, iommu_get_def_domain_type(dev));
> + if (!group->default_domain) {
> + ret = -EINVAL;
> + goto err_unlock;
> }
There are still some drivers that don't support default domains. It was
allowed previously, but above seems to change that behavior. The
iommu_probe_device() fails now on those platforms. Or perhaps I
overlooked anything?
> + iommu_create_device_direct_mappings(group, dev);
> + ret = __iommu_group_set_domain_internal(
> + group, group->default_domain,
> + IOMMU_SET_DOMAIN_WITH_DEFERRED);
Best regards,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM
2023-04-05 23:38 ` [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM Jason Gunthorpe
@ 2023-04-06 3:53 ` Baolu Lu
0 siblings, 0 replies; 36+ messages in thread
From: Baolu Lu @ 2023-04-06 3:53 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Robin Murphy, Tom Rix,
Will Deacon
Cc: baolu.lu, Kevin Tian, Nicolin Chen
On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> For now several ARM drivers do not allow mappings to be created until a
> domain is attached. This means they do not technically support
> IOMMU_RESV_DIRECT as it requires the 1:1 maps to work continuously.
>
> Currently if the platform requests these maps on ARM systems they are
> silently ignored.
>
> Work around this by trying again to establish the direct mappings after
> the domain is attached if the pre-attach attempt failed.
>
> In the long run the drivers will be fixed to fully setup domains when they
> are created without waiting for attachment.
>
> Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Best regards,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function
2023-04-05 23:38 ` [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function Jason Gunthorpe
@ 2023-04-06 4:43 ` Baolu Lu
0 siblings, 0 replies; 36+ messages in thread
From: Baolu Lu @ 2023-04-06 4:43 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Robin Murphy, Tom Rix,
Will Deacon
Cc: baolu.lu, Kevin Tian, Nicolin Chen
On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> Make iommu_change_dev_def_domain() general enough to setup the initial
> default_domain or replace it with a new default_domain. Call the new
> function iommu_setup_default_domain() and make it the only place in the
> code that stores to group->default_domain.
>
> Consolidate the three copies of the default_domain setup sequence. The flow
> flow requires:
>
> - Determining the domain type to use
> - Checking if the current default domain is the same type
> - Allocating a domain
> - Doing iommu_create_device_direct_mappings()
> - Attaching it to devices
> - Store group->default_domain
>
> This adjusts the domain allocation from the prior patch to be able to
> detect if each of the allocation steps is already the domain we already
> have, which is a more robust version of what change default domain was
> already doing.
>
> Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Best regards,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain()
2023-04-06 3:40 ` Baolu Lu
@ 2023-04-06 11:34 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-06 11:34 UTC (permalink / raw)
To: Baolu Lu
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon, Kevin Tian,
Nicolin Chen
On Thu, Apr 06, 2023 at 11:40:12AM +0800, Baolu Lu wrote:
> On 4/6/23 7:38 AM, Jason Gunthorpe wrote:
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 8623d3edc61014..5f6aecfd92c657 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -91,8 +91,9 @@ static const char * const iommu_group_resv_type_string[] = {
> > static int iommu_bus_notifier(struct notifier_block *nb,
> > unsigned long action, void *data);
> > -static int iommu_alloc_default_domain(struct iommu_group *group,
> > - struct device *dev);
> > +static struct iommu_domain *
> > +iommu_group_alloc_default_domain(struct iommu_group *group, int req_type);
> > +static int iommu_get_def_domain_type(struct device *dev);
> > static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
> > unsigned type);
> > static int __iommu_attach_device(struct iommu_domain *domain,
> > @@ -444,13 +445,16 @@ int iommu_probe_device(struct device *dev)
> > * support default domains, so the return value is not yet
> > * checked.
> > */
> > - iommu_alloc_default_domain(group, dev);
> > - if (group->default_domain) {
> > - iommu_create_device_direct_mappings(group, dev);
> > - ret = __iommu_group_set_domain_internal(
> > - group, group->default_domain,
> > - IOMMU_SET_DOMAIN_WITH_DEFERRED);
> > + group->default_domain = iommu_group_alloc_default_domain(
> > + group, iommu_get_def_domain_type(dev));
> > + if (!group->default_domain) {
> > + ret = -EINVAL;
> > + goto err_unlock;
> > }
>
> There are still some drivers that don't support default domains. It was
> allowed previously, but above seems to change that behavior. The
> iommu_probe_device() fails now on those platforms. Or perhaps I
> overlooked anything?
Blah, yes, this is fixed up in the next patches it is an artifact of
trying to split patches up. Let me think about it
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler
2023-04-05 23:38 ` [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler Jason Gunthorpe
2023-04-06 3:28 ` Baolu Lu
@ 2023-04-06 15:44 ` Robin Murphy
2023-04-06 16:09 ` Jason Gunthorpe
1 sibling, 1 reply; 36+ messages in thread
From: Robin Murphy @ 2023-04-06 15:44 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> It should always attach to the current group->domain, so don't take in a
> domain parameter. Use the __iommu_device_set_domain() common code to
> handle the attach.
Hmm, why do we still end up with this in two places, and potentially
called twice in the same path?
iommu_probe_device()
__iommu_probe_device()
iommu_group_get_for_dev()
iommu_group_add_device()
iommu_group_do_dma_first_attach()
iommu_create_device_direct_mappings()
iommu_group_do_dma_first_attach()
Seems off to me... at best it's confusing, but at worst it's plain wrong
(it's the one in iommu_group_add_device() that I'm most suspicious of)
Not to mention the slightly different open-coded variant we still seem
to have in iommu_setup_default_domain() via the "else if
(!group->default_domain)" path :/
(BTW if any comments seem misplaced, it's because I'm really looking at
the end result at the cover-letter commit in your branch, rather than
trying to review each patch independently)
Thanks,
Robin.
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommu.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index af3af752c255e4..c68cf551d05dc4 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -105,6 +105,10 @@ enum {
> IOMMU_SET_DOMAIN_WITH_DEFERRED = 1 << 1,
> };
>
> +static int __iommu_device_set_domain(struct iommu_group *group,
> + struct device *dev,
> + struct iommu_domain *new_domain,
> + unsigned int flags);
> static int __iommu_group_set_domain_internal(struct iommu_group *group,
> struct iommu_domain *new_domain,
> unsigned int flags);
> @@ -405,18 +409,10 @@ static bool iommu_is_attach_deferred(struct device *dev)
> return false;
> }
>
> -static int iommu_group_do_dma_first_attach(struct device *dev, void *data)
> +static int iommu_group_do_dma_first_attach(struct iommu_group *group, struct device *dev)
> {
> - struct iommu_domain *domain = data;
> -
> - lockdep_assert_held(&dev->iommu_group->mutex);
> -
> - if (iommu_is_attach_deferred(dev)) {
> - dev->iommu->attach_deferred = 1;
> - return 0;
> - }
> -
> - return __iommu_attach_device(domain, dev);
> + return __iommu_device_set_domain(group, dev, group->domain,
> + IOMMU_SET_DOMAIN_WITH_DEFERRED);
> }
>
> int iommu_probe_device(struct device *dev)
> @@ -449,7 +445,7 @@ int iommu_probe_device(struct device *dev)
> * attach the default domain.
> */
> if (group->default_domain && !group->owner) {
> - ret = iommu_group_do_dma_first_attach(dev, group->default_domain);
> + ret = iommu_group_do_dma_first_attach(group, dev);
> if (ret) {
> mutex_unlock(&group->mutex);
> iommu_group_put(group);
> @@ -1117,7 +1113,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
> mutex_lock(&group->mutex);
> list_add_tail(&device->list, &group->devices);
> if (group->domain)
> - ret = iommu_group_do_dma_first_attach(dev, group->domain);
> + ret = iommu_group_do_dma_first_attach(group, dev);
> mutex_unlock(&group->mutex);
> if (ret)
> goto err_put_group;
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain
2023-04-05 23:38 ` [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain Jason Gunthorpe
@ 2023-04-06 15:58 ` Robin Murphy
2023-04-06 19:17 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Robin Murphy @ 2023-04-06 15:58 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> Prepare for removing the group->domain set from
> iommu_group_alloc_default_domain() by calling
> __iommu_group_set_domain_internal() to set the group->domain.
>
> Add IOMMU_SET_DOMAIN_WITH_DEFERRED to allow it to do the attach_deferred
> logic.
This is way overcomplicated; if we evaluate iommu_is_attach_deferred()
earlier in the probe_device path, then the rest of the handling at the
point of attach should simply reduce to:
if (dev->iommu->attach_deferred && iommu_is_dma_domain(domain))
return 0;
No magical flags required, and especially not the confusing thing with
group-owner because the thing that still says it's for attaching the
default domain now might also be attaching a non-default domain.
Thanks,
Robin.
> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommu.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index cd75d7405a3051..af3af752c255e4 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -102,6 +102,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
>
> enum {
> IOMMU_SET_DOMAIN_MUST_SUCCEED = 1 << 0,
> + IOMMU_SET_DOMAIN_WITH_DEFERRED = 1 << 1,
> };
>
> static int __iommu_group_set_domain_internal(struct iommu_group *group,
> @@ -1855,12 +1856,6 @@ static void probe_alloc_default_domain(struct bus_type *bus,
>
> }
>
> -static int __iommu_group_dma_first_attach(struct iommu_group *group)
> -{
> - return __iommu_group_for_each_dev(group, group->default_domain,
> - iommu_group_do_dma_first_attach);
> -}
> -
> static int iommu_group_do_probe_finalize(struct device *dev, void *data)
> {
> const struct iommu_ops *ops = dev_iommu_ops(dev);
> @@ -1923,7 +1918,10 @@ int bus_iommu_probe(struct bus_type *bus)
>
> iommu_group_create_direct_mappings(group);
>
> - ret = __iommu_group_dma_first_attach(group);
> + group->domain = NULL;
> + ret = __iommu_group_set_domain_internal(
> + group, group->default_domain,
> + IOMMU_SET_DOMAIN_WITH_DEFERRED);
>
> mutex_unlock(&group->mutex);
>
> @@ -2218,6 +2216,12 @@ static int __iommu_device_set_domain(struct iommu_group *group,
> {
> int ret;
>
> + if ((flags & IOMMU_SET_DOMAIN_WITH_DEFERRED) &&
> + iommu_is_attach_deferred(dev)) {
> + dev->iommu->attach_deferred = 1;
> + return 0;
> + }
> +
> ret = __iommu_attach_device(new_domain, dev);
> if (ret) {
> /*
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler
2023-04-06 15:44 ` Robin Murphy
@ 2023-04-06 16:09 ` Jason Gunthorpe
2023-04-06 19:05 ` Robin Murphy
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-06 16:09 UTC (permalink / raw)
To: Robin Murphy
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On Thu, Apr 06, 2023 at 04:44:57PM +0100, Robin Murphy wrote:
> On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> > It should always attach to the current group->domain, so don't take in a
> > domain parameter. Use the __iommu_device_set_domain() common code to
> > handle the attach.
>
> Hmm, why do we still end up with this in two places, and potentially called
> twice in the same path?
>
> iommu_probe_device()
> __iommu_probe_device()
> iommu_group_get_for_dev()
> iommu_group_add_device()
> iommu_group_do_dma_first_attach()
> iommu_create_device_direct_mappings()
> iommu_group_do_dma_first_attach()
>
> Seems off to me... at best it's confusing, but at worst it's plain wrong
> (it's the one in iommu_group_add_device() that I'm most suspicious of)
Lu and Kevin pointed to this as well, it started out that way.
I made a whole other series tidying the probe related flows just based
on this observation..
I can pull one patch into here since this seems to catch everyone.
AFIACT it is harmless at this point since we can safely call
first_attach twice?
> Not to mention the slightly different open-coded variant we still seem to
> have in iommu_setup_default_domain() via the "else if
> (!group->default_domain)" path :/
Yes.. Humm. More parts could be moved into a function but I think it
would obfuscates what iommu_setup_default_domain() does.
Oh, I see another mistake:
if (group->domain) {
iommu_create_device_direct_mappings(group->domain, dev);
^^^^^^^^^^^^^^^
iommu_create_device_direct_mappings should only be called on the
default domain.
> (BTW if any comments seem misplaced, it's because I'm really looking at the
> end result at the cover-letter commit in your branch, rather than trying to
> review each patch independently)
That is probably best. The patches are OK but it is hard to see the
final arrangment
Thanks,
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups
2023-04-05 23:38 ` [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups Jason Gunthorpe
@ 2023-04-06 16:37 ` Robin Murphy
2023-04-06 19:34 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Robin Murphy @ 2023-04-06 16:37 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> If the group is already owned then defered attach should not be done and
> the device should be directly connected to the correct domain.
>
> Owned means that some driver is already bound to devices in the group and
> is operating the group possibly without DMA API support. In this case
> there would be no way to correct the mismatched domain.
Looking more closely at the current code, it seems we just skip any
attach at all in this case. Thus although there can be technically be a
mismatch (physically, the device will be either partly or fully assigned
to the current domain if it is grouped by ID aliasing, or in limbo if it
is grouped for ACS reasons), it *should* be corrected whenever the
current owner next detaches its domain. This shouldn't actually be a
problem in the usual case, since the new device generally won't do
anything if the existing ownership prevents its driver from binding.
I don't have a particularly strong opinion on changing this behaviour,
but I guess what's nagging me is that we could do a better job of
calling out that it *is* a behaviour change, and its implications.
Namely that they are slightly better if the device binds to the same
owner, in that it could rely use the current domain straight off without
the risk of being in ACS limbo, but also slightly worse in that if the
attach fails for any reason, it will now trash aliased RIDs of devices
that *were* working OK. It also means that the device could bind to any
*other* driver with driver_managed_dma and have a much better guarantee
of having access to the other owner's domain, with all the pitfalls
therein, but that's only a strengthening of a pre-existing problem :)
Thanks,
Robin.
> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommu.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index c68cf551d05dc4..6fe8bc78db2016 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -411,8 +411,9 @@ static bool iommu_is_attach_deferred(struct device *dev)
>
> static int iommu_group_do_dma_first_attach(struct iommu_group *group, struct device *dev)
> {
> - return __iommu_device_set_domain(group, dev, group->domain,
> - IOMMU_SET_DOMAIN_WITH_DEFERRED);
> + return __iommu_device_set_domain(
> + group, dev, group->domain,
> + group->owner ? 0 : IOMMU_SET_DOMAIN_WITH_DEFERRED);
> }
>
> int iommu_probe_device(struct device *dev)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind
2023-04-05 23:38 ` [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind Jason Gunthorpe
@ 2023-04-06 18:08 ` Robin Murphy
2023-04-06 19:09 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Robin Murphy @ 2023-04-06 18:08 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, llvm, Nathan Chancellor,
Nick Desaulniers, Miguel Ojeda, Tom Rix, Will Deacon
Cc: Lu Baolu, Kevin Tian, Nicolin Chen
On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> Let's try to have a consistent and clear strategy for error handling
> during domain attach failures.
>
> There are two broad categories, the first is callers doing destruction and
> trying to set the domain back to a previously good domain. These cases
> cannot handle failure during destruction flows and must succeed, or at
> least avoid a UAF on the current group->domain which is likely about to be
> freed.
>
> Many of the drivers are well behaved here and will not hit the WARN_ON's
> or a UAF, but some are doing hypercalls/etc that can fail unpredictably
> and don't meet the expectations.
>
> The second case is attaching a domain for the first time in a failable
> context, failure should restore the attachment back to group->domain using
> the above unfailable operation.
>
> Have __iommu_group_set_domain_internal() execute a common algorithm that
> tries to achieve this, and in the worst case, would leave a device
> "detached" or assigned to a global blocking domain. This relies on some
> existing common driver behaviors where attach failure will also do detatch
> and true IOMMU_DOMAIN_BLOCK implementations that are not allowed to ever
> fail.
>
> Name the first case with __iommu_group_set_domain_nofail() to make it
> clear.
>
> Pull all the error handling and WARN_ON generation into
> __iommu_group_set_domain_internal().
>
> Avoid the obfuscating use of __iommu_group_for_each_dev() and be more
> careful about what should happen during failures by only touching devices
> we've already touched.
>
> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommu.c | 130 +++++++++++++++++++++++++++++++++++-------
> 1 file changed, 108 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 501257bd02fc3c..b7cb2a56ebfe8c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -99,8 +99,26 @@ static int __iommu_attach_device(struct iommu_domain *domain,
> struct device *dev);
> static int __iommu_attach_group(struct iommu_domain *domain,
> struct iommu_group *group);
> +
> +enum {
> + IOMMU_SET_DOMAIN_MUST_SUCCEED = 1 << 0,
> +};
> +
> +static int __iommu_group_set_domain_internal(struct iommu_group *group,
> + struct iommu_domain *new_domain,
> + unsigned int flags);
> static int __iommu_group_set_domain(struct iommu_group *group,
> - struct iommu_domain *new_domain);
> + struct iommu_domain *new_domain)
> +{
> + return __iommu_group_set_domain_internal(group, new_domain, 0);
> +}
> +static void __iommu_group_set_domain_nofail(struct iommu_group *group,
> + struct iommu_domain *new_domain)
> +{
> + WARN_ON(__iommu_group_set_domain_internal(
> + group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> +}
> +
> static int iommu_create_device_direct_mappings(struct iommu_group *group,
> struct device *dev);
> static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
> @@ -2040,15 +2058,13 @@ EXPORT_SYMBOL_GPL(iommu_domain_free);
> static void __iommu_group_set_core_domain(struct iommu_group *group)
> {
> struct iommu_domain *new_domain;
> - int ret;
>
> if (group->owner)
> new_domain = group->blocking_domain;
> else
> new_domain = group->default_domain;
>
> - ret = __iommu_group_set_domain(group, new_domain);
> - WARN(ret, "iommu driver failed to attach the default/blocking domain");
> + __iommu_group_set_domain_nofail(group, new_domain);
> }
>
> static int __iommu_attach_device(struct iommu_domain *domain,
> @@ -2233,21 +2249,55 @@ int iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group)
> }
> EXPORT_SYMBOL_GPL(iommu_attach_group);
>
> -static int iommu_group_do_set_platform_dma(struct device *dev, void *data)
> +static int __iommu_device_set_domain(struct iommu_group *group,
> + struct device *dev,
> + struct iommu_domain *new_domain,
> + unsigned int flags)
> {
> - const struct iommu_ops *ops = dev_iommu_ops(dev);
> -
> - if (!WARN_ON(!ops->set_platform_dma_ops))
> - ops->set_platform_dma_ops(dev);
> + int ret;
>
> + ret = __iommu_attach_device(new_domain, dev);
> + if (ret) {
> + /*
> + * If we have a blocking domain then try to attach that in hopes
> + * of avoiding a UAF. Modern drivers should implement blocking
> + * domains as global statics that cannot fail.
> + */
> + if ((flags & IOMMU_SET_DOMAIN_MUST_SUCCEED) &&
> + group->blocking_domain &&
> + group->blocking_domain != new_domain)
> + __iommu_attach_device(group->blocking_domain, dev);
> + return ret;
> + }
> return 0;
> }
>
> -static int __iommu_group_set_domain(struct iommu_group *group,
> - struct iommu_domain *new_domain)
> +/*
> + * If 0 is returned the group's domain is new_domain. If an error is returned
> + * then the group's domain will be set back to the existing domain unless
> + * IOMMU_SET_DOMAIN_MUST_SUCCEED, otherwise an error is returned and the group's
> + * domains is left inconsistent. This is a driver bug to fail attach with a
> + * previously good domain. We try to avoid a kernel UAF because of this.
> + *
> + * IOMMU groups are really the natural working unit of the IOMMU, but the IOMMU
> + * API works on domains and devices. Bridge that gap by iterating over the
> + * devices in a group. Ideally we'd have a single device which represents the
> + * requestor ID of the group, but we also allow IOMMU drivers to create policy
> + * defined minimum sets, where the physical hardware may be able to distiguish
> + * members, but we wish to group them at a higher level (ex. untrusted
> + * multi-function PCI devices). Thus we attach each device.
> + */
> +static int __iommu_group_set_domain_internal(struct iommu_group *group,
> + struct iommu_domain *new_domain,
> + unsigned int flags)
> {
> + struct group_device *last_gdev;
> + struct group_device *gdev;
> + int result;
> int ret;
>
> + lockdep_assert_held(&group->mutex);
> +
> if (group->domain == new_domain)
> return 0;
>
> @@ -2257,8 +2307,12 @@ static int __iommu_group_set_domain(struct iommu_group *group,
> * platform specific behavior.
> */
> if (!new_domain) {
> - __iommu_group_for_each_dev(group, NULL,
> - iommu_group_do_set_platform_dma);
> + for_each_group_device(group, gdev) {
> + const struct iommu_ops *ops = dev_iommu_ops(gdev->dev);
> +
> + if (!WARN_ON(!ops->set_platform_dma_ops))
> + ops->set_platform_dma_ops(gdev->dev);
> + }
> group->domain = NULL;
> return 0;
> }
> @@ -2272,12 +2326,47 @@ static int __iommu_group_set_domain(struct iommu_group *group,
> * Note that this is called in error unwind paths, attaching to a
> * domain that has already been attached cannot fail.
This comment now seems at odds with the rest of the code.
> */
> - ret = __iommu_group_for_each_dev(group, new_domain,
> - iommu_group_do_attach_device);
> - if (ret)
> - return ret;
> + result = 0;
> + for_each_group_device(group, gdev) {
> + ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
> + flags);
> + if (ret) {
> + result = ret;
> + /*
> + * Keep trying the other devices in the group. If a
> + * driver fails attach to an otherwise good domain, and
> + * does not support blocking domains, it should at least
> + * drop its reference on the current domain so we don't
> + * UAF.
> + */
> + if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
> + continue;
I know it's called out above, but my brain truly can't cope with a flag
named "MUST_SUCCEED" which actually means "can fail, but you can't
reason about the state after failure". As before, I think the actual
flag itself can go away, but the behaviour itself will still be no nicer
for being implicit :/
By this point I've started to lose sight of whether we've really
meaningfully improved the handling of failure if we still have multiple
different behaviours and one of them is still this mess... The
big-picture view seems to be that in order to avoid relying on the
assumption that the second and subsequent attaches within a group can't
fail if the first one didn't, we can now "handle" such a failure, by
requiring the assumption that the second and subsequent attaches to a
domain in general must not fail if the first one didn't. See the problem
there?
TBH I still think the responsibility should be on the drivers to handle
rollback of their internal per-device state if returning failure from an
.attach_dev call. As above, it fundamentally *can't* generalise at the
core API level since after the one special case of resetting the first
device in a group when an initial attach failed, it becomes
self-contradictory. And that special case itself is basically working
around a dumb driver behaviour that's always going to be simplest to fix
directly in any driver that does it. But what would be really really
really good, now that .detach_dev has truly gone, is to pass *both* "to"
and "from" domains to .attach_dev (and a "from" domain to
.release_device), so that drivers don't have to squirrel away all these
internal references to their own notions of "current" domains which are
the root of this whole class of horrid UAF and consistency bugs.
Thanks,
Robin.
> + goto err_revert;
> + }
> + }
> group->domain = new_domain;
> - return 0;
> + return result;
> +
> +err_revert:
> + last_gdev = gdev;
> + for_each_group_device(group, gdev) {
> + const struct iommu_ops *ops = dev_iommu_ops(gdev->dev);
> +
> + /*
> + * If set_platform_dma_ops is not present a NULL domain can
> + * happen only for first probe, in which case we leave
> + * group->domain as NULL and let release clean everything up.
> + */
> + if (group->domain)
> + WARN_ON(__iommu_device_set_domain(
> + group, gdev->dev, group->domain,
> + IOMMU_SET_DOMAIN_MUST_SUCCEED));
> + else if (ops->set_platform_dma_ops)
> + ops->set_platform_dma_ops(gdev->dev);
> + if (gdev == last_gdev)
> + break;
> + }
> + return ret;
> }
>
> void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group)
> @@ -3194,16 +3283,13 @@ EXPORT_SYMBOL_GPL(iommu_device_claim_dma_owner);
>
> static void __iommu_release_dma_ownership(struct iommu_group *group)
> {
> - int ret;
> -
> if (WARN_ON(!group->owner_cnt || !group->owner ||
> !xa_empty(&group->pasid_array)))
> return;
>
> group->owner_cnt = 0;
> group->owner = NULL;
> - ret = __iommu_group_set_domain(group, group->default_domain);
> - WARN(ret, "iommu driver failed to attach the default domain");
> + __iommu_group_set_domain_nofail(group, group->default_domain);
> }
>
> /**
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler
2023-04-06 16:09 ` Jason Gunthorpe
@ 2023-04-06 19:05 ` Robin Murphy
0 siblings, 0 replies; 36+ messages in thread
From: Robin Murphy @ 2023-04-06 19:05 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On 06/04/2023 5:09 pm, Jason Gunthorpe wrote:
> On Thu, Apr 06, 2023 at 04:44:57PM +0100, Robin Murphy wrote:
>> On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
>>> It should always attach to the current group->domain, so don't take in a
>>> domain parameter. Use the __iommu_device_set_domain() common code to
>>> handle the attach.
>>
>> Hmm, why do we still end up with this in two places, and potentially called
>> twice in the same path?
>>
>> iommu_probe_device()
>> __iommu_probe_device()
>> iommu_group_get_for_dev()
>> iommu_group_add_device()
>> iommu_group_do_dma_first_attach()
>> iommu_create_device_direct_mappings()
>> iommu_group_do_dma_first_attach()
>>
>> Seems off to me... at best it's confusing, but at worst it's plain wrong
>> (it's the one in iommu_group_add_device() that I'm most suspicious of)
>
> Lu and Kevin pointed to this as well, it started out that way.
>
> I made a whole other series tidying the probe related flows just based
> on this observation..
>
> I can pull one patch into here since this seems to catch everyone.
>
> AFIACT it is harmless at this point since we can safely call
> first_attach twice?
Possibly; the other attach paths usually check and avoid calling into
the driver to reattach the currently-attached domain, so at worst it
could conceivably cause refcounting or data corruption issues inside a
driver which tracks a current domain pointer internally and assumes it
won't alias the new one. However I guess in practice either our current
drivers don't mind, or any which do aren't seeing multi-device groups.
From a quick history dive, it seems that the iommu_group_add_device()
one was in fact the original default domain attach site, but things
diverged in 6e1aa2049154 ("iommu: Move default domain allocation to
iommu_probe_device()"), where IIRC we still had to deal with a mix of
drivers calling iommu_group_add_device() and/or
iommu_group_get_for_dev() plus the new probe_device path, so it may have
still been needed (or at least seemed risky to remove) at that point,
but then it was overlooked for cleanup after the whole probe_device
conversion was finished?
Thanks,
Robin.
>> Not to mention the slightly different open-coded variant we still seem to
>> have in iommu_setup_default_domain() via the "else if
>> (!group->default_domain)" path :/
>
> Yes.. Humm. More parts could be moved into a function but I think it
> would obfuscates what iommu_setup_default_domain() does.
>
> Oh, I see another mistake:
>
> if (group->domain) {
> iommu_create_device_direct_mappings(group->domain, dev);
> ^^^^^^^^^^^^^^^
>
> iommu_create_device_direct_mappings should only be called on the
> default domain.
>
>> (BTW if any comments seem misplaced, it's because I'm really looking at the
>> end result at the cover-letter commit in your branch, rather than trying to
>> review each patch independently)
>
> That is probably best. The patches are OK but it is hard to see the
> final arrangment
>
> Thanks,
> Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind
2023-04-06 18:08 ` Robin Murphy
@ 2023-04-06 19:09 ` Jason Gunthorpe
2023-04-13 16:47 ` Robin Murphy
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-06 19:09 UTC (permalink / raw)
To: Robin Murphy
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On Thu, Apr 06, 2023 at 07:08:59PM +0100, Robin Murphy wrote:
> > + if (ret) {
> > + result = ret;
> > + /*
> > + * Keep trying the other devices in the group. If a
> > + * driver fails attach to an otherwise good domain, and
> > + * does not support blocking domains, it should at least
> > + * drop its reference on the current domain so we don't
> > + * UAF.
> > + */
> > + if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
> > + continue;
>
> I know it's called out above, but my brain truly can't cope with a flag
> named "MUST_SUCCEED" which actually means "can fail, but you can't reason
> about the state after failure". As before, I think the actual flag itself
> can go away, but the behaviour itself will still be no nicer for being
> implicit :/
What I saw is that a bunch of drivers tend to implement attach_dev
with first some kind of detach, which often cannot fail, and then can
fail the new attach.
So, this is intended to go along with that design pattern. We call
every device's attach and hope that the outcome is either to release
the current iommu_domain or to attach to the new one.
If a driver doesn't follow this pattern then we'll UAF.
The primary purpose of the MUST_SUCCEED flag is to prevent the code
from trying to go back to the old domain. This really should only
depend on the caller's context and ability to handle errors, so I
don't see what would make it implicit?
We definitely do not want to attach back the old domain if we are
about to unconditionally free it.
> TBH I still think the responsibility should be on the drivers to handle
> rollback of their internal per-device state if returning failure from an
> .attach_dev call.
It helps the driver stay consistent, but I don't think it helps the
core code too much.
With the two patches to add the rodata IDENTITY domain to drivers I
think we should go in that direction.
Lets have an "ops->emergency_domain" that is something the driver can
always attach. Ideally it will be the BLOCKING domain, but IDENTITY
could work too.
When this function gets into trouble it will put the entire group on
'emergency_domain' and things will be safe, no UAF.
This seems pretty simple for drivers since the attach_dev can just
leave behind whatever state is convenient, so long as their
emergency_domain cannot fail.
> But what would be really really really good, now that .detach_dev has
> truly gone, is to pass *both* "to" and "from" domains to .attach_dev (and a
> "from" domain to .release_device), so that drivers don't have to squirrel
> away all these internal references to their own notions of "current" domains
> which are the root of this whole class of horrid UAF and consistency
> bugs.
Usually the IOMMU HW will be touching the IOPTEs via DMA enclosed by
the iommu_domain, so even if we could remove the iommu_domain pointers
from the drivers it still retains the HW programming to the
memory. Freeing the domain will still UAF.
Refcounts would be another way to address the UAF, then we can leak
the iommu_domain if things go bad. I think this is more complex on drivers.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain
2023-04-06 15:58 ` Robin Murphy
@ 2023-04-06 19:17 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-06 19:17 UTC (permalink / raw)
To: Robin Murphy
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On Thu, Apr 06, 2023 at 04:58:27PM +0100, Robin Murphy wrote:
> On 06/04/2023 12:38 am, Jason Gunthorpe wrote:
> > Prepare for removing the group->domain set from
> > iommu_group_alloc_default_domain() by calling
> > __iommu_group_set_domain_internal() to set the group->domain.
> >
> > Add IOMMU_SET_DOMAIN_WITH_DEFERRED to allow it to do the attach_deferred
> > logic.
>
> This is way overcomplicated; if we evaluate iommu_is_attach_deferred()
> earlier in the probe_device path, then the rest of the handling at the point
> of attach should simply reduce to:
>
> if (dev->iommu->attach_deferred && iommu_is_dma_domain(domain))
> return 0;
Ok, that seems to work without too much trouble elsewhere.
deferred domains currently work with IDENTITY domains too so the
iommu_is_dma_domain() needed a fix
Against the whole series it looks like this:
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c72aa3e81aae50..c5ed2d27a2da05 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -100,7 +100,6 @@ static int __iommu_attach_group(struct iommu_domain *domain,
enum {
IOMMU_SET_DOMAIN_MUST_SUCCEED = 1 << 0,
- IOMMU_SET_DOMAIN_WITH_DEFERRED = 1 << 1,
};
static int __iommu_device_set_domain(struct iommu_group *group,
@@ -365,6 +364,8 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
dev->iommu->iommu_dev = iommu_dev;
dev->iommu->max_pasids = dev_iommu_get_max_pasids(dev);
+ if (ops->is_attach_deferred)
+ dev->iommu->attach_deferred = ops->is_attach_deferred(dev);
group = iommu_group_get_for_dev(dev);
if (IS_ERR(group)) {
@@ -399,23 +400,6 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
return ret;
}
-static bool iommu_is_attach_deferred(struct device *dev)
-{
- const struct iommu_ops *ops = dev_iommu_ops(dev);
-
- if (ops->is_attach_deferred)
- return ops->is_attach_deferred(dev);
-
- return false;
-}
-
-static int iommu_group_do_dma_first_attach(struct iommu_group *group, struct device *dev)
-{
- return __iommu_device_set_domain(
- group, dev, group->domain,
- group->owner ? 0 : IOMMU_SET_DOMAIN_WITH_DEFERRED);
-}
-
int iommu_probe_device(struct device *dev)
{
const struct iommu_ops *ops;
@@ -434,9 +418,11 @@ int iommu_probe_device(struct device *dev)
mutex_lock(&group->mutex);
+ if (group->default_domain)
+ iommu_create_device_direct_mappings(group->default_domain, dev);
+
if (group->domain) {
- iommu_create_device_direct_mappings(group->domain, dev);
- ret = iommu_group_do_dma_first_attach(group, dev);
+ ret = __iommu_device_set_domain(group, dev, group->domain, 0);
if (ret)
goto err_unlock;
} else if (!group->default_domain) {
@@ -1104,25 +1090,13 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
- if (group->domain)
- ret = iommu_group_do_dma_first_attach(group, dev);
mutex_unlock(&group->mutex);
- if (ret)
- goto err_put_group;
-
trace_add_device_to_group(group->id, dev);
dev_info(dev, "Adding to iommu group %d\n", group->id);
return 0;
-err_put_group:
- mutex_lock(&group->mutex);
- list_del(&device->list);
- mutex_unlock(&group->mutex);
- dev->iommu_group = NULL;
- kobject_put(group->devices_kobj);
- sysfs_remove_link(group->devices_kobj, device->name);
err_free_name:
kfree(device->name);
err_remove_link:
@@ -2165,10 +2139,10 @@ static int __iommu_device_set_domain(struct iommu_group *group,
{
int ret;
- if ((flags & IOMMU_SET_DOMAIN_WITH_DEFERRED) &&
- iommu_is_attach_deferred(dev)) {
- dev->iommu->attach_deferred = 1;
- return 0;
+ if (dev->iommu->attach_deferred) {
+ if (new_domain == group->default_domain)
+ return 0;
+ dev->iommu->attach_deferred = 0;
}
ret = __iommu_attach_device(new_domain, dev);
@@ -2875,6 +2849,7 @@ EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
static int iommu_setup_default_domain(struct iommu_group *group,
int target_type)
{
+ struct iommu_domain *old_dom = group->default_domain;
struct group_device *gdev;
struct iommu_domain *dom;
bool direct_failed;
@@ -2899,8 +2874,8 @@ static int iommu_setup_default_domain(struct iommu_group *group,
/* Once in default_domain mode we never leave */
if (group->default_domain)
return -ENODEV;
- ret = 0;
- goto out_set;
+ group->default_domain = NULL;
+ return 0;
}
if (group->default_domain == dom)
@@ -2917,8 +2892,10 @@ static int iommu_setup_default_domain(struct iommu_group *group,
direct_failed = true;
}
- ret = __iommu_group_set_domain_internal(group, dom,
- IOMMU_SET_DOMAIN_WITH_DEFERRED);
+ /* We must set default_domain early for __iommu_device_set_domain */
+ group->default_domain = dom;
+ ret = __iommu_group_set_domain_internal(
+ group, dom, group->domain ? 0 : IOMMU_SET_DOMAIN_MUST_SUCCEED);
if (ret) {
/*
* An attach_dev failure may result in some devices being left
@@ -2927,12 +2904,16 @@ static int iommu_setup_default_domain(struct iommu_group *group,
* no choice but to stick the broken domain into
* group->default_domain to defer the free and try to continue.
*/
- if (list_count_nodes(&group->devices) > 1)
+ if (!group->domain)
goto out_set;
+ /*
+ * Otherwise assume that cleanup worked and the devices were put
+ * back to old_dom.
+ */
iommu_domain_free(dom);
- dom = NULL;
- goto out_set;
+ group->default_domain = old_dom;
+ return ret;
}
/*
@@ -2947,8 +2928,8 @@ static int iommu_setup_default_domain(struct iommu_group *group,
}
out_set:
- if (group->default_domain)
- iommu_domain_free(group->default_domain);
+ if (old_dom)
+ iommu_domain_free(old_dom);
group->default_domain = dom;
return ret;
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups
2023-04-06 16:37 ` Robin Murphy
@ 2023-04-06 19:34 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-06 19:34 UTC (permalink / raw)
To: Robin Murphy
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On Thu, Apr 06, 2023 at 05:37:57PM +0100, Robin Murphy wrote:
> I don't have a particularly strong opinion on changing this behaviour, but I
> guess what's nagging me is that we could do a better job of calling out that
> it *is* a behaviour change, and its implications.
This patch got dropped with your other suggestion.
Instead this distinction is handled inside __iommu_device_set_domain():
if (dev->iommu->attach_deferred) {
if (new_domain == group->default_domain)
return 0;
dev->iommu->attach_deferred = 0;
}
Relying on the owned means group->domain != default_domain thinking we
used elsewhere.
> slightly better if the device binds to the same owner, in that it could rely
> use the current domain straight off without the risk of being in ACS limbo,
> but also slightly worse in that if the attach fails for any reason, it will
> now trash aliased RIDs of devices that *were* working OK.
The functional issue is that iommufd uses iommu_attach_group() - so
conceivably it is possible to hot plug in a new group member, add a
vfio driver to it and then hot add it to an existing iommufd. In this
case it is required that the core code set the iommu_domain properly
since iommufd won't call iommu_attach_group() again.
On the alias RID side, I would say a driver should be checking if the
acutal RID is already set and not disturb anything..
> the device could bind to any *other* driver with driver_managed_dma and have
> a much better guarantee of having access to the other owner's
> domain,
Right, this is what the 'owner' argument to
iommu_device_claim_dma_owner() is supposed to allow.
I added the above note about iommufd to the patch "iommu: Fix
iommu_probe_device() to attach the right domain"
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 00/17] Consolidate the error handling around device attachment
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
` (16 preceding siblings ...)
2023-04-05 23:38 ` [PATCH v3 17/17] iommu: Tidy the control flow in iommu_group_store_type() Jason Gunthorpe
@ 2023-04-07 11:02 ` Naresh Kamboju
17 siblings, 0 replies; 36+ messages in thread
From: Naresh Kamboju @ 2023-04-07 11:02 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Robin Murphy, Tom Rix, Will Deacon, Lu Baolu,
Kevin Tian, Nicolin Chen
On Thu, 6 Apr 2023 at 05:08, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Device attachment has a bunch of different flows open coded in different
> ways throughout the code.
>
> One of the things that became apparently recently is that error handling
> is important and we do need to consistently treat errors during attach and
> have some strategy to unwind back to a safe state.
>
> Implement a single algorithm for this in one function. It will call each
> device's attach, if it fails it will try to go back to the prior domain or
> as a contingency against a UAF crash try to go to a blocking domain.
>
> As part of this we consolidate how the default domain is created and
> attached as well into one place with a consistent flow.
>
> The new worker functions are called __iommu_device_set_domain() and
> __iommu_group_set_domain_internal(), each has sensible error handling
> internally. At the end __iommu_group_set_domain_internal() is the only
> function that stores to group->domain, and must be called to change this
> value.
>
> Some flags tell the intent of the caller, if the caller cannot accept a
> failure, or if the caller is a first attach and wants to do the deferred
> logic.
>
> Several of the confusing workflows where we store things in group->domain
> or group->default_domain before they are fully setup are removed.
>
> This has a followup series that does a similar de-duplication to the probe
> path:
>
> https://github.com/jgunthorpe/linux/commits/iommu_err_unwind
This repository and iommu_err_unwind branch built with clang-16 and
boot tested on arm x15, arm64 Rpi4, Qcom Db410c and juno-r2 devices.
All boot test and LTP smoke pass.
>
> v3:
> - New patch to do iommu_group_create_direct_mappings() before
> attach on the hotplug path based on Lu and Robin's remarks
> - Fix to return 0 if the group has conflicting default domain types like
> the original code
> - Put iommu_group_create_direct_mappings() before attach in setup_domains
> - Split up the alloc changes from the setup_domain patch to their own
> patch, implement Robin's point about how the iommu_def_domain_type should
> work
> - New patch to optionally do iommu_group_create_direct_mappings() after
> attach
> - Reword the setup_domain patch's commit message
<trim>
> .clang-format | 1 +
> drivers/iommu/iommu.c | 636 +++++++++++++++++++++---------------------
> 2 files changed, 325 insertions(+), 312 deletions(-)
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
>
> base-commit: 3578e36f238ef81a1967e849e0a3106c9dd37e68
> --
> 2.40.0
>
Best regards
Naresh
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind
2023-04-06 19:09 ` Jason Gunthorpe
@ 2023-04-13 16:47 ` Robin Murphy
2023-04-13 18:39 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Robin Murphy @ 2023-04-13 16:47 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On 06/04/2023 8:09 pm, Jason Gunthorpe wrote:
> On Thu, Apr 06, 2023 at 07:08:59PM +0100, Robin Murphy wrote:
>
>>> + if (ret) {
>>> + result = ret;
>>> + /*
>>> + * Keep trying the other devices in the group. If a
>>> + * driver fails attach to an otherwise good domain, and
>>> + * does not support blocking domains, it should at least
>>> + * drop its reference on the current domain so we don't
>>> + * UAF.
>>> + */
>>> + if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
>>> + continue;
>>
>> I know it's called out above, but my brain truly can't cope with a flag
>> named "MUST_SUCCEED" which actually means "can fail, but you can't reason
>> about the state after failure". As before, I think the actual flag itself
>> can go away, but the behaviour itself will still be no nicer for being
>> implicit :/
>
> What I saw is that a bunch of drivers tend to implement attach_dev
> with first some kind of detach, which often cannot fail, and then can
> fail the new attach.
>
> So, this is intended to go along with that design pattern. We call
> every device's attach and hope that the outcome is either to release
> the current iommu_domain or to attach to the new one.
It's that design pattern which is most of the problem, though. It's a
left-over from the old pre-default-domain era (quite literally, in most
cases, where drivers were just minimally bodged to simulate the previous
core call of ops->detach_dev). In the modern
always-attached-to-something paradigm, drivers should absolutely be able
to figure out if an attach will fail *before* they throw away any
existing device state.
If we reach the point of adding core code that only serves to help
drivers emulate the new API using the old API because we'd rather
undermine the new API design than fix the drivers, then frankly that
seems like an argument for just reverting back to the old API.
Partial failure at the group level - i.e. one device's attach call
failing after previous ones have succeeded and leaving the group as a
whole in an inconsistent state - is a separate issue dating all the way
back to day 1 of groups. I don't disagree with questioning the current
implicit assumption that that just shouldn't ever happen, but it touches
on different aspects of driver design than the above.
> If a driver doesn't follow this pattern then we'll UAF.
>
> The primary purpose of the MUST_SUCCEED flag is to prevent the code
> from trying to go back to the old domain. This really should only
> depend on the caller's context and ability to handle errors, so I
> don't see what would make it implicit?
It could be implicit *because* it's contextual; the domain types already
carry as much information as the flag would. Attaching to a non-core
(unmanaged or SVA) domain type is the one case where core code has clear
reason to unwind if an attach fails mid-group. The initial attach to a
core domain from nothing in probe_device could be allowed to fail, but
there there's clearly nothing to unwind, and leaving the device in limbo
seems OK. Attaching from non-core to core should represent a repeat of
something which previously succeeded for the given device+domain
combination, so whether we permit that to fail-to-limbo or mandate that
drivers must be able to repeat a previous success mostly depends on what
we think is a reasonable expectation of how to design a driver (I think
it comes down to whether anyone really needs any domain-specific
allocations that have to belong to devices rather than the domain
itself, and cannot otherwise be preserved over a detach/reattach cycle).
Finally, core to core transitions via sysfs can probably be their own
special case depending on what falls out of the others.
> We definitely do not want to attach back the old domain if we are
> about to unconditionally free it.
>
>> TBH I still think the responsibility should be on the drivers to handle
>> rollback of their internal per-device state if returning failure from an
>> .attach_dev call.
>
> It helps the driver stay consistent, but I don't think it helps the
> core code too much.
>
> With the two patches to add the rodata IDENTITY domain to drivers I
> think we should go in that direction.
>
> Lets have an "ops->emergency_domain" that is something the driver can
> always attach. Ideally it will be the BLOCKING domain, but IDENTITY
> could work too.
>
> When this function gets into trouble it will put the entire group on
> 'emergency_domain' and things will be safe, no UAF.
>
> This seems pretty simple for drivers since the attach_dev can just
> leave behind whatever state is convenient, so long as their
> emergency_domain cannot fail.
>
>> But what would be really really really good, now that .detach_dev has
>> truly gone, is to pass *both* "to" and "from" domains to .attach_dev (and a
>> "from" domain to .release_device), so that drivers don't have to squirrel
>> away all these internal references to their own notions of "current" domains
>> which are the root of this whole class of horrid UAF and consistency
>> bugs.
>
> Usually the IOMMU HW will be touching the IOPTEs via DMA enclosed by
> the iommu_domain, so even if we could remove the iommu_domain pointers
> from the drivers it still retains the HW programming to the
> memory. Freeing the domain will still UAF.
>
> Refcounts would be another way to address the UAF, then we can leak
> the iommu_domain if things go bad. I think this is more complex on drivers.
Ah, we're thinking of different notions of UAF - I just mean drivers
dereferencing stale pointers in software. If software leaves the
hardware pointing at the the wrong thing, then it's already not going to
work as expected from then on, regardless of whether that
descriptor/pagetable/whatever is subsequently freed or not, so as far as
I'm concerned there isn't much meaningful functional difference between
"broken" and "much more badly broken". Obviously it does matter more
from an overall kernel safety/stability viewpoint, but that's only a
concern for when we can't get the API-level functional correctness right
in the first place.
Thanks,
Robin.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind
2023-04-13 16:47 ` Robin Murphy
@ 2023-04-13 18:39 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2023-04-13 18:39 UTC (permalink / raw)
To: Robin Murphy
Cc: iommu, Joerg Roedel, llvm, Nathan Chancellor, Nick Desaulniers,
Miguel Ojeda, Tom Rix, Will Deacon, Lu Baolu, Kevin Tian,
Nicolin Chen
On Thu, Apr 13, 2023 at 05:47:14PM +0100, Robin Murphy wrote:
> On 06/04/2023 8:09 pm, Jason Gunthorpe wrote:
> > On Thu, Apr 06, 2023 at 07:08:59PM +0100, Robin Murphy wrote:
> >
> > > > + if (ret) {
> > > > + result = ret;
> > > > + /*
> > > > + * Keep trying the other devices in the group. If a
> > > > + * driver fails attach to an otherwise good domain, and
> > > > + * does not support blocking domains, it should at least
> > > > + * drop its reference on the current domain so we don't
> > > > + * UAF.
> > > > + */
> > > > + if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
> > > > + continue;
> > >
> > > I know it's called out above, but my brain truly can't cope with a flag
> > > named "MUST_SUCCEED" which actually means "can fail, but you can't reason
> > > about the state after failure". As before, I think the actual flag itself
> > > can go away, but the behaviour itself will still be no nicer for being
> > > implicit :/
> >
> > What I saw is that a bunch of drivers tend to implement attach_dev
> > with first some kind of detach, which often cannot fail, and then can
> > fail the new attach.
> >
> > So, this is intended to go along with that design pattern. We call
> > every device's attach and hope that the outcome is either to release
> > the current iommu_domain or to attach to the new one.
>
> It's that design pattern which is most of the problem, though.
Yes, this is definately true
> It's a left-over from the old pre-default-domain era (quite
> literally, in most cases, where drivers were just minimally bodged
> to simulate the previous core call of ops->detach_dev). In the
> modern always-attached-to-something paradigm, drivers should
> absolutely be able to figure out if an attach will fail *before*
> they throw away any existing device state.
That is fine
I think I muddled the description I gave last time.
The loop is doing this:
for_each_group_device(group, gdev) {
ret = __iommu_device_set_domain(group, gdev->dev, new_domain,
flags);
if (ret) {
result = ret;
/*
* Keep trying the other devices in the group. If a
* driver fails attach to an otherwise good domain, and
* does not support blocking domains, it should at least
* drop its reference on the current domain so we don't
* UAF.
*/
if (flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)
continue;
goto err_revert;
And then in __iommu_device_set_domain:
ret = __iommu_attach_device(new_domain, dev);
if (ret) {
/*
* If we have a blocking domain then try to attach that in hopes
* of avoiding a UAF. Modern drivers should implement blocking
* domains as global statics that cannot fail.
*/
if ((flags & IOMMU_SET_DOMAIN_MUST_SUCCEED) &&
group->blocking_domain &&
group->blocking_domain != new_domain)
__iommu_attach_device(group->blocking_domain, dev);
So, for "old" drivers the ops->attach_dev will disconnect the domain while
it fails.
"new" drivers will leave the domain attached and the above will then
go on an explicitly use the blocking_domain.
So the purpose of the loop is to go over every device and detach the
domain. Either with the sketchy old driver implicit detach or the new
driver explicit blocking domain attach.
The blocking domain part isn't 100% finished yet, as
group->blocking_domain is often NULL, and might be an UNMANAGED
domain, but from a failure management perspecitive this is what I am
proposing.
> If we reach the point of adding core code that only serves to help drivers
> emulate the new API using the old API because we'd rather undermine the new
> API design than fix the drivers, then frankly that seems like an argument
> for just reverting back to the old API.
I don't see it as undermining, what a new high quality driver should
implement is:
- Do all attach checks before changing any state. Once state changing
happens it should succeed
- Thus leave the old domain in place on attach failure
- Provide a blocking domain with an attach_dev that cannot fail
- If attach must fail and cannot go back to the old domain, it must
go to a blocking domain
- Re-attach of a previously attached domain should succeed
- [missing] provide the blocking domain pointer through the device
ops so it is always available.
- Do not fail when attaching the blocking domain
Effectively the blocking domain is filling the error handling role
left by detach_dev. What we are doing is unmuddling the mixture of
detach_dev behaviors where some drivers set IDENTITY and some drivers
set BLOCKING.
> > The primary purpose of the MUST_SUCCEED flag is to prevent the code
> > from trying to go back to the old domain. This really should only
> > depend on the caller's context and ability to handle errors, so I
> > don't see what would make it implicit?
>
> It could be implicit *because* it's contextual; the domain types already
> carry as much information as the flag would. Attaching to a non-core
> (unmanaged or SVA) domain type is the one case where core code has clear
> reason to unwind if an attach fails mid-group.
It is not universal, eg the "change default domain" flow looks like this:
/* We must set default_domain early for __iommu_device_set_domain */
group->default_domain = dom;
if (!group->domain) {
..
} else {
ret = __iommu_group_set_domain(group, dom);
if (ret) {
In this case even though we are attaching the (new) default domain
what we want is to unroll back to group->domain and leave the old
default_domain in place.
I find it much clearer to directly specify the error unwind expecation
of the caller based on the caller's logic and ability to handle the
returned error than to try an infer it.
> Attaching from non-core to core should represent a repeat of something which
> previously succeeded for the given device+domain combination, so whether we
> permit that to fail-to-limbo or mandate that drivers must be able to repeat
> a previous success mostly depends on what we think is a reasonable
> expectation of how to design a driver
I think most drivers are OK to repeat, but some like s390 have FW
interfaces that are not reliable, so repeat can fail there. I would
say a drivers should strive for a design that repeat should work.
The repeat safe drivers never hit these error cases anyhow..
When thinking about this error handling I was looking at cases like
s390. For their flow the first thing attach does is a FW call to set a
blocking behavior, if they have a failure from that point they can
leave the device blocking. When the core code asks for blocking during
the error handling it is NOP.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2023-04-13 18:39 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-05 23:38 [PATCH v3 00/17] Consolidate the error handling around device attachment Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 01/17] iommu: Replace iommu_group_device_count() with list_count_nodes() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 02/17] iommu: Add for_each_group_device() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 03/17] iommu: Make __iommu_group_set_domain() handle error unwind Jason Gunthorpe
2023-04-06 18:08 ` Robin Murphy
2023-04-06 19:09 ` Jason Gunthorpe
2023-04-13 16:47 ` Robin Murphy
2023-04-13 18:39 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 04/17] iommu: Use __iommu_group_set_domain() for __iommu_attach_group() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 05/17] iommu: Use __iommu_group_set_domain() in iommu_change_dev_def_domain() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 06/17] iommu: Replace __iommu_group_dma_first_attach() with set_domain Jason Gunthorpe
2023-04-06 15:58 ` Robin Murphy
2023-04-06 19:17 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 07/17] iommu: Make iommu_group_do_dma_first_attach() simpler Jason Gunthorpe
2023-04-06 3:28 ` Baolu Lu
2023-04-06 15:44 ` Robin Murphy
2023-04-06 16:09 ` Jason Gunthorpe
2023-04-06 19:05 ` Robin Murphy
2023-04-05 23:38 ` [PATCH v3 08/17] iommu: Make iommu_group_do_dma_first_attach() work with owned groups Jason Gunthorpe
2023-04-06 16:37 ` Robin Murphy
2023-04-06 19:34 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 09/17] iommu: Fix iommu_probe_device() to attach the right domain Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 10/17] iommu: Do iommu_group_create_direct_mappings() before attach Jason Gunthorpe
2023-04-06 3:14 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 11/17] iommu: Remove the assignment of group->domain during default domain alloc Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 12/17] iommu: Consolidate the code to calculate the target default domain type Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 13/17] iommu: Revise iommu_group_alloc_default_domain() Jason Gunthorpe
2023-04-06 3:40 ` Baolu Lu
2023-04-06 11:34 ` Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 14/17] iommu: Consolidate the default_domain setup to one function Jason Gunthorpe
2023-04-06 4:43 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 15/17] iommu: Allow IOMMU_RESV_DIRECT to work on ARM Jason Gunthorpe
2023-04-06 3:53 ` Baolu Lu
2023-04-05 23:38 ` [PATCH v3 16/17] iommu: Remove __iommu_group_for_each_dev() Jason Gunthorpe
2023-04-05 23:38 ` [PATCH v3 17/17] iommu: Tidy the control flow in iommu_group_store_type() Jason Gunthorpe
2023-04-07 11:02 ` [PATCH v3 00/17] Consolidate the error handling around device attachment Naresh Kamboju
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).