* [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-23 8:17 ` Tian, Kevin
2024-04-23 9:01 ` Tian, Kevin
2024-04-16 8:06 ` [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers Lu Baolu
` (11 subsequent siblings)
12 siblings, 2 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
Caching tag is a combination of tags used by the hardware to cache various
translations. Whenever a mapping in a domain is changed, the IOMMU driver
should invalidate the caches with the caching tags. The VT-d specification
describes caching tags in section 6.2.1, Tagging of Cached Translations.
Add interface to assign caching tags to an IOMMU domain when attached to a
RID or PASID, and unassign caching tags when a domain is detached from a
RID or PASID. All caching tags are listed in the per-domain tag list and
are protected by a dedicated lock.
In addition to the basic IOTLB and devTLB caching tag types, PARENT_IOTLB
and PARENT_DEVTLB tag types are also introduced. These tags are used for
caches that store translations for DMA accesses through a nested user
domain. They are affected by changes to mappings in the parent domain.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 31 +++++
drivers/iommu/intel/cache.c | 214 +++++++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.c | 28 ++++-
drivers/iommu/intel/nested.c | 19 +++-
drivers/iommu/intel/svm.c | 10 +-
drivers/iommu/intel/Makefile | 2 +-
6 files changed, 295 insertions(+), 9 deletions(-)
create mode 100644 drivers/iommu/intel/cache.c
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 404d2476a877..52471f5337d5 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -607,6 +607,9 @@ struct dmar_domain {
struct list_head devices; /* all devices' list */
struct list_head dev_pasids; /* all attached pasids */
+ spinlock_t cache_lock; /* Protect the cache tag list */
+ struct list_head cache_tags; /* Cache tag list */
+
int iommu_superpage;/* Level of superpages supported:
0 == 4KiB (no superpages), 1 == 2MiB,
2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
@@ -1092,6 +1095,34 @@ struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
const struct iommu_user_data *user_data);
struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid);
+enum cache_tag_type {
+ CACHE_TAG_IOTLB,
+ CACHE_TAG_DEVTLB,
+ CACHE_TAG_NESTING_IOTLB,
+ CACHE_TAG_NESTING_DEVTLB,
+};
+
+struct cache_tag {
+ struct list_head node;
+ enum cache_tag_type type;
+ struct intel_iommu *iommu;
+ /*
+ * The @dev field represents the location of the cache. For IOTLB, it
+ * resides on the IOMMU hardware. @dev stores the device pointer to
+ * the IOMMU hardware. For DevTLB, it locates in the PCIe endpoint.
+ * @dev stores the device pointer to that endpoint.
+ */
+ struct device *dev;
+ u16 domain_id;
+ ioasid_t pasid;
+ unsigned int users;
+};
+
+int cache_tag_assign_domain(struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid);
+void cache_tag_unassign_domain(struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid);
+
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
int intel_svm_enable_prq(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
new file mode 100644
index 000000000000..296f1645a739
--- /dev/null
+++ b/drivers/iommu/intel/cache.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * cache.c - Intel VT-d cache invalidation
+ *
+ * Copyright (C) 2024 Intel Corporation
+ *
+ * Author: Lu Baolu <baolu.lu@linux.intel.com>
+ */
+
+#define pr_fmt(fmt) "DMAR: " fmt
+
+#include <linux/dmar.h>
+#include <linux/iommu.h>
+#include <linux/memory.h>
+#include <linux/spinlock.h>
+
+#include "iommu.h"
+#include "pasid.h"
+
+/* Check if an existing cache tag can be reused for a new association. */
+static bool cache_tage_match(struct cache_tag *tag, u16 domain_id,
+ struct intel_iommu *iommu, struct device *dev,
+ ioasid_t pasid, enum cache_tag_type type)
+{
+ if (tag->type != type)
+ return false;
+
+ if (tag->domain_id != domain_id || tag->pasid != pasid)
+ return false;
+
+ if (type == CACHE_TAG_IOTLB || type == CACHE_TAG_NESTING_IOTLB)
+ return tag->iommu == iommu;
+
+ if (type == CACHE_TAG_DEVTLB || type == CACHE_TAG_NESTING_DEVTLB)
+ return tag->dev == dev;
+
+ return false;
+}
+
+/* Assign a cache tag with specified type to domain. */
+static int cache_tag_assign(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid,
+ enum cache_tag_type type)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
+ struct cache_tag *tag, *temp;
+ unsigned long flags;
+
+ tag = kzalloc(sizeof(*tag), GFP_KERNEL);
+ if (!tag)
+ return -ENOMEM;
+
+ tag->type = type;
+ tag->iommu = iommu;
+ tag->domain_id = did;
+ tag->pasid = pasid;
+ tag->users = 1;
+
+ if (type == CACHE_TAG_DEVTLB || type == CACHE_TAG_NESTING_DEVTLB)
+ tag->dev = dev;
+ else
+ tag->dev = iommu->iommu.dev;
+
+ spin_lock_irqsave(&domain->cache_lock, flags);
+ list_for_each_entry(temp, &domain->cache_tags, node) {
+ if (cache_tage_match(temp, did, iommu, dev, pasid, type)) {
+ temp->users++;
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+ kfree(tag);
+ return 0;
+ }
+ }
+ list_add_tail(&tag->node, &domain->cache_tags);
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+
+ return 0;
+}
+
+/* Unassign a cache tag with specified type from domain. */
+static void cache_tag_unassign(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid,
+ enum cache_tag_type type)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
+ struct cache_tag *tag;
+ unsigned long flags;
+
+ spin_lock_irqsave(&domain->cache_lock, flags);
+ list_for_each_entry(tag, &domain->cache_tags, node) {
+ if (cache_tage_match(tag, did, iommu, dev, pasid, type)) {
+ if (--tag->users == 0) {
+ list_del(&tag->node);
+ kfree(tag);
+ }
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+}
+
+static int __cache_tag_assign_domain(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ int ret;
+
+ ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_IOTLB);
+ if (ret || !info->ats_enabled)
+ return ret;
+
+ ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_DEVTLB);
+ if (ret)
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_IOTLB);
+
+ return ret;
+}
+
+static void __cache_tag_unassign_domain(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_IOTLB);
+
+ if (info->ats_enabled)
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_DEVTLB);
+}
+
+static int __cache_tag_assign_parent_domain(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ int ret;
+
+ ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_NESTING_IOTLB);
+ if (ret || !info->ats_enabled)
+ return ret;
+
+ ret = cache_tag_assign(domain, did, dev, pasid, CACHE_TAG_NESTING_DEVTLB);
+ if (ret)
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_NESTING_IOTLB);
+
+ return ret;
+}
+
+static void __cache_tag_unassign_parent_domain(struct dmar_domain *domain, u16 did,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_NESTING_IOTLB);
+
+ if (info->ats_enabled)
+ cache_tag_unassign(domain, did, dev, pasid, CACHE_TAG_NESTING_DEVTLB);
+}
+
+static u16 domain_get_id_for_dev(struct dmar_domain *domain, struct device *dev)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
+
+ /*
+ * The driver assigns different domain IDs for all domains except
+ * the SVA type.
+ */
+ if (domain->domain.type == IOMMU_DOMAIN_SVA)
+ return FLPT_DEFAULT_DID;
+
+ return domain_id_iommu(domain, iommu);
+}
+
+/*
+ * Assign cache tags to a domain when it's associated with a device's
+ * PASID using a specific domain ID.
+ *
+ * On success (return value of 0), cache tags are created and added to the
+ * domain's cache tag list. On failure (negative return value), an error
+ * code is returned indicating the reason for the failure.
+ */
+int cache_tag_assign_domain(struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ u16 did = domain_get_id_for_dev(domain, dev);
+ int ret;
+
+ ret = __cache_tag_assign_domain(domain, did, dev, pasid);
+ if (ret || domain->domain.type != IOMMU_DOMAIN_NESTED)
+ return ret;
+
+ ret = __cache_tag_assign_parent_domain(domain->s2_domain, did, dev, pasid);
+ if (ret)
+ __cache_tag_unassign_domain(domain, did, dev, pasid);
+
+ return ret;
+}
+
+/*
+ * Remove the cache tags associated with a device's PASID when the domain is
+ * detached from the device.
+ *
+ * The cache tags must be previously assigned to the domain by calling the
+ * assign interface.
+ */
+void cache_tag_unassign_domain(struct dmar_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ u16 did = domain_get_id_for_dev(domain, dev);
+
+ __cache_tag_unassign_domain(domain, did, dev, pasid);
+ if (domain->domain.type == IOMMU_DOMAIN_NESTED)
+ __cache_tag_unassign_parent_domain(domain->s2_domain, did, dev, pasid);
+}
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f0a67e9d9faf..0c0b8e493fda 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1745,7 +1745,9 @@ static struct dmar_domain *alloc_domain(unsigned int type)
domain->has_iotlb_device = false;
INIT_LIST_HEAD(&domain->devices);
INIT_LIST_HEAD(&domain->dev_pasids);
+ INIT_LIST_HEAD(&domain->cache_tags);
spin_lock_init(&domain->lock);
+ spin_lock_init(&domain->cache_lock);
xa_init(&domain->iommu_array);
return domain;
@@ -1757,6 +1759,9 @@ int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
unsigned long ndomains;
int num, ret = -ENOSPC;
+ if (domain->domain.type == IOMMU_DOMAIN_SVA)
+ return 0;
+
info = kzalloc(sizeof(*info), GFP_KERNEL);
if (!info)
return -ENOMEM;
@@ -1804,6 +1809,9 @@ void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
{
struct iommu_domain_info *info;
+ if (domain->domain.type == IOMMU_DOMAIN_SVA)
+ return;
+
spin_lock(&iommu->lock);
info = xa_load(&domain->iommu_array, iommu->seq_id);
if (--info->refcnt == 0) {
@@ -2322,6 +2330,13 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
ret = domain_attach_iommu(domain, iommu);
if (ret)
return ret;
+
+ ret = cache_tag_assign_domain(domain, dev, IOMMU_NO_PASID);
+ if (ret) {
+ domain_detach_iommu(domain, iommu);
+ return ret;
+ }
+
info->domain = domain;
spin_lock_irqsave(&domain->lock, flags);
list_add(&info->link, &domain->devices);
@@ -3810,6 +3825,7 @@ void device_block_translation(struct device *dev)
list_del(&info->link);
spin_unlock_irqrestore(&info->domain->lock, flags);
+ cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID);
domain_detach_iommu(info->domain, iommu);
info->domain = NULL;
}
@@ -4597,6 +4613,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
if (WARN_ON_ONCE(!domain))
goto out_tear_down;
+ dmar_domain = to_dmar_domain(domain);
/*
* The SVA implementation needs to handle its own stuffs like the mm
@@ -4605,10 +4622,10 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
*/
if (domain->type == IOMMU_DOMAIN_SVA) {
intel_svm_remove_dev_pasid(dev, pasid);
+ cache_tag_unassign_domain(dmar_domain, dev, pasid);
goto out_tear_down;
}
- dmar_domain = to_dmar_domain(domain);
spin_lock_irqsave(&dmar_domain->lock, flags);
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
if (curr->dev == dev && curr->pasid == pasid) {
@@ -4620,6 +4637,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
WARN_ON_ONCE(!dev_pasid);
spin_unlock_irqrestore(&dmar_domain->lock, flags);
+ cache_tag_unassign_domain(dmar_domain, dev, pasid);
domain_detach_iommu(dmar_domain, iommu);
intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
@@ -4659,6 +4677,10 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
if (ret)
goto out_free;
+ ret = cache_tag_assign_domain(dmar_domain, dev, pasid);
+ if (ret)
+ goto out_detach_iommu;
+
if (domain_type_is_si(dmar_domain))
ret = intel_pasid_setup_pass_through(iommu, dev, pasid);
else if (dmar_domain->use_first_level)
@@ -4668,7 +4690,7 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
ret = intel_pasid_setup_second_level(iommu, dmar_domain,
dev, pasid);
if (ret)
- goto out_detach_iommu;
+ goto out_unassign_tag;
dev_pasid->dev = dev;
dev_pasid->pasid = pasid;
@@ -4680,6 +4702,8 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
intel_iommu_debugfs_create_dev_pasid(dev_pasid);
return 0;
+out_unassign_tag:
+ cache_tag_unassign_domain(dmar_domain, dev, pasid);
out_detach_iommu:
domain_detach_iommu(dmar_domain, iommu);
out_free:
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index a7d68f3d518a..13406ee742bf 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -52,13 +52,14 @@ static int intel_nested_attach_dev(struct iommu_domain *domain,
return ret;
}
+ ret = cache_tag_assign_domain(dmar_domain, dev, IOMMU_NO_PASID);
+ if (ret)
+ goto detach_iommu;
+
ret = intel_pasid_setup_nested(iommu, dev,
IOMMU_NO_PASID, dmar_domain);
- if (ret) {
- domain_detach_iommu(dmar_domain, iommu);
- dev_err_ratelimited(dev, "Failed to setup pasid entry\n");
- return ret;
- }
+ if (ret)
+ goto unassign_tag;
info->domain = dmar_domain;
spin_lock_irqsave(&dmar_domain->lock, flags);
@@ -68,6 +69,12 @@ static int intel_nested_attach_dev(struct iommu_domain *domain,
domain_update_iotlb(dmar_domain);
return 0;
+unassign_tag:
+ cache_tag_unassign_domain(dmar_domain, dev, IOMMU_NO_PASID);
+detach_iommu:
+ domain_detach_iommu(dmar_domain, iommu);
+
+ return ret;
}
static void intel_nested_domain_free(struct iommu_domain *domain)
@@ -206,7 +213,9 @@ struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
domain->domain.type = IOMMU_DOMAIN_NESTED;
INIT_LIST_HEAD(&domain->devices);
INIT_LIST_HEAD(&domain->dev_pasids);
+ INIT_LIST_HEAD(&domain->cache_tags);
spin_lock_init(&domain->lock);
+ spin_lock_init(&domain->cache_lock);
xa_init(&domain->iommu_array);
spin_lock(&s2_domain->s1_lock);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index ee3b469e2da1..641d0cef3737 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -366,17 +366,23 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
sdev->qdep = 0;
}
+ ret = cache_tag_assign_domain(to_dmar_domain(domain), dev, pasid);
+ if (ret)
+ goto free_sdev;
+
/* Setup the pasid table: */
sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
FLPT_DEFAULT_DID, sflags);
if (ret)
- goto free_sdev;
+ goto unassign_tag;
list_add_rcu(&sdev->list, &svm->devs);
return 0;
+unassign_tag:
+ cache_tag_unassign_domain(to_dmar_domain(domain), dev, pasid);
free_sdev:
kfree(sdev);
free_svm:
@@ -795,6 +801,8 @@ struct iommu_domain *intel_svm_domain_alloc(void)
if (!domain)
return NULL;
domain->domain.ops = &intel_svm_domain_ops;
+ INIT_LIST_HEAD(&domain->cache_tags);
+ spin_lock_init(&domain->cache_lock);
return &domain->domain;
}
diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile
index 5402b699a122..c8beb0281559 100644
--- a/drivers/iommu/intel/Makefile
+++ b/drivers/iommu/intel/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DMAR_TABLE) += dmar.o
-obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o
+obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o cache.o
obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o
obj-$(CONFIG_DMAR_PERF) += perf.o
obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* RE: [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface
2024-04-16 8:06 ` [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface Lu Baolu
@ 2024-04-23 8:17 ` Tian, Kevin
2024-04-24 1:48 ` Baolu Lu
2024-04-23 9:01 ` Tian, Kevin
1 sibling, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2024-04-23 8:17 UTC (permalink / raw)
To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, April 16, 2024 4:07 PM
>
> In addition to the basic IOTLB and devTLB caching tag types, PARENT_IOTLB
> and PARENT_DEVTLB tag types are also introduced. These tags are used for
it's NESTING_IOTLB/DEVTLB now.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface
2024-04-23 8:17 ` Tian, Kevin
@ 2024-04-24 1:48 ` Baolu Lu
0 siblings, 0 replies; 22+ messages in thread
From: Baolu Lu @ 2024-04-24 1:48 UTC (permalink / raw)
To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: baolu.lu, Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
On 4/23/24 4:17 PM, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, April 16, 2024 4:07 PM
>>
>> In addition to the basic IOTLB and devTLB caching tag types, PARENT_IOTLB
>> and PARENT_DEVTLB tag types are also introduced. These tags are used for
>
> it's NESTING_IOTLB/DEVTLB now.
>
Fixed.
Best regards,
baolu
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface
2024-04-16 8:06 ` [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface Lu Baolu
2024-04-23 8:17 ` Tian, Kevin
@ 2024-04-23 9:01 ` Tian, Kevin
2024-04-24 2:44 ` Baolu Lu
1 sibling, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2024-04-23 9:01 UTC (permalink / raw)
To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, April 16, 2024 4:07 PM
>
> @@ -1757,6 +1759,9 @@ int domain_attach_iommu(struct dmar_domain
> *domain, struct intel_iommu *iommu)
> unsigned long ndomains;
> int num, ret = -ENOSPC;
>
> + if (domain->domain.type == IOMMU_DOMAIN_SVA)
> + return 0;
> +
> info = kzalloc(sizeof(*info), GFP_KERNEL);
> if (!info)
> return -ENOMEM;
> @@ -1804,6 +1809,9 @@ void domain_detach_iommu(struct dmar_domain
> *domain, struct intel_iommu *iommu)
> {
> struct iommu_domain_info *info;
>
> + if (domain->domain.type == IOMMU_DOMAIN_SVA)
> + return;
> +
> spin_lock(&iommu->lock);
> info = xa_load(&domain->iommu_array, iommu->seq_id);
> if (--info->refcnt == 0) {
above two are not called for SVA. Why do they start checking
SVA type now?
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface
2024-04-23 9:01 ` Tian, Kevin
@ 2024-04-24 2:44 ` Baolu Lu
0 siblings, 0 replies; 22+ messages in thread
From: Baolu Lu @ 2024-04-24 2:44 UTC (permalink / raw)
To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: baolu.lu, Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
On 4/23/24 5:01 PM, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, April 16, 2024 4:07 PM
>>
>> @@ -1757,6 +1759,9 @@ int domain_attach_iommu(struct dmar_domain
>> *domain, struct intel_iommu *iommu)
>> unsigned long ndomains;
>> int num, ret = -ENOSPC;
>>
>> + if (domain->domain.type == IOMMU_DOMAIN_SVA)
>> + return 0;
>> +
>> info = kzalloc(sizeof(*info), GFP_KERNEL);
>> if (!info)
>> return -ENOMEM;
>> @@ -1804,6 +1809,9 @@ void domain_detach_iommu(struct dmar_domain
>> *domain, struct intel_iommu *iommu)
>> {
>> struct iommu_domain_info *info;
>>
>> + if (domain->domain.type == IOMMU_DOMAIN_SVA)
>> + return;
>> +
>> spin_lock(&iommu->lock);
>> info = xa_load(&domain->iommu_array, iommu->seq_id);
>> if (--info->refcnt == 0) {
>
> above two are not called for SVA. Why do they start checking
> SVA type now?
domain_detach_iommu() is called in remove_dev_pasid path which covers
SVA domain as well.
I am moving the check to the domain_attach/detach_iommu() helpers so
that the call sites could be generic. I will later move the domain ID
management to the cache tag code, this kind of check will be dropped
then.
Best regards,
baolu
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
2024-04-16 8:06 ` [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-22 5:30 ` Baolu Lu
2024-04-16 8:06 ` [PATCH v3 03/12] iommu/vt-d: Add trace events for cache tag interface Lu Baolu
` (10 subsequent siblings)
12 siblings, 1 reply; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
Add several helpers to invalidate the caches after mappings in the
affected domain are changed.
- cache_tag_flush_range() invalidates a range of caches after mappings
within this range are changed. It uses the page-selective cache
invalidation methods.
- cache_tag_flush_all() invalidates all caches tagged by a domain ID.
It uses the domain-selective cache invalidation methods.
- cache_tag_flush_range_np() invalidates a range of caches when new
mappings are created in the domain and the corresponding page table
entries change from non-present to present.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 14 +++
drivers/iommu/intel/cache.c | 195 ++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.c | 12 ---
3 files changed, 209 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 52471f5337d5..e17683ecef4b 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -35,6 +35,8 @@
#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
#define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK)
+#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT)
+
#define VTD_STRIDE_SHIFT (9)
#define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT)
@@ -1041,6 +1043,13 @@ static inline void context_set_sm_pre(struct context_entry *context)
context->lo |= BIT_ULL(4);
}
+/* Returns a number of VTD pages, but aligned to MM page size */
+static inline unsigned long aligned_nrpages(unsigned long host_addr, size_t size)
+{
+ host_addr &= ~PAGE_MASK;
+ return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT;
+}
+
/* Convert value to context PASID directory size field coding. */
#define context_pdts(pds) (((pds) & 0x7) << 9)
@@ -1122,6 +1131,11 @@ int cache_tag_assign_domain(struct dmar_domain *domain,
struct device *dev, ioasid_t pasid);
void cache_tag_unassign_domain(struct dmar_domain *domain,
struct device *dev, ioasid_t pasid);
+void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start,
+ unsigned long end, int ih);
+void cache_tag_flush_all(struct dmar_domain *domain);
+void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long start,
+ unsigned long end);
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
index 296f1645a739..0539275a9d20 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -12,6 +12,7 @@
#include <linux/dmar.h>
#include <linux/iommu.h>
#include <linux/memory.h>
+#include <linux/pci.h>
#include <linux/spinlock.h>
#include "iommu.h"
@@ -212,3 +213,197 @@ void cache_tag_unassign_domain(struct dmar_domain *domain,
if (domain->domain.type == IOMMU_DOMAIN_NESTED)
__cache_tag_unassign_parent_domain(domain->s2_domain, did, dev, pasid);
}
+
+static unsigned long calculate_psi_aligned_address(unsigned long start,
+ unsigned long end,
+ unsigned long *_pages,
+ unsigned long *_mask)
+{
+ unsigned long pages = aligned_nrpages(start, end - start + 1);
+ unsigned long aligned_pages = __roundup_pow_of_two(pages);
+ unsigned long bitmask = aligned_pages - 1;
+ unsigned long mask = ilog2(aligned_pages);
+ unsigned long pfn = IOVA_PFN(start);
+
+ /*
+ * PSI masks the low order bits of the base address. If the
+ * address isn't aligned to the mask, then compute a mask value
+ * needed to ensure the target range is flushed.
+ */
+ if (unlikely(bitmask & pfn)) {
+ unsigned long end_pfn = pfn + pages - 1, shared_bits;
+
+ /*
+ * Since end_pfn <= pfn + bitmask, the only way bits
+ * higher than bitmask can differ in pfn and end_pfn is
+ * by carrying. This means after masking out bitmask,
+ * high bits starting with the first set bit in
+ * shared_bits are all equal in both pfn and end_pfn.
+ */
+ shared_bits = ~(pfn ^ end_pfn) & ~bitmask;
+ mask = shared_bits ? __ffs(shared_bits) : BITS_PER_LONG;
+ }
+
+ *_pages = aligned_pages;
+ *_mask = mask;
+
+ return ALIGN_DOWN(start, VTD_PAGE_SIZE << mask);
+}
+
+/*
+ * Invalidates a range of IOVA from @start (inclusive) to @end (inclusive)
+ * when the memory mappings in the target domain have been modified.
+ */
+void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start,
+ unsigned long end, int ih)
+{
+ unsigned long pages, mask, addr;
+ struct cache_tag *tag;
+ unsigned long flags;
+
+ addr = calculate_psi_aligned_address(start, end, &pages, &mask);
+
+ spin_lock_irqsave(&domain->cache_lock, flags);
+ list_for_each_entry(tag, &domain->cache_tags, node) {
+ struct intel_iommu *iommu = tag->iommu;
+ struct device_domain_info *info;
+ u16 sid;
+
+ switch (tag->type) {
+ case CACHE_TAG_IOTLB:
+ case CACHE_TAG_NESTING_IOTLB:
+ if (domain->use_first_level) {
+ qi_flush_piotlb(iommu, tag->domain_id,
+ tag->pasid, addr, pages, ih);
+ } else {
+ /*
+ * Fallback to domain selective flush if no
+ * PSI support or the size is too big.
+ */
+ if (!cap_pgsel_inv(iommu->cap) ||
+ mask > cap_max_amask_val(iommu->cap))
+ iommu->flush.flush_iotlb(iommu, tag->domain_id,
+ 0, 0, DMA_TLB_DSI_FLUSH);
+ else
+ iommu->flush.flush_iotlb(iommu, tag->domain_id,
+ addr | ih, mask,
+ DMA_TLB_PSI_FLUSH);
+ }
+ break;
+ case CACHE_TAG_NESTING_DEVTLB:
+ /*
+ * Address translation cache in device side caches the
+ * result of nested translation. There is no easy way
+ * to identify the exact set of nested translations
+ * affected by a change in S2. So just flush the entire
+ * device cache.
+ */
+ addr = 0;
+ mask = MAX_AGAW_PFN_WIDTH;
+ fallthrough;
+ case CACHE_TAG_DEVTLB:
+ info = dev_iommu_priv_get(tag->dev);
+ sid = PCI_DEVID(info->bus, info->devfn);
+
+ if (tag->pasid == IOMMU_NO_PASID)
+ qi_flush_dev_iotlb(iommu, sid, info->pfsid,
+ info->ats_qdep, addr, mask);
+ else
+ qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid,
+ tag->pasid, info->ats_qdep,
+ addr, mask);
+
+ quirk_extra_dev_tlb_flush(info, addr, mask, tag->pasid, info->ats_qdep);
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+}
+
+/*
+ * Invalidates all ranges of IOVA when the memory mappings in the target
+ * domain have been modified.
+ */
+void cache_tag_flush_all(struct dmar_domain *domain)
+{
+ struct cache_tag *tag;
+ unsigned long flags;
+
+ spin_lock_irqsave(&domain->cache_lock, flags);
+ list_for_each_entry(tag, &domain->cache_tags, node) {
+ struct intel_iommu *iommu = tag->iommu;
+ struct device_domain_info *info;
+ u16 sid;
+
+ switch (tag->type) {
+ case CACHE_TAG_IOTLB:
+ case CACHE_TAG_NESTING_IOTLB:
+ if (domain->use_first_level)
+ qi_flush_piotlb(iommu, tag->domain_id,
+ tag->pasid, 0, -1, 0);
+ else
+ iommu->flush.flush_iotlb(iommu, tag->domain_id,
+ 0, 0, DMA_TLB_DSI_FLUSH);
+ break;
+ case CACHE_TAG_DEVTLB:
+ case CACHE_TAG_NESTING_DEVTLB:
+ info = dev_iommu_priv_get(tag->dev);
+ sid = PCI_DEVID(info->bus, info->devfn);
+
+ qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep,
+ 0, MAX_AGAW_PFN_WIDTH);
+ quirk_extra_dev_tlb_flush(info, 0, MAX_AGAW_PFN_WIDTH,
+ IOMMU_NO_PASID, info->ats_qdep);
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+}
+
+/*
+ * Invalidate a range of IOVA when new mappings are created in the target
+ * domain.
+ *
+ * - VT-d spec, Section 6.1 Caching Mode: When the CM field is reported as
+ * Set, any software updates to remapping structures other than first-
+ * stage mapping requires explicit invalidation of the caches.
+ * - VT-d spec, Section 6.8 Write Buffer Flushing: For hardware that requires
+ * write buffer flushing, software must explicitly perform write-buffer
+ * flushing, if cache invalidation is not required.
+ */
+void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long start,
+ unsigned long end)
+{
+ unsigned long pages, mask, addr;
+ struct cache_tag *tag;
+ unsigned long flags;
+
+ addr = calculate_psi_aligned_address(start, end, &pages, &mask);
+
+ spin_lock_irqsave(&domain->cache_lock, flags);
+ list_for_each_entry(tag, &domain->cache_tags, node) {
+ struct intel_iommu *iommu = tag->iommu;
+
+ if (!cap_caching_mode(iommu->cap) || domain->use_first_level) {
+ iommu_flush_write_buffer(iommu);
+ continue;
+ }
+
+ if (tag->type == CACHE_TAG_IOTLB ||
+ tag->type == CACHE_TAG_NESTING_IOTLB) {
+ /*
+ * Fallback to domain selective flush if no
+ * PSI support or the size is too big.
+ */
+ if (!cap_pgsel_inv(iommu->cap) ||
+ mask > cap_max_amask_val(iommu->cap))
+ iommu->flush.flush_iotlb(iommu, tag->domain_id,
+ 0, 0, DMA_TLB_DSI_FLUSH);
+ else
+ iommu->flush.flush_iotlb(iommu, tag->domain_id,
+ addr, mask,
+ DMA_TLB_PSI_FLUSH);
+ }
+ }
+ spin_unlock_irqrestore(&domain->cache_lock, flags);
+}
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0c0b8e493fda..ac413097058c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -54,11 +54,6 @@
__DOMAIN_MAX_PFN(gaw), (unsigned long)-1))
#define DOMAIN_MAX_ADDR(gaw) (((uint64_t)__DOMAIN_MAX_PFN(gaw)) << VTD_PAGE_SHIFT)
-/* IO virtual address start page frame number */
-#define IOVA_START_PFN (1)
-
-#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT)
-
static void __init check_tylersburg_isoch(void);
static int rwbf_quirk;
@@ -1991,13 +1986,6 @@ domain_context_mapping(struct dmar_domain *domain, struct device *dev)
domain_context_mapping_cb, domain);
}
-/* Returns a number of VTD pages, but aligned to MM page size */
-static unsigned long aligned_nrpages(unsigned long host_addr, size_t size)
-{
- host_addr &= ~PAGE_MASK;
- return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT;
-}
-
/* Return largest possible superpage level for a given mapping */
static int hardware_largepage_caps(struct dmar_domain *domain, unsigned long iov_pfn,
unsigned long phy_pfn, unsigned long pages)
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers
2024-04-16 8:06 ` [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers Lu Baolu
@ 2024-04-22 5:30 ` Baolu Lu
2024-04-23 8:42 ` Tian, Kevin
0 siblings, 1 reply; 22+ messages in thread
From: Baolu Lu @ 2024-04-22 5:30 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: baolu.lu, Tina Zhang, Yi Liu, iommu, linux-kernel
On 4/16/24 4:06 PM, Lu Baolu wrote:
> Add several helpers to invalidate the caches after mappings in the
> affected domain are changed.
>
> - cache_tag_flush_range() invalidates a range of caches after mappings
> within this range are changed. It uses the page-selective cache
> invalidation methods.
>
> - cache_tag_flush_all() invalidates all caches tagged by a domain ID.
> It uses the domain-selective cache invalidation methods.
>
> - cache_tag_flush_range_np() invalidates a range of caches when new
> mappings are created in the domain and the corresponding page table
> entries change from non-present to present.
>
> Signed-off-by: Lu Baolu<baolu.lu@linux.intel.com>
> ---
> drivers/iommu/intel/iommu.h | 14 +++
> drivers/iommu/intel/cache.c | 195 ++++++++++++++++++++++++++++++++++++
> drivers/iommu/intel/iommu.c | 12 ---
> 3 files changed, 209 insertions(+), 12 deletions(-)
[...]
> +
> +/*
> + * Invalidates a range of IOVA from @start (inclusive) to @end (inclusive)
> + * when the memory mappings in the target domain have been modified.
> + */
> +void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start,
> + unsigned long end, int ih)
> +{
> + unsigned long pages, mask, addr;
> + struct cache_tag *tag;
> + unsigned long flags;
> +
> + addr = calculate_psi_aligned_address(start, end, &pages, &mask);
> +
> + spin_lock_irqsave(&domain->cache_lock, flags);
> + list_for_each_entry(tag, &domain->cache_tags, node) {
> + struct intel_iommu *iommu = tag->iommu;
> + struct device_domain_info *info;
> + u16 sid;
> +
> + switch (tag->type) {
> + case CACHE_TAG_IOTLB:
> + case CACHE_TAG_NESTING_IOTLB:
> + if (domain->use_first_level) {
> + qi_flush_piotlb(iommu, tag->domain_id,
> + tag->pasid, addr, pages, ih);
> + } else {
> + /*
> + * Fallback to domain selective flush if no
> + * PSI support or the size is too big.
> + */
> + if (!cap_pgsel_inv(iommu->cap) ||
> + mask > cap_max_amask_val(iommu->cap))
> + iommu->flush.flush_iotlb(iommu, tag->domain_id,
> + 0, 0, DMA_TLB_DSI_FLUSH);
> + else
> + iommu->flush.flush_iotlb(iommu, tag->domain_id,
> + addr | ih, mask,
> + DMA_TLB_PSI_FLUSH);
> + }
> + break;
> + case CACHE_TAG_NESTING_DEVTLB:
> + /*
> + * Address translation cache in device side caches the
> + * result of nested translation. There is no easy way
> + * to identify the exact set of nested translations
> + * affected by a change in S2. So just flush the entire
> + * device cache.
> + */
> + addr = 0;
> + mask = MAX_AGAW_PFN_WIDTH;
> + fallthrough;
I realized that the logic above is not right. Setting both @addr and
@mask to 0 doesn't means flush all caches on the device. I will change
it like below:
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
index e8418cdd8331..18debb82272a 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -302,9 +302,14 @@ void cache_tag_flush_range(struct dmar_domain
*domain, unsigned long start,
* affected by a change in S2. So just flush
the entire
* device cache.
*/
- addr = 0;
- mask = MAX_AGAW_PFN_WIDTH;
- fallthrough;
+ info = dev_iommu_priv_get(tag->dev);
+ sid = PCI_DEVID(info->bus, info->devfn);
+
+ qi_flush_dev_iotlb(iommu, sid, info->pfsid,
info->ats_qdep,
+ 0, MAX_AGAW_PFN_WIDTH);
+ quirk_extra_dev_tlb_flush(info, 0,
MAX_AGAW_PFN_WIDTH,
+ IOMMU_NO_PASID,
info->ats_qdep);
+ break;
case CACHE_TAG_DEVTLB:
info = dev_iommu_priv_get(tag->dev);
sid = PCI_DEVID(info->bus, info->devfn);
> + case CACHE_TAG_DEVTLB:
> + info = dev_iommu_priv_get(tag->dev);
> + sid = PCI_DEVID(info->bus, info->devfn);
> +
> + if (tag->pasid == IOMMU_NO_PASID)
> + qi_flush_dev_iotlb(iommu, sid, info->pfsid,
> + info->ats_qdep, addr, mask);
> + else
> + qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid,
> + tag->pasid, info->ats_qdep,
> + addr, mask);
> +
> + quirk_extra_dev_tlb_flush(info, addr, mask, tag->pasid, info->ats_qdep);
> + break;
> + }
> + }
> + spin_unlock_irqrestore(&domain->cache_lock, flags);
Best regards,
baolu
^ permalink raw reply related [flat|nested] 22+ messages in thread* RE: [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers
2024-04-22 5:30 ` Baolu Lu
@ 2024-04-23 8:42 ` Tian, Kevin
2024-04-24 1:45 ` Baolu Lu
0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2024-04-23 8:42 UTC (permalink / raw)
To: Baolu Lu, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Monday, April 22, 2024 1:30 PM
>
> On 4/16/24 4:06 PM, Lu Baolu wrote:
> > + case CACHE_TAG_NESTING_DEVTLB:
> > + /*
> > + * Address translation cache in device side caches the
> > + * result of nested translation. There is no easy way
> > + * to identify the exact set of nested translations
> > + * affected by a change in S2. So just flush the entire
> > + * device cache.
> > + */
> > + addr = 0;
> > + mask = MAX_AGAW_PFN_WIDTH;
> > + fallthrough;
>
> I realized that the logic above is not right. Setting both @addr and
> @mask to 0 doesn't means flush all caches on the device. I will change
> it like below:
I didn't get. Above code doesn't set @mask to 0.
>
> diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
> index e8418cdd8331..18debb82272a 100644
> --- a/drivers/iommu/intel/cache.c
> +++ b/drivers/iommu/intel/cache.c
> @@ -302,9 +302,14 @@ void cache_tag_flush_range(struct dmar_domain
> *domain, unsigned long start,
> * affected by a change in S2. So just flush
> the entire
> * device cache.
> */
> - addr = 0;
> - mask = MAX_AGAW_PFN_WIDTH;
> - fallthrough;
> + info = dev_iommu_priv_get(tag->dev);
> + sid = PCI_DEVID(info->bus, info->devfn);
> +
> + qi_flush_dev_iotlb(iommu, sid, info->pfsid,
> info->ats_qdep,
> + 0, MAX_AGAW_PFN_WIDTH);
> + quirk_extra_dev_tlb_flush(info, 0,
> MAX_AGAW_PFN_WIDTH,
> + IOMMU_NO_PASID,
> info->ats_qdep);
> + break;
and I didn't get this change. It goes backward by ignoring tag->pasid.
what's the exact problem of the fallthrough logic in original code?
> case CACHE_TAG_DEVTLB:
> info = dev_iommu_priv_get(tag->dev);
> sid = PCI_DEVID(info->bus, info->devfn);
>
> > + case CACHE_TAG_DEVTLB:
> > + info = dev_iommu_priv_get(tag->dev);
> > + sid = PCI_DEVID(info->bus, info->devfn);
> > +
> > + if (tag->pasid == IOMMU_NO_PASID)
> > + qi_flush_dev_iotlb(iommu, sid, info->pfsid,
> > + info->ats_qdep, addr,
> mask);
> > + else
> > + qi_flush_dev_iotlb_pasid(iommu, sid, info-
> >pfsid,
> > + tag->pasid, info-
> >ats_qdep,
> > + addr, mask);
> > +
> > + quirk_extra_dev_tlb_flush(info, addr, mask, tag-
> >pasid, info->ats_qdep);
> > + break;
> > + }
> > + }
> > + spin_unlock_irqrestore(&domain->cache_lock, flags);
>
> Best regards,
> baolu
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers
2024-04-23 8:42 ` Tian, Kevin
@ 2024-04-24 1:45 ` Baolu Lu
0 siblings, 0 replies; 22+ messages in thread
From: Baolu Lu @ 2024-04-24 1:45 UTC (permalink / raw)
To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: baolu.lu, Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
On 4/23/24 4:42 PM, Tian, Kevin wrote:
>> From: Baolu Lu<baolu.lu@linux.intel.com>
>> Sent: Monday, April 22, 2024 1:30 PM
>>
>> On 4/16/24 4:06 PM, Lu Baolu wrote:
>>> + case CACHE_TAG_NESTING_DEVTLB:
>>> + /*
>>> + * Address translation cache in device side caches the
>>> + * result of nested translation. There is no easy way
>>> + * to identify the exact set of nested translations
>>> + * affected by a change in S2. So just flush the entire
>>> + * device cache.
>>> + */
>>> + addr = 0;
>>> + mask = MAX_AGAW_PFN_WIDTH;
>>> + fallthrough;
>> I realized that the logic above is not right. Setting both @addr and
>> @mask to 0 doesn't means flush all caches on the device. I will change
>> it like below:
> I didn't get. Above code doesn't set @mask to 0.
Oh!? I have no idea why I read that as "mask = 0" now. Perhaps my brain
was on vacation earlier. :-)
>
>> diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
>> index e8418cdd8331..18debb82272a 100644
>> --- a/drivers/iommu/intel/cache.c
>> +++ b/drivers/iommu/intel/cache.c
>> @@ -302,9 +302,14 @@ void cache_tag_flush_range(struct dmar_domain
>> *domain, unsigned long start,
>> * affected by a change in S2. So just flush
>> the entire
>> * device cache.
>> */
>> - addr = 0;
>> - mask = MAX_AGAW_PFN_WIDTH;
>> - fallthrough;
>> + info = dev_iommu_priv_get(tag->dev);
>> + sid = PCI_DEVID(info->bus, info->devfn);
>> +
>> + qi_flush_dev_iotlb(iommu, sid, info->pfsid,
>> info->ats_qdep,
>> + 0, MAX_AGAW_PFN_WIDTH);
>> + quirk_extra_dev_tlb_flush(info, 0,
>> MAX_AGAW_PFN_WIDTH,
>> + IOMMU_NO_PASID,
>> info->ats_qdep);
>> + break;
> and I didn't get this change. It goes backward by ignoring tag->pasid.
>
> what's the exact problem of the fallthrough logic in original code?
Sorry! Please ignore this.
Best regards,
baolu
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 03/12] iommu/vt-d: Add trace events for cache tag interface
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
2024-04-16 8:06 ` [PATCH v3 01/12] iommu/vt-d: Add cache tag assignment interface Lu Baolu
2024-04-16 8:06 ` [PATCH v3 02/12] iommu/vt-d: Add cache tag invalidation helpers Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 04/12] iommu/vt-d: Use cache_tag_flush_all() in flush_iotlb_all Lu Baolu
` (9 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
Add trace events for cache tag assign/unassign/flush operations and trace
the events in the interfaces. These trace events will improve debugging
capabilities by providing detailed information about cache tag activity.
A sample of the traced messages looks like below [messages have been
stripped and wrapped to make the line short].
cache_tag_assign: dmar9/0000:00:01.0 type iotlb did 1 pasid 9 ref 1
cache_tag_assign: dmar9/0000:00:01.0 type devtlb did 1 pasid 9 ref 1
cache_tag_flush_all: dmar6/0000:8a:00.0 type iotlb did 7 pasid 0 ref 1
cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9
[0xeab00000-0xeab1afff] addr 0xeab00000 pages 0x20 mask 0x5
cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9
[0xeab20000-0xeab31fff] addr 0xeab20000 pages 0x20 mask 0x5
cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9
[0xeaa40000-0xeaa51fff] addr 0xeaa40000 pages 0x20 mask 0x5
cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9
[0x98de0000-0x98de4fff] addr 0x98de0000 pages 0x8 mask 0x3
cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9
[0xe9828000-0xe9828fff] addr 0xe9828000 pages 0x1 mask 0x0
cache_tag_unassign: dmar9/0000:00:01.0 type iotlb did 1 pasid 9 ref 1
cache_tag_unassign: dmar9/0000:00:01.0 type devtlb did 1 pasid 9 ref 1
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/trace.h | 97 +++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/cache.c | 10 ++++
2 files changed, 107 insertions(+)
diff --git a/drivers/iommu/intel/trace.h b/drivers/iommu/intel/trace.h
index 93d96f93a89b..961ac1c1bc21 100644
--- a/drivers/iommu/intel/trace.h
+++ b/drivers/iommu/intel/trace.h
@@ -89,6 +89,103 @@ TRACE_EVENT(prq_report,
__entry->dw1, __entry->dw2, __entry->dw3)
)
);
+
+DECLARE_EVENT_CLASS(cache_tag_log,
+ TP_PROTO(struct cache_tag *tag),
+ TP_ARGS(tag),
+ TP_STRUCT__entry(
+ __string(iommu, tag->iommu->name)
+ __string(dev, dev_name(tag->dev))
+ __field(u16, type)
+ __field(u16, domain_id)
+ __field(u32, pasid)
+ __field(u32, users)
+ ),
+ TP_fast_assign(
+ __assign_str(iommu, tag->iommu->name);
+ __assign_str(dev, dev_name(tag->dev));
+ __entry->type = tag->type;
+ __entry->domain_id = tag->domain_id;
+ __entry->pasid = tag->pasid;
+ __entry->users = tag->users;
+ ),
+ TP_printk("%s/%s type %s did %d pasid %d ref %d",
+ __get_str(iommu), __get_str(dev),
+ __print_symbolic(__entry->type,
+ { CACHE_TAG_IOTLB, "iotlb" },
+ { CACHE_TAG_DEVTLB, "devtlb" },
+ { CACHE_TAG_NESTING_IOTLB, "nesting_iotlb" },
+ { CACHE_TAG_NESTING_DEVTLB, "nesting_devtlb" }),
+ __entry->domain_id, __entry->pasid, __entry->users
+ )
+);
+
+DEFINE_EVENT(cache_tag_log, cache_tag_assign,
+ TP_PROTO(struct cache_tag *tag),
+ TP_ARGS(tag)
+);
+
+DEFINE_EVENT(cache_tag_log, cache_tag_unassign,
+ TP_PROTO(struct cache_tag *tag),
+ TP_ARGS(tag)
+);
+
+DEFINE_EVENT(cache_tag_log, cache_tag_flush_all,
+ TP_PROTO(struct cache_tag *tag),
+ TP_ARGS(tag)
+);
+
+DECLARE_EVENT_CLASS(cache_tag_flush,
+ TP_PROTO(struct cache_tag *tag, unsigned long start, unsigned long end,
+ unsigned long addr, unsigned long pages, unsigned long mask),
+ TP_ARGS(tag, start, end, addr, pages, mask),
+ TP_STRUCT__entry(
+ __string(iommu, tag->iommu->name)
+ __string(dev, dev_name(tag->dev))
+ __field(u16, type)
+ __field(u16, domain_id)
+ __field(u32, pasid)
+ __field(unsigned long, start)
+ __field(unsigned long, end)
+ __field(unsigned long, addr)
+ __field(unsigned long, pages)
+ __field(unsigned long, mask)
+ ),
+ TP_fast_assign(
+ __assign_str(iommu, tag->iommu->name);
+ __assign_str(dev, dev_name(tag->dev));
+ __entry->type = tag->type;
+ __entry->domain_id = tag->domain_id;
+ __entry->pasid = tag->pasid;
+ __entry->start = start;
+ __entry->end = end;
+ __entry->addr = addr;
+ __entry->pages = pages;
+ __entry->mask = mask;
+ ),
+ TP_printk("%s %s[%d] type %s did %d [0x%lx-0x%lx] addr 0x%lx pages 0x%lx mask 0x%lx",
+ __get_str(iommu), __get_str(dev), __entry->pasid,
+ __print_symbolic(__entry->type,
+ { CACHE_TAG_IOTLB, "iotlb" },
+ { CACHE_TAG_DEVTLB, "devtlb" },
+ { CACHE_TAG_NESTING_IOTLB, "nesting_iotlb" },
+ { CACHE_TAG_NESTING_DEVTLB, "nesting_devtlb" }),
+ __entry->domain_id, __entry->start, __entry->end,
+ __entry->addr, __entry->pages, __entry->mask
+ )
+);
+
+DEFINE_EVENT(cache_tag_flush, cache_tag_flush_range,
+ TP_PROTO(struct cache_tag *tag, unsigned long start, unsigned long end,
+ unsigned long addr, unsigned long pages, unsigned long mask),
+ TP_ARGS(tag, start, end, addr, pages, mask)
+);
+
+DEFINE_EVENT(cache_tag_flush, cache_tag_flush_range_np,
+ TP_PROTO(struct cache_tag *tag, unsigned long start, unsigned long end,
+ unsigned long addr, unsigned long pages, unsigned long mask),
+ TP_ARGS(tag, start, end, addr, pages, mask)
+);
#endif /* _TRACE_INTEL_IOMMU_H */
/* This part must be outside protection */
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
index 0539275a9d20..e8418cdd8331 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -17,6 +17,7 @@
#include "iommu.h"
#include "pasid.h"
+#include "trace.h"
/* Check if an existing cache tag can be reused for a new association. */
static bool cache_tage_match(struct cache_tag *tag, u16 domain_id,
@@ -69,11 +70,13 @@ static int cache_tag_assign(struct dmar_domain *domain, u16 did,
temp->users++;
spin_unlock_irqrestore(&domain->cache_lock, flags);
kfree(tag);
+ trace_cache_tag_assign(temp);
return 0;
}
}
list_add_tail(&tag->node, &domain->cache_tags);
spin_unlock_irqrestore(&domain->cache_lock, flags);
+ trace_cache_tag_assign(tag);
return 0;
}
@@ -91,6 +94,7 @@ static void cache_tag_unassign(struct dmar_domain *domain, u16 did,
spin_lock_irqsave(&domain->cache_lock, flags);
list_for_each_entry(tag, &domain->cache_tags, node) {
if (cache_tage_match(tag, did, iommu, dev, pasid, type)) {
+ trace_cache_tag_unassign(tag);
if (--tag->users == 0) {
list_del(&tag->node);
kfree(tag);
@@ -316,6 +320,8 @@ void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start,
quirk_extra_dev_tlb_flush(info, addr, mask, tag->pasid, info->ats_qdep);
break;
}
+
+ trace_cache_tag_flush_range(tag, start, end, addr, pages, mask);
}
spin_unlock_irqrestore(&domain->cache_lock, flags);
}
@@ -356,6 +362,8 @@ void cache_tag_flush_all(struct dmar_domain *domain)
IOMMU_NO_PASID, info->ats_qdep);
break;
}
+
+ trace_cache_tag_flush_all(tag);
}
spin_unlock_irqrestore(&domain->cache_lock, flags);
}
@@ -404,6 +412,8 @@ void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long start,
addr, mask,
DMA_TLB_PSI_FLUSH);
}
+
+ trace_cache_tag_flush_range_np(tag, start, end, addr, pages, mask);
}
spin_unlock_irqrestore(&domain->cache_lock, flags);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 04/12] iommu/vt-d: Use cache_tag_flush_all() in flush_iotlb_all
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (2 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 03/12] iommu/vt-d: Add trace events for cache tag interface Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 05/12] iommu/vt-d: Use cache_tag_flush_range() in tlb_sync Lu Baolu
` (8 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The flush_iotlb_all callback is called by the iommu core to flush
all caches for the affected domain. Use cache_tag_flush_all() in
this callback.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 20 +-------------------
1 file changed, 1 insertion(+), 19 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index ac413097058c..4eca107389b4 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1556,25 +1556,7 @@ static void parent_domain_flush(struct dmar_domain *domain,
static void intel_flush_iotlb_all(struct iommu_domain *domain)
{
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
- struct iommu_domain_info *info;
- unsigned long idx;
-
- xa_for_each(&dmar_domain->iommu_array, idx, info) {
- struct intel_iommu *iommu = info->iommu;
- u16 did = domain_id_iommu(dmar_domain, iommu);
-
- if (dmar_domain->use_first_level)
- domain_flush_pasid_iotlb(iommu, dmar_domain, 0, -1, 0);
- else
- iommu->flush.flush_iotlb(iommu, did, 0, 0,
- DMA_TLB_DSI_FLUSH);
-
- iommu_flush_dev_iotlb(dmar_domain, 0, MAX_AGAW_PFN_WIDTH);
- }
-
- if (dmar_domain->nested_parent)
- parent_domain_flush(dmar_domain, 0, -1, 0);
+ cache_tag_flush_all(to_dmar_domain(domain));
}
static void iommu_disable_protect_mem_regions(struct intel_iommu *iommu)
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 05/12] iommu/vt-d: Use cache_tag_flush_range() in tlb_sync
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (3 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 04/12] iommu/vt-d: Use cache_tag_flush_all() in flush_iotlb_all Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 06/12] iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map Lu Baolu
` (7 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The tlb_sync callback is called by the iommu core to flush a range of
caches for the affected domain. Use cache_tag_flush_range() in this
callback.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 21 ++-------------------
1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 4eca107389b4..6bf5b91b03c5 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4103,25 +4103,8 @@ static size_t intel_iommu_unmap_pages(struct iommu_domain *domain,
static void intel_iommu_tlb_sync(struct iommu_domain *domain,
struct iommu_iotlb_gather *gather)
{
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
- unsigned long iova_pfn = IOVA_PFN(gather->start);
- size_t size = gather->end - gather->start;
- struct iommu_domain_info *info;
- unsigned long start_pfn;
- unsigned long nrpages;
- unsigned long i;
-
- nrpages = aligned_nrpages(gather->start, size);
- start_pfn = mm_to_dma_pfn_start(iova_pfn);
-
- xa_for_each(&dmar_domain->iommu_array, i, info)
- iommu_flush_iotlb_psi(info->iommu, dmar_domain,
- start_pfn, nrpages,
- list_empty(&gather->freelist), 0);
-
- if (dmar_domain->nested_parent)
- parent_domain_flush(dmar_domain, start_pfn, nrpages,
- list_empty(&gather->freelist));
+ cache_tag_flush_range(to_dmar_domain(domain), gather->start,
+ gather->end, list_empty(&gather->freelist));
put_pages_list(&gather->freelist);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 06/12] iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (4 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 05/12] iommu/vt-d: Use cache_tag_flush_range() in tlb_sync Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 07/12] iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi() Lu Baolu
` (6 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The iotlb_sync_map callback is called by the iommu core after non-present
to present mappings are created. The iommu driver uses this callback to
invalidate caches if IOMMU is working in caching mode and second-only
translation is used for the domain. Use cache_tag_flush_range_np() in this
callback.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 22 +---------------------
1 file changed, 1 insertion(+), 21 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6bf5b91b03c5..c6a0d528ad19 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1500,20 +1500,6 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
iommu_flush_dev_iotlb(domain, addr, mask);
}
-/* Notification for newly created mappings */
-static void __mapping_notify_one(struct intel_iommu *iommu, struct dmar_domain *domain,
- unsigned long pfn, unsigned int pages)
-{
- /*
- * It's a non-present to present mapping. Only flush if caching mode
- * and second level.
- */
- if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
- iommu_flush_iotlb_psi(iommu, domain, pfn, pages, 0, 1);
- else
- iommu_flush_write_buffer(iommu);
-}
-
/*
* Flush the relevant caches in nested translation if the domain
* also serves as a parent
@@ -4543,14 +4529,8 @@ static bool risky_device(struct pci_dev *pdev)
static int intel_iommu_iotlb_sync_map(struct iommu_domain *domain,
unsigned long iova, size_t size)
{
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
- unsigned long pages = aligned_nrpages(iova, size);
- unsigned long pfn = iova >> VTD_PAGE_SHIFT;
- struct iommu_domain_info *info;
- unsigned long i;
+ cache_tag_flush_range_np(to_dmar_domain(domain), iova, iova + size - 1);
- xa_for_each(&dmar_domain->iommu_array, i, info)
- __mapping_notify_one(info->iommu, dmar_domain, pfn, pages);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 07/12] iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi()
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (5 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 06/12] iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 08/12] iommu/vt-d: Use cache_tag_flush_range() in cache_invalidate_user Lu Baolu
` (5 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
Use cache_tag_flush_range() in switch_to_super_page() to invalidate the
necessary caches when switching mappings from normal to super pages. The
iommu_flush_iotlb_psi() call in intel_iommu_memory_notifier() is
unnecessary since there should be no cache invalidation for the identity
domain.
Clean up iommu_flush_iotlb_psi() after the last call site is removed.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 171 +-----------------------------------
1 file changed, 2 insertions(+), 169 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c6a0d528ad19..fc6303940322 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1389,157 +1389,6 @@ static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
quirk_extra_dev_tlb_flush(info, addr, mask, IOMMU_NO_PASID, qdep);
}
-static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
- u64 addr, unsigned mask)
-{
- struct dev_pasid_info *dev_pasid;
- struct device_domain_info *info;
- unsigned long flags;
-
- if (!domain->has_iotlb_device)
- return;
-
- spin_lock_irqsave(&domain->lock, flags);
- list_for_each_entry(info, &domain->devices, link)
- __iommu_flush_dev_iotlb(info, addr, mask);
-
- list_for_each_entry(dev_pasid, &domain->dev_pasids, link_domain) {
- info = dev_iommu_priv_get(dev_pasid->dev);
-
- if (!info->ats_enabled)
- continue;
-
- qi_flush_dev_iotlb_pasid(info->iommu,
- PCI_DEVID(info->bus, info->devfn),
- info->pfsid, dev_pasid->pasid,
- info->ats_qdep, addr,
- mask);
- }
- spin_unlock_irqrestore(&domain->lock, flags);
-}
-
-static void domain_flush_pasid_iotlb(struct intel_iommu *iommu,
- struct dmar_domain *domain, u64 addr,
- unsigned long npages, bool ih)
-{
- u16 did = domain_id_iommu(domain, iommu);
- struct dev_pasid_info *dev_pasid;
- unsigned long flags;
-
- spin_lock_irqsave(&domain->lock, flags);
- list_for_each_entry(dev_pasid, &domain->dev_pasids, link_domain)
- qi_flush_piotlb(iommu, did, dev_pasid->pasid, addr, npages, ih);
-
- if (!list_empty(&domain->devices))
- qi_flush_piotlb(iommu, did, IOMMU_NO_PASID, addr, npages, ih);
- spin_unlock_irqrestore(&domain->lock, flags);
-}
-
-static void __iommu_flush_iotlb_psi(struct intel_iommu *iommu, u16 did,
- unsigned long pfn, unsigned int pages,
- int ih)
-{
- unsigned int aligned_pages = __roundup_pow_of_two(pages);
- unsigned long bitmask = aligned_pages - 1;
- unsigned int mask = ilog2(aligned_pages);
- u64 addr = (u64)pfn << VTD_PAGE_SHIFT;
-
- /*
- * PSI masks the low order bits of the base address. If the
- * address isn't aligned to the mask, then compute a mask value
- * needed to ensure the target range is flushed.
- */
- if (unlikely(bitmask & pfn)) {
- unsigned long end_pfn = pfn + pages - 1, shared_bits;
-
- /*
- * Since end_pfn <= pfn + bitmask, the only way bits
- * higher than bitmask can differ in pfn and end_pfn is
- * by carrying. This means after masking out bitmask,
- * high bits starting with the first set bit in
- * shared_bits are all equal in both pfn and end_pfn.
- */
- shared_bits = ~(pfn ^ end_pfn) & ~bitmask;
- mask = shared_bits ? __ffs(shared_bits) : BITS_PER_LONG;
- }
-
- /*
- * Fallback to domain selective flush if no PSI support or
- * the size is too big.
- */
- if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu->cap))
- iommu->flush.flush_iotlb(iommu, did, 0, 0,
- DMA_TLB_DSI_FLUSH);
- else
- iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
- DMA_TLB_PSI_FLUSH);
-}
-
-static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
- struct dmar_domain *domain,
- unsigned long pfn, unsigned int pages,
- int ih, int map)
-{
- unsigned int aligned_pages = __roundup_pow_of_two(pages);
- unsigned int mask = ilog2(aligned_pages);
- uint64_t addr = (uint64_t)pfn << VTD_PAGE_SHIFT;
- u16 did = domain_id_iommu(domain, iommu);
-
- if (WARN_ON(!pages))
- return;
-
- if (ih)
- ih = 1 << 6;
-
- if (domain->use_first_level)
- domain_flush_pasid_iotlb(iommu, domain, addr, pages, ih);
- else
- __iommu_flush_iotlb_psi(iommu, did, pfn, pages, ih);
-
- if (!map)
- iommu_flush_dev_iotlb(domain, addr, mask);
-}
-
-/*
- * Flush the relevant caches in nested translation if the domain
- * also serves as a parent
- */
-static void parent_domain_flush(struct dmar_domain *domain,
- unsigned long pfn,
- unsigned long pages, int ih)
-{
- struct dmar_domain *s1_domain;
-
- spin_lock(&domain->s1_lock);
- list_for_each_entry(s1_domain, &domain->s1_domains, s2_link) {
- struct device_domain_info *device_info;
- struct iommu_domain_info *info;
- unsigned long flags;
- unsigned long i;
-
- xa_for_each(&s1_domain->iommu_array, i, info)
- __iommu_flush_iotlb_psi(info->iommu, info->did,
- pfn, pages, ih);
-
- if (!s1_domain->has_iotlb_device)
- continue;
-
- spin_lock_irqsave(&s1_domain->lock, flags);
- list_for_each_entry(device_info, &s1_domain->devices, link)
- /*
- * Address translation cache in device side caches the
- * result of nested translation. There is no easy way
- * to identify the exact set of nested translations
- * affected by a change in S2. So just flush the entire
- * device cache.
- */
- __iommu_flush_dev_iotlb(device_info, 0,
- MAX_AGAW_PFN_WIDTH);
- spin_unlock_irqrestore(&s1_domain->lock, flags);
- }
- spin_unlock(&domain->s1_lock);
-}
-
static void intel_flush_iotlb_all(struct iommu_domain *domain)
{
cache_tag_flush_all(to_dmar_domain(domain));
@@ -1990,9 +1839,7 @@ static void switch_to_super_page(struct dmar_domain *domain,
unsigned long end_pfn, int level)
{
unsigned long lvl_pages = lvl_to_nr_pages(level);
- struct iommu_domain_info *info;
struct dma_pte *pte = NULL;
- unsigned long i;
while (start_pfn <= end_pfn) {
if (!pte)
@@ -2004,13 +1851,8 @@ static void switch_to_super_page(struct dmar_domain *domain,
start_pfn + lvl_pages - 1,
level + 1);
- xa_for_each(&domain->iommu_array, i, info)
- iommu_flush_iotlb_psi(info->iommu, domain,
- start_pfn, lvl_pages,
- 0, 0);
- if (domain->nested_parent)
- parent_domain_flush(domain, start_pfn,
- lvl_pages, 0);
+ cache_tag_flush_range(domain, start_pfn << VTD_PAGE_SHIFT,
+ end_pfn << VTD_PAGE_SHIFT, 0);
}
pte++;
@@ -3380,18 +3222,9 @@ static int intel_iommu_memory_notifier(struct notifier_block *nb,
case MEM_OFFLINE:
case MEM_CANCEL_ONLINE:
{
- struct dmar_drhd_unit *drhd;
- struct intel_iommu *iommu;
LIST_HEAD(freelist);
domain_unmap(si_domain, start_vpfn, last_vpfn, &freelist);
-
- rcu_read_lock();
- for_each_active_iommu(iommu, drhd)
- iommu_flush_iotlb_psi(iommu, si_domain,
- start_vpfn, mhp->nr_pages,
- list_empty(&freelist), 0);
- rcu_read_unlock();
put_pages_list(&freelist);
}
break;
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 08/12] iommu/vt-d: Use cache_tag_flush_range() in cache_invalidate_user
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (6 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 07/12] iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi() Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 09/12] iommu/vt-d: Use cache helpers in arch_invalidate_secondary_tlbs Lu Baolu
` (4 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The cache_invalidate_user callback is called to invalidate a range
of caches for the affected user domain. Use cache_tag_flush_range()
in this callback.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 6 +++++
drivers/iommu/intel/nested.c | 50 +++---------------------------------
2 files changed, 9 insertions(+), 47 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index e17683ecef4b..b1d04aa36d31 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1050,6 +1050,12 @@ static inline unsigned long aligned_nrpages(unsigned long host_addr, size_t size
return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT;
}
+/* Return a size from number of VTD pages. */
+static inline unsigned long nrpages_to_size(unsigned long npages)
+{
+ return npages << VTD_PAGE_SHIFT;
+}
+
/* Convert value to context PASID directory size field coding. */
#define context_pdts(pds) (((pds) & 0x7) << 9)
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index 13406ee742bf..16a2bcf5cfeb 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -88,50 +88,6 @@ static void intel_nested_domain_free(struct iommu_domain *domain)
kfree(dmar_domain);
}
-static void nested_flush_dev_iotlb(struct dmar_domain *domain, u64 addr,
- unsigned int mask)
-{
- struct device_domain_info *info;
- unsigned long flags;
- u16 sid, qdep;
-
- spin_lock_irqsave(&domain->lock, flags);
- list_for_each_entry(info, &domain->devices, link) {
- if (!info->ats_enabled)
- continue;
- sid = info->bus << 8 | info->devfn;
- qdep = info->ats_qdep;
- qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
- qdep, addr, mask);
- quirk_extra_dev_tlb_flush(info, addr, mask,
- IOMMU_NO_PASID, qdep);
- }
- spin_unlock_irqrestore(&domain->lock, flags);
-}
-
-static void intel_nested_flush_cache(struct dmar_domain *domain, u64 addr,
- u64 npages, bool ih)
-{
- struct iommu_domain_info *info;
- unsigned int mask;
- unsigned long i;
-
- xa_for_each(&domain->iommu_array, i, info)
- qi_flush_piotlb(info->iommu,
- domain_id_iommu(domain, info->iommu),
- IOMMU_NO_PASID, addr, npages, ih);
-
- if (!domain->has_iotlb_device)
- return;
-
- if (npages == U64_MAX)
- mask = 64 - VTD_PAGE_SHIFT;
- else
- mask = ilog2(__roundup_pow_of_two(npages));
-
- nested_flush_dev_iotlb(domain, addr, mask);
-}
-
static int intel_nested_cache_invalidate_user(struct iommu_domain *domain,
struct iommu_user_data_array *array)
{
@@ -164,9 +120,9 @@ static int intel_nested_cache_invalidate_user(struct iommu_domain *domain,
break;
}
- intel_nested_flush_cache(dmar_domain, inv_entry.addr,
- inv_entry.npages,
- inv_entry.flags & IOMMU_VTD_INV_FLAGS_LEAF);
+ cache_tag_flush_range(dmar_domain, inv_entry.addr,
+ inv_entry.addr + nrpages_to_size(inv_entry.npages) - 1,
+ inv_entry.flags & IOMMU_VTD_INV_FLAGS_LEAF);
processed++;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 09/12] iommu/vt-d: Use cache helpers in arch_invalidate_secondary_tlbs
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (7 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 08/12] iommu/vt-d: Use cache_tag_flush_range() in cache_invalidate_user Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 10/12] iommu/vt-d: Remove intel_svm_dev Lu Baolu
` (3 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The arch_invalidate_secondary_tlbs callback is called in the SVA mm
notification path. It invalidates all or a range of caches after the
CPU page table is modified. Use the cache tag helps in this path.
The mm_types defines vm_end as the first byte after the end address
which is different from the iommu gather API, hence convert the end
parameter from mm_types to iommu gather scheme before calling the
cache_tag helper.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 1 +
drivers/iommu/intel/svm.c | 81 +++++--------------------------------
2 files changed, 11 insertions(+), 71 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index b1d04aa36d31..31bbce8ffe7e 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1167,6 +1167,7 @@ struct intel_svm {
struct mm_struct *mm;
u32 pasid;
struct list_head devs;
+ struct dmar_domain *domain;
};
#else
static inline void intel_svm_check(struct intel_iommu *iommu) {}
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 641d0cef3737..e6d25a0f25f6 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -168,88 +168,25 @@ void intel_svm_check(struct intel_iommu *iommu)
iommu->flags |= VTD_FLAG_SVM_CAPABLE;
}
-static void __flush_svm_range_dev(struct intel_svm *svm,
- struct intel_svm_dev *sdev,
- unsigned long address,
- unsigned long pages, int ih)
-{
- struct device_domain_info *info = dev_iommu_priv_get(sdev->dev);
-
- if (WARN_ON(!pages))
- return;
-
- qi_flush_piotlb(sdev->iommu, sdev->did, svm->pasid, address, pages, ih);
- if (info->ats_enabled) {
- qi_flush_dev_iotlb_pasid(sdev->iommu, sdev->sid, info->pfsid,
- svm->pasid, sdev->qdep, address,
- order_base_2(pages));
- quirk_extra_dev_tlb_flush(info, address, order_base_2(pages),
- svm->pasid, sdev->qdep);
- }
-}
-
-static void intel_flush_svm_range_dev(struct intel_svm *svm,
- struct intel_svm_dev *sdev,
- unsigned long address,
- unsigned long pages, int ih)
-{
- unsigned long shift = ilog2(__roundup_pow_of_two(pages));
- unsigned long align = (1ULL << (VTD_PAGE_SHIFT + shift));
- unsigned long start = ALIGN_DOWN(address, align);
- unsigned long end = ALIGN(address + (pages << VTD_PAGE_SHIFT), align);
-
- while (start < end) {
- __flush_svm_range_dev(svm, sdev, start, align >> VTD_PAGE_SHIFT, ih);
- start += align;
- }
-}
-
-static void intel_flush_svm_range(struct intel_svm *svm, unsigned long address,
- unsigned long pages, int ih)
-{
- struct intel_svm_dev *sdev;
-
- rcu_read_lock();
- list_for_each_entry_rcu(sdev, &svm->devs, list)
- intel_flush_svm_range_dev(svm, sdev, address, pages, ih);
- rcu_read_unlock();
-}
-
-static void intel_flush_svm_all(struct intel_svm *svm)
-{
- struct device_domain_info *info;
- struct intel_svm_dev *sdev;
-
- rcu_read_lock();
- list_for_each_entry_rcu(sdev, &svm->devs, list) {
- info = dev_iommu_priv_get(sdev->dev);
-
- qi_flush_piotlb(sdev->iommu, sdev->did, svm->pasid, 0, -1UL, 0);
- if (info->ats_enabled) {
- qi_flush_dev_iotlb_pasid(sdev->iommu, sdev->sid, info->pfsid,
- svm->pasid, sdev->qdep,
- 0, 64 - VTD_PAGE_SHIFT);
- quirk_extra_dev_tlb_flush(info, 0, 64 - VTD_PAGE_SHIFT,
- svm->pasid, sdev->qdep);
- }
- }
- rcu_read_unlock();
-}
-
/* Pages have been freed at this point */
static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end)
{
struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+ struct dmar_domain *domain = svm->domain;
if (start == 0 && end == -1UL) {
- intel_flush_svm_all(svm);
+ cache_tag_flush_all(domain);
return;
}
- intel_flush_svm_range(svm, start,
- (end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0);
+ /*
+ * The mm_types defines vm_end as the first byte after the end address,
+ * different from IOMMU subsystem using the last address of an address
+ * range.
+ */
+ cache_tag_flush_range(domain, start, end - 1, 0);
}
static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -336,6 +273,7 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
INIT_LIST_HEAD_RCU(&svm->devs);
svm->notifier.ops = &intel_mmuops;
+ svm->domain = to_dmar_domain(domain);
ret = mmu_notifier_register(&svm->notifier, mm);
if (ret) {
kfree(svm);
@@ -801,6 +739,7 @@ struct iommu_domain *intel_svm_domain_alloc(void)
if (!domain)
return NULL;
domain->domain.ops = &intel_svm_domain_ops;
+ domain->use_first_level = true;
INIT_LIST_HEAD(&domain->cache_tags);
spin_lock_init(&domain->cache_lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 10/12] iommu/vt-d: Remove intel_svm_dev
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (8 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 09/12] iommu/vt-d: Use cache helpers in arch_invalidate_secondary_tlbs Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 11/12] iommu: Add ops->domain_alloc_sva() Lu Baolu
` (2 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The intel_svm_dev data structure used in the sva implementation for the
Intel IOMMU driver stores information about a device attached to an SVA
domain. It is a duplicate of dev_pasid_info that serves the same purpose.
Replace intel_svm_dev with dev_pasid_info and clean up the use of
intel_svm_dev.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 15 +----
drivers/iommu/intel/iommu.c | 7 +-
drivers/iommu/intel/svm.c | 130 ++++++++++--------------------------
3 files changed, 42 insertions(+), 110 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 31bbce8ffe7e..4f4f4f28d55d 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -650,6 +650,7 @@ struct dmar_domain {
struct list_head s2_link;
};
};
+ struct intel_svm *svm;
struct iommu_domain domain; /* generic domain data structure for
iommu core */
@@ -1150,23 +1151,13 @@ int intel_svm_finish_prq(struct intel_iommu *iommu);
void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
struct iommu_page_response *msg);
struct iommu_domain *intel_svm_domain_alloc(void);
-void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid);
+void intel_svm_remove_dev_pasid(struct iommu_domain *domain);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);
-struct intel_svm_dev {
- struct list_head list;
- struct rcu_head rcu;
- struct device *dev;
- struct intel_iommu *iommu;
- u16 did;
- u16 sid, qdep;
-};
-
struct intel_svm {
struct mmu_notifier notifier;
struct mm_struct *mm;
u32 pasid;
- struct list_head devs;
struct dmar_domain *domain;
};
#else
@@ -1177,7 +1168,7 @@ static inline struct iommu_domain *intel_svm_domain_alloc(void)
return NULL;
}
-static inline void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid)
+static inline void intel_svm_remove_dev_pasid(struct iommu_domain *domain)
{
}
#endif
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index fc6303940322..0f609ba36382 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4386,11 +4386,8 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
* notification. Before consolidating that code into iommu core, let
* the intel sva code handle it.
*/
- if (domain->type == IOMMU_DOMAIN_SVA) {
- intel_svm_remove_dev_pasid(dev, pasid);
- cache_tag_unassign_domain(dmar_domain, dev, pasid);
- goto out_tear_down;
- }
+ if (domain->type == IOMMU_DOMAIN_SVA)
+ intel_svm_remove_dev_pasid(domain);
spin_lock_irqsave(&dmar_domain->lock, flags);
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index e6d25a0f25f6..01fa88f8b4a8 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -43,23 +43,6 @@ static void *pasid_private_find(ioasid_t pasid)
return xa_load(&pasid_private_array, pasid);
}
-static struct intel_svm_dev *
-svm_lookup_device_by_dev(struct intel_svm *svm, struct device *dev)
-{
- struct intel_svm_dev *sdev = NULL, *t;
-
- rcu_read_lock();
- list_for_each_entry_rcu(t, &svm->devs, list) {
- if (t->dev == dev) {
- sdev = t;
- break;
- }
- }
- rcu_read_unlock();
-
- return sdev;
-}
-
int intel_svm_enable_prq(struct intel_iommu *iommu)
{
struct iopf_queue *iopfq;
@@ -192,7 +175,10 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
- struct intel_svm_dev *sdev;
+ struct dmar_domain *domain = svm->domain;
+ struct dev_pasid_info *dev_pasid;
+ struct device_domain_info *info;
+ unsigned long flags;
/* This might end up being called from exit_mmap(), *before* the page
* tables are cleared. And __mmu_notifier_release() will delete us from
@@ -206,11 +192,13 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* page) so that we end up taking a fault that the hardware really
* *has* to handle gracefully without affecting other processes.
*/
- rcu_read_lock();
- list_for_each_entry_rcu(sdev, &svm->devs, list)
- intel_pasid_tear_down_entry(sdev->iommu, sdev->dev,
- svm->pasid, true);
- rcu_read_unlock();
+ spin_lock_irqsave(&domain->lock, flags);
+ list_for_each_entry(dev_pasid, &domain->dev_pasids, link_domain) {
+ info = dev_iommu_priv_get(dev_pasid->dev);
+ intel_pasid_tear_down_entry(info->iommu, dev_pasid->dev,
+ dev_pasid->pasid, true);
+ }
+ spin_unlock_irqrestore(&domain->lock, flags);
}
@@ -219,47 +207,17 @@ static const struct mmu_notifier_ops intel_mmuops = {
.arch_invalidate_secondary_tlbs = intel_arch_invalidate_secondary_tlbs,
};
-static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
- struct intel_svm **rsvm,
- struct intel_svm_dev **rsdev)
-{
- struct intel_svm_dev *sdev = NULL;
- struct intel_svm *svm;
-
- if (pasid == IOMMU_PASID_INVALID || pasid >= PASID_MAX)
- return -EINVAL;
-
- svm = pasid_private_find(pasid);
- if (IS_ERR(svm))
- return PTR_ERR(svm);
-
- if (!svm)
- goto out;
-
- /*
- * If we found svm for the PASID, there must be at least one device
- * bond.
- */
- if (WARN_ON(list_empty(&svm->devs)))
- return -EINVAL;
- sdev = svm_lookup_device_by_dev(svm, dev);
-
-out:
- *rsvm = svm;
- *rsdev = sdev;
-
- return 0;
-}
-
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
- struct intel_svm_dev *sdev;
+ struct dev_pasid_info *dev_pasid;
struct intel_svm *svm;
unsigned long sflags;
+ unsigned long flags;
int ret = 0;
svm = pasid_private_find(pasid);
@@ -270,7 +228,6 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
svm->pasid = pasid;
svm->mm = mm;
- INIT_LIST_HEAD_RCU(&svm->devs);
svm->notifier.ops = &intel_mmuops;
svm->domain = to_dmar_domain(domain);
@@ -288,25 +245,17 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
}
}
- sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
- if (!sdev) {
- ret = -ENOMEM;
+ dmar_domain->svm = svm;
+ dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
+ if (!dev_pasid)
goto free_svm;
- }
- sdev->dev = dev;
- sdev->iommu = iommu;
- sdev->did = FLPT_DEFAULT_DID;
- sdev->sid = PCI_DEVID(info->bus, info->devfn);
- if (info->ats_enabled) {
- sdev->qdep = info->ats_qdep;
- if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
- sdev->qdep = 0;
- }
+ dev_pasid->dev = dev;
+ dev_pasid->pasid = pasid;
ret = cache_tag_assign_domain(to_dmar_domain(domain), dev, pasid);
if (ret)
- goto free_sdev;
+ goto free_dev_pasid;
/* Setup the pasid table: */
sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
@@ -315,16 +264,18 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
if (ret)
goto unassign_tag;
- list_add_rcu(&sdev->list, &svm->devs);
+ spin_lock_irqsave(&dmar_domain->lock, flags);
+ list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
+ spin_unlock_irqrestore(&dmar_domain->lock, flags);
return 0;
unassign_tag:
cache_tag_unassign_domain(to_dmar_domain(domain), dev, pasid);
-free_sdev:
- kfree(sdev);
+free_dev_pasid:
+ kfree(dev_pasid);
free_svm:
- if (list_empty(&svm->devs)) {
+ if (list_empty(&dmar_domain->dev_pasids)) {
mmu_notifier_unregister(&svm->notifier, mm);
pasid_private_remove(pasid);
kfree(svm);
@@ -333,26 +284,17 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
return ret;
}
-void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
+void intel_svm_remove_dev_pasid(struct iommu_domain *domain)
{
- struct intel_svm_dev *sdev;
- struct intel_svm *svm;
- struct mm_struct *mm;
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ struct intel_svm *svm = dmar_domain->svm;
+ struct mm_struct *mm = domain->mm;
- if (pasid_to_svm_sdev(dev, pasid, &svm, &sdev))
- return;
- mm = svm->mm;
-
- if (sdev) {
- list_del_rcu(&sdev->list);
- kfree_rcu(sdev, rcu);
-
- if (list_empty(&svm->devs)) {
- if (svm->notifier.ops)
- mmu_notifier_unregister(&svm->notifier, mm);
- pasid_private_remove(svm->pasid);
- kfree(svm);
- }
+ if (list_empty(&dmar_domain->dev_pasids)) {
+ if (svm->notifier.ops)
+ mmu_notifier_unregister(&svm->notifier, mm);
+ pasid_private_remove(svm->pasid);
+ kfree(svm);
}
}
@@ -740,8 +682,10 @@ struct iommu_domain *intel_svm_domain_alloc(void)
return NULL;
domain->domain.ops = &intel_svm_domain_ops;
domain->use_first_level = true;
+ INIT_LIST_HEAD(&domain->dev_pasids);
INIT_LIST_HEAD(&domain->cache_tags);
spin_lock_init(&domain->cache_lock);
+ spin_lock_init(&domain->lock);
return &domain->domain;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 11/12] iommu: Add ops->domain_alloc_sva()
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (9 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 10/12] iommu/vt-d: Remove intel_svm_dev Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-16 8:06 ` [PATCH v3 12/12] iommu/vt-d: Remove struct intel_svm Lu Baolu
2024-04-23 9:06 ` [PATCH v3 00/12] Consolidate domain cache invalidation Tian, Kevin
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Jason Gunthorpe,
Vasant Hegde, Lu Baolu
From: Jason Gunthorpe <jgg@nvidia.com>
Make a new op that receives the device and the mm_struct that the SVA
domain should be created for. Unlike domain_alloc_paging() the dev
argument is never NULL here.
This allows drivers to fully initialize the SVA domain and allocate the
mmu_notifier during allocation. It allows the notifier lifetime to follow
the lifetime of the iommu_domain.
Since we have only one call site, upgrade the new op to return ERR_PTR
instead of NULL.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
[Removed smmu3 related changes - Vasant]
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20240311090843.133455-15-vasant.hegde@amd.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
include/linux/iommu.h | 3 +++
drivers/iommu/iommu-sva.c | 16 +++++++++++-----
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2e925b5eba53..8aabe83af8f2 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -518,6 +518,7 @@ static inline int __iommu_copy_struct_from_user_array(
* Upon failure, ERR_PTR must be returned.
* @domain_alloc_paging: Allocate an iommu_domain that can be used for
* UNMANAGED, DMA, and DMA_FQ domain types.
+ * @domain_alloc_sva: Allocate an iommu_domain for Shared Virtual Addressing.
* @probe_device: Add device to iommu driver handling
* @release_device: Remove device from iommu driver handling
* @probe_finalize: Do final setup work after the device is added to an IOMMU
@@ -558,6 +559,8 @@ struct iommu_ops {
struct device *dev, u32 flags, struct iommu_domain *parent,
const struct iommu_user_data *user_data);
struct iommu_domain *(*domain_alloc_paging)(struct device *dev);
+ struct iommu_domain *(*domain_alloc_sva)(struct device *dev,
+ struct mm_struct *mm);
struct iommu_device *(*probe_device)(struct device *dev);
void (*release_device)(struct device *dev);
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 640acc804e8c..18a35e798b72 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -108,8 +108,8 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
/* Allocate a new domain and set it on device pasid. */
domain = iommu_sva_domain_alloc(dev, mm);
- if (!domain) {
- ret = -ENOMEM;
+ if (IS_ERR(domain)) {
+ ret = PTR_ERR(domain);
goto out_free_handle;
}
@@ -283,9 +283,15 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
const struct iommu_ops *ops = dev_iommu_ops(dev);
struct iommu_domain *domain;
- domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
- if (!domain)
- return NULL;
+ if (ops->domain_alloc_sva) {
+ domain = ops->domain_alloc_sva(dev, mm);
+ if (IS_ERR(domain))
+ return domain;
+ } else {
+ domain = ops->domain_alloc(IOMMU_DOMAIN_SVA);
+ if (!domain)
+ return ERR_PTR(-ENOMEM);
+ }
domain->type = IOMMU_DOMAIN_SVA;
mmgrab(mm);
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 12/12] iommu/vt-d: Remove struct intel_svm
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (10 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 11/12] iommu: Add ops->domain_alloc_sva() Lu Baolu
@ 2024-04-16 8:06 ` Lu Baolu
2024-04-23 9:06 ` [PATCH v3 00/12] Consolidate domain cache invalidation Tian, Kevin
12 siblings, 0 replies; 22+ messages in thread
From: Lu Baolu @ 2024-04-16 8:06 UTC (permalink / raw)
To: Joerg Roedel, Will Deacon, Robin Murphy, Kevin Tian,
Jason Gunthorpe
Cc: Tina Zhang, Yi Liu, iommu, linux-kernel, Lu Baolu
The struct intel_svm was used for keeping attached devices info for sva
domain. Since sva domain is a kind of iommu_domain, the struct
dmar_domain should centralize all info of a sva domain, including the
info of attached devices. Therefore, retire struct intel_svm and clean up
the code.
Besides, register mmu notifier callback in domain_alloc_sva() callback
which allows the memory management notifier lifetime to follow the lifetime
of the iommu_domain. Call mmu_notifier_put() in the domain free and defer
the real free to the mmu free_notifier callback.
Co-developed-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.h | 26 ++++------
drivers/iommu/intel/iommu.c | 11 +----
drivers/iommu/intel/svm.c | 99 ++++++++++---------------------------
3 files changed, 37 insertions(+), 99 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 4f4f4f28d55d..e1fb94acf0be 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -649,8 +649,12 @@ struct dmar_domain {
/* link to parent domain siblings */
struct list_head s2_link;
};
+
+ /* SVA domain */
+ struct {
+ struct mmu_notifier notifier;
+ };
};
- struct intel_svm *svm;
struct iommu_domain domain; /* generic domain data structure for
iommu core */
@@ -1150,26 +1154,16 @@ int intel_svm_enable_prq(struct intel_iommu *iommu);
int intel_svm_finish_prq(struct intel_iommu *iommu);
void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
struct iommu_page_response *msg);
-struct iommu_domain *intel_svm_domain_alloc(void);
-void intel_svm_remove_dev_pasid(struct iommu_domain *domain);
+struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
+ struct mm_struct *mm);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);
-
-struct intel_svm {
- struct mmu_notifier notifier;
- struct mm_struct *mm;
- u32 pasid;
- struct dmar_domain *domain;
-};
#else
static inline void intel_svm_check(struct intel_iommu *iommu) {}
static inline void intel_drain_pasid_prq(struct device *dev, u32 pasid) {}
-static inline struct iommu_domain *intel_svm_domain_alloc(void)
-{
- return NULL;
-}
-
-static inline void intel_svm_remove_dev_pasid(struct iommu_domain *domain)
+static inline struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
+ struct mm_struct *mm)
{
+ return ERR_PTR(-ENODEV);
}
#endif
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0f609ba36382..bbc47b3c603c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3682,8 +3682,6 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
return domain;
case IOMMU_DOMAIN_IDENTITY:
return &si_domain->domain;
- case IOMMU_DOMAIN_SVA:
- return intel_svm_domain_alloc();
default:
return NULL;
}
@@ -4381,14 +4379,6 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
goto out_tear_down;
dmar_domain = to_dmar_domain(domain);
- /*
- * The SVA implementation needs to handle its own stuffs like the mm
- * notification. Before consolidating that code into iommu core, let
- * the intel sva code handle it.
- */
- if (domain->type == IOMMU_DOMAIN_SVA)
- intel_svm_remove_dev_pasid(domain);
-
spin_lock_irqsave(&dmar_domain->lock, flags);
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
if (curr->dev == dev && curr->pasid == pasid) {
@@ -4623,6 +4613,7 @@ const struct iommu_ops intel_iommu_ops = {
.hw_info = intel_iommu_hw_info,
.domain_alloc = intel_iommu_domain_alloc,
.domain_alloc_user = intel_iommu_domain_alloc_user,
+ .domain_alloc_sva = intel_svm_domain_alloc,
.probe_device = intel_iommu_probe_device,
.probe_finalize = intel_iommu_probe_finalize,
.release_device = intel_iommu_release_device,
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 01fa88f8b4a8..84d9bd543a6d 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -26,23 +26,6 @@
static irqreturn_t prq_event_thread(int irq, void *d);
-static DEFINE_XARRAY_ALLOC(pasid_private_array);
-static int pasid_private_add(ioasid_t pasid, void *priv)
-{
- return xa_alloc(&pasid_private_array, &pasid, priv,
- XA_LIMIT(pasid, pasid), GFP_ATOMIC);
-}
-
-static void pasid_private_remove(ioasid_t pasid)
-{
- xa_erase(&pasid_private_array, pasid);
-}
-
-static void *pasid_private_find(ioasid_t pasid)
-{
- return xa_load(&pasid_private_array, pasid);
-}
-
int intel_svm_enable_prq(struct intel_iommu *iommu)
{
struct iopf_queue *iopfq;
@@ -156,10 +139,9 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end)
{
- struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
- struct dmar_domain *domain = svm->domain;
+ struct dmar_domain *domain = container_of(mn, struct dmar_domain, notifier);
- if (start == 0 && end == -1UL) {
+ if (start == 0 && end == ULONG_MAX) {
cache_tag_flush_all(domain);
return;
}
@@ -174,8 +156,7 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
- struct dmar_domain *domain = svm->domain;
+ struct dmar_domain *domain = container_of(mn, struct dmar_domain, notifier);
struct dev_pasid_info *dev_pasid;
struct device_domain_info *info;
unsigned long flags;
@@ -202,9 +183,15 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
}
+static void intel_mm_free_notifier(struct mmu_notifier *mn)
+{
+ kfree(container_of(mn, struct dmar_domain, notifier));
+}
+
static const struct mmu_notifier_ops intel_mmuops = {
.release = intel_mm_release,
.arch_invalidate_secondary_tlbs = intel_arch_invalidate_secondary_tlbs,
+ .free_notifier = intel_mm_free_notifier,
};
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
@@ -215,40 +202,13 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
struct dev_pasid_info *dev_pasid;
- struct intel_svm *svm;
unsigned long sflags;
unsigned long flags;
int ret = 0;
- svm = pasid_private_find(pasid);
- if (!svm) {
- svm = kzalloc(sizeof(*svm), GFP_KERNEL);
- if (!svm)
- return -ENOMEM;
-
- svm->pasid = pasid;
- svm->mm = mm;
-
- svm->notifier.ops = &intel_mmuops;
- svm->domain = to_dmar_domain(domain);
- ret = mmu_notifier_register(&svm->notifier, mm);
- if (ret) {
- kfree(svm);
- return ret;
- }
-
- ret = pasid_private_add(svm->pasid, svm);
- if (ret) {
- mmu_notifier_unregister(&svm->notifier, mm);
- kfree(svm);
- return ret;
- }
- }
-
- dmar_domain->svm = svm;
dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
if (!dev_pasid)
- goto free_svm;
+ return -ENOMEM;
dev_pasid->dev = dev;
dev_pasid->pasid = pasid;
@@ -274,30 +234,10 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
cache_tag_unassign_domain(to_dmar_domain(domain), dev, pasid);
free_dev_pasid:
kfree(dev_pasid);
-free_svm:
- if (list_empty(&dmar_domain->dev_pasids)) {
- mmu_notifier_unregister(&svm->notifier, mm);
- pasid_private_remove(pasid);
- kfree(svm);
- }
return ret;
}
-void intel_svm_remove_dev_pasid(struct iommu_domain *domain)
-{
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
- struct intel_svm *svm = dmar_domain->svm;
- struct mm_struct *mm = domain->mm;
-
- if (list_empty(&dmar_domain->dev_pasids)) {
- if (svm->notifier.ops)
- mmu_notifier_unregister(&svm->notifier, mm);
- pasid_private_remove(svm->pasid);
- kfree(svm);
- }
-}
-
/* Page request queue descriptor */
struct page_req_dsc {
union {
@@ -665,7 +605,10 @@ void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
static void intel_svm_domain_free(struct iommu_domain *domain)
{
- kfree(to_dmar_domain(domain));
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+ /* dmar_domain free is deferred to the mmu free_notifier callback. */
+ mmu_notifier_put(&dmar_domain->notifier);
}
static const struct iommu_domain_ops intel_svm_domain_ops = {
@@ -673,13 +616,16 @@ static const struct iommu_domain_ops intel_svm_domain_ops = {
.free = intel_svm_domain_free
};
-struct iommu_domain *intel_svm_domain_alloc(void)
+struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
+ struct mm_struct *mm)
{
struct dmar_domain *domain;
+ int ret;
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (!domain)
- return NULL;
+ return ERR_PTR(-ENOMEM);
+
domain->domain.ops = &intel_svm_domain_ops;
domain->use_first_level = true;
INIT_LIST_HEAD(&domain->dev_pasids);
@@ -687,5 +633,12 @@ struct iommu_domain *intel_svm_domain_alloc(void)
spin_lock_init(&domain->cache_lock);
spin_lock_init(&domain->lock);
+ domain->notifier.ops = &intel_mmuops;
+ ret = mmu_notifier_register(&domain->notifier, mm);
+ if (ret) {
+ kfree(domain);
+ return ERR_PTR(ret);
+ }
+
return &domain->domain;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 22+ messages in thread* RE: [PATCH v3 00/12] Consolidate domain cache invalidation
2024-04-16 8:06 [PATCH v3 00/12] Consolidate domain cache invalidation Lu Baolu
` (11 preceding siblings ...)
2024-04-16 8:06 ` [PATCH v3 12/12] iommu/vt-d: Remove struct intel_svm Lu Baolu
@ 2024-04-23 9:06 ` Tian, Kevin
2024-04-24 2:46 ` Baolu Lu
12 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2024-04-23 9:06 UTC (permalink / raw)
To: Lu Baolu, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, April 16, 2024 4:07 PM
>
> The IOMMU hardware cache needs to be invalidated whenever the
> mappings
> in the domain are changed. Currently, domain cache invalidation is
> scattered across different places, causing several issues:
>
> - IOMMU IOTLB Invalidation: This is done by iterating through the domain
> IDs of each domain using the following code:
>
> xa_for_each(&dmar_domain->iommu_array, i, info)
> iommu_flush_iotlb_psi(info->iommu, dmar_domain,
> start_pfn, nrpages,
> list_empty(&gather->freelist), 0);
>
> This code could theoretically cause a use-after-free problem because
> there's no lock to protect the "info" pointer within the loop.
>
> - Inconsistent Invalidation Methods: Different domain types implement
> their own cache invalidation methods, making the code difficult to
> maintain. For example, the DMA domain, SVA domain, and nested domain
> have similar cache invalidation code scattered across different files.
>
> - SVA Domain Inconsistency: The SVA domain implementation uses a
> completely different data structure to track attached devices compared
> to other domains. This creates unnecessary differences and, even
> worse, leads to duplicate IOTLB invalidation when an SVA domain is
> attached to devices belonging to a same IOMMU.
>
> - Nested Domain Dependency: The special overlap between a nested domain
> and its parent domain requires a dedicated parent_domain_flush()
> helper function to be called everywhere the parent domain's mapping
> changes.
>
> - Limited Debugging Support: There are currently no debugging aids
> available for domain cache invalidation.
>
> By consolidating domain cache invalidation into a common location, we
> can address the issues mentioned above and improve the code's
> maintainability and debuggability.
>
There are several small nits but overall this series looks good to me:
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v3 00/12] Consolidate domain cache invalidation
2024-04-23 9:06 ` [PATCH v3 00/12] Consolidate domain cache invalidation Tian, Kevin
@ 2024-04-24 2:46 ` Baolu Lu
0 siblings, 0 replies; 22+ messages in thread
From: Baolu Lu @ 2024-04-24 2:46 UTC (permalink / raw)
To: Tian, Kevin, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe
Cc: baolu.lu, Zhang, Tina, Liu, Yi L, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
On 4/23/24 5:06 PM, Tian, Kevin wrote:
>> From: Lu Baolu<baolu.lu@linux.intel.com>
>> Sent: Tuesday, April 16, 2024 4:07 PM
>>
>> The IOMMU hardware cache needs to be invalidated whenever the
>> mappings
>> in the domain are changed. Currently, domain cache invalidation is
>> scattered across different places, causing several issues:
>>
>> - IOMMU IOTLB Invalidation: This is done by iterating through the domain
>> IDs of each domain using the following code:
>>
>> xa_for_each(&dmar_domain->iommu_array, i, info)
>> iommu_flush_iotlb_psi(info->iommu, dmar_domain,
>> start_pfn, nrpages,
>> list_empty(&gather->freelist), 0);
>>
>> This code could theoretically cause a use-after-free problem because
>> there's no lock to protect the "info" pointer within the loop.
>>
>> - Inconsistent Invalidation Methods: Different domain types implement
>> their own cache invalidation methods, making the code difficult to
>> maintain. For example, the DMA domain, SVA domain, and nested domain
>> have similar cache invalidation code scattered across different files.
>>
>> - SVA Domain Inconsistency: The SVA domain implementation uses a
>> completely different data structure to track attached devices compared
>> to other domains. This creates unnecessary differences and, even
>> worse, leads to duplicate IOTLB invalidation when an SVA domain is
>> attached to devices belonging to a same IOMMU.
>>
>> - Nested Domain Dependency: The special overlap between a nested domain
>> and its parent domain requires a dedicated parent_domain_flush()
>> helper function to be called everywhere the parent domain's mapping
>> changes.
>>
>> - Limited Debugging Support: There are currently no debugging aids
>> available for domain cache invalidation.
>>
>> By consolidating domain cache invalidation into a common location, we
>> can address the issues mentioned above and improve the code's
>> maintainability and debuggability.
>>
> There are several small nits but overall this series looks good to me:
>
> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
Thanks! I will queue this series to Joerg for wider verification.
Best regards,
baolu
^ permalink raw reply [flat|nested] 22+ messages in thread