* [PATCH v2 01/11] iommu/amd: Rename amd_iommu_v2_supported() as amd_iommu_sva_supported()
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 02/11] iommu/amd: Do not override PASID entry in GCR3 table Vasant Hegde
` (10 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
To reflect its usage. No functional changes intended.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 2 +-
drivers/iommu/amd/init.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1aa061543c9f..7d18addb731e 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -38,7 +38,7 @@ extern int amd_iommu_guest_ir;
extern enum io_pgtable_fmt amd_iommu_pgtable;
extern int amd_iommu_gpt_level;
-bool amd_iommu_v2_supported(void);
+bool amd_iommu_sva_supported(void);
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
bool amd_iommu_pc_supported(void);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 64bcf3df37ee..3820ae738088 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3652,7 +3652,7 @@ __setup("ivrs_ioapic", parse_ivrs_ioapic);
__setup("ivrs_hpet", parse_ivrs_hpet);
__setup("ivrs_acpihid", parse_ivrs_acpihid);
-bool amd_iommu_v2_supported(void)
+bool amd_iommu_sva_supported(void)
{
/* CPU page table size should match IOMMU guest page table size */
if (cpu_feature_enabled(X86_FEATURE_LA57) &&
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 02/11] iommu/amd: Do not override PASID entry in GCR3 table
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 01/11] iommu/amd: Rename amd_iommu_v2_supported() as amd_iommu_sva_supported() Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 03/11] iommu/amd: Add support for enabling/disabling IOMMU features Vasant Hegde
` (9 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
Return error if PASID is already exists.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/iommu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f977b423d216..d7edb2d63fcb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2732,6 +2732,9 @@ static int __set_gcr3(struct iommu_dev_data *dev_data,
if (pte == NULL)
return -ENOMEM;
+ if (*pte & GCR3_VALID)
+ return -EBUSY;
+
*pte = (gcr3 & PAGE_MASK) | GCR3_VALID;
__amd_iommu_flush_tlb(dev_data->domain, pasid);
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 03/11] iommu/amd: Add support for enabling/disabling IOMMU features
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 01/11] iommu/amd: Rename amd_iommu_v2_supported() as amd_iommu_sva_supported() Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 02/11] iommu/amd: Do not override PASID entry in GCR3 table Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU Vasant Hegde
` (8 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Wei Huang <wei.huang2@amd.com>
Add support for struct iommu_ops.dev_{enable/disable}_feat. Please note
that the empty feature switches will be populated by subsequent patches.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/iommu.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index d7edb2d63fcb..a573e3534656 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2568,6 +2568,32 @@ static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
return true;
}
+static int amd_iommu_dev_enable_feature(struct device *dev,
+ enum iommu_dev_features feat)
+{
+ int ret;
+
+ switch (feat) {
+ default:
+ ret = -EINVAL;
+ break;
+ }
+ return ret;
+}
+
+static int amd_iommu_dev_disable_feature(struct device *dev,
+ enum iommu_dev_features feat)
+{
+ int ret;
+
+ switch (feat) {
+ default:
+ ret = -EINVAL;
+ break;
+ }
+ return ret;
+}
+
const struct iommu_ops amd_iommu_ops = {
.capable = amd_iommu_capable,
.domain_alloc = amd_iommu_domain_alloc,
@@ -2579,6 +2605,8 @@ const struct iommu_ops amd_iommu_ops = {
.is_attach_deferred = amd_iommu_is_attach_deferred,
.pgsize_bitmap = AMD_IOMMU_PGSIZES,
.def_domain_type = amd_iommu_def_domain_type,
+ .dev_enable_feat = amd_iommu_dev_enable_feature,
+ .dev_disable_feat = amd_iommu_dev_disable_feature,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = amd_iommu_attach_device,
.map_pages = amd_iommu_map_pages,
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (2 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 03/11] iommu/amd: Add support for enabling/disabling IOMMU features Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-12 16:47 ` Jason Gunthorpe
2023-09-11 12:10 ` [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature Vasant Hegde
` (7 subsequent siblings)
11 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
This includes :
- Add data structure to track per protection domain dev/pasid binding details
- Add mmu notifier to protection domain, so that we can retrieve SVA
protection domain in invalidation path.
- Add iommu_ops.remove_dev_pasid support. It will unbind PASID from
device. Also remove pasid data from protection domain.
- mmu notifier for iotlb invalidation
- Add IOMMU_SVA as dependency to AMD_IOMMU driver
- Move 'to_pdomain()' to header file
For a given PASID, amd_iommu_set_dev_pasid() will bind all devices to same
SVA protection domain (1 PASID : 1 SVA protection domain : N devices).
This protection domain is different from device protection domain (one
that's mapped in attach_device() path). IOMMU uses device protection
domain for caching, etc. Hence in invalidation path we retrieve device
protection domain from iommu_dev_data structure and use that for
invalidation.
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/Kconfig | 1 +
drivers/iommu/amd/Makefile | 2 +-
drivers/iommu/amd/amd_iommu.h | 10 ++
drivers/iommu/amd/amd_iommu_types.h | 25 +++++
drivers/iommu/amd/iommu.c | 16 ++-
drivers/iommu/amd/sva.c | 158 ++++++++++++++++++++++++++++
6 files changed, 206 insertions(+), 6 deletions(-)
create mode 100644 drivers/iommu/amd/sva.c
diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
index 75132ae861a2..cabf4ccde1ed 100644
--- a/drivers/iommu/amd/Kconfig
+++ b/drivers/iommu/amd/Kconfig
@@ -10,6 +10,7 @@ config AMD_IOMMU
select IOMMU_API
select IOMMU_IOVA
select IOMMU_IO_PGTABLE
+ select IOMMU_SVA
depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
help
With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index f454fbb1569e..848c89fa5238 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o sva.o
obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 7d18addb731e..7159b2610702 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -38,7 +38,12 @@ extern int amd_iommu_guest_ir;
extern enum io_pgtable_fmt amd_iommu_pgtable;
extern int amd_iommu_gpt_level;
+/* SVA/PASID */
bool amd_iommu_sva_supported(void);
+int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid);
+void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid);
+
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
bool amd_iommu_pc_supported(void);
@@ -153,6 +158,11 @@ static inline struct amd_iommu *get_amd_iommu_from_dev(struct device *dev)
return container_of(iommu, struct amd_iommu, iommu);
}
+static inline struct protection_domain *to_pdomain(struct iommu_domain *dom)
+{
+ return container_of(dom, struct protection_domain, domain);
+}
+
bool translation_pre_enabled(struct amd_iommu *iommu);
bool amd_iommu_is_attach_deferred(struct device *dev);
int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index b1a54d8c7506..c866fbfa0dd8 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -8,7 +8,9 @@
#ifndef _ASM_X86_AMD_IOMMU_TYPES_H
#define _ASM_X86_AMD_IOMMU_TYPES_H
+#include <linux/iommu.h>
#include <linux/types.h>
+#include <linux/mmu_notifier.h>
#include <linux/mutex.h>
#include <linux/msi.h>
#include <linux/list.h>
@@ -541,6 +543,16 @@ enum protection_domain_mode {
PD_MODE_V2,
};
+/* Track PASID list for the protection domain */
+struct pdom_pasid_data {
+ /* PASID attached to the protection domain */
+ ioasid_t pasid;
+ /* Points to attach device data */
+ struct iommu_dev_data *dev_data;
+ /* Link to protection domain */
+ struct list_head pdom_link;
+};
+
/*
* This structure contains generic data for IOMMU protection domains
* independent of their use.
@@ -556,6 +568,9 @@ struct protection_domain {
enum protection_domain_mode pd_mode; /* Track page table type */
unsigned dev_cnt; /* devices assigned to this domain */
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
+
+ struct mmu_notifier mn; /* mmu notifier for the SVA domain */
+ struct list_head pasid_list; /* List of pdom_pasid_data */
};
/*
@@ -922,6 +937,16 @@ static inline int get_hpet_devid(int id)
return -EINVAL;
}
+static inline struct protection_domain *amd_iommu_get_pdomain(struct device *dev)
+{
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+
+ if (!dev_data || !dev_data->domain)
+ return NULL;
+
+ return dev_data->domain;
+}
+
enum amd_iommu_intr_mode_type {
AMD_IOMMU_GUEST_IR_LEGACY,
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a573e3534656..94eec3dac8f6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -76,6 +76,7 @@ struct iommu_cmd {
struct kmem_cache *amd_iommu_irq_cache;
static void detach_device(struct device *dev);
+static void amd_iommu_domain_free(struct iommu_domain *dom);
static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
struct gcr3_tbl_info *gcr3_info,
@@ -183,11 +184,6 @@ static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
return __rlookup_amd_iommu(seg, PCI_SBDF_TO_DEVID(devid));
}
-static struct protection_domain *to_pdomain(struct iommu_domain *dom)
-{
- return container_of(dom, struct protection_domain, domain);
-}
-
static struct iommu_dev_data *alloc_dev_data(struct amd_iommu *iommu, u16 devid)
{
struct iommu_dev_data *dev_data;
@@ -2192,6 +2188,11 @@ static int protection_domain_init_v2(struct protection_domain *pdom)
return 0;
}
+static const struct iommu_domain_ops amd_svm_domain_ops = {
+ .set_dev_pasid = amd_iommu_set_dev_pasid,
+ .free = amd_iommu_domain_free
+};
+
static struct protection_domain *protection_domain_alloc(unsigned int type)
{
struct io_pgtable_ops *pgtbl_ops;
@@ -2209,6 +2210,7 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
spin_lock_init(&domain->lock);
INIT_LIST_HEAD(&domain->dev_list);
+ INIT_LIST_HEAD(&domain->pasid_list);
domain->nid = NUMA_NO_NODE;
switch (type) {
@@ -2226,6 +2228,9 @@ static struct protection_domain *protection_domain_alloc(unsigned int type)
case IOMMU_DOMAIN_UNMANAGED:
pgtable = AMD_IOMMU_V1;
break;
+ case IOMMU_DOMAIN_SVA:
+ domain->domain.ops = &amd_svm_domain_ops;
+ return domain;
default:
goto out_err;
}
@@ -2607,6 +2612,7 @@ const struct iommu_ops amd_iommu_ops = {
.def_domain_type = amd_iommu_def_domain_type,
.dev_enable_feat = amd_iommu_dev_enable_feature,
.dev_disable_feat = amd_iommu_dev_disable_feature,
+ .remove_dev_pasid = amd_iommu_remove_dev_pasid,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = amd_iommu_attach_device,
.map_pages = amd_iommu_map_pages,
diff --git a/drivers/iommu/amd/sva.c b/drivers/iommu/amd/sva.c
new file mode 100644
index 000000000000..d18dc3f676b9
--- /dev/null
+++ b/drivers/iommu/amd/sva.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 Advanced Micro Devices, Inc.
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt) pr_fmt(fmt)
+
+#include <linux/iommu.h>
+#include <linux/mm_types.h>
+#include <linux/mmu_notifier.h>
+
+#include "amd_iommu.h"
+
+static void sva_mn_invalidate_range(struct mmu_notifier *mn,
+ struct mm_struct *mm,
+ unsigned long start, unsigned long end)
+{
+ struct protection_domain *sva_pdom;
+ struct pdom_pasid_data *pasid_data;
+ struct iommu_dev_data *dev_data;
+
+ sva_pdom = container_of(mn, struct protection_domain, mn);
+
+ list_for_each_entry(pasid_data, &sva_pdom->pasid_list, pdom_link) {
+ dev_data = pasid_data->dev_data;
+
+ if ((start ^ (end - 1)) < PAGE_SIZE) {
+ amd_iommu_flush_page(dev_data->domain,
+ pasid_data->pasid, start);
+ } else {
+ amd_iommu_flush_tlb(dev_data->domain,
+ pasid_data->pasid);
+ }
+ }
+}
+
+static void sva_mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
+{
+}
+
+static const struct mmu_notifier_ops sva_mn = {
+ .invalidate_range = sva_mn_invalidate_range,
+ .release = sva_mn_release,
+};
+
+int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ struct protection_domain *sva_pdom = to_pdomain(domain);
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+ struct pdom_pasid_data *pasid_data;
+ int ret = -EINVAL;
+ unsigned long flags;
+
+ /* PASID zero is used for requests from the I/O device without PASID */
+ if (pasid == 0 || pasid >= dev->iommu->max_pasids)
+ return ret;
+
+ /* Use SVA protection domain lock */
+ spin_lock_irqsave(&sva_pdom->lock, flags);
+
+ /* Add PASID to protection domain pasid list */
+ pasid_data = kzalloc(sizeof(*pasid_data), GFP_KERNEL);
+ if (pasid_data == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ pasid_data->pasid = pasid;
+ pasid_data->dev_data = dev_data;
+
+ /* Setup GCR3 table */
+ ret = amd_iommu_set_gcr3(dev_data, pasid,
+ iommu_virt_to_phys(domain->mm->pgd));
+ if (ret)
+ goto out_free_pasid_data;
+
+ if (list_empty(&sva_pdom->pasid_list)) {
+ sva_pdom->mn.ops = &sva_mn;
+
+ ret = mmu_notifier_register(&sva_pdom->mn, domain->mm);
+ if (ret)
+ goto out_clear_gcr3;
+ }
+
+ list_add(&pasid_data->pdom_link, &sva_pdom->pasid_list);
+ spin_unlock_irqrestore(&sva_pdom->lock, flags);
+ return ret;
+
+out_clear_gcr3:
+ amd_iommu_clear_gcr3(dev_data, pasid);
+
+out_free_pasid_data:
+ kfree(pasid_data);
+
+out:
+ spin_unlock_irqrestore(&sva_pdom->lock, flags);
+ return ret;
+}
+
+static struct pdom_pasid_data *get_pdom_pasid_data(struct protection_domain *pdom,
+ struct device *dev, ioasid_t pasid)
+{
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+ struct pdom_pasid_data *pasid_data;
+
+ list_for_each_entry(pasid_data, &pdom->pasid_list, pdom_link) {
+ if (pasid_data->pasid == pasid &&
+ pasid_data->dev_data == dev_data)
+ return pasid_data;
+ }
+
+ return NULL;
+}
+
+void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
+{
+ struct pdom_pasid_data *pasid_data;
+ struct protection_domain *sva_pdom;
+ struct iommu_domain *domain;
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+ unsigned long flags;
+
+ if (pasid == 0 || pasid >= dev->iommu->max_pasids)
+ return;
+
+ /* Get protection domain */
+ domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
+ if (!domain)
+ return;
+ sva_pdom = to_pdomain(domain);
+
+ /* Ensure that all queued faults have been processed */
+ iopf_queue_flush_dev(dev);
+
+ spin_lock_irqsave(&sva_pdom->lock, flags);
+
+ pasid_data = get_pdom_pasid_data(sva_pdom, dev, pasid);
+ if (!pasid_data) {
+ spin_unlock_irqrestore(&sva_pdom->lock, flags);
+ return;
+ }
+
+ list_del(&pasid_data->pdom_link);
+ kfree(pasid_data);
+
+ /* make it visible */
+ smp_wmb();
+
+ /* Update GCR3 table and flush IOTLB */
+ amd_iommu_clear_gcr3(dev_data, pasid);
+
+ spin_unlock_irqrestore(&sva_pdom->lock, flags);
+
+ if (list_empty(&sva_pdom->pasid_list))
+ mmu_notifier_unregister(&sva_pdom->mn, domain->mm);
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-09-11 12:10 ` [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU Vasant Hegde
@ 2023-09-12 16:47 ` Jason Gunthorpe
2023-09-15 8:50 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 16:47 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:39PM +0000, Vasant Hegde wrote:
> +static inline struct protection_domain *amd_iommu_get_pdomain(struct device *dev)
> +{
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> +
> + if (!dev_data || !dev_data->domain)
> + return NULL;
> +
> + return dev_data->domain;
> +}
Not used in this patch
> @@ -2192,6 +2188,11 @@ static int protection_domain_init_v2(struct protection_domain *pdom)
> return 0;
> }
>
> +static const struct iommu_domain_ops amd_svm_domain_ops = {
> + .set_dev_pasid = amd_iommu_set_dev_pasid,
> + .free = amd_iommu_domain_free
> +};
"amd_sva_domain_ops" please
> new file mode 100644
> index 000000000000..d18dc3f676b9
> --- /dev/null
> +++ b/drivers/iommu/amd/sva.c
> @@ -0,0 +1,158 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2023 Advanced Micro Devices, Inc.
> + */
> +
> +#define pr_fmt(fmt) "AMD-Vi: " fmt
> +#define dev_fmt(fmt) pr_fmt(fmt)
> +
> +#include <linux/iommu.h>
> +#include <linux/mm_types.h>
> +#include <linux/mmu_notifier.h>
> +
> +#include "amd_iommu.h"
> +
> +static void sva_mn_invalidate_range(struct mmu_notifier *mn,
> + struct mm_struct *mm,
> + unsigned long start, unsigned long end)
> +{
> + struct protection_domain *sva_pdom;
> + struct pdom_pasid_data *pasid_data;
> + struct iommu_dev_data *dev_data;
> +
> + sva_pdom = container_of(mn, struct protection_domain, mn);
> +
> + list_for_each_entry(pasid_data, &sva_pdom->pasid_list, pdom_link) {
Need to hold sva_pdom->lock if iterating over pasid_list
> + dev_data = pasid_data->dev_data;
> +
> + if ((start ^ (end - 1)) < PAGE_SIZE) {
> + amd_iommu_flush_page(dev_data->domain,
> + pasid_data->pasid, start);
> + } else {
> + amd_iommu_flush_tlb(dev_data->domain,
> + pasid_data->pasid);
> + }
It is baffling why dev_data->domain would be used here, and it isn't
locked properly so this is a problem.
Looking into __flush_pasid it doesn't really make sense. The PASID
domain definately can't assume that the RID domain has the the same
set of iommus. Most likely the SVA domain has a wider set of iommus,
at least I didn't notice any logic preventing this in set_dev_pasid.
Also, didn't you say you wanted to de-duplicate the IOTLB and ATC
invalidations like ARM is doing?
> +static const struct mmu_notifier_ops sva_mn = {
> + .invalidate_range = sva_mn_invalidate_range,
> + .release = sva_mn_release,
> +};
Why an empty release function?
That can't be right, when release is called the driver has to stop
using iommu_virt_to_phys(domain->mm->pgd)) because it will be freed
memory. It should replace it with a global empty pgd and flush caches
before release returns.
> +int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t pasid)
> +{
> + struct protection_domain *sva_pdom = to_pdomain(domain);
Just call this pdom, there is nothing about this function that is sva
specific anymore.
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> + struct pdom_pasid_data *pasid_data;
> + int ret = -EINVAL;
> + unsigned long flags;
> +
> + /* PASID zero is used for requests from the I/O device without PASID */
> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
> + return ret;
> +
> + /* Use SVA protection domain lock */
> + spin_lock_irqsave(&sva_pdom->lock, flags);
> +
> + /* Add PASID to protection domain pasid list */
> + pasid_data = kzalloc(sizeof(*pasid_data), GFP_KERNEL);
> + if (pasid_data == NULL) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + pasid_data->pasid = pasid;
> + pasid_data->dev_data = dev_data;
> +
> + /* Setup GCR3 table */
> + ret = amd_iommu_set_gcr3(dev_data, pasid,
> + iommu_virt_to_phys(domain->mm->pgd));
> + if (ret)
> + goto out_free_pasid_data;
> +
> + if (list_empty(&sva_pdom->pasid_list)) {
> + sva_pdom->mn.ops = &sva_mn;
> +
> + ret = mmu_notifier_register(&sva_pdom->mn, domain->mm);
> + if (ret)
> + goto out_clear_gcr3;
> + }
Looks to me like mmu_notifier_register() should be done before
amd_iommu_set_gcr3() so we don't have a risk of stale iotlb data.
> +
> + list_add(&pasid_data->pdom_link, &sva_pdom->pasid_list);
> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> + return ret;
> +
> +out_clear_gcr3:
> + amd_iommu_clear_gcr3(dev_data, pasid);
> +
> +out_free_pasid_data:
> + kfree(pasid_data);
> +
> +out:
> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> + return ret;
> +}
But this looks fine, it is entirely generic.
> +static struct pdom_pasid_data *get_pdom_pasid_data(struct protection_domain *pdom,
> + struct device *dev, ioasid_t pasid)
> +{
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> + struct pdom_pasid_data *pasid_data;
> +
> + list_for_each_entry(pasid_data, &pdom->pasid_list, pdom_link)
> {
Add a lockdep assert for pdom->lock here
> + if (pasid_data->pasid == pasid &&
> + pasid_data->dev_data == dev_data)
> + return pasid_data;
> + }
> +
> + return NULL;
> +}
> +
> +void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> +{
> + struct pdom_pasid_data *pasid_data;
> + struct protection_domain *sva_pdom;
> + struct iommu_domain *domain;
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> + unsigned long flags;
> +
> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
> + return;
> +
> + /* Get protection domain */
> + domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
> + if (!domain)
> + return;
> + sva_pdom = to_pdomain(domain);
> +
> + /* Ensure that all queued faults have been processed */
> + iopf_queue_flush_dev(dev);
> +
> + spin_lock_irqsave(&sva_pdom->lock, flags);
> +
> + pasid_data = get_pdom_pasid_data(sva_pdom, dev, pasid);
> + if (!pasid_data) {
> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> + return;
> + }
> +
> + list_del(&pasid_data->pdom_link);
> + kfree(pasid_data);
> +
> + /* make it visible */
> + smp_wmb();
> +
> + /* Update GCR3 table and flush IOTLB */
> + amd_iommu_clear_gcr3(dev_data, pasid);
> +
> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> +
> + if (list_empty(&sva_pdom->pasid_list))
> + mmu_notifier_unregister(&sva_pdom->mn, domain->mm);
Can't drop the spinlock before doing this, it is racy.
It would be best if the mmu notifier was registered at domain
allocation time and freed at domain destruction
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-09-12 16:47 ` Jason Gunthorpe
@ 2023-09-15 8:50 ` Vasant Hegde
2023-09-18 12:53 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-15 8:50 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/12/2023 10:17 PM, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 12:10:39PM +0000, Vasant Hegde wrote:
>
>> +static inline struct protection_domain *amd_iommu_get_pdomain(struct device *dev)
>> +{
>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>> +
>> + if (!dev_data || !dev_data->domain)
>> + return NULL;
>> +
>> + return dev_data->domain;
>> +}
>
> Not used in this patch
Yeah. Forgot to move to different patch after reworking SVA code. Will fix it.
>
>> @@ -2192,6 +2188,11 @@ static int protection_domain_init_v2(struct protection_domain *pdom)
>> return 0;
>> }
>>
>> +static const struct iommu_domain_ops amd_svm_domain_ops = {
>> + .set_dev_pasid = amd_iommu_set_dev_pasid,
>> + .free = amd_iommu_domain_free
>> +};
>
> "amd_sva_domain_ops" please
Fixed.
>
>> new file mode 100644
>> index 000000000000..d18dc3f676b9
>> --- /dev/null
>> +++ b/drivers/iommu/amd/sva.c
>> @@ -0,0 +1,158 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (C) 2023 Advanced Micro Devices, Inc.
>> + */
>> +
>> +#define pr_fmt(fmt) "AMD-Vi: " fmt
>> +#define dev_fmt(fmt) pr_fmt(fmt)
>> +
>> +#include <linux/iommu.h>
>> +#include <linux/mm_types.h>
>> +#include <linux/mmu_notifier.h>
>> +
>> +#include "amd_iommu.h"
>> +
>> +static void sva_mn_invalidate_range(struct mmu_notifier *mn,
>> + struct mm_struct *mm,
>> + unsigned long start, unsigned long end)
>> +{
>> + struct protection_domain *sva_pdom;
>> + struct pdom_pasid_data *pasid_data;
>> + struct iommu_dev_data *dev_data;
>> +
>> + sva_pdom = container_of(mn, struct protection_domain, mn);
>> +
>> + list_for_each_entry(pasid_data, &sva_pdom->pasid_list, pdom_link) {
>
> Need to hold sva_pdom->lock if iterating over pasid_list
>
>> + dev_data = pasid_data->dev_data;
>> +
>> + if ((start ^ (end - 1)) < PAGE_SIZE) {
>> + amd_iommu_flush_page(dev_data->domain,
>> + pasid_data->pasid, start);
>> + } else {
>> + amd_iommu_flush_tlb(dev_data->domain,
>> + pasid_data->pasid);
>> + }
>
> It is baffling why dev_data->domain would be used here, and it isn't
> locked properly so this is a problem.
>
> Looking into __flush_pasid it doesn't really make sense. The PASID
> domain definately can't assume that the RID domain has the the same
> set of iommus. Most likely the SVA domain has a wider set of iommus,
> at least I didn't notice any logic preventing this in set_dev_pasid.
Right. SVA protection domain will just add dev/PASID combination without
worrying device protection domain/iommu.
In invalidation path, I walk the `pasid_list` from SVA protection domain, get
dev_data/PASID info and use those for invaliation. Since invalidation needs
device protection domain I have to refer dev_data->domain.
>
> Also, didn't you say you wanted to de-duplicate the IOTLB and ATC
> invalidations like ARM is doing?
I don't have the understanding of ARM. But our invalidation interfaces needs
improvement. Like current code does unncessary complete_wait() calls. Those will
be fixed along with other invalidation cleanup.
>
>> +static const struct mmu_notifier_ops sva_mn = {
>> + .invalidate_range = sva_mn_invalidate_range,
>> + .release = sva_mn_release,
>> +};
>
> Why an empty release function?
>
> That can't be right, when release is called the driver has to stop
> using iommu_virt_to_phys(domain->mm->pgd)) because it will be freed
> memory. It should replace it with a global empty pgd and flush caches
> before release returns.
In remove_dev_pasid() its already removed PASID from GCR3 table, flushing IOMMU.
Also if its last device unbind, then it also does mmu_notifier_unregister().
I thought remove_dev_pasid() always gets called before mmu notifier releases.
Is there any chance of mmu notifier relase() getting called before
remove_dev_pasid().
>
>> +int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
>> + struct device *dev, ioasid_t pasid)
>> +{
>> + struct protection_domain *sva_pdom = to_pdomain(domain);
>
> Just call this pdom, there is nothing about this function that is sva
> specific anymore.
Ok.
>
>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>> + struct pdom_pasid_data *pasid_data;
>> + int ret = -EINVAL;
>> + unsigned long flags;
>> +
>> + /* PASID zero is used for requests from the I/O device without PASID */
>> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
>> + return ret;
>> +
>> + /* Use SVA protection domain lock */
>> + spin_lock_irqsave(&sva_pdom->lock, flags);
>> +
>> + /* Add PASID to protection domain pasid list */
>> + pasid_data = kzalloc(sizeof(*pasid_data), GFP_KERNEL);
>> + if (pasid_data == NULL) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + pasid_data->pasid = pasid;
>> + pasid_data->dev_data = dev_data;
>> +
>> + /* Setup GCR3 table */
>> + ret = amd_iommu_set_gcr3(dev_data, pasid,
>> + iommu_virt_to_phys(domain->mm->pgd));
>> + if (ret)
>> + goto out_free_pasid_data;
>> +
>> + if (list_empty(&sva_pdom->pasid_list)) {
>> + sva_pdom->mn.ops = &sva_mn;
>> +
>> + ret = mmu_notifier_register(&sva_pdom->mn, domain->mm);
>> + if (ret)
>> + goto out_clear_gcr3;
>> + }
>
> Looks to me like mmu_notifier_register() should be done before
> amd_iommu_set_gcr3() so we don't have a risk of stale iotlb data.
Actually we need to update GCR3 table with PASID before registering mmu
notifier. So that hardware is ready to handle the invalidations. Otherwise we
will have a redudant invalidations.
>
>> +
>> + list_add(&pasid_data->pdom_link, &sva_pdom->pasid_list);
>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>> + return ret;
>> +
>> +out_clear_gcr3:
>> + amd_iommu_clear_gcr3(dev_data, pasid);
>> +
>> +out_free_pasid_data:
>> + kfree(pasid_data);
>> +
>> +out:
>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>> + return ret;
>> +}
>
> But this looks fine, it is entirely generic.
>
>> +static struct pdom_pasid_data *get_pdom_pasid_data(struct protection_domain *pdom,
>> + struct device *dev, ioasid_t pasid)
>> +{
>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>> + struct pdom_pasid_data *pasid_data;
>> +
>> + list_for_each_entry(pasid_data, &pdom->pasid_list, pdom_link)
>> {
>
> Add a lockdep assert for pdom->lock here
Ok.
>
>> + if (pasid_data->pasid == pasid &&
>> + pasid_data->dev_data == dev_data)
>> + return pasid_data;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>> +{
>> + struct pdom_pasid_data *pasid_data;
>> + struct protection_domain *sva_pdom;
>> + struct iommu_domain *domain;
>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>> + unsigned long flags;
>> +
>> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
>> + return;
>> +
>> + /* Get protection domain */
>> + domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
>> + if (!domain)
>> + return;
>> + sva_pdom = to_pdomain(domain);
>> +
>> + /* Ensure that all queued faults have been processed */
>> + iopf_queue_flush_dev(dev);
>> +
>> + spin_lock_irqsave(&sva_pdom->lock, flags);
>> +
>> + pasid_data = get_pdom_pasid_data(sva_pdom, dev, pasid);
>> + if (!pasid_data) {
>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>> + return;
>> + }
>> +
>> + list_del(&pasid_data->pdom_link);
>> + kfree(pasid_data);
>> +
>> + /* make it visible */
>> + smp_wmb();
>> +
>> + /* Update GCR3 table and flush IOTLB */
>> + amd_iommu_clear_gcr3(dev_data, pasid);
>> +
>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>> +
>> + if (list_empty(&sva_pdom->pasid_list))
>> + mmu_notifier_unregister(&sva_pdom->mn, domain->mm);
>
> Can't drop the spinlock before doing this, it is racy.
Yeah. Its racy. But we cannot call mmu_notifier_unregister() with spinlock held.
May be I will introduce some lock to handle notifier.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-09-15 8:50 ` Vasant Hegde
@ 2023-09-18 12:53 ` Jason Gunthorpe
2023-10-13 15:52 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-18 12:53 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Fri, Sep 15, 2023 at 02:20:32PM +0530, Vasant Hegde wrote:
> >> + dev_data = pasid_data->dev_data;
> >> +
> >> + if ((start ^ (end - 1)) < PAGE_SIZE) {
> >> + amd_iommu_flush_page(dev_data->domain,
> >> + pasid_data->pasid, start);
> >> + } else {
> >> + amd_iommu_flush_tlb(dev_data->domain,
> >> + pasid_data->pasid);
> >> + }
> >
> > It is baffling why dev_data->domain would be used here, and it isn't
> > locked properly so this is a problem.
> >
> > Looking into __flush_pasid it doesn't really make sense. The PASID
> > domain definately can't assume that the RID domain has the the same
> > set of iommus. Most likely the SVA domain has a wider set of iommus,
> > at least I didn't notice any logic preventing this in set_dev_pasid.
>
> Right. SVA protection domain will just add dev/PASID combination without
> worrying device protection domain/iommu.
>
> In invalidation path, I walk the `pasid_list` from SVA protection domain, get
> dev_data/PASID info and use those for invaliation. Since invalidation needs
> device protection domain I have to refer dev_data->domain.
See my other email, this area looks quite troubled.
> > Also, didn't you say you wanted to de-duplicate the IOTLB and ATC
> > invalidations like ARM is doing?
>
> I don't have the understanding of ARM. But our invalidation
> interfaces needs improvement. Like current code does unncessary
> complete_wait() calls. Those will be fixed along with other
> invalidation cleanup.
Future fix is fine
> >
> >> +static const struct mmu_notifier_ops sva_mn = {
> >> + .invalidate_range = sva_mn_invalidate_range,
> >> + .release = sva_mn_release,
> >> +};
> >
> > Why an empty release function?
> >
> > That can't be right, when release is called the driver has to stop
> > using iommu_virt_to_phys(domain->mm->pgd)) because it will be freed
> > memory. It should replace it with a global empty pgd and flush caches
> > before release returns.
>
> In remove_dev_pasid() its already removed PASID from GCR3 table, flushing IOMMU.
> Also if its last device unbind, then it also does mmu_notifier_unregister().
>
> I thought remove_dev_pasid() always gets called before mmu notifier releases.
> Is there any chance of mmu notifier relase() getting called before
> remove_dev_pasid().
Yes it can be called in that order. An empty release function here is buggy.
> >> + pasid_data->pasid = pasid;
> >> + pasid_data->dev_data = dev_data;
> >> +
> >> + /* Setup GCR3 table */
> >> + ret = amd_iommu_set_gcr3(dev_data, pasid,
> >> + iommu_virt_to_phys(domain->mm->pgd));
> >> + if (ret)
> >> + goto out_free_pasid_data;
> >> +
> >> + if (list_empty(&sva_pdom->pasid_list)) {
> >> + sva_pdom->mn.ops = &sva_mn;
> >> +
> >> + ret = mmu_notifier_register(&sva_pdom->mn, domain->mm);
> >> + if (ret)
> >> + goto out_clear_gcr3;
> >> + }
> >
> > Looks to me like mmu_notifier_register() should be done before
> > amd_iommu_set_gcr3() so we don't have a risk of stale iotlb data.
>
> Actually we need to update GCR3 table with PASID before registering mmu
> notifier. So that hardware is ready to handle the invalidations. Otherwise we
> will have a redudant invalidations.
If you register the mmu and the paslid_list is empty (or locked) then the
invalidation handler does nothing. So this cannot be a concern.
Putting it out of order like this creates an edge case where the HW
could concivable cache a translation and miss a notification. It is
definately wrong.
> >> +void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> >> +{
> >> + struct pdom_pasid_data *pasid_data;
> >> + struct protection_domain *sva_pdom;
> >> + struct iommu_domain *domain;
> >> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> >> + unsigned long flags;
> >> +
> >> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
> >> + return;
> >> +
> >> + /* Get protection domain */
> >> + domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
> >> + if (!domain)
> >> + return;
> >> + sva_pdom = to_pdomain(domain);
> >> +
> >> + /* Ensure that all queued faults have been processed */
> >> + iopf_queue_flush_dev(dev);
> >> +
> >> + spin_lock_irqsave(&sva_pdom->lock, flags);
> >> +
> >> + pasid_data = get_pdom_pasid_data(sva_pdom, dev, pasid);
> >> + if (!pasid_data) {
> >> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> >> + return;
> >> + }
> >> +
> >> + list_del(&pasid_data->pdom_link);
> >> + kfree(pasid_data);
> >> +
> >> + /* make it visible */
> >> + smp_wmb();
> >> +
> >> + /* Update GCR3 table and flush IOTLB */
> >> + amd_iommu_clear_gcr3(dev_data, pasid);
> >> +
> >> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
> >> +
> >> + if (list_empty(&sva_pdom->pasid_list))
> >> + mmu_notifier_unregister(&sva_pdom->mn, domain->mm);
> >
> > Can't drop the spinlock before doing this, it is racy.
>
> Yeah. Its racy. But we cannot call mmu_notifier_unregister() with spinlock held.
> May be I will introduce some lock to handle notifier.
If you reoganize things so the notifier is always registered then you
don't need to have the spinlock protecting it anymore. this requires
putting the unregister in the domain deallocation routine.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-09-18 12:53 ` Jason Gunthorpe
@ 2023-10-13 15:52 ` Vasant Hegde
2023-10-13 15:58 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-10-13 15:52 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On 9/18/2023 6:23 PM, Jason Gunthorpe wrote:
> On Fri, Sep 15, 2023 at 02:20:32PM +0530, Vasant Hegde wrote:
>
>>>> + dev_data = pasid_data->dev_data;
>>>> +
>>>> + if ((start ^ (end - 1)) < PAGE_SIZE) {
>>>> + amd_iommu_flush_page(dev_data->domain,
>>>> + pasid_data->pasid, start);
>>>> + } else {
>>>> + amd_iommu_flush_tlb(dev_data->domain,
>>>> + pasid_data->pasid);
>>>> + }
>>>
>>> It is baffling why dev_data->domain would be used here, and it isn't
>>> locked properly so this is a problem.
>>>
>>> Looking into __flush_pasid it doesn't really make sense. The PASID
>>> domain definately can't assume that the RID domain has the the same
>>> set of iommus. Most likely the SVA domain has a wider set of iommus,
>>> at least I didn't notice any logic preventing this in set_dev_pasid.
>>
>> Right. SVA protection domain will just add dev/PASID combination without
>> worrying device protection domain/iommu.
>>
>> In invalidation path, I walk the `pasid_list` from SVA protection domain, get
>> dev_data/PASID info and use those for invaliation. Since invalidation needs
>> device protection domain I have to refer dev_data->domain.
>
> See my other email, this area looks quite troubled.
>
>>> Also, didn't you say you wanted to de-duplicate the IOTLB and ATC
>>> invalidations like ARM is doing?
>>
>> I don't have the understanding of ARM. But our invalidation
>> interfaces needs improvement. Like current code does unncessary
>> complete_wait() calls. Those will be fixed along with other
>> invalidation cleanup.
>
> Future fix is fine
>
>>>
>>>> +static const struct mmu_notifier_ops sva_mn = {
>>>> + .invalidate_range = sva_mn_invalidate_range,
>>>> + .release = sva_mn_release,
>>>> +};
>>>
>>> Why an empty release function?
>>>
>>> That can't be right, when release is called the driver has to stop
>>> using iommu_virt_to_phys(domain->mm->pgd)) because it will be freed
>>> memory. It should replace it with a global empty pgd and flush caches
>>> before release returns.
>>
>> In remove_dev_pasid() its already removed PASID from GCR3 table, flushing IOMMU.
>> Also if its last device unbind, then it also does mmu_notifier_unregister().
>>
>> I thought remove_dev_pasid() always gets called before mmu notifier releases.
>> Is there any chance of mmu notifier relase() getting called before
>> remove_dev_pasid().
>
> Yes it can be called in that order. An empty release function here is buggy.
Then I misunderstood this part. Will fix it in v3.
>
>>>> + pasid_data->pasid = pasid;
>>>> + pasid_data->dev_data = dev_data;
>>>> +
>>>> + /* Setup GCR3 table */
>>>> + ret = amd_iommu_set_gcr3(dev_data, pasid,
>>>> + iommu_virt_to_phys(domain->mm->pgd));
>>>> + if (ret)
>>>> + goto out_free_pasid_data;
>>>> +
>>>> + if (list_empty(&sva_pdom->pasid_list)) {
>>>> + sva_pdom->mn.ops = &sva_mn;
>>>> +
>>>> + ret = mmu_notifier_register(&sva_pdom->mn, domain->mm);
>>>> + if (ret)
>>>> + goto out_clear_gcr3;
>>>> + }
>>>
>>> Looks to me like mmu_notifier_register() should be done before
>>> amd_iommu_set_gcr3() so we don't have a risk of stale iotlb data.
>>
>> Actually we need to update GCR3 table with PASID before registering mmu
>> notifier. So that hardware is ready to handle the invalidations. Otherwise we
>> will have a redudant invalidations.
>
> If you register the mmu and the paslid_list is empty (or locked) then the
> invalidation handler does nothing. So this cannot be a concern.
Yeah. With all these changes I think changing order is fine.
>
> Putting it out of order like this creates an edge case where the HW
> could concivable cache a translation and miss a notification. It is
> definately wrong.
>
>>>> +void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>>>> +{
>>>> + struct pdom_pasid_data *pasid_data;
>>>> + struct protection_domain *sva_pdom;
>>>> + struct iommu_domain *domain;
>>>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>>>> + unsigned long flags;
>>>> +
>>>> + if (pasid == 0 || pasid >= dev->iommu->max_pasids)
>>>> + return;
>>>> +
>>>> + /* Get protection domain */
>>>> + domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA);
>>>> + if (!domain)
>>>> + return;
>>>> + sva_pdom = to_pdomain(domain);
>>>> +
>>>> + /* Ensure that all queued faults have been processed */
>>>> + iopf_queue_flush_dev(dev);
>>>> +
>>>> + spin_lock_irqsave(&sva_pdom->lock, flags);
>>>> +
>>>> + pasid_data = get_pdom_pasid_data(sva_pdom, dev, pasid);
>>>> + if (!pasid_data) {
>>>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>>>> + return;
>>>> + }
>>>> +
>>>> + list_del(&pasid_data->pdom_link);
>>>> + kfree(pasid_data);
>>>> +
>>>> + /* make it visible */
>>>> + smp_wmb();
>>>> +
>>>> + /* Update GCR3 table and flush IOTLB */
>>>> + amd_iommu_clear_gcr3(dev_data, pasid);
>>>> +
>>>> + spin_unlock_irqrestore(&sva_pdom->lock, flags);
>>>> +
>>>> + if (list_empty(&sva_pdom->pasid_list))
>>>> + mmu_notifier_unregister(&sva_pdom->mn, domain->mm);
>>>
>>> Can't drop the spinlock before doing this, it is racy.
>>
>> Yeah. Its racy. But we cannot call mmu_notifier_unregister() with spinlock held.
>> May be I will introduce some lock to handle notifier.
>
> If you reoganize things so the notifier is always registered then you
> don't need to have the spinlock protecting it anymore. this requires
> putting the unregister in the domain deallocation routine.
Yeah. I will put unregister in de-allocation path instead of adding another lock.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU
2023-10-13 15:52 ` Vasant Hegde
@ 2023-10-13 15:58 ` Jason Gunthorpe
0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-10-13 15:58 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Fri, Oct 13, 2023 at 09:22:20PM +0530, Vasant Hegde wrote:
> > If you reoganize things so the notifier is always registered then you
> > don't need to have the spinlock protecting it anymore. this requires
> > putting the unregister in the domain deallocation routine.
>
> Yeah. I will put unregister in de-allocation path instead of adding another lock.
If you inspect the series I posted here (actually just get it from
github and look at it) you can see an example of how to structure all
of this to avoid all of the concerns in this thread:
https://lore.kernel.org/linux-iommu/0-v1-afbb86647bbd+5-smmuv3_newapi_p2_jgg@nvidia.com/
ttps://github.com/jgunthorpe/linux/commits/smmuv3_newapi
Feel free to take the patch introducing the new core op to your
series, it really helps alot to make this sane and clean.
Notice how little of the code that is left is actually SVA specific.
When structured properly the majority of the work is to enable PASID
support. SVA is just a small bit on top to manage the mmu_notifier.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (3 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 04/11] iommu/amd: Initial SVA support for AMD IOMMU Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-12 16:57 ` Jason Gunthorpe
2023-09-11 12:10 ` [PATCH v2 06/11] iommu/amd: Move PPR-related functions into ppr.c Vasant Hegde
` (6 subsequent siblings)
11 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
It seems iommu_dev_enable_feature(SVA) will be deprecated soon. Hence in
this path we just return success.
Instead we add necessary check during pasid bind to device. IN this path
it checks whether device GCR3 table is setup or not. If not it will try
to setup the GCR3.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 2 ++
drivers/iommu/amd/iommu.c | 50 +++++++++++++++++++++++++++
drivers/iommu/amd/sva.c | 63 +++++++++++++++++++++++++++++++++++
3 files changed, 115 insertions(+)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 7159b2610702..f7f576b8c8f6 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -43,6 +43,8 @@ bool amd_iommu_sva_supported(void);
int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid);
void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid);
+int amd_iommu_gcr3_init(struct iommu_dev_data *dev_data, int pasids);
+void amd_iommu_gcr3_uninit(struct iommu_dev_data *dev_data);
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 94eec3dac8f6..5a9749cfe14e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2162,6 +2162,50 @@ static void protection_domain_free(struct protection_domain *domain)
kfree(domain);
}
+/*******************************
+ * PASID setup related helper functions
+ */
+static inline bool pdom_is_pt_mode(struct protection_domain *pdom)
+{
+ return (pdom->domain.type == IOMMU_DOMAIN_IDENTITY);
+}
+
+static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
+{
+ return (pdom->iop.pgd != NULL);
+}
+
+int amd_iommu_gcr3_init(struct iommu_dev_data *dev_data, int pasids)
+{
+ struct protection_domain *pdom = dev_data->domain;
+ int ret = 0;
+
+ lockdep_assert_held(&dev_data->lock);
+
+ /*
+ * We cannot support PASID w/ existing v1 page table in the same domain
+ * since it will be nested. However, existing domain w/ v2 page table
+ * can be used for PASID.
+ */
+ if (pdom->pd_mode == PD_MODE_V1)
+ return -EOPNOTSUPP;
+
+ /* Allocate GCR3 table */
+ if (pdom_is_pt_mode(dev_data->domain))
+ ret = setup_gcr3_table(dev_data, pasids);
+
+ return ret;
+}
+
+void amd_iommu_gcr3_uninit(struct iommu_dev_data *dev_data)
+{
+ lockdep_assert_held(&dev_data->lock);
+
+ /* Free GCR3 table */
+ if (pdom_is_pt_mode(dev_data->domain))
+ free_gcr3_table(dev_data);
+}
+
static int protection_domain_init_v1(struct protection_domain *domain, int mode)
{
u64 *pt_root = NULL;
@@ -2579,6 +2623,9 @@ static int amd_iommu_dev_enable_feature(struct device *dev,
int ret;
switch (feat) {
+ case IOMMU_DEV_FEAT_SVA:
+ ret = 0;
+ break;
default:
ret = -EINVAL;
break;
@@ -2592,6 +2639,9 @@ static int amd_iommu_dev_disable_feature(struct device *dev,
int ret;
switch (feat) {
+ case IOMMU_DEV_FEAT_SVA:
+ ret = 0;
+ break;
default:
ret = -EINVAL;
break;
diff --git a/drivers/iommu/amd/sva.c b/drivers/iommu/amd/sva.c
index d18dc3f676b9..4cf1fb7864c4 100644
--- a/drivers/iommu/amd/sva.c
+++ b/drivers/iommu/amd/sva.c
@@ -12,6 +12,60 @@
#include "amd_iommu.h"
+static inline bool is_pasid_enabled(struct iommu_dev_data *dev_data)
+{
+ if (dev_data->gcr3_info.gcr3_tbl != NULL &&
+ dev_data->gcr3_info.pasid_cnt != 0)
+ return true;
+
+ return false;
+}
+
+static int amd_iommu_pasid_enable(struct iommu_dev_data *dev_data)
+{
+ struct device *dev = dev_data->dev;
+ unsigned long flags;
+ int ret = 0;
+
+ spin_lock_irqsave(&dev_data->lock, flags);
+
+ if (dev_data->gcr3_info.gcr3_tbl != NULL &&
+ dev_data->gcr3_info.pasid_cnt != 0)
+ goto out;
+
+ if (!amd_iommu_sva_supported()) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ if (!dev_data->pasid_enabled) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ret = amd_iommu_gcr3_init(dev_data, dev->iommu->max_pasids);
+
+out:
+ spin_unlock_irqrestore(&dev_data->lock, flags);
+ return ret;
+}
+
+static void amd_iommu_pasid_disable(struct iommu_dev_data *dev_data)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev_data->lock, flags);
+
+ if (dev_data->gcr3_info.gcr3_tbl == NULL)
+ return;
+
+ if (dev_data->gcr3_info.pasid_cnt != 0)
+ return;
+
+ amd_iommu_gcr3_uninit(dev_data);
+ spin_unlock_irqrestore(&dev_data->lock, flags);
+}
+
static void sva_mn_invalidate_range(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end)
@@ -60,6 +114,13 @@ int amd_iommu_set_dev_pasid(struct iommu_domain *domain,
/* Use SVA protection domain lock */
spin_lock_irqsave(&sva_pdom->lock, flags);
+ /* Make sure PASID is enabled */
+ if (!is_pasid_enabled(dev_data)) {
+ ret = amd_iommu_pasid_enable(dev_data);
+ if (ret)
+ goto out;
+ }
+
/* Add PASID to protection domain pasid list */
pasid_data = kzalloc(sizeof(*pasid_data), GFP_KERNEL);
if (pasid_data == NULL) {
@@ -151,6 +212,8 @@ void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
/* Update GCR3 table and flush IOTLB */
amd_iommu_clear_gcr3(dev_data, pasid);
+ amd_iommu_pasid_disable(dev_data);
+
spin_unlock_irqrestore(&sva_pdom->lock, flags);
if (list_empty(&sva_pdom->pasid_list))
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature
2023-09-11 12:10 ` [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature Vasant Hegde
@ 2023-09-12 16:57 ` Jason Gunthorpe
2023-09-15 8:57 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 16:57 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:40PM +0000, Vasant Hegde wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>
> It seems iommu_dev_enable_feature(SVA) will be deprecated soon. Hence in
> this path we just return success.
>
> Instead we add necessary check during pasid bind to device. IN this path
> it checks whether device GCR3 table is setup or not. If not it will try
> to setup the GCR3.
Yes, though be mindful that switching from IDENTITY v1 to IDENTITY v2
on the RID (eg switching the DTE to/from GCR3 mode) should be atomic.
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 94eec3dac8f6..5a9749cfe14e 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2162,6 +2162,50 @@ static void protection_domain_free(struct protection_domain *domain)
> kfree(domain);
> }
>
> +/*******************************
> + * PASID setup related helper functions
> + */
> +static inline bool pdom_is_pt_mode(struct protection_domain *pdom)
> +{
> + return (pdom->domain.type == IOMMU_DOMAIN_IDENTITY);
> +}
> +
> +static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
> +{
> + return (pdom->iop.pgd != NULL);
> +}
> +
> +int amd_iommu_gcr3_init(struct iommu_dev_data *dev_data, int pasids)
> +{
> + struct protection_domain *pdom = dev_data->domain;
> + int ret = 0;
> +
> + lockdep_assert_held(&dev_data->lock);
> +
> + /*
> + * We cannot support PASID w/ existing v1 page table in the same domain
> + * since it will be nested. However, existing domain w/ v2 page table
> + * can be used for PASID.
> + */
> + if (pdom->pd_mode == PD_MODE_V1)
> + return -EOPNOTSUPP;
> +
> + /* Allocate GCR3 table */
> + if (pdom_is_pt_mode(dev_data->domain))
It seems obfuscating to put this test for IOMMU_DOMAIN_IDENTITY in a
function..
> + ret = setup_gcr3_table(dev_data, pasids);
This is the DTE change that should be atomic
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature
2023-09-12 16:57 ` Jason Gunthorpe
@ 2023-09-15 8:57 ` Vasant Hegde
0 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-15 8:57 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/12/2023 10:27 PM, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 12:10:40PM +0000, Vasant Hegde wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> It seems iommu_dev_enable_feature(SVA) will be deprecated soon. Hence in
>> this path we just return success.
>>
>> Instead we add necessary check during pasid bind to device. IN this path
>> it checks whether device GCR3 table is setup or not. If not it will try
>> to setup the GCR3.
>
> Yes, though be mindful that switching from IDENTITY v1 to IDENTITY v2
> on the RID (eg switching the DTE to/from GCR3 mode) should be atomic.
Right. Its atomic. It should work fine. I have to be careful when I add support
for PASID table expansion. But its doable.
>
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index 94eec3dac8f6..5a9749cfe14e 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> @@ -2162,6 +2162,50 @@ static void protection_domain_free(struct protection_domain *domain)
>> kfree(domain);
>> }
>>
>> +/*******************************
>> + * PASID setup related helper functions
>> + */
>> +static inline bool pdom_is_pt_mode(struct protection_domain *pdom)
>> +{
>> + return (pdom->domain.type == IOMMU_DOMAIN_IDENTITY);
>> +}
>> +
>> +static inline bool pdom_is_v2_pgtbl_mode(struct protection_domain *pdom)
>> +{
>> + return (pdom->iop.pgd != NULL);
>> +}
>> +
>> +int amd_iommu_gcr3_init(struct iommu_dev_data *dev_data, int pasids)
>> +{
>> + struct protection_domain *pdom = dev_data->domain;
>> + int ret = 0;
>> +
>> + lockdep_assert_held(&dev_data->lock);
>> +
>> + /*
>> + * We cannot support PASID w/ existing v1 page table in the same domain
>> + * since it will be nested. However, existing domain w/ v2 page table
>> + * can be used for PASID.
>> + */
>> + if (pdom->pd_mode == PD_MODE_V1)
>> + return -EOPNOTSUPP;
>> +
>> + /* Allocate GCR3 table */
>> + if (pdom_is_pt_mode(dev_data->domain))
>
> It seems obfuscating to put this test for IOMMU_DOMAIN_IDENTITY in a
> function..
It needs to know how domain is configured. Hence I have this check.
>
>> + ret = setup_gcr3_table(dev_data, pasids);
>
> This is the DTE change that should be atomic
It gets called with dev_data lock. And it doesn't break ongoing translations. So
it should be fine.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 06/11] iommu/amd: Move PPR-related functions into ppr.c
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (4 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 05/11] iommu/amd: Add support to enable/disable PASID feature Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue Vasant Hegde
` (5 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
In preparation to subsequent PPR-related patches, and also remove static
declaration for certain helper functions so that it can be reused in other
files.
Also rename below functions:
alloc_ppr_log -> amd_iommu_alloc_ppr_log
iommu_enable_ppr_log -> amd_iommu_enable_ppr_log
free_ppr_log -> amd_iommu_free_ppr_log
iommu_poll_ppr_log -> amd_iommu_poll_ppr_log
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/Makefile | 2 +-
drivers/iommu/amd/amd_iommu.h | 17 +++++-
drivers/iommu/amd/init.c | 65 +++-----------------
drivers/iommu/amd/iommu.c | 55 +----------------
drivers/iommu/amd/ppr.c | 112 ++++++++++++++++++++++++++++++++++
5 files changed, 137 insertions(+), 114 deletions(-)
create mode 100644 drivers/iommu/amd/ppr.c
diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index 848c89fa5238..a6601d885c5b 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o sva.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o sva.o ppr.o
obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f7f576b8c8f6..f7ec33a81fc1 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -17,10 +17,16 @@ irqreturn_t amd_iommu_int_thread_pprlog(int irq, void *data);
irqreturn_t amd_iommu_int_thread_galog(int irq, void *data);
irqreturn_t amd_iommu_int_handler(int irq, void *data);
void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid);
+void amd_iommu_restart_log(struct amd_iommu *iommu, const char *evt_type,
+ u8 cntrl_intr, u8 cntrl_log,
+ u32 status_run_mask, u32 status_overflow_mask);
void amd_iommu_restart_event_logging(struct amd_iommu *iommu);
void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
void amd_iommu_restart_ppr_log(struct amd_iommu *iommu);
void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
+void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
+void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
+ gfp_t gfp, size_t size);
#ifdef CONFIG_AMD_IOMMU_DEBUGFS
void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
@@ -64,6 +70,14 @@ int amd_iommu_set_gcr3(struct iommu_dev_data *dev_data,
u32 pasid, unsigned long gcr3);
int amd_iommu_clear_gcr3(struct iommu_dev_data *dev_data, u32 pasid);
+/* PPR */
+int __init amd_iommu_alloc_ppr_log(struct amd_iommu *iommu);
+void __init amd_iommu_free_ppr_log(struct amd_iommu *iommu);
+void amd_iommu_enable_ppr_log(struct amd_iommu *iommu);
+void amd_iommu_poll_ppr_log(struct amd_iommu *iommu);
+int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
+ int status, int tag);
+
int amd_iommu_flush_page(struct protection_domain *domain, u32 pasid, u64 address);
void amd_iommu_update_and_flush_device_table(struct protection_domain *domain);
void amd_iommu_domain_update(struct protection_domain *domain);
@@ -84,9 +98,6 @@ static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
#define PPR_INVALID 0x1
#define PPR_FAILURE 0xf
-int amd_iommu_complete_ppr(struct pci_dev *pdev, u32 pasid,
- int status, int tag);
-
static inline bool is_rd890_iommu(struct pci_dev *pdev)
{
return (pdev->vendor == PCI_VENDOR_ID_ATI) &&
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3820ae738088..a20a7f998ecd 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -418,7 +418,7 @@ static void iommu_set_device_table(struct amd_iommu *iommu)
}
/* Generic functions to enable/disable certain features of the IOMMU. */
-static void iommu_feature_enable(struct amd_iommu *iommu, u8 bit)
+void iommu_feature_enable(struct amd_iommu *iommu, u8 bit)
{
u64 ctrl;
@@ -745,9 +745,9 @@ static int __init alloc_command_buffer(struct amd_iommu *iommu)
* Interrupt handler has processed all pending events and adjusted head
* and tail pointer. Reset overflow mask and restart logging again.
*/
-static void amd_iommu_restart_log(struct amd_iommu *iommu, const char *evt_type,
- u8 cntrl_intr, u8 cntrl_log,
- u32 status_run_mask, u32 status_overflow_mask)
+void amd_iommu_restart_log(struct amd_iommu *iommu, const char *evt_type,
+ u8 cntrl_intr, u8 cntrl_log,
+ u32 status_run_mask, u32 status_overflow_mask)
{
u32 status;
@@ -788,17 +788,6 @@ void amd_iommu_restart_ga_log(struct amd_iommu *iommu)
MMIO_STATUS_GALOG_OVERFLOW_MASK);
}
-/*
- * This function restarts ppr logging in case the IOMMU experienced
- * PPR log overflow.
- */
-void amd_iommu_restart_ppr_log(struct amd_iommu *iommu)
-{
- amd_iommu_restart_log(iommu, "PPR", CONTROL_PPRINT_EN,
- CONTROL_PPRLOG_EN, MMIO_STATUS_PPR_RUN_MASK,
- MMIO_STATUS_PPR_OVERFLOW_MASK);
-}
-
/*
* This function resets the command buffer if the IOMMU stopped fetching
* commands from it.
@@ -847,8 +836,8 @@ static void __init free_command_buffer(struct amd_iommu *iommu)
free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
}
-static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
- gfp_t gfp, size_t size)
+void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, gfp_t gfp,
+ size_t size)
{
int order = get_order(size);
void *buf = (void *)__get_free_pages(gfp, order);
@@ -903,42 +892,6 @@ static void __init free_event_buffer(struct amd_iommu *iommu)
free_pages((unsigned long)iommu->evt_buf, get_order(EVT_BUFFER_SIZE));
}
-/* allocates the memory where the IOMMU will log its events to */
-static int __init alloc_ppr_log(struct amd_iommu *iommu)
-{
- iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO,
- PPR_LOG_SIZE);
-
- return iommu->ppr_log ? 0 : -ENOMEM;
-}
-
-static void iommu_enable_ppr_log(struct amd_iommu *iommu)
-{
- u64 entry;
-
- if (iommu->ppr_log == NULL)
- return;
-
- iommu_feature_enable(iommu, CONTROL_PPR_EN);
-
- entry = iommu_virt_to_phys(iommu->ppr_log) | PPR_LOG_SIZE_512;
-
- memcpy_toio(iommu->mmio_base + MMIO_PPR_LOG_OFFSET,
- &entry, sizeof(entry));
-
- /* set head and tail to zero manually */
- writel(0x00, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
- writel(0x00, iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
-
- iommu_feature_enable(iommu, CONTROL_PPRLOG_EN);
- iommu_feature_enable(iommu, CONTROL_PPRINT_EN);
-}
-
-static void __init free_ppr_log(struct amd_iommu *iommu)
-{
- free_pages((unsigned long)iommu->ppr_log, get_order(PPR_LOG_SIZE));
-}
-
static void free_ga_log(struct amd_iommu *iommu)
{
#ifdef CONFIG_IRQ_REMAP
@@ -1682,7 +1635,7 @@ static void __init free_iommu_one(struct amd_iommu *iommu)
free_cwwb_sem(iommu);
free_command_buffer(iommu);
free_event_buffer(iommu);
- free_ppr_log(iommu);
+ amd_iommu_free_ppr_log(iommu);
free_ga_log(iommu);
iommu_unmap_mmio_space(iommu);
}
@@ -2095,7 +2048,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
amd_iommu_max_glx_val = min(amd_iommu_max_glx_val, glxval);
}
- if (check_feature(FEATURE_PPR) && alloc_ppr_log(iommu))
+ if (check_feature(FEATURE_PPR) && amd_iommu_alloc_ppr_log(iommu))
return -ENOMEM;
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) {
@@ -2839,7 +2792,7 @@ static void enable_iommus_v2(void)
struct amd_iommu *iommu;
for_each_iommu(iommu) {
- iommu_enable_ppr_log(iommu);
+ amd_iommu_enable_ppr_log(iommu);
iommu_enable_gt(iommu);
}
}
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5a9749cfe14e..e11a705df665 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -820,59 +820,6 @@ static void iommu_poll_events(struct amd_iommu *iommu)
writel(head, iommu->mmio_base + MMIO_EVT_HEAD_OFFSET);
}
-static void iommu_poll_ppr_log(struct amd_iommu *iommu)
-{
- u32 head, tail;
-
- if (iommu->ppr_log == NULL)
- return;
-
- head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
- tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
-
- while (head != tail) {
- volatile u64 *raw;
- u64 entry[2];
- int i;
-
- raw = (u64 *)(iommu->ppr_log + head);
-
- /*
- * Hardware bug: Interrupt may arrive before the entry is
- * written to memory. If this happens we need to wait for the
- * entry to arrive.
- */
- for (i = 0; i < LOOP_TIMEOUT; ++i) {
- if (PPR_REQ_TYPE(raw[0]) != 0)
- break;
- udelay(1);
- }
-
- /* Avoid memcpy function-call overhead */
- entry[0] = raw[0];
- entry[1] = raw[1];
-
- /*
- * To detect the hardware errata 733 we need to clear the
- * entry back to zero. This issue does not exist on SNP
- * enabled system. Also this buffer is not writeable on
- * SNP enabled system.
- */
- if (!amd_iommu_snp_en)
- raw[0] = raw[1] = 0UL;
-
- /* Update head pointer of hardware ring-buffer */
- head = (head + PPR_ENTRY_SIZE) % PPR_LOG_SIZE;
- writel(head, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
-
- /* TODO: PPR Handler will be added when we add IOPF support */
-
- /* Refresh ring-buffer information */
- head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
- tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
- }
-}
-
#ifdef CONFIG_IRQ_REMAP
static int (*iommu_ga_log_notifier)(u32);
@@ -993,7 +940,7 @@ irqreturn_t amd_iommu_int_thread_pprlog(int irq, void *data)
{
amd_iommu_handle_irq(data, "PPR", MMIO_STATUS_PPR_INT_MASK,
MMIO_STATUS_PPR_OVERFLOW_MASK,
- iommu_poll_ppr_log, amd_iommu_restart_ppr_log);
+ amd_iommu_poll_ppr_log, amd_iommu_restart_ppr_log);
return IRQ_HANDLED;
}
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
new file mode 100644
index 000000000000..673fcc30f9dc
--- /dev/null
+++ b/drivers/iommu/amd/ppr.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 Advanced Micro Devices, Inc.
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt) pr_fmt(fmt)
+
+#include <linux/amd-iommu.h>
+#include <linux/delay.h>
+#include <linux/mmu_notifier.h>
+
+#include "amd_iommu.h"
+#include "amd_iommu_types.h"
+
+int __init amd_iommu_alloc_ppr_log(struct amd_iommu *iommu)
+{
+ iommu->ppr_log = iommu_alloc_4k_pages(iommu, GFP_KERNEL | __GFP_ZERO,
+ PPR_LOG_SIZE);
+ return iommu->ppr_log ? 0 : -ENOMEM;
+}
+
+void amd_iommu_enable_ppr_log(struct amd_iommu *iommu)
+{
+ u64 entry;
+
+ if (iommu->ppr_log == NULL)
+ return;
+
+ iommu_feature_enable(iommu, CONTROL_PPR_EN);
+
+ entry = iommu_virt_to_phys(iommu->ppr_log) | PPR_LOG_SIZE_512;
+
+ memcpy_toio(iommu->mmio_base + MMIO_PPR_LOG_OFFSET,
+ &entry, sizeof(entry));
+
+ /* set head and tail to zero manually */
+ writel(0x00, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
+ writel(0x00, iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
+
+ iommu_feature_enable(iommu, CONTROL_PPRINT_EN);
+ iommu_feature_enable(iommu, CONTROL_PPRLOG_EN);
+}
+
+void __init amd_iommu_free_ppr_log(struct amd_iommu *iommu)
+{
+ free_pages((unsigned long)iommu->ppr_log, get_order(PPR_LOG_SIZE));
+}
+
+/*
+ * This function restarts ppr logging in case the IOMMU experienced
+ * PPR log overflow.
+ */
+void amd_iommu_restart_ppr_log(struct amd_iommu *iommu)
+{
+ amd_iommu_restart_log(iommu, "PPR", CONTROL_PPRINT_EN,
+ CONTROL_PPRLOG_EN, MMIO_STATUS_PPR_RUN_MASK,
+ MMIO_STATUS_PPR_OVERFLOW_MASK);
+}
+
+void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
+{
+ u32 head, tail;
+
+ if (iommu->ppr_log == NULL)
+ return;
+
+ head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
+ tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
+
+ while (head != tail) {
+ volatile u64 *raw;
+ u64 entry[2];
+ int i;
+
+ raw = (u64 *)(iommu->ppr_log + head);
+
+ /*
+ * Hardware bug: Interrupt may arrive before the entry is
+ * written to memory. If this happens we need to wait for the
+ * entry to arrive.
+ */
+ for (i = 0; i < LOOP_TIMEOUT; ++i) {
+ if (PPR_REQ_TYPE(raw[0]) != 0)
+ break;
+ udelay(1);
+ }
+
+ /* Avoid memcpy function-call overhead */
+ entry[0] = raw[0];
+ entry[1] = raw[1];
+
+ /*
+ * To detect the hardware errata 733 we need to clear the
+ * entry back to zero. This issue does not exist on SNP
+ * enabled system. Also this buffer is not writeable on
+ * SNP enabled system.
+ */
+ if (!amd_iommu_snp_en)
+ raw[0] = raw[1] = 0UL;
+
+ /* Update head pointer of hardware ring-buffer */
+ head = (head + PPR_ENTRY_SIZE) % PPR_LOG_SIZE;
+ writel(head, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
+
+ /* TODO: PPR Handler will be added when we add IOPF support */
+
+ /* Refresh ring-buffer information */
+ head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
+ tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
+ }
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (5 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 06/11] iommu/amd: Move PPR-related functions into ppr.c Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-12 16:59 ` Jason Gunthorpe
2023-09-11 12:10 ` [PATCH v2 08/11] iommu/amd: Add support for page response Vasant Hegde
` (4 subsequent siblings)
11 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
AMD IOMMU hardware supports PCI Peripheral Paging Request (PPR) using
a PPR log, which is a circular buffer containing requests from downstream
end-point devices.
There is one PPR log per IOMMU instance. Therefore, allocate an iopf_queue
per IOMMU instance during driver initialization, and free the queue during
driver deinitialization.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 4 +++
drivers/iommu/amd/amd_iommu_types.h | 4 +++
drivers/iommu/amd/init.c | 5 ++++
drivers/iommu/amd/ppr.c | 42 +++++++++++++++++++++++++++++
4 files changed, 55 insertions(+)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index f7ec33a81fc1..a85503259727 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -52,6 +52,10 @@ void amd_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid);
int amd_iommu_gcr3_init(struct iommu_dev_data *dev_data, int pasids);
void amd_iommu_gcr3_uninit(struct iommu_dev_data *dev_data);
+/* IOPF */
+int amd_iommu_iopf_init(struct amd_iommu *iommu);
+void amd_iommu_iopf_uninit(struct amd_iommu *iommu);
+
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
bool amd_iommu_pc_supported(void);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index c866fbfa0dd8..56b6fdc89c99 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -773,6 +773,10 @@ struct amd_iommu {
/* DebugFS Info */
struct dentry *debugfs;
#endif
+
+ /* IOPF support */
+ struct iopf_queue *iopf_queue;
+ unsigned char iopfq_name[32];
};
static inline struct amd_iommu *dev_to_amd_iommu(struct device *dev)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index a20a7f998ecd..302e964b1eff 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1638,6 +1638,7 @@ static void __init free_iommu_one(struct amd_iommu *iommu)
amd_iommu_free_ppr_log(iommu);
free_ga_log(iommu);
iommu_unmap_mmio_space(iommu);
+ amd_iommu_iopf_uninit(iommu);
}
static void __init free_iommu_all(void)
@@ -2791,9 +2792,13 @@ static void enable_iommus_v2(void)
{
struct amd_iommu *iommu;
+ if (!amd_iommu_gt_ppr_supported())
+ return;
+
for_each_iommu(iommu) {
amd_iommu_enable_ppr_log(iommu);
iommu_enable_gt(iommu);
+ amd_iommu_iopf_init(iommu);
}
}
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 673fcc30f9dc..9816d7dbcbf0 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -110,3 +110,45 @@ void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
}
}
+
+/**************************************************************
+ *
+ * IOPF handling stuff
+ */
+
+/* Setup per-IOMMU IOPF queue if not exist. */
+int amd_iommu_iopf_init(struct amd_iommu *iommu)
+{
+ unsigned long flags;
+ int ret = 0;
+
+ raw_spin_lock_irqsave(&iommu->lock, flags);
+
+ if (iommu->iopf_queue)
+ goto out;
+
+ snprintf(iommu->iopfq_name, sizeof(iommu->iopfq_name),
+ "amdiommu-%#x-iopfq",
+ PCI_SEG_DEVID_TO_SBDF(iommu->pci_seg->id, iommu->devid));
+
+ iommu->iopf_queue = iopf_queue_alloc(iommu->iopfq_name);
+ if (!iommu->iopf_queue)
+ ret = -ENOMEM;
+
+out:
+ raw_spin_unlock_irqrestore(&iommu->lock, flags);
+ return ret;
+}
+
+/* Destroy per-IOMMU IOPF queue if no longer needed. */
+void amd_iommu_iopf_uninit(struct amd_iommu *iommu)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&iommu->lock, flags);
+
+ iopf_queue_free(iommu->iopf_queue);
+ iommu->iopf_queue = NULL;
+
+ raw_spin_unlock_irqrestore(&iommu->lock, flags);
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue
2023-09-11 12:10 ` [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue Vasant Hegde
@ 2023-09-12 16:59 ` Jason Gunthorpe
2023-09-15 13:48 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 16:59 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:42PM +0000, Vasant Hegde wrote:
> +
> +/**************************************************************
> + *
> + * IOPF handling stuff
> + */
> +
> +/* Setup per-IOMMU IOPF queue if not exist. */
> +int amd_iommu_iopf_init(struct amd_iommu *iommu)
> +{
> + unsigned long flags;
> + int ret = 0;
> +
> + raw_spin_lock_irqsave(&iommu->lock, flags);
Dare I ask why this is a raw spinlock??
You could also do this when the first PRI domain is attached and save
a bit of memory..
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue
2023-09-12 16:59 ` Jason Gunthorpe
@ 2023-09-15 13:48 ` Vasant Hegde
0 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-15 13:48 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/12/2023 10:29 PM, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 12:10:42PM +0000, Vasant Hegde wrote:
>> +
>> +/**************************************************************
>> + *
>> + * IOPF handling stuff
>> + */
>> +
>> +/* Setup per-IOMMU IOPF queue if not exist. */
>> +int amd_iommu_iopf_init(struct amd_iommu *iommu)
>> +{
>> + unsigned long flags;
>> + int ret = 0;
>> +
>> + raw_spin_lock_irqsave(&iommu->lock, flags);
>
> Dare I ask why this is a raw spinlock??
Looking into git history,
commit 27790398c2ae (iommu/amd: Use raw locks on atomic context paths)
Several functions in this driver are called from atomic context,
and thus raw locks must be used in order to be safe on PREEMPT_RT.
This includes paths that must wait for command completion, which is
a potential PREEMPT_RT latency concern but not easily avoidable.
>
> You could also do this when the first PRI domain is attached and save
> a bit of memory..
Its true that it can save bit of a memory. But not sure its really worth to do
so many things in bind path.
For now I think will keep it in init path and revisit this later.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 08/11] iommu/amd: Add support for page response
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (6 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 07/11] iommu/amd: Define per-IOMMU iopf_queue Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 09/11] iommu/amd: Add support for add/remove device for IOPF Vasant Hegde
` (3 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
This generates AMD IOMMU COMPLETE_PPR_REQUEST for the specified device
with the specified PRI Response Code.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 3 +++
drivers/iommu/amd/iommu.c | 1 +
drivers/iommu/amd/ppr.c | 15 +++++++++++++++
3 files changed, 19 insertions(+)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index a85503259727..bfd9f705c0cc 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -55,6 +55,9 @@ void amd_iommu_gcr3_uninit(struct iommu_dev_data *dev_data);
/* IOPF */
int amd_iommu_iopf_init(struct amd_iommu *iommu);
void amd_iommu_iopf_uninit(struct amd_iommu *iommu);
+int amd_iommu_page_response(struct device *dev,
+ struct iopf_fault *evt,
+ struct iommu_page_response *resp);
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e11a705df665..646873e3790a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2610,6 +2610,7 @@ const struct iommu_ops amd_iommu_ops = {
.dev_enable_feat = amd_iommu_dev_enable_feature,
.dev_disable_feat = amd_iommu_dev_disable_feature,
.remove_dev_pasid = amd_iommu_remove_dev_pasid,
+ .page_response = amd_iommu_page_response,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = amd_iommu_attach_device,
.map_pages = amd_iommu_map_pages,
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 9816d7dbcbf0..2a0414e3d689 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -152,3 +152,18 @@ void amd_iommu_iopf_uninit(struct amd_iommu *iommu)
raw_spin_unlock_irqrestore(&iommu->lock, flags);
}
+
+int amd_iommu_page_response(struct device *dev,
+ struct iopf_fault *evt,
+ struct iommu_page_response *resp)
+{
+ struct pci_dev *pdev;
+
+ if (!dev || !dev_is_pci(dev))
+ return -ENODEV;
+
+ pdev = to_pci_dev(dev);
+
+ return amd_iommu_complete_ppr(pdev, resp->pasid, resp->code,
+ resp->grpid);
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 09/11] iommu/amd: Add support for add/remove device for IOPF
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (7 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 08/11] iommu/amd: Add support for page response Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-11 12:10 ` [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler Vasant Hegde
` (2 subsequent siblings)
11 siblings, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Wei Huang <wei.huang2@amd.com>
Which adds/removes the device to the corresponding per-IOMMU iopf_queue.
It also registers device fault handler. These interfaces are called
from IOPF feature enable/disable path.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 2 ++
drivers/iommu/amd/ppr.c | 38 +++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index bfd9f705c0cc..4ffa2dfedbd8 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -55,6 +55,8 @@ void amd_iommu_gcr3_uninit(struct iommu_dev_data *dev_data);
/* IOPF */
int amd_iommu_iopf_init(struct amd_iommu *iommu);
void amd_iommu_iopf_uninit(struct amd_iommu *iommu);
+int amd_iommu_iopf_add_device(struct amd_iommu *iommu, struct device *dev);
+int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev);
int amd_iommu_page_response(struct device *dev,
struct iopf_fault *evt,
struct iommu_page_response *resp);
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 2a0414e3d689..0371df24a2b4 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -167,3 +167,41 @@ int amd_iommu_page_response(struct device *dev,
return amd_iommu_complete_ppr(pdev, resp->pasid, resp->code,
resp->grpid);
}
+
+int amd_iommu_iopf_add_device(struct amd_iommu *iommu, struct device *dev)
+{
+ unsigned long flags;
+ int ret = -EINVAL;
+
+ raw_spin_lock_irqsave(&iommu->lock, flags);
+
+ if (!iommu->iopf_queue)
+ goto out;
+
+ ret = iopf_queue_add_device(iommu->iopf_queue, dev);
+
+out:
+ raw_spin_unlock_irqrestore(&iommu->lock, flags);
+ return ret;
+}
+
+int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev)
+{
+ unsigned long flags;
+ int ret = -EINVAL;
+
+ raw_spin_lock_irqsave(&iommu->lock, flags);
+
+ if (!iommu->iopf_queue)
+ goto out;
+
+ ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
+ if (ret) {
+ pr_warn("Failed to remove device (0x%x) from iopf queue\n",
+ dev->id);
+ }
+
+out:
+ raw_spin_unlock_irqrestore(&iommu->lock, flags);
+ return ret;
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (8 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 09/11] iommu/amd: Add support for add/remove device for IOPF Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-12 18:46 ` Jason Gunthorpe
2023-09-11 12:10 ` [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF Vasant Hegde
2023-09-12 18:48 ` [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Jason Gunthorpe
11 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Wei Huang <wei.huang2@amd.com>
Whenever there is a page fault IOMMU logs entry to ppr log and sends
interrupt to host. We have to handle the page fault and respond to IOMMU.
Add support to validate page fault requrest and hook it to core iommu
page fault handler.
Also rename search_dev_data() -> amd_iommu_search_dev_data().
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/amd_iommu.h | 2 +
drivers/iommu/amd/amd_iommu_types.h | 8 +++
drivers/iommu/amd/iommu.c | 7 +-
drivers/iommu/amd/ppr.c | 103 +++++++++++++++++++++++++++-
4 files changed, 116 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 4ffa2dfedbd8..14198cade320 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -27,6 +27,8 @@ void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
void iommu_feature_enable(struct amd_iommu *iommu, u8 bit);
void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu,
gfp_t gfp, size_t size);
+struct iommu_dev_data *amd_iommu_search_dev_data(struct amd_iommu *iommu,
+ u16 devid);
#ifdef CONFIG_AMD_IOMMU_DEBUGFS
void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 56b6fdc89c99..f8dba866eb9a 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -250,6 +250,14 @@
#define PPR_ENTRY_SIZE 16
#define PPR_LOG_SIZE (PPR_ENTRY_SIZE * PPR_LOG_ENTRIES)
+/* PAGE_SERVICE_REQUEST PPR Log Buffer Entry flags */
+#define PPR_FLAG_EXEC 0x002 /* Execute permission requested */
+#define PPR_FLAG_READ 0x004 /* Read permission requested */
+#define PPR_FLAG_WRITE 0x020 /* Write permission requested */
+#define PPR_FLAG_US 0x040 /* 1: User, 0: Supervisor */
+#define PPR_FLAG_RVSD 0x080 /* Reserved bit not zero */
+#define PPR_FLAG_GN 0x100 /* GVA and PASID is valid */
+
#define PPR_REQ_TYPE(x) (((x) >> 60) & 0xfULL)
#define PPR_FLAGS(x) (((x) >> 48) & 0xfffULL)
#define PPR_DEVID(x) ((x) & 0xffffULL)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 646873e3790a..35221deb848b 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -201,7 +201,8 @@ static struct iommu_dev_data *alloc_dev_data(struct amd_iommu *iommu, u16 devid)
return dev_data;
}
-static struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid)
+struct iommu_dev_data *amd_iommu_search_dev_data(struct amd_iommu *iommu,
+ u16 devid)
{
struct iommu_dev_data *dev_data;
struct llist_node *node;
@@ -285,7 +286,7 @@ static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid)
{
struct iommu_dev_data *dev_data;
- dev_data = search_dev_data(iommu, devid);
+ dev_data = amd_iommu_search_dev_data(iommu, devid);
if (dev_data == NULL) {
dev_data = alloc_dev_data(iommu, devid);
@@ -3622,7 +3623,7 @@ static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
if (ir_data->iommu == NULL)
return -EINVAL;
- dev_data = search_dev_data(ir_data->iommu, irte_info->devid);
+ dev_data = amd_iommu_search_dev_data(ir_data->iommu, irte_info->devid);
/* Note:
* This device has never been set up for guest mode.
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 0371df24a2b4..2891f67a6ca0 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -58,6 +58,106 @@ void amd_iommu_restart_ppr_log(struct amd_iommu *iommu)
MMIO_STATUS_PPR_OVERFLOW_MASK);
}
+static inline u32 ppr_flag_to_fault_perm(u16 flag)
+{
+ int perm = 0;
+
+ if (flag & PPR_FLAG_READ)
+ perm |= IOMMU_FAULT_PERM_READ;
+ if (flag & PPR_FLAG_WRITE)
+ perm |= IOMMU_FAULT_PERM_WRITE;
+ if (flag & PPR_FLAG_EXEC)
+ perm |= IOMMU_FAULT_PERM_EXEC;
+ if (!(flag & PPR_FLAG_US))
+ perm |= IOMMU_FAULT_PERM_PRIV;
+
+ return perm;
+}
+
+static bool ppr_is_valid(struct amd_iommu *iommu, u64 *raw)
+{
+ struct device *dev = iommu->iommu.dev;
+ u16 devid = PPR_DEVID(raw[0]);
+
+ if (!(PPR_FLAGS(raw[0]) & PPR_FLAG_GN)) {
+ dev_warn(dev, "PPR logged [Request ignored due to GN=0 (device=%04x:%02x:%02x.%x "
+ "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
+ iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
+ return false;
+ }
+
+ if (PPR_FLAGS(raw[0]) & PPR_FLAG_RVSD) {
+ dev_warn(dev, "PPR logged [Invalid request format (device=%04x:%02x:%02x.%x "
+ "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
+ iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
+ return false;
+ }
+
+ return true;
+}
+
+static void iommu_call_iopf_notifier(struct amd_iommu *iommu, u64 *raw)
+{
+ struct iopf_fault event;
+ struct pci_dev *pdev;
+ int ret = -EINVAL;
+ u16 devid = PPR_DEVID(raw[0]);
+
+ if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) {
+ pr_err_ratelimited("Unknown PPR request received\n");
+ return;
+ }
+
+ if (!ppr_is_valid(iommu, raw))
+ goto out;
+
+ pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid),
+ devid & 0xff);
+ if (!pdev)
+ goto out;
+
+ memset(&event, 0, sizeof(struct iopf_fault));
+
+ event.fault.type = IOMMU_FAULT_PAGE_REQ;
+ event.fault.prm.perm = ppr_flag_to_fault_perm(PPR_FLAGS(raw[0]));
+ event.fault.prm.addr = (u64)(raw[1] & PAGE_MASK);
+ event.fault.prm.pasid = PPR_PASID(raw[0]);
+ event.fault.prm.grpid = PPR_TAG(raw[0]) & 0x1FF;
+
+ /*
+ * PASID zero is used for requests from the I/O device without
+ * a PASID
+ */
+ if (event.fault.prm.pasid == 0 ||
+ event.fault.prm.pasid >= pdev->dev.iommu->max_pasids) {
+ pr_info_ratelimited("Invalid PASID : 0x%x, device : 0x%x\n",
+ event.fault.prm.pasid, pdev->dev.id);
+ goto out;
+ }
+
+
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+ if (PPR_TAG(raw[0]) & 0x200)
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+
+ /* Submit event */
+ ret = iommu_report_device_fault(&pdev->dev, &event);
+
+out:
+ if (ret) {
+ /* Nobody cared, abort */
+ struct iommu_page_response resp = {
+ .pasid = PPR_PASID(raw[0]),
+ .grpid = PPR_TAG(raw[0]) & 0x1FF,
+ .code = IOMMU_PAGE_RESP_FAILURE,
+ };
+ amd_iommu_page_response(&pdev->dev, &event, &resp);
+ }
+}
+
void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
{
u32 head, tail;
@@ -103,7 +203,8 @@ void amd_iommu_poll_ppr_log(struct amd_iommu *iommu)
head = (head + PPR_ENTRY_SIZE) % PPR_LOG_SIZE;
writel(head, iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
- /* TODO: PPR Handler will be added when we add IOPF support */
+ /* Handle PPR entry */
+ iommu_call_iopf_notifier(iommu, entry);
/* Refresh ring-buffer information */
head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler
2023-09-11 12:10 ` [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler Vasant Hegde
@ 2023-09-12 18:46 ` Jason Gunthorpe
2023-09-13 4:19 ` Baolu Lu
2023-09-15 8:15 ` Vasant Hegde
0 siblings, 2 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 18:46 UTC (permalink / raw)
To: Vasant Hegde, Baolu Lu
Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:45PM +0000, Vasant Hegde wrote:
> @@ -285,7 +286,7 @@ static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid)
> {
> struct iommu_dev_data *dev_data;
>
> - dev_data = search_dev_data(iommu, devid);
> + dev_data = amd_iommu_search_dev_data(iommu, devid);
> +static bool ppr_is_valid(struct amd_iommu *iommu, u64 *raw)
> +{
> + struct device *dev = iommu->iommu.dev;
> + u16 devid = PPR_DEVID(raw[0]);
> +
> + if (!(PPR_FLAGS(raw[0]) & PPR_FLAG_GN)) {
> + dev_warn(dev, "PPR logged [Request ignored due to GN=0 (device=%04x:%02x:%02x.%x "
> + "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
> + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
> + PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
> + return false;
> + }
> +
> + if (PPR_FLAGS(raw[0]) & PPR_FLAG_RVSD) {
> + dev_warn(dev, "PPR logged [Invalid request format (device=%04x:%02x:%02x.%x "
> + "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
> + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
> + PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
> + return false;
> + }
Please be careful that no guest can trigger these warnings..
> +
> + return true;
> +}
> +
> +static void iommu_call_iopf_notifier(struct amd_iommu *iommu, u64 *raw)
> +{
> + struct iopf_fault event;
> + struct pci_dev *pdev;
> + int ret = -EINVAL;
> + u16 devid = PPR_DEVID(raw[0]);
> +
> + if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) {
> + pr_err_ratelimited("Unknown PPR request received\n");
> + return;
> + }
> +
> + if (!ppr_is_valid(iommu, raw))
> + goto out;
> +
> + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid),
> + devid & 0xff);
> + if (!pdev)
> + goto out;
Lu, here is another case where the core PRI code could make use of a
core helper for a getting from the RID to the iommu world.
> +
> + memset(&event, 0, sizeof(struct iopf_fault));
> +
> + event.fault.type = IOMMU_FAULT_PAGE_REQ;
> + event.fault.prm.perm = ppr_flag_to_fault_perm(PPR_FLAGS(raw[0]));
> + event.fault.prm.addr = (u64)(raw[1] & PAGE_MASK);
> + event.fault.prm.pasid = PPR_PASID(raw[0]);
> + event.fault.prm.grpid = PPR_TAG(raw[0]) & 0x1FF;
> +
> + /*
> + * PASID zero is used for requests from the I/O device without
> + * a PASID
> + */
> + if (event.fault.prm.pasid == 0 ||
> + event.fault.prm.pasid >= pdev->dev.iommu->max_pasids) {
> + pr_info_ratelimited("Invalid PASID : 0x%x, device : 0x%x\n",
> + event.fault.prm.pasid, pdev->dev.id);
> + goto out;
> + }
> +
> +
> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> + if (PPR_TAG(raw[0]) & 0x200)
> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
> +
> + /* Submit event */
> + ret = iommu_report_device_fault(&pdev->dev, &event);
> +
> +out:
> + if (ret) {
> + /* Nobody cared, abort */
> + struct iommu_page_response resp = {
> + .pasid = PPR_PASID(raw[0]),
> + .grpid = PPR_TAG(raw[0]) & 0x1FF,
> + .code = IOMMU_PAGE_RESP_FAILURE,
> + };
> + amd_iommu_page_response(&pdev->dev, &event, &resp);
Just to call amd_iommu_complete_ppr(), this already has the pci_dev,
we don't need amd_iommu_page_response() to get it.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler
2023-09-12 18:46 ` Jason Gunthorpe
@ 2023-09-13 4:19 ` Baolu Lu
2023-09-15 8:15 ` Vasant Hegde
1 sibling, 0 replies; 30+ messages in thread
From: Baolu Lu @ 2023-09-13 4:19 UTC (permalink / raw)
To: Jason Gunthorpe, Vasant Hegde
Cc: baolu.lu, iommu, joro, suravee.suthikulpanit, wei.huang2,
jsnitsel
On 9/13/23 2:46 AM, Jason Gunthorpe wrote:
>> +static void iommu_call_iopf_notifier(struct amd_iommu *iommu, u64 *raw)
>> +{
>> + struct iopf_fault event;
>> + struct pci_dev *pdev;
>> + int ret = -EINVAL;
>> + u16 devid = PPR_DEVID(raw[0]);
>> +
>> + if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) {
>> + pr_err_ratelimited("Unknown PPR request received\n");
>> + return;
>> + }
>> +
>> + if (!ppr_is_valid(iommu, raw))
>> + goto out;
>> +
>> + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid),
>> + devid & 0xff);
>> + if (!pdev)
>> + goto out;
> Lu, here is another case where the core PRI code could make use of a
> core helper for a getting from the RID to the iommu world.
>
Yeah! It's common for PCI/PRI. ARM SMMUv3 probably will also implement
PCI/PRI in the future.
AMD IOMMU driver also calls pci_get_domain_bus_and_slot() in the path of
handling a DMA fault (unrecoverable fault). I don't think this is a
performance critical path, so we only need the common code to to serve
the PRI case. What's your opinion?
Best regards,
baolu
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler
2023-09-12 18:46 ` Jason Gunthorpe
2023-09-13 4:19 ` Baolu Lu
@ 2023-09-15 8:15 ` Vasant Hegde
1 sibling, 0 replies; 30+ messages in thread
From: Vasant Hegde @ 2023-09-15 8:15 UTC (permalink / raw)
To: Jason Gunthorpe, Baolu Lu
Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/13/2023 12:16 AM, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 12:10:45PM +0000, Vasant Hegde wrote:
>
>> @@ -285,7 +286,7 @@ static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid)
>> {
>> struct iommu_dev_data *dev_data;
>>
>> - dev_data = search_dev_data(iommu, devid);
>> + dev_data = amd_iommu_search_dev_data(iommu, devid);
>
>> +static bool ppr_is_valid(struct amd_iommu *iommu, u64 *raw)
>> +{
>> + struct device *dev = iommu->iommu.dev;
>> + u16 devid = PPR_DEVID(raw[0]);
>> +
>> + if (!(PPR_FLAGS(raw[0]) & PPR_FLAG_GN)) {
>> + dev_warn(dev, "PPR logged [Request ignored due to GN=0 (device=%04x:%02x:%02x.%x "
>> + "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
>> + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
>> + PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
>> + return false;
>> + }
>> +
>> + if (PPR_FLAGS(raw[0]) & PPR_FLAG_RVSD) {
>> + dev_warn(dev, "PPR logged [Invalid request format (device=%04x:%02x:%02x.%x "
>> + "pasid=0x%05llx address=0x%llx flags=0x%04llx tag=0x%03llx]\n",
>> + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
>> + PPR_PASID(raw[0]), raw[1], PPR_FLAGS(raw[0]), PPR_TAG(raw[0]));
>> + return false;
>> + }
>
> Please be careful that no guest can trigger these warnings..
sure.
>
>> +
>> + return true;
>> +}
>> +
>> +static void iommu_call_iopf_notifier(struct amd_iommu *iommu, u64 *raw)
>> +{
>> + struct iopf_fault event;
>> + struct pci_dev *pdev;
>> + int ret = -EINVAL;
>> + u16 devid = PPR_DEVID(raw[0]);
>> +
>> + if (PPR_REQ_TYPE(raw[0]) != PPR_REQ_FAULT) {
>> + pr_err_ratelimited("Unknown PPR request received\n");
>> + return;
>> + }
>> +
>> + if (!ppr_is_valid(iommu, raw))
>> + goto out;
>> +
>> + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid),
>> + devid & 0xff);
>> + if (!pdev)
>> + goto out;
>
> Lu, here is another case where the core PRI code could make use of a
> core helper for a getting from the RID to the iommu world.
>
>> +
>> + memset(&event, 0, sizeof(struct iopf_fault));
>> +
>> + event.fault.type = IOMMU_FAULT_PAGE_REQ;
>> + event.fault.prm.perm = ppr_flag_to_fault_perm(PPR_FLAGS(raw[0]));
>> + event.fault.prm.addr = (u64)(raw[1] & PAGE_MASK);
>> + event.fault.prm.pasid = PPR_PASID(raw[0]);
>> + event.fault.prm.grpid = PPR_TAG(raw[0]) & 0x1FF;
>> +
>> + /*
>> + * PASID zero is used for requests from the I/O device without
>> + * a PASID
>> + */
>> + if (event.fault.prm.pasid == 0 ||
>> + event.fault.prm.pasid >= pdev->dev.iommu->max_pasids) {
>> + pr_info_ratelimited("Invalid PASID : 0x%x, device : 0x%x\n",
>> + event.fault.prm.pasid, pdev->dev.id);
>> + goto out;
>> + }
>> +
>> +
>> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
>> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
>> + if (PPR_TAG(raw[0]) & 0x200)
>> + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
>> +
>> + /* Submit event */
>> + ret = iommu_report_device_fault(&pdev->dev, &event);
>> +
>> +out:
>> + if (ret) {
>> + /* Nobody cared, abort */
>> + struct iommu_page_response resp = {
>> + .pasid = PPR_PASID(raw[0]),
>> + .grpid = PPR_TAG(raw[0]) & 0x1FF,
>> + .code = IOMMU_PAGE_RESP_FAILURE,
>> + };
>> + amd_iommu_page_response(&pdev->dev, &event, &resp);
>
> Just to call amd_iommu_complete_ppr(), this already has the pci_dev,
> we don't need amd_iommu_page_response() to get it.
Makes sense. Will fix it.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (9 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 10/11] iommu/amd: Add IO page fault notifier handler Vasant Hegde
@ 2023-09-11 12:10 ` Vasant Hegde
2023-09-12 16:32 ` Jason Gunthorpe
2023-09-12 18:48 ` [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Jason Gunthorpe
11 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-11 12:10 UTC (permalink / raw)
To: iommu, joro
Cc: suravee.suthikulpanit, wei.huang2, jsnitsel, jgg, Vasant Hegde
From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Add support to enable/disable IOPF feature. So that PRI capable device
can make use of page fault handler in SVA mode.
Also add IOMMU_IOPF as dependency to AMD_IOMMU driver.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
---
drivers/iommu/amd/Kconfig | 1 +
drivers/iommu/amd/amd_iommu.h | 2 ++
drivers/iommu/amd/iommu.c | 6 ++++
drivers/iommu/amd/ppr.c | 56 +++++++++++++++++++++++++++++++++++
4 files changed, 65 insertions(+)
diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
index cabf4ccde1ed..22e33179c593 100644
--- a/drivers/iommu/amd/Kconfig
+++ b/drivers/iommu/amd/Kconfig
@@ -11,6 +11,7 @@ config AMD_IOMMU
select IOMMU_IOVA
select IOMMU_IO_PGTABLE
select IOMMU_SVA
+ select IOMMU_IOPF
depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
help
With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 14198cade320..7dbd9e54f48f 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -62,6 +62,8 @@ int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev);
int amd_iommu_page_response(struct device *dev,
struct iopf_fault *evt,
struct iommu_page_response *resp);
+int amd_iommu_iopf_enable(struct device *dev);
+int amd_iommu_iopf_disable(struct device *dev);
struct amd_iommu *get_amd_iommu(unsigned int idx);
u8 amd_iommu_pc_get_max_banks(unsigned int idx);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 35221deb848b..1be5fdab70d0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2574,6 +2574,9 @@ static int amd_iommu_dev_enable_feature(struct device *dev,
case IOMMU_DEV_FEAT_SVA:
ret = 0;
break;
+ case IOMMU_DEV_FEAT_IOPF:
+ ret = amd_iommu_iopf_enable(dev);
+ break;
default:
ret = -EINVAL;
break;
@@ -2590,6 +2593,9 @@ static int amd_iommu_dev_disable_feature(struct device *dev,
case IOMMU_DEV_FEAT_SVA:
ret = 0;
break;
+ case IOMMU_DEV_FEAT_IOPF:
+ ret = amd_iommu_iopf_disable(dev);
+ break;
default:
ret = -EINVAL;
break;
diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
index 2891f67a6ca0..86c0cb836a71 100644
--- a/drivers/iommu/amd/ppr.c
+++ b/drivers/iommu/amd/ppr.c
@@ -9,6 +9,7 @@
#include <linux/amd-iommu.h>
#include <linux/delay.h>
#include <linux/mmu_notifier.h>
+#include <linux/pci-ats.h>
#include "amd_iommu.h"
#include "amd_iommu_types.h"
@@ -306,3 +307,58 @@ int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev)
raw_spin_unlock_irqrestore(&iommu->lock, flags);
return ret;
}
+
+static int amd_iommu_iopf_update(struct device *dev, bool enable)
+{
+ unsigned long flags;
+ int ret;
+ struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
+ struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+ struct protection_domain *pdom = amd_iommu_get_pdomain(dev);
+
+ if (!pdev || !iommu || !dev_data)
+ return -EINVAL;
+
+ spin_lock_irqsave(&pdom->lock, flags);
+
+ if (enable) {
+ ret = amd_iommu_iopf_add_device(iommu, dev);
+ if (ret)
+ goto out;
+
+ dev_data->ppr = true;
+ } else {
+ ret = amd_iommu_iopf_remove_device(iommu, dev);
+ dev_data->ppr = false;
+ }
+
+ amd_iommu_domain_update(pdom);
+
+out:
+ spin_unlock_irqrestore(&pdom->lock, flags);
+ return ret;
+}
+
+int amd_iommu_iopf_enable(struct device *dev)
+{
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+
+ if (!(dev_data->flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP))
+ return -ENODEV;
+
+ if (!dev_data->ats_enabled || !dev_data->pri_enabled)
+ return -EINVAL;
+
+ return amd_iommu_iopf_update(dev, true);
+}
+
+int amd_iommu_iopf_disable(struct device *dev)
+{
+ struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
+
+ if (!dev_data->pri_enabled)
+ return -EINVAL;
+
+ return amd_iommu_iopf_update(dev, false);
+}
--
2.31.1
^ permalink raw reply related [flat|nested] 30+ messages in thread* Re: [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-09-11 12:10 ` [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF Vasant Hegde
@ 2023-09-12 16:32 ` Jason Gunthorpe
2023-09-15 8:26 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 16:32 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:46PM +0000, Vasant Hegde wrote:
> diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
> index 2891f67a6ca0..86c0cb836a71 100644
> --- a/drivers/iommu/amd/ppr.c
> +++ b/drivers/iommu/amd/ppr.c
> @@ -9,6 +9,7 @@
> #include <linux/amd-iommu.h>
> #include <linux/delay.h>
> #include <linux/mmu_notifier.h>
> +#include <linux/pci-ats.h>
>
> #include "amd_iommu.h"
> #include "amd_iommu_types.h"
> @@ -306,3 +307,58 @@ int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev)
> raw_spin_unlock_irqrestore(&iommu->lock, flags);
> return ret;
> }
> +
> +static int amd_iommu_iopf_update(struct device *dev, bool enable)
> +{
> + unsigned long flags;
> + int ret;
> + struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
> + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> + struct protection_domain *pdom = amd_iommu_get_pdomain(dev);
> +
> + if (!pdev || !iommu || !dev_data)
> + return -EINVAL;
> +
> + spin_lock_irqsave(&pdom->lock, flags);
> +
> + if (enable) {
> + ret = amd_iommu_iopf_add_device(iommu, dev);
> + if (ret)
> + goto out;
> +
> + dev_data->ppr = true;
> + } else {
> + ret = amd_iommu_iopf_remove_device(iommu, dev);
> + dev_data->ppr = false;
> + }
> +
> + amd_iommu_domain_update(pdom);
> +
> +out:
> + spin_unlock_irqrestore(&pdom->lock, flags);
> + return ret;
> +}
Same remarks as for the SVA, this is in the wrong place.
iopf_queue_add_device() should be done when a PRI enabled domain is
attached to a device, not in feature things. I also want to remove
these too once ARM is fixed..
And the iopf is a device wide thing, it makes no sense that some
random pdom->lock is protecting dev_dev->ppr..
amd_iommu_get_pdomain() has improper locking, it needs to hold some
kind of dev_data lock to access dev_data->domain. (and really this
would all be clearer if it was just written dev_data->domain instead
of the redundant function)
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-09-12 16:32 ` Jason Gunthorpe
@ 2023-09-15 8:26 ` Vasant Hegde
2023-09-18 12:45 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-09-15 8:26 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/12/2023 10:02 PM, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 12:10:46PM +0000, Vasant Hegde wrote:
>> diff --git a/drivers/iommu/amd/ppr.c b/drivers/iommu/amd/ppr.c
>> index 2891f67a6ca0..86c0cb836a71 100644
>> --- a/drivers/iommu/amd/ppr.c
>> +++ b/drivers/iommu/amd/ppr.c
>> @@ -9,6 +9,7 @@
>> #include <linux/amd-iommu.h>
>> #include <linux/delay.h>
>> #include <linux/mmu_notifier.h>
>> +#include <linux/pci-ats.h>
>>
>> #include "amd_iommu.h"
>> #include "amd_iommu_types.h"
>> @@ -306,3 +307,58 @@ int amd_iommu_iopf_remove_device(struct amd_iommu *iommu, struct device *dev)
>> raw_spin_unlock_irqrestore(&iommu->lock, flags);
>> return ret;
>> }
>> +
>> +static int amd_iommu_iopf_update(struct device *dev, bool enable)
>> +{
>> + unsigned long flags;
>> + int ret;
>> + struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
>> + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>> + struct protection_domain *pdom = amd_iommu_get_pdomain(dev);
>> +
>> + if (!pdev || !iommu || !dev_data)
>> + return -EINVAL;
>> +
>> + spin_lock_irqsave(&pdom->lock, flags);
>> +
>> + if (enable) {
>> + ret = amd_iommu_iopf_add_device(iommu, dev);
>> + if (ret)
>> + goto out;
>> +
>> + dev_data->ppr = true;
>> + } else {
>> + ret = amd_iommu_iopf_remove_device(iommu, dev);
>> + dev_data->ppr = false;
>> + }
>> +
>> + amd_iommu_domain_update(pdom);
>> +
>> +out:
>> + spin_unlock_irqrestore(&pdom->lock, flags);
>> + return ret;
>> +}
>
> Same remarks as for the SVA, this is in the wrong place.
>
> iopf_queue_add_device() should be done when a PRI enabled domain is
> attached to a device, not in feature things. I also want to remove
> these too once ARM is fixed..
You mean we should do this during first device/pasid bind time?
>
> And the iopf is a device wide thing, it makes no sense that some
> random pdom->lock is protecting dev_dev->ppr..
Will fix it.
>
> amd_iommu_get_pdomain() has improper locking, it needs to hold some
> kind of dev_data lock to access dev_data->domain. (and really this
> would all be clearer if it was just written dev_data->domain instead
> of the redundant function)
It helps to get iommu from device structure. Its useful and we want to use it
other places as well.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-09-15 8:26 ` Vasant Hegde
@ 2023-09-18 12:45 ` Jason Gunthorpe
2023-10-10 14:53 ` Vasant Hegde
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-18 12:45 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Fri, Sep 15, 2023 at 01:56:54PM +0530, Vasant Hegde wrote:
> >> +static int amd_iommu_iopf_update(struct device *dev, bool enable)
> >> +{
> >> + unsigned long flags;
> >> + int ret;
> >> + struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
> >> + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
> >> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
> >> + struct protection_domain *pdom = amd_iommu_get_pdomain(dev);
> >> +
> >> + if (!pdev || !iommu || !dev_data)
> >> + return -EINVAL;
> >> +
> >> + spin_lock_irqsave(&pdom->lock, flags);
> >> +
> >> + if (enable) {
> >> + ret = amd_iommu_iopf_add_device(iommu, dev);
> >> + if (ret)
> >> + goto out;
> >> +
> >> + dev_data->ppr = true;
> >> + } else {
> >> + ret = amd_iommu_iopf_remove_device(iommu, dev);
> >> + dev_data->ppr = false;
> >> + }
> >> +
> >> + amd_iommu_domain_update(pdom);
> >> +
> >> +out:
> >> + spin_unlock_irqrestore(&pdom->lock, flags);
> >> + return ret;
> >> +}
> >
> > Same remarks as for the SVA, this is in the wrong place.
> >
> > iopf_queue_add_device() should be done when a PRI enabled domain is
> > attached to a device, not in feature things. I also want to remove
> > these too once ARM is fixed..
>
> You mean we should do this during first device/pasid bind time?
Yes, when a PRI capable domain is first attached the PRI stuff should
be setup. SVA is a PRI capable domain type, but we will have more..
> > amd_iommu_get_pdomain() has improper locking, it needs to hold some
> > kind of dev_data lock to access dev_data->domain. (and really this
> > would all be clearer if it was just written dev_data->domain instead
> > of the redundant function)
>
> It helps to get iommu from device structure. Its useful and we want to use it
> other places as well.
As I said, it seems obfuscating as to how the locking needs to work as
you must hold a lock while using dev_data->domain. In some cases that
can be the core code's group mutex, but that doesn't always work..
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-09-18 12:45 ` Jason Gunthorpe
@ 2023-10-10 14:53 ` Vasant Hegde
2023-10-10 15:04 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Vasant Hegde @ 2023-10-10 14:53 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
Jason,
On 9/18/2023 6:15 PM, Jason Gunthorpe wrote:
> On Fri, Sep 15, 2023 at 01:56:54PM +0530, Vasant Hegde wrote:
>
>>>> +static int amd_iommu_iopf_update(struct device *dev, bool enable)
>>>> +{
>>>> + unsigned long flags;
>>>> + int ret;
>>>> + struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
>>>> + struct amd_iommu *iommu = get_amd_iommu_from_dev(dev);
>>>> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
>>>> + struct protection_domain *pdom = amd_iommu_get_pdomain(dev);
>>>> +
>>>> + if (!pdev || !iommu || !dev_data)
>>>> + return -EINVAL;
>>>> +
>>>> + spin_lock_irqsave(&pdom->lock, flags);
>>>> +
>>>> + if (enable) {
>>>> + ret = amd_iommu_iopf_add_device(iommu, dev);
>>>> + if (ret)
>>>> + goto out;
>>>> +
>>>> + dev_data->ppr = true;
>>>> + } else {
>>>> + ret = amd_iommu_iopf_remove_device(iommu, dev);
>>>> + dev_data->ppr = false;
>>>> + }
>>>> +
>>>> + amd_iommu_domain_update(pdom);
>>>> +
>>>> +out:
>>>> + spin_unlock_irqrestore(&pdom->lock, flags);
>>>> + return ret;
>>>> +}
>>>
>>> Same remarks as for the SVA, this is in the wrong place.
>>>
>>> iopf_queue_add_device() should be done when a PRI enabled domain is
>>> attached to a device, not in feature things. I also want to remove
>>> these too once ARM is fixed..
>>
>> You mean we should do this during first device/pasid bind time?
>
> Yes, when a PRI capable domain is first attached the PRI stuff should
> be setup. SVA is a PRI capable domain type, but we will have more..
>
I have done this for now. But can you explain why you want to control device
feature? With current code, if something is broken then device driver knows it
and it can decide what to enable/disable. With new approach IOMMU layer will
endup applying all device specific workarounds right?
>>> amd_iommu_get_pdomain() has improper locking, it needs to hold some
>>> kind of dev_data lock to access dev_data->domain. (and really this
>>> would all be clearer if it was just written dev_data->domain instead
>>> of the redundant function)
>>
>> It helps to get iommu from device structure. Its useful and we want to use it
>> other places as well.
>
> As I said, it seems obfuscating as to how the locking needs to work as
> you must hold a lock while using dev_data->domain. In some cases that
> can be the core code's group mutex, but that doesn't always work..
I don't think driver should rely on core layer locking mechanism. I have cleaned
up this code.
-Vasant
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF
2023-10-10 14:53 ` Vasant Hegde
@ 2023-10-10 15:04 ` Jason Gunthorpe
0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-10-10 15:04 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Tue, Oct 10, 2023 at 08:23:04PM +0530, Vasant Hegde wrote:
> >>> iopf_queue_add_device() should be done when a PRI enabled domain is
> >>> attached to a device, not in feature things. I also want to remove
> >>> these too once ARM is fixed..
> >>
> >> You mean we should do this during first device/pasid bind time?
> >
> > Yes, when a PRI capable domain is first attached the PRI stuff should
> > be setup. SVA is a PRI capable domain type, but we will have more..
> >
>
> I have done this for now. But can you explain why you want to control device
> feature? With current code, if something is broken then device driver knows it
> and it can decide what to enable/disable. With new approach IOMMU layer will
> endup applying all device specific workarounds right?
At this point the device feature enable/disable is not useful.
If we need a query from the device driver to check if certain things
work eg 'does device support PRI', 'does device support PASID' then
those things can be a capability check, not an enable/disable.
I'm not sure what you mean by workaround?
> >>> amd_iommu_get_pdomain() has improper locking, it needs to hold some
> >>> kind of dev_data lock to access dev_data->domain. (and really this
> >>> would all be clearer if it was just written dev_data->domain instead
> >>> of the redundant function)
> >>
> >> It helps to get iommu from device structure. Its useful and we want to use it
> >> other places as well.
> >
> > As I said, it seems obfuscating as to how the locking needs to work as
> > you must hold a lock while using dev_data->domain. In some cases that
> > can be the core code's group mutex, but that doesn't always work..
>
> I don't think driver should rely on core layer locking mechanism. I have cleaned
> up this code.
Many drivers do, it is well defined. I've been thinking about exposing
a lockdep assertion that drivers can call to document this assumption..
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF
2023-09-11 12:10 [PATCH v2 00/11] iommu/amd: SVA Support (Part 4) - SVA and IOPF Vasant Hegde
` (10 preceding siblings ...)
2023-09-11 12:10 ` [PATCH v2 11/11] iommu/amd: Introduce logic to enable/disable IOPF Vasant Hegde
@ 2023-09-12 18:48 ` Jason Gunthorpe
11 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2023-09-12 18:48 UTC (permalink / raw)
To: Vasant Hegde; +Cc: iommu, joro, suravee.suthikulpanit, wei.huang2, jsnitsel
On Mon, Sep 11, 2023 at 12:10:35PM +0000, Vasant Hegde wrote:
> This is part 4 of the 4-part series to introduce Share Virtual Address
> (SVA) support for devices, which can support PCI ATS, PASID and PRI
> capabilities. These devices are referred to as SVA-capable devices in
> this series.
I think this is broadly close enough to how it should be structured
that I would not be averse to going ahead like this.
The functional issues still need to be addressed though..
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread