* [PATCH 0/7] Enable SVM for Intel VT-d
@ 2015-10-08 23:50 David Woodhouse
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
` (6 more replies)
0 siblings, 7 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:50 UTC (permalink / raw)
To: iommu, intel-gfx, Barnes, Jesse
[-- Attachment #1.1: Type: text/plain, Size: 1271 bytes --]
This patch set enables PASID support for the Intel IOMMU, along with
page request support.
Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
we'll have a session at the Kernel Summit later this month in which we
can work out a generic API which will cover the two (now) existing
implementations as well as upcoming ARM (and other?) versions.
For the time being, however, exposing an Intel-specific API is good
enough, especially as we don't have the required TLP prefix support on
our PCIe root ports and we *can't* support discrete PCIe devices with
PASID support. It's purely on-chip stuff right now, which is basically
only Intel graphics.
The AMD implementation allows a per-device PASID space, and managing
the PASID space is left entirely to the device driver. In contrast,
this implementation maintains a per-IOMMU PASID space, and drivers
calling intel_svm_bind_mm() will be *given* the PASID that they are to
use. In general we seem to be converging on using a single PASID space
across *all* IOMMUs in the system, and this will support that mode of
operation.
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/7] iommu/vt-d: Introduce intel_iommu=pasid28, and pasid_enabled() macro
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2015-10-08 23:52 ` David Woodhouse
2015-10-08 23:53 ` [PATCH 5/7] iommu/vt-d: Assume BIOS lies about ATSR for integrated gfx David Woodhouse
1 sibling, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:52 UTC (permalink / raw)
To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
jesse.barnes-ral2JQCrhuEAvxtiuMwx3w
[-- Attachment #1.1: Type: text/plain, Size: 3216 bytes --]
As long as we use an identity mapping to work around the worst of the
hardware bugs which caused us to defeature it and change the definition
of the capability bit, we *can* use PASID support on the devices which
advertised it in bit 28 of the Extended Capability Register.
Allow people to do so with 'intel_iommu=pasid28' on the command line.
Signed-off-by: David Woodhouse <David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/iommu/intel-iommu.c | 20 ++++++++++++++------
include/linux/intel-iommu.h | 2 +-
2 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 041bc18..a1514a5 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -497,13 +497,21 @@ static int dmar_forcedac;
static int intel_iommu_strict;
static int intel_iommu_superpage = 1;
static int intel_iommu_ecs = 1;
+static int intel_iommu_pasid28;
+static int iommu_identity_mapping;
+
+#define IDENTMAP_ALL 1
+#define IDENTMAP_GFX 2
+#define IDENTMAP_AZALIA 4
/* We only actually use ECS when PASID support (on the new bit 40)
* is also advertised. Some early implementations — the ones with
* PASID support on bit 28 — have issues even when we *only* use
* extended root/context tables. */
+#define pasid_enabled(iommu) (ecap_pasid(iommu->ecap) || \
+ (intel_iommu_pasid28 && ecap_broken_pasid(iommu->ecap)))
#define ecs_enabled(iommu) (intel_iommu_ecs && ecap_ecs(iommu->ecap) && \
- ecap_pasid(iommu->ecap))
+ pasid_enabled(iommu))
int intel_iommu_gfx_mapped;
EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
@@ -566,6 +574,11 @@ static int __init intel_iommu_setup(char *str)
printk(KERN_INFO
"Intel-IOMMU: disable extended context table support\n");
intel_iommu_ecs = 0;
+ } else if (!strncmp(str, "pasid28", 7)) {
+ printk(KERN_INFO
+ "Intel-IOMMU: enable pre-production PASID support\n");
+ intel_iommu_pasid28 = 1;
+ iommu_identity_mapping |= IDENTMAP_GFX;
}
str += strcspn(str, ",");
@@ -2399,11 +2412,6 @@ found_domain:
return domain;
}
-static int iommu_identity_mapping;
-#define IDENTMAP_ALL 1
-#define IDENTMAP_GFX 2
-#define IDENTMAP_AZALIA 4
-
static int iommu_domain_identity_map(struct dmar_domain *domain,
unsigned long long start,
unsigned long long end)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 6240063..c03316d 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -123,7 +123,7 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
#define ecap_srs(e) ((e >> 31) & 0x1)
#define ecap_ers(e) ((e >> 30) & 0x1)
#define ecap_prs(e) ((e >> 29) & 0x1)
-/* PASID support used to be on bit 28 */
+#define ecap_broken_pasid(e) ((e >> 28) & 0x1)
#define ecap_dis(e) ((e >> 27) & 0x1)
#define ecap_nest(e) ((e >> 26) & 0x1)
#define ecap_mts(e) ((e >> 25) & 0x1)
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/7] iommu/vt-d: Add initial support for PASID tables
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2015-10-08 23:52 ` David Woodhouse
2015-10-08 23:52 ` [PATCH 3/7] iommu/vt-d: Add intel_svm_{un, }bind_mm() functions David Woodhouse
` (4 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:52 UTC (permalink / raw)
To: iommu, intel-gfx, jesse.barnes
[-- Attachment #1.1: Type: text/plain, Size: 6591 bytes --]
Add CONFIG_INTEL_IOMMU_SVM, and allocate PASID tables on supported hardware.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
drivers/iommu/Kconfig | 8 ++++++
drivers/iommu/Makefile | 1 +
drivers/iommu/intel-iommu.c | 14 ++++++++++
drivers/iommu/intel-svm.c | 65 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 15 +++++++++++
5 files changed, 103 insertions(+)
create mode 100644 drivers/iommu/intel-svm.c
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9da766..e3b2c2e 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -135,6 +135,14 @@ config INTEL_IOMMU
and include PCI device scope covered by these DMA
remapping devices.
+config INTEL_IOMMU_SVM
+ bool "Support for Shared Virtual Memory with Intel IOMMU"
+ depends on INTEL_IOMMU && X86
+ help
+ Shared Virtual Memory (SVM) provides a facility for devices
+ to access DMA resources through process address space by
+ means of a Process Address Space ID (PASID).
+
config INTEL_IOMMU_DEFAULT_ON
def_bool y
prompt "Enable Intel DMA Remapping Devices by default"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c6dcc51..dc6f511 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
obj-$(CONFIG_DMAR_TABLE) += dmar.o
obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o
+obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a1514a5..1f89064 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1680,6 +1680,11 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
/* free context mapping */
free_context_table(iommu);
+
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ if (pasid_enabled(iommu))
+ intel_svm_free_pasid_tables(iommu);
+#endif
}
static struct dmar_domain *alloc_domain(int flags)
@@ -3103,6 +3108,10 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap))
hw_pass_through = 0;
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ if (pasid_enabled(iommu))
+ intel_svm_alloc_pasid_tables(iommu);
+#endif
}
if (iommu_pass_through)
@@ -4118,6 +4127,11 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
if (ret)
goto out;
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ if (pasid_enabled(iommu))
+ intel_svm_alloc_pasid_tables(iommu);
+#endif
+
if (dmaru->ignored) {
/*
* we always have to disable PMRs or DMA may fail on this device
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
new file mode 100644
index 0000000..9b40ad6
--- /dev/null
+++ b/drivers/iommu/intel-svm.c
@@ -0,0 +1,65 @@
+/*
+ * Copyright © 2015 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * Authors: David Woodhouse <dwmw2@infradead.org>
+ */
+
+#include <linux/intel-iommu.h>
+
+int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
+{
+ struct page *pages;
+ int order;
+
+ order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ if (order < 0)
+ order = 0;
+
+ pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
+ if (!pages) {
+ pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
+ iommu->name);
+ return -ENOMEM;
+ }
+ iommu->pasid_table = page_address(pages);
+ pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
+ if (!pages) {
+ pr_warn("IOMMU: %s: Failed to allocate PASID state table\n",
+ iommu->name);
+ free_pages((unsigned long)iommu->pasid_table, order);
+ iommu->pasid_table = NULL;
+ return -ENOMEM;
+ }
+ iommu->pasid_state_table = page_address(pages);
+ pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
+
+ return 0;
+}
+
+int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
+{
+ int order;
+
+ order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ if (order < 0)
+ order = 0;
+
+ if (iommu->pasid_table) {
+ free_pages((unsigned long)iommu->pasid_table, order);
+ iommu->pasid_table = NULL;
+ }
+ if (iommu->pasid_state_table) {
+ free_pages((unsigned long)iommu->pasid_state_table, order);
+ iommu->pasid_state_table = NULL;
+ }
+ return 0;
+}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index c03316d..47844cb 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -327,6 +327,9 @@ enum {
#define VTD_FLAG_TRANS_PRE_ENABLED (1 << 0)
#define VTD_FLAG_IRQ_REMAP_PRE_ENABLED (1 << 1)
+struct pasid_entry;
+struct pasid_state_entry;
+
struct intel_iommu {
void __iomem *reg; /* Pointer to hardware regs, virtual addr */
u64 reg_phys; /* physical address of hw register set */
@@ -350,6 +353,15 @@ struct intel_iommu {
struct iommu_flush flush;
#endif
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ /* These are large and need to be contiguous, so we allocate just
+ * one for now. We'll maybe want to rethink that if we truly give
+ * devices away to userspace processes (e.g. for DPDK) and don't
+ * want to trust that userspace will use *only* the PASID it was
+ * told to. But while it's all driver-arbitrated, we're fine. */
+ struct pasid_entry *pasid_table;
+ struct pasid_state_entry *pasid_state_table;
+#endif
struct q_inval *qi; /* Queued invalidation info */
u32 *iommu_state; /* Store iommu states between suspend and resume.*/
@@ -389,6 +401,9 @@ extern int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
extern int dmar_ir_support(void);
+extern int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu);
+extern int intel_svm_free_pasid_tables(struct intel_iommu *iommu);
+
extern const struct attribute_group *intel_iommu_groups[];
#endif
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/7] iommu/vt-d: Add intel_svm_{un, }bind_mm() functions
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-10-08 23:52 ` [PATCH 2/7] iommu/vt-d: Add initial support for PASID tables David Woodhouse
@ 2015-10-08 23:52 ` David Woodhouse
2015-10-08 23:53 ` [PATCH 4/7] iommu/vt-d: Generalise DMAR MSI setup to allow for page request events David Woodhouse
` (3 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:52 UTC (permalink / raw)
To: iommu, intel-gfx, jesse.barnes
[-- Attachment #1.1: Type: text/plain, Size: 19079 bytes --]
This provides basic PASID support for endpoint devices, tested with a
version of the i915 driver.
A given process can bind to only one device per IOMMU for now. Making
that more generic isn't particularly difficult but isn't needed yet, and
can come later once the basic functionality is stable.
Eventually we'll also want the PASID space to be system-wide, not just
per-IOMMU. But when we have that requirement we'll also have a way to
achieve it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
drivers/iommu/intel-iommu.c | 100 ++++++++++++++++++
drivers/iommu/intel-svm.c | 229 ++++++++++++++++++++++++++++++++++++++++++
include/linux/dma_remapping.h | 7 ++
include/linux/intel-iommu.h | 59 ++++++++++-
include/linux/intel-svm.h | 68 +++++++++++++
5 files changed, 458 insertions(+), 5 deletions(-)
create mode 100644 include/linux/intel-svm.h
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1f89064..a6fd639 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4882,6 +4882,106 @@ static void intel_iommu_remove_device(struct device *dev)
iommu_device_unlink(iommu->iommu_dev, dev);
}
+#ifdef CONFIG_INTEL_IOMMU_SVM
+int intel_iommu_enable_pasid(struct intel_svm *svm)
+{
+ struct device_domain_info *info = NULL;
+ struct context_entry *context;
+ struct dmar_domain *domain;
+ unsigned long flags;
+ u8 bus, devfn;
+ u64 ctx_lo;
+
+ if (iommu_dummy(svm->dev)) {
+ dev_warn(svm->dev,
+ "No IOMMU translation for device; cannot enable SVM\n");
+ return -EINVAL;
+ }
+
+ domain = get_valid_domain_for_dev(svm->dev);
+ if (!domain) {
+ dev_warn(svm->dev, "Cannot get IOMMU domain to enable SVM\n");
+ return -EINVAL;
+ }
+
+ svm->iommu = device_to_iommu(svm->dev, &bus, &devfn);
+ if (!ecs_enabled(svm->iommu)) {
+ dev_dbg(svm->dev, "No ECS support on IOMMU; cannot enable SVM\n");
+ return -EINVAL;
+ }
+ svm->did = domain->iommu_did[svm->iommu->seq_id];
+
+ spin_lock_irqsave(&device_domain_lock, flags);
+ spin_lock(&svm->iommu->lock);
+ context = iommu_context_addr(svm->iommu, bus, devfn, 0);
+ if (WARN_ON(!context)) {
+ spin_unlock(&svm->iommu->lock);
+ spin_unlock_irqrestore(&device_domain_lock, flags);
+ return -EINVAL;
+ }
+
+ ctx_lo = context[0].lo;
+ /* Modes in which the device IOTLB is enabled are 1 and 5. Modes
+ * 3 and 7 are invalid. so we only need to test the low bit of TT */
+ svm->dev_iotlb = (ctx_lo >> 2) & 1;
+
+ if (!(ctx_lo & CONTEXT_PASIDE)) {
+ context[1].hi = (u64)virt_to_phys(svm->iommu->pasid_state_table);
+ context[1].lo = (u64)virt_to_phys(svm->iommu->pasid_table) | ecap_pss(svm->iommu->ecap);
+ wmb();
+ /* CONTEXT_TT_MULTI_LEVEL and CONTEXT_TT_DEV_IOTLB are both
+ * extended to permit requests-with-PASID if the PASIDE bit
+ * is set. which makes sense. For CONTEXT_TT_PASS_THROUGH,
+ * however, the PASIDE bit is ignored and requests-with-PASID
+ * are unconditionally blocked. Which makes less sense.
+ * So convert from CONTEXT_TT_PASS_THROUGH to one of the new
+ * "guest mode" translation types depending on whether ATS
+ * is available or not. */
+ if ((ctx_lo & CONTEXT_TT_MASK) == (CONTEXT_TT_PASS_THROUGH << 2)) {
+ ctx_lo &= ~CONTEXT_TT_MASK;
+ info = iommu_support_dev_iotlb(domain, svm->iommu, bus, devfn);
+ if (info) {
+ ctx_lo |= CONTEXT_TT_PT_PASID_DEV_IOTLB << 2;
+ svm->dev_iotlb = 1;
+ } else
+ ctx_lo |= CONTEXT_TT_PT_PASID << 2;
+ }
+ ctx_lo |= CONTEXT_PASIDE;
+ context[0].lo = ctx_lo;
+ wmb();
+ svm->iommu->flush.flush_context(svm->iommu, svm->did,
+ (((u16)bus) << 8) | devfn,
+ DMA_CCMD_MASK_NOBIT,
+ DMA_CCMD_DEVICE_INVL);
+ }
+ spin_unlock(&svm->iommu->lock);
+ spin_unlock_irqrestore(&device_domain_lock, flags);
+
+ /* This only happens if we just switched from CONTEXT_TT_PASS_THROUGH */
+ if (info)
+ iommu_enable_dev_iotlb(info);
+
+ /* This can also happen when we were already in a dev-iotlb mode */
+ if (svm->dev_iotlb) {
+ svm->qdep = pci_ats_queue_depth(to_pci_dev(svm->dev));
+ if (svm->qdep >= QI_DEV_EIOTLB_MAX_INVS)
+ svm->qdep = 0;
+ svm->sid = (((u16)bus) << 8) | devfn;
+ }
+
+ return 0;
+}
+
+/* Helper function for SVM code, so that we can look up a given PASID
+ * in its IOMMU's pasid_idr for unbinding */
+struct intel_iommu *intel_iommu_device_to_iommu(struct device *dev)
+{
+ u8 bus, devfn;
+
+ return device_to_iommu(dev, &bus, &devfn);
+}
+#endif /* CONFIG_INTEL_IOMMU_SVM */
+
static const struct iommu_ops intel_iommu_ops = {
.capable = intel_iommu_capable,
.domain_alloc = intel_iommu_domain_alloc,
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 9b40ad6..913c3a1 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -14,6 +14,14 @@
*/
#include <linux/intel-iommu.h>
+#include <linux/mmu_notifier.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/intel-svm.h>
+
+struct pasid_entry {
+ u64 val;
+};
int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
{
@@ -40,6 +48,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
return -ENOMEM;
}
iommu->pasid_state_table = page_address(pages);
+ idr_init(&iommu->pasid_idr);
+
pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
return 0;
@@ -61,5 +71,224 @@ int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
free_pages((unsigned long)iommu->pasid_state_table, order);
iommu->pasid_state_table = NULL;
}
+ idr_destroy(&iommu->pasid_idr);
return 0;
}
+
+static void intel_flush_svm_range(struct intel_svm *svm,
+ unsigned long address, int pages, int ih)
+{
+ struct qi_desc desc;
+ int mask = ilog2(__roundup_pow_of_two(pages));
+
+ if (pages == -1 || !cap_pgsel_inv(svm->iommu->cap) ||
+ mask > cap_max_amask_val(svm->iommu->cap)) {
+ desc.low = QI_EIOTLB_PASID(svm->pasid) | QI_EIOTLB_DID(svm->did) |
+ QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | QI_EIOTLB_TYPE;
+ desc.high = 0;
+ } else {
+ desc.low = QI_EIOTLB_PASID(svm->pasid) | QI_EIOTLB_DID(svm->did) |
+ QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | QI_EIOTLB_TYPE;
+ desc.high = QI_EIOTLB_ADDR(address) | QI_EIOTLB_GL(1) |
+ QI_EIOTLB_IH(ih) | QI_EIOTLB_AM(mask);
+ }
+
+ qi_submit_sync(&desc, svm->iommu);
+
+ if (svm->dev_iotlb) {
+ desc.low = QI_DEV_EIOTLB_PASID(svm->pasid) | QI_DEV_EIOTLB_SID(svm->sid) |
+ QI_DEV_EIOTLB_QDEP(svm->qdep) | QI_DEIOTLB_TYPE;
+ if (mask) {
+ unsigned long adr, delta;
+
+ /* Least significant zero bits in the address indicate the
+ * range of the request. So mask them out according to the
+ * size. */
+ adr = address & ((1<<(VTD_PAGE_SHIFT + mask)) - 1);
+
+ /* Now ensure that we round down further if the original
+ * request was not aligned w.r.t. its size */
+ delta = address - adr;
+ if (delta + (pages << VTD_PAGE_SHIFT) >= (1 << (VTD_PAGE_SHIFT + mask)))
+ adr &= ~(1 << (VTD_PAGE_SHIFT + mask));
+ desc.high = QI_DEV_EIOTLB_ADDR(adr) | QI_DEV_EIOTLB_SIZE;
+ } else {
+ desc.high = QI_DEV_EIOTLB_ADDR(address);
+ }
+ qi_submit_sync(&desc, svm->iommu);
+ }
+}
+
+
+static void intel_change_pte(struct mmu_notifier *mn, struct mm_struct *mm,
+ unsigned long address, pte_t pte)
+{
+ struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+
+ intel_flush_svm_range(svm, address, 1, 1);
+}
+
+static void intel_invalidate_page(struct mmu_notifier *mn, struct mm_struct *mm,
+ unsigned long address)
+{
+ struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+
+ intel_flush_svm_range(svm, address, 1, 1);
+}
+
+/* Pages have been freed at this point */
+static void intel_invalidate_range_end(struct mmu_notifier *mn,
+ struct mm_struct *mm,
+ unsigned long start, unsigned long end)
+{
+ struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+
+ intel_flush_svm_range(svm, start,
+ (end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT , 0);
+}
+
+static void intel_flush_pasid(struct intel_svm *svm)
+{
+ struct qi_desc desc;
+
+ desc.high = 0;
+ desc.low = QI_PC_TYPE | QI_PC_DID(svm->did) | QI_PC_PASID_SEL | QI_PC_PASID(svm->pasid);
+
+ qi_submit_sync(&desc, svm->iommu);
+}
+
+static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
+{
+ struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+
+ /* Called either when the process exits, or on the last unbind */
+ svm->iommu->pasid_table[svm->pasid].val = 0;
+
+ intel_flush_pasid(svm);
+ intel_flush_svm_range(svm, 0, -1, 0);
+
+ /* XXX: Callback to device driver to let it know? */
+}
+
+static const struct mmu_notifier_ops intel_mmuops = {
+ .release = intel_mm_release,
+ .change_pte = intel_change_pte,
+ .invalidate_page = intel_invalidate_page,
+ .invalidate_range_end = intel_invalidate_range_end,
+};
+
+static DEFINE_MUTEX(pasid_mutex);
+
+int intel_svm_bind_mm(struct device *dev, int *pasid)
+{
+ struct intel_svm *svm;
+ int pasid_max;
+ int ret;
+
+ BUG_ON(pasid && !current->mm);
+
+ mutex_lock(&pasid_mutex);
+ if (pasid) {
+ struct intel_iommu *iommu = intel_iommu_device_to_iommu(dev);
+ int pasid;
+
+ if (!iommu || !iommu->pasid_table) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ idr_for_each_entry(&iommu->pasid_idr, svm, pasid) {
+ if (svm->mm != current->mm)
+ continue;
+
+ if (dev != svm->dev) {
+ ret = -EBUSY;
+ goto out;
+ }
+
+ kref_get(&svm->kref);
+ goto success;
+ }
+ }
+
+ svm = kzalloc(sizeof(*svm), GFP_KERNEL);
+ if (!svm) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ kref_init(&svm->kref);
+ svm->dev = dev;
+ ret = intel_iommu_enable_pasid(svm);
+ if (ret) {
+ kfree(svm);
+ goto out;
+ }
+ if (!pasid) {
+ /* If they don't actually want to assign a PASID, this is
+ * just an enabling check/preparation. */
+ kfree(svm);
+ goto out;
+ }
+
+ pasid_max = 2 << ecap_pss(svm->iommu->ecap);
+ /* FIXME: Factor in device max too. */
+ ret = idr_alloc(&svm->iommu->pasid_idr, svm, 0, pasid_max - 1,
+ GFP_KERNEL);
+ if (ret < 0) {
+ kfree(svm);
+ goto out;
+ }
+ svm->pasid = ret;
+ svm->notifier.ops = &intel_mmuops;
+ svm->mm = get_task_mm(current);
+ ret = -ENOMEM;
+ if (!svm->mm || (ret = mmu_notifier_register(&svm->notifier, svm->mm))) {
+ idr_remove(&svm->iommu->pasid_idr, svm->pasid);
+ kfree(svm);
+ goto out;
+ }
+ svm->iommu->pasid_table[svm->pasid].val = (u64)__pa(current->mm->pgd) | 1;
+
+ success:
+ *pasid = svm->pasid;
+ ret = 0;
+ out:
+ mutex_unlock(&pasid_mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(intel_svm_bind_mm);
+
+static void intel_mm_free(struct kref *svm_ref)
+{
+ struct intel_svm *svm = container_of(svm_ref, struct intel_svm, kref);
+
+ mmu_notifier_unregister(&svm->notifier, svm->mm);
+
+ idr_remove(&svm->iommu->pasid_idr, svm->pasid);
+ mmput(svm->mm);
+ kfree(svm);
+}
+
+int intel_svm_unbind_mm(struct device *dev, int pasid)
+{
+ struct intel_svm *svm;
+ struct intel_iommu *iommu;
+ int ret = -EINVAL;
+
+ mutex_lock(&pasid_mutex);
+ iommu = intel_iommu_device_to_iommu(dev);
+ if (!iommu || !iommu->pasid_table)
+ goto out;
+
+ svm = idr_find(&iommu->pasid_idr, pasid);
+ if (!svm)
+ goto out;
+
+ kref_put(&svm->kref, intel_mm_free);
+ ret = 0;
+ out:
+ mutex_unlock(&pasid_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(intel_svm_unbind_mm);
diff --git a/include/linux/dma_remapping.h b/include/linux/dma_remapping.h
index 7ac17f5..0e114bf 100644
--- a/include/linux/dma_remapping.h
+++ b/include/linux/dma_remapping.h
@@ -20,6 +20,13 @@
#define CONTEXT_TT_MULTI_LEVEL 0
#define CONTEXT_TT_DEV_IOTLB 1
#define CONTEXT_TT_PASS_THROUGH 2
+/* Extended context entry types */
+#define CONTEXT_TT_PT_PASID 4
+#define CONTEXT_TT_PT_PASID_DEV_IOTLB 5
+#define CONTEXT_TT_MASK (7ULL << 2)
+
+#define CONTEXT_PRS (1ULL << 9)
+#define CONTEXT_PASIDE (1ULL << 11)
struct intel_iommu;
struct dmar_domain;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 47844cb..b0df572 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -1,5 +1,9 @@
/*
- * Copyright (c) 2006, Intel Corporation.
+ * Copyright © 2006-2015, Intel Corporation.
+ *
+ * Authors: Ashok Raj <ashok.raj@intel.com>
+ * Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
+ * David Woodhouse <David.Woodhouse@intel.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
@@ -13,10 +17,6 @@
* You should have received a copy of the GNU General Public License along with
* this program; if not, write to the Free Software Foundation, Inc., 59 Temple
* Place - Suite 330, Boston, MA 02111-1307 USA.
- *
- * Copyright (C) 2006-2008 Intel Corporation
- * Author: Ashok Raj <ashok.raj@intel.com>
- * Author: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
*/
#ifndef _INTEL_IOMMU_H_
@@ -25,7 +25,9 @@
#include <linux/types.h>
#include <linux/iova.h>
#include <linux/io.h>
+#include <linux/idr.h>
#include <linux/dma_remapping.h>
+#include <linux/mmu_notifier.h>
#include <asm/cacheflush.h>
#include <asm/iommu.h>
@@ -253,6 +255,9 @@ enum {
#define QI_DIOTLB_TYPE 0x3
#define QI_IEC_TYPE 0x4
#define QI_IWD_TYPE 0x5
+#define QI_EIOTLB_TYPE 0x6
+#define QI_PC_TYPE 0x7
+#define QI_DEIOTLB_TYPE 0x8
#define QI_IEC_SELECTIVE (((u64)1) << 4)
#define QI_IEC_IIDEX(idx) (((u64)(idx & 0xffff) << 32))
@@ -280,6 +285,34 @@ enum {
#define QI_DEV_IOTLB_SIZE 1
#define QI_DEV_IOTLB_MAX_INVS 32
+#define QI_PC_PASID(pasid) (((u64)pasid) << 32)
+#define QI_PC_DID(did) (((u64)did) << 16)
+#define QI_PC_GRAN(gran) (((u64)gran) << 4)
+
+#define QI_PC_ALL_PASIDS (QI_PC_TYPE | QI_PC_GRAN(0))
+#define QI_PC_PASID_SEL (QI_PC_TYPE | QI_PC_GRAN(1))
+
+#define QI_EIOTLB_ADDR(addr) ((u64)(addr) & VTD_PAGE_MASK)
+#define QI_EIOTLB_GL(gl) (((u64)gl) << 7)
+#define QI_EIOTLB_IH(ih) (((u64)ih) << 6)
+#define QI_EIOTLB_AM(am) (((u64)am))
+#define QI_EIOTLB_PASID(pasid) (((u64)pasid) << 32)
+#define QI_EIOTLB_DID(did) (((u64)did) << 16)
+#define QI_EIOTLB_GRAN(gran) (((u64)gran) << 4)
+
+#define QI_DEV_EIOTLB_ADDR(a) ((u64)(a) & VTD_PAGE_MASK)
+#define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
+#define QI_DEV_EIOTLB_GLOB(g) ((u64)g)
+#define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32)
+#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0xffff) << 32)
+#define QI_DEV_EIOTLB_QDEP(qd) (((qd) & 0x1f) << 16)
+#define QI_DEV_EIOTLB_MAX_INVS 32
+
+#define QI_GRAN_ALL_ALL 0
+#define QI_GRAN_NONG_ALL 1
+#define QI_GRAN_NONG_PASID 2
+#define QI_GRAN_PSI_PASID 3
+
struct qi_desc {
u64 low, high;
};
@@ -361,6 +394,7 @@ struct intel_iommu {
* told to. But while it's all driver-arbitrated, we're fine. */
struct pasid_entry *pasid_table;
struct pasid_state_entry *pasid_state_table;
+ struct idr pasid_idr;
#endif
struct q_inval *qi; /* Queued invalidation info */
u32 *iommu_state; /* Store iommu states between suspend and resume.*/
@@ -404,6 +438,21 @@ extern int dmar_ir_support(void);
extern int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu);
extern int intel_svm_free_pasid_tables(struct intel_iommu *iommu);
+struct intel_svm {
+ struct kref kref;
+ struct mmu_notifier notifier;
+ struct mm_struct *mm;
+ struct intel_iommu *iommu;
+ struct device *dev;
+ int pasid;
+ u16 did;
+ u16 dev_iotlb:1;
+ u16 sid, qdep;
+};
+
+extern int intel_iommu_enable_pasid(struct intel_svm *svm);
+extern struct intel_iommu *intel_iommu_device_to_iommu(struct device *dev);
+
extern const struct attribute_group *intel_iommu_groups[];
#endif
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
new file mode 100644
index 0000000..1e84f3e
--- /dev/null
+++ b/include/linux/intel-svm.h
@@ -0,0 +1,68 @@
+/*
+ * Copyright © 2015 Intel Corporation.
+ *
+ * Authors: David Woodhouse <David.Woodhouse@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __INTEL_SVM_H__
+#define __INTEL_SVM_H__
+
+struct device;
+
+/**
+ * intel_svm_bind_mm() - Bind the current process to a PASID
+ * @dev: Device to be granted acccess
+ * @pasid: Address for allocated PASID
+ *
+ * This function attempts to enable PASID support for the given device.
+ * If the @pasid argument is non-%NULL, a PASID is allocated for access
+ * to the MM of the current process.
+ *
+ * By using a %NULL value for the @pasid argument, this function can
+ * be used to simply validate that PASID support is available for the
+ * given device — i.e. that it is behind an IOMMU which has the
+ * requisite support, and is enabled.
+ *
+ * Page faults are handled transparently by the IOMMU code, and there
+ * should be no need for the device driver to be involved. If a page
+ * fault cannot be handled (i.e. is an invalid address rather than
+ * just needs paging in), then the page request will be completed by
+ * the core IOMMU code with appropriate status, and the device itself
+ * can then report the resulting fault to its driver via whatever
+ * mechanism is appropriate.
+ *
+ * Multiple calls from the same process may result in the same PASID
+ * being re-used. A reference count is kept.
+ */
+extern int intel_svm_bind_mm(struct device *dev, int *pasid);
+
+#define intel_svm_available(dev) intel_svm_bind_mm((dev), NULL)
+
+/**
+ * intel_svm_unbind_mm() - Unbind a specified PASID
+ * @dev: Device for which PASID was allocated
+ * @pasid: PASID value to be unbound
+ *
+ * This function allows a PASID to be retired when the device no
+ * longer requires access to the address space of a given process.
+ *
+ * If the use count for the PASID in question reaches zero, the
+ * PASID is revoked and may no longer be used by hardware.
+ *
+ * Device drivers are required to ensure that no access (including
+ * page requests) is currently outstanding for the PASID in question,
+ * before calling this function.
+ */
+extern int intel_svm_unbind_mm(struct device *dev, int pasid);
+
+
+#endif /* __INTEL_SVM_H__ */
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/7] iommu/vt-d: Generalise DMAR MSI setup to allow for page request events
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
` (2 preceding siblings ...)
2015-10-08 23:52 ` [PATCH 3/7] iommu/vt-d: Add intel_svm_{un, }bind_mm() functions David Woodhouse
@ 2015-10-08 23:53 ` David Woodhouse
2015-10-08 23:53 ` [PATCH 6/7] iommu/vt-d: Enable page request interrupt David Woodhouse
` (2 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:53 UTC (permalink / raw)
To: iommu, intel-gfx, jesse.barnes
[-- Attachment #1.1: Type: text/plain, Size: 4898 bytes --]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
drivers/iommu/dmar.c | 42 +++++++++++++++++++++++++++++++-----------
include/linux/intel-iommu.h | 10 +++++++++-
2 files changed, 40 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 8757f8d..80e3c17 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1086,6 +1086,11 @@ static void free_iommu(struct intel_iommu *iommu)
iommu_device_destroy(iommu->iommu_dev);
if (iommu->irq) {
+ if (iommu->pr_irq) {
+ free_irq(iommu->pr_irq, iommu);
+ dmar_free_hwirq(iommu->pr_irq);
+ iommu->pr_irq = 0;
+ }
free_irq(iommu->irq, iommu);
dmar_free_hwirq(iommu->irq);
iommu->irq = 0;
@@ -1493,53 +1498,68 @@ static const char *dmar_get_fault_reason(u8 fault_reason, int *fault_type)
}
}
+
+static inline int dmar_msi_reg(struct intel_iommu *iommu, int irq)
+{
+ if (iommu->irq == irq)
+ return DMAR_FECTL_REG;
+ else if (iommu->pr_irq == irq)
+ return DMAR_PECTL_REG;
+ else
+ BUG();
+}
+
void dmar_msi_unmask(struct irq_data *data)
{
struct intel_iommu *iommu = irq_data_get_irq_handler_data(data);
+ int reg = dmar_msi_reg(iommu, data->irq);
unsigned long flag;
/* unmask it */
raw_spin_lock_irqsave(&iommu->register_lock, flag);
- writel(0, iommu->reg + DMAR_FECTL_REG);
+ writel(0, iommu->reg + reg);
/* Read a reg to force flush the post write */
- readl(iommu->reg + DMAR_FECTL_REG);
+ readl(iommu->reg + reg);
raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
}
void dmar_msi_mask(struct irq_data *data)
{
- unsigned long flag;
struct intel_iommu *iommu = irq_data_get_irq_handler_data(data);
+ int reg = dmar_msi_reg(iommu, data->irq);
+ unsigned long flag;
/* mask it */
raw_spin_lock_irqsave(&iommu->register_lock, flag);
- writel(DMA_FECTL_IM, iommu->reg + DMAR_FECTL_REG);
+ writel(DMA_FECTL_IM, iommu->reg + reg);
/* Read a reg to force flush the post write */
- readl(iommu->reg + DMAR_FECTL_REG);
+ readl(iommu->reg + reg);
raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
}
void dmar_msi_write(int irq, struct msi_msg *msg)
{
struct intel_iommu *iommu = irq_get_handler_data(irq);
+ int reg = dmar_msi_reg(iommu, irq);
unsigned long flag;
raw_spin_lock_irqsave(&iommu->register_lock, flag);
- writel(msg->data, iommu->reg + DMAR_FEDATA_REG);
- writel(msg->address_lo, iommu->reg + DMAR_FEADDR_REG);
- writel(msg->address_hi, iommu->reg + DMAR_FEUADDR_REG);
+ writel(msg->data, iommu->reg + reg + 4);
+ writel(msg->address_lo, iommu->reg + reg + 8);
+ writel(msg->address_hi, iommu->reg + reg + 12);
raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
}
void dmar_msi_read(int irq, struct msi_msg *msg)
{
struct intel_iommu *iommu = irq_get_handler_data(irq);
+ int reg = dmar_msi_reg(iommu, irq);
unsigned long flag;
raw_spin_lock_irqsave(&iommu->register_lock, flag);
- msg->data = readl(iommu->reg + DMAR_FEDATA_REG);
- msg->address_lo = readl(iommu->reg + DMAR_FEADDR_REG);
- msg->address_hi = readl(iommu->reg + DMAR_FEUADDR_REG);
+ msg->data = readl(iommu->reg + reg + 4);
+ msg->address_lo = readl(iommu->reg + reg + 8);
+ msg->address_hi = readl(iommu->reg + reg + 12);
raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index b0df572..564a61b 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -59,6 +59,14 @@
#define DMAR_IQA_REG 0x90 /* Invalidation queue addr register */
#define DMAR_ICS_REG 0x9c /* Invalidation complete status register */
#define DMAR_IRTA_REG 0xb8 /* Interrupt remapping table addr register */
+#define DMAR_PQH_REG 0xc0 /* Page request queue head register */
+#define DMAR_PQT_REG 0xc8 /* Page request queue tail register */
+#define DMAR_PQA_REG 0xd0 /* Page request queue address register */
+#define DMAR_PRS_REG 0xdc /* Page request status register */
+#define DMAR_PECTL_REG 0xe0 /* Page request event control register */
+#define DMAR_PEDATA_REG 0xe4 /* Page request event interrupt data register */
+#define DMAR_PEADDR_REG 0xe8 /* Page request event interrupt addr register */
+#define DMAR_PEUADDR_REG 0xec /* Page request event Upper address register */
#define OFFSET_STRIDE (9)
/*
@@ -374,7 +382,7 @@ struct intel_iommu {
int seq_id; /* sequence id of the iommu */
int agaw; /* agaw of this iommu */
int msagaw; /* max sagaw of this iommu */
- unsigned int irq;
+ unsigned int irq, pr_irq;
u16 segment; /* PCI segment# */
unsigned char name[13]; /* Device Name */
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/7] iommu/vt-d: Assume BIOS lies about ATSR for integrated gfx
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-10-08 23:52 ` [PATCH 1/7] iommu/vt-d: Introduce intel_iommu=pasid28, and pasid_enabled() macro David Woodhouse
@ 2015-10-08 23:53 ` David Woodhouse
1 sibling, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:53 UTC (permalink / raw)
To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
jesse.barnes-ral2JQCrhuEAvxtiuMwx3w
[-- Attachment #1.1: Type: text/plain, Size: 1293 bytes --]
If the device itself reports ATS in its PCIe capabilities, but the BIOS
neglects to provide an ATSR structure to indicate that the root port can
also cope, then assume the latter is lying.
Signed-off-by: David Woodhouse <David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/iommu/intel-iommu.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a6fd639..b5ab009 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1447,9 +1447,14 @@ iommu_support_dev_iotlb (struct dmar_domain *domain, struct intel_iommu *iommu,
if (!pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS))
return NULL;
- if (!dmar_find_matched_atsr_unit(pdev))
- return NULL;
-
+ if (!dmar_find_matched_atsr_unit(pdev)) {
+ if (intel_iommu_pasid28 && IS_GFX_DEVICE(pdev) &&
+ pdev->vendor == 0x8086) {
+ pr_warn("BIOS denies ATSR capability for %s; assuming it lies\n",
+ dev_name(info->dev));
+ } else
+ return NULL;
+ }
return info;
}
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 6/7] iommu/vt-d: Enable page request interrupt
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
` (3 preceding siblings ...)
2015-10-08 23:53 ` [PATCH 4/7] iommu/vt-d: Generalise DMAR MSI setup to allow for page request events David Woodhouse
@ 2015-10-08 23:53 ` David Woodhouse
2015-10-08 23:54 ` [PATCH 7/7] iommu/vt-d: Implement page request handling David Woodhouse
2015-10-10 13:17 ` [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
6 siblings, 0 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:53 UTC (permalink / raw)
To: iommu, intel-gfx, jesse.barnes
[-- Attachment #1.1: Type: text/plain, Size: 5754 bytes --]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
drivers/iommu/intel-iommu.c | 22 +++++++++++++-
drivers/iommu/intel-svm.c | 71 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 5 ++++
3 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b5ab009..e27b9c6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1687,8 +1687,11 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
free_context_table(iommu);
#ifdef CONFIG_INTEL_IOMMU_SVM
- if (pasid_enabled(iommu))
+ if (pasid_enabled(iommu)) {
+ if (ecap_prs(iommu->ecap))
+ intel_svm_finish_prq(iommu);
intel_svm_free_pasid_tables(iommu);
+ }
#endif
}
@@ -3204,6 +3207,13 @@ domains_done:
iommu_flush_write_buffer(iommu);
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ if (pasid_enabled(iommu) && ecap_prs(iommu->ecap)) {
+ ret = intel_svm_enable_prq(iommu);
+ if (ret)
+ goto free_iommu;
+ }
+#endif
ret = dmar_set_interrupt(iommu);
if (ret)
goto free_iommu;
@@ -4148,6 +4158,14 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
intel_iommu_init_qi(iommu);
iommu_flush_write_buffer(iommu);
+
+#ifdef CONFIG_INTEL_IOMMU_SVM
+ if (pasid_enabled(iommu) && ecap_prs(iommu->ecap)) {
+ ret = intel_svm_enable_prq(iommu);
+ if (ret)
+ goto disable_iommu;
+ }
+#endif
ret = dmar_set_interrupt(iommu);
if (ret)
goto disable_iommu;
@@ -4952,6 +4970,8 @@ int intel_iommu_enable_pasid(struct intel_svm *svm)
ctx_lo |= CONTEXT_TT_PT_PASID << 2;
}
ctx_lo |= CONTEXT_PASIDE;
+ if (svm->dev_iotlb && ecap_prs(svm->iommu->ecap))
+ ctx_lo |= CONTEXT_PRS;
context[0].lo = ctx_lo;
wmb();
svm->iommu->flush.flush_context(svm->iommu, svm->did,
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 913c3a1..1260e87 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -18,6 +18,10 @@
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/intel-svm.h>
+#include <linux/dmar.h>
+#include <linux/interrupt.h>
+
+static irqreturn_t prq_event_thread(int irq, void *d);
struct pasid_entry {
u64 val;
@@ -75,6 +79,65 @@ int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
return 0;
}
+#define PRQ_ORDER 0
+int intel_svm_enable_prq(struct intel_iommu *iommu)
+{
+ struct page *pages;
+ int irq, ret;
+
+ pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, PRQ_ORDER);
+ if (!pages) {
+ pr_warn("IOMMU: %s: Failed to allocate page request queue\n",
+ iommu->name);
+ return -ENOMEM;
+ }
+ iommu->prq = page_address(pages);
+
+ irq = dmar_alloc_hwirq(DMAR_UNITS_SUPPORTED + iommu->seq_id, iommu->node, iommu);
+ if (irq <= 0) {
+ pr_err("IOMMU: %s: Failed to create IRQ vector for page request queue\n",
+ iommu->name);
+ ret = -EINVAL;
+ err:
+ free_pages((unsigned long)iommu->prq, PRQ_ORDER);
+ iommu->prq = 0;
+ return ret;
+ }
+ iommu->pr_irq = irq;
+
+ snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-prq", iommu->seq_id);
+
+ ret = request_threaded_irq(irq, NULL, prq_event_thread, IRQF_ONESHOT,
+ iommu->prq_name, iommu);
+ if (ret) {
+ pr_err("IOMMU: %s: Failed to request IRQ for page request queue\n",
+ iommu->name);
+ dmar_free_hwirq(irq);
+ goto err;
+ }
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQA_REG, virt_to_phys(iommu->prq) | PRQ_ORDER);
+
+ return 0;
+}
+
+int intel_svm_finish_prq(struct intel_iommu *iommu)
+{
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQA_REG, 0ULL);
+
+ free_irq(iommu->pr_irq, iommu);
+ dmar_free_hwirq(iommu->pr_irq);
+ iommu->pr_irq = 0;
+
+ free_pages((unsigned long)iommu->prq, PRQ_ORDER);
+ iommu->prq = 0;
+
+ return 0;
+}
+
static void intel_flush_svm_range(struct intel_svm *svm,
unsigned long address, int pages, int ih)
{
@@ -292,3 +355,11 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
return ret;
}
EXPORT_SYMBOL_GPL(intel_svm_unbind_mm);
+
+
+static irqreturn_t prq_event_thread(int irq, void *d)
+{
+ struct intel_iommu *iommu = d;
+ printk("PRQ event on %s\n", iommu->name);
+ return IRQ_HANDLED;
+}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 564a61b..bf034b0 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -370,6 +370,7 @@ enum {
struct pasid_entry;
struct pasid_state_entry;
+struct page_req_dsc;
struct intel_iommu {
void __iomem *reg; /* Pointer to hardware regs, virtual addr */
@@ -402,6 +403,8 @@ struct intel_iommu {
* told to. But while it's all driver-arbitrated, we're fine. */
struct pasid_entry *pasid_table;
struct pasid_state_entry *pasid_state_table;
+ struct page_req_dsc *prq;
+ unsigned char prq_name[16]; /* Name for PRQ interrupt */
struct idr pasid_idr;
#endif
struct q_inval *qi; /* Queued invalidation info */
@@ -445,6 +448,8 @@ extern int dmar_ir_support(void);
extern int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu);
extern int intel_svm_free_pasid_tables(struct intel_iommu *iommu);
+extern int intel_svm_enable_prq(struct intel_iommu *iommu);
+extern int intel_svm_finish_prq(struct intel_iommu *iommu);
struct intel_svm {
struct kref kref;
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 7/7] iommu/vt-d: Implement page request handling
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
` (4 preceding siblings ...)
2015-10-08 23:53 ` [PATCH 6/7] iommu/vt-d: Enable page request interrupt David Woodhouse
@ 2015-10-08 23:54 ` David Woodhouse
2015-10-10 16:54 ` Chris Wilson
2015-10-10 13:17 ` [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
6 siblings, 1 reply; 12+ messages in thread
From: David Woodhouse @ 2015-10-08 23:54 UTC (permalink / raw)
To: iommu, intel-gfx, jesse.barnes
[-- Attachment #1.1: Type: text/plain, Size: 5567 bytes --]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
drivers/iommu/intel-svm.c | 115 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/intel-iommu.h | 21 ++++++++
2 files changed, 134 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 1260e87..ace1e32 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -356,10 +356,121 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
}
EXPORT_SYMBOL_GPL(intel_svm_unbind_mm);
+/* Page request queue descriptor */
+struct page_req_dsc {
+ u64 srr:1;
+ u64 bof:1;
+ u64 pasid_present:1;
+ u64 lpig:1;
+ u64 pasid:20;
+ u64 bus:8;
+ u64 private:23;
+ u64 prg_index:9;
+ u64 rd_req:1;
+ u64 wr_req:1;
+ u64 exe_req:1;
+ u64 priv_req:1;
+ u64 devfn:8;
+ u64 addr:52;
+};
+#define PRQ_RING_MASK ((0x1000 << PRQ_ORDER) - 0x10)
static irqreturn_t prq_event_thread(int irq, void *d)
{
struct intel_iommu *iommu = d;
- printk("PRQ event on %s\n", iommu->name);
- return IRQ_HANDLED;
+ struct intel_svm *svm = NULL;
+ int head, tail, handled = 0;
+
+ tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
+ head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
+ while (head != tail) {
+ struct vm_area_struct *vma;
+ struct page_req_dsc *req;
+ struct qi_desc resp;
+ int ret, result;
+ u64 address;
+
+ handled = 1;
+
+ req = &iommu->prq[head / sizeof(*req)];
+
+ result = QI_RESP_INVALID;
+ address = req->addr << PAGE_SHIFT;
+ if (!req->pasid_present) {
+ pr_err("%s: Page request without PASID: %08llx %08llx\n",
+ iommu->name, ((unsigned long long *)req)[0],
+ ((unsigned long long *)req)[1]);
+ goto inval_req;
+ }
+
+ if (!svm || svm->pasid != req->pasid) {
+ mutex_lock(&pasid_mutex);
+ if (svm)
+ kref_put(&svm->kref, &intel_mm_free);
+
+ svm = idr_find(&iommu->pasid_idr, req->pasid);
+ if (!svm) {
+ pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
+ iommu->name, req->pasid, ((unsigned long long *)req)[0],
+ ((unsigned long long *)req)[1]);
+ mutex_unlock(&pasid_mutex);
+ goto inval_req;
+ }
+ /* Strictly speaking, we shouldn't need this. It is forbidden
+ to unbind the PASID while there may still be requests in
+ flight. But let's do it anyway. */
+ kref_get(&svm->kref);
+ mutex_unlock(&pasid_mutex);
+ }
+
+ result = QI_RESP_FAILURE;
+ down_read(&svm->mm->mmap_sem);
+ vma = find_extend_vma(svm->mm, address);
+ if (!vma || address < vma->vm_start)
+ goto hard_fault;
+
+ ret = handle_mm_fault(svm->mm, vma, address,
+ req->wr_req ? FAULT_FLAG_WRITE : 0);
+ if (ret & VM_FAULT_ERROR)
+ goto hard_fault;
+
+ result = QI_RESP_SUCCESS;
+ hard_fault:
+ up_read(&svm->mm->mmap_sem);
+ inval_req:
+ /* Accounting for major/minor faults? */
+
+ if (req->lpig) {
+ /* Page Group Response */
+ resp.low = QI_PGRP_PASID(req->pasid) |
+ QI_PGRP_DID((req->bus << 8) | req->devfn) |
+ QI_PGRP_PASID_P(req->pasid_present) |
+ QI_PGRP_RESP_TYPE;
+ resp.high = QI_PGRP_IDX(req->prg_index) |
+ QI_PGRP_PRIV(req->private) | QI_PGRP_RESP_CODE(result);
+
+ qi_submit_sync(&resp, svm->iommu);
+ } else if (req->srr) {
+ /* Page Stream Response */
+ resp.low = QI_PSTRM_IDX(req->prg_index) |
+ QI_PSTRM_PRIV(req->private) | QI_PSTRM_BUS(req->bus) |
+ QI_PSTRM_PASID(req->pasid) | QI_PSTRM_RESP_TYPE;
+ resp.high = QI_PSTRM_ADDR(address) | QI_PSTRM_DEVFN(req->devfn) |
+ QI_PSTRM_RESP_CODE(result);
+
+ qi_submit_sync(&resp, svm->iommu);
+ }
+
+ head = (head + sizeof(*req)) & PRQ_RING_MASK;
+ }
+
+ if (svm) {
+ mutex_lock(&pasid_mutex);
+ kref_put(&svm->kref, &intel_mm_free);
+ mutex_unlock(&pasid_mutex);
+ }
+
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, tail);
+
+ return IRQ_RETVAL(handled);
}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index bf034b0..aa7e02d 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -266,6 +266,8 @@ enum {
#define QI_EIOTLB_TYPE 0x6
#define QI_PC_TYPE 0x7
#define QI_DEIOTLB_TYPE 0x8
+#define QI_PGRP_RESP_TYPE 0x9
+#define QI_PSTRM_RESP_TYPE 0xa
#define QI_IEC_SELECTIVE (((u64)1) << 4)
#define QI_IEC_IIDEX(idx) (((u64)(idx & 0xffff) << 32))
@@ -316,6 +318,25 @@ enum {
#define QI_DEV_EIOTLB_QDEP(qd) (((qd) & 0x1f) << 16)
#define QI_DEV_EIOTLB_MAX_INVS 32
+#define QI_PGRP_IDX(idx) (((u64)(idx)) << 55)
+#define QI_PGRP_PRIV(priv) (((u64)(priv)) << 32)
+#define QI_PGRP_RESP_CODE(res) ((u64)(res))
+#define QI_PGRP_PASID(pasid) (((u64)(pasid)) << 32)
+#define QI_PGRP_DID(did) (((u64)(did)) << 16)
+#define QI_PGRP_PASID_P(p) (((u64)(p)) << 4)
+
+#define QI_PSTRM_ADDR(addr) (((u64)(addr)) & VTD_PAGE_MASK)
+#define QI_PSTRM_DEVFN(devfn) (((u64)(devfn)) << 4)
+#define QI_PSTRM_RESP_CODE(res) ((u64)(res))
+#define QI_PSTRM_IDX(idx) (((u64)(idx)) << 55)
+#define QI_PSTRM_PRIV(priv) (((u64)(priv)) << 32)
+#define QI_PSTRM_BUS(bus) (((u64)(bus)) << 24)
+#define QI_PSTRM_PASID(pasid) (((u64)(pasid)) << 4)
+
+#define QI_RESP_SUCCESS 0x0
+#define QI_RESP_INVALID 0x1
+#define QI_RESP_FAILURE 0xf
+
#define QI_GRAN_ALL_ALL 0
#define QI_GRAN_NONG_ALL 1
#define QI_GRAN_NONG_PASID 2
--
2.4.3
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/7] Enable SVM for Intel VT-d
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
` (5 preceding siblings ...)
2015-10-08 23:54 ` [PATCH 7/7] iommu/vt-d: Implement page request handling David Woodhouse
@ 2015-10-10 13:17 ` David Woodhouse
[not found] ` <1444483075.92154.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-10-13 12:06 ` Daniel Vetter
6 siblings, 2 replies; 12+ messages in thread
From: David Woodhouse @ 2015-10-10 13:17 UTC (permalink / raw)
To: iommu, intel-gfx, Barnes, Jesse; +Cc: Oded Gabbay, Alex Deucher
[-- Attachment #1.1: Type: text/plain, Size: 2513 bytes --]
On Fri, 2015-10-09 at 00:50 +0100, David Woodhouse wrote:
> This patch set enables PASID support for the Intel IOMMU, along with
> page request support.
>
> Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
> we'll have a session at the Kernel Summit later this month in which we
> can work out a generic API which will cover the two (now) existing
> implementations as well as upcoming ARM (and other?) versions.
>
> For the time being, however, exposing an Intel-specific API is good
> enough, especially as we don't have the required TLP prefix support on
> our PCIe root ports and we *can't* support discrete PCIe devices with
> PASID support. It's purely on-chip stuff right now, which is basically
> only Intel graphics.
>
> The AMD implementation allows a per-device PASID space, and managing
> the PASID space is left entirely to the device driver. In contrast,
> this implementation maintains a per-IOMMU PASID space, and drivers
> calling intel_svm_bind_mm() will be *given* the PASID that they are to
> use. In general we seem to be converging on using a single PASID space
> across *all* IOMMUs in the system, and this will support that mode of
> operation.
The other noticeable difference is the lifetime management of the mm.
My code takes a reference on it, and will only do the mmput() when the
driver unbinds the PASID. So the mmu_notifier's .release() method won't
get called before that.
The AMD version doesn't take that refcount, and its .release() method
therefore needs to actually call back into the device driver and ensure
that all access to the mm, including pending page faults, is flushed.
The locking issues there scare me a little, especially if page faults
are currently outstanding.
In the i915 case we have an open file descriptor associated with the
gfx context. When the process dies, the fd is closed and the driver can
go and clean up after it.
The amdkfd driver, on the other hand, keeps the device-side job running
even after the process has closed its file descriptor. So it *needs*
the .release() call to happen when the process exits, as it otherwise
doesn't know when to clean up.
I am somewhat dubious about that as a design decision. If we're moving
to a more explicit management of off-cpu tasks with mm access, as is to
be discussed at the Kernel Summit, then hopefully we can fix that. It's
a *lot* simpler if we just pin the mm while the device context has
access to it.
--
dwmw2
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 7/7] iommu/vt-d: Implement page request handling
2015-10-08 23:54 ` [PATCH 7/7] iommu/vt-d: Implement page request handling David Woodhouse
@ 2015-10-10 16:54 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2015-10-10 16:54 UTC (permalink / raw)
To: David Woodhouse; +Cc: iommu, jesse.barnes, intel-gfx
On Fri, Oct 09, 2015 at 12:54:07AM +0100, David Woodhouse wrote:
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> ---
> drivers/iommu/intel-svm.c | 115 +++++++++++++++++++++++++++++++++++++++++++-
> include/linux/intel-iommu.h | 21 ++++++++
> 2 files changed, 134 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 1260e87..ace1e32 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> static irqreturn_t prq_event_thread(int irq, void *d)
> {
> + if (svm) {
> + mutex_lock(&pasid_mutex);
> + kref_put(&svm->kref, &intel_mm_free);
> + mutex_unlock(&pasid_mutex);
kref_put_mutex(&svm->kref, intel_mm_free, &pasid_mutex); ?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/7] Enable SVM for Intel VT-d
[not found] ` <1444483075.92154.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2015-10-11 13:48 ` Oded Gabbay
0 siblings, 0 replies; 12+ messages in thread
From: Oded Gabbay @ 2015-10-11 13:48 UTC (permalink / raw)
To: David Woodhouse
Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Barnes, Jesse,
Alex Deucher
On Sat, Oct 10, 2015 at 4:17 PM, David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> On Fri, 2015-10-09 at 00:50 +0100, David Woodhouse wrote:
> > This patch set enables PASID support for the Intel IOMMU, along with
> > page request support.
> >
> > Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
> > we'll have a session at the Kernel Summit later this month in which we
> > can work out a generic API which will cover the two (now) existing
> > implementations as well as upcoming ARM (and other?) versions.
> >
> > For the time being, however, exposing an Intel-specific API is good
> > enough, especially as we don't have the required TLP prefix support on
> > our PCIe root ports and we *can't* support discrete PCIe devices with
> > PASID support. It's purely on-chip stuff right now, which is basically
> > only Intel graphics.
> >
> > The AMD implementation allows a per-device PASID space, and managing
> > the PASID space is left entirely to the device driver. In contrast,
> > this implementation maintains a per-IOMMU PASID space, and drivers
> > calling intel_svm_bind_mm() will be *given* the PASID that they are to
> > use. In general we seem to be converging on using a single PASID space
> > across *all* IOMMUs in the system, and this will support that mode of
> > operation.
>
> The other noticeable difference is the lifetime management of the mm.
> My code takes a reference on it, and will only do the mmput() when the
> driver unbinds the PASID. So the mmu_notifier's .release() method won't
> get called before that.
>
> The AMD version doesn't take that refcount, and its .release() method
> therefore needs to actually call back into the device driver and ensure
> that all access to the mm, including pending page faults, is flushed.
> The locking issues there scare me a little, especially if page faults
> are currently outstanding.
>
> In the i915 case we have an open file descriptor associated with the
> gfx context. When the process dies, the fd is closed and the driver can
> go and clean up after it.
>
> The amdkfd driver, on the other hand, keeps the device-side job running
> even after the process has closed its file descriptor. So it *needs*
> the .release() call to happen when the process exits, as it otherwise
> doesn't know when to clean up.
>
> I am somewhat dubious about that as a design decision. If we're moving
> to a more explicit management of off-cpu tasks with mm access, as is to
> be discussed at the Kernel Summit, then hopefully we can fix that. It's
> a *lot* simpler if we just pin the mm while the device context has
> access to it.
>
> --
> dwmw2
>
Hi David,
There was a whole debate about this issue (amdkfd binding to mm struct
lifespan instead of to fd) when we upstreamed amdkfd, with good
arguments for and against. If you want to understand the reasons, I
suggest reading the following email thread:
https://lists.linuxfoundation.org/pipermail/iommu/2014-July/009005.html
TL;DR, IIRC, the bottom line was that (over-simplified):
1. HSA/amdkfd is not a "classic" device driver, is it performs
operations in context of a process working on multiple devices and
doesn't contain an "instance per device". It's conceptually more like
a subsystem/system call interface then a device driver.
2. It is not a one-of-a-kind in the kernel, as there are other drivers
which use this method.
Oded
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/7] Enable SVM for Intel VT-d
2015-10-10 13:17 ` [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
[not found] ` <1444483075.92154.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2015-10-13 12:06 ` Daniel Vetter
1 sibling, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2015-10-13 12:06 UTC (permalink / raw)
To: David Woodhouse
Cc: Oded Gabbay, Alex Deucher, iommu, Barnes, Jesse, intel-gfx
On Sat, Oct 10, 2015 at 02:17:55PM +0100, David Woodhouse wrote:
> On Fri, 2015-10-09 at 00:50 +0100, David Woodhouse wrote:
> > This patch set enables PASID support for the Intel IOMMU, along with
> > page request support.
> >
> > Like its AMD counterpart, it exposes an IOMMU-specific API. I believe
> > we'll have a session at the Kernel Summit later this month in which we
> > can work out a generic API which will cover the two (now) existing
> > implementations as well as upcoming ARM (and other?) versions.
> >
> > For the time being, however, exposing an Intel-specific API is good
> > enough, especially as we don't have the required TLP prefix support on
> > our PCIe root ports and we *can't* support discrete PCIe devices with
> > PASID support. It's purely on-chip stuff right now, which is basically
> > only Intel graphics.
> >
> > The AMD implementation allows a per-device PASID space, and managing
> > the PASID space is left entirely to the device driver. In contrast,
> > this implementation maintains a per-IOMMU PASID space, and drivers
> > calling intel_svm_bind_mm() will be *given* the PASID that they are to
> > use. In general we seem to be converging on using a single PASID space
> > across *all* IOMMUs in the system, and this will support that mode of
> > operation.
>
> The other noticeable difference is the lifetime management of the mm.
> My code takes a reference on it, and will only do the mmput() when the
> driver unbinds the PASID. So the mmu_notifier's .release() method won't
> get called before that.
>
> The AMD version doesn't take that refcount, and its .release() method
> therefore needs to actually call back into the device driver and ensure
> that all access to the mm, including pending page faults, is flushed.
> The locking issues there scare me a little, especially if page faults
> are currently outstanding.
>
> In the i915 case we have an open file descriptor associated with the
> gfx context. When the process dies, the fd is closed and the driver can
> go and clean up after it.
>
> The amdkfd driver, on the other hand, keeps the device-side job running
> even after the process has closed its file descriptor. So it *needs*
> the .release() call to happen when the process exits, as it otherwise
> doesn't know when to clean up.
>
> I am somewhat dubious about that as a design decision. If we're moving
> to a more explicit management of off-cpu tasks with mm access, as is to
> be discussed at the Kernel Summit, then hopefully we can fix that. It's
> a *lot* simpler if we just pin the mm while the device context has
> access to it.
I think acquiring a full reference on the mm makes sense. Conceptually an
svm context is just another compute thread, just not running on the cpu.
The other way round would mean that at mm exit we tear down these
additional threads, which seems a bit backwards.
Of course that special thread is attached to an fd, which has a completely
separate lifetime from threads. There's also the fun that a different mm
could submit commands to a foreign svm context. So no perfect fit, but on
a hunch still feels like grabbing a full reference is the cleaner design.
And it matches the refcounting we do for traditional gpu contexts on the
ppgtt address space they're using.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-10-13 12:06 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-08 23:50 [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
[not found] ` <1444348223.92154.22.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-10-08 23:52 ` [PATCH 1/7] iommu/vt-d: Introduce intel_iommu=pasid28, and pasid_enabled() macro David Woodhouse
2015-10-08 23:53 ` [PATCH 5/7] iommu/vt-d: Assume BIOS lies about ATSR for integrated gfx David Woodhouse
2015-10-08 23:52 ` [PATCH 2/7] iommu/vt-d: Add initial support for PASID tables David Woodhouse
2015-10-08 23:52 ` [PATCH 3/7] iommu/vt-d: Add intel_svm_{un, }bind_mm() functions David Woodhouse
2015-10-08 23:53 ` [PATCH 4/7] iommu/vt-d: Generalise DMAR MSI setup to allow for page request events David Woodhouse
2015-10-08 23:53 ` [PATCH 6/7] iommu/vt-d: Enable page request interrupt David Woodhouse
2015-10-08 23:54 ` [PATCH 7/7] iommu/vt-d: Implement page request handling David Woodhouse
2015-10-10 16:54 ` Chris Wilson
2015-10-10 13:17 ` [PATCH 0/7] Enable SVM for Intel VT-d David Woodhouse
[not found] ` <1444483075.92154.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-10-11 13:48 ` Oded Gabbay
2015-10-13 12:06 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox