* [Qemu-devel] [RFC 01/20] hw/arm/smmu-common: Fix the name of the iommu memory regions
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 02/20] update-linux-headers: Import iommu.h Eric Auger
` (18 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
At the point smmu_find_add_as() gets called, the bus number might
not be computed. Let's change the name of IOMMU memory region and
just use the devfn and an incrementing index.
The name only is used for debug.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmu-common.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 55c75d65d2..3f55cfd193 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -311,6 +311,7 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
SMMUState *s = opaque;
SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus);
SMMUDevice *sdev;
+ static uint index;
if (!sbus) {
sbus = g_malloc0(sizeof(SMMUPciBus) +
@@ -321,9 +322,8 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
sdev = sbus->pbdev[devfn];
if (!sdev) {
- char *name = g_strdup_printf("%s-%d-%d",
- s->mrtypename,
- pci_bus_num(bus), devfn);
+ char *name = g_strdup_printf("%s-%d-%d", s->mrtypename, devfn, index++);
+
sdev = sbus->pbdev[devfn] = g_new0(SMMUDevice, 1);
sdev->smmu = s;
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 02/20] update-linux-headers: Import iommu.h
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 01/20] hw/arm/smmu-common: Fix the name of the iommu memory regions Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 03/20] linux-headers: Partial header update Eric Auger
` (17 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Update the script to import the new iommu.h uapi header.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
| 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 0a964fe240..3f11e49758 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -125,7 +125,7 @@ done
rm -rf "$output/linux-headers/linux"
mkdir -p "$output/linux-headers/linux"
-for header in kvm.h vfio.h vfio_ccw.h vhost.h \
+for header in kvm.h vfio.h iommu.h vfio_ccw.h vhost.h \
psci.h psp-sev.h userfaultfd.h; do
cp "$tmpdir/include/linux/$header" "$output/linux-headers/linux"
done
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 03/20] linux-headers: Partial header update
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 01/20] hw/arm/smmu-common: Fix the name of the iommu memory regions Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 02/20] update-linux-headers: Import iommu.h Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 04/20] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
` (16 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
This imports both the iommu.h and vfio.h headers found on branch
https://github.com/eauger/linux/tree/v4.18-2stage-rfc++
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
| 157 ++++++++++++++++++++++++++++++++++++
| 28 ++++++-
2 files changed, 182 insertions(+), 3 deletions(-)
create mode 100644 linux-headers/linux/iommu.h
--git a/linux-headers/linux/iommu.h b/linux-headers/linux/iommu.h
new file mode 100644
index 0000000000..72536cf412
--- /dev/null
+++ b/linux-headers/linux/iommu.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * IOMMU user API definitions
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _UAPI_IOMMU_H
+#define _UAPI_IOMMU_H
+
+#include <linux/types.h>
+
+/**
+ * PASID table data used to bind guest PASID table to the host IOMMU. This will
+ * enable guest managed first level page tables.
+ * @version: for future extensions and identification of the data format
+ * @bytes: size of this structure
+ * @base_ptr: PASID table pointer
+ * @pasid_bits: number of bits supported in the guest PASID table, must be less
+ * or equal than the host supported PASID size.
+ */
+struct iommu_pasid_table_config {
+ __u32 version;
+#define PASID_TABLE_CFG_VERSION_1 1
+ __u32 bytes;
+ __u64 base_ptr;
+ __u8 pasid_bits;
+};
+
+/**
+ * Stream Table Entry stage info
+ * @flags: indicate the stage 1 state
+ * @cdptr_dma: GPA of the Context Descriptor
+ * @asid_bits: number of asid bits supported in the guest, must be less or
+ * equal than the host asid size
+ */
+struct iommu_smmu_s1_config {
+#define IOMMU_SMMU_S1_DISABLED (1 << 0)
+#define IOMMU_SMMU_S1_BYPASSED (1 << 1)
+#define IOMMU_SMMU_S1_ABORTED (1 << 2)
+ __u32 flags;
+ __u64 cdptr_dma;
+ __u8 asid_bits;
+};
+
+struct iommu_guest_stage_config {
+#define PASID_TABLE (1 << 0)
+#define SMMUV3_S1_CFG (1 << 1)
+ __u32 flags;
+ union {
+ struct iommu_pasid_table_config pasidt;
+ struct iommu_smmu_s1_config smmu_s1;
+ };
+};
+
+/**
+ * enum iommu_inv_granularity - Generic invalidation granularity
+ * @IOMMU_INV_GRANU_DOMAIN_ALL_PASID: TLB entries or PASID caches of all
+ * PASIDs associated with a domain ID
+ * @IOMMU_INV_GRANU_PASID_SEL: TLB entries or PASID cache associated
+ * with a PASID and a domain
+ * @IOMMU_INV_GRANU_PAGE_PASID: TLB entries of selected page range
+ * within a PASID
+ *
+ * When an invalidation request is passed down to IOMMU to flush translation
+ * caches, it may carry different granularity levels, which can be specific
+ * to certain types of translation caches.
+ * This enum is a collection of granularities for all types of translation
+ * caches. The idea is to make it easy for IOMMU model specific driver to
+ * convert from generic to model specific value. Each IOMMU driver
+ * can enforce check based on its own conversion table. The conversion is
+ * based on 2D look-up with inputs as follows:
+ * - translation cache types
+ * - granularity
+ *
+ * type | DTLB | TLB | PASID |
+ * granule | | | cache |
+ * -----------------+-----------+-----------+-----------+
+ * DN_ALL_PASID | Y | Y | Y |
+ * PASID_SEL | Y | Y | Y |
+ * PAGE_PASID | Y | Y | N/A |
+ *
+ */
+enum iommu_inv_granularity {
+ IOMMU_INV_GRANU_DOMAIN_ALL_PASID,
+ IOMMU_INV_GRANU_PASID_SEL,
+ IOMMU_INV_GRANU_PAGE_PASID,
+ IOMMU_INV_NR_GRANU,
+};
+
+/**
+ * enum iommu_inv_type - Generic translation cache types for invalidation
+ *
+ * @IOMMU_INV_TYPE_DTLB: device IOTLB
+ * @IOMMU_INV_TYPE_TLB: IOMMU paging structure cache
+ * @IOMMU_INV_TYPE_PASID: PASID cache
+ * Invalidation requests sent to IOMMU for a given device need to indicate
+ * which type of translation cache to be operated on. Combined with enum
+ * iommu_inv_granularity, model specific driver can do a simple lookup to
+ * convert from generic to model specific value.
+ */
+enum iommu_inv_type {
+ IOMMU_INV_TYPE_DTLB,
+ IOMMU_INV_TYPE_TLB,
+ IOMMU_INV_TYPE_PASID,
+ IOMMU_INV_NR_TYPE
+};
+
+/**
+ * Translation cache invalidation header that contains mandatory meta data.
+ * @version: info format version, expecting future extesions
+ * @type: type of translation cache to be invalidated
+ */
+struct iommu_tlb_invalidate_hdr {
+ __u32 version;
+#define TLB_INV_HDR_VERSION_1 1
+ enum iommu_inv_type type;
+};
+
+/**
+ * Translation cache invalidation information, contains generic IOMMU
+ * data which can be parsed based on model ID by model specific drivers.
+ * Since the invalidation of second level page tables are included in the
+ * unmap operation, this info is only applicable to the first level
+ * translation caches, i.e. DMA request with PASID.
+ *
+ * @granularity: requested invalidation granularity, type dependent
+ * @size: 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc.
+ * @nr_pages: number of pages to invalidate
+ * @pasid: processor address space ID value per PCI spec.
+ * @addr: page address to be invalidated
+ * @flags IOMMU_INVALIDATE_ADDR_LEAF: leaf paging entries
+ * IOMMU_INVALIDATE_GLOBAL_PAGE: global pages
+ *
+ */
+struct iommu_tlb_invalidate_info {
+ struct iommu_tlb_invalidate_hdr hdr;
+ enum iommu_inv_granularity granularity;
+ __u32 flags;
+#define IOMMU_INVALIDATE_ADDR_LEAF (1 << 0)
+#define IOMMU_INVALIDATE_GLOBAL_PAGE (1 << 1)
+ __u8 size;
+ __u64 nr_pages;
+ __u32 pasid;
+ __u64 addr;
+};
+
+struct iommu_guest_msi_binding {
+ __u64 iova;
+ __u64 gpa;
+ __u32 granule;
+};
+#endif /* _UAPI_IOMMU_H */
+
--git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 3615a269d3..8b22565fba 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -9,11 +9,12 @@
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
-#ifndef VFIO_H
-#define VFIO_H
+#ifndef _UAPIVFIO_H
+#define _UAPIVFIO_H
#include <linux/types.h>
#include <linux/ioctl.h>
+#include <linux/iommu.h>
#define VFIO_API_VERSION 0
@@ -665,6 +666,27 @@ struct vfio_iommu_type1_dma_unmap {
#define VFIO_IOMMU_ENABLE _IO(VFIO_TYPE, VFIO_BASE + 15)
#define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16)
+struct vfio_iommu_type1_bind_guest_stage {
+ __u32 argsz;
+ __u32 flags;
+ struct iommu_guest_stage_config config;
+};
+#define VFIO_IOMMU_BIND_GUEST_STAGE _IO(VFIO_TYPE, VFIO_BASE + 22)
+
+struct vfio_iommu_type1_tlb_invalidate {
+ __u32 argsz;
+ __u32 flags;
+ struct iommu_tlb_invalidate_info info;
+};
+#define VFIO_IOMMU_TLB_INVALIDATE _IO(VFIO_TYPE, VFIO_BASE + 23)
+
+struct vfio_iommu_type1_bind_guest_msi {
+ __u32 argsz;
+ __u32 flags;
+ struct iommu_guest_msi_binding binding;
+};
+#define VFIO_IOMMU_BIND_MSI _IO(VFIO_TYPE, VFIO_BASE + 24)
+
/* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */
/*
@@ -816,4 +838,4 @@ struct vfio_iommu_spapr_tce_remove {
/* ***************************************************************** */
-#endif /* VFIO_H */
+#endif /* _UAPIVFIO_H */
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 04/20] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (2 preceding siblings ...)
2018-09-01 14:22 ` [Qemu-devel] [RFC 03/20] linux-headers: Partial header update Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 05/20] hw/arm/smmuv3: Implement get_attr API to report IOMMU_ATTR_VFIO_NESTED Eric Auger
` (15 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
We introduce a new IOMMU Memory Region attribute, IOMMU_ATTR_VFIO_NESTED
which tells whether the virtual IOMMU requires physical nested
stages for VFIO integration. Intel virtual IOMMU supports Caching
Mode and does not require 2 stages at physical level. However virtual
ARM SMMU does not implement such caching mode and requires to use
physical stage 1 for VFIO integration.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
include/exec/memory.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index eb4f2fb249..b6e59c139c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -209,7 +209,8 @@ struct MemoryRegionOps {
};
enum IOMMUMemoryRegionAttr {
- IOMMU_ATTR_SPAPR_TCE_FD
+ IOMMU_ATTR_SPAPR_TCE_FD,
+ IOMMU_ATTR_VFIO_NESTED,
};
/**
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 05/20] hw/arm/smmuv3: Implement get_attr API to report IOMMU_ATTR_VFIO_NESTED
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (3 preceding siblings ...)
2018-09-01 14:22 ` [Qemu-devel] [RFC 04/20] memory: add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 06/20] hw/vfio/common: Refactor container initialization Eric Auger
` (14 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Virtual SMMUv3 requires physical nested stages for VFIO integration.
Let's advertise this.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmuv3.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index bb6a24e9b8..80aa4f3793 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1504,6 +1504,17 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
}
}
+static int smmuv3_get_attr(IOMMUMemoryRegion *iommu,
+ enum IOMMUMemoryRegionAttr attr,
+ void *data)
+{
+ if (attr == IOMMU_ATTR_VFIO_NESTED) {
+ *(bool *) data = true;
+ return 0;
+ }
+ return -EINVAL;
+}
+
static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
void *data)
{
@@ -1511,6 +1522,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
imrc->translate = smmuv3_translate;
imrc->notify_flag_changed = smmuv3_notify_flag_changed;
+ imrc->get_attr = smmuv3_get_attr;
}
static const TypeInfo smmuv3_type_info = {
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 06/20] hw/vfio/common: Refactor container initialization
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (4 preceding siblings ...)
2018-09-01 14:22 ` [Qemu-devel] [RFC 05/20] hw/arm/smmuv3: Implement get_attr API to report IOMMU_ATTR_VFIO_NESTED Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:22 ` [Qemu-devel] [RFC 07/20] hw/vfio/common: Force nested if iommu requires it Eric Auger
` (13 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
To prepare for testing yet another extension, let's
refactor the code. We introduce vfio_iommu_get_type()
helper which selects the richest API (v2 first). Then
vfio_init_container() does the SET_CONTAINER and
SET_IOMMU ioctl calls. So we end up with a switch/case
on the iommu_type which should be a little bit more readable
when introducing the NESTING extension check. Also ioctl's
get called once per iommu_type.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 102 ++++++++++++++++++++++++++++++-----------------
1 file changed, 65 insertions(+), 37 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 3f31f80b12..39f400b077 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1036,12 +1036,58 @@ static void vfio_put_address_space(VFIOAddressSpace *space)
}
}
+/*
+ * vfio_iommu_get_type - selects the richest iommu_type (v2 first)
+ * nested only is selected if requested by @force_nested
+ */
+static int vfio_iommu_get_type(VFIOContainer *container,
+ Error **errp)
+{
+ int fd = container->fd;
+
+ if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
+ return VFIO_TYPE1v2_IOMMU;
+ } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
+ return VFIO_TYPE1_IOMMU;
+ } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
+ return VFIO_SPAPR_TCE_v2_IOMMU;
+ } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
+ return VFIO_SPAPR_TCE_IOMMU;
+ } else {
+ error_setg(errp, "No available IOMMU models");
+ return -EINVAL;
+ }
+}
+
+static int vfio_init_container(VFIOContainer *container, int group_fd,
+ int iommu_type, Error **errp)
+{
+ int ret;
+
+ ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER, &container->fd);
+ if (ret) {
+ error_setg_errno(errp, errno, "failed to set group container");
+ return -errno;
+ }
+
+ ret = ioctl(container->fd, VFIO_SET_IOMMU, iommu_type);
+ if (ret) {
+ error_setg_errno(errp, errno, "failed to set iommu for container");
+ return -errno;
+ }
+ container->iommu_type = iommu_type;
+ return 0;
+}
+
static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
Error **errp)
{
VFIOContainer *container;
int ret, fd;
VFIOAddressSpace *space;
+ int iommu_type;
+ bool v2 = false;
+
space = vfio_get_address_space(as);
@@ -1101,23 +1147,20 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
container->fd = fd;
QLIST_INIT(&container->giommu_list);
QLIST_INIT(&container->hostwin_list);
- if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) ||
- ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
- bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU);
+
+ iommu_type = vfio_iommu_get_type(container, errp);
+ if (iommu_type < 0) {
+ goto free_container_exit;
+ }
+
+ switch (iommu_type) {
+ case VFIO_TYPE1v2_IOMMU:
+ case VFIO_TYPE1_IOMMU:
+ {
struct vfio_iommu_type1_info info;
- ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
+ ret = vfio_init_container(container, group->fd, iommu_type, errp);
if (ret) {
- error_setg_errno(errp, errno, "failed to set group container");
- ret = -errno;
- goto free_container_exit;
- }
-
- container->iommu_type = v2 ? VFIO_TYPE1v2_IOMMU : VFIO_TYPE1_IOMMU;
- ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
- if (ret) {
- error_setg_errno(errp, errno, "failed to set iommu for container");
- ret = -errno;
goto free_container_exit;
}
@@ -1137,28 +1180,16 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
}
vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
container->pgsizes = info.iova_pgsizes;
- } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU) ||
- ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
+ break;
+ }
+ case VFIO_SPAPR_TCE_v2_IOMMU:
+ v2 = true;
+ case VFIO_SPAPR_TCE_IOMMU:
+ {
struct vfio_iommu_spapr_tce_info info;
- bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU);
- ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
+ ret = vfio_init_container(container, group->fd, iommu_type, errp);
if (ret) {
- error_setg_errno(errp, errno, "failed to set group container");
- ret = -errno;
- goto free_container_exit;
- }
- container->iommu_type =
- v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
- ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
- if (ret) {
- container->iommu_type = VFIO_SPAPR_TCE_IOMMU;
- v2 = false;
- ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
- }
- if (ret) {
- error_setg_errno(errp, errno, "failed to set iommu for container");
- ret = -errno;
goto free_container_exit;
}
@@ -1222,10 +1253,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
info.dma32_window_size - 1,
0x1000);
}
- } else {
- error_setg(errp, "No available IOMMU models");
- ret = -EINVAL;
- goto free_container_exit;
+ }
}
vfio_kvm_device_add_group(group);
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 07/20] hw/vfio/common: Force nested if iommu requires it
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (5 preceding siblings ...)
2018-09-01 14:22 ` [Qemu-devel] [RFC 06/20] hw/vfio/common: Refactor container initialization Eric Auger
@ 2018-09-01 14:22 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 08/20] memory: Introduce IOMMUIOLTBNotifier Eric Auger
` (12 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:22 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
In case we detect the address space is translated by
a virtual IOMMU which requires nested stages, let's set up
the container with the VFIO_TYPE1_NESTING_IOMMU iommu_type.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 39f400b077..53ff7a6b39 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1041,11 +1041,15 @@ static void vfio_put_address_space(VFIOAddressSpace *space)
* nested only is selected if requested by @force_nested
*/
static int vfio_iommu_get_type(VFIOContainer *container,
- Error **errp)
+ bool force_nested, Error **errp)
{
int fd = container->fd;
- if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
+ if (force_nested &&
+ ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_NESTING_IOMMU)) {
+ /* NESTED implies v2 */
+ return VFIO_TYPE1_NESTING_IOMMU;
+ } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
return VFIO_TYPE1v2_IOMMU;
} else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
return VFIO_TYPE1_IOMMU;
@@ -1085,9 +1089,16 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
VFIOContainer *container;
int ret, fd;
VFIOAddressSpace *space;
+ IOMMUMemoryRegion *iommu_mr;
int iommu_type;
+ bool force_nested = false;
bool v2 = false;
+ if (as != &address_space_memory && memory_region_is_iommu(as->root)) {
+ iommu_mr = IOMMU_MEMORY_REGION(as->root);
+ memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_VFIO_NESTED,
+ (void *)&force_nested);
+ }
space = vfio_get_address_space(as);
@@ -1148,12 +1159,18 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
QLIST_INIT(&container->giommu_list);
QLIST_INIT(&container->hostwin_list);
- iommu_type = vfio_iommu_get_type(container, errp);
+ iommu_type = vfio_iommu_get_type(container, force_nested, errp);
if (iommu_type < 0) {
goto free_container_exit;
}
+ if (force_nested && iommu_type != VFIO_TYPE1_NESTING_IOMMU) {
+ error_setg(errp, "nested mode requested by the virtual IOMMU "
+ "but not supported by the vfio iommu");
+ }
+
switch (iommu_type) {
+ case VFIO_TYPE1_NESTING_IOMMU:
case VFIO_TYPE1v2_IOMMU:
case VFIO_TYPE1_IOMMU:
{
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 08/20] memory: Introduce IOMMUIOLTBNotifier
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (6 preceding siblings ...)
2018-09-01 14:22 ` [Qemu-devel] [RFC 07/20] hw/vfio/common: Force nested if iommu requires it Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 09/20] memory: rename memory_region notify_iommu, notify_one Eric Auger
` (11 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Current IOMMUNotifiers dedicate to IOTLB related notifications.
We want to introduce notifiers for virtual IOMMU config changes.
Let's create a new IOMMUIOLTBNotifier datatype. This paves the way
to the introduction of an IOMMUConfigNotifier. IOMMUNotifier
now has an iotlb_notifier field. We change all calling sites.
We also rename IOMMU_NOTIFIER_ALL into IOMMU_NOTIFIER_IOTLB_ALL
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
exec.c | 12 ++++++------
hw/arm/smmu-common.c | 4 ++--
hw/i386/intel_iommu.c | 6 +++---
hw/vfio/common.c | 12 ++++++------
hw/virtio/vhost.c | 12 ++++++------
include/exec/memory.h | 25 +++++++++++++++----------
memory.c | 10 +++++-----
7 files changed, 43 insertions(+), 38 deletions(-)
diff --git a/exec.c b/exec.c
index 6826c8337d..1411660289 100644
--- a/exec.c
+++ b/exec.c
@@ -683,12 +683,12 @@ static void tcg_register_iommu_notifier(CPUState *cpu,
* just register interest in the whole thing, on the assumption
* that iommu reconfiguration will be rare.
*/
- iommu_notifier_init(¬ifier->n,
- tcg_iommu_unmap_notify,
- IOMMU_NOTIFIER_UNMAP,
- 0,
- HWADDR_MAX,
- iommu_idx);
+ iommu_iotlb_notifier_init(¬ifier->n,
+ tcg_iommu_unmap_notify,
+ IOMMU_NOTIFIER_UNMAP,
+ 0,
+ HWADDR_MAX,
+ iommu_idx);
memory_region_register_iommu_notifier(notifier->mr, ¬ifier->n);
}
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 3f55cfd193..ad6ef2135b 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -391,9 +391,9 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
IOMMUTLBEntry entry;
entry.target_as = &address_space_memory;
- entry.iova = n->start;
+ entry.iova = n->iotlb_notifier.start;
entry.perm = IOMMU_NONE;
- entry.addr_mask = n->end - n->start;
+ entry.addr_mask = n->iotlb_notifier.end - n->iotlb_notifier.start;
memory_region_notify_one(n, &entry);
}
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0a8cd4e9cc..7acbd6b21e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2947,8 +2947,8 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
{
IOMMUTLBEntry entry;
hwaddr size;
- hwaddr start = n->start;
- hwaddr end = n->end;
+ hwaddr start = n->iotlb_notifier.start;
+ hwaddr end = n->iotlb_notifier.end;
IntelIOMMUState *s = as->iommu_state;
DMAMap map;
@@ -2984,7 +2984,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
entry.target_as = &address_space_memory;
/* Adjust iova for the size */
- entry.iova = n->start & ~(size - 1);
+ entry.iova = n->iotlb_notifier.start & ~(size - 1);
/* This field is meaningless for unmap */
entry.translated_addr = 0;
entry.perm = IOMMU_NONE;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 53ff7a6b39..b6673fcf49 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -527,11 +527,11 @@ static void vfio_listener_region_add(MemoryListener *listener,
llend = int128_sub(llend, int128_one());
iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
MEMTXATTRS_UNSPECIFIED);
- iommu_notifier_init(&giommu->n, vfio_iommu_map_notify,
- IOMMU_NOTIFIER_ALL,
- section->offset_within_region,
- int128_get64(llend),
- iommu_idx);
+ iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
+ IOMMU_NOTIFIER_IOTLB_ALL,
+ section->offset_within_region,
+ int128_get64(llend),
+ iommu_idx);
QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
memory_region_register_iommu_notifier(section->mr, &giommu->n);
@@ -625,7 +625,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
if (MEMORY_REGION(giommu->iommu) == section->mr &&
- giommu->n.start == section->offset_within_region) {
+ giommu->n.iotlb_notifier.start == section->offset_within_region) {
memory_region_unregister_iommu_notifier(section->mr,
&giommu->n);
QLIST_REMOVE(giommu, giommu_next);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index d4cb5894a8..c21b9b8be9 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -677,11 +677,11 @@ static void vhost_iommu_region_add(MemoryListener *listener,
end = int128_sub(end, int128_one());
iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
MEMTXATTRS_UNSPECIFIED);
- iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
- IOMMU_NOTIFIER_UNMAP,
- section->offset_within_region,
- int128_get64(end),
- iommu_idx);
+ iommu_iotlb_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
+ IOMMU_NOTIFIER_UNMAP,
+ section->offset_within_region,
+ int128_get64(end),
+ iommu_idx);
iommu->mr = section->mr;
iommu->iommu_offset = section->offset_within_address_space -
section->offset_within_region;
@@ -704,7 +704,7 @@ static void vhost_iommu_region_del(MemoryListener *listener,
QLIST_FOREACH(iommu, &dev->iommu_list, iommu_next) {
if (iommu->mr == section->mr &&
- iommu->n.start == section->offset_within_region) {
+ iommu->n.iotlb_notifier.start == section->offset_within_region) {
memory_region_unregister_iommu_notifier(iommu->mr,
&iommu->n);
QLIST_REMOVE(iommu, iommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index b6e59c139c..31fc859c6b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -86,18 +86,22 @@ typedef enum {
IOMMU_NOTIFIER_MAP = 0x2,
} IOMMUNotifierFlag;
-#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+#define IOMMU_NOTIFIER_IOTLB_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
struct IOMMUNotifier;
typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
IOMMUTLBEntry *data);
-struct IOMMUNotifier {
+typedef struct IOMMUIOLTBNotifier {
IOMMUNotify notify;
- IOMMUNotifierFlag notifier_flags;
/* Notify for address space range start <= addr <= end */
hwaddr start;
hwaddr end;
+} IOMMUIOLTBNotifier;
+
+struct IOMMUNotifier {
+ IOMMUNotifierFlag notifier_flags;
+ IOMMUIOLTBNotifier iotlb_notifier;
int iommu_idx;
QLIST_ENTRY(IOMMUNotifier) node;
};
@@ -126,15 +130,16 @@ typedef struct IOMMUNotifier IOMMUNotifier;
/* RAM is a persistent kind memory */
#define RAM_PMEM (1 << 5)
-static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
- IOMMUNotifierFlag flags,
- hwaddr start, hwaddr end,
- int iommu_idx)
+static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
+ IOMMUNotifierFlag flags,
+ hwaddr start, hwaddr end,
+ int iommu_idx)
{
- n->notify = fn;
+ assert(flags & IOMMU_NOTIFIER_MAP || flags & IOMMU_NOTIFIER_UNMAP);
n->notifier_flags = flags;
- n->start = start;
- n->end = end;
+ n->iotlb_notifier.notify = fn;
+ n->iotlb_notifier.start = start;
+ n->iotlb_notifier.end = end;
n->iommu_idx = iommu_idx;
}
diff --git a/memory.c b/memory.c
index 9b73892768..b7e2e43b68 100644
--- a/memory.c
+++ b/memory.c
@@ -1800,7 +1800,7 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr,
/* We need to register for at least one bitfield */
iommu_mr = IOMMU_MEMORY_REGION(mr);
assert(n->notifier_flags != IOMMU_NOTIFIER_NONE);
- assert(n->start <= n->end);
+ assert(n->iotlb_notifier.start <= n->iotlb_notifier.end);
assert(n->iommu_idx >= 0 &&
n->iommu_idx < memory_region_iommu_num_indexes(iommu_mr));
@@ -1836,7 +1836,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
if (iotlb.perm != IOMMU_NONE) {
- n->notify(n, &iotlb);
+ n->iotlb_notifier.notify(n, &iotlb);
}
/* if (2^64 - MR size) < granularity, it's possible to get an
@@ -1879,8 +1879,8 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
* Skip the notification if the notification does not overlap
* with registered range.
*/
- if (notifier->start > entry->iova + entry->addr_mask ||
- notifier->end < entry->iova) {
+ if (notifier->iotlb_notifier.start > entry->iova + entry->addr_mask ||
+ notifier->iotlb_notifier.end < entry->iova) {
return;
}
@@ -1891,7 +1891,7 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
}
if (notifier->notifier_flags & request_flags) {
- notifier->notify(notifier, entry);
+ notifier->iotlb_notifier.notify(notifier, entry);
}
}
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 09/20] memory: rename memory_region notify_iommu, notify_one
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (7 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 08/20] memory: Introduce IOMMUIOLTBNotifier Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 10/20] memory: Add IOMMUConfigNotifier Eric Auger
` (10 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Le's rename those notification functions to clearly discriminate
iotlb notifications from looming config notifications.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmu-common.c | 2 +-
hw/arm/smmuv3.c | 2 +-
hw/i386/intel_iommu.c | 10 +++++-----
hw/misc/tz-mpc.c | 8 ++++----
hw/ppc/spapr_iommu.c | 2 +-
hw/s390x/s390-pci-inst.c | 4 ++--
include/exec/memory.h | 20 ++++++++++----------
memory.c | 12 ++++++------
8 files changed, 30 insertions(+), 30 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index ad6ef2135b..70b014e618 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -395,7 +395,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
entry.perm = IOMMU_NONE;
entry.addr_mask = n->iotlb_notifier.end - n->iotlb_notifier.start;
- memory_region_notify_one(n, &entry);
+ memory_region_iotlb_notify_one(n, &entry);
}
/* Unmap all notifiers attached to @mr */
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 80aa4f3793..c4bd368355 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -822,7 +822,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
entry.addr_mask = (1 << tt->granule_sz) - 1;
entry.perm = IOMMU_NONE;
- memory_region_notify_one(n, &entry);
+ memory_region_iotlb_notify_one(n, &entry);
}
/* invalidate an asid/iova tuple in all mr's */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 7acbd6b21e..318dd83e7f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1023,7 +1023,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
static int vtd_sync_shadow_page_hook(IOMMUTLBEntry *entry,
void *private)
{
- memory_region_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
+ memory_region_iotlb_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
return 0;
}
@@ -1581,7 +1581,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
.addr_mask = size - 1,
.perm = IOMMU_NONE,
};
- memory_region_notify_iommu(&vtd_as->iommu, 0, entry);
+ memory_region_iotlb_notify_iommu(&vtd_as->iommu, 0, entry);
}
}
}
@@ -2015,7 +2015,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
entry.iova = addr;
entry.perm = IOMMU_NONE;
entry.translated_addr = 0;
- memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
+ memory_region_iotlb_notify_iommu(&vtd_dev_as->iommu, 0, entry);
done:
return true;
@@ -2999,7 +2999,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
map.size = entry.addr_mask;
iova_tree_remove(as->iova_tree, &map);
- memory_region_notify_one(n, &entry);
+ memory_region_iotlb_notify_one(n, &entry);
}
static void vtd_address_space_unmap_all(IntelIOMMUState *s)
@@ -3016,7 +3016,7 @@ static void vtd_address_space_unmap_all(IntelIOMMUState *s)
static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
{
- memory_region_notify_one((IOMMUNotifier *)private, entry);
+ memory_region_iotlb_notify_one((IOMMUNotifier *)private, entry);
return 0;
}
diff --git a/hw/misc/tz-mpc.c b/hw/misc/tz-mpc.c
index e0c58ba37e..a585e7b475 100644
--- a/hw/misc/tz-mpc.c
+++ b/hw/misc/tz-mpc.c
@@ -100,8 +100,8 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
entry.translated_addr = addr;
entry.perm = IOMMU_NONE;
- memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
- memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+ memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
+ memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
entry.perm = IOMMU_RW;
if (block_is_ns) {
@@ -109,13 +109,13 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
} else {
entry.target_as = &s->downstream_as;
}
- memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
+ memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
if (block_is_ns) {
entry.target_as = &s->downstream_as;
} else {
entry.target_as = &s->blocked_io_as;
}
- memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+ memory_region_iotlb_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
}
}
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 1b0880ac9e..680fb6ad24 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -429,7 +429,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
entry.translated_addr = tce & page_mask;
entry.addr_mask = ~page_mask;
entry.perm = spapr_tce_iommu_access_flags(tce);
- memory_region_notify_iommu(&tcet->iommu, 0, entry);
+ memory_region_iotlb_notify_iommu(&tcet->iommu, 0, entry);
return H_SUCCESS;
}
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 7b61367ee3..a5ad72d19c 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -589,7 +589,7 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
}
notify.perm = IOMMU_NONE;
- memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
+ memory_region_iotlb_notify_iommu(&iommu->iommu_mr, 0, notify);
notify.perm = entry->perm;
}
@@ -601,7 +601,7 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
g_hash_table_replace(iommu->iotlb, &cache->iova, cache);
}
- memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
+ memory_region_iotlb_notify_iommu(&iommu->iommu_mr, 0, notify);
}
int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 31fc859c6b..5ef9bf6d21 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1024,10 +1024,10 @@ static inline IOMMUMemoryRegionClass *memory_region_get_iommu_class_nocheck(
uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
/**
- * memory_region_notify_iommu: notify a change in an IOMMU translation entry.
+ * memory_region_iotlb_notify_iommu: notify a change in an IOMMU translation
+ * entry.
*
* The notification type will be decided by entry.perm bits:
- *
* - For UNMAP (cache invalidation) notifies: set entry.perm to IOMMU_NONE.
* - For MAP (newly added entry) notifies: set entry.perm to the
* permission of the page (which is definitely !IOMMU_NONE).
@@ -1041,15 +1041,15 @@ uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
* replaces all old entries for the same virtual I/O address range.
* Deleted entries have .@perm == 0.
*/
-void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
- int iommu_idx,
- IOMMUTLBEntry entry);
+void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+ int iommu_idx,
+ IOMMUTLBEntry entry);
/**
- * memory_region_notify_one: notify a change in an IOMMU translation
- * entry to a single notifier
+ * memory_region_iotlb_notify_one: notify a change in an IOMMU translation
+ * entry to a single notifier
*
- * This works just like memory_region_notify_iommu(), but it only
+ * This works just like memory_region_iotlb_notify_iommu(), but it only
* notifies a specific notifier, not all of them.
*
* @notifier: the notifier to be notified
@@ -1057,8 +1057,8 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
* replaces all old entries for the same virtual I/O address range.
* Deleted entries have .@perm == 0.
*/
-void memory_region_notify_one(IOMMUNotifier *notifier,
- IOMMUTLBEntry *entry);
+void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
+ IOMMUTLBEntry *entry);
/**
* memory_region_register_iommu_notifier: register a notifier for changes to
diff --git a/memory.c b/memory.c
index b7e2e43b68..8ee5cbdbad 100644
--- a/memory.c
+++ b/memory.c
@@ -1870,8 +1870,8 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
memory_region_update_iommu_notify_flags(iommu_mr);
}
-void memory_region_notify_one(IOMMUNotifier *notifier,
- IOMMUTLBEntry *entry)
+void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
+ IOMMUTLBEntry *entry)
{
IOMMUNotifierFlag request_flags;
@@ -1895,9 +1895,9 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
}
}
-void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
- int iommu_idx,
- IOMMUTLBEntry entry)
+void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+ int iommu_idx,
+ IOMMUTLBEntry entry)
{
IOMMUNotifier *iommu_notifier;
@@ -1905,7 +1905,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
if (iommu_notifier->iommu_idx == iommu_idx) {
- memory_region_notify_one(iommu_notifier, &entry);
+ memory_region_iotlb_notify_one(iommu_notifier, &entry);
}
}
}
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 10/20] memory: Add IOMMUConfigNotifier
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (8 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 09/20] memory: rename memory_region notify_iommu, notify_one Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 11/20] hw/arm/smmuv3: Store s1ctrptr in translation config data Eric Auger
` (9 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
With this patch, an IOMMUNotifier can now be either
an IOTLB notifier or a config notifier. A config notifier
is supposed to be called on guest translation config change.
This gives host a chance to update the physical IOMMU
configuration so that is consistent with the guest view.
The notifier is passed an iommu_guest_stage_config struct.
We introduce the associated helpers, iommu_config_notifier_init,
memory_region_config_notify_iommu
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 14 ++++++++----
include/exec/memory.h | 52 +++++++++++++++++++++++++++++++++++++++++--
memory.c | 32 ++++++++++++++++++++++++--
3 files changed, 90 insertions(+), 8 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index b6673fcf49..7bd3cc250d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -624,10 +624,16 @@ static void vfio_listener_region_del(MemoryListener *listener,
VFIOGuestIOMMU *giommu;
QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
- if (MEMORY_REGION(giommu->iommu) == section->mr &&
- giommu->n.iotlb_notifier.start == section->offset_within_region) {
- memory_region_unregister_iommu_notifier(section->mr,
- &giommu->n);
+ if (MEMORY_REGION(giommu->iommu) == section->mr) {
+ if (is_iommu_iotlb_notifier(&giommu->n) &&
+ giommu->n.iotlb_notifier.start ==
+ section->offset_within_region) {
+ memory_region_unregister_iommu_notifier(section->mr,
+ &giommu->n);
+ } else if (is_iommu_config_notifier(&giommu->n)) {
+ memory_region_unregister_iommu_notifier(section->mr,
+ &giommu->n);
+ }
QLIST_REMOVE(giommu, giommu_next);
g_free(giommu);
break;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5ef9bf6d21..e89fd95fc5 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -84,13 +84,23 @@ typedef enum {
IOMMU_NOTIFIER_UNMAP = 0x1,
/* Notify entry changes (newly created entries) */
IOMMU_NOTIFIER_MAP = 0x2,
+ /* Notify stage 1 config changes */
+ IOMMU_NOTIFIER_S1_CFG = 0x4,
} IOMMUNotifierFlag;
#define IOMMU_NOTIFIER_IOTLB_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+#define IOMMU_NOTIFIER_CONFIG_ALL (IOMMU_NOTIFIER_S1_CFG)
+
+typedef enum {
+ IOMMU_ARM_SMMUV3 = 0x1,
+} IOMMUStage1ConfigType;
struct IOMMUNotifier;
+struct iommu_guest_stage_config;
typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
IOMMUTLBEntry *data);
+typedef void (*IOMMUConfigNotify)(struct IOMMUNotifier *notifier,
+ struct iommu_guest_stage_config *cfg);
typedef struct IOMMUIOLTBNotifier {
IOMMUNotify notify;
@@ -99,9 +109,16 @@ typedef struct IOMMUIOLTBNotifier {
hwaddr end;
} IOMMUIOLTBNotifier;
+typedef struct IOMMUConfigNotifier {
+ IOMMUConfigNotify notify;
+} IOMMUConfigNotifier;
+
struct IOMMUNotifier {
IOMMUNotifierFlag notifier_flags;
- IOMMUIOLTBNotifier iotlb_notifier;
+ union {
+ IOMMUIOLTBNotifier iotlb_notifier;
+ IOMMUConfigNotifier config_notifier;
+ };
int iommu_idx;
QLIST_ENTRY(IOMMUNotifier) node;
};
@@ -143,6 +160,15 @@ static inline void iommu_iotlb_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
n->iommu_idx = iommu_idx;
}
+static inline void iommu_config_notifier_init(IOMMUNotifier *n,
+ IOMMUConfigNotify fn,
+ int iommu_idx)
+{
+ n->notifier_flags = IOMMU_NOTIFIER_S1_CFG;
+ n->iommu_idx = iommu_idx;
+ n->config_notifier.notify = fn;
+}
+
/*
* Memory region callbacks
*/
@@ -639,6 +665,17 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
uint64_t length,
void *host),
Error **errp);
+
+static inline bool is_iommu_iotlb_notifier(IOMMUNotifier *n)
+{
+ return n->notifier_flags & IOMMU_NOTIFIER_IOTLB_ALL;
+}
+
+static inline bool is_iommu_config_notifier(IOMMUNotifier *n)
+{
+ return n->notifier_flags & IOMMU_NOTIFIER_CONFIG_ALL;
+}
+
#ifdef __linux__
/**
@@ -1045,6 +1082,17 @@ void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
int iommu_idx,
IOMMUTLBEntry entry);
+/**
+ * memory_region_config_notify_iommu: notify a change in a translation
+ * configuration structure.
+ * @iommu_mr: the memory region that was changed
+ * @iommu_idx: the IOMMU index for the translation table which has changed
+ * @config: new guest config
+ */
+void memory_region_config_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+ int iommu_idx,
+ struct iommu_guest_stage_config *config);
+
/**
* memory_region_iotlb_notify_one: notify a change in an IOMMU translation
* entry to a single notifier
@@ -1062,7 +1110,7 @@ void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
/**
* memory_region_register_iommu_notifier: register a notifier for changes to
- * IOMMU translation entries.
+ * IOMMU translation entries or translation config settings.
*
* @mr: the memory region to observe
* @n: the IOMMUNotifier to be added; the notify callback receives a
diff --git a/memory.c b/memory.c
index 8ee5cbdbad..ea2a09b0dd 100644
--- a/memory.c
+++ b/memory.c
@@ -49,6 +49,8 @@ static GHashTable *flat_views;
typedef struct AddrRange AddrRange;
+struct iommu_guest_stage_config;
+
/*
* Note that signed integers are needed for negative offsetting in aliases
* (large MemoryRegion::alias_offset).
@@ -1800,7 +1802,9 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr,
/* We need to register for at least one bitfield */
iommu_mr = IOMMU_MEMORY_REGION(mr);
assert(n->notifier_flags != IOMMU_NOTIFIER_NONE);
- assert(n->iotlb_notifier.start <= n->iotlb_notifier.end);
+ if (is_iommu_iotlb_notifier(n)) {
+ assert(n->iotlb_notifier.start <= n->iotlb_notifier.end);
+ }
assert(n->iommu_idx >= 0 &&
n->iommu_idx < memory_region_iommu_num_indexes(iommu_mr));
@@ -1870,6 +1874,13 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
memory_region_update_iommu_notify_flags(iommu_mr);
}
+static void
+memory_region_config_notify_one(IOMMUNotifier *notifier,
+ struct iommu_guest_stage_config *cfg)
+{
+ notifier->config_notifier.notify(notifier, cfg);
+}
+
void memory_region_iotlb_notify_one(IOMMUNotifier *notifier,
IOMMUTLBEntry *entry)
{
@@ -1904,12 +1915,29 @@ void memory_region_iotlb_notify_iommu(IOMMUMemoryRegion *iommu_mr,
assert(memory_region_is_iommu(MEMORY_REGION(iommu_mr)));
IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
- if (iommu_notifier->iommu_idx == iommu_idx) {
+ if (iommu_notifier->iommu_idx == iommu_idx &&
+ is_iommu_iotlb_notifier(iommu_notifier)) {
memory_region_iotlb_notify_one(iommu_notifier, &entry);
}
}
}
+void memory_region_config_notify_iommu(IOMMUMemoryRegion *iommu_mr,
+ int iommu_idx,
+ struct iommu_guest_stage_config *config)
+{
+ IOMMUNotifier *iommu_notifier;
+
+ assert(memory_region_is_iommu(MEMORY_REGION(iommu_mr)));
+
+ IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
+ if (iommu_notifier->iommu_idx == iommu_idx &&
+ is_iommu_config_notifier(iommu_notifier)) {
+ memory_region_config_notify_one(iommu_notifier, config);
+ }
+ }
+}
+
int memory_region_iommu_get_attr(IOMMUMemoryRegion *iommu_mr,
enum IOMMUMemoryRegionAttr attr,
void *data)
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 11/20] hw/arm/smmuv3: Store s1ctrptr in translation config data
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (9 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 10/20] memory: Add IOMMUConfigNotifier Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 12/20] hw/arm/smmuv3: Implement dummy replay Eric Auger
` (8 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
For VFIO integration we will need to pass the
context descriptor table GPA to the host. So let's
decode and store it.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmuv3.c | 1 +
include/hw/arm/smmu-common.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c4bd368355..5f787bf455 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -351,6 +351,7 @@ static int decode_ste(SMMUv3State *s, SMMUTransCfg *cfg,
"SMMUv3 S1 stalling fault model not allowed yet\n");
goto bad_ste;
}
+ cfg->s1ctxptr = STE_CTXPTR(ste);
return 0;
bad_ste:
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index b07cadd0ef..52073a23b4 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -68,6 +68,7 @@ typedef struct SMMUTransCfg {
uint8_t tbi; /* Top Byte Ignore */
uint16_t asid;
SMMUTransTableInfo tt[2];
+ dma_addr_t s1ctxptr;
uint32_t iotlb_hits; /* counts IOTLB hits for this asid */
uint32_t iotlb_misses; /* counts IOTLB misses for this asid */
} SMMUTransCfg;
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 12/20] hw/arm/smmuv3: Implement dummy replay
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (10 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 11/20] hw/arm/smmuv3: Store s1ctrptr in translation config data Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 13/20] hw/arm/smmuv3: Notify on config changes Eric Auger
` (7 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
The default implementation of memory_region_iommu_replay() shall
not be used as it forces the translation of the whole RAM range.
The purpose of this function is to update the shadow page tables.
However in case of nested stage, there is no shadow page table so
we can simply return.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmuv3.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 5f787bf455..ff92f802bd 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1516,6 +1516,11 @@ static int smmuv3_get_attr(IOMMUMemoryRegion *iommu,
return -EINVAL;
}
+static inline void
+smmuv3_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
+{
+}
+
static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
void *data)
{
@@ -1524,6 +1529,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
imrc->translate = smmuv3_translate;
imrc->notify_flag_changed = smmuv3_notify_flag_changed;
imrc->get_attr = smmuv3_get_attr;
+ imrc->replay = smmuv3_replay;
}
static const TypeInfo smmuv3_type_info = {
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 13/20] hw/arm/smmuv3: Notify on config changes
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (11 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 12/20] hw/arm/smmuv3: Implement dummy replay Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 14/20] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper Eric Auger
` (6 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
In case IOMMU config notifiers are attached to the
IOMMU memory region, we execute them, passing as argument
the iommu_guest_stage_config struct updated with the new
viommu translation config. Config notifiers are called on
STE and CD changes. At physical level, they translate into
CMD_CFGI_STE_* and CMD_CFGI_CD_* commands.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmuv3.c | 71 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 46 insertions(+), 25 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index ff92f802bd..a31df03d47 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -16,6 +16,8 @@
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
+#include "linux/iommu.h"
+
#include "qemu/osdep.h"
#include "hw/boards.h"
#include "sysemu/sysemu.h"
@@ -843,6 +845,47 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
}
}
+static void smmuv3_notify_config_change(SMMUState *bs, uint32_t sid)
+{
+ IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
+ SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid};
+ SMMUTransCfg *cfg;
+ SMMUDevice *sdev;
+
+ if (!mr) {
+ return;
+ }
+
+ sdev = container_of(mr, SMMUDevice, iommu);
+
+ /* flush QEMU config cache */
+ smmuv3_flush_config(sdev);
+
+ if (mr->iommu_notify_flags & IOMMU_NOTIFIER_S1_CFG) {
+ /* force a guest RAM config structure decoding */
+ cfg = smmuv3_get_config(sdev, &event);
+
+ if (cfg) {
+ struct iommu_guest_stage_config *kcfg =
+ g_new0(struct iommu_guest_stage_config, 1);
+
+ kcfg->flags = SMMUV3_S1_CFG;
+ kcfg->smmu_s1.flags = cfg->disabled ? IOMMU_SMMU_S1_DISABLED : 0 |
+ cfg->bypassed ? IOMMU_SMMU_S1_BYPASSED : 0 |
+ cfg->aborted ? IOMMU_SMMU_S1_ABORTED : 0;
+ kcfg->smmu_s1.cdptr_dma = cfg->s1ctxptr;
+ kcfg->smmu_s1.asid_bits = 16;
+
+ memory_region_config_notify_iommu(mr, 0, kcfg);
+ g_free(kcfg);
+ } else {
+ qemu_log_mask(LOG_GUEST_ERROR,
+ "%s error decoding the configuration for iommu mr=%s\n",
+ __func__, mr->parent_obj.name);
+ }
+ }
+}
+
static int smmuv3_cmdq_consume(SMMUv3State *s)
{
SMMUState *bs = ARM_SMMU(s);
@@ -893,22 +936,14 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
case SMMU_CMD_CFGI_STE:
{
uint32_t sid = CMD_SID(&cmd);
- IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
- SMMUDevice *sdev;
if (CMD_SSEC(&cmd)) {
cmd_error = SMMU_CERROR_ILL;
break;
}
- if (!mr) {
- break;
- }
-
trace_smmuv3_cmdq_cfgi_ste(sid);
- sdev = container_of(mr, SMMUDevice, iommu);
- smmuv3_flush_config(sdev);
-
+ smmuv3_notify_config_change(bs, sid);
break;
}
case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
@@ -925,14 +960,7 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
trace_smmuv3_cmdq_cfgi_ste_range(start, end);
for (i = start; i <= end; i++) {
- IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, i);
- SMMUDevice *sdev;
-
- if (!mr) {
- continue;
- }
- sdev = container_of(mr, SMMUDevice, iommu);
- smmuv3_flush_config(sdev);
+ smmuv3_notify_config_change(bs, i);
}
break;
}
@@ -940,21 +968,14 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
case SMMU_CMD_CFGI_CD_ALL:
{
uint32_t sid = CMD_SID(&cmd);
- IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
- SMMUDevice *sdev;
if (CMD_SSEC(&cmd)) {
cmd_error = SMMU_CERROR_ILL;
break;
}
- if (!mr) {
- break;
- }
-
trace_smmuv3_cmdq_cfgi_cd(sid);
- sdev = container_of(mr, SMMUDevice, iommu);
- smmuv3_flush_config(sdev);
+ smmuv3_notify_config_change(bs, sid);
break;
}
case SMMU_CMD_TLBI_NH_ASID:
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 14/20] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (12 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 13/20] hw/arm/smmuv3: Notify on config changes Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 15/20] hw/vfio/common: Introduce vfio_dma_(un)map_ram_section helpers Eric Auger
` (5 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Soon this code will be called several times. So let's introduce
an helper.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7bd3cc250d..2342bccf38 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -24,6 +24,7 @@
#include <linux/kvm.h>
#endif
#include <linux/vfio.h>
+#include <linux/iommu.h>
#include "hw/vfio/vfio-common.h"
#include "hw/vfio/vfio.h"
@@ -401,6 +402,19 @@ out:
rcu_read_unlock();
}
+static VFIOGuestIOMMU *vfio_alloc_guest_iommu(VFIOContainer *container,
+ IOMMUMemoryRegion *iommu,
+ hwaddr offset)
+{
+ VFIOGuestIOMMU *giommu = g_new0(VFIOGuestIOMMU, 1);
+
+ giommu->container = container;
+ giommu->iommu = iommu;
+ giommu->iommu_offset = offset;
+ /* notifier will be registered separately */
+ return giommu;
+}
+
static void vfio_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
@@ -508,6 +522,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
if (memory_region_is_iommu(section->mr)) {
VFIOGuestIOMMU *giommu;
IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
+ hwaddr offset;
int iommu_idx;
trace_vfio_listener_region_add_iommu(iova, end);
@@ -517,11 +532,11 @@ static void vfio_listener_region_add(MemoryListener *listener,
* would be the right place to wire that up (tell the KVM
* device emulation the VFIO iommu handles to use).
*/
- giommu = g_malloc0(sizeof(*giommu));
- giommu->iommu = iommu_mr;
- giommu->iommu_offset = section->offset_within_address_space -
- section->offset_within_region;
- giommu->container = container;
+
+ offset = section->offset_within_address_space -
+ section->offset_within_region;
+ giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+
llend = int128_add(int128_make64(section->offset_within_region),
section->size);
llend = int128_sub(llend, int128_one());
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 15/20] hw/vfio/common: Introduce vfio_dma_(un)map_ram_section helpers
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (13 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 14/20] hw/vfio/common: Introduce vfio_alloc_guest_iommu helper Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 16/20] hw/vfio/common: Register specific nested mode notifiers and memory_listener Eric Auger
` (4 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
This code is going to be duplicated soon, so let's introduce
an helper which dma (unp)maps a ram memory section.
No functional change.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 198 ++++++++++++++++++++++++++-----------------
hw/vfio/trace-events | 4 +-
2 files changed, 123 insertions(+), 79 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 2342bccf38..a47ac63e1d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -415,13 +415,130 @@ static VFIOGuestIOMMU *vfio_alloc_guest_iommu(VFIOContainer *container,
return giommu;
}
+static int vfio_dma_map_ram_section(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ VFIOHostDMAWindow *hostwin;
+ Int128 llend, llsize;
+ bool hostwin_found;
+ hwaddr iova, end;
+ void *vaddr;
+ int ret;
+
+ assert(memory_region_is_ram(section->mr));
+
+ iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+ llend = int128_make64(section->offset_within_address_space);
+ llend = int128_add(llend, section->size);
+ llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+ end = int128_get64(int128_sub(llend, int128_one()));
+
+ vaddr = memory_region_get_ram_ptr(section->mr) +
+ section->offset_within_region +
+ (iova - section->offset_within_address_space);
+
+ hostwin_found = false;
+ QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
+ if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
+ hostwin_found = true;
+ break;
+ }
+ }
+
+ if (!hostwin_found) {
+ error_report("vfio: IOMMU container %p can't map guest IOVA region"
+ " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
+ container, iova, end);
+ return -EFAULT;
+ }
+
+ trace_vfio_dma_map_ram(iova, end, vaddr);
+
+ llsize = int128_sub(llend, int128_make64(iova));
+
+ if (memory_region_is_ram_device(section->mr)) {
+ hwaddr pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
+
+ if ((iova & pgmask) || (int128_get64(llsize) & pgmask)) {
+ trace_vfio_listener_region_add_no_dma_map(
+ memory_region_name(section->mr),
+ section->offset_within_address_space,
+ int128_getlo(section->size),
+ pgmask + 1);
+ return 0;
+ }
+ }
+
+ ret = vfio_dma_map(container, iova, int128_get64(llsize),
+ vaddr, section->readonly);
+ if (ret) {
+ error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx", %p) = %d (%m)",
+ container, iova, int128_get64(llsize), vaddr, ret);
+ if (memory_region_is_ram_device(section->mr)) {
+ /* Allow unexpected mappings not to be fatal for RAM devices */
+ return 0;
+ }
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static void vfio_dma_unmap_ram_section(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ Int128 llend, llsize;
+ hwaddr iova, end;
+ bool try_unmap = true;
+ int ret;
+
+ iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+ llend = int128_make64(section->offset_within_address_space);
+ llend = int128_add(llend, section->size);
+ llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+
+ if (int128_ge(int128_make64(iova), llend)) {
+ return;
+ }
+ end = int128_get64(int128_sub(llend, int128_one()));
+
+ llsize = int128_sub(llend, int128_make64(iova));
+
+ trace_vfio_dma_unmap_ram(iova, end);
+
+ if (memory_region_is_ram_device(section->mr)) {
+ hwaddr pgmask;
+ VFIOHostDMAWindow *hostwin;
+ bool hostwin_found = false;
+
+ QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
+ if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
+ hostwin_found = true;
+ break;
+ }
+ }
+ assert(hostwin_found); /* or region_add() would have failed */
+
+ pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
+ try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
+ }
+
+ if (try_unmap) {
+ ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
+ if (ret) {
+ error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%m)",
+ container, iova, int128_get64(llsize), ret);
+ }
+ }
+}
+
static void vfio_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
VFIOContainer *container = container_of(listener, VFIOContainer, listener);
hwaddr iova, end;
- Int128 llend, llsize;
- void *vaddr;
+ Int128 llend;
int ret;
VFIOHostDMAWindow *hostwin;
bool hostwin_found;
@@ -556,41 +673,10 @@ static void vfio_listener_region_add(MemoryListener *listener,
}
/* Here we assume that memory_region_is_ram(section->mr)==true */
-
- vaddr = memory_region_get_ram_ptr(section->mr) +
- section->offset_within_region +
- (iova - section->offset_within_address_space);
-
- trace_vfio_listener_region_add_ram(iova, end, vaddr);
-
- llsize = int128_sub(llend, int128_make64(iova));
-
- if (memory_region_is_ram_device(section->mr)) {
- hwaddr pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
-
- if ((iova & pgmask) || (int128_get64(llsize) & pgmask)) {
- trace_vfio_listener_region_add_no_dma_map(
- memory_region_name(section->mr),
- section->offset_within_address_space,
- int128_getlo(section->size),
- pgmask + 1);
- return;
- }
- }
-
- ret = vfio_dma_map(container, iova, int128_get64(llsize),
- vaddr, section->readonly);
+ ret = vfio_dma_map_ram_section(container, section);
if (ret) {
- error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx", %p) = %d (%m)",
- container, iova, int128_get64(llsize), vaddr, ret);
- if (memory_region_is_ram_device(section->mr)) {
- /* Allow unexpected mappings not to be fatal for RAM devices */
- return;
- }
goto fail;
}
-
return;
fail:
@@ -616,10 +702,6 @@ static void vfio_listener_region_del(MemoryListener *listener,
MemoryRegionSection *section)
{
VFIOContainer *container = container_of(listener, VFIOContainer, listener);
- hwaddr iova, end;
- Int128 llend, llsize;
- int ret;
- bool try_unmap = true;
if (vfio_listener_skipped_section(section)) {
trace_vfio_listener_region_del_skip(
@@ -664,45 +746,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
*/
}
- iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
- llend = int128_make64(section->offset_within_address_space);
- llend = int128_add(llend, section->size);
- llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
-
- if (int128_ge(int128_make64(iova), llend)) {
- return;
- }
- end = int128_get64(int128_sub(llend, int128_one()));
-
- llsize = int128_sub(llend, int128_make64(iova));
-
- trace_vfio_listener_region_del(iova, end);
-
- if (memory_region_is_ram_device(section->mr)) {
- hwaddr pgmask;
- VFIOHostDMAWindow *hostwin;
- bool hostwin_found = false;
-
- QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
- if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
- hostwin_found = true;
- break;
- }
- }
- assert(hostwin_found); /* or region_add() would have failed */
-
- pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
- try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
- }
-
- if (try_unmap) {
- ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
- if (ret) {
- error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%m)",
- container, iova, int128_get64(llsize), ret);
- }
- }
+ vfio_dma_unmap_ram_section(container, section);
memory_region_unref(section->mr);
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index a85e8662ea..cee49ef124 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -93,10 +93,10 @@ vfio_region_read(char *name, int index, uint64_t addr, unsigned size, uint64_t d
vfio_iommu_map_notify(const char *op, uint64_t iova_start, uint64_t iova_end) "iommu %s @ 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add 0x%"PRIx64" - 0x%"PRIx64
vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] 0x%"PRIx64" - 0x%"PRIx64
-vfio_listener_region_add_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
+vfio_dma_map_ram(uint64_t iova_start, uint64_t iova_end, void *vaddr) "region_add [ram] 0x%"PRIx64" - 0x%"PRIx64" [%p]"
vfio_listener_region_add_no_dma_map(const char *name, uint64_t iova, uint64_t size, uint64_t page_size) "Region \"%s\" 0x%"PRIx64" size=0x%"PRIx64" is not aligned to 0x%"PRIx64" and cannot be mapped for DMA"
vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del 0x%"PRIx64" - 0x%"PRIx64
-vfio_listener_region_del(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
+vfio_dma_unmap_ram(uint64_t start, uint64_t end) "region_del 0x%"PRIx64" - 0x%"PRIx64
vfio_disconnect_container(int fd) "close container->fd=%d"
vfio_put_group(int fd) "close group->fd=%d"
vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 16/20] hw/vfio/common: Register specific nested mode notifiers and memory_listener
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (14 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 15/20] hw/vfio/common: Introduce vfio_dma_(un)map_ram_section helpers Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 17/20] hw/vfio/common: Register MAP notifier for MSI binding Eric Auger
` (3 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
In nested mode, legacy vfio_iommu_map_notify MAP/UNMAP notifier
cannot be used anymore. Indeed there is no caching mode in
place that allows to trap MAP events. Only configuration change
and UNMAP events can be trapped. As such we register
- one configuration notifier, whose role is to propagate the
configuration update downto the host
- one UNMAP notifier, whose role is to propagate the TLB
invalidation at physical IOMMU level.
Those notifiers propagate the guest stage 1 mappings at physical
level.
Also as there is no MAP event, the stage 2 mapping is not handled
anymore by the vfio_iommu_map_notify notifier.
We register a prereg_listener whose role is to dma_(un)map the RAM
memory regions. This programs the stage 2.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 166 ++++++++++++++++++++++++++++++++++++++---------
1 file changed, 136 insertions(+), 30 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index a47ac63e1d..49fcbbbc8c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -349,6 +349,64 @@ static bool vfio_get_vaddr(IOMMUTLBEntry *iotlb, void **vaddr,
return true;
}
+/* Program the guest @cfg on physical IOMMU stage 1 (nested mode) */
+static void vfio_iommu_nested_notify(IOMMUNotifier *n,
+ struct iommu_guest_stage_config *cfg)
+{
+ VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+ VFIOContainer *container = giommu->container;
+ struct vfio_iommu_type1_bind_guest_stage info;
+ int ret;
+
+ info.argsz = sizeof(info);
+ info.flags = 0;
+ memcpy(&info.config, cfg, sizeof(struct iommu_guest_stage_config));
+
+ ret = ioctl(container->fd, VFIO_IOMMU_BIND_GUEST_STAGE, &info);
+ if (ret) {
+ error_report("%s: failed to pass S1 config to the host (%d)",
+ __func__, ret);
+ }
+}
+
+/* Propagate a guest invalidation downto the physical IOMMU (nested mode) */
+static void vfio_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+ VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+ hwaddr start = iotlb->iova + giommu->iommu_offset;
+
+ VFIOContainer *container = giommu->container;
+ struct vfio_iommu_type1_tlb_invalidate ustruct;
+ int ret;
+
+ assert(iotlb->perm == IOMMU_NONE);
+
+ ustruct.argsz = sizeof(ustruct);
+ ustruct.flags = 0;
+ ustruct.info.hdr.version = TLB_INV_HDR_VERSION_1;
+ ustruct.info.hdr.type = IOMMU_INV_TYPE_TLB;
+ ustruct.info.granularity = IOMMU_INV_NR_GRANU;
+ ustruct.info.flags = IOMMU_INVALIDATE_GLOBAL_PAGE;
+ /* 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc. */
+ ustruct.info.size = ctz64(~iotlb->addr_mask) - 12;
+ /*
+ * TODO: at the moment we invalidate the whole ASID instead
+ * of invalidating the given nb_pages (nb_pages = 0):
+ * mask covering the whole GPA range is observed: in this case we shall
+ * invalidate the whole ASID (NH_ASID) and not induce storm of
+ * NH_VA commands.
+ */
+ ustruct.info.nr_pages = 0;
+ ustruct.info.addr = start;
+
+ ret = ioctl(container->fd, VFIO_IOMMU_TLB_INVALIDATE, &ustruct);
+ if (ret) {
+ error_report("%s: failed to invalidate TLB for 0x%"PRIx64
+ " mask=0x%"PRIx64" (%d)",
+ __func__, start, iotlb->addr_mask, ret);
+ }
+}
+
static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
{
VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
@@ -533,6 +591,32 @@ static void vfio_dma_unmap_ram_section(VFIOContainer *container,
}
}
+static void vfio_prereg_listener_region_add(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container =
+ container_of(listener, VFIOContainer, prereg_listener);
+
+ if (!memory_region_is_ram(section->mr)) {
+ return;
+ }
+
+ vfio_dma_map_ram_section(container, section);
+
+}
+static void vfio_prereg_listener_region_del(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container =
+ container_of(listener, VFIOContainer, prereg_listener);
+
+ if (!memory_region_is_ram(section->mr)) {
+ return;
+ }
+
+ vfio_dma_unmap_ram_section(container, section);
+}
+
static void vfio_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
@@ -541,7 +625,6 @@ static void vfio_listener_region_add(MemoryListener *listener,
Int128 llend;
int ret;
VFIOHostDMAWindow *hostwin;
- bool hostwin_found;
if (vfio_listener_skipped_section(section)) {
trace_vfio_listener_region_add_skip(
@@ -618,26 +701,10 @@ static void vfio_listener_region_add(MemoryListener *listener,
#endif
}
- hostwin_found = false;
- QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
- if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
- hostwin_found = true;
- break;
- }
- }
-
- if (!hostwin_found) {
- error_report("vfio: IOMMU container %p can't map guest IOVA region"
- " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
- container, iova, end);
- ret = -EFAULT;
- goto fail;
- }
-
memory_region_ref(section->mr);
if (memory_region_is_iommu(section->mr)) {
- VFIOGuestIOMMU *giommu;
+ VFIOGuestIOMMU *giommu = NULL;
IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
hwaddr offset;
int iommu_idx;
@@ -652,21 +719,40 @@ static void vfio_listener_region_add(MemoryListener *listener,
offset = section->offset_within_address_space -
section->offset_within_region;
- giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
-
llend = int128_add(int128_make64(section->offset_within_region),
section->size);
llend = int128_sub(llend, int128_one());
iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
MEMTXATTRS_UNSPECIFIED);
- iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
- IOMMU_NOTIFIER_IOTLB_ALL,
- section->offset_within_region,
- int128_get64(llend),
- iommu_idx);
- QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
- memory_region_register_iommu_notifier(section->mr, &giommu->n);
+ if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+ /* Config notifier to propagate guest stage 1 config changes */
+ giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+ iommu_config_notifier_init(&giommu->n, vfio_iommu_nested_notify,
+ iommu_idx);
+ QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+ memory_region_register_iommu_notifier(section->mr, &giommu->n);
+
+ /* IOTLB unmap notifier to propagate guest IOTLB invalidations */
+ giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+ iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_unmap_notify,
+ IOMMU_NOTIFIER_UNMAP,
+ section->offset_within_region,
+ int128_get64(llend),
+ iommu_idx);
+ QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+ memory_region_register_iommu_notifier(section->mr, &giommu->n);
+ } else {
+ /* MAP/UNMAP IOTLB notifier */
+ giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+ iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_map_notify,
+ IOMMU_NOTIFIER_IOTLB_ALL,
+ section->offset_within_region,
+ int128_get64(llend),
+ iommu_idx);
+ QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+ memory_region_register_iommu_notifier(section->mr, &giommu->n);
+ }
memory_region_iommu_replay(giommu->iommu, &giommu->n);
return;
@@ -679,7 +765,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
}
return;
-fail:
+ fail:
if (memory_region_is_ram_device(section->mr)) {
error_report("failed to vfio_dma_map. pci p2p may not work");
return;
@@ -763,15 +849,21 @@ static void vfio_listener_region_del(MemoryListener *listener,
}
}
-static const MemoryListener vfio_memory_listener = {
+static MemoryListener vfio_memory_listener = {
.region_add = vfio_listener_region_add,
.region_del = vfio_listener_region_del,
};
+static MemoryListener vfio_memory_prereg_listener = {
+ .region_add = vfio_prereg_listener_region_add,
+ .region_del = vfio_prereg_listener_region_del,
+};
+
static void vfio_listener_release(VFIOContainer *container)
{
memory_listener_unregister(&container->listener);
- if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
+ if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU ||
+ container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
memory_listener_unregister(&container->prereg_listener);
}
}
@@ -1262,6 +1354,20 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
}
vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
container->pgsizes = info.iova_pgsizes;
+
+ if (container->iommu_type == VFIO_TYPE1_NESTING_IOMMU) {
+ container->prereg_listener = vfio_memory_prereg_listener;
+
+ memory_listener_register(&container->prereg_listener,
+ &address_space_memory);
+ if (container->error) {
+ memory_listener_unregister(&container->prereg_listener);
+ ret = container->error;
+ error_setg(errp,
+ "RAM memory listener initialization failed for container");
+ goto free_container_exit;
+ }
+ }
break;
}
case VFIO_SPAPR_TCE_v2_IOMMU:
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 17/20] hw/vfio/common: Register MAP notifier for MSI binding
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (15 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 16/20] hw/vfio/common: Register specific nested mode notifiers and memory_listener Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 18/20] target/arm/kvm: Notifies IOMMU on MSI stage 1 binding Eric Auger
` (2 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
Register a MAP notifier to propage MSI stage 1 bindings to
the host. When the notifier gets called, we pass the guest
stage 1 MSI binding to the host. The host can then build
a S2 binding whose entry is the guest MSI doorbell GPA.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/common.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 49fcbbbc8c..412041593e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -407,6 +407,26 @@ static void vfio_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
}
+static void vfio_iommu_msi_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
+{
+ VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+ VFIOContainer *container = giommu->container;
+ struct vfio_iommu_type1_bind_guest_msi ustruct;
+ int ret;
+
+ ustruct.argsz = sizeof(struct vfio_iommu_type1_bind_guest_msi);
+ ustruct.flags = 0;
+ ustruct.binding.iova = iotlb->iova;
+ ustruct.binding.gpa = iotlb->translated_addr;
+ ustruct.binding.granule = ctz64(~iotlb->addr_mask);
+
+ ret = ioctl(container->fd, VFIO_IOMMU_BIND_MSI , &ustruct);
+ if (ret) {
+ error_report("%s: failed to pass MSI binding (%d)",
+ __func__, ret);
+ }
+}
+
static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
{
VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
@@ -742,6 +762,15 @@ static void vfio_listener_region_add(MemoryListener *listener,
iommu_idx);
QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
memory_region_register_iommu_notifier(section->mr, &giommu->n);
+
+ giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
+ iommu_iotlb_notifier_init(&giommu->n, vfio_iommu_msi_map_notify,
+ IOMMU_NOTIFIER_MAP,
+ section->offset_within_region,
+ int128_get64(llend),
+ iommu_idx);
+ QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+ memory_region_register_iommu_notifier(section->mr, &giommu->n);
} else {
/* MAP/UNMAP IOTLB notifier */
giommu = vfio_alloc_guest_iommu(container, iommu_mr, offset);
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 18/20] target/arm/kvm: Notifies IOMMU on MSI stage 1 binding
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (16 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 17/20] hw/vfio/common: Register MAP notifier for MSI binding Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 19/20] vfio/pci: Always set up MSI route before enabling vectors Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 20/20] hw/arm/smmuv3: Remove warning about unsupported MAP notifiers Eric Auger
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
When the MSI route is setup, we know the MSI IOVA and
the doorbell GPA . At that point we can communicate this guest
stage 1 binding to the host. Then the host will be able
to construct a stage 2 binding taking as input address the
doorbell GPA.
We also directly use the iommu memory region translate() callback
as the addr_mask is returned in IOTLB entry. address_space_translate
does not return this information.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
TODO: access to as->root field may be cleaned later on
---
target/arm/kvm.c | 46 ++++++++++++++++++++--------------------------
1 file changed, 20 insertions(+), 26 deletions(-)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 65f867d569..6f905215b8 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -661,41 +661,35 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
uint64_t address, uint32_t data, PCIDevice *dev)
{
AddressSpace *as = pci_device_iommu_address_space(dev);
- hwaddr xlat, len, doorbell_gpa;
- MemoryRegionSection mrs;
- MemoryRegion *mr;
- int ret = 1;
+ IOMMUMemoryRegionClass *imrc;
+ IOMMUMemoryRegion *iommu_mr;
+ IOMMUTLBEntry entry;
if (as == &address_space_memory) {
return 0;
}
+ iommu_mr = IOMMU_MEMORY_REGION(as->root);
+ imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
+
/* MSI doorbell address is translated by an IOMMU */
rcu_read_lock();
- mr = address_space_translate(as, address, &xlat, &len, true,
- MEMTXATTRS_UNSPECIFIED);
- if (!mr) {
- goto unlock;
- }
- mrs = memory_region_find(mr, xlat, 1);
- if (!mrs.mr) {
- goto unlock;
- }
-
- doorbell_gpa = mrs.offset_within_address_space;
- memory_region_unref(mrs.mr);
-
- route->u.msi.address_lo = doorbell_gpa;
- route->u.msi.address_hi = doorbell_gpa >> 32;
-
- trace_kvm_arm_fixup_msi_route(address, doorbell_gpa);
-
- ret = 0;
-
-unlock:
+ entry = imrc->translate(iommu_mr, address, IOMMU_WO, 0);
rcu_read_unlock();
- return ret;
+
+ if (entry.perm == IOMMU_NONE) {
+ return -ENOENT;
+ }
+
+ route->u.msi.address_lo = entry.translated_addr;
+ route->u.msi.address_hi = entry.translated_addr >> 32;
+
+ memory_region_iotlb_notify_iommu(iommu_mr, 0, entry);
+
+ trace_kvm_arm_fixup_msi_route(address, entry.translated_addr);
+
+ return 0;
}
int kvm_arch_add_msi_route_post(struct kvm_irq_routing_entry *route,
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 19/20] vfio/pci: Always set up MSI route before enabling vectors
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (17 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 18/20] target/arm/kvm: Notifies IOMMU on MSI stage 1 binding Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
2018-09-01 14:23 ` [Qemu-devel] [RFC 20/20] hw/arm/smmuv3: Remove warning about unsupported MAP notifiers Eric Auger
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
As we enable the vectors, we shall have an MSI route setup.
The notification of the stage 1 binding is done on MSI
route setup.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/vfio/pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 056f3a887a..040a2f39f8 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -522,6 +522,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
if (vdev->nr_vectors < nr + 1) {
vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
vdev->nr_vectors = nr + 1;
+ vfio_add_kvm_msi_virq(vdev, vector, nr, true);
ret = vfio_enable_vectors(vdev, true);
if (ret) {
error_report("vfio: failed to enable vectors, %d", ret);
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [RFC 20/20] hw/arm/smmuv3: Remove warning about unsupported MAP notifiers
2018-09-01 14:22 [Qemu-devel] [RFC 00/20] vSMMUv3/pSMMUv3 2 stage VFIO integration Eric Auger
` (18 preceding siblings ...)
2018-09-01 14:23 ` [Qemu-devel] [RFC 19/20] vfio/pci: Always set up MSI route before enabling vectors Eric Auger
@ 2018-09-01 14:23 ` Eric Auger
19 siblings, 0 replies; 21+ messages in thread
From: Eric Auger @ 2018-09-01 14:23 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell
Cc: alex.williamson, mst, cdall, jean-philippe.brucker, peterx,
yi.l.liu
SMMUv3 now is integrated with VFIO by enable the 2 stages of
the physical SMMUv3. This relies on a MAP notifier for MSI
stage 1 binding notification. So let's remove this warning.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmuv3.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index a31df03d47..9f39887602 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1497,14 +1497,6 @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
SMMUNotifierNode *node = NULL;
SMMUNotifierNode *next_node = NULL;
- if (new & IOMMU_NOTIFIER_MAP) {
- int bus_num = pci_bus_num(sdev->bus);
- PCIDevice *pcidev = pci_find_device(sdev->bus, bus_num, sdev->devfn);
-
- warn_report("SMMUv3 does not support notification on MAP: "
- "device %s will not function properly", pcidev->name);
- }
-
if (old == IOMMU_NOTIFIER_NONE) {
trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
node = g_malloc0(sizeof(*node));
--
2.17.1
^ permalink raw reply related [flat|nested] 21+ messages in thread