* [Qemu-devel] [PATCH v2 0/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset
@ 2014-07-27 8:52 Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 1/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation Le Tan
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Le Tan @ 2014-07-27 8:52 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S. Tsirkin, Knut Omang, Le Tan, Alex Williamson,
Jan Kiszka, Anthony Liguori, Paolo Bonzini
Hi,
These patches are intended to introduce Intel IOMMU (VT-d) emulation to q35
chipset. The major job in these patches is to add support for emulating Intel
IOMMU according to the VT-d specification, including basic responses to CSRs
accesses, the logic of DMAR (DMA remapping) and DMA memory address
translations.
Features implemented for now are:
1. Response to important CSRs accesses;
2. DMAR (DMA remapping) without PASID support;
3. Use register-based invalidation for IOTLB and context cache invalidation;
4. Add DMAR table to ACPI tables to expose VT-d to BIOS;
5. Add "-machine iommu=on|off" option to enable/disable VT-d;
6. Only one DMAR unit for all the devices of PCI Segment 0.
Testing:
1. L1 guest with Linux with intel_iommu=on can interact with VT-d and boot
smoothly, and I can see info about VT-d in log of kernel;
2. Run L1 with VT-d, L2 guest with Linux can boot smoothly withou PCI device
passthrough;
3. Run L1 with VT-d and "-soundhw ac97 (QEMU_AUDIO_DRV=none)", then assign the
sound card to L2; L2 can boot smoothly with legacy PCI assignment;
4. Jailhouse hypervisor seems to run smoothly for now (tested by Jan).
5. Run L1 with VT-d and e1000 network card, then assign e1000 to L2; L2 will be
STUCK when booting. This still remains unsolved now. As far as I know, I suppose
that the L2 crashes when doing e1000_probe(). The QEMU of L1 will dump
something with "KVM: entry failed, hardware error 0x0", and the KVM of host
will print "nested_vmx_exit_handled failed vm entry 7". Unlike assigning the
sound card, after being assigned to L2, there is no translation entry of e1000
through VT-d, which I think means that e1000 doesn't issue any DMA access during
the boot of L2. Sometimes the kernel of L2 will print "divide error" during
booting. Can someone help me with this? Any help is appreciated! :)
6. VFIO is tested and is the same as legacy pci assignment.
I have some questions want to consult here:
1. Now the struct IntelIOMMUState is a member of MCHPCIState. VT-d is registered
as TYPE_SYS_BUS_DEVICE but registers its configuration MemoryRegion as subregion
of mch->pci_address_space. Is this correct? Another thought comes to my mind is
using sysbus_mmio_map() to map the MemoryRegion of VT-d. But I am not sure. And
maybe there are more improper usage of the QOM.
2. For declaration of porinter of pointer, like VTDAddressSpace **address_spaces,
checkpatch.pl will warn that "ERROR: need consistent spacing around '*' (ctx:WxO)".
Is checkpatch.pl wrong?
TODO:
1. Fix the bug of legacy PCI assignment;
2. Clear up codes related to migration.
3. Queued Invalidation;
4. Basic fault reporting;
5. Caching propertities of IOTLB;
Changes since v1:
*address reviewing suggestions given by Michael, Paolo, Stefan and Jan
-split intel_iommu.h to include/hw/i386/intel_iommu.h and
hw/i386/intel_iommu_internal.h
-change the copyright information
-change D() to VTD_DPRINTF()
-remove dead code
-rename constant definitions with consistent prefix VTD_
-rename some struct definitions according to QEMU standard
-rename some CSRs access functions
-use endian-save functions to access CSRs
-change machine option to "iommu=on|off"
Thanks very much!
Git trees:
https://github.com/tamlok/qemu
Le Tan (3):
intel-iommu: introduce Intel IOMMU (VT-d) emulation
intel-iommu: add DMAR table to ACPI tables
intel-iommu: add Intel IOMMU emulation to q35 and add a machine option
"iommu" as a switch
hw/core/machine.c | 27 +-
hw/i386/Makefile.objs | 1 +
hw/i386/acpi-build.c | 41 ++
hw/i386/acpi-defs.h | 70 ++++
hw/i386/intel_iommu.c | 911 +++++++++++++++++++++++++++++++++++++++++
hw/i386/intel_iommu_internal.h | 257 ++++++++++++
hw/pci-host/q35.c | 72 +++-
include/hw/boards.h | 1 +
include/hw/i386/intel_iommu.h | 75 ++++
include/hw/pci-host/q35.h | 2 +
qemu-options.hx | 5 +-
vl.c | 4 +
12 files changed, 1457 insertions(+), 9 deletions(-)
create mode 100644 hw/i386/intel_iommu.c
create mode 100644 hw/i386/intel_iommu_internal.h
create mode 100644 include/hw/i386/intel_iommu.h
--
1.9.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Qemu-devel] [PATCH v2 1/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation
2014-07-27 8:52 [Qemu-devel] [PATCH v2 0/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset Le Tan
@ 2014-07-27 8:52 ` Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 2/3] intel-iommu: add DMAR table to ACPI tables Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 3/3] intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch Le Tan
2 siblings, 0 replies; 4+ messages in thread
From: Le Tan @ 2014-07-27 8:52 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S. Tsirkin, Knut Omang, Le Tan, Alex Williamson,
Jan Kiszka, Anthony Liguori, Paolo Bonzini
Add support for emulating Intel IOMMU according to the VT-d specification for
the q35 chipset machine. Implement the logic for DMAR (DMA remapping) without
PASID support. Use register-based invalidation for context-cache invalidation
and IOTLB invalidation.
Basic fault reporting and caching are not implemented yet.
Signed-off-by: Le Tan <tamlokveer@gmail.com>
---
hw/i386/Makefile.objs | 1 +
hw/i386/intel_iommu.c | 911 +++++++++++++++++++++++++++++++++++++++++
hw/i386/intel_iommu_internal.h | 257 ++++++++++++
include/hw/i386/intel_iommu.h | 75 ++++
4 files changed, 1244 insertions(+)
create mode 100644 hw/i386/intel_iommu.c
create mode 100644 hw/i386/intel_iommu_internal.h
create mode 100644 include/hw/i386/intel_iommu.h
diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 48014ab..6936111 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -2,6 +2,7 @@ obj-$(CONFIG_KVM) += kvm/
obj-y += multiboot.o smbios.o
obj-y += pc.o pc_piix.o pc_q35.o
obj-y += pc_sysfw.o
+obj-y += intel_iommu.o
obj-$(CONFIG_XEN) += ../xenpv/ xen/
obj-y += kvmvapic.o
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
new file mode 100644
index 0000000..718993a
--- /dev/null
+++ b/hw/i386/intel_iommu.c
@@ -0,0 +1,911 @@
+/*
+ * QEMU emulation of an Intel IOMMU (VT-d)
+ * (DMA Remapping device)
+ *
+ * Copyright (C) 2013 Knut Omang, Oracle <knut.omang@oracle.com>
+ * Copyright (C) 2014 Le Tan, <tamlokveer@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "hw/sysbus.h"
+#include "exec/address-spaces.h"
+#include "intel_iommu_internal.h"
+
+
+/*#define DEBUG_INTEL_IOMMU*/
+#ifdef DEBUG_INTEL_IOMMU
+#define VTD_DPRINTF(fmt, ...) \
+ do { fprintf(stderr, "(vtd)%s: " fmt "\n", __func__, \
+ ## __VA_ARGS__); } while (0)
+#else
+#define VTD_DPRINTF(fmt, ...) \
+ do { } while (0)
+#endif
+
+static inline void define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
+ uint64_t wmask, uint64_t w1cmask)
+{
+ stq_le_p(&s->csr[addr], val);
+ stq_le_p(&s->wmask[addr], wmask);
+ stq_le_p(&s->w1cmask[addr], w1cmask);
+}
+
+static inline void define_quad_wo(IntelIOMMUState *s, hwaddr addr,
+ uint64_t mask)
+{
+ stq_le_p(&s->womask[addr], mask);
+}
+
+static inline void define_long(IntelIOMMUState *s, hwaddr addr, uint32_t val,
+ uint32_t wmask, uint32_t w1cmask)
+{
+ stl_le_p(&s->csr[addr], val);
+ stl_le_p(&s->wmask[addr], wmask);
+ stl_le_p(&s->w1cmask[addr], w1cmask);
+}
+
+static inline void define_long_wo(IntelIOMMUState *s, hwaddr addr,
+ uint32_t mask)
+{
+ stl_le_p(&s->womask[addr], mask);
+}
+
+/* "External" get/set operations */
+static inline void set_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val)
+{
+ uint64_t oldval = ldq_le_p(&s->csr[addr]);
+ uint64_t wmask = ldq_le_p(&s->wmask[addr]);
+ uint64_t w1cmask = ldq_le_p(&s->w1cmask[addr]);
+ stq_le_p(&s->csr[addr],
+ ((oldval & ~wmask) | (val & wmask)) & ~(w1cmask & val));
+}
+
+static inline void set_long(IntelIOMMUState *s, hwaddr addr, uint32_t val)
+{
+ uint32_t oldval = ldl_le_p(&s->csr[addr]);
+ uint32_t wmask = ldl_le_p(&s->wmask[addr]);
+ uint32_t w1cmask = ldl_le_p(&s->w1cmask[addr]);
+ stl_le_p(&s->csr[addr],
+ ((oldval & ~wmask) | (val & wmask)) & ~(w1cmask & val));
+}
+
+static inline uint64_t get_quad(IntelIOMMUState *s, hwaddr addr)
+{
+ uint64_t val = ldq_le_p(&s->csr[addr]);
+ uint64_t womask = ldq_le_p(&s->womask[addr]);
+ return val & ~womask;
+}
+
+
+static inline uint32_t get_long(IntelIOMMUState *s, hwaddr addr)
+{
+ uint32_t val = ldl_le_p(&s->csr[addr]);
+ uint32_t womask = ldl_le_p(&s->womask[addr]);
+ return val & ~womask;
+}
+
+/* "Internal" get/set operations */
+static inline uint64_t get_quad_raw(IntelIOMMUState *s, hwaddr addr)
+{
+ return ldq_le_p(&s->csr[addr]);
+}
+
+static inline uint32_t get_long_raw(IntelIOMMUState *s, hwaddr addr)
+{
+ return ldl_le_p(&s->csr[addr]);
+}
+
+static inline uint32_t set_clear_mask_long(IntelIOMMUState *s, hwaddr addr,
+ uint32_t clear, uint32_t mask)
+{
+ uint32_t new_val = (ldl_le_p(&s->csr[addr]) & ~clear) | mask;
+ stl_le_p(&s->csr[addr], new_val);
+ return new_val;
+}
+
+static inline uint64_t set_clear_mask_quad(IntelIOMMUState *s, hwaddr addr,
+ uint64_t clear, uint64_t mask)
+{
+ uint64_t new_val = (ldq_le_p(&s->csr[addr]) & ~clear) | mask;
+ stq_le_p(&s->csr[addr], new_val);
+ return new_val;
+}
+
+static inline bool root_entry_present(VTDRootEntry *root)
+{
+ return root->val & VTD_ROOT_ENTRY_P;
+}
+
+static bool get_root_entry(IntelIOMMUState *s, int index, VTDRootEntry *re)
+{
+ dma_addr_t addr;
+
+ assert(index >= 0 && index < VTD_ROOT_ENTRY_NR);
+
+ addr = s->root + index * sizeof(*re);
+
+ if (dma_memory_read(&address_space_memory, addr, re, sizeof(*re))) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: fail to read root table");
+ re->val = 0;
+ return false;
+ }
+
+ re->val = le64_to_cpu(re->val);
+ return true;
+}
+
+static inline bool context_entry_present(VTDContextEntry *context)
+{
+ return context->lo & VTD_CONTEXT_ENTRY_P;
+}
+
+static bool get_context_entry_from_root(VTDRootEntry *root, int index,
+ VTDContextEntry *ce)
+{
+ dma_addr_t addr;
+
+ if (!root_entry_present(root)) {
+ ce->lo = 0;
+ ce->hi = 0;
+ return false;
+ }
+
+ assert(index >= 0 && index < VTD_CONTEXT_ENTRY_NR);
+
+ addr = (root->val & VTD_ROOT_ENTRY_CTP) + index * sizeof(*ce);
+
+ if (dma_memory_read(&address_space_memory, addr, ce, sizeof(*ce))) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: fail to read context_entry table");
+ ce->lo = 0;
+ ce->hi = 0;
+ return false;
+ }
+
+ ce->lo = le64_to_cpu(ce->lo);
+ ce->hi = le64_to_cpu(ce->hi);
+ return true;
+}
+
+static inline dma_addr_t get_slpt_base_from_context(VTDContextEntry *ce)
+{
+ return ce->lo & VTD_CONTEXT_ENTRY_SLPTPTR;
+}
+
+/* The shift of an addr for a certain level of paging structure */
+static inline int slpt_level_shift(int level)
+{
+ return VTD_PAGE_SHIFT_4K + (level - 1) * VTD_SL_LEVEL_BITS;
+}
+
+static inline bool slpte_present(uint64_t slpte)
+{
+ return slpte & 3;
+}
+
+/* Calculate the GPA given the base address, the index in the page table and
+ * the level of this page table.
+ */
+static inline uint64_t get_slpt_gpa(uint64_t addr, int index, int level)
+{
+ return addr + (((uint64_t)index) << slpt_level_shift(level));
+}
+
+static inline uint64_t get_slpte_addr(uint64_t slpte)
+{
+ return slpte & VTD_SL_PT_BASE_ADDR_MASK;
+}
+
+/* Whether the pte points to a large page */
+static inline bool is_large_pte(uint64_t pte)
+{
+ return pte & VTD_SL_PT_PAGE_SIZE_MASK;
+}
+
+/* Whether the pte indicates the address of the page frame */
+static inline bool is_last_slpte(uint64_t slpte, int level)
+{
+ if (level == VTD_SL_PT_LEVEL) {
+ return true;
+ }
+ if (is_large_pte(slpte)) {
+ return true;
+ }
+ return false;
+}
+
+/* Get the content of a spte located in @base_addr[@index] */
+static inline uint64_t get_slpte(dma_addr_t base_addr, int index)
+{
+ uint64_t slpte;
+
+ assert(index >= 0 && index < VTD_SL_PT_ENTRY_NR);
+
+ if (dma_memory_read(&address_space_memory,
+ base_addr + index * sizeof(slpte), &slpte,
+ sizeof(slpte))) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: fail to read second level paging structures");
+ slpte = (uint64_t)-1;
+ return slpte;
+ }
+
+ slpte = le64_to_cpu(slpte);
+ return slpte;
+}
+
+/* Given a gpa and the level of paging structure, return the offset of current
+ * level.
+ */
+static inline int gpa_level_offset(uint64_t gpa, int level)
+{
+ return (gpa >> slpt_level_shift(level)) & ((1ULL << VTD_SL_LEVEL_BITS) - 1);
+}
+
+/* Get the page-table level that hardware should use for the second-level
+ * page-table walk from the Address Width field of context-entry.
+ */
+static inline int get_level_from_context_entry(VTDContextEntry *ce)
+{
+ return 2 + (ce->hi & VTD_CONTEXT_ENTRY_AW);
+}
+
+/* Given the @gpa, return relevant slpte. @slpte_level will be the last level
+ * of the translation, can be used for deciding the size of large page.
+ */
+static uint64_t gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa,
+ int *slpte_level)
+{
+ dma_addr_t addr = get_slpt_base_from_context(ce);
+ int level = get_level_from_context_entry(ce);
+ int offset;
+ uint64_t slpte;
+
+ while (true) {
+ offset = gpa_level_offset(gpa, level);
+ slpte = get_slpte(addr, offset);
+
+ if (slpte == (uint64_t)-1) {
+ *slpte_level = level;
+ break;
+ }
+ if (!slpte_present(slpte)) {
+ VTD_DPRINTF("error: slpte 0x%"PRIx64 " is not present", slpte);
+ slpte = (uint64_t)-1;
+ *slpte_level = level;
+ break;
+ }
+ if (is_last_slpte(slpte, level)) {
+ *slpte_level = level;
+ break;
+ }
+ addr = get_slpte_addr(slpte);
+ level--;
+ }
+ return slpte;
+}
+
+/* Map a device to its corresponding domain (context_entry) */
+static inline bool dev_to_context_entry(IntelIOMMUState *s, int bus_num,
+ int devfn, VTDContextEntry *ce)
+{
+ VTDRootEntry re;
+
+ assert(0 <= bus_num && bus_num < VTD_PCI_BUS_MAX);
+ assert(0 <= devfn && devfn < VTD_PCI_SLOT_MAX * VTD_PCI_FUNC_MAX);
+
+ if (!get_root_entry(s, bus_num, &re)) {
+ /* FIXME: fault reporting */
+ return false;
+ }
+ if (!root_entry_present(&re)) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: root-entry #%d is not present", bus_num);
+ return false;
+ }
+ if (!get_context_entry_from_root(&re, devfn, ce)) {
+ /* FIXME: fault reporting */
+ return false;
+ }
+ if (!context_entry_present(ce)) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: context-entry #%d(bus #%d) is not present", devfn,
+ bus_num);
+ return false;
+ }
+
+ return true;
+}
+
+/* Do a paging-structures walk to do a iommu translation
+ * @bus_num: The bus number
+ * @devfn: The devfn, which is the combined of device and function number
+ * @entry: IOMMUTLBEntry that contain the addr to be translated and result
+ */
+static void iommu_translate(IntelIOMMUState *s, int bus_num, int devfn,
+ hwaddr addr, IOMMUTLBEntry *entry)
+{
+ VTDContextEntry ce;
+ uint64_t slpte;
+ int level;
+ uint64_t page_mask = VTD_PAGE_MASK_4K;
+
+ if (!dev_to_context_entry(s, bus_num, devfn, &ce)) {
+ return;
+ }
+
+ slpte = gpa_to_slpte(&ce, addr, &level);
+ if (slpte == (uint64_t)-1) {
+ /* FIXME: fault reporting */
+ VTD_DPRINTF("error: can't get slpte for gpa %"PRIx64, addr);
+ return;
+ }
+
+ if (is_large_pte(slpte)) {
+ if (level == VTD_SL_PDP_LEVEL) {
+ /* 1-GB page */
+ page_mask = VTD_PAGE_MASK_1G;
+ } else {
+ /* 2-MB page */
+ page_mask = VTD_PAGE_MASK_2M;
+ }
+ }
+
+ entry->iova = addr & page_mask;
+ entry->translated_addr = get_slpte_addr(slpte) & page_mask;
+ entry->addr_mask = ~page_mask;
+ entry->perm = IOMMU_RW;
+}
+
+static void vtd_root_table_setup(IntelIOMMUState *s)
+{
+ s->root = *((uint64_t *)&s->csr[DMAR_RTADDR_REG]);
+ s->extended = s->root & VTD_RTADDR_RTT;
+ s->root &= ~0xfff;
+ VTD_DPRINTF("root_table addr 0x%"PRIx64 " %s", s->root,
+ (s->extended ? "(extended)" : ""));
+}
+
+/* Context-cache invalidation
+ * Returns the Context Actual Invalidation Granularity.
+ * @val: the content of the CCMD_REG
+ */
+static uint64_t vtd_context_cache_invalidate(IntelIOMMUState *s, uint64_t val)
+{
+ uint64_t caig;
+ uint64_t type = val & VTD_CCMD_CIRG_MASK;
+
+ switch (type) {
+ case VTD_CCMD_GLOBAL_INVL:
+ VTD_DPRINTF("Global invalidation request");
+ caig = VTD_CCMD_GLOBAL_INVL_A;
+ break;
+
+ case VTD_CCMD_DOMAIN_INVL:
+ VTD_DPRINTF("Domain-selective invalidation request");
+ caig = VTD_CCMD_DOMAIN_INVL_A;
+ break;
+
+ case VTD_CCMD_DEVICE_INVL:
+ VTD_DPRINTF("Domain-selective invalidation request");
+ caig = VTD_CCMD_DEVICE_INVL_A;
+ break;
+
+ default:
+ VTD_DPRINTF("error: wrong context-cache invalidation granularity");
+ caig = 0;
+ }
+
+ return caig;
+}
+
+/* Flush IOTLB
+ * Returns the IOTLB Actual Invalidation Granularity.
+ * @val: the content of the IOTLB_REG
+ */
+static uint64_t vtd_iotlb_flush(IntelIOMMUState *s, uint64_t val)
+{
+ uint64_t iaig;
+ uint64_t type = val & VTD_TLB_FLUSH_GRANU_MASK;
+
+ switch (type) {
+ case VTD_TLB_GLOBAL_FLUSH:
+ VTD_DPRINTF("Global IOTLB flush");
+ iaig = VTD_TLB_GLOBAL_FLUSH_A;
+ break;
+
+ case VTD_TLB_DSI_FLUSH:
+ VTD_DPRINTF("Domain-selective IOTLB flush");
+ iaig = VTD_TLB_DSI_FLUSH_A;
+ break;
+
+ case VTD_TLB_PSI_FLUSH:
+ VTD_DPRINTF("Page-selective-within-domain IOTLB flush");
+ iaig = VTD_TLB_PSI_FLUSH_A;
+ break;
+
+ default:
+ VTD_DPRINTF("error: wrong iotlb flush granularity");
+ iaig = 0;
+ }
+
+ return iaig;
+}
+
+/* FIXME: Not implemented yet */
+static void handle_gcmd_qie(IntelIOMMUState *s, bool en)
+{
+ VTD_DPRINTF("Queued Invalidation Enable %s", (en ? "on" : "off"));
+
+ /* Ok - report back to driver */
+ set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_QIES);
+}
+
+/* Set Root Table Pointer */
+static void handle_gcmd_srtp(IntelIOMMUState *s)
+{
+ VTD_DPRINTF("set Root Table Pointer");
+
+ vtd_root_table_setup(s);
+ /* Ok - report back to driver */
+ set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
+}
+
+/* Handle Translation Enable/Disable */
+static void handle_gcmd_te(IntelIOMMUState *s, bool en)
+{
+ VTD_DPRINTF("Translation Enable %s", (en ? "on" : "off"));
+
+ if (en) {
+ /* Ok - report back to driver */
+ set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_TES);
+ } else {
+ /* Ok - report back to driver */
+ set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
+ }
+}
+
+/* Handle write to Global Command Register */
+static void handle_gcmd_write(IntelIOMMUState *s)
+{
+ uint32_t status = get_long_raw(s, DMAR_GSTS_REG);
+ uint32_t val = get_long_raw(s, DMAR_GCMD_REG);
+ uint32_t changed = status ^ val;
+
+ VTD_DPRINTF("value 0x%x status 0x%x", val, status);
+ if (changed & VTD_GCMD_TE) {
+ /* Translation enable/disable */
+ handle_gcmd_te(s, val & VTD_GCMD_TE);
+ } else if (val & VTD_GCMD_SRTP) {
+ /* Set/update the root-table pointer */
+ handle_gcmd_srtp(s);
+ } else if (changed & VTD_GCMD_QIE) {
+ /* Queued Invalidation Enable */
+ handle_gcmd_qie(s, val & VTD_GCMD_QIE);
+ } else {
+ VTD_DPRINTF("error: unhandled gcmd write");
+ }
+}
+
+/* Handle write to Context Command Register */
+static void handle_ccmd_write(IntelIOMMUState *s)
+{
+ uint64_t ret;
+ uint64_t val = get_quad_raw(s, DMAR_CCMD_REG);
+
+ /* Context-cache invalidation request */
+ if (val & VTD_CCMD_ICC) {
+ ret = vtd_context_cache_invalidate(s, val);
+
+ /* Invalidation completed. Change something to show */
+ set_clear_mask_quad(s, DMAR_CCMD_REG, VTD_CCMD_ICC, 0ULL);
+ ret = set_clear_mask_quad(s, DMAR_CCMD_REG, VTD_CCMD_CAIG_MASK, ret);
+ VTD_DPRINTF("CCMD_REG write-back val: 0x%"PRIx64, ret);
+ }
+}
+
+/* Handle write to IOTLB Invalidation Register */
+static void handle_iotlb_write(IntelIOMMUState *s)
+{
+ uint64_t ret;
+ uint64_t val = get_quad_raw(s, DMAR_IOTLB_REG);
+
+ /* IOTLB invalidation request */
+ if (val & VTD_TLB_IVT) {
+ ret = vtd_iotlb_flush(s, val);
+
+ /* Invalidation completed. Change something to show */
+ set_clear_mask_quad(s, DMAR_IOTLB_REG, VTD_TLB_IVT, 0ULL);
+ ret = set_clear_mask_quad(s, DMAR_IOTLB_REG,
+ VTD_TLB_FLUSH_GRANU_MASK_A, ret);
+ VTD_DPRINTF("IOTLB_REG write-back val: 0x%"PRIx64, ret);
+ }
+}
+
+static uint64_t vtd_mem_read(void *opaque, hwaddr addr, unsigned size)
+{
+ IntelIOMMUState *s = opaque;
+ uint64_t val;
+
+ if (addr + size > DMAR_REG_SIZE) {
+ VTD_DPRINTF("error: addr outside region: max 0x%"PRIx64
+ ", got 0x%"PRIx64 " %d",
+ (uint64_t)DMAR_REG_SIZE, addr, size);
+ return (uint64_t)-1;
+ }
+
+ assert(size == 4 || size == 8);
+
+ switch (addr) {
+ /* Root Table Address Register, 64-bit */
+ case DMAR_RTADDR_REG:
+ if (size == 4) {
+ val = (uint32_t)s->root;
+ } else {
+ val = s->root;
+ }
+ break;
+
+ case DMAR_RTADDR_REG_HI:
+ assert(size == 4);
+ val = s->root >> 32;
+ break;
+
+ default:
+ if (size == 4) {
+ val = get_long(s, addr);
+ } else {
+ val = get_quad(s, addr);
+ }
+ }
+
+ VTD_DPRINTF("addr 0x%"PRIx64 " size %d val 0x%"PRIx64, addr, size, val);
+ return val;
+}
+
+static void vtd_mem_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+ IntelIOMMUState *s = opaque;
+
+ if (addr + size > DMAR_REG_SIZE) {
+ VTD_DPRINTF("error: addr outside region: max 0x%"PRIx64
+ ", got 0x%"PRIx64 " %d",
+ (uint64_t)DMAR_REG_SIZE, addr, size);
+ return;
+ }
+
+ assert(size == 4 || size == 8);
+
+ /* Val should be written into csr within the handler */
+ switch (addr) {
+ /* Global Command Register, 32-bit */
+ case DMAR_GCMD_REG:
+ VTD_DPRINTF("DMAR_GCMD_REG write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ set_long(s, addr, val);
+ handle_gcmd_write(s);
+ break;
+
+ /* Context Command Register, 64-bit */
+ case DMAR_CCMD_REG:
+ VTD_DPRINTF("DMAR_CCMD_REG write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ if (size == 4) {
+ set_long(s, addr, val);
+ } else {
+ set_quad(s, addr, val);
+ handle_ccmd_write(s);
+ }
+ break;
+
+ case DMAR_CCMD_REG_HI:
+ VTD_DPRINTF("DMAR_CCMD_REG_HI write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ assert(size == 4);
+ set_long(s, addr, val);
+ handle_ccmd_write(s);
+ break;
+
+
+ /* IOTLB Invalidation Register, 64-bit */
+ case DMAR_IOTLB_REG:
+ VTD_DPRINTF("DMAR_IOTLB_REG write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ if (size == 4) {
+ set_long(s, addr, val);
+ } else {
+ set_quad(s, addr, val);
+ handle_iotlb_write(s);
+ }
+ break;
+
+ case DMAR_IOTLB_REG_HI:
+ VTD_DPRINTF("DMAR_IOTLB_REG_HI write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ assert(size == 4);
+ set_long(s, addr, val);
+ handle_iotlb_write(s);
+ break;
+
+ /* Fault Status Register, 32-bit */
+ case DMAR_FSTS_REG:
+ /* Fault Event Data Register, 32-bit */
+ case DMAR_FEDATA_REG:
+ /* Fault Event Address Register, 32-bit */
+ case DMAR_FEADDR_REG:
+ /* Fault Event Upper Address Register, 32-bit */
+ case DMAR_FEUADDR_REG:
+ /* Fault Event Control Register, 32-bit */
+ case DMAR_FECTL_REG:
+ /* Protected Memory Enable Register, 32-bit */
+ case DMAR_PMEN_REG:
+ VTD_DPRINTF("known reg write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ set_long(s, addr, val);
+ break;
+
+
+ /* Root Table Address Register, 64-bit */
+ case DMAR_RTADDR_REG:
+ VTD_DPRINTF("DMAR_RTADDR_REG write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ if (size == 4) {
+ set_long(s, addr, val);
+ } else {
+ set_quad(s, addr, val);
+ }
+ break;
+
+ case DMAR_RTADDR_REG_HI:
+ VTD_DPRINTF("DMAR_RTADDR_REG_HI write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ assert(size == 4);
+ set_long(s, addr, val);
+ break;
+
+ default:
+ VTD_DPRINTF("error: unhandled reg write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ if (size == 4) {
+ set_long(s, addr, val);
+ } else {
+ set_quad(s, addr, val);
+ }
+ }
+
+}
+
+static IOMMUTLBEntry vtd_iommu_translate(MemoryRegion *iommu, hwaddr addr)
+{
+ VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
+ IntelIOMMUState *s = vtd_as->iommu_state;
+ int bus_num = vtd_as->bus_num;
+ int devfn = vtd_as->devfn;
+ IOMMUTLBEntry ret = {
+ .target_as = &address_space_memory,
+ .iova = 0,
+ .translated_addr = 0,
+ .addr_mask = ~(hwaddr)0,
+ .perm = IOMMU_NONE,
+ };
+
+ if (!(get_long_raw(s, DMAR_GSTS_REG) & VTD_GSTS_TES)) {
+ /* DMAR disabled, passthrough, use 4k-page*/
+ ret.iova = addr & VTD_PAGE_MASK_4K;
+ ret.translated_addr = addr & VTD_PAGE_MASK_4K;
+ ret.addr_mask = ~VTD_PAGE_MASK_4K;
+ ret.perm = IOMMU_RW;
+ return ret;
+ }
+
+ iommu_translate(s, bus_num, devfn, addr, &ret);
+
+ VTD_DPRINTF("bus %d slot %d func %d devfn %d gpa %"PRIx64 " hpa %"PRIx64,
+ bus_num, VTD_PCI_SLOT(devfn), VTD_PCI_FUNC(devfn), devfn, addr,
+ ret.translated_addr);
+ return ret;
+}
+
+static const VMStateDescription vtd_vmstate = {
+ .name = "iommu_intel",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .minimum_version_id_old = 1,
+ .fields = (VMStateField[]) {
+ VMSTATE_UINT8_ARRAY(csr, IntelIOMMUState, DMAR_REG_SIZE),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
+static const MemoryRegionOps vtd_mem_ops = {
+ .read = vtd_mem_read,
+ .write = vtd_mem_write,
+ .endianness = DEVICE_LITTLE_ENDIAN,
+ .impl = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+ .valid = {
+ .min_access_size = 4,
+ .max_access_size = 8,
+ },
+};
+
+static Property iommu_properties[] = {
+ DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+/* Do the real initialization. It will also be called when reset, so pay
+ * attention when adding new initialization stuff.
+ */
+static void do_vtd_init(IntelIOMMUState *s)
+{
+ memset(s->csr, 0, DMAR_REG_SIZE);
+ memset(s->wmask, 0, DMAR_REG_SIZE);
+ memset(s->w1cmask, 0, DMAR_REG_SIZE);
+ memset(s->womask, 0, DMAR_REG_SIZE);
+
+ s->iommu_ops.translate = vtd_iommu_translate;
+ s->root = 0;
+ s->extended = false;
+
+ /* b.0:2 = 6: Number of domains supported: 64K using 16 bit ids
+ * b.3 = 0: No advanced fault logging
+ * b.4 = 0: No required write buffer flushing
+ * b.5 = 0: Protected low memory region not supported
+ * b.6 = 0: Protected high memory region not supported
+ * b.8:12 = 2: SAGAW(Supported Adjusted Guest Address Widths), 39-bit,
+ * 3-level page-table
+ * b.16:21 = 38: MGAW(Maximum Guest Address Width) = 39
+ * b.22 = 0: ZLR(Zero Length Read) zero length DMA read requests
+ * to write-only pages not supported
+ * b.24:33 = 34: FRO(Fault-recording Register offset)
+ * b.54 = 0: DWD(Write Draining), draining of write requests not supported
+ * b.55 = 0: DRD(Read Draining), draining of read requests not supported
+ */
+ const uint64_t dmar_cap_reg_value = VTD_CAP_FRO | VTD_CAP_NFR |
+ VTD_CAP_ND | VTD_CAP_MGAW |
+ VTD_CAP_SAGAW_39bit;
+
+ /* b.1 = 0: QI(Queued Invalidation support) not supported
+ * b.2 = 0: DT(Device-TLB support)
+ * b.3 = 0: IR(Interrupt Remapping support) not supported
+ * b.4 = 0: EIM(Extended Interrupt Mode) not supported
+ * b.8:17 = 15: IRO(IOTLB Register Offset)
+ * b.20:23 = 15: MHMV(Maximum Handle Mask Value)
+ */
+ const uint64_t dmar_ecap_reg_value = 0xf00000ULL | VTD_ECAP_IRO;
+
+ /* Define registers with default values and bit semantics */
+ define_long(s, DMAR_VER_REG, 0x10UL, 0, 0); /* set MAX = 1, RO */
+ define_quad(s, DMAR_CAP_REG, dmar_cap_reg_value, 0, 0);
+ define_quad(s, DMAR_ECAP_REG, dmar_ecap_reg_value, 0, 0);
+ define_long(s, DMAR_GCMD_REG, 0, 0xff800000UL, 0);
+ define_long_wo(s, DMAR_GCMD_REG, 0xff800000UL);
+ define_long(s, DMAR_GSTS_REG, 0, 0, 0); /* All bits RO, default 0 */
+ define_quad(s, DMAR_RTADDR_REG, 0, 0xfffffffffffff000ULL, 0);
+ define_quad(s, DMAR_CCMD_REG, 0, 0xe0000003ffffffffULL, 0);
+ define_quad_wo(s, DMAR_CCMD_REG, 0x3ffff0000ULL);
+ define_long(s, DMAR_FSTS_REG, 0, 0, 0xfdUL);
+ define_long(s, DMAR_FECTL_REG, 0x80000000UL, 0x80000000UL, 0);
+ define_long(s, DMAR_FEDATA_REG, 0, 0xffffffffUL, 0); /* All bits RW */
+ define_long(s, DMAR_FEADDR_REG, 0, 0xfffffffcUL, 0); /* 31:2 RW */
+ define_long(s, DMAR_FEUADDR_REG, 0, 0xffffffffUL, 0); /* 31:0 RW */
+
+ define_quad(s, DMAR_AFLOG_REG, 0, 0xffffffffffffff00ULL, 0);
+
+ /* Treated as RO for implementations that PLMR and PHMR fields reported
+ * as Clear in the CAP_REG.
+ * define_long(s, DMAR_PMEN_REG, 0, 0x80000000UL, 0);
+ */
+ define_long(s, DMAR_PMEN_REG, 0, 0, 0);
+
+ /* TBD: The definition of these are dynamic:
+ * DMAR_PLMBASE_REG, DMAR_PLMLIMIT_REG, DMAR_PHMBASE_REG, DMAR_PHMLIMIT_REG
+ */
+
+ /* Bits 18:4 (0x7fff0) is RO, rest is RsvdZ
+ * IQH_REG is treated as RsvdZ when not supported in ECAP_REG
+ * define_quad(s, DMAR_IQH_REG, 0, 0, 0);
+ */
+ define_quad(s, DMAR_IQH_REG, 0, 0, 0);
+
+ /* IQT_REG and IQA_REG is treated as RsvdZ when not supported in ECAP_REG
+ * define_quad(s, DMAR_IQT_REG, 0, 0x7fff0ULL, 0);
+ * define_quad(s, DMAR_IQA_REG, 0, 0xfffffffffffff007ULL, 0);
+ */
+ define_quad(s, DMAR_IQT_REG, 0, 0, 0);
+ define_quad(s, DMAR_IQA_REG, 0, 0, 0);
+
+ /* Bit 0 is RW1CS - rest is RsvdZ */
+ define_long(s, DMAR_ICS_REG, 0, 0, 0x1UL);
+
+ /* b.31 is RW, b.30 RO, rest: RsvdZ */
+ define_long(s, DMAR_IECTL_REG, 0x80000000UL, 0x80000000UL, 0);
+
+ define_long(s, DMAR_IEDATA_REG, 0, 0xffffffffUL, 0);
+ define_long(s, DMAR_IEADDR_REG, 0, 0xfffffffcUL, 0);
+ define_long(s, DMAR_IEUADDR_REG, 0, 0xffffffffUL, 0);
+ define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff80fULL, 0);
+ define_quad(s, DMAR_PQH_REG, 0, 0x7fff0ULL, 0);
+ define_quad(s, DMAR_PQT_REG, 0, 0x7fff0ULL, 0);
+ define_quad(s, DMAR_PQA_REG, 0, 0xfffffffffffff007ULL, 0);
+ define_long(s, DMAR_PRS_REG, 0, 0, 0x1UL);
+ define_long(s, DMAR_PECTL_REG, 0x80000000UL, 0x80000000UL, 0);
+ define_long(s, DMAR_PEDATA_REG, 0, 0xffffffffUL, 0);
+ define_long(s, DMAR_PEADDR_REG, 0, 0xfffffffcUL, 0);
+ define_long(s, DMAR_PEUADDR_REG, 0, 0xffffffffUL, 0);
+
+ /* When MTS not supported in ECAP_REG, these regs are RsvdZ */
+ define_long(s, DMAR_MTRRCAP_REG, 0, 0, 0);
+ define_long(s, DMAR_MTRRDEF_REG, 0, 0, 0);
+
+ /* IOTLB registers */
+ define_quad(s, DMAR_IOTLB_REG, 0, 0Xb003ffff00000000ULL, 0);
+ define_quad(s, DMAR_IVA_REG, 0, 0xfffffffffffff07fULL, 0);
+ define_quad_wo(s, DMAR_IVA_REG, 0xfffffffffffff07fULL);
+}
+
+/* Reset function of QOM
+ * Should not reset address_spaces when reset
+ */
+static void vtd_reset(DeviceState *dev)
+{
+ IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
+
+ VTD_DPRINTF("");
+ do_vtd_init(s);
+}
+
+/* Initialization function of QOM */
+static void vtd_realize(DeviceState *dev, Error **errp)
+{
+ IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
+
+ VTD_DPRINTF("");
+ memset(s->address_spaces, 0, sizeof(s->address_spaces));
+ memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s,
+ "intel_iommu", DMAR_REG_SIZE);
+
+ do_vtd_init(s);
+}
+
+static void vtd_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->reset = vtd_reset;
+ dc->realize = vtd_realize;
+ dc->vmsd = &vtd_vmstate;
+ dc->props = iommu_properties;
+}
+
+static const TypeInfo vtd_info = {
+ .name = TYPE_INTEL_IOMMU_DEVICE,
+ .parent = TYPE_SYS_BUS_DEVICE,
+ .instance_size = sizeof(IntelIOMMUState),
+ .class_init = vtd_class_init,
+};
+
+static void vtd_register_types(void)
+{
+ VTD_DPRINTF("");
+ type_register_static(&vtd_info);
+}
+
+type_init(vtd_register_types)
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
new file mode 100644
index 0000000..075e1c5
--- /dev/null
+++ b/hw/i386/intel_iommu_internal.h
@@ -0,0 +1,257 @@
+/*
+ * QEMU emulation of an Intel IOMMU (VT-d)
+ * (DMA Remapping device)
+ *
+ * Copyright (C) 2013 Knut Omang, Oracle <knut.omang@oracle.com>
+ * Copyright (C) 2014 Le Tan, <tamlokveer@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Lots of defines copied from kernel/include/linux/intel-iommu.h:
+ * Copyright (C) 2006-2008 Intel Corporation
+ * Author: Ashok Raj <ashok.raj@intel.com>
+ * Author: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
+ *
+ */
+
+#ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
+#define HW_I386_INTEL_IOMMU_INTERNAL_H
+#include "hw/i386/intel_iommu.h"
+
+/*
+ * Intel IOMMU register specification per version 1.0 public spec.
+ */
+
+#define DMAR_VER_REG 0x0 /* Arch version supported by this IOMMU */
+#define DMAR_CAP_REG 0x8 /* Hardware supported capabilities */
+#define DMAR_CAP_REG_HI 0xc /* High 32-bit of DMAR_CAP_REG */
+#define DMAR_ECAP_REG 0x10 /* Extended capabilities supported */
+#define DMAR_ECAP_REG_HI 0X14
+#define DMAR_GCMD_REG 0x18 /* Global command register */
+#define DMAR_GSTS_REG 0x1c /* Global status register */
+#define DMAR_RTADDR_REG 0x20 /* Root entry table */
+#define DMAR_RTADDR_REG_HI 0X24
+#define DMAR_CCMD_REG 0x28 /* Context command reg */
+#define DMAR_CCMD_REG_HI 0x2c
+#define DMAR_FSTS_REG 0x34 /* Fault Status register */
+#define DMAR_FECTL_REG 0x38 /* Fault control register */
+#define DMAR_FEDATA_REG 0x3c /* Fault event interrupt data register */
+#define DMAR_FEADDR_REG 0x40 /* Fault event interrupt addr register */
+#define DMAR_FEUADDR_REG 0x44 /* Upper address register */
+#define DMAR_AFLOG_REG 0x58 /* Advanced Fault control */
+#define DMAR_AFLOG_REG_HI 0X5c
+#define DMAR_PMEN_REG 0x64 /* Enable Protected Memory Region */
+#define DMAR_PLMBASE_REG 0x68 /* PMRR Low addr */
+#define DMAR_PLMLIMIT_REG 0x6c /* PMRR low limit */
+#define DMAR_PHMBASE_REG 0x70 /* pmrr high base addr */
+#define DMAR_PHMBASE_REG_HI 0X74
+#define DMAR_PHMLIMIT_REG 0x78 /* pmrr high limit */
+#define DMAR_PHMLIMIT_REG_HI 0x7c
+#define DMAR_IQH_REG 0x80 /* Invalidation queue head register */
+#define DMAR_IQH_REG_HI 0X84
+#define DMAR_IQT_REG 0x88 /* Invalidation queue tail register */
+#define DMAR_IQT_REG_HI 0X8c
+#define DMAR_IQ_SHIFT 4 /* Invalidation queue head/tail shift */
+#define DMAR_IQA_REG 0x90 /* Invalidation queue addr register */
+#define DMAR_IQA_REG_HI 0x94
+#define DMAR_ICS_REG 0x9c /* Invalidation complete status register */
+#define DMAR_IRTA_REG 0xb8 /* Interrupt remapping table addr register */
+#define DMAR_IRTA_REG_HI 0xbc
+
+/* From Vt-d 2.2 spec */
+#define DMAR_IECTL_REG 0xa0 /* Invalidation event control register */
+#define DMAR_IEDATA_REG 0xa4 /* Invalidation event data register */
+#define DMAR_IEADDR_REG 0xa8 /* Invalidation event address register */
+#define DMAR_IEUADDR_REG 0xac /* Invalidation event address register */
+#define DMAR_PQH_REG 0xc0 /* Page request queue head register */
+#define DMAR_PQH_REG_HI 0xc4
+#define DMAR_PQT_REG 0xc8 /* Page request queue tail register*/
+#define DMAR_PQT_REG_HI 0xcc
+#define DMAR_PQA_REG 0xd0 /* Page request queue address register */
+#define DMAR_PQA_REG_HI 0xd4
+#define DMAR_PRS_REG 0xdc /* Page request status register */
+#define DMAR_PECTL_REG 0xe0 /* Page request event control register */
+#define DMAR_PEDATA_REG 0xe4 /* Page request event data register */
+#define DMAR_PEADDR_REG 0xe8 /* Page request event address register */
+#define DMAR_PEUADDR_REG 0xec /* Page event upper address register */
+#define DMAR_MTRRCAP_REG 0x100 /* MTRR capability register */
+#define DMAR_MTRRCAP_REG_HI 0x104
+#define DMAR_MTRRDEF_REG 0x108 /* MTRR default type register */
+#define DMAR_MTRRDEF_REG_HI 0x10c
+
+/* IOTLB */
+#define DMAR_IOTLB_REG_OFFSET 0xf0 /* Offset to the IOTLB registers */
+#define DMAR_IVA_REG DMAR_IOTLB_REG_OFFSET /* Invalidate Address Register */
+#define DMAR_IVA_REG_HI (DMAR_IVA_REG + 4)
+/* IOTLB Invalidate Register */
+#define DMAR_IOTLB_REG (DMAR_IOTLB_REG_OFFSET + 0x8)
+#define DMAR_IOTLB_REG_HI (DMAR_IOTLB_REG + 4)
+
+/* FRCD */
+#define DMAR_FRCD_REG_OFFSET 0x220 /* Offset to the Fault Recording Registers */
+#define DMAR_FRCD_REG_NR 1ULL /* Num of Fault Recording Registers */
+
+/* If you change the DMAR_FRCD_REG_NR, please remember to change the
+ * DMAR_REG_SIZE in include/hw/i386/intel_iommu.h.
+ * #define DMAR_REG_SIZE (DMAR_FRCD_REG_OFFSET + 128 * DMAR_FRCD_REG_NR)
+ */
+
+/* IOTLB_REG */
+#define VTD_TLB_GLOBAL_FLUSH (1ULL << 60) /* Global invalidation */
+#define VTD_TLB_DSI_FLUSH (2ULL << 60) /* Domain-selective invalidation */
+#define VTD_TLB_PSI_FLUSH (3ULL << 60) /* Page-selective invalidation */
+#define VTD_TLB_FLUSH_GRANU_MASK (3ULL << 60)
+#define VTD_TLB_GLOBAL_FLUSH_A (1ULL << 57)
+#define VTD_TLB_DSI_FLUSH_A (2ULL << 57)
+#define VTD_TLB_PSI_FLUSH_A (3ULL << 57)
+#define VTD_TLB_FLUSH_GRANU_MASK_A (3ULL << 57)
+#define VTD_TLB_READ_DRAIN (1ULL << 49)
+#define VTD_TLB_WRITE_DRAIN (1ULL << 48)
+#define VTD_TLB_DID(id) (((uint64_t)((id) & 0xffffULL)) << 32)
+#define VTD_TLB_IVT (1ULL << 63)
+#define VTD_TLB_IH_NONLEAF (1ULL << 6)
+#define VTD_TLB_MAX_SIZE (0x3f)
+
+/* GCMD_REG */
+#define VTD_GCMD_TE (1UL << 31)
+#define VTD_GCMD_SRTP (1UL << 30)
+#define VTD_GCMD_SFL (1UL << 29)
+#define VTD_GCMD_EAFL (1UL << 28)
+#define VTD_GCMD_WBF (1UL << 27)
+#define VTD_GCMD_QIE (1UL << 26)
+#define VTD_GCMD_IRE (1UL << 25)
+#define VTD_GCMD_SIRTP (1UL << 24)
+#define VTD_GCMD_CFI (1UL << 23)
+
+/* GSTS_REG */
+#define VTD_GSTS_TES (1UL << 31)
+#define VTD_GSTS_RTPS (1UL << 30)
+#define VTD_GSTS_FLS (1UL << 29)
+#define VTD_GSTS_AFLS (1UL << 28)
+#define VTD_GSTS_WBFS (1UL << 27)
+#define VTD_GSTS_QIES (1UL << 26)
+#define VTD_GSTS_IRES (1UL << 25)
+#define VTD_GSTS_IRTPS (1UL << 24)
+#define VTD_GSTS_CFIS (1UL << 23)
+
+/* CCMD_REG */
+#define VTD_CCMD_ICC (1ULL << 63)
+#define VTD_CCMD_GLOBAL_INVL (1ULL << 61)
+#define VTD_CCMD_DOMAIN_INVL (2ULL << 61)
+#define VTD_CCMD_DEVICE_INVL (3ULL << 61)
+#define VTD_CCMD_CIRG_MASK (3ULL << 61)
+#define VTD_CCMD_GLOBAL_INVL_A (1ULL << 59)
+#define VTD_CCMD_DOMAIN_INVL_A (2ULL << 59)
+#define VTD_CCMD_DEVICE_INVL_A (3ULL << 59)
+#define VTD_CCMD_CAIG_MASK (3ULL << 59)
+#define VTD_CCMD_FM(m) (((uint64_t)((m) & 3ULL)) << 32)
+#define VTD_CCMD_MASK_NOBIT 0
+#define VTD_CCMD_MASK_1BIT 1
+#define VTD_CCMD_MASK_2BIT 2
+#define VTD_CCMD_MASK_3BIT 3
+#define VTD_CCMD_SID(s) (((uint64_t)((s) & 0xffffULL)) << 16)
+#define VTD_CCMD_DID(d) ((uint64_t)((d) & 0xffffULL))
+
+/* RTADDR_REG */
+#define VTD_RTADDR_RTT (1ULL << 11)
+
+/* ECAP_REG */
+#define VTD_ECAP_IRO (DMAR_IOTLB_REG_OFFSET << 4) /* (val >> 4) << 8 */
+
+/* CAP_REG */
+/* (val >> 4) << 24 */
+#define VTD_CAP_FRO (DMAR_FRCD_REG_OFFSET << 20)
+
+#define VTD_CAP_NFR ((DMAR_FRCD_REG_NR - 1) << 40)
+#define VTD_DOMAIN_ID_SHIFT 16 /* 16-bit domain id for 64K domains */
+#define VTD_CAP_ND (((VTD_DOMAIN_ID_SHIFT - 4) / 2) & 7ULL)
+#define VTD_MGAW 39 /* Maximum Guest Address Width */
+#define VTD_CAP_MGAW (((VTD_MGAW - 1) & 0x3fULL) << 16)
+/* Supported Adjusted Guest Address Widths */
+#define VTD_CAP_SAGAW_MASK (0x1fULL << 8)
+#define VTD_CAP_SAGAW_39bit (0x2ULL << 8) /* 39-bit AGAW, 3-level page-table */
+#define VTD_CAP_SAGAW_48bit (0x4ULL << 8) /* 48-bit AGAW, 4-level page-table */
+
+
+/* Pagesize of VTD paging structures, including root and context tables */
+#define VTD_PAGE_SHIFT (12)
+#define VTD_PAGE_SIZE (1ULL << VTD_PAGE_SHIFT)
+
+#define VTD_PAGE_SHIFT_4K (12)
+#define VTD_PAGE_MASK_4K (~((1ULL << VTD_PAGE_SHIFT_4K) - 1))
+#define VTD_PAGE_SHIFT_2M (21)
+#define VTD_PAGE_MASK_2M (~((1ULL << VTD_PAGE_SHIFT_2M) - 1))
+#define VTD_PAGE_SHIFT_1G (30)
+#define VTD_PAGE_MASK_1G (~((1ULL << VTD_PAGE_SHIFT_1G) - 1))
+
+/* Root-Entry
+ * 0: Present
+ * 1-11: Reserved
+ * 12-63: Context-table Pointer
+ * 64-127: Reserved
+ */
+struct VTDRootEntry {
+ uint64_t val;
+ uint64_t rsvd;
+};
+typedef struct VTDRootEntry VTDRootEntry;
+
+/* Masks for struct VTDRootEntry */
+#define VTD_ROOT_ENTRY_P (1ULL << 0)
+#define VTD_ROOT_ENTRY_CTP (~0xfffULL)
+
+#define VTD_ROOT_ENTRY_NR (VTD_PAGE_SIZE / sizeof(VTDRootEntry))
+
+
+/* Context-Entry */
+struct VTDContextEntry {
+ uint64_t lo;
+ uint64_t hi;
+};
+typedef struct VTDContextEntry VTDContextEntry;
+
+/* Masks for struct VTDContextEntry */
+/* lo */
+#define VTD_CONTEXT_ENTRY_P (1ULL << 0)
+#define VTD_CONTEXT_ENTRY_FPD (1ULL << 1) /* Fault Processing Disable */
+#define VTD_CONTEXT_ENTRY_TT (3ULL << 2) /* Translation Type */
+#define VTD_CONTEXT_TT_MULTI_LEVEL (0)
+#define VTD_CONTEXT_TT_DEV_IOTLB (1)
+#define VTD_CONTEXT_TT_PASS_THROUGH (2)
+/* Second Level Page Translation Pointer*/
+#define VTD_CONTEXT_ENTRY_SLPTPTR (~0xfffULL)
+
+/* hi */
+#define VTD_CONTEXT_ENTRY_AW (7ULL) /* Adjusted guest-address-width */
+#define VTD_CONTEXT_ENTRY_DID (0xffffULL << 8) /* Domain Identifier */
+
+
+#define VTD_CONTEXT_ENTRY_NR (VTD_PAGE_SIZE / sizeof(VTDContextEntry))
+
+
+/* Paging Structure common */
+#define VTD_SL_PT_PAGE_SIZE_MASK (1ULL << 7)
+#define VTD_SL_LEVEL_BITS 9 /* Bits to decide the offset for each level */
+
+/* Second Level Paging Structure */
+#define VTD_SL_PML4_LEVEL 4
+#define VTD_SL_PDP_LEVEL 3
+#define VTD_SL_PD_LEVEL 2
+#define VTD_SL_PT_LEVEL 1
+
+#define VTD_SL_PT_ENTRY_NR 512
+#define VTD_SL_PT_BASE_ADDR_MASK (~(VTD_PAGE_SIZE - 1))
+
+
+#endif
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
new file mode 100644
index 0000000..b19c216
--- /dev/null
+++ b/include/hw/i386/intel_iommu.h
@@ -0,0 +1,75 @@
+/*
+ * QEMU emulation of an Intel IOMMU (VT-d)
+ * (DMA Remapping device)
+ *
+ * Copyright (C) 2013 Knut Omang, Oracle <knut.omang@oracle.com>
+ * Copyright (C) 2014 Le Tan, <tamlokveer@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef INTEL_IOMMU_H
+#define INTEL_IOMMU_H
+#include "hw/qdev.h"
+#include "sysemu/dma.h"
+
+#define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
+#define INTEL_IOMMU_DEVICE(obj) \
+ OBJECT_CHECK(IntelIOMMUState, (obj), TYPE_INTEL_IOMMU_DEVICE)
+
+/* DMAR Hardware Unit Definition address (IOMMU unit) */
+#define Q35_HOST_BRIDGE_IOMMU_ADDR 0xfed90000ULL
+
+#define VTD_PCI_BUS_MAX 256
+#define VTD_PCI_SLOT_MAX 32
+#define VTD_PCI_FUNC_MAX 8
+#define VTD_PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
+#define VTD_PCI_FUNC(devfn) ((devfn) & 0x07)
+
+#define DMAR_REG_SIZE 0x2a0
+
+typedef struct IntelIOMMUState IntelIOMMUState;
+typedef struct VTDAddressSpace VTDAddressSpace;
+
+struct VTDAddressSpace {
+ int bus_num;
+ int devfn;
+ AddressSpace as;
+ MemoryRegion iommu;
+ IntelIOMMUState *iommu_state;
+};
+
+/* The iommu (DMAR) device state struct */
+struct IntelIOMMUState {
+ SysBusDevice busdev;
+ MemoryRegion csrmem;
+ uint8_t csr[DMAR_REG_SIZE]; /* register values */
+ uint8_t wmask[DMAR_REG_SIZE]; /* R/W bytes */
+ uint8_t w1cmask[DMAR_REG_SIZE]; /* RW1C(Write 1 to Clear) bytes */
+ uint8_t womask[DMAR_REG_SIZE]; /* WO (write only - read returns 0) */
+ uint32_t version;
+
+ dma_addr_t root; /* Current root table pointer */
+ bool extended; /* Type of root table (extended or not) */
+ uint16_t iq_head; /* Current invalidation queue head */
+ uint16_t iq_tail; /* Current invalidation queue tail */
+ dma_addr_t iq; /* Current invalidation queue (IQ) pointer */
+ size_t iq_sz; /* IQ Size in number of entries */
+ bool iq_enable; /* Set if the IQ is enabled */
+
+ MemoryRegionIOMMUOps iommu_ops;
+ VTDAddressSpace **address_spaces[VTD_PCI_BUS_MAX];
+};
+
+#endif
--
1.9.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Qemu-devel] [PATCH v2 2/3] intel-iommu: add DMAR table to ACPI tables
2014-07-27 8:52 [Qemu-devel] [PATCH v2 0/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 1/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation Le Tan
@ 2014-07-27 8:52 ` Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 3/3] intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch Le Tan
2 siblings, 0 replies; 4+ messages in thread
From: Le Tan @ 2014-07-27 8:52 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S. Tsirkin, Knut Omang, Le Tan, Alex Williamson,
Jan Kiszka, Anthony Liguori, Paolo Bonzini
Expose Intel IOMMU to the BIOS. If object of TYPE_INTEL_IOMMU_DEVICE exists,
add DMAR table to ACPI RSDT table. For now the DMAR table indicates that there
is only one hardware unit without INTR_REMAP capability on the platform.
Signed-off-by: Le Tan <tamlokveer@gmail.com>
---
hw/i386/acpi-build.c | 41 ++++++++++++++++++++++++++++++
hw/i386/acpi-defs.h | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 111 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..8241621 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -45,6 +45,7 @@
#include "hw/i386/ich9.h"
#include "hw/pci/pci_bus.h"
#include "hw/pci-host/q35.h"
+#include "hw/i386/intel_iommu.h"
#include "hw/i386/q35-acpi-dsdt.hex"
#include "hw/i386/acpi-dsdt.hex"
@@ -1316,6 +1317,31 @@ build_mcfg_q35(GArray *table_data, GArray *linker, AcpiMcfgInfo *info)
}
static void
+build_dmar_q35(GArray *table_data, GArray *linker)
+{
+ int dmar_start = table_data->len;
+
+ AcpiTableDmar *dmar;
+ AcpiDmarHardwareUnit *drhd;
+
+ dmar = acpi_data_push(table_data, sizeof(*dmar));
+ dmar->host_address_width = 0x26; /* 0x26 + 1 = 39 */
+ dmar->flags = 0; /* No intr_remap for now */
+
+ /* DMAR Remapping Hardware Unit Definition structure */
+ drhd = acpi_data_push(table_data, sizeof(*drhd));
+ drhd->type = cpu_to_le16(ACPI_DMAR_TYPE_HARDWARE_UNIT);
+ drhd->length = cpu_to_le16(sizeof(*drhd)); /* No device scope now */
+ drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
+ drhd->pci_segment = cpu_to_le16(0);
+ drhd->address = cpu_to_le64(Q35_HOST_BRIDGE_IOMMU_ADDR);
+
+ build_header(linker, table_data, (void *)(table_data->data + dmar_start),
+ "DMAR", table_data->len - dmar_start, 1);
+}
+
+
+static void
build_dsdt(GArray *table_data, GArray *linker, AcpiMiscInfo *misc)
{
AcpiTableHeader *dsdt;
@@ -1436,6 +1462,17 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
return true;
}
+static bool acpi_has_iommu(void)
+{
+ bool ambiguous;
+ Object *intel_iommu;
+
+ intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
+ &ambiguous);
+ return intel_iommu && !ambiguous;
+}
+
+
static
void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
{
@@ -1497,6 +1534,10 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
acpi_add_table(table_offsets, tables->table_data);
build_mcfg_q35(tables->table_data, tables->linker, &mcfg);
}
+ if (acpi_has_iommu()) {
+ acpi_add_table(table_offsets, tables->table_data);
+ build_dmar_q35(tables->table_data, tables->linker);
+ }
/* Add tables supplied by user (if any) */
for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
diff --git a/hw/i386/acpi-defs.h b/hw/i386/acpi-defs.h
index e93babb..9674825 100644
--- a/hw/i386/acpi-defs.h
+++ b/hw/i386/acpi-defs.h
@@ -314,4 +314,74 @@ struct AcpiTableMcfg {
} QEMU_PACKED;
typedef struct AcpiTableMcfg AcpiTableMcfg;
+/* DMAR - DMA Remapping table r2.2 */
+struct AcpiTableDmar {
+ ACPI_TABLE_HEADER_DEF
+ uint8_t host_address_width; /* Maximum DMA physical addressability */
+ uint8_t flags;
+ uint8_t reserved[10];
+} QEMU_PACKED;
+typedef struct AcpiTableDmar AcpiTableDmar;
+
+/* Masks for Flags field above */
+#define ACPI_DMAR_INTR_REMAP (1)
+#define ACPI_DMAR_X2APIC_OPT_OUT (2)
+
+/*
+ * DMAR sub-structures (Follow DMA Remapping table)
+ */
+#define ACPI_DMAR_SUB_HEADER_DEF /* Common ACPI DMAR sub-structure header */\
+ uint16_t type; \
+ uint16_t length;
+
+/* Values for sub-structure type for DMAR */
+enum {
+ ACPI_DMAR_TYPE_HARDWARE_UNIT = 0, /* DRHD */
+ ACPI_DMAR_TYPE_RESERVED_MEMORY = 1, /* RMRR */
+ ACPI_DMAR_TYPE_ATSR = 2, /* ATSR */
+ ACPI_DMAR_TYPE_HARDWARE_AFFINITY = 3, /* RHSR */
+ ACPI_DMAR_TYPE_ANDD = 4, /* ANDD */
+ ACPI_DMAR_TYPE_RESERVED = 5 /* Reserved for furture use */
+};
+
+/*
+ * Sub-structures for DMAR, correspond to Type in ACPI_DMAR_SUB_HEADER_DEF
+ */
+
+/* DMAR Device Scope structures */
+struct AcpiDmarDeviceScope {
+ uint8_t type;
+ uint8_t length;
+ uint16_t reserved;
+ uint8_t enumeration_id;
+ uint8_t start_bus_number;
+ uint8_t path[0];
+} QEMU_PACKED;
+typedef struct AcpiDmarDeviceScope AcpiDmarDeviceScope;
+
+/* Values for type in struct AcpiDmarDeviceScope */
+enum {
+ ACPI_DMAR_SCOPE_TYPE_NOT_USED = 0,
+ ACPI_DMAR_SCOPE_TYPE_ENDPOINT = 1,
+ ACPI_DMAR_SCOPE_TYPE_BRIDGE = 2,
+ ACPI_DMAR_SCOPE_TYPE_IOAPIC = 3,
+ ACPI_DMAR_SCOPE_TYPE_HPET = 4,
+ ACPI_DMAR_SCOPE_TYPE_ACPI = 5,
+ ACPI_DMAR_SCOPE_TYPE_RESERVED = 6 /* Reserved for future use */
+};
+
+/* 0: Hardware Unit Definition */
+struct AcpiDmarHardwareUnit {
+ ACPI_DMAR_SUB_HEADER_DEF
+ uint8_t flags;
+ uint8_t reserved;
+ uint16_t pci_segment; /* The PCI Segment associated with this unit */
+ uint64_t address; /* Base address of remapping hardware register-set */
+} QEMU_PACKED;
+typedef struct AcpiDmarHardwareUnit AcpiDmarHardwareUnit;
+
+/* Masks for Flags field above */
+#define ACPI_DMAR_INCLUDE_PCI_ALL (1)
+
+
#endif
--
1.9.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Qemu-devel] [PATCH v2 3/3] intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
2014-07-27 8:52 [Qemu-devel] [PATCH v2 0/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 1/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 2/3] intel-iommu: add DMAR table to ACPI tables Le Tan
@ 2014-07-27 8:52 ` Le Tan
2 siblings, 0 replies; 4+ messages in thread
From: Le Tan @ 2014-07-27 8:52 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S. Tsirkin, Knut Omang, Le Tan, Alex Williamson,
Jan Kiszka, Anthony Liguori, Paolo Bonzini
Add Intel IOMMU emulation to q35 chipset and expose it to the guest.
1. Add a machine option. Users can use "-machine iommu=on|off" in the command
line to enable/disable Intel IOMMU. The default is off.
2. Accroding to the machine option, q35 will initialize the Intel IOMMU and
use pci_setup_iommu() to setup q35_host_dma_iommu() as the IOMMU function for
the pci bus.
3. q35_host_dma_iommu() will return different address space according to the
bus_num and devfn of the device.
Signed-off-by: Le Tan <tamlokveer@gmail.com>
---
hw/core/machine.c | 27 ++++++++++++++++--
hw/pci-host/q35.c | 72 +++++++++++++++++++++++++++++++++++++++++++----
include/hw/boards.h | 1 +
include/hw/pci-host/q35.h | 2 ++
qemu-options.hx | 5 +++-
vl.c | 4 +++
6 files changed, 102 insertions(+), 9 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index cbba679..9b166e5 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -235,6 +235,20 @@ static void machine_set_firmware(Object *obj, const char *value, Error **errp)
ms->firmware = g_strdup(value);
}
+static bool machine_get_iommu(Object *obj, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ return ms->iommu;
+}
+
+static void machine_set_iommu(Object *obj, bool value, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ ms->iommu = value;
+}
+
static void machine_initfn(Object *obj)
{
object_property_add_str(obj, "accel",
@@ -270,10 +284,17 @@ static void machine_initfn(Object *obj)
machine_set_dump_guest_core,
NULL);
object_property_add_bool(obj, "mem-merge",
- machine_get_mem_merge, machine_set_mem_merge, NULL);
- object_property_add_bool(obj, "usb", machine_get_usb, machine_set_usb, NULL);
+ machine_get_mem_merge,
+ machine_set_mem_merge, NULL);
+ object_property_add_bool(obj, "usb",
+ machine_get_usb,
+ machine_set_usb, NULL);
object_property_add_str(obj, "firmware",
- machine_get_firmware, machine_set_firmware, NULL);
+ machine_get_firmware,
+ machine_set_firmware, NULL);
+ object_property_add_bool(obj, "iommu",
+ machine_get_iommu,
+ machine_set_iommu, NULL);
}
static void machine_finalize(Object *obj)
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index a0a3068..4abd0ee 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -347,6 +347,61 @@ static void mch_reset(DeviceState *qdev)
mch_update(mch);
}
+static AddressSpace *q35_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
+{
+ IntelIOMMUState *s = opaque;
+ VTDAddressSpace **pvtd_as;
+ VTDAddressSpace *vtd_as;
+ int bus_num = pci_bus_num(bus);
+
+ assert(devfn >= 0);
+
+ pvtd_as = s->address_spaces[bus_num];
+ if (!pvtd_as) {
+ /* No corresponding free() */
+ pvtd_as = g_malloc0(sizeof(VTDAddressSpace *) *
+ VTD_PCI_SLOT_MAX * VTD_PCI_FUNC_MAX);
+ s->address_spaces[bus_num] = pvtd_as;
+ }
+
+ vtd_as = *(pvtd_as + devfn);
+ if (!vtd_as) {
+ vtd_as = g_malloc0(sizeof(*vtd_as));
+ *(pvtd_as + devfn) = vtd_as;
+
+ vtd_as->bus_num = bus_num;
+ vtd_as->devfn = devfn;
+ vtd_as->iommu_state = s;
+ memory_region_init_iommu(&vtd_as->iommu, OBJECT(s), &s->iommu_ops,
+ "intel_iommu", UINT64_MAX);
+ address_space_init(&vtd_as->as, &vtd_as->iommu, "intel_iommu");
+ }
+
+ return &vtd_as->as;
+}
+
+static void mch_init_dmar(MCHPCIState *mch)
+{
+ Error *error = NULL;
+ PCIBus *pci_bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
+
+ mch->iommu = INTEL_IOMMU_DEVICE(object_new(TYPE_INTEL_IOMMU_DEVICE));
+ qdev_set_parent_bus(DEVICE(mch->iommu), sysbus_get_default());
+ object_property_set_bool(OBJECT(mch->iommu), true, "realized", &error);
+
+ if (error) {
+ fprintf(stderr, "%s\n", error_get_pretty(error));
+ error_free(error);
+ return;
+ }
+
+ memory_region_add_subregion(mch->pci_address_space,
+ Q35_HOST_BRIDGE_IOMMU_ADDR,
+ &mch->iommu->csrmem);
+ pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu);
+}
+
+
static int mch_init(PCIDevice *d)
{
int i;
@@ -363,13 +418,20 @@ static int mch_init(PCIDevice *d)
memory_region_add_subregion_overlap(mch->system_memory, 0xa0000,
&mch->smram_region, 1);
memory_region_set_enabled(&mch->smram_region, false);
- init_pam(DEVICE(mch), mch->ram_memory, mch->system_memory, mch->pci_address_space,
- &mch->pam_regions[0], PAM_BIOS_BASE, PAM_BIOS_SIZE);
+ init_pam(DEVICE(mch), mch->ram_memory, mch->system_memory,
+ mch->pci_address_space, &mch->pam_regions[0], PAM_BIOS_BASE,
+ PAM_BIOS_SIZE);
for (i = 0; i < 12; ++i) {
- init_pam(DEVICE(mch), mch->ram_memory, mch->system_memory, mch->pci_address_space,
- &mch->pam_regions[i+1], PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE,
- PAM_EXPAN_SIZE);
+ init_pam(DEVICE(mch), mch->ram_memory, mch->system_memory,
+ mch->pci_address_space, &mch->pam_regions[i+1],
+ PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
+ }
+
+ /* Intel IOMMU (VT-d) */
+ if (qemu_opt_get_bool(qemu_get_machine_opts(), "iommu", false)) {
+ mch_init_dmar(mch);
}
+
return 0;
}
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 605a970..dfb6718 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -123,6 +123,7 @@ struct MachineState {
bool mem_merge;
bool usb;
char *firmware;
+ bool iommu;
ram_addr_t ram_size;
ram_addr_t maxram_size;
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index d9ee978..025d6e6 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -33,6 +33,7 @@
#include "hw/acpi/acpi.h"
#include "hw/acpi/ich9.h"
#include "hw/pci-host/pam.h"
+#include "hw/i386/intel_iommu.h"
#define TYPE_Q35_HOST_DEVICE "q35-pcihost"
#define Q35_HOST_DEVICE(obj) \
@@ -60,6 +61,7 @@ typedef struct MCHPCIState {
uint64_t pci_hole64_size;
PcGuestInfo *guest_info;
uint32_t short_root_bus;
+ IntelIOMMUState *iommu;
} MCHPCIState;
typedef struct Q35PCIHost {
diff --git a/qemu-options.hx b/qemu-options.hx
index 9e54686..e60ba6a 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -35,7 +35,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
" kernel_irqchip=on|off controls accelerated irqchip support\n"
" kvm_shadow_mem=size of KVM shadow MMU\n"
" dump-guest-core=on|off include guest memory in a core dump (default=on)\n"
- " mem-merge=on|off controls memory merge support (default: on)\n",
+ " mem-merge=on|off controls memory merge support (default: on)\n"
+ " iommu=on|off controls emulated Intel IOMMU (VT-d) support (default=off)\n",
QEMU_ARCH_ALL)
STEXI
@item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -58,6 +59,8 @@ Include guest memory in a core dump. The default is on.
Enables or disables memory merge support. This feature, when supported by
the host, de-duplicates identical memory pages among VMs instances
(enabled by default).
+@item iommu=on|off
+Enables or disables emulated Intel IOMMU (VT-d) support. The default is off.
@end table
ETEXI
diff --git a/vl.c b/vl.c
index 6abedcf..e49ef5a 100644
--- a/vl.c
+++ b/vl.c
@@ -387,6 +387,10 @@ static QemuOptsList qemu_machine_opts = {
.name = PC_MACHINE_MAX_RAM_BELOW_4G,
.type = QEMU_OPT_SIZE,
.help = "maximum ram below the 4G boundary (32bit boundary)",
+ },{
+ .name = "iommu",
+ .type = QEMU_OPT_BOOL,
+ .help = "Set on/off to enable/disable Intel IOMMU (VT-d)",
},
{ /* End of list */ }
},
--
1.9.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-27 8:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-27 8:52 [Qemu-devel] [PATCH v2 0/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation to q35 chipset Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 1/3] intel-iommu: introduce Intel IOMMU (VT-d) emulation Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 2/3] intel-iommu: add DMAR table to ACPI tables Le Tan
2014-07-27 8:52 ` [Qemu-devel] [PATCH v2 3/3] intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch Le Tan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).