* [Qemu-devel] [PATCH v3 01/13] intel_iommu: allow queued invalidation for IR
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register Peter Xu
` (12 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Queued invalidation is required for IR. This patch add basic support for
interrupt cache invalidate requests. Since we currently have no IR cache
implemented yet, we can just skip all interrupt cache invalidation
requests for now.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 9 +++++++++
hw/i386/intel_iommu_internal.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 347718f..4b0558e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1400,6 +1400,15 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s)
}
break;
+ case VTD_INV_DESC_IEC:
+ VTD_DPRINTF(INV, "Interrupt Entry Cache Invalidation "
+ "not implemented yet");
+ /*
+ * Since currently we do not cache interrupt entries, we can
+ * just mark this descriptor as "good" and move on.
+ */
+ break;
+
default:
VTD_DPRINTF(GENERAL, "error: unkonw Invalidation Descriptor type "
"hi 0x%"PRIx64 " lo 0x%"PRIx64 " type %"PRIu8,
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index e5f514c..b648e69 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -286,6 +286,8 @@ typedef struct VTDInvDesc VTDInvDesc;
#define VTD_INV_DESC_TYPE 0xf
#define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc */
#define VTD_INV_DESC_IOTLB 0x2
+#define VTD_INV_DESC_IEC 0x4 /* Interrupt Entry Cache
+ Invalidate Descriptor */
#define VTD_INV_DESC_WAIT 0x5 /* Invalidation Wait Descriptor */
#define VTD_INV_DESC_NONE 0 /* Not an Invalidate Descriptor */
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 01/13] intel_iommu: allow queued invalidation for IR Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-17 2:30 ` Jan Kiszka
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 03/13] acpi: add DMAR scope definition for root IOAPIC Peter Xu
` (11 subsequent siblings)
13 siblings, 1 reply; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Enable IR in IOMMU Extended Capability register.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 7 +++++++
hw/i386/intel_iommu_internal.h | 2 ++
2 files changed, 9 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4b0558e..17668d6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -24,6 +24,7 @@
#include "exec/address-spaces.h"
#include "intel_iommu_internal.h"
#include "hw/pci/pci.h"
+#include "hw/boards.h"
/*#define DEBUG_INTEL_IOMMU*/
#ifdef DEBUG_INTEL_IOMMU
@@ -1941,6 +1942,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
*/
static void vtd_init(IntelIOMMUState *s)
{
+ MachineState *ms = MACHINE(qdev_get_machine());
+
memset(s->csr, 0, DMAR_REG_SIZE);
memset(s->wmask, 0, DMAR_REG_SIZE);
memset(s->w1cmask, 0, DMAR_REG_SIZE);
@@ -1961,6 +1964,10 @@ static void vtd_init(IntelIOMMUState *s)
VTD_CAP_SAGAW | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS;
s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
+ if (ms->iommu_intr) {
+ s->ecap |= VTD_ECAP_IR;
+ }
+
vtd_reset_context_cache(s);
vtd_reset_iotlb(s);
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index b648e69..5b98a11 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -176,6 +176,8 @@
/* (offset >> 4) << 8 */
#define VTD_ECAP_IRO (DMAR_IOTLB_REG_OFFSET << 4)
#define VTD_ECAP_QI (1ULL << 1)
+/* Interrupt Remapping support */
+#define VTD_ECAP_IR (1ULL << 3)
/* CAP_REG */
/* (offset >> 4) << 24 */
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register Peter Xu
@ 2016-04-17 2:30 ` Jan Kiszka
2016-04-18 3:11 ` Peter Xu
0 siblings, 1 reply; 24+ messages in thread
From: Jan Kiszka @ 2016-04-17 2:30 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini, rkrcmar,
alex.williamson, wexu
On 2016-04-14 20:31, Peter Xu wrote:
> Enable IR in IOMMU Extended Capability register.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> hw/i386/intel_iommu.c | 7 +++++++
> hw/i386/intel_iommu_internal.h | 2 ++
> 2 files changed, 9 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 4b0558e..17668d6 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -24,6 +24,7 @@
> #include "exec/address-spaces.h"
> #include "intel_iommu_internal.h"
> #include "hw/pci/pci.h"
> +#include "hw/boards.h"
>
> /*#define DEBUG_INTEL_IOMMU*/
> #ifdef DEBUG_INTEL_IOMMU
> @@ -1941,6 +1942,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
> */
> static void vtd_init(IntelIOMMUState *s)
> {
> + MachineState *ms = MACHINE(qdev_get_machine());
> +
> memset(s->csr, 0, DMAR_REG_SIZE);
> memset(s->wmask, 0, DMAR_REG_SIZE);
> memset(s->w1cmask, 0, DMAR_REG_SIZE);
> @@ -1961,6 +1964,10 @@ static void vtd_init(IntelIOMMUState *s)
> VTD_CAP_SAGAW | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS;
> s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>
> + if (ms->iommu_intr) {
This cannot work, the field doesn't exit yet.
Please test bisectability after reordering patches.
Jan
> + s->ecap |= VTD_ECAP_IR;
> + }
> +
> vtd_reset_context_cache(s);
> vtd_reset_iotlb(s);
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index b648e69..5b98a11 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -176,6 +176,8 @@
> /* (offset >> 4) << 8 */
> #define VTD_ECAP_IRO (DMAR_IOTLB_REG_OFFSET << 4)
> #define VTD_ECAP_QI (1ULL << 1)
> +/* Interrupt Remapping support */
> +#define VTD_ECAP_IR (1ULL << 3)
>
> /* CAP_REG */
> /* (offset >> 4) << 24 */
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register
2016-04-17 2:30 ` Jan Kiszka
@ 2016-04-18 3:11 ` Peter Xu
0 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-18 3:11 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
pbonzini, rkrcmar, alex.williamson, wexu
On Sat, Apr 16, 2016 at 07:30:25PM -0700, Jan Kiszka wrote:
> On 2016-04-14 20:31, Peter Xu wrote:
> > Enable IR in IOMMU Extended Capability register.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > hw/i386/intel_iommu.c | 7 +++++++
> > hw/i386/intel_iommu_internal.h | 2 ++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 4b0558e..17668d6 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -24,6 +24,7 @@
> > #include "exec/address-spaces.h"
> > #include "intel_iommu_internal.h"
> > #include "hw/pci/pci.h"
> > +#include "hw/boards.h"
> >
> > /*#define DEBUG_INTEL_IOMMU*/
> > #ifdef DEBUG_INTEL_IOMMU
> > @@ -1941,6 +1942,8 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
> > */
> > static void vtd_init(IntelIOMMUState *s)
> > {
> > + MachineState *ms = MACHINE(qdev_get_machine());
> > +
> > memset(s->csr, 0, DMAR_REG_SIZE);
> > memset(s->wmask, 0, DMAR_REG_SIZE);
> > memset(s->w1cmask, 0, DMAR_REG_SIZE);
> > @@ -1961,6 +1964,10 @@ static void vtd_init(IntelIOMMUState *s)
> > VTD_CAP_SAGAW | VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS;
> > s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
> >
> > + if (ms->iommu_intr) {
>
> This cannot work, the field doesn't exit yet.
>
> Please test bisectability after reordering patches.
Oh god, I missed one patch. There should be 14 patches, while it
seems that I generated only 13. :(
Sorry for the misunderstanding. Will repost as v4.
-- peterx
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 03/13] acpi: add DMAR scope definition for root IOAPIC
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 01/13] intel_iommu: allow queued invalidation for IR Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 02/13] intel_iommu: set IR bit for ECAP register Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 04/13] intel_iommu: define interrupt remap table addr register Peter Xu
` (10 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
To enable interrupt remapping for intel IOMMU device, each IOAPIC device
in the system reported via ACPI MADT must be explicitly enumerated under
one specific remapping hardware unit. This patch adds the root-complex
IOAPIC into the default DMAR device.
Please refer to VT-d spec 8.3.1.1 for more information.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/acpi-build.c | 17 +++++++++++++++--
include/hw/acpi/acpi-defs.h | 15 +++++++++++++++
include/hw/pci-host/q35.h | 9 +++++++++
3 files changed, 39 insertions(+), 2 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 80dd1bb..5d2d87b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -77,6 +77,9 @@
#define ACPI_BUILD_DPRINTF(fmt, ...)
#endif
+/* Default IOAPIC ID */
+#define ACPI_BUILD_IOAPIC_ID 0x0
+
typedef struct AcpiMcfgInfo {
uint64_t mcfg_base;
uint32_t mcfg_size;
@@ -375,7 +378,6 @@ build_madt(GArray *table_data, GArray *linker, PCMachineState *pcms)
io_apic = acpi_data_push(table_data, sizeof *io_apic);
io_apic->type = ACPI_APIC_IO;
io_apic->length = sizeof(*io_apic);
-#define ACPI_BUILD_IOAPIC_ID 0x0
io_apic->io_apic_id = ACPI_BUILD_IOAPIC_ID;
io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
io_apic->interrupt = cpu_to_le32(0);
@@ -2582,6 +2584,9 @@ build_dmar_q35(MachineState *ms, GArray *table_data, GArray *linker)
AcpiTableDmar *dmar;
AcpiDmarHardwareUnit *drhd;
uint8_t dmar_flags = 0;
+ AcpiDmarDeviceScope *scope = NULL;
+ /* Root complex IOAPIC use one path[0] only */
+ uint16_t scope_size = sizeof(*scope) + sizeof(uint16_t);
if (ms->iommu_intr) {
/* enable INTR for the IOMMU device */
@@ -2595,11 +2600,19 @@ build_dmar_q35(MachineState *ms, GArray *table_data, GArray *linker)
/* DMAR Remapping Hardware Unit Definition structure */
drhd = acpi_data_push(table_data, sizeof(*drhd));
drhd->type = cpu_to_le16(ACPI_DMAR_TYPE_HARDWARE_UNIT);
- drhd->length = cpu_to_le16(sizeof(*drhd)); /* No device scope now */
+ drhd->length = cpu_to_le16(sizeof(*drhd) + scope_size);
drhd->flags = ACPI_DMAR_INCLUDE_PCI_ALL;
drhd->pci_segment = cpu_to_le16(0);
drhd->address = cpu_to_le64(Q35_HOST_BRIDGE_IOMMU_ADDR);
+ /* Scope definition for the root-complex IOAPIC */
+ scope = acpi_data_push(table_data, scope_size);
+ scope->entry_type = cpu_to_le16(ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC);
+ scope->length = scope_size;
+ scope->enumeration_id = cpu_to_le16(ACPI_BUILD_IOAPIC_ID);
+ scope->bus = cpu_to_le16(Q35_PSEUDO_BUS_PLATFORM);
+ scope->path[0] = cpu_to_le16(Q35_PSEUDO_DEVFN_IOAPIC);
+
build_header(linker, table_data, (void *)(table_data->data + dmar_start),
"DMAR", table_data->len - dmar_start, 1, NULL, NULL);
}
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c7a03d4..2430af6 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -556,6 +556,20 @@ enum {
/*
* Sub-structures for DMAR
*/
+
+#define ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC (0x03)
+
+/* Device scope structure for DRHD. */
+struct AcpiDmarDeviceScope {
+ uint8_t entry_type;
+ uint8_t length;
+ uint16_t reserved;
+ uint8_t enumeration_id;
+ uint8_t bus;
+ uint16_t path[0]; /* list of dev:func pairs */
+} QEMU_PACKED;
+typedef struct AcpiDmarDeviceScope AcpiDmarDeviceScope;
+
/* Type 0: Hardware Unit Definition */
struct AcpiDmarHardwareUnit {
uint16_t type;
@@ -564,6 +578,7 @@ struct AcpiDmarHardwareUnit {
uint8_t reserved;
uint16_t pci_segment; /* The PCI Segment associated with this unit */
uint64_t address; /* Base address of remapping hardware register-set */
+ AcpiDmarDeviceScope scope[0];
} QEMU_PACKED;
typedef struct AcpiDmarHardwareUnit AcpiDmarHardwareUnit;
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index c5c073d..9afc221 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -175,4 +175,13 @@ typedef struct Q35PCIHost {
uint64_t mch_mcfg_base(void);
+/*
+ * Arbitary but unique BNF number for IOAPIC device. This is only
+ * used when interrupt remapping is enabled.
+ *
+ * TODO: make sure there would have no conflict with real PCI bus
+ */
+#define Q35_PSEUDO_BUS_PLATFORM (0xff)
+#define Q35_PSEUDO_DEVFN_IOAPIC (0x00)
+
#endif /* HW_Q35_H */
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 04/13] intel_iommu: define interrupt remap table addr register
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (2 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 03/13] acpi: add DMAR scope definition for root IOAPIC Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 05/13] intel_iommu: handle interrupt remap enable Peter Xu
` (9 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Defined Interrupt Remap Table Address register to store IR table
pointer. Also, do proper handling on global command register writes to
store table pointer and its size.
One more debug flag "DEBUG_IR" is added for interrupt remapping.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 52 +++++++++++++++++++++++++++++++++++++++++-
hw/i386/intel_iommu_internal.h | 4 ++++
include/hw/i386/intel_iommu.h | 5 ++++
3 files changed, 60 insertions(+), 1 deletion(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 17668d6..00b873c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -30,7 +30,7 @@
#ifdef DEBUG_INTEL_IOMMU
enum {
DEBUG_GENERAL, DEBUG_CSR, DEBUG_INV, DEBUG_MMU, DEBUG_FLOG,
- DEBUG_CACHE,
+ DEBUG_CACHE, DEBUG_IR,
};
#define VTD_DBGBIT(x) (1 << DEBUG_##x)
static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | VTD_DBGBIT(CSR);
@@ -900,6 +900,19 @@ static void vtd_root_table_setup(IntelIOMMUState *s)
(s->root_extended ? "(extended)" : ""));
}
+static void vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
+{
+ uint64_t value = 0;
+ value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
+ s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
+ s->intr_root = value & VTD_IRTA_ADDR_MASK;
+
+ /* TODO: invalidate interrupt entry cache */
+
+ VTD_DPRINTF(CSR, "int remap table addr 0x%"PRIx64 " size %"PRIu32,
+ s->intr_root, s->intr_size);
+}
+
static void vtd_context_global_invalidate(IntelIOMMUState *s)
{
s->context_cache_gen++;
@@ -1138,6 +1151,16 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s)
vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
}
+/* Set Interrupt Remap Table Pointer */
+static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
+{
+ VTD_DPRINTF(CSR, "set Interrupt Remap Table Pointer");
+
+ vtd_interrupt_remap_table_setup(s);
+ /* Ok - report back to driver */
+ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
+}
+
/* Handle Translation Enable/Disable */
static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
{
@@ -1177,6 +1200,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
/* Queued Invalidation Enable */
vtd_handle_gcmd_qie(s, val & VTD_GCMD_QIE);
}
+ if (val & VTD_GCMD_SIRTP) {
+ /* Set/update the interrupt remapping root-table pointer */
+ vtd_handle_gcmd_sirtp(s);
+ }
}
/* Handle write to Context Command Register */
@@ -1838,6 +1865,23 @@ static void vtd_mem_write(void *opaque, hwaddr addr,
vtd_update_fsts_ppf(s);
break;
+ case DMAR_IRTA_REG:
+ VTD_DPRINTF(IR, "DMAR_IRTA_REG write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ if (size == 4) {
+ vtd_set_long(s, addr, val);
+ } else {
+ vtd_set_quad(s, addr, val);
+ }
+ break;
+
+ case DMAR_IRTA_REG_HI:
+ VTD_DPRINTF(IR, "DMAR_IRTA_REG_HI write addr 0x%"PRIx64
+ ", size %d, val 0x%"PRIx64, addr, size, val);
+ assert(size == 4);
+ vtd_set_long(s, addr, val);
+ break;
+
default:
VTD_DPRINTF(GENERAL, "error: unhandled reg write addr 0x%"PRIx64
", size %d, val 0x%"PRIx64, addr, size, val);
@@ -2017,6 +2061,12 @@ static void vtd_init(IntelIOMMUState *s)
/* Fault Recording Registers, 128-bit */
vtd_define_quad(s, DMAR_FRCD_REG_0_0, 0, 0, 0);
vtd_define_quad(s, DMAR_FRCD_REG_0_2, 0, 0, 0x8000000000000000ULL);
+
+ /*
+ * Interrupt remapping registers, not support extended interrupt
+ * mode for now.
+ */
+ vtd_define_quad(s, DMAR_IRTA_REG, 0, 0xfffffffffffff00fULL, 0);
}
/* Should not reset address_spaces when reset because devices will still use
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 5b98a11..309833f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -172,6 +172,10 @@
#define VTD_RTADDR_RTT (1ULL << 11)
#define VTD_RTADDR_ADDR_MASK (VTD_HAW_MASK ^ 0xfffULL)
+/* IRTA_REG */
+#define VTD_IRTA_ADDR_MASK (VTD_HAW_MASK ^ 0xfffULL)
+#define VTD_IRTA_SIZE_MASK (0xfULL)
+
/* ECAP_REG */
/* (offset >> 4) << 8 */
#define VTD_ECAP_IRO (DMAR_IOTLB_REG_OFFSET << 4)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 0d89796..cc49839 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -125,6 +125,11 @@ struct IntelIOMMUState {
MemoryRegionIOMMUOps iommu_ops;
GHashTable *vtd_as_by_busptr; /* VTDBus objects indexed by PCIBus* reference */
VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed by bus number */
+
+ /* interrupt remapping */
+ bool intr_enabled; /* Whether guest enabled IR */
+ dma_addr_t intr_root; /* Interrupt remapping table pointer */
+ uint32_t intr_size; /* Number of IR table entries */
};
/* Find the VTD Address space associated with the given bus pointer,
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 05/13] intel_iommu: handle interrupt remap enable
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (3 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 04/13] intel_iommu: define interrupt remap table addr register Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 06/13] intel_iommu: define several structs for IOMMU IR Peter Xu
` (8 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Handle writting to IRE bit in global command register.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 00b873c..4d14124 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1180,6 +1180,22 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
}
}
+/* Handle Interrupt Remap Enable/Disable */
+static void vtd_handle_gcmd_ire(IntelIOMMUState *s, bool en)
+{
+ VTD_DPRINTF(CSR, "Interrupt Remap Enable %s", (en ? "on" : "off"));
+
+ if (en) {
+ s->intr_enabled = true;
+ /* Ok - report back to driver */
+ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRES);
+ } else {
+ s->intr_enabled = false;
+ /* Ok - report back to driver */
+ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_IRES, 0);
+ }
+}
+
/* Handle write to Global Command Register */
static void vtd_handle_gcmd_write(IntelIOMMUState *s)
{
@@ -1204,6 +1220,10 @@ static void vtd_handle_gcmd_write(IntelIOMMUState *s)
/* Set/update the interrupt remapping root-table pointer */
vtd_handle_gcmd_sirtp(s);
}
+ if (changed & VTD_GCMD_IRE) {
+ /* Interrupt remap enable/disable */
+ vtd_handle_gcmd_ire(s, val & VTD_GCMD_IRE);
+ }
}
/* Handle write to Context Command Register */
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 06/13] intel_iommu: define several structs for IOMMU IR
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (4 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 05/13] intel_iommu: handle interrupt remap enable Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 07/13] intel_iommu: provide helper function vtd_get_iommu Peter Xu
` (7 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Several data structs are defined to better support the rest of the
patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
structure bits to parse interrupt entries to be filled in by guest
kernel.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/hw/i386/intel_iommu.h | 60 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index cc49839..4914fe6 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -52,6 +52,9 @@ typedef struct IntelIOMMUState IntelIOMMUState;
typedef struct VTDAddressSpace VTDAddressSpace;
typedef struct VTDIOTLBEntry VTDIOTLBEntry;
typedef struct VTDBus VTDBus;
+typedef union VTD_IRTE VTD_IRTE;
+typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
+typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
/* Context-Entry */
struct VTDContextEntry {
@@ -90,6 +93,63 @@ struct VTDIOTLBEntry {
bool write_flags;
};
+/* Interrupt Remapping Table Entry Definition */
+union VTD_IRTE {
+ struct {
+ uint8_t present:1; /* Whether entry present/available */
+ uint8_t fault_disable:1; /* Fault Processing Disable */
+ uint8_t dest_mode:1; /* Destination Mode */
+ uint8_t redir_hint:1; /* Redirection Hint */
+ uint8_t trigger_mode:1; /* Trigger Mode */
+ uint8_t delivery_mode:3; /* Delivery Mode */
+ uint8_t __avail:4; /* Available spaces for software */
+ uint8_t __reserved_0:3; /* Reserved 0 */
+ uint8_t irte_mode:1; /* IRTE Mode */
+ uint8_t vector:8; /* Interrupt Vector */
+ uint8_t __reserved_1:8; /* Reserved 1 */
+ uint32_t dest_id:32; /* Destination ID */
+ uint16_t source_id:16; /* Source-ID */
+ uint8_t sid_q:2; /* Source-ID Qualifier */
+ uint8_t sid_vtype:2; /* Source-ID Validation Type */
+ uint64_t __reserved_2:44; /* Reserved 2 */
+ } QEMU_PACKED;
+ uint64_t data[2];
+};
+
+/* Programming format for IOAPIC table entries */
+union VTD_IR_IOAPICEntry {
+ struct {
+ uint8_t vector:8; /* Vector */
+ uint8_t __zeros:3; /* Reserved (all zero) */
+ uint8_t index_h:1; /* Interrupt Index bit 15 */
+ uint8_t status:1; /* Deliver Status */
+ uint8_t polarity:1; /* Interrupt Polarity */
+ uint8_t remote_irr:1; /* Remote IRR */
+ uint8_t trigger_mode:1; /* Trigger Mode */
+ uint8_t mask:1; /* Mask */
+ uint32_t __reserved:31; /* Reserved (should all zero) */
+ uint8_t int_mode:1; /* Interrupt Format */
+ uint16_t index_l:15; /* Interrupt Index bits 14-0 */
+ } QEMU_PACKED;
+ uint64_t data;
+};
+
+/* Programming format for MSI/MSI-X addresses */
+union VTD_IR_MSIAddress {
+ struct {
+ uint8_t __not_care:2;
+ uint8_t index_h:1; /* Interrupt index bit 15 */
+ uint8_t sub_valid:1; /* SHV: Sub-Handle Valid bit */
+ uint8_t int_mode:1; /* Interrupt format */
+ uint16_t index_l:15; /* Interrupt index bit 14-0 */
+ uint16_t __head:12; /* Should always be: 0x0fee */
+ } QEMU_PACKED;
+ uint32_t data;
+};
+
+/* When IR is enabled, all MSI/MSI-X data bits should be zero */
+#define VTD_IR_MSI_DATA (0)
+
/* The iommu (DMAR) device state struct */
struct IntelIOMMUState {
SysBusDevice busdev;
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 07/13] intel_iommu: provide helper function vtd_get_iommu
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (5 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 06/13] intel_iommu: define several structs for IOMMU IR Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 08/13] intel_iommu: add IR translation faults defines Peter Xu
` (6 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Moves acpi_get_iommu() under VT-d to make it a public function.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/acpi-build.c | 7 +------
hw/i386/intel_iommu.c | 13 +++++++++++++
include/hw/i386/intel_iommu.h | 2 ++
3 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 5d2d87b..b064bc2 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2677,12 +2677,7 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
static bool acpi_has_iommu(void)
{
- bool ambiguous;
- Object *intel_iommu;
-
- intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
- &ambiguous);
- return intel_iommu && !ambiguous;
+ return !!vtd_iommu_get();
}
static
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 4d14124..a44289f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2001,6 +2001,19 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
return vtd_dev_as;
}
+IntelIOMMUState *vtd_iommu_get(void)
+{
+ bool ambiguous = false;
+ Object *intel_iommu = NULL;
+
+ intel_iommu = object_resolve_path_type("", TYPE_INTEL_IOMMU_DEVICE,
+ &ambiguous);
+ if (ambiguous)
+ intel_iommu = NULL;
+
+ return (IntelIOMMUState *)intel_iommu;
+}
+
/* Do the initialization. It will also be called when reset, so pay
* attention when adding new initialization stuff.
*/
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 4914fe6..9ee84f7 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -196,5 +196,7 @@ struct IntelIOMMUState {
* create a new one if none exists
*/
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
+/* Get default IOMMU object */
+IntelIOMMUState *vtd_iommu_get(void);
#endif
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 08/13] intel_iommu: add IR translation faults defines
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (6 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 07/13] intel_iommu: provide helper function vtd_get_iommu Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 09/13] intel_iommu: Add support for PCI MSI remap Peter Xu
` (5 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Adding translation fault definitions for interrupt remapping. Please
refer to VT-d spec section 7.1.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu_internal.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 309833f..2a9987f 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -271,6 +271,19 @@ typedef enum VTDFaultReason {
* context-entry.
*/
VTD_FR_CONTEXT_ENTRY_TT,
+
+ /* Interrupt remapping transition faults */
+ VTD_FR_IR_REQ_RSVD = 0x20, /* One or more IR request reserved
+ * fields set */
+ VTD_FR_IR_INDEX_OVER = 0x21, /* Index value greater than max */
+ VTD_FR_IR_ENTRY_P = 0x22, /* Present (P) not set in IRTE */
+ VTD_FR_IR_ROOT_INVAL = 0x23, /* IR Root table invalid */
+ VTD_FR_IR_IRTE_RSVD = 0x24, /* IRTE Rsvd field non-zero with
+ * Present flag set */
+ VTD_FR_IR_REQ_COMPAT = 0x25, /* Encountered compatible IR
+ * request while disabled */
+ VTD_FR_IR_SID_ERR = 0x26, /* Invalid Source-ID */
+
/* This is not a normal fault reason. We use this to indicate some faults
* that are not referenced by the VT-d specification.
* Fault event with such reason should not be recorded.
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 09/13] intel_iommu: Add support for PCI MSI remap
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (7 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 08/13] intel_iommu: add IR translation faults defines Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 10/13] q35: ioapic: add support for emulated IOAPIC IR Peter Xu
` (4 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
This patch enables interrupt remapping for PCI devices.
To play the trick, one memory region "iommu_ir" is added as child region
of the original iommu memory region, covering range 0xfeeXXXXX (which is
the address range for APIC). All the writes to this range will be taken
as MSI, and translation is carried out only when IR is enabled.
Idea suggested by Paolo Bonzini.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 237 +++++++++++++++++++++++++++++++++++++++++
hw/i386/intel_iommu_internal.h | 2 +
include/hw/i386/intel_iommu.h | 48 +++++++++
3 files changed, 287 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a44289f..68ebc1e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -44,6 +44,10 @@ static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | VTD_DBGBIT(CSR);
#define VTD_DPRINTF(what, fmt, ...) do {} while (0)
#endif
+#define VTD_MSI_ADDR_HI_MASK (0xffffffff00000000ULL)
+#define VTD_MSI_ADDR_HI_SHIFT (32)
+#define VTD_MSI_ADDR_LO_MASK (0x00000000ffffffffULL)
+
static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
uint64_t wmask, uint64_t w1cmask)
{
@@ -1969,6 +1973,234 @@ static Property vtd_properties[] = {
DEFINE_PROP_END_OF_LIST(),
};
+/* Read IRTE entry with specific index */
+static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
+ VTD_IRTE *entry)
+{
+ dma_addr_t addr = 0x00;
+
+ addr = iommu->intr_root + index * sizeof(*entry);
+ if (dma_memory_read(&address_space_memory, addr, entry,
+ sizeof(*entry))) {
+ VTD_DPRINTF(GENERAL, "error: fail to access IR root at 0x%"PRIx64
+ " + %"PRIu16, iommu->intr_root, index);
+ return -VTD_FR_IR_ROOT_INVAL;
+ }
+
+ if (!entry->present) {
+ VTD_DPRINTF(GENERAL, "error: present flag not set in IRTE"
+ " entry index %u value 0x%"PRIx64 " 0x%"PRIx64,
+ index, le64_to_cpu(entry->data[1]),
+ le64_to_cpu(entry->data[0]));
+ return -VTD_FR_IR_ENTRY_P;
+ }
+
+ if (entry->__reserved_0 || entry->__reserved_1 || \
+ entry->__reserved_2) {
+ VTD_DPRINTF(GENERAL, "error: IRTE entry index %"PRIu16
+ " reserved fields non-zero: 0x%"PRIx64 " 0x%"PRIx64,
+ index, le64_to_cpu(entry->data[1]),
+ le64_to_cpu(entry->data[0]));
+ return -VTD_FR_IR_IRTE_RSVD;
+ }
+
+ /*
+ * TODO: Check Source-ID corresponds to SVT (Source Validation
+ * Type) bits
+ */
+
+ return 0;
+}
+
+/* Fetch IRQ information of specific IR index */
+static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index, VTDIrq *irq)
+{
+ VTD_IRTE irte;
+ int ret = 0;
+
+ bzero(&irte, sizeof(irte));
+
+ ret = vtd_irte_get(iommu, index, &irte);
+ if (ret) {
+ return ret;
+ }
+
+ irq->trigger_mode = irte.trigger_mode;
+ irq->vector = irte.vector;
+ irq->delivery_mode = irte.delivery_mode;
+ /* Not support EIM yet: please refer to vt-d 9.10 DST bits */
+#define VTD_IR_APIC_DEST_MASK (0xff00ULL)
+#define VTD_IR_APIC_DEST_SHIFT (8)
+ irq->dest = (irte.dest_id & VTD_IR_APIC_DEST_MASK) >> \
+ VTD_IR_APIC_DEST_SHIFT;
+ irq->dest_mode = irte.dest_mode;
+ irq->redir_hint = irte.redir_hint;
+
+ VTD_DPRINTF(IR, "remapping interrupt index %d: trig:%u,vec:%u,"
+ "deliver:%u,dest:%u,dest_mode:%u", index,
+ irq->trigger_mode, irq->vector, irq->delivery_mode,
+ irq->dest, irq->dest_mode);
+
+ return 0;
+}
+
+/* Generate one MSI message from VTDIrq info */
+static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
+{
+ VTD_MSIMessage msg = {};
+
+ /* Generate address bits */
+ msg.dest_mode = irq->dest_mode;
+ msg.redir_hint = irq->redir_hint;
+ msg.dest = irq->dest;
+ msg.__addr_head = 0xfee;
+ /* Keep this from original MSI address bits */
+ msg.__not_used = irq->msi_addr_last_bits;
+
+ /* Generate data bits */
+ msg.vector = irq->vector;
+ msg.delivery_mode = irq->delivery_mode;
+ msg.level = 1;
+ msg.trigger_mode = irq->trigger_mode;
+
+ msg_out->address = msg.msi_addr;
+ msg_out->data = msg.msi_data;
+}
+
+/* Interrupt remapping for MSI/MSI-X entry */
+static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
+ MSIMessage *origin,
+ MSIMessage *translated)
+{
+ int ret = 0;
+ VTD_IR_MSIAddress addr;
+ uint16_t index = 0;
+ VTDIrq irq = {0};
+
+ assert(iommu && origin && translated);
+
+ if (!iommu->intr_enabled) {
+ memcpy(translated, origin, sizeof(*origin));
+ return 0;
+ }
+
+ if (origin->address & VTD_MSI_ADDR_HI_MASK) {
+ VTD_DPRINTF(GENERAL, "error: MSI addr high 32 bits nonzero"
+ " during interrupt remapping: 0x%"PRIx32,
+ (uint32_t)((origin->address & VTD_MSI_ADDR_HI_MASK) >> \
+ VTD_MSI_ADDR_HI_SHIFT));
+ return -VTD_FR_IR_REQ_RSVD;
+ }
+
+ addr.data = origin->address & VTD_MSI_ADDR_LO_MASK;
+ if (addr.__head != 0xfee) {
+ VTD_DPRINTF(GENERAL, "error: MSI addr low 32 bits invalid: "
+ "0x%"PRIx32, addr.data);
+ return -VTD_FR_IR_REQ_RSVD;
+ }
+
+ if (addr.sub_valid == 1) {
+ VTD_DPRINTF(IR, "received MSI interrupt");
+ if (origin->data) {
+ VTD_DPRINTF(GENERAL, "error: MSI data bits non-zero for "
+ "interrupt remappable entry: 0x%"PRIx32,
+ origin->data);
+ return -VTD_FR_IR_REQ_RSVD;
+ }
+ } else {
+ VTD_DPRINTF(IR, "received IOAPIC interrupt");
+ /* TODO: IOAPIC interrupt check. */
+ }
+
+ /* TODO: Currently we still do not support compatible mode */
+ if (addr.int_mode != VTD_IR_INT_FORMAT_REMAP) {
+ VTD_DPRINTF(GENERAL, "error: trying to remap MSI entry"
+ " with compatible format: 0x%"PRIx32,
+ addr.data);
+ return -VTD_FR_IR_REQ_COMPAT;
+ }
+
+ index = addr.index_h << 15 | addr.index_l;
+
+ ret = vtd_remap_irq_get(iommu, index, &irq);
+ if (ret) {
+ return ret;
+ }
+
+ /*
+ * We'd better keep the last two bits, assuming that guest OS
+ * might modify it. Keep it does not hurt after all.
+ */
+ irq.msi_addr_last_bits = addr.__not_care;
+
+ /* Translate VTDIrq to MSI message */
+ vtd_generate_msi_message(&irq, translated);
+
+ VTD_DPRINTF(IR, "mapping MSI 0x%"PRIx64":0x%"PRIx32 " -> "
+ "0x%"PRIx64":0x%"PRIx32, origin->address, origin->data,
+ translated->address, translated->data);
+
+ return 0;
+}
+
+static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
+{
+ uint64_t data = 0;
+
+ addr += VTD_INTERRUPT_ADDR_FIRST;
+
+ VTD_DPRINTF(IR, "read mem_ir addr 0x%"PRIx64 " size %u",
+ addr, size);
+
+ if (dma_memory_read(&address_space_memory, addr, &data, size)) {
+ VTD_DPRINTF(GENERAL, "error: fail to access 0x%"PRIx64, addr);
+ return (uint32_t) -1;
+ }
+
+ return data;
+}
+
+static void vtd_mem_ir_write(void *opaque, hwaddr addr,
+ uint64_t value, unsigned size)
+{
+ int ret = 0;
+ MSIMessage from = {0}, to = {0};
+
+ from.address = (uint64_t) addr + VTD_INTERRUPT_ADDR_FIRST;
+ from.data = (uint32_t) value;
+
+ ret = vtd_interrupt_remap_msi(opaque, &from, &to);
+ if (ret) {
+ /* TODO: report error */
+ VTD_DPRINTF(GENERAL, "int remap fail for addr 0x%"PRIx64
+ " data 0x%"PRIx32, from.address, from.data);
+ /* Drop this interrupt */
+ return;
+ }
+
+ VTD_DPRINTF(IR, "delivering MSI 0x%"PRIx64":0x%"PRIx32,
+ to.address, to.data);
+
+ if (dma_memory_write(&address_space_memory, to.address,
+ &to.data, size)) {
+ VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
+ " value 0x%"PRIx32, to.address, to.data);
+ }
+}
+
+static const MemoryRegionOps vtd_mem_ir_ops = {
+ .read = vtd_mem_ir_read,
+ .write = vtd_mem_ir_write,
+ .endianness = DEVICE_LITTLE_ENDIAN,
+ .impl = {
+ .min_access_size = 4,
+ .max_access_size = 4,
+ },
+ .valid = {
+ .min_access_size = 4,
+ .max_access_size = 4,
+ },
+};
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
{
@@ -1995,6 +2227,11 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
vtd_dev_as->context_cache_entry.context_cache_gen = 0;
memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
&s->iommu_ops, "intel_iommu", UINT64_MAX);
+ memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
+ &vtd_mem_ir_ops, s, "intel_iommu_ir",
+ VTD_INTERRUPT_ADDR_SIZE);
+ memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR_FIRST,
+ &vtd_dev_as->iommu_ir);
address_space_init(&vtd_dev_as->as,
&vtd_dev_as->iommu, "intel_iommu");
}
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index 2a9987f..e1a08cb 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -110,6 +110,8 @@
/* Interrupt Address Range */
#define VTD_INTERRUPT_ADDR_FIRST 0xfee00000ULL
#define VTD_INTERRUPT_ADDR_LAST 0xfeefffffULL
+#define VTD_INTERRUPT_ADDR_SIZE (VTD_INTERRUPT_ADDR_LAST - \
+ VTD_INTERRUPT_ADDR_FIRST + 1)
/* The shift of source_id in the key of IOTLB hash table */
#define VTD_IOTLB_SID_SHIFT 36
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 9ee84f7..f824957 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -23,6 +23,8 @@
#define INTEL_IOMMU_H
#include "hw/qdev.h"
#include "sysemu/dma.h"
+#include "hw/i386/ioapic.h"
+#include "hw/pci/msi.h"
#define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
#define INTEL_IOMMU_DEVICE(obj) \
@@ -55,6 +57,8 @@ typedef struct VTDBus VTDBus;
typedef union VTD_IRTE VTD_IRTE;
typedef union VTD_IR_IOAPICEntry VTD_IR_IOAPICEntry;
typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress;
+typedef struct VTDIrq VTDIrq;
+typedef struct VTD_MSIMessage VTD_MSIMessage;
/* Context-Entry */
struct VTDContextEntry {
@@ -75,6 +79,7 @@ struct VTDAddressSpace {
uint8_t devfn;
AddressSpace as;
MemoryRegion iommu;
+ MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */
IntelIOMMUState *iommu_state;
VTDContextCacheEntry context_cache_entry;
};
@@ -116,6 +121,9 @@ union VTD_IRTE {
uint64_t data[2];
};
+#define VTD_IR_INT_FORMAT_COMPAT (0) /* Compatible Interrupt */
+#define VTD_IR_INT_FORMAT_REMAP (1) /* Remappable Interrupt */
+
/* Programming format for IOAPIC table entries */
union VTD_IR_IOAPICEntry {
struct {
@@ -147,6 +155,46 @@ union VTD_IR_MSIAddress {
uint32_t data;
};
+/* Generic IRQ entry information */
+struct VTDIrq {
+ /* Used by both IOAPIC/MSI interrupt remapping */
+ uint8_t trigger_mode;
+ uint8_t vector;
+ uint8_t delivery_mode;
+ uint32_t dest;
+ uint8_t dest_mode;
+
+ /* only used by MSI interrupt remapping */
+ uint8_t redir_hint;
+ uint8_t msi_addr_last_bits;
+};
+
+struct VTD_MSIMessage {
+ union {
+ struct {
+ uint16_t __not_used:2;
+ uint16_t dest_mode:1;
+ uint16_t redir_hint:1;
+ uint16_t __reserved:8;
+ uint16_t dest:8;
+ uint16_t __addr_head:12; /* 0xfee */
+ uint32_t __addr_hi:32;
+ } QEMU_PACKED;
+ uint64_t msi_addr;
+ };
+ union {
+ struct {
+ uint16_t vector:8;
+ uint16_t delivery_mode:3;
+ uint16_t __resved:3;
+ uint16_t level:1;
+ uint16_t trigger_mode:1;
+ uint16_t __resved1:16;
+ } QEMU_PACKED;
+ uint32_t msi_data;
+ };
+};
+
/* When IR is enabled, all MSI/MSI-X data bits should be zero */
#define VTD_IR_MSI_DATA (0)
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 10/13] q35: ioapic: add support for emulated IOAPIC IR
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (8 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 09/13] intel_iommu: Add support for PCI MSI remap Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 11/13] ioapic: introduce ioapic_entry_parse() helper Peter Xu
` (3 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
This patch translates all IOAPIC interrupts into MSI ones. One pseudo
ioapic address space is added to transfer the MSI message. By default,
it will be system memory address space. When IR is enabled, it will be
IOMMU address space.
Currently, only emulated IOAPIC is supported.
Idea suggested by Jan Kiszka and Rita Sinha in the following patch:
https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg01933.html
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/pc.c | 3 +++
hw/intc/ioapic.c | 28 ++++++++++++++++++++++++----
hw/pci-host/q35.c | 4 ++++
include/hw/i386/apic-msidef.h | 1 +
include/hw/i386/ioapic_internal.h | 1 +
include/hw/i386/pc.h | 4 ++++
6 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 99437e0..365e82f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1395,6 +1395,9 @@ void pc_memory_init(PCMachineState *pcms,
rom_add_option(option_rom[i].name, option_rom[i].bootindex);
}
pcms->fw_cfg = fw_cfg;
+
+ /* Init default IOAPIC address space */
+ pcms->ioapic_as = &address_space_memory;
}
qemu_irq pc_allocate_cpu_irq(void)
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 378e663..92334a6 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -28,6 +28,8 @@
#include "hw/i386/ioapic_internal.h"
#include "include/hw/pci/msi.h"
#include "sysemu/kvm.h"
+#include "target-i386/cpu.h"
+#include "hw/i386/apic-msidef.h"
//#define DEBUG_IOAPIC
@@ -49,13 +51,15 @@ extern int ioapic_no;
static void ioapic_service(IOAPICCommonState *s)
{
+ AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
+ uint32_t addr, data;
uint8_t i;
uint8_t trig_mode;
uint8_t vector;
uint8_t delivery_mode;
uint32_t mask;
uint64_t entry;
- uint8_t dest;
+ uint16_t dest_idx;
uint8_t dest_mode;
for (i = 0; i < IOAPIC_NUM_PINS; i++) {
@@ -66,7 +70,14 @@ static void ioapic_service(IOAPICCommonState *s)
entry = s->ioredtbl[i];
if (!(entry & IOAPIC_LVT_MASKED)) {
trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
- dest = entry >> IOAPIC_LVT_DEST_SHIFT;
+ /*
+ * By default, this would be dest_id[8] +
+ * reserved[8]. When IR is enabled, this would be
+ * interrupt_index[15] + interrupt_format[1]. This
+ * field never means anything, but only used to
+ * generate corresponding MSI.
+ */
+ dest_idx = entry >> IOAPIC_LVT_DEST_IDX_SHIFT;
dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
delivery_mode =
(entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
@@ -96,8 +107,17 @@ static void ioapic_service(IOAPICCommonState *s)
#else
(void)coalesce;
#endif
- apic_deliver_irq(dest, dest_mode, delivery_mode, vector,
- trig_mode);
+ /* No matter whether IR is enabled, we translate
+ * the IOAPIC message into a MSI one, and its
+ * address space will decide whether we need a
+ * translation. */
+ addr = APIC_DEFAULT_ADDRESS | \
+ (dest_idx << MSI_ADDR_DEST_IDX_SHIFT) |
+ (dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
+ data = (vector << MSI_DATA_VECTOR_SHIFT) |
+ (trig_mode << MSI_DATA_TRIGGER_SHIFT) |
+ (delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+ stl_le_phys(ioapic_as, addr, data);
}
}
}
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 70f897e..d32c123 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -437,6 +437,7 @@ static AddressSpace *q35_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
static void mch_init_dmar(MCHPCIState *mch)
{
+ PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
PCIBus *pci_bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
mch->iommu = INTEL_IOMMU_DEVICE(qdev_create(NULL, TYPE_INTEL_IOMMU_DEVICE));
@@ -446,6 +447,9 @@ static void mch_init_dmar(MCHPCIState *mch)
sysbus_mmio_map(SYS_BUS_DEVICE(mch->iommu), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu);
+ /* Pseudo address space under root PCI bus. */
+ pcms->ioapic_as = q35_host_dma_iommu(pci_bus, mch->iommu,
+ Q35_PSEUDO_DEVFN_IOAPIC);
}
static void mch_realize(PCIDevice *d, Error **errp)
diff --git a/include/hw/i386/apic-msidef.h b/include/hw/i386/apic-msidef.h
index 6e2eb71..8b4d4cc 100644
--- a/include/hw/i386/apic-msidef.h
+++ b/include/hw/i386/apic-msidef.h
@@ -25,6 +25,7 @@
#define MSI_ADDR_REDIRECTION_SHIFT 3
#define MSI_ADDR_DEST_ID_SHIFT 12
+#define MSI_ADDR_DEST_IDX_SHIFT 4
#define MSI_ADDR_DEST_ID_MASK 0x00ffff0
#endif /* HW_APIC_MSIDEF_H */
diff --git a/include/hw/i386/ioapic_internal.h b/include/hw/i386/ioapic_internal.h
index 797ed47..d279f2d 100644
--- a/include/hw/i386/ioapic_internal.h
+++ b/include/hw/i386/ioapic_internal.h
@@ -31,6 +31,7 @@
#define IOAPIC_VERSION 0x11
#define IOAPIC_LVT_DEST_SHIFT 56
+#define IOAPIC_LVT_DEST_IDX_SHIFT 48
#define IOAPIC_LVT_MASKED_SHIFT 16
#define IOAPIC_LVT_TRIGGER_MODE_SHIFT 15
#define IOAPIC_LVT_REMOTE_IRR_SHIFT 14
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 96f0b66..cde6934 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -72,6 +72,10 @@ struct PCMachineState {
uint64_t numa_nodes;
uint64_t *node_mem;
uint64_t *node_cpu;
+
+ /* Address space used by IOAPIC device. All IOAPIC interrupts
+ * will be translated to MSI messages in the address space. */
+ AddressSpace *ioapic_as;
};
#define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 11/13] ioapic: introduce ioapic_entry_parse() helper
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (9 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 10/13] q35: ioapic: add support for emulated IOAPIC IR Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd Peter Xu
` (2 subsequent siblings)
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
Abstract IOAPIC entry parsing logic into a helper function for further
reuse.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/intc/ioapic.c | 83 ++++++++++++++++++++++++++++++++++----------------------
1 file changed, 50 insertions(+), 33 deletions(-)
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 92334a6..84e8948 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -49,18 +49,56 @@ static IOAPICCommonState *ioapics[MAX_IOAPICS];
/* global variable from ioapic_common.c */
extern int ioapic_no;
+struct ioapic_entry_info {
+ /* fields parsed from IOAPIC entries */
+ uint8_t masked;
+ uint8_t trig_mode;
+ uint16_t dest_idx;
+ uint8_t dest_mode;
+ uint8_t delivery_mode;
+ uint8_t vector;
+
+ /* MSI message generated from above parsed fields */
+ uint32_t addr;
+ uint32_t data;
+};
+
+static void ioapic_entry_parse(uint64_t entry, struct ioapic_entry_info *info)
+{
+ bzero(info, sizeof(*info));
+ info->masked = (entry >> IOAPIC_LVT_MASKED_SHIFT) & 1;
+ info->trig_mode = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
+ /*
+ * By default, this would be dest_id[8] + reserved[8]. When IR
+ * is enabled, this would be interrupt_index[15] +
+ * interrupt_format[1]. This field never means anything, but
+ * only used to generate corresponding MSI.
+ */
+ info->dest_idx = (entry >> IOAPIC_LVT_DEST_IDX_SHIFT) & 0xffff;
+ info->dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
+ info->delivery_mode = (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) \
+ & IOAPIC_DM_MASK;
+ if (info->delivery_mode == IOAPIC_DM_EXTINT) {
+ info->vector = pic_read_irq(isa_pic);
+ } else {
+ info->vector = entry & IOAPIC_VECTOR_MASK;
+ }
+
+ info->addr = APIC_DEFAULT_ADDRESS | \
+ (info->dest_idx << MSI_ADDR_DEST_IDX_SHIFT) | \
+ (info->dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
+ info->data = (info->vector << MSI_DATA_VECTOR_SHIFT) | \
+ (info->trig_mode << MSI_DATA_TRIGGER_SHIFT) | \
+ (info->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+}
+
static void ioapic_service(IOAPICCommonState *s)
{
AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
- uint32_t addr, data;
+ struct ioapic_entry_info info;
uint8_t i;
- uint8_t trig_mode;
- uint8_t vector;
- uint8_t delivery_mode;
uint32_t mask;
uint64_t entry;
- uint16_t dest_idx;
- uint8_t dest_mode;
for (i = 0; i < IOAPIC_NUM_PINS; i++) {
mask = 1 << i;
@@ -68,33 +106,18 @@ static void ioapic_service(IOAPICCommonState *s)
int coalesce = 0;
entry = s->ioredtbl[i];
- if (!(entry & IOAPIC_LVT_MASKED)) {
- trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
- /*
- * By default, this would be dest_id[8] +
- * reserved[8]. When IR is enabled, this would be
- * interrupt_index[15] + interrupt_format[1]. This
- * field never means anything, but only used to
- * generate corresponding MSI.
- */
- dest_idx = entry >> IOAPIC_LVT_DEST_IDX_SHIFT;
- dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
- delivery_mode =
- (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
- if (trig_mode == IOAPIC_TRIGGER_EDGE) {
+ ioapic_entry_parse(entry, &info);
+ if (!info.masked) {
+ if (info.trig_mode == IOAPIC_TRIGGER_EDGE) {
s->irr &= ~mask;
} else {
coalesce = s->ioredtbl[i] & IOAPIC_LVT_REMOTE_IRR;
s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
}
- if (delivery_mode == IOAPIC_DM_EXTINT) {
- vector = pic_read_irq(isa_pic);
- } else {
- vector = entry & IOAPIC_VECTOR_MASK;
- }
+
#ifdef CONFIG_KVM
if (kvm_irqchip_is_split()) {
- if (trig_mode == IOAPIC_TRIGGER_EDGE) {
+ if (info.trig_mode == IOAPIC_TRIGGER_EDGE) {
kvm_set_irq(kvm_state, i, 1);
kvm_set_irq(kvm_state, i, 0);
} else {
@@ -111,13 +134,7 @@ static void ioapic_service(IOAPICCommonState *s)
* the IOAPIC message into a MSI one, and its
* address space will decide whether we need a
* translation. */
- addr = APIC_DEFAULT_ADDRESS | \
- (dest_idx << MSI_ADDR_DEST_IDX_SHIFT) |
- (dest_mode << MSI_ADDR_DEST_MODE_SHIFT);
- data = (vector << MSI_DATA_VECTOR_SHIFT) |
- (trig_mode << MSI_DATA_TRIGGER_SHIFT) |
- (delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
- stl_le_phys(ioapic_as, addr, data);
+ stl_le_phys(ioapic_as, info.addr, info.data);
}
}
}
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (10 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 11/13] ioapic: introduce ioapic_entry_parse() helper Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-15 15:31 ` Radim Krčmář
2016-04-17 2:44 ` Jan Kiszka
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 13/13] q35: add "int-remap" flag to enable intr Peter Xu
2016-04-17 2:26 ` [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
13 siblings, 2 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
This patch allows Intel IR work with splitted irqchip. Two more fields
are added to IOAPICCommonState to support the translation process (For
future AMD IR support, we will need to provide another AMD-specific
callback for int_remap()). In split irqchip mode, IOAPIC is working in
user space, only update kernel irq routes when entry changed. When IR is
enabled, we directly update the kernel with translated messages. It
works just like a kernel cache for the remapping entries.
Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
long as we can support split irqchip, we will support irqfd as
well. Also, since kernel gsi routes will cache translated interrupts,
irqfd delivery will not suffer from any performance impact due to IR.
And, since we supported irqfd, vhost devices will be able to work
seamlessly with IR now. Logically this should contain both vhost-net and
vhost-user case.
Here we avoided capturing IOMMU IR invalidation, based on the assumption
that, guest kernel will always first update IR entry, then IOAPIC
entry. As long as guest follows this order to update IOAPIC entries, we
should be safe.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/intel_iommu.c | 9 +++++++--
hw/intc/ioapic.c | 39 ++++++++++++++-------------------------
hw/intc/ioapic_common.c | 4 ++++
include/hw/i386/intel_iommu.h | 2 ++
include/hw/i386/ioapic_internal.h | 5 +++++
5 files changed, 32 insertions(+), 27 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 68ebc1e..104afeb 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2077,9 +2077,9 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
uint16_t index = 0;
VTDIrq irq = {0};
- assert(iommu && origin && translated);
+ assert(origin && translated);
- if (!iommu->intr_enabled) {
+ if (!iommu || !iommu->intr_enabled) {
memcpy(translated, origin, sizeof(*origin));
return 0;
}
@@ -2143,6 +2143,11 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
return 0;
}
+int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst)
+{
+ return vtd_interrupt_remap_msi(iommu, src, dst);
+}
+
static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
{
uint64_t data = 0;
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 84e8948..b993bd0 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
{
#ifdef CONFIG_KVM
int i;
+ int ret;
if (kvm_irqchip_is_split()) {
for (i = 0; i < IOAPIC_NUM_PINS; i++) {
- uint64_t entry = s->ioredtbl[i];
- uint8_t trig_mode;
- uint8_t delivery_mode;
- uint8_t dest;
- uint8_t dest_mode;
- uint64_t pin_polarity;
- MSIMessage msg;
-
- trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
- dest = entry >> IOAPIC_LVT_DEST_SHIFT;
- dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
- pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
- delivery_mode =
- (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
-
- msg.address = APIC_DEFAULT_ADDRESS;
- msg.address |= dest_mode << 2;
- msg.address |= dest << 12;
-
- msg.data = entry & IOAPIC_VECTOR_MASK;
- msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
- msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
- msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
-
- kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+ MSIMessage src, dst;
+ struct ioapic_entry_info info;
+ ioapic_entry_parse(s->ioredtbl[i], &info);
+ src.address = info.addr;
+ src.data = info.data;
+ /* We update kernel irqchip routes with translated
+ * results. */
+ ret = s->int_remap(s->iommu, &src, &dst);
+ if (ret) {
+ DPRINTF("Int remap failed: %d, drop interrupt\n", ret);
+ continue;
+ }
+ kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
}
kvm_irqchip_commit_routes(kvm_state);
}
diff --git a/hw/intc/ioapic_common.c b/hw/intc/ioapic_common.c
index 1b7ec5e..d583398 100644
--- a/hw/intc/ioapic_common.c
+++ b/hw/intc/ioapic_common.c
@@ -25,6 +25,7 @@
#include "hw/i386/ioapic.h"
#include "hw/i386/ioapic_internal.h"
#include "hw/sysbus.h"
+#include "hw/i386/intel_iommu.h"
/* ioapic_no count start from 0 to MAX_IOAPICS,
* remove as static variable from ioapic_common_init.
@@ -135,6 +136,9 @@ static void ioapic_common_realize(DeviceState *dev, Error **errp)
info = IOAPIC_COMMON_GET_CLASS(s);
info->realize(dev, errp);
+ s->iommu = (void *)vtd_iommu_get();
+ s->int_remap = vtd_int_remap;
+
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->io_memory);
ioapic_no++;
}
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index f824957..25a9306 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -246,5 +246,7 @@ struct IntelIOMMUState {
VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn);
/* Get default IOMMU object */
IntelIOMMUState *vtd_iommu_get(void);
+/* Interrupt remapping hook function */
+int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst);
#endif
diff --git a/include/hw/i386/ioapic_internal.h b/include/hw/i386/ioapic_internal.h
index d279f2d..d9070cf 100644
--- a/include/hw/i386/ioapic_internal.h
+++ b/include/hw/i386/ioapic_internal.h
@@ -102,6 +102,11 @@ struct IOAPICCommonState {
uint8_t ioregsel;
uint32_t irr;
uint64_t ioredtbl[IOAPIC_NUM_PINS];
+
+ /* This should be the IOMMU that owns the IOAPIC. */
+ void *iommu;
+ /* Interrupt remapping callback */
+ int (*int_remap)(void *iommu, MSIMessage *src, MSIMessage *dst);
};
void ioapic_reset_common(DeviceState *dev);
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd Peter Xu
@ 2016-04-15 15:31 ` Radim Krčmář
2016-04-18 3:30 ` Peter Xu
2016-04-17 2:44 ` Jan Kiszka
1 sibling, 1 reply; 24+ messages in thread
From: Radim Krčmář @ 2016-04-15 15:31 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
pbonzini, jan.kiszka, alex.williamson, wexu
2016-04-15 11:31+0800, Peter Xu:
> This patch allows Intel IR work with splitted irqchip. Two more fields
> are added to IOAPICCommonState to support the translation process (For
> future AMD IR support, we will need to provide another AMD-specific
> callback for int_remap()). In split irqchip mode, IOAPIC is working in
> user space, only update kernel irq routes when entry changed. When IR is
> enabled, we directly update the kernel with translated messages. It
> works just like a kernel cache for the remapping entries.
(Patches are nice, thanks, I'll be looking how to slap EIM on top.)
> Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
> long as we can support split irqchip, we will support irqfd as
> well. Also, since kernel gsi routes will cache translated interrupts,
> irqfd delivery will not suffer from any performance impact due to IR.
>
> And, since we supported irqfd, vhost devices will be able to work
> seamlessly with IR now. Logically this should contain both vhost-net and
> vhost-user case.
Doesn't look that all callers of kvm_irqchip_update_msi_route() are IR
aware. I think wrapping the remapping around it might be easiest,
kvm_arch_fixup_msi_route() is another candidate.
> Here we avoided capturing IOMMU IR invalidation, based on the assumption
> that, guest kernel will always first update IR entry, then IOAPIC
> entry. As long as guest follows this order to update IOAPIC entries, we
> should be safe.
The OS configures IOAPIC, MSI and IR independently. e.g. changing the
destination LAPIC only updates IRTE and can happen anytime.
You have to update kvm_irqchip routes when IRTE changes.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-15 15:31 ` Radim Krčmář
@ 2016-04-18 3:30 ` Peter Xu
0 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-18 3:30 UTC (permalink / raw)
To: Radim Krčmář
Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
pbonzini, jan.kiszka, alex.williamson, wexu
On Fri, Apr 15, 2016 at 05:31:58PM +0200, Radim Krčmář wrote:
> 2016-04-15 11:31+0800, Peter Xu:
> > This patch allows Intel IR work with splitted irqchip. Two more fields
> > are added to IOAPICCommonState to support the translation process (For
> > future AMD IR support, we will need to provide another AMD-specific
> > callback for int_remap()). In split irqchip mode, IOAPIC is working in
> > user space, only update kernel irq routes when entry changed. When IR is
> > enabled, we directly update the kernel with translated messages. It
> > works just like a kernel cache for the remapping entries.
>
> (Patches are nice, thanks, I'll be looking how to slap EIM on top.)
>
> > Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
> > long as we can support split irqchip, we will support irqfd as
> > well. Also, since kernel gsi routes will cache translated interrupts,
> > irqfd delivery will not suffer from any performance impact due to IR.
> >
> > And, since we supported irqfd, vhost devices will be able to work
> > seamlessly with IR now. Logically this should contain both vhost-net and
> > vhost-user case.
>
> Doesn't look that all callers of kvm_irqchip_update_msi_route() are IR
> aware. I think wrapping the remapping around it might be easiest,
> kvm_arch_fixup_msi_route() is another candidate.
You are right, failed to find this during smoke test. It seems that
kvm_arch_fixup_msi_route() is a good place. Thanks!
>
> > Here we avoided capturing IOMMU IR invalidation, based on the assumption
> > that, guest kernel will always first update IR entry, then IOAPIC
> > entry. As long as guest follows this order to update IOAPIC entries, we
> > should be safe.
>
> The OS configures IOAPIC, MSI and IR independently. e.g. changing the
> destination LAPIC only updates IRTE and can happen anytime.
> You have to update kvm_irqchip routes when IRTE changes.
Thanks to point out. Will add one more patch to do that.
-- peterx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd Peter Xu
2016-04-15 15:31 ` Radim Krčmář
@ 2016-04-17 2:44 ` Jan Kiszka
2016-04-17 9:45 ` Michael S. Tsirkin
1 sibling, 1 reply; 24+ messages in thread
From: Jan Kiszka @ 2016-04-17 2:44 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini, rkrcmar,
alex.williamson, wexu
On 2016-04-14 20:31, Peter Xu wrote:
> This patch allows Intel IR work with splitted irqchip. Two more fields
> are added to IOAPICCommonState to support the translation process (For
> future AMD IR support, we will need to provide another AMD-specific
> callback for int_remap()). In split irqchip mode, IOAPIC is working in
> user space, only update kernel irq routes when entry changed. When IR is
> enabled, we directly update the kernel with translated messages. It
> works just like a kernel cache for the remapping entries.
>
> Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
> long as we can support split irqchip, we will support irqfd as
> well. Also, since kernel gsi routes will cache translated interrupts,
> irqfd delivery will not suffer from any performance impact due to IR.
>
> And, since we supported irqfd, vhost devices will be able to work
> seamlessly with IR now. Logically this should contain both vhost-net and
> vhost-user case.
>
> Here we avoided capturing IOMMU IR invalidation, based on the assumption
> that, guest kernel will always first update IR entry, then IOAPIC
> entry. As long as guest follows this order to update IOAPIC entries, we
> should be safe.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> hw/i386/intel_iommu.c | 9 +++++++--
> hw/intc/ioapic.c | 39 ++++++++++++++-------------------------
> hw/intc/ioapic_common.c | 4 ++++
> include/hw/i386/intel_iommu.h | 2 ++
> include/hw/i386/ioapic_internal.h | 5 +++++
> 5 files changed, 32 insertions(+), 27 deletions(-)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 68ebc1e..104afeb 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2077,9 +2077,9 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
> uint16_t index = 0;
> VTDIrq irq = {0};
>
> - assert(iommu && origin && translated);
> + assert(origin && translated);
>
> - if (!iommu->intr_enabled) {
> + if (!iommu || !iommu->intr_enabled) {
> memcpy(translated, origin, sizeof(*origin));
> return 0;
> }
> @@ -2143,6 +2143,11 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
> return 0;
> }
>
> +int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst)
> +{
> + return vtd_interrupt_remap_msi(iommu, src, dst);
> +}
> +
> static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
> {
> uint64_t data = 0;
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index 84e8948..b993bd0 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
> {
> #ifdef CONFIG_KVM
> int i;
> + int ret;
>
> if (kvm_irqchip_is_split()) {
> for (i = 0; i < IOAPIC_NUM_PINS; i++) {
> - uint64_t entry = s->ioredtbl[i];
> - uint8_t trig_mode;
> - uint8_t delivery_mode;
> - uint8_t dest;
> - uint8_t dest_mode;
> - uint64_t pin_polarity;
> - MSIMessage msg;
> -
> - trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
> - dest = entry >> IOAPIC_LVT_DEST_SHIFT;
> - dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
> - pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
> - delivery_mode =
> - (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
> -
> - msg.address = APIC_DEFAULT_ADDRESS;
> - msg.address |= dest_mode << 2;
> - msg.address |= dest << 12;
> -
> - msg.data = entry & IOAPIC_VECTOR_MASK;
> - msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
> - msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
> - msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
> -
> - kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> + MSIMessage src, dst;
> + struct ioapic_entry_info info;
> + ioapic_entry_parse(s->ioredtbl[i], &info);
> + src.address = info.addr;
> + src.data = info.data;
> + /* We update kernel irqchip routes with translated
> + * results. */
> + ret = s->int_remap(s->iommu, &src, &dst);
> + if (ret) {
> + DPRINTF("Int remap failed: %d, drop interrupt\n", ret);
> + continue;
> + }
> + kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
The need to hook here makes me wonder if we can't inject IOAPIC
interrupts via KVM_SIGNAL_MSI (abstracted by kvm_irqchip_send_msi, but
that will pick the fast-path on kernels supporting split irqchip)
instead of open-coding the route changes. If we translated the IOAPIC
outputs always into MSIs, the need for special-casing split irqchip
would be gone, and the need for hooking here for IR just as well.
Jan
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-17 2:44 ` Jan Kiszka
@ 2016-04-17 9:45 ` Michael S. Tsirkin
2016-04-18 8:55 ` Peter Xu
0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2016-04-17 9:45 UTC (permalink / raw)
To: Jan Kiszka
Cc: Peter Xu, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
pbonzini, rkrcmar, alex.williamson, wexu
On Sat, Apr 16, 2016 at 07:44:12PM -0700, Jan Kiszka wrote:
> On 2016-04-14 20:31, Peter Xu wrote:
> > This patch allows Intel IR work with splitted irqchip. Two more fields
> > are added to IOAPICCommonState to support the translation process (For
> > future AMD IR support, we will need to provide another AMD-specific
> > callback for int_remap()). In split irqchip mode, IOAPIC is working in
> > user space, only update kernel irq routes when entry changed. When IR is
> > enabled, we directly update the kernel with translated messages. It
> > works just like a kernel cache for the remapping entries.
> >
> > Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
> > long as we can support split irqchip, we will support irqfd as
> > well. Also, since kernel gsi routes will cache translated interrupts,
> > irqfd delivery will not suffer from any performance impact due to IR.
> >
> > And, since we supported irqfd, vhost devices will be able to work
> > seamlessly with IR now. Logically this should contain both vhost-net and
> > vhost-user case.
> >
> > Here we avoided capturing IOMMU IR invalidation, based on the assumption
> > that, guest kernel will always first update IR entry, then IOAPIC
> > entry. As long as guest follows this order to update IOAPIC entries, we
> > should be safe.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > hw/i386/intel_iommu.c | 9 +++++++--
> > hw/intc/ioapic.c | 39 ++++++++++++++-------------------------
> > hw/intc/ioapic_common.c | 4 ++++
> > include/hw/i386/intel_iommu.h | 2 ++
> > include/hw/i386/ioapic_internal.h | 5 +++++
> > 5 files changed, 32 insertions(+), 27 deletions(-)
> >
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 68ebc1e..104afeb 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -2077,9 +2077,9 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
> > uint16_t index = 0;
> > VTDIrq irq = {0};
> >
> > - assert(iommu && origin && translated);
> > + assert(origin && translated);
> >
> > - if (!iommu->intr_enabled) {
> > + if (!iommu || !iommu->intr_enabled) {
> > memcpy(translated, origin, sizeof(*origin));
> > return 0;
> > }
> > @@ -2143,6 +2143,11 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
> > return 0;
> > }
> >
> > +int vtd_int_remap(void *iommu, MSIMessage *src, MSIMessage *dst)
> > +{
> > + return vtd_interrupt_remap_msi(iommu, src, dst);
> > +}
> > +
> > static uint64_t vtd_mem_ir_read(void *opaque, hwaddr addr, unsigned size)
> > {
> > uint64_t data = 0;
> > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> > index 84e8948..b993bd0 100644
> > --- a/hw/intc/ioapic.c
> > +++ b/hw/intc/ioapic.c
> > @@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
> > {
> > #ifdef CONFIG_KVM
> > int i;
> > + int ret;
> >
> > if (kvm_irqchip_is_split()) {
> > for (i = 0; i < IOAPIC_NUM_PINS; i++) {
> > - uint64_t entry = s->ioredtbl[i];
> > - uint8_t trig_mode;
> > - uint8_t delivery_mode;
> > - uint8_t dest;
> > - uint8_t dest_mode;
> > - uint64_t pin_polarity;
> > - MSIMessage msg;
> > -
> > - trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
> > - dest = entry >> IOAPIC_LVT_DEST_SHIFT;
> > - dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
> > - pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
> > - delivery_mode =
> > - (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
> > -
> > - msg.address = APIC_DEFAULT_ADDRESS;
> > - msg.address |= dest_mode << 2;
> > - msg.address |= dest << 12;
> > -
> > - msg.data = entry & IOAPIC_VECTOR_MASK;
> > - msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
> > - msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
> > - msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
> > -
> > - kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> > + MSIMessage src, dst;
> > + struct ioapic_entry_info info;
> > + ioapic_entry_parse(s->ioredtbl[i], &info);
> > + src.address = info.addr;
> > + src.data = info.data;
> > + /* We update kernel irqchip routes with translated
> > + * results. */
> > + ret = s->int_remap(s->iommu, &src, &dst);
> > + if (ret) {
> > + DPRINTF("Int remap failed: %d, drop interrupt\n", ret);
> > + continue;
> > + }
> > + kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
>
> The need to hook here makes me wonder if we can't inject IOAPIC
> interrupts via KVM_SIGNAL_MSI (abstracted by kvm_irqchip_send_msi, but
> that will pick the fast-path on kernels supporting split irqchip)
> instead of open-coding the route changes. If we translated the IOAPIC
> outputs always into MSIs, the need for special-casing split irqchip
> would be gone, and the need for hooking here for IR just as well.
>
> Jan
Will work for irqfd as well.
--
MST
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-17 9:45 ` Michael S. Tsirkin
@ 2016-04-18 8:55 ` Peter Xu
2016-04-25 5:00 ` Jan Kiszka
0 siblings, 1 reply; 24+ messages in thread
From: Peter Xu @ 2016-04-18 8:55 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jan Kiszka, qemu-devel, imammedo, rth, ehabkost, jasowang, marcel,
pbonzini, rkrcmar, alex.williamson, wexu
On Sun, Apr 17, 2016 at 12:45:03PM +0300, Michael S. Tsirkin wrote:
> On Sat, Apr 16, 2016 at 07:44:12PM -0700, Jan Kiszka wrote:
> > On 2016-04-14 20:31, Peter Xu wrote:
[...]
> > > diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> > > index 84e8948..b993bd0 100644
> > > --- a/hw/intc/ioapic.c
> > > +++ b/hw/intc/ioapic.c
> > > @@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
> > > {
> > > #ifdef CONFIG_KVM
> > > int i;
> > > + int ret;
> > >
> > > if (kvm_irqchip_is_split()) {
> > > for (i = 0; i < IOAPIC_NUM_PINS; i++) {
> > > - uint64_t entry = s->ioredtbl[i];
> > > - uint8_t trig_mode;
> > > - uint8_t delivery_mode;
> > > - uint8_t dest;
> > > - uint8_t dest_mode;
> > > - uint64_t pin_polarity;
> > > - MSIMessage msg;
> > > -
> > > - trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
> > > - dest = entry >> IOAPIC_LVT_DEST_SHIFT;
> > > - dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
> > > - pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
> > > - delivery_mode =
> > > - (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
> > > -
> > > - msg.address = APIC_DEFAULT_ADDRESS;
> > > - msg.address |= dest_mode << 2;
> > > - msg.address |= dest << 12;
> > > -
> > > - msg.data = entry & IOAPIC_VECTOR_MASK;
> > > - msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
> > > - msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
> > > - msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
> > > -
> > > - kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> > > + MSIMessage src, dst;
> > > + struct ioapic_entry_info info;
> > > + ioapic_entry_parse(s->ioredtbl[i], &info);
> > > + src.address = info.addr;
> > > + src.data = info.data;
> > > + /* We update kernel irqchip routes with translated
> > > + * results. */
> > > + ret = s->int_remap(s->iommu, &src, &dst);
> > > + if (ret) {
> > > + DPRINTF("Int remap failed: %d, drop interrupt\n", ret);
> > > + continue;
> > > + }
> > > + kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
> >
> > The need to hook here makes me wonder if we can't inject IOAPIC
> > interrupts via KVM_SIGNAL_MSI (abstracted by kvm_irqchip_send_msi, but
> > that will pick the fast-path on kernels supporting split irqchip)
> > instead of open-coding the route changes. If we translated the IOAPIC
> > outputs always into MSIs, the need for special-casing split irqchip
> > would be gone, and the need for hooking here for IR just as well.
Hi, Jan,
IIUC, this can be achieved by removing lines in ioapic_service():
-#ifdef CONFIG_KVM
- if (kvm_irqchip_is_split()) {
- if (trig_mode == IOAPIC_TRIGGER_EDGE) {
- kvm_set_irq(kvm_state, i, 1);
- kvm_set_irq(kvm_state, i, 0);
- } else {
- if (!coalesce) {
- kvm_set_irq(kvm_state, i, 1);
- }
- }
- continue;
- }
-#else
- (void)coalesce;
-#endif
So that QEMU will automatically select the correct way to notify the
interrupt depending on whether "apic" or "kvm-apic" is used.
I suppose this is a good way to do if we are to support split
irqchip only. However, what if we move on to support irqfd and
vhost? If so, we may still need to update kernel entries into
translated ones, right? Or... did I miss anything?
>
> Will work for irqfd as well.
Michael,
Could you help explain how could I support irqfd as well using Jan's
method mentioned above?
Thanks in advance (to both :).
-- peterx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd
2016-04-18 8:55 ` Peter Xu
@ 2016-04-25 5:00 ` Jan Kiszka
0 siblings, 0 replies; 24+ messages in thread
From: Jan Kiszka @ 2016-04-25 5:00 UTC (permalink / raw)
To: Peter Xu, Michael S. Tsirkin
Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, pbonzini,
rkrcmar, alex.williamson, wexu
On 2016-04-18 10:55, Peter Xu wrote:
> On Sun, Apr 17, 2016 at 12:45:03PM +0300, Michael S. Tsirkin wrote:
>> On Sat, Apr 16, 2016 at 07:44:12PM -0700, Jan Kiszka wrote:
>>> On 2016-04-14 20:31, Peter Xu wrote:
> [...]
>>>> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>>>> index 84e8948..b993bd0 100644
>>>> --- a/hw/intc/ioapic.c
>>>> +++ b/hw/intc/ioapic.c
>>>> @@ -182,34 +182,23 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
>>>> {
>>>> #ifdef CONFIG_KVM
>>>> int i;
>>>> + int ret;
>>>>
>>>> if (kvm_irqchip_is_split()) {
>>>> for (i = 0; i < IOAPIC_NUM_PINS; i++) {
>>>> - uint64_t entry = s->ioredtbl[i];
>>>> - uint8_t trig_mode;
>>>> - uint8_t delivery_mode;
>>>> - uint8_t dest;
>>>> - uint8_t dest_mode;
>>>> - uint64_t pin_polarity;
>>>> - MSIMessage msg;
>>>> -
>>>> - trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
>>>> - dest = entry >> IOAPIC_LVT_DEST_SHIFT;
>>>> - dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
>>>> - pin_polarity = (entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1;
>>>> - delivery_mode =
>>>> - (entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
>>>> -
>>>> - msg.address = APIC_DEFAULT_ADDRESS;
>>>> - msg.address |= dest_mode << 2;
>>>> - msg.address |= dest << 12;
>>>> -
>>>> - msg.data = entry & IOAPIC_VECTOR_MASK;
>>>> - msg.data |= delivery_mode << APIC_DELIVERY_MODE_SHIFT;
>>>> - msg.data |= pin_polarity << APIC_POLARITY_SHIFT;
>>>> - msg.data |= trig_mode << APIC_TRIG_MODE_SHIFT;
>>>> -
>>>> - kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>>>> + MSIMessage src, dst;
>>>> + struct ioapic_entry_info info;
>>>> + ioapic_entry_parse(s->ioredtbl[i], &info);
>>>> + src.address = info.addr;
>>>> + src.data = info.data;
>>>> + /* We update kernel irqchip routes with translated
>>>> + * results. */
>>>> + ret = s->int_remap(s->iommu, &src, &dst);
>>>> + if (ret) {
>>>> + DPRINTF("Int remap failed: %d, drop interrupt\n", ret);
>>>> + continue;
>>>> + }
>>>> + kvm_irqchip_update_msi_route(kvm_state, i, dst, NULL);
>>>
>>> The need to hook here makes me wonder if we can't inject IOAPIC
>>> interrupts via KVM_SIGNAL_MSI (abstracted by kvm_irqchip_send_msi, but
>>> that will pick the fast-path on kernels supporting split irqchip)
>>> instead of open-coding the route changes. If we translated the IOAPIC
>>> outputs always into MSIs, the need for special-casing split irqchip
>>> would be gone, and the need for hooking here for IR just as well.
>
> Hi, Jan,
>
> IIUC, this can be achieved by removing lines in ioapic_service():
>
> -#ifdef CONFIG_KVM
> - if (kvm_irqchip_is_split()) {
> - if (trig_mode == IOAPIC_TRIGGER_EDGE) {
> - kvm_set_irq(kvm_state, i, 1);
> - kvm_set_irq(kvm_state, i, 0);
> - } else {
> - if (!coalesce) {
> - kvm_set_irq(kvm_state, i, 1);
> - }
> - }
> - continue;
> - }
> -#else
> - (void)coalesce;
> -#endif
>
> So that QEMU will automatically select the correct way to notify the
> interrupt depending on whether "apic" or "kvm-apic" is used.
>
> I suppose this is a good way to do if we are to support split
> irqchip only. However, what if we move on to support irqfd and
> vhost? If so, we may still need to update kernel entries into
> translated ones, right? Or... did I miss anything?
Right, in-kernel irq sources depend on an already remapped channel
because they deliver directly to the in-kernel APICs. Thus, you will
have to establish routes for those irqfds that reflects th (cached) IRTEs.
Jan
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v3 13/13] q35: add "int-remap" flag to enable intr
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (11 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 12/13] q35: ioapic: add support for split irqchip and irqfd Peter Xu
@ 2016-04-15 3:31 ` Peter Xu
2016-04-17 2:26 ` [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
13 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-15 3:31 UTC (permalink / raw)
To: qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini,
jan.kiszka, rkrcmar, alex.williamson, wexu, peterx
One flag is added to specify whether to enable INTR for emulated
IOMMU. By default, interrupt remapping is not supportted. To enable it,
we should specify something like:
$ qemu-system-x86_64 -M q35,iommu=on,intr=on
To be more clear, the following command:
$ qemu-system-x86_64 -M q35,iommu=on
Will enable IOMMU only, without interrupt remapping support.
Currently, Intel IOMMU IR only support kernel-irqchip={off|split}. We
need to specify either of it in -M as well.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/core/machine.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 276ad61..b00f39f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -300,6 +300,20 @@ static void machine_set_iommu(Object *obj, bool value, Error **errp)
ms->iommu = value;
}
+static bool machine_get_intr(Object *obj, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ return ms->iommu_intr;
+}
+
+static void machine_set_intr(Object *obj, bool value, Error **errp)
+{
+ MachineState *ms = MACHINE(obj);
+
+ ms->iommu_intr = value;
+}
+
static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
{
MachineState *ms = MACHINE(obj);
@@ -480,6 +494,12 @@ static void machine_initfn(Object *obj)
object_property_set_description(obj, "iommu",
"Set on/off to enable/disable Intel IOMMU (VT-d)",
NULL);
+ object_property_add_bool(obj, "intr", machine_get_intr,
+ machine_set_intr, NULL);
+ object_property_set_description(obj, "intr",
+ "Set on/off to enable/disable IOMMU"
+ " interrupt remapping",
+ NULL);
object_property_add_bool(obj, "suppress-vmdesc",
machine_get_suppress_vmdesc,
machine_set_suppress_vmdesc, NULL);
--
2.4.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU
2016-04-15 3:31 [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Peter Xu
` (12 preceding siblings ...)
2016-04-15 3:31 ` [Qemu-devel] [PATCH v3 13/13] q35: add "int-remap" flag to enable intr Peter Xu
@ 2016-04-17 2:26 ` Jan Kiszka
2016-04-18 3:14 ` Peter Xu
13 siblings, 1 reply; 24+ messages in thread
From: Jan Kiszka @ 2016-04-17 2:26 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: imammedo, rth, ehabkost, jasowang, marcel, mst, pbonzini, rkrcmar,
alex.williamson, wexu
Hi Peter,
series no longer applies to git master. What was the baseline? Do you
have a public git repo as well?
Jan
On 2016-04-14 20:31, Peter Xu wrote:
> v3 changes (all patch numbers corresponds to v2):
> - patch 1 (-> v3 patch 13)
> - move to the end of series [Alex]
> - patch 10 (dropped)
> - drop this one, since re-worked on IOAPIC support, so we do not
> need this any more.
> - patch 12 (-> v3 patch 10)
> - leverage MSI path for IOAPIC IR [Jan]
> - patch 13 (v3 -> patch 9)
> - remove vtd_interrupt_remap_msi() declaration by reordering the
> functions [mst]
> - vtd_generate_msi_message(): init msg using {}, remove FIXME
> [mst]
> - new patches
> - v3 patch 11: introduce ioapic_entry_parse() helper function
> - v3 patch 12: add support for kernel-irqchip=split. This needs
> more reviews, logically this should enable lots of things:
> splitted irqchip, irqfd, vhost, and irqfd support for
> passthrough devices (not tested). Please refer to the patch for
> more information.
>
> v2 changes:
> - patch 1
> - rename "int_remap" to "intr" in several places [Marcel]
> - remove "Intel" specific words in desc or commit message, prepare
> itself with further AMD support [Marcel]
> - avoid using object_property_get_bool() [Marcel]
> - patch 5
> - use PCI bus number 0xff rather than 0xf0 for the IOAPIC scope
> definition. (please let me know if anyone knows how I can avoid
> user using PCI bus number 0xff... TIA)
> - patch 11
> - fix comments [Marcel]
> - all
> - remove intr_supported variable [Marcel]
>
> This patchset provide very basic functionalities for interrupt
> remapping (IR) support of the emulated Intel IOMMU device.
>
> By default, IR is disabled to be better compatible with current
> QEMU. To enable IR, we can using the following command to boot a
> IR-supported VM with virtio-net device with vhost (still do not
> support kvm-ioapic, so we need to specify kernel-irqchip={split|off}
> here):
>
> $ qemu-system-x86_64 -M q35,iommu=on,intr=on,kernel-irqchip=split \
> -enable-kvm -m 1024 \
> -netdev tap,id=net0,vhost=on \
> -device virtio-net-pci,netdev=user.0 \
> -monitor telnet::3333,server,nowait \
> /var/lib/libvirt/images/vm1.qcow2
>
> When guest boots, we can verify whether IR enabled by grepping the
> dmesg like:
>
> [root@localhost ~]# journalctl -k | grep "DMAR-IR"
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base 0xfed90000 IOMMU 0
> Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in xapic mode
>
> Currently supported devices:
>
> - Emulated/Splitted irqchip
> - Generic PCI Devices
> - vhost devices
> - pass through device support? Not tested, but suppose it should work.
>
> TODO List:
>
> - kvm-ioapic support (?)
> - EIM support
> - IR fault reporting
> - source-id validation for IRTE
> - IRTE cache and better queued invalidation
> - migration support (for IOMMU as general?)
> - more?
>
> Peter Xu (13):
> intel_iommu: allow queued invalidation for IR
> intel_iommu: set IR bit for ECAP register
> acpi: add DMAR scope definition for root IOAPIC
> intel_iommu: define interrupt remap table addr register
> intel_iommu: handle interrupt remap enable
> intel_iommu: define several structs for IOMMU IR
> intel_iommu: provide helper function vtd_get_iommu
> intel_iommu: add IR translation faults defines
> intel_iommu: Add support for PCI MSI remap
> q35: ioapic: add support for emulated IOAPIC IR
> ioapic: introduce ioapic_entry_parse() helper
> q35: ioapic: add support for split irqchip and irqfd
> q35: add "int-remap" flag to enable intr
>
> hw/core/machine.c | 20 +++
> hw/i386/acpi-build.c | 24 ++-
> hw/i386/intel_iommu.c | 343 +++++++++++++++++++++++++++++++++++++-
> hw/i386/intel_iommu_internal.h | 23 +++
> hw/i386/pc.c | 3 +
> hw/intc/ioapic.c | 116 ++++++++-----
> hw/intc/ioapic_common.c | 4 +
> hw/pci-host/q35.c | 4 +
> include/hw/acpi/acpi-defs.h | 15 ++
> include/hw/i386/apic-msidef.h | 1 +
> include/hw/i386/intel_iommu.h | 117 +++++++++++++
> include/hw/i386/ioapic_internal.h | 6 +
> include/hw/i386/pc.h | 4 +
> include/hw/pci-host/q35.h | 9 +
> 14 files changed, 635 insertions(+), 54 deletions(-)
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU
2016-04-17 2:26 ` [Qemu-devel] [PATCH v3 00/13] IOMMU: Enable interrupt remapping for Intel IOMMU Jan Kiszka
@ 2016-04-18 3:14 ` Peter Xu
0 siblings, 0 replies; 24+ messages in thread
From: Peter Xu @ 2016-04-18 3:14 UTC (permalink / raw)
To: Jan Kiszka
Cc: qemu-devel, imammedo, rth, ehabkost, jasowang, marcel, mst,
pbonzini, rkrcmar, alex.williamson, wexu
On Sat, Apr 16, 2016 at 07:26:50PM -0700, Jan Kiszka wrote:
> Hi Peter,
>
> series no longer applies to git master. What was the baseline? Do you
> have a public git repo as well?
Yep, since I missed the first patch. Please check v4 later. Sorry!
-- peterx
^ permalink raw reply [flat|nested] 24+ messages in thread