* [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
@ 2023-10-24 15:21 Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 1/5] i386/tcg: implement x2APIC registers MSR access Bui Quang Minh
` (5 more replies)
0 siblings, 6 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
Hi everyone,
This series implements x2APIC mode in userspace local APIC and the
RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
using either Intel or AMD iommu.
Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
with enabled x2APIC and can enumerate CPU with APIC ID 257
Using Intel IOMMU
qemu/build/qemu-system-x86_64 \
-smp 2,maxcpus=260 \
-cpu qemu64,x2apic=on \
-machine q35 \
-device intel-iommu,intremap=on,eim=on \
-device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
-m 2G \
-kernel $KERNEL_DIR \
-append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
-drive file=$IMAGE_DIR,format=raw \
-nographic \
-s
Using AMD IOMMU
qemu/build/qemu-system-x86_64 \
-smp 2,maxcpus=260 \
-cpu qemu64,x2apic=on \
-machine q35 \
-device amd-iommu,intremap=on,xtsup=on \
-device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
-m 2G \
-kernel $KERNEL_DIR \
-append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
-drive file=$IMAGE_DIR,format=raw \
-nographic \
-s
Testing the emulated userspace APIC with kvm-unit-tests, disable test
device with this patch
diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
index 1734afb..f56fe1c 100644
--- a/lib/x86/fwcfg.c
+++ b/lib/x86/fwcfg.c
@@ -27,6 +27,7 @@ static void read_cfg_override(void)
if ((str = getenv("TEST_DEVICE")))
no_test_device = !atol(str);
+ no_test_device = true;
if ((str = getenv("MEMLIMIT")))
fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
./run_tests.sh -v -g apic
TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
-cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
apic-split (54 tests, 8 unexpected failures, 1 skipped)
TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
6 unexpected failures, 2 skipped)
FAIL: apic_disable: *0xfee00030: 50014
FAIL: apic_disable: *0xfee00080: f0
FAIL: apic_disable: *0xfee00030: 50014
FAIL: apic_disable: *0xfee00080: f0
FAIL: apicbase: relocate apic
These errors are because we don't disable MMIO region when switching to
x2APIC and don't support relocate MMIO region yet. This is a problem
because, MMIO region is the same for all CPUs, in order to support these we
need to figure out how to allocate and manage different MMIO regions for
each CPUs. This can be an improvement in the future.
FAIL: nmi-after-sti
FAIL: multiple nmi
These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
FAIL: TMCCT should stay at zero
This error is related to APIC timer which should be addressed in separate
patch.
Version 9 changes,
- Patch 1:
+ Create apic_msr_read/write which is a small wrapper around
apic_register_read/write that have additional x2apic mode check
- Patch 2:
+ Remove raise_exception_ra which is is TCG specific. Instead, return -1
and let the accelerator raise the appropriate exception
+ Refactor apic_get_delivery_bitmask a little bit to reduce line length
+ Move cpu_has_x2apic_feature and cpu_set_apic_feature from patch 3 to
patch 2 so that patch 2 can be compiled without patch 3
- Patch 3:
+ set_base in APICCommonClass now returns an int to indicate error
+ Remove raise_exception_ra in apic_set base which is is TCG specific.
Instead, return -1 and let the accelerator raise the appropriate
exception
Version 8 changes,
- Patch 2, 4:
+ Rebase to master and resolve conflicts in these 2 patches
Version 7 changes,
- Patch 4:
+ If eim=on, keep checking if kvm x2APIC is enabled when kernel-irqchip
is split
Version 6 changes,
- Patch 5:
+ Make all places use the amdvi_extended_feature_register to get extended
feature register
Version 5 changes,
- Patch 3:
+ Rebase to master and fix conflict
- Patch 5:
+ Create a helper function to get amdvi extended feature register instead
of storing it in AMDVIState
Version 4 changes,
- Patch 5:
+ Instead of replacing IVHD type 0x10 with type 0x11, export both types
for backward compatibility with old guest operating system
+ Flip the xtsup feature check condition in amdvi_int_remap_ga for
readability
Version 3 changes,
- Patch 2:
+ Allow APIC ID > 255 only when x2APIC feature is supported on CPU
+ Make physical destination mode IPI which has destination id 0xffffffff
a broadcast to xAPIC CPUs
+ Make cluster address 0xf in cluster model of xAPIC logical destination
mode a broadcast to all clusters
+ Create new extended_log_dest to store APIC_LDR information in x2APIC
instead of extending log_dest for backward compatibility in vmstate
Version 2 changes,
- Add support for APIC ID larger than 255
- Adjust AMD iommu for x2APIC suuport
- Reorganize and split patch 1,2 into patch 1,2,3 in version 2
Thanks,
Quang Minh.
Bui Quang Minh (5):
i386/tcg: implement x2APIC registers MSR access
apic: add support for x2APIC mode
apic, i386/tcg: add x2apic transitions
intel_iommu: allow Extended Interrupt Mode when using userspace APIC
amd_iommu: report x2APIC support to the operating system
hw/i386/acpi-build.c | 129 +++++---
hw/i386/amd_iommu.c | 29 +-
hw/i386/amd_iommu.h | 16 +-
hw/i386/intel_iommu.c | 6 +-
hw/i386/kvm/apic.c | 3 +-
hw/i386/x86.c | 6 +-
hw/i386/xen/xen_apic.c | 3 +-
hw/intc/apic.c | 464 +++++++++++++++++++++------
hw/intc/apic_common.c | 22 +-
hw/intc/trace-events | 4 +-
include/hw/i386/apic.h | 8 +-
include/hw/i386/apic_internal.h | 9 +-
target/i386/cpu-sysemu.c | 18 +-
target/i386/cpu.c | 9 +-
target/i386/cpu.h | 9 +
target/i386/tcg/sysemu/misc_helper.c | 41 ++-
target/i386/whpx/whpx-apic.c | 3 +-
17 files changed, 591 insertions(+), 188 deletions(-)
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v9 1/5] i386/tcg: implement x2APIC registers MSR access
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
@ 2023-10-24 15:21 ` Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 2/5] apic: add support for x2APIC mode Bui Quang Minh
` (4 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
This commit creates apic_register_read/write which are used by both
apic_mem_read/write for MMIO access and apic_msr_read/write for MSR access.
The apic_msr_read/write returns -1 on error, accelerator can use this to
raise the appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
hw/intc/apic.c | 122 ++++++++++++++++++++-------
hw/intc/trace-events | 4 +-
include/hw/i386/apic.h | 3 +
target/i386/cpu.h | 3 +
target/i386/tcg/sysemu/misc_helper.c | 27 ++++++
5 files changed, 127 insertions(+), 32 deletions(-)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index ac3d47d231..7a349c0723 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -288,6 +288,13 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
}
+bool is_x2apic_mode(DeviceState *dev)
+{
+ APICCommonState *s = APIC(dev);
+
+ return s->apicbase & MSR_IA32_APICBASE_EXTD;
+}
+
static void apic_set_base(APICCommonState *s, uint64_t val)
{
s->apicbase = (val & 0xfffff000) |
@@ -636,24 +643,19 @@ static void apic_timer(void *opaque)
apic_timer_update(s, s->next_time);
}
-static uint64_t apic_mem_read(void *opaque, hwaddr addr, unsigned size)
+static int apic_register_read(int index, uint64_t *value)
{
DeviceState *dev;
APICCommonState *s;
uint32_t val;
- int index;
-
- if (size < 4) {
- return 0;
- }
+ int ret = 0;
dev = cpu_get_current_apic();
if (!dev) {
- return 0;
+ return -1;
}
s = APIC(dev);
- index = (addr >> 4) & 0xff;
switch(index) {
case 0x02: /* id */
val = s->id << 24;
@@ -718,12 +720,46 @@ static uint64_t apic_mem_read(void *opaque, hwaddr addr, unsigned size)
default:
s->esr |= APIC_ESR_ILLEGAL_ADDRESS;
val = 0;
+ ret = -1;
break;
}
- trace_apic_mem_readl(addr, val);
+
+ trace_apic_register_read(index, val);
+ *value = val;
+ return ret;
+}
+
+static uint64_t apic_mem_read(void *opaque, hwaddr addr, unsigned size)
+{
+ uint64_t val;
+ int index;
+
+ if (size < 4) {
+ return 0;
+ }
+
+ index = (addr >> 4) & 0xff;
+ apic_register_read(index, &val);
+
return val;
}
+int apic_msr_read(int index, uint64_t *val)
+{
+ DeviceState *dev;
+
+ dev = cpu_get_current_apic();
+ if (!dev) {
+ return -1;
+ }
+
+ if (!is_x2apic_mode(dev)) {
+ return -1;
+ }
+
+ return apic_register_read(index, val);
+}
+
static void apic_send_msi(MSIMessage *msi)
{
uint64_t addr = msi->address;
@@ -737,35 +773,18 @@ static void apic_send_msi(MSIMessage *msi)
apic_deliver_irq(dest, dest_mode, delivery, vector, trigger_mode);
}
-static void apic_mem_write(void *opaque, hwaddr addr, uint64_t val,
- unsigned size)
+static int apic_register_write(int index, uint64_t val)
{
DeviceState *dev;
APICCommonState *s;
- int index = (addr >> 4) & 0xff;
-
- if (size < 4) {
- return;
- }
-
- if (addr > 0xfff || !index) {
- /* MSI and MMIO APIC are at the same memory location,
- * but actually not on the global bus: MSI is on PCI bus
- * APIC is connected directly to the CPU.
- * Mapping them on the global bus happens to work because
- * MSI registers are reserved in APIC MMIO and vice versa. */
- MSIMessage msi = { .address = addr, .data = val };
- apic_send_msi(&msi);
- return;
- }
dev = cpu_get_current_apic();
if (!dev) {
- return;
+ return -1;
}
s = APIC(dev);
- trace_apic_mem_writel(addr, val);
+ trace_apic_register_write(index, val);
switch(index) {
case 0x02:
@@ -839,8 +858,51 @@ static void apic_mem_write(void *opaque, hwaddr addr, uint64_t val,
break;
default:
s->esr |= APIC_ESR_ILLEGAL_ADDRESS;
- break;
+ return -1;
}
+
+ return 0;
+}
+
+static void apic_mem_write(void *opaque, hwaddr addr, uint64_t val,
+ unsigned size)
+{
+ int index = (addr >> 4) & 0xff;
+
+ if (size < 4) {
+ return;
+ }
+
+ if (addr > 0xfff || !index) {
+ /*
+ * MSI and MMIO APIC are at the same memory location,
+ * but actually not on the global bus: MSI is on PCI bus
+ * APIC is connected directly to the CPU.
+ * Mapping them on the global bus happens to work because
+ * MSI registers are reserved in APIC MMIO and vice versa.
+ */
+ MSIMessage msi = { .address = addr, .data = val };
+ apic_send_msi(&msi);
+ return;
+ }
+
+ apic_register_write(index, val);
+}
+
+int apic_msr_write(int index, uint64_t val)
+{
+ DeviceState *dev;
+
+ dev = cpu_get_current_apic();
+ if (!dev) {
+ return -1;
+ }
+
+ if (!is_x2apic_mode(dev)) {
+ return -1;
+ }
+
+ return apic_register_write(index, val);
}
static void apic_pre_save(APICCommonState *s)
diff --git a/hw/intc/trace-events b/hw/intc/trace-events
index 36ff71f947..1ef29d0256 100644
--- a/hw/intc/trace-events
+++ b/hw/intc/trace-events
@@ -14,8 +14,8 @@ cpu_get_apic_base(uint64_t val) "0x%016"PRIx64
# apic.c
apic_local_deliver(int vector, uint32_t lvt) "vector %d delivery mode %d"
apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode, uint8_t vector_num, uint8_t trigger_mode) "dest %d dest_mode %d delivery_mode %d vector %d trigger_mode %d"
-apic_mem_readl(uint64_t addr, uint32_t val) "0x%"PRIx64" = 0x%08x"
-apic_mem_writel(uint64_t addr, uint32_t val) "0x%"PRIx64" = 0x%08x"
+apic_register_read(uint8_t reg, uint64_t val) "register 0x%02x = 0x%"PRIx64
+apic_register_write(uint8_t reg, uint64_t val) "register 0x%02x = 0x%"PRIx64
# ioapic.c
ioapic_set_remote_irr(int n) "set remote irr for pin %d"
diff --git a/include/hw/i386/apic.h b/include/hw/i386/apic.h
index bdc15a7a73..ddea4213db 100644
--- a/include/hw/i386/apic.h
+++ b/include/hw/i386/apic.h
@@ -18,6 +18,9 @@ void apic_sipi(DeviceState *s);
void apic_poll_irq(DeviceState *d);
void apic_designate_bsp(DeviceState *d, bool bsp);
int apic_get_highest_priority_irr(DeviceState *dev);
+int apic_msr_read(int index, uint64_t *val);
+int apic_msr_write(int index, uint64_t val);
+bool is_x2apic_mode(DeviceState *d);
/* pc.c */
DeviceState *cpu_get_current_apic(void);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 471e71dbc5..92d0cf528c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -545,6 +545,9 @@ typedef enum X86Seg {
#define MSR_IA32_VMX_TRUE_ENTRY_CTLS 0x00000490
#define MSR_IA32_VMX_VMFUNC 0x00000491
+#define MSR_APIC_START 0x00000800
+#define MSR_APIC_END 0x000008ff
+
#define XSTATE_FP_BIT 0
#define XSTATE_SSE_BIT 1
#define XSTATE_YMM_BIT 2
diff --git a/target/i386/tcg/sysemu/misc_helper.c b/target/i386/tcg/sysemu/misc_helper.c
index e1528b7f80..6fccdb3dca 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -25,6 +25,7 @@
#include "exec/address-spaces.h"
#include "exec/exec-all.h"
#include "tcg/helper-tcg.h"
+#include "hw/i386/apic.h"
void helper_outb(CPUX86State *env, uint32_t port, uint32_t data)
{
@@ -289,6 +290,19 @@ void helper_wrmsr(CPUX86State *env)
env->msr_bndcfgs = val;
cpu_sync_bndcs_hflags(env);
break;
+ case MSR_APIC_START ... MSR_APIC_END: {
+ int ret;
+ int index = (uint32_t)env->regs[R_ECX] - MSR_APIC_START;
+
+ qemu_mutex_lock_iothread();
+ ret = apic_msr_write(index, val);
+ qemu_mutex_unlock_iothread();
+ if (ret < 0) {
+ goto error;
+ }
+
+ break;
+ }
default:
if ((uint32_t)env->regs[R_ECX] >= MSR_MC0_CTL
&& (uint32_t)env->regs[R_ECX] < MSR_MC0_CTL +
@@ -455,6 +469,19 @@ void helper_rdmsr(CPUX86State *env)
val = (cs->nr_threads * cs->nr_cores) | (cs->nr_cores << 16);
break;
}
+ case MSR_APIC_START ... MSR_APIC_END: {
+ int ret;
+ int index = (uint32_t)env->regs[R_ECX] - MSR_APIC_START;
+
+ qemu_mutex_lock_iothread();
+ ret = apic_msr_read(index, &val);
+ qemu_mutex_unlock_iothread();
+ if (ret < 0) {
+ raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+ }
+
+ break;
+ }
default:
if ((uint32_t)env->regs[R_ECX] >= MSR_MC0_CTL
&& (uint32_t)env->regs[R_ECX] < MSR_MC0_CTL +
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v9 2/5] apic: add support for x2APIC mode
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 1/5] i386/tcg: implement x2APIC registers MSR access Bui Quang Minh
@ 2023-10-24 15:21 ` Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 3/5] apic, i386/tcg: add x2apic transitions Bui Quang Minh
` (3 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
This commit extends the APIC ID to 32-bit long and remove the 255 max APIC
ID limit in userspace APIC. The array that manages local APICs is now
dynamically allocated based on the max APIC ID of created x86 machine.
Also, new x2APIC IPI destination determination scheme, self IPI and x2APIC
mode register access are supported.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
hw/i386/x86.c | 6 +-
hw/intc/apic.c | 280 ++++++++++++++++++++++++--------
hw/intc/apic_common.c | 9 +
include/hw/i386/apic.h | 3 +-
include/hw/i386/apic_internal.h | 7 +-
target/i386/cpu-sysemu.c | 18 +-
target/i386/cpu.h | 2 +
7 files changed, 250 insertions(+), 75 deletions(-)
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index b3d054889b..9d3f8b9b4e 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -133,7 +133,7 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
* both in-kernel lapic and X2APIC userspace API.
*/
if (x86ms->apic_id_limit > 255 && kvm_enabled() &&
- (!kvm_irqchip_in_kernel() || !kvm_enable_x2apic())) {
+ kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {
error_report("current -smp configuration requires kernel "
"irqchip and X2APIC API support.");
exit(EXIT_FAILURE);
@@ -143,6 +143,10 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
kvm_set_max_apic_id(x86ms->apic_id_limit);
}
+ if (!kvm_irqchip_in_kernel()) {
+ apic_set_max_apic_id(x86ms->apic_id_limit);
+ }
+
possible_cpus = mc->possible_cpu_arch_ids(ms);
for (i = 0; i < ms->smp.cpus; i++) {
x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 7a349c0723..84d428a875 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -32,14 +32,13 @@
#include "qapi/error.h"
#include "qom/object.h"
-#define MAX_APICS 255
-#define MAX_APIC_WORDS 8
-
#define SYNC_FROM_VAPIC 0x1
#define SYNC_TO_VAPIC 0x2
#define SYNC_ISR_IRR_TO_VAPIC 0x4
-static APICCommonState *local_apics[MAX_APICS + 1];
+static APICCommonState **local_apics;
+static uint32_t max_apics;
+static uint32_t max_apic_words;
#define TYPE_APIC "apic"
/*This is reusing the APICCommonState typedef from APIC_COMMON */
@@ -49,7 +48,19 @@ DECLARE_INSTANCE_CHECKER(APICCommonState, APIC,
static void apic_set_irq(APICCommonState *s, int vector_num, int trigger_mode);
static void apic_update_irq(APICCommonState *s);
static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
- uint8_t dest, uint8_t dest_mode);
+ uint32_t dest, uint8_t dest_mode);
+
+void apic_set_max_apic_id(uint32_t max_apic_id)
+{
+ int word_size = 32;
+
+ /* round up the max apic id to next multiple of words */
+ max_apics = (max_apic_id + word_size - 1) & ~(word_size - 1);
+
+ local_apics = g_malloc0(sizeof(*local_apics) * max_apics);
+ max_apic_words = max_apics >> 5;
+}
+
/* Find first bit starting from msb */
static int apic_fls_bit(uint32_t value)
@@ -199,10 +210,10 @@ static void apic_external_nmi(APICCommonState *s)
#define foreach_apic(apic, deliver_bitmask, code) \
{\
int __i, __j;\
- for(__i = 0; __i < MAX_APIC_WORDS; __i++) {\
+ for (__i = 0; __i < max_apic_words; __i++) {\
uint32_t __mask = deliver_bitmask[__i];\
if (__mask) {\
- for(__j = 0; __j < 32; __j++) {\
+ for (__j = 0; __j < 32; __j++) {\
if (__mask & (1U << __j)) {\
apic = local_apics[__i * 32 + __j];\
if (apic) {\
@@ -226,7 +237,7 @@ static void apic_bus_deliver(const uint32_t *deliver_bitmask,
{
int i, d;
d = -1;
- for(i = 0; i < MAX_APIC_WORDS; i++) {
+ for (i = 0; i < max_apic_words; i++) {
if (deliver_bitmask[i]) {
d = i * 32 + apic_ffs_bit(deliver_bitmask[i]);
break;
@@ -276,16 +287,18 @@ static void apic_bus_deliver(const uint32_t *deliver_bitmask,
apic_set_irq(apic_iter, vector_num, trigger_mode) );
}
-void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
- uint8_t vector_num, uint8_t trigger_mode)
+static void apic_deliver_irq(uint32_t dest, uint8_t dest_mode,
+ uint8_t delivery_mode, uint8_t vector_num,
+ uint8_t trigger_mode)
{
- uint32_t deliver_bitmask[MAX_APIC_WORDS];
+ uint32_t *deliver_bitmask = g_malloc(max_apic_words * sizeof(uint32_t));
trace_apic_deliver_irq(dest, dest_mode, delivery_mode, vector_num,
trigger_mode);
apic_get_delivery_bitmask(deliver_bitmask, dest, dest_mode);
apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
+ g_free(deliver_bitmask);
}
bool is_x2apic_mode(DeviceState *dev)
@@ -442,57 +455,123 @@ static void apic_eoi(APICCommonState *s)
apic_update_irq(s);
}
-static int apic_find_dest(uint8_t dest)
+static bool apic_match_dest(APICCommonState *apic, uint32_t dest)
{
- APICCommonState *apic = local_apics[dest];
+ if (is_x2apic_mode(&apic->parent_obj)) {
+ return apic->initial_apic_id == dest;
+ } else {
+ return apic->id == (uint8_t)dest;
+ }
+}
+
+static void apic_find_dest(uint32_t *deliver_bitmask, uint32_t dest)
+{
+ APICCommonState *apic = NULL;
int i;
- if (apic && apic->id == dest)
- return dest; /* shortcut in case apic->id == local_apics[dest]->id */
-
- for (i = 0; i < MAX_APICS; i++) {
+ for (i = 0; i < max_apics; i++) {
apic = local_apics[i];
- if (apic && apic->id == dest)
- return i;
- if (!apic)
- break;
+ if (apic && apic_match_dest(apic, dest)) {
+ apic_set_bit(deliver_bitmask, i);
+ }
}
-
- return -1;
}
-static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
- uint8_t dest, uint8_t dest_mode)
+/*
+ * Deliver interrupt to x2APIC CPUs if it is x2APIC broadcast.
+ * Otherwise, deliver interrupt to xAPIC CPUs if it is xAPIC
+ * broadcast.
+ */
+static void apic_get_broadcast_bitmask(uint32_t *deliver_bitmask,
+ bool is_x2apic_broadcast)
{
+ int i;
APICCommonState *apic_iter;
+
+ for (i = 0; i < max_apics; i++) {
+ apic_iter = local_apics[i];
+ if (apic_iter) {
+ bool apic_in_x2apic = is_x2apic_mode(&apic_iter->parent_obj);
+
+ if (is_x2apic_broadcast && apic_in_x2apic) {
+ apic_set_bit(deliver_bitmask, i);
+ } else if (!is_x2apic_broadcast && !apic_in_x2apic) {
+ apic_set_bit(deliver_bitmask, i);
+ }
+ }
+ }
+}
+
+static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
+ uint32_t dest, uint8_t dest_mode)
+{
+ APICCommonState *apic;
int i;
- if (dest_mode == 0) {
+ memset(deliver_bitmask, 0x00, max_apic_words * sizeof(uint32_t));
+
+ /*
+ * x2APIC broadcast is delivered to all x2APIC CPUs regardless of
+ * destination mode. In case the destination mode is physical, it is
+ * broadcasted to all xAPIC CPUs too. Otherwise, if the destination
+ * mode is logical, we need to continue checking if xAPIC CPUs accepts
+ * the interrupt.
+ */
+ if (dest == 0xffffffff) {
+ if (dest_mode == APIC_DESTMODE_PHYSICAL) {
+ memset(deliver_bitmask, 0xff, max_apic_words * sizeof(uint32_t));
+ return;
+ } else {
+ apic_get_broadcast_bitmask(deliver_bitmask, true);
+ }
+ }
+
+ if (dest_mode == APIC_DESTMODE_PHYSICAL) {
+ apic_find_dest(deliver_bitmask, dest);
+ /* Any APIC in xAPIC mode will interpret 0xFF as broadcast */
if (dest == 0xff) {
- memset(deliver_bitmask, 0xff, MAX_APIC_WORDS * sizeof(uint32_t));
- } else {
- int idx = apic_find_dest(dest);
- memset(deliver_bitmask, 0x00, MAX_APIC_WORDS * sizeof(uint32_t));
- if (idx >= 0)
- apic_set_bit(deliver_bitmask, idx);
+ apic_get_broadcast_bitmask(deliver_bitmask, false);
}
} else {
- /* XXX: cluster mode */
- memset(deliver_bitmask, 0x00, MAX_APIC_WORDS * sizeof(uint32_t));
- for(i = 0; i < MAX_APICS; i++) {
- apic_iter = local_apics[i];
- if (apic_iter) {
- if (apic_iter->dest_mode == 0xf) {
- if (dest & apic_iter->log_dest)
- apic_set_bit(deliver_bitmask, i);
- } else if (apic_iter->dest_mode == 0x0) {
- if ((dest & 0xf0) == (apic_iter->log_dest & 0xf0) &&
- (dest & apic_iter->log_dest & 0x0f)) {
+ /* XXX: logical mode */
+ for (i = 0; i < max_apics; i++) {
+ apic = local_apics[i];
+ if (apic) {
+ /* x2APIC logical mode */
+ if (apic->apicbase & MSR_IA32_APICBASE_EXTD) {
+ if ((dest >> 16) == (apic->extended_log_dest >> 16) &&
+ (dest & apic->extended_log_dest & 0xffff)) {
apic_set_bit(deliver_bitmask, i);
}
+ continue;
}
- } else {
- break;
+
+ /* xAPIC logical mode */
+ dest = (uint8_t)dest;
+ if (apic->dest_mode == APIC_DESTMODE_LOGICAL_FLAT) {
+ if (dest & apic->log_dest) {
+ apic_set_bit(deliver_bitmask, i);
+ }
+ } else if (apic->dest_mode == APIC_DESTMODE_LOGICAL_CLUSTER) {
+ /*
+ * In cluster model of xAPIC logical mode IPI, 4 higher
+ * bits are used as cluster address, 4 lower bits are
+ * the bitmask for local APICs in the cluster. The IPI
+ * is delivered to an APIC if the cluster address
+ * matches and the APIC's address bit in the cluster is
+ * set in bitmask of destination ID in IPI.
+ *
+ * The cluster address ranges from 0 - 14, the cluster
+ * address 15 (0xf) is the broadcast address to all
+ * clusters.
+ */
+ if ((dest & 0xf0) == 0xf0 ||
+ (dest & 0xf0) == (apic->log_dest & 0xf0)) {
+ if (dest & apic->log_dest & 0x0f) {
+ apic_set_bit(deliver_bitmask, i);
+ }
+ }
+ }
}
}
}
@@ -516,29 +595,36 @@ void apic_sipi(DeviceState *dev)
s->wait_for_sipi = 0;
}
-static void apic_deliver(DeviceState *dev, uint8_t dest, uint8_t dest_mode,
+static void apic_deliver(DeviceState *dev, uint32_t dest, uint8_t dest_mode,
uint8_t delivery_mode, uint8_t vector_num,
- uint8_t trigger_mode)
+ uint8_t trigger_mode, uint8_t dest_shorthand)
{
APICCommonState *s = APIC(dev);
- uint32_t deliver_bitmask[MAX_APIC_WORDS];
- int dest_shorthand = (s->icr[0] >> 18) & 3;
APICCommonState *apic_iter;
+ uint32_t deliver_bitmask_size = max_apic_words * sizeof(uint32_t);
+ uint32_t *deliver_bitmask = g_malloc(deliver_bitmask_size);
+ uint32_t current_apic_id;
+
+ if (is_x2apic_mode(dev)) {
+ current_apic_id = s->initial_apic_id;
+ } else {
+ current_apic_id = s->id;
+ }
switch (dest_shorthand) {
case 0:
apic_get_delivery_bitmask(deliver_bitmask, dest, dest_mode);
break;
case 1:
- memset(deliver_bitmask, 0x00, sizeof(deliver_bitmask));
- apic_set_bit(deliver_bitmask, s->id);
+ memset(deliver_bitmask, 0x00, deliver_bitmask_size);
+ apic_set_bit(deliver_bitmask, current_apic_id);
break;
case 2:
- memset(deliver_bitmask, 0xff, sizeof(deliver_bitmask));
+ memset(deliver_bitmask, 0xff, deliver_bitmask_size);
break;
case 3:
- memset(deliver_bitmask, 0xff, sizeof(deliver_bitmask));
- apic_reset_bit(deliver_bitmask, s->id);
+ memset(deliver_bitmask, 0xff, deliver_bitmask_size);
+ apic_reset_bit(deliver_bitmask, current_apic_id);
break;
}
@@ -562,6 +648,7 @@ static void apic_deliver(DeviceState *dev, uint8_t dest, uint8_t dest_mode,
}
apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
+ g_free(deliver_bitmask);
}
static bool apic_check_pic(APICCommonState *s)
@@ -658,7 +745,11 @@ static int apic_register_read(int index, uint64_t *value)
switch(index) {
case 0x02: /* id */
- val = s->id << 24;
+ if (is_x2apic_mode(dev)) {
+ val = s->initial_apic_id;
+ } else {
+ val = s->id << 24;
+ }
break;
case 0x03: /* version */
val = s->version | ((APIC_LVT_NB - 1) << 16);
@@ -681,10 +772,19 @@ static int apic_register_read(int index, uint64_t *value)
val = 0;
break;
case 0x0d:
- val = s->log_dest << 24;
+ if (is_x2apic_mode(dev)) {
+ val = s->extended_log_dest;
+ } else {
+ val = s->log_dest << 24;
+ }
break;
case 0x0e:
- val = (s->dest_mode << 28) | 0xfffffff;
+ if (is_x2apic_mode(dev)) {
+ val = 0;
+ ret = -1;
+ } else {
+ val = (s->dest_mode << 28) | 0xfffffff;
+ }
break;
case 0x0f:
val = s->spurious_vec;
@@ -764,7 +864,12 @@ static void apic_send_msi(MSIMessage *msi)
{
uint64_t addr = msi->address;
uint32_t data = msi->data;
- uint8_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
+ uint32_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
+ /*
+ * The higher 3 bytes of destination id is stored in higher word of
+ * msi address. See x86_iommu_irq_to_msi_message()
+ */
+ dest = dest | (addr >> 32);
uint8_t vector = (data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
uint8_t dest_mode = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
uint8_t trigger_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
@@ -788,6 +893,10 @@ static int apic_register_write(int index, uint64_t val)
switch(index) {
case 0x02:
+ if (is_x2apic_mode(dev)) {
+ return -1;
+ }
+
s->id = (val >> 24);
break;
case 0x03:
@@ -807,9 +916,17 @@ static int apic_register_write(int index, uint64_t val)
apic_eoi(s);
break;
case 0x0d:
+ if (is_x2apic_mode(dev)) {
+ return -1;
+ }
+
s->log_dest = val >> 24;
break;
case 0x0e:
+ if (is_x2apic_mode(dev)) {
+ return -1;
+ }
+
s->dest_mode = val >> 28;
break;
case 0x0f:
@@ -821,13 +938,27 @@ static int apic_register_write(int index, uint64_t val)
case 0x20 ... 0x27:
case 0x28:
break;
- case 0x30:
+ case 0x30: {
+ uint32_t dest;
+
s->icr[0] = val;
- apic_deliver(dev, (s->icr[1] >> 24) & 0xff, (s->icr[0] >> 11) & 1,
+ if (is_x2apic_mode(dev)) {
+ s->icr[1] = val >> 32;
+ dest = s->icr[1];
+ } else {
+ dest = (s->icr[1] >> 24) & 0xff;
+ }
+
+ apic_deliver(dev, dest, (s->icr[0] >> 11) & 1,
(s->icr[0] >> 8) & 7, (s->icr[0] & 0xff),
- (s->icr[0] >> 15) & 1);
+ (s->icr[0] >> 15) & 1, (s->icr[0] >> 18) & 3);
break;
+ }
case 0x31:
+ if (is_x2apic_mode(dev)) {
+ return -1;
+ }
+
s->icr[1] = val;
break;
case 0x32 ... 0x37:
@@ -856,6 +987,23 @@ static int apic_register_write(int index, uint64_t val)
s->count_shift = (v + 1) & 7;
}
break;
+ case 0x3f: {
+ int vector = val & 0xff;
+
+ if (!is_x2apic_mode(dev)) {
+ return -1;
+ }
+
+ /*
+ * Self IPI is identical to IPI with
+ * - Destination shorthand: 1 (Self)
+ * - Trigger mode: 0 (Edge)
+ * - Delivery mode: 0 (Fixed)
+ */
+ apic_deliver(dev, 0, 0, APIC_DM_FIXED, vector, 0, 1);
+
+ break;
+ }
default:
s->esr |= APIC_ESR_ILLEGAL_ADDRESS;
return -1;
@@ -933,12 +1081,6 @@ static void apic_realize(DeviceState *dev, Error **errp)
{
APICCommonState *s = APIC(dev);
- if (s->id >= MAX_APICS) {
- error_setg(errp, "%s initialization failed. APIC ID %d is invalid",
- object_get_typename(OBJECT(dev)), s->id);
- return;
- }
-
if (kvm_enabled()) {
warn_report("Userspace local APIC is deprecated for KVM.");
warn_report("Do not use kernel-irqchip except for the -M isapc machine type.");
@@ -955,7 +1097,7 @@ static void apic_realize(DeviceState *dev, Error **errp)
s->io_memory.disable_reentrancy_guard = true;
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
- local_apics[s->id] = s;
+ local_apics[s->initial_apic_id] = s;
msi_nonbroken = true;
}
@@ -965,7 +1107,7 @@ static void apic_unrealize(DeviceState *dev)
APICCommonState *s = APIC(dev);
timer_free(s->timer);
- local_apics[s->id] = NULL;
+ local_apics[s->initial_apic_id] = NULL;
}
static void apic_class_init(ObjectClass *klass, void *data)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index bccb4241c2..4bc3d2f149 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -287,6 +287,10 @@ static void apic_common_realize(DeviceState *dev, Error **errp)
}
vmstate_register_with_alias_id(NULL, instance_id, &vmstate_apic_common,
s, -1, 0, NULL);
+
+ /* APIC LDR in x2APIC mode */
+ s->extended_log_dest = ((s->initial_apic_id & 0xffff0) << 16) |
+ (1 << (s->initial_apic_id & 0xf));
}
static void apic_common_unrealize(DeviceState *dev)
@@ -427,6 +431,11 @@ static void apic_common_set_id(Object *obj, Visitor *v, const char *name,
return;
}
+ if (value >= 255 && !cpu_has_x2apic_feature(&s->cpu->env)) {
+ error_setg(errp, "APIC ID %d requires x2APIC feature in CPU", value);
+ return;
+ }
+
s->initial_apic_id = value;
s->id = (uint8_t)value;
}
diff --git a/include/hw/i386/apic.h b/include/hw/i386/apic.h
index ddea4213db..c8ca41ab44 100644
--- a/include/hw/i386/apic.h
+++ b/include/hw/i386/apic.h
@@ -3,8 +3,7 @@
/* apic.c */
-void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t delivery_mode,
- uint8_t vector_num, uint8_t trigger_mode);
+void apic_set_max_apic_id(uint32_t max_apic_id);
int apic_accept_pic_intr(DeviceState *s);
void apic_deliver_pic_intr(DeviceState *s, int level);
void apic_deliver_nmi(DeviceState *d);
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 5f2ba24bfc..e796e6cae3 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -46,8 +46,10 @@
#define APIC_DM_EXTINT 7
/* APIC destination mode */
-#define APIC_DESTMODE_FLAT 0xf
-#define APIC_DESTMODE_CLUSTER 1
+#define APIC_DESTMODE_PHYSICAL 0
+#define APIC_DESTMODE_LOGICAL 1
+#define APIC_DESTMODE_LOGICAL_FLAT 0xf
+#define APIC_DESTMODE_LOGICAL_CLUSTER 0
#define APIC_TRIGGER_EDGE 0
#define APIC_TRIGGER_LEVEL 1
@@ -187,6 +189,7 @@ struct APICCommonState {
DeviceState *vapic;
hwaddr vapic_paddr; /* note: persistence via kvmvapic */
bool legacy_instance_id;
+ uint32_t extended_log_dest;
};
typedef struct VAPICState {
diff --git a/target/i386/cpu-sysemu.c b/target/i386/cpu-sysemu.c
index 2375e48178..7422096737 100644
--- a/target/i386/cpu-sysemu.c
+++ b/target/i386/cpu-sysemu.c
@@ -235,6 +235,16 @@ void cpu_clear_apic_feature(CPUX86State *env)
env->features[FEAT_1_EDX] &= ~CPUID_APIC;
}
+void cpu_set_apic_feature(CPUX86State *env)
+{
+ env->features[FEAT_1_EDX] |= CPUID_APIC;
+}
+
+bool cpu_has_x2apic_feature(CPUX86State *env)
+{
+ return env->features[FEAT_1_ECX] & CPUID_EXT_X2APIC;
+}
+
bool cpu_is_bsp(X86CPU *cpu)
{
return cpu_get_apic_base(cpu->apic_state) & MSR_IA32_APICBASE_BSP;
@@ -281,11 +291,17 @@ void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
OBJECT(cpu->apic_state));
object_unref(OBJECT(cpu->apic_state));
- qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);
/* TODO: convert to link<> */
apic = APIC_COMMON(cpu->apic_state);
apic->cpu = cpu;
apic->apicbase = APIC_DEFAULT_ADDRESS | MSR_IA32_APICBASE_ENABLE;
+
+ /*
+ * apic_common_set_id needs to check if the CPU has x2APIC
+ * feature in case APIC ID >= 255, so we need to set apic->cpu
+ * before setting APIC ID
+ */
+ qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);
}
void x86_cpu_apic_realize(X86CPU *cpu, Error **errp)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 92d0cf528c..32a16453d0 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2201,8 +2201,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx);
void cpu_clear_apic_feature(CPUX86State *env);
+void cpu_set_apic_feature(CPUX86State *env);
void host_cpuid(uint32_t function, uint32_t count,
uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
+bool cpu_has_x2apic_feature(CPUX86State *env);
/* helper.c */
void x86_cpu_set_a20(X86CPU *cpu, int a20_state);
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v9 3/5] apic, i386/tcg: add x2apic transitions
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 1/5] i386/tcg: implement x2APIC registers MSR access Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 2/5] apic: add support for x2APIC mode Bui Quang Minh
@ 2023-10-24 15:21 ` Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 4/5] intel_iommu: allow Extended Interrupt Mode when using userspace APIC Bui Quang Minh
` (2 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
This commit adds support for x2APIC transitions when writing to
MSR_IA32_APICBASE register and finally adds CPUID_EXT_X2APIC to
TCG_EXT_FEATURES.
The set_base in APICCommonClass now returns an integer to indicate error in
execution. apic_set_base return -1 on invalid APIC state transition,
accelerator can use this to raise appropriate exception.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
hw/i386/kvm/apic.c | 3 +-
hw/i386/xen/xen_apic.c | 3 +-
hw/intc/apic.c | 62 +++++++++++++++++++++++++++-
hw/intc/apic_common.c | 13 +++---
include/hw/i386/apic.h | 2 +-
include/hw/i386/apic_internal.h | 2 +-
target/i386/cpu.c | 9 ++--
target/i386/cpu.h | 4 ++
target/i386/tcg/sysemu/misc_helper.c | 14 ++++++-
target/i386/whpx/whpx-apic.c | 3 +-
10 files changed, 96 insertions(+), 19 deletions(-)
diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 1e89ca0899..a72c28e8a7 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -95,9 +95,10 @@ void kvm_get_apic_state(DeviceState *dev, struct kvm_lapic_state *kapic)
apic_next_timer(s, s->initial_count_load_time);
}
-static void kvm_apic_set_base(APICCommonState *s, uint64_t val)
+static int kvm_apic_set_base(APICCommonState *s, uint64_t val)
{
s->apicbase = val;
+ return 0;
}
static void kvm_apic_set_tpr(APICCommonState *s, uint8_t val)
diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
index 7c7a60b166..101e16a766 100644
--- a/hw/i386/xen/xen_apic.c
+++ b/hw/i386/xen/xen_apic.c
@@ -49,8 +49,9 @@ static void xen_apic_realize(DeviceState *dev, Error **errp)
msi_nonbroken = true;
}
-static void xen_apic_set_base(APICCommonState *s, uint64_t val)
+static int xen_apic_set_base(APICCommonState *s, uint64_t val)
{
+ return 0;
}
static void xen_apic_set_tpr(APICCommonState *s, uint8_t val)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 84d428a875..f9e54d92b3 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -308,8 +308,49 @@ bool is_x2apic_mode(DeviceState *dev)
return s->apicbase & MSR_IA32_APICBASE_EXTD;
}
-static void apic_set_base(APICCommonState *s, uint64_t val)
+static int apic_set_base_check(APICCommonState *s, uint64_t val)
{
+ /* Enable x2apic when x2apic is not supported by CPU */
+ if (!cpu_has_x2apic_feature(&s->cpu->env) &&
+ val & MSR_IA32_APICBASE_EXTD) {
+ return -1;
+ }
+
+ /*
+ * Transition into invalid state
+ * (s->apicbase & MSR_IA32_APICBASE_ENABLE == 0) &&
+ * (s->apicbase & MSR_IA32_APICBASE_EXTD) == 1
+ */
+ if (!(val & MSR_IA32_APICBASE_ENABLE) &&
+ (val & MSR_IA32_APICBASE_EXTD)) {
+ return -1;
+ }
+
+ /* Invalid transition from disabled mode to x2APIC */
+ if (!(s->apicbase & MSR_IA32_APICBASE_ENABLE) &&
+ !(s->apicbase & MSR_IA32_APICBASE_EXTD) &&
+ (val & MSR_IA32_APICBASE_ENABLE) &&
+ (val & MSR_IA32_APICBASE_EXTD)) {
+ return -1;
+ }
+
+ /* Invalid transition from x2APIC to xAPIC */
+ if ((s->apicbase & MSR_IA32_APICBASE_ENABLE) &&
+ (s->apicbase & MSR_IA32_APICBASE_EXTD) &&
+ (val & MSR_IA32_APICBASE_ENABLE) &&
+ !(val & MSR_IA32_APICBASE_EXTD)) {
+ return -1;
+ }
+
+ return 0;
+}
+
+static int apic_set_base(APICCommonState *s, uint64_t val)
+{
+ if (apic_set_base_check(s, val) < 0) {
+ return -1;
+ }
+
s->apicbase = (val & 0xfffff000) |
(s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
/* if disabled, cannot be enabled again */
@@ -318,6 +359,25 @@ static void apic_set_base(APICCommonState *s, uint64_t val)
cpu_clear_apic_feature(&s->cpu->env);
s->spurious_vec &= ~APIC_SV_ENABLE;
}
+
+ /* Transition from disabled mode to xAPIC */
+ if (!(s->apicbase & MSR_IA32_APICBASE_ENABLE) &&
+ (val & MSR_IA32_APICBASE_ENABLE)) {
+ s->apicbase |= MSR_IA32_APICBASE_ENABLE;
+ cpu_set_apic_feature(&s->cpu->env);
+ }
+
+ /* Transition from xAPIC to x2APIC */
+ if (cpu_has_x2apic_feature(&s->cpu->env) &&
+ !(s->apicbase & MSR_IA32_APICBASE_EXTD) &&
+ (val & MSR_IA32_APICBASE_EXTD)) {
+ s->apicbase |= MSR_IA32_APICBASE_EXTD;
+
+ s->log_dest = ((s->initial_apic_id & 0xffff0) << 16) |
+ (1 << (s->initial_apic_id & 0xf));
+ }
+
+ return 0;
}
static void apic_set_tpr(APICCommonState *s, uint8_t val)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 4bc3d2f149..b13a7b0457 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -35,20 +35,19 @@
bool apic_report_tpr_access;
-void cpu_set_apic_base(DeviceState *dev, uint64_t val)
+int cpu_set_apic_base(DeviceState *dev, uint64_t val)
{
trace_cpu_set_apic_base(val);
if (dev) {
APICCommonState *s = APIC_COMMON(dev);
APICCommonClass *info = APIC_COMMON_GET_CLASS(s);
- /* switching to x2APIC, reset possibly modified xAPIC ID */
- if (!(s->apicbase & MSR_IA32_APICBASE_EXTD) &&
- (val & MSR_IA32_APICBASE_EXTD)) {
- s->id = s->initial_apic_id;
- }
- info->set_base(s, val);
+ /* Reset possibly modified xAPIC ID */
+ s->id = s->initial_apic_id;
+ return info->set_base(s, val);
}
+
+ return 0;
}
uint64_t cpu_get_apic_base(DeviceState *dev)
diff --git a/include/hw/i386/apic.h b/include/hw/i386/apic.h
index c8ca41ab44..f6e7489f2d 100644
--- a/include/hw/i386/apic.h
+++ b/include/hw/i386/apic.h
@@ -8,7 +8,7 @@ int apic_accept_pic_intr(DeviceState *s);
void apic_deliver_pic_intr(DeviceState *s, int level);
void apic_deliver_nmi(DeviceState *d);
int apic_get_interrupt(DeviceState *s);
-void cpu_set_apic_base(DeviceState *s, uint64_t val);
+int cpu_set_apic_base(DeviceState *s, uint64_t val);
uint64_t cpu_get_apic_base(DeviceState *s);
void cpu_set_apic_tpr(DeviceState *s, uint8_t val);
uint8_t cpu_get_apic_tpr(DeviceState *s);
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index e796e6cae3..d6e85833da 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -137,7 +137,7 @@ struct APICCommonClass {
DeviceRealize realize;
DeviceUnrealize unrealize;
- void (*set_base)(APICCommonState *s, uint64_t val);
+ int (*set_base)(APICCommonState *s, uint64_t val);
void (*set_tpr)(APICCommonState *s, uint8_t val);
uint8_t (*get_tpr)(APICCommonState *s);
void (*enable_tpr_reporting)(APICCommonState *s, bool enable);
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index bdca901dfa..ee78186550 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -631,8 +631,8 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
* in CPL=3; remove them if they are ever implemented for system emulation.
*/
#if defined CONFIG_USER_ONLY
-#define CPUID_EXT_KERNEL_FEATURES (CPUID_EXT_PCID | CPUID_EXT_TSC_DEADLINE_TIMER | \
- CPUID_EXT_X2APIC)
+#define CPUID_EXT_KERNEL_FEATURES \
+ (CPUID_EXT_PCID | CPUID_EXT_TSC_DEADLINE_TIMER)
#else
#define CPUID_EXT_KERNEL_FEATURES 0
#endif
@@ -642,12 +642,13 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */ \
CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \
CPUID_EXT_RDRAND | CPUID_EXT_AVX | CPUID_EXT_F16C | \
- CPUID_EXT_FMA | CPUID_EXT_KERNEL_FEATURES)
+ CPUID_EXT_FMA | CPUID_EXT_X2APIC | CPUID_EXT_KERNEL_FEATURES)
/* missing:
CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID,
CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
- CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER */
+ CPUID_EXT_TSC_DEADLINE_TIMER
+ */
#ifdef TARGET_X86_64
#define TCG_EXT2_X86_64_FEATURES CPUID_EXT2_LM
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 32a16453d0..9f0d690bfc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -379,6 +379,10 @@ typedef enum X86Seg {
#define MSR_IA32_APICBASE_ENABLE (1<<11)
#define MSR_IA32_APICBASE_EXTD (1 << 10)
#define MSR_IA32_APICBASE_BASE (0xfffffU<<12)
+#define MSR_IA32_APICBASE_RESERVED \
+ (~(uint64_t)(MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE \
+ | MSR_IA32_APICBASE_EXTD | MSR_IA32_APICBASE_BASE))
+
#define MSR_IA32_FEATURE_CONTROL 0x0000003a
#define MSR_TSC_ADJUST 0x0000003b
#define MSR_IA32_SPEC_CTRL 0x48
diff --git a/target/i386/tcg/sysemu/misc_helper.c b/target/i386/tcg/sysemu/misc_helper.c
index 6fccdb3dca..29260d657d 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -158,9 +158,19 @@ void helper_wrmsr(CPUX86State *env)
case MSR_IA32_SYSENTER_EIP:
env->sysenter_eip = val;
break;
- case MSR_IA32_APICBASE:
- cpu_set_apic_base(env_archcpu(env)->apic_state, val);
+ case MSR_IA32_APICBASE: {
+ int ret;
+
+ if (val & MSR_IA32_APICBASE_RESERVED) {
+ goto error;
+ }
+
+ ret = cpu_set_apic_base(env_archcpu(env)->apic_state, val);
+ if (ret < 0) {
+ goto error;
+ }
break;
+ }
case MSR_EFER:
{
uint64_t update_mask;
diff --git a/target/i386/whpx/whpx-apic.c b/target/i386/whpx/whpx-apic.c
index 8710e37567..7e14ded978 100644
--- a/target/i386/whpx/whpx-apic.c
+++ b/target/i386/whpx/whpx-apic.c
@@ -90,9 +90,10 @@ static void whpx_get_apic_state(APICCommonState *s,
apic_next_timer(s, s->initial_count_load_time);
}
-static void whpx_apic_set_base(APICCommonState *s, uint64_t val)
+static int whpx_apic_set_base(APICCommonState *s, uint64_t val)
{
s->apicbase = val;
+ return 0;
}
static void whpx_put_apic_base(CPUState *cpu, uint64_t val)
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v9 4/5] intel_iommu: allow Extended Interrupt Mode when using userspace APIC
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
` (2 preceding siblings ...)
2023-10-24 15:21 ` [PATCH v9 3/5] apic, i386/tcg: add x2apic transitions Bui Quang Minh
@ 2023-10-24 15:21 ` Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system Bui Quang Minh
2023-11-09 10:11 ` [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Santosh Shukla
5 siblings, 0 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
As userspace APIC now supports x2APIC, intel interrupt remapping
hardware can be set to EIM mode when userspace local APIC is used.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
hw/i386/intel_iommu.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e4f6cedcb1..848511b7f8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4049,11 +4049,7 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
}
if (s->intr_eim == ON_OFF_AUTO_ON && !s->buggy_eim) {
- if (!kvm_irqchip_is_split()) {
- error_setg(errp, "eim=on requires accel=kvm,kernel-irqchip=split");
- return false;
- }
- if (kvm_enabled() && !kvm_enable_x2apic()) {
+ if (kvm_irqchip_is_split() && !kvm_enable_x2apic()) {
error_setg(errp, "eim=on requires support on the KVM side"
"(X2APIC_API, first shipped in v4.7)");
return false;
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
` (3 preceding siblings ...)
2023-10-24 15:21 ` [PATCH v9 4/5] intel_iommu: allow Extended Interrupt Mode when using userspace APIC Bui Quang Minh
@ 2023-10-24 15:21 ` Bui Quang Minh
2023-11-07 0:39 ` Michael S. Tsirkin
2023-11-09 10:11 ` [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Santosh Shukla
5 siblings, 1 reply; 15+ messages in thread
From: Bui Quang Minh @ 2023-10-24 15:21 UTC (permalink / raw)
To: qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Bui Quang Minh
This commit adds XTSup configuration to let user choose to whether enable
this feature or not. When XTSup is enabled, additional bytes in IRTE with
enabled guest virtual VAPIC are used to support 32-bit destination id.
Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
features only the legacy ones, so operating system (e.g. Linux) may only
detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
is kept so that old operating system that only parses type 0x10 can detect
the IOMMU device.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
---
hw/i386/acpi-build.c | 129 +++++++++++++++++++++++++++----------------
hw/i386/amd_iommu.c | 29 +++++++++-
hw/i386/amd_iommu.h | 16 ++++--
3 files changed, 117 insertions(+), 57 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 3f2b27cf75..8069971e54 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2337,30 +2337,23 @@ static void
build_amd_iommu(GArray *table_data, BIOSLinker *linker, const char *oem_id,
const char *oem_table_id)
{
- int ivhd_table_len = 24;
AMDVIState *s = AMD_IOMMU_DEVICE(x86_iommu_get_default());
GArray *ivhd_blob = g_array_new(false, true, 1);
AcpiTable table = { .sig = "IVRS", .rev = 1, .oem_id = oem_id,
.oem_table_id = oem_table_id };
+ uint64_t feature_report;
acpi_table_begin(&table, table_data);
/* IVinfo - IO virtualization information common to all
* IOMMU units in a system
*/
- build_append_int_noprefix(table_data, 40UL << 8/* PASize */, 4);
+ build_append_int_noprefix(table_data,
+ (1UL << 0) | /* EFRSup */
+ (40UL << 8), /* PASize */
+ 4);
/* reserved */
build_append_int_noprefix(table_data, 0, 8);
- /* IVHD definition - type 10h */
- build_append_int_noprefix(table_data, 0x10, 1);
- /* virtualization flags */
- build_append_int_noprefix(table_data,
- (1UL << 0) | /* HtTunEn */
- (1UL << 4) | /* iotblSup */
- (1UL << 6) | /* PrefSup */
- (1UL << 7), /* PPRSup */
- 1);
-
/*
* A PCI bus walk, for each PCI host bridge, is necessary to create a
* complete set of IVHD entries. Do this into a separate blob so that we
@@ -2380,56 +2373,94 @@ build_amd_iommu(GArray *table_data, BIOSLinker *linker, const char *oem_id,
build_append_int_noprefix(ivhd_blob, 0x0000001, 4);
}
- ivhd_table_len += ivhd_blob->len;
-
/*
* When interrupt remapping is supported, we add a special IVHD device
- * for type IO-APIC.
- */
- if (x86_iommu_ir_supported(x86_iommu_get_default())) {
- ivhd_table_len += 8;
- }
-
- /* IVHD length */
- build_append_int_noprefix(table_data, ivhd_table_len, 2);
- /* DeviceID */
- build_append_int_noprefix(table_data,
- object_property_get_int(OBJECT(&s->pci), "addr",
- &error_abort), 2);
- /* Capability offset */
- build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
- /* IOMMU base address */
- build_append_int_noprefix(table_data, s->mmio.addr, 8);
- /* PCI Segment Group */
- build_append_int_noprefix(table_data, 0, 2);
- /* IOMMU info */
- build_append_int_noprefix(table_data, 0, 2);
- /* IOMMU Feature Reporting */
- build_append_int_noprefix(table_data,
- (48UL << 30) | /* HATS */
- (48UL << 28) | /* GATS */
- (1UL << 2) | /* GTSup */
- (1UL << 6), /* GASup */
- 4);
-
- /* IVHD entries as found above */
- g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
- g_array_free(ivhd_blob, TRUE);
-
- /*
- * Add a special IVHD device type.
+ * for type IO-APIC
* Refer to spec - Table 95: IVHD device entry type codes
*
* Linux IOMMU driver checks for the special IVHD device (type IO-APIC).
* See Linux kernel commit 'c2ff5cf5294bcbd7fa50f7d860e90a66db7e5059'
*/
if (x86_iommu_ir_supported(x86_iommu_get_default())) {
- build_append_int_noprefix(table_data,
+ build_append_int_noprefix(ivhd_blob,
(0x1ull << 56) | /* type IOAPIC */
(IOAPIC_SB_DEVID << 40) | /* IOAPIC devid */
0x48, /* special device */
8);
}
+
+ /* IVHD definition - type 10h */
+ build_append_int_noprefix(table_data, 0x10, 1);
+ /* virtualization flags */
+ build_append_int_noprefix(table_data,
+ (1UL << 0) | /* HtTunEn */
+ (1UL << 4) | /* iotblSup */
+ (1UL << 6) | /* PrefSup */
+ (1UL << 7), /* PPRSup */
+ 1);
+
+ /* IVHD length */
+ build_append_int_noprefix(table_data, ivhd_blob->len + 24, 2);
+ /* DeviceID */
+ build_append_int_noprefix(table_data,
+ object_property_get_int(OBJECT(&s->pci), "addr",
+ &error_abort), 2);
+ /* Capability offset */
+ build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
+ /* IOMMU base address */
+ build_append_int_noprefix(table_data, s->mmio.addr, 8);
+ /* PCI Segment Group */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* IOMMU info */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* IOMMU Feature Reporting */
+ feature_report = (48UL << 30) | /* HATS */
+ (48UL << 28) | /* GATS */
+ (1UL << 2) | /* GTSup */
+ (1UL << 6); /* GASup */
+ if (s->xtsup) {
+ feature_report |= (1UL << 0); /* XTSup */
+ }
+ build_append_int_noprefix(table_data, feature_report, 4);
+
+ /* IVHD entries as found above */
+ g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
+
+ /* IVHD definition - type 11h */
+ build_append_int_noprefix(table_data, 0x11, 1);
+ /* virtualization flags */
+ build_append_int_noprefix(table_data,
+ (1UL << 0) | /* HtTunEn */
+ (1UL << 4), /* iotblSup */
+ 1);
+
+ /* IVHD length */
+ build_append_int_noprefix(table_data, ivhd_blob->len + 40, 2);
+ /* DeviceID */
+ build_append_int_noprefix(table_data,
+ object_property_get_int(OBJECT(&s->pci), "addr",
+ &error_abort), 2);
+ /* Capability offset */
+ build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
+ /* IOMMU base address */
+ build_append_int_noprefix(table_data, s->mmio.addr, 8);
+ /* PCI Segment Group */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* IOMMU info */
+ build_append_int_noprefix(table_data, 0, 2);
+ /* IOMMU Attributes */
+ build_append_int_noprefix(table_data, 0, 4);
+ /* EFR Register Image */
+ build_append_int_noprefix(table_data,
+ amdvi_extended_feature_register(s),
+ 8);
+ /* EFR Register Image 2 */
+ build_append_int_noprefix(table_data, 0, 8);
+
+ /* IVHD entries as found above */
+ g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
+
+ g_array_free(ivhd_blob, TRUE);
acpi_table_end(linker, &table);
}
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 7965415b47..e7809b641a 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -31,6 +31,7 @@
#include "hw/i386/apic_internal.h"
#include "trace.h"
#include "hw/i386/apic-msidef.h"
+#include "hw/qdev-properties.h"
/* used AMD-Vi MMIO registers */
const char *amdvi_mmio_low[] = {
@@ -74,6 +75,16 @@ typedef struct AMDVIIOTLBEntry {
uint64_t page_mask; /* physical page size */
} AMDVIIOTLBEntry;
+uint64_t amdvi_extended_feature_register(AMDVIState *s)
+{
+ uint64_t feature = AMDVI_DEFAULT_EXT_FEATURES;
+ if (s->xtsup) {
+ feature |= AMDVI_FEATURE_XT;
+ }
+
+ return feature;
+}
+
/* configure MMIO registers at startup/reset */
static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
uint64_t romask, uint64_t w1cmask)
@@ -1155,7 +1166,12 @@ static int amdvi_int_remap_ga(AMDVIState *iommu,
irq->vector = irte.hi.fields.vector;
irq->dest_mode = irte.lo.fields_remap.dm;
irq->redir_hint = irte.lo.fields_remap.rq_eoi;
- irq->dest = irte.lo.fields_remap.destination;
+ if (iommu->xtsup) {
+ irq->dest = irte.lo.fields_remap.destination |
+ (irte.hi.fields.destination_hi << 24);
+ } else {
+ irq->dest = irte.lo.fields_remap.destination & 0xff;
+ }
return 0;
}
@@ -1501,8 +1517,9 @@ static void amdvi_init(AMDVIState *s)
/* reset MMIO */
memset(s->mmior, 0, AMDVI_MMIO_SIZE);
- amdvi_set_quad(s, AMDVI_MMIO_EXT_FEATURES, AMDVI_EXT_FEATURES,
- 0xffffffffffffffef, 0);
+ amdvi_set_quad(s, AMDVI_MMIO_EXT_FEATURES,
+ amdvi_extended_feature_register(s),
+ 0xffffffffffffffef, 0);
amdvi_set_quad(s, AMDVI_MMIO_STATUS, 0, 0x98, 0x67);
}
@@ -1585,6 +1602,11 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
amdvi_init(s);
}
+static Property amdvi_properties[] = {
+ DEFINE_PROP_BOOL("xtsup", AMDVIState, xtsup, false),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
static const VMStateDescription vmstate_amdvi_sysbus = {
.name = "amd-iommu",
.unmigratable = 1
@@ -1611,6 +1633,7 @@ static void amdvi_sysbus_class_init(ObjectClass *klass, void *data)
dc->user_creatable = true;
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
dc->desc = "AMD IOMMU (AMD-Vi) DMA Remapping device";
+ device_class_set_props(dc, amdvi_properties);
}
static const TypeInfo amdvi_sysbus = {
diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
index c5065a3e27..73619fe9ea 100644
--- a/hw/i386/amd_iommu.h
+++ b/hw/i386/amd_iommu.h
@@ -154,6 +154,7 @@
#define AMDVI_FEATURE_PREFETCH (1ULL << 0) /* page prefetch */
#define AMDVI_FEATURE_PPR (1ULL << 1) /* PPR Support */
+#define AMDVI_FEATURE_XT (1ULL << 2) /* x2APIC Support */
#define AMDVI_FEATURE_GT (1ULL << 4) /* Guest Translation */
#define AMDVI_FEATURE_IA (1ULL << 6) /* inval all support */
#define AMDVI_FEATURE_GA (1ULL << 7) /* guest VAPIC support */
@@ -173,8 +174,9 @@
#define AMDVI_IOTLB_MAX_SIZE 1024
#define AMDVI_DEVID_SHIFT 36
-/* extended feature support */
-#define AMDVI_EXT_FEATURES (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR | \
+/* default extended feature */
+#define AMDVI_DEFAULT_EXT_FEATURES \
+ (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR | \
AMDVI_FEATURE_IA | AMDVI_FEATURE_GT | AMDVI_FEATURE_HE | \
AMDVI_GATS_MODE | AMDVI_HATS_MODE | AMDVI_FEATURE_GA)
@@ -276,8 +278,8 @@ union irte_ga_lo {
dm:1,
/* ------ */
guest_mode:1,
- destination:8,
- rsvd_1:48;
+ destination:24,
+ rsvd_1:32;
} fields_remap;
};
@@ -285,7 +287,8 @@ union irte_ga_hi {
uint64_t val;
struct {
uint64_t vector:8,
- rsvd_2:56;
+ rsvd_2:48,
+ destination_hi:8;
} fields;
};
@@ -364,6 +367,9 @@ struct AMDVIState {
/* Interrupt remapping */
bool ga_enabled;
+ bool xtsup;
};
+uint64_t amdvi_extended_feature_register(AMDVIState *s);
+
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system
2023-10-24 15:21 ` [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system Bui Quang Minh
@ 2023-11-07 0:39 ` Michael S. Tsirkin
2023-11-08 14:22 ` Bui Quang Minh
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-11-07 0:39 UTC (permalink / raw)
To: Bui Quang Minh
Cc: qemu-devel, David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum, Igor Mammedov,
Alex Bennée, Joao Martins, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan
On Tue, Oct 24, 2023 at 10:21:05PM +0700, Bui Quang Minh wrote:
> This commit adds XTSup configuration to let user choose to whether enable
> this feature or not. When XTSup is enabled, additional bytes in IRTE with
> enabled guest virtual VAPIC are used to support 32-bit destination id.
>
> Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
> 0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
> features only the legacy ones, so operating system (e.g. Linux) may only
> detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
> is kept so that old operating system that only parses type 0x10 can detect
> the IOMMU device.
>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
changes IVRS without updating expected files for tests.
result seems to be CI failures:
https://gitlab.com/mstredhat/qemu/-/jobs/5470533834
> ---
> hw/i386/acpi-build.c | 129 +++++++++++++++++++++++++++----------------
> hw/i386/amd_iommu.c | 29 +++++++++-
> hw/i386/amd_iommu.h | 16 ++++--
> 3 files changed, 117 insertions(+), 57 deletions(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 3f2b27cf75..8069971e54 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -2337,30 +2337,23 @@ static void
> build_amd_iommu(GArray *table_data, BIOSLinker *linker, const char *oem_id,
> const char *oem_table_id)
> {
> - int ivhd_table_len = 24;
> AMDVIState *s = AMD_IOMMU_DEVICE(x86_iommu_get_default());
> GArray *ivhd_blob = g_array_new(false, true, 1);
> AcpiTable table = { .sig = "IVRS", .rev = 1, .oem_id = oem_id,
> .oem_table_id = oem_table_id };
> + uint64_t feature_report;
>
> acpi_table_begin(&table, table_data);
> /* IVinfo - IO virtualization information common to all
> * IOMMU units in a system
> */
> - build_append_int_noprefix(table_data, 40UL << 8/* PASize */, 4);
> + build_append_int_noprefix(table_data,
> + (1UL << 0) | /* EFRSup */
> + (40UL << 8), /* PASize */
> + 4);
> /* reserved */
> build_append_int_noprefix(table_data, 0, 8);
>
> - /* IVHD definition - type 10h */
> - build_append_int_noprefix(table_data, 0x10, 1);
> - /* virtualization flags */
> - build_append_int_noprefix(table_data,
> - (1UL << 0) | /* HtTunEn */
> - (1UL << 4) | /* iotblSup */
> - (1UL << 6) | /* PrefSup */
> - (1UL << 7), /* PPRSup */
> - 1);
> -
> /*
> * A PCI bus walk, for each PCI host bridge, is necessary to create a
> * complete set of IVHD entries. Do this into a separate blob so that we
> @@ -2380,56 +2373,94 @@ build_amd_iommu(GArray *table_data, BIOSLinker *linker, const char *oem_id,
> build_append_int_noprefix(ivhd_blob, 0x0000001, 4);
> }
>
> - ivhd_table_len += ivhd_blob->len;
> -
> /*
> * When interrupt remapping is supported, we add a special IVHD device
> - * for type IO-APIC.
> - */
> - if (x86_iommu_ir_supported(x86_iommu_get_default())) {
> - ivhd_table_len += 8;
> - }
> -
> - /* IVHD length */
> - build_append_int_noprefix(table_data, ivhd_table_len, 2);
> - /* DeviceID */
> - build_append_int_noprefix(table_data,
> - object_property_get_int(OBJECT(&s->pci), "addr",
> - &error_abort), 2);
> - /* Capability offset */
> - build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
> - /* IOMMU base address */
> - build_append_int_noprefix(table_data, s->mmio.addr, 8);
> - /* PCI Segment Group */
> - build_append_int_noprefix(table_data, 0, 2);
> - /* IOMMU info */
> - build_append_int_noprefix(table_data, 0, 2);
> - /* IOMMU Feature Reporting */
> - build_append_int_noprefix(table_data,
> - (48UL << 30) | /* HATS */
> - (48UL << 28) | /* GATS */
> - (1UL << 2) | /* GTSup */
> - (1UL << 6), /* GASup */
> - 4);
> -
> - /* IVHD entries as found above */
> - g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
> - g_array_free(ivhd_blob, TRUE);
> -
> - /*
> - * Add a special IVHD device type.
> + * for type IO-APIC
> * Refer to spec - Table 95: IVHD device entry type codes
> *
> * Linux IOMMU driver checks for the special IVHD device (type IO-APIC).
> * See Linux kernel commit 'c2ff5cf5294bcbd7fa50f7d860e90a66db7e5059'
> */
> if (x86_iommu_ir_supported(x86_iommu_get_default())) {
> - build_append_int_noprefix(table_data,
> + build_append_int_noprefix(ivhd_blob,
> (0x1ull << 56) | /* type IOAPIC */
> (IOAPIC_SB_DEVID << 40) | /* IOAPIC devid */
> 0x48, /* special device */
> 8);
> }
> +
> + /* IVHD definition - type 10h */
> + build_append_int_noprefix(table_data, 0x10, 1);
> + /* virtualization flags */
> + build_append_int_noprefix(table_data,
> + (1UL << 0) | /* HtTunEn */
> + (1UL << 4) | /* iotblSup */
> + (1UL << 6) | /* PrefSup */
> + (1UL << 7), /* PPRSup */
> + 1);
> +
> + /* IVHD length */
> + build_append_int_noprefix(table_data, ivhd_blob->len + 24, 2);
> + /* DeviceID */
> + build_append_int_noprefix(table_data,
> + object_property_get_int(OBJECT(&s->pci), "addr",
> + &error_abort), 2);
> + /* Capability offset */
> + build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
> + /* IOMMU base address */
> + build_append_int_noprefix(table_data, s->mmio.addr, 8);
> + /* PCI Segment Group */
> + build_append_int_noprefix(table_data, 0, 2);
> + /* IOMMU info */
> + build_append_int_noprefix(table_data, 0, 2);
> + /* IOMMU Feature Reporting */
> + feature_report = (48UL << 30) | /* HATS */
> + (48UL << 28) | /* GATS */
> + (1UL << 2) | /* GTSup */
> + (1UL << 6); /* GASup */
> + if (s->xtsup) {
> + feature_report |= (1UL << 0); /* XTSup */
> + }
> + build_append_int_noprefix(table_data, feature_report, 4);
> +
> + /* IVHD entries as found above */
> + g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
> +
> + /* IVHD definition - type 11h */
> + build_append_int_noprefix(table_data, 0x11, 1);
> + /* virtualization flags */
> + build_append_int_noprefix(table_data,
> + (1UL << 0) | /* HtTunEn */
> + (1UL << 4), /* iotblSup */
> + 1);
> +
> + /* IVHD length */
> + build_append_int_noprefix(table_data, ivhd_blob->len + 40, 2);
> + /* DeviceID */
> + build_append_int_noprefix(table_data,
> + object_property_get_int(OBJECT(&s->pci), "addr",
> + &error_abort), 2);
> + /* Capability offset */
> + build_append_int_noprefix(table_data, s->pci.capab_offset, 2);
> + /* IOMMU base address */
> + build_append_int_noprefix(table_data, s->mmio.addr, 8);
> + /* PCI Segment Group */
> + build_append_int_noprefix(table_data, 0, 2);
> + /* IOMMU info */
> + build_append_int_noprefix(table_data, 0, 2);
> + /* IOMMU Attributes */
> + build_append_int_noprefix(table_data, 0, 4);
> + /* EFR Register Image */
> + build_append_int_noprefix(table_data,
> + amdvi_extended_feature_register(s),
> + 8);
> + /* EFR Register Image 2 */
> + build_append_int_noprefix(table_data, 0, 8);
> +
> + /* IVHD entries as found above */
> + g_array_append_vals(table_data, ivhd_blob->data, ivhd_blob->len);
> +
> + g_array_free(ivhd_blob, TRUE);
> acpi_table_end(linker, &table);
> }
>
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 7965415b47..e7809b641a 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -31,6 +31,7 @@
> #include "hw/i386/apic_internal.h"
> #include "trace.h"
> #include "hw/i386/apic-msidef.h"
> +#include "hw/qdev-properties.h"
>
> /* used AMD-Vi MMIO registers */
> const char *amdvi_mmio_low[] = {
> @@ -74,6 +75,16 @@ typedef struct AMDVIIOTLBEntry {
> uint64_t page_mask; /* physical page size */
> } AMDVIIOTLBEntry;
>
> +uint64_t amdvi_extended_feature_register(AMDVIState *s)
> +{
> + uint64_t feature = AMDVI_DEFAULT_EXT_FEATURES;
> + if (s->xtsup) {
> + feature |= AMDVI_FEATURE_XT;
> + }
> +
> + return feature;
> +}
> +
> /* configure MMIO registers at startup/reset */
> static void amdvi_set_quad(AMDVIState *s, hwaddr addr, uint64_t val,
> uint64_t romask, uint64_t w1cmask)
> @@ -1155,7 +1166,12 @@ static int amdvi_int_remap_ga(AMDVIState *iommu,
> irq->vector = irte.hi.fields.vector;
> irq->dest_mode = irte.lo.fields_remap.dm;
> irq->redir_hint = irte.lo.fields_remap.rq_eoi;
> - irq->dest = irte.lo.fields_remap.destination;
> + if (iommu->xtsup) {
> + irq->dest = irte.lo.fields_remap.destination |
> + (irte.hi.fields.destination_hi << 24);
> + } else {
> + irq->dest = irte.lo.fields_remap.destination & 0xff;
> + }
>
> return 0;
> }
> @@ -1501,8 +1517,9 @@ static void amdvi_init(AMDVIState *s)
>
> /* reset MMIO */
> memset(s->mmior, 0, AMDVI_MMIO_SIZE);
> - amdvi_set_quad(s, AMDVI_MMIO_EXT_FEATURES, AMDVI_EXT_FEATURES,
> - 0xffffffffffffffef, 0);
> + amdvi_set_quad(s, AMDVI_MMIO_EXT_FEATURES,
> + amdvi_extended_feature_register(s),
> + 0xffffffffffffffef, 0);
> amdvi_set_quad(s, AMDVI_MMIO_STATUS, 0, 0x98, 0x67);
> }
>
> @@ -1585,6 +1602,11 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)
> amdvi_init(s);
> }
>
> +static Property amdvi_properties[] = {
> + DEFINE_PROP_BOOL("xtsup", AMDVIState, xtsup, false),
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> static const VMStateDescription vmstate_amdvi_sysbus = {
> .name = "amd-iommu",
> .unmigratable = 1
> @@ -1611,6 +1633,7 @@ static void amdvi_sysbus_class_init(ObjectClass *klass, void *data)
> dc->user_creatable = true;
> set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> dc->desc = "AMD IOMMU (AMD-Vi) DMA Remapping device";
> + device_class_set_props(dc, amdvi_properties);
> }
>
> static const TypeInfo amdvi_sysbus = {
> diff --git a/hw/i386/amd_iommu.h b/hw/i386/amd_iommu.h
> index c5065a3e27..73619fe9ea 100644
> --- a/hw/i386/amd_iommu.h
> +++ b/hw/i386/amd_iommu.h
> @@ -154,6 +154,7 @@
>
> #define AMDVI_FEATURE_PREFETCH (1ULL << 0) /* page prefetch */
> #define AMDVI_FEATURE_PPR (1ULL << 1) /* PPR Support */
> +#define AMDVI_FEATURE_XT (1ULL << 2) /* x2APIC Support */
> #define AMDVI_FEATURE_GT (1ULL << 4) /* Guest Translation */
> #define AMDVI_FEATURE_IA (1ULL << 6) /* inval all support */
> #define AMDVI_FEATURE_GA (1ULL << 7) /* guest VAPIC support */
> @@ -173,8 +174,9 @@
> #define AMDVI_IOTLB_MAX_SIZE 1024
> #define AMDVI_DEVID_SHIFT 36
>
> -/* extended feature support */
> -#define AMDVI_EXT_FEATURES (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR | \
> +/* default extended feature */
> +#define AMDVI_DEFAULT_EXT_FEATURES \
> + (AMDVI_FEATURE_PREFETCH | AMDVI_FEATURE_PPR | \
> AMDVI_FEATURE_IA | AMDVI_FEATURE_GT | AMDVI_FEATURE_HE | \
> AMDVI_GATS_MODE | AMDVI_HATS_MODE | AMDVI_FEATURE_GA)
>
> @@ -276,8 +278,8 @@ union irte_ga_lo {
> dm:1,
> /* ------ */
> guest_mode:1,
> - destination:8,
> - rsvd_1:48;
> + destination:24,
> + rsvd_1:32;
> } fields_remap;
> };
>
> @@ -285,7 +287,8 @@ union irte_ga_hi {
> uint64_t val;
> struct {
> uint64_t vector:8,
> - rsvd_2:56;
> + rsvd_2:48,
> + destination_hi:8;
> } fields;
> };
>
> @@ -364,6 +367,9 @@ struct AMDVIState {
>
> /* Interrupt remapping */
> bool ga_enabled;
> + bool xtsup;
> };
>
> +uint64_t amdvi_extended_feature_register(AMDVIState *s);
> +
> #endif
> --
> 2.25.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system
2023-11-07 0:39 ` Michael S. Tsirkin
@ 2023-11-08 14:22 ` Bui Quang Minh
2023-11-08 19:44 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Bui Quang Minh @ 2023-11-08 14:22 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel, David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum, Igor Mammedov,
Alex Bennée, Joao Martins, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan
On 11/7/23 07:39, Michael S. Tsirkin wrote:
> On Tue, Oct 24, 2023 at 10:21:05PM +0700, Bui Quang Minh wrote:
>> This commit adds XTSup configuration to let user choose to whether enable
>> this feature or not. When XTSup is enabled, additional bytes in IRTE with
>> enabled guest virtual VAPIC are used to support 32-bit destination id.
>>
>> Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
>> 0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
>> features only the legacy ones, so operating system (e.g. Linux) may only
>> detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
>> is kept so that old operating system that only parses type 0x10 can detect
>> the IOMMU device.
>>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
>
>
> changes IVRS without updating expected files for tests.
> result seems to be CI failures:
> https://gitlab.com/mstredhat/qemu/-/jobs/5470533834
Thanks Michael, I am preparing the fix in the next version. I've read
the instructions to update the test data in bios-tables-test.c. It says
I need to create some separate patches to update the test data. Are
there any reasons for this? I intend to change the binary and include
the ASL diff into the commit message. Is it enough?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system
2023-11-08 14:22 ` Bui Quang Minh
@ 2023-11-08 19:44 ` Michael S. Tsirkin
2023-11-09 14:12 ` Bui Quang Minh
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-11-08 19:44 UTC (permalink / raw)
To: Bui Quang Minh
Cc: qemu-devel, David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum, Igor Mammedov,
Alex Bennée, Joao Martins, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan
On Wed, Nov 08, 2023 at 09:22:18PM +0700, Bui Quang Minh wrote:
> On 11/7/23 07:39, Michael S. Tsirkin wrote:
> > On Tue, Oct 24, 2023 at 10:21:05PM +0700, Bui Quang Minh wrote:
> > > This commit adds XTSup configuration to let user choose to whether enable
> > > this feature or not. When XTSup is enabled, additional bytes in IRTE with
> > > enabled guest virtual VAPIC are used to support 32-bit destination id.
> > >
> > > Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
> > > 0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
> > > features only the legacy ones, so operating system (e.g. Linux) may only
> > > detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
> > > is kept so that old operating system that only parses type 0x10 can detect
> > > the IOMMU device.
> > >
> > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
> >
> >
> > changes IVRS without updating expected files for tests.
> > result seems to be CI failures:
> > https://gitlab.com/mstredhat/qemu/-/jobs/5470533834
>
>
> Thanks Michael, I am preparing the fix in the next version. I've read the
> instructions to update the test data in bios-tables-test.c. It says I need
> to create some separate patches to update the test data. Are there any
> reasons for this? I intend to change the binary and include the ASL diff
> into the commit message. Is it enough?
No, not enough. No, do not ignore the rules please. Yes, there's a
reason. The reason is that I need to be able to rebase your patches. I
then regenerate the binaries. If the patch includes binaries it won't
rebase.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
` (4 preceding siblings ...)
2023-10-24 15:21 ` [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system Bui Quang Minh
@ 2023-11-09 10:11 ` Santosh Shukla
2023-11-09 14:10 ` Bui Quang Minh
5 siblings, 1 reply; 15+ messages in thread
From: Santosh Shukla @ 2023-11-09 10:11 UTC (permalink / raw)
To: Bui Quang Minh, qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Suravee Suthikulpanit
On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
> Hi everyone,
>
> This series implements x2APIC mode in userspace local APIC and the
> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
> using either Intel or AMD iommu.
>
> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
> with enabled x2APIC and can enumerate CPU with APIC ID 257
>
> Using Intel IOMMU
>
> qemu/build/qemu-system-x86_64 \
> -smp 2,maxcpus=260 \
> -cpu qemu64,x2apic=on \
> -machine q35 \
> -device intel-iommu,intremap=on,eim=on \
> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
> -m 2G \
> -kernel $KERNEL_DIR \
> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
> -drive file=$IMAGE_DIR,format=raw \
> -nographic \
> -s
>
> Using AMD IOMMU
>
> qemu/build/qemu-system-x86_64 \
> -smp 2,maxcpus=260 \
> -cpu qemu64,x2apic=on \
> -machine q35 \
> -device amd-iommu,intremap=on,xtsup=on \
> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
> -m 2G \
> -kernel $KERNEL_DIR \
> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
> -drive file=$IMAGE_DIR,format=raw \
> -nographic \
> -s
>
> Testing the emulated userspace APIC with kvm-unit-tests, disable test
> device with this patch
>
> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
> index 1734afb..f56fe1c 100644
> --- a/lib/x86/fwcfg.c
> +++ b/lib/x86/fwcfg.c
> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>
> if ((str = getenv("TEST_DEVICE")))
> no_test_device = !atol(str);
> + no_test_device = true;
>
> if ((str = getenv("MEMLIMIT")))
> fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>
> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
> ./run_tests.sh -v -g apic
>
> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
> apic-split (54 tests, 8 unexpected failures, 1 skipped)
> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
> 6 unexpected failures, 2 skipped)
>
> FAIL: apic_disable: *0xfee00030: 50014
> FAIL: apic_disable: *0xfee00080: f0
> FAIL: apic_disable: *0xfee00030: 50014
> FAIL: apic_disable: *0xfee00080: f0
> FAIL: apicbase: relocate apic
>
> These errors are because we don't disable MMIO region when switching to
> x2APIC and don't support relocate MMIO region yet. This is a problem
> because, MMIO region is the same for all CPUs, in order to support these we
> need to figure out how to allocate and manage different MMIO regions for
> each CPUs. This can be an improvement in the future.
>
> FAIL: nmi-after-sti
> FAIL: multiple nmi
>
> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>
> FAIL: TMCCT should stay at zero
>
> This error is related to APIC timer which should be addressed in separate
> patch.
>
> Version 9 changes,
Hi Bui,
I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
see > 255 vCPU for Linux and Windows Guest.
Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
Thanks,
Santosh
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
2023-11-09 10:11 ` [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Santosh Shukla
@ 2023-11-09 14:10 ` Bui Quang Minh
2023-11-09 14:32 ` Joao Martins
0 siblings, 1 reply; 15+ messages in thread
From: Bui Quang Minh @ 2023-11-09 14:10 UTC (permalink / raw)
To: Santosh Shukla, qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Joao Martins, Peter Xu,
Jason Wang, Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Suravee Suthikulpanit
On 11/9/23 17:11, Santosh Shukla wrote:
> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>> Hi everyone,
>>
>> This series implements x2APIC mode in userspace local APIC and the
>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>> using either Intel or AMD iommu.
>>
>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>
>> Using Intel IOMMU
>>
>> qemu/build/qemu-system-x86_64 \
>> -smp 2,maxcpus=260 \
>> -cpu qemu64,x2apic=on \
>> -machine q35 \
>> -device intel-iommu,intremap=on,eim=on \
>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>> -m 2G \
>> -kernel $KERNEL_DIR \
>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>> -drive file=$IMAGE_DIR,format=raw \
>> -nographic \
>> -s
>>
>> Using AMD IOMMU
>>
>> qemu/build/qemu-system-x86_64 \
>> -smp 2,maxcpus=260 \
>> -cpu qemu64,x2apic=on \
>> -machine q35 \
>> -device amd-iommu,intremap=on,xtsup=on \
>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>> -m 2G \
>> -kernel $KERNEL_DIR \
>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>> -drive file=$IMAGE_DIR,format=raw \
>> -nographic \
>> -s
>>
>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>> device with this patch
>>
>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>> index 1734afb..f56fe1c 100644
>> --- a/lib/x86/fwcfg.c
>> +++ b/lib/x86/fwcfg.c
>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>
>> if ((str = getenv("TEST_DEVICE")))
>> no_test_device = !atol(str);
>> + no_test_device = true;
>>
>> if ((str = getenv("MEMLIMIT")))
>> fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>
>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>> ./run_tests.sh -v -g apic
>>
>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>> 6 unexpected failures, 2 skipped)
>>
>> FAIL: apic_disable: *0xfee00030: 50014
>> FAIL: apic_disable: *0xfee00080: f0
>> FAIL: apic_disable: *0xfee00030: 50014
>> FAIL: apic_disable: *0xfee00080: f0
>> FAIL: apicbase: relocate apic
>>
>> These errors are because we don't disable MMIO region when switching to
>> x2APIC and don't support relocate MMIO region yet. This is a problem
>> because, MMIO region is the same for all CPUs, in order to support these we
>> need to figure out how to allocate and manage different MMIO regions for
>> each CPUs. This can be an improvement in the future.
>>
>> FAIL: nmi-after-sti
>> FAIL: multiple nmi
>>
>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>
>> FAIL: TMCCT should stay at zero
>>
>> This error is related to APIC timer which should be addressed in separate
>> patch.
>>
>> Version 9 changes,
>
> Hi Bui,
>
> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
> see > 255 vCPU for Linux and Windows Guest.
>
> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
Hi Santosh,
With KVM enabled, you may be using the in kernel APIC from KVM not the
emulated APIC in userspace as in this series.
Thanks,
Quang Minh.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system
2023-11-08 19:44 ` Michael S. Tsirkin
@ 2023-11-09 14:12 ` Bui Quang Minh
0 siblings, 0 replies; 15+ messages in thread
From: Bui Quang Minh @ 2023-11-09 14:12 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel, David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum, Igor Mammedov,
Alex Bennée, Joao Martins, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan
On 11/9/23 02:44, Michael S. Tsirkin wrote:
> On Wed, Nov 08, 2023 at 09:22:18PM +0700, Bui Quang Minh wrote:
>> On 11/7/23 07:39, Michael S. Tsirkin wrote:
>>> On Tue, Oct 24, 2023 at 10:21:05PM +0700, Bui Quang Minh wrote:
>>>> This commit adds XTSup configuration to let user choose to whether enable
>>>> this feature or not. When XTSup is enabled, additional bytes in IRTE with
>>>> enabled guest virtual VAPIC are used to support 32-bit destination id.
>>>>
>>>> Additionally, this commit exports IVHD type 0x11 besides the old IVHD type
>>>> 0x10 in ACPI table. IVHD type 0x10 does not report full set of IOMMU
>>>> features only the legacy ones, so operating system (e.g. Linux) may only
>>>> detects x2APIC support if IVHD type 0x11 is available. The IVHD type 0x10
>>>> is kept so that old operating system that only parses type 0x10 can detect
>>>> the IOMMU device.
>>>>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
>>>
>>>
>>> changes IVRS without updating expected files for tests.
>>> result seems to be CI failures:
>>> https://gitlab.com/mstredhat/qemu/-/jobs/5470533834
>>
>>
>> Thanks Michael, I am preparing the fix in the next version. I've read the
>> instructions to update the test data in bios-tables-test.c. It says I need
>> to create some separate patches to update the test data. Are there any
>> reasons for this? I intend to change the binary and include the ASL diff
>> into the commit message. Is it enough?
>
> No, not enough. No, do not ignore the rules please. Yes, there's a
> reason. The reason is that I need to be able to rebase your patches. I
> then regenerate the binaries. If the patch includes binaries it won't
> rebase.
Okay, I got it. I will prepare the fix in the next version.
Thanks,
Quang Minh.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
2023-11-09 14:10 ` Bui Quang Minh
@ 2023-11-09 14:32 ` Joao Martins
2023-11-09 14:42 ` Bui Quang Minh
0 siblings, 1 reply; 15+ messages in thread
From: Joao Martins @ 2023-11-09 14:32 UTC (permalink / raw)
To: Bui Quang Minh, Santosh Shukla, qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Suravee Suthikulpanit
On 09/11/2023 14:10, Bui Quang Minh wrote:
> On 11/9/23 17:11, Santosh Shukla wrote:
>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>> Hi everyone,
>>>
>>> This series implements x2APIC mode in userspace local APIC and the
>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>> using either Intel or AMD iommu.
>>>
>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>
>>> Using Intel IOMMU
>>>
>>> qemu/build/qemu-system-x86_64 \
>>> -smp 2,maxcpus=260 \
>>> -cpu qemu64,x2apic=on \
>>> -machine q35 \
>>> -device intel-iommu,intremap=on,eim=on \
>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>> -m 2G \
>>> -kernel $KERNEL_DIR \
>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>> net.ifnames=0" \
>>> -drive file=$IMAGE_DIR,format=raw \
>>> -nographic \
>>> -s
>>>
>>> Using AMD IOMMU
>>>
>>> qemu/build/qemu-system-x86_64 \
>>> -smp 2,maxcpus=260 \
>>> -cpu qemu64,x2apic=on \
>>> -machine q35 \
>>> -device amd-iommu,intremap=on,xtsup=on \
>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>> -m 2G \
>>> -kernel $KERNEL_DIR \
>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>> net.ifnames=0" \
>>> -drive file=$IMAGE_DIR,format=raw \
>>> -nographic \
>>> -s
>>>
>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>> device with this patch
>>>
>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>> index 1734afb..f56fe1c 100644
>>> --- a/lib/x86/fwcfg.c
>>> +++ b/lib/x86/fwcfg.c
>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>
>>> if ((str = getenv("TEST_DEVICE")))
>>> no_test_device = !atol(str);
>>> + no_test_device = true;
>>>
>>> if ((str = getenv("MEMLIMIT")))
>>> fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>
>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>> ./run_tests.sh -v -g apic
>>>
>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>> 6 unexpected failures, 2 skipped)
>>>
>>> FAIL: apic_disable: *0xfee00030: 50014
>>> FAIL: apic_disable: *0xfee00080: f0
>>> FAIL: apic_disable: *0xfee00030: 50014
>>> FAIL: apic_disable: *0xfee00080: f0
>>> FAIL: apicbase: relocate apic
>>>
>>> These errors are because we don't disable MMIO region when switching to
>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>> because, MMIO region is the same for all CPUs, in order to support these we
>>> need to figure out how to allocate and manage different MMIO regions for
>>> each CPUs. This can be an improvement in the future.
>>>
>>> FAIL: nmi-after-sti
>>> FAIL: multiple nmi
>>>
>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>
>>> FAIL: TMCCT should stay at zero
>>>
>>> This error is related to APIC timer which should be addressed in separate
>>> patch.
>>>
>>> Version 9 changes,
>>
>> Hi Bui,
>>
>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>> see > 255 vCPU for Linux and Windows Guest.
>>
>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
>
> Hi Santosh,
>
> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
> APIC in userspace as in this series.
>
Your XTSup code isn't necessarily userspace APIC specific. You can have
accel=kvm with split irqchip and things will still work. I suspect that's how
Santosh tested it.
Joao
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
2023-11-09 14:32 ` Joao Martins
@ 2023-11-09 14:42 ` Bui Quang Minh
2023-11-09 15:29 ` Santosh Shukla
0 siblings, 1 reply; 15+ messages in thread
From: Bui Quang Minh @ 2023-11-09 14:42 UTC (permalink / raw)
To: Joao Martins, Santosh Shukla, qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Suravee Suthikulpanit
On 11/9/23 21:32, Joao Martins wrote:
> On 09/11/2023 14:10, Bui Quang Minh wrote:
>> On 11/9/23 17:11, Santosh Shukla wrote:
>>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>>> Hi everyone,
>>>>
>>>> This series implements x2APIC mode in userspace local APIC and the
>>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>>> using either Intel or AMD iommu.
>>>>
>>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>>
>>>> Using Intel IOMMU
>>>>
>>>> qemu/build/qemu-system-x86_64 \
>>>> -smp 2,maxcpus=260 \
>>>> -cpu qemu64,x2apic=on \
>>>> -machine q35 \
>>>> -device intel-iommu,intremap=on,eim=on \
>>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>> -m 2G \
>>>> -kernel $KERNEL_DIR \
>>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>> net.ifnames=0" \
>>>> -drive file=$IMAGE_DIR,format=raw \
>>>> -nographic \
>>>> -s
>>>>
>>>> Using AMD IOMMU
>>>>
>>>> qemu/build/qemu-system-x86_64 \
>>>> -smp 2,maxcpus=260 \
>>>> -cpu qemu64,x2apic=on \
>>>> -machine q35 \
>>>> -device amd-iommu,intremap=on,xtsup=on \
>>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>> -m 2G \
>>>> -kernel $KERNEL_DIR \
>>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>> net.ifnames=0" \
>>>> -drive file=$IMAGE_DIR,format=raw \
>>>> -nographic \
>>>> -s
>>>>
>>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>>> device with this patch
>>>>
>>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>>> index 1734afb..f56fe1c 100644
>>>> --- a/lib/x86/fwcfg.c
>>>> +++ b/lib/x86/fwcfg.c
>>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>>
>>>> if ((str = getenv("TEST_DEVICE")))
>>>> no_test_device = !atol(str);
>>>> + no_test_device = true;
>>>>
>>>> if ((str = getenv("MEMLIMIT")))
>>>> fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>>
>>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>>> ./run_tests.sh -v -g apic
>>>>
>>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>>> 6 unexpected failures, 2 skipped)
>>>>
>>>> FAIL: apic_disable: *0xfee00030: 50014
>>>> FAIL: apic_disable: *0xfee00080: f0
>>>> FAIL: apic_disable: *0xfee00030: 50014
>>>> FAIL: apic_disable: *0xfee00080: f0
>>>> FAIL: apicbase: relocate apic
>>>>
>>>> These errors are because we don't disable MMIO region when switching to
>>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>>> because, MMIO region is the same for all CPUs, in order to support these we
>>>> need to figure out how to allocate and manage different MMIO regions for
>>>> each CPUs. This can be an improvement in the future.
>>>>
>>>> FAIL: nmi-after-sti
>>>> FAIL: multiple nmi
>>>>
>>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>>
>>>> FAIL: TMCCT should stay at zero
>>>>
>>>> This error is related to APIC timer which should be addressed in separate
>>>> patch.
>>>>
>>>> Version 9 changes,
>>>
>>> Hi Bui,
>>>
>>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>>> see > 255 vCPU for Linux and Windows Guest.
>>>
>>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
>>
>> Hi Santosh,
>>
>> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
>> APIC in userspace as in this series.
>>
>
> Your XTSup code isn't necessarily userspace APIC specific. You can have
> accel=kvm with split irqchip and things will still work. I suspect that's how
> Santosh tested it.
Ah, I got it. Thanks Santosh, Joao.
Quang Minh.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v9 0/5] Support x2APIC mode with TCG accelerator
2023-11-09 14:42 ` Bui Quang Minh
@ 2023-11-09 15:29 ` Santosh Shukla
0 siblings, 0 replies; 15+ messages in thread
From: Santosh Shukla @ 2023-11-09 15:29 UTC (permalink / raw)
To: Bui Quang Minh, Joao Martins, Santosh Shukla, qemu-devel
Cc: David Woodhouse, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
Igor Mammedov, Alex Bennée, Peter Xu, Jason Wang,
Philippe Mathieu-Daudé, Phil Dennis-Jordan,
Suravee Suthikulpanit
On 11/9/2023 8:12 PM, Bui Quang Minh wrote:
> On 11/9/23 21:32, Joao Martins wrote:
>> On 09/11/2023 14:10, Bui Quang Minh wrote:
>>> On 11/9/23 17:11, Santosh Shukla wrote:
>>>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>>>> Hi everyone,
>>>>>
>>>>> This series implements x2APIC mode in userspace local APIC and the
>>>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>>>> using either Intel or AMD iommu.
>>>>>
>>>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>>>
>>>>> Using Intel IOMMU
>>>>>
>>>>> qemu/build/qemu-system-x86_64 \
>>>>> -smp 2,maxcpus=260 \
>>>>> -cpu qemu64,x2apic=on \
>>>>> -machine q35 \
>>>>> -device intel-iommu,intremap=on,eim=on \
>>>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>> -m 2G \
>>>>> -kernel $KERNEL_DIR \
>>>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>>> net.ifnames=0" \
>>>>> -drive file=$IMAGE_DIR,format=raw \
>>>>> -nographic \
>>>>> -s
>>>>>
>>>>> Using AMD IOMMU
>>>>>
>>>>> qemu/build/qemu-system-x86_64 \
>>>>> -smp 2,maxcpus=260 \
>>>>> -cpu qemu64,x2apic=on \
>>>>> -machine q35 \
>>>>> -device amd-iommu,intremap=on,xtsup=on \
>>>>> -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>> -m 2G \
>>>>> -kernel $KERNEL_DIR \
>>>>> -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>>> net.ifnames=0" \
>>>>> -drive file=$IMAGE_DIR,format=raw \
>>>>> -nographic \
>>>>> -s
>>>>>
>>>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>>>> device with this patch
>>>>>
>>>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>>>> index 1734afb..f56fe1c 100644
>>>>> --- a/lib/x86/fwcfg.c
>>>>> +++ b/lib/x86/fwcfg.c
>>>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>>>
>>>>> if ((str = getenv("TEST_DEVICE")))
>>>>> no_test_device = !atol(str);
>>>>> + no_test_device = true;
>>>>>
>>>>> if ((str = getenv("MEMLIMIT")))
>>>>> fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>>>
>>>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>>>> ./run_tests.sh -v -g apic
>>>>>
>>>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>>>> 6 unexpected failures, 2 skipped)
>>>>>
>>>>> FAIL: apic_disable: *0xfee00030: 50014
>>>>> FAIL: apic_disable: *0xfee00080: f0
>>>>> FAIL: apic_disable: *0xfee00030: 50014
>>>>> FAIL: apic_disable: *0xfee00080: f0
>>>>> FAIL: apicbase: relocate apic
>>>>>
>>>>> These errors are because we don't disable MMIO region when switching to
>>>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>>>> because, MMIO region is the same for all CPUs, in order to support these we
>>>>> need to figure out how to allocate and manage different MMIO regions for
>>>>> each CPUs. This can be an improvement in the future.
>>>>>
>>>>> FAIL: nmi-after-sti
>>>>> FAIL: multiple nmi
>>>>>
>>>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>>>
>>>>> FAIL: TMCCT should stay at zero
>>>>>
>>>>> This error is related to APIC timer which should be addressed in separate
>>>>> patch.
>>>>>
>>>>> Version 9 changes,
>>>>
>>>> Hi Bui,
>>>>
>>>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>>>> see > 255 vCPU for Linux and Windows Guest.
>>>>
>>>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
>>>
>>> Hi Santosh,
>>>
>>> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
>>> APIC in userspace as in this series.
>>>
>>
>> Your XTSup code isn't necessarily userspace APIC specific. You can have
>> accel=kvm with split irqchip and things will still work. I suspect that's how
>> Santosh tested it.
>
That's correct.
> Ah, I got it. Thanks Santosh, Joao.
> Quang Minh.
>
Thanks,
Santosh
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-11-09 15:31 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-24 15:21 [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 1/5] i386/tcg: implement x2APIC registers MSR access Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 2/5] apic: add support for x2APIC mode Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 3/5] apic, i386/tcg: add x2apic transitions Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 4/5] intel_iommu: allow Extended Interrupt Mode when using userspace APIC Bui Quang Minh
2023-10-24 15:21 ` [PATCH v9 5/5] amd_iommu: report x2APIC support to the operating system Bui Quang Minh
2023-11-07 0:39 ` Michael S. Tsirkin
2023-11-08 14:22 ` Bui Quang Minh
2023-11-08 19:44 ` Michael S. Tsirkin
2023-11-09 14:12 ` Bui Quang Minh
2023-11-09 10:11 ` [PATCH v9 0/5] Support x2APIC mode with TCG accelerator Santosh Shukla
2023-11-09 14:10 ` Bui Quang Minh
2023-11-09 14:32 ` Joao Martins
2023-11-09 14:42 ` Bui Quang Minh
2023-11-09 15:29 ` Santosh Shukla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).