* [Qemu-devel] [PATCH 01/23] hyperv: add header with protocol definitions
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 02/23] update-linux-headers: prepare for hyperv.h removal Roman Kagan
` (21 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
The definitions for Hyper-V emulation are currently taken from a header
imported from the Linux kernel.
However, as these describe a third-party protocol rather than a kernel
API, it probably wasn't a good idea to publish it in the kernel uapi.
This patch introduces a header that provides all the necessary
definitions, obsoleting the one coming from the kernel.
The new header supports (temporary) coexistence with the kernel one.
The constants explicitly named in the Hyper-V specification (e.g. msr
numbers) are defined in a non-conflicting way. Other constants and
types have got new names.
While at this, the protocol data structures are defined in a more
conventional way, without bitfields, enums, and excessive unions.
The code using this stuff is adjusted, too; it can now be built both
with and without the kernel header in the tree.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/cpu.h | 10 +-
target/i386/hyperv_proto.h | 257 +++++++++++++++++++++++++++++++++++++++++++++
target/i386/cpu.c | 4 +-
target/i386/hyperv.c | 6 +-
target/i386/kvm.c | 57 +++++-----
target/i386/machine.c | 15 ++-
6 files changed, 301 insertions(+), 48 deletions(-)
create mode 100644 target/i386/hyperv_proto.h
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index cfe825f..9335dcc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -22,7 +22,7 @@
#include "qemu-common.h"
#include "cpu-qom.h"
-#include "standard-headers/asm-x86/hyperv.h"
+#include "hyperv_proto.h"
#ifdef TARGET_X86_64
#define TARGET_LONG_BITS 64
@@ -1093,15 +1093,15 @@ typedef struct CPUX86State {
uint64_t msr_hv_guest_os_id;
uint64_t msr_hv_vapic;
uint64_t msr_hv_tsc;
- uint64_t msr_hv_crash_params[HV_X64_MSR_CRASH_PARAMS];
+ uint64_t msr_hv_crash_params[HV_CRASH_PARAMS];
uint64_t msr_hv_runtime;
uint64_t msr_hv_synic_control;
uint64_t msr_hv_synic_version;
uint64_t msr_hv_synic_evt_page;
uint64_t msr_hv_synic_msg_page;
- uint64_t msr_hv_synic_sint[HV_SYNIC_SINT_COUNT];
- uint64_t msr_hv_stimer_config[HV_SYNIC_STIMER_COUNT];
- uint64_t msr_hv_stimer_count[HV_SYNIC_STIMER_COUNT];
+ uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
+ uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
+ uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
/* exception/interrupt handling */
int error_code;
diff --git a/target/i386/hyperv_proto.h b/target/i386/hyperv_proto.h
new file mode 100644
index 0000000..1b3fa6e
--- /dev/null
+++ b/target/i386/hyperv_proto.h
@@ -0,0 +1,257 @@
+/*
+ * Definitions for Hyper-V guest/hypervisor interaction
+ *
+ * Copyright (C) 2017 Parallels International GmbH
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef TARGET_I386_HYPERV_PROTO_H
+#define TARGET_I386_HYPERV_PROTO_H
+
+#include "qemu/bitmap.h"
+
+#define HV_CPUID_VENDOR_AND_MAX_FUNCTIONS 0x40000000
+#define HV_CPUID_INTERFACE 0x40000001
+#define HV_CPUID_VERSION 0x40000002
+#define HV_CPUID_FEATURES 0x40000003
+#define HV_CPUID_ENLIGHTMENT_INFO 0x40000004
+#define HV_CPUID_IMPLEMENT_LIMITS 0x40000005
+#define HV_CPUID_MIN 0x40000005
+#define HV_CPUID_MAX 0x4000ffff
+#define HV_HYPERVISOR_PRESENT_BIT 0x80000000
+
+/*
+ * HV_CPUID_FEATURES.EAX bits
+ */
+#define HV_VP_RUNTIME_AVAILABLE (1u << 0)
+#define HV_TIME_REF_COUNT_AVAILABLE (1u << 1)
+#define HV_SYNIC_AVAILABLE (1u << 2)
+#define HV_SYNTIMERS_AVAILABLE (1u << 3)
+#define HV_APIC_ACCESS_AVAILABLE (1u << 4)
+#define HV_HYPERCALL_AVAILABLE (1u << 5)
+#define HV_VP_INDEX_AVAILABLE (1u << 6)
+#define HV_RESET_AVAILABLE (1u << 7)
+#define HV_REFERENCE_TSC_AVAILABLE (1u << 9)
+
+/*
+ * HV_CPUID_FEATURES.EDX bits
+ */
+#define HV_MWAIT_AVAILABLE (1u << 0)
+#define HV_GUEST_DEBUGGING_AVAILABLE (1u << 1)
+#define HV_PERF_MONITOR_AVAILABLE (1u << 2)
+#define HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE (1u << 3)
+#define HV_HYPERCALL_PARAMS_XMM_AVAILABLE (1u << 4)
+#define HV_GUEST_IDLE_STATE_AVAILABLE (1u << 5)
+#define HV_GUEST_CRASH_MSR_AVAILABLE (1u << 10)
+
+/*
+ * HV_CPUID_ENLIGHTMENT_INFO.EAX bits
+ */
+#define HV_AS_SWITCH_RECOMMENDED (1u << 0)
+#define HV_LOCAL_TLB_FLUSH_RECOMMENDED (1u << 1)
+#define HV_REMOTE_TLB_FLUSH_RECOMMENDED (1u << 2)
+#define HV_APIC_ACCESS_RECOMMENDED (1u << 3)
+#define HV_SYSTEM_RESET_RECOMMENDED (1u << 4)
+#define HV_RELAXED_TIMING_RECOMMENDED (1u << 5)
+
+/*
+ * Basic virtualized MSRs
+ */
+#define HV_X64_MSR_GUEST_OS_ID 0x40000000
+#define HV_X64_MSR_HYPERCALL 0x40000001
+#define HV_X64_MSR_VP_INDEX 0x40000002
+#define HV_X64_MSR_RESET 0x40000003
+#define HV_X64_MSR_VP_RUNTIME 0x40000010
+#define HV_X64_MSR_TIME_REF_COUNT 0x40000020
+#define HV_X64_MSR_REFERENCE_TSC 0x40000021
+#define HV_X64_MSR_TSC_FREQUENCY 0x40000022
+#define HV_X64_MSR_APIC_FREQUENCY 0x40000023
+
+/*
+ * Virtual APIC MSRs
+ */
+#define HV_X64_MSR_EOI 0x40000070
+#define HV_X64_MSR_ICR 0x40000071
+#define HV_X64_MSR_TPR 0x40000072
+#define HV_X64_MSR_APIC_ASSIST_PAGE 0x40000073
+
+/*
+ * Synthetic interrupt controller MSRs
+ */
+#define HV_X64_MSR_SCONTROL 0x40000080
+#define HV_X64_MSR_SVERSION 0x40000081
+#define HV_X64_MSR_SIEFP 0x40000082
+#define HV_X64_MSR_SIMP 0x40000083
+#define HV_X64_MSR_EOM 0x40000084
+#define HV_X64_MSR_SINT0 0x40000090
+#define HV_X64_MSR_SINT1 0x40000091
+#define HV_X64_MSR_SINT2 0x40000092
+#define HV_X64_MSR_SINT3 0x40000093
+#define HV_X64_MSR_SINT4 0x40000094
+#define HV_X64_MSR_SINT5 0x40000095
+#define HV_X64_MSR_SINT6 0x40000096
+#define HV_X64_MSR_SINT7 0x40000097
+#define HV_X64_MSR_SINT8 0x40000098
+#define HV_X64_MSR_SINT9 0x40000099
+#define HV_X64_MSR_SINT10 0x4000009A
+#define HV_X64_MSR_SINT11 0x4000009B
+#define HV_X64_MSR_SINT12 0x4000009C
+#define HV_X64_MSR_SINT13 0x4000009D
+#define HV_X64_MSR_SINT14 0x4000009E
+#define HV_X64_MSR_SINT15 0x4000009F
+
+/*
+ * Synthetic timer MSRs
+ */
+#define HV_X64_MSR_STIMER0_CONFIG 0x400000B0
+#define HV_X64_MSR_STIMER0_COUNT 0x400000B1
+#define HV_X64_MSR_STIMER1_CONFIG 0x400000B2
+#define HV_X64_MSR_STIMER1_COUNT 0x400000B3
+#define HV_X64_MSR_STIMER2_CONFIG 0x400000B4
+#define HV_X64_MSR_STIMER2_COUNT 0x400000B5
+#define HV_X64_MSR_STIMER3_CONFIG 0x400000B6
+#define HV_X64_MSR_STIMER3_COUNT 0x400000B7
+
+/*
+ * Guest crash notification MSRs
+ */
+#define HV_X64_MSR_CRASH_P0 0x40000100
+#define HV_X64_MSR_CRASH_P1 0x40000101
+#define HV_X64_MSR_CRASH_P2 0x40000102
+#define HV_X64_MSR_CRASH_P3 0x40000103
+#define HV_X64_MSR_CRASH_P4 0x40000104
+#define HV_CRASH_PARAMS (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0 + 1)
+#define HV_X64_MSR_CRASH_CTL 0x40000105
+#define HV_CRASH_CTL_NOTIFY (1ull << 63)
+
+/*
+ * Hypercall status code
+ */
+#define HV_STATUS_SUCCESS 0
+#define HV_STATUS_INVALID_HYPERCALL_CODE 2
+#define HV_STATUS_INVALID_HYPERCALL_INPUT 3
+#define HV_STATUS_INVALID_ALIGNMENT 4
+#define HV_STATUS_INVALID_PARAMETER 5
+#define HV_STATUS_INSUFFICIENT_MEMORY 11
+#define HV_STATUS_INVALID_CONNECTION_ID 18
+#define HV_STATUS_INSUFFICIENT_BUFFERS 19
+
+/*
+ * Hypercall numbers
+ */
+#define HV_POST_MESSAGE 0x005c
+#define HV_SIGNAL_EVENT 0x005d
+#define HV_HYPERCALL_FAST (1u << 16)
+
+/*
+ * Hypercall MSR bits
+ */
+#define HV_HYPERCALL_ENABLE (1u << 0)
+
+/*
+ * Synthetic interrupt controller definitions
+ */
+#define HV_SYNIC_VERSION 1
+#define HV_SINT_COUNT 16
+#define HV_SYNIC_ENABLE (1u << 0)
+#define HV_SIMP_ENABLE (1u << 0)
+#define HV_SIEFP_ENABLE (1u << 0)
+#define HV_SINT_MASKED (1u << 16)
+#define HV_SINT_AUTO_EOI (1u << 17)
+#define HV_SINT_VECTOR_MASK 0xff
+
+#define HV_STIMER_COUNT 4
+
+/*
+ * Message size
+ */
+#define HV_MESSAGE_PAYLOAD_SIZE 240
+
+/*
+ * Message types
+ */
+#define HV_MESSAGE_NONE 0x00000000
+#define HV_MESSAGE_VMBUS 0x00000001
+#define HV_MESSAGE_UNMAPPED_GPA 0x80000000
+#define HV_MESSAGE_GPA_INTERCEPT 0x80000001
+#define HV_MESSAGE_TIMER_EXPIRED 0x80000010
+#define HV_MESSAGE_INVALID_VP_REGISTER_VALUE 0x80000020
+#define HV_MESSAGE_UNRECOVERABLE_EXCEPTION 0x80000021
+#define HV_MESSAGE_UNSUPPORTED_FEATURE 0x80000022
+#define HV_MESSAGE_EVENTLOG_BUFFERCOMPLETE 0x80000040
+#define HV_MESSAGE_X64_IOPORT_INTERCEPT 0x80010000
+#define HV_MESSAGE_X64_MSR_INTERCEPT 0x80010001
+#define HV_MESSAGE_X64_CPUID_INTERCEPT 0x80010002
+#define HV_MESSAGE_X64_EXCEPTION_INTERCEPT 0x80010003
+#define HV_MESSAGE_X64_APIC_EOI 0x80010004
+#define HV_MESSAGE_X64_LEGACY_FP_ERROR 0x80010005
+
+/*
+ * Message flags
+ */
+#define HV_MESSAGE_FLAG_PENDING 0x1
+
+/*
+ * Event flags number per SINT
+ */
+#define HV_EVENT_FLAGS_COUNT (256 * 8)
+
+/*
+ * Connection id valid bits
+ */
+#define HV_CONNECTION_ID_MASK 0x00ffffff
+
+/*
+ * Input structure for POST_MESSAGE hypercall
+ */
+struct hyperv_post_message_input {
+ uint32_t connection_id;
+ uint32_t _reserved;
+ uint32_t message_type;
+ uint32_t payload_size;
+ uint8_t payload[HV_MESSAGE_PAYLOAD_SIZE];
+};
+
+/*
+ * Input structure for SIGNAL_EVENT hypercall
+ */
+struct hyperv_signal_event_input {
+ uint32_t connection_id;
+ uint16_t flag_number;
+ uint16_t _reserved_zero;
+};
+
+/*
+ * SynIC message structures
+ */
+struct hyperv_message_header {
+ uint32_t message_type;
+ uint8_t payload_size;
+ uint8_t message_flags; /* HV_MESSAGE_FLAG_XX */
+ uint8_t _reserved[2];
+ uint64_t sender;
+};
+
+struct hyperv_message {
+ struct hyperv_message_header header;
+ uint8_t payload[HV_MESSAGE_PAYLOAD_SIZE];
+};
+
+struct hyperv_message_page {
+ struct hyperv_message slot[HV_SINT_COUNT];
+};
+
+/*
+ * SynIC event flags structures
+ */
+struct hyperv_event_flags {
+ DECLARE_BITMAP(flags, HV_EVENT_FLAGS_COUNT);
+};
+
+struct hyperv_event_flags_page {
+ struct hyperv_event_flags slot[HV_SINT_COUNT];
+};
+
+#endif
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ffb5267..ef46850 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3808,12 +3808,12 @@ static GuestPanicInformation *x86_cpu_get_crash_info(CPUState *cs)
CPUX86State *env = &cpu->env;
GuestPanicInformation *panic_info = NULL;
- if (env->features[FEAT_HYPERV_EDX] & HV_X64_GUEST_CRASH_MSR_AVAILABLE) {
+ if (env->features[FEAT_HYPERV_EDX] & HV_GUEST_CRASH_MSR_AVAILABLE) {
panic_info = g_malloc0(sizeof(GuestPanicInformation));
panic_info->type = GUEST_PANIC_INFORMATION_TYPE_HYPER_V;
- assert(HV_X64_MSR_CRASH_PARAMS >= 5);
+ assert(HV_CRASH_PARAMS >= 5);
panic_info->u.hyper_v.arg1 = env->msr_hv_crash_params[0];
panic_info->u.hyper_v.arg2 = env->msr_hv_crash_params[1];
panic_info->u.hyper_v.arg3 = env->msr_hv_crash_params[2];
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 8545574..227185c 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -14,7 +14,7 @@
#include "qemu/osdep.h"
#include "qemu/main-loop.h"
#include "hyperv.h"
-#include "standard-headers/asm-x86/hyperv.h"
+#include "hyperv_proto.h"
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
@@ -50,8 +50,8 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
code = exit->u.hcall.input & 0xffff;
switch (code) {
- case HVCALL_POST_MESSAGE:
- case HVCALL_SIGNAL_EVENT:
+ case HV_POST_MESSAGE:
+ case HV_SIGNAL_EVENT:
default:
exit->u.hcall.result = HV_STATUS_INVALID_HYPERCALL_CODE;
return 0;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 49b6115..a6debbd 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -27,6 +27,7 @@
#include "sysemu/kvm_int.h"
#include "kvm_i386.h"
#include "hyperv.h"
+#include "hyperv_proto.h"
#include "exec/gdbstub.h"
#include "qemu/host-utils.h"
@@ -40,7 +41,6 @@
#include "hw/i386/x86-iommu.h"
#include "exec/ioport.h"
-#include "standard-headers/asm-x86/hyperv.h"
#include "hw/pci/pci.h"
#include "hw/pci/msi.h"
#include "migration/blocker.h"
@@ -621,29 +621,29 @@ static int hyperv_handle_properties(CPUState *cs)
}
if (cpu->hyperv_relaxed_timing) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_HYPERCALL_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
}
if (cpu->hyperv_vapic) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_HYPERCALL_AVAILABLE;
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_APIC_ACCESS_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_APIC_ACCESS_AVAILABLE;
}
if (cpu->hyperv_time) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_HYPERCALL_AVAILABLE;
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_TIME_REF_COUNT_AVAILABLE;
- env->features[FEAT_HYPERV_EAX] |= 0x200;
+ env->features[FEAT_HYPERV_EAX] |= HV_HYPERCALL_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_TIME_REF_COUNT_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_REFERENCE_TSC_AVAILABLE;
}
if (cpu->hyperv_crash && has_msr_hv_crash) {
- env->features[FEAT_HYPERV_EDX] |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
+ env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
}
- env->features[FEAT_HYPERV_EDX] |= HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
+ env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
if (cpu->hyperv_reset && has_msr_hv_reset) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_RESET_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_RESET_AVAILABLE;
}
if (cpu->hyperv_vpindex && has_msr_hv_vpindex) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_VP_INDEX_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_VP_INDEX_AVAILABLE;
}
if (cpu->hyperv_runtime && has_msr_hv_runtime) {
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_VP_RUNTIME_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE;
}
if (cpu->hyperv_synic) {
int sint;
@@ -654,10 +654,10 @@ static int hyperv_handle_properties(CPUState *cs)
return -ENOSYS;
}
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_SYNIC_AVAILABLE;
- env->msr_hv_synic_version = HV_SYNIC_VERSION_1;
+ env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
+ env->msr_hv_synic_version = HV_SYNIC_VERSION;
for (sint = 0; sint < ARRAY_SIZE(env->msr_hv_synic_sint); sint++) {
- env->msr_hv_synic_sint[sint] = HV_SYNIC_SINT_MASKED;
+ env->msr_hv_synic_sint[sint] = HV_SINT_MASKED;
}
}
if (cpu->hyperv_stimer) {
@@ -665,7 +665,7 @@ static int hyperv_handle_properties(CPUState *cs)
fprintf(stderr, "Hyper-V timers aren't supported by kernel\n");
return -ENOSYS;
}
- env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_SYNTIMER_AVAILABLE;
+ env->features[FEAT_HYPERV_EAX] |= HV_SYNTIMERS_AVAILABLE;
}
return 0;
}
@@ -697,7 +697,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
/* Paravirtualization CPUIDs */
if (hyperv_enabled(cpu)) {
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
+ c->function = HV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
if (!cpu->hyperv_vendor_id) {
memcpy(signature, "Microsoft Hv", 12);
} else {
@@ -710,13 +710,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
memset(signature, 0, 12);
memcpy(signature, cpu->hyperv_vendor_id, len);
}
- c->eax = HYPERV_CPUID_MIN;
+ c->eax = HV_CPUID_MIN;
c->ebx = signature[0];
c->ecx = signature[1];
c->edx = signature[2];
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_INTERFACE;
+ c->function = HV_CPUID_INTERFACE;
memcpy(signature, "Hv#1\0\0\0\0\0\0\0\0", 12);
c->eax = signature[0];
c->ebx = 0;
@@ -724,12 +724,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
c->edx = 0;
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_VERSION;
+ c->function = HV_CPUID_VERSION;
c->eax = 0x00001bbc;
c->ebx = 0x00060001;
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_FEATURES;
+ c->function = HV_CPUID_FEATURES;
r = hyperv_handle_properties(cs);
if (r) {
return r;
@@ -739,17 +739,17 @@ int kvm_arch_init_vcpu(CPUState *cs)
c->edx = env->features[FEAT_HYPERV_EDX];
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_ENLIGHTMENT_INFO;
+ c->function = HV_CPUID_ENLIGHTMENT_INFO;
if (cpu->hyperv_relaxed_timing) {
- c->eax |= HV_X64_RELAXED_TIMING_RECOMMENDED;
+ c->eax |= HV_RELAXED_TIMING_RECOMMENDED;
}
if (cpu->hyperv_vapic) {
- c->eax |= HV_X64_APIC_ACCESS_RECOMMENDED;
+ c->eax |= HV_APIC_ACCESS_RECOMMENDED;
}
c->ebx = cpu->hyperv_spinlock_attempts;
c = &cpuid_data.entries[cpuid_i++];
- c->function = HYPERV_CPUID_IMPLEMENT_LIMITS;
+ c->function = HV_CPUID_IMPLEMENT_LIMITS;
c->eax = 0x40;
c->ebx = 0x40;
@@ -1735,12 +1735,11 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
if (has_msr_hv_crash) {
int j;
- for (j = 0; j < HV_X64_MSR_CRASH_PARAMS; j++)
+ for (j = 0; j < HV_CRASH_PARAMS; j++)
kvm_msr_entry_add(cpu, HV_X64_MSR_CRASH_P0 + j,
env->msr_hv_crash_params[j]);
- kvm_msr_entry_add(cpu, HV_X64_MSR_CRASH_CTL,
- HV_X64_MSR_CRASH_CTL_NOTIFY);
+ kvm_msr_entry_add(cpu, HV_X64_MSR_CRASH_CTL, HV_CRASH_CTL_NOTIFY);
}
if (has_msr_hv_runtime) {
kvm_msr_entry_add(cpu, HV_X64_MSR_VP_RUNTIME, env->msr_hv_runtime);
@@ -2145,7 +2144,7 @@ static int kvm_get_msrs(X86CPU *cpu)
if (has_msr_hv_crash) {
int j;
- for (j = 0; j < HV_X64_MSR_CRASH_PARAMS; j++) {
+ for (j = 0; j < HV_CRASH_PARAMS; j++) {
kvm_msr_entry_add(cpu, HV_X64_MSR_CRASH_P0 + j, 0);
}
}
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 3cb2729..eb00b19 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -570,7 +570,7 @@ static bool hyperv_crash_enable_needed(void *opaque)
CPUX86State *env = &cpu->env;
int i;
- for (i = 0; i < HV_X64_MSR_CRASH_PARAMS; i++) {
+ for (i = 0; i < HV_CRASH_PARAMS; i++) {
if (env->msr_hv_crash_params[i]) {
return true;
}
@@ -584,8 +584,7 @@ static const VMStateDescription vmstate_msr_hyperv_crash = {
.minimum_version_id = 1,
.needed = hyperv_crash_enable_needed,
.fields = (VMStateField[]) {
- VMSTATE_UINT64_ARRAY(env.msr_hv_crash_params,
- X86CPU, HV_X64_MSR_CRASH_PARAMS),
+ VMSTATE_UINT64_ARRAY(env.msr_hv_crash_params, X86CPU, HV_CRASH_PARAMS),
VMSTATE_END_OF_LIST()
}
};
@@ -643,8 +642,7 @@ static const VMStateDescription vmstate_msr_hyperv_synic = {
VMSTATE_UINT64(env.msr_hv_synic_control, X86CPU),
VMSTATE_UINT64(env.msr_hv_synic_evt_page, X86CPU),
VMSTATE_UINT64(env.msr_hv_synic_msg_page, X86CPU),
- VMSTATE_UINT64_ARRAY(env.msr_hv_synic_sint, X86CPU,
- HV_SYNIC_SINT_COUNT),
+ VMSTATE_UINT64_ARRAY(env.msr_hv_synic_sint, X86CPU, HV_SINT_COUNT),
VMSTATE_END_OF_LIST()
}
};
@@ -669,10 +667,9 @@ static const VMStateDescription vmstate_msr_hyperv_stimer = {
.minimum_version_id = 1,
.needed = hyperv_stimer_enable_needed,
.fields = (VMStateField[]) {
- VMSTATE_UINT64_ARRAY(env.msr_hv_stimer_config,
- X86CPU, HV_SYNIC_STIMER_COUNT),
- VMSTATE_UINT64_ARRAY(env.msr_hv_stimer_count,
- X86CPU, HV_SYNIC_STIMER_COUNT),
+ VMSTATE_UINT64_ARRAY(env.msr_hv_stimer_config, X86CPU,
+ HV_STIMER_COUNT),
+ VMSTATE_UINT64_ARRAY(env.msr_hv_stimer_count, X86CPU, HV_STIMER_COUNT),
VMSTATE_END_OF_LIST()
}
};
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 02/23] update-linux-headers: prepare for hyperv.h removal
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 01/23] hyperv: add header with protocol definitions Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 03/23] hyperv: set partition-wide MSRs only on first vcpu Roman Kagan
` (20 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
All definitions related to Hyper-V emulation are now taken from the QEMU
own header, so the one imported from the kernel is no longer needed.
Unfortunately it's included by kvm_para.h.
So, until this is fixed in the kernel, teach the header harvesting
script to substitute kernel's hyperv.h with a dummy.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
| 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 2f906c4..ad80fe3 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -104,7 +104,9 @@ for arch in $ARCHLIST; do
cp "$tmpdir/include/asm/unistd-common.h" "$output/linux-headers/asm-arm/"
fi
if [ $arch = x86 ]; then
- cp_portable "$tmpdir/include/asm/hyperv.h" "$output/include/standard-headers/asm-x86/"
+ cat <<-EOF >"$output/include/standard-headers/asm-x86/hyperv.h"
+ /* this is a temporary placeholder until kvm_para.h stops including it */
+ EOF
cp "$tmpdir/include/asm/unistd_32.h" "$output/linux-headers/asm-x86/"
cp "$tmpdir/include/asm/unistd_x32.h" "$output/linux-headers/asm-x86/"
cp "$tmpdir/include/asm/unistd_64.h" "$output/linux-headers/asm-x86/"
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 03/23] hyperv: set partition-wide MSRs only on first vcpu
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 01/23] hyperv: add header with protocol definitions Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 02/23] update-linux-headers: prepare for hyperv.h removal Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 04/23] hyperv: ensure msrs are inited properly Roman Kagan
` (19 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
From: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Hyper-V has a notion of partition-wide MSRs. Those MSRs are read and
written as usual on each VCPU, however the hypervisor maintains a single
global value for all VCPUs. Thus writing such an MSR from any single
VCPU affects the global value that is read by all other VCPUs.
This leads to an issue during VCPU hotplug: the zero-initialzied values
of those MSRs get synced into KVM and override the global values as has
already been set by the guest.
This change makes the partition-wide MSRs only be synchronized on the
first vcpu.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/cpu.h | 5 ++++-
target/i386/kvm.c | 20 ++++++++++++--------
2 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9335dcc..7af2cce 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1089,10 +1089,13 @@ typedef struct CPUX86State {
uint64_t async_pf_en_msr;
uint64_t pv_eoi_en_msr;
+ /* Partition-wide HV MSRs, will be updated only on the first vcpu */
uint64_t msr_hv_hypercall;
uint64_t msr_hv_guest_os_id;
- uint64_t msr_hv_vapic;
uint64_t msr_hv_tsc;
+
+ /* Per-VCPU HV MSRs */
+ uint64_t msr_hv_vapic;
uint64_t msr_hv_crash_params[HV_CRASH_PARAMS];
uint64_t msr_hv_runtime;
uint64_t msr_hv_synic_control;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index a6debbd..3a80913 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1719,19 +1719,23 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,
env->msr_global_ctrl);
}
- if (has_msr_hv_hypercall) {
- kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID,
- env->msr_hv_guest_os_id);
- kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL,
- env->msr_hv_hypercall);
+ /* Sync partition-wide MSRs only on first VCPU to avoid races */
+ if (current_cpu == first_cpu) {
+ if (has_msr_hv_hypercall) {
+ kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID,
+ env->msr_hv_guest_os_id);
+ kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL,
+ env->msr_hv_hypercall);
+ }
+ if (cpu->hyperv_time) {
+ kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
+ env->msr_hv_tsc);
+ }
}
if (cpu->hyperv_vapic) {
kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
env->msr_hv_vapic);
}
- if (cpu->hyperv_time) {
- kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, env->msr_hv_tsc);
- }
if (has_msr_hv_crash) {
int j;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 04/23] hyperv: ensure msrs are inited properly
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (2 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 03/23] hyperv: set partition-wide MSRs only on first vcpu Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index Roman Kagan
` (18 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Make sure HV_X64_MSR_SVERSION and HV_X64_MSR_SINTx are properly
initialized at guest start.
For that, move the field containing SVERSION value into the region in
CPUX86State which is preserved across resets, and move the
initialization of SINTx to kvm_arch_vcpu_reset().
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/cpu.h | 3 ++-
target/i386/kvm.c | 12 +++++++-----
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7af2cce..7c97bce 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1099,7 +1099,6 @@ typedef struct CPUX86State {
uint64_t msr_hv_crash_params[HV_CRASH_PARAMS];
uint64_t msr_hv_runtime;
uint64_t msr_hv_synic_control;
- uint64_t msr_hv_synic_version;
uint64_t msr_hv_synic_evt_page;
uint64_t msr_hv_synic_msg_page;
uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
@@ -1159,6 +1158,8 @@ typedef struct CPUX86State {
uint64_t mtrr_deftype;
MTRRVar mtrr_var[MSR_MTRRcap_VCNT];
+ uint64_t msr_hv_synic_version;
+
/* For KVM */
uint32_t mp_state;
int32_t exception_injected;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 3a80913..251aa95 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -646,8 +646,6 @@ static int hyperv_handle_properties(CPUState *cs)
env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE;
}
if (cpu->hyperv_synic) {
- int sint;
-
if (!has_msr_hv_synic ||
kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_SYNIC, 0)) {
fprintf(stderr, "Hyper-V SynIC is not supported by kernel\n");
@@ -656,9 +654,6 @@ static int hyperv_handle_properties(CPUState *cs)
env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
env->msr_hv_synic_version = HV_SYNIC_VERSION;
- for (sint = 0; sint < ARRAY_SIZE(env->msr_hv_synic_sint); sint++) {
- env->msr_hv_synic_sint[sint] = HV_SINT_MASKED;
- }
}
if (cpu->hyperv_stimer) {
if (!has_msr_hv_stimer) {
@@ -1038,6 +1033,13 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
} else {
env->mp_state = KVM_MP_STATE_RUNNABLE;
}
+
+ if (cpu->hyperv_synic) {
+ int i;
+ for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) {
+ env->msr_hv_synic_sint[i] = HV_SINT_MASKED;
+ }
+ }
}
void kvm_arch_do_init_vcpu(X86CPU *cpu)
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (3 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 04/23] hyperv: ensure msrs are inited properly Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-13 18:57 ` Eduardo Habkost
2017-06-06 18:19 ` [Qemu-devel] [PATCH 06/23] hyperv: helper to find vcpu by VP index Roman Kagan
` (17 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Hyper-V identifies vcpus by the virtual processor (VP) index which is
normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
It has to be owned by QEMU in order to preserve it across migration.
However, the current implementation in KVM doesn't allow to set this
msr, and KVM uses its own notion of VP index. Fortunately, the way
vcpus are created in QEMU/KVM basically guarantees that the KVM value is
equal to QEMU cpu_index.
So, for back and forward compatibility, attempt to set the msr at vcpu
init time to cpu_index, but ignore the errors; then query the msr value
from KVM and assert that it's equal to cpu_index. On current kernels
this will work by luck; future ones will accept the value from
userspace.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 2 ++
target/i386/hyperv.c | 5 +++++
target/i386/kvm.c | 26 ++++++++++++++++++++++++++
3 files changed, 33 insertions(+)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 0c3b562..35da0b1 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -39,4 +39,6 @@ void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
+uint32_t hyperv_vp_index(X86CPU *cpu);
+
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 227185c..27de5bc 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -16,6 +16,11 @@
#include "hyperv.h"
#include "hyperv_proto.h"
+uint32_t hyperv_vp_index(X86CPU *cpu)
+{
+ return CPU(cpu)->cpu_index;
+}
+
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
CPUX86State *env = &cpu->env;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 251aa95..eb9cde4 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -610,11 +610,37 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
return 0;
}
+static void hyperv_set_vp_index(CPUState *cs)
+{
+ struct {
+ struct kvm_msrs info;
+ struct kvm_msr_entry entries[1];
+ } msr_data;
+ int ret;
+
+ msr_data.info.nmsrs = 1;
+ msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
+
+ /*
+ * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
+ * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
+ * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
+ * expected value.
+ */
+ msr_data.entries[0].data = cs->cpu_index;
+ kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
+ ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
+ assert(ret == 1);
+ assert(msr_data.entries[0].data == cs->cpu_index);
+}
+
static int hyperv_handle_properties(CPUState *cs)
{
X86CPU *cpu = X86_CPU(cs);
CPUX86State *env = &cpu->env;
+ hyperv_set_vp_index(cs);
+
if (cpu->hyperv_time &&
kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) <= 0) {
cpu->hyperv_time = false;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-06 18:19 ` [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index Roman Kagan
@ 2017-06-13 18:57 ` Eduardo Habkost
2017-06-14 11:25 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-13 18:57 UTC (permalink / raw)
To: Roman Kagan
Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev,
Igor Mammedov
On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> Hyper-V identifies vcpus by the virtual processor (VP) index which is
> normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
>
> It has to be owned by QEMU in order to preserve it across migration.
>
> However, the current implementation in KVM doesn't allow to set this
> msr, and KVM uses its own notion of VP index. Fortunately, the way
> vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> equal to QEMU cpu_index.
This might not be true in the future. cpu_index is not a good
identifier for CPUs, and we would like to remove it in the
future.
But it looks like we have no choice, see extra comments below:
>
> So, for back and forward compatibility, attempt to set the msr at vcpu
> init time to cpu_index, but ignore the errors; then query the msr value
> from KVM and assert that it's equal to cpu_index. On current kernels
> this will work by luck; future ones will accept the value from
> userspace.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
> target/i386/hyperv.h | 2 ++
> target/i386/hyperv.c | 5 +++++
> target/i386/kvm.c | 26 ++++++++++++++++++++++++++
> 3 files changed, 33 insertions(+)
>
> diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
> index 0c3b562..35da0b1 100644
> --- a/target/i386/hyperv.h
> +++ b/target/i386/hyperv.h
> @@ -39,4 +39,6 @@ void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
>
> int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
>
> +uint32_t hyperv_vp_index(X86CPU *cpu);
> +
> #endif
> diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
> index 227185c..27de5bc 100644
> --- a/target/i386/hyperv.c
> +++ b/target/i386/hyperv.c
> @@ -16,6 +16,11 @@
> #include "hyperv.h"
> #include "hyperv_proto.h"
>
> +uint32_t hyperv_vp_index(X86CPU *cpu)
> +{
> + return CPU(cpu)->cpu_index;
> +}
> +
You are introducing a wrapper, but you are using cs->cpu_index
directly at hyperv_set_vp_index(). The knowledge that vp_index
== cpu_index is duplicated on two places, and the functions risk
getting out of sync.
I suggest using the wrapper inside hyperv_set_vp_index() too.
> int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
> {
> CPUX86State *env = &cpu->env;
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 251aa95..eb9cde4 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -610,11 +610,37 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
> return 0;
> }
>
> +static void hyperv_set_vp_index(CPUState *cs)
> +{
> + struct {
> + struct kvm_msrs info;
> + struct kvm_msr_entry entries[1];
> + } msr_data;
> + int ret;
> +
> + msr_data.info.nmsrs = 1;
> + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> +
> + /*
> + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> + * expected value.
> + */
Oh. This sounds like a problem. As reference for others, this
is the KVM code in kvm_hv_get_msr():
case HV_X64_MSR_VP_INDEX: {
int r;
struct kvm_vcpu *v;
kvm_for_each_vcpu(r, v, vcpu->kvm) {
if (v == vcpu) {
data = r;
break;
}
}
break;
}
The problem with that is that it will break as soon as we create
VCPUs in a different order. Unsolvable on hosts that don't allow
HV_X64_MSR_VP_INDEX to be set, however.
> + msr_data.entries[0].data = cs->cpu_index;
> + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> + assert(ret == 1);
> + assert(msr_data.entries[0].data == cs->cpu_index);
If KVM already initializes the MSR to cpu_index and we will abort
if it was not set to anything except cpu_index here, why exactly
do we need the KVM_SET_MSRS call?
> +}
> +
> static int hyperv_handle_properties(CPUState *cs)
> {
> X86CPU *cpu = X86_CPU(cs);
> CPUX86State *env = &cpu->env;
>
> + hyperv_set_vp_index(cs);
> +
The purpose of hyperv_handle_properties() is just to initialize
cpu->features[] CPUID data based on the hyperv options. I suggest
creating a new hyperv_vcpu_init() or x86_cpu_hyperv_init()
function for these initialization steps.
> if (cpu->hyperv_time &&
> kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_TIME) <= 0) {
> cpu->hyperv_time = false;
> --
> 2.9.4
>
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-13 18:57 ` Eduardo Habkost
@ 2017-06-14 11:25 ` Roman Kagan
2017-06-14 11:26 ` Paolo Bonzini
2017-06-14 13:00 ` Eduardo Habkost
0 siblings, 2 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 11:25 UTC (permalink / raw)
To: Eduardo Habkost
Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev,
Igor Mammedov
On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> >
> > It has to be owned by QEMU in order to preserve it across migration.
> >
> > However, the current implementation in KVM doesn't allow to set this
> > msr, and KVM uses its own notion of VP index. Fortunately, the way
> > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > equal to QEMU cpu_index.
>
> This might not be true in the future. cpu_index is not a good
> identifier for CPUs, and we would like to remove it in the
> future.
>
> But it looks like we have no choice, see extra comments below:
> > +static void hyperv_set_vp_index(CPUState *cs)
> > +{
> > + struct {
> > + struct kvm_msrs info;
> > + struct kvm_msr_entry entries[1];
> > + } msr_data;
> > + int ret;
> > +
> > + msr_data.info.nmsrs = 1;
> > + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > +
> > + /*
> > + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> > + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > + * expected value.
> > + */
>
> Oh. This sounds like a problem. As reference for others, this
> is the KVM code in kvm_hv_get_msr():
>
> case HV_X64_MSR_VP_INDEX: {
> int r;
> struct kvm_vcpu *v;
>
> kvm_for_each_vcpu(r, v, vcpu->kvm) {
> if (v == vcpu) {
> data = r;
> break;
> }
> }
> break;
> }
>
> The problem with that is that it will break as soon as we create
> VCPUs in a different order. Unsolvable on hosts that don't allow
> HV_X64_MSR_VP_INDEX to be set, however.
Right, thanks for putting together a detailed explanation.
This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
by QEMU. I'm going to post a patch to KVM fixing that.
Meanwhile QEMU needs a way to maintain its notion of vp_index that is
1) in sync with kernel's notion
2) also with kernels that don't support setting the msr
3) persistent across migrations
cpu_index looked like a perfect candidate.
> > + msr_data.entries[0].data = cs->cpu_index;
> > + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > + assert(ret == 1);
> > + assert(msr_data.entries[0].data == cs->cpu_index);
>
> If KVM already initializes the MSR to cpu_index and we will abort
> if it was not set to anything except cpu_index here, why exactly
> do we need the KVM_SET_MSRS call?
The kernel made no obligations to keep initializing its vp_index
identically to QEMU's cpu_index. So we'd better check and abort if that
got out of sync. Once KVM gains the ability to learn vp_index from QEMU
we'll no longer depend on that as we'll do explicit synchronization.
> > +}
> > +
> > static int hyperv_handle_properties(CPUState *cs)
> > {
> > X86CPU *cpu = X86_CPU(cs);
> > CPUX86State *env = &cpu->env;
> >
> > + hyperv_set_vp_index(cs);
> > +
>
> The purpose of hyperv_handle_properties() is just to initialize
> cpu->features[] CPUID data based on the hyperv options. I suggest
> creating a new hyperv_vcpu_init() or x86_cpu_hyperv_init()
> function for these initialization steps.
Sounds like a good idea, thanks.
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 11:25 ` Roman Kagan
@ 2017-06-14 11:26 ` Paolo Bonzini
2017-06-14 13:00 ` Igor Mammedov
2017-06-14 13:01 ` Eduardo Habkost
2017-06-14 13:00 ` Eduardo Habkost
1 sibling, 2 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 11:26 UTC (permalink / raw)
To: Roman Kagan, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev, Igor Mammedov
On 14/06/2017 13:25, Roman Kagan wrote:
>> The problem with that is that it will break as soon as we create
>> VCPUs in a different order. Unsolvable on hosts that don't allow
>> HV_X64_MSR_VP_INDEX to be set, however.
> Right, thanks for putting together a detailed explanation.
>
> This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> by QEMU. I'm going to post a patch to KVM fixing that.
>
> Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> 1) in sync with kernel's notion
> 2) also with kernels that don't support setting the msr
> 3) persistent across migrations
>
> cpu_index looked like a perfect candidate.
>
What you want is the APIC id, which _is_ cpu_index but may not be in the
future. But the APIC id is also the KVM vcpu_id, so there's no need to
have VP_INDEX maintained by QEMU.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 11:26 ` Paolo Bonzini
@ 2017-06-14 13:00 ` Igor Mammedov
2017-06-15 12:41 ` Roman Kagan
2017-06-14 13:01 ` Eduardo Habkost
1 sibling, 1 reply; 76+ messages in thread
From: Igor Mammedov @ 2017-06-14 13:00 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Roman Kagan, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, 14 Jun 2017 13:26:44 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 14/06/2017 13:25, Roman Kagan wrote:
> >> The problem with that is that it will break as soon as we create
> >> VCPUs in a different order. Unsolvable on hosts that don't allow
> >> HV_X64_MSR_VP_INDEX to be set, however.
> > Right, thanks for putting together a detailed explanation.
> >
> > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > by QEMU. I'm going to post a patch to KVM fixing that.
> >
> > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > 1) in sync with kernel's notion
> > 2) also with kernels that don't support setting the msr
> > 3) persistent across migrations
> >
> > cpu_index looked like a perfect candidate.
> >
>
> What you want is the APIC id,
> which _is_ cpu_index but may not be in the
depending on topology cpu_index won't be the same as APIC ID/vcpu_id
/AMDs odd core count/.
> future. But the APIC id is also the KVM vcpu_id, so there's no need to
> have VP_INDEX maintained by QEMU.
agreed it'd be better to reuse vcpu_id/apic id as interface between
qemu/kvm/guest instead of adding additional cpu_index concept in ABI
>
> Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:00 ` Igor Mammedov
@ 2017-06-15 12:41 ` Roman Kagan
2017-06-15 13:22 ` Paolo Bonzini
2017-06-15 13:27 ` Igor Mammedov
0 siblings, 2 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-15 12:41 UTC (permalink / raw)
To: Igor Mammedov
Cc: Paolo Bonzini, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 03:00:27PM +0200, Igor Mammedov wrote:
> On Wed, 14 Jun 2017 13:26:44 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> > On 14/06/2017 13:25, Roman Kagan wrote:
> > >> The problem with that is that it will break as soon as we create
> > >> VCPUs in a different order. Unsolvable on hosts that don't allow
> > >> HV_X64_MSR_VP_INDEX to be set, however.
> > > Right, thanks for putting together a detailed explanation.
> > >
> > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > by QEMU. I'm going to post a patch to KVM fixing that.
> > >
> > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > 1) in sync with kernel's notion
> > > 2) also with kernels that don't support setting the msr
> > > 3) persistent across migrations
> > >
> > > cpu_index looked like a perfect candidate.
> > >
> >
> > What you want is the APIC id,
>
> > which _is_ cpu_index but may not be in the
> depending on topology cpu_index won't be the same as APIC ID/vcpu_id
> /AMDs odd core count/.
So vcpu_id can be sparse?
> > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > have VP_INDEX maintained by QEMU.
> agreed it'd be better to reuse vcpu_id/apic id as interface between
> qemu/kvm/guest instead of adding additional cpu_index concept in ABI
Having consulted the spec, I'm not so confident any more this is the
right move.
> 7.8.1 Virtual Processor Index
>
> Virtual processors are identified by using an index (VP index). The
> maximum number of virtual processors per partition supported by the
> current implementation of the hypervisor can be obtained through CPUID
> leaf 0x40000005. A virtual processor index must be less than the
> maximum number of virtual processors per partition.
This seems to imply that VP index should be dense. As if they use it
directly as an index into an array whose length is equal to the max
number of vcpus. (BTW the value we report for it is currently hardcoded
to 0x40 which probably needs fixing, too.)
I'm starting to drift back to adding SET_MSRS support to
HV_X64_MSR_VP_INDEX in KVM to delegate the control to QEMU, and having
QEMU use cpu_index as its value... :-/
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 12:41 ` Roman Kagan
@ 2017-06-15 13:22 ` Paolo Bonzini
2017-06-15 13:27 ` Igor Mammedov
1 sibling, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-15 13:22 UTC (permalink / raw)
To: Roman Kagan, Igor Mammedov, Eduardo Habkost, qemu-devel,
Evgeny Yakovlev, Denis V . Lunev
On 15/06/2017 14:41, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 03:00:27PM +0200, Igor Mammedov wrote:
>> On Wed, 14 Jun 2017 13:26:44 +0200
>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>> On 14/06/2017 13:25, Roman Kagan wrote:
>>>>> The problem with that is that it will break as soon as we create
>>>>> VCPUs in a different order. Unsolvable on hosts that don't allow
>>>>> HV_X64_MSR_VP_INDEX to be set, however.
>>>> Right, thanks for putting together a detailed explanation.
>>>>
>>>> This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
>>>> by QEMU. I'm going to post a patch to KVM fixing that.
>>>>
>>>> Meanwhile QEMU needs a way to maintain its notion of vp_index that is
>>>> 1) in sync with kernel's notion
>>>> 2) also with kernels that don't support setting the msr
>>>> 3) persistent across migrations
>>>>
>>>> cpu_index looked like a perfect candidate.
>>>>
>>>
>>> What you want is the APIC id,
>>
>>> which _is_ cpu_index but may not be in the
>> depending on topology cpu_index won't be the same as APIC ID/vcpu_id
>> /AMDs odd core count/.
>
> So vcpu_id can be sparse?
Yes.
> Having consulted the spec, I'm not so confident any more this is the
> right move.
>
>> 7.8.1 Virtual Processor Index
>>
>> Virtual processors are identified by using an index (VP index). The
>> maximum number of virtual processors per partition supported by the
>> current implementation of the hypervisor can be obtained through CPUID
>> leaf 0x40000005. A virtual processor index must be less than the
>> maximum number of virtual processors per partition.
>
> This seems to imply that VP index should be dense. As if they use it
> directly as an index into an array whose length is equal to the max
> number of vcpus. (BTW the value we report for it is currently hardcoded
> to 0x40 which probably needs fixing, too.)
>
> I'm starting to drift back to adding SET_MSRS support to
> HV_X64_MSR_VP_INDEX in KVM to delegate the control to QEMU, and having
> QEMU use cpu_index as its value... :-/
Yes, definitely.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 12:41 ` Roman Kagan
2017-06-15 13:22 ` Paolo Bonzini
@ 2017-06-15 13:27 ` Igor Mammedov
2017-06-15 16:05 ` Roman Kagan
1 sibling, 1 reply; 76+ messages in thread
From: Igor Mammedov @ 2017-06-15 13:27 UTC (permalink / raw)
To: Roman Kagan
Cc: Paolo Bonzini, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev, Radim Krčmář
On Thu, 15 Jun 2017 15:41:08 +0300
Roman Kagan <rkagan@virtuozzo.com> wrote:
> On Wed, Jun 14, 2017 at 03:00:27PM +0200, Igor Mammedov wrote:
> > On Wed, 14 Jun 2017 13:26:44 +0200
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > > On 14/06/2017 13:25, Roman Kagan wrote:
> > > >> The problem with that is that it will break as soon as we create
> > > >> VCPUs in a different order. Unsolvable on hosts that don't allow
> > > >> HV_X64_MSR_VP_INDEX to be set, however.
> > > > Right, thanks for putting together a detailed explanation.
> > > >
> > > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > > by QEMU. I'm going to post a patch to KVM fixing that.
> > > >
> > > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > > 1) in sync with kernel's notion
> > > > 2) also with kernels that don't support setting the msr
> > > > 3) persistent across migrations
> > > >
> > > > cpu_index looked like a perfect candidate.
> > > >
> > >
> > > What you want is the APIC id,
> >
> > > which _is_ cpu_index but may not be in the
> > depending on topology cpu_index won't be the same as APIC ID/vcpu_id
> > /AMDs odd core count/.
>
> So vcpu_id can be sparse?
yes
I suppose kernel even has cpu->apic_id map somewhere that you can reuse
instead of custom map hv_context.vp_index[]
> > > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > > have VP_INDEX maintained by QEMU.
> > agreed it'd be better to reuse vcpu_id/apic id as interface between
> > qemu/kvm/guest instead of adding additional cpu_index concept in ABI
>
> Having consulted the spec, I'm not so confident any more this is the
> right move.
>
> > 7.8.1 Virtual Processor Index
> >
> > Virtual processors are identified by using an index (VP index). The
> > maximum number of virtual processors per partition supported by the
> > current implementation of the hypervisor can be obtained through CPUID
> > leaf 0x40000005. A virtual processor index must be less than the
> > maximum number of virtual processors per partition.
>
> This seems to imply that VP index should be dense. As if they use it
> directly as an index into an array whose length is equal to the max
> number of vcpus. (BTW the value we report for it is currently hardcoded
> to 0x40 which probably needs fixing, too.)
the question here is where "maximum number of virtual processors" comes from
if it's possible to tell guest value explicitly,
we can use pcms->apic_id_limit for it.
> I'm starting to drift back to adding SET_MSRS support to
> HV_X64_MSR_VP_INDEX in KVM to delegate the control to QEMU, and having
> QEMU use cpu_index as its value... :-/
>
> Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 13:27 ` Igor Mammedov
@ 2017-06-15 16:05 ` Roman Kagan
2017-06-18 15:29 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-15 16:05 UTC (permalink / raw)
To: Igor Mammedov
Cc: Paolo Bonzini, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev, Radim Krčmář
On Thu, Jun 15, 2017 at 03:27:28PM +0200, Igor Mammedov wrote:
> On Thu, 15 Jun 2017 15:41:08 +0300
> Roman Kagan <rkagan@virtuozzo.com> wrote:
> > On Wed, Jun 14, 2017 at 03:00:27PM +0200, Igor Mammedov wrote:
> > > On Wed, 14 Jun 2017 13:26:44 +0200
> > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > >
> > > > On 14/06/2017 13:25, Roman Kagan wrote:
> > > > >> The problem with that is that it will break as soon as we create
> > > > >> VCPUs in a different order. Unsolvable on hosts that don't allow
> > > > >> HV_X64_MSR_VP_INDEX to be set, however.
> > > > > Right, thanks for putting together a detailed explanation.
> > > > >
> > > > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > > > by QEMU. I'm going to post a patch to KVM fixing that.
> > > > >
> > > > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > > > 1) in sync with kernel's notion
> > > > > 2) also with kernels that don't support setting the msr
> > > > > 3) persistent across migrations
> > > > >
> > > > > cpu_index looked like a perfect candidate.
> > > > >
> > > >
> > > > What you want is the APIC id,
> > >
> > > > which _is_ cpu_index but may not be in the
> > > depending on topology cpu_index won't be the same as APIC ID/vcpu_id
> > > /AMDs odd core count/.
> >
> > So vcpu_id can be sparse?
> yes
>
> I suppose kernel even has cpu->apic_id map somewhere that you can reuse
> instead of custom map hv_context.vp_index[]
Right, but this is not the problem. Whether apic_id matches the
expected vp_index semantics is.
> > > > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > > > have VP_INDEX maintained by QEMU.
> > > agreed it'd be better to reuse vcpu_id/apic id as interface between
> > > qemu/kvm/guest instead of adding additional cpu_index concept in ABI
> >
> > Having consulted the spec, I'm not so confident any more this is the
> > right move.
> >
> > > 7.8.1 Virtual Processor Index
> > >
> > > Virtual processors are identified by using an index (VP index). The
> > > maximum number of virtual processors per partition supported by the
> > > current implementation of the hypervisor can be obtained through CPUID
> > > leaf 0x40000005. A virtual processor index must be less than the
> > > maximum number of virtual processors per partition.
> >
> > This seems to imply that VP index should be dense. As if they use it
> > directly as an index into an array whose length is equal to the max
> > number of vcpus. (BTW the value we report for it is currently hardcoded
> > to 0x40 which probably needs fixing, too.)
> the question here is where "maximum number of virtual processors" comes from
We provide it in a corresponding cpuid leaf.
> if it's possible to tell guest value explicitly,
> we can use pcms->apic_id_limit for it.
But, depending on the apic_id encoding scheme, this can be arbitrarily
bigger than the actual maximum number of vcpus. Which can result in
guest confusion or inefficiency.
> > I'm starting to drift back to adding SET_MSRS support to
> > HV_X64_MSR_VP_INDEX in KVM to delegate the control to QEMU, and having
> > QEMU use cpu_index as its value... :-/
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 16:05 ` Roman Kagan
@ 2017-06-18 15:29 ` Eduardo Habkost
0 siblings, 0 replies; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-18 15:29 UTC (permalink / raw)
To: Roman Kagan, Igor Mammedov, Paolo Bonzini, qemu-devel,
Evgeny Yakovlev, Denis V . Lunev, Radim Krčmář
On Thu, Jun 15, 2017 at 07:05:29PM +0300, Roman Kagan wrote:
> On Thu, Jun 15, 2017 at 03:27:28PM +0200, Igor Mammedov wrote:
[...]
> > > > > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > > > > have VP_INDEX maintained by QEMU.
> > > > agreed it'd be better to reuse vcpu_id/apic id as interface between
> > > > qemu/kvm/guest instead of adding additional cpu_index concept in ABI
> > >
> > > Having consulted the spec, I'm not so confident any more this is the
> > > right move.
> > >
> > > > 7.8.1 Virtual Processor Index
> > > >
> > > > Virtual processors are identified by using an index (VP index). The
> > > > maximum number of virtual processors per partition supported by the
> > > > current implementation of the hypervisor can be obtained through CPUID
> > > > leaf 0x40000005. A virtual processor index must be less than the
> > > > maximum number of virtual processors per partition.
> > >
> > > This seems to imply that VP index should be dense. As if they use it
> > > directly as an index into an array whose length is equal to the max
> > > number of vcpus. (BTW the value we report for it is currently hardcoded
> > > to 0x40 which probably needs fixing, too.)
> > the question here is where "maximum number of virtual processors" comes from
>
> We provide it in a corresponding cpuid leaf.
>
> > if it's possible to tell guest value explicitly,
> > we can use pcms->apic_id_limit for it.
>
> But, depending on the apic_id encoding scheme, this can be arbitrarily
> bigger than the actual maximum number of vcpus. Which can result in
> guest confusion or inefficiency.
Note that it will be bigger only if threads-per-core or
cores-per-thread are not powers of 2. If your use case don't
require configuration of a virtual thread/core topology, the APIC
IDs can be always contiguous.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 11:26 ` Paolo Bonzini
2017-06-14 13:00 ` Igor Mammedov
@ 2017-06-14 13:01 ` Eduardo Habkost
2017-06-14 13:11 ` Igor Mammedov
1 sibling, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:01 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Roman Kagan, qemu-devel, Evgeny Yakovlev, Denis V . Lunev,
Igor Mammedov
On Wed, Jun 14, 2017 at 01:26:44PM +0200, Paolo Bonzini wrote:
>
>
> On 14/06/2017 13:25, Roman Kagan wrote:
> >> The problem with that is that it will break as soon as we create
> >> VCPUs in a different order. Unsolvable on hosts that don't allow
> >> HV_X64_MSR_VP_INDEX to be set, however.
> > Right, thanks for putting together a detailed explanation.
> >
> > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > by QEMU. I'm going to post a patch to KVM fixing that.
> >
> > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > 1) in sync with kernel's notion
> > 2) also with kernels that don't support setting the msr
> > 3) persistent across migrations
> >
> > cpu_index looked like a perfect candidate.
> >
>
> What you want is the APIC id, which _is_ cpu_index but may not be in the
> future. But the APIC id is also the KVM vcpu_id, so there's no need to
> have VP_INDEX maintained by QEMU.
No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
kvm_hv_get_msr():
case HV_X64_MSR_VP_INDEX: {
int r;
struct kvm_vcpu *v;
kvm_for_each_vcpu(r, v, vcpu->kvm) {
if (v == vcpu) {
data = r;
break;
}
}
break;
}
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:01 ` Eduardo Habkost
@ 2017-06-14 13:11 ` Igor Mammedov
2017-06-14 13:17 ` Paolo Bonzini
2017-06-14 13:19 ` Eduardo Habkost
0 siblings, 2 replies; 76+ messages in thread
From: Igor Mammedov @ 2017-06-14 13:11 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Paolo Bonzini, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, 14 Jun 2017 10:01:49 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Wed, Jun 14, 2017 at 01:26:44PM +0200, Paolo Bonzini wrote:
> >
> >
> > On 14/06/2017 13:25, Roman Kagan wrote:
> > >> The problem with that is that it will break as soon as we create
> > >> VCPUs in a different order. Unsolvable on hosts that don't allow
> > >> HV_X64_MSR_VP_INDEX to be set, however.
> > > Right, thanks for putting together a detailed explanation.
> > >
> > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > by QEMU. I'm going to post a patch to KVM fixing that.
> > >
> > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > 1) in sync with kernel's notion
> > > 2) also with kernels that don't support setting the msr
> > > 3) persistent across migrations
> > >
> > > cpu_index looked like a perfect candidate.
> > >
> >
> > What you want is the APIC id, which _is_ cpu_index but may not be in the
> > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > have VP_INDEX maintained by QEMU.
>
> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
and as you pointed out that works just by luck,
as soon as we there would be out of order created CPUs
returned value won't match cpu_index.
So instead of spreading this nonsense out to QEMU, is it possible
to fix kernel(kvm+guest) to use apic_id instead?
> kvm_hv_get_msr():
>
> case HV_X64_MSR_VP_INDEX: {
> int r;
> struct kvm_vcpu *v;
>
> kvm_for_each_vcpu(r, v, vcpu->kvm) {
> if (v == vcpu) {
> data = r;
> break;
> }
> }
> break;
> }
>
>
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:11 ` Igor Mammedov
@ 2017-06-14 13:17 ` Paolo Bonzini
2017-06-14 13:22 ` Eduardo Habkost
2017-06-14 13:19 ` Eduardo Habkost
1 sibling, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 13:17 UTC (permalink / raw)
To: Igor Mammedov, Eduardo Habkost
Cc: Roman Kagan, qemu-devel, Evgeny Yakovlev, Denis V . Lunev
On 14/06/2017 15:11, Igor Mammedov wrote:
>> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
>
> and as you pointed out that works just by luck,
> as soon as we there would be out of order created CPUs
> returned value won't match cpu_index.
>
> So instead of spreading this nonsense out to QEMU, is it possible
> to fix kernel(kvm+guest) to use apic_id instead?
Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
capability and declare the old one broken, that will make things easier.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:17 ` Paolo Bonzini
@ 2017-06-14 13:22 ` Eduardo Habkost
2017-06-14 13:37 ` Paolo Bonzini
2017-06-14 13:38 ` Igor Mammedov
0 siblings, 2 replies; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:22 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Igor Mammedov, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 03:17:54PM +0200, Paolo Bonzini wrote:
>
>
> On 14/06/2017 15:11, Igor Mammedov wrote:
> >> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> >
> > and as you pointed out that works just by luck,
> > as soon as we there would be out of order created CPUs
> > returned value won't match cpu_index.
> >
> > So instead of spreading this nonsense out to QEMU, is it possible
> > to fix kernel(kvm+guest) to use apic_id instead?
>
> Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
> capability and declare the old one broken, that will make things easier.
What do we need a capability for? Can't we just call
KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
doesn't work?
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:22 ` Eduardo Habkost
@ 2017-06-14 13:37 ` Paolo Bonzini
2017-06-14 13:38 ` Igor Mammedov
1 sibling, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 13:37 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Igor Mammedov, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 15:22, Eduardo Habkost wrote:
>> Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
>> capability and declare the old one broken, that will make things easier.
> What do we need a capability for? Can't we just call
> KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
> doesn't work?
As mentioned in the review of patch 16/23, we may want to break
backwards compatibility (meaning we drop the old capability and
introduce a new one) for other reasons.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:22 ` Eduardo Habkost
2017-06-14 13:37 ` Paolo Bonzini
@ 2017-06-14 13:38 ` Igor Mammedov
2017-06-14 13:45 ` Eduardo Habkost
1 sibling, 1 reply; 76+ messages in thread
From: Igor Mammedov @ 2017-06-14 13:38 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Paolo Bonzini, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, 14 Jun 2017 10:22:16 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Wed, Jun 14, 2017 at 03:17:54PM +0200, Paolo Bonzini wrote:
> >
> >
> > On 14/06/2017 15:11, Igor Mammedov wrote:
> > >> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> > >
> > > and as you pointed out that works just by luck,
> > > as soon as we there would be out of order created CPUs
> > > returned value won't match cpu_index.
> > >
> > > So instead of spreading this nonsense out to QEMU, is it possible
> > > to fix kernel(kvm+guest) to use apic_id instead?
> >
> > Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
> > capability and declare the old one broken, that will make things easier.
>
> What do we need a capability for?
> Can't we just call
> KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
> doesn't work?
warn won't solve issue, and it's called rather late in vcpu creation
chain without any error propagation so it's not possible (hard) to
gracefully fail cpu hotplug.
with KVM_CAP_HYPERV_SYNIC2 we could easily turn on compat property for
SYNC device (needs QOMification) that would turn on broken compat
logic if capability is not present/old machine type.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:38 ` Igor Mammedov
@ 2017-06-14 13:45 ` Eduardo Habkost
2017-06-14 18:40 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:45 UTC (permalink / raw)
To: Igor Mammedov
Cc: Paolo Bonzini, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 03:38:59PM +0200, Igor Mammedov wrote:
> On Wed, 14 Jun 2017 10:22:16 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> > On Wed, Jun 14, 2017 at 03:17:54PM +0200, Paolo Bonzini wrote:
> > >
> > >
> > > On 14/06/2017 15:11, Igor Mammedov wrote:
> > > >> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> > > >
> > > > and as you pointed out that works just by luck,
> > > > as soon as we there would be out of order created CPUs
> > > > returned value won't match cpu_index.
> > > >
> > > > So instead of spreading this nonsense out to QEMU, is it possible
> > > > to fix kernel(kvm+guest) to use apic_id instead?
> > >
> > > Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
> > > capability and declare the old one broken, that will make things easier.
> >
> > What do we need a capability for?
> > Can't we just call
> > KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
> > doesn't work?
> warn won't solve issue, and it's called rather late in vcpu creation
> chain without any error propagation so it's not possible (hard) to
> gracefully fail cpu hotplug.
The issue is unsolvable on old kernels (and I don't think we want
to prevent the VM from starting), hence the warning. But the
ability to report errors gracefully on CPU hotplug is a very good
reason. Good point.
>
> with KVM_CAP_HYPERV_SYNIC2 we could easily turn on compat property for
> SYNC device (needs QOMification) that would turn on broken compat
> logic if capability is not present/old machine type.
I was going to propose enabling the compat logic if KVM_SET_MSRS
failed, but you have a good point about KVM_SET_MSRS being called
very late.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:45 ` Eduardo Habkost
@ 2017-06-14 18:40 ` Roman Kagan
2017-06-14 18:59 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 18:40 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Igor Mammedov, Paolo Bonzini, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 10:45:23AM -0300, Eduardo Habkost wrote:
> On Wed, Jun 14, 2017 at 03:38:59PM +0200, Igor Mammedov wrote:
> > On Wed, 14 Jun 2017 10:22:16 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >
> > > On Wed, Jun 14, 2017 at 03:17:54PM +0200, Paolo Bonzini wrote:
> > > >
> > > >
> > > > On 14/06/2017 15:11, Igor Mammedov wrote:
> > > > >> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> > > > >
> > > > > and as you pointed out that works just by luck,
> > > > > as soon as we there would be out of order created CPUs
> > > > > returned value won't match cpu_index.
> > > > >
> > > > > So instead of spreading this nonsense out to QEMU, is it possible
> > > > > to fix kernel(kvm+guest) to use apic_id instead?
> > > >
> > > > Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
> > > > capability and declare the old one broken, that will make things easier.
> > >
> > > What do we need a capability for?
> > > Can't we just call
> > > KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
> > > doesn't work?
> > warn won't solve issue, and it's called rather late in vcpu creation
> > chain without any error propagation so it's not possible (hard) to
> > gracefully fail cpu hotplug.
>
> The issue is unsolvable on old kernels (and I don't think we want
> to prevent the VM from starting), hence the warning. But the
> ability to report errors gracefully on CPU hotplug is a very good
> reason. Good point.
>
> >
> > with KVM_CAP_HYPERV_SYNIC2 we could easily turn on compat property for
> > SYNC device (needs QOMification) that would turn on broken compat
> > logic if capability is not present/old machine type.
>
> I was going to propose enabling the compat logic if KVM_SET_MSRS
> failed, but you have a good point about KVM_SET_MSRS being called
> very late.
Thanks for this discussion, I didn't realize all the implications
(including that hotplug issue too).
One more data point is that until now there was no use for vp_index in
QEMU, so it didn't care how KVM managed it. In KVM the only
vp_index-aware path that the guests could trigger was exactly reading of
HV_X64_MSR_VP_INDEX.
So let me try to sum up (to make sure I understand it right);
1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
switches to using vcpu_id as vp_index and stops zeroing synic pages
2) new QEMU refuses to start in non-compat mode when
KVM_CAP_HYPERV_SYNIC2 is not supported
3) old QEMU or new QEMU in compat mode enables KVM_CAP_HYPERV_SYNIC
making KVM keep using internal cpu index as vp_index and zeroing
synic pages
4) new QEMU in compat mode refuses vmbus or sint route creation
Is this what is proposed? My only problem here is that KVM will have to
somehow guarantee stable numbering in the old synic mode. How can this
be ensured? And should it be at all?
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 18:40 ` Roman Kagan
@ 2017-06-14 18:59 ` Eduardo Habkost
2017-06-15 8:26 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 18:59 UTC (permalink / raw)
To: Roman Kagan, Igor Mammedov, Paolo Bonzini, qemu-devel,
Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 09:40:37PM +0300, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 10:45:23AM -0300, Eduardo Habkost wrote:
> > On Wed, Jun 14, 2017 at 03:38:59PM +0200, Igor Mammedov wrote:
> > > On Wed, 14 Jun 2017 10:22:16 -0300
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > >
> > > > On Wed, Jun 14, 2017 at 03:17:54PM +0200, Paolo Bonzini wrote:
> > > > >
> > > > >
> > > > > On 14/06/2017 15:11, Igor Mammedov wrote:
> > > > > >> No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> > > > > >
> > > > > > and as you pointed out that works just by luck,
> > > > > > as soon as we there would be out of order created CPUs
> > > > > > returned value won't match cpu_index.
> > > > > >
> > > > > > So instead of spreading this nonsense out to QEMU, is it possible
> > > > > > to fix kernel(kvm+guest) to use apic_id instead?
> > > > >
> > > > > Yes, definitely. At this point let's add a new "KVM_CAP_HYPERV_SYNIC2"
> > > > > capability and declare the old one broken, that will make things easier.
> > > >
> > > > What do we need a capability for?
> > > > Can't we just call
> > > > KVM_SET_MSRS using arch_id (== apic_id) and warn the user if it
> > > > doesn't work?
> > > warn won't solve issue, and it's called rather late in vcpu creation
> > > chain without any error propagation so it's not possible (hard) to
> > > gracefully fail cpu hotplug.
> >
> > The issue is unsolvable on old kernels (and I don't think we want
> > to prevent the VM from starting), hence the warning. But the
> > ability to report errors gracefully on CPU hotplug is a very good
> > reason. Good point.
> >
> > >
> > > with KVM_CAP_HYPERV_SYNIC2 we could easily turn on compat property for
> > > SYNC device (needs QOMification) that would turn on broken compat
> > > logic if capability is not present/old machine type.
> >
> > I was going to propose enabling the compat logic if KVM_SET_MSRS
> > failed, but you have a good point about KVM_SET_MSRS being called
> > very late.
>
> Thanks for this discussion, I didn't realize all the implications
> (including that hotplug issue too).
>
> One more data point is that until now there was no use for vp_index in
> QEMU, so it didn't care how KVM managed it. In KVM the only
> vp_index-aware path that the guests could trigger was exactly reading of
> HV_X64_MSR_VP_INDEX.
>
> So let me try to sum up (to make sure I understand it right);
>
> 1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
> switches to using vcpu_id as vp_index and stops zeroing synic pages
If we want to keep KVM code simpler, we could make QEMU
explicitly initialize vp_index using vcpu_id (== arch_id == apic_id)
if KVM_CAP_HYPERV_SYNIC2 is reported as supported.
>
> 2) new QEMU refuses to start in non-compat mode when
> KVM_CAP_HYPERV_SYNIC2 is not supported
It depends on which cases are considered "in non-compat mode".
Getting a VM not runnable just because the machine-type was
updated is not desirable, especially considering that on most
cases we will create the VCPUs on the same order and things would
keep working. Probably the best we can do on this case is to
automatically enable compat mode, but print a warning saying
future QEMU versions might break if the kernel is not upgraded.
>
> 3) old QEMU or new QEMU in compat mode enables KVM_CAP_HYPERV_SYNIC
> making KVM keep using internal cpu index as vp_index and zeroing
> synic pages
Maybe we need a warning on this case too, because QEMU won't
guarantee it will always create VCPUs in the same order in the
future.
>
> 4) new QEMU in compat mode refuses vmbus or sint route creation
Also: new QEMU in compat mode refuses CPU hotplug.
>
> Is this what is proposed? My only problem here is that KVM will have to
> somehow guarantee stable numbering in the old synic mode. How can this
> be ensured? And should it be at all?
As far as I can see, KVM can only guarantee stable numbering in
the old mode if QEMU creates the VCPUs in the same order. QEMU
can do a reasonable effort to not break that unnecessarily, but
users should be aware that old mode is deprecated and can break.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 18:59 ` Eduardo Habkost
@ 2017-06-15 8:26 ` Paolo Bonzini
2017-06-15 11:40 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-15 8:26 UTC (permalink / raw)
To: Eduardo Habkost, Roman Kagan, Igor Mammedov, qemu-devel,
Evgeny Yakovlev, Denis V . Lunev
On 14/06/2017 20:59, Eduardo Habkost wrote:
> On Wed, Jun 14, 2017 at 09:40:37PM +0300, Roman Kagan wrote:
>> One more data point is that until now there was no use for vp_index in
>> QEMU, so it didn't care how KVM managed it. In KVM the only
>> vp_index-aware path that the guests could trigger was exactly reading of
>> HV_X64_MSR_VP_INDEX.
>>
>> So let me try to sum up (to make sure I understand it right);
>>
>> 1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
>> switches to using vcpu_id as vp_index and stops zeroing synic pages
>
> If we want to keep KVM code simpler, we could make QEMU
> explicitly initialize vp_index using vcpu_id (== arch_id == apic_id)
> if KVM_CAP_HYPERV_SYNIC2 is reported as supported.
>
>> 2) new QEMU refuses to start in non-compat mode when
>> KVM_CAP_HYPERV_SYNIC2 is not supported
>
> It depends on which cases are considered "in non-compat mode".
> Getting a VM not runnable just because the machine-type was
> updated is not desirable, especially considering that on most
> cases we will create the VCPUs on the same order and things would
> keep working. Probably the best we can do on this case is to
> automatically enable compat mode, but print a warning saying
> future QEMU versions might break if the kernel is not upgraded.
Anything that specifies hv_synic can be broken. There was really no
reason to specify it for anything except experimenting.
So QEMU never has to enable KVM_CAP_HYPERV_SYNIC. No compat mode is
necessary. In fact, I don't see any reason to _keep_
KVM_CAP_HYPERV_SYNIC in the kernel, either. This means that KVM can
also switch unconditionally the vp_index to vcpu_id, because old QEMU +
new kernel won't even start.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 8:26 ` Paolo Bonzini
@ 2017-06-15 11:40 ` Roman Kagan
2017-06-15 11:42 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-15 11:40 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Eduardo Habkost, Igor Mammedov, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Thu, Jun 15, 2017 at 10:26:58AM +0200, Paolo Bonzini wrote:
> On 14/06/2017 20:59, Eduardo Habkost wrote:
> > On Wed, Jun 14, 2017 at 09:40:37PM +0300, Roman Kagan wrote:
> >> One more data point is that until now there was no use for vp_index in
> >> QEMU, so it didn't care how KVM managed it. In KVM the only
> >> vp_index-aware path that the guests could trigger was exactly reading of
> >> HV_X64_MSR_VP_INDEX.
> >>
> >> So let me try to sum up (to make sure I understand it right);
> >>
> >> 1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
> >> switches to using vcpu_id as vp_index and stops zeroing synic pages
> >
> > If we want to keep KVM code simpler, we could make QEMU
> > explicitly initialize vp_index using vcpu_id (== arch_id == apic_id)
> > if KVM_CAP_HYPERV_SYNIC2 is reported as supported.
> >
> >> 2) new QEMU refuses to start in non-compat mode when
> >> KVM_CAP_HYPERV_SYNIC2 is not supported
> >
> > It depends on which cases are considered "in non-compat mode".
> > Getting a VM not runnable just because the machine-type was
> > updated is not desirable, especially considering that on most
> > cases we will create the VCPUs on the same order and things would
> > keep working. Probably the best we can do on this case is to
> > automatically enable compat mode, but print a warning saying
> > future QEMU versions might break if the kernel is not upgraded.
>
> Anything that specifies hv_synic can be broken. There was really no
> reason to specify it for anything except experimenting.
Hyper-V SynIC timers depend on it, and they work since QEMU 2.6 / KVM
4.5 and even supported by libvirt since 1.3.3.
> So QEMU never has to enable KVM_CAP_HYPERV_SYNIC. No compat mode is
> necessary. In fact, I don't see any reason to _keep_
> KVM_CAP_HYPERV_SYNIC in the kernel, either. This means that KVM can
> also switch unconditionally the vp_index to vcpu_id, because old QEMU +
> new kernel won't even start.
I'm afraid this is too harsh on users...
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 11:40 ` Roman Kagan
@ 2017-06-15 11:42 ` Paolo Bonzini
2017-06-15 12:03 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-15 11:42 UTC (permalink / raw)
To: Roman Kagan, Eduardo Habkost, Igor Mammedov, qemu-devel,
Evgeny Yakovlev, Denis V . Lunev
On 15/06/2017 13:40, Roman Kagan wrote:
> On Thu, Jun 15, 2017 at 10:26:58AM +0200, Paolo Bonzini wrote:
>> On 14/06/2017 20:59, Eduardo Habkost wrote:
>>> On Wed, Jun 14, 2017 at 09:40:37PM +0300, Roman Kagan wrote:
>>>> One more data point is that until now there was no use for vp_index in
>>>> QEMU, so it didn't care how KVM managed it. In KVM the only
>>>> vp_index-aware path that the guests could trigger was exactly reading of
>>>> HV_X64_MSR_VP_INDEX.
>>>>
>>>> So let me try to sum up (to make sure I understand it right);
>>>>
>>>> 1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
>>>> switches to using vcpu_id as vp_index and stops zeroing synic pages
>>>
>>> If we want to keep KVM code simpler, we could make QEMU
>>> explicitly initialize vp_index using vcpu_id (== arch_id == apic_id)
>>> if KVM_CAP_HYPERV_SYNIC2 is reported as supported.
>>>
>>>> 2) new QEMU refuses to start in non-compat mode when
>>>> KVM_CAP_HYPERV_SYNIC2 is not supported
>>>
>>> It depends on which cases are considered "in non-compat mode".
>>> Getting a VM not runnable just because the machine-type was
>>> updated is not desirable, especially considering that on most
>>> cases we will create the VCPUs on the same order and things would
>>> keep working. Probably the best we can do on this case is to
>>> automatically enable compat mode, but print a warning saying
>>> future QEMU versions might break if the kernel is not upgraded.
>>
>> Anything that specifies hv_synic can be broken. There was really no
>> reason to specify it for anything except experimenting.
>
> Hyper-V SynIC timers depend on it, and they work since QEMU 2.6 / KVM
> 4.5 and even supported by libvirt since 1.3.3.
But who is using them, and why would they be doing that? What is the
advantage of using SynIC timers?
Paolo
>> So QEMU never has to enable KVM_CAP_HYPERV_SYNIC. No compat mode is
>> necessary. In fact, I don't see any reason to _keep_
>> KVM_CAP_HYPERV_SYNIC in the kernel, either. This means that KVM can
>> also switch unconditionally the vp_index to vcpu_id, because old QEMU +
>> new kernel won't even start.
>
> I'm afraid this is too harsh on users...
>
> Roman.
>
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-15 11:42 ` Paolo Bonzini
@ 2017-06-15 12:03 ` Roman Kagan
0 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-15 12:03 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Eduardo Habkost, Igor Mammedov, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Thu, Jun 15, 2017 at 01:42:56PM +0200, Paolo Bonzini wrote:
>
>
> On 15/06/2017 13:40, Roman Kagan wrote:
> > On Thu, Jun 15, 2017 at 10:26:58AM +0200, Paolo Bonzini wrote:
> >> On 14/06/2017 20:59, Eduardo Habkost wrote:
> >>> On Wed, Jun 14, 2017 at 09:40:37PM +0300, Roman Kagan wrote:
> >>>> One more data point is that until now there was no use for vp_index in
> >>>> QEMU, so it didn't care how KVM managed it. In KVM the only
> >>>> vp_index-aware path that the guests could trigger was exactly reading of
> >>>> HV_X64_MSR_VP_INDEX.
> >>>>
> >>>> So let me try to sum up (to make sure I understand it right);
> >>>>
> >>>> 1) we add KVM_CAP_HYPERV_SYNIC2 to KVM; when QEMU enables it KVM
> >>>> switches to using vcpu_id as vp_index and stops zeroing synic pages
> >>>
> >>> If we want to keep KVM code simpler, we could make QEMU
> >>> explicitly initialize vp_index using vcpu_id (== arch_id == apic_id)
> >>> if KVM_CAP_HYPERV_SYNIC2 is reported as supported.
> >>>
> >>>> 2) new QEMU refuses to start in non-compat mode when
> >>>> KVM_CAP_HYPERV_SYNIC2 is not supported
> >>>
> >>> It depends on which cases are considered "in non-compat mode".
> >>> Getting a VM not runnable just because the machine-type was
> >>> updated is not desirable, especially considering that on most
> >>> cases we will create the VCPUs on the same order and things would
> >>> keep working. Probably the best we can do on this case is to
> >>> automatically enable compat mode, but print a warning saying
> >>> future QEMU versions might break if the kernel is not upgraded.
> >>
> >> Anything that specifies hv_synic can be broken. There was really no
> >> reason to specify it for anything except experimenting.
> >
> > Hyper-V SynIC timers depend on it, and they work since QEMU 2.6 / KVM
> > 4.5 and even supported by libvirt since 1.3.3.
>
> But who is using them, and why would they be doing that? What is the
> advantage of using SynIC timers?
I guess because they are lighter-weight than HPET, and Windows used to
ignore apic timer so it would choose SynIC timers if available.
Dunno...
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:11 ` Igor Mammedov
2017-06-14 13:17 ` Paolo Bonzini
@ 2017-06-14 13:19 ` Eduardo Habkost
1 sibling, 0 replies; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:19 UTC (permalink / raw)
To: Igor Mammedov
Cc: Paolo Bonzini, Roman Kagan, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 03:11:17PM +0200, Igor Mammedov wrote:
> On Wed, 14 Jun 2017 10:01:49 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> > On Wed, Jun 14, 2017 at 01:26:44PM +0200, Paolo Bonzini wrote:
> > >
> > >
> > > On 14/06/2017 13:25, Roman Kagan wrote:
> > > >> The problem with that is that it will break as soon as we create
> > > >> VCPUs in a different order. Unsolvable on hosts that don't allow
> > > >> HV_X64_MSR_VP_INDEX to be set, however.
> > > > Right, thanks for putting together a detailed explanation.
> > > >
> > > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > > by QEMU. I'm going to post a patch to KVM fixing that.
> > > >
> > > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > > 1) in sync with kernel's notion
> > > > 2) also with kernels that don't support setting the msr
> > > > 3) persistent across migrations
> > > >
> > > > cpu_index looked like a perfect candidate.
> > > >
> > >
> > > What you want is the APIC id, which _is_ cpu_index but may not be in the
> > > future. But the APIC id is also the KVM vcpu_id, so there's no need to
> > > have VP_INDEX maintained by QEMU.
> >
> > No, KVM really uses the VCPU _index_ for HV_X64_MSR_VP_INDEX:
> and as you pointed out that works just by luck,
> as soon as we there would be out of order created CPUs
> returned value won't match cpu_index.
This is true, which makes it even worse.
>
> So instead of spreading this nonsense out to QEMU, is it possible
> to fix kernel(kvm+guest) to use apic_id instead?
It is possible to fix the kernel, but if we want to support older
kernels, QEMU has no choice but creating the VCPUs always in the
same order.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 11:25 ` Roman Kagan
2017-06-14 11:26 ` Paolo Bonzini
@ 2017-06-14 13:00 ` Eduardo Habkost
2017-06-14 13:24 ` Igor Mammedov
1 sibling, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:00 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev, Igor Mammedov
On Wed, Jun 14, 2017 at 02:25:07PM +0300, Roman Kagan wrote:
> On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> > > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> > >
> > > It has to be owned by QEMU in order to preserve it across migration.
> > >
> > > However, the current implementation in KVM doesn't allow to set this
> > > msr, and KVM uses its own notion of VP index. Fortunately, the way
> > > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > > equal to QEMU cpu_index.
> >
> > This might not be true in the future. cpu_index is not a good
> > identifier for CPUs, and we would like to remove it in the
> > future.
> >
> > But it looks like we have no choice, see extra comments below:
>
> > > +static void hyperv_set_vp_index(CPUState *cs)
> > > +{
> > > + struct {
> > > + struct kvm_msrs info;
> > > + struct kvm_msr_entry entries[1];
> > > + } msr_data;
> > > + int ret;
> > > +
> > > + msr_data.info.nmsrs = 1;
> > > + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > > +
> > > + /*
> > > + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> > > + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > > + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > > + * expected value.
> > > + */
> >
> > Oh. This sounds like a problem. As reference for others, this
> > is the KVM code in kvm_hv_get_msr():
> >
> > case HV_X64_MSR_VP_INDEX: {
> > int r;
> > struct kvm_vcpu *v;
> >
> > kvm_for_each_vcpu(r, v, vcpu->kvm) {
> > if (v == vcpu) {
> > data = r;
> > break;
> > }
> > }
> > break;
> > }
> >
> > The problem with that is that it will break as soon as we create
> > VCPUs in a different order. Unsolvable on hosts that don't allow
> > HV_X64_MSR_VP_INDEX to be set, however.
>
> Right, thanks for putting together a detailed explanation.
>
> This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> by QEMU. I'm going to post a patch to KVM fixing that.
>
> Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> 1) in sync with kernel's notion
> 2) also with kernels that don't support setting the msr
> 3) persistent across migrations
>
> cpu_index looked like a perfect candidate.
I would like to be able to stop using cpu_index as a persistent
VCPU identifier in the future. I would also like to be able to
create VCPUs in any order without breaking guest ABI. But it
seems to be impossible to do that without breaking vp_index on
older kernels.
So cpu_index looks like the only candidate we have.
>
> > > + msr_data.entries[0].data = cs->cpu_index;
> > > + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > > + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > > + assert(ret == 1);
> > > + assert(msr_data.entries[0].data == cs->cpu_index);
> >
> > If KVM already initializes the MSR to cpu_index and we will abort
> > if it was not set to anything except cpu_index here, why exactly
> > do we need the KVM_SET_MSRS call?
>
> The kernel made no obligations to keep initializing its vp_index
> identically to QEMU's cpu_index. So we'd better check and abort if that
> got out of sync. Once KVM gains the ability to learn vp_index from QEMU
> we'll no longer depend on that as we'll do explicit synchronization.
I think the kernel now has an obligation to keep initializing
HV_X64_MSR_VP_INDEX in a compatible way, or it will break older
QEMU versions that don't set the MSR.
But if you don't think we can rely on that, then the KVM_SET_MSRS
call won't hurt.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:00 ` Eduardo Habkost
@ 2017-06-14 13:24 ` Igor Mammedov
2017-06-14 13:35 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Igor Mammedov @ 2017-06-14 13:24 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, 14 Jun 2017 10:00:15 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Wed, Jun 14, 2017 at 02:25:07PM +0300, Roman Kagan wrote:
> > On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:
> > > On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> > > > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > > > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> > > >
> > > > It has to be owned by QEMU in order to preserve it across migration.
> > > >
> > > > However, the current implementation in KVM doesn't allow to set this
> > > > msr, and KVM uses its own notion of VP index. Fortunately, the way
> > > > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > > > equal to QEMU cpu_index.
> > >
> > > This might not be true in the future. cpu_index is not a good
> > > identifier for CPUs, and we would like to remove it in the
> > > future.
> > >
> > > But it looks like we have no choice, see extra comments below:
> >
> > > > +static void hyperv_set_vp_index(CPUState *cs)
> > > > +{
> > > > + struct {
> > > > + struct kvm_msrs info;
> > > > + struct kvm_msr_entry entries[1];
> > > > + } msr_data;
> > > > + int ret;
> > > > +
> > > > + msr_data.info.nmsrs = 1;
> > > > + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > > > +
> > > > + /*
> > > > + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> > > > + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > > > + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > > > + * expected value.
> > > > + */
> > >
> > > Oh. This sounds like a problem. As reference for others, this
> > > is the KVM code in kvm_hv_get_msr():
> > >
> > > case HV_X64_MSR_VP_INDEX: {
> > > int r;
> > > struct kvm_vcpu *v;
> > >
> > > kvm_for_each_vcpu(r, v, vcpu->kvm) {
> > > if (v == vcpu) {
> > > data = r;
> > > break;
> > > }
> > > }
> > > break;
> > > }
> > >
> > > The problem with that is that it will break as soon as we create
> > > VCPUs in a different order. Unsolvable on hosts that don't allow
> > > HV_X64_MSR_VP_INDEX to be set, however.
> >
> > Right, thanks for putting together a detailed explanation.
> >
> > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > by QEMU. I'm going to post a patch to KVM fixing that.
> >
> > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > 1) in sync with kernel's notion
> > 2) also with kernels that don't support setting the msr
> > 3) persistent across migrations
> >
> > cpu_index looked like a perfect candidate.
>
> I would like to be able to stop using cpu_index as a persistent
> VCPU identifier in the future. I would also like to be able to
> create VCPUs in any order without breaking guest ABI. But it
> seems to be impossible to do that without breaking vp_index on
> older kernels.
>
> So cpu_index looks like the only candidate we have.
>
> >
> > > > + msr_data.entries[0].data = cs->cpu_index;
> > > > + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > > > + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > > > + assert(ret == 1);
> > > > + assert(msr_data.entries[0].data == cs->cpu_index);
> > >
> > > If KVM already initializes the MSR to cpu_index and we will abort
> > > if it was not set to anything except cpu_index here, why exactly
> > > do we need the KVM_SET_MSRS call?
> >
> > The kernel made no obligations to keep initializing its vp_index
> > identically to QEMU's cpu_index. So we'd better check and abort if that
> > got out of sync. Once KVM gains the ability to learn vp_index from QEMU
> > we'll no longer depend on that as we'll do explicit synchronization.
>
> I think the kernel now has an obligation to keep initializing
> HV_X64_MSR_VP_INDEX in a compatible way, or it will break older
> QEMU versions that don't set the MSR.
>
> But if you don't think we can rely on that, then the KVM_SET_MSRS
> call won't hurt.
if we can tell apart old index based vp_index kernel and new that supports
MSR setting (add cap check perhaps) then we should be able to
- leave out old vp_index broken as it is now (for old kernels/old QEMU and/ new QEMU old machine types)
i.e. do not set vp_index MSR in QEMU
- in case of (new QEMU new machine type/new kernel) use APIC ID which would work
with out of order CPU creation
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:24 ` Igor Mammedov
@ 2017-06-14 13:35 ` Eduardo Habkost
2017-06-14 15:31 ` Igor Mammedov
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:35 UTC (permalink / raw)
To: Igor Mammedov
Cc: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 03:24:50PM +0200, Igor Mammedov wrote:
> On Wed, 14 Jun 2017 10:00:15 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> > On Wed, Jun 14, 2017 at 02:25:07PM +0300, Roman Kagan wrote:
> > > On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:
> > > > On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> > > > > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > > > > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> > > > >
> > > > > It has to be owned by QEMU in order to preserve it across migration.
> > > > >
> > > > > However, the current implementation in KVM doesn't allow to set this
> > > > > msr, and KVM uses its own notion of VP index. Fortunately, the way
> > > > > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > > > > equal to QEMU cpu_index.
> > > >
> > > > This might not be true in the future. cpu_index is not a good
> > > > identifier for CPUs, and we would like to remove it in the
> > > > future.
> > > >
> > > > But it looks like we have no choice, see extra comments below:
> > >
> > > > > +static void hyperv_set_vp_index(CPUState *cs)
> > > > > +{
> > > > > + struct {
> > > > > + struct kvm_msrs info;
> > > > > + struct kvm_msr_entry entries[1];
> > > > > + } msr_data;
> > > > > + int ret;
> > > > > +
> > > > > + msr_data.info.nmsrs = 1;
> > > > > + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > > > > +
> > > > > + /*
> > > > > + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> > > > > + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > > > > + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > > > > + * expected value.
> > > > > + */
> > > >
> > > > Oh. This sounds like a problem. As reference for others, this
> > > > is the KVM code in kvm_hv_get_msr():
> > > >
> > > > case HV_X64_MSR_VP_INDEX: {
> > > > int r;
> > > > struct kvm_vcpu *v;
> > > >
> > > > kvm_for_each_vcpu(r, v, vcpu->kvm) {
> > > > if (v == vcpu) {
> > > > data = r;
> > > > break;
> > > > }
> > > > }
> > > > break;
> > > > }
> > > >
> > > > The problem with that is that it will break as soon as we create
> > > > VCPUs in a different order. Unsolvable on hosts that don't allow
> > > > HV_X64_MSR_VP_INDEX to be set, however.
> > >
> > > Right, thanks for putting together a detailed explanation.
> > >
> > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > by QEMU. I'm going to post a patch to KVM fixing that.
> > >
> > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > 1) in sync with kernel's notion
> > > 2) also with kernels that don't support setting the msr
> > > 3) persistent across migrations
> > >
> > > cpu_index looked like a perfect candidate.
> >
> > I would like to be able to stop using cpu_index as a persistent
> > VCPU identifier in the future. I would also like to be able to
> > create VCPUs in any order without breaking guest ABI. But it
> > seems to be impossible to do that without breaking vp_index on
> > older kernels.
> >
> > So cpu_index looks like the only candidate we have.
> >
> > >
> > > > > + msr_data.entries[0].data = cs->cpu_index;
> > > > > + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > > > > + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > > > > + assert(ret == 1);
> > > > > + assert(msr_data.entries[0].data == cs->cpu_index);
> > > >
> > > > If KVM already initializes the MSR to cpu_index and we will abort
> > > > if it was not set to anything except cpu_index here, why exactly
> > > > do we need the KVM_SET_MSRS call?
> > >
> > > The kernel made no obligations to keep initializing its vp_index
> > > identically to QEMU's cpu_index. So we'd better check and abort if that
> > > got out of sync. Once KVM gains the ability to learn vp_index from QEMU
> > > we'll no longer depend on that as we'll do explicit synchronization.
> >
> > I think the kernel now has an obligation to keep initializing
> > HV_X64_MSR_VP_INDEX in a compatible way, or it will break older
> > QEMU versions that don't set the MSR.
> >
> > But if you don't think we can rely on that, then the KVM_SET_MSRS
> > call won't hurt.
>
> if we can tell apart old index based vp_index kernel and new that supports
> MSR setting (add cap check perhaps) then we should be able to
> - leave out old vp_index broken as it is now (for old kernels/old QEMU and/ new QEMU old machine types)
> i.e. do not set vp_index MSR in QEMU
> - in case of (new QEMU new machine type/new kernel) use APIC ID which would work
> with out of order CPU creation
I agree with the proposal, but I don't see the need for a
capability check:
Old machine-types can simply not call KVM_SET_MSRS, as you
suggested.
New machine-types can just call KVM_SET_MSRS unconditionally, and
warn the user in case the it fails and the APIC ID really doesn't
match the index in KVM (which in practice will happen only if the
user is also configuring a CPU topology explicitly).
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index
2017-06-14 13:35 ` Eduardo Habkost
@ 2017-06-14 15:31 ` Igor Mammedov
0 siblings, 0 replies; 76+ messages in thread
From: Igor Mammedov @ 2017-06-14 15:31 UTC (permalink / raw)
To: Eduardo Habkost
Cc: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, 14 Jun 2017 10:35:36 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Wed, Jun 14, 2017 at 03:24:50PM +0200, Igor Mammedov wrote:
> > On Wed, 14 Jun 2017 10:00:15 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >
> > > On Wed, Jun 14, 2017 at 02:25:07PM +0300, Roman Kagan wrote:
> > > > On Tue, Jun 13, 2017 at 03:57:52PM -0300, Eduardo Habkost wrote:
> > > > > On Tue, Jun 06, 2017 at 09:19:30PM +0300, Roman Kagan wrote:
> > > > > > Hyper-V identifies vcpus by the virtual processor (VP) index which is
> > > > > > normally queried by the guest via HV_X64_MSR_VP_INDEX msr.
> > > > > >
> > > > > > It has to be owned by QEMU in order to preserve it across migration.
> > > > > >
> > > > > > However, the current implementation in KVM doesn't allow to set this
> > > > > > msr, and KVM uses its own notion of VP index. Fortunately, the way
> > > > > > vcpus are created in QEMU/KVM basically guarantees that the KVM value is
> > > > > > equal to QEMU cpu_index.
> > > > >
> > > > > This might not be true in the future. cpu_index is not a good
> > > > > identifier for CPUs, and we would like to remove it in the
> > > > > future.
> > > > >
> > > > > But it looks like we have no choice, see extra comments below:
> > > >
> > > > > > +static void hyperv_set_vp_index(CPUState *cs)
> > > > > > +{
> > > > > > + struct {
> > > > > > + struct kvm_msrs info;
> > > > > > + struct kvm_msr_entry entries[1];
> > > > > > + } msr_data;
> > > > > > + int ret;
> > > > > > +
> > > > > > + msr_data.info.nmsrs = 1;
> > > > > > + msr_data.entries[0].index = HV_X64_MSR_VP_INDEX;
> > > > > > +
> > > > > > + /*
> > > > > > + * Some kernels don't support setting HV_X64_MSR_VP_INDEX. However,
> > > > > > + * they use VP_INDEX equal to cpu_index due to the way vcpus are created.
> > > > > > + * So ignore the errors from SET_MSRS, but verify that GET_MSRS returns the
> > > > > > + * expected value.
> > > > > > + */
> > > > >
> > > > > Oh. This sounds like a problem. As reference for others, this
> > > > > is the KVM code in kvm_hv_get_msr():
> > > > >
> > > > > case HV_X64_MSR_VP_INDEX: {
> > > > > int r;
> > > > > struct kvm_vcpu *v;
> > > > >
> > > > > kvm_for_each_vcpu(r, v, vcpu->kvm) {
> > > > > if (v == vcpu) {
> > > > > data = r;
> > > > > break;
> > > > > }
> > > > > }
> > > > > break;
> > > > > }
> > > > >
> > > > > The problem with that is that it will break as soon as we create
> > > > > VCPUs in a different order. Unsolvable on hosts that don't allow
> > > > > HV_X64_MSR_VP_INDEX to be set, however.
> > > >
> > > > Right, thanks for putting together a detailed explanation.
> > > >
> > > > This was a thinko back then, not to have HV_X64_MSR_VP_INDEX maintained
> > > > by QEMU. I'm going to post a patch to KVM fixing that.
> > > >
> > > > Meanwhile QEMU needs a way to maintain its notion of vp_index that is
> > > > 1) in sync with kernel's notion
> > > > 2) also with kernels that don't support setting the msr
> > > > 3) persistent across migrations
> > > >
> > > > cpu_index looked like a perfect candidate.
> > >
> > > I would like to be able to stop using cpu_index as a persistent
> > > VCPU identifier in the future. I would also like to be able to
> > > create VCPUs in any order without breaking guest ABI. But it
> > > seems to be impossible to do that without breaking vp_index on
> > > older kernels.
> > >
> > > So cpu_index looks like the only candidate we have.
> > >
> > > >
> > > > > > + msr_data.entries[0].data = cs->cpu_index;
> > > > > > + kvm_vcpu_ioctl(cs, KVM_SET_MSRS, &msr_data);
> > > > > > + ret = kvm_vcpu_ioctl(cs, KVM_GET_MSRS, &msr_data);
> > > > > > + assert(ret == 1);
> > > > > > + assert(msr_data.entries[0].data == cs->cpu_index);
> > > > >
> > > > > If KVM already initializes the MSR to cpu_index and we will abort
> > > > > if it was not set to anything except cpu_index here, why exactly
> > > > > do we need the KVM_SET_MSRS call?
> > > >
> > > > The kernel made no obligations to keep initializing its vp_index
> > > > identically to QEMU's cpu_index. So we'd better check and abort if that
> > > > got out of sync. Once KVM gains the ability to learn vp_index from QEMU
> > > > we'll no longer depend on that as we'll do explicit synchronization.
> > >
> > > I think the kernel now has an obligation to keep initializing
> > > HV_X64_MSR_VP_INDEX in a compatible way, or it will break older
> > > QEMU versions that don't set the MSR.
> > >
> > > But if you don't think we can rely on that, then the KVM_SET_MSRS
> > > call won't hurt.
> >
> > if we can tell apart old index based vp_index kernel and new that supports
> > MSR setting (add cap check perhaps) then we should be able to
> > - leave out old vp_index broken as it is now (for old kernels/old QEMU and/ new QEMU old machine types)
> > i.e. do not set vp_index MSR in QEMU
> > - in case of (new QEMU new machine type/new kernel) use APIC ID which would work
> > with out of order CPU creation
>
> I agree with the proposal, but I don't see the need for a
> capability check:
>
> Old machine-types can simply not call KVM_SET_MSRS, as you
> suggested.
>
> New machine-types can just call KVM_SET_MSRS unconditionally, and
> warn the user in case the it fails and the APIC ID really doesn't
> match the index in KVM (which in practice will happen only if the
> user is also configuring a CPU topology explicitly).
I don't know anything about hyperv so here goes question:
new machine booted on new kernel (using ACPI ID) and
than migrated to new qemu on old kernel.
Will it work or break. If it breaks we need cap check and if not
we probably could do without it.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 06/23] hyperv: helper to find vcpu by VP index
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (4 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 05/23] hyperv: ensure VP index equal to QEMU cpu_index Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 07/23] hyperv_testdev: refactor for readability Roman Kagan
` (16 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Locate the vcpu by its VP index (equal to QEMU cpu_index).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 1 +
target/i386/hyperv.c | 5 +++++
2 files changed, 6 insertions(+)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 35da0b1..c5843c9 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -40,5 +40,6 @@ void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
uint32_t hyperv_vp_index(X86CPU *cpu);
+X86CPU *hyperv_find_vcpu(uint32_t vcpu_id);
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 27de5bc..480bdfe 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -21,6 +21,11 @@ uint32_t hyperv_vp_index(X86CPU *cpu)
return CPU(cpu)->cpu_index;
}
+X86CPU *hyperv_find_vcpu(uint32_t vp_index)
+{
+ return X86_CPU(qemu_get_cpu(vp_index));
+}
+
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
CPUX86State *env = &cpu->env;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 07/23] hyperv_testdev: refactor for readability
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (5 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 06/23] hyperv: helper to find vcpu by VP index Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 08/23] hyperv: cosmetic: g_malloc -> g_new Roman Kagan
` (15 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Make hyperv_testdev slightly easier to follow and enhance in future.
For that, put the hyperv sint routes (wrapped in a helper structure) on
a linked list rather than a fixed-size array.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
hw/misc/hyperv_testdev.c | 113 ++++++++++++++++++++++-------------------------
1 file changed, 52 insertions(+), 61 deletions(-)
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index dbd7cdd..da87630 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -12,6 +12,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/queue.h"
#include <linux/kvm.h>
#include "hw/hw.h"
#include "hw/qdev.h"
@@ -20,12 +21,17 @@
#include "target/i386/hyperv.h"
#include "kvm_i386.h"
-#define HV_TEST_DEV_MAX_SINT_ROUTES 64
+typedef struct TestSintRoute {
+ QLIST_ENTRY(TestSintRoute) le;
+ uint8_t cpu;
+ uint8_t sint;
+ HvSintRoute *sint_route;
+} TestSintRoute;
struct HypervTestDev {
ISADevice parent_obj;
MemoryRegion sint_control;
- HvSintRoute *sint_route[HV_TEST_DEV_MAX_SINT_ROUTES];
+ QLIST_HEAD(, TestSintRoute) sint_routes;
};
typedef struct HypervTestDev HypervTestDev;
@@ -39,88 +45,73 @@ enum {
HV_TEST_DEV_SINT_ROUTE_SET_SINT
};
-static int alloc_sint_route_index(HypervTestDev *dev)
+static void sint_route_create(HypervTestDev *dev, uint8_t cpu, uint8_t sint)
{
- int i;
+ TestSintRoute *sint_route;
- for (i = 0; i < ARRAY_SIZE(dev->sint_route); i++) {
- if (dev->sint_route[i] == NULL) {
- return i;
- }
- }
- return -1;
-}
+ sint_route = g_new0(TestSintRoute, 1);
+ assert(sint_route);
-static void free_sint_route_index(HypervTestDev *dev, int i)
-{
- assert(i >= 0 && i < ARRAY_SIZE(dev->sint_route));
- dev->sint_route[i] = NULL;
+ sint_route->cpu = cpu;
+ sint_route->sint = sint;
+
+ sint_route->sint_route = kvm_hv_sint_route_create(cpu, sint, NULL);
+ assert(sint_route->sint_route);
+
+ QLIST_INSERT_HEAD(&dev->sint_routes, sint_route, le);
}
-static int find_sint_route_index(HypervTestDev *dev, uint32_t vcpu_id,
- uint32_t sint)
+static TestSintRoute *sint_route_find(HypervTestDev *dev,
+ uint8_t cpu, uint8_t sint)
{
- HvSintRoute *sint_route;
- int i;
+ TestSintRoute *sint_route;
- for (i = 0; i < ARRAY_SIZE(dev->sint_route); i++) {
- sint_route = dev->sint_route[i];
- if (sint_route && sint_route->vcpu_id == vcpu_id &&
- sint_route->sint == sint) {
- return i;
+ QLIST_FOREACH(sint_route, &dev->sint_routes, le) {
+ if (sint_route->cpu == cpu && sint_route->sint == sint) {
+ return sint_route;
}
}
- return -1;
+ assert(false);
+ return NULL;
}
-static void hv_synic_test_dev_control(HypervTestDev *dev, uint32_t ctl,
- uint32_t vcpu_id, uint32_t sint)
+static void sint_route_destroy(HypervTestDev *dev, uint8_t cpu, uint8_t sint)
{
- int i;
- HvSintRoute *sint_route;
+ TestSintRoute *sint_route;
- switch (ctl) {
- case HV_TEST_DEV_SINT_ROUTE_CREATE:
- i = alloc_sint_route_index(dev);
- assert(i >= 0);
- sint_route = kvm_hv_sint_route_create(vcpu_id, sint, NULL);
- assert(sint_route);
- dev->sint_route[i] = sint_route;
- break;
- case HV_TEST_DEV_SINT_ROUTE_DESTROY:
- i = find_sint_route_index(dev, vcpu_id, sint);
- assert(i >= 0);
- sint_route = dev->sint_route[i];
- kvm_hv_sint_route_destroy(sint_route);
- free_sint_route_index(dev, i);
- break;
- case HV_TEST_DEV_SINT_ROUTE_SET_SINT:
- i = find_sint_route_index(dev, vcpu_id, sint);
- assert(i >= 0);
- sint_route = dev->sint_route[i];
- kvm_hv_sint_route_set_sint(sint_route);
- break;
- default:
- break;
- }
+ sint_route = sint_route_find(dev, cpu, sint);
+ QLIST_REMOVE(sint_route, le);
+ kvm_hv_sint_route_destroy(sint_route->sint_route);
+ g_free(sint_route);
+}
+
+static void sint_route_set_sint(HypervTestDev *dev, uint8_t cpu, uint8_t sint)
+{
+ TestSintRoute *sint_route;
+
+ sint_route = sint_route_find(dev, cpu, sint);
+
+ kvm_hv_sint_route_set_sint(sint_route->sint_route);
}
static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
uint32_t len)
{
HypervTestDev *dev = HYPERV_TEST_DEV(opaque);
- uint8_t ctl;
+ uint8_t sint = data & 0xFF;
+ uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
+ uint8_t ctl = (data >> 16ULL) & 0xFF;
- ctl = (data >> 16ULL) & 0xFF;
switch (ctl) {
case HV_TEST_DEV_SINT_ROUTE_CREATE:
+ sint_route_create(dev, vcpu_id, sint);
+ break;
case HV_TEST_DEV_SINT_ROUTE_DESTROY:
- case HV_TEST_DEV_SINT_ROUTE_SET_SINT: {
- uint8_t sint = data & 0xFF;
- uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
- hv_synic_test_dev_control(dev, ctl, vcpu_id, sint);
+ sint_route_destroy(dev, vcpu_id, sint);
+ break;
+ case HV_TEST_DEV_SINT_ROUTE_SET_SINT:
+ sint_route_set_sint(dev, vcpu_id, sint);
break;
- }
default:
break;
}
@@ -139,7 +130,7 @@ static void hv_test_dev_realizefn(DeviceState *d, Error **errp)
HypervTestDev *dev = HYPERV_TEST_DEV(d);
MemoryRegion *io = isa_address_space_io(isa);
- memset(dev->sint_route, 0, sizeof(dev->sint_route));
+ QLIST_INIT(&dev->sint_routes);
memory_region_init_io(&dev->sint_control, OBJECT(dev),
&synic_test_sint_ops, dev,
"hyperv-testdev-ctl", 4);
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 08/23] hyperv: cosmetic: g_malloc -> g_new
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (6 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 07/23] hyperv_testdev: refactor for readability Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 09/23] hyperv: synic: only setup ack notifier if there's a callback Roman Kagan
` (14 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 480bdfe..9aa5ec5 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -88,7 +88,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
HvSintRoute *sint_route;
int r, gsi;
- sint_route = g_malloc0(sizeof(*sint_route));
+ sint_route = g_new0(HvSintRoute, 1);
r = event_notifier_init(&sint_route->sint_set_notifier, false);
if (r) {
goto err;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 09/23] hyperv: synic: only setup ack notifier if there's a callback
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (7 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 08/23] hyperv: cosmetic: g_malloc -> g_new Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 10/23] hyperv: allow passing arbitrary data to sint ack callback Roman Kagan
` (13 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
There's no point setting up an sint ack notifier if no callback is
specified.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 9aa5ec5..3ed5a48 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -77,15 +77,14 @@ static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
HvSintRoute *sint_route = container_of(notifier, HvSintRoute,
sint_ack_notifier);
event_notifier_test_and_clear(notifier);
- if (sint_route->sint_ack_clb) {
- sint_route->sint_ack_clb(sint_route);
- }
+ sint_route->sint_ack_clb(sint_route);
}
HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
HvSintAckClb sint_ack_clb)
{
HvSintRoute *sint_route;
+ EventNotifier *ack_notifier;
int r, gsi;
sint_route = g_new0(HvSintRoute, 1);
@@ -94,13 +93,15 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
goto err;
}
- r = event_notifier_init(&sint_route->sint_ack_notifier, false);
- if (r) {
- goto err_sint_set_notifier;
- }
+ ack_notifier = sint_ack_clb ? &sint_route->sint_ack_notifier : NULL;
+ if (ack_notifier) {
+ r = event_notifier_init(ack_notifier, false);
+ if (r) {
+ goto err_sint_set_notifier;
+ }
- event_notifier_set_handler(&sint_route->sint_ack_notifier,
- kvm_hv_sint_ack_handler);
+ event_notifier_set_handler(ack_notifier, kvm_hv_sint_ack_handler);
+ }
gsi = kvm_irqchip_add_hv_sint_route(kvm_state, vcpu_id, sint);
if (gsi < 0) {
@@ -109,7 +110,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
r = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state,
&sint_route->sint_set_notifier,
- &sint_route->sint_ack_notifier, gsi);
+ ack_notifier, gsi);
if (r) {
goto err_irqfd;
}
@@ -123,8 +124,10 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
err_irqfd:
kvm_irqchip_release_virq(kvm_state, gsi);
err_gsi:
- event_notifier_set_handler(&sint_route->sint_ack_notifier, NULL);
- event_notifier_cleanup(&sint_route->sint_ack_notifier);
+ if (ack_notifier) {
+ event_notifier_set_handler(ack_notifier, NULL);
+ event_notifier_cleanup(ack_notifier);
+ }
err_sint_set_notifier:
event_notifier_cleanup(&sint_route->sint_set_notifier);
err:
@@ -139,8 +142,10 @@ void kvm_hv_sint_route_destroy(HvSintRoute *sint_route)
&sint_route->sint_set_notifier,
sint_route->gsi);
kvm_irqchip_release_virq(kvm_state, sint_route->gsi);
- event_notifier_set_handler(&sint_route->sint_ack_notifier, NULL);
- event_notifier_cleanup(&sint_route->sint_ack_notifier);
+ if (sint_route->sint_ack_clb) {
+ event_notifier_set_handler(&sint_route->sint_ack_notifier, NULL);
+ event_notifier_cleanup(&sint_route->sint_ack_notifier);
+ }
event_notifier_cleanup(&sint_route->sint_set_notifier);
g_free(sint_route);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 10/23] hyperv: allow passing arbitrary data to sint ack callback
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (8 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 09/23] hyperv: synic: only setup ack notifier if there's a callback Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted Roman Kagan
` (12 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Make sint ack callback accept an opaque pointer, that is stored on
sint_route at creation time.
This allows for more convenient interaction with the callback.
Besides, nothing outside hyperv.c should need to know the layout of
HvSintRoute fields any more so its declaration can be removed from the
header.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 14 +++-----------
hw/misc/hyperv_testdev.c | 2 +-
target/i386/hyperv.c | 16 ++++++++++++++--
3 files changed, 18 insertions(+), 14 deletions(-)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index c5843c9..bf7f47b 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -19,21 +19,13 @@
#include "qemu/event_notifier.h"
typedef struct HvSintRoute HvSintRoute;
-typedef void (*HvSintAckClb)(HvSintRoute *sint_route);
-
-struct HvSintRoute {
- uint32_t sint;
- uint32_t vcpu_id;
- int gsi;
- EventNotifier sint_set_notifier;
- EventNotifier sint_ack_notifier;
- HvSintAckClb sint_ack_clb;
-};
+typedef void (*HvSintAckClb)(void *data);
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
- HvSintAckClb sint_ack_clb);
+ HvSintAckClb sint_ack_clb,
+ void *sint_ack_clb_data);
void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index da87630..97ea959 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -55,7 +55,7 @@ static void sint_route_create(HypervTestDev *dev, uint8_t cpu, uint8_t sint)
sint_route->cpu = cpu;
sint_route->sint = sint;
- sint_route->sint_route = kvm_hv_sint_route_create(cpu, sint, NULL);
+ sint_route->sint_route = kvm_hv_sint_route_create(cpu, sint, NULL, NULL);
assert(sint_route->sint_route);
QLIST_INSERT_HEAD(&dev->sint_routes, sint_route, le);
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 3ed5a48..ac7988f 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -16,6 +16,16 @@
#include "hyperv.h"
#include "hyperv_proto.h"
+struct HvSintRoute {
+ uint32_t sint;
+ uint32_t vcpu_id;
+ int gsi;
+ EventNotifier sint_set_notifier;
+ EventNotifier sint_ack_notifier;
+ HvSintAckClb sint_ack_clb;
+ void *sint_ack_clb_data;
+};
+
uint32_t hyperv_vp_index(X86CPU *cpu)
{
return CPU(cpu)->cpu_index;
@@ -77,11 +87,12 @@ static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
HvSintRoute *sint_route = container_of(notifier, HvSintRoute,
sint_ack_notifier);
event_notifier_test_and_clear(notifier);
- sint_route->sint_ack_clb(sint_route);
+ sint_route->sint_ack_clb(sint_route->sint_ack_clb_data);
}
HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
- HvSintAckClb sint_ack_clb)
+ HvSintAckClb sint_ack_clb,
+ void *sint_ack_clb_data)
{
HvSintRoute *sint_route;
EventNotifier *ack_notifier;
@@ -116,6 +127,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vcpu_id, uint32_t sint,
}
sint_route->gsi = gsi;
sint_route->sint_ack_clb = sint_ack_clb;
+ sint_route->sint_ack_clb_data = sint_ack_clb_data;
sint_route->vcpu_id = vcpu_id;
sint_route->sint = sint;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (9 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 10/23] hyperv: allow passing arbitrary data to sint ack callback Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-14 13:53 ` Eduardo Habkost
2017-06-06 18:19 ` [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC Roman Kagan
` (11 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Multiple entities (e.g. VMBus devices) can use the same SINT route. To
make their lives easier in maintaining SINT route ownership, make it
reference-counted. Adjust the respective API names accordingly.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 10 +++++-----
hw/misc/hyperv_testdev.c | 4 ++--
target/i386/hyperv.c | 25 +++++++++++++++++++++----
3 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index bc071da..ca5a32d 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -23,11 +23,11 @@ typedef void (*HvSintAckClb)(void *data);
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
-HvSintRoute *kvm_hv_sint_route_create(X86CPU *cpu, uint32_t sint,
- HvSintAckClb sint_ack_clb,
- void *sint_ack_clb_data);
-
-void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
+HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
+ HvSintAckClb sint_ack_clb,
+ void *sint_ack_clb_data);
+void hyperv_sint_route_ref(HvSintRoute *sint_route);
+void hyperv_sint_route_unref(HvSintRoute *sint_route);
int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index 1b12295..929bf4f 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -55,7 +55,7 @@ static void sint_route_create(HypervTestDev *dev, X86CPU *cpu, uint8_t sint)
sint_route->cpu = cpu;
sint_route->sint = sint;
- sint_route->sint_route = kvm_hv_sint_route_create(cpu, sint, NULL, NULL);
+ sint_route->sint_route = hyperv_sint_route_new(cpu, sint, NULL, NULL);
assert(sint_route->sint_route);
QLIST_INSERT_HEAD(&dev->sint_routes, sint_route, le);
@@ -81,7 +81,7 @@ static void sint_route_destroy(HypervTestDev *dev, X86CPU *cpu, uint8_t sint)
sint_route = sint_route_find(dev, cpu, sint);
QLIST_REMOVE(sint_route, le);
- kvm_hv_sint_route_destroy(sint_route->sint_route);
+ hyperv_sint_route_unref(sint_route->sint_route);
g_free(sint_route);
}
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index a150401..ae67f82 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -24,6 +24,7 @@ struct HvSintRoute {
EventNotifier sint_ack_notifier;
HvSintAckClb sint_ack_clb;
void *sint_ack_clb_data;
+ unsigned refcount;
};
uint32_t hyperv_vp_index(X86CPU *cpu)
@@ -90,9 +91,9 @@ static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
sint_route->sint_ack_clb(sint_route->sint_ack_clb_data);
}
-HvSintRoute *kvm_hv_sint_route_create(X86CPU *cpu, uint32_t sint,
- HvSintAckClb sint_ack_clb,
- void *sint_ack_clb_data)
+HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
+ HvSintAckClb sint_ack_clb,
+ void *sint_ack_clb_data)
{
HvSintRoute *sint_route;
EventNotifier *ack_notifier;
@@ -130,6 +131,7 @@ HvSintRoute *kvm_hv_sint_route_create(X86CPU *cpu, uint32_t sint,
sint_route->sint_ack_clb_data = sint_ack_clb_data;
sint_route->cpu = cpu;
sint_route->sint = sint;
+ sint_route->refcount = 1;
return sint_route;
@@ -148,8 +150,23 @@ err:
return NULL;
}
-void kvm_hv_sint_route_destroy(HvSintRoute *sint_route)
+void hyperv_sint_route_ref(HvSintRoute *sint_route)
{
+ sint_route->refcount++;
+}
+
+void hyperv_sint_route_unref(HvSintRoute *sint_route)
+{
+ if (!sint_route) {
+ return;
+ }
+
+ assert(sint_route->refcount > 0);
+
+ if (--sint_route->refcount) {
+ return;
+ }
+
kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state,
&sint_route->sint_set_notifier,
sint_route->gsi);
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted
2017-06-06 18:19 ` [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted Roman Kagan
@ 2017-06-14 13:53 ` Eduardo Habkost
2017-06-14 16:23 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 13:53 UTC (permalink / raw)
To: Roman Kagan; +Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Tue, Jun 06, 2017 at 09:19:37PM +0300, Roman Kagan wrote:
> Multiple entities (e.g. VMBus devices) can use the same SINT route. To
> make their lives easier in maintaining SINT route ownership, make it
> reference-counted. Adjust the respective API names accordingly.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Isn't it easier to reuse existing refcounting infrastructure
here? Is it overkill to make it a QOM object? (struct Object has
40 bytes)
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted
2017-06-14 13:53 ` Eduardo Habkost
@ 2017-06-14 16:23 ` Roman Kagan
2017-06-23 12:44 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 16:23 UTC (permalink / raw)
To: Eduardo Habkost
Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 10:53:25AM -0300, Eduardo Habkost wrote:
> On Tue, Jun 06, 2017 at 09:19:37PM +0300, Roman Kagan wrote:
> > Multiple entities (e.g. VMBus devices) can use the same SINT route. To
> > make their lives easier in maintaining SINT route ownership, make it
> > reference-counted. Adjust the respective API names accordingly.
> >
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
>
> Isn't it easier to reuse existing refcounting infrastructure
> here? Is it overkill to make it a QOM object? (struct Object has
> 40 bytes)
Normally the guests use a sint route per cpu or less. So no, the space
overhead is not an issue.
I also wanted to reuse regular QOM refcounting so I QOM-ified it at
first. However, while hammering out the design, I found no appropriate
place in the QOM hierachy where to stick these objects so we dropped the
idea.
If I get your proposal right you suggest to leave it unattached instead.
That should probably work; however, looking at all the boilerplate code
this would entail, including OBJECT casts, I'm not sure it would save
anything. Do you think it's worth reworking into QOM?
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted
2017-06-14 16:23 ` Roman Kagan
@ 2017-06-23 12:44 ` Eduardo Habkost
0 siblings, 0 replies; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-23 12:44 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 07:23:56PM +0300, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 10:53:25AM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 06, 2017 at 09:19:37PM +0300, Roman Kagan wrote:
> > > Multiple entities (e.g. VMBus devices) can use the same SINT route. To
> > > make their lives easier in maintaining SINT route ownership, make it
> > > reference-counted. Adjust the respective API names accordingly.
> > >
> > > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> >
> > Isn't it easier to reuse existing refcounting infrastructure
> > here? Is it overkill to make it a QOM object? (struct Object has
> > 40 bytes)
>
> Normally the guests use a sint route per cpu or less. So no, the space
> overhead is not an issue.
>
> I also wanted to reuse regular QOM refcounting so I QOM-ified it at
> first. However, while hammering out the design, I found no appropriate
> place in the QOM hierachy where to stick these objects so we dropped the
> idea.
>
> If I get your proposal right you suggest to leave it unattached instead.
> That should probably work; however, looking at all the boilerplate code
> this would entail, including OBJECT casts, I'm not sure it would save
> anything. Do you think it's worth reworking into QOM?
I just noticed I didn't reply to this, sorry. I'm also not sure it is
worth reworking into QOM, so I guess it's up to you.
About placing the object in the QOM hierarchy, I don't think that would
be a problem. Objects without a parent are safer, as long as reference
counting is properly tracked.
About the OBJECT casts, I guess that's a common issue. I'm considering
proposing making object_ref()/object_unref() macros that use OBJECT()
automatically.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (10 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 12/23] hyperv: make HvSintRoute reference-counted Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-13 18:34 ` Eduardo Habkost
2017-06-06 18:19 ` [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too Roman Kagan
` (10 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Make Hyper-V SynIC a device which is attached as a child to X86CPU. For
now it only makes SynIC visibile in the qom hierarchy and exposes a few
properties which are maintained in sync with the respecitve msrs of the
parent cpu.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 4 ++
target/i386/hyperv.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++++-
target/i386/kvm.c | 5 ++
target/i386/machine.c | 9 ++++
4 files changed, 147 insertions(+), 2 deletions(-)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index ca5a32d..9dd5ca0 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -34,4 +34,8 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
uint32_t hyperv_vp_index(X86CPU *cpu);
X86CPU *hyperv_find_vcpu(uint32_t vcpu_id);
+void hyperv_synic_add(X86CPU *cpu);
+void hyperv_synic_reset(X86CPU *cpu);
+void hyperv_synic_update(X86CPU *cpu);
+
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index ae67f82..2d9e9fe 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -13,12 +13,27 @@
#include "qemu/osdep.h"
#include "qemu/main-loop.h"
+#include "qapi/error.h"
+#include "hw/qdev-properties.h"
#include "hyperv.h"
#include "hyperv_proto.h"
+typedef struct SynICState {
+ DeviceState parent_obj;
+
+ X86CPU *cpu;
+
+ bool enabled;
+ hwaddr msg_page_addr;
+ hwaddr evt_page_addr;
+} SynICState;
+
+#define TYPE_SYNIC "hyperv-synic"
+#define SYNIC(obj) OBJECT_CHECK(SynICState, (obj), TYPE_SYNIC)
+
struct HvSintRoute {
uint32_t sint;
- X86CPU *cpu;
+ SynICState *synic;
int gsi;
EventNotifier sint_set_notifier;
EventNotifier sint_ack_notifier;
@@ -37,6 +52,37 @@ X86CPU *hyperv_find_vcpu(uint32_t vp_index)
return X86_CPU(qemu_get_cpu(vp_index));
}
+static SynICState *get_synic(X86CPU *cpu)
+{
+ SynICState *synic =
+ SYNIC(object_resolve_path_component(OBJECT(cpu), "synic"));
+ assert(synic);
+ return synic;
+}
+
+static void synic_update_msg_page_addr(SynICState *synic)
+{
+ uint64_t msr = synic->cpu->env.msr_hv_synic_msg_page;
+ hwaddr new_addr = (msr & HV_SIMP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+
+ synic->msg_page_addr = new_addr;
+}
+
+static void synic_update_evt_page_addr(SynICState *synic)
+{
+ uint64_t msr = synic->cpu->env.msr_hv_synic_evt_page;
+ hwaddr new_addr = (msr & HV_SIEFP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+
+ synic->evt_page_addr = new_addr;
+}
+
+static void synic_update(SynICState *synic)
+{
+ synic->enabled = synic->cpu->env.msr_hv_synic_control & HV_SYNIC_ENABLE;
+ synic_update_msg_page_addr(synic);
+ synic_update_evt_page_addr(synic);
+}
+
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
CPUX86State *env = &cpu->env;
@@ -65,6 +111,7 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
default:
return -1;
}
+ synic_update(get_synic(cpu));
return 0;
case KVM_EXIT_HYPERV_HCALL: {
uint16_t code;
@@ -95,10 +142,13 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
HvSintAckClb sint_ack_clb,
void *sint_ack_clb_data)
{
+ SynICState *synic;
HvSintRoute *sint_route;
EventNotifier *ack_notifier;
int r, gsi;
+ synic = get_synic(cpu);
+
sint_route = g_new0(HvSintRoute, 1);
r = event_notifier_init(&sint_route->sint_set_notifier, false);
if (r) {
@@ -129,7 +179,7 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
sint_route->gsi = gsi;
sint_route->sint_ack_clb = sint_ack_clb;
sint_route->sint_ack_clb_data = sint_ack_clb_data;
- sint_route->cpu = cpu;
+ sint_route->synic = synic;
sint_route->sint = sint;
sint_route->refcount = 1;
@@ -183,3 +233,80 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route)
{
return event_notifier_set(&sint_route->sint_set_notifier);
}
+
+static Property synic_props[] = {
+ DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
+ DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
+ DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void synic_realize(DeviceState *dev, Error **errp)
+{
+ Object *obj = OBJECT(dev);
+ SynICState *synic = SYNIC(dev);
+
+ synic->cpu = X86_CPU(obj->parent);
+}
+
+static void synic_reset(DeviceState *dev)
+{
+ SynICState *synic = SYNIC(dev);
+ synic_update(synic);
+}
+
+static void synic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->props = synic_props;
+ dc->realize = synic_realize;
+ dc->reset = synic_reset;
+ dc->user_creatable = false;
+}
+
+void hyperv_synic_add(X86CPU *cpu)
+{
+ Object *obj;
+
+ if (!cpu->hyperv_synic) {
+ return;
+ }
+
+ obj = object_new(TYPE_SYNIC);
+ object_property_add_child(OBJECT(cpu), "synic", obj, &error_abort);
+ object_unref(obj);
+ object_property_set_bool(obj, true, "realized", &error_abort);
+}
+
+void hyperv_synic_reset(X86CPU *cpu)
+{
+ if (!cpu->hyperv_synic) {
+ return;
+ }
+
+ device_reset(DEVICE(get_synic(cpu)));
+}
+
+void hyperv_synic_update(X86CPU *cpu)
+{
+ if (!cpu->hyperv_synic) {
+ return;
+ }
+
+ synic_update(get_synic(cpu));
+}
+
+static const TypeInfo synic_type_info = {
+ .name = TYPE_SYNIC,
+ .parent = TYPE_DEVICE,
+ .instance_size = sizeof(SynICState),
+ .class_init = synic_class_init,
+};
+
+static void synic_register_types(void)
+{
+ type_register_static(&synic_type_info);
+}
+
+type_init(synic_register_types)
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index eb9cde4..433c912 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -688,6 +688,9 @@ static int hyperv_handle_properties(CPUState *cs)
}
env->features[FEAT_HYPERV_EAX] |= HV_SYNTIMERS_AVAILABLE;
}
+
+ hyperv_synic_add(cpu);
+
return 0;
}
@@ -1065,6 +1068,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) {
env->msr_hv_synic_sint[i] = HV_SINT_MASKED;
}
+
+ hyperv_synic_reset(cpu);
}
}
diff --git a/target/i386/machine.c b/target/i386/machine.c
index eb00b19..8022c24 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -7,6 +7,7 @@
#include "hw/i386/pc.h"
#include "hw/isa/isa.h"
#include "migration/cpu.h"
+#include "hyperv.h"
#include "sysemu/kvm.h"
@@ -633,11 +634,19 @@ static bool hyperv_synic_enable_needed(void *opaque)
return false;
}
+static int hyperv_synic_post_load(void *opaque, int version_id)
+{
+ X86CPU *cpu = opaque;
+ hyperv_synic_update(cpu);
+ return 0;
+}
+
static const VMStateDescription vmstate_msr_hyperv_synic = {
.name = "cpu/msr_hyperv_synic",
.version_id = 1,
.minimum_version_id = 1,
.needed = hyperv_synic_enable_needed,
+ .post_load = hyperv_synic_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT64(env.msr_hv_synic_control, X86CPU),
VMSTATE_UINT64(env.msr_hv_synic_evt_page, X86CPU),
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-06 18:19 ` [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC Roman Kagan
@ 2017-06-13 18:34 ` Eduardo Habkost
2017-06-14 9:58 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-13 18:34 UTC (permalink / raw)
To: Roman Kagan; +Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Tue, Jun 06, 2017 at 09:19:38PM +0300, Roman Kagan wrote:
> Make Hyper-V SynIC a device which is attached as a child to X86CPU. For
> now it only makes SynIC visibile in the qom hierarchy and exposes a few
> properties which are maintained in sync with the respecitve msrs of the
> parent cpu.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
> target/i386/hyperv.h | 4 ++
> target/i386/hyperv.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++++-
> target/i386/kvm.c | 5 ++
> target/i386/machine.c | 9 ++++
> 4 files changed, 147 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
> index ca5a32d..9dd5ca0 100644
> --- a/target/i386/hyperv.h
> +++ b/target/i386/hyperv.h
> @@ -34,4 +34,8 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
> uint32_t hyperv_vp_index(X86CPU *cpu);
> X86CPU *hyperv_find_vcpu(uint32_t vcpu_id);
>
> +void hyperv_synic_add(X86CPU *cpu);
> +void hyperv_synic_reset(X86CPU *cpu);
> +void hyperv_synic_update(X86CPU *cpu);
> +
> #endif
> diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
> index ae67f82..2d9e9fe 100644
> --- a/target/i386/hyperv.c
> +++ b/target/i386/hyperv.c
> @@ -13,12 +13,27 @@
>
> #include "qemu/osdep.h"
> #include "qemu/main-loop.h"
> +#include "qapi/error.h"
> +#include "hw/qdev-properties.h"
> #include "hyperv.h"
> #include "hyperv_proto.h"
>
> +typedef struct SynICState {
> + DeviceState parent_obj;
> +
> + X86CPU *cpu;
> +
> + bool enabled;
> + hwaddr msg_page_addr;
> + hwaddr evt_page_addr;
> +} SynICState;
> +
> +#define TYPE_SYNIC "hyperv-synic"
> +#define SYNIC(obj) OBJECT_CHECK(SynICState, (obj), TYPE_SYNIC)
> +
> struct HvSintRoute {
> uint32_t sint;
> - X86CPU *cpu;
> + SynICState *synic;
> int gsi;
> EventNotifier sint_set_notifier;
> EventNotifier sint_ack_notifier;
> @@ -37,6 +52,37 @@ X86CPU *hyperv_find_vcpu(uint32_t vp_index)
> return X86_CPU(qemu_get_cpu(vp_index));
> }
>
> +static SynICState *get_synic(X86CPU *cpu)
> +{
> + SynICState *synic =
> + SYNIC(object_resolve_path_component(OBJECT(cpu), "synic"));
> + assert(synic);
> + return synic;
How often this will run? I think a new X86CPU struct field would
be acceptable to avoid resolving a QOM path on every hyperv exit.
Do you even need QOM for this?
> +}
> +
> +static void synic_update_msg_page_addr(SynICState *synic)
> +{
> + uint64_t msr = synic->cpu->env.msr_hv_synic_msg_page;
> + hwaddr new_addr = (msr & HV_SIMP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
> +
> + synic->msg_page_addr = new_addr;
> +}
> +
> +static void synic_update_evt_page_addr(SynICState *synic)
> +{
> + uint64_t msr = synic->cpu->env.msr_hv_synic_evt_page;
> + hwaddr new_addr = (msr & HV_SIEFP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
> +
> + synic->evt_page_addr = new_addr;
> +}
> +
> +static void synic_update(SynICState *synic)
> +{
> + synic->enabled = synic->cpu->env.msr_hv_synic_control & HV_SYNIC_ENABLE;
> + synic_update_msg_page_addr(synic);
> + synic_update_evt_page_addr(synic);
> +}
> +
> int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
> {
> CPUX86State *env = &cpu->env;
> @@ -65,6 +111,7 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
> default:
> return -1;
> }
> + synic_update(get_synic(cpu));
> return 0;
> case KVM_EXIT_HYPERV_HCALL: {
> uint16_t code;
> @@ -95,10 +142,13 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
> HvSintAckClb sint_ack_clb,
> void *sint_ack_clb_data)
> {
> + SynICState *synic;
> HvSintRoute *sint_route;
> EventNotifier *ack_notifier;
> int r, gsi;
>
> + synic = get_synic(cpu);
> +
> sint_route = g_new0(HvSintRoute, 1);
> r = event_notifier_init(&sint_route->sint_set_notifier, false);
> if (r) {
> @@ -129,7 +179,7 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
> sint_route->gsi = gsi;
> sint_route->sint_ack_clb = sint_ack_clb;
> sint_route->sint_ack_clb_data = sint_ack_clb_data;
> - sint_route->cpu = cpu;
> + sint_route->synic = synic;
> sint_route->sint = sint;
> sint_route->refcount = 1;
>
> @@ -183,3 +233,80 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route)
> {
> return event_notifier_set(&sint_route->sint_set_notifier);
> }
> +
> +static Property synic_props[] = {
> + DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
> + DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
> + DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
What do you need the QOM properties for? Are they supposed to be
configurable by the user? Are they supposed to be queried from
outside QEMU? On which cases?
> + DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void synic_realize(DeviceState *dev, Error **errp)
> +{
> + Object *obj = OBJECT(dev);
> + SynICState *synic = SYNIC(dev);
> +
> + synic->cpu = X86_CPU(obj->parent);
> +}
> +
> +static void synic_reset(DeviceState *dev)
> +{
> + SynICState *synic = SYNIC(dev);
> + synic_update(synic);
> +}
> +
> +static void synic_class_init(ObjectClass *klass, void *data)
> +{
> + DeviceClass *dc = DEVICE_CLASS(klass);
> +
> + dc->props = synic_props;
> + dc->realize = synic_realize;
> + dc->reset = synic_reset;
> + dc->user_creatable = false;
> +}
> +
> +void hyperv_synic_add(X86CPU *cpu)
> +{
> + Object *obj;
> +
> + if (!cpu->hyperv_synic) {
> + return;
> + }
> +
> + obj = object_new(TYPE_SYNIC);
> + object_property_add_child(OBJECT(cpu), "synic", obj, &error_abort);
> + object_unref(obj);
> + object_property_set_bool(obj, true, "realized", &error_abort);
> +}
> +
> +void hyperv_synic_reset(X86CPU *cpu)
> +{
> + if (!cpu->hyperv_synic) {
> + return;
> + }
> +
> + device_reset(DEVICE(get_synic(cpu)));
> +}
> +
> +void hyperv_synic_update(X86CPU *cpu)
> +{
> + if (!cpu->hyperv_synic) {
> + return;
> + }
> +
> + synic_update(get_synic(cpu));
> +}
> +
> +static const TypeInfo synic_type_info = {
> + .name = TYPE_SYNIC,
> + .parent = TYPE_DEVICE,
> + .instance_size = sizeof(SynICState),
> + .class_init = synic_class_init,
> +};
> +
> +static void synic_register_types(void)
> +{
> + type_register_static(&synic_type_info);
> +}
> +
> +type_init(synic_register_types)
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index eb9cde4..433c912 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -688,6 +688,9 @@ static int hyperv_handle_properties(CPUState *cs)
> }
> env->features[FEAT_HYPERV_EAX] |= HV_SYNTIMERS_AVAILABLE;
> }
> +
> + hyperv_synic_add(cpu);
> +
The purpose of hyperv_handle_properties() is to just initialize
env->features based on hyperv flags, and it might even go away
some day. I suggest moving this to x86_cpu_realizefn().
> return 0;
> }
>
> @@ -1065,6 +1068,8 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
> for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) {
> env->msr_hv_synic_sint[i] = HV_SINT_MASKED;
> }
> +
> + hyperv_synic_reset(cpu);
> }
> }
>
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index eb00b19..8022c24 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -7,6 +7,7 @@
> #include "hw/i386/pc.h"
> #include "hw/isa/isa.h"
> #include "migration/cpu.h"
> +#include "hyperv.h"
>
> #include "sysemu/kvm.h"
>
> @@ -633,11 +634,19 @@ static bool hyperv_synic_enable_needed(void *opaque)
> return false;
> }
>
> +static int hyperv_synic_post_load(void *opaque, int version_id)
> +{
> + X86CPU *cpu = opaque;
> + hyperv_synic_update(cpu);
> + return 0;
> +}
> +
> static const VMStateDescription vmstate_msr_hyperv_synic = {
> .name = "cpu/msr_hyperv_synic",
> .version_id = 1,
> .minimum_version_id = 1,
> .needed = hyperv_synic_enable_needed,
> + .post_load = hyperv_synic_post_load,
> .fields = (VMStateField[]) {
> VMSTATE_UINT64(env.msr_hv_synic_control, X86CPU),
> VMSTATE_UINT64(env.msr_hv_synic_evt_page, X86CPU),
> --
> 2.9.4
>
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-13 18:34 ` Eduardo Habkost
@ 2017-06-14 9:58 ` Roman Kagan
2017-06-14 12:46 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 9:58 UTC (permalink / raw)
To: Eduardo Habkost
Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Tue, Jun 13, 2017 at 03:34:34PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 06, 2017 at 09:19:38PM +0300, Roman Kagan wrote:
> > Make Hyper-V SynIC a device which is attached as a child to X86CPU. For
> > now it only makes SynIC visibile in the qom hierarchy and exposes a few
> > properties which are maintained in sync with the respecitve msrs of the
> > parent cpu.
[...]
> > +static SynICState *get_synic(X86CPU *cpu)
> > +{
> > + SynICState *synic =
> > + SYNIC(object_resolve_path_component(OBJECT(cpu), "synic"));
> > + assert(synic);
> > + return synic;
>
> How often this will run? I think a new X86CPU struct field would
> be acceptable to avoid resolving a QOM path on every hyperv exit.
It's only used at the points where synic state is updated: at
realize/save/load/unrealize and when the guest configures synics via
msrs (i.e. a few times per cpu at guest startup). None of those is
performance-critical.
> Do you even need QOM for this?
I need to reach the synic from the cpu on slow paths, and IMHO the
pointer on X86CPU is not justified here. Besides there are two memory
regions associated with every synic (in a followup patch) and I thought
it made sense to have a dedicated parent QOM node for them.
> > +static Property synic_props[] = {
> > + DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
> > + DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
> > + DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
>
> What do you need the QOM properties for? Are they supposed to be
> configurable by the user?
No. Actually I wanted to define them as read-only, but I failed to find
a way to do so.
> Are they supposed to be queried from outside QEMU? On which cases?
ATM I only see them as informational fields that may prove useful during
debugging.
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-14 9:58 ` Roman Kagan
@ 2017-06-14 12:46 ` Eduardo Habkost
2017-06-14 15:11 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 12:46 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 12:58:04PM +0300, Roman Kagan wrote:
> On Tue, Jun 13, 2017 at 03:34:34PM -0300, Eduardo Habkost wrote:
> > On Tue, Jun 06, 2017 at 09:19:38PM +0300, Roman Kagan wrote:
> > > Make Hyper-V SynIC a device which is attached as a child to X86CPU. For
> > > now it only makes SynIC visibile in the qom hierarchy and exposes a few
> > > properties which are maintained in sync with the respecitve msrs of the
> > > parent cpu.
> [...]
> > > +static SynICState *get_synic(X86CPU *cpu)
> > > +{
> > > + SynICState *synic =
> > > + SYNIC(object_resolve_path_component(OBJECT(cpu), "synic"));
> > > + assert(synic);
> > > + return synic;
> >
> > How often this will run? I think a new X86CPU struct field would
> > be acceptable to avoid resolving a QOM path on every hyperv exit.
>
> It's only used at the points where synic state is updated: at
> realize/save/load/unrealize and when the guest configures synics via
> msrs (i.e. a few times per cpu at guest startup). None of those is
> performance-critical.
>
> > Do you even need QOM for this?
>
> I need to reach the synic from the cpu on slow paths, and IMHO the
> pointer on X86CPU is not justified here. Besides there are two memory
> regions associated with every synic (in a followup patch) and I thought
> it made sense to have a dedicated parent QOM node for them.
Makes sense.
>
> > > +static Property synic_props[] = {
> > > + DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
> > > + DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
> > > + DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
> >
> > What do you need the QOM properties for? Are they supposed to be
> > configurable by the user?
>
> No. Actually I wanted to define them as read-only, but I failed to find
> a way to do so.
>
> > Are they supposed to be queried from outside QEMU? On which cases?
>
> ATM I only see them as informational fields that may prove useful during
> debugging.
>
You can use object_property_add_uint64_ptr() on instance_init to
add read-only properties.
If the enabled/msg_page_addr/evt_page_addr struct fields exist
only because of the properties, you can also use
object_property_add() and write your own getter that will query
the MSRs only when reading the property. I normally don't like
custom getters/setters, but it sounds acceptable for a read-only
property that's only for debugging.
Either way you choose to implement it, I would add a comment
noting that the properties are there just for debugging.
BTW, would you still want to add this amount of boilerplate code
just for debugging if we had a generic msr_get HMP command?
There's a series in the list (submitted on April) adding msr_get
and msr_set HMP commands, but it was never merged.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-14 12:46 ` Eduardo Habkost
@ 2017-06-14 15:11 ` Roman Kagan
2017-06-14 15:21 ` Eduardo Habkost
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 15:11 UTC (permalink / raw)
To: Eduardo Habkost
Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 09:46:20AM -0300, Eduardo Habkost wrote:
> On Wed, Jun 14, 2017 at 12:58:04PM +0300, Roman Kagan wrote:
> > On Tue, Jun 13, 2017 at 03:34:34PM -0300, Eduardo Habkost wrote:
> > > On Tue, Jun 06, 2017 at 09:19:38PM +0300, Roman Kagan wrote:
> > > > +static Property synic_props[] = {
> > > > + DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
> > > > + DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
> > > > + DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
> > >
> > > What do you need the QOM properties for? Are they supposed to be
> > > configurable by the user?
> >
> > No. Actually I wanted to define them as read-only, but I failed to find
> > a way to do so.
> >
> > > Are they supposed to be queried from outside QEMU? On which cases?
> >
> > ATM I only see them as informational fields that may prove useful during
> > debugging.
> >
>
> You can use object_property_add_uint64_ptr() on instance_init to
> add read-only properties.
>
> If the enabled/msg_page_addr/evt_page_addr struct fields exist
> only because of the properties, you can also use
> object_property_add() and write your own getter that will query
> the MSRs only when reading the property. I normally don't like
> custom getters/setters, but it sounds acceptable for a read-only
> property that's only for debugging.
Actually that was what I did at first, but then I decided it was useful
to have the fields directly on SynICState (see followup patches).
> Either way you choose to implement it, I would add a comment
> noting that the properties are there just for debugging.
>
> BTW, would you still want to add this amount of boilerplate code
> just for debugging if we had a generic msr_get HMP command?
> There's a series in the list (submitted on April) adding msr_get
> and msr_set HMP commands, but it was never merged.
Yes I guess it would do. So I'm now tempted to just drop the
properties, and leave the fields invisible through QOM.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC
2017-06-14 15:11 ` Roman Kagan
@ 2017-06-14 15:21 ` Eduardo Habkost
0 siblings, 0 replies; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-14 15:21 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Paolo Bonzini, Evgeny Yakovlev,
Denis V . Lunev
On Wed, Jun 14, 2017 at 06:11:03PM +0300, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 09:46:20AM -0300, Eduardo Habkost wrote:
> > On Wed, Jun 14, 2017 at 12:58:04PM +0300, Roman Kagan wrote:
> > > On Tue, Jun 13, 2017 at 03:34:34PM -0300, Eduardo Habkost wrote:
> > > > On Tue, Jun 06, 2017 at 09:19:38PM +0300, Roman Kagan wrote:
> > > > > +static Property synic_props[] = {
> > > > > + DEFINE_PROP_BOOL("enabled", SynICState, enabled, false),
> > > > > + DEFINE_PROP_UINT64("msg-page-addr", SynICState, msg_page_addr, 0),
> > > > > + DEFINE_PROP_UINT64("evt-page-addr", SynICState, evt_page_addr, 0),
> > > >
> > > > What do you need the QOM properties for? Are they supposed to be
> > > > configurable by the user?
> > >
> > > No. Actually I wanted to define them as read-only, but I failed to find
> > > a way to do so.
> > >
> > > > Are they supposed to be queried from outside QEMU? On which cases?
> > >
> > > ATM I only see them as informational fields that may prove useful during
> > > debugging.
> > >
> >
> > You can use object_property_add_uint64_ptr() on instance_init to
> > add read-only properties.
> >
> > If the enabled/msg_page_addr/evt_page_addr struct fields exist
> > only because of the properties, you can also use
> > object_property_add() and write your own getter that will query
> > the MSRs only when reading the property. I normally don't like
> > custom getters/setters, but it sounds acceptable for a read-only
> > property that's only for debugging.
>
> Actually that was what I did at first, but then I decided it was useful
> to have the fields directly on SynICState (see followup patches).
>
> > Either way you choose to implement it, I would add a comment
> > noting that the properties are there just for debugging.
> >
> > BTW, would you still want to add this amount of boilerplate code
> > just for debugging if we had a generic msr_get HMP command?
> > There's a series in the list (submitted on April) adding msr_get
> > and msr_set HMP commands, but it was never merged.
>
> Yes I guess it would do. So I'm now tempted to just drop the
> properties, and leave the fields invisible through QOM.
If you are going to keep the fields on SynICState, then you just
need one object_property_add_*_ptr() line for each property, so
maybe it's still worth it. It's up to you.
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (11 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 13/23] hyperv: qdev-ify SynIC Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-08 14:47 ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 15/23] hyperv: make overlay pages for SynIC Roman Kagan
` (9 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Wrap the bulk of kvm_cpu_exec with cpu_exec_start/end, so that kvm
version can also enjoy performing certain operations while all vCPUs are
quiescent.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
kvm-all.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kvm-all.c b/kvm-all.c
index 494b925..85668fb 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1974,6 +1974,7 @@ int kvm_cpu_exec(CPUState *cpu)
}
qemu_mutex_unlock_iothread();
+ cpu_exec_start(cpu);
do {
MemTxAttrs attrs;
@@ -2103,6 +2104,7 @@ int kvm_cpu_exec(CPUState *cpu)
}
} while (ret == 0);
+ cpu_exec_end(cpu);
qemu_mutex_lock_iothread();
if (ret < 0) {
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too
2017-06-06 18:19 ` [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too Roman Kagan
@ 2017-06-08 14:47 ` Paolo Bonzini
0 siblings, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-08 14:47 UTC (permalink / raw)
To: Roman Kagan, qemu-devel; +Cc: Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On 06/06/2017 20:19, Roman Kagan wrote:
> Wrap the bulk of kvm_cpu_exec with cpu_exec_start/end, so that kvm
> version can also enjoy performing certain operations while all vCPUs are
> quiescent.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
> kvm-all.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kvm-all.c b/kvm-all.c
> index 494b925..85668fb 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1974,6 +1974,7 @@ int kvm_cpu_exec(CPUState *cpu)
> }
>
> qemu_mutex_unlock_iothread();
> + cpu_exec_start(cpu);
>
> do {
> MemTxAttrs attrs;
> @@ -2103,6 +2104,7 @@ int kvm_cpu_exec(CPUState *cpu)
> }
> } while (ret == 0);
>
> + cpu_exec_end(cpu);
> qemu_mutex_lock_iothread();
>
> if (ret < 0) {
>
This patch caught my eye, I'm applying it right now.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 15/23] hyperv: make overlay pages for SynIC
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (12 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 14/23] kvm-all: make async_safe_run_on_cpu safe on kvm too Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs Roman Kagan
` (8 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Per Hyper-V spec, SynIC message and event flag pages are to be
implemented as so called overlay pages. That is, they are owned by the
hypervisor and, when mapped into the guest physical address space,
overlay the guest physical pages such that
1) the overlaid guest page becomes invisible to the guest CPUs until the
overlay page is turned off
2) the contents of the overlay page is preserved when it's turned off
and back on, even at a different address; it's only zeroed at vcpu
reset
This particular nature of SynIC message and event flag pages is ignored
in the current code, and guest physical pages are used directly instead.
This (mostly) works because the actual guests seem not to depend on the
features listed above.
This patch implements those pages as the spec mandates. Besides being
more correct, it will help to work around certain lifetime issues with
the current code (in a followup patch).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 61 insertions(+), 6 deletions(-)
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 2d9e9fe..165133a 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -15,6 +15,9 @@
#include "qemu/main-loop.h"
#include "qapi/error.h"
#include "hw/qdev-properties.h"
+#include "exec/address-spaces.h"
+#include "sysemu/cpus.h"
+#include "migration/vmstate.h"
#include "hyperv.h"
#include "hyperv_proto.h"
@@ -26,6 +29,10 @@ typedef struct SynICState {
bool enabled;
hwaddr msg_page_addr;
hwaddr evt_page_addr;
+ MemoryRegion msg_page_mr;
+ MemoryRegion evt_page_mr;
+ struct hyperv_message_page *msg_page;
+ struct hyperv_event_flags_page *evt_page;
} SynICState;
#define TYPE_SYNIC "hyperv-synic"
@@ -65,6 +72,17 @@ static void synic_update_msg_page_addr(SynICState *synic)
uint64_t msr = synic->cpu->env.msr_hv_synic_msg_page;
hwaddr new_addr = (msr & HV_SIMP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+ if (new_addr == synic->msg_page_addr) {
+ return;
+ }
+
+ if (synic->msg_page_addr) {
+ memory_region_del_subregion(get_system_memory(), &synic->msg_page_mr);
+ }
+ if (new_addr) {
+ memory_region_add_subregion(get_system_memory(), new_addr,
+ &synic->msg_page_mr);
+ }
synic->msg_page_addr = new_addr;
}
@@ -73,6 +91,17 @@ static void synic_update_evt_page_addr(SynICState *synic)
uint64_t msr = synic->cpu->env.msr_hv_synic_evt_page;
hwaddr new_addr = (msr & HV_SIEFP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+ if (new_addr == synic->evt_page_addr) {
+ return;
+ }
+
+ if (synic->evt_page_addr) {
+ memory_region_del_subregion(get_system_memory(), &synic->evt_page_mr);
+ }
+ if (new_addr) {
+ memory_region_add_subregion(get_system_memory(), new_addr,
+ &synic->evt_page_mr);
+ }
synic->evt_page_addr = new_addr;
}
@@ -83,6 +112,15 @@ static void synic_update(SynICState *synic)
synic_update_evt_page_addr(synic);
}
+
+static void async_synic_update(CPUState *cs, run_on_cpu_data data)
+{
+ SynICState *synic = data.host_ptr;
+ qemu_mutex_lock_iothread();
+ synic_update(synic);
+ qemu_mutex_unlock_iothread();
+}
+
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
CPUX86State *env = &cpu->env;
@@ -93,11 +131,6 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
return -1;
}
- /*
- * For now just track changes in SynIC control and msg/evt pages msr's.
- * When SynIC messaging/events processing will be added in future
- * here we will do messages queues flushing and pages remapping.
- */
switch (exit->u.synic.msr) {
case HV_X64_MSR_SCONTROL:
env->msr_hv_synic_control = exit->u.synic.control;
@@ -111,7 +144,8 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
default:
return -1;
}
- synic_update(get_synic(cpu));
+ async_safe_run_on_cpu(CPU(cpu), async_synic_update,
+ RUN_ON_CPU_HOST_PTR(get_synic(cpu)));
return 0;
case KVM_EXIT_HYPERV_HCALL: {
uint16_t code;
@@ -245,13 +279,34 @@ static void synic_realize(DeviceState *dev, Error **errp)
{
Object *obj = OBJECT(dev);
SynICState *synic = SYNIC(dev);
+ char *msgp_name, *evtp_name;
+ uint32_t vp_index;
synic->cpu = X86_CPU(obj->parent);
+
+ /* memory region names have to be globally unique */
+ vp_index = hyperv_vp_index(synic->cpu);
+ msgp_name = g_strdup_printf("synic-%u-msg-page", vp_index);
+ evtp_name = g_strdup_printf("synic-%u-evt-page", vp_index);
+
+ memory_region_init_ram(&synic->msg_page_mr, obj, msgp_name,
+ sizeof(*synic->msg_page), &error_abort);
+ memory_region_init_ram(&synic->evt_page_mr, obj, evtp_name,
+ sizeof(*synic->evt_page), &error_abort);
+ vmstate_register_ram(&synic->msg_page_mr, dev);
+ vmstate_register_ram(&synic->evt_page_mr, dev);
+ synic->msg_page = memory_region_get_ram_ptr(&synic->msg_page_mr);
+ synic->evt_page = memory_region_get_ram_ptr(&synic->evt_page_mr);
+
+ g_free(msgp_name);
+ g_free(evtp_name);
}
static void synic_reset(DeviceState *dev)
{
SynICState *synic = SYNIC(dev);
+ memset(synic->msg_page, 0, sizeof(*synic->msg_page));
+ memset(synic->evt_page, 0, sizeof(*synic->evt_page));
synic_update(synic);
}
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (13 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 15/23] hyperv: make overlay pages for SynIC Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-14 11:12 ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery Roman Kagan
` (7 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
There is a design flaw in the Hyper-V SynIC implementation in KVM: when
message page or event flags page is enabled by setting the corresponding
msr, KVM zeroes it out. This violates the spec in general (per spec,
the pages have to be overlay ones and only zeroed at cpu reset), but
it's non-fatal in normal operation because the user exit happens after
the page is zeroed, so it's the underlying guest page which is zeroed
out, and sane guests don't depend on its contents to be preserved while
it's overlaid.
However, in the case of vmstate load the overlay pages are set up before
msrs are set so the contents of those pages get lost.
To work it around, avoid setting up overlay pages in .post_load.
Instead, postpone it until after the msrs are pushed to KVM. As a
result, KVM just zeroes out the underlying guest pages similar to how it
happens during guest-initiated msr writes, which is tolerable.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/kvm.c | 8 ++++++++
target/i386/machine.c | 9 ---------
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 433c912..b0b7595 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -2761,6 +2761,14 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
return ret;
}
}
+ /*
+ * to work around buggy KVM which zeroes out the message and event pages in
+ * KVM_SET_MSRS handler, only map the overlay pages after kvm_put_msrs,
+ * making vmstate load work similar to guest-initiated set_msr
+ */
+ if (level >= KVM_PUT_RESET_STATE) {
+ hyperv_synic_update(x86_cpu);
+ }
ret = kvm_put_tscdeadline_msr(x86_cpu);
if (ret < 0) {
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 8022c24..eb00b19 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -7,7 +7,6 @@
#include "hw/i386/pc.h"
#include "hw/isa/isa.h"
#include "migration/cpu.h"
-#include "hyperv.h"
#include "sysemu/kvm.h"
@@ -634,19 +633,11 @@ static bool hyperv_synic_enable_needed(void *opaque)
return false;
}
-static int hyperv_synic_post_load(void *opaque, int version_id)
-{
- X86CPU *cpu = opaque;
- hyperv_synic_update(cpu);
- return 0;
-}
-
static const VMStateDescription vmstate_msr_hyperv_synic = {
.name = "cpu/msr_hyperv_synic",
.version_id = 1,
.minimum_version_id = 1,
.needed = hyperv_synic_enable_needed,
- .post_load = hyperv_synic_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT64(env.msr_hv_synic_control, X86CPU),
VMSTATE_UINT64(env.msr_hv_synic_evt_page, X86CPU),
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-06 18:19 ` [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs Roman Kagan
@ 2017-06-14 11:12 ` Paolo Bonzini
2017-06-14 11:54 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 11:12 UTC (permalink / raw)
To: Roman Kagan, qemu-devel; +Cc: Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On 06/06/2017 20:19, Roman Kagan wrote:
> There is a design flaw in the Hyper-V SynIC implementation in KVM: when
> message page or event flags page is enabled by setting the corresponding
> msr, KVM zeroes it out. This violates the spec in general (per spec,
> the pages have to be overlay ones and only zeroed at cpu reset), but
> it's non-fatal in normal operation because the user exit happens after
> the page is zeroed, so it's the underlying guest page which is zeroed
> out, and sane guests don't depend on its contents to be preserved while
> it's overlaid.
>
> However, in the case of vmstate load the overlay pages are set up before
> msrs are set so the contents of those pages get lost.
>
> To work it around, avoid setting up overlay pages in .post_load.
> Instead, postpone it until after the msrs are pushed to KVM. As a
> result, KVM just zeroes out the underlying guest pages similar to how it
> happens during guest-initiated msr writes, which is tolerable.
Why not disable the zeroing for host-initiated MSR writes? This is
pretty clearly a KVM bug, we can push it to stable kernels too.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-14 11:12 ` Paolo Bonzini
@ 2017-06-14 11:54 ` Roman Kagan
2017-06-14 12:11 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 11:54 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 01:12:12PM +0200, Paolo Bonzini wrote:
>
>
> On 06/06/2017 20:19, Roman Kagan wrote:
> > There is a design flaw in the Hyper-V SynIC implementation in KVM: when
> > message page or event flags page is enabled by setting the corresponding
> > msr, KVM zeroes it out. This violates the spec in general (per spec,
> > the pages have to be overlay ones and only zeroed at cpu reset), but
> > it's non-fatal in normal operation because the user exit happens after
> > the page is zeroed, so it's the underlying guest page which is zeroed
> > out, and sane guests don't depend on its contents to be preserved while
> > it's overlaid.
> >
> > However, in the case of vmstate load the overlay pages are set up before
> > msrs are set so the contents of those pages get lost.
> >
> > To work it around, avoid setting up overlay pages in .post_load.
> > Instead, postpone it until after the msrs are pushed to KVM. As a
> > result, KVM just zeroes out the underlying guest pages similar to how it
> > happens during guest-initiated msr writes, which is tolerable.
>
> Why not disable the zeroing for host-initiated MSR writes? This is
> pretty clearly a KVM bug, we can push it to stable kernels too.
The only problem with this is that QEMU will have no reliable way to
know if the KVM it runs with has this bug fixed or not. Machines
without vmbus work and even migrate fine with the current KVM despite
this bug (the only user of those pages currently is synic timers which
re-arm themselves and post messages regardless of zeroing). Now
updating QEMU to a vmbus-enabled version without updating the kernel
will make the migrations cause guest hangs.
If that is tolerable I can happily drop this patch as it complicates
code a little. Distros probably won't be affected as they can make sure
their kernels have this bug fixed before they roll out a vmbus-capable
QEMU.
What do you think?
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-14 11:54 ` Roman Kagan
@ 2017-06-14 12:11 ` Paolo Bonzini
2017-06-14 12:41 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 12:11 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Eduardo Habkost, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 13:54, Roman Kagan wrote:
>> Why not disable the zeroing for host-initiated MSR writes? This is
>> pretty clearly a KVM bug, we can push it to stable kernels too.
>
> The only problem with this is that QEMU will have no reliable way to
> know if the KVM it runs with has this bug fixed or not. Machines
> without vmbus work and even migrate fine with the current KVM despite
> this bug (the only user of those pages currently is synic timers which
> re-arm themselves and post messages regardless of zeroing). Now
> updating QEMU to a vmbus-enabled version without updating the kernel
> will make the migrations cause guest hangs.
Return 2 from KVM_CHECK_EXTENSION(KVM_CAP_HYPERV_SYNIC)? Then you can
make new QEMU refuse to enable synic if a new kernel is not available.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-14 12:11 ` Paolo Bonzini
@ 2017-06-14 12:41 ` Roman Kagan
2017-06-14 12:46 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 12:41 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 02:11:56PM +0200, Paolo Bonzini wrote:
> On 14/06/2017 13:54, Roman Kagan wrote:
> >> Why not disable the zeroing for host-initiated MSR writes? This is
> >> pretty clearly a KVM bug, we can push it to stable kernels too.
> >
> > The only problem with this is that QEMU will have no reliable way to
> > know if the KVM it runs with has this bug fixed or not. Machines
> > without vmbus work and even migrate fine with the current KVM despite
> > this bug (the only user of those pages currently is synic timers which
> > re-arm themselves and post messages regardless of zeroing). Now
> > updating QEMU to a vmbus-enabled version without updating the kernel
> > will make the migrations cause guest hangs.
>
> Return 2 from KVM_CHECK_EXTENSION(KVM_CAP_HYPERV_SYNIC)? Then you can
> make new QEMU refuse to enable synic if a new kernel is not available.
Indeed, that's a possibility.
I'll probably make it in both directions then: on
KVM_ENABLE_CAP(KVM_CAP_HYPERV_SYNIC, 2) disable zeroing completely,
including on guest writes, to better match Hyper-V. Or does it deserve
a separate cap number?
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs
2017-06-14 12:41 ` Roman Kagan
@ 2017-06-14 12:46 ` Paolo Bonzini
0 siblings, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 12:46 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Eduardo Habkost, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 14:41, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 02:11:56PM +0200, Paolo Bonzini wrote:
>> On 14/06/2017 13:54, Roman Kagan wrote:
>>>> Why not disable the zeroing for host-initiated MSR writes? This is
>>>> pretty clearly a KVM bug, we can push it to stable kernels too.
>>>
>>> The only problem with this is that QEMU will have no reliable way to
>>> know if the KVM it runs with has this bug fixed or not. Machines
>>> without vmbus work and even migrate fine with the current KVM despite
>>> this bug (the only user of those pages currently is synic timers which
>>> re-arm themselves and post messages regardless of zeroing). Now
>>> updating QEMU to a vmbus-enabled version without updating the kernel
>>> will make the migrations cause guest hangs.
>>
>> Return 2 from KVM_CHECK_EXTENSION(KVM_CAP_HYPERV_SYNIC)? Then you can
>> make new QEMU refuse to enable synic if a new kernel is not available.
>
> Indeed, that's a possibility.
>
> I'll probably make it in both directions then: on
> KVM_ENABLE_CAP(KVM_CAP_HYPERV_SYNIC, 2) disable zeroing completely,
> including on guest writes, to better match Hyper-V. Or does it deserve
> a separate cap number?
Unfortunately kvm_vcpu_ioctl_enable_cap is not checking arguments and
refusing nonzero values, so I'm a bit wary of doing that
unconditionally---and I'm not sure it deserves a separate capability number.
Maybe a flag KVM_ENABLE_CAP_STRICT_ARG_CHECK in cap->flags, but I don't
want you to do too much unrelated work. Keeping the zeroing on guest
writes doesn't seem too bad.
Or the nuclear option, introduce KVM_CAP_HYPERV_SYNIC2 and drop the
broken KVM_CAP_HYPERV_SYNIC altogether. Has its charm...
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (14 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 16/23] hyperv: map overlay pages after updating msrs Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-14 15:08 ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 18/23] hyperv: add synic event flag signaling Roman Kagan
` (6 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Add infrastructure to deliver SynIC messages to the guest SynIC message
page.
Note that KVM also may want to deliver (SynIC timer) messages to the
same message slot.
The problem is that the access to a SynIC message slot is controlled by
the value of its .msg_type field which indicates if the slot is being
owned by the hypervisor (zero) or by the guest (non-zero).
This leaves no room for synchronizing multiple concurrent producers.
The simplest way to deal with this for both KVM and QEMU is to only
deliver messages in the vcpu thread. KVM already does this; this patch
makes it for QEMU, too.
Specifically,
- add a function for posting messages, which only copies the message
into the staging buffer if its free, and schedules a work on the
corresponding vcpu to actually deliver it to the guest slot;
- instead of a sint ack callback, set up the sint route with a message
status callback. This function is called in a bh whenever there are
updates to the message slot status: either the vcpu made definitive
progress delivering the message from the staging buffer (succeeded or
failed) or the guest issued EOM; the status is passed as an argument
to the callback.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 7 ++--
target/i386/hyperv.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 109 insertions(+), 14 deletions(-)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 9dd5ca0..fa3e988 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -19,13 +19,12 @@
#include "qemu/event_notifier.h"
typedef struct HvSintRoute HvSintRoute;
-typedef void (*HvSintAckClb)(void *data);
+typedef void (*HvSintMsgCb)(void *data, int status);
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
- HvSintAckClb sint_ack_clb,
- void *sint_ack_clb_data);
+ HvSintMsgCb cb, void *cb_data);
void hyperv_sint_route_ref(HvSintRoute *sint_route);
void hyperv_sint_route_unref(HvSintRoute *sint_route);
@@ -38,4 +37,6 @@ void hyperv_synic_add(X86CPU *cpu);
void hyperv_synic_reset(X86CPU *cpu);
void hyperv_synic_update(X86CPU *cpu);
+int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *msg);
+
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 165133a..0a7f9b1 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -44,8 +44,21 @@ struct HvSintRoute {
int gsi;
EventNotifier sint_set_notifier;
EventNotifier sint_ack_notifier;
- HvSintAckClb sint_ack_clb;
- void *sint_ack_clb_data;
+
+ HvSintMsgCb msg_cb;
+ void *msg_cb_data;
+ QEMUBH *msg_bh;
+ struct hyperv_message *msg;
+ /*
+ * the state of the message staged in .msg:
+ * 0 - the staging area is not in use (after init or message
+ * successfully delivered to guest)
+ * -EBUSY - the staging area is being used in vcpu thread
+ * -EAGAIN - delivery attempt failed due to slot being busy, retry
+ * -EXXXX - error
+ */
+ int msg_status;
+
unsigned refcount;
};
@@ -112,6 +125,69 @@ static void synic_update(SynICState *synic)
synic_update_evt_page_addr(synic);
}
+/*
+ * Worker to transfer the message from the staging area into the guest-owned
+ * message page in vcpu context, which guarantees serialization with both KVM
+ * vcpu and the guest cpu.
+ */
+static void cpu_post_msg(CPUState *cs, run_on_cpu_data data)
+{
+ int ret;
+ HvSintRoute *sint_route = data.host_ptr;
+ SynICState *synic = sint_route->synic;
+ struct hyperv_message *dst_msg;
+
+ if (!synic->enabled || !synic->msg_page_addr) {
+ ret = -ENXIO;
+ goto notify;
+ }
+
+ dst_msg = &synic->msg_page->slot[sint_route->sint];
+
+ if (dst_msg->header.message_type != HV_MESSAGE_NONE) {
+ dst_msg->header.message_flags |= HV_MESSAGE_FLAG_PENDING;
+ ret = -EAGAIN;
+ } else {
+ memcpy(dst_msg, sint_route->msg, sizeof(*dst_msg));
+ ret = kvm_hv_sint_route_set_sint(sint_route);
+ }
+
+ memory_region_set_dirty(&synic->msg_page_mr, 0, sizeof(*synic->msg_page));
+
+notify:
+ sint_route->msg_status = ret;
+ /* notify the msg originator of the progress made; if the slot was busy we
+ * set msg_pending flag in it so it will be the guest who will do EOM and
+ * trigger the notification from KVM via sint_ack_notifier */
+ if (ret != -EAGAIN) {
+ qemu_bh_schedule(sint_route->msg_bh);
+ }
+}
+
+/*
+ * Post a Hyper-V message to the staging area, for delivery to guest in the
+ * vcpu thread.
+ */
+int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *src_msg)
+{
+ int ret = sint_route->msg_status;
+
+ assert(sint_route->msg_cb);
+
+ if (ret == -EBUSY) {
+ return -EAGAIN;
+ }
+ if (ret) {
+ return ret;
+ }
+
+ sint_route->msg_status = -EBUSY;
+ memcpy(sint_route->msg, src_msg, sizeof(*src_msg));
+
+ async_run_on_cpu(CPU(sint_route->synic->cpu), cpu_post_msg,
+ RUN_ON_CPU_HOST_PTR(sint_route));
+ return 0;
+}
static void async_synic_update(CPUState *cs, run_on_cpu_data data)
{
@@ -164,17 +240,27 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
}
}
-static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
+static void sint_ack_handler(EventNotifier *notifier)
{
HvSintRoute *sint_route = container_of(notifier, HvSintRoute,
sint_ack_notifier);
event_notifier_test_and_clear(notifier);
- sint_route->sint_ack_clb(sint_route->sint_ack_clb_data);
+
+ if (sint_route->msg_status == -EAGAIN) {
+ qemu_bh_schedule(sint_route->msg_bh);
+ }
+}
+
+static void sint_msg_bh(void *opaque)
+{
+ HvSintRoute *sint_route = opaque;
+ int status = sint_route->msg_status;
+ sint_route->msg_status = 0;
+ sint_route->msg_cb(sint_route->msg_cb_data, status);
}
HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
- HvSintAckClb sint_ack_clb,
- void *sint_ack_clb_data)
+ HvSintMsgCb cb, void *cb_data)
{
SynICState *synic;
HvSintRoute *sint_route;
@@ -189,14 +275,18 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
goto err;
}
- ack_notifier = sint_ack_clb ? &sint_route->sint_ack_notifier : NULL;
+ ack_notifier = cb ? &sint_route->sint_ack_notifier : NULL;
if (ack_notifier) {
+ sint_route->msg = g_new(struct hyperv_message, 1);
+
r = event_notifier_init(ack_notifier, false);
if (r) {
goto err_sint_set_notifier;
}
- event_notifier_set_handler(ack_notifier, kvm_hv_sint_ack_handler);
+ event_notifier_set_handler(ack_notifier, sint_ack_handler);
+
+ sint_route->msg_bh = qemu_bh_new(sint_msg_bh, sint_route);
}
gsi = kvm_irqchip_add_hv_sint_route(kvm_state, hyperv_vp_index(cpu), sint);
@@ -211,8 +301,8 @@ HvSintRoute *hyperv_sint_route_new(X86CPU *cpu, uint32_t sint,
goto err_irqfd;
}
sint_route->gsi = gsi;
- sint_route->sint_ack_clb = sint_ack_clb;
- sint_route->sint_ack_clb_data = sint_ack_clb_data;
+ sint_route->msg_cb = cb;
+ sint_route->msg_cb_data = cb_data;
sint_route->synic = synic;
sint_route->sint = sint;
sint_route->refcount = 1;
@@ -223,8 +313,10 @@ err_irqfd:
kvm_irqchip_release_virq(kvm_state, gsi);
err_gsi:
if (ack_notifier) {
+ qemu_bh_delete(sint_route->msg_bh);
event_notifier_set_handler(ack_notifier, NULL);
event_notifier_cleanup(ack_notifier);
+ g_free(sint_route->msg);
}
err_sint_set_notifier:
event_notifier_cleanup(&sint_route->sint_set_notifier);
@@ -255,9 +347,11 @@ void hyperv_sint_route_unref(HvSintRoute *sint_route)
&sint_route->sint_set_notifier,
sint_route->gsi);
kvm_irqchip_release_virq(kvm_state, sint_route->gsi);
- if (sint_route->sint_ack_clb) {
+ if (sint_route->msg_cb) {
+ qemu_bh_delete(sint_route->msg_bh);
event_notifier_set_handler(&sint_route->sint_ack_notifier, NULL);
event_notifier_cleanup(&sint_route->sint_ack_notifier);
+ g_free(sint_route->msg);
}
event_notifier_cleanup(&sint_route->sint_set_notifier);
g_free(sint_route);
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery
2017-06-06 18:19 ` [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery Roman Kagan
@ 2017-06-14 15:08 ` Paolo Bonzini
2017-06-14 15:28 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 15:08 UTC (permalink / raw)
To: Roman Kagan, qemu-devel; +Cc: Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On 06/06/2017 20:19, Roman Kagan wrote:
> + sint_route->msg_status = ret;
> + /* notify the msg originator of the progress made; if the slot was busy we
> + * set msg_pending flag in it so it will be the guest who will do EOM and
> + * trigger the notification from KVM via sint_ack_notifier */
> + if (ret != -EAGAIN) {
> + qemu_bh_schedule(sint_route->msg_bh);
> + }
It may be faster to use aio_bh_schedule_oneshot, depending on the number
of devices.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery
2017-06-14 15:08 ` Paolo Bonzini
@ 2017-06-14 15:28 ` Roman Kagan
2017-06-14 15:32 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 15:28 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 05:08:02PM +0200, Paolo Bonzini wrote:
>
>
> On 06/06/2017 20:19, Roman Kagan wrote:
> > + sint_route->msg_status = ret;
> > + /* notify the msg originator of the progress made; if the slot was busy we
> > + * set msg_pending flag in it so it will be the guest who will do EOM and
> > + * trigger the notification from KVM via sint_ack_notifier */
> > + if (ret != -EAGAIN) {
> > + qemu_bh_schedule(sint_route->msg_bh);
> > + }
>
> It may be faster to use aio_bh_schedule_oneshot, depending on the number
> of devices.
Messages aren't used on fast paths, they are only exchanged at device
setup/teardown. So I cared more about readability than speed here.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery
2017-06-14 15:28 ` Roman Kagan
@ 2017-06-14 15:32 ` Paolo Bonzini
2017-06-14 15:39 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 15:32 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Eduardo Habkost, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 17:28, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 05:08:02PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 06/06/2017 20:19, Roman Kagan wrote:
>>> + sint_route->msg_status = ret;
>>> + /* notify the msg originator of the progress made; if the slot was busy we
>>> + * set msg_pending flag in it so it will be the guest who will do EOM and
>>> + * trigger the notification from KVM via sint_ack_notifier */
>>> + if (ret != -EAGAIN) {
>>> + qemu_bh_schedule(sint_route->msg_bh);
>>> + }
>>
>> It may be faster to use aio_bh_schedule_oneshot, depending on the number
>> of devices.
>
> Messages aren't used on fast paths, they are only exchanged at device
> setup/teardown. So I cared more about readability than speed here.
Then you really want to use aio_bh_schedule_oneshot, because bottom
halves incur a (small) cost even when you don't use them: each iteration
of the event loop visits the list of bottom halves.
Persistent bottom halves, thus, are only a good idea if they are
expected to trigger very often on a busy VM. If this bottom half is
only triggered at setup/teardown, it shouldn't use them.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery
2017-06-14 15:32 ` Paolo Bonzini
@ 2017-06-14 15:39 ` Roman Kagan
0 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 15:39 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 05:32:12PM +0200, Paolo Bonzini wrote:
>
>
> On 14/06/2017 17:28, Roman Kagan wrote:
> > On Wed, Jun 14, 2017 at 05:08:02PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 06/06/2017 20:19, Roman Kagan wrote:
> >>> + sint_route->msg_status = ret;
> >>> + /* notify the msg originator of the progress made; if the slot was busy we
> >>> + * set msg_pending flag in it so it will be the guest who will do EOM and
> >>> + * trigger the notification from KVM via sint_ack_notifier */
> >>> + if (ret != -EAGAIN) {
> >>> + qemu_bh_schedule(sint_route->msg_bh);
> >>> + }
> >>
> >> It may be faster to use aio_bh_schedule_oneshot, depending on the number
> >> of devices.
> >
> > Messages aren't used on fast paths, they are only exchanged at device
> > setup/teardown. So I cared more about readability than speed here.
>
> Then you really want to use aio_bh_schedule_oneshot, because bottom
> halves incur a (small) cost even when you don't use them: each iteration
> of the event loop visits the list of bottom halves.
I didn't realize that (yes that's easy to see in the code but the API
didn't suggest I needed to ;).
> Persistent bottom halves, thus, are only a good idea if they are
> expected to trigger very often on a busy VM. If this bottom half is
> only triggered at setup/teardown, it shouldn't use them.
Thanks for pointing that out; we'll have to adjust our vmbus code too as
we used a persistent bottom half for message interactions there too.
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 18/23] hyperv: add synic event flag signaling
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (15 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 17/23] hyperv: add synic message delivery Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-14 15:07 ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 19/23] hyperv: process SIGNAL_EVENT hypercall Roman Kagan
` (5 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Add infrastructure to signal SynIC event flags by atomically setting the
corresponding bit in the event flags page and firing a SINT if
necessary.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 2 ++
target/i386/hyperv.c | 32 ++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index fa3e988..a2d4304 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -39,4 +39,6 @@ void hyperv_synic_update(X86CPU *cpu);
int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *msg);
+int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno);
+
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 0a7f9b1..dcf49e4 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -17,6 +17,7 @@
#include "hw/qdev-properties.h"
#include "exec/address-spaces.h"
#include "sysemu/cpus.h"
+#include "qemu/bitops.h"
#include "migration/vmstate.h"
#include "hyperv.h"
#include "hyperv_proto.h"
@@ -189,6 +190,37 @@ int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *src_msg)
return 0;
}
+/*
+ * Set given event flag for a given sint on a given vcpu, and signal the sint.
+ */
+int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno)
+{
+ int ret;
+ SynICState *synic = sint_route->synic;
+ unsigned long *flags, set_mask;
+ unsigned set_idx;
+
+ if (evtno > HV_EVENT_FLAGS_COUNT) {
+ return -EINVAL;
+ }
+ if (!synic->enabled || !synic->evt_page_addr) {
+ return -ENXIO;
+ }
+
+ set_idx = BIT_WORD(evtno);
+ set_mask = BIT_MASK(evtno);
+ flags = synic->evt_page->slot[sint_route->sint].flags;
+
+ if ((atomic_fetch_or(&flags[set_idx], set_mask) & set_mask) != set_mask) {
+ ret = kvm_hv_sint_route_set_sint(sint_route);
+ memory_region_set_dirty(&synic->evt_page_mr, 0,
+ sizeof(*synic->evt_page));
+ } else {
+ ret = 0;
+ }
+ return ret;
+}
+
static void async_synic_update(CPUState *cs, run_on_cpu_data data)
{
SynICState *synic = data.host_ptr;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 19/23] hyperv: process SIGNAL_EVENT hypercall
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (16 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 18/23] hyperv: add synic event flag signaling Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall Roman Kagan
` (4 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Add handling of SIGNAL_EVENT hypercall. For that, provide an interface
to associate an EventNotifier with an event connection number, so that
it's signaled when the SIGNAL_EVENT hypercall with the matching
parameters is called by the guest.
TODO: we should be able to move this to KVM and avoid expensive user
exit just to look up an eventfd by connection number and signal it.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 2 +
target/i386/hyperv.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 111 insertions(+), 4 deletions(-)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index a2d4304..d2630ac 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -41,4 +41,6 @@ int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *msg);
int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno);
+int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier);
+
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index dcf49e4..030184e 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -18,6 +18,9 @@
#include "exec/address-spaces.h"
#include "sysemu/cpus.h"
#include "qemu/bitops.h"
+#include "qemu/queue.h"
+#include "qemu/rcu.h"
+#include "qemu/rcu_queue.h"
#include "migration/vmstate.h"
#include "hyperv.h"
#include "hyperv_proto.h"
@@ -229,6 +232,106 @@ static void async_synic_update(CPUState *cs, run_on_cpu_data data)
qemu_mutex_unlock_iothread();
}
+typedef struct EvtHandler {
+ struct rcu_head rcu;
+ QLIST_ENTRY(EvtHandler) le;
+ uint32_t conn_id;
+ EventNotifier *notifier;
+} EvtHandler;
+
+static QLIST_HEAD(, EvtHandler) evt_handlers;
+static QemuMutex evt_handlers_mutex;
+
+static void __attribute__((constructor)) hv_init(void)
+{
+ QLIST_INIT(&evt_handlers);
+ qemu_mutex_init(&evt_handlers_mutex);
+}
+
+int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
+{
+ int ret;
+ EvtHandler *eh;
+
+ qemu_mutex_lock(&evt_handlers_mutex);
+ QLIST_FOREACH(eh, &evt_handlers, le) {
+ if (eh->conn_id == conn_id) {
+ if (notifier) {
+ ret = -EEXIST;
+ } else {
+ QLIST_REMOVE_RCU(eh, le);
+ g_free_rcu(eh, rcu);
+ ret = 0;
+ }
+ goto unlock;
+ }
+ }
+
+ if (notifier) {
+ eh = g_new(EvtHandler, 1);
+ eh->conn_id = conn_id;
+ eh->notifier = notifier;
+ QLIST_INSERT_HEAD_RCU(&evt_handlers, eh, le);
+ ret = 0;
+ } else {
+ ret = -ENOENT;
+ }
+unlock:
+ qemu_mutex_unlock(&evt_handlers_mutex);
+ return ret;
+}
+
+static uint64_t sigevent_params(hwaddr addr, uint32_t *conn_id)
+{
+ uint64_t ret;
+ hwaddr len;
+ struct hyperv_signal_event_input *msg;
+
+ if (addr & (__alignof__(*msg) - 1)) {
+ return HV_STATUS_INVALID_ALIGNMENT;
+ }
+
+ len = sizeof(*msg);
+ msg = cpu_physical_memory_map(addr, &len, 0);
+ if (len < sizeof(*msg)) {
+ ret = HV_STATUS_INSUFFICIENT_MEMORY;
+ } else {
+ *conn_id = (msg->connection_id & HV_CONNECTION_ID_MASK) +
+ msg->flag_number;
+ ret = 0;
+ }
+ cpu_physical_memory_unmap(msg, len, 0, 0);
+ return ret;
+}
+
+static uint64_t hvcall_signal_event(uint64_t param, bool fast)
+{
+ uint64_t ret;
+ uint32_t conn_id;
+ EvtHandler *eh;
+
+ if (likely(fast)) {
+ conn_id = (param & 0xffffffff) + ((param >> 32) & 0xffff);
+ } else {
+ ret = sigevent_params(param, &conn_id);
+ if (ret) {
+ return ret;
+ }
+ }
+
+ ret = HV_STATUS_INVALID_CONNECTION_ID;
+ rcu_read_lock();
+ QLIST_FOREACH_RCU(eh, &evt_handlers, le) {
+ if (eh->conn_id == conn_id) {
+ event_notifier_set(eh->notifier);
+ ret = 0;
+ break;
+ }
+ }
+ rcu_read_unlock();
+ return ret;
+}
+
int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
{
CPUX86State *env = &cpu->env;
@@ -256,16 +359,18 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
RUN_ON_CPU_HOST_PTR(get_synic(cpu)));
return 0;
case KVM_EXIT_HYPERV_HCALL: {
- uint16_t code;
+ uint16_t code = exit->u.hcall.input & 0xffff;
+ bool fast = exit->u.hcall.input & HV_HYPERCALL_FAST;
+ uint64_t param = exit->u.hcall.params[0];
- code = exit->u.hcall.input & 0xffff;
switch (code) {
- case HV_POST_MESSAGE:
case HV_SIGNAL_EVENT:
+ exit->u.hcall.result = hvcall_signal_event(param, fast);
+ break;
default:
exit->u.hcall.result = HV_STATUS_INVALID_HYPERCALL_CODE;
- return 0;
}
+ return 0;
}
default:
return -1;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (17 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 19/23] hyperv: process SIGNAL_EVENT hypercall Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-14 11:19 ` Paolo Bonzini
2017-06-06 18:19 ` [Qemu-devel] [PATCH 21/23] hyperv_testdev: add SynIC message and event testmodes Roman Kagan
` (3 subsequent siblings)
22 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Add handling of POST_MESSAGE hypercall. For that, add an interface to
regsiter a handler for the messages arrived from the guest on a
particular connection id (IOW set up a message connection in Hyper-V
speak).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 5 +++
target/i386/hyperv.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 95 insertions(+)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index d2630ac..f82c770 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -41,6 +41,11 @@ int hyperv_post_msg(HvSintRoute *sint_route, struct hyperv_message *msg);
int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno);
+struct hyperv_post_message_input;
+typedef uint64_t (*HvMsgHandler)(const struct hyperv_post_message_input *msg,
+ void *data);
+int hyperv_set_msg_handler(uint32_t conn_id, HvMsgHandler handler, void *data);
+
int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier);
#endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 030184e..50386a2 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -232,6 +232,17 @@ static void async_synic_update(CPUState *cs, run_on_cpu_data data)
qemu_mutex_unlock_iothread();
}
+typedef struct MsgHandler {
+ struct rcu_head rcu;
+ QLIST_ENTRY(MsgHandler) le;
+ uint32_t conn_id;
+ HvMsgHandler handler;
+ void *data;
+} MsgHandler;
+
+static QLIST_HEAD(, MsgHandler) msg_handlers;
+static QemuMutex msg_handlers_mutex;
+
typedef struct EvtHandler {
struct rcu_head rcu;
QLIST_ENTRY(EvtHandler) le;
@@ -244,10 +255,46 @@ static QemuMutex evt_handlers_mutex;
static void __attribute__((constructor)) hv_init(void)
{
+ QLIST_INIT(&msg_handlers);
+ qemu_mutex_init(&msg_handlers_mutex);
QLIST_INIT(&evt_handlers);
qemu_mutex_init(&evt_handlers_mutex);
}
+int hyperv_set_msg_handler(uint32_t conn_id, HvMsgHandler handler, void *data)
+{
+ int ret;
+ MsgHandler *mh;
+
+ qemu_mutex_lock(&msg_handlers_mutex);
+ QLIST_FOREACH(mh, &msg_handlers, le) {
+ if (mh->conn_id == conn_id) {
+ if (handler) {
+ ret = -EEXIST;
+ } else {
+ QLIST_REMOVE_RCU(mh, le);
+ g_free_rcu(mh, rcu);
+ ret = 0;
+ }
+ goto unlock;
+ }
+ }
+
+ if (handler) {
+ mh = g_new(MsgHandler, 1);
+ mh->conn_id = conn_id;
+ mh->handler = handler;
+ mh->data = data;
+ QLIST_INSERT_HEAD_RCU(&msg_handlers, mh, le);
+ ret = 0;
+ } else {
+ ret = -ENOENT;
+ }
+unlock:
+ qemu_mutex_unlock(&msg_handlers_mutex);
+ return ret;
+}
+
int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
{
int ret;
@@ -281,6 +328,46 @@ unlock:
return ret;
}
+static uint64_t hvcall_post_message(uint64_t param, bool fast)
+{
+ uint64_t ret;
+ hwaddr len;
+ struct hyperv_post_message_input *msg;
+ MsgHandler *mh;
+
+ if (fast) {
+ return HV_STATUS_INVALID_HYPERCALL_CODE;
+ }
+ if (param & (__alignof__(*msg) - 1)) {
+ return HV_STATUS_INVALID_ALIGNMENT;
+ }
+
+ len = sizeof(*msg);
+ msg = cpu_physical_memory_map(param, &len, 0);
+ if (len < sizeof(*msg)) {
+ ret = HV_STATUS_INSUFFICIENT_MEMORY;
+ goto unmap;
+ }
+ if (msg->payload_size > sizeof(msg->payload)) {
+ ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+ goto unmap;
+ }
+
+ ret = HV_STATUS_INVALID_CONNECTION_ID;
+ rcu_read_lock();
+ QLIST_FOREACH_RCU(mh, &msg_handlers, le) {
+ if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) {
+ ret = mh->handler(msg, mh->data);
+ break;
+ }
+ }
+ rcu_read_unlock();
+
+unmap:
+ cpu_physical_memory_unmap(msg, len, 0, 0);
+ return ret;
+}
+
static uint64_t sigevent_params(hwaddr addr, uint32_t *conn_id)
{
uint64_t ret;
@@ -364,6 +451,9 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
uint64_t param = exit->u.hcall.params[0];
switch (code) {
+ case HV_POST_MESSAGE:
+ exit->u.hcall.result = hvcall_post_message(param, fast);
+ break;
case HV_SIGNAL_EVENT:
exit->u.hcall.result = hvcall_signal_event(param, fast);
break;
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall
2017-06-06 18:19 ` [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall Roman Kagan
@ 2017-06-14 11:19 ` Paolo Bonzini
2017-06-14 14:20 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 11:19 UTC (permalink / raw)
To: Roman Kagan, qemu-devel; +Cc: Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On 06/06/2017 20:19, Roman Kagan wrote:
> +typedef struct MsgHandler {
> + struct rcu_head rcu;
> + QLIST_ENTRY(MsgHandler) le;
> + uint32_t conn_id;
> + HvMsgHandler handler;
> + void *data;
> +} MsgHandler;
> +
> +static QLIST_HEAD(, MsgHandler) msg_handlers;
> +static QemuMutex msg_handlers_mutex;
Maybe use the same mutex for event and message handlers?
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall
2017-06-14 11:19 ` Paolo Bonzini
@ 2017-06-14 14:20 ` Roman Kagan
2017-06-14 14:30 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 14:20 UTC (permalink / raw)
To: Paolo Bonzini
Cc: qemu-devel, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 01:19:21PM +0200, Paolo Bonzini wrote:
> On 06/06/2017 20:19, Roman Kagan wrote:
> > +typedef struct MsgHandler {
> > + struct rcu_head rcu;
> > + QLIST_ENTRY(MsgHandler) le;
> > + uint32_t conn_id;
> > + HvMsgHandler handler;
> > + void *data;
> > +} MsgHandler;
> > +
> > +static QLIST_HEAD(, MsgHandler) msg_handlers;
> > +static QemuMutex msg_handlers_mutex;
>
> Maybe use the same mutex for event and message handlers?
Are there other benefits in it beside saving 40 bytes?
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall
2017-06-14 14:20 ` Roman Kagan
@ 2017-06-14 14:30 ` Paolo Bonzini
0 siblings, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 14:30 UTC (permalink / raw)
To: Roman Kagan, qemu-devel, Eduardo Habkost, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 16:20, Roman Kagan wrote:
> On Wed, Jun 14, 2017 at 01:19:21PM +0200, Paolo Bonzini wrote:
>> On 06/06/2017 20:19, Roman Kagan wrote:
>>> +typedef struct MsgHandler {
>>> + struct rcu_head rcu;
>>> + QLIST_ENTRY(MsgHandler) le;
>>> + uint32_t conn_id;
>>> + HvMsgHandler handler;
>>> + void *data;
>>> +} MsgHandler;
>>> +
>>> +static QLIST_HEAD(, MsgHandler) msg_handlers;
>>> +static QemuMutex msg_handlers_mutex;
>>
>> Maybe use the same mutex for event and message handlers?
>
> Are there other benefits in it beside saving 40 bytes?
It's generally simpler if one module only uses one mutex, you don't have
to think of the interactions. Since everything is RCU-protected on the
read-side, it should not matter for performance.
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 21/23] hyperv_testdev: add SynIC message and event testmodes
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (18 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 20/23] hyperv: process POST_MESSAGE hypercall Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 22/23] MAINTAINERS: add myself and eyakovlev@ for hyperv* Roman Kagan
` (2 subsequent siblings)
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Add testmodes for SynIC messages and events. The message or event
connection setup / teardown is initiated by the guest via new control
codes written to the test device port. Then the test connections bounce
the respective operations back to the guest, i.e. the incoming messages
are posted or the incoming events are signaled on the configured vCPUs.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
hw/misc/hyperv_testdev.c | 163 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 162 insertions(+), 1 deletion(-)
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index 929bf4f..99d6d8a 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -13,12 +13,15 @@
#include "qemu/osdep.h"
#include "qemu/queue.h"
+#include "qemu/event_notifier.h"
+#include "qemu/main-loop.h"
#include <linux/kvm.h>
#include "hw/hw.h"
#include "hw/qdev.h"
#include "hw/isa/isa.h"
#include "sysemu/kvm.h"
#include "target/i386/hyperv.h"
+#include "target/i386/hyperv_proto.h"
#include "kvm_i386.h"
typedef struct TestSintRoute {
@@ -28,10 +31,26 @@ typedef struct TestSintRoute {
HvSintRoute *sint_route;
} TestSintRoute;
+typedef struct TestMsgConn {
+ QLIST_ENTRY(TestMsgConn) le;
+ uint8_t conn_id;
+ HvSintRoute *sint_route;
+ struct hyperv_message msg;
+} TestMsgConn;
+
+typedef struct TestEvtConn {
+ QLIST_ENTRY(TestEvtConn) le;
+ uint8_t conn_id;
+ HvSintRoute *sint_route;
+ EventNotifier notifier;
+} TestEvtConn;
+
struct HypervTestDev {
ISADevice parent_obj;
MemoryRegion sint_control;
QLIST_HEAD(, TestSintRoute) sint_routes;
+ QLIST_HEAD(, TestMsgConn) msg_conns;
+ QLIST_HEAD(, TestEvtConn) evt_conns;
};
typedef struct HypervTestDev HypervTestDev;
@@ -42,7 +61,11 @@ typedef struct HypervTestDev HypervTestDev;
enum {
HV_TEST_DEV_SINT_ROUTE_CREATE = 1,
HV_TEST_DEV_SINT_ROUTE_DESTROY,
- HV_TEST_DEV_SINT_ROUTE_SET_SINT
+ HV_TEST_DEV_SINT_ROUTE_SET_SINT,
+ HV_TEST_DEV_MSG_CONN_CREATE,
+ HV_TEST_DEV_MSG_CONN_DESTROY,
+ HV_TEST_DEV_EVT_CONN_CREATE,
+ HV_TEST_DEV_EVT_CONN_DESTROY,
};
static void sint_route_create(HypervTestDev *dev, X86CPU *cpu, uint8_t sint)
@@ -94,6 +117,128 @@ static void sint_route_set_sint(HypervTestDev *dev, X86CPU *cpu, uint8_t sint)
kvm_hv_sint_route_set_sint(sint_route->sint_route);
}
+static void msg_cb(void *data, int status)
+{
+ TestMsgConn *conn = data;
+
+ assert(status == 0 || status == -EAGAIN);
+
+ if (!status) {
+ return;
+ }
+
+ /* no concurrent posting is expected so this should succeed */
+ assert(!hyperv_post_msg(conn->sint_route, &conn->msg));
+}
+
+static uint64_t msg_handler(const struct hyperv_post_message_input *msg,
+ void *data)
+{
+ int ret;
+ TestMsgConn *conn = data;
+
+ /* post the same message we've got */
+ conn->msg.header.message_type = msg->message_type;
+ assert(msg->payload_size < sizeof(conn->msg.payload));
+ conn->msg.header.payload_size = msg->payload_size;
+ memcpy(&conn->msg.payload, msg->payload, msg->payload_size);
+
+ ret = hyperv_post_msg(conn->sint_route, &conn->msg);
+
+ switch (ret) {
+ case 0:
+ return HV_STATUS_SUCCESS;
+ case -EAGAIN:
+ return HV_STATUS_INSUFFICIENT_BUFFERS;
+ default:
+ return HV_STATUS_INVALID_HYPERCALL_INPUT;
+ }
+}
+
+static void msg_conn_create(HypervTestDev *dev, X86CPU *cpu,
+ uint8_t sint, uint8_t conn_id)
+{
+ TestMsgConn *conn;
+
+ conn = g_new0(TestMsgConn, 1);
+ assert(conn);
+
+ conn->conn_id = conn_id;
+
+ conn->sint_route = hyperv_sint_route_new(cpu, sint, msg_cb, conn);
+ assert(conn->sint_route);
+
+ assert(!hyperv_set_msg_handler(conn->conn_id, msg_handler, conn));
+
+ QLIST_INSERT_HEAD(&dev->msg_conns, conn, le);
+}
+
+static void msg_conn_destroy(HypervTestDev *dev, uint8_t conn_id)
+{
+ TestMsgConn *conn;
+
+ QLIST_FOREACH(conn, &dev->msg_conns, le) {
+ if (conn->conn_id == conn_id) {
+ QLIST_REMOVE(conn, le);
+ hyperv_set_msg_handler(conn->conn_id, NULL, NULL);
+ hyperv_sint_route_unref(conn->sint_route);
+ g_free(conn);
+ return;
+ }
+ }
+ assert(false);
+}
+
+static void evt_conn_handler(EventNotifier *notifier)
+{
+ TestEvtConn *conn = container_of(notifier, TestEvtConn, notifier);
+
+ event_notifier_test_and_clear(notifier);
+
+ /* signal the same event flag we've got */
+ assert(!hyperv_set_evt_flag(conn->sint_route, conn->conn_id));
+}
+
+static void evt_conn_create(HypervTestDev *dev, X86CPU *cpu,
+ uint8_t sint, uint8_t conn_id)
+{
+ TestEvtConn *conn;
+
+ conn = g_new0(TestEvtConn, 1);
+ assert(conn);
+
+ conn->conn_id = conn_id;
+
+ conn->sint_route = hyperv_sint_route_new(cpu, sint, NULL, NULL);
+ assert(conn->sint_route);
+
+ assert(!event_notifier_init(&conn->notifier, false));
+
+ event_notifier_set_handler(&conn->notifier, evt_conn_handler);
+
+ assert(!hyperv_set_evt_notifier(conn_id, &conn->notifier));
+
+ QLIST_INSERT_HEAD(&dev->evt_conns, conn, le);
+}
+
+static void evt_conn_destroy(HypervTestDev *dev, uint8_t conn_id)
+{
+ TestEvtConn *conn;
+
+ QLIST_FOREACH(conn, &dev->evt_conns, le) {
+ if (conn->conn_id == conn_id) {
+ QLIST_REMOVE(conn, le);
+ hyperv_set_evt_notifier(conn->conn_id, NULL);
+ event_notifier_set_handler(&conn->notifier, NULL);
+ event_notifier_cleanup(&conn->notifier);
+ hyperv_sint_route_unref(conn->sint_route);
+ g_free(conn);
+ return;
+ }
+ }
+ assert(false);
+}
+
static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
uint32_t len)
{
@@ -101,7 +246,9 @@ static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
uint8_t sint = data & 0xFF;
uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
uint8_t ctl = (data >> 16ULL) & 0xFF;
+ uint8_t conn_id = (data >> 24ULL) & 0xFF;
X86CPU *cpu = hyperv_find_vcpu(vcpu_id);
+
assert(cpu);
switch (ctl) {
@@ -114,6 +261,18 @@ static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
case HV_TEST_DEV_SINT_ROUTE_SET_SINT:
sint_route_set_sint(dev, cpu, sint);
break;
+ case HV_TEST_DEV_MSG_CONN_CREATE:
+ msg_conn_create(dev, cpu, sint, conn_id);
+ break;
+ case HV_TEST_DEV_MSG_CONN_DESTROY:
+ msg_conn_destroy(dev, conn_id);
+ break;
+ case HV_TEST_DEV_EVT_CONN_CREATE:
+ evt_conn_create(dev, cpu, sint, conn_id);
+ break;
+ case HV_TEST_DEV_EVT_CONN_DESTROY:
+ evt_conn_destroy(dev, conn_id);
+ break;
default:
break;
}
@@ -133,6 +292,8 @@ static void hv_test_dev_realizefn(DeviceState *d, Error **errp)
MemoryRegion *io = isa_address_space_io(isa);
QLIST_INIT(&dev->sint_routes);
+ QLIST_INIT(&dev->msg_conns);
+ QLIST_INIT(&dev->evt_conns);
memory_region_init_io(&dev->sint_control, OBJECT(dev),
&synic_test_sint_ops, dev,
"hyperv-testdev-ctl", 4);
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 22/23] MAINTAINERS: add myself and eyakovlev@ for hyperv*
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (19 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 21/23] hyperv_testdev: add SynIC message and event testmodes Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
2017-06-06 18:19 ` [Qemu-devel] [PATCH 23/23] hyperv: update copyright notices Roman Kagan
[not found] ` <20170606181948.16238-12-rkagan@virtuozzo.com>
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
MAINTAINERS | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 120788d..16c5422 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -852,6 +852,13 @@ F: hw/core/machine.c
F: hw/core/null-machine.c
F: include/hw/boards.h
+Hyper-V emulation core
+M: Roman Kagan <rkagan@virtuozzo.com>
+M: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
+S: Supported
+F: target/i386/hyperv*
+F: hw/misc/hyperv*
+
Xtensa Machines
---------------
sim
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [Qemu-devel] [PATCH 23/23] hyperv: update copyright notices
2017-06-06 18:19 [Qemu-devel] [PATCH 00/23] hyperv fixes and enhancements Roman Kagan
` (20 preceding siblings ...)
2017-06-06 18:19 ` [Qemu-devel] [PATCH 22/23] MAINTAINERS: add myself and eyakovlev@ for hyperv* Roman Kagan
@ 2017-06-06 18:19 ` Roman Kagan
[not found] ` <20170606181948.16238-12-rkagan@virtuozzo.com>
22 siblings, 0 replies; 76+ messages in thread
From: Roman Kagan @ 2017-06-06 18:19 UTC (permalink / raw)
To: qemu-devel
Cc: Paolo Bonzini, Eduardo Habkost, Evgeny Yakovlev, Denis V . Lunev
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
target/i386/hyperv.h | 1 +
hw/misc/hyperv_testdev.c | 1 +
target/i386/hyperv.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index f82c770..cca4723 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -2,6 +2,7 @@
* QEMU KVM Hyper-V support
*
* Copyright (C) 2015 Andrey Smetanin <asmetanin@virtuozzo.com>
+ * Copyright (C) 2017 Parallels International GmbH
*
* Authors:
* Andrey Smetanin <asmetanin@virtuozzo.com>
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index 99d6d8a..b5611f0 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -2,6 +2,7 @@
* QEMU KVM Hyper-V test device to support Hyper-V kvm-unit-tests
*
* Copyright (C) 2015 Andrey Smetanin <asmetanin@virtuozzo.com>
+ * Copyright (C) 2017 Parallels International GmbH
*
* Authors:
* Andrey Smetanin <asmetanin@virtuozzo.com>
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 50386a2..ed3087b 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -2,6 +2,7 @@
* QEMU KVM Hyper-V support
*
* Copyright (C) 2015 Andrey Smetanin <asmetanin@virtuozzo.com>
+ * Copyright (C) 2017 Parallels International GmbH
*
* Authors:
* Andrey Smetanin <asmetanin@virtuozzo.com>
--
2.9.4
^ permalink raw reply related [flat|nested] 76+ messages in thread
[parent not found: <20170606181948.16238-12-rkagan@virtuozzo.com>]
* Re: [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer
[not found] ` <20170606181948.16238-12-rkagan@virtuozzo.com>
@ 2017-06-13 19:02 ` Eduardo Habkost
2017-06-14 11:08 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Eduardo Habkost @ 2017-06-13 19:02 UTC (permalink / raw)
To: Roman Kagan; +Cc: qemu-devel, Paolo Bonzini, Evgeny Yakovlev, Denis V . Lunev
On Tue, Jun 06, 2017 at 09:19:36PM +0300, Roman Kagan wrote:
> Use X86CPU pointer to refer to the respective HvSintRoute instead of
> vp_index. This is more convenient and also paves the way for future
> enhancements.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
[...]
> @@ -101,16 +101,18 @@ static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
> uint8_t sint = data & 0xFF;
> uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
vcpu_id risks being confused KVM's vcpu_id (which is the CPU APIC
ID in x86). If you are already touching this code, this could be
renamed to vp_index to avoid confusion.
> uint8_t ctl = (data >> 16ULL) & 0xFF;
> + X86CPU *cpu = hyperv_find_vcpu(vcpu_id);
> + assert(cpu);
>
> switch (ctl) {
> case HV_TEST_DEV_SINT_ROUTE_CREATE:
> - sint_route_create(dev, vcpu_id, sint);
> + sint_route_create(dev, cpu, sint);
> break;
> case HV_TEST_DEV_SINT_ROUTE_DESTROY:
> - sint_route_destroy(dev, vcpu_id, sint);
> + sint_route_destroy(dev, cpu, sint);
> break;
> case HV_TEST_DEV_SINT_ROUTE_SET_SINT:
> - sint_route_set_sint(dev, vcpu_id, sint);
> + sint_route_set_sint(dev, cpu, sint);
> break;
> default:
> break;
--
Eduardo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer
2017-06-13 19:02 ` [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer Eduardo Habkost
@ 2017-06-14 11:08 ` Paolo Bonzini
2017-06-14 12:14 ` Roman Kagan
0 siblings, 1 reply; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 11:08 UTC (permalink / raw)
To: Eduardo Habkost, Roman Kagan; +Cc: qemu-devel, Evgeny Yakovlev, Denis V . Lunev
On 13/06/2017 21:02, Eduardo Habkost wrote:
>> @@ -101,16 +101,18 @@ static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
>> uint8_t sint = data & 0xFF;
>> uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
>
> vcpu_id risks being confused KVM's vcpu_id (which is the CPU APIC
> ID in x86). If you are already touching this code, this could be
> renamed to vp_index to avoid confusion.
Actually the VP_INDEX is _also_ the vcpu_id. Should we just document
that KVM makes the VP_INDEX equal to the CPU APIC id, and adjust patch 5
accordingly?
Paolo
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer
2017-06-14 11:08 ` Paolo Bonzini
@ 2017-06-14 12:14 ` Roman Kagan
2017-06-14 12:17 ` Paolo Bonzini
0 siblings, 1 reply; 76+ messages in thread
From: Roman Kagan @ 2017-06-14 12:14 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Eduardo Habkost, qemu-devel, Evgeny Yakovlev, Denis V . Lunev
On Wed, Jun 14, 2017 at 01:08:43PM +0200, Paolo Bonzini wrote:
>
>
> On 13/06/2017 21:02, Eduardo Habkost wrote:
> >> @@ -101,16 +101,18 @@ static void hv_test_dev_control(void *opaque, hwaddr addr, uint64_t data,
> >> uint8_t sint = data & 0xFF;
> >> uint8_t vcpu_id = (data >> 8ULL) & 0xFF;
> >
> > vcpu_id risks being confused KVM's vcpu_id (which is the CPU APIC
> > ID in x86). If you are already touching this code, this could be
> > renamed to vp_index to avoid confusion.
>
> Actually the VP_INDEX is _also_ the vcpu_id. Should we just document
> that KVM makes the VP_INDEX equal to the CPU APIC id, and adjust patch 5
> accordingly?
I think here it makes sense to follow Eduardo's suggestion because this
code doesn't need to know how VP_INDEX is defined.
The real matter is discussed in patch 5, yes.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Qemu-devel] [PATCH 11/23] hyperv: address HvSintRoute by X86CPU pointer
2017-06-14 12:14 ` Roman Kagan
@ 2017-06-14 12:17 ` Paolo Bonzini
0 siblings, 0 replies; 76+ messages in thread
From: Paolo Bonzini @ 2017-06-14 12:17 UTC (permalink / raw)
To: Roman Kagan, Eduardo Habkost, qemu-devel, Evgeny Yakovlev,
Denis V . Lunev
On 14/06/2017 14:14, Roman Kagan wrote:
>>> vcpu_id risks being confused KVM's vcpu_id (which is the CPU APIC
>>> ID in x86). If you are already touching this code, this could be
>>> renamed to vp_index to avoid confusion.
>> Actually the VP_INDEX is _also_ the vcpu_id. Should we just document
>> that KVM makes the VP_INDEX equal to the CPU APIC id, and adjust patch 5
>> accordingly?
> I think here it makes sense to follow Eduardo's suggestion because this
> code doesn't need to know how VP_INDEX is defined.
Yes, I agree. Sorry if that wasn't clear---Eduardo's comment just made
me notice this possible solution to the issue of patch 5.
Paolo
> The real matter is discussed in patch 5, yes.
^ permalink raw reply [flat|nested] 76+ messages in thread