* [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state
@ 2013-07-25 15:05 Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 1/2] target-i386: remove tabs from target-i386/cpu.h Paolo Bonzini
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Paolo Bonzini @ 2013-07-25 15:05 UTC (permalink / raw)
To: qemu-devel; +Cc: gnatapov, ehabkost, afaerber
Same code as v1, but with all tabs expunged from cpu.h.
Paolo Bonzini (2):
target-i386: remove tabs from target-i386/cpu.h
kvm: migrate vPMU state
target-i386/cpu.h | 215 ++++++++++++++++++++++++++++----------------------
target-i386/kvm.c | 93 ++++++++++++++++++++--
target-i386/machine.c | 44 +++++++++++
3 files changed, 251 insertions(+), 101 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Qemu-devel] [PATCH v2 1/2] target-i386: remove tabs from target-i386/cpu.h
2013-07-25 15:05 [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state Paolo Bonzini
@ 2013-07-25 15:05 ` Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state Paolo Bonzini
2013-07-28 13:03 ` [Qemu-devel] [PATCH v2 0/2] " Andreas Färber
2 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2013-07-25 15:05 UTC (permalink / raw)
To: qemu-devel; +Cc: gnatapov, ehabkost, afaerber
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target-i386/cpu.h | 192 +++++++++++++++++++++++++++---------------------------
1 file changed, 96 insertions(+), 96 deletions(-)
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index cedefdc..d19d111 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -37,9 +37,9 @@
#define TARGET_HAS_ICE 1
#ifdef TARGET_X86_64
-#define ELF_MACHINE EM_X86_64
+#define ELF_MACHINE EM_X86_64
#else
-#define ELF_MACHINE EM_386
+#define ELF_MACHINE EM_386
#endif
#define CPUArchState struct CPUX86State
@@ -98,10 +98,10 @@
#define DESC_TSS_BUSY_MASK (1 << 9)
/* eflags masks */
-#define CC_C 0x0001
-#define CC_P 0x0004
-#define CC_A 0x0010
-#define CC_Z 0x0040
+#define CC_C 0x0001
+#define CC_P 0x0004
+#define CC_A 0x0010
+#define CC_Z 0x0040
#define CC_S 0x0080
#define CC_O 0x0800
@@ -109,14 +109,14 @@
#define IOPL_SHIFT 12
#define VM_SHIFT 17
-#define TF_MASK 0x00000100
-#define IF_MASK 0x00000200
-#define DF_MASK 0x00000400
-#define IOPL_MASK 0x00003000
-#define NT_MASK 0x00004000
-#define RF_MASK 0x00010000
-#define VM_MASK 0x00020000
-#define AC_MASK 0x00040000
+#define TF_MASK 0x00000100
+#define IF_MASK 0x00000200
+#define DF_MASK 0x00000400
+#define IOPL_MASK 0x00003000
+#define NT_MASK 0x00004000
+#define RF_MASK 0x00010000
+#define VM_MASK 0x00020000
+#define AC_MASK 0x00040000
#define VIF_MASK 0x00080000
#define VIP_MASK 0x00100000
#define ID_MASK 0x00200000
@@ -238,28 +238,28 @@
#define DR7_TYPE_IO_RW 0x2
#define DR7_TYPE_DATA_RW 0x3
-#define PG_PRESENT_BIT 0
-#define PG_RW_BIT 1
-#define PG_USER_BIT 2
-#define PG_PWT_BIT 3
-#define PG_PCD_BIT 4
-#define PG_ACCESSED_BIT 5
-#define PG_DIRTY_BIT 6
-#define PG_PSE_BIT 7
-#define PG_GLOBAL_BIT 8
-#define PG_NX_BIT 63
+#define PG_PRESENT_BIT 0
+#define PG_RW_BIT 1
+#define PG_USER_BIT 2
+#define PG_PWT_BIT 3
+#define PG_PCD_BIT 4
+#define PG_ACCESSED_BIT 5
+#define PG_DIRTY_BIT 6
+#define PG_PSE_BIT 7
+#define PG_GLOBAL_BIT 8
+#define PG_NX_BIT 63
#define PG_PRESENT_MASK (1 << PG_PRESENT_BIT)
-#define PG_RW_MASK (1 << PG_RW_BIT)
-#define PG_USER_MASK (1 << PG_USER_BIT)
-#define PG_PWT_MASK (1 << PG_PWT_BIT)
-#define PG_PCD_MASK (1 << PG_PCD_BIT)
+#define PG_RW_MASK (1 << PG_RW_BIT)
+#define PG_USER_MASK (1 << PG_USER_BIT)
+#define PG_PWT_MASK (1 << PG_PWT_BIT)
+#define PG_PCD_MASK (1 << PG_PCD_BIT)
#define PG_ACCESSED_MASK (1 << PG_ACCESSED_BIT)
-#define PG_DIRTY_MASK (1 << PG_DIRTY_BIT)
-#define PG_PSE_MASK (1 << PG_PSE_BIT)
-#define PG_GLOBAL_MASK (1 << PG_GLOBAL_BIT)
+#define PG_DIRTY_MASK (1 << PG_DIRTY_BIT)
+#define PG_PSE_MASK (1 << PG_PSE_BIT)
+#define PG_GLOBAL_MASK (1 << PG_GLOBAL_BIT)
#define PG_HI_USER_MASK 0x7ff0000000000000LL
-#define PG_NX_MASK (1LL << PG_NX_BIT)
+#define PG_NX_MASK (1LL << PG_NX_BIT)
#define PG_ERROR_W_BIT 1
@@ -269,32 +269,32 @@
#define PG_ERROR_RSVD_MASK 0x08
#define PG_ERROR_I_D_MASK 0x10
-#define MCG_CTL_P (1ULL<<8) /* MCG_CAP register available */
-#define MCG_SER_P (1ULL<<24) /* MCA recovery/new status bits */
+#define MCG_CTL_P (1ULL<<8) /* MCG_CAP register available */
+#define MCG_SER_P (1ULL<<24) /* MCA recovery/new status bits */
-#define MCE_CAP_DEF (MCG_CTL_P|MCG_SER_P)
-#define MCE_BANKS_DEF 10
+#define MCE_CAP_DEF (MCG_CTL_P|MCG_SER_P)
+#define MCE_BANKS_DEF 10
-#define MCG_STATUS_RIPV (1ULL<<0) /* restart ip valid */
-#define MCG_STATUS_EIPV (1ULL<<1) /* ip points to correct instruction */
-#define MCG_STATUS_MCIP (1ULL<<2) /* machine check in progress */
+#define MCG_STATUS_RIPV (1ULL<<0) /* restart ip valid */
+#define MCG_STATUS_EIPV (1ULL<<1) /* ip points to correct instruction */
+#define MCG_STATUS_MCIP (1ULL<<2) /* machine check in progress */
-#define MCI_STATUS_VAL (1ULL<<63) /* valid error */
-#define MCI_STATUS_OVER (1ULL<<62) /* previous errors lost */
-#define MCI_STATUS_UC (1ULL<<61) /* uncorrected error */
-#define MCI_STATUS_EN (1ULL<<60) /* error enabled */
-#define MCI_STATUS_MISCV (1ULL<<59) /* misc error reg. valid */
-#define MCI_STATUS_ADDRV (1ULL<<58) /* addr reg. valid */
-#define MCI_STATUS_PCC (1ULL<<57) /* processor context corrupt */
-#define MCI_STATUS_S (1ULL<<56) /* Signaled machine check */
-#define MCI_STATUS_AR (1ULL<<55) /* Action required */
+#define MCI_STATUS_VAL (1ULL<<63) /* valid error */
+#define MCI_STATUS_OVER (1ULL<<62) /* previous errors lost */
+#define MCI_STATUS_UC (1ULL<<61) /* uncorrected error */
+#define MCI_STATUS_EN (1ULL<<60) /* error enabled */
+#define MCI_STATUS_MISCV (1ULL<<59) /* misc error reg. valid */
+#define MCI_STATUS_ADDRV (1ULL<<58) /* addr reg. valid */
+#define MCI_STATUS_PCC (1ULL<<57) /* processor context corrupt */
+#define MCI_STATUS_S (1ULL<<56) /* Signaled machine check */
+#define MCI_STATUS_AR (1ULL<<55) /* Action required */
/* MISC register defines */
-#define MCM_ADDR_SEGOFF 0 /* segment offset */
-#define MCM_ADDR_LINEAR 1 /* linear address */
-#define MCM_ADDR_PHYS 2 /* physical address */
-#define MCM_ADDR_MEM 3 /* memory address */
-#define MCM_ADDR_GENERIC 7 /* generic */
+#define MCM_ADDR_SEGOFF 0 /* segment offset */
+#define MCM_ADDR_LINEAR 1 /* linear address */
+#define MCM_ADDR_PHYS 2 /* physical address */
+#define MCM_ADDR_MEM 3 /* memory address */
+#define MCM_ADDR_GENERIC 7 /* generic */
#define MSR_IA32_TSC 0x10
#define MSR_IA32_APICBASE 0x1b
@@ -304,10 +304,10 @@
#define MSR_TSC_ADJUST 0x0000003b
#define MSR_IA32_TSCDEADLINE 0x6e0
-#define MSR_MTRRcap 0xfe
-#define MSR_MTRRcap_VCNT 8
-#define MSR_MTRRcap_FIXRANGE_SUPPORT (1 << 8)
-#define MSR_MTRRcap_WC_SUPPORTED (1 << 10)
+#define MSR_MTRRcap 0xfe
+#define MSR_MTRRcap_VCNT 8
+#define MSR_MTRRcap_FIXRANGE_SUPPORT (1 << 8)
+#define MSR_MTRRcap_WC_SUPPORTED (1 << 10)
#define MSR_IA32_SYSENTER_CS 0x174
#define MSR_IA32_SYSENTER_ESP 0x175
@@ -319,33 +319,33 @@
#define MSR_IA32_PERF_STATUS 0x198
-#define MSR_IA32_MISC_ENABLE 0x1a0
+#define MSR_IA32_MISC_ENABLE 0x1a0
/* Indicates good rep/movs microcode on some processors: */
#define MSR_IA32_MISC_ENABLE_DEFAULT 1
-#define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
-#define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
-
-#define MSR_MTRRfix64K_00000 0x250
-#define MSR_MTRRfix16K_80000 0x258
-#define MSR_MTRRfix16K_A0000 0x259
-#define MSR_MTRRfix4K_C0000 0x268
-#define MSR_MTRRfix4K_C8000 0x269
-#define MSR_MTRRfix4K_D0000 0x26a
-#define MSR_MTRRfix4K_D8000 0x26b
-#define MSR_MTRRfix4K_E0000 0x26c
-#define MSR_MTRRfix4K_E8000 0x26d
-#define MSR_MTRRfix4K_F0000 0x26e
-#define MSR_MTRRfix4K_F8000 0x26f
+#define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg))
+#define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1)
+
+#define MSR_MTRRfix64K_00000 0x250
+#define MSR_MTRRfix16K_80000 0x258
+#define MSR_MTRRfix16K_A0000 0x259
+#define MSR_MTRRfix4K_C0000 0x268
+#define MSR_MTRRfix4K_C8000 0x269
+#define MSR_MTRRfix4K_D0000 0x26a
+#define MSR_MTRRfix4K_D8000 0x26b
+#define MSR_MTRRfix4K_E0000 0x26c
+#define MSR_MTRRfix4K_E8000 0x26d
+#define MSR_MTRRfix4K_F0000 0x26e
+#define MSR_MTRRfix4K_F8000 0x26f
#define MSR_PAT 0x277
-#define MSR_MTRRdefType 0x2ff
+#define MSR_MTRRdefType 0x2ff
-#define MSR_MC0_CTL 0x400
-#define MSR_MC0_STATUS 0x401
-#define MSR_MC0_ADDR 0x402
-#define MSR_MC0_MISC 0x403
+#define MSR_MC0_CTL 0x400
+#define MSR_MC0_STATUS 0x401
+#define MSR_MC0_ADDR 0x402
+#define MSR_MC0_MISC 0x403
#define MSR_EFER 0xc0000080
@@ -549,24 +549,24 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
#define CPUID_MWAIT_IBE (1 << 1) /* Interrupts can exit capability */
#define CPUID_MWAIT_EMX (1 << 0) /* enumeration supported */
-#define EXCP00_DIVZ 0
-#define EXCP01_DB 1
-#define EXCP02_NMI 2
-#define EXCP03_INT3 3
-#define EXCP04_INTO 4
-#define EXCP05_BOUND 5
-#define EXCP06_ILLOP 6
-#define EXCP07_PREX 7
-#define EXCP08_DBLE 8
-#define EXCP09_XERR 9
-#define EXCP0A_TSS 10
-#define EXCP0B_NOSEG 11
-#define EXCP0C_STACK 12
-#define EXCP0D_GPF 13
-#define EXCP0E_PAGE 14
-#define EXCP10_COPR 16
-#define EXCP11_ALGN 17
-#define EXCP12_MCHK 18
+#define EXCP00_DIVZ 0
+#define EXCP01_DB 1
+#define EXCP02_NMI 2
+#define EXCP03_INT3 3
+#define EXCP04_INTO 4
+#define EXCP05_BOUND 5
+#define EXCP06_ILLOP 6
+#define EXCP07_PREX 7
+#define EXCP08_DBLE 8
+#define EXCP09_XERR 9
+#define EXCP0A_TSS 10
+#define EXCP0B_NOSEG 11
+#define EXCP0C_STACK 12
+#define EXCP0D_GPF 13
+#define EXCP0E_PAGE 14
+#define EXCP10_COPR 16
+#define EXCP11_ALGN 17
+#define EXCP12_MCHK 18
#define EXCP_SYSCALL 0x100 /* only happens in user only emulation
for syscall instruction */
@@ -1085,7 +1085,7 @@ static inline CPUX86State *cpu_init(const char *cpu_model)
#define cpu_gen_code cpu_x86_gen_code
#define cpu_signal_handler cpu_x86_signal_handler
#define cpu_list x86_cpu_list
-#define cpudef_setup x86_cpudef_setup
+#define cpudef_setup x86_cpudef_setup
/* MMU modes definitions */
#define MMU_MODE0_SUFFIX _kernel
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-25 15:05 [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 1/2] target-i386: remove tabs from target-i386/cpu.h Paolo Bonzini
@ 2013-07-25 15:05 ` Paolo Bonzini
2013-07-28 12:57 ` Gleb Natapov
2013-07-28 13:03 ` [Qemu-devel] [PATCH v2 0/2] " Andreas Färber
2 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-07-25 15:05 UTC (permalink / raw)
To: qemu-devel; +Cc: gnatapov, ehabkost, afaerber
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target-i386/cpu.h | 23 +++++++++++++
target-i386/kvm.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++---
target-i386/machine.c | 44 ++++++++++++++++++++++++
3 files changed, 155 insertions(+), 5 deletions(-)
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index d19d111..15d6d6a 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -304,6 +304,8 @@
#define MSR_TSC_ADJUST 0x0000003b
#define MSR_IA32_TSCDEADLINE 0x6e0
+#define MSR_P6_PERFCTR0 0xc1
+
#define MSR_MTRRcap 0xfe
#define MSR_MTRRcap_VCNT 8
#define MSR_MTRRcap_FIXRANGE_SUPPORT (1 << 8)
@@ -317,6 +319,8 @@
#define MSR_MCG_STATUS 0x17a
#define MSR_MCG_CTL 0x17b
+#define MSR_P6_EVNTSEL0 0x186
+
#define MSR_IA32_PERF_STATUS 0x198
#define MSR_IA32_MISC_ENABLE 0x1a0
@@ -342,6 +346,14 @@
#define MSR_MTRRdefType 0x2ff
+#define MSR_CORE_PERF_FIXED_CTR0 0x309
+#define MSR_CORE_PERF_FIXED_CTR1 0x30a
+#define MSR_CORE_PERF_FIXED_CTR2 0x30b
+#define MSR_CORE_PERF_FIXED_CTR_CTRL 0x38d
+#define MSR_CORE_PERF_GLOBAL_STATUS 0x38e
+#define MSR_CORE_PERF_GLOBAL_CTRL 0x38f
+#define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x390
+
#define MSR_MC0_CTL 0x400
#define MSR_MC0_STATUS 0x401
#define MSR_MC0_ADDR 0x402
@@ -720,6 +732,9 @@ typedef struct {
#define CPU_NB_REGS CPU_NB_REGS32
#endif
+#define MAX_FIXED_COUNTERS 3
+#define MAX_GP_COUNTERS (MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
+
#define NB_MMU_MODES 3
typedef enum TPRAccess {
@@ -814,6 +829,14 @@ typedef struct CPUX86State {
uint64_t mcg_status;
uint64_t msr_ia32_misc_enable;
+ uint64_t msr_fixed_ctr_ctrl;
+ uint64_t msr_global_ctrl;
+ uint64_t msr_global_status;
+ uint64_t msr_global_ovf_ctrl;
+ uint64_t msr_fixed_counters[MAX_FIXED_COUNTERS];
+ uint64_t msr_gp_counters[MAX_GP_COUNTERS];
+ uint64_t msr_gp_evtsel[MAX_GP_COUNTERS];
+
/* exception/interrupt handling */
int error_code;
int exception_is_int;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 3c9d10a..96ec1f4 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -71,6 +71,9 @@ static bool has_msr_misc_enable;
static bool has_msr_kvm_steal_time;
static int lm_capable_kernel;
+static bool has_msr_architectural_pmu;
+static uint32_t num_architectural_pmu_counters;
+
bool kvm_allows_irq0_override(void)
{
return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
@@ -581,6 +584,25 @@ int kvm_arch_init_vcpu(CPUState *cs)
break;
}
}
+
+ if (limit >= 0x0a) {
+ uint32_t ver;
+
+ cpu_x86_cpuid(env, 0x0a, 0, &ver, &unused, &unused, &unused);
+ if ((ver & 0xff) > 0) {
+ has_msr_architectural_pmu = true;
+ num_architectural_pmu_counters = (ver & 0xff00) >> 8;
+
+ /* Shouldn't be more than 32, since that's the number of bits
+ * available in EBX to tell us _which_ counters are available.
+ * Play it safe.
+ */
+ if (num_architectural_pmu_counters > MAX_GP_COUNTERS) {
+ num_architectural_pmu_counters = MAX_GP_COUNTERS;
+ }
+ }
+ }
+
cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
for (i = 0x80000000; i <= limit; i++) {
@@ -1052,7 +1074,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
struct kvm_msr_entry entries[100];
} msr_data;
struct kvm_msr_entry *msrs = msr_data.entries;
- int n = 0;
+ int n = 0, i;
kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_CS, env->sysenter_cs);
kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_ESP, env->sysenter_esp);
@@ -1094,9 +1116,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
}
}
/*
- * The following paravirtual MSRs have side effects on the guest or are
- * too heavy for normal writeback. Limit them to reset or full state
- * updates.
+ * The following MSRs have side effects on the guest or are too heavy
+ * for normal writeback. Limit them to reset or full state updates.
*/
if (level >= KVM_PUT_RESET_STATE) {
kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
@@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
env->steal_time_msr);
}
+ if (has_msr_architectural_pmu) {
+ /* Stop the counter. */
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
+
+ /* Set the counter values. */
+ for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR0 + i,
+ env->msr_fixed_counters[i]);
+ }
+ for (i = 0; i < num_architectural_pmu_counters; i++) {
+ kvm_msr_entry_set(&msrs[n++], MSR_P6_PERFCTR0 + i,
+ env->msr_gp_counters[i]);
+ kvm_msr_entry_set(&msrs[n++], MSR_P6_EVNTSEL0 + i,
+ env->msr_gp_evtsel[i]);
+ }
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_STATUS,
+ env->msr_global_status);
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_OVF_CTRL,
+ env->msr_global_ovf_ctrl);
+
+ /* Now start the PMU. */
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL,
+ env->msr_fixed_ctr_ctrl);
+ kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL,
+ env->msr_global_ctrl);
+ }
if (hyperv_hypercall_available()) {
kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0);
kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_HYPERCALL, 0);
@@ -1370,6 +1418,19 @@ static int kvm_get_msrs(X86CPU *cpu)
if (has_msr_kvm_steal_time) {
msrs[n++].index = MSR_KVM_STEAL_TIME;
}
+ if (has_msr_architectural_pmu) {
+ msrs[n++].index = MSR_CORE_PERF_FIXED_CTR_CTRL;
+ msrs[n++].index = MSR_CORE_PERF_GLOBAL_CTRL;
+ msrs[n++].index = MSR_CORE_PERF_GLOBAL_STATUS;
+ msrs[n++].index = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
+ for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
+ msrs[n++].index = MSR_CORE_PERF_FIXED_CTR0 + i;
+ }
+ for (i = 0; i < num_architectural_pmu_counters; i++) {
+ msrs[n++].index = MSR_P6_PERFCTR0 + i;
+ msrs[n++].index = MSR_P6_EVNTSEL0 + i;
+ }
+ }
if (env->mcg_cap) {
msrs[n++].index = MSR_MCG_STATUS;
@@ -1386,7 +1447,8 @@ static int kvm_get_msrs(X86CPU *cpu)
}
for (i = 0; i < ret; i++) {
- switch (msrs[i].index) {
+ uint32_t index = msrs[i].index;
+ switch (index) {
case MSR_IA32_SYSENTER_CS:
env->sysenter_cs = msrs[i].data;
break;
@@ -1458,6 +1520,27 @@ static int kvm_get_msrs(X86CPU *cpu)
case MSR_KVM_STEAL_TIME:
env->steal_time_msr = msrs[i].data;
break;
+ case MSR_CORE_PERF_FIXED_CTR_CTRL:
+ env->msr_fixed_ctr_ctrl = msrs[i].data;
+ break;
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ env->msr_global_ctrl = msrs[i].data;
+ break;
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ env->msr_global_status = msrs[i].data;
+ break;
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ env->msr_global_ovf_ctrl = msrs[i].data;
+ break;
+ case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR0 + MAX_FIXED_COUNTERS - 1:
+ env->msr_fixed_counters[index - MSR_CORE_PERF_FIXED_CTR0] = msrs[i].data;
+ break;
+ case MSR_P6_PERFCTR0 ... MSR_P6_PERFCTR0 + MAX_GP_COUNTERS - 1:
+ env->msr_gp_counters[index - MSR_P6_PERFCTR0] = msrs[i].data;
+ break;
+ case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
+ env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
+ break;
}
}
diff --git a/target-i386/machine.c b/target-i386/machine.c
index f9ec581..076a39d 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -446,6 +446,47 @@ static const VMStateDescription vmstate_msr_ia32_misc_enable = {
}
};
+static bool pmu_enable_needed(void *opaque)
+{
+ X86CPU *cpu = opaque;
+ CPUX86State *env = &cpu->env;
+ int i;
+
+ if (env->msr_fixed_ctr_ctrl || env->msr_global_ctrl ||
+ env->msr_global_status || env->msr_global_ovf_ctrl) {
+ return true;
+ }
+ for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
+ if (env->msr_fixed_counters[i]) {
+ return true;
+ }
+ }
+ for (i = 0; i < MAX_GP_COUNTERS; i++) {
+ if (env->msr_gp_counters[i] || env->msr_gp_evtsel[i]) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static const VMStateDescription vmstate_msr_architectural_pmu = {
+ .name = "cpu/msr_architectural_pmu",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .minimum_version_id_old = 1,
+ .fields = (VMStateField []) {
+ VMSTATE_UINT64(env.msr_fixed_ctr_ctrl, X86CPU),
+ VMSTATE_UINT64(env.msr_global_ctrl, X86CPU),
+ VMSTATE_UINT64(env.msr_global_status, X86CPU),
+ VMSTATE_UINT64(env.msr_global_ovf_ctrl, X86CPU),
+ VMSTATE_UINT64_ARRAY(env.msr_fixed_counters, X86CPU, MAX_FIXED_COUNTERS),
+ VMSTATE_UINT64_ARRAY(env.msr_gp_counters, X86CPU, MAX_GP_COUNTERS),
+ VMSTATE_UINT64_ARRAY(env.msr_gp_evtsel, X86CPU, MAX_GP_COUNTERS),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
const VMStateDescription vmstate_x86_cpu = {
.name = "cpu",
.version_id = 12,
@@ -571,6 +612,9 @@ const VMStateDescription vmstate_x86_cpu = {
}, {
.vmsd = &vmstate_msr_ia32_misc_enable,
.needed = misc_enable_needed,
+ }, {
+ .vmsd = &vmstate_msr_architectural_pmu,
+ .needed = pmu_enable_needed,
} , {
/* empty */
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state Paolo Bonzini
@ 2013-07-28 12:57 ` Gleb Natapov
2013-07-28 13:51 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Gleb Natapov @ 2013-07-28 12:57 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, afaerber, ehabkost
On Thu, Jul 25, 2013 at 05:05:22PM +0200, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target-i386/cpu.h | 23 +++++++++++++
> target-i386/kvm.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++---
> target-i386/machine.c | 44 ++++++++++++++++++++++++
> 3 files changed, 155 insertions(+), 5 deletions(-)
>
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index d19d111..15d6d6a 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -304,6 +304,8 @@
> #define MSR_TSC_ADJUST 0x0000003b
> #define MSR_IA32_TSCDEADLINE 0x6e0
>
> +#define MSR_P6_PERFCTR0 0xc1
> +
> #define MSR_MTRRcap 0xfe
> #define MSR_MTRRcap_VCNT 8
> #define MSR_MTRRcap_FIXRANGE_SUPPORT (1 << 8)
> @@ -317,6 +319,8 @@
> #define MSR_MCG_STATUS 0x17a
> #define MSR_MCG_CTL 0x17b
>
> +#define MSR_P6_EVNTSEL0 0x186
> +
> #define MSR_IA32_PERF_STATUS 0x198
>
> #define MSR_IA32_MISC_ENABLE 0x1a0
> @@ -342,6 +346,14 @@
>
> #define MSR_MTRRdefType 0x2ff
>
> +#define MSR_CORE_PERF_FIXED_CTR0 0x309
> +#define MSR_CORE_PERF_FIXED_CTR1 0x30a
> +#define MSR_CORE_PERF_FIXED_CTR2 0x30b
> +#define MSR_CORE_PERF_FIXED_CTR_CTRL 0x38d
> +#define MSR_CORE_PERF_GLOBAL_STATUS 0x38e
> +#define MSR_CORE_PERF_GLOBAL_CTRL 0x38f
> +#define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x390
> +
> #define MSR_MC0_CTL 0x400
> #define MSR_MC0_STATUS 0x401
> #define MSR_MC0_ADDR 0x402
> @@ -720,6 +732,9 @@ typedef struct {
> #define CPU_NB_REGS CPU_NB_REGS32
> #endif
>
> +#define MAX_FIXED_COUNTERS 3
> +#define MAX_GP_COUNTERS (MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
> +
> #define NB_MMU_MODES 3
>
> typedef enum TPRAccess {
> @@ -814,6 +829,14 @@ typedef struct CPUX86State {
> uint64_t mcg_status;
> uint64_t msr_ia32_misc_enable;
>
> + uint64_t msr_fixed_ctr_ctrl;
> + uint64_t msr_global_ctrl;
> + uint64_t msr_global_status;
> + uint64_t msr_global_ovf_ctrl;
> + uint64_t msr_fixed_counters[MAX_FIXED_COUNTERS];
> + uint64_t msr_gp_counters[MAX_GP_COUNTERS];
> + uint64_t msr_gp_evtsel[MAX_GP_COUNTERS];
> +
> /* exception/interrupt handling */
> int error_code;
> int exception_is_int;
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 3c9d10a..96ec1f4 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -71,6 +71,9 @@ static bool has_msr_misc_enable;
> static bool has_msr_kvm_steal_time;
> static int lm_capable_kernel;
>
> +static bool has_msr_architectural_pmu;
> +static uint32_t num_architectural_pmu_counters;
> +
> bool kvm_allows_irq0_override(void)
> {
> return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
> @@ -581,6 +584,25 @@ int kvm_arch_init_vcpu(CPUState *cs)
> break;
> }
> }
> +
> + if (limit >= 0x0a) {
> + uint32_t ver;
> +
> + cpu_x86_cpuid(env, 0x0a, 0, &ver, &unused, &unused, &unused);
> + if ((ver & 0xff) > 0) {
> + has_msr_architectural_pmu = true;
> + num_architectural_pmu_counters = (ver & 0xff00) >> 8;
> +
> + /* Shouldn't be more than 32, since that's the number of bits
> + * available in EBX to tell us _which_ counters are available.
> + * Play it safe.
> + */
> + if (num_architectural_pmu_counters > MAX_GP_COUNTERS) {
> + num_architectural_pmu_counters = MAX_GP_COUNTERS;
> + }
> + }
> + }
> +
> cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
>
> for (i = 0x80000000; i <= limit; i++) {
> @@ -1052,7 +1074,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> struct kvm_msr_entry entries[100];
> } msr_data;
> struct kvm_msr_entry *msrs = msr_data.entries;
> - int n = 0;
> + int n = 0, i;
>
> kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_CS, env->sysenter_cs);
> kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_ESP, env->sysenter_esp);
> @@ -1094,9 +1116,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> }
> }
> /*
> - * The following paravirtual MSRs have side effects on the guest or are
> - * too heavy for normal writeback. Limit them to reset or full state
> - * updates.
> + * The following MSRs have side effects on the guest or are too heavy
> + * for normal writeback. Limit them to reset or full state updates.
> */
> if (level >= KVM_PUT_RESET_STATE) {
> kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
> @@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
> env->steal_time_msr);
> }
> + if (has_msr_architectural_pmu) {
> + /* Stop the counter. */
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
> +
Why is this needed?
> + /* Set the counter values. */
> + for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR0 + i,
> + env->msr_fixed_counters[i]);
> + }
> + for (i = 0; i < num_architectural_pmu_counters; i++) {
> + kvm_msr_entry_set(&msrs[n++], MSR_P6_PERFCTR0 + i,
> + env->msr_gp_counters[i]);
> + kvm_msr_entry_set(&msrs[n++], MSR_P6_EVNTSEL0 + i,
> + env->msr_gp_evtsel[i]);
> + }
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_STATUS,
> + env->msr_global_status);
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> + env->msr_global_ovf_ctrl);
> +
> + /* Now start the PMU. */
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL,
> + env->msr_fixed_ctr_ctrl);
> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL,
> + env->msr_global_ctrl);
> + }
> if (hyperv_hypercall_available()) {
> kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0);
> kvm_msr_entry_set(&msrs[n++], HV_X64_MSR_HYPERCALL, 0);
> @@ -1370,6 +1418,19 @@ static int kvm_get_msrs(X86CPU *cpu)
> if (has_msr_kvm_steal_time) {
> msrs[n++].index = MSR_KVM_STEAL_TIME;
> }
> + if (has_msr_architectural_pmu) {
> + msrs[n++].index = MSR_CORE_PERF_FIXED_CTR_CTRL;
> + msrs[n++].index = MSR_CORE_PERF_GLOBAL_CTRL;
> + msrs[n++].index = MSR_CORE_PERF_GLOBAL_STATUS;
> + msrs[n++].index = MSR_CORE_PERF_GLOBAL_OVF_CTRL;
> + for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
> + msrs[n++].index = MSR_CORE_PERF_FIXED_CTR0 + i;
> + }
> + for (i = 0; i < num_architectural_pmu_counters; i++) {
> + msrs[n++].index = MSR_P6_PERFCTR0 + i;
> + msrs[n++].index = MSR_P6_EVNTSEL0 + i;
> + }
> + }
>
> if (env->mcg_cap) {
> msrs[n++].index = MSR_MCG_STATUS;
> @@ -1386,7 +1447,8 @@ static int kvm_get_msrs(X86CPU *cpu)
> }
>
> for (i = 0; i < ret; i++) {
> - switch (msrs[i].index) {
> + uint32_t index = msrs[i].index;
> + switch (index) {
> case MSR_IA32_SYSENTER_CS:
> env->sysenter_cs = msrs[i].data;
> break;
> @@ -1458,6 +1520,27 @@ static int kvm_get_msrs(X86CPU *cpu)
> case MSR_KVM_STEAL_TIME:
> env->steal_time_msr = msrs[i].data;
> break;
> + case MSR_CORE_PERF_FIXED_CTR_CTRL:
> + env->msr_fixed_ctr_ctrl = msrs[i].data;
> + break;
> + case MSR_CORE_PERF_GLOBAL_CTRL:
> + env->msr_global_ctrl = msrs[i].data;
> + break;
> + case MSR_CORE_PERF_GLOBAL_STATUS:
> + env->msr_global_status = msrs[i].data;
> + break;
> + case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
> + env->msr_global_ovf_ctrl = msrs[i].data;
> + break;
> + case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR0 + MAX_FIXED_COUNTERS - 1:
> + env->msr_fixed_counters[index - MSR_CORE_PERF_FIXED_CTR0] = msrs[i].data;
> + break;
> + case MSR_P6_PERFCTR0 ... MSR_P6_PERFCTR0 + MAX_GP_COUNTERS - 1:
> + env->msr_gp_counters[index - MSR_P6_PERFCTR0] = msrs[i].data;
> + break;
> + case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
> + env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
> + break;
> }
> }
>
> diff --git a/target-i386/machine.c b/target-i386/machine.c
> index f9ec581..076a39d 100644
> --- a/target-i386/machine.c
> +++ b/target-i386/machine.c
> @@ -446,6 +446,47 @@ static const VMStateDescription vmstate_msr_ia32_misc_enable = {
> }
> };
>
> +static bool pmu_enable_needed(void *opaque)
> +{
> + X86CPU *cpu = opaque;
> + CPUX86State *env = &cpu->env;
> + int i;
> +
> + if (env->msr_fixed_ctr_ctrl || env->msr_global_ctrl ||
> + env->msr_global_status || env->msr_global_ovf_ctrl) {
> + return true;
> + }
> + for (i = 0; i < MAX_FIXED_COUNTERS; i++) {
> + if (env->msr_fixed_counters[i]) {
> + return true;
> + }
> + }
> + for (i = 0; i < MAX_GP_COUNTERS; i++) {
> + if (env->msr_gp_counters[i] || env->msr_gp_evtsel[i]) {
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +static const VMStateDescription vmstate_msr_architectural_pmu = {
> + .name = "cpu/msr_architectural_pmu",
> + .version_id = 1,
> + .minimum_version_id = 1,
> + .minimum_version_id_old = 1,
> + .fields = (VMStateField []) {
> + VMSTATE_UINT64(env.msr_fixed_ctr_ctrl, X86CPU),
> + VMSTATE_UINT64(env.msr_global_ctrl, X86CPU),
> + VMSTATE_UINT64(env.msr_global_status, X86CPU),
> + VMSTATE_UINT64(env.msr_global_ovf_ctrl, X86CPU),
> + VMSTATE_UINT64_ARRAY(env.msr_fixed_counters, X86CPU, MAX_FIXED_COUNTERS),
> + VMSTATE_UINT64_ARRAY(env.msr_gp_counters, X86CPU, MAX_GP_COUNTERS),
> + VMSTATE_UINT64_ARRAY(env.msr_gp_evtsel, X86CPU, MAX_GP_COUNTERS),
> + VMSTATE_END_OF_LIST()
> + }
> +};
> +
> const VMStateDescription vmstate_x86_cpu = {
> .name = "cpu",
> .version_id = 12,
> @@ -571,6 +612,9 @@ const VMStateDescription vmstate_x86_cpu = {
> }, {
> .vmsd = &vmstate_msr_ia32_misc_enable,
> .needed = misc_enable_needed,
> + }, {
> + .vmsd = &vmstate_msr_architectural_pmu,
> + .needed = pmu_enable_needed,
> } , {
> /* empty */
> }
> --
> 1.8.3.1
--
Gleb.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state
2013-07-25 15:05 [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 1/2] target-i386: remove tabs from target-i386/cpu.h Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state Paolo Bonzini
@ 2013-07-28 13:03 ` Andreas Färber
2013-07-28 13:05 ` Gleb Natapov
2 siblings, 1 reply; 14+ messages in thread
From: Andreas Färber @ 2013-07-28 13:03 UTC (permalink / raw)
To: Paolo Bonzini, gnatapov; +Cc: qemu-devel, ehabkost
Am 25.07.2013 17:05, schrieb Paolo Bonzini:
> Same code as v1, but with all tabs expunged from cpu.h.
>
> Paolo Bonzini (2):
> target-i386: remove tabs from target-i386/cpu.h
> kvm: migrate vPMU state
>
> target-i386/cpu.h | 215 ++++++++++++++++++++++++++++----------------------
> target-i386/kvm.c | 93 ++++++++++++++++++++--
> target-i386/machine.c | 44 +++++++++++
> 3 files changed, 251 insertions(+), 101 deletions(-)
Gleb, I am content with v2 and don't have conflicting changes, so please
take it through the KVM queue once you guys are happy.
Thanks,
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state
2013-07-28 13:03 ` [Qemu-devel] [PATCH v2 0/2] " Andreas Färber
@ 2013-07-28 13:05 ` Gleb Natapov
0 siblings, 0 replies; 14+ messages in thread
From: Gleb Natapov @ 2013-07-28 13:05 UTC (permalink / raw)
To: Andreas Färber; +Cc: Paolo Bonzini, qemu-devel, ehabkost
On Sun, Jul 28, 2013 at 03:03:50PM +0200, Andreas Färber wrote:
> Am 25.07.2013 17:05, schrieb Paolo Bonzini:
> > Same code as v1, but with all tabs expunged from cpu.h.
> >
> > Paolo Bonzini (2):
> > target-i386: remove tabs from target-i386/cpu.h
> > kvm: migrate vPMU state
> >
> > target-i386/cpu.h | 215 ++++++++++++++++++++++++++++----------------------
> > target-i386/kvm.c | 93 ++++++++++++++++++++--
> > target-i386/machine.c | 44 +++++++++++
> > 3 files changed, 251 insertions(+), 101 deletions(-)
>
> Gleb, I am content with v2 and don't have conflicting changes, so please
> take it through the KVM queue once you guys are happy.
>
Thanks, I or Paolo will do.
--
Gleb.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-28 12:57 ` Gleb Natapov
@ 2013-07-28 13:51 ` Paolo Bonzini
2013-07-28 13:54 ` Gleb Natapov
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-07-28 13:51 UTC (permalink / raw)
To: Gleb Natapov; +Cc: qemu-devel, ehabkost, afaerber
Il 28/07/2013 14:57, Gleb Natapov ha scritto:
>> @@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>> kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
>> env->steal_time_msr);
>> }
>> + if (has_msr_architectural_pmu) {
>> + /* Stop the counter. */
>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
>> +
> Why is this needed?
In v1 it was in the commit message. I'll fix it up before applying:
> Second, to avoid any possible side effects during the setting of MSRs
> I stop the PMU while setting the counters and event selector MSRs.
> Stopping the PMU snapshots the counters and ensures that no strange
> races can happen if the counters were saved close to their overflow
> value.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-28 13:51 ` Paolo Bonzini
@ 2013-07-28 13:54 ` Gleb Natapov
2013-07-28 14:07 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Gleb Natapov @ 2013-07-28 13:54 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, ehabkost, afaerber
On Sun, Jul 28, 2013 at 03:51:25PM +0200, Paolo Bonzini wrote:
> Il 28/07/2013 14:57, Gleb Natapov ha scritto:
> >> @@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> >> kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
> >> env->steal_time_msr);
> >> }
> >> + if (has_msr_architectural_pmu) {
> >> + /* Stop the counter. */
> >> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> >> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
> >> +
> > Why is this needed?
>
> In v1 it was in the commit message. I'll fix it up before applying:
>
> > Second, to avoid any possible side effects during the setting of MSRs
> > I stop the PMU while setting the counters and event selector MSRs.
> > Stopping the PMU snapshots the counters and ensures that no strange
> > races can happen if the counters were saved close to their overflow
> > value.
>
Since vcpu is not running counters should not count anyway.
--
Gleb.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-28 13:54 ` Gleb Natapov
@ 2013-07-28 14:07 ` Paolo Bonzini
2013-07-28 14:24 ` Gleb Natapov
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-07-28 14:07 UTC (permalink / raw)
To: Gleb Natapov; +Cc: qemu-devel, afaerber, ehabkost
Il 28/07/2013 15:54, Gleb Natapov ha scritto:
> On Sun, Jul 28, 2013 at 03:51:25PM +0200, Paolo Bonzini wrote:
>> Il 28/07/2013 14:57, Gleb Natapov ha scritto:
>>>> @@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>>> kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
>>>> env->steal_time_msr);
>>>> }
>>>> + if (has_msr_architectural_pmu) {
>>>> + /* Stop the counter. */
>>>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>>>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
>>>> +
>>> Why is this needed?
>>
>> In v1 it was in the commit message. I'll fix it up before applying:
>>
>>> Second, to avoid any possible side effects during the setting of MSRs
>>> I stop the PMU while setting the counters and event selector MSRs.
>>> Stopping the PMU snapshots the counters and ensures that no strange
>>> races can happen if the counters were saved close to their overflow
>>> value.
>>
> Since vcpu is not running counters should not count anyway.
Does the perf event distinguish KVM_RUN from any other activity in the
vCPU thread (in which this code runs)? It seemed unsafe to me to change
the overflow status and the performance counter value while the counter
could be running, since the counter value could affect the overflow
status. Maybe I was being paranoid?
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-28 14:07 ` Paolo Bonzini
@ 2013-07-28 14:24 ` Gleb Natapov
2013-08-01 13:03 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Gleb Natapov @ 2013-07-28 14:24 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, afaerber, ehabkost
On Sun, Jul 28, 2013 at 04:07:37PM +0200, Paolo Bonzini wrote:
> Il 28/07/2013 15:54, Gleb Natapov ha scritto:
> > On Sun, Jul 28, 2013 at 03:51:25PM +0200, Paolo Bonzini wrote:
> >> Il 28/07/2013 14:57, Gleb Natapov ha scritto:
> >>>> @@ -1114,6 +1135,33 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
> >>>> kvm_msr_entry_set(&msrs[n++], MSR_KVM_STEAL_TIME,
> >>>> env->steal_time_msr);
> >>>> }
> >>>> + if (has_msr_architectural_pmu) {
> >>>> + /* Stop the counter. */
> >>>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> >>>> + kvm_msr_entry_set(&msrs[n++], MSR_CORE_PERF_GLOBAL_CTRL, 0);
> >>>> +
> >>> Why is this needed?
> >>
> >> In v1 it was in the commit message. I'll fix it up before applying:
> >>
> >>> Second, to avoid any possible side effects during the setting of MSRs
> >>> I stop the PMU while setting the counters and event selector MSRs.
> >>> Stopping the PMU snapshots the counters and ensures that no strange
> >>> races can happen if the counters were saved close to their overflow
> >>> value.
> >>
> > Since vcpu is not running counters should not count anyway.
>
> Does the perf event distinguish KVM_RUN from any other activity in the
> vCPU thread (in which this code runs)? It seemed unsafe to me to change
> the overflow status and the performance counter value while the counter
> could be running, since the counter value could affect the overflow
> status. Maybe I was being paranoid?
>
KVM disabled HW counters when outside of a guest mode (otherwise result
will be useless), so I do not see how the problem you describe can
happen. On the other hand MPU emulation assumes that counter have to be disabled
while MSR_IA32_PERFCTR0 is written since write to MSR_IA32_PERFCTR0 does
not reprogram perf evens, so we need either disable/enable counters to
write MSR_IA32_PERFCTR0 or have this patch in the kernel:
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 5c4f631..bf14e42 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -412,6 +412,8 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (!msr_info->host_initiated)
data = (s64)(s32)data;
pmc->counter += data - read_pmc(pmc);
+ if (msr_info->host_initiated)
+ reprogram_gp_counter(pmc, pmc->eventsel);
return 0;
} else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
if (data == pmc->eventsel)
--
Gleb.
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-07-28 14:24 ` Gleb Natapov
@ 2013-08-01 13:03 ` Paolo Bonzini
2013-08-01 13:12 ` Gleb Natapov
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-08-01 13:03 UTC (permalink / raw)
To: Gleb Natapov; +Cc: qemu-devel, afaerber, ehabkost
> KVM disabled HW counters when outside of a guest mode (otherwise result
> will be useless), so I do not see how the problem you describe can
> happen.
Yes, you're right.
> On the other hand MPU emulation assumes that counter have to be disabled
> while MSR_IA32_PERFCTR0 is written since write to MSR_IA32_PERFCTR0 does
> not reprogram perf evens, so we need either disable/enable counters to
> write MSR_IA32_PERFCTR0 or have this patch in the kernel:
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 5c4f631..bf14e42 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -412,6 +412,8 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> if (!msr_info->host_initiated)
> data = (s64)(s32)data;
> pmc->counter += data - read_pmc(pmc);
> + if (msr_info->host_initiated)
> + reprogram_gp_counter(pmc, pmc->eventsel);
> return 0;
> } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
> if (data == pmc->eventsel)
Why do you need "if (msr_info->host_initiated)"? I could not find any
hint in the manual that the overflow counter will still use the value
of the counter that was programmed first.
If we need to do it always, I agree it's better to modify the QEMU
patch and not disable/enable the counters. But if we need to restrict
it to host-initiated writes, I would rather have the QEMU patch as I
posted it. So far we always had less side-effects from host_initiated,
not more, and I think it's a good rule of thumb.
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-08-01 13:03 ` Paolo Bonzini
@ 2013-08-01 13:12 ` Gleb Natapov
2013-08-01 13:48 ` Paolo Bonzini
0 siblings, 1 reply; 14+ messages in thread
From: Gleb Natapov @ 2013-08-01 13:12 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, afaerber, ehabkost
On Thu, Aug 01, 2013 at 03:03:12PM +0200, Paolo Bonzini wrote:
> > KVM disabled HW counters when outside of a guest mode (otherwise result
> > will be useless), so I do not see how the problem you describe can
> > happen.
>
> Yes, you're right.
>
> > On the other hand MPU emulation assumes that counter have to be disabled
> > while MSR_IA32_PERFCTR0 is written since write to MSR_IA32_PERFCTR0 does
> > not reprogram perf evens, so we need either disable/enable counters to
> > write MSR_IA32_PERFCTR0 or have this patch in the kernel:
> >
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index 5c4f631..bf14e42 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -412,6 +412,8 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > if (!msr_info->host_initiated)
> > data = (s64)(s32)data;
> > pmc->counter += data - read_pmc(pmc);
> > + if (msr_info->host_initiated)
> > + reprogram_gp_counter(pmc, pmc->eventsel);
> > return 0;
> > } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
> > if (data == pmc->eventsel)
>
> Why do you need "if (msr_info->host_initiated)"? I could not find any
> hint in the manual that the overflow counter will still use the value
> of the counter that was programmed first.
>
Not sure I understand. What "overflow counter will still use the value
of the counter that was programmed first" means?
Strictly speaking we do need "if (msr_info->host_initiated)" here,
there is no harm in calling reprogram_gp_counter() unconditionally,
but spec says in no vague terms that counter should be disabled before
writing into the MSR and it means that reprogram_gp_counter() will be
called again when guest will enable counter later, so the invocation
here is redundant and since during profiling this happens a lot avoiding
call to reprogram_gp_counter() is a win.
> If we need to do it always, I agree it's better to modify the QEMU
> patch and not disable/enable the counters. But if we need to restrict
> it to host-initiated writes, I would rather have the QEMU patch as I
> posted it. So far we always had less side-effects from host_initiated,
> not more, and I think it's a good rule of thumb.
>
I am OK with your patch, it is a little bit unfortunate that userspase
should care about such low level details though.
--
Gleb.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-08-01 13:12 ` Gleb Natapov
@ 2013-08-01 13:48 ` Paolo Bonzini
2013-08-01 13:50 ` Gleb Natapov
0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-08-01 13:48 UTC (permalink / raw)
To: Gleb Natapov; +Cc: qemu-devel, afaerber, ehabkost
On Aug 01 2013, Gleb Natapov wrote:
> On Thu, Aug 01, 2013 at 03:03:12PM +0200, Paolo Bonzini wrote:
> > > KVM disabled HW counters when outside of a guest mode (otherwise result
> > > will be useless), so I do not see how the problem you describe can
> > > happen.
> >
> > Yes, you're right.
> >
> > > On the other hand MPU emulation assumes that counter have to be disabled
> > > while MSR_IA32_PERFCTR0 is written since write to MSR_IA32_PERFCTR0 does
> > > not reprogram perf evens, so we need either disable/enable counters to
> > > write MSR_IA32_PERFCTR0 or have this patch in the kernel:
> > >
> > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > > index 5c4f631..bf14e42 100644
> > > --- a/arch/x86/kvm/pmu.c
> > > +++ b/arch/x86/kvm/pmu.c
> > > @@ -412,6 +412,8 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > if (!msr_info->host_initiated)
> > > data = (s64)(s32)data;
> > > pmc->counter += data - read_pmc(pmc);
> > > + if (msr_info->host_initiated)
> > > + reprogram_gp_counter(pmc, pmc->eventsel);
> > > return 0;
> > > } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
> > > if (data == pmc->eventsel)
> >
> > Why do you need "if (msr_info->host_initiated)"? I could not find any
> > hint in the manual that the overflow counter will still use the value
> > of the counter that was programmed first.
> >
> Not sure I understand. What "overflow counter will still use the value
> of the counter that was programmed first" means?
>
> spec says in no vague terms that counter should be disabled before
> writing into the MSR and it means that reprogram_gp_counter() will be
> called again when guest will enable counter later,
Yeah, I found it now.
> I am OK with your patch, it is a little bit unfortunate that userspase
> should care about such low level details though.
Is it a Reviewed-by?
Paolo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state
2013-08-01 13:48 ` Paolo Bonzini
@ 2013-08-01 13:50 ` Gleb Natapov
0 siblings, 0 replies; 14+ messages in thread
From: Gleb Natapov @ 2013-08-01 13:50 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, afaerber, ehabkost
On Thu, Aug 01, 2013 at 03:48:29PM +0200, Paolo Bonzini wrote:
> On Aug 01 2013, Gleb Natapov wrote:
> > On Thu, Aug 01, 2013 at 03:03:12PM +0200, Paolo Bonzini wrote:
> > > > KVM disabled HW counters when outside of a guest mode (otherwise result
> > > > will be useless), so I do not see how the problem you describe can
> > > > happen.
> > >
> > > Yes, you're right.
> > >
> > > > On the other hand MPU emulation assumes that counter have to be disabled
> > > > while MSR_IA32_PERFCTR0 is written since write to MSR_IA32_PERFCTR0 does
> > > > not reprogram perf evens, so we need either disable/enable counters to
> > > > write MSR_IA32_PERFCTR0 or have this patch in the kernel:
> > > >
> > > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > > > index 5c4f631..bf14e42 100644
> > > > --- a/arch/x86/kvm/pmu.c
> > > > +++ b/arch/x86/kvm/pmu.c
> > > > @@ -412,6 +412,8 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > > if (!msr_info->host_initiated)
> > > > data = (s64)(s32)data;
> > > > pmc->counter += data - read_pmc(pmc);
> > > > + if (msr_info->host_initiated)
> > > > + reprogram_gp_counter(pmc, pmc->eventsel);
> > > > return 0;
> > > > } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
> > > > if (data == pmc->eventsel)
> > >
> > > Why do you need "if (msr_info->host_initiated)"? I could not find any
> > > hint in the manual that the overflow counter will still use the value
> > > of the counter that was programmed first.
> > >
> > Not sure I understand. What "overflow counter will still use the value
> > of the counter that was programmed first" means?
> >
> > spec says in no vague terms that counter should be disabled before
> > writing into the MSR and it means that reprogram_gp_counter() will be
> > called again when guest will enable counter later,
>
> Yeah, I found it now.
>
> > I am OK with your patch, it is a little bit unfortunate that userspase
> > should care about such low level details though.
>
> Is it a Reviewed-by?
>
here is one :)
Reviewed-by: Gleb Natapov <gleb@redhat.com>
--
Gleb.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-08-01 22:55 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-25 15:05 [Qemu-devel] [PATCH v2 0/2] kvm: migrate vPMU state Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 1/2] target-i386: remove tabs from target-i386/cpu.h Paolo Bonzini
2013-07-25 15:05 ` [Qemu-devel] [PATCH v2 2/2] kvm: migrate vPMU state Paolo Bonzini
2013-07-28 12:57 ` Gleb Natapov
2013-07-28 13:51 ` Paolo Bonzini
2013-07-28 13:54 ` Gleb Natapov
2013-07-28 14:07 ` Paolo Bonzini
2013-07-28 14:24 ` Gleb Natapov
2013-08-01 13:03 ` Paolo Bonzini
2013-08-01 13:12 ` Gleb Natapov
2013-08-01 13:48 ` Paolo Bonzini
2013-08-01 13:50 ` Gleb Natapov
2013-07-28 13:03 ` [Qemu-devel] [PATCH v2 0/2] " Andreas Färber
2013-07-28 13:05 ` Gleb Natapov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).