* [PATCHv3 01/10] KVM: Expose kvm_lapic_local_deliver()
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 02/10] KVM: Expose a version 2 architectural PMU to a guests Gleb Natapov
` (10 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
Needed to deliver performance monitoring interrupts.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/lapic.c | 2 +-
arch/x86/kvm/lapic.h | 1 +
2 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 54abb40..e87e43e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1120,7 +1120,7 @@ int apic_has_pending_timer(struct kvm_vcpu *vcpu)
return 0;
}
-static int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
+int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type)
{
u32 reg = apic_get_reg(apic, lvt_type);
int vector, mode, trig_mode;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 138e8cc..6f4ce25 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -34,6 +34,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu);
int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
+int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data);
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 02/10] KVM: Expose a version 2 architectural PMU to a guests
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 01/10] KVM: Expose kvm_lapic_local_deliver() Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 03/10] KVM: Add generic RDPMC support Gleb Natapov
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
Use perf_events to emulate an architectural PMU, version 2.
Based on PMU version 1 emulation by Avi Kivity.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 48 ++++
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/Makefile | 2 +-
arch/x86/kvm/pmu.c | 531 +++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 24 ++-
include/linux/kvm_host.h | 2 +
6 files changed, 598 insertions(+), 10 deletions(-)
create mode 100644 arch/x86/kvm/pmu.c
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6d83264..5807a49 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -16,10 +16,12 @@
#include <linux/mmu_notifier.h>
#include <linux/tracepoint.h>
#include <linux/cpumask.h>
+#include <linux/irq_work.h>
#include <linux/kvm.h>
#include <linux/kvm_para.h>
#include <linux/kvm_types.h>
+#include <linux/perf_event.h>
#include <asm/pvclock-abi.h>
#include <asm/desc.h>
@@ -289,6 +291,37 @@ struct kvm_mmu {
u64 pdptrs[4]; /* pae */
};
+enum pmc_type {
+ KVM_PMC_GP = 0,
+ KVM_PMC_FIXED,
+};
+
+struct kvm_pmc {
+ enum pmc_type type;
+ u8 idx;
+ u64 counter;
+ u64 eventsel;
+ struct perf_event *perf_event;
+ struct kvm_vcpu *vcpu;
+};
+
+struct kvm_pmu {
+ unsigned nr_arch_gp_counters;
+ unsigned nr_arch_fixed_counters;
+ unsigned available_event_types;
+ u64 fixed_ctr_ctrl;
+ u64 global_ctrl;
+ u64 global_status;
+ u64 global_ovf_ctrl;
+ u64 counter_bitmask[2];
+ u64 global_ctrl_mask;
+ u8 version;
+ struct kvm_pmc gp_counters[X86_PMC_MAX_GENERIC];
+ struct kvm_pmc fixed_counters[X86_PMC_MAX_FIXED];
+ struct irq_work irq_work;
+ u64 reprogram_pmi;
+};
+
struct kvm_vcpu_arch {
/*
* rip and regs accesses must go through
@@ -422,6 +455,8 @@ struct kvm_vcpu_arch {
unsigned access;
gfn_t mmio_gfn;
+ struct kvm_pmu pmu;
+
/* used for guest single stepping over the given code position */
unsigned long singlestep_rip;
@@ -881,4 +916,17 @@ extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn);
void kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err);
+int kvm_is_in_guest(void);
+
+void kvm_pmu_init(struct kvm_vcpu *vcpu);
+void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
+void kvm_pmu_reset(struct kvm_vcpu *vcpu);
+void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu);
+bool kvm_pmu_msr(struct kvm_vcpu *vcpu, u32 msr);
+int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
+int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data);
+int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
+void kvm_handle_pmu_event(struct kvm_vcpu *vcpu);
+void kvm_deliver_pmi(struct kvm_vcpu *vcpu);
+
#endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ff5790d..c27dd11 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -35,6 +35,7 @@ config KVM
select KVM_MMIO
select TASKSTATS
select TASK_DELAY_ACCT
+ select PERF_EVENTS
---help---
Support hosting fully virtualized guest machines using hardware
virtualization extensions. You will need a fairly recent
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index f15501f..cfca03f 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,7 +12,7 @@ kvm-$(CONFIG_IOMMU_API) += $(addprefix ../../../virt/kvm/, iommu.o)
kvm-$(CONFIG_KVM_ASYNC_PF) += $(addprefix ../../../virt/kvm/, async_pf.o)
kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
- i8254.o timer.o
+ i8254.o timer.o pmu.o
kvm-intel-y += vmx.o
kvm-amd-y += svm.o
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
new file mode 100644
index 0000000..1888aa4
--- /dev/null
+++ b/arch/x86/kvm/pmu.c
@@ -0,0 +1,531 @@
+/*
+ * Kernel-based Virtual Machine -- Performane Monitoring Unit support
+ *
+ * Copyright 2011 Red Hat, Inc. and/or its affiliates.
+ *
+ * Authors:
+ * Avi Kivity <avi@redhat.com>
+ * Gleb Natapov <gleb@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include <linux/types.h>
+#include <linux/kvm_host.h>
+#include <linux/perf_event.h>
+#include "x86.h"
+#include "lapic.h"
+
+static struct kvm_arch_event_perf_mapping {
+ u8 eventsel;
+ u8 unit_mask;
+ unsigned event_type;
+ bool inexact;
+} arch_events[] = {
+ /* Index must match CPUID 0x0A.EBX bit vector */
+ [0] = { 0x3c, 0x00, PERF_COUNT_HW_CPU_CYCLES },
+ [1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
+ [2] = { 0x3c, 0x01, PERF_COUNT_HW_BUS_CYCLES },
+ [3] = { 0x2e, 0x4f, PERF_COUNT_HW_CACHE_REFERENCES },
+ [4] = { 0x2e, 0x41, PERF_COUNT_HW_CACHE_MISSES },
+ [5] = { 0xc4, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
+ [6] = { 0xc5, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
+};
+
+/* mapping between fixed pmc index and arch_events array */
+int fixed_pmc_events[] = {1, 0, 2};
+
+static bool pmc_is_gp(struct kvm_pmc *pmc)
+{
+ return pmc->type == KVM_PMC_GP;
+}
+
+static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
+{
+ struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
+
+ return pmu->counter_bitmask[pmc->type];
+}
+
+static inline bool pmc_enabled(struct kvm_pmc *pmc)
+{
+ struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
+ return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
+}
+
+static inline struct kvm_pmc *get_gp_pmc(struct kvm_pmu *pmu, u32 msr,
+ u32 base)
+{
+ if (msr >= base && msr < base + pmu->nr_arch_gp_counters)
+ return &pmu->gp_counters[msr - base];
+ return NULL;
+}
+
+static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr)
+{
+ int base = MSR_CORE_PERF_FIXED_CTR0;
+ if (msr >= base && msr < base + pmu->nr_arch_fixed_counters)
+ return &pmu->fixed_counters[msr - base];
+ return NULL;
+}
+
+static inline struct kvm_pmc *get_fixed_pmc_idx(struct kvm_pmu *pmu, int idx)
+{
+ return get_fixed_pmc(pmu, MSR_CORE_PERF_FIXED_CTR0 + idx);
+}
+
+static struct kvm_pmc *global_idx_to_pmc(struct kvm_pmu *pmu, int idx)
+{
+ if (idx < X86_PMC_IDX_FIXED)
+ return get_gp_pmc(pmu, MSR_P6_EVNTSEL0 + idx, MSR_P6_EVNTSEL0);
+ else
+ return get_fixed_pmc_idx(pmu, idx - X86_PMC_IDX_FIXED);
+}
+
+void kvm_deliver_pmi(struct kvm_vcpu *vcpu)
+{
+ if (vcpu->arch.apic)
+ kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
+}
+
+static void trigger_pmi(struct irq_work *irq_work)
+{
+ struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu,
+ irq_work);
+ struct kvm_vcpu *vcpu = container_of(pmu, struct kvm_vcpu,
+ arch.pmu);
+
+ kvm_deliver_pmi(vcpu);
+}
+
+static void kvm_perf_overflow(struct perf_event *perf_event,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ struct kvm_pmc *pmc = perf_event->overflow_handler_context;
+ struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
+ __set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+}
+
+static void kvm_perf_overflow_intr(struct perf_event *perf_event,
+ struct perf_sample_data *data, struct pt_regs *regs)
+{
+ struct kvm_pmc *pmc = perf_event->overflow_handler_context;
+ struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
+ if (!test_and_set_bit(pmc->idx, (unsigned long *)&pmu->reprogram_pmi)) {
+ kvm_perf_overflow(perf_event, data, regs);
+ kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
+ /*
+ * Inject PMI. If vcpu was in a guest mode during NMI PMI
+ * can be ejected on a guest mode re-entry. Otherwise we can't
+ * be sure that vcpu wasn't executing hlt instruction at the
+ * time of vmexit and is not going to re-enter guest mode until,
+ * woken up. So we should wake it, but this is impossible from
+ * NMI context. Do it from irq work instead.
+ */
+ if (!kvm_is_in_guest())
+ irq_work_queue(&pmc->vcpu->arch.pmu.irq_work);
+ else
+ kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
+ }
+}
+
+static u64 read_pmc(struct kvm_pmc *pmc)
+{
+ u64 counter, enabled, running;
+
+ counter = pmc->counter;
+
+ if (pmc->perf_event)
+ counter += perf_event_read_value(pmc->perf_event,
+ &enabled, &running);
+
+ /* FIXME: Scaling needed? */
+
+ return counter & pmc_bitmask(pmc);
+}
+
+static void stop_counter(struct kvm_pmc *pmc)
+{
+ if (pmc->perf_event) {
+ pmc->counter = read_pmc(pmc);
+ perf_event_release_kernel(pmc->perf_event);
+ pmc->perf_event = NULL;
+ }
+}
+
+static void reprogram_counter(struct kvm_pmc *pmc, u32 type,
+ unsigned config, bool exclude_user, bool exclude_kernel,
+ bool intr)
+{
+ struct perf_event *event;
+ struct perf_event_attr attr = {
+ .type = type,
+ .size = sizeof(attr),
+ .pinned = true,
+ .exclude_idle = true,
+ .exclude_host = 1,
+ .exclude_user = exclude_user,
+ .exclude_kernel = exclude_kernel,
+ .sample_period = (-pmc->counter) & pmc_bitmask(pmc),
+ .config = config,
+ };
+
+ event = perf_event_create_kernel_counter(&attr, -1, current,
+ intr ? kvm_perf_overflow_intr :
+ kvm_perf_overflow, pmc);
+ if (IS_ERR(event)) {
+ printk_once("kvm: pmu event creation failed %ld\n",
+ PTR_ERR(event));
+ return;
+ }
+
+ pmc->perf_event = event;
+ clear_bit(pmc->idx, (unsigned long*)&pmc->vcpu->arch.pmu.reprogram_pmi);
+}
+
+static unsigned find_arch_event(struct kvm_pmu *pmu, u8 event_select,
+ u8 unit_mask)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(arch_events); i++)
+ if (arch_events[i].eventsel == event_select
+ && arch_events[i].unit_mask == unit_mask
+ && (pmu->available_event_types & (1 << i)))
+ break;
+
+ if (i == ARRAY_SIZE(arch_events))
+ return PERF_COUNT_HW_MAX;
+
+ return arch_events[i].event_type;
+}
+
+static void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
+{
+ unsigned config, type = PERF_TYPE_RAW;
+ u8 event_select, unit_mask;
+
+ pmc->eventsel = eventsel;
+
+ stop_counter(pmc);
+
+ if (!(eventsel & ARCH_PERFMON_EVENTSEL_ENABLE) || !pmc_enabled(pmc))
+ return;
+
+ event_select = eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
+ unit_mask = (eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
+
+ if (!(event_select & (ARCH_PERFMON_EVENTSEL_EDGE |
+ ARCH_PERFMON_EVENTSEL_INV |
+ ARCH_PERFMON_EVENTSEL_CMASK))) {
+ config = find_arch_event(&pmc->vcpu->arch.pmu, event_select,
+ unit_mask);
+ if (config != PERF_COUNT_HW_MAX)
+ type = PERF_TYPE_HARDWARE;
+ }
+
+ if (type == PERF_TYPE_RAW)
+ config = eventsel & X86_RAW_EVENT_MASK;
+
+ reprogram_counter(pmc, type, config,
+ !(eventsel & ARCH_PERFMON_EVENTSEL_USR),
+ !(eventsel & ARCH_PERFMON_EVENTSEL_OS),
+ eventsel & ARCH_PERFMON_EVENTSEL_INT);
+}
+
+static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 en_pmi, int idx)
+{
+ unsigned en = en_pmi & 0x3;
+ bool pmi = en_pmi & 0x8;
+
+ stop_counter(pmc);
+
+ if (!en || !pmc_enabled(pmc))
+ return;
+
+ reprogram_counter(pmc, PERF_TYPE_HARDWARE,
+ arch_events[fixed_pmc_events[idx]].event_type,
+ !(en & 0x2), /* exclude user */
+ !(en & 0x1), /* exclude kernel */
+ pmi);
+}
+
+static inline u8 fixed_en_pmi(u64 ctrl, int idx)
+{
+ return (ctrl >> (idx * 4)) & 0xf;
+}
+
+static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
+{
+ int i;
+
+ for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
+ u8 en_pmi = fixed_en_pmi(data, i);
+ struct kvm_pmc *pmc = get_fixed_pmc_idx(pmu, i);
+
+ if (fixed_en_pmi(pmu->fixed_ctr_ctrl, i) == en_pmi)
+ continue;
+
+ reprogram_fixed_counter(pmc, en_pmi, i);
+ }
+
+ pmu->fixed_ctr_ctrl = data;
+}
+
+static void reprogram_idx(struct kvm_pmu *pmu, int idx)
+{
+ struct kvm_pmc *pmc = global_idx_to_pmc(pmu, idx);
+
+ if (!pmc)
+ return;
+
+ if (pmc_is_gp(pmc))
+ reprogram_gp_counter(pmc, pmc->eventsel);
+ else {
+ int fidx = idx - X86_PMC_IDX_FIXED;
+ reprogram_fixed_counter(pmc,
+ fixed_en_pmi(pmu->fixed_ctr_ctrl, fidx), fidx);
+ }
+}
+
+static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
+{
+ int bit;
+ u64 diff = pmu->global_ctrl ^ data;
+
+ pmu->global_ctrl = data;
+
+ for_each_set_bit(bit, (unsigned long *)&diff, X86_PMC_IDX_MAX)
+ reprogram_idx(pmu, bit);
+}
+
+bool kvm_pmu_msr(struct kvm_vcpu *vcpu, u32 msr)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ int ret;
+
+ switch (msr) {
+ case MSR_CORE_PERF_FIXED_CTR_CTRL:
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ ret = pmu->version > 1;
+ break;
+ default:
+ ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)
+ || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0)
+ || get_fixed_pmc(pmu, msr);
+ break;
+ }
+ return ret;
+}
+
+int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ struct kvm_pmc *pmc;
+
+ switch (index) {
+ case MSR_CORE_PERF_FIXED_CTR_CTRL:
+ *data = pmu->fixed_ctr_ctrl;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ *data = pmu->global_status;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ *data = pmu->global_ctrl;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ *data = pmu->global_ovf_ctrl;
+ return 0;
+ default:
+ if ((pmc = get_gp_pmc(pmu, index, MSR_IA32_PERFCTR0)) ||
+ (pmc = get_fixed_pmc(pmu, index))) {
+ *data = read_pmc(pmc);
+ return 0;
+ } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
+ *data = pmc->eventsel;
+ return 0;
+ }
+ }
+ return 1;
+}
+
+int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ struct kvm_pmc *pmc;
+
+ switch (index) {
+ case MSR_CORE_PERF_FIXED_CTR_CTRL:
+ if (pmu->fixed_ctr_ctrl == data)
+ return 0;
+ if (!(data & 0xfffffffffffff444)) {
+ reprogram_fixed_counters(pmu, data);
+ return 0;
+ }
+ break;
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ break; /* RO MSR */
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ if (pmu->global_ctrl == data)
+ return 0;
+ if (!(data & pmu->global_ctrl_mask)) {
+ global_ctrl_changed(pmu, data);
+ return 0;
+ }
+ break;
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ if (!(data & (pmu->global_ctrl_mask & ~(3ull<<62)))) {
+ pmu->global_status &= ~data;
+ pmu->global_ovf_ctrl = data;
+ return 0;
+ }
+ break;
+ default:
+ if ((pmc = get_gp_pmc(pmu, index, MSR_IA32_PERFCTR0)) ||
+ (pmc = get_fixed_pmc(pmu, index))) {
+ data = (s64)(s32)data;
+ pmc->counter += data - read_pmc(pmc);
+ return 0;
+ } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) {
+ if (data == pmc->eventsel)
+ return 0;
+ if (!(data & 0xffffffff00200000ull)) {
+ reprogram_gp_counter(pmc, data);
+ return 0;
+ }
+ }
+ }
+ return 1;
+}
+
+int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ bool fast_mode = pmc & (1u << 31);
+ bool fixed = pmc & (1u << 30);
+ struct kvm_pmc *counters;
+ u64 ctr;
+
+ pmc &= (3u << 30) - 1;
+ if (!fixed && pmc >= pmu->nr_arch_gp_counters)
+ return 1;
+ if (fixed && pmc >= pmu->nr_arch_fixed_counters)
+ return 1;
+ counters = fixed ? pmu->fixed_counters : pmu->gp_counters;
+ ctr = read_pmc(&counters[pmc]);
+ if (fast_mode)
+ ctr = (u32)ctr;
+ *data = ctr;
+
+ return 0;
+}
+
+void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ struct kvm_cpuid_entry2 *entry;
+ unsigned bitmap_len;
+
+ pmu->nr_arch_gp_counters = 0;
+ pmu->nr_arch_fixed_counters = 0;
+ pmu->counter_bitmask[KVM_PMC_GP] = 0;
+ pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
+ pmu->version = 0;
+
+ entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
+ if (!entry)
+ return;
+
+ pmu->version = entry->eax & 0xff;
+ if (!pmu->version)
+ return;
+
+ pmu->nr_arch_gp_counters = min((int)(entry->eax >> 8) & 0xff,
+ X86_PMC_MAX_GENERIC);
+ pmu->counter_bitmask[KVM_PMC_GP] =
+ ((u64)1 << ((entry->eax >> 16) & 0xff)) - 1;
+ bitmap_len = (entry->eax >> 24) & 0xff;
+ pmu->available_event_types = ~entry->ebx & ((1ull << bitmap_len) - 1);
+
+ if (pmu->version == 1) {
+ pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
+ return;
+ }
+
+ pmu->nr_arch_fixed_counters = min((int)(entry->edx & 0x1f),
+ X86_PMC_MAX_FIXED);
+ pmu->counter_bitmask[KVM_PMC_FIXED] =
+ ((u64)1 << ((entry->edx >> 5) & 0xff)) - 1;
+ pmu->global_ctrl_mask = ~(((1 << pmu->nr_arch_gp_counters) - 1)
+ | (((1ull << pmu->nr_arch_fixed_counters) - 1)
+ << X86_PMC_IDX_FIXED));
+}
+
+void kvm_pmu_init(struct kvm_vcpu *vcpu)
+{
+ int i;
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+
+ memset(pmu, 0, sizeof(*pmu));
+ for (i = 0; i < X86_PMC_MAX_GENERIC; i++) {
+ pmu->gp_counters[i].type = KVM_PMC_GP;
+ pmu->gp_counters[i].vcpu = vcpu;
+ pmu->gp_counters[i].idx = i;
+ }
+ for (i = 0; i < X86_PMC_MAX_FIXED; i++) {
+ pmu->fixed_counters[i].type = KVM_PMC_FIXED;
+ pmu->fixed_counters[i].vcpu = vcpu;
+ pmu->fixed_counters[i].idx = i + X86_PMC_IDX_FIXED;
+ }
+ init_irq_work(&pmu->irq_work, trigger_pmi);
+ kvm_pmu_cpuid_update(vcpu);
+}
+
+void kvm_pmu_reset(struct kvm_vcpu *vcpu)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ int i;
+
+ irq_work_sync(&pmu->irq_work);
+ for (i = 0; i < X86_PMC_MAX_GENERIC; i++) {
+ struct kvm_pmc *pmc = &pmu->gp_counters[i];
+ stop_counter(pmc);
+ pmc->counter = pmc->eventsel = 0;
+ }
+
+ for (i = 0; i < X86_PMC_MAX_FIXED; i++)
+ stop_counter(&pmu->fixed_counters[i]);
+
+ pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status =
+ pmu->global_ovf_ctrl = 0;
+}
+
+void kvm_pmu_destroy(struct kvm_vcpu *vcpu)
+{
+ kvm_pmu_reset(vcpu);
+}
+
+void kvm_handle_pmu_event(struct kvm_vcpu *vcpu)
+{
+ struct kvm_pmu *pmu = &vcpu->arch.pmu;
+ u64 bitmask;
+ int bit;
+
+ bitmask = pmu->reprogram_pmi;
+
+ for_each_set_bit(bit, (unsigned long *)&bitmask, X86_PMC_IDX_MAX) {
+ struct kvm_pmc *pmc = global_idx_to_pmc(pmu, bit);
+
+ if (unlikely(!pmc || !pmc->perf_event)) {
+ clear_bit(bit, (unsigned long *)&pmu->reprogram_pmi);
+ continue;
+ }
+
+ reprogram_idx(pmu, bit);
+ }
+}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9eff4af..52a8666 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -624,6 +624,8 @@ static void update_cpuid(struct kvm_vcpu *vcpu)
if (apic)
apic->lapic_timer.timer_mode_mask = timer_mode_mask;
+
+ kvm_pmu_cpuid_update(vcpu);
}
int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
@@ -1656,8 +1658,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
* which we perfectly emulate ;-). Any other value should be at least
* reported, some guests depend on them.
*/
- case MSR_P6_EVNTSEL0:
- case MSR_P6_EVNTSEL1:
case MSR_K7_EVNTSEL0:
case MSR_K7_EVNTSEL1:
case MSR_K7_EVNTSEL2:
@@ -1669,8 +1669,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
/* at least RHEL 4 unconditionally writes to the perfctr registers,
* so we ignore writes to make it happy.
*/
- case MSR_P6_PERFCTR0:
- case MSR_P6_PERFCTR1:
case MSR_K7_PERFCTR0:
case MSR_K7_PERFCTR1:
case MSR_K7_PERFCTR2:
@@ -1707,6 +1705,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
default:
if (msr && (msr == vcpu->kvm->arch.xen_hvm_config.msr))
return xen_hvm_config(vcpu, data);
+ if (kvm_pmu_msr(vcpu, msr))
+ return kvm_pmu_set_msr(vcpu, msr, data);
if (!ignore_msrs) {
pr_unimpl(vcpu, "unhandled wrmsr: 0x%x data %llx\n",
msr, data);
@@ -1869,10 +1869,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
case MSR_K8_SYSCFG:
case MSR_K7_HWCR:
case MSR_VM_HSAVE_PA:
- case MSR_P6_PERFCTR0:
- case MSR_P6_PERFCTR1:
- case MSR_P6_EVNTSEL0:
- case MSR_P6_EVNTSEL1:
case MSR_K7_EVNTSEL0:
case MSR_K7_PERFCTR0:
case MSR_K8_INT_PENDING_MSG:
@@ -1983,6 +1979,8 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
data = 0xbe702111;
break;
default:
+ if (kvm_pmu_msr(vcpu, msr))
+ return kvm_pmu_get_msr(vcpu, msr, pdata);
if (!ignore_msrs) {
pr_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", msr);
return 1;
@@ -5136,7 +5134,7 @@ static void kvm_timer_init(void)
static DEFINE_PER_CPU(struct kvm_vcpu *, current_vcpu);
-static int kvm_is_in_guest(void)
+int kvm_is_in_guest(void)
{
return percpu_read(current_vcpu) != NULL;
}
@@ -5719,6 +5717,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
process_nmi(vcpu);
req_immediate_exit =
kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu);
+ if (kvm_check_request(KVM_REQ_PMU, vcpu))
+ kvm_handle_pmu_event(vcpu);
+ if (kvm_check_request(KVM_REQ_PMI, vcpu))
+ kvm_deliver_pmi(vcpu);
}
r = kvm_mmu_reload(vcpu);
@@ -6459,6 +6461,8 @@ int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu)
kvm_async_pf_hash_reset(vcpu);
vcpu->arch.apf.halted = false;
+ kvm_pmu_reset(vcpu);
+
return kvm_x86_ops->vcpu_reset(vcpu);
}
@@ -6547,6 +6551,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
goto fail_free_mce_banks;
kvm_async_pf_hash_reset(vcpu);
+ kvm_pmu_init(vcpu);
return 0;
fail_free_mce_banks:
@@ -6565,6 +6570,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
{
int idx;
+ kvm_pmu_destroy(vcpu);
kfree(vcpu->arch.mce_banks);
kvm_free_lapic(vcpu);
idx = srcu_read_lock(&vcpu->kvm->srcu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c6a2ec9..40d0878 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -52,6 +52,8 @@
#define KVM_REQ_STEAL_UPDATE 13
#define KVM_REQ_NMI 14
#define KVM_REQ_IMMEDIATE_EXIT 15
+#define KVM_REQ_PMU 16
+#define KVM_REQ_PMI 17
#define KVM_USERSPACE_IRQ_SOURCE_ID 0
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 03/10] KVM: Add generic RDPMC support
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 01/10] KVM: Expose kvm_lapic_local_deliver() Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 02/10] KVM: Expose a version 2 architectural PMU to a guests Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 04/10] KVM: SVM: Intercept RDPMC Gleb Natapov
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
Add a helper function that emulates the RDPMC instruction operation.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/x86.c | 15 +++++++++++++++
2 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5807a49..422824c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -756,6 +756,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data);
unsigned long kvm_get_rflags(struct kvm_vcpu *vcpu);
void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
+bool kvm_rdpmc(struct kvm_vcpu *vcpu);
void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 52a8666..b88426c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -815,6 +815,21 @@ int kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val)
}
EXPORT_SYMBOL_GPL(kvm_get_dr);
+bool kvm_rdpmc(struct kvm_vcpu *vcpu)
+{
+ u32 ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
+ u64 data;
+ int err;
+
+ err = kvm_pmu_read_pmc(vcpu, ecx, &data);
+ if (err)
+ return err;
+ kvm_register_write(vcpu, VCPU_REGS_RAX, (u32)data);
+ kvm_register_write(vcpu, VCPU_REGS_RDX, data >> 32);
+ return err;
+}
+EXPORT_SYMBOL_GPL(kvm_rdpmc);
+
/*
* List of msr numbers which we expose to userspace through KVM_GET_MSRS
* and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 04/10] KVM: SVM: Intercept RDPMC
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (2 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 03/10] KVM: Add generic RDPMC support Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 05/10] KVM: VMX: " Gleb Natapov
` (7 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
Intercept RDPMC and forward it to the PMU emulation code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/svm.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e32243e..5fa553b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1014,6 +1014,7 @@ static void init_vmcb(struct vcpu_svm *svm)
set_intercept(svm, INTERCEPT_NMI);
set_intercept(svm, INTERCEPT_SMI);
set_intercept(svm, INTERCEPT_SELECTIVE_CR0);
+ set_intercept(svm, INTERCEPT_RDPMC);
set_intercept(svm, INTERCEPT_CPUID);
set_intercept(svm, INTERCEPT_INVD);
set_intercept(svm, INTERCEPT_HLT);
@@ -2770,6 +2771,19 @@ static int emulate_on_interception(struct vcpu_svm *svm)
return emulate_instruction(&svm->vcpu, 0) == EMULATE_DONE;
}
+static int rdpmc_interception(struct vcpu_svm *svm)
+{
+ int err;
+
+ if (!static_cpu_has(X86_FEATURE_NRIPS))
+ return emulate_on_interception(svm);
+
+ err = kvm_rdpmc(&svm->vcpu);
+ kvm_complete_insn_gp(&svm->vcpu, err);
+
+ return 1;
+}
+
bool check_selective_cr0_intercepted(struct vcpu_svm *svm, unsigned long val)
{
unsigned long cr0 = svm->vcpu.arch.cr0;
@@ -3190,6 +3204,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_SMI] = nop_on_interception,
[SVM_EXIT_INIT] = nop_on_interception,
[SVM_EXIT_VINTR] = interrupt_window_interception,
+ [SVM_EXIT_RDPMC] = rdpmc_interception,
[SVM_EXIT_CPUID] = cpuid_interception,
[SVM_EXIT_IRET] = iret_interception,
[SVM_EXIT_INVD] = emulate_on_interception,
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 05/10] KVM: VMX: Intercept RDPMC
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (3 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 04/10] KVM: SVM: Intercept RDPMC Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 06/10] x86, perf: disable non available architectural events Gleb Natapov
` (6 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
Intercept RDPMC and forward it to the PMU emulation code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/vmx.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6e28d58..a6535ba 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1956,6 +1956,7 @@ static __init void nested_vmx_setup_ctls_msrs(void)
#endif
CPU_BASED_MOV_DR_EXITING | CPU_BASED_UNCOND_IO_EXITING |
CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MONITOR_EXITING |
+ CPU_BASED_RDPMC_EXITING |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
/*
* We can allow some features even when not supported by the
@@ -2414,7 +2415,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
CPU_BASED_USE_TSC_OFFSETING |
CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING |
- CPU_BASED_INVLPG_EXITING;
+ CPU_BASED_INVLPG_EXITING |
+ CPU_BASED_RDPMC_EXITING;
if (yield_on_hlt)
min |= CPU_BASED_HLT_EXITING;
@@ -4615,6 +4617,16 @@ static int handle_invlpg(struct kvm_vcpu *vcpu)
return 1;
}
+static int handle_rdpmc(struct kvm_vcpu *vcpu)
+{
+ int err;
+
+ err = kvm_rdpmc(vcpu);
+ kvm_complete_insn_gp(vcpu, err);
+
+ return 1;
+}
+
static int handle_wbinvd(struct kvm_vcpu *vcpu)
{
skip_emulated_instruction(vcpu);
@@ -5565,6 +5577,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[EXIT_REASON_HLT] = handle_halt,
[EXIT_REASON_INVD] = handle_invd,
[EXIT_REASON_INVLPG] = handle_invlpg,
+ [EXIT_REASON_RDPMC] = handle_rdpmc,
[EXIT_REASON_VMCALL] = handle_vmcall,
[EXIT_REASON_VMCLEAR] = handle_vmclear,
[EXIT_REASON_VMLAUNCH] = handle_vmlaunch,
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 06/10] x86, perf: disable non available architectural events.
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (4 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 05/10] KVM: VMX: " Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-17 9:29 ` Peter Zijlstra
2011-12-06 20:17 ` [tip:perf/core] x86, perf: Disable " tip-bot for Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 07/10] perf, x86: expose perf capability to other modules Gleb Natapov
` (5 subsequent siblings)
11 siblings, 2 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
Intel CPUs report non-available architectural events in cpuid leaf
0AH.EBX. Use it to disable events that are not available according
to CPU.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/include/asm/perf_event.h | 14 ++++++++++++++
arch/x86/kernel/cpu/perf_event.h | 5 +++++
arch/x86/kernel/cpu/perf_event_intel.c | 29 ++++++++++++++++++++++++-----
3 files changed, 43 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f61c62f..c6998bc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -57,6 +57,7 @@
(1 << (ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX))
#define ARCH_PERFMON_BRANCH_MISSES_RETIRED 6
+#define ARCH_PERFMON_EVENTS_COUNT 7
/*
* Intel "Architectural Performance Monitoring" CPUID
@@ -72,6 +73,19 @@ union cpuid10_eax {
unsigned int full;
};
+union cpuid10_ebx {
+ struct {
+ unsigned int no_unhalted_core_cycles:1;
+ unsigned int no_instructions_retired:1;
+ unsigned int no_unhalted_reference_cycles:1;
+ unsigned int no_llc_reference:1;
+ unsigned int no_llc_misses:1;
+ unsigned int no_branch_instruction_retired:1;
+ unsigned int no_branch_misses_retired:1;
+ } split;
+ unsigned int full;
+};
+
union cpuid10_edx {
struct {
unsigned int num_counters_fixed:5;
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index b9698d4..cd0ebcd 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -259,6 +259,11 @@ struct x86_pmu {
int num_counters_fixed;
int cntval_bits;
u64 cntval_mask;
+ union {
+ unsigned long events_maskl;
+ unsigned long events_mask[BITS_TO_LONGS(ARCH_PERFMON_EVENTS_COUNT)];
+ };
+ int events_mask_len;
int apic;
u64 max_period;
struct event_constraint *
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index e09ca20..301369a 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1544,13 +1544,23 @@ static void intel_clovertown_quirks(void)
x86_pmu.pebs_constraints = NULL;
}
+static int intel_event_id_to_hw_id[] = {
+ PERF_COUNT_HW_CPU_CYCLES,
+ PERF_COUNT_HW_INSTRUCTIONS,
+ PERF_COUNT_HW_BUS_CYCLES,
+ PERF_COUNT_HW_CACHE_REFERENCES,
+ PERF_COUNT_HW_CACHE_MISSES,
+ PERF_COUNT_HW_BRANCH_INSTRUCTIONS,
+ PERF_COUNT_HW_BRANCH_MISSES,
+};
+
__init int intel_pmu_init(void)
{
union cpuid10_edx edx;
union cpuid10_eax eax;
+ union cpuid10_ebx ebx;
unsigned int unused;
- unsigned int ebx;
- int version;
+ int version, bit;
if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -1566,8 +1576,8 @@ __init int intel_pmu_init(void)
* Check whether the Architectural PerfMon supports
* Branch Misses Retired hw_event or not.
*/
- cpuid(10, &eax.full, &ebx, &unused, &edx.full);
- if (eax.split.mask_length <= ARCH_PERFMON_BRANCH_MISSES_RETIRED)
+ cpuid(10, &eax.full, &ebx.full, &unused, &edx.full);
+ if (eax.split.mask_length < ARCH_PERFMON_EVENTS_COUNT)
return -ENODEV;
version = eax.split.version_id;
@@ -1643,7 +1653,7 @@ __init int intel_pmu_init(void)
/* UOPS_EXECUTED.CORE_ACTIVE_CYCLES,c=1,i=1 */
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x1803fb1;
- if (ebx & 0x40) {
+ if (ebx.split.no_branch_misses_retired) {
/*
* Erratum AAJ80 detected, we work it around by using
* the BR_MISP_EXEC.ANY event. This will over-count
@@ -1651,6 +1661,7 @@ __init int intel_pmu_init(void)
* architectural event which is often completely bogus:
*/
intel_perfmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x7f89;
+ ebx.split.no_branch_misses_retired = 0;
pr_cont("erratum AAJ80 worked around, ");
}
@@ -1729,5 +1740,13 @@ __init int intel_pmu_init(void)
break;
}
}
+ x86_pmu.events_maskl = ebx.full;
+ x86_pmu.events_mask_len = eax.split.mask_length;
+
+ /* disable event that reported as not presend by cpuid */
+ for_each_set_bit(bit, x86_pmu.events_mask,
+ min(x86_pmu.events_mask_len, x86_pmu.max_events))
+ intel_perfmon_event_map[intel_event_id_to_hw_id[bit]] = 0;
+
return 0;
}
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCHv3 06/10] x86, perf: disable non available architectural events.
2011-11-10 12:57 ` [PATCHv3 06/10] x86, perf: disable non available architectural events Gleb Natapov
@ 2011-11-17 9:29 ` Peter Zijlstra
2011-12-06 20:17 ` [tip:perf/core] x86, perf: Disable " tip-bot for Gleb Natapov
1 sibling, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2011-11-17 9:29 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, linux-kernel, avi, mtosatti, mingo, acme
On Thu, 2011-11-10 at 14:57 +0200, Gleb Natapov wrote:
> +
> + /* disable event that reported as not presend by cpuid */
> + for_each_set_bit(bit, x86_pmu.events_mask,
> + min(x86_pmu.events_mask_len, x86_pmu.max_events))
> + intel_perfmon_event_map[intel_event_id_to_hw_id[bit]] = 0;
I think that should read:
for_each_set_bit(bit, x86_pmu.events_mask, ARRAY_SIZE(intel_event_id_to_hw_id))
intel_perfmon_event_map[intel_event_id_to_hw_id[bit]] = 0;
There is no guarantee x86_pmu.max_events has any relation to the size of
intel_event_id_to_hw_id.
Changed it for you.
^ permalink raw reply [flat|nested] 16+ messages in thread* [tip:perf/core] x86, perf: Disable non available architectural events
2011-11-10 12:57 ` [PATCHv3 06/10] x86, perf: disable non available architectural events Gleb Natapov
2011-11-17 9:29 ` Peter Zijlstra
@ 2011-12-06 20:17 ` tip-bot for Gleb Natapov
1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Gleb Natapov @ 2011-12-06 20:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, gleb, hpa, mingo, a.p.zijlstra, tglx, mingo
Commit-ID: ffb871bc9156ee2e5cf442f61250c5bd6aad17e3
Gitweb: http://git.kernel.org/tip/ffb871bc9156ee2e5cf442f61250c5bd6aad17e3
Author: Gleb Natapov <gleb@redhat.com>
AuthorDate: Thu, 10 Nov 2011 14:57:26 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 6 Dec 2011 20:41:05 +0100
x86, perf: Disable non available architectural events
Intel CPUs report non-available architectural events in cpuid leaf
0AH.EBX. Use it to disable events that are not available according
to CPU.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1320929850-10480-7-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/include/asm/perf_event.h | 14 ++++++++++++++
arch/x86/kernel/cpu/perf_event.h | 5 +++++
arch/x86/kernel/cpu/perf_event_intel.c | 28 +++++++++++++++++++++++-----
3 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f61c62f..c6998bc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -57,6 +57,7 @@
(1 << (ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX))
#define ARCH_PERFMON_BRANCH_MISSES_RETIRED 6
+#define ARCH_PERFMON_EVENTS_COUNT 7
/*
* Intel "Architectural Performance Monitoring" CPUID
@@ -72,6 +73,19 @@ union cpuid10_eax {
unsigned int full;
};
+union cpuid10_ebx {
+ struct {
+ unsigned int no_unhalted_core_cycles:1;
+ unsigned int no_instructions_retired:1;
+ unsigned int no_unhalted_reference_cycles:1;
+ unsigned int no_llc_reference:1;
+ unsigned int no_llc_misses:1;
+ unsigned int no_branch_instruction_retired:1;
+ unsigned int no_branch_misses_retired:1;
+ } split;
+ unsigned int full;
+};
+
union cpuid10_edx {
struct {
unsigned int num_counters_fixed:5;
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 51a985c..f49c5c2 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -285,6 +285,11 @@ struct x86_pmu {
int num_counters_fixed;
int cntval_bits;
u64 cntval_mask;
+ union {
+ unsigned long events_maskl;
+ unsigned long events_mask[BITS_TO_LONGS(ARCH_PERFMON_EVENTS_COUNT)];
+ };
+ int events_mask_len;
int apic;
u64 max_period;
struct event_constraint *
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 8d601b1..201156b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1552,13 +1552,23 @@ static void intel_sandybridge_quirks(void)
x86_pmu.pebs_constraints = NULL;
}
+static const int intel_event_id_to_hw_id[] __initconst = {
+ PERF_COUNT_HW_CPU_CYCLES,
+ PERF_COUNT_HW_INSTRUCTIONS,
+ PERF_COUNT_HW_BUS_CYCLES,
+ PERF_COUNT_HW_CACHE_REFERENCES,
+ PERF_COUNT_HW_CACHE_MISSES,
+ PERF_COUNT_HW_BRANCH_INSTRUCTIONS,
+ PERF_COUNT_HW_BRANCH_MISSES,
+};
+
__init int intel_pmu_init(void)
{
union cpuid10_edx edx;
union cpuid10_eax eax;
+ union cpuid10_ebx ebx;
unsigned int unused;
- unsigned int ebx;
- int version;
+ int version, bit;
if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -1574,8 +1584,8 @@ __init int intel_pmu_init(void)
* Check whether the Architectural PerfMon supports
* Branch Misses Retired hw_event or not.
*/
- cpuid(10, &eax.full, &ebx, &unused, &edx.full);
- if (eax.split.mask_length <= ARCH_PERFMON_BRANCH_MISSES_RETIRED)
+ cpuid(10, &eax.full, &ebx.full, &unused, &edx.full);
+ if (eax.split.mask_length < ARCH_PERFMON_EVENTS_COUNT)
return -ENODEV;
version = eax.split.version_id;
@@ -1651,7 +1661,7 @@ __init int intel_pmu_init(void)
/* UOPS_EXECUTED.CORE_ACTIVE_CYCLES,c=1,i=1 */
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x1803fb1;
- if (ebx & 0x40) {
+ if (ebx.split.no_branch_misses_retired) {
/*
* Erratum AAJ80 detected, we work it around by using
* the BR_MISP_EXEC.ANY event. This will over-count
@@ -1659,6 +1669,7 @@ __init int intel_pmu_init(void)
* architectural event which is often completely bogus:
*/
intel_perfmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x7f89;
+ ebx.split.no_branch_misses_retired = 0;
pr_cont("erratum AAJ80 worked around, ");
}
@@ -1738,5 +1749,12 @@ __init int intel_pmu_init(void)
break;
}
}
+ x86_pmu.events_maskl = ebx.full;
+ x86_pmu.events_mask_len = eax.split.mask_length;
+
+ /* disable event that reported as not presend by cpuid */
+ for_each_set_bit(bit, x86_pmu.events_mask, ARRAY_SIZE(intel_event_id_to_hw_id))
+ intel_perfmon_event_map[intel_event_id_to_hw_id[bit]] = 0;
+
return 0;
}
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCHv3 07/10] perf, x86: expose perf capability to other modules.
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (5 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 06/10] x86, perf: disable non available architectural events Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-12-06 20:18 ` [tip:perf/core] perf, x86: Expose " tip-bot for Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 08/10] KVM: Expose the architectural performance monitoring CPUID leaf Gleb Natapov
` (4 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
KVM needs to know perf capability to decide which PMU it can expose to a
guest.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/include/asm/perf_event.h | 15 +++++++++++++++
arch/x86/kernel/cpu/perf_event.c | 11 +++++++++++
2 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index c6998bc..5487ad6 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -95,6 +95,15 @@ union cpuid10_edx {
unsigned int full;
};
+struct x86_pmu_capability {
+ int version;
+ int num_counters_gp;
+ int num_counters_fixed;
+ int bit_width_gp;
+ int bit_width_fixed;
+ unsigned int events_mask;
+ int events_mask_len;
+};
/*
* Fixed-purpose performance events:
@@ -216,6 +225,7 @@ struct perf_guest_switch_msr {
};
extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
#else
static inline perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
{
@@ -223,6 +233,11 @@ static inline perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
return NULL;
}
+static inline void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
+{
+ memset(cap, 0, sizeof(*cap));
+}
+
static inline void perf_events_lapic_init(void) { }
#endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 6408910..5af5996 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1570,3 +1570,14 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
return misc;
}
+
+void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
+{
+ cap->version = x86_pmu.version;
+ cap->num_counters_gp = x86_pmu.num_counters;
+ cap->num_counters_fixed = x86_pmu.num_counters_fixed;
+ cap->bit_width_gp = cap->bit_width_fixed = x86_pmu.cntval_bits;
+ cap->events_mask = (unsigned int)x86_pmu.events_maskl;
+ cap->events_mask_len = x86_pmu.events_mask_len;
+}
+EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [tip:perf/core] perf, x86: Expose perf capability to other modules
2011-11-10 12:57 ` [PATCHv3 07/10] perf, x86: expose perf capability to other modules Gleb Natapov
@ 2011-12-06 20:18 ` tip-bot for Gleb Natapov
0 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Gleb Natapov @ 2011-12-06 20:18 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, gleb, hpa, mingo, a.p.zijlstra, tglx, mingo
Commit-ID: b3d9468a8bd218a695e3a0ff112cd4efd27b670a
Gitweb: http://git.kernel.org/tip/b3d9468a8bd218a695e3a0ff112cd4efd27b670a
Author: Gleb Natapov <gleb@redhat.com>
AuthorDate: Thu, 10 Nov 2011 14:57:27 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 6 Dec 2011 20:41:08 +0100
perf, x86: Expose perf capability to other modules
KVM needs to know perf capability to decide which PMU it can expose to a
guest.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1320929850-10480-8-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/include/asm/perf_event.h | 15 +++++++++++++++
arch/x86/kernel/cpu/perf_event.c | 12 ++++++++++++
2 files changed, 27 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index c6998bc..b50e9d1 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -95,6 +95,15 @@ union cpuid10_edx {
unsigned int full;
};
+struct x86_pmu_capability {
+ int version;
+ int num_counters_gp;
+ int num_counters_fixed;
+ int bit_width_gp;
+ int bit_width_fixed;
+ unsigned int events_mask;
+ int events_mask_len;
+};
/*
* Fixed-purpose performance events:
@@ -216,6 +225,7 @@ struct perf_guest_switch_msr {
};
extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
#else
static inline perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
{
@@ -223,6 +233,11 @@ static inline perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
return NULL;
}
+static inline void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
+{
+ memset(cap, 0, sizeof(*cap));
+}
+
static inline void perf_events_lapic_init(void) { }
#endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 55889e0..930fe48 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1696,3 +1696,15 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
return misc;
}
+
+void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
+{
+ cap->version = x86_pmu.version;
+ cap->num_counters_gp = x86_pmu.num_counters;
+ cap->num_counters_fixed = x86_pmu.num_counters_fixed;
+ cap->bit_width_gp = x86_pmu.cntval_bits;
+ cap->bit_width_fixed = x86_pmu.cntval_bits;
+ cap->events_mask = (unsigned int)x86_pmu.events_maskl;
+ cap->events_mask_len = x86_pmu.events_mask_len;
+}
+EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCHv3 08/10] KVM: Expose the architectural performance monitoring CPUID leaf
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (6 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 07/10] perf, x86: expose perf capability to other modules Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 09/10] KVM: x86 emulator: fix RDPMC privilege check Gleb Natapov
` (3 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
Provide a CPUID leaf that describes the emulated PMU.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++++-
1 files changed, 29 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b88426c..2c44b05 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2544,6 +2544,35 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
}
case 9:
break;
+ case 0xa: { /* Architectural Performance Monitoring */
+ struct x86_pmu_capability cap;
+ union cpuid10_eax eax;
+ union cpuid10_edx edx;
+
+ perf_get_x86_pmu_capability(&cap);
+
+ /*
+ * Only support guest architectural pmu on a host
+ * with architectural pmu.
+ */
+ if (!cap.version)
+ memset(&cap, 0, sizeof(cap));
+
+ eax.split.version_id = min(cap.version, 2);
+ eax.split.num_counters = cap.num_counters_gp;
+ eax.split.bit_width = cap.bit_width_gp;
+ eax.split.mask_length = cap.events_mask_len;
+
+ edx.split.num_counters_fixed = cap.num_counters_fixed;
+ edx.split.bit_width_fixed = cap.bit_width_fixed;
+ edx.split.reserved = 0;
+
+ entry->eax = eax.full;
+ entry->ebx = cap.events_mask;
+ entry->ecx = 0;
+ entry->edx = edx.full;
+ break;
+ }
/* function 0xb has additional index. */
case 0xb: {
int i, level_type;
@@ -2638,7 +2667,6 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
case 3: /* Processor serial number */
case 5: /* MONITOR/MWAIT */
case 6: /* Thermal management */
- case 0xA: /* Architectural Performance Monitoring */
case 0x80000007: /* Advanced power management */
case 0xC0000002:
case 0xC0000003:
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 09/10] KVM: x86 emulator: fix RDPMC privilege check
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (7 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 08/10] KVM: Expose the architectural performance monitoring CPUID leaf Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-10 12:57 ` [PATCHv3 10/10] KVM: x86 emulator: implement RDPMC (0F 33) Gleb Natapov
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
RDPMC is only privileged if CR4.PCE=0. check_rdpmc() already implements this,
so all we need to do is drop the Priv flag.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/emulate.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8547958..c0ee85b 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3254,7 +3254,7 @@ static struct opcode twobyte_table[256] = {
DI(ImplicitOps | Priv, wrmsr),
IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc),
DI(ImplicitOps | Priv, rdmsr),
- DIP(ImplicitOps | Priv, rdpmc, check_rdpmc),
+ DIP(ImplicitOps, rdpmc, check_rdpmc),
I(ImplicitOps | VendorSpecific, em_sysenter),
I(ImplicitOps | Priv | VendorSpecific, em_sysexit),
N, N,
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCHv3 10/10] KVM: x86 emulator: implement RDPMC (0F 33)
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (8 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 09/10] KVM: x86 emulator: fix RDPMC privilege check Gleb Natapov
@ 2011-11-10 12:57 ` Gleb Natapov
2011-11-17 9:30 ` [PATCHv3 00/10] KVM in-guest performance monitoring Peter Zijlstra
2011-12-11 14:05 ` Avi Kivity
11 siblings, 0 replies; 16+ messages in thread
From: Gleb Natapov @ 2011-11-10 12:57 UTC (permalink / raw)
To: kvm; +Cc: linux-kernel, avi, mtosatti, mingo, a.p.zijlstra, acme
From: Avi Kivity <avi@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/include/asm/kvm_emulate.h | 1 +
arch/x86/kvm/emulate.c | 13 ++++++++++++-
arch/x86/kvm/x86.c | 7 +++++++
3 files changed, 20 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 9a4acf4..ab4092e 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -181,6 +181,7 @@ struct x86_emulate_ops {
int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value);
int (*set_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data);
int (*get_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
+ int (*read_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc, u64 *pdata);
void (*halt)(struct x86_emulate_ctxt *ctxt);
void (*wbinvd)(struct x86_emulate_ctxt *ctxt);
int (*fix_hypercall)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index c0ee85b..d76a852 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2623,6 +2623,17 @@ static int em_rdtsc(struct x86_emulate_ctxt *ctxt)
return X86EMUL_CONTINUE;
}
+static int em_rdpmc(struct x86_emulate_ctxt *ctxt)
+{
+ u64 pmc;
+
+ if (ctxt->ops->read_pmc(ctxt, ctxt->regs[VCPU_REGS_RCX], &pmc))
+ return emulate_gp(ctxt, 0);
+ ctxt->regs[VCPU_REGS_RAX] = (u32)pmc;
+ ctxt->regs[VCPU_REGS_RDX] = pmc >> 32;
+ return X86EMUL_CONTINUE;
+}
+
static int em_mov(struct x86_emulate_ctxt *ctxt)
{
ctxt->dst.val = ctxt->src.val;
@@ -3254,7 +3265,7 @@ static struct opcode twobyte_table[256] = {
DI(ImplicitOps | Priv, wrmsr),
IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc),
DI(ImplicitOps | Priv, rdmsr),
- DIP(ImplicitOps, rdpmc, check_rdpmc),
+ IIP(ImplicitOps, em_rdpmc, rdpmc, check_rdpmc),
I(ImplicitOps | VendorSpecific, em_sysenter),
I(ImplicitOps | Priv | VendorSpecific, em_sysexit),
N, N,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2c44b05..f78b48c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4662,6 +4662,12 @@ static int emulator_set_msr(struct x86_emulate_ctxt *ctxt,
return kvm_set_msr(emul_to_vcpu(ctxt), msr_index, data);
}
+static int emulator_read_pmc(struct x86_emulate_ctxt *ctxt,
+ u32 pmc, u64 *pdata)
+{
+ return kvm_pmu_read_pmc(emul_to_vcpu(ctxt), pmc, pdata);
+}
+
static void emulator_halt(struct x86_emulate_ctxt *ctxt)
{
emul_to_vcpu(ctxt)->arch.halt_request = 1;
@@ -4714,6 +4720,7 @@ static struct x86_emulate_ops emulate_ops = {
.set_dr = emulator_set_dr,
.set_msr = emulator_set_msr,
.get_msr = emulator_get_msr,
+ .read_pmc = emulator_read_pmc,
.halt = emulator_halt,
.wbinvd = emulator_wbinvd,
.fix_hypercall = emulator_fix_hypercall,
--
1.7.7.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCHv3 00/10] KVM in-guest performance monitoring
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (9 preceding siblings ...)
2011-11-10 12:57 ` [PATCHv3 10/10] KVM: x86 emulator: implement RDPMC (0F 33) Gleb Natapov
@ 2011-11-17 9:30 ` Peter Zijlstra
2011-12-11 14:05 ` Avi Kivity
11 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2011-11-17 9:30 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, linux-kernel, avi, mtosatti, mingo, acme
On Thu, 2011-11-10 at 14:57 +0200, Gleb Natapov wrote:
> This patchset exposes an emulated version 2 architectural performance
> monitoring unit to KVM guests. The PMU is emulated using perf_events,
> so the host kernel can multiplex host-wide, host-user, and the
> guest on available resources.
>
> The patches are against next branch on kvm.git.
>
> If you want to try running perf in a guest you need to apply the patch
> below to qemu-kvm and use -cpu host on qemu command line. But DO NOT
> TRY those patches without applying [1][2] to the host kernel first.
> Don't tell me I didn't warn you!
>
> [1] https://lkml.org/lkml/2011/10/18/390
> [2] https://lkml.org/lkml/2011/10/23/163
>
> Changelog:
> v1->v2
> - put index into struct kvm_pmc instead of calculating it
> - use locked version of bitops
> - inject pmi from irq work if vcpu was not in a guest mode during NMI
> - providing stub for perf_get_x86_pmu_capability() for !PERF_EVENTS
> v2->v3
> - minor style change/comment clarification
> - add perf patch to disable arch event not supported by a CPU
> - create perf events as pinned
On the pinned thing, do check event->state, if it hits
PERF_EVENT_STATE_ERROR you're up shit creek and the counter didn't get
scheduled. This can happen because eg. cpu-pinned events have higher
precedence than task-pinned events.
I've taken the two perf patches from this (6 and 7), the rest I gather
should go through the KVM tree. I've only cursory looked at the other
patches but didn't spot anything very curious.
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCHv3 00/10] KVM in-guest performance monitoring
2011-11-10 12:57 [PATCHv3 00/10] KVM in-guest performance monitoring Gleb Natapov
` (10 preceding siblings ...)
2011-11-17 9:30 ` [PATCHv3 00/10] KVM in-guest performance monitoring Peter Zijlstra
@ 2011-12-11 14:05 ` Avi Kivity
11 siblings, 0 replies; 16+ messages in thread
From: Avi Kivity @ 2011-12-11 14:05 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, linux-kernel, mtosatti, mingo, a.p.zijlstra, acme
On 11/10/2011 02:57 PM, Gleb Natapov wrote:
> This patchset exposes an emulated version 2 architectural performance
> monitoring unit to KVM guests. The PMU is emulated using perf_events,
> so the host kernel can multiplex host-wide, host-user, and the
> guest on available resources.
>
> The patches are against next branch on kvm.git.
>
>
Thanks, applied.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 16+ messages in thread