kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] perf support for x86 guest/host-only bits
@ 2011-10-03 13:49 Gleb Natapov
  2011-10-03 13:49 ` [PATCH 1/9] perf, core: Introduce attrs to count in either host or guest mode Gleb Natapov
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

This patch series consists of Joerg series named "perf support for amd
guest/host-only bits v2" [1] rebased to 3.1.0-rc7 and in addition,
support for intel cpus for the same functionality.


[1] https://lkml.org/lkml/2011/6/17/171

Gleb Natapov (4):
  perf, intel: Use GO/HO bits in perf-ctr
  KVM, VMX: add support for switching of PERF_GLOBAL_CTRL
  KVM, VMX: Add support for guest/host-only profiling
  KVM, VMX: Check for automatic switch msr table overflow.

Joerg Roedel (5):
  perf, core: Introduce attrs to count in either host or guest mode
  perf, amd: Use GO/HO bits in perf-ctr
  perf, tools: Add support for guest/host-only profiling
  perf, tools: Fix copy&paste error in perf-kvm option description
  perf, tools: Do guest-only counting in perf-kvm by default

 arch/x86/include/asm/perf_event.h      |   15 ++++
 arch/x86/kernel/cpu/perf_event.c       |   10 +++
 arch/x86/kernel/cpu/perf_event_amd.c   |   13 +++
 arch/x86/kernel/cpu/perf_event_intel.c |  111 +++++++++++++++++++++++++-
 arch/x86/kvm/vmx.c                     |  139 +++++++++++++++++++++++++++++---
 include/linux/perf_event.h             |    5 +-
 tools/perf/builtin-kvm.c               |    5 +-
 tools/perf/util/event.c                |    8 ++
 tools/perf/util/event.h                |    2 +
 tools/perf/util/evlist.c               |    5 +-
 tools/perf/util/parse-events.c         |   15 +++-
 11 files changed, 309 insertions(+), 19 deletions(-)

-- 
1.7.5.3


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] perf, core: Introduce attrs to count in either host or guest mode
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 13:49 ` [PATCH 2/9] perf, amd: Use GO/HO bits in perf-ctr Gleb Natapov
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

From: Joerg Roedel <joerg.roedel@amd.com>

The two new attributes exclude_guest and exclude_host can
bes used by user-space to tell the kernel to setup
performance counter to either only count while the CPU is in
guest or in host mode.
An additional check is also introduced to make sure
user-space does not try to exclude guest and host mode from
counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 include/linux/perf_event.h |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c816075..1e9ebe5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -220,7 +220,10 @@ struct perf_event_attr {
 				mmap_data      :  1, /* non-exec mmap data    */
 				sample_id_all  :  1, /* sample_type all events */
 
-				__reserved_1   : 45;
+				exclude_host   :  1, /* don't count in host   */
+				exclude_guest  :  1, /* don't count in guest  */
+
+				__reserved_1   : 43;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
-- 
1.7.5.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] perf, amd: Use GO/HO bits in perf-ctr
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
  2011-10-03 13:49 ` [PATCH 1/9] perf, core: Introduce attrs to count in either host or guest mode Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 13:49 ` [PATCH 3/9] perf, tools: Add support for guest/host-only profiling Gleb Natapov
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

From: Joerg Roedel <joerg.roedel@amd.com>

The AMD perf-counters support counting in guest or host-mode
only. Make use of that feature when user-space specified
guest/host-mode only counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/perf_event.h    |    3 +++
 arch/x86/kernel/cpu/perf_event_amd.c |   13 +++++++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 094fb30..ce2bfb3 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -29,6 +29,9 @@
 #define ARCH_PERFMON_EVENTSEL_INV			(1ULL << 23)
 #define ARCH_PERFMON_EVENTSEL_CMASK			0xFF000000ULL
 
+#define AMD_PERFMON_EVENTSEL_GUESTONLY			(1ULL << 40)
+#define AMD_PERFMON_EVENTSEL_HOSTONLY			(1ULL << 41)
+
 #define AMD64_EVENTSEL_EVENT	\
 	(ARCH_PERFMON_EVENTSEL_EVENT | (0x0FULL << 32))
 #define INTEL_ARCH_EVENT_MASK	\
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 941caa2..5c01a73 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -132,6 +132,19 @@ static int amd_pmu_hw_config(struct perf_event *event)
 	if (ret)
 		return ret;
 
+	if (event->attr.exclude_host && event->attr.exclude_guest)
+		/*
+		 * When HO == GO == 1 the hardware treats that as GO == HO == 0
+		 * and will count in both modes. We don't want to count in that
+		 * case so we emulate no-counting by setting US = OS = 0.
+		 */
+		event->hw.config &= ~(ARCH_PERFMON_EVENTSEL_USR |
+				      ARCH_PERFMON_EVENTSEL_OS);
+	else if (event->attr.exclude_host)
+		event->hw.config |= AMD_PERFMON_EVENTSEL_GUESTONLY;
+	else if (event->attr.exclude_guest)
+		event->hw.config |= AMD_PERFMON_EVENTSEL_HOSTONLY;
+
 	if (event->attr.type != PERF_TYPE_RAW)
 		return 0;
 
-- 
1.7.5.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] perf, tools: Add support for guest/host-only profiling
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
  2011-10-03 13:49 ` [PATCH 1/9] perf, core: Introduce attrs to count in either host or guest mode Gleb Natapov
  2011-10-03 13:49 ` [PATCH 2/9] perf, amd: Use GO/HO bits in perf-ctr Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 13:49 ` [PATCH 4/9] perf, tools: Fix copy&paste error in perf-kvm option description Gleb Natapov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

From: Joerg Roedel <joerg.roedel@amd.com>

To restrict a counter to either host or guest mode this
patch introduces two new event modifiers: G and H.
With G the counter is configured in guest-only mode and with
H in host-only mode.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 tools/perf/util/parse-events.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 928918b..3b00775 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -735,8 +735,8 @@ static int
 parse_event_modifier(const char **strp, struct perf_event_attr *attr)
 {
 	const char *str = *strp;
-	int exclude = 0;
-	int eu = 0, ek = 0, eh = 0, precise = 0;
+	int exclude = 0, exclude_GH = 0;
+	int eu = 0, ek = 0, eh = 0, eH = 0, eG = 0, precise = 0;
 
 	if (!*str)
 		return 0;
@@ -760,6 +760,14 @@ parse_event_modifier(const char **strp, struct perf_event_attr *attr)
 			if (!exclude)
 				exclude = eu = ek = eh = 1;
 			eh = 0;
+		} else if (*str == 'G') {
+			if (!exclude_GH)
+				exclude_GH = eG = eH = 1;
+			eG = 0;
+		} else if (*str == 'H') {
+			if (!exclude_GH)
+				exclude_GH = eG = eH = 1;
+			eH = 0;
 		} else if (*str == 'p') {
 			precise++;
 		} else
@@ -776,6 +784,8 @@ parse_event_modifier(const char **strp, struct perf_event_attr *attr)
 	attr->exclude_kernel = ek;
 	attr->exclude_hv     = eh;
 	attr->precise_ip     = precise;
+	attr->exclude_host   = eH;
+	attr->exclude_guest  = eG;
 
 	return 0;
 }
-- 
1.7.5.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] perf, tools: Fix copy&paste error in perf-kvm option description
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (2 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 3/9] perf, tools: Add support for guest/host-only profiling Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 13:49 ` [PATCH 5/9] perf, tools: Do guest-only counting in perf-kvm by default Gleb Natapov
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

From: Joerg Roedel <joerg.roedel@amd.com>

The --host option certainly enables host-data collection.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 tools/perf/builtin-kvm.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 34d1e85..032324a 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -38,7 +38,7 @@ static const struct option kvm_options[] = {
 	OPT_BOOLEAN(0, "guest", &perf_guest,
 		    "Collect guest os data"),
 	OPT_BOOLEAN(0, "host", &perf_host,
-		    "Collect guest os data"),
+		    "Collect host os data"),
 	OPT_STRING(0, "guestmount", &symbol_conf.guestmount, "directory",
 		   "guest mount directory under which every guest os"
 		   " instance has a subdir"),
-- 
1.7.5.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] perf, tools: Do guest-only counting in perf-kvm by default
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (3 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 4/9] perf, tools: Fix copy&paste error in perf-kvm option description Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 13:49 ` [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr Gleb Natapov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

From: Joerg Roedel <joerg.roedel@amd.com>

Make use of exclude_guest and exlude_host in perf-kvm to do
only guest-only counting by default.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 tools/perf/builtin-kvm.c       |    3 ++-
 tools/perf/util/event.c        |    8 ++++++++
 tools/perf/util/event.h        |    2 ++
 tools/perf/util/evlist.c       |    5 ++++-
 tools/perf/util/parse-events.c |    1 +
 5 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 032324a..9b05afa 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -107,7 +107,8 @@ static int __cmd_buildid_list(int argc, const char **argv)
 
 int cmd_kvm(int argc, const char **argv, const char *prefix __used)
 {
-	perf_host = perf_guest = 0;
+	perf_host  = 0;
+	perf_guest = 1;
 
 	argc = parse_options(argc, argv, kvm_options, kvm_usage,
 			PARSE_OPT_STOP_AT_NON_OPTION);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 437f8ca..31a6d7f 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -44,6 +44,14 @@ static struct perf_sample synth_sample = {
 	.period	   = 1,
 };
 
+void event_attr_init(struct perf_event_attr *attr)
+{
+	if (!perf_host)
+		attr->exclude_host  = 1;
+	if (!perf_guest)
+		attr->exclude_guest = 1;
+}
+
 static pid_t perf_event__synthesize_comm(union perf_event *event, pid_t pid,
 					 int full, perf_event__handler_t process,
 					 struct perf_session *session)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 357a85b..e5f101e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -150,6 +150,8 @@ typedef int (*perf_event__handler_t)(union perf_event *event,
 				     struct perf_sample *sample,
 				      struct perf_session *session);
 
+void event_attr_init(struct perf_event_attr *attr);
+
 int perf_event__synthesize_thread_map(struct thread_map *threads,
 				      perf_event__handler_t process,
 				      struct perf_session *session);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 72e9f48..4773fbe 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -82,8 +82,11 @@ int perf_evlist__add_default(struct perf_evlist *evlist)
 		.type = PERF_TYPE_HARDWARE,
 		.config = PERF_COUNT_HW_CPU_CYCLES,
 	};
-	struct perf_evsel *evsel = perf_evsel__new(&attr, 0);
+	struct perf_evsel *evsel;
+
+	event_attr_init(&attr);
 
+	evsel = perf_evsel__new(&attr, 0);
 	if (evsel == NULL)
 		goto error;
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3b00775..620ba98 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -848,6 +848,7 @@ int parse_events(struct perf_evlist *evlist , const char *str, int unset __used)
 	for (;;) {
 		ostr = str;
 		memset(&attr, 0, sizeof(attr));
+		event_attr_init(&attr);
 		ret = parse_event_symbols(evlist, &str, &attr);
 		if (ret == EVT_FAILED)
 			return -1;
-- 
1.7.5.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (4 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 5/9] perf, tools: Do guest-only counting in perf-kvm by default Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 15:06   ` Avi Kivity
  2011-10-03 13:49 ` [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL Gleb Natapov
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

Intel does not have guest/host-only bit in perf counters like AMD
does.  To support GO/HO bits KVM needs to switch EVENTSELn values
(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
configured to count only in a guest mode it stays disabled in a host,
but VMX is configured to switch it to enabled value during guest entry.

This patch adds GO/HO tracking to Intel perf code and provides interface
for KVM to get a list of MSRs that need to be switched on a guest entry.

Only cpus with architectural PMU (v1 or later) are supported with this
patch.  To my knowledge there is not p6 models with VMX but without
architectural PMU and p4 with VMX are rare and the interface is general
enough to support them if need arise.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/perf_event.h      |   12 ++++
 arch/x86/kernel/cpu/perf_event.c       |   10 +++
 arch/x86/kernel/cpu/perf_event_intel.c |  111 +++++++++++++++++++++++++++++++-
 3 files changed, 130 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index ce2bfb3..f4697f1 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -162,7 +162,19 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	);							\
 }
 
+struct perf_guest_switch_msr {
+	unsigned msr;
+	u64 host, guest;
+};
+
+extern int perf_guest_get_msrs_count(void);
+extern int perf_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr);
 #else
+static inline int perf_guest_get_msrs_count(void)
+{
+	return 0;
+}
+
 static inline void perf_events_lapic_init(void)	{ }
 #endif
 
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index cfa62ec..33d5744 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -119,6 +119,10 @@ struct cpu_hw_events {
 	struct perf_branch_stack	lbr_stack;
 	struct perf_branch_entry	lbr_entries[MAX_LBR_ENTRIES];
 
+	/* Intel host/guest exclude bits */
+	u64			intel_ctrl_guest_mask;
+	u64			intel_ctrl_host_mask;
+
 	/*
 	 * manage shared (per-core, per-cpu) registers
 	 * used on Intel NHM/WSM/SNB
@@ -292,6 +296,12 @@ struct x86_pmu {
 	 */
 	struct extra_reg *extra_regs;
 	unsigned int er_flags;
+
+	/*
+	 * Guest event support
+	 */
+	int (*guest_get_msrs_count)(void);
+	int (*guest_get_msrs)(int cnt, struct perf_guest_switch_msr *arr);
 };
 
 #define ERF_NO_HT_SHARING	1
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index f88af2c..72e1ea3 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -746,7 +746,8 @@ static void intel_pmu_enable_all(int added)
 
 	intel_pmu_pebs_enable_all();
 	intel_pmu_lbr_enable_all();
-	wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, x86_pmu.intel_ctrl);
+	wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL,
+			x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask);
 
 	if (test_bit(X86_PMC_IDX_FIXED_BTS, cpuc->active_mask)) {
 		struct perf_event *event =
@@ -869,6 +870,7 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
 
 	if (unlikely(hwc->idx == X86_PMC_IDX_FIXED_BTS)) {
 		intel_pmu_disable_bts();
@@ -876,6 +878,9 @@ static void intel_pmu_disable_event(struct perf_event *event)
 		return;
 	}
 
+	cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
+	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
+
 	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
 		intel_pmu_disable_fixed(hwc);
 		return;
@@ -921,6 +926,7 @@ static void intel_pmu_enable_fixed(struct hw_perf_event *hwc)
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
 
 	if (unlikely(hwc->idx == X86_PMC_IDX_FIXED_BTS)) {
 		if (!__this_cpu_read(cpu_hw_events.enabled))
@@ -930,6 +936,11 @@ static void intel_pmu_enable_event(struct perf_event *event)
 		return;
 	}
 
+	if (event->attr.exclude_host)
+		cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
+	if (event->attr.exclude_guest)
+		cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
+
 	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
 		intel_pmu_enable_fixed(hwc);
 		return;
@@ -1284,12 +1295,87 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
+static int intel_guest_get_msrs_count(void)
+{
+	return 1;
+}
+
+static int intel_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr)
+{
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+
+	if (cnt < 1)
+		return -ENOMEM;
+
+	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
+	arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
+	arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
+
+	return 0;
+}
+
+static int core_guest_get_msrs_count(void)
+{
+	return x86_pmu.num_counters;
+}
+
+static int core_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr)
+{
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	int idx;
+
+	if (cnt < x86_pmu.num_counters)
+		return -ENOMEM;
+
+	for (idx = 0; idx < x86_pmu.num_counters; idx++)  {
+		struct perf_event *event = cpuc->events[idx];
+
+		arr[idx].msr = x86_pmu_config_addr(idx);
+		arr[idx].host = arr[idx].guest = 0;
+
+		if (!test_bit(idx, cpuc->active_mask))
+			continue;
+
+		arr[idx].host = arr[idx].guest =
+				event->hw.config | ARCH_PERFMON_EVENTSEL_ENABLE;
+
+		if (event->attr.exclude_host)
+			arr[idx].host &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
+		else if (event->attr.exclude_guest)
+			arr[idx].guest &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
+	}
+
+	return 0;
+}
+
+static void core_pmu_enable_event(struct perf_event *event)
+{
+	if (!event->attr.exclude_host)
+		x86_pmu_enable_event(event);
+}
+
+static void core_pmu_enable_all(int added)
+{
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+	int idx;
+
+	for (idx = 0; idx < x86_pmu.num_counters; idx++) {
+		struct hw_perf_event *hwc = &cpuc->events[idx]->hw;
+
+		if (!test_bit(idx, cpuc->active_mask) ||
+				cpuc->events[idx]->attr.exclude_host)
+			continue;
+
+		__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+	}
+}
+
 static __initconst const struct x86_pmu core_pmu = {
 	.name			= "core",
 	.handle_irq		= x86_pmu_handle_irq,
 	.disable_all		= x86_pmu_disable_all,
-	.enable_all		= x86_pmu_enable_all,
-	.enable			= x86_pmu_enable_event,
+	.enable_all		= core_pmu_enable_all,
+	.enable			= core_pmu_enable_event,
 	.disable		= x86_pmu_disable_event,
 	.hw_config		= x86_pmu_hw_config,
 	.schedule_events	= x86_schedule_events,
@@ -1307,6 +1393,8 @@ static __initconst const struct x86_pmu core_pmu = {
 	.get_event_constraints	= intel_get_event_constraints,
 	.put_event_constraints	= intel_put_event_constraints,
 	.event_constraints	= intel_core_event_constraints,
+	.guest_get_msrs_count	= core_guest_get_msrs_count,
+	.guest_get_msrs		= core_guest_get_msrs,
 };
 
 static struct intel_shared_regs *allocate_shared_regs(int cpu)
@@ -1413,6 +1501,8 @@ static __initconst const struct x86_pmu intel_pmu = {
 	.cpu_prepare		= intel_pmu_cpu_prepare,
 	.cpu_starting		= intel_pmu_cpu_starting,
 	.cpu_dying		= intel_pmu_cpu_dying,
+	.guest_get_msrs_count	= intel_guest_get_msrs_count,
+	.guest_get_msrs		= intel_guest_get_msrs,
 };
 
 static void intel_clovertown_quirks(void)
@@ -1629,6 +1719,21 @@ static __init int intel_pmu_init(void)
 	return 0;
 }
 
+int perf_guest_get_msrs_count(void)
+{
+	if (x86_pmu.guest_get_msrs_count)
+		return x86_pmu.guest_get_msrs_count();
+	return 0;
+}
+EXPORT_SYMBOL_GPL(perf_guest_get_msrs_count);
+
+int perf_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr)
+{
+	if (x86_pmu.guest_get_msrs)
+		return x86_pmu.guest_get_msrs(cnt, arr);
+	return -ENOSYS;
+}
+EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
 #else /* CONFIG_CPU_SUP_INTEL */
 
 static int intel_pmu_init(void)
-- 
1.7.5.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (5 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-04  9:32   ` Avi Kivity
  2011-10-03 13:49 ` [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling Gleb Natapov
  2011-10-03 13:49 ` [PATCH 9/9] KVM, VMX: Check for automatic switch msr table overflow Gleb Natapov
  8 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

Some cpus have special support for switching PERF_GLOBAL_CTRL msr.
Add logic to detect if such support exists and works properly and extend
msr switching code to use it if available. Also extend number of generic
msr switching entries to 8.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/vmx.c |   98 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 89 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f3ec38f..e4cc3c2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -118,7 +118,7 @@ module_param(ple_gap, int, S_IRUGO);
 static int ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
 module_param(ple_window, int, S_IRUGO);
 
-#define NR_AUTOLOAD_MSRS 1
+#define NR_AUTOLOAD_MSRS 8
 #define VMCS02_POOL_SIZE 1
 
 struct vmcs {
@@ -622,6 +622,7 @@ static unsigned long *vmx_msr_bitmap_legacy;
 static unsigned long *vmx_msr_bitmap_longmode;
 
 static bool cpu_has_load_ia32_efer;
+static bool cpu_has_load_perf_global_ctrl;
 
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
@@ -1195,10 +1196,29 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
 {
 	unsigned i;
 	struct msr_autoload *m = &vmx->msr_autoload;
+	u32 entry_load, exit_load;
+	bool done = false;
 
-	if (msr == MSR_EFER && cpu_has_load_ia32_efer) {
-		vmcs_clear_bits(VM_ENTRY_CONTROLS, VM_ENTRY_LOAD_IA32_EFER);
-		vmcs_clear_bits(VM_EXIT_CONTROLS, VM_EXIT_LOAD_IA32_EFER);
+	switch (msr) {
+	case MSR_EFER:
+		if (cpu_has_load_ia32_efer) {
+			entry_load = VM_ENTRY_LOAD_IA32_EFER;
+			exit_load = VM_EXIT_LOAD_IA32_EFER;
+			done = true;
+		}
+		break;
+	case MSR_CORE_PERF_GLOBAL_CTRL:
+		if (cpu_has_load_perf_global_ctrl) {
+			entry_load = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+			exit_load = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+			done = true;
+		}
+		break;
+	}
+
+	if (done) {
+		vmcs_clear_bits(VM_ENTRY_CONTROLS, entry_load);
+		vmcs_clear_bits(VM_EXIT_CONTROLS, exit_load);
 		return;
 	}
 
@@ -1220,12 +1240,36 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
 {
 	unsigned i;
 	struct msr_autoload *m = &vmx->msr_autoload;
+	unsigned long guest_val_vmcs, host_val_vmcs;
+	u32 entry_load, exit_load;
+	bool done = false;
+
+	switch (msr) {
+	case MSR_EFER:
+		if (cpu_has_load_ia32_efer) {
+			guest_val_vmcs = GUEST_IA32_EFER;
+			host_val_vmcs = HOST_IA32_EFER;
+			entry_load = VM_ENTRY_LOAD_IA32_EFER;
+			exit_load = VM_EXIT_LOAD_IA32_EFER;
+			done = true;
+		}
+		break;
+	case MSR_CORE_PERF_GLOBAL_CTRL:
+		if (cpu_has_load_perf_global_ctrl) {
+			guest_val_vmcs = GUEST_IA32_PERF_GLOBAL_CTRL;
+			host_val_vmcs = HOST_IA32_PERF_GLOBAL_CTRL;
+			entry_load = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+			exit_load = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+			done = true;
+		}
+		break;
+	}
 
-	if (msr == MSR_EFER && cpu_has_load_ia32_efer) {
-		vmcs_write64(GUEST_IA32_EFER, guest_val);
-		vmcs_write64(HOST_IA32_EFER, host_val);
-		vmcs_set_bits(VM_ENTRY_CONTROLS, VM_ENTRY_LOAD_IA32_EFER);
-		vmcs_set_bits(VM_EXIT_CONTROLS, VM_EXIT_LOAD_IA32_EFER);
+	if (done) {
+		vmcs_write64(guest_val_vmcs, guest_val);
+		vmcs_write64(host_val_vmcs, host_val);
+		vmcs_set_bits(VM_ENTRY_CONTROLS, entry_load);
+		vmcs_set_bits(VM_EXIT_CONTROLS, exit_load);
 		return;
 	}
 
@@ -2455,6 +2499,42 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 		&& allow_1_setting(MSR_IA32_VMX_EXIT_CTLS,
 				   VM_EXIT_LOAD_IA32_EFER);
 
+	cpu_has_load_perf_global_ctrl =
+		allow_1_setting(MSR_IA32_VMX_ENTRY_CTLS,
+				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+		&& allow_1_setting(MSR_IA32_VMX_EXIT_CTLS,
+				   VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
+
+	/*
+	 * Some cpus support VM_ENTRY_(LOAD|SAVE)_IA32_PERF_GLOBAL_CTRL
+	 * but due to arrata below it can't be used. Workaround is to use
+	 * msr load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+	 *
+	 * VM Exit May Incorrectly Clear IA32_PERF_GLOBAL_CTRL [34:32]
+	 *
+	 * AAK155             (model 26)
+	 * AAP115             (model 30)
+	 * AAT100             (model 37)
+	 * BC86,AAY89,BD102   (model 44)
+	 * BA97               (model 46)
+	 *
+	 */
+	if (cpu_has_load_perf_global_ctrl && boot_cpu_data.x86 == 0x6) {
+		switch (boot_cpu_data.x86_model) {
+		case 26:
+		case 30:
+		case 37:
+		case 44:
+		case 46:
+			cpu_has_load_perf_global_ctrl = false;
+			printk_once(KERN_WARNING"kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+					"does not work properly. Using workaround\n");
+			break;
+		default:
+			break;
+		}
+	}
+
 	return 0;
 }
 
-- 
1.7.5.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (6 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  2011-10-03 15:00   ` Avi Kivity
  2011-10-03 13:49 ` [PATCH 9/9] KVM, VMX: Check for automatic switch msr table overflow Gleb Natapov
  8 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra

Support guest/host-only profiling by switch perf msrs on
a guest entry if needed.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/vmx.c |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e4cc3c2..fef7bd9 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -39,6 +39,7 @@
 #include <asm/mce.h>
 #include <asm/i387.h>
 #include <asm/xcr.h>
+#include <asm/perf_event.h>
 
 #include "trace.h"
 
@@ -417,6 +418,9 @@ struct vcpu_vmx {
 
 	/* Support for a guest hypervisor (nested VMX) */
 	struct nested_vmx nested;
+
+	int perf_msrs_cnt;
+	struct perf_guest_switch_msr *perf_msrs;
 };
 
 enum segment_cache_field {
@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
 	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
 }
 
+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
+{
+#ifdef CONFIG_PERF_EVENTS
+	int i;
+
+	if (!cpu_has_arch_perfmon || vmx->perf_msrs_cnt <= 0)
+		return;
+
+	perf_guest_get_msrs(vmx->perf_msrs_cnt, vmx->perf_msrs);
+	for (i = 0; i < vmx->perf_msrs_cnt; i++) {
+		struct perf_guest_switch_msr *msr = &vmx->perf_msrs[i];
+		if (msr->host == msr->guest)
+			clear_atomic_switch_msr(vmx, msr->msr);
+		else
+			add_atomic_switch_msr(vmx, msr->msr, msr->guest,
+					msr->host);
+	}
+#endif
+}
+
 #ifdef CONFIG_X86_64
 #define R "r"
 #define Q "q"
@@ -6101,6 +6125,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
 		vmx_set_interrupt_shadow(vcpu, 0);
 
+	atomic_switch_perf_msrs(vmx);
+
 	vmx->__launched = vmx->loaded_vmcs->launched;
 	asm(
 		/* Store host registers */
@@ -6306,6 +6332,15 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 	vmx->nested.current_vmptr = -1ull;
 	vmx->nested.current_vmcs12 = NULL;
 
+	vmx->perf_msrs_cnt = perf_guest_get_msrs_count();
+	if (vmx->perf_msrs_cnt > 0) {
+		vmx->perf_msrs = kmalloc(vmx->perf_msrs_cnt *
+				sizeof(struct perf_guest_switch_msr),
+				GFP_KERNEL);
+		if (!vmx->perf_msrs)
+			goto free_vmcs;
+	}
+
 	return &vmx->vcpu;
 
 free_vmcs:
-- 
1.7.5.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] KVM, VMX: Check for automatic switch msr table overflow.
  2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
                   ` (7 preceding siblings ...)
  2011-10-03 13:49 ` [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling Gleb Natapov
@ 2011-10-03 13:49 ` Gleb Natapov
  8 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: avi, kvm, joerg.roedel, mingo, a.p.zijlstra


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/vmx.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fef7bd9..afa9aa3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1281,7 +1281,11 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
 		if (m->guest[i].index == msr)
 			break;
 
-	if (i == m->nr) {
+	if (i == NR_AUTOLOAD_MSRS) {
+		printk_once(KERN_WARNING"Not enough mst switch entries. "
+				"Can't add msr %x\n", msr);
+		return;
+	} else if (i == m->nr) {
 		++m->nr;
 		vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->nr);
 		vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->nr);
-- 
1.7.5.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-03 13:49 ` [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling Gleb Natapov
@ 2011-10-03 15:00   ` Avi Kivity
  2011-10-03 15:36     ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-03 15:00 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> Support guest/host-only profiling by switch perf msrs on
> a guest entry if needed.
>
> @@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
>   	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
>   }
>
> +static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> +{
> +#ifdef CONFIG_PERF_EVENTS

No need for #ifdef (if you also define perf_guest_get_msrs() when 
!CONFIG_PERF_EVENTS).

>
> +	int i;
> +
> +	if (!cpu_has_arch_perfmon || vmx->perf_msrs_cnt<= 0)
> +		return;

Why the first check?

>
> +
> +	perf_guest_get_msrs(vmx->perf_msrs_cnt, vmx->perf_msrs);
> +	for (i = 0; i<  vmx->perf_msrs_cnt; i++) {
> +		struct perf_guest_switch_msr *msr =&vmx->perf_msrs[i];
> +		if (msr->host == msr->guest)
> +			clear_atomic_switch_msr(vmx, msr->msr);
> +		else
> +			add_atomic_switch_msr(vmx, msr->msr, msr->guest,
> +					msr->host);

This generates a lot of VMWRITEs even if nothing changes, just to re-set 
bits in the VMCS to their existing values.  Need to add something like this:

    if (loaded_vmcs->msr[i].host == msr->host
&& loaded_vmcs->msr[i].guest == msr->guest)
           continue;

btw, shouldn't the msr autoload list be part of loaded_vmcs as well?

>
> +	}
> +#endif
> +}
> +
>   #ifdef CONFIG_X86_64
>   #define R "r"
>   #define Q "q"
> @@ -6101,6 +6125,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>   	if (vcpu->guest_debug&  KVM_GUESTDBG_SINGLESTEP)
>   		vmx_set_interrupt_shadow(vcpu, 0);
>
> +	atomic_switch_perf_msrs(vmx);
> +
>   	vmx->__launched = vmx->loaded_vmcs->launched;
>   	asm(
>   		/* Store host registers */
> @@ -6306,6 +6332,15 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>   	vmx->nested.current_vmptr = -1ull;
>   	vmx->nested.current_vmcs12 = NULL;
>
> +	vmx->perf_msrs_cnt = perf_guest_get_msrs_count();
> +	if (vmx->perf_msrs_cnt>  0) {
> +		vmx->perf_msrs = kmalloc(vmx->perf_msrs_cnt *
> +				sizeof(struct perf_guest_switch_msr),
> +				GFP_KERNEL);
> +		if (!vmx->perf_msrs)
> +			goto free_vmcs;
> +	}
> +

Do we really need a private buffer?  Perhaps perf_guest_get_msrs() can 
return a perf-internal buffer (but then, we will need to copy it for the 
optimization above, but that's a separate issue).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr
  2011-10-03 13:49 ` [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr Gleb Natapov
@ 2011-10-03 15:06   ` Avi Kivity
  2011-10-03 15:41     ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-03 15:06 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> Intel does not have guest/host-only bit in perf counters like AMD
> does.  To support GO/HO bits KVM needs to switch EVENTSELn values
> (or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
> configured to count only in a guest mode it stays disabled in a host,
> but VMX is configured to switch it to enabled value during guest entry.
>
> This patch adds GO/HO tracking to Intel perf code and provides interface
> for KVM to get a list of MSRs that need to be switched on a guest entry.
>
> Only cpus with architectural PMU (v1 or later) are supported with this
> patch.  To my knowledge there is not p6 models with VMX but without
> architectural PMU and p4 with VMX are rare and the interface is general
> enough to support them if need arise.
>
> +
> +static int core_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr)
> +{
> +	struct cpu_hw_events *cpuc =&__get_cpu_var(cpu_hw_events);
> +	int idx;
> +
> +	if (cnt<  x86_pmu.num_counters)
> +		return -ENOMEM;
> +
> +	for (idx = 0; idx<  x86_pmu.num_counters; idx++)  {
> +		struct perf_event *event = cpuc->events[idx];
> +
> +		arr[idx].msr = x86_pmu_config_addr(idx);
> +		arr[idx].host = arr[idx].guest = 0;
> +
> +		if (!test_bit(idx, cpuc->active_mask))
> +			continue;
> +
> +		arr[idx].host = arr[idx].guest =
> +				event->hw.config | ARCH_PERFMON_EVENTSEL_ENABLE;
> +
> +		if (event->attr.exclude_host)
> +			arr[idx].host&= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> +		else if (event->attr.exclude_guest)
> +			arr[idx].guest&= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> +	}
> +
> +	return 0;
> +}

Would be better to calculate these when the host msrs are calculated, 
instead of here, every vmentry.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-03 15:00   ` Avi Kivity
@ 2011-10-03 15:36     ` Gleb Natapov
  2011-10-04  9:28       ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 15:36 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Mon, Oct 03, 2011 at 05:00:25PM +0200, Avi Kivity wrote:
> On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >Support guest/host-only profiling by switch perf msrs on
> >a guest entry if needed.
> >
> >@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> >  	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
> >  }
> >
> >+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >+{
> >+#ifdef CONFIG_PERF_EVENTS
> 
> No need for #ifdef (if you also define perf_guest_get_msrs() when
> !CONFIG_PERF_EVENTS).
> 
Yes, but will compiler be smart enough to remove the code of the
function completely? It will have to figure that vmx->perf_msrs_cnt is
always 0 somehow.

> >
> >+	int i;
> >+
> >+	if (!cpu_has_arch_perfmon || vmx->perf_msrs_cnt<= 0)
> >+		return;
> 
> Why the first check?
> 
Leftovers from previous iteration of the patch. Not needed any more.

> >
> >+
> >+	perf_guest_get_msrs(vmx->perf_msrs_cnt, vmx->perf_msrs);
> >+	for (i = 0; i<  vmx->perf_msrs_cnt; i++) {
> >+		struct perf_guest_switch_msr *msr =&vmx->perf_msrs[i];
> >+		if (msr->host == msr->guest)
> >+			clear_atomic_switch_msr(vmx, msr->msr);
> >+		else
> >+			add_atomic_switch_msr(vmx, msr->msr, msr->guest,
> >+					msr->host);
> 
> This generates a lot of VMWRITEs even if nothing changes, just to
> re-set bits in the VMCS to their existing values.  Need to add
> something like this:
> 
>    if (loaded_vmcs->msr[i].host == msr->host
> && loaded_vmcs->msr[i].guest == msr->guest)
>           continue;
VMWRITE happens only when number of autoloaded MSRs changes (which is
rare), not on each call to add_atomic_switch_msr(). I thought about
optimizing this write too by doing
vmcs_write32(VM_(ENTRY|EXIT)_MSR_LOAD_COUNT, m->nr) only once by
checking that  m->nr changed during vmentry. Can be done later.
> 
> btw, shouldn't the msr autoload list be part of loaded_vmcs as well?
> 
Why?

> >
> >+	}
> >+#endif
> >+}
> >+
> >  #ifdef CONFIG_X86_64
> >  #define R "r"
> >  #define Q "q"
> >@@ -6101,6 +6125,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
> >  	if (vcpu->guest_debug&  KVM_GUESTDBG_SINGLESTEP)
> >  		vmx_set_interrupt_shadow(vcpu, 0);
> >
> >+	atomic_switch_perf_msrs(vmx);
> >+
> >  	vmx->__launched = vmx->loaded_vmcs->launched;
> >  	asm(
> >  		/* Store host registers */
> >@@ -6306,6 +6332,15 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
> >  	vmx->nested.current_vmptr = -1ull;
> >  	vmx->nested.current_vmcs12 = NULL;
> >
> >+	vmx->perf_msrs_cnt = perf_guest_get_msrs_count();
> >+	if (vmx->perf_msrs_cnt>  0) {
> >+		vmx->perf_msrs = kmalloc(vmx->perf_msrs_cnt *
> >+				sizeof(struct perf_guest_switch_msr),
> >+				GFP_KERNEL);
> >+		if (!vmx->perf_msrs)
> >+			goto free_vmcs;
> >+	}
> >+
> 
> Do we really need a private buffer?  Perhaps perf_guest_get_msrs()
> can return a perf-internal buffer (but then, we will need to copy it
> for the optimization above, but that's a separate issue).
> 
The buffer will be small, so IMHO private one is not an issue. We can
make it perf internal per cpu buffer I think.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr
  2011-10-03 15:06   ` Avi Kivity
@ 2011-10-03 15:41     ` Gleb Natapov
  2011-10-04  9:21       ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-03 15:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Mon, Oct 03, 2011 at 05:06:27PM +0200, Avi Kivity wrote:
> On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >Intel does not have guest/host-only bit in perf counters like AMD
> >does.  To support GO/HO bits KVM needs to switch EVENTSELn values
> >(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
> >configured to count only in a guest mode it stays disabled in a host,
> >but VMX is configured to switch it to enabled value during guest entry.
> >
> >This patch adds GO/HO tracking to Intel perf code and provides interface
> >for KVM to get a list of MSRs that need to be switched on a guest entry.
> >
> >Only cpus with architectural PMU (v1 or later) are supported with this
> >patch.  To my knowledge there is not p6 models with VMX but without
> >architectural PMU and p4 with VMX are rare and the interface is general
> >enough to support them if need arise.
> >
> >+
> >+static int core_guest_get_msrs(int cnt, struct perf_guest_switch_msr *arr)
> >+{
> >+	struct cpu_hw_events *cpuc =&__get_cpu_var(cpu_hw_events);
> >+	int idx;
> >+
> >+	if (cnt<  x86_pmu.num_counters)
> >+		return -ENOMEM;
> >+
> >+	for (idx = 0; idx<  x86_pmu.num_counters; idx++)  {
> >+		struct perf_event *event = cpuc->events[idx];
> >+
> >+		arr[idx].msr = x86_pmu_config_addr(idx);
> >+		arr[idx].host = arr[idx].guest = 0;
> >+
> >+		if (!test_bit(idx, cpuc->active_mask))
> >+			continue;
> >+
> >+		arr[idx].host = arr[idx].guest =
> >+				event->hw.config | ARCH_PERFMON_EVENTSEL_ENABLE;
> >+
> >+		if (event->attr.exclude_host)
> >+			arr[idx].host&= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> >+		else if (event->attr.exclude_guest)
> >+			arr[idx].guest&= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> >+	}
> >+
> >+	return 0;
> >+}
> 
> Would be better to calculate these when the host msrs are
> calculated, instead of here, every vmentry.
>
For arch PMU v2 and greater it is precalculated. For v1 (which is almost
non existent, even my oldest cpu with VMX has v2 PMU) I am not sure it
will help since we need to copy information to perf_guest_switch_msr
array anyway here.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr
  2011-10-03 15:41     ` Gleb Natapov
@ 2011-10-04  9:21       ` Avi Kivity
  0 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2011-10-04  9:21 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/03/2011 05:41 PM, Gleb Natapov wrote:
> >  Would be better to calculate these when the host msrs are
> >  calculated, instead of here, every vmentry.
> >
> For arch PMU v2 and greater it is precalculated. For v1 (which is almost
> non existent, even my oldest cpu with VMX has v2 PMU) I am not sure it
> will help since we need to copy information to perf_guest_switch_msr
> array anyway here.
>

Okay.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-03 15:36     ` Gleb Natapov
@ 2011-10-04  9:28       ` Avi Kivity
  2011-10-04  9:56         ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-04  9:28 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/03/2011 05:36 PM, Gleb Natapov wrote:
> On Mon, Oct 03, 2011 at 05:00:25PM +0200, Avi Kivity wrote:
> >  On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >  >Support guest/host-only profiling by switch perf msrs on
> >  >a guest entry if needed.
> >  >
> >  >@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> >  >   	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
> >  >   }
> >  >
> >  >+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >  >+{
> >  >+#ifdef CONFIG_PERF_EVENTS
> >
> >  No need for #ifdef (if you also define perf_guest_get_msrs() when
> >  !CONFIG_PERF_EVENTS).
> >
> Yes, but will compiler be smart enough to remove the code of the
> function completely? It will have to figure that vmx->perf_msrs_cnt is
> always 0 somehow.

It won't, but do we care?

> >  >
> >  >+
> >  >+	perf_guest_get_msrs(vmx->perf_msrs_cnt, vmx->perf_msrs);
> >  >+	for (i = 0; i<   vmx->perf_msrs_cnt; i++) {
> >  >+		struct perf_guest_switch_msr *msr =&vmx->perf_msrs[i];
> >  >+		if (msr->host == msr->guest)
> >  >+			clear_atomic_switch_msr(vmx, msr->msr);
> >  >+		else
> >  >+			add_atomic_switch_msr(vmx, msr->msr, msr->guest,
> >  >+					msr->host);
> >
> >  This generates a lot of VMWRITEs even if nothing changes, just to
> >  re-set bits in the VMCS to their existing values.  Need to add
> >  something like this:
> >
> >     if (loaded_vmcs->msr[i].host == msr->host
> >  &&  loaded_vmcs->msr[i].guest == msr->guest)
> >            continue;
> VMWRITE happens only when number of autoloaded MSRs changes (which is
> rare), not on each call to add_atomic_switch_msr(). I thought about
> optimizing this write too by doing
> vmcs_write32(VM_(ENTRY|EXIT)_MSR_LOAD_COUNT, m->nr) only once by
> checking that  m->nr changed during vmentry. Can be done later.

For EFER and PERF_CTRL, it's done unconditionally, no?

> >
> >  btw, shouldn't the msr autoload list be part of loaded_vmcs as well?
> >
> Why?

Any caching is only relative to the vmcs (unless we invalidate the cache 
on vmcs switch).

> >
> >  Do we really need a private buffer?  Perhaps perf_guest_get_msrs()
> >  can return a perf-internal buffer (but then, we will need to copy it
> >  for the optimization above, but that's a separate issue).
> >
> The buffer will be small, so IMHO private one is not an issue. We can
> make it perf internal per cpu buffer I think.
>

I think the API is nicer with perf returning a read-only internal 
buffer; this way there is no kmalloc involved since perf knows its 
internal limits.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL
  2011-10-03 13:49 ` [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL Gleb Natapov
@ 2011-10-04  9:32   ` Avi Kivity
  2011-10-04  9:57     ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-04  9:32 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> Some cpus have special support for switching PERF_GLOBAL_CTRL msr.
> Add logic to detect if such support exists and works properly and extend
> msr switching code to use it if available. Also extend number of generic
> msr switching entries to 8.
>
>
>   static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
>   static DEFINE_SPINLOCK(vmx_vpid_lock);
> @@ -1195,10 +1196,29 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
>   {
>   	unsigned i;
>   	struct msr_autoload *m =&vmx->msr_autoload;
> +	u32 entry_load, exit_load;
> +	bool done = false;
>
> -	if (msr == MSR_EFER&&  cpu_has_load_ia32_efer) {
> -		vmcs_clear_bits(VM_ENTRY_CONTROLS, VM_ENTRY_LOAD_IA32_EFER);
> -		vmcs_clear_bits(VM_EXIT_CONTROLS, VM_EXIT_LOAD_IA32_EFER);
> +	switch (msr) {
> +	case MSR_EFER:
> +		if (cpu_has_load_ia32_efer) {
> +			entry_load = VM_ENTRY_LOAD_IA32_EFER;
> +			exit_load = VM_EXIT_LOAD_IA32_EFER;
> +			done = true;
> +		}
> +		break;
> +	case MSR_CORE_PERF_GLOBAL_CTRL:
> +		if (cpu_has_load_perf_global_ctrl) {
> +			entry_load = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +			exit_load = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +			done = true;
> +		}
> +		break;
> +	}
> +
> +	if (done) {
> +		vmcs_clear_bits(VM_ENTRY_CONTROLS, entry_load);
> +		vmcs_clear_bits(VM_EXIT_CONTROLS, exit_load);
>   		return;
>   	}
>
>

This is ugly.  How about

    if (msr == MSR_EFER && cpu_has_load_ia32_efer) {
        clear_atomic_switch_msr_special(VM_ENTRY_LOAD_IA32_EFER, 
VM_EXIT_LOAD_IA32_EFER);
        return;
    }

avoids the fake 'done' variable.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-04  9:28       ` Avi Kivity
@ 2011-10-04  9:56         ` Gleb Natapov
  2011-10-04 11:10           ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-04  9:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Tue, Oct 04, 2011 at 11:28:51AM +0200, Avi Kivity wrote:
> On 10/03/2011 05:36 PM, Gleb Natapov wrote:
> >On Mon, Oct 03, 2011 at 05:00:25PM +0200, Avi Kivity wrote:
> >>  On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >>  >Support guest/host-only profiling by switch perf msrs on
> >>  >a guest entry if needed.
> >>  >
> >>  >@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> >>  >   	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
> >>  >   }
> >>  >
> >>  >+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >>  >+{
> >>  >+#ifdef CONFIG_PERF_EVENTS
> >>
> >>  No need for #ifdef (if you also define perf_guest_get_msrs() when
> >>  !CONFIG_PERF_EVENTS).
> >>
> >Yes, but will compiler be smart enough to remove the code of the
> >function completely? It will have to figure that vmx->perf_msrs_cnt is
> >always 0 somehow.
> 
> It won't, but do we care?
> 
Dead code, that likely to be inlined, on a hot path.

> >>  >
> >>  >+
> >>  >+	perf_guest_get_msrs(vmx->perf_msrs_cnt, vmx->perf_msrs);
> >>  >+	for (i = 0; i<   vmx->perf_msrs_cnt; i++) {
> >>  >+		struct perf_guest_switch_msr *msr =&vmx->perf_msrs[i];
> >>  >+		if (msr->host == msr->guest)
> >>  >+			clear_atomic_switch_msr(vmx, msr->msr);
> >>  >+		else
> >>  >+			add_atomic_switch_msr(vmx, msr->msr, msr->guest,
> >>  >+					msr->host);
> >>
> >>  This generates a lot of VMWRITEs even if nothing changes, just to
> >>  re-set bits in the VMCS to their existing values.  Need to add
> >>  something like this:
> >>
> >>     if (loaded_vmcs->msr[i].host == msr->host
> >>  &&  loaded_vmcs->msr[i].guest == msr->guest)
> >>            continue;
> >VMWRITE happens only when number of autoloaded MSRs changes (which is
> >rare), not on each call to add_atomic_switch_msr(). I thought about
> >optimizing this write too by doing
> >vmcs_write32(VM_(ENTRY|EXIT)_MSR_LOAD_COUNT, m->nr) only once by
> >checking that  m->nr changed during vmentry. Can be done later.
> 
> For EFER and PERF_CTRL, it's done unconditionally, no?
For those yes. We do not cache their value currently. Can be added, but
this is independent optimization. 

> 
> >>
> >>  btw, shouldn't the msr autoload list be part of loaded_vmcs as well?
> >>
> >Why?
> 
> Any caching is only relative to the vmcs (unless we invalidate the
> cache on vmcs switch).
Ah, you are talking about not yet existent cache. Then I see why it
should be in loaded_vmcs for EFER and PERF_CTRL.

> 
> >>
> >>  Do we really need a private buffer?  Perhaps perf_guest_get_msrs()
> >>  can return a perf-internal buffer (but then, we will need to copy it
> >>  for the optimization above, but that's a separate issue).
> >>
> >The buffer will be small, so IMHO private one is not an issue. We can
> >make it perf internal per cpu buffer I think.
> >
> 
> I think the API is nicer with perf returning a read-only internal
> buffer; this way there is no kmalloc involved since perf knows its
> internal limits.
> 
Yeah, I am trying it now it it looks nicer.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL
  2011-10-04  9:32   ` Avi Kivity
@ 2011-10-04  9:57     ` Gleb Natapov
  0 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-04  9:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Tue, Oct 04, 2011 at 11:32:03AM +0200, Avi Kivity wrote:
> On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >Some cpus have special support for switching PERF_GLOBAL_CTRL msr.
> >Add logic to detect if such support exists and works properly and extend
> >msr switching code to use it if available. Also extend number of generic
> >msr switching entries to 8.
> >
> >
> >  static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
> >  static DEFINE_SPINLOCK(vmx_vpid_lock);
> >@@ -1195,10 +1196,29 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> >  {
> >  	unsigned i;
> >  	struct msr_autoload *m =&vmx->msr_autoload;
> >+	u32 entry_load, exit_load;
> >+	bool done = false;
> >
> >-	if (msr == MSR_EFER&&  cpu_has_load_ia32_efer) {
> >-		vmcs_clear_bits(VM_ENTRY_CONTROLS, VM_ENTRY_LOAD_IA32_EFER);
> >-		vmcs_clear_bits(VM_EXIT_CONTROLS, VM_EXIT_LOAD_IA32_EFER);
> >+	switch (msr) {
> >+	case MSR_EFER:
> >+		if (cpu_has_load_ia32_efer) {
> >+			entry_load = VM_ENTRY_LOAD_IA32_EFER;
> >+			exit_load = VM_EXIT_LOAD_IA32_EFER;
> >+			done = true;
> >+		}
> >+		break;
> >+	case MSR_CORE_PERF_GLOBAL_CTRL:
> >+		if (cpu_has_load_perf_global_ctrl) {
> >+			entry_load = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> >+			exit_load = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> >+			done = true;
> >+		}
> >+		break;
> >+	}
> >+
> >+	if (done) {
> >+		vmcs_clear_bits(VM_ENTRY_CONTROLS, entry_load);
> >+		vmcs_clear_bits(VM_EXIT_CONTROLS, exit_load);
> >  		return;
> >  	}
> >
> >
> 
> This is ugly.  How about
> 
>    if (msr == MSR_EFER && cpu_has_load_ia32_efer) {
>        clear_atomic_switch_msr_special(VM_ENTRY_LOAD_IA32_EFER,
> VM_EXIT_LOAD_IA32_EFER);
>        return;
>    }
> 
> avoids the fake 'done' variable.
> 
OK.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-04  9:56         ` Gleb Natapov
@ 2011-10-04 11:10           ` Avi Kivity
  2011-10-04 11:17             ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-04 11:10 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/04/2011 11:56 AM, Gleb Natapov wrote:
> On Tue, Oct 04, 2011 at 11:28:51AM +0200, Avi Kivity wrote:
> >  On 10/03/2011 05:36 PM, Gleb Natapov wrote:
> >  >On Mon, Oct 03, 2011 at 05:00:25PM +0200, Avi Kivity wrote:
> >  >>   On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >  >>   >Support guest/host-only profiling by switch perf msrs on
> >  >>   >a guest entry if needed.
> >  >>   >
> >  >>   >@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> >  >>   >    	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
> >  >>   >    }
> >  >>   >
> >  >>   >+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >  >>   >+{
> >  >>   >+#ifdef CONFIG_PERF_EVENTS
> >  >>
> >  >>   No need for #ifdef (if you also define perf_guest_get_msrs() when
> >  >>   !CONFIG_PERF_EVENTS).
> >  >>
> >  >Yes, but will compiler be smart enough to remove the code of the
> >  >function completely? It will have to figure that vmx->perf_msrs_cnt is
> >  >always 0 somehow.
> >
> >  It won't, but do we care?
> >
> Dead code, that likely to be inlined, on a hot path.

I mean, CONFIG_KVM && !CONFIG_PERF_EVENTS is an unlikely combination.  
If you're using kvm, you usually want PERF_EVENTS.

> >  >VMWRITE happens only when number of autoloaded MSRs changes (which is
> >  >rare), not on each call to add_atomic_switch_msr(). I thought about
> >  >optimizing this write too by doing
> >  >vmcs_write32(VM_(ENTRY|EXIT)_MSR_LOAD_COUNT, m->nr) only once by
> >  >checking that  m->nr changed during vmentry. Can be done later.
> >
> >  For EFER and PERF_CTRL, it's done unconditionally, no?
> For those yes. We do not cache their value currently. Can be added, but
> this is independent optimization.

Sure.  I note that on most cpus only these will ever be used.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-04 11:10           ` Avi Kivity
@ 2011-10-04 11:17             ` Gleb Natapov
  2011-10-04 11:24               ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2011-10-04 11:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Tue, Oct 04, 2011 at 01:10:24PM +0200, Avi Kivity wrote:
> On 10/04/2011 11:56 AM, Gleb Natapov wrote:
> >On Tue, Oct 04, 2011 at 11:28:51AM +0200, Avi Kivity wrote:
> >>  On 10/03/2011 05:36 PM, Gleb Natapov wrote:
> >>  >On Mon, Oct 03, 2011 at 05:00:25PM +0200, Avi Kivity wrote:
> >>  >>   On 10/03/2011 03:49 PM, Gleb Natapov wrote:
> >>  >>   >Support guest/host-only profiling by switch perf msrs on
> >>  >>   >a guest entry if needed.
> >>  >>   >
> >>  >>   >@@ -6052,6 +6056,26 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
> >>  >>   >    	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
> >>  >>   >    }
> >>  >>   >
> >>  >>   >+static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >>  >>   >+{
> >>  >>   >+#ifdef CONFIG_PERF_EVENTS
> >>  >>
> >>  >>   No need for #ifdef (if you also define perf_guest_get_msrs() when
> >>  >>   !CONFIG_PERF_EVENTS).
> >>  >>
> >>  >Yes, but will compiler be smart enough to remove the code of the
> >>  >function completely? It will have to figure that vmx->perf_msrs_cnt is
> >>  >always 0 somehow.
> >>
> >>  It won't, but do we care?
> >>
> >Dead code, that likely to be inlined, on a hot path.
> 
> I mean, CONFIG_KVM && !CONFIG_PERF_EVENTS is an unlikely
> combination.  If you're using kvm, you usually want PERF_EVENTS.
> 
Who knows. Think about someone building appliance with embedded KVM and
trying to achieve minimal code footprint. It is much easier to add ifdefs
at the development stage then trying to figure out later what can be
ifdeffed. If we will do:
 if (!(cnt = perf_guest_get_msrs_count())
	return;

at the beginning of atomic_switch_perf_msrs() then compiler can
eliminate dead code in case of !CONFIG_PERF_EVENTS since
perf_guest_get_msrs_count() will become 0, but this will add two
function calls on vmentry in CONFIG_PERF_EVENTS case.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-04 11:17             ` Gleb Natapov
@ 2011-10-04 11:24               ` Avi Kivity
  2011-10-04 12:12                 ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2011-10-04 11:24 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On 10/04/2011 01:17 PM, Gleb Natapov wrote:
> >
> >  I mean, CONFIG_KVM&&  !CONFIG_PERF_EVENTS is an unlikely
> >  combination.  If you're using kvm, you usually want PERF_EVENTS.
> >
> Who knows. Think about someone building appliance with embedded KVM and
> trying to achieve minimal code footprint.

Saving a few dozen bytes, then launching a 1GB guest?

> It is much easier to add ifdefs
> at the development stage then trying to figure out later what can be
> ifdeffed. If we will do:
>   if (!(cnt = perf_guest_get_msrs_count())
> 	return;
>
> at the beginning of atomic_switch_perf_msrs() then compiler can
> eliminate dead code in case of !CONFIG_PERF_EVENTS since
> perf_guest_get_msrs_count() will become 0, but this will add two
> function calls on vmentry in CONFIG_PERF_EVENTS case.
>

Then move it to the beginning:

    nr_msrs = perf_get_guest_msrs(&msr_buffer);
    for (i = 0; i < nr_msrs; ++i)
           add_atomic_switch_msr((*msr_buffer)[i], nr);

the compiler will kill the loop if nr_msrs is statically 0.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling
  2011-10-04 11:24               ` Avi Kivity
@ 2011-10-04 12:12                 ` Gleb Natapov
  0 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2011-10-04 12:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, kvm, joerg.roedel, mingo, a.p.zijlstra

On Tue, Oct 04, 2011 at 01:24:18PM +0200, Avi Kivity wrote:
> On 10/04/2011 01:17 PM, Gleb Natapov wrote:
> >>
> >>  I mean, CONFIG_KVM&&  !CONFIG_PERF_EVENTS is an unlikely
> >>  combination.  If you're using kvm, you usually want PERF_EVENTS.
> >>
> >Who knows. Think about someone building appliance with embedded KVM and
> >trying to achieve minimal code footprint.
> 
> Saving a few dozen bytes, then launching a 1GB guest?
> 
> >It is much easier to add ifdefs
> >at the development stage then trying to figure out later what can be
> >ifdeffed. If we will do:
> >  if (!(cnt = perf_guest_get_msrs_count())
> >	return;
> >
> >at the beginning of atomic_switch_perf_msrs() then compiler can
> >eliminate dead code in case of !CONFIG_PERF_EVENTS since
> >perf_guest_get_msrs_count() will become 0, but this will add two
> >function calls on vmentry in CONFIG_PERF_EVENTS case.
> >
> 
> Then move it to the beginning:
> 
>    nr_msrs = perf_get_guest_msrs(&msr_buffer);
>    for (i = 0; i < nr_msrs; ++i)
>           add_atomic_switch_msr((*msr_buffer)[i], nr);
> 
> the compiler will kill the loop if nr_msrs is statically 0.
> 
Good idea. Just get rid of perf_guest_get_msrs_count().

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-10-04 12:12 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-03 13:49 [PATCH 0/9] perf support for x86 guest/host-only bits Gleb Natapov
2011-10-03 13:49 ` [PATCH 1/9] perf, core: Introduce attrs to count in either host or guest mode Gleb Natapov
2011-10-03 13:49 ` [PATCH 2/9] perf, amd: Use GO/HO bits in perf-ctr Gleb Natapov
2011-10-03 13:49 ` [PATCH 3/9] perf, tools: Add support for guest/host-only profiling Gleb Natapov
2011-10-03 13:49 ` [PATCH 4/9] perf, tools: Fix copy&paste error in perf-kvm option description Gleb Natapov
2011-10-03 13:49 ` [PATCH 5/9] perf, tools: Do guest-only counting in perf-kvm by default Gleb Natapov
2011-10-03 13:49 ` [PATCH 6/9] perf, intel: Use GO/HO bits in perf-ctr Gleb Natapov
2011-10-03 15:06   ` Avi Kivity
2011-10-03 15:41     ` Gleb Natapov
2011-10-04  9:21       ` Avi Kivity
2011-10-03 13:49 ` [PATCH 7/9] KVM, VMX: add support for switching of PERF_GLOBAL_CTRL Gleb Natapov
2011-10-04  9:32   ` Avi Kivity
2011-10-04  9:57     ` Gleb Natapov
2011-10-03 13:49 ` [PATCH 8/9] KVM, VMX: Add support for guest/host-only profiling Gleb Natapov
2011-10-03 15:00   ` Avi Kivity
2011-10-03 15:36     ` Gleb Natapov
2011-10-04  9:28       ` Avi Kivity
2011-10-04  9:56         ` Gleb Natapov
2011-10-04 11:10           ` Avi Kivity
2011-10-04 11:17             ` Gleb Natapov
2011-10-04 11:24               ` Avi Kivity
2011-10-04 12:12                 ` Gleb Natapov
2011-10-03 13:49 ` [PATCH 9/9] KVM, VMX: Check for automatic switch msr table overflow Gleb Natapov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).