public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling
@ 2023-11-03 23:05 Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code Sean Christopherson
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

The ultimate goal of this series is to track emulated counter events using
a dedicated variable instead of trying to track the previous counter value.
Tracking the previous counter value is flawed as it takes a snapshot at
every emulated event, but only checks for overflow prior to VM-Enter, i.e.
KVM could miss an overflow if KVM ever supports emulating event types that
can occur multiple times in a single VM-Exit.

And as Mingwei root caused, emulating overflow while the perf event is
running can result in duplicate overflow events, e.g. if the perf event
overflows between taking and checking the snapshot.  This bug is largely
masked now that KVM correctly sets LVT_MASK when delivering PMIs, but it's
still a bug, e.g. could cause problems if there are other side effects.

Patches 1-5 are (some loosely, some tightly) related fixes and cleanups to
simplify the emulated counter approach implementation.  The fixes are
tagged for stable as usersepace could cause some weirdness around perf
events, but I doubt any real world VMM is actually affected.

Dapeng, I intentionally omitted your Reviewed-by from the last patch as
the change from v1 isn't trivial.

v2:
 - Collect reviews. [Dapeng]
 - Emulate overflow *after* pausing perf event. [Mingwei]

v1: https://lore.kernel.org/all/20231023234000.2499267-1-seanjc@google.com

Sean Christopherson (6):
  KVM: x86/pmu: Move PMU reset logic to common x86 code
  KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing
  KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant)
  KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init()
  KVM: x86/pmu: Update sample period in pmc_write_counter()
  KVM: x86/pmu: Track emulated counter events instead of previous
    counter

 arch/x86/include/asm/kvm-x86-pmu-ops.h |   2 +-
 arch/x86/include/asm/kvm_host.h        |  17 ++-
 arch/x86/kvm/pmu.c                     | 140 +++++++++++++++++++++----
 arch/x86/kvm/pmu.h                     |  47 +--------
 arch/x86/kvm/svm/pmu.c                 |  17 ---
 arch/x86/kvm/vmx/pmu_intel.c           |  22 ----
 arch/x86/kvm/x86.c                     |   1 -
 7 files changed, 137 insertions(+), 109 deletions(-)


base-commit: 45b890f7689eb0aba454fc5831d2d79763781677
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 2/6] KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing Sean Christopherson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Move the common (or at least "ignored") aspects of resetting the vPMU to
common x86 code, along with the stop/release helpers that are no used only
by the common pmu.c.

There is no need to manually handle fixed counters as all_valid_pmc_idx
tracks both fixed and general purpose counters, and resetting the vPMU is
far from a hot path, i.e. the extra bit of overhead to the PMC from the
index is a non-issue.

Zero fixed_ctr_ctrl in common code even though it's Intel specific.
Ensuring it's zero doesn't harm AMD/SVM in any way, and stopping the fixed
counters via all_valid_pmc_idx, but not clearing the associated control
bits, would be odd/confusing.

Make the .reset() hook optional as SVM no longer needs vendor specific
handling.

Cc: stable@vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm-x86-pmu-ops.h |  2 +-
 arch/x86/kvm/pmu.c                     | 40 +++++++++++++++++++++++++-
 arch/x86/kvm/pmu.h                     | 18 ------------
 arch/x86/kvm/svm/pmu.c                 | 16 -----------
 arch/x86/kvm/vmx/pmu_intel.c           | 20 -------------
 5 files changed, 40 insertions(+), 56 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index 6c98f4bb4228..058bc636356a 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -22,7 +22,7 @@ KVM_X86_PMU_OP(get_msr)
 KVM_X86_PMU_OP(set_msr)
 KVM_X86_PMU_OP(refresh)
 KVM_X86_PMU_OP(init)
-KVM_X86_PMU_OP(reset)
+KVM_X86_PMU_OP_OPTIONAL(reset)
 KVM_X86_PMU_OP_OPTIONAL(deliver_pmi)
 KVM_X86_PMU_OP_OPTIONAL(cleanup)
 
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 9ae07db6f0f6..027e9c3c2b93 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -250,6 +250,24 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
 	return true;
 }
 
+static void pmc_release_perf_event(struct kvm_pmc *pmc)
+{
+	if (pmc->perf_event) {
+		perf_event_release_kernel(pmc->perf_event);
+		pmc->perf_event = NULL;
+		pmc->current_config = 0;
+		pmc_to_pmu(pmc)->event_count--;
+	}
+}
+
+static void pmc_stop_counter(struct kvm_pmc *pmc)
+{
+	if (pmc->perf_event) {
+		pmc->counter = pmc_read_counter(pmc);
+		pmc_release_perf_event(pmc);
+	}
+}
+
 static int filter_cmp(const void *pa, const void *pb, u64 mask)
 {
 	u64 a = *(u64 *)pa & mask;
@@ -654,7 +672,27 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
 
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 {
-	static_call(kvm_x86_pmu_reset)(vcpu);
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct kvm_pmc *pmc;
+	int i;
+
+	bitmap_zero(pmu->reprogram_pmi, X86_PMC_IDX_MAX);
+
+	for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
+		pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
+		if (!pmc)
+			continue;
+
+		pmc_stop_counter(pmc);
+		pmc->counter = 0;
+
+		if (pmc_is_gp(pmc))
+			pmc->eventsel = 0;
+	}
+
+	pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status = 0;
+
+	static_call_cond(kvm_x86_pmu_reset)(vcpu);
 }
 
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 1d64113de488..a46aa9b25150 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -80,24 +80,6 @@ static inline void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
 	pmc->counter &= pmc_bitmask(pmc);
 }
 
-static inline void pmc_release_perf_event(struct kvm_pmc *pmc)
-{
-	if (pmc->perf_event) {
-		perf_event_release_kernel(pmc->perf_event);
-		pmc->perf_event = NULL;
-		pmc->current_config = 0;
-		pmc_to_pmu(pmc)->event_count--;
-	}
-}
-
-static inline void pmc_stop_counter(struct kvm_pmc *pmc)
-{
-	if (pmc->perf_event) {
-		pmc->counter = pmc_read_counter(pmc);
-		pmc_release_perf_event(pmc);
-	}
-}
-
 static inline bool pmc_is_gp(struct kvm_pmc *pmc)
 {
 	return pmc->type == KVM_PMC_GP;
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 373ff6a6687b..3fd47de14b38 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -233,21 +233,6 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)
 	}
 }
 
-static void amd_pmu_reset(struct kvm_vcpu *vcpu)
-{
-	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	int i;
-
-	for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC; i++) {
-		struct kvm_pmc *pmc = &pmu->gp_counters[i];
-
-		pmc_stop_counter(pmc);
-		pmc->counter = pmc->prev_counter = pmc->eventsel = 0;
-	}
-
-	pmu->global_ctrl = pmu->global_status = 0;
-}
-
 struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.hw_event_available = amd_hw_event_available,
 	.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
@@ -259,7 +244,6 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.set_msr = amd_pmu_set_msr,
 	.refresh = amd_pmu_refresh,
 	.init = amd_pmu_init,
-	.reset = amd_pmu_reset,
 	.EVENTSEL_EVENT = AMD64_EVENTSEL_EVENT,
 	.MAX_NR_GP_COUNTERS = KVM_AMD_PMC_MAX_GENERIC,
 	.MIN_NR_GP_COUNTERS = AMD64_NUM_COUNTERS,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 820d3e1f6b4f..90c1f7f07e53 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -632,26 +632,6 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 {
-	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	struct kvm_pmc *pmc = NULL;
-	int i;
-
-	for (i = 0; i < KVM_INTEL_PMC_MAX_GENERIC; i++) {
-		pmc = &pmu->gp_counters[i];
-
-		pmc_stop_counter(pmc);
-		pmc->counter = pmc->prev_counter = pmc->eventsel = 0;
-	}
-
-	for (i = 0; i < KVM_PMC_MAX_FIXED; i++) {
-		pmc = &pmu->fixed_counters[i];
-
-		pmc_stop_counter(pmc);
-		pmc->counter = pmc->prev_counter = 0;
-	}
-
-	pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status = 0;
-
 	intel_pmu_release_guest_lbr_event(vcpu);
 }
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/6] KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 3/6] KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant) Sean Christopherson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Stop all counters and release all perf events before refreshing the vPMU,
i.e. before reconfiguring the vPMU to respond to changes in the vCPU
model.

Clear need_cleanup in kvm_pmu_reset() as well so that KVM doesn't
prematurely stop counters, e.g. if KVM enters the guest and enables
counters before the vCPU is scheduled out.

Cc: stable@vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 027e9c3c2b93..dc8e8e907cfb 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -657,25 +657,14 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 0;
 }
 
-/* refresh PMU settings. This function generally is called when underlying
- * settings are changed (such as changes of PMU CPUID by guest VMs), which
- * should rarely happen.
- */
-void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
-{
-	if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
-		return;
-
-	bitmap_zero(vcpu_to_pmu(vcpu)->all_valid_pmc_idx, X86_PMC_IDX_MAX);
-	static_call(kvm_x86_pmu_refresh)(vcpu);
-}
-
 void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
 	int i;
 
+	pmu->need_cleanup = false;
+
 	bitmap_zero(pmu->reprogram_pmi, X86_PMC_IDX_MAX);
 
 	for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
@@ -695,6 +684,26 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	static_call_cond(kvm_x86_pmu_reset)(vcpu);
 }
 
+
+/*
+ * Refresh the PMU configuration for the vCPU, e.g. if userspace changes CPUID
+ * and/or PERF_CAPABILITIES.
+ */
+void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
+{
+	if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
+		return;
+
+	/*
+	 * Stop/release all existing counters/events before realizing the new
+	 * vPMU model.
+	 */
+	kvm_pmu_reset(vcpu);
+
+	bitmap_zero(vcpu_to_pmu(vcpu)->all_valid_pmc_idx, X86_PMC_IDX_MAX);
+	static_call(kvm_x86_pmu_refresh)(vcpu);
+}
+
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/6] KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant)
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 2/6] KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 4/6] KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init() Sean Christopherson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Drop kvm_vcpu_reset()'s call to kvm_pmu_reset(), the call is performed
only for RESET, which is really just the same thing as vCPU creation,
and kvm_arch_vcpu_create() *just* called kvm_pmu_init(), i.e. there can't
possibly be any work to do.

Unlike Intel, AMD's amd_pmu_refresh() does fill all_valid_pmc_idx even if
guest CPUID is empty, but everything that is at all dynamic is guaranteed
to be '0'/NULL, e.g. it should be impossible for KVM to have already
created a perf event.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c | 2 +-
 arch/x86/kvm/pmu.h | 1 -
 arch/x86/kvm/x86.c | 1 -
 3 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index dc8e8e907cfb..458e836c6efe 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -657,7 +657,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 0;
 }
 
-void kvm_pmu_reset(struct kvm_vcpu *vcpu)
+static void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index a46aa9b25150..db9a12c0a2ef 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -243,7 +243,6 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr);
 int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
 int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
 void kvm_pmu_refresh(struct kvm_vcpu *vcpu);
-void kvm_pmu_reset(struct kvm_vcpu *vcpu);
 void kvm_pmu_init(struct kvm_vcpu *vcpu);
 void kvm_pmu_cleanup(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2c924075f6f1..efbf52a9dc83 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12207,7 +12207,6 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	}
 
 	if (!init_event) {
-		kvm_pmu_reset(vcpu);
 		vcpu->arch.smbase = 0x30000;
 
 		vcpu->arch.msr_misc_features_enables = 0;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/6] KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init()
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
                   ` (2 preceding siblings ...)
  2023-11-03 23:05 ` [PATCH v2 3/6] KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant) Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 5/6] KVM: x86/pmu: Update sample period in pmc_write_counter() Sean Christopherson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Remove code that unnecessarily clears event_count and need_cleanup in
kvm_pmu_init(), the entire kvm_pmu is zeroed just a few lines earlier.
Vendor code doesn't set event_count or need_cleanup during .init(), and
if either VMX or SVM did set those fields it would be a flagrant bug.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 458e836c6efe..c06090196b00 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -710,8 +710,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 
 	memset(pmu, 0, sizeof(*pmu));
 	static_call(kvm_x86_pmu_init)(vcpu);
-	pmu->event_count = 0;
-	pmu->need_cleanup = false;
 	kvm_pmu_refresh(vcpu);
 }
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/6] KVM: x86/pmu: Update sample period in pmc_write_counter()
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
                   ` (3 preceding siblings ...)
  2023-11-03 23:05 ` [PATCH v2 4/6] KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init() Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:05 ` [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter Sean Christopherson
  2023-12-01  1:52 ` [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Update a PMC's sample period in pmc_write_counter() to deduplicate code
across all callers of pmc_write_counter().  Opportunistically move
pmc_write_counter() into pmc.c now that it's doing more work.  WRMSR isn't
such a hot path that an extra CALL+RET pair will be problematic, and the
order of function definitions needs to be changed anyways, i.e. now is a
convenient time to eat the churn.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/pmu.c           | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/pmu.h           | 25 +------------------------
 arch/x86/kvm/svm/pmu.c       |  1 -
 arch/x86/kvm/vmx/pmu_intel.c |  2 --
 4 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c06090196b00..3725d001239d 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -161,6 +161,15 @@ static u64 pmc_get_pebs_precise_level(struct kvm_pmc *pmc)
 	return 1;
 }
 
+static u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
+{
+	u64 sample_period = (-counter_value) & pmc_bitmask(pmc);
+
+	if (!sample_period)
+		sample_period = pmc_bitmask(pmc) + 1;
+	return sample_period;
+}
+
 static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config,
 				 bool exclude_user, bool exclude_kernel,
 				 bool intr)
@@ -268,6 +277,24 @@ static void pmc_stop_counter(struct kvm_pmc *pmc)
 	}
 }
 
+static void pmc_update_sample_period(struct kvm_pmc *pmc)
+{
+	if (!pmc->perf_event || pmc->is_paused ||
+	    !is_sampling_event(pmc->perf_event))
+		return;
+
+	perf_event_period(pmc->perf_event,
+			  get_sample_period(pmc, pmc->counter));
+}
+
+void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
+{
+	pmc->counter += val - pmc_read_counter(pmc);
+	pmc->counter &= pmc_bitmask(pmc);
+	pmc_update_sample_period(pmc);
+}
+EXPORT_SYMBOL_GPL(pmc_write_counter);
+
 static int filter_cmp(const void *pa, const void *pb, u64 mask)
 {
 	u64 a = *(u64 *)pa & mask;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index db9a12c0a2ef..cae85e550f60 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -74,11 +74,7 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc)
 	return counter & pmc_bitmask(pmc);
 }
 
-static inline void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
-{
-	pmc->counter += val - pmc_read_counter(pmc);
-	pmc->counter &= pmc_bitmask(pmc);
-}
+void pmc_write_counter(struct kvm_pmc *pmc, u64 val);
 
 static inline bool pmc_is_gp(struct kvm_pmc *pmc)
 {
@@ -128,25 +124,6 @@ static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr)
 	return NULL;
 }
 
-static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
-{
-	u64 sample_period = (-counter_value) & pmc_bitmask(pmc);
-
-	if (!sample_period)
-		sample_period = pmc_bitmask(pmc) + 1;
-	return sample_period;
-}
-
-static inline void pmc_update_sample_period(struct kvm_pmc *pmc)
-{
-	if (!pmc->perf_event || pmc->is_paused ||
-	    !is_sampling_event(pmc->perf_event))
-		return;
-
-	perf_event_period(pmc->perf_event,
-			  get_sample_period(pmc, pmc->counter));
-}
-
 static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 3fd47de14b38..b6a7ad4d6914 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -161,7 +161,6 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	pmc = get_gp_pmc_amd(pmu, msr, PMU_TYPE_COUNTER);
 	if (pmc) {
 		pmc_write_counter(pmc, data);
-		pmc_update_sample_period(pmc);
 		return 0;
 	}
 	/* MSR_EVNTSELn */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 90c1f7f07e53..a6216c874729 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -437,11 +437,9 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			    !(msr & MSR_PMC_FULL_WIDTH_BIT))
 				data = (s64)(s32)data;
 			pmc_write_counter(pmc, data);
-			pmc_update_sample_period(pmc);
 			break;
 		} else if ((pmc = get_fixed_pmc(pmu, msr))) {
 			pmc_write_counter(pmc, data);
-			pmc_update_sample_period(pmc);
 			break;
 		} else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
 			reserved_bits = pmu->reserved_bits;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
                   ` (4 preceding siblings ...)
  2023-11-03 23:05 ` [PATCH v2 5/6] KVM: x86/pmu: Update sample period in pmc_write_counter() Sean Christopherson
@ 2023-11-03 23:05 ` Sean Christopherson
  2023-11-03 23:21   ` Mingwei Zhang
  2023-12-01  1:52 ` [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
  6 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2023-11-03 23:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

Explicitly track emulated counter events instead of using the common
counter value that's shared with the hardware counter owned by perf.
Bumping the common counter requires snapshotting the pre-increment value
in order to detect overflow from emulation, and the snapshot approach is
inherently flawed.

Snapshotting the previous counter at every increment assumes that there is
at most one emulated counter event per emulated instruction (or rather,
between checks for KVM_REQ_PMU).  That's mostly holds true today because
KVM only emulates (branch) instructions retired, but the approach will
fall apart if KVM ever supports event types that don't have a 1:1
relationship with instructions.

And KVM already has a relevant bug, as handle_invalid_guest_state()
emulates multiple instructions without checking KVM_REQ_PMU, i.e. could
miss an overflow event due to clobbering pmc->prev_counter.  Not checking
KVM_REQ_PMU is problematic in both cases, but at least with the emulated
counter approach, the resulting behavior is delayed overflow detection,
as opposed to completely lost detection.

Tracking the emulated count fixes another bug where the snapshot approach
can signal spurious overflow due to incorporating both the emulated count
and perf's count in the check, i.e. if overflow is detected by perf, then
KVM's emulation will also incorrectly signal overflow.  Add a comment in
the related code to call out the need to process emulated events *after*
pausing the perf event (big kudos to Mingwei for figuring out that
particular wrinkle).

Cc: Mingwei Zhang <mizhang@google.com>
Cc: Roman Kagan <rkagan@amazon.de>
Cc: Jim Mattson <jmattson@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 17 +++++++++++-
 arch/x86/kvm/pmu.c              | 48 ++++++++++++++++++++++++---------
 arch/x86/kvm/pmu.h              |  3 ++-
 3 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d7036982332e..d8bc9ba88cfc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -500,8 +500,23 @@ struct kvm_pmc {
 	u8 idx;
 	bool is_paused;
 	bool intr;
+	/*
+	 * Base value of the PMC counter, relative to the *consumed* count in
+	 * the associated perf_event.  This value includes counter updates from
+	 * the perf_event and emulated_count since the last time the counter
+	 * was reprogrammed, but it is *not* the current value as seen by the
+	 * guest or userspace.
+	 *
+	 * The count is relative to the associated perf_event so that KVM
+	 * doesn't need to reprogram the perf_event every time the guest writes
+	 * to the counter.
+	 */
 	u64 counter;
-	u64 prev_counter;
+	/*
+	 * PMC events triggered by KVM emulation that haven't been fully
+	 * processed, i.e. haven't undergone overflow detection.
+	 */
+	u64 emulated_counter;
 	u64 eventsel;
 	struct perf_event *perf_event;
 	struct kvm_vcpu *vcpu;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 3725d001239d..87cc6c8809ad 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -127,9 +127,9 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
 	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
 
 	/*
-	 * Ignore overflow events for counters that are scheduled to be
-	 * reprogrammed, e.g. if a PMI for the previous event races with KVM's
-	 * handling of a related guest WRMSR.
+	 * Ignore asynchronous overflow events for counters that are scheduled
+	 * to be reprogrammed, e.g. if a PMI for the previous event races with
+	 * KVM's handling of a related guest WRMSR.
 	 */
 	if (test_and_set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi))
 		return;
@@ -224,17 +224,30 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config,
 	return 0;
 }
 
-static void pmc_pause_counter(struct kvm_pmc *pmc)
+static bool pmc_pause_counter(struct kvm_pmc *pmc)
 {
 	u64 counter = pmc->counter;
-
-	if (!pmc->perf_event || pmc->is_paused)
-		return;
+	u64 prev_counter;
 
 	/* update counter, reset event value to avoid redundant accumulation */
-	counter += perf_event_pause(pmc->perf_event, true);
+	if (pmc->perf_event && !pmc->is_paused)
+		counter += perf_event_pause(pmc->perf_event, true);
+
+	/*
+	 * Snapshot the previous counter *after* accumulating state from perf.
+	 * If overflow already happened, hardware (via perf) is responsible for
+	 * generating a PMI.  KVM just needs to detect overflow on emulated
+	 * counter events that haven't yet been processed.
+	 */
+	prev_counter = counter & pmc_bitmask(pmc);
+
+	counter += pmc->emulated_counter;
 	pmc->counter = counter & pmc_bitmask(pmc);
+
+	pmc->emulated_counter = 0;
 	pmc->is_paused = true;
+
+	return pmc->counter < prev_counter;
 }
 
 static bool pmc_resume_counter(struct kvm_pmc *pmc)
@@ -289,6 +302,15 @@ static void pmc_update_sample_period(struct kvm_pmc *pmc)
 
 void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
 {
+	/*
+	 * Drop any unconsumed accumulated counts, the WRMSR is a write, not a
+	 * read-modify-write.  Adjust the counter value so that its value is
+	 * relative to the current count, as reading the current count from
+	 * perf is faster than pausing and repgrogramming the event in order to
+	 * reset it to '0'.  Note, this very sneakily offsets the accumulated
+	 * emulated count too, by using pmc_read_counter()!
+	 */
+	pmc->emulated_counter = 0;
 	pmc->counter += val - pmc_read_counter(pmc);
 	pmc->counter &= pmc_bitmask(pmc);
 	pmc_update_sample_period(pmc);
@@ -428,14 +450,15 @@ static void reprogram_counter(struct kvm_pmc *pmc)
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
 	u64 eventsel = pmc->eventsel;
 	u64 new_config = eventsel;
+	bool emulate_overflow;
 	u8 fixed_ctr_ctrl;
 
-	pmc_pause_counter(pmc);
+	emulate_overflow = pmc_pause_counter(pmc);
 
 	if (!pmc_event_is_allowed(pmc))
 		goto reprogram_complete;
 
-	if (pmc->counter < pmc->prev_counter)
+	if (emulate_overflow)
 		__kvm_perf_overflow(pmc, false);
 
 	if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL)
@@ -475,7 +498,6 @@ static void reprogram_counter(struct kvm_pmc *pmc)
 
 reprogram_complete:
 	clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi);
-	pmc->prev_counter = 0;
 }
 
 void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
@@ -701,6 +723,7 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 
 		pmc_stop_counter(pmc);
 		pmc->counter = 0;
+		pmc->emulated_counter = 0;
 
 		if (pmc_is_gp(pmc))
 			pmc->eventsel = 0;
@@ -772,8 +795,7 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu)
 
 static void kvm_pmu_incr_counter(struct kvm_pmc *pmc)
 {
-	pmc->prev_counter = pmc->counter;
-	pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc);
+	pmc->emulated_counter++;
 	kvm_pmu_request_counter_reprogram(pmc);
 }
 
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index cae85e550f60..7caeb3d8d4fd 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -66,7 +66,8 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc)
 {
 	u64 counter, enabled, running;
 
-	counter = pmc->counter;
+	counter = pmc->counter + pmc->emulated_counter;
+
 	if (pmc->perf_event && !pmc->is_paused)
 		counter += perf_event_read_value(pmc->perf_event,
 						 &enabled, &running);
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter
  2023-11-03 23:05 ` [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter Sean Christopherson
@ 2023-11-03 23:21   ` Mingwei Zhang
  0 siblings, 0 replies; 9+ messages in thread
From: Mingwei Zhang @ 2023-11-03 23:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Dapeng Mi, Roman Kagan,
	Jim Mattson, Like Xu

On Fri, Nov 03, 2023, Sean Christopherson wrote:
> Explicitly track emulated counter events instead of using the common
> counter value that's shared with the hardware counter owned by perf.
> Bumping the common counter requires snapshotting the pre-increment value
> in order to detect overflow from emulation, and the snapshot approach is
> inherently flawed.
> 
> Snapshotting the previous counter at every increment assumes that there is
> at most one emulated counter event per emulated instruction (or rather,
> between checks for KVM_REQ_PMU).  That's mostly holds true today because
> KVM only emulates (branch) instructions retired, but the approach will
> fall apart if KVM ever supports event types that don't have a 1:1
> relationship with instructions.
> 
> And KVM already has a relevant bug, as handle_invalid_guest_state()
> emulates multiple instructions without checking KVM_REQ_PMU, i.e. could
> miss an overflow event due to clobbering pmc->prev_counter.  Not checking
> KVM_REQ_PMU is problematic in both cases, but at least with the emulated
> counter approach, the resulting behavior is delayed overflow detection,
> as opposed to completely lost detection.
> 
> Tracking the emulated count fixes another bug where the snapshot approach
> can signal spurious overflow due to incorporating both the emulated count
> and perf's count in the check, i.e. if overflow is detected by perf, then
> KVM's emulation will also incorrectly signal overflow.  Add a comment in
> the related code to call out the need to process emulated events *after*
> pausing the perf event (big kudos to Mingwei for figuring out that
> particular wrinkle).
> 
> Cc: Mingwei Zhang <mizhang@google.com>
> Cc: Roman Kagan <rkagan@amazon.de>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Cc: Like Xu <like.xu.linux@gmail.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
Reviewed-by: Mingwei Zhang <mizhang@google.com>

>  arch/x86/include/asm/kvm_host.h | 17 +++++++++++-
>  arch/x86/kvm/pmu.c              | 48 ++++++++++++++++++++++++---------
>  arch/x86/kvm/pmu.h              |  3 ++-
>  3 files changed, 53 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d7036982332e..d8bc9ba88cfc 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -500,8 +500,23 @@ struct kvm_pmc {
>  	u8 idx;
>  	bool is_paused;
>  	bool intr;
> +	/*
> +	 * Base value of the PMC counter, relative to the *consumed* count in
> +	 * the associated perf_event.  This value includes counter updates from
> +	 * the perf_event and emulated_count since the last time the counter
> +	 * was reprogrammed, but it is *not* the current value as seen by the
> +	 * guest or userspace.
> +	 *
> +	 * The count is relative to the associated perf_event so that KVM
> +	 * doesn't need to reprogram the perf_event every time the guest writes
> +	 * to the counter.
> +	 */
>  	u64 counter;
> -	u64 prev_counter;
> +	/*
> +	 * PMC events triggered by KVM emulation that haven't been fully
> +	 * processed, i.e. haven't undergone overflow detection.
> +	 */
> +	u64 emulated_counter;
>  	u64 eventsel;
>  	struct perf_event *perf_event;
>  	struct kvm_vcpu *vcpu;
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 3725d001239d..87cc6c8809ad 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -127,9 +127,9 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
>  	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
>  
>  	/*
> -	 * Ignore overflow events for counters that are scheduled to be
> -	 * reprogrammed, e.g. if a PMI for the previous event races with KVM's
> -	 * handling of a related guest WRMSR.
> +	 * Ignore asynchronous overflow events for counters that are scheduled
> +	 * to be reprogrammed, e.g. if a PMI for the previous event races with
> +	 * KVM's handling of a related guest WRMSR.
>  	 */
>  	if (test_and_set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi))
>  		return;
> @@ -224,17 +224,30 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config,
>  	return 0;
>  }
>  
> -static void pmc_pause_counter(struct kvm_pmc *pmc)
> +static bool pmc_pause_counter(struct kvm_pmc *pmc)
>  {
>  	u64 counter = pmc->counter;
> -
> -	if (!pmc->perf_event || pmc->is_paused)
> -		return;
> +	u64 prev_counter;
>  
>  	/* update counter, reset event value to avoid redundant accumulation */
> -	counter += perf_event_pause(pmc->perf_event, true);
> +	if (pmc->perf_event && !pmc->is_paused)
> +		counter += perf_event_pause(pmc->perf_event, true);
> +
> +	/*
> +	 * Snapshot the previous counter *after* accumulating state from perf.
> +	 * If overflow already happened, hardware (via perf) is responsible for
> +	 * generating a PMI.  KVM just needs to detect overflow on emulated
> +	 * counter events that haven't yet been processed.
> +	 */
> +	prev_counter = counter & pmc_bitmask(pmc);
> +
> +	counter += pmc->emulated_counter;
>  	pmc->counter = counter & pmc_bitmask(pmc);
> +
> +	pmc->emulated_counter = 0;
>  	pmc->is_paused = true;
> +
> +	return pmc->counter < prev_counter;
>  }
>  
>  static bool pmc_resume_counter(struct kvm_pmc *pmc)
> @@ -289,6 +302,15 @@ static void pmc_update_sample_period(struct kvm_pmc *pmc)
>  
>  void pmc_write_counter(struct kvm_pmc *pmc, u64 val)
>  {
> +	/*
> +	 * Drop any unconsumed accumulated counts, the WRMSR is a write, not a
> +	 * read-modify-write.  Adjust the counter value so that its value is
> +	 * relative to the current count, as reading the current count from
> +	 * perf is faster than pausing and repgrogramming the event in order to
> +	 * reset it to '0'.  Note, this very sneakily offsets the accumulated
> +	 * emulated count too, by using pmc_read_counter()!
> +	 */
> +	pmc->emulated_counter = 0;
>  	pmc->counter += val - pmc_read_counter(pmc);
>  	pmc->counter &= pmc_bitmask(pmc);
>  	pmc_update_sample_period(pmc);
> @@ -428,14 +450,15 @@ static void reprogram_counter(struct kvm_pmc *pmc)
>  	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>  	u64 eventsel = pmc->eventsel;
>  	u64 new_config = eventsel;
> +	bool emulate_overflow;
>  	u8 fixed_ctr_ctrl;
>  
> -	pmc_pause_counter(pmc);
> +	emulate_overflow = pmc_pause_counter(pmc);
>  
>  	if (!pmc_event_is_allowed(pmc))
>  		goto reprogram_complete;
>  
> -	if (pmc->counter < pmc->prev_counter)
> +	if (emulate_overflow)
>  		__kvm_perf_overflow(pmc, false);
>  
>  	if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL)
> @@ -475,7 +498,6 @@ static void reprogram_counter(struct kvm_pmc *pmc)
>  
>  reprogram_complete:
>  	clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi);
> -	pmc->prev_counter = 0;
>  }
>  
>  void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
> @@ -701,6 +723,7 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu)
>  
>  		pmc_stop_counter(pmc);
>  		pmc->counter = 0;
> +		pmc->emulated_counter = 0;
>  
>  		if (pmc_is_gp(pmc))
>  			pmc->eventsel = 0;
> @@ -772,8 +795,7 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu)
>  
>  static void kvm_pmu_incr_counter(struct kvm_pmc *pmc)
>  {
> -	pmc->prev_counter = pmc->counter;
> -	pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc);
> +	pmc->emulated_counter++;
>  	kvm_pmu_request_counter_reprogram(pmc);
>  }
>  
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index cae85e550f60..7caeb3d8d4fd 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -66,7 +66,8 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc)
>  {
>  	u64 counter, enabled, running;
>  
> -	counter = pmc->counter;
> +	counter = pmc->counter + pmc->emulated_counter;
> +
>  	if (pmc->perf_event && !pmc->is_paused)
>  		counter += perf_event_read_value(pmc->perf_event,
>  						 &enabled, &running);
> -- 
> 2.42.0.869.gea05f2083d-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling
  2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
                   ` (5 preceding siblings ...)
  2023-11-03 23:05 ` [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter Sean Christopherson
@ 2023-12-01  1:52 ` Sean Christopherson
  6 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2023-12-01  1:52 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Dapeng Mi, Mingwei Zhang, Roman Kagan,
	Jim Mattson, Like Xu

On Fri, 03 Nov 2023 16:05:35 -0700, Sean Christopherson wrote:
> The ultimate goal of this series is to track emulated counter events using
> a dedicated variable instead of trying to track the previous counter value.
> Tracking the previous counter value is flawed as it takes a snapshot at
> every emulated event, but only checks for overflow prior to VM-Enter, i.e.
> KVM could miss an overflow if KVM ever supports emulating event types that
> can occur multiple times in a single VM-Exit.
> 
> [...]

Applied to kvm-x86 pmu, thanks!

[1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code
      https://github.com/kvm-x86/linux/commit/cbb359d81a26
[2/6] KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing
      https://github.com/kvm-x86/linux/commit/1647b52757d5
[3/6] KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant)
      https://github.com/kvm-x86/linux/commit/f2f63f7ec6fd
[4/6] KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init()
      https://github.com/kvm-x86/linux/commit/ec61b2306dfd
[5/6] KVM: x86/pmu: Update sample period in pmc_write_counter()
      https://github.com/kvm-x86/linux/commit/89acf1237b81
[6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter
      https://github.com/kvm-x86/linux/commit/fd89499a5151

--
https://github.com/kvm-x86/linux/tree/next

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-01  1:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-03 23:05 [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 1/6] KVM: x86/pmu: Move PMU reset logic to common x86 code Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 2/6] KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 3/6] KVM: x86/pmu: Stop calling kvm_pmu_reset() at RESET (it's redundant) Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 4/6] KVM: x86/pmu: Remove manual clearing of fields in kvm_pmu_init() Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 5/6] KVM: x86/pmu: Update sample period in pmc_write_counter() Sean Christopherson
2023-11-03 23:05 ` [PATCH v2 6/6] KVM: x86/pmu: Track emulated counter events instead of previous counter Sean Christopherson
2023-11-03 23:21   ` Mingwei Zhang
2023-12-01  1:52 ` [PATCH v2 0/6] KVM: x86/pmu: Clean up emulated PMC event handling Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox