[PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup
@ 2025-03-02 22:00 Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
                   ` (9 more replies)
  0 siblings, 10 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Would you mind suggesting how we can move forward with this patchset as:

(1) One patch for kvm_arch_pre_create_vcpu() is picked from Xiaoyao's
patchset.
(2) Dapeng is working on mediated passthrough vPMU QEMU patches. This
patchset doesn't support mediated passthrough vPMU.

This patchset addresses four bugs related to AMD PMU virtualization.

1. The PerfMonV2 is still available if PERCORE if disabled via
"-cpu host,-perfctr-core".

2. The VM 'cpuid' command still returns PERFCORE although "-pmu" is
configured.

3. The third issue is that using "-cpu host,-pmu" does not disable AMD PMU
virtualization. When using "-cpu EPYC" or "-cpu host,-pmu", AMD PMU
virtualization remains enabled. On the VM's Linux side, you might still
see:

[    0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.

instead of:

[    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
[    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled

To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE
when "-pmu" is configured.

4. The fourth issue is that unreclaimed performance events (after a QEMU
system_reset) in KVM may cause random, unwanted, or unknown NMIs to be
injected into the VM.

The AMD PMU registers are not reset during QEMU system_reset.

(1) If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while
running "perf top", the PMU registers are not disabled properly.

(2) Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs()
does not handle AMD PMU registers, causing some PMU events to remain
enabled in KVM.

(3) The KVM kvm_pmc_speculative_in_use() function consistently returns true,
preventing the reclamation of these events. Consequently, the
kvm_pmc->perf_event remains active.

(4) After a reboot, the VM kernel may report the following error:

[    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
[    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)

(5) In the worst case, the active kvm_pmc->perf_event may inject unknown
NMIs randomly into the VM kernel:

[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.

To resolve these issues, we propose resetting AMD PMU registers during the
VM reset process

Changed since v1:
  - Use feature_dependencies for CPUID_EXT3_PERFCORE and
    CPUID_8000_0022_EAX_PERFMON_V2.
  - Remove CPUID_EXT3_PERFCORE when !cpu->enable_pmu.
  - Pick kvm_arch_pre_create_vcpu() patch from Xiaoyao Li.
  - Use "-pmu" but not a global "pmu-cap-disabled" for KVM_PMU_CAP_DISABLE.
  - Also use sysfs kvm.enable_pmu=N to determine if PMU is supported.
  - Some changes to PMU register limit calculation.

Xiaoyao Li (1):
  kvm: Introduce kvm_arch_pre_create_vcpu()

Dongli Zhang (9):
  target/i386: disable PerfMonV2 when PERFCORE unavailable
  target/i386: disable PERFCORE when "-pmu" is configured
  target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  target/i386/kvm: rename architectural PMU variables
  target/i386/kvm: query kvm.enable_pmu parameter
  target/i386/kvm: reset AMD PMU registers during VM reset
  target/i386/kvm: support perfmon-v2 for reset
  target/i386/kvm: don't stop Intel PMU counters

 accel/kvm/kvm-all.c        |   5 +
 include/system/kvm.h       |   1 +
 target/arm/kvm.c           |   5 +
 target/i386/cpu.c          |   8 +
 target/i386/cpu.h          |  12 ++
 target/i386/kvm/kvm.c      | 348 ++++++++++++++++++++++++++++++++++------
 target/loongarch/kvm/kvm.c |   5 +
 target/mips/kvm.c          |   5 +
 target/ppc/kvm.c           |   5 +
 target/riscv/kvm/kvm-cpu.c |   5 +
 target/s390x/kvm/kvm.c     |   5 +
 11 files changed, 357 insertions(+), 47 deletions(-)

base-commit: b69801dd6b1eb4d107f7c2f643adf0a4e3ec9124

Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-04 14:40   ` Xiaoyao Li
                     ` (2 more replies)
  2025-03-02 22:00 ` [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured Dongli Zhang
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
reflected in in guest dmesg.

[    0.285136] Performance Events: AMD PMU driver.

However, the guest CPUID indicates the PerfMonV2 is still available.

CPU:
   Extended Performance Monitoring and Debugging (0x80000022):
      AMD performance monitoring V2         = true
      AMD LBR V2                            = false
      AMD LBR stack & PMC freezing          = false
      number of core perf ctrs              = 0x6 (6)
      number of LBR stack entries           = 0x0 (0)
      number of avail Northbridge perf ctrs = 0x0 (0)
      number of available UMC PMCs          = 0x0 (0)
      active UMCs bitmask                   = 0x0

Disable PerfMonV2 in CPUID when PERFCORE is disabled.

Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Use feature_dependencies (suggested by Zhao Liu).

 target/i386/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 72ab147e85..b6d6167910 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1805,6 +1805,10 @@ static FeatureDep feature_dependencies[] = {
         .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
         .to = { FEAT_24_0_EBX,              ~0ull },
     },
+    {
+        .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
+        .to = { FEAT_8000_0022_EAX,         CPUID_8000_0022_EAX_PERFMON_V2 },
+    },
 };
 
 typedef struct X86RegisterInfo32 {
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-03  1:59   ` Xiaoyao Li
  2025-03-06 16:50   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu() Dongli Zhang
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Currently, AMD PMU support isn't determined based on CPUID, that is, the
"-pmu" option does not fully disable KVM AMD PMU virtualization.

To minimize AMD PMU features, remove PERFCORE when "-pmu" is configured.

To completely disable AMD PMU virtualization will be implemented via
KVM_CAP_PMU_CAPABILITY in upcoming patches.

As a reminder, neither CPUID_EXT3_PERFCORE nor
CPUID_8000_0022_EAX_PERFMON_V2 is removed from env->features[] when "-pmu"
is configured. Developers should query whether they are supported via
cpu_x86_cpuid() rather than relying on env->features[] in future patches.

Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 target/i386/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b6d6167910..61a671028a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             !(env->hflags & HF_LMA_MASK)) {
             *edx &= ~CPUID_EXT2_SYSCALL;
         }
+
+        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {
+            *ecx &= ~CPUID_EXT3_PERFCORE;
+        }
         break;
     case 0x80000002:
     case 0x80000003:
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu()
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-05 14:46   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured Dongli Zhang
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

From: Xiaoyao Li <xiaoyao.li@intel.com>

Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

The specific implemnet of i386 will be added in the future patch.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
I used to send a version:
https://lore.kernel.org/all/20221119122901.2469-2-dongli.zhang@oracle.com/
Just pick the one from Xiaoyao's patchset as Dapeng may use this version
as well.
https://lore.kernel.org/all/20250124132048.3229049-8-xiaoyao.li@intel.com/

 accel/kvm/kvm-all.c        | 5 +++++
 include/system/kvm.h       | 1 +
 target/arm/kvm.c           | 5 +++++
 target/i386/kvm/kvm.c      | 5 +++++
 target/loongarch/kvm/kvm.c | 5 +++++
 target/mips/kvm.c          | 5 +++++
 target/ppc/kvm.c           | 5 +++++
 target/riscv/kvm/kvm-cpu.c | 5 +++++
 target/s390x/kvm/kvm.c     | 5 +++++
 9 files changed, 41 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f89568bfa3..df9840e53a 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -540,6 +540,11 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    ret = kvm_arch_pre_create_vcpu(cpu, errp);
+    if (ret < 0) {
+        goto err;
+    }
+
     ret = kvm_create_vcpu(cpu);
     if (ret < 0) {
         error_setg_errno(errp, -ret,
diff --git a/include/system/kvm.h b/include/system/kvm.h
index ab17c09a55..d7dfa25493 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -374,6 +374,7 @@ int kvm_arch_get_default_type(MachineState *ms);
 
 int kvm_arch_init(MachineState *ms, KVMState *s);
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp);
 int kvm_arch_init_vcpu(CPUState *cpu);
 int kvm_arch_destroy_vcpu(CPUState *cpu);
 
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index da30bdbb23..93f1a7245b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1874,6 +1874,11 @@ static int kvm_arm_sve_set_vls(ARMCPU *cpu)
 
 #define ARM_CPU_ID_MPIDR       3, 0, 0, 0, 5
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     int ret;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6c749d4ee8..f41e190fb8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2051,6 +2051,11 @@ full:
     abort();
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
diff --git a/target/loongarch/kvm/kvm.c b/target/loongarch/kvm/kvm.c
index a3f55155b0..91c3c67cdb 100644
--- a/target/loongarch/kvm/kvm.c
+++ b/target/loongarch/kvm/kvm.c
@@ -973,6 +973,11 @@ static int kvm_cpu_check_pmu(CPUState *cs, Error **errp)
     return 0;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     uint64_t val;
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index d67b7c1a8e..ec53acb51a 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -61,6 +61,11 @@ int kvm_arch_irqchip_create(KVMState *s)
     return 0;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     CPUMIPSState *env = cpu_env(cs);
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 966c2c6572..758298d565 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -477,6 +477,11 @@ static void kvmppc_hw_debug_points_init(CPUPPCState *cenv)
     }
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     PowerPCCPU *cpu = POWERPC_CPU(cs);
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 23ce779359..55be7542e7 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1362,6 +1362,11 @@ static int kvm_vcpu_enable_sbi_dbcn(RISCVCPU *cpu, CPUState *cs)
     return kvm_set_one_reg(cs, kvm_sbi_dbcn.kvm_reg_id, &reg);
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     int ret = 0;
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 4d56e653dd..1f592733f4 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -404,6 +404,11 @@ unsigned long kvm_arch_vcpu_id(CPUState *cpu)
     return cpu->cpu_index;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+    return 0;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     unsigned int max_cpus = MACHINE(qdev_get_machine())->smp.max_cpus;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (2 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu() Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-04  7:59   ` Xiaoyao Li
  2025-03-05 14:44   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid() Dongli Zhang
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
there is no way to fully disable KVM AMD PMU virtualization. Neither
"-cpu host,-pmu" nor "-cpu EPYC" achieves this.

As a result, the following message still appears in the VM dmesg:

[    0.263615] Performance Events: AMD PMU driver.

However, the expected output should be:

[    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
[    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled

This occurs because AMD does not use any CPUID bit to indicate PMU
availability.

To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE
when "-pmu" is configured.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Switch back to the initial implementation with "-pmu".
https://lore.kernel.org/all/20221119122901.2469-3-dongli.zhang@oracle.com
  - Mention that "KVM_PMU_CAP_DISABLE doesn't change the PMU behavior on
    Intel platform because current "pmu" property works as expected."

 target/i386/kvm/kvm.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f41e190fb8..5c8a852dbd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -176,6 +176,8 @@ static int has_triple_fault_event;
 
 static bool has_msr_mcg_ext_ctl;
 
+static int has_pmu_cap;
+
 static struct kvm_cpuid2 *cpuid_cache;
 static struct kvm_cpuid2 *hv_cpuid_cache;
 static struct kvm_msr_list *kvm_feature_msrs;
@@ -2053,6 +2055,33 @@ full:
 
 int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
 {
+    static bool first = true;
+    int ret;
+
+    if (first) {
+        first = false;
+
+        /*
+         * Since Linux v5.18, KVM provides a VM-level capability to easily
+         * disable PMUs; however, QEMU has been providing PMU property per
+         * CPU since v1.6. In order to accommodate both, have to configure
+         * the VM-level capability here.
+         *
+         * KVM_PMU_CAP_DISABLE doesn't change the PMU
+         * behavior on Intel platform because current "pmu" property works
+         * as expected.
+         */
+        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
+            ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
+                                    KVM_PMU_CAP_DISABLE);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret,
+                                 "Failed to set KVM_PMU_CAP_DISABLE");
+                return ret;
+            }
+        }
+    }
+
     return 0;
 }
 
@@ -3351,6 +3380,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
+
     return 0;
 }
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (3 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-05  7:03   ` Mi, Dapeng
  2025-03-07  9:15   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables Dongli Zhang
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

The initialization of 'has_architectural_pmu_version',
'num_architectural_pmu_gp_counters', and
'num_architectural_pmu_fixed_counters' is unrelated to the process of
building the CPUID.

Extract them out of kvm_x86_build_cpuid().

No functional change.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Still extract the code, but call them for all CPUs.

 target/i386/kvm/kvm.c | 66 +++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 27 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 5c8a852dbd..8f293ffd61 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1959,33 +1959,6 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
         }
     }
 
-    if (limit >= 0x0a) {
-        uint32_t eax, edx;
-
-        cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
-
-        has_architectural_pmu_version = eax & 0xff;
-        if (has_architectural_pmu_version > 0) {
-            num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
-
-            /* Shouldn't be more than 32, since that's the number of bits
-             * available in EBX to tell us _which_ counters are available.
-             * Play it safe.
-             */
-            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
-                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
-            }
-
-            if (has_architectural_pmu_version > 1) {
-                num_architectural_pmu_fixed_counters = edx & 0x1f;
-
-                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
-                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
-                }
-            }
-        }
-    }
-
     cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
 
     for (i = 0x80000000; i <= limit; i++) {
@@ -2085,6 +2058,43 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
     return 0;
 }
 
+static void kvm_init_pmu_info(CPUX86State *env)
+{
+    uint32_t eax, edx;
+    uint32_t unused;
+    uint32_t limit;
+
+    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
+
+    if (limit < 0x0a) {
+        return;
+    }
+
+    cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
+
+    has_architectural_pmu_version = eax & 0xff;
+    if (has_architectural_pmu_version > 0) {
+        num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
+
+        /*
+         * Shouldn't be more than 32, since that's the number of bits
+         * available in EBX to tell us _which_ counters are available.
+         * Play it safe.
+         */
+        if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
+            num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
+        }
+
+        if (has_architectural_pmu_version > 1) {
+            num_architectural_pmu_fixed_counters = edx & 0x1f;
+
+            if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
+                num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
+            }
+        }
+    }
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
@@ -2267,6 +2277,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
     cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;
 
+    kvm_init_pmu_info(env);
+
     if (((env->cpuid_version >> 8)&0xF) >= 6
         && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
            (CPUID_MCE | CPUID_MCA)) {
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (4 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid() Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-05  7:07   ` Mi, Dapeng
  2025-03-07  9:19   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter Dongli Zhang
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

AMD does not have what is commonly referred to as an architectural PMU.
Therefore, we need to rename the following variables to be applicable for
both Intel and AMD:

- has_architectural_pmu_version
- num_architectural_pmu_gp_counters
- num_architectural_pmu_fixed_counters

For Intel processors, the meaning of has_pmu_version remains unchanged.

For AMD processors:

has_pmu_version == 1 corresponds to versions before AMD PerfMonV2.
has_pmu_version == 2 corresponds to AMD PerfMonV2.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 target/i386/kvm/kvm.c | 49 ++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 21 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 8f293ffd61..e895d22f94 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -164,9 +164,16 @@ static bool has_msr_perf_capabs;
 static bool has_msr_pkrs;
 static bool has_msr_hwcr;
 
-static uint32_t has_architectural_pmu_version;
-static uint32_t num_architectural_pmu_gp_counters;
-static uint32_t num_architectural_pmu_fixed_counters;
+/*
+ * For Intel processors, the meaning is the architectural PMU version
+ * number.
+ *
+ * For AMD processors: 1 corresponds to the prior versions, and 2
+ * corresponds to AMD PerfMonV2.
+ */
+static uint32_t has_pmu_version;
+static uint32_t num_pmu_gp_counters;
+static uint32_t num_pmu_fixed_counters;
 
 static int has_xsave2;
 static int has_xcrs;
@@ -2072,24 +2079,24 @@ static void kvm_init_pmu_info(CPUX86State *env)
 
     cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
 
-    has_architectural_pmu_version = eax & 0xff;
-    if (has_architectural_pmu_version > 0) {
-        num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
+    has_pmu_version = eax & 0xff;
+    if (has_pmu_version > 0) {
+        num_pmu_gp_counters = (eax & 0xff00) >> 8;
 
         /*
          * Shouldn't be more than 32, since that's the number of bits
          * available in EBX to tell us _which_ counters are available.
          * Play it safe.
          */
-        if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
-            num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
+        if (num_pmu_gp_counters > MAX_GP_COUNTERS) {
+            num_pmu_gp_counters = MAX_GP_COUNTERS;
         }
 
-        if (has_architectural_pmu_version > 1) {
-            num_architectural_pmu_fixed_counters = edx & 0x1f;
+        if (has_pmu_version > 1) {
+            num_pmu_fixed_counters = edx & 0x1f;
 
-            if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
-                num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
+            if (num_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
+                num_pmu_fixed_counters = MAX_FIXED_COUNTERS;
             }
         }
     }
@@ -4041,25 +4048,25 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
         }
 
-        if (has_architectural_pmu_version > 0) {
-            if (has_architectural_pmu_version > 1) {
+        if (has_pmu_version > 0) {
+            if (has_pmu_version > 1) {
                 /* Stop the counter.  */
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
             }
 
             /* Set the counter values.  */
-            for (i = 0; i < num_architectural_pmu_fixed_counters; i++) {
+            for (i = 0; i < num_pmu_fixed_counters; i++) {
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
                                   env->msr_fixed_counters[i]);
             }
-            for (i = 0; i < num_architectural_pmu_gp_counters; i++) {
+            for (i = 0; i < num_pmu_gp_counters; i++) {
                 kvm_msr_entry_add(cpu, MSR_P6_PERFCTR0 + i,
                                   env->msr_gp_counters[i]);
                 kvm_msr_entry_add(cpu, MSR_P6_EVNTSEL0 + i,
                                   env->msr_gp_evtsel[i]);
             }
-            if (has_architectural_pmu_version > 1) {
+            if (has_pmu_version > 1) {
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_STATUS,
                                   env->msr_global_status);
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
@@ -4519,17 +4526,17 @@ static int kvm_get_msrs(X86CPU *cpu)
     if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
         kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
     }
-    if (has_architectural_pmu_version > 0) {
-        if (has_architectural_pmu_version > 1) {
+    if (has_pmu_version > 0) {
+        if (has_pmu_version > 1) {
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_STATUS, 0);
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0);
         }
-        for (i = 0; i < num_architectural_pmu_fixed_counters; i++) {
+        for (i = 0; i < num_pmu_fixed_counters; i++) {
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i, 0);
         }
-        for (i = 0; i < num_architectural_pmu_gp_counters; i++) {
+        for (i = 0; i < num_pmu_gp_counters; i++) {
             kvm_msr_entry_add(cpu, MSR_P6_PERFCTR0 + i, 0);
             kvm_msr_entry_add(cpu, MSR_P6_EVNTSEL0 + i, 0);
         }
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (5 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-10  6:14   ` Zhao Liu
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

There is no way to distinguish between the following scenarios:

(1) KVM_CAP_PMU_CAPABILITY is not supported.
(2) KVM_CAP_PMU_CAPABILITY is supported but disabled via the module
parameter kvm.enable_pmu=N.

In scenario (1), there is no way to fully disable AMD PMU virtualization.

In scenario (2), PMU virtualization is completely disabled by the KVM
module.

To help determine the scenario, read the kvm.enable_pmu value from the
sysfs module parameter.

There isn't any requirement to initialize 'has_pmu_version',
'num_pmu_gp_counters' or 'num_pmu_fixed_counters', if kvm.enable_pmu=N.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 target/i386/kvm/kvm.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e895d22f94..efba3ae7a4 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -184,6 +184,10 @@ static int has_triple_fault_event;
 static bool has_msr_mcg_ext_ctl;
 
 static int has_pmu_cap;
+/*
+ * Read from /sys/module/kvm/parameters/enable_pmu.
+ */
+static bool kvm_pmu_disabled;
 
 static struct kvm_cpuid2 *cpuid_cache;
 static struct kvm_cpuid2 *hv_cpuid_cache;
@@ -3256,6 +3260,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     int ret;
     struct utsname utsname;
     Error *local_err = NULL;
+    g_autofree char *kvm_enable_pmu;
 
     /*
      * Initialize SEV context, if required
@@ -3401,6 +3406,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 
     has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
 
+    /*
+     * The kvm.enable_pmu's permission is 0444. It does not change until a
+     * reload of the KVM module.
+     */
+    if (g_file_get_contents("/sys/module/kvm/parameters/enable_pmu",
+                            &kvm_enable_pmu, NULL, NULL)) {
+        if (*kvm_enable_pmu == 'N') {
+            kvm_pmu_disabled = true;
+        }
+    }
+
     return 0;
 }
 
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (6 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-05  7:33   ` Mi, Dapeng
                     ` (4 more replies)
  2025-03-02 22:00 ` [PATCH v2 09/10] target/i386/kvm: support perfmon-v2 for reset Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters Dongli Zhang
  9 siblings, 5 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

QEMU uses the kvm_get_msrs() function to save Intel PMU registers from KVM
and kvm_put_msrs() to restore them to KVM. However, there is no support for
AMD PMU registers. Currently, has_pmu_version and num_pmu_gp_counters are
initialized based on cpuid(0xa), which does not apply to AMD processors.
For AMD CPUs, prior to PerfMonV2, the number of general-purpose registers
is determined based on the CPU version.

To address this issue, we need to add support for AMD PMU registers.
Without this support, the following problems can arise:

1. If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while
running "perf top", the PMU registers are not disabled properly.

2. Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs()
does not handle AMD PMU registers, causing some PMU events to remain
enabled in KVM.

3. The KVM kvm_pmc_speculative_in_use() function consistently returns true,
preventing the reclamation of these events. Consequently, the
kvm_pmc->perf_event remains active.

4. After a reboot, the VM kernel may report the following error:

[    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
[    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)

5. In the worst case, the active kvm_pmc->perf_event may inject unknown
NMIs randomly into the VM kernel:

[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.

To resolve these issues, we propose resetting AMD PMU registers during the
VM reset process.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Modify "MSR_K7_EVNTSEL0 + 3" and "MSR_K7_PERFCTR0 + 3" by using
    AMD64_NUM_COUNTERS (suggested by Sandipan Das).
  - Use "AMD64_NUM_COUNTERS_CORE * 2 - 1", not "MSR_F15H_PERF_CTL0 + 0xb".
    (suggested by Sandipan Das).
  - Switch back to "-pmu" instead of using a global "pmu-cap-disabled".
  - Don't initialize PMU info if kvm.enable_pmu=N.

 target/i386/cpu.h     |   8 ++
 target/i386/kvm/kvm.c | 173 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 177 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c67b42d34f..319600672b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -490,6 +490,14 @@ typedef enum X86Seg {
 #define MSR_CORE_PERF_GLOBAL_CTRL       0x38f
 #define MSR_CORE_PERF_GLOBAL_OVF_CTRL   0x390
 
+#define MSR_K7_EVNTSEL0                 0xc0010000
+#define MSR_K7_PERFCTR0                 0xc0010004
+#define MSR_F15H_PERF_CTL0              0xc0010200
+#define MSR_F15H_PERF_CTR0              0xc0010201
+
+#define AMD64_NUM_COUNTERS              4
+#define AMD64_NUM_COUNTERS_CORE         6
+
 #define MSR_MC0_CTL                     0x400
 #define MSR_MC0_STATUS                  0x401
 #define MSR_MC0_ADDR                    0x402
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index efba3ae7a4..d4be8a0d2e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2069,7 +2069,7 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
     return 0;
 }
 
-static void kvm_init_pmu_info(CPUX86State *env)
+static void kvm_init_pmu_info_intel(CPUX86State *env)
 {
     uint32_t eax, edx;
     uint32_t unused;
@@ -2106,6 +2106,94 @@ static void kvm_init_pmu_info(CPUX86State *env)
     }
 }
 
+static void kvm_init_pmu_info_amd(CPUX86State *env)
+{
+    uint32_t unused;
+    int64_t family;
+    uint32_t ecx;
+
+    has_pmu_version = 0;
+
+    /*
+     * To determine the CPU family, the following code is derived from
+     * x86_cpuid_version_get_family().
+     */
+    family = (env->cpuid_version >> 8) & 0xf;
+    if (family == 0xf) {
+        family += (env->cpuid_version >> 20) & 0xff;
+    }
+
+    /*
+     * Performance-monitoring supported from K7 and later.
+     */
+    if (family < 6) {
+        return;
+    }
+
+    has_pmu_version = 1;
+
+    cpu_x86_cpuid(env, 0x80000001, 0, &unused, &unused, &ecx, &unused);
+
+    if (!(ecx & CPUID_EXT3_PERFCORE)) {
+        num_pmu_gp_counters = AMD64_NUM_COUNTERS;
+        return;
+    }
+
+    num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
+}
+
+static bool is_same_vendor(CPUX86State *env)
+{
+    static uint32_t host_cpuid_vendor1;
+    static uint32_t host_cpuid_vendor2;
+    static uint32_t host_cpuid_vendor3;
+
+    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
+               &host_cpuid_vendor2);
+
+    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
+           env->cpuid_vendor2 == host_cpuid_vendor2 &&
+           env->cpuid_vendor3 == host_cpuid_vendor3;
+}
+
+static void kvm_init_pmu_info(CPUState *cs)
+{
+    X86CPU *cpu = X86_CPU(cs);
+    CPUX86State *env = &cpu->env;
+
+    /*
+     * The PMU virtualization is disabled by kvm.enable_pmu=N.
+     */
+    if (kvm_pmu_disabled) {
+        return;
+    }
+
+    /*
+     * It is not supported to virtualize AMD PMU registers on Intel
+     * processors, nor to virtualize Intel PMU registers on AMD processors.
+     */
+    if (!is_same_vendor(env)) {
+        return;
+    }
+
+    /*
+     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
+     * disable the AMD pmu virtualization.
+     *
+     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
+     * indicates the KVM has already disabled the PMU virtualization.
+     */
+    if (has_pmu_cap && !cpu->enable_pmu) {
+        return;
+    }
+
+    if (IS_INTEL_CPU(env)) {
+        kvm_init_pmu_info_intel(env);
+    } else if (IS_AMD_CPU(env)) {
+        kvm_init_pmu_info_amd(env);
+    }
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
@@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
     cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;
 
-    kvm_init_pmu_info(env);
+    kvm_init_pmu_info(cs);
 
     if (((env->cpuid_version >> 8)&0xF) >= 6
         && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
@@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
         }
 
-        if (has_pmu_version > 0) {
+        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
             if (has_pmu_version > 1) {
                 /* Stop the counter.  */
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
@@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
                                   env->msr_global_ctrl);
             }
         }
+
+        if (IS_AMD_CPU(env) && has_pmu_version > 0) {
+            uint32_t sel_base = MSR_K7_EVNTSEL0;
+            uint32_t ctr_base = MSR_K7_PERFCTR0;
+            /*
+             * The address of the next selector or counter register is
+             * obtained by incrementing the address of the current selector
+             * or counter register by one.
+             */
+            uint32_t step = 1;
+
+            /*
+             * When PERFCORE is enabled, AMD PMU uses a separate set of
+             * addresses for the selector and counter registers.
+             * Additionally, the address of the next selector or counter
+             * register is determined by incrementing the address of the
+             * current register by two.
+             */
+            if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
+                sel_base = MSR_F15H_PERF_CTL0;
+                ctr_base = MSR_F15H_PERF_CTR0;
+                step = 2;
+            }
+
+            for (i = 0; i < num_pmu_gp_counters; i++) {
+                kvm_msr_entry_add(cpu, ctr_base + i * step,
+                                  env->msr_gp_counters[i]);
+                kvm_msr_entry_add(cpu, sel_base + i * step,
+                                  env->msr_gp_evtsel[i]);
+            }
+        }
+
         /*
          * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add,
          * only sync them to KVM on the first cpu
@@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
     if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
         kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
     }
-    if (has_pmu_version > 0) {
+
+    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
         if (has_pmu_version > 1) {
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
             kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
@@ -4558,6 +4679,35 @@ static int kvm_get_msrs(X86CPU *cpu)
         }
     }
 
+    if (IS_AMD_CPU(env) && has_pmu_version > 0) {
+        uint32_t sel_base = MSR_K7_EVNTSEL0;
+        uint32_t ctr_base = MSR_K7_PERFCTR0;
+        /*
+         * The address of the next selector or counter register is
+         * obtained by incrementing the address of the current selector
+         * or counter register by one.
+         */
+        uint32_t step = 1;
+
+        /*
+         * When PERFCORE is enabled, AMD PMU uses a separate set of
+         * addresses for the selector and counter registers.
+         * Additionally, the address of the next selector or counter
+         * register is determined by incrementing the address of the
+         * current register by two.
+         */
+        if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
+            sel_base = MSR_F15H_PERF_CTL0;
+            ctr_base = MSR_F15H_PERF_CTR0;
+            step = 2;
+        }
+
+        for (i = 0; i < num_pmu_gp_counters; i++) {
+            kvm_msr_entry_add(cpu, ctr_base + i * step, 0);
+            kvm_msr_entry_add(cpu, sel_base + i * step, 0);
+        }
+    }
+
     if (env->mcg_cap) {
         kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0);
         kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0);
@@ -4869,6 +5019,21 @@ static int kvm_get_msrs(X86CPU *cpu)
         case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
             env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
             break;
+        case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL0 + AMD64_NUM_COUNTERS - 1:
+            env->msr_gp_evtsel[index - MSR_K7_EVNTSEL0] = msrs[i].data;
+            break;
+        case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR0 + AMD64_NUM_COUNTERS - 1:
+            env->msr_gp_counters[index - MSR_K7_PERFCTR0] = msrs[i].data;
+            break;
+        case MSR_F15H_PERF_CTL0 ...
+             MSR_F15H_PERF_CTL0 + AMD64_NUM_COUNTERS_CORE * 2 - 1:
+            index = index - MSR_F15H_PERF_CTL0;
+            if (index & 0x1) {
+                env->msr_gp_counters[index] = msrs[i].data;
+            } else {
+                env->msr_gp_evtsel[index] = msrs[i].data;
+            }
+            break;
         case HV_X64_MSR_HYPERCALL:
             env->msr_hv_hypercall = msrs[i].data;
             break;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 09/10] target/i386/kvm: support perfmon-v2 for reset
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (7 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-02 22:00 ` [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters Dongli Zhang
  9 siblings, 0 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Since perfmon-v2, the AMD PMU supports additional registers. This update
includes get/put functionality for these extra registers.

Similar to the implementation in KVM:

- MSR_CORE_PERF_GLOBAL_STATUS and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS both
use env->msr_global_status.
- MSR_CORE_PERF_GLOBAL_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_CTL both use
env->msr_global_ctrl.
- MSR_CORE_PERF_GLOBAL_OVF_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR
both use env->msr_global_ovf_ctrl.

No changes are needed for vmstate_msr_architectural_pmu or
pmu_enable_needed().

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
Changed since v1:
  - Use "has_pmu_version > 1", not "has_pmu_version == 2".

 target/i386/cpu.h     |  4 ++++
 target/i386/kvm/kvm.c | 47 ++++++++++++++++++++++++++++++++++---------
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 319600672b..fdceebfc72 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -490,6 +490,10 @@ typedef enum X86Seg {
 #define MSR_CORE_PERF_GLOBAL_CTRL       0x38f
 #define MSR_CORE_PERF_GLOBAL_OVF_CTRL   0x390
 
+#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS       0xc0000300
+#define MSR_AMD64_PERF_CNTR_GLOBAL_CTL          0xc0000301
+#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR   0xc0000302
+
 #define MSR_K7_EVNTSEL0                 0xc0010000
 #define MSR_K7_PERFCTR0                 0xc0010004
 #define MSR_F15H_PERF_CTL0              0xc0010200
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d4be8a0d2e..c5911baef0 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2108,9 +2108,9 @@ static void kvm_init_pmu_info_intel(CPUX86State *env)
 
 static void kvm_init_pmu_info_amd(CPUX86State *env)
 {
+    uint32_t eax, ebx, ecx;
     uint32_t unused;
     int64_t family;
-    uint32_t ecx;
 
     has_pmu_version = 0;
 
@@ -2140,6 +2140,13 @@ static void kvm_init_pmu_info_amd(CPUX86State *env)
     }
 
     num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
+
+    cpu_x86_cpuid(env, 0x80000022, 0, &eax, &ebx, &unused, &unused);
+
+    if (eax & CPUID_8000_0022_EAX_PERFMON_V2) {
+        has_pmu_version = 2;
+        num_pmu_gp_counters = ebx & 0xf;
+    }
 }
 
 static bool is_same_vendor(CPUX86State *env)
@@ -4195,13 +4202,14 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
             uint32_t step = 1;
 
             /*
-             * When PERFCORE is enabled, AMD PMU uses a separate set of
-             * addresses for the selector and counter registers.
-             * Additionally, the address of the next selector or counter
-             * register is determined by incrementing the address of the
-             * current register by two.
+             * When PERFCORE or PerfMonV2 is enabled, AMD PMU uses a
+             * separate set of addresses for the selector and counter
+             * registers. Additionally, the address of the next selector or
+             * counter register is determined by incrementing the address
+             * of the current register by two.
              */
-            if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
+            if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE ||
+                has_pmu_version > 1) {
                 sel_base = MSR_F15H_PERF_CTL0;
                 ctr_base = MSR_F15H_PERF_CTR0;
                 step = 2;
@@ -4213,6 +4221,15 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
                 kvm_msr_entry_add(cpu, sel_base + i * step,
                                   env->msr_gp_evtsel[i]);
             }
+
+            if (has_pmu_version > 1) {
+                kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS,
+                                  env->msr_global_status);
+                kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR,
+                                  env->msr_global_ovf_ctrl);
+                kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_CTL,
+                                  env->msr_global_ctrl);
+            }
         }
 
         /*
@@ -4690,13 +4707,14 @@ static int kvm_get_msrs(X86CPU *cpu)
         uint32_t step = 1;
 
         /*
-         * When PERFCORE is enabled, AMD PMU uses a separate set of
-         * addresses for the selector and counter registers.
+         * When PERFCORE or PerfMonV2 is enabled, AMD PMU uses a separate
+         * set of addresses for the selector and counter registers.
          * Additionally, the address of the next selector or counter
          * register is determined by incrementing the address of the
          * current register by two.
          */
-        if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
+        if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE ||
+            has_pmu_version > 1) {
             sel_base = MSR_F15H_PERF_CTL0;
             ctr_base = MSR_F15H_PERF_CTR0;
             step = 2;
@@ -4706,6 +4724,12 @@ static int kvm_get_msrs(X86CPU *cpu)
             kvm_msr_entry_add(cpu, ctr_base + i * step, 0);
             kvm_msr_entry_add(cpu, sel_base + i * step, 0);
         }
+
+        if (has_pmu_version > 1) {
+            kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0);
+            kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, 0);
+            kvm_msr_entry_add(cpu, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, 0);
+        }
     }
 
     if (env->mcg_cap) {
@@ -5002,12 +5026,15 @@ static int kvm_get_msrs(X86CPU *cpu)
             env->msr_fixed_ctr_ctrl = msrs[i].data;
             break;
         case MSR_CORE_PERF_GLOBAL_CTRL:
+        case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
             env->msr_global_ctrl = msrs[i].data;
             break;
         case MSR_CORE_PERF_GLOBAL_STATUS:
+        case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
             env->msr_global_status = msrs[i].data;
             break;
         case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+        case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
             env->msr_global_ovf_ctrl = msrs[i].data;
             break;
         case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR0 + MAX_FIXED_COUNTERS - 1:
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters
  2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
                   ` (8 preceding siblings ...)
  2025-03-02 22:00 ` [PATCH v2 09/10] target/i386/kvm: support perfmon-v2 for reset Dongli Zhang
@ 2025-03-02 22:00 ` Dongli Zhang
  2025-03-05  7:35   ` Mi, Dapeng
  9 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-02 22:00 UTC (permalink / raw)
  To: qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM processes
these MSRs one by one in a loop, only saving the config and triggering the
KVM_REQ_PMU request. This approach does not immediately stop the event
before updating PMC.

In additional, PMU MSRs are set only at levels >= KVM_PUT_RESET_STATE,
excluding runtime. Therefore, updating these MSRs without stopping events
should be acceptable.

Finally, KVM creates kernel perf events with host mode excluded
(exclude_host = 1). While the events remain active, they don't increment
the counter during QEMU vCPU userspace mode.

No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
migrate vPMU state"), because this isn't a bugfix.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 target/i386/kvm/kvm.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index c5911baef0..4902694129 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4160,13 +4160,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
         }
 
         if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
-            if (has_pmu_version > 1) {
-                /* Stop the counter.  */
-                kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
-                kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
-            }
-
-            /* Set the counter values.  */
             for (i = 0; i < num_pmu_fixed_counters; i++) {
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
                                   env->msr_fixed_counters[i]);
@@ -4182,8 +4175,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
                                   env->msr_global_status);
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
                                   env->msr_global_ovf_ctrl);
-
-                /* Now start the PMU.  */
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
                                   env->msr_fixed_ctr_ctrl);
                 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-02 22:00 ` [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured Dongli Zhang
@ 2025-03-03  1:59   ` Xiaoyao Li
  2025-03-03 18:45     ` dongli.zhang
  2025-03-06 16:50   ` Zhao Liu
  1 sibling, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-03  1:59 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> Currently, AMD PMU support isn't determined based on CPUID, that is, the
> "-pmu" option does not fully disable KVM AMD PMU virtualization.
> 
> To minimize AMD PMU features, remove PERFCORE when "-pmu" is configured.
> 
> To completely disable AMD PMU virtualization will be implemented via
> KVM_CAP_PMU_CAPABILITY in upcoming patches.
> 
> As a reminder, neither CPUID_EXT3_PERFCORE nor
> CPUID_8000_0022_EAX_PERFMON_V2 is removed from env->features[] when "-pmu"
> is configured. Developers should query whether they are supported via
> cpu_x86_cpuid() rather than relying on env->features[] in future patches.

I don't think it is the correct direction to go.

env->features[] should be finalized before cpu_x86_cpuid() and 
env->features[] needs to be able to be exposed to guest directly. This 
ensures guest and QEMU have the same view of CPUIDs and it simplifies 
things.

We can adjust env->features[] by filtering all PMU related CPUIDs based 
on cpu->enable_pmu in x86_cpu_realizefn().

> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>   target/i386/cpu.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b6d6167910..61a671028a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               !(env->hflags & HF_LMA_MASK)) {
>               *edx &= ~CPUID_EXT2_SYSCALL;
>           }
> +
> +        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {
> +            *ecx &= ~CPUID_EXT3_PERFCORE;
> +        }
>           break;
>       case 0x80000002:
>       case 0x80000003:


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-03  1:59   ` Xiaoyao Li
@ 2025-03-03 18:45     ` dongli.zhang
  2025-03-04  6:11       ` Xiaoyao Li
  0 siblings, 1 reply; 63+ messages in thread
From: dongli.zhang @ 2025-03-03 18:45 UTC (permalink / raw)
  To: Xiaoyao Li, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

Hi Xiaoyao,

On 3/2/25 5:59 PM, Xiaoyao Li wrote:
> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>> Currently, AMD PMU support isn't determined based on CPUID, that is, the
>> "-pmu" option does not fully disable KVM AMD PMU virtualization.
>>
>> To minimize AMD PMU features, remove PERFCORE when "-pmu" is configured.
>>
>> To completely disable AMD PMU virtualization will be implemented via
>> KVM_CAP_PMU_CAPABILITY in upcoming patches.
>>
>> As a reminder, neither CPUID_EXT3_PERFCORE nor
>> CPUID_8000_0022_EAX_PERFMON_V2 is removed from env->features[] when "-pmu"
>> is configured. Developers should query whether they are supported via
>> cpu_x86_cpuid() rather than relying on env->features[] in future patches.
> 
> I don't think it is the correct direction to go.
> 
> env->features[] should be finalized before cpu_x86_cpuid() and env-
>>features[] needs to be able to be exposed to guest directly. This ensures
> guest and QEMU have the same view of CPUIDs and it simplifies things.
> 
> We can adjust env->features[] by filtering all PMU related CPUIDs based on
> cpu->enable_pmu in x86_cpu_realizefn().

Thank you very much for suggestion.

I see  code like below in x86_cpu_realizefn() to edit env->features[].

7982     /* On AMD CPUs, some CPUID[8000_0001].EDX bits must match the bits on
7983      * CPUID[1].EDX.
7984      */
7985     if (IS_AMD_CPU(env)) {
7986         env->features[FEAT_8000_0001_EDX] &= ~CPUID_EXT2_AMD_ALIASES;
7987         env->features[FEAT_8000_0001_EDX] |= (env->features[FEAT_1_EDX]
7988            & CPUID_EXT2_AMD_ALIASES);
7989     }

I may do something similar to them for CPUID_EXT3_PERFCORE and
CPUID_8000_0022_EAX_PERFMON_V2.

Dongli Zhang



> 
>> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>   target/i386/cpu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index b6d6167910..61a671028a 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
>> index, uint32_t count,
>>               !(env->hflags & HF_LMA_MASK)) {
>>               *edx &= ~CPUID_EXT2_SYSCALL;
>>           }
>> +
>> +        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {
>> +            *ecx &= ~CPUID_EXT3_PERFCORE;
>> +        }
>>           break;
>>       case 0x80000002:
>>       case 0x80000003:
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-03 18:45     ` dongli.zhang
@ 2025-03-04  6:11       ` Xiaoyao Li
  0 siblings, 0 replies; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-04  6:11 UTC (permalink / raw)
  To: dongli.zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

On 3/4/2025 2:45 AM, dongli.zhang@oracle.com wrote:
> Hi Xiaoyao,
> 
> On 3/2/25 5:59 PM, Xiaoyao Li wrote:
>> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>>> Currently, AMD PMU support isn't determined based on CPUID, that is, the
>>> "-pmu" option does not fully disable KVM AMD PMU virtualization.
>>>
>>> To minimize AMD PMU features, remove PERFCORE when "-pmu" is configured.
>>>
>>> To completely disable AMD PMU virtualization will be implemented via
>>> KVM_CAP_PMU_CAPABILITY in upcoming patches.
>>>
>>> As a reminder, neither CPUID_EXT3_PERFCORE nor
>>> CPUID_8000_0022_EAX_PERFMON_V2 is removed from env->features[] when "-pmu"
>>> is configured. Developers should query whether they are supported via
>>> cpu_x86_cpuid() rather than relying on env->features[] in future patches.
>>
>> I don't think it is the correct direction to go.
>>
>> env->features[] should be finalized before cpu_x86_cpuid() and env-
>>> features[] needs to be able to be exposed to guest directly. This ensures
>> guest and QEMU have the same view of CPUIDs and it simplifies things.
>>
>> We can adjust env->features[] by filtering all PMU related CPUIDs based on
>> cpu->enable_pmu in x86_cpu_realizefn().
> 
> Thank you very much for suggestion.
> 
> I see  code like below in x86_cpu_realizefn() to edit env->features[].
> 
> 7982     /* On AMD CPUs, some CPUID[8000_0001].EDX bits must match the bits on
> 7983      * CPUID[1].EDX.
> 7984      */
> 7985     if (IS_AMD_CPU(env)) {
> 7986         env->features[FEAT_8000_0001_EDX] &= ~CPUID_EXT2_AMD_ALIASES;
> 7987         env->features[FEAT_8000_0001_EDX] |= (env->features[FEAT_1_EDX]
> 7988            & CPUID_EXT2_AMD_ALIASES);
> 7989     }
> 
> I may do something similar to them for CPUID_EXT3_PERFCORE and
> CPUID_8000_0022_EAX_PERFMON_V2.

I just sent a series for CPUID_EXT_PDCM[1]. I think you can put 
CPUID_EXT3_PERFCORE and CPUID_8000_0022_EAX_PERFMON_V2 at the same place.

[1] 
https://lore.kernel.org/qemu-devel/20250304052450.465445-1-xiaoyao.li@intel.com/T/#m31c6777131b6361d7c3af22b09532bdc785dbc06

> Dongli Zhang
> 
> 
> 
>>
>>> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>> ---
>>>    target/i386/cpu.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index b6d6167910..61a671028a 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
>>> index, uint32_t count,
>>>                !(env->hflags & HF_LMA_MASK)) {
>>>                *edx &= ~CPUID_EXT2_SYSCALL;
>>>            }
>>> +
>>> +        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {
>>> +            *ecx &= ~CPUID_EXT3_PERFCORE;
>>> +        }
>>>            break;
>>>        case 0x80000002:
>>>        case 0x80000003:
>>
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-02 22:00 ` [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured Dongli Zhang
@ 2025-03-04  7:59   ` Xiaoyao Li
  2025-03-05  1:22     ` Sean Christopherson
  2025-03-05 14:41     ` Zhao Liu
  2025-03-05 14:44   ` Zhao Liu
  1 sibling, 2 replies; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-04  7:59 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
> there is no way to fully disable KVM AMD PMU virtualization. Neither
> "-cpu host,-pmu" nor "-cpu EPYC" achieves this.

This looks like a KVM bug.

Anyway, since QEMU can achieve its goal with KVM_PMU_CAP_DISABLE with 
current KVM, I'm fine with it.

I have one nit below, otherwise

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> As a result, the following message still appears in the VM dmesg:
> 
> [    0.263615] Performance Events: AMD PMU driver.
> 
> However, the expected output should be:
> 
> [    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
> [    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
> 
> This occurs because AMD does not use any CPUID bit to indicate PMU
> availability.
> 
> To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE
> when "-pmu" is configured.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>    - Switch back to the initial implementation with "-pmu".
> https://lore.kernel.org/all/20221119122901.2469-3-dongli.zhang@oracle.com
>    - Mention that "KVM_PMU_CAP_DISABLE doesn't change the PMU behavior on
>      Intel platform because current "pmu" property works as expected."
> 
>   target/i386/kvm/kvm.c | 31 +++++++++++++++++++++++++++++++
>   1 file changed, 31 insertions(+)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index f41e190fb8..5c8a852dbd 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -176,6 +176,8 @@ static int has_triple_fault_event;
>   
>   static bool has_msr_mcg_ext_ctl;
>   
> +static int has_pmu_cap;
> +
>   static struct kvm_cpuid2 *cpuid_cache;
>   static struct kvm_cpuid2 *hv_cpuid_cache;
>   static struct kvm_msr_list *kvm_feature_msrs;
> @@ -2053,6 +2055,33 @@ full:
>   
>   int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>   {
> +    static bool first = true;
> +    int ret;
> +
> +    if (first) {
> +        first = false;
> +
> +        /*
> +         * Since Linux v5.18, KVM provides a VM-level capability to easily
> +         * disable PMUs; however, QEMU has been providing PMU property per
> +         * CPU since v1.6. In order to accommodate both, have to configure
> +         * the VM-level capability here.
> +         *
> +         * KVM_PMU_CAP_DISABLE doesn't change the PMU
> +         * behavior on Intel platform because current "pmu" property works
> +         * as expected.
> +         */
> +        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {

One nit, it's safer to use

	(has_pmu_cap & KVM_PMU_CAP_DISABLE) && !X86_CPU(cpu)->enable_pmu

Maybe we can rename has_pmu_cap to pmu_cap as well.

> +            ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> +                                    KVM_PMU_CAP_DISABLE);
> +            if (ret < 0) {
> +                error_setg_errno(errp, -ret,
> +                                 "Failed to set KVM_PMU_CAP_DISABLE");
> +                return ret;
> +            }
> +        }
> +    }
> +
>       return 0;
>   }
>   
> @@ -3351,6 +3380,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>           }
>       }
>   
> +    has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
> +
>       return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
@ 2025-03-04 14:40   ` Xiaoyao Li
  2025-03-04 22:53     ` dongli.zhang
  2025-03-05 14:20   ` Zhao Liu
  2025-03-07  7:24   ` Sandipan Das
  2 siblings, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-04 14:40 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
> reflected in in guest dmesg.
> 
> [    0.285136] Performance Events: AMD PMU driver.

I'm a little confused. wWhen no perfctr-core, AMD PMU driver can still 
be probed? (forgive me if I ask a silly question)

> However, the guest CPUID indicates the PerfMonV2 is still available.
> 
> CPU:
>     Extended Performance Monitoring and Debugging (0x80000022):
>        AMD performance monitoring V2         = true
>        AMD LBR V2                            = false
>        AMD LBR stack & PMC freezing          = false
>        number of core perf ctrs              = 0x6 (6)
>        number of LBR stack entries           = 0x0 (0)
>        number of avail Northbridge perf ctrs = 0x0 (0)
>        number of available UMC PMCs          = 0x0 (0)
>        active UMCs bitmask                   = 0x0
> 
> Disable PerfMonV2 in CPUID when PERFCORE is disabled.
> 
> Suggested-by: Zhao Liu <zhao1.liu@intel.com>

Though I have above confusion of the description, the change itself 
looks good to me. So

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>    - Use feature_dependencies (suggested by Zhao Liu).
> 
>   target/i386/cpu.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 72ab147e85..b6d6167910 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1805,6 +1805,10 @@ static FeatureDep feature_dependencies[] = {
>           .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
>           .to = { FEAT_24_0_EBX,              ~0ull },
>       },
> +    {
> +        .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
> +        .to = { FEAT_8000_0022_EAX,         CPUID_8000_0022_EAX_PERFMON_V2 },
> +    },
>   };
>   
>   typedef struct X86RegisterInfo32 {


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-04 14:40   ` Xiaoyao Li
@ 2025-03-04 22:53     ` dongli.zhang
  2025-03-05  1:38       ` Xiaoyao Li
  0 siblings, 1 reply; 63+ messages in thread
From: dongli.zhang @ 2025-03-04 22:53 UTC (permalink / raw)
  To: Xiaoyao Li, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

Hi Xiaoyao,

On 3/4/25 6:40 AM, Xiaoyao Li wrote:
> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>> When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
>> reflected in in guest dmesg.
>>
>> [    0.285136] Performance Events: AMD PMU driver.
> 
> I'm a little confused. wWhen no perfctr-core, AMD PMU driver can still be
> probed? (forgive me if I ask a silly question)

Intel use "cpuid -1 -l 0xa" to determine the support of PMU.

However, AMD doesn't use CPUID to determine PMU support (except AMD PMU
PerfMonV2).

I have derived everything from Linux kernel function amd_pmu_init().

As line 1521, the PMU isn't supported by old AMD CPUs.

1516 __init int amd_pmu_init(void)
1517 {
1518         int ret;
1519
1520         /* Performance-monitoring supported from K7 and later: */
1521         if (boot_cpu_data.x86 < 6)
1522                 return -ENODEV;
1523
1524         x86_pmu = amd_pmu;
1525
1526         ret = amd_core_pmu_init();


1. Therefore, at least 4 PMCs are available (without 'perfctr-core').

2. With 'perfctr-core', there are 6 PMCs. (line 1410)

1404 static int __init amd_core_pmu_init(void)
1405 {
1406         union cpuid_0x80000022_ebx ebx;
1407         u64 even_ctr_mask = 0ULL;
1408         int i;
1409
1410         if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
1411                 return 0;
1412
1413         /* Avoid calculating the value each time in the NMI handler */
1414         perf_nmi_window = msecs_to_jiffies(100);
1415
1416         /*
1417          * If core performance counter extensions exists, we must use
1418          * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
1419          * amd_pmu_addr_offset().
1420          */
1421         x86_pmu.eventsel        = MSR_F15H_PERF_CTL;
1422         x86_pmu.perfctr         = MSR_F15H_PERF_CTR;
1423         x86_pmu.cntr_mask64     = GENMASK_ULL(AMD64_NUM_COUNTERS_CORE
- 1, 0);


3. With PerfMonV2, extra global registers are available, as well as PMCs.
(line 1426)

1425         /* Check for Performance Monitoring v2 support */
1426         if (boot_cpu_has(X86_FEATURE_PERFMON_V2)) {
1427                 ebx.full = cpuid_ebx(EXT_PERFMON_DEBUG_FEATURES);
1428
1429                 /* Update PMU version for later usage */
1430                 x86_pmu.version = 2;
1431
1432                 /* Find the number of available Core PMCs */
1433                 x86_pmu.cntr_mask64 =
GENMASK_ULL(ebx.split.num_core_pmc - 1, 0);
1434
1435                 amd_pmu_global_cntr_mask = x86_pmu.cntr_mask64;
1436
1437                 /* Update PMC handling functions */
1438                 x86_pmu.enable_all = amd_pmu_v2_enable_all;
1439                 x86_pmu.disable_all = amd_pmu_v2_disable_all;
1440                 x86_pmu.enable = amd_pmu_v2_enable_event;
1441                 x86_pmu.handle_irq = amd_pmu_v2_handle_irq;
1442                 static_call_update(amd_pmu_test_overflow,
amd_pmu_test_overflow_status);
1443         }


That's why legacy 4-PMC PMU is probed after we disable perfctr-core.

- (boot_cpu_data.x86 < 6): No PMU.
- Without perfctr-core: 4 PMCs
- With perfctr-core: 6 PMCs
- PerfMonV2: PMCs (currently 6) + global PMU registers


May this resolve your concern in another thread that "This looks like a KVM
bug."? This isn't a KVM bug. It is because AMD's lack of the configuration
to disable PMU.

Thank you very much!

Dongli Zhang

> 
>> However, the guest CPUID indicates the PerfMonV2 is still available.
>>
>> CPU:
>>     Extended Performance Monitoring and Debugging (0x80000022):
>>        AMD performance monitoring V2         = true
>>        AMD LBR V2                            = false
>>        AMD LBR stack & PMC freezing          = false
>>        number of core perf ctrs              = 0x6 (6)
>>        number of LBR stack entries           = 0x0 (0)
>>        number of avail Northbridge perf ctrs = 0x0 (0)
>>        number of available UMC PMCs          = 0x0 (0)
>>        active UMCs bitmask                   = 0x0
>>
>> Disable PerfMonV2 in CPUID when PERFCORE is disabled.
>>
>> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
> 
> Though I have above confusion of the description, the change itself looks
> good to me. So
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> 
>> Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>> Changed since v1:
>>    - Use feature_dependencies (suggested by Zhao Liu).
>>
>>   target/i386/cpu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 72ab147e85..b6d6167910 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -1805,6 +1805,10 @@ static FeatureDep feature_dependencies[] = {
>>           .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
>>           .to = { FEAT_24_0_EBX,              ~0ull },
>>       },
>> +    {
>> +        .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
>> +        .to = { FEAT_8000_0022_EAX,        
>> CPUID_8000_0022_EAX_PERFMON_V2 },
>> +    },
>>   };
>>     typedef struct X86RegisterInfo32 {
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-04  7:59   ` Xiaoyao Li
@ 2025-03-05  1:22     ` Sean Christopherson
  2025-03-05  1:35       ` Xiaoyao Li
  2025-03-05 14:41     ` Zhao Liu
  1 sibling, 1 reply; 63+ messages in thread
From: Sean Christopherson @ 2025-03-05  1:22 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, zhao1.liu, mtosatti,
	sandipan.das, babu.moger, likexu, like.xu.linux, zhenyuw, groug,
	khorenko, alexander.ivanov, den, davydov-max, dapeng1.mi, joe.jin

On Tue, Mar 04, 2025, Xiaoyao Li wrote:
> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> > Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
> > there is no way to fully disable KVM AMD PMU virtualization. Neither
> > "-cpu host,-pmu" nor "-cpu EPYC" achieves this.
> 
> This looks like a KVM bug.

Heh, the patches you sent do fix _a_ KVM bug, but this is something else entirely.

In practice, the KVM bug only affects what KVM_GET_SUPPORTED_CPUID returns when
enable_pmu=false, and in that case, it's only a reporting issue, i.e. KVM will
still block usage of the PMU.

As Dongli pointed out, older AMD CPUs don't actually enumerate a PMU in CPUID,
and so the kernel assumes that not-too-old CPUs have a PMU:

	/* Performance-monitoring supported from K7 and later: */
	if (boot_cpu_data.x86 < 6)
		return -ENODEV;

The "expected" output:

   Performance Events: PMU not available due to virtualization, using software events only.

is a long-standing workaround in the kernel to deal with lack of enumeration.  On
top of explicit enumeration, init_hw_perf_events() => check_hw_exists() probes
hardware to see if it actually works.  If an MSR is unexpectedly unavailable, as
is the case when running as a guest, the kernel prints a message and disables PMU
usage.  E.g. the above message is specific to running as a guest:

	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
		pr_cont("PMU not available due to virtualization, using software events only.\n");

From the KVM side, because there's no CPUID enumeration, there's no way for KVM
to know that userspace wants to completely disable PMU virtualization from CPUID
alone.  Whereas with Intel CPUs, KVM infers that the PMU should be disabled by
lack of a non-zero PMU version, e.g. if CPUID.0xA is omitted.

> Anyway, since QEMU can achieve its goal with KVM_PMU_CAP_DISABLE with
> current KVM, I'm fine with it.

Yeah, this is the only way other than disabling KVM's PMU virtualization via
module param (enable_pmu).

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-05  1:22     ` Sean Christopherson
@ 2025-03-05  1:35       ` Xiaoyao Li
  0 siblings, 0 replies; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-05  1:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, zhao1.liu, mtosatti,
	sandipan.das, babu.moger, likexu, like.xu.linux, zhenyuw, groug,
	khorenko, alexander.ivanov, den, davydov-max, dapeng1.mi, joe.jin

On 3/5/2025 9:22 AM, Sean Christopherson wrote:
> On Tue, Mar 04, 2025, Xiaoyao Li wrote:
>> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>>> Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
>>> there is no way to fully disable KVM AMD PMU virtualization. Neither
>>> "-cpu host,-pmu" nor "-cpu EPYC" achieves this.
>>
>> This looks like a KVM bug.
> 
> Heh, the patches you sent do fix _a_ KVM bug, but this is something else entirely.

Aha, that fix was just found by code inspection. It was not supposed to 
be related with this.

> In practice, the KVM bug only affects what KVM_GET_SUPPORTED_CPUID returns when
> enable_pmu=false, and in that case, it's only a reporting issue, i.e. KVM will
> still block usage of the PMU.
> 
> As Dongli pointed out, older AMD CPUs don't actually enumerate a PMU in CPUID,
> and so the kernel assumes that not-too-old CPUs have a PMU:
> 
> 	/* Performance-monitoring supported from K7 and later: */
> 	if (boot_cpu_data.x86 < 6)
> 		return -ENODEV;
> 
> The "expected" output:
> 
>     Performance Events: PMU not available due to virtualization, using software events only.
> 
> is a long-standing workaround in the kernel to deal with lack of enumeration.  On
> top of explicit enumeration, init_hw_perf_events() => check_hw_exists() probes
> hardware to see if it actually works.  If an MSR is unexpectedly unavailable, as
> is the case when running as a guest, the kernel prints a message and disables PMU
> usage.  E.g. the above message is specific to running as a guest:
> 
> 	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> 		pr_cont("PMU not available due to virtualization, using software events only.\n");
> 
>  From the KVM side, because there's no CPUID enumeration, there's no way for KVM
> to know that userspace wants to completely disable PMU virtualization from CPUID
> alone.  Whereas with Intel CPUs, KVM infers that the PMU should be disabled by
> lack of a non-zero PMU version, e.g. if CPUID.0xA is omitted.

I see now.

Thanks to you and Dongli!

>> Anyway, since QEMU can achieve its goal with KVM_PMU_CAP_DISABLE with
>> current KVM, I'm fine with it.
> 
> Yeah, this is the only way other than disabling KVM's PMU virtualization via
> module param (enable_pmu).


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-04 22:53     ` dongli.zhang
@ 2025-03-05  1:38       ` Xiaoyao Li
  0 siblings, 0 replies; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-05  1:38 UTC (permalink / raw)
  To: dongli.zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, dapeng1.mi, joe.jin

On 3/5/2025 6:53 AM, dongli.zhang@oracle.com wrote:
> Hi Xiaoyao,
> 
> On 3/4/25 6:40 AM, Xiaoyao Li wrote:
>> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>>> When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
>>> reflected in in guest dmesg.
>>>
>>> [    0.285136] Performance Events: AMD PMU driver.
>>
>> I'm a little confused. wWhen no perfctr-core, AMD PMU driver can still be
>> probed? (forgive me if I ask a silly question)
> 
> Intel use "cpuid -1 -l 0xa" to determine the support of PMU.
> 
> However, AMD doesn't use CPUID to determine PMU support (except AMD PMU
> PerfMonV2).
> 
> I have derived everything from Linux kernel function amd_pmu_init().
> 
> As line 1521, the PMU isn't supported by old AMD CPUs.
> 
> 1516 __init int amd_pmu_init(void)
> 1517 {
> 1518         int ret;
> 1519
> 1520         /* Performance-monitoring supported from K7 and later: */
> 1521         if (boot_cpu_data.x86 < 6)
> 1522                 return -ENODEV;
> 1523
> 1524         x86_pmu = amd_pmu;
> 1525
> 1526         ret = amd_core_pmu_init();
> 
> 
> 1. Therefore, at least 4 PMCs are available (without 'perfctr-core').
> 
> 2. With 'perfctr-core', there are 6 PMCs. (line 1410)
> 
> 1404 static int __init amd_core_pmu_init(void)
> 1405 {
> 1406         union cpuid_0x80000022_ebx ebx;
> 1407         u64 even_ctr_mask = 0ULL;
> 1408         int i;
> 1409
> 1410         if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
> 1411                 return 0;
> 1412
> 1413         /* Avoid calculating the value each time in the NMI handler */
> 1414         perf_nmi_window = msecs_to_jiffies(100);
> 1415
> 1416         /*
> 1417          * If core performance counter extensions exists, we must use
> 1418          * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
> 1419          * amd_pmu_addr_offset().
> 1420          */
> 1421         x86_pmu.eventsel        = MSR_F15H_PERF_CTL;
> 1422         x86_pmu.perfctr         = MSR_F15H_PERF_CTR;
> 1423         x86_pmu.cntr_mask64     = GENMASK_ULL(AMD64_NUM_COUNTERS_CORE
> - 1, 0);
> 
> 
> 3. With PerfMonV2, extra global registers are available, as well as PMCs.
> (line 1426)
> 
> 1425         /* Check for Performance Monitoring v2 support */
> 1426         if (boot_cpu_has(X86_FEATURE_PERFMON_V2)) {
> 1427                 ebx.full = cpuid_ebx(EXT_PERFMON_DEBUG_FEATURES);
> 1428
> 1429                 /* Update PMU version for later usage */
> 1430                 x86_pmu.version = 2;
> 1431
> 1432                 /* Find the number of available Core PMCs */
> 1433                 x86_pmu.cntr_mask64 =
> GENMASK_ULL(ebx.split.num_core_pmc - 1, 0);
> 1434
> 1435                 amd_pmu_global_cntr_mask = x86_pmu.cntr_mask64;
> 1436
> 1437                 /* Update PMC handling functions */
> 1438                 x86_pmu.enable_all = amd_pmu_v2_enable_all;
> 1439                 x86_pmu.disable_all = amd_pmu_v2_disable_all;
> 1440                 x86_pmu.enable = amd_pmu_v2_enable_event;
> 1441                 x86_pmu.handle_irq = amd_pmu_v2_handle_irq;
> 1442                 static_call_update(amd_pmu_test_overflow,
> amd_pmu_test_overflow_status);
> 1443         }
> 
> 
> That's why legacy 4-PMC PMU is probed after we disable perfctr-core.
> 
> - (boot_cpu_data.x86 < 6): No PMU.
> - Without perfctr-core: 4 PMCs
> - With perfctr-core: 6 PMCs
> - PerfMonV2: PMCs (currently 6) + global PMU registers
> 
> 
> May this resolve your concern in another thread that "This looks like a KVM
> bug."? This isn't a KVM bug. It is because AMD's lack of the configuration
> to disable PMU.

It helps a lot! Yes, it doesn't a KVM bug.

Thanks for your elaborated explanation!

> Thank you very much!
> 
> Dongli Zhang
> 
>>
>>> However, the guest CPUID indicates the PerfMonV2 is still available.
>>>
>>> CPU:
>>>      Extended Performance Monitoring and Debugging (0x80000022):
>>>         AMD performance monitoring V2         = true
>>>         AMD LBR V2                            = false
>>>         AMD LBR stack & PMC freezing          = false
>>>         number of core perf ctrs              = 0x6 (6)
>>>         number of LBR stack entries           = 0x0 (0)
>>>         number of avail Northbridge perf ctrs = 0x0 (0)
>>>         number of available UMC PMCs          = 0x0 (0)
>>>         active UMCs bitmask                   = 0x0
>>>
>>> Disable PerfMonV2 in CPUID when PERFCORE is disabled.
>>>
>>> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
>>
>> Though I have above confusion of the description, the change itself looks
>> good to me. So
>>
>> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>
>>> Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>> ---
>>> Changed since v1:
>>>     - Use feature_dependencies (suggested by Zhao Liu).
>>>
>>>    target/i386/cpu.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 72ab147e85..b6d6167910 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -1805,6 +1805,10 @@ static FeatureDep feature_dependencies[] = {
>>>            .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
>>>            .to = { FEAT_24_0_EBX,              ~0ull },
>>>        },
>>> +    {
>>> +        .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
>>> +        .to = { FEAT_8000_0022_EAX,
>>> CPUID_8000_0022_EAX_PERFMON_V2 },
>>> +    },
>>>    };
>>>      typedef struct X86RegisterInfo32 {
>>
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  2025-03-02 22:00 ` [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid() Dongli Zhang
@ 2025-03-05  7:03   ` Mi, Dapeng
  2025-03-07  9:15   ` Zhao Liu
  1 sibling, 0 replies; 63+ messages in thread
From: Mi, Dapeng @ 2025-03-05  7:03 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin


On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> The initialization of 'has_architectural_pmu_version',
> 'num_architectural_pmu_gp_counters', and
> 'num_architectural_pmu_fixed_counters' is unrelated to the process of
> building the CPUID.
>
> Extract them out of kvm_x86_build_cpuid().
>
> No functional change.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Still extract the code, but call them for all CPUs.
>
>  target/i386/kvm/kvm.c | 66 +++++++++++++++++++++++++------------------
>  1 file changed, 39 insertions(+), 27 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 5c8a852dbd..8f293ffd61 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1959,33 +1959,6 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
>          }
>      }
>  
> -    if (limit >= 0x0a) {
> -        uint32_t eax, edx;
> -
> -        cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
> -
> -        has_architectural_pmu_version = eax & 0xff;
> -        if (has_architectural_pmu_version > 0) {
> -            num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
> -
> -            /* Shouldn't be more than 32, since that's the number of bits
> -             * available in EBX to tell us _which_ counters are available.
> -             * Play it safe.
> -             */
> -            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
> -                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
> -            }
> -
> -            if (has_architectural_pmu_version > 1) {
> -                num_architectural_pmu_fixed_counters = edx & 0x1f;
> -
> -                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
> -                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
> -                }
> -            }
> -        }
> -    }
> -
>      cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
>  
>      for (i = 0x80000000; i <= limit; i++) {
> @@ -2085,6 +2058,43 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>      return 0;
>  }
>  
> +static void kvm_init_pmu_info(CPUX86State *env)
> +{
> +    uint32_t eax, edx;
> +    uint32_t unused;
> +    uint32_t limit;
> +
> +    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
> +
> +    if (limit < 0x0a) {
> +        return;
> +    }
> +
> +    cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
> +
> +    has_architectural_pmu_version = eax & 0xff;
> +    if (has_architectural_pmu_version > 0) {
> +        num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
> +
> +        /*
> +         * Shouldn't be more than 32, since that's the number of bits
> +         * available in EBX to tell us _which_ counters are available.
> +         * Play it safe.
> +         */
> +        if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
> +            num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
> +        }
> +
> +        if (has_architectural_pmu_version > 1) {
> +            num_architectural_pmu_fixed_counters = edx & 0x1f;
> +
> +            if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
> +                num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
> +            }
> +        }
> +    }
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2267,6 +2277,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
>  
> +    kvm_init_pmu_info(env);
> +
>      if (((env->cpuid_version >> 8)&0xF) >= 6
>          && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
>             (CPUID_MCE | CPUID_MCA)) {

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables
  2025-03-02 22:00 ` [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables Dongli Zhang
@ 2025-03-05  7:07   ` Mi, Dapeng
  2025-03-07  9:19   ` Zhao Liu
  1 sibling, 0 replies; 63+ messages in thread
From: Mi, Dapeng @ 2025-03-05  7:07 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin


On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> AMD does not have what is commonly referred to as an architectural PMU.
> Therefore, we need to rename the following variables to be applicable for
> both Intel and AMD:
>
> - has_architectural_pmu_version
> - num_architectural_pmu_gp_counters
> - num_architectural_pmu_fixed_counters
>
> For Intel processors, the meaning of has_pmu_version remains unchanged.
>
> For AMD processors:
>
> has_pmu_version == 1 corresponds to versions before AMD PerfMonV2.
> has_pmu_version == 2 corresponds to AMD PerfMonV2.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  target/i386/kvm/kvm.c | 49 ++++++++++++++++++++++++-------------------
>  1 file changed, 28 insertions(+), 21 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 8f293ffd61..e895d22f94 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -164,9 +164,16 @@ static bool has_msr_perf_capabs;
>  static bool has_msr_pkrs;
>  static bool has_msr_hwcr;
>  
> -static uint32_t has_architectural_pmu_version;
> -static uint32_t num_architectural_pmu_gp_counters;
> -static uint32_t num_architectural_pmu_fixed_counters;
> +/*
> + * For Intel processors, the meaning is the architectural PMU version
> + * number.
> + *
> + * For AMD processors: 1 corresponds to the prior versions, and 2
> + * corresponds to AMD PerfMonV2.
> + */
> +static uint32_t has_pmu_version;
> +static uint32_t num_pmu_gp_counters;
> +static uint32_t num_pmu_fixed_counters;
>  
>  static int has_xsave2;
>  static int has_xcrs;
> @@ -2072,24 +2079,24 @@ static void kvm_init_pmu_info(CPUX86State *env)
>  
>      cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
>  
> -    has_architectural_pmu_version = eax & 0xff;
> -    if (has_architectural_pmu_version > 0) {
> -        num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
> +    has_pmu_version = eax & 0xff;
> +    if (has_pmu_version > 0) {
> +        num_pmu_gp_counters = (eax & 0xff00) >> 8;
>  
>          /*
>           * Shouldn't be more than 32, since that's the number of bits
>           * available in EBX to tell us _which_ counters are available.
>           * Play it safe.
>           */
> -        if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
> -            num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
> +        if (num_pmu_gp_counters > MAX_GP_COUNTERS) {
> +            num_pmu_gp_counters = MAX_GP_COUNTERS;
>          }
>  
> -        if (has_architectural_pmu_version > 1) {
> -            num_architectural_pmu_fixed_counters = edx & 0x1f;
> +        if (has_pmu_version > 1) {
> +            num_pmu_fixed_counters = edx & 0x1f;
>  
> -            if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
> -                num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
> +            if (num_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
> +                num_pmu_fixed_counters = MAX_FIXED_COUNTERS;
>              }
>          }
>      }
> @@ -4041,25 +4048,25 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
>          }
>  
> -        if (has_architectural_pmu_version > 0) {
> -            if (has_architectural_pmu_version > 1) {
> +        if (has_pmu_version > 0) {
> +            if (has_pmu_version > 1) {
>                  /* Stop the counter.  */
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
>              }
>  
>              /* Set the counter values.  */
> -            for (i = 0; i < num_architectural_pmu_fixed_counters; i++) {
> +            for (i = 0; i < num_pmu_fixed_counters; i++) {
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
>                                    env->msr_fixed_counters[i]);
>              }
> -            for (i = 0; i < num_architectural_pmu_gp_counters; i++) {
> +            for (i = 0; i < num_pmu_gp_counters; i++) {
>                  kvm_msr_entry_add(cpu, MSR_P6_PERFCTR0 + i,
>                                    env->msr_gp_counters[i]);
>                  kvm_msr_entry_add(cpu, MSR_P6_EVNTSEL0 + i,
>                                    env->msr_gp_evtsel[i]);
>              }
> -            if (has_architectural_pmu_version > 1) {
> +            if (has_pmu_version > 1) {
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_STATUS,
>                                    env->msr_global_status);
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> @@ -4519,17 +4526,17 @@ static int kvm_get_msrs(X86CPU *cpu)
>      if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>          kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>      }
> -    if (has_architectural_pmu_version > 0) {
> -        if (has_architectural_pmu_version > 1) {
> +    if (has_pmu_version > 0) {
> +        if (has_pmu_version > 1) {
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_STATUS, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0);
>          }
> -        for (i = 0; i < num_architectural_pmu_fixed_counters; i++) {
> +        for (i = 0; i < num_pmu_fixed_counters; i++) {
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i, 0);
>          }
> -        for (i = 0; i < num_architectural_pmu_gp_counters; i++) {
> +        for (i = 0; i < num_pmu_gp_counters; i++) {
>              kvm_msr_entry_add(cpu, MSR_P6_PERFCTR0 + i, 0);
>              kvm_msr_entry_add(cpu, MSR_P6_EVNTSEL0 + i, 0);
>          }

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
@ 2025-03-05  7:33   ` Mi, Dapeng
  2025-03-05 11:41   ` Francesco Lavra
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 63+ messages in thread
From: Mi, Dapeng @ 2025-03-05  7:33 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin


On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> QEMU uses the kvm_get_msrs() function to save Intel PMU registers from KVM
> and kvm_put_msrs() to restore them to KVM. However, there is no support for
> AMD PMU registers. Currently, has_pmu_version and num_pmu_gp_counters are
> initialized based on cpuid(0xa), which does not apply to AMD processors.
> For AMD CPUs, prior to PerfMonV2, the number of general-purpose registers
> is determined based on the CPU version.
>
> To address this issue, we need to add support for AMD PMU registers.
> Without this support, the following problems can arise:
>
> 1. If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while
> running "perf top", the PMU registers are not disabled properly.
>
> 2. Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs()
> does not handle AMD PMU registers, causing some PMU events to remain
> enabled in KVM.
>
> 3. The KVM kvm_pmc_speculative_in_use() function consistently returns true,
> preventing the reclamation of these events. Consequently, the
> kvm_pmc->perf_event remains active.
>
> 4. After a reboot, the VM kernel may report the following error:
>
> [    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
> [    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)
>
> 5. In the worst case, the active kvm_pmc->perf_event may inject unknown
> NMIs randomly into the VM kernel:
>
> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>
> To resolve these issues, we propose resetting AMD PMU registers during the
> VM reset process.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Modify "MSR_K7_EVNTSEL0 + 3" and "MSR_K7_PERFCTR0 + 3" by using
>     AMD64_NUM_COUNTERS (suggested by Sandipan Das).
>   - Use "AMD64_NUM_COUNTERS_CORE * 2 - 1", not "MSR_F15H_PERF_CTL0 + 0xb".
>     (suggested by Sandipan Das).
>   - Switch back to "-pmu" instead of using a global "pmu-cap-disabled".
>   - Don't initialize PMU info if kvm.enable_pmu=N.
>
>  target/i386/cpu.h     |   8 ++
>  target/i386/kvm/kvm.c | 173 +++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 177 insertions(+), 4 deletions(-)
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index c67b42d34f..319600672b 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -490,6 +490,14 @@ typedef enum X86Seg {
>  #define MSR_CORE_PERF_GLOBAL_CTRL       0x38f
>  #define MSR_CORE_PERF_GLOBAL_OVF_CTRL   0x390
>  
> +#define MSR_K7_EVNTSEL0                 0xc0010000
> +#define MSR_K7_PERFCTR0                 0xc0010004
> +#define MSR_F15H_PERF_CTL0              0xc0010200
> +#define MSR_F15H_PERF_CTR0              0xc0010201
> +
> +#define AMD64_NUM_COUNTERS              4
> +#define AMD64_NUM_COUNTERS_CORE         6
> +
>  #define MSR_MC0_CTL                     0x400
>  #define MSR_MC0_STATUS                  0x401
>  #define MSR_MC0_ADDR                    0x402
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index efba3ae7a4..d4be8a0d2e 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2069,7 +2069,7 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>      return 0;
>  }
>  
> -static void kvm_init_pmu_info(CPUX86State *env)
> +static void kvm_init_pmu_info_intel(CPUX86State *env)
>  {
>      uint32_t eax, edx;
>      uint32_t unused;
> @@ -2106,6 +2106,94 @@ static void kvm_init_pmu_info(CPUX86State *env)
>      }
>  }
>  
> +static void kvm_init_pmu_info_amd(CPUX86State *env)
> +{
> +    uint32_t unused;
> +    int64_t family;
> +    uint32_t ecx;
> +
> +    has_pmu_version = 0;
> +
> +    /*
> +     * To determine the CPU family, the following code is derived from
> +     * x86_cpuid_version_get_family().
> +     */
> +    family = (env->cpuid_version >> 8) & 0xf;
> +    if (family == 0xf) {
> +        family += (env->cpuid_version >> 20) & 0xff;
> +    }
> +
> +    /*
> +     * Performance-monitoring supported from K7 and later.
> +     */
> +    if (family < 6) {
> +        return;
> +    }
> +
> +    has_pmu_version = 1;
> +
> +    cpu_x86_cpuid(env, 0x80000001, 0, &unused, &unused, &ecx, &unused);
> +
> +    if (!(ecx & CPUID_EXT3_PERFCORE)) {
> +        num_pmu_gp_counters = AMD64_NUM_COUNTERS;
> +        return;
> +    }
> +
> +    num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +}
> +
> +static bool is_same_vendor(CPUX86State *env)
> +{
> +    static uint32_t host_cpuid_vendor1;
> +    static uint32_t host_cpuid_vendor2;
> +    static uint32_t host_cpuid_vendor3;
> +
> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
> +               &host_cpuid_vendor2);
> +
> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
> +           env->cpuid_vendor3 == host_cpuid_vendor3;
> +}
> +
> +static void kvm_init_pmu_info(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +
> +    /*
> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
> +     */
> +    if (kvm_pmu_disabled) {
> +        return;
> +    }
> +
> +    /*
> +     * It is not supported to virtualize AMD PMU registers on Intel
> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
> +     */
> +    if (!is_same_vendor(env)) {
> +        return;
> +    }
> +
> +    /*
> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> +     * disable the AMD pmu virtualization.
> +     *
> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> +     * indicates the KVM has already disabled the PMU virtualization.
> +     */
> +    if (has_pmu_cap && !cpu->enable_pmu) {
> +        return;
> +    }
> +
> +    if (IS_INTEL_CPU(env)) {
> +        kvm_init_pmu_info_intel(env);
> +    } else if (IS_AMD_CPU(env)) {
> +        kvm_init_pmu_info_amd(env);
> +    }
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
>  
> -    kvm_init_pmu_info(env);
> +    kvm_init_pmu_info(cs);
>  
>      if (((env->cpuid_version >> 8)&0xF) >= 6
>          && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> @@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
>          }
>  
> -        if (has_pmu_version > 0) {
> +        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>              if (has_pmu_version > 1) {
>                  /* Stop the counter.  */
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> @@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>                                    env->msr_global_ctrl);
>              }
>          }
> +
> +        if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +            uint32_t sel_base = MSR_K7_EVNTSEL0;
> +            uint32_t ctr_base = MSR_K7_PERFCTR0;
> +            /*
> +             * The address of the next selector or counter register is
> +             * obtained by incrementing the address of the current selector
> +             * or counter register by one.
> +             */
> +            uint32_t step = 1;
> +
> +            /*
> +             * When PERFCORE is enabled, AMD PMU uses a separate set of
> +             * addresses for the selector and counter registers.
> +             * Additionally, the address of the next selector or counter
> +             * register is determined by incrementing the address of the
> +             * current register by two.
> +             */
> +            if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
> +                sel_base = MSR_F15H_PERF_CTL0;
> +                ctr_base = MSR_F15H_PERF_CTR0;
> +                step = 2;
> +            }
> +
> +            for (i = 0; i < num_pmu_gp_counters; i++) {
> +                kvm_msr_entry_add(cpu, ctr_base + i * step,
> +                                  env->msr_gp_counters[i]);
> +                kvm_msr_entry_add(cpu, sel_base + i * step,
> +                                  env->msr_gp_evtsel[i]);
> +            }
> +        }
> +
>          /*
>           * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add,
>           * only sync them to KVM on the first cpu
> @@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>      if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>          kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>      }
> -    if (has_pmu_version > 0) {
> +
> +    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>          if (has_pmu_version > 1) {
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> @@ -4558,6 +4679,35 @@ static int kvm_get_msrs(X86CPU *cpu)
>          }
>      }
>  
> +    if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +        uint32_t sel_base = MSR_K7_EVNTSEL0;
> +        uint32_t ctr_base = MSR_K7_PERFCTR0;
> +        /*
> +         * The address of the next selector or counter register is
> +         * obtained by incrementing the address of the current selector
> +         * or counter register by one.
> +         */
> +        uint32_t step = 1;
> +
> +        /*
> +         * When PERFCORE is enabled, AMD PMU uses a separate set of
> +         * addresses for the selector and counter registers.
> +         * Additionally, the address of the next selector or counter
> +         * register is determined by incrementing the address of the
> +         * current register by two.
> +         */
> +        if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
> +            sel_base = MSR_F15H_PERF_CTL0;
> +            ctr_base = MSR_F15H_PERF_CTR0;
> +            step = 2;
> +        }
> +
> +        for (i = 0; i < num_pmu_gp_counters; i++) {
> +            kvm_msr_entry_add(cpu, ctr_base + i * step, 0);
> +            kvm_msr_entry_add(cpu, sel_base + i * step, 0);
> +        }
> +    }
> +
>      if (env->mcg_cap) {
>          kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0);
>          kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0);
> @@ -4869,6 +5019,21 @@ static int kvm_get_msrs(X86CPU *cpu)
>          case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
>              env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
>              break;
> +        case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL0 + AMD64_NUM_COUNTERS - 1:
> +            env->msr_gp_evtsel[index - MSR_K7_EVNTSEL0] = msrs[i].data;
> +            break;
> +        case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR0 + AMD64_NUM_COUNTERS - 1:
> +            env->msr_gp_counters[index - MSR_K7_PERFCTR0] = msrs[i].data;
> +            break;
> +        case MSR_F15H_PERF_CTL0 ...
> +             MSR_F15H_PERF_CTL0 + AMD64_NUM_COUNTERS_CORE * 2 - 1:
> +            index = index - MSR_F15H_PERF_CTL0;
> +            if (index & 0x1) {
> +                env->msr_gp_counters[index] = msrs[i].data;
> +            } else {
> +                env->msr_gp_evtsel[index] = msrs[i].data;
> +            }
> +            break;
>          case HV_X64_MSR_HYPERCALL:
>              env->msr_hv_hypercall = msrs[i].data;
>              break;

LGTM, but leave it to AMD PMU expert to review.



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters
  2025-03-02 22:00 ` [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters Dongli Zhang
@ 2025-03-05  7:35   ` Mi, Dapeng
  2025-03-05 19:00     ` dongli.zhang
  0 siblings, 1 reply; 63+ messages in thread
From: Mi, Dapeng @ 2025-03-05  7:35 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin


On 3/3/2025 6:00 AM, Dongli Zhang wrote:
> The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM processes
> these MSRs one by one in a loop, only saving the config and triggering the
> KVM_REQ_PMU request. This approach does not immediately stop the event
> before updating PMC.
>
> In additional, PMU MSRs are set only at levels >= KVM_PUT_RESET_STATE,
> excluding runtime. Therefore, updating these MSRs without stopping events
> should be acceptable.

Suppose this works for upcoming mediated vPMU as well? If so, please
mention it here. Thanks.


>
> Finally, KVM creates kernel perf events with host mode excluded
> (exclude_host = 1). While the events remain active, they don't increment
> the counter during QEMU vCPU userspace mode.
>
> No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
> migrate vPMU state"), because this isn't a bugfix.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  target/i386/kvm/kvm.c | 9 ---------
>  1 file changed, 9 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index c5911baef0..4902694129 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4160,13 +4160,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>          }
>  
>          if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
> -            if (has_pmu_version > 1) {
> -                /* Stop the counter.  */
> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> -            }
> -
> -            /* Set the counter values.  */
>              for (i = 0; i < num_pmu_fixed_counters; i++) {
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
>                                    env->msr_fixed_counters[i]);
> @@ -4182,8 +4175,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>                                    env->msr_global_status);
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>                                    env->msr_global_ovf_ctrl);
> -
> -                /* Now start the PMU.  */
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
>                                    env->msr_fixed_ctr_ctrl);
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
  2025-03-05  7:33   ` Mi, Dapeng
@ 2025-03-05 11:41   ` Francesco Lavra
  2025-03-05 19:05     ` dongli.zhang
  2025-03-07  7:38   ` Sandipan Das
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 63+ messages in thread
From: Francesco Lavra @ 2025-03-05 11:41 UTC (permalink / raw)
  To: dongli.zhang
  Cc: alexander.ivanov, babu.moger, dapeng1.mi, davydov-max, den, groug,
	joe.jin, khorenko, kvm, like.xu.linux, likexu, mtosatti, pbonzini,
	qemu-devel, sandipan.das, xiaoyao.li, zhao1.liu, zhenyuw

On 2025-03-02 at 22:00, Dongli Zhang wrote:
> +static bool is_same_vendor(CPUX86State *env)
> +{
> +    static uint32_t host_cpuid_vendor1;
> +    static uint32_t host_cpuid_vendor2;
> +    static uint32_t host_cpuid_vendor3;

What's the purpose of making these variables static?

> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1,
> &host_cpuid_vendor3,
> +               &host_cpuid_vendor2);
> +
> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
> +           env->cpuid_vendor3 == host_cpuid_vendor3;
> +}
> +
> +static void kvm_init_pmu_info(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +
> +    /*
> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
> +     */
> +    if (kvm_pmu_disabled) {
> +        return;
> +    }
> +
> +    /*
> +     * It is not supported to virtualize AMD PMU registers on Intel
> +     * processors, nor to virtualize Intel PMU registers on AMD
> processors.
> +     */
> +    if (!is_same_vendor(env)) {
> +        return;
> +    }
> +
> +    /*
> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way
> to
> +     * disable the AMD pmu virtualization.

s/pmu/PMU/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
  2025-03-04 14:40   ` Xiaoyao Li
@ 2025-03-05 14:20   ` Zhao Liu
  2025-03-07  7:24   ` Sandipan Das
  2 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2025-03-05 14:20 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

On Sun, Mar 02, 2025 at 02:00:09PM -0800, Dongli Zhang wrote:
> Date: Sun,  2 Mar 2025 14:00:09 -0800
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE
>  unavailable
> X-Mailer: git-send-email 2.43.5
> 
> When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
> reflected in in guest dmesg.
> 
> [    0.285136] Performance Events: AMD PMU driver.
> 
> However, the guest CPUID indicates the PerfMonV2 is still available.
> 
> CPU:
>    Extended Performance Monitoring and Debugging (0x80000022):
>       AMD performance monitoring V2         = true
>       AMD LBR V2                            = false
>       AMD LBR stack & PMC freezing          = false
>       number of core perf ctrs              = 0x6 (6)
>       number of LBR stack entries           = 0x0 (0)
>       number of avail Northbridge perf ctrs = 0x0 (0)
>       number of available UMC PMCs          = 0x0 (0)
>       active UMCs bitmask                   = 0x0
> 
> Disable PerfMonV2 in CPUID when PERFCORE is disabled.
> 
> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
> Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Use feature_dependencies (suggested by Zhao Liu).
> 
>  target/i386/cpu.c | 4 ++++
>  1 file changed, 4 insertions(+)

Thanks!

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-04  7:59   ` Xiaoyao Li
  2025-03-05  1:22     ` Sean Christopherson
@ 2025-03-05 14:41     ` Zhao Liu
  2025-03-05 20:13       ` dongli.zhang
  1 sibling, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-05 14:41 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, mtosatti, sandipan.das,
	babu.moger, likexu, like.xu.linux, zhenyuw, groug, khorenko,
	alexander.ivanov, den, davydov-max, dapeng1.mi, joe.jin

> > +        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
> 
> One nit, it's safer to use
> 
> 	(has_pmu_cap & KVM_PMU_CAP_DISABLE) && !X86_CPU(cpu)->enable_pmu
> 
> Maybe we can rename has_pmu_cap to pmu_cap as well.

Yes, I agree.

Regards,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-02 22:00 ` [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured Dongli Zhang
  2025-03-04  7:59   ` Xiaoyao Li
@ 2025-03-05 14:44   ` Zhao Liu
  1 sibling, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2025-03-05 14:44 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

On Sun, Mar 02, 2025 at 02:00:12PM -0800, Dongli Zhang wrote:
> Date: Sun,  2 Mar 2025 14:00:12 -0800
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if
>  "-pmu" is configured
> X-Mailer: git-send-email 2.43.5
> 
> Although AMD PERFCORE and PerfMonV2 are removed when "-pmu" is configured,
> there is no way to fully disable KVM AMD PMU virtualization. Neither
> "-cpu host,-pmu" nor "-cpu EPYC" achieves this.
> 
> As a result, the following message still appears in the VM dmesg:
> 
> [    0.263615] Performance Events: AMD PMU driver.
> 
> However, the expected output should be:
> 
> [    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
> [    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
> 
> This occurs because AMD does not use any CPUID bit to indicate PMU
> availability.
> 
> To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE
> when "-pmu" is configured.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Switch back to the initial implementation with "-pmu".
> https://lore.kernel.org/all/20221119122901.2469-3-dongli.zhang@oracle.com
>   - Mention that "KVM_PMU_CAP_DISABLE doesn't change the PMU behavior on
>     Intel platform because current "pmu" property works as expected."
> 
>  target/i386/kvm/kvm.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)

Overall LGTM. And with Xiaoyao's comment fixed :-)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu()
  2025-03-02 22:00 ` [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu() Dongli Zhang
@ 2025-03-05 14:46   ` Zhao Liu
  2025-03-05 21:53     ` dongli.zhang
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-05 14:46 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

On Sun, Mar 02, 2025 at 02:00:11PM -0800, Dongli Zhang wrote:
> Date: Sun,  2 Mar 2025 14:00:11 -0800
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce
>  kvm_arch_pre_create_vcpu()
> X-Mailer: git-send-email 2.43.5
> 
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> 
> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> work prior to create any vcpu. This is for i386 TDX because it needs
> call TDX_INIT_VM before creating any vcpu.
> 
> The specific implemnet of i386 will be added in the future patch.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>

Your Signed-off is missing...

(When you send the patch, it's better to attach your own Signed-off :-))

> ---
> I used to send a version:
> https://lore.kernel.org/all/20221119122901.2469-2-dongli.zhang@oracle.com/
> Just pick the one from Xiaoyao's patchset as Dapeng may use this version
> as well.
> https://lore.kernel.org/all/20250124132048.3229049-8-xiaoyao.li@intel.com/
> 
>  accel/kvm/kvm-all.c        | 5 +++++
>  include/system/kvm.h       | 1 +
>  target/arm/kvm.c           | 5 +++++
>  target/i386/kvm/kvm.c      | 5 +++++
>  target/loongarch/kvm/kvm.c | 5 +++++
>  target/mips/kvm.c          | 5 +++++
>  target/ppc/kvm.c           | 5 +++++
>  target/riscv/kvm/kvm-cpu.c | 5 +++++
>  target/s390x/kvm/kvm.c     | 5 +++++
>  9 files changed, 41 insertions(+)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters
  2025-03-05  7:35   ` Mi, Dapeng
@ 2025-03-05 19:00     ` dongli.zhang
  2025-03-06  1:38       ` Mi, Dapeng
  0 siblings, 1 reply; 63+ messages in thread
From: dongli.zhang @ 2025-03-05 19:00 UTC (permalink / raw)
  To: Mi, Dapeng, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin

Hi Dapeng,

On 3/4/25 11:35 PM, Mi, Dapeng wrote:
> 
> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>> The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM processes
>> these MSRs one by one in a loop, only saving the config and triggering the
>> KVM_REQ_PMU request. This approach does not immediately stop the event
>> before updating PMC.
>>
>> In additional, PMU MSRs are set only at levels >= KVM_PUT_RESET_STATE,
>> excluding runtime. Therefore, updating these MSRs without stopping events
>> should be acceptable.
> 
> Suppose this works for upcoming mediated vPMU as well? If so, please
> mention it here. Thanks.

TBH I am not sure if it works for mediated vPMU. The entire patchset is
based the current implementation in mainline linux kernel.

Otherwise, it is also required to modify the AMD's implementation ... that
is, to stop AMD general PMCs or global registers (PerfMonV2).

How about only consider the case without mediated vPMU so far?

1. For user without PerfMonV2 servers, they only need the patchset to reset
general PMCs.

2. For user with PerfMonV2 servers, they need extra patch to reset global
registers.

3. For mediated vPMU, we may add extra patch in the future.

Thank you very much!

Dongli Zhang

> 
> 
>>
>> Finally, KVM creates kernel perf events with host mode excluded
>> (exclude_host = 1). While the events remain active, they don't increment
>> the counter during QEMU vCPU userspace mode.
>>
>> No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
>> migrate vPMU state"), because this isn't a bugfix.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  target/i386/kvm/kvm.c | 9 ---------
>>  1 file changed, 9 deletions(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index c5911baef0..4902694129 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -4160,13 +4160,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>          }
>>  
>>          if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>> -            if (has_pmu_version > 1) {
>> -                /* Stop the counter.  */
>> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
>> -            }
>> -
>> -            /* Set the counter values.  */
>>              for (i = 0; i < num_pmu_fixed_counters; i++) {
>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
>>                                    env->msr_fixed_counters[i]);
>> @@ -4182,8 +4175,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>                                    env->msr_global_status);
>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>>                                    env->msr_global_ovf_ctrl);
>> -
>> -                /* Now start the PMU.  */
>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
>>                                    env->msr_fixed_ctr_ctrl);
>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-05 11:41   ` Francesco Lavra
@ 2025-03-05 19:05     ` dongli.zhang
  0 siblings, 0 replies; 63+ messages in thread
From: dongli.zhang @ 2025-03-05 19:05 UTC (permalink / raw)
  To: Francesco Lavra
  Cc: alexander.ivanov, babu.moger, dapeng1.mi, davydov-max, den, groug,
	joe.jin, khorenko, kvm, like.xu.linux, likexu, mtosatti, pbonzini,
	qemu-devel, sandipan.das, xiaoyao.li, zhao1.liu, zhenyuw

Hi Francesco,

On 3/5/25 3:41 AM, Francesco Lavra wrote:
> On 2025-03-02 at 22:00, Dongli Zhang wrote:
>> +static bool is_same_vendor(CPUX86State *env)
>> +{
>> +    static uint32_t host_cpuid_vendor1;
>> +    static uint32_t host_cpuid_vendor2;
>> +    static uint32_t host_cpuid_vendor3;
> 
> What's the purpose of making these variables static?

My fault.

I used to make them globally shared during the development in case any
other users may need them in the future, but finally decided to move them
into the function as local variables.

I just erroneously copied 'static' with the variable.

Thank you very much for identifying the issue.

Dongli Zhang

> 
>> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1,
>> &host_cpuid_vendor3,
>> +               &host_cpuid_vendor2);
>> +
>> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
>> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
>> +           env->cpuid_vendor3 == host_cpuid_vendor3;
>> +}
>> +
>> +static void kvm_init_pmu_info(CPUState *cs)
>> +{
>> +    X86CPU *cpu = X86_CPU(cs);
>> +    CPUX86State *env = &cpu->env;
>> +
>> +    /*
>> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
>> +     */
>> +    if (kvm_pmu_disabled) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * It is not supported to virtualize AMD PMU registers on Intel
>> +     * processors, nor to virtualize Intel PMU registers on AMD
>> processors.
>> +     */
>> +    if (!is_same_vendor(env)) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way
>> to
>> +     * disable the AMD pmu virtualization.
> 
> s/pmu/PMU/


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured
  2025-03-05 14:41     ` Zhao Liu
@ 2025-03-05 20:13       ` dongli.zhang
  0 siblings, 0 replies; 63+ messages in thread
From: dongli.zhang @ 2025-03-05 20:13 UTC (permalink / raw)
  To: Zhao Liu, Xiaoyao Li
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, dapeng1.mi, joe.jin

Hi Xiaoyao and Zhao,

On 3/5/25 6:41 AM, Zhao Liu wrote:
>>> +        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
>>
>> One nit, it's safer to use
>>
>> 	(has_pmu_cap & KVM_PMU_CAP_DISABLE) && !X86_CPU(cpu)->enable_pmu
>>
>> Maybe we can rename has_pmu_cap to pmu_cap as well.
> 
> Yes, I agree.
> 

Thanks both of you very much!

I also need to modify PATCH 08/10 where has_pmu_cap is used.

[PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset

Dongli Zhang


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu()
  2025-03-05 14:46   ` Zhao Liu
@ 2025-03-05 21:53     ` dongli.zhang
  2025-03-07  7:52       ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: dongli.zhang @ 2025-03-05 21:53 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Zhao,

On 3/5/25 6:46 AM, Zhao Liu wrote:
> On Sun, Mar 02, 2025 at 02:00:11PM -0800, Dongli Zhang wrote:
>> Date: Sun,  2 Mar 2025 14:00:11 -0800
>> From: Dongli Zhang <dongli.zhang@oracle.com>
>> Subject: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce
>>  kvm_arch_pre_create_vcpu()
>> X-Mailer: git-send-email 2.43.5
>>
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>>
>> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
>> work prior to create any vcpu. This is for i386 TDX because it needs
>> call TDX_INIT_VM before creating any vcpu.
>>
>> The specific implemnet of i386 will be added in the future patch.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> 
> Your Signed-off is missing...
> 
> (When you send the patch, it's better to attach your own Signed-off :-))

Thank you very much!

I didn't know if I would need to wait until this patch is merged into
mainline QEMU. That's why I didn't add my signed-off.

I will add in v3 and remove "DO NOT MERGE" if the patch isn't in QEMU when
I am sending out v3.

Dongli Zhang

> 
>> ---
>> I used to send a version:
>> https://urldefense.com/v3/__https://lore.kernel.org/all/20221119122901.2469-2-dongli.zhang@oracle.com/__;!!ACWV5N9M2RV99hQ!P5Ki_gsFsvAUNjV4-CNmMcDAJA5QRyzJ5ufGtNqeH6Ayt2ZUxwoPde3VQVer_o9Y2xRSVTwCN5fdjO-Dyerp$ 
>> Just pick the one from Xiaoyao's patchset as Dapeng may use this version
>> as well.
>> https://urldefense.com/v3/__https://lore.kernel.org/all/20250124132048.3229049-8-xiaoyao.li@intel.com/__;!!ACWV5N9M2RV99hQ!P5Ki_gsFsvAUNjV4-CNmMcDAJA5QRyzJ5ufGtNqeH6Ayt2ZUxwoPde3VQVer_o9Y2xRSVTwCN5fdjN17lCxG$ 
>>
>>  accel/kvm/kvm-all.c        | 5 +++++
>>  include/system/kvm.h       | 1 +
>>  target/arm/kvm.c           | 5 +++++
>>  target/i386/kvm/kvm.c      | 5 +++++
>>  target/loongarch/kvm/kvm.c | 5 +++++
>>  target/mips/kvm.c          | 5 +++++
>>  target/ppc/kvm.c           | 5 +++++
>>  target/riscv/kvm/kvm-cpu.c | 5 +++++
>>  target/s390x/kvm/kvm.c     | 5 +++++
>>  9 files changed, 41 insertions(+)
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters
  2025-03-05 19:00     ` dongli.zhang
@ 2025-03-06  1:38       ` Mi, Dapeng
  0 siblings, 0 replies; 63+ messages in thread
From: Mi, Dapeng @ 2025-03-06  1:38 UTC (permalink / raw)
  To: dongli.zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, joe.jin


On 3/6/2025 3:00 AM, dongli.zhang@oracle.com wrote:
> Hi Dapeng,
>
> On 3/4/25 11:35 PM, Mi, Dapeng wrote:
>> On 3/3/2025 6:00 AM, Dongli Zhang wrote:
>>> The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM processes
>>> these MSRs one by one in a loop, only saving the config and triggering the
>>> KVM_REQ_PMU request. This approach does not immediately stop the event
>>> before updating PMC.
>>>
>>> In additional, PMU MSRs are set only at levels >= KVM_PUT_RESET_STATE,
>>> excluding runtime. Therefore, updating these MSRs without stopping events
>>> should be acceptable.
>> Suppose this works for upcoming mediated vPMU as well? If so, please
>> mention it here. Thanks.
> TBH I am not sure if it works for mediated vPMU. The entire patchset is
> based the current implementation in mainline linux kernel.
>
> Otherwise, it is also required to modify the AMD's implementation ... that
> is, to stop AMD general PMCs or global registers (PerfMonV2).
>
> How about only consider the case without mediated vPMU so far?
>
> 1. For user without PerfMonV2 servers, they only need the patchset to reset
> general PMCs.
>
> 2. For user with PerfMonV2 servers, they need extra patch to reset global
> registers.
>
> 3. For mediated vPMU, we may add extra patch in the future.

I suppose it should be fine for mediated vPMU but I have no bandwidth to
thoroughly test it.

Ok, I think it's fine not to consider mediated vPMU in this patch series,
we can look at it later. Thanks.


>
> Thank you very much!
>
> Dongli Zhang
>
>>
>>> Finally, KVM creates kernel perf events with host mode excluded
>>> (exclude_host = 1). While the events remain active, they don't increment
>>> the counter during QEMU vCPU userspace mode.
>>>
>>> No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
>>> migrate vPMU state"), because this isn't a bugfix.
>>>
>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>> ---
>>>  target/i386/kvm/kvm.c | 9 ---------
>>>  1 file changed, 9 deletions(-)
>>>
>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>> index c5911baef0..4902694129 100644
>>> --- a/target/i386/kvm/kvm.c
>>> +++ b/target/i386/kvm/kvm.c
>>> @@ -4160,13 +4160,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>>          }
>>>  
>>>          if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>>> -            if (has_pmu_version > 1) {
>>> -                /* Stop the counter.  */
>>> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>>> -                kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
>>> -            }
>>> -
>>> -            /* Set the counter values.  */
>>>              for (i = 0; i < num_pmu_fixed_counters; i++) {
>>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
>>>                                    env->msr_fixed_counters[i]);
>>> @@ -4182,8 +4175,6 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>>                                    env->msr_global_status);
>>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>>>                                    env->msr_global_ovf_ctrl);
>>> -
>>> -                /* Now start the PMU.  */
>>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
>>>                                    env->msr_fixed_ctr_ctrl);
>>>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-02 22:00 ` [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured Dongli Zhang
  2025-03-03  1:59   ` Xiaoyao Li
@ 2025-03-06 16:50   ` Zhao Liu
  2025-03-06 17:47     ` dongli.zhang
  1 sibling, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-06 16:50 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Dongli,

> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b6d6167910..61a671028a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              !(env->hflags & HF_LMA_MASK)) {
>              *edx &= ~CPUID_EXT2_SYSCALL;
>          }
> +
> +        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {

No need to check "kvm_enabled() && IS_AMD_CPU(env)" because:

 * "pmu" is a general CPU property option which should cover all PMU
   related features, and not kvm-specific/vendor-specific.
 * this bit is reserved on Intel. So the following operation doesn't
   affect Intel.

I think Xiaoyao's idea about checking in x86_cpu_expand_features() is
good. And I believe it's worth having another cleanup series to revisit
pmu dependencies. I can help you later to consolidate and move this
check to x86_cpu_expand_features(), so this patch can focus on correctly
defining the current dependency relationship.

With the above nit fixed,

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-06 16:50   ` Zhao Liu
@ 2025-03-06 17:47     ` dongli.zhang
  2025-03-07  7:41       ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: dongli.zhang @ 2025-03-06 17:47 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Zhao,

On 3/6/25 8:50 AM, Zhao Liu wrote:
> Hi Dongli,
> 
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index b6d6167910..61a671028a 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -7115,6 +7115,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>              !(env->hflags & HF_LMA_MASK)) {
>>              *edx &= ~CPUID_EXT2_SYSCALL;
>>          }
>> +
>> +        if (kvm_enabled() && IS_AMD_CPU(env) && !cpu->enable_pmu) {
> 
> No need to check "kvm_enabled() && IS_AMD_CPU(env)" because:
> 
>  * "pmu" is a general CPU property option which should cover all PMU
>    related features, and not kvm-specific/vendor-specific.
>  * this bit is reserved on Intel. So the following operation doesn't
>    affect Intel.
> 
> I think Xiaoyao's idea about checking in x86_cpu_expand_features() is
> good. And I believe it's worth having another cleanup series to revisit
> pmu dependencies. I can help you later to consolidate and move this
> check to x86_cpu_expand_features(), so this patch can focus on correctly
> defining the current dependency relationship.

That means I don't need to change anything except:

1. Remove "kvm_enabled() && IS_AMD_CPU(env)" since the bit is reserved by
Intel.

2. Add your Reviewed-by.

Thank you very much!

Dongli Zhang

> 
> With the above nit fixed,
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> 
> 
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable
  2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
  2025-03-04 14:40   ` Xiaoyao Li
  2025-03-05 14:20   ` Zhao Liu
@ 2025-03-07  7:24   ` Sandipan Das
  2 siblings, 0 replies; 63+ messages in thread
From: Sandipan Das @ 2025-03-07  7:24 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, babu.moger, likexu, like.xu.linux,
	zhenyuw, groug, khorenko, alexander.ivanov, den, davydov-max,
	xiaoyao.li, dapeng1.mi, joe.jin

On 3/3/2025 3:30 AM, Dongli Zhang wrote:
> When the PERFCORE is disabled with "-cpu host,-perfctr-core", it is
> reflected in in guest dmesg.
> 
> [    0.285136] Performance Events: AMD PMU driver.
> 
> However, the guest CPUID indicates the PerfMonV2 is still available.
> 
> CPU:
>    Extended Performance Monitoring and Debugging (0x80000022):
>       AMD performance monitoring V2         = true
>       AMD LBR V2                            = false
>       AMD LBR stack & PMC freezing          = false
>       number of core perf ctrs              = 0x6 (6)
>       number of LBR stack entries           = 0x0 (0)
>       number of avail Northbridge perf ctrs = 0x0 (0)
>       number of available UMC PMCs          = 0x0 (0)
>       active UMCs bitmask                   = 0x0
> 
> Disable PerfMonV2 in CPUID when PERFCORE is disabled.
> 
> Suggested-by: Zhao Liu <zhao1.liu@intel.com>
> Fixes: 209b0ac12074 ("target/i386: Add PerfMonV2 feature bit")
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Use feature_dependencies (suggested by Zhao Liu).
> 
>  target/i386/cpu.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 72ab147e85..b6d6167910 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1805,6 +1805,10 @@ static FeatureDep feature_dependencies[] = {
>          .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
>          .to = { FEAT_24_0_EBX,              ~0ull },
>      },
> +    {
> +        .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
> +        .to = { FEAT_8000_0022_EAX,         CPUID_8000_0022_EAX_PERFMON_V2 },
> +    },
>  };
>  
>  typedef struct X86RegisterInfo32 {


Reviewed-by: Sandipan Das <sandipan.das@amd.com>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
  2025-03-05  7:33   ` Mi, Dapeng
  2025-03-05 11:41   ` Francesco Lavra
@ 2025-03-07  7:38   ` Sandipan Das
  2025-03-10  7:47   ` Zhao Liu
  2025-03-28  6:29   ` ewanhai
  4 siblings, 0 replies; 63+ messages in thread
From: Sandipan Das @ 2025-03-07  7:38 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, babu.moger, likexu, like.xu.linux,
	zhenyuw, groug, khorenko, alexander.ivanov, den, davydov-max,
	xiaoyao.li, dapeng1.mi, joe.jin

On 3/3/2025 3:30 AM, Dongli Zhang wrote:
> QEMU uses the kvm_get_msrs() function to save Intel PMU registers from KVM
> and kvm_put_msrs() to restore them to KVM. However, there is no support for
> AMD PMU registers. Currently, has_pmu_version and num_pmu_gp_counters are
> initialized based on cpuid(0xa), which does not apply to AMD processors.
> For AMD CPUs, prior to PerfMonV2, the number of general-purpose registers
> is determined based on the CPU version.
> 
> To address this issue, we need to add support for AMD PMU registers.
> Without this support, the following problems can arise:
> 
> 1. If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while
> running "perf top", the PMU registers are not disabled properly.
> 
> 2. Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs()
> does not handle AMD PMU registers, causing some PMU events to remain
> enabled in KVM.
> 
> 3. The KVM kvm_pmc_speculative_in_use() function consistently returns true,
> preventing the reclamation of these events. Consequently, the
> kvm_pmc->perf_event remains active.
> 
> 4. After a reboot, the VM kernel may report the following error:
> 
> [    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
> [    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)
> 
> 5. In the worst case, the active kvm_pmc->perf_event may inject unknown
> NMIs randomly into the VM kernel:
> 
> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0.
> 
> To resolve these issues, we propose resetting AMD PMU registers during the
> VM reset process.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>   - Modify "MSR_K7_EVNTSEL0 + 3" and "MSR_K7_PERFCTR0 + 3" by using
>     AMD64_NUM_COUNTERS (suggested by Sandipan Das).
>   - Use "AMD64_NUM_COUNTERS_CORE * 2 - 1", not "MSR_F15H_PERF_CTL0 + 0xb".
>     (suggested by Sandipan Das).
>   - Switch back to "-pmu" instead of using a global "pmu-cap-disabled".
>   - Don't initialize PMU info if kvm.enable_pmu=N.
> 
>  target/i386/cpu.h     |   8 ++
>  target/i386/kvm/kvm.c | 173 +++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 177 insertions(+), 4 deletions(-)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index c67b42d34f..319600672b 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -490,6 +490,14 @@ typedef enum X86Seg {
>  #define MSR_CORE_PERF_GLOBAL_CTRL       0x38f
>  #define MSR_CORE_PERF_GLOBAL_OVF_CTRL   0x390
>  
> +#define MSR_K7_EVNTSEL0                 0xc0010000
> +#define MSR_K7_PERFCTR0                 0xc0010004
> +#define MSR_F15H_PERF_CTL0              0xc0010200
> +#define MSR_F15H_PERF_CTR0              0xc0010201
> +
> +#define AMD64_NUM_COUNTERS              4
> +#define AMD64_NUM_COUNTERS_CORE         6
> +
>  #define MSR_MC0_CTL                     0x400
>  #define MSR_MC0_STATUS                  0x401
>  #define MSR_MC0_ADDR                    0x402
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index efba3ae7a4..d4be8a0d2e 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2069,7 +2069,7 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>      return 0;
>  }
>  
> -static void kvm_init_pmu_info(CPUX86State *env)
> +static void kvm_init_pmu_info_intel(CPUX86State *env)
>  {
>      uint32_t eax, edx;
>      uint32_t unused;
> @@ -2106,6 +2106,94 @@ static void kvm_init_pmu_info(CPUX86State *env)
>      }
>  }
>  
> +static void kvm_init_pmu_info_amd(CPUX86State *env)
> +{
> +    uint32_t unused;
> +    int64_t family;
> +    uint32_t ecx;
> +
> +    has_pmu_version = 0;
> +
> +    /*
> +     * To determine the CPU family, the following code is derived from
> +     * x86_cpuid_version_get_family().
> +     */
> +    family = (env->cpuid_version >> 8) & 0xf;
> +    if (family == 0xf) {
> +        family += (env->cpuid_version >> 20) & 0xff;
> +    }
> +
> +    /*
> +     * Performance-monitoring supported from K7 and later.
> +     */
> +    if (family < 6) {
> +        return;
> +    }
> +
> +    has_pmu_version = 1;
> +
> +    cpu_x86_cpuid(env, 0x80000001, 0, &unused, &unused, &ecx, &unused);
> +
> +    if (!(ecx & CPUID_EXT3_PERFCORE)) {
> +        num_pmu_gp_counters = AMD64_NUM_COUNTERS;
> +        return;
> +    }
> +
> +    num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +}
> +
> +static bool is_same_vendor(CPUX86State *env)
> +{
> +    static uint32_t host_cpuid_vendor1;
> +    static uint32_t host_cpuid_vendor2;
> +    static uint32_t host_cpuid_vendor3;
> +
> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
> +               &host_cpuid_vendor2);
> +
> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
> +           env->cpuid_vendor3 == host_cpuid_vendor3;
> +}
> +
> +static void kvm_init_pmu_info(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +
> +    /*
> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
> +     */
> +    if (kvm_pmu_disabled) {
> +        return;
> +    }
> +
> +    /*
> +     * It is not supported to virtualize AMD PMU registers on Intel
> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
> +     */
> +    if (!is_same_vendor(env)) {
> +        return;
> +    }
> +
> +    /*
> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> +     * disable the AMD pmu virtualization.
> +     *
> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> +     * indicates the KVM has already disabled the PMU virtualization.
> +     */
> +    if (has_pmu_cap && !cpu->enable_pmu) {
> +        return;
> +    }
> +
> +    if (IS_INTEL_CPU(env)) {
> +        kvm_init_pmu_info_intel(env);
> +    } else if (IS_AMD_CPU(env)) {
> +        kvm_init_pmu_info_amd(env);
> +    }
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
>  
> -    kvm_init_pmu_info(env);
> +    kvm_init_pmu_info(cs);
>  
>      if (((env->cpuid_version >> 8)&0xF) >= 6
>          && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> @@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
>          }
>  
> -        if (has_pmu_version > 0) {
> +        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>              if (has_pmu_version > 1) {
>                  /* Stop the counter.  */
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> @@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>                                    env->msr_global_ctrl);
>              }
>          }
> +
> +        if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +            uint32_t sel_base = MSR_K7_EVNTSEL0;
> +            uint32_t ctr_base = MSR_K7_PERFCTR0;
> +            /*
> +             * The address of the next selector or counter register is
> +             * obtained by incrementing the address of the current selector
> +             * or counter register by one.
> +             */
> +            uint32_t step = 1;
> +
> +            /*
> +             * When PERFCORE is enabled, AMD PMU uses a separate set of
> +             * addresses for the selector and counter registers.
> +             * Additionally, the address of the next selector or counter
> +             * register is determined by incrementing the address of the
> +             * current register by two.
> +             */
> +            if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
> +                sel_base = MSR_F15H_PERF_CTL0;
> +                ctr_base = MSR_F15H_PERF_CTR0;
> +                step = 2;
> +            }
> +
> +            for (i = 0; i < num_pmu_gp_counters; i++) {
> +                kvm_msr_entry_add(cpu, ctr_base + i * step,
> +                                  env->msr_gp_counters[i]);
> +                kvm_msr_entry_add(cpu, sel_base + i * step,
> +                                  env->msr_gp_evtsel[i]);
> +            }
> +        }
> +
>          /*
>           * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add,
>           * only sync them to KVM on the first cpu
> @@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>      if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>          kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>      }
> -    if (has_pmu_version > 0) {
> +
> +    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
>          if (has_pmu_version > 1) {
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> @@ -4558,6 +4679,35 @@ static int kvm_get_msrs(X86CPU *cpu)
>          }
>      }
>  
> +    if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +        uint32_t sel_base = MSR_K7_EVNTSEL0;
> +        uint32_t ctr_base = MSR_K7_PERFCTR0;
> +        /*
> +         * The address of the next selector or counter register is
> +         * obtained by incrementing the address of the current selector
> +         * or counter register by one.
> +         */
> +        uint32_t step = 1;
> +
> +        /*
> +         * When PERFCORE is enabled, AMD PMU uses a separate set of
> +         * addresses for the selector and counter registers.
> +         * Additionally, the address of the next selector or counter
> +         * register is determined by incrementing the address of the
> +         * current register by two.
> +         */
> +        if (num_pmu_gp_counters == AMD64_NUM_COUNTERS_CORE) {
> +            sel_base = MSR_F15H_PERF_CTL0;
> +            ctr_base = MSR_F15H_PERF_CTR0;
> +            step = 2;
> +        }
> +
> +        for (i = 0; i < num_pmu_gp_counters; i++) {
> +            kvm_msr_entry_add(cpu, ctr_base + i * step, 0);
> +            kvm_msr_entry_add(cpu, sel_base + i * step, 0);
> +        }
> +    }
> +
>      if (env->mcg_cap) {
>          kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0);
>          kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0);
> @@ -4869,6 +5019,21 @@ static int kvm_get_msrs(X86CPU *cpu)
>          case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
>              env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
>              break;
> +        case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL0 + AMD64_NUM_COUNTERS - 1:
> +            env->msr_gp_evtsel[index - MSR_K7_EVNTSEL0] = msrs[i].data;
> +            break;
> +        case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR0 + AMD64_NUM_COUNTERS - 1:
> +            env->msr_gp_counters[index - MSR_K7_PERFCTR0] = msrs[i].data;
> +            break;
> +        case MSR_F15H_PERF_CTL0 ...
> +             MSR_F15H_PERF_CTL0 + AMD64_NUM_COUNTERS_CORE * 2 - 1:
> +            index = index - MSR_F15H_PERF_CTL0;
> +            if (index & 0x1) {
> +                env->msr_gp_counters[index] = msrs[i].data;
> +            } else {
> +                env->msr_gp_evtsel[index] = msrs[i].data;
> +            }
> +            break;
>          case HV_X64_MSR_HYPERCALL:
>              env->msr_hv_hypercall = msrs[i].data;
>              break;


Reviewed-by: Sandipan Das <sandipan.das@amd.com>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured
  2025-03-06 17:47     ` dongli.zhang
@ 2025-03-07  7:41       ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2025-03-07  7:41 UTC (permalink / raw)
  To: dongli.zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

> 1. Remove "kvm_enabled() && IS_AMD_CPU(env)" since the bit is reserved by
> Intel.
> 
> 2. Add your Reviewed-by.

Yes, this is exactly what I mean!

Regards,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu()
  2025-03-05 21:53     ` dongli.zhang
@ 2025-03-07  7:52       ` Zhao Liu
  2025-03-07  8:40         ` Xiaoyao Li
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-07  7:52 UTC (permalink / raw)
  To: dongli.zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

> I didn't know if I would need to wait until this patch is merged into
> mainline QEMU. That's why I didn't add my signed-off.

No problem if Xiaoyao is okay with it (copyright of patches need to
honor the original author & signed-off). IMO, if your series is accepted
first, it also helps to reduce the size of the TDX series, and it helps
the subsequent PMU development (like mediated PMU). Conversely, it's
also not a big deal; you can simply rebase and remove this patch at that
time.

Even I'm thinking that my KVM PMU filter should perhaps base on your work.

> I will add in v3 and remove "DO NOT MERGE" if the patch isn't in QEMU when
> I am sending out v3.

Okay.

Thanks,
Zhao

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu()
  2025-03-07  7:52       ` Zhao Liu
@ 2025-03-07  8:40         ` Xiaoyao Li
  0 siblings, 0 replies; 63+ messages in thread
From: Xiaoyao Li @ 2025-03-07  8:40 UTC (permalink / raw)
  To: Zhao Liu, dongli.zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, dapeng1.mi, joe.jin

On 3/7/2025 3:52 PM, Zhao Liu wrote:
>> I didn't know if I would need to wait until this patch is merged into
>> mainline QEMU. That's why I didn't add my signed-off.
> 
> No problem if Xiaoyao is okay with it (copyright of patches need to
> honor the original author & signed-off). IMO, if your series is accepted
> first, it also helps to reduce the size of the TDX series, and it helps
> the subsequent PMU development (like mediated PMU). Conversely, it's
> also not a big deal; you can simply rebase and remove this patch at that
> time.

Hi Dongli,

Usually, if my TDX series is going to be merged soon, or you think this 
series won't be accepted soon before TDX series, you can just mention in 
the cover letter that this series has a dependency on other patch.

For the case that your series might be accepted earlier, it's better to 
just grab the patches needed by this series from others' series. Just 
like what you did here.

Like Zhao mentioned, when you grab a patch from others and post with 
your series, you need keep the original patch as is (the unchanged 
authorship and signed-off-by chain), in addition to add your 
signed-off-by at last in the chain.

> Even I'm thinking that my KVM PMU filter should perhaps base on your work.
> 
>> I will add in v3 and remove "DO NOT MERGE" if the patch isn't in QEMU when
>> I am sending out v3.

Be sure to add your signed-off-by, which tells you are involved.

> Okay.
> 
> Thanks,
> Zhao
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  2025-03-02 22:00 ` [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid() Dongli Zhang
  2025-03-05  7:03   ` Mi, Dapeng
@ 2025-03-07  9:15   ` Zhao Liu
  2025-03-07 22:47     ` Dongli Zhang
  1 sibling, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-07  9:15 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

> +static void kvm_init_pmu_info(CPUX86State *env)
> +{
> +    uint32_t eax, edx;
> +    uint32_t unused;
> +    uint32_t limit;
> +
> +    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);

At this stage, CPUID has already been filled and we should not use
cpu_x86_cpuid() to get the "raw" CPUID info.

Instead, after kvm_x86_build_cpuid(), the cpuid_find_entry() helper
should be preferred.

With cpuid_find_entry(), we don't even need to check the limit again.

> +
> +    if (limit < 0x0a) {
> +        return;
> +    }

...

>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2267,6 +2277,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
>  
> +    kvm_init_pmu_info(env);
> +

Referring what has_msr_feature_control did, what about the following
change?

 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
@@ -2277,8 +2240,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
     cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;

-    kvm_init_pmu_info(env);
-
     if (((env->cpuid_version >> 8)&0xF) >= 6
         && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
            (CPUID_MCE | CPUID_MCA)) {
@@ -2329,6 +2290,31 @@ int kvm_arch_init_vcpu(CPUState *cs)
         has_msr_feature_control = true;
     }

+    c = cpuid_find_entry(&cpuid_data.cpuid, 0xa, 0);
+    if (c) {
+        has_architectural_pmu_version = c->eax & 0xff;
+        if (has_architectural_pmu_version > 0) {
+            num_architectural_pmu_gp_counters = (c->eax & 0xff00) >> 8;
+
+            /*
+             * Shouldn't be more than 32, since that's the number of bits
+             * available in EBX to tell us _which_ counters are available.
+             * Play it safe.
+             */
+            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
+                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
+            }
+
+            if (has_architectural_pmu_version > 1) {
+                num_architectural_pmu_fixed_counters = c->edx & 0x1f;
+
+                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
+                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
+                }
+            }
+        }
+    }
+
     if (env->mcg_cap & MCG_LMCE_P) {
         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
     }
---

The above codes check 0xa after 0x1 and 0x7, and uses the local variable
`c`, so that it doesn't need to wrap another new function.

Regards,
Zhao




^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables
  2025-03-02 22:00 ` [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables Dongli Zhang
  2025-03-05  7:07   ` Mi, Dapeng
@ 2025-03-07  9:19   ` Zhao Liu
  2025-03-07 22:49     ` Dongli Zhang
  1 sibling, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-07  9:19 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

> +/*
> + * For Intel processors, the meaning is the architectural PMU version
> + * number.
> + *
> + * For AMD processors: 1 corresponds to the prior versions, and 2
> + * corresponds to AMD PerfMonV2.
> + */
> +static uint32_t has_pmu_version;

The "has_" prefix sounds like a boolean type. So what about "pmu_version"?

Others look good to me,

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  2025-03-07  9:15   ` Zhao Liu
@ 2025-03-07 22:47     ` Dongli Zhang
  2025-03-10  3:55       ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-07 22:47 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Zhao,

On 3/7/25 1:15 AM, Zhao Liu wrote:
>> +static void kvm_init_pmu_info(CPUX86State *env)
>> +{
>> +    uint32_t eax, edx;
>> +    uint32_t unused;
>> +    uint32_t limit;
>> +
>> +    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
> 
> At this stage, CPUID has already been filled and we should not use
> cpu_x86_cpuid() to get the "raw" CPUID info.
> 
> Instead, after kvm_x86_build_cpuid(), the cpuid_find_entry() helper
> should be preferred.
> 
> With cpuid_find_entry(), we don't even need to check the limit again.
> 
>> +
>> +    if (limit < 0x0a) {
>> +        return;
>> +    }
> 
> ...
> 
>>  int kvm_arch_init_vcpu(CPUState *cs)
>>  {
>>      struct {
>> @@ -2267,6 +2277,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>>      cpuid_data.cpuid.nent = cpuid_i;
>>  
>> +    kvm_init_pmu_info(env);
>> +
> 
> Referring what has_msr_feature_control did, what about the following
> change?
> 
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2277,8 +2240,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
> 
> -    kvm_init_pmu_info(env);
> -
>      if (((env->cpuid_version >> 8)&0xF) >= 6
>          && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
>             (CPUID_MCE | CPUID_MCA)) {
> @@ -2329,6 +2290,31 @@ int kvm_arch_init_vcpu(CPUState *cs)
>          has_msr_feature_control = true;
>      }
> 
> +    c = cpuid_find_entry(&cpuid_data.cpuid, 0xa, 0);
> +    if (c) {
> +        has_architectural_pmu_version = c->eax & 0xff;
> +        if (has_architectural_pmu_version > 0) {
> +            num_architectural_pmu_gp_counters = (c->eax & 0xff00) >> 8;
> +
> +            /*
> +             * Shouldn't be more than 32, since that's the number of bits
> +             * available in EBX to tell us _which_ counters are available.
> +             * Play it safe.
> +             */
> +            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {

BTW, I may need this bound checking for the PerfMonV2 patch, where the
number of counters is determined by cpuid(0x80000022).

> +                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
> +            }
> +
> +            if (has_architectural_pmu_version > 1) {
> +                num_architectural_pmu_fixed_counters = c->edx & 0x1f;
> +
> +                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
> +                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
> +                }
> +            }
> +        }
> +    }
> +
>      if (env->mcg_cap & MCG_LMCE_P) {
>          has_msr_mcg_ext_ctl = has_msr_feature_control = true;
>      }
> ---
> 
> The above codes check 0xa after 0x1 and 0x7, and uses the local variable
> `c`, so that it doesn't need to wrap another new function.
> 

How about we still wrap in another new function with &cpuid_data.cpuid as
an argument?

1. In current patch, we need cpuid(0xa) to query Intel PMU info.

2. In PATCH 08/10 (AMD), we need cpuid(0x80000001) to determine PERFCORE.

https://lore.kernel.org/all/20250302220112.17653-9-dongli.zhang@oracle.com/

(Otherwise, we may use ((env->features[FEAT_8000_0001_ECX] &
CPUID_EXT3_PERFCORE), but I prefer something consistent)


3. In PATCH 09/10 (AMD PerfMonV2), we need cpuid(0x80000022) to query the
PerfMonV2 support, and the number of PMU counters.

https://lore.kernel.org/all/20250302220112.17653-10-dongli.zhang@oracle.com/

Thank you very much!

Dongli Zhang


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables
  2025-03-07  9:19   ` Zhao Liu
@ 2025-03-07 22:49     ` Dongli Zhang
  0 siblings, 0 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-07 22:49 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin



On 3/7/25 1:19 AM, Zhao Liu wrote:
>> +/*
>> + * For Intel processors, the meaning is the architectural PMU version
>> + * number.
>> + *
>> + * For AMD processors: 1 corresponds to the prior versions, and 2
>> + * corresponds to AMD PerfMonV2.
>> + */
>> +static uint32_t has_pmu_version;
> 
> The "has_" prefix sounds like a boolean type. So what about "pmu_version"?

Sure. I will change to pmu_version.

Thank you very much!

Dongli Zhang

> 
> Others look good to me,
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> 
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid()
  2025-03-07 22:47     ` Dongli Zhang
@ 2025-03-10  3:55       ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2025-03-10  3:55 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

> How about we still wrap in another new function with &cpuid_data.cpuid as
> an argument?
> 
> 1. In current patch, we need cpuid(0xa) to query Intel PMU info.
> 
> 2. In PATCH 08/10 (AMD), we need cpuid(0x80000001) to determine PERFCORE.
> 
> https://lore.kernel.org/all/20250302220112.17653-9-dongli.zhang@oracle.com/
> 
> (Otherwise, we may use ((env->features[FEAT_8000_0001_ECX] &
> CPUID_EXT3_PERFCORE), but I prefer something consistent)
> 
> 
> 3. In PATCH 09/10 (AMD PerfMonV2), we need cpuid(0x80000022) to query the
> PerfMonV2 support, and the number of PMU counters.
> 
> https://lore.kernel.org/all/20250302220112.17653-10-dongli.zhang@oracle.com/

Thanks, I see. This new function makes sense for me.

Regards,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
  2025-03-02 22:00 ` [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter Dongli Zhang
@ 2025-03-10  6:14   ` Zhao Liu
  2025-03-10 15:41     ` Dongli Zhang
  2025-03-10 16:49     ` Dongli Zhang
  0 siblings, 2 replies; 63+ messages in thread
From: Zhao Liu @ 2025-03-10  6:14 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

On Sun, Mar 02, 2025 at 02:00:15PM -0800, Dongli Zhang wrote:
> Date: Sun,  2 Mar 2025 14:00:15 -0800
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
> X-Mailer: git-send-email 2.43.5
> 
> There is no way to distinguish between the following scenarios:
> 
> (1) KVM_CAP_PMU_CAPABILITY is not supported.
> (2) KVM_CAP_PMU_CAPABILITY is supported but disabled via the module
> parameter kvm.enable_pmu=N.
> 
> In scenario (1), there is no way to fully disable AMD PMU virtualization.
> 
> In scenario (2), PMU virtualization is completely disabled by the KVM
> module.

KVM_CAP_PMU_CAPABILITY is introduced since ba7bb663f554 ("KVM: x86:
Provide per VM capability for disabling PMU virtualization") in v5.18,
so I understand you want to handle the old linux before v5.18.

Let's sort out all the cases:

1) v5.18 and after, if the parameter "enable_pmu" is Y and then
   KVM_CAP_PMU_CAPABILITY exists, so everything could work.

2) v5.18 and after, "enable_pmu" is N and then KVM_CAP_PMU_CAPABILITY
   doesn't exist, QEMU needs to helpe user disable vPMU.

3) v5.17 (since "enable_pmu" is introduced in v5.17 since 4732f2444acd
   ("KVM: x86: Making the module parameter of vPMU more common")),
   there's no KVM_CAP_PMU_CAPABILITY and vPMU enablement depends on
   "enable_pmu". QEMU's enable_pmu option should depend on kvm
   parameter.

4) before v5.17, there's no "enable_pmu" so that there's no way to
   fully disable AMD PMU.

IIUC, you want to distinguish 2) and 3). And your current codes won't
break old kernels on 4) because "kvm_pmu_disabled" defaults false.
Therefore, overall the idea of this patch is good for me.

But IMO, the logics all above can be compatible by:

 * First check the KVM_CAP_PMU_CAPABILITY,
 * Only if KVM_CAP_PMU_CAPABILITY doesn't exist, then check the kvm parameter

...instead of always checking the parameter as you are currently doing.

What about this change? :-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4902694129f9..9a6044e41a82 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2055,13 +2055,34 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
          * behavior on Intel platform because current "pmu" property works
          * as expected.
          */
-        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
-            ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
-                                    KVM_PMU_CAP_DISABLE);
-            if (ret < 0) {
-                error_setg_errno(errp, -ret,
-                                 "Failed to set KVM_PMU_CAP_DISABLE");
-                return ret;
+        if (has_pmu_cap) {
+            if (!X86_CPU(cpu)->enable_pmu) {
+                ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
+                                        KVM_PMU_CAP_DISABLE);
+                if (ret < 0) {
+                    error_setg_errno(errp, -ret,
+                                     "Failed to set KVM_PMU_CAP_DISABLE");
+                    return ret;
+                }
+            }
+        } else {
+            /*
+             * KVM_CAP_PMU_CAPABILITY is introduced in Linux v5.18. For old linux,
+             * we have to check enable_pmu parameter for vPMU support.
+             */
+            g_autofree char *kvm_enable_pmu;
+
+            /*
+             * The kvm.enable_pmu's permission is 0444. It does not change until a
+             * reload of the KVM module.
+             */
+            if (g_file_get_contents("/sys/module/kvm/parameters/enable_pmu",
+                &kvm_enable_pmu, NULL, NULL)) {
+                if (*kvm_enable_pmu == 'N' && !X86_CPU(cpu)->enable_pmu) {
+                    error_setg(errp, "Failed to enable PMU since "
+                               "KVM's enable_pmu parameter is disabled");
+                    return -1;
+                }
             }
         }
     }

---

This example not only eliminates the static variable “kvm_pmu_disabled”,
but also explicitly informs the user that vPMU is not available and
QEMU's "pmu" option doesn't work.

As a comparison, your patch 8 actually "silently" disables PMU (in the
kvm_init_pmu_info()) and user can only find it in Guest through PMU
exceptions.

Thanks,
Zhao

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
                     ` (2 preceding siblings ...)
  2025-03-07  7:38   ` Sandipan Das
@ 2025-03-10  7:47   ` Zhao Liu
  2025-03-10 16:39     ` Dongli Zhang
  2025-03-28  6:29   ` ewanhai
  4 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-10  7:47 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

(+EwanHai for zhaoxin case...)

...

> -static void kvm_init_pmu_info(CPUX86State *env)
> +static void kvm_init_pmu_info_intel(CPUX86State *env)
>  {
>      uint32_t eax, edx;
>      uint32_t unused;
> @@ -2106,6 +2106,94 @@ static void kvm_init_pmu_info(CPUX86State *env)
>      }
>  }
>  
> +static void kvm_init_pmu_info_amd(CPUX86State *env)
> +{
> +    uint32_t unused;
> +    int64_t family;
> +    uint32_t ecx;
> +
> +    has_pmu_version = 0;
> +
> +    /*
> +     * To determine the CPU family, the following code is derived from
> +     * x86_cpuid_version_get_family().
> +     */
> +    family = (env->cpuid_version >> 8) & 0xf;
> +    if (family == 0xf) {
> +        family += (env->cpuid_version >> 20) & 0xff;
> +    }
> +
> +    /*
> +     * Performance-monitoring supported from K7 and later.
> +     */
> +    if (family < 6) {
> +        return;
> +    }

I understand we can get family by object_property_get_int() helper:

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4902694129f9..ff08c7bfee6c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2106,27 +2106,22 @@ static void kvm_init_pmu_info_intel(CPUX86State *env)
     }
 }

-static void kvm_init_pmu_info_amd(CPUX86State *env)
+static void kvm_init_pmu_info_amd(X86CPU *cpu)
 {
+    CPUX86State *env = &cpu->env;
     uint32_t eax, ebx, ecx;
     uint32_t unused;
     int64_t family;

     has_pmu_version = 0;

-    /*
-     * To determine the CPU family, the following code is derived from
-     * x86_cpuid_version_get_family().
-     */
-    family = (env->cpuid_version >> 8) & 0xf;
-    if (family == 0xf) {
-        family += (env->cpuid_version >> 20) & 0xff;
+    family = object_property_get_int(OBJECT(cpu), "family", &error_abort);
+    if (family < 0) {
+        return;
     }

-    /*
-     * Performance-monitoring supported from K7 and later.
-     */
     if (family < 6) {
+        error_report("AMD performance-monitoring is supported from K7 and later")
         return;
     }

@@ -2197,7 +2192,7 @@ static void kvm_init_pmu_info(CPUState *cs)
     if (IS_INTEL_CPU(env)) {
         kvm_init_pmu_info_intel(env);
     } else if (IS_AMD_CPU(env)) {
-        kvm_init_pmu_info_amd(env);
+        kvm_init_pmu_info_amd(cpu);
     }
 }

---
Then for consistency, kvm_init_pmu_info_intel() could also accept
"X86CPU *cpu" as the argument.

> +    has_pmu_version = 1;
> +
> +    cpu_x86_cpuid(env, 0x80000001, 0, &unused, &unused, &ecx, &unused);
> +
> +    if (!(ecx & CPUID_EXT3_PERFCORE)) {
> +        num_pmu_gp_counters = AMD64_NUM_COUNTERS;
> +        return;
> +    }
> +
> +    num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +}

...

> +static void kvm_init_pmu_info(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +
> +    /*
> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
> +     */
> +    if (kvm_pmu_disabled) {
> +        return;
> +    }

As I said in patch 7, we could return an error instead.

> +    /*
> +     * It is not supported to virtualize AMD PMU registers on Intel
> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
> +     */
> +    if (!is_same_vendor(env)) {

Here it deserves a warning like:

error_report("host doesn't support requested feature: vPMU\n");

> +        return;
> +    }
>
> +    /*
> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> +     * disable the AMD pmu virtualization.
> +     *
> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> +     * indicates the KVM has already disabled the PMU virtualization.
> +     */
> +    if (has_pmu_cap && !cpu->enable_pmu) {
> +        return;
> +    }

Could we only check "cpu->enable_pmu" at the beginning of this function?
then if pmu is already disabled, we don't need to initialize the pmu info.

> +    if (IS_INTEL_CPU(env)) {

Zhaoxin also supports architectural PerfMon in 0xa.

I'm not sure if this check should also involve Zhaoxin CPU, so cc
zhaoxin guys for double check.

> +        kvm_init_pmu_info_intel(env);
> +    } else if (IS_AMD_CPU(env)) {
> +        kvm_init_pmu_info_amd(env);
> +    }
> +}
> +
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
>      struct {
> @@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>      cpuid_data.cpuid.nent = cpuid_i;
>  
> -    kvm_init_pmu_info(env);
> +    kvm_init_pmu_info(cs);
>  
>      if (((env->cpuid_version >> 8)&0xF) >= 6
>          && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> @@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>              kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
>          }
>  
> -        if (has_pmu_version > 0) {
> +        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {

ditto.

>              if (has_pmu_version > 1) {
>                  /* Stop the counter.  */
>                  kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> @@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>                                    env->msr_global_ctrl);
>              }
>          }
> +

...

>          /*
>           * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add,
>           * only sync them to KVM on the first cpu
> @@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>      if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>          kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>      }
> -    if (has_pmu_version > 0) {
> +
> +    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {

ditto.

>          if (has_pmu_version > 1) {
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>              kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> @@ -4558,6 +4679,35 @@ static int kvm_get_msrs(X86CPU *cpu)
>          }
>      }
>

Thanks,
Zhao



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
  2025-03-10  6:14   ` Zhao Liu
@ 2025-03-10 15:41     ` Dongli Zhang
  2025-03-10 16:49     ` Dongli Zhang
  1 sibling, 0 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-10 15:41 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Zhao,

On 3/9/25 11:14 PM, Zhao Liu wrote:
> On Sun, Mar 02, 2025 at 02:00:15PM -0800, Dongli Zhang wrote:
>> Date: Sun,  2 Mar 2025 14:00:15 -0800
>> From: Dongli Zhang <dongli.zhang@oracle.com>
>> Subject: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
>> X-Mailer: git-send-email 2.43.5
>>
>> There is no way to distinguish between the following scenarios:
>>
>> (1) KVM_CAP_PMU_CAPABILITY is not supported.
>> (2) KVM_CAP_PMU_CAPABILITY is supported but disabled via the module
>> parameter kvm.enable_pmu=N.
>>
>> In scenario (1), there is no way to fully disable AMD PMU virtualization.
>>
>> In scenario (2), PMU virtualization is completely disabled by the KVM
>> module.
> 
> KVM_CAP_PMU_CAPABILITY is introduced since ba7bb663f554 ("KVM: x86:
> Provide per VM capability for disabling PMU virtualization") in v5.18,
> so I understand you want to handle the old linux before v5.18.
> 
> Let's sort out all the cases:
> 
> 1) v5.18 and after, if the parameter "enable_pmu" is Y and then
>    KVM_CAP_PMU_CAPABILITY exists, so everything could work.
> 
> 2) v5.18 and after, "enable_pmu" is N and then KVM_CAP_PMU_CAPABILITY
>    doesn't exist, QEMU needs to helpe user disable vPMU.
> 
> 3) v5.17 (since "enable_pmu" is introduced in v5.17 since 4732f2444acd
>    ("KVM: x86: Making the module parameter of vPMU more common")),
>    there's no KVM_CAP_PMU_CAPABILITY and vPMU enablement depends on
>    "enable_pmu". QEMU's enable_pmu option should depend on kvm
>    parameter.
> 
> 4) before v5.17, there's no "enable_pmu" so that there's no way to
>    fully disable AMD PMU.
> 
> IIUC, you want to distinguish 2) and 3). And your current codes won't
> break old kernels on 4) because "kvm_pmu_disabled" defaults false.
> Therefore, overall the idea of this patch is good for me.
> 
> But IMO, the logics all above can be compatible by:
> 
>  * First check the KVM_CAP_PMU_CAPABILITY,
>  * Only if KVM_CAP_PMU_CAPABILITY doesn't exist, then check the kvm parameter
> 
> ...instead of always checking the parameter as you are currently doing.
> 
> What about this change? :-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 4902694129f9..9a6044e41a82 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2055,13 +2055,34 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>           * behavior on Intel platform because current "pmu" property works
>           * as expected.
>           */
> -        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
> -            ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> -                                    KVM_PMU_CAP_DISABLE);
> -            if (ret < 0) {
> -                error_setg_errno(errp, -ret,
> -                                 "Failed to set KVM_PMU_CAP_DISABLE");
> -                return ret;
> +        if (has_pmu_cap) {
> +            if (!X86_CPU(cpu)->enable_pmu) {
> +                ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> +                                        KVM_PMU_CAP_DISABLE);
> +                if (ret < 0) {
> +                    error_setg_errno(errp, -ret,
> +                                     "Failed to set KVM_PMU_CAP_DISABLE");
> +                    return ret;
> +                }
> +            }
> +        } else {
> +            /*
> +             * KVM_CAP_PMU_CAPABILITY is introduced in Linux v5.18. For old linux,
> +             * we have to check enable_pmu parameter for vPMU support.
> +             */
> +            g_autofree char *kvm_enable_pmu;
> +
> +            /*
> +             * The kvm.enable_pmu's permission is 0444. It does not change until a
> +             * reload of the KVM module.
> +             */
> +            if (g_file_get_contents("/sys/module/kvm/parameters/enable_pmu",
> +                &kvm_enable_pmu, NULL, NULL)) {
> +                if (*kvm_enable_pmu == 'N' && !X86_CPU(cpu)->enable_pmu) {
> +                    error_setg(errp, "Failed to enable PMU since "
> +                               "KVM's enable_pmu parameter is disabled");
> +                    return -1;
> +                }
>              }
>          }
>      }
> 
> ---
> 
> This example not only eliminates the static variable “kvm_pmu_disabled”,
> but also explicitly informs the user that vPMU is not available and
> QEMU's "pmu" option doesn't work.
> 
> As a comparison, your patch 8 actually "silently" disables PMU (in the
> kvm_init_pmu_info()) and user can only find it in Guest through PMU
> exceptions.
> 

Thank you very much!

I will change the code following your suggestion.

Dongli Zhang


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-10  7:47   ` Zhao Liu
@ 2025-03-10 16:39     ` Dongli Zhang
  2025-03-11 13:51       ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-10 16:39 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

Hi Zhao,

On 3/10/25 12:47 AM, Zhao Liu wrote:
> (+EwanHai for zhaoxin case...)
> 
> ...
> 

[snip]

>> +
>> +    /*
>> +     * Performance-monitoring supported from K7 and later.
>> +     */
>> +    if (family < 6) {
>> +        return;
>> +    }
> 
> I understand we can get family by object_property_get_int() helper:

Thank you very much for suggestion!

> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 4902694129f9..ff08c7bfee6c 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2106,27 +2106,22 @@ static void kvm_init_pmu_info_intel(CPUX86State *env)
>      }
>  }
> 

[snip]

> 
> @@ -2197,7 +2192,7 @@ static void kvm_init_pmu_info(CPUState *cs)
>      if (IS_INTEL_CPU(env)) {
>          kvm_init_pmu_info_intel(env);
>      } else if (IS_AMD_CPU(env)) {
> -        kvm_init_pmu_info_amd(env);
> +        kvm_init_pmu_info_amd(cpu);
>      }
>  }
> 
> ---
> Then for consistency, kvm_init_pmu_info_intel() could also accept
> "X86CPU *cpu" as the argument.

Sure. Will do.

> 
>> +    has_pmu_version = 1;
>> +
>> +    cpu_x86_cpuid(env, 0x80000001, 0, &unused, &unused, &ecx, &unused);
>> +
>> +    if (!(ecx & CPUID_EXT3_PERFCORE)) {
>> +        num_pmu_gp_counters = AMD64_NUM_COUNTERS;
>> +        return;
>> +    }
>> +
>> +    num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
>> +}
> 
> ...
> 
>> +static void kvm_init_pmu_info(CPUState *cs)
>> +{
>> +    X86CPU *cpu = X86_CPU(cs);
>> +    CPUX86State *env = &cpu->env;
>> +
>> +    /*
>> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
>> +     */
>> +    if (kvm_pmu_disabled) {
>> +        return;
>> +    }
> 
> As I said in patch 7, we could return an error instead.

Sure.

In addition, as we have discussed, we are going to pass cpuid_data.cpuid as
argument, so that we don't need cpu_x86_cpuid() any longer.

> 
>> +    /*
>> +     * It is not supported to virtualize AMD PMU registers on Intel
>> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
>> +     */
>> +    if (!is_same_vendor(env)) {
> 
> Here it deserves a warning like:
> 
> error_report("host doesn't support requested feature: vPMU\n");

Sure. Will do.

> 
>> +        return;
>> +    }
>>
>> +    /*
>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
>> +     * disable the AMD pmu virtualization.
>> +     *
>> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
>> +     * indicates the KVM has already disabled the PMU virtualization.
>> +     */
>> +    if (has_pmu_cap && !cpu->enable_pmu) {
>> +        return;
>> +    }
> 
> Could we only check "cpu->enable_pmu" at the beginning of this function?
> then if pmu is already disabled, we don't need to initialize the pmu info.

I don't think so. There is a case:

- cpu->enable_pmu = false. (That is, "-cpu host,-pmu").
- But for KVM prior v5.18 that KVM_CAP_PMU_CAPABILITY doesn't exist.

There is no way to disable vPMU. To determine based on only
"!cpu->enable_pmu" doesn't work.

It works only when "!cpu->enable_pmu" and KVM_CAP_PMU_CAPABILITY exists.


We may still need a static global variable here to indicate where
"kvm.enable_pmu=N" (as discussed in PATCH 07).

> 
>> +    if (IS_INTEL_CPU(env)) {
> 
> Zhaoxin also supports architectural PerfMon in 0xa.
> 
> I'm not sure if this check should also involve Zhaoxin CPU, so cc
> zhaoxin guys for double check.

Sure for both here and below 'ditto'. Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
  2025-03-10  6:14   ` Zhao Liu
  2025-03-10 15:41     ` Dongli Zhang
@ 2025-03-10 16:49     ` Dongli Zhang
  1 sibling, 0 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-10 16:49 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin

Hi Zhao,

On 3/9/25 11:14 PM, Zhao Liu wrote:
> On Sun, Mar 02, 2025 at 02:00:15PM -0800, Dongli Zhang wrote:
>> Date: Sun,  2 Mar 2025 14:00:15 -0800
>> From: Dongli Zhang <dongli.zhang@oracle.com>
>> Subject: [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter
>> X-Mailer: git-send-email 2.43.5
>>
>> There is no way to distinguish between the following scenarios:
>>
>> (1) KVM_CAP_PMU_CAPABILITY is not supported.
>> (2) KVM_CAP_PMU_CAPABILITY is supported but disabled via the module
>> parameter kvm.enable_pmu=N.
>>
>> In scenario (1), there is no way to fully disable AMD PMU virtualization.
>>
>> In scenario (2), PMU virtualization is completely disabled by the KVM
>> module.
> 
> KVM_CAP_PMU_CAPABILITY is introduced since ba7bb663f554 ("KVM: x86:
> Provide per VM capability for disabling PMU virtualization") in v5.18,
> so I understand you want to handle the old linux before v5.18.
> 
> Let's sort out all the cases:
> 
> 1) v5.18 and after, if the parameter "enable_pmu" is Y and then
>    KVM_CAP_PMU_CAPABILITY exists, so everything could work.
> 
> 2) v5.18 and after, "enable_pmu" is N and then KVM_CAP_PMU_CAPABILITY
>    doesn't exist, QEMU needs to helpe user disable vPMU.
> 
> 3) v5.17 (since "enable_pmu" is introduced in v5.17 since 4732f2444acd
>    ("KVM: x86: Making the module parameter of vPMU more common")),
>    there's no KVM_CAP_PMU_CAPABILITY and vPMU enablement depends on
>    "enable_pmu". QEMU's enable_pmu option should depend on kvm
>    parameter.
> 
> 4) before v5.17, there's no "enable_pmu" so that there's no way to
>    fully disable AMD PMU.
> 
> IIUC, you want to distinguish 2) and 3). And your current codes won't
> break old kernels on 4) because "kvm_pmu_disabled" defaults false.
> Therefore, overall the idea of this patch is good for me.
> 
> But IMO, the logics all above can be compatible by:
> 
>  * First check the KVM_CAP_PMU_CAPABILITY,
>  * Only if KVM_CAP_PMU_CAPABILITY doesn't exist, then check the kvm parameter
> 
> ...instead of always checking the parameter as you are currently doing.
> 
> What about this change? :-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 4902694129f9..9a6044e41a82 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2055,13 +2055,34 @@ int kvm_arch_pre_create_vcpu(CPUState *cpu, Error **errp)
>           * behavior on Intel platform because current "pmu" property works
>           * as expected.
>           */
> -        if (has_pmu_cap && !X86_CPU(cpu)->enable_pmu) {
> -            ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> -                                    KVM_PMU_CAP_DISABLE);
> -            if (ret < 0) {
> -                error_setg_errno(errp, -ret,
> -                                 "Failed to set KVM_PMU_CAP_DISABLE");
> -                return ret;
> +        if (has_pmu_cap) {
> +            if (!X86_CPU(cpu)->enable_pmu) {
> +                ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> +                                        KVM_PMU_CAP_DISABLE);
> +                if (ret < 0) {
> +                    error_setg_errno(errp, -ret,
> +                                     "Failed to set KVM_PMU_CAP_DISABLE");
> +                    return ret;
> +                }
> +            }
> +        } else {
> +            /*
> +             * KVM_CAP_PMU_CAPABILITY is introduced in Linux v5.18. For old linux,
> +             * we have to check enable_pmu parameter for vPMU support.
> +             */
> +            g_autofree char *kvm_enable_pmu;
> +
> +            /*
> +             * The kvm.enable_pmu's permission is 0444. It does not change until a
> +             * reload of the KVM module.
> +             */
> +            if (g_file_get_contents("/sys/module/kvm/parameters/enable_pmu",
> +                &kvm_enable_pmu, NULL, NULL)) {
> +                if (*kvm_enable_pmu == 'N' && !X86_CPU(cpu)->enable_pmu) {

BTW, may I assume you meant:

if (*kvm_enable_pmu == 'N' && X86_CPU(cpu)->enable_pmu) {

not

if (*kvm_enable_pmu == 'N' && !X86_CPU(cpu)->enable_pmu) {

That is, return error because the QEMU isn't able to enable vPMU, because
of the kernel module configuration.

> +                    error_setg(errp, "Failed to enable PMU since "
> +                               "KVM's enable_pmu parameter is disabled");
> +                    return -1;
> +                }
>              }
>          }
>      }
> 
> ---
> 
> This example not only eliminates the static variable “kvm_pmu_disabled”,
> but also explicitly informs the user that vPMU is not available and
> QEMU's "pmu" option doesn't work.
> 
> As a comparison, your patch 8 actually "silently" disables PMU (in the
> kvm_init_pmu_info()) and user can only find it in Guest through PMU
> exceptions.

As replied in PATCH 08, we may still need a static variable
"kvm_pmu_disabled", in order to tell if we need to reset PMU registers when:

- X86_CPU(cpu)->enable_pmu = false.
- KVM_CAP_PMU_CAPABILITY returns 0.

If (kvm.enable_pmu=N)
    It is safe to skip PMU registers' reset
Otherwise
    We cannot skip reset.


Dongli Zhang

> 
> Thanks,
> Zhao
> 
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-10 16:39     ` Dongli Zhang
@ 2025-03-11 13:51       ` Zhao Liu
  2025-03-11 19:52         ` Dongli Zhang
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-11 13:51 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

Hi Dongli,

> >> +    /*
> >> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> >> +     * disable the AMD pmu virtualization.
> >> +     *
> >> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> >> +     * indicates the KVM has already disabled the PMU virtualization.
> >> +     */
> >> +    if (has_pmu_cap && !cpu->enable_pmu) {
> >> +        return;
> >> +    }
> > 
> > Could we only check "cpu->enable_pmu" at the beginning of this function?
> > then if pmu is already disabled, we don't need to initialize the pmu info.
> 
> I don't think so. There is a case:
> 
> - cpu->enable_pmu = false. (That is, "-cpu host,-pmu").
> - But for KVM prior v5.18 that KVM_CAP_PMU_CAPABILITY doesn't exist.
> 
> There is no way to disable vPMU. To determine based on only
> "!cpu->enable_pmu" doesn't work.

Ah, I didn't get your point here. When QEMU user has already disabled
PMU, why we still need to continue initialize PMU info and save/load PMU
MSRs? In this case, user won't expect vPMU could work.

> It works only when "!cpu->enable_pmu" and KVM_CAP_PMU_CAPABILITY exists.
> 
> 
> We may still need a static global variable here to indicate where
> "kvm.enable_pmu=N" (as discussed in PATCH 07).
>
> > 
> >> +    if (IS_INTEL_CPU(env)) {
> > 
> > Zhaoxin also supports architectural PerfMon in 0xa.
> > 
> > I'm not sure if this check should also involve Zhaoxin CPU, so cc
> > zhaoxin guys for double check.
> 
> Sure for both here and below 'ditto'. Thank you very much!

Per the Linux commit 3a4ac121c2cac, Zhaoxin mostly follows Intel
Architectural PerfMon-v2. Afterall, before this patch, these PMU things
didn't check any vendor, so I suppose vPMU may could work for Zhaoxin as
well. Therefore, its' better to consider Zhaoxin when you check Intel
CPU, which can help avoid introducing some regressions.

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-11 13:51       ` Zhao Liu
@ 2025-03-11 19:52         ` Dongli Zhang
  2025-03-12  8:30           ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-11 19:52 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

Hi Zhao,

On 3/11/25 6:51 AM, Zhao Liu wrote:
> Hi Dongli,
> 
>>>> +    /*
>>>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
>>>> +     * disable the AMD pmu virtualization.
>>>> +     *
>>>> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
>>>> +     * indicates the KVM has already disabled the PMU virtualization.
>>>> +     */
>>>> +    if (has_pmu_cap && !cpu->enable_pmu) {
>>>> +        return;
>>>> +    }
>>>
>>> Could we only check "cpu->enable_pmu" at the beginning of this function?
>>> then if pmu is already disabled, we don't need to initialize the pmu info.
>>
>> I don't think so. There is a case:
>>
>> - cpu->enable_pmu = false. (That is, "-cpu host,-pmu").
>> - But for KVM prior v5.18 that KVM_CAP_PMU_CAPABILITY doesn't exist.
>>
>> There is no way to disable vPMU. To determine based on only
>> "!cpu->enable_pmu" doesn't work.
> 
> Ah, I didn't get your point here. When QEMU user has already disabled
> PMU, why we still need to continue initialize PMU info and save/load PMU
> MSRs? In this case, user won't expect vPMU could work.

Yes, "In this case, user won't expect vPMU could work.".

But in reality vPMU is still active, although that doesn't match user's
expectation.

User doesn't expect PMU to work. However, "perf stat" still works in VM
(when KVM_CAP_PMU_CAPABILITY isn't available).

Would you suggest we only follow user's expectation? That is, once user
configure "-pmu", we are going to always assume vPMU is disabled, even it
is still available (on KVM without KVM_CAP_PMU_CAPABILITY and prior v5.18)?

> 
>> It works only when "!cpu->enable_pmu" and KVM_CAP_PMU_CAPABILITY exists.
>>
>>
>> We may still need a static global variable here to indicate where
>> "kvm.enable_pmu=N" (as discussed in PATCH 07).
>>
>>>
>>>> +    if (IS_INTEL_CPU(env)) {
>>>
>>> Zhaoxin also supports architectural PerfMon in 0xa.
>>>
>>> I'm not sure if this check should also involve Zhaoxin CPU, so cc
>>> zhaoxin guys for double check.
>>
>> Sure for both here and below 'ditto'. Thank you very much!
> 
> Per the Linux commit 3a4ac121c2cac, Zhaoxin mostly follows Intel
> Architectural PerfMon-v2. Afterall, before this patch, these PMU things
> didn't check any vendor, so I suppose vPMU may could work for Zhaoxin as
> well. Therefore, its' better to consider Zhaoxin when you check Intel
> CPU, which can help avoid introducing some regressions.
> 

Thank you very much!

zhaoxin_pmu_init() looks self explanatory.

Dongli Zhang


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-11 19:52         ` Dongli Zhang
@ 2025-03-12  8:30           ` Zhao Liu
  2025-03-12 22:17             ` Dongli Zhang
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-03-12  8:30 UTC (permalink / raw)
  To: Dongli Zhang
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

> >>>> +    /*
> >>>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> >>>> +     * disable the AMD pmu virtualization.
> >>>> +     *
> >>>> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> >>>> +     * indicates the KVM has already disabled the PMU virtualization.
> >>>> +     */
> >>>> +    if (has_pmu_cap && !cpu->enable_pmu) {
> >>>> +        return;
> >>>> +    }
> >>>
> >>> Could we only check "cpu->enable_pmu" at the beginning of this function?
> >>> then if pmu is already disabled, we don't need to initialize the pmu info.
> >>
> >> I don't think so. There is a case:
> >>
> >> - cpu->enable_pmu = false. (That is, "-cpu host,-pmu").
> >> - But for KVM prior v5.18 that KVM_CAP_PMU_CAPABILITY doesn't exist.
> >>
> >> There is no way to disable vPMU. To determine based on only
> >> "!cpu->enable_pmu" doesn't work.
> > 
> > Ah, I didn't get your point here. When QEMU user has already disabled
> > PMU, why we still need to continue initialize PMU info and save/load PMU
> > MSRs? In this case, user won't expect vPMU could work.
> 
> Yes, "In this case, user won't expect vPMU could work.".
> 
> But in reality vPMU is still active, although that doesn't match user's
> expectation.
> 
> User doesn't expect PMU to work. However, "perf stat" still works in VM
> (when KVM_CAP_PMU_CAPABILITY isn't available).
> 
> Would you suggest we only follow user's expectation?

Yes, for this case, many PMU related CPUIDs have already been disabled
because of "!enable_pmu", so IMO it's not necessary to handle other PMU
MSRs.

> That is, once user
> configure "-pmu", we are going to always assume vPMU is disabled, even it
> is still available (on KVM without KVM_CAP_PMU_CAPABILITY and prior v5.18)?

Strictly speaking, only the earlier AMD PMUs are still AVAILABLE at this
point, as the other platforms, have CPUIDs to indicate PMU enablement.
So for the latter (which I understand is most of the cases nowadays),
there's no reason to assume that the PMUs are still working when the CPUIDs
are corrupted...

There is no perfect solution for pre-v5.18 kernel... But while not breaking
compatibility, again IMO, we need the logic to be self-consistent, i.e.
any time the user does not enable vPMU (enable_pmu = false), it should be
assumed that vPMU does not work.

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-12  8:30           ` Zhao Liu
@ 2025-03-12 22:17             ` Dongli Zhang
  0 siblings, 0 replies; 63+ messages in thread
From: Dongli Zhang @ 2025-03-12 22:17 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, kvm, pbonzini, mtosatti, sandipan.das, babu.moger,
	likexu, like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov,
	den, davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai-oc

Hi Zhao,

On 3/12/25 1:30 AM, Zhao Liu wrote:
>>>>>> +    /*
>>>>>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
>>>>>> +     * disable the AMD pmu virtualization.
>>>>>> +     *
>>>>>> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
>>>>>> +     * indicates the KVM has already disabled the PMU virtualization.
>>>>>> +     */
>>>>>> +    if (has_pmu_cap && !cpu->enable_pmu) {
>>>>>> +        return;
>>>>>> +    }
>>>>>
>>>>> Could we only check "cpu->enable_pmu" at the beginning of this function?
>>>>> then if pmu is already disabled, we don't need to initialize the pmu info.
>>>>
>>>> I don't think so. There is a case:
>>>>
>>>> - cpu->enable_pmu = false. (That is, "-cpu host,-pmu").
>>>> - But for KVM prior v5.18 that KVM_CAP_PMU_CAPABILITY doesn't exist.
>>>>
>>>> There is no way to disable vPMU. To determine based on only
>>>> "!cpu->enable_pmu" doesn't work.
>>>
>>> Ah, I didn't get your point here. When QEMU user has already disabled
>>> PMU, why we still need to continue initialize PMU info and save/load PMU
>>> MSRs? In this case, user won't expect vPMU could work.
>>
>> Yes, "In this case, user won't expect vPMU could work.".
>>
>> But in reality vPMU is still active, although that doesn't match user's
>> expectation.
>>
>> User doesn't expect PMU to work. However, "perf stat" still works in VM
>> (when KVM_CAP_PMU_CAPABILITY isn't available).
>>
>> Would you suggest we only follow user's expectation?
> 
> Yes, for this case, many PMU related CPUIDs have already been disabled
> because of "!enable_pmu", so IMO it's not necessary to handle other PMU
> MSRs.
> 
>> That is, once user
>> configure "-pmu", we are going to always assume vPMU is disabled, even it
>> is still available (on KVM without KVM_CAP_PMU_CAPABILITY and prior v5.18)?
> 
> Strictly speaking, only the earlier AMD PMUs are still AVAILABLE at this
> point, as the other platforms, have CPUIDs to indicate PMU enablement.
> So for the latter (which I understand is most of the cases nowadays),
> there's no reason to assume that the PMUs are still working when the CPUIDs
> are corrupted...
> 
> There is no perfect solution for pre-v5.18 kernel... But while not breaking
> compatibility, again IMO, we need the logic to be self-consistent, i.e.
> any time the user does not enable vPMU (enable_pmu = false), it should be
> assumed that vPMU does not work.
> 

Sure. That makes coding easier, with less assumptions.

Thank you very much!

Dongli Zhang

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
                     ` (3 preceding siblings ...)
  2025-03-10  7:47   ` Zhao Liu
@ 2025-03-28  6:29   ` ewanhai
  2025-03-28 16:42     ` Dongli Zhang
  4 siblings, 1 reply; 63+ messages in thread
From: ewanhai @ 2025-03-28  6:29 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, zhao1.liu, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai, cobechen,
	louisqi, liamni, frankzhu, silviazhao

Hi Zhao,

Thank you for pointing out the potential impact on Zhaoxin CPUs!

Hi Dongli,

Zhaoxin (including vendor "__shanghai__" and "centaurhauls")'s PMU is
compatible with Intel, so I have some advice for this patch.

在 2025/3/3 06:00, Dongli Zhang 写道:
> [snip]
> +
> +static bool is_same_vendor(CPUX86State *env)
> +{
> +    static uint32_t host_cpuid_vendor1;
> +    static uint32_t host_cpuid_vendor2;
> +    static uint32_t host_cpuid_vendor3;
> +
> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
> +               &host_cpuid_vendor2);
> +
> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
> +           env->cpuid_vendor3 == host_cpuid_vendor3;
> +}
Should we consider a special case, such as emulating Intel CPUs on a
Zhaoxin platform, or vice versa? If so, maybe we can write a
'vendor_compatible()' helper. After all, before this patchset, QEMU
supported behavior-similar CPU emulation, e.g., emulating an Intel VCPU on
a Zhaoxin PCPU.
> +static void kvm_init_pmu_info(CPUState *cs)
> +{
> +    X86CPU *cpu = X86_CPU(cs);
> +    CPUX86State *env = &cpu->env;
> +
> +    /*
> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
> +     */
> +    if (kvm_pmu_disabled) {
> +        return;
> +    }
> +
> +    /*
> +     * It is not supported to virtualize AMD PMU registers on Intel
> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
> +     */
> +    if (!is_same_vendor(env)) {
> +        return;
> +    }

ditto.

> [snip]
> +    /*
> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> +     * disable the AMD pmu virtualization.
> +     *
> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
> +     * indicates the KVM has already disabled the PMU virtualization.
> +     */
> +    if (has_pmu_cap && !cpu->enable_pmu) {
> +        return;
> +    }
> +
> +    if (IS_INTEL_CPU(env)) {
> +        kvm_init_pmu_info_intel(env);
We can use "if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))" instead. This
helper was introduced to QEMU in commit 5d20aa540b.

The function name kvm_init_pmu_info_"intel"() is acceptable since the
current Zhaoxin and Intel PMU architectures are compatible. However,
if Zhaoxin develop any exclusive features in the future, we can always
implement a separate "zhaoxin" version of the PMU info initialization
function.
> +    } else if (IS_AMD_CPU(env)) {
> +        kvm_init_pmu_info_amd(env);
> +    }
> +}
> +
[snip]
>   int kvm_arch_init_vcpu(CPUState *cs)
>   {
>       struct {
> @@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>       cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>       cpuid_data.cpuid.nent = cpuid_i;
>   
> -    kvm_init_pmu_info(env);
> +    kvm_init_pmu_info(cs);
>   
>       if (((env->cpuid_version >> 8)&0xF) >= 6
>           && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> @@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>               kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env->poll_control_msr);
>           }
>   
> -        if (has_pmu_version > 0) {
> +        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
Also use 'if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))' instead.
>               if (has_pmu_version > 1) {
>                   /* Stop the counter.  */
>                   kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> @@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>                                     env->msr_global_ctrl);
>               }
>           }
> +
> +        if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +            uint32_t sel_base = MSR_K7_EVNTSEL0;
> +            uint32_t ctr_base = MSR_K7_PERFCTR0;
> ...
[snip]
> @@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>       if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>           kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>       }
> -    if (has_pmu_version > 0) {
> +
> +    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
Also use 'if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))' instead.
>           if (has_pmu_version > 1) {
>               kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>               kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> @@ -4558,6 +4679,35 @@ static int kvm_get_msrs(X86CPU *cpu)
>           }
>       }
>   
> +    if (IS_AMD_CPU(env) && has_pmu_version > 0) {
> +        uint32_t sel_base = MSR_K7_EVNTSEL0;
> +        uint32_t ctr_base = MSR_K7_PERFCTR0;
> ...


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-28  6:29   ` ewanhai
@ 2025-03-28 16:42     ` Dongli Zhang
  2025-03-31  3:55       ` ewanhai
  0 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-28 16:42 UTC (permalink / raw)
  To: ewanhai, qemu-devel, kvm, zhao1.liu
  Cc: pbonzini, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai, cobechen,
	louisqi, liamni, frankzhu, silviazhao

Hi ewanhai (and FYI Zhao),

Thank you very much for suggestion! Indeed recently I am struggling with QEMU
and Zhaoxin. Please see inline.

On 3/27/25 11:29 PM, ewanhai wrote:
> Hi Zhao,
> 
> Thank you for pointing out the potential impact on Zhaoxin CPUs!
> 
> Hi Dongli,
> 
> Zhaoxin (including vendor "__shanghai__" and "centaurhauls")'s PMU is
> compatible with Intel, so I have some advice for this patch.
> 
> 在 2025/3/3 06:00, Dongli Zhang 写道:
>> [snip]
>> +
>> +static bool is_same_vendor(CPUX86State *env)
>> +{
>> +    static uint32_t host_cpuid_vendor1;
>> +    static uint32_t host_cpuid_vendor2;
>> +    static uint32_t host_cpuid_vendor3;
>> +
>> +    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
>> +               &host_cpuid_vendor2);
>> +
>> +    return env->cpuid_vendor1 == host_cpuid_vendor1 &&
>> +           env->cpuid_vendor2 == host_cpuid_vendor2 &&
>> +           env->cpuid_vendor3 == host_cpuid_vendor3;
>> +}
> Should we consider a special case, such as emulating Intel CPUs on a
> Zhaoxin platform, or vice versa? If so, maybe we can write a

The vendor and CPU are different. i.e., if we use Zhaoxin CPU without
configuring vendor: "-cpu YongFeng,+pmu \" on Intel KVM.

The CPU is Zhaoxin while vendor is still Intel.

The PMU selection is based on vendor, not CPU.

[    0.321163] smpboot: CPU0: Intel Zhaoxin YongFeng Processor (family: 0x7,
model: 0xb, stepping: 0x3)
[    0.321996] Performance Events: generic architected perfmon, Intel PMU driver.
[    0.322867] ... version:                2
[    0.323738] ... bit width:              48
[    0.323864] ... generic registers:      4
[    0.324776] ... value mask:             0000ffffffffffff
[    0.324864] ... max period:             000000007fffffff
[    0.325864] ... fixed-purpose events:   3
[    0.326749] ... event mask:             000000070000000f

By default, IS_INTEL_CPU() still returns true even we emulate Zhaoxin on Intel KVM.

> 'vendor_compatible()' helper. After all, before this patchset, QEMU
> supported behavior-similar CPU emulation, e.g., emulating an Intel VCPU on
> a Zhaoxin PCPU.

I did many efforts, and I could not use Zhaoxin's PMU on Intel hypervisor.

According to arch/x86/events/zhaoxin/core.c, the Zhaoxin's PMU is working in
limited conditions, especially only when stepping >= 0xe.

switch (boot_cpu_data.x86) {
case 0x06:
    /*
     * Support Zhaoxin CPU from ZXC series, exclude Nano series through FMS.
     * Nano FMS: Family=6, Model=F, Stepping=[0-A][C-D]
     * ZXC FMS: Family=6, Model=F, Stepping=E-F OR Family=6, Model=0x19,
Stepping=0-3
     */
    if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
            boot_cpu_data.x86_model == 0x19) {


From QEMU, the stepping of YongFeng is always 3.

5502         .name = "YongFeng",
5503         .level = 0x1F,
5504         .vendor = CPUID_VENDOR_ZHAOXIN1,
5505         .family = 7,
5506         .model = 11,
5507         .stepping = 3,

Therefore, I cannot enable Zhaoxin's PMU on Intel KVM.

-cpu YongFeng,vendor="CentaurHauls",+pmu \

[    0.253229] smpboot: CPU0: Centaur Zhaoxin YongFeng Processor (family: 0x7,
model: 0xb, stepping: 0x3)
[    0.254009] Performance Events:
[    0.254009] core: Welcome to zhaoxin pmu!
[    0.254880] core: Version check pass!
[    0.255567] no PMU driver, software events only.


It doesn't work on Intel Icelake hypervisor too, even with "host".

-cpu host,vendor="CentaurHauls",+pmu \

[    0.268434] smpboot: CPU0: Centaur Intel(R) Xeon(R) Gold 6354 CPU @ 3.00GHz
(family: 0x6, model: 0x6a, stepping: 0x6)
[    0.269237] Performance Events:
[    0.269237] core: Welcome to zhaoxin pmu!
[    0.270112] core: Version check pass!
[    0.270768] no PMU driver, software events only.


The PMU never works, although cpuid returns PMU config.

[root@vm ~]# cpuid -1 -l 0xa
CPU:
   Architecture Performance Monitoring Features (0xa):
      version ID                               = 0x2 (2)
      number of counters per logical processor = 0x8 (8)
      bit width of counter                     = 0x30 (48)
      length of EBX bit vector                 = 0x8 (8)
      core cycle event                         = available
      instruction retired event                = available
      reference cycles event                   = available
      last-level cache ref event               = available
      last-level cache miss event              = available
      branch inst retired event                = available
      branch mispred retired event             = available
      top-down slots event                     = available
... ...
      number of contiguous fixed counters      = 0x3 (3)
      bit width of fixed counters              = 0x30 (48)
      anythread deprecation                    = true


So far I am not able to use Zhaoxin PMU on Intel hypervisor.

Since I don't have Zhaoxin environment, I am not sure about "vice versa".

Unless there is more suggestion from Zhao, I may replace is_same_vendor() with
vendor_compatible().

>> +static void kvm_init_pmu_info(CPUState *cs)
>> +{
>> +    X86CPU *cpu = X86_CPU(cs);
>> +    CPUX86State *env = &cpu->env;
>> +
>> +    /*
>> +     * The PMU virtualization is disabled by kvm.enable_pmu=N.
>> +     */
>> +    if (kvm_pmu_disabled) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * It is not supported to virtualize AMD PMU registers on Intel
>> +     * processors, nor to virtualize Intel PMU registers on AMD processors.
>> +     */
>> +    if (!is_same_vendor(env)) {
>> +        return;
>> +    }
> 
> ditto.

Sure. I may replace is_same_vendor() with
vendor_compatible(), unless there is objection from Zhao.

> 
>> [snip]
>> +    /*
>> +     * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
>> +     * disable the AMD pmu virtualization.
>> +     *
>> +     * If KVM_CAP_PMU_CAPABILITY is supported !cpu->enable_pmu
>> +     * indicates the KVM has already disabled the PMU virtualization.
>> +     */
>> +    if (has_pmu_cap && !cpu->enable_pmu) {
>> +        return;
>> +    }
>> +
>> +    if (IS_INTEL_CPU(env)) {
>> +        kvm_init_pmu_info_intel(env);
> We can use "if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))" instead. This
> helper was introduced to QEMU in commit 5d20aa540b.

Sure.

> 
> The function name kvm_init_pmu_info_"intel"() is acceptable since the
> current Zhaoxin and Intel PMU architectures are compatible. However,
> if Zhaoxin develop any exclusive features in the future, we can always
> implement a separate "zhaoxin" version of the PMU info initialization
> function.
>> +    } else if (IS_AMD_CPU(env)) {
>> +        kvm_init_pmu_info_amd(env);
>> +    }
>> +}
>> +
> [snip]
>>   int kvm_arch_init_vcpu(CPUState *cs)
>>   {
>>       struct {
>> @@ -2288,7 +2376,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>       cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
>>       cpuid_data.cpuid.nent = cpuid_i;
>>   -    kvm_init_pmu_info(env);
>> +    kvm_init_pmu_info(cs);
>>         if (((env->cpuid_version >> 8)&0xF) >= 6
>>           && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
>> @@ -4064,7 +4152,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>               kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, env-
>> >poll_control_msr);
>>           }
>>   -        if (has_pmu_version > 0) {
>> +        if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
> Also use 'if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))' instead.
>>               if (has_pmu_version > 1) {
>>                   /* Stop the counter.  */
>>                   kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>> @@ -4095,6 +4183,38 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>>                                     env->msr_global_ctrl);
>>               }
>>           }
>> +
>> +        if (IS_AMD_CPU(env) && has_pmu_version > 0) {
>> +            uint32_t sel_base = MSR_K7_EVNTSEL0;
>> +            uint32_t ctr_base = MSR_K7_PERFCTR0;
>> ...
> [snip]
>> @@ -4542,7 +4662,8 @@ static int kvm_get_msrs(X86CPU *cpu)
>>       if (env->features[FEAT_KVM] & CPUID_KVM_POLL_CONTROL) {
>>           kvm_msr_entry_add(cpu, MSR_KVM_POLL_CONTROL, 1);
>>       }
>> -    if (has_pmu_version > 0) {
>> +
>> +    if (IS_INTEL_CPU(env) && has_pmu_version > 0) {
> Also use 'if (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))' instead.

Sure.

Thank you very much for suggestion!

Dongli Zhang

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-28 16:42     ` Dongli Zhang
@ 2025-03-31  3:55       ` ewanhai
  2025-03-31 19:16         ` Dongli Zhang
  0 siblings, 1 reply; 63+ messages in thread
From: ewanhai @ 2025-03-31  3:55 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm, zhao1.liu
  Cc: pbonzini, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai, cobechen,
	louisqi, liamni, frankzhu, silviazhao

Hi Dongli,

I noticed you've sent the V3 patchset, but I believe it's more appropriate to
continue the discussion about the issues you encountered in this thread.

On 3/29/25 12:42 AM, Dongli Zhang wrote:
> The vendor and CPU are different. i.e., if we use Zhaoxin CPU without
> configuring vendor: "-cpu YongFeng,+pmu \" on Intel KVM.
>
> The CPU is Zhaoxin while vendor is still Intel.
[1] QEMU always sets the vCPU's vendor to match the host's vendor when
acceleration (KVM or HVF) is enabled(except for users set guest vendor
with -cpu xx, vendor=xx).
> The PMU selection is based on vendor, not CPU.
>
> [    0.321163] smpboot: CPU0: Intel Zhaoxin YongFeng Processor (family: 0x7,
> model: 0xb, stepping: 0x3)
> [    0.321996] Performance Events: generic architected perfmon, Intel PMU driver.
> [    0.322867] ... version:                2
> [    0.323738] ... bit width:              48
> [    0.323864] ... generic registers:      4
> [    0.324776] ... value mask:             0000ffffffffffff
> [    0.324864] ... max period:             000000007fffffff
> [    0.325864] ... fixed-purpose events:   3
> [    0.326749] ... event mask:             000000070000000f
>
> By default, IS_INTEL_CPU() still returns true even we emulate Zhaoxin on Intel KVM.

[2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's vendor
when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore, should we display a warning to users who enable both vPMU and KVM acceleration but do not manually set the guest vendor when it differs from the host vendor?
> I did many efforts, and I could not use Zhaoxin's PMU on Intel hypervisor.
>
> According to arch/x86/events/zhaoxin/core.c, the Zhaoxin's PMU is working in
> limited conditions, especially only when stepping >= 0xe.
>
> switch (boot_cpu_data.x86) {
> case 0x06:
>      /*
>       * Support Zhaoxin CPU from ZXC series, exclude Nano series through FMS.
>       * Nano FMS: Family=6, Model=F, Stepping=[0-A][C-D]
>       * ZXC FMS: Family=6, Model=F, Stepping=E-F OR Family=6, Model=0x19,
> Stepping=0-3
>       */
>      if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
>              boot_cpu_data.x86_model == 0x19) {
>
>
>  From QEMU, the stepping of YongFeng is always 3.
>
> 5502         .name = "YongFeng",
> 5503         .level = 0x1F,
> 5504         .vendor = CPUID_VENDOR_ZHAOXIN1,
> 5505         .family = 7,
> 5506         .model = 11,
> 5507         .stepping = 3,
>
> Therefore, I cannot enable Zhaoxin's PMU on Intel KVM.
>
> -cpu YongFeng,vendor="CentaurHauls",+pmu \
>
> [    0.253229] smpboot: CPU0: Centaur Zhaoxin YongFeng Processor (family: 0x7,
> model: 0xb, stepping: 0x3)
> [    0.254009] Performance Events:
> [    0.254009] core: Welcome to zhaoxin pmu!
> [    0.254880] core: Version check pass!
> [    0.255567] no PMU driver, software events only.
>
>
> It doesn't work on Intel Icelake hypervisor too, even with "host".
>
> -cpu host,vendor="CentaurHauls",+pmu \
>
> [    0.268434] smpboot: CPU0: Centaur Intel(R) Xeon(R) Gold 6354 CPU @ 3.00GHz
> (family: 0x6, model: 0x6a, stepping: 0x6)
> [    0.269237] Performance Events:
> [    0.269237] core: Welcome to zhaoxin pmu!
> [    0.270112] core: Version check pass!
> [    0.270768] no PMU driver, software events only.
>
>
> The PMU never works, although cpuid returns PMU config.
>
> [root@vm ~]# cpuid -1 -l 0xa
> CPU:
>     Architecture Performance Monitoring Features (0xa):
>        version ID                               = 0x2 (2)
>        number of counters per logical processor = 0x8 (8)
>        bit width of counter                     = 0x30 (48)
>        length of EBX bit vector                 = 0x8 (8)
>        core cycle event                         = available
>        instruction retired event                = available
>        reference cycles event                   = available
>        last-level cache ref event               = available
>        last-level cache miss event              = available
>        branch inst retired event                = available
>        branch mispred retired event             = available
>        top-down slots event                     = available
> ... ...
>        number of contiguous fixed counters      = 0x3 (3)
>        bit width of fixed counters              = 0x30 (48)
>        anythread deprecation                    = true
>
>
> So far I am not able to use Zhaoxin PMU on Intel hypervisor.
>
> Since I don't have Zhaoxin environment, I am not sure about "vice versa".
>
> Unless there is more suggestion from Zhao, I may replace is_same_vendor() with
> vendor_compatible().
I'm sorry I didn't provide you with enough information about the Zhaoxin PMU.

1. I made a mistake in the Zhaoxin YongFeng vCPU model patch. The correct model
should be 0x5b, but I mistakenly set it to 0xb (11). The mistake happened because
I overlooked the extended model bits from cpuid[eax=0x1].eax and only used the
base model. I'll send a fix patch soon.

2. As you can see in zhaoxin_pmu_init() in the Linux kernel, there is no handling
for CPUs with family 0x7 and model (base + extended) 0x5b. The reason is clear:
we submitted a patch for zhaoxin_pmu_init() to support YongFeng two years ago
(https://lore.kernel.org/lkml/20230323024026.823-1-silviazhao-oc@zhaoxin.com/),
but received no response. We will keep trying to resubmit it.



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-31  3:55       ` ewanhai
@ 2025-03-31 19:16         ` Dongli Zhang
  2025-04-01  3:35           ` Ewan Hai
  0 siblings, 1 reply; 63+ messages in thread
From: Dongli Zhang @ 2025-03-31 19:16 UTC (permalink / raw)
  To: ewanhai, qemu-devel, kvm
  Cc: pbonzini, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai, cobechen,
	louisqi, liamni, frankzhu, silviazhao, zhao1.liu

Hi ewanhai,

On 3/30/25 8:55 PM, ewanhai wrote:
> Hi Dongli,
> 

[snip]

> 
> [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's
> vendor
> when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
> Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore,
> should we display a warning to users who enable both vPMU and KVM acceleration
> but do not manually set the guest vendor when it differs from the host vendor?

Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the
default :)

>> I did many efforts, and I could not use Zhaoxin's PMU on Intel hypervisor.
>>

[snip]

>>
>> So far I am not able to use Zhaoxin PMU on Intel hypervisor.
>>
>> Since I don't have Zhaoxin environment, I am not sure about "vice versa".
>>
>> Unless there is more suggestion from Zhao, I may replace is_same_vendor() with
>> vendor_compatible().
> I'm sorry I didn't provide you with enough information about the Zhaoxin PMU.
> 
> 1. I made a mistake in the Zhaoxin YongFeng vCPU model patch. The correct model
> should be 0x5b, but I mistakenly set it to 0xb (11). The mistake happened because
> I overlooked the extended model bits from cpuid[eax=0x1].eax and only used the
> base model. I'll send a fix patch soon.
> 
> 2. As you can see in zhaoxin_pmu_init() in the Linux kernel, there is no handling
> for CPUs with family 0x7 and model (base + extended) 0x5b. The reason is clear:
> we submitted a patch for zhaoxin_pmu_init() to support YongFeng two years ago
> (https://urldefense.com/v3/__https://lore.kernel.org/lkml/20230323024026.823-1-
> silviazhao-oc@zhaoxin.com/__;!!ACWV5N9M2RV99hQ!NduXM-
> ouGzo6_imecWUY_JxPGGp72W4M0Gk3ian-
> na03t2R2BfTPwxnfNOS8JO1IGAL_F9G3ZnsY7zh2F7vuXAIS$ ),
> but received no response. We will keep trying to resubmit it.
> 

Thank you very much for explanation.

The VM (v5.15) is able to detect PMU after the below is applied.

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1b64ceaaba..9077c4c44f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5503,7 +5503,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
         .level = 0x1F,
         .vendor = CPUID_VENDOR_ZHAOXIN1,
         .family = 7,
-        .model = 11,
+        .model = 0x3b,
         .stepping = 3,
         /* missing: CPUID_HT, CPUID_TM, CPUID_PBE */
         .features[FEAT_1_EDX] =

I have changed model to 0x3b.

[    0.298541] smpboot: CPU0: Centaur Zhaoxin YongFeng Processor (family: 0x7,
model: 0x3b, stepping: 0x3)
[    0.299294] Performance Events:
[    0.299295] core: Welcome to zhaoxin pmu!
[    0.300176] core: Version check pass!
[    0.301002] ZXE events, zhaoxin PMU driver.
[    0.301177] ... version:                2
[    0.302061] ... bit width:              48
[    0.302174] ... generic registers:      4
[    0.303053] ... value mask:             0000ffffffffffff
[    0.303174] ... max period:             00007fffffffffff
[    0.304174] ... fixed-purpose events:   3
[    0.305063] ... event mask:             000000070000000f


In the v3 patchset, it always follows the Intel path, if both guest and host are
Intel or Zhaoxin.

https://lore.kernel.org/qemu-devel/20250331013307.11937-9-dongli.zhang@oracle.com/


Thank you very much!

Dongli Zhang


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-03-31 19:16         ` Dongli Zhang
@ 2025-04-01  3:35           ` Ewan Hai
  2025-04-07  8:51             ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Ewan Hai @ 2025-04-01  3:35 UTC (permalink / raw)
  To: Dongli Zhang, qemu-devel, kvm
  Cc: pbonzini, mtosatti, sandipan.das, babu.moger, likexu,
	like.xu.linux, zhenyuw, groug, khorenko, alexander.ivanov, den,
	davydov-max, xiaoyao.li, dapeng1.mi, joe.jin, ewanhai, cobechen,
	louisqi, liamni, frankzhu, silviazhao, zhao1.liu

>> [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's
>> vendor
>> when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
>> Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore,
>> should we display a warning to users who enable both vPMU and KVM acceleration
>> but do not manually set the guest vendor when it differs from the host vendor?
> 
> Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the
> default :)

Okay, handling this situation can be rather complex, so let's keep it simple. I 
have added a dedicated function to capture the intended behavior for potential 
future reference.

Anyway, Thanks for taking Zhaoxin's situation into account, regardless.


+/*
+ * check_vendor_compatibility_and_warn() returns true if the host and
+ * guest vendors are compatible for vPMU virtualization. In addition, if
+ * the guest vendor is not explicitly set in a cross-vendor emulation
+ * scenario (e.g., a Zhaoxin host emulating an Intel guest or vice versa),
+ * it issues a warning.
+ */
+static bool check_vendor_compatibility_and_warn(CPUX86State *env)
+{
+    char host_vendor[CPUID_VENDOR_SZ + 1];
+    uint32_t host_cpuid_vendor1, host_cpuid_vendor2, host_cpuid_vendor3;
+
+    /* Retrieve host vendor info */
+    host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
+               &host_cpuid_vendor2);
+    x86_cpu_vendor_words2str(host_vendor, host_cpuid_vendor1,
+                             host_cpuid_vendor2, host_cpuid_vendor3);
+
+    /*
+     * Case A:
+     * If the host vendor is Intel or Zhaoxin and the guest CPU type is
+     * either Intel or Zhaoxin, consider them compatible. However, if a
+     * cross-vendor scenario is detected (e.g., host is Zhaoxin but guest is
+     * Intel, or vice versa) and the guest vendor fields have not been
+     * overridden (i.e., they still match the host), then warn the user.
+     */
+    if ((g_str_equal(host_vendor, CPUID_VENDOR_INTEL) ||
+         g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN1) ||
+         g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN2)) &&
+        (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env)))
+    {
+        if ((g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN1) ||
+             g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN2)) &&
+            IS_INTEL_CPU(env) &&
+            (env->cpuid_vendor1 == host_cpuid_vendor1 &&
+             env->cpuid_vendor2 == host_cpuid_vendor2 &&
+             env->cpuid_vendor3 == host_cpuid_vendor3))
+        {
+            warning_report("vPMU emulation will fail because the guest vendor "
+                            "is not explicitly set. Use '-cpu,vendor=Intel' to "
+                            "emulate Intel vPMU on a Zhaoxin host.");
+        }
+        else if (g_str_equal(host_vendor, CPUID_VENDOR_INTEL) &&
+                 IS_ZHAOXIN_CPU(env) &&
+                 (env->cpuid_vendor1 == host_cpuid_vendor1 &&
+                  env->cpuid_vendor2 == host_cpuid_vendor2 &&
+                  env->cpuid_vendor3 == host_cpuid_vendor3))
+        {
+            warning_report("vPMU emulation will fail because the guest vendor"
+                            "is not explicitly set. Use '-cpu,vendor=Zhaoxin' "
+                            "to emulate Zhaoxin vPMU on an Intel host.");
+        }
+        return true;
+    }
+
+    /*
+     * Case B:
+     * For other CPU types, if the guest vendor fields exactly match the host,
+     * consider them compatible.
+     */
+    if (env->cpuid_vendor1 == host_cpuid_vendor1 &&
+        env->cpuid_vendor2 == host_cpuid_vendor2 &&
+        env->cpuid_vendor3 == host_cpuid_vendor3)
+    {
+        return true;
+    }
+
+    return false;
+}
+

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-04-01  3:35           ` Ewan Hai
@ 2025-04-07  8:51             ` Zhao Liu
  2025-04-07  9:33               ` Ewan Hai
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2025-04-07  8:51 UTC (permalink / raw)
  To: Ewan Hai
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, mtosatti, sandipan.das,
	babu.moger, likexu, like.xu.linux, zhenyuw, groug, khorenko,
	alexander.ivanov, den, davydov-max, xiaoyao.li, dapeng1.mi,
	joe.jin, ewanhai, cobechen, louisqi, liamni, frankzhu, silviazhao

On Tue, Apr 01, 2025 at 11:35:49AM +0800, Ewan Hai wrote:
> Date: Tue, 1 Apr 2025 11:35:49 +0800
> From: Ewan Hai <ewanhai-oc@zhaoxin.com>
> Subject: Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers
>  during VM reset
> 
> > > [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's
> > > vendor
> > > when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
> > > Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore,
> > > should we display a warning to users who enable both vPMU and KVM acceleration
> > > but do not manually set the guest vendor when it differs from the host vendor?
> > 
> > Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the
> > default :)
> 
> Okay, handling this situation can be rather complex, so let's keep it
> simple. I have added a dedicated function to capture the intended behavior
> for potential future reference.
> 
> Anyway, Thanks for taking Zhaoxin's situation into account, regardless.
> 

Thanks for your code example!!

Zhaoxin implements perfmon v2, so I think checking the vendor might be
overly complicated. If a check is needed, it seems more reasonable to
check the perfmon version rather than the vendor, similar to how avx10
version is checked in x86_cpu_filter_features().

I understand Ewan's concern is that if an Intel guest requires a higher
perfmon version that the Zhaoxin host doesn't support, there could be
issues (although I think this situation doesn't currently exist in KVM-QEMU,
one reason is QEMU uses the pmu_version in 0xa queried from KVM directly,
which means QEMU currently doesn't support custom pmu_version).

(I'll help go through Dongli's v3 soon.)

Thank you both,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-04-07  8:51             ` Zhao Liu
@ 2025-04-07  9:33               ` Ewan Hai
  2025-04-16  8:17                 ` Mi, Dapeng
  0 siblings, 1 reply; 63+ messages in thread
From: Ewan Hai @ 2025-04-07  9:33 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, mtosatti, sandipan.das,
	babu.moger, likexu, like.xu.linux, zhenyuw, groug, khorenko,
	alexander.ivanov, den, davydov-max, xiaoyao.li, dapeng1.mi,
	joe.jin, ewanhai, cobechen, louisqi, liamni, frankzhu, silviazhao,
	yeeli



On 4/7/25 4:51 PM, Zhao Liu wrote:

> 
> On Tue, Apr 01, 2025 at 11:35:49AM +0800, Ewan Hai wrote:
>> Date: Tue, 1 Apr 2025 11:35:49 +0800
>> From: Ewan Hai <ewanhai-oc@zhaoxin.com>
>> Subject: Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers
>>   during VM reset
>>
>>>> [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's
>>>> vendor
>>>> when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
>>>> Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore,
>>>> should we display a warning to users who enable both vPMU and KVM acceleration
>>>> but do not manually set the guest vendor when it differs from the host vendor?
>>>
>>> Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the
>>> default :)
>>
>> Okay, handling this situation can be rather complex, so let's keep it
>> simple. I have added a dedicated function to capture the intended behavior
>> for potential future reference.
>>
>> Anyway, Thanks for taking Zhaoxin's situation into account, regardless.
>>
> 
> Thanks for your code example!!
> 
> Zhaoxin implements perfmon v2, so I think checking the vendor might be
> overly complicated. If a check is needed, it seems more reasonable to
> check the perfmon version rather than the vendor, similar to how avx10
> version is checked in x86_cpu_filter_features().
> 
> I understand Ewan's concern is that if an Intel guest requires a higher
> perfmon version that the Zhaoxin host doesn't support, there could be
> issues (although I think this situation doesn't currently exist in KVM-QEMU,
> one reason is QEMU uses the pmu_version in 0xa queried from KVM directly,
> which means QEMU currently doesn't support custom pmu_version).

Yeah, that's exactly what I was concerned about.
Thanks for clearing that up!

perfmon_version is a great idea --- I might add it as a property to the QEMU 
vCPU template in the future, so it can adjust based on user input and host support.
Can't promise a timeline yet, but it's definitely something I'll keep in mind.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
  2025-04-07  9:33               ` Ewan Hai
@ 2025-04-16  8:17                 ` Mi, Dapeng
  0 siblings, 0 replies; 63+ messages in thread
From: Mi, Dapeng @ 2025-04-16  8:17 UTC (permalink / raw)
  To: Ewan Hai, Zhao Liu
  Cc: Dongli Zhang, qemu-devel, kvm, pbonzini, mtosatti, sandipan.das,
	babu.moger, likexu, like.xu.linux, zhenyuw, groug, khorenko,
	alexander.ivanov, den, davydov-max, xiaoyao.li, joe.jin, ewanhai,
	cobechen, louisqi, liamni, frankzhu, silviazhao, yeeli


On 4/7/2025 5:33 PM, Ewan Hai wrote:
>
> On 4/7/25 4:51 PM, Zhao Liu wrote:
>
>> On Tue, Apr 01, 2025 at 11:35:49AM +0800, Ewan Hai wrote:
>>> Date: Tue, 1 Apr 2025 11:35:49 +0800
>>> From: Ewan Hai <ewanhai-oc@zhaoxin.com>
>>> Subject: Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers
>>>   during VM reset
>>>
>>>>> [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's
>>>>> vendor
>>>>> when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a
>>>>> Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore,
>>>>> should we display a warning to users who enable both vPMU and KVM acceleration
>>>>> but do not manually set the guest vendor when it differs from the host vendor?
>>>> Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the
>>>> default :)
>>> Okay, handling this situation can be rather complex, so let's keep it
>>> simple. I have added a dedicated function to capture the intended behavior
>>> for potential future reference.
>>>
>>> Anyway, Thanks for taking Zhaoxin's situation into account, regardless.
>>>
>> Thanks for your code example!!
>>
>> Zhaoxin implements perfmon v2, so I think checking the vendor might be
>> overly complicated. If a check is needed, it seems more reasonable to
>> check the perfmon version rather than the vendor, similar to how avx10
>> version is checked in x86_cpu_filter_features().
>>
>> I understand Ewan's concern is that if an Intel guest requires a higher
>> perfmon version that the Zhaoxin host doesn't support, there could be
>> issues (although I think this situation doesn't currently exist in KVM-QEMU,
>> one reason is QEMU uses the pmu_version in 0xa queried from KVM directly,
>> which means QEMU currently doesn't support custom pmu_version).
> Yeah, that's exactly what I was concerned about.
> Thanks for clearing that up!
>
> perfmon_version is a great idea --- I might add it as a property to the QEMU 
> vCPU template in the future, so it can adjust based on user input and host support.
> Can't promise a timeline yet, but it's definitely something I'll keep in mind.

I'm wondering if there are real user cases that sets a lower PMU version
than host PMU version (live migration on different HW platforms?). In
theory, the higher PMU version should fully back compatible with lower PMU
version, but it's not always true in reality. One example is that the
Anythread bit introduced in v3 is dropped in PMU v5 on Intel processors.
This causes some difficulties to support PMU version 3/4 on host with PMU
version 5+. Different PMU versions between host and guest could cause
similar issues on other platforms.

Currently KVM supported highest PMU version is v2. We plan to support
higher PMU version for Intel processors on top of mediated vPMU. If guest
sets pmu version to 3/4 on host with PMU version 5+, the Anythread bit
would be marked as reserved,


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2025-04-16  8:17 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-02 22:00 [PATCH v2 00/10] target/i386/kvm/pmu: PMU Enhancement, Bugfix and Cleanup Dongli Zhang
2025-03-02 22:00 ` [PATCH v2 01/10] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
2025-03-04 14:40   ` Xiaoyao Li
2025-03-04 22:53     ` dongli.zhang
2025-03-05  1:38       ` Xiaoyao Li
2025-03-05 14:20   ` Zhao Liu
2025-03-07  7:24   ` Sandipan Das
2025-03-02 22:00 ` [PATCH v2 02/10] target/i386: disable PERFCORE when "-pmu" is configured Dongli Zhang
2025-03-03  1:59   ` Xiaoyao Li
2025-03-03 18:45     ` dongli.zhang
2025-03-04  6:11       ` Xiaoyao Li
2025-03-06 16:50   ` Zhao Liu
2025-03-06 17:47     ` dongli.zhang
2025-03-07  7:41       ` Zhao Liu
2025-03-02 22:00 ` [PATCH v2 03/10] [DO NOT MERGE] kvm: Introduce kvm_arch_pre_create_vcpu() Dongli Zhang
2025-03-05 14:46   ` Zhao Liu
2025-03-05 21:53     ` dongli.zhang
2025-03-07  7:52       ` Zhao Liu
2025-03-07  8:40         ` Xiaoyao Li
2025-03-02 22:00 ` [PATCH v2 04/10] target/i386/kvm: set KVM_PMU_CAP_DISABLE if "-pmu" is configured Dongli Zhang
2025-03-04  7:59   ` Xiaoyao Li
2025-03-05  1:22     ` Sean Christopherson
2025-03-05  1:35       ` Xiaoyao Li
2025-03-05 14:41     ` Zhao Liu
2025-03-05 20:13       ` dongli.zhang
2025-03-05 14:44   ` Zhao Liu
2025-03-02 22:00 ` [PATCH v2 05/10] target/i386/kvm: extract unrelated code out of kvm_x86_build_cpuid() Dongli Zhang
2025-03-05  7:03   ` Mi, Dapeng
2025-03-07  9:15   ` Zhao Liu
2025-03-07 22:47     ` Dongli Zhang
2025-03-10  3:55       ` Zhao Liu
2025-03-02 22:00 ` [PATCH v2 06/10] target/i386/kvm: rename architectural PMU variables Dongli Zhang
2025-03-05  7:07   ` Mi, Dapeng
2025-03-07  9:19   ` Zhao Liu
2025-03-07 22:49     ` Dongli Zhang
2025-03-02 22:00 ` [PATCH v2 07/10] target/i386/kvm: query kvm.enable_pmu parameter Dongli Zhang
2025-03-10  6:14   ` Zhao Liu
2025-03-10 15:41     ` Dongli Zhang
2025-03-10 16:49     ` Dongli Zhang
2025-03-02 22:00 ` [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
2025-03-05  7:33   ` Mi, Dapeng
2025-03-05 11:41   ` Francesco Lavra
2025-03-05 19:05     ` dongli.zhang
2025-03-07  7:38   ` Sandipan Das
2025-03-10  7:47   ` Zhao Liu
2025-03-10 16:39     ` Dongli Zhang
2025-03-11 13:51       ` Zhao Liu
2025-03-11 19:52         ` Dongli Zhang
2025-03-12  8:30           ` Zhao Liu
2025-03-12 22:17             ` Dongli Zhang
2025-03-28  6:29   ` ewanhai
2025-03-28 16:42     ` Dongli Zhang
2025-03-31  3:55       ` ewanhai
2025-03-31 19:16         ` Dongli Zhang
2025-04-01  3:35           ` Ewan Hai
2025-04-07  8:51             ` Zhao Liu
2025-04-07  9:33               ` Ewan Hai
2025-04-16  8:17                 ` Mi, Dapeng
2025-03-02 22:00 ` [PATCH v2 09/10] target/i386/kvm: support perfmon-v2 for reset Dongli Zhang
2025-03-02 22:00 ` [PATCH v2 10/10] target/i386/kvm: don't stop Intel PMU counters Dongli Zhang
2025-03-05  7:35   ` Mi, Dapeng
2025-03-05 19:00     ` dongli.zhang
2025-03-06  1:38       ` Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).