qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Zhao Liu <zhao1.liu@intel.com>
To: Dongli Zhang <dongli.zhang@oracle.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, pbonzini@redhat.com,
	mtosatti@redhat.com, sandipan.das@amd.com, babu.moger@amd.com,
	likexu@tencent.com, like.xu.linux@gmail.com,
	zhenyuw@linux.intel.com, groug@kaod.org, lyan@digitalocean.com,
	khorenko@virtuozzo.com, alexander.ivanov@virtuozzo.com,
	den@virtuozzo.com, joe.jin@oracle.com,
	davydov-max@yandex-team.ru, dapeng1.mi@linux.intel.com,
	zide.chen@intel.com
Subject: Re: [PATCH 2/7] target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE
Date: Thu, 7 Nov 2024 15:52:42 +0800	[thread overview]
Message-ID: <ZyxxygVaufOntpZJ@intel.com> (raw)
In-Reply-To: <20241104094119.4131-3-dongli.zhang@oracle.com>

(+Dapang & Zide)

Hi Dongli,

On Mon, Nov 04, 2024 at 01:40:17AM -0800, Dongli Zhang wrote:
> Date: Mon,  4 Nov 2024 01:40:17 -0800
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [PATCH 2/7] target/i386/kvm: introduce 'pmu-cap-disabled' to set
>  KVM_PMU_CAP_DISABLE
> X-Mailer: git-send-email 2.43.5
> 
> The AMD PMU virtualization is not disabled when configuring
> "-cpu host,-pmu" in the QEMU command line on an AMD server. Neither
> "-cpu host,-pmu" nor "-cpu EPYC" effectively disables AMD PMU
> virtualization in such an environment.
> 
> As a result, VM logs typically show:
> 
> [    0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
> 
> whereas the expected logs should be:
> 
> [    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
> [    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
> 
> This discrepancy occurs because AMD PMU does not use CPUID to determine
> whether PMU virtualization is supported.

Intel platform doesn't have this issue since Linux kernel fails to check
the CPU family & model when "-cpu *,-pmu" option clears PMU version.

The difference between Intel and AMD platforms, however, is that it seems
Intel hardly ever reaches the “...due virtualization” message, but
instead reports an error because it recognizes a mismatched family/model.

This may be a drawback of the PMU driver's print message, but the result
is the same, it prevents the PMU driver from enabling.

So, please mention that KVM_PMU_CAP_DISABLE doesn't change the PMU
behavior on Intel platform because current "pmu" property works as
expected.

> To address this, we introduce a new property, 'pmu-cap-disabled', for KVM
> acceleration. This property sets KVM_PMU_CAP_DISABLE if
> KVM_CAP_PMU_CAPABILITY is supported. Note that this feature currently
> supports only x86 hosts, as KVM_CAP_PMU_CAPABILITY is used exclusively for
> x86 systems.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Another previous solution to re-use '-cpu host,-pmu':
> https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@oracle.com/

IMO, I prefer the previous version. This VM-level KVM property is
difficult to integrate with the existing CPU properties. Pls refer later
comments for reasons.

>  accel/kvm/kvm-all.c        |  1 +
>  include/sysemu/kvm_int.h   |  1 +
>  qemu-options.hx            |  9 ++++++-
>  target/i386/cpu.c          |  2 +-
>  target/i386/kvm/kvm.c      | 52 ++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/kvm_i386.h |  2 ++
>  6 files changed, 65 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 801cff16a5..8b5ba45cf7 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -3933,6 +3933,7 @@ static void kvm_accel_instance_init(Object *obj)
>      s->xen_evtchn_max_pirq = 256;
>      s->device = NULL;
>      s->msr_energy.enable = false;
> +    s->pmu_cap_disabled = false;
>  }

The CPU property "pmu" also defaults to "false"...but:

 * max CPU would override this and try to enable PMU by default in
   max_x86_cpu_initfn().

 * Other named CPU models keep the default setting to avoid affecting
   the migration.

The pmu_cap_disabled and “pmu” property look unbound and unassociated,
so this can cause the conflict when they are not synchronized. For
example,

-cpu host -accel kvm,pmu-cap-disabled=on

The above options will fail to launch a VM (on Intel platform).

Ideally, the “pmu” property and pmu-cap-disabled should be bound to each
other and be consistent. But it's not easy because:
 - There is no proper way to have pmu_cap_disabled set different default
   values (e.g., "false" for max CPU and "true" for named CPU models)
   based on different CPU models.
 - And, no proper place to check the consistency of pmu_cap_disabled and
   enable_pmu.

Therefore, I prefer your previous approach, to reuse current CPU "pmu"
property.

Further, considering that this is currently the only case that needs to
to set the VM level's capability in the CPU context, there is no need to
introduce a new kvm interface (in your previous patch), which can instead
be set in kvm_cpu_realizefn(), like:


diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index 99d1941cf51c..05e9c9a1a0cf 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -42,6 +42,8 @@ static bool kvm_cpu_realizefn(CPUState *cs, Error **errp)
 {
     X86CPU *cpu = X86_CPU(cs);
     CPUX86State *env = &cpu->env;
+    KVMState *s = kvm_state;
+    static bool first = true;
     bool ret;

     /*
@@ -63,6 +65,29 @@ static bool kvm_cpu_realizefn(CPUState *cs, Error **errp)
      *   check/update ucode_rev, phys_bits, guest_phys_bits, mwait
      *   cpu_common_realizefn() (via xcc->parent_realize)
      */
+
+    if (first) {
+        first = false;
+
+        /*
+         * Since Linux v5.18, KVM provides a VM-level capability to easily
+         * disable PMUs; however, QEMU has been providing PMU property per
+         * CPU since v1.6. In order to accommodate both, have to configure
+         * the VM-level capability here.
+         */
+        if (!cpu->enable_pmu &&
+            kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY)) {
+            int r = kvm_vm_enable_cap(s, KVM_CAP_PMU_CAPABILITY, 0,
+                                      KVM_PMU_CAP_DISABLE);
+
+            if (r < 0) {
+                error_setg(errp, "kvm: Failed to disable pmu cap: %s",
+                           strerror(-r));
+                return false;
+            }
+        }
+    }
+
     if (cpu->max_features) {
         if (enable_cpu_pm) {
             if (kvm_has_waitpkg()) {
---

In addition, if PMU is disabled, why not mask the perf related bits in
8000_0001_ECX? :)

Regards,
Zhao



  reply	other threads:[~2024-11-07  7:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-04  9:40 [PATCH 0/7] target/i386/kvm/pmu: Enhancement, Bugfix and Cleanup Dongli Zhang
2024-11-04  9:40 ` [PATCH 1/7] target/i386: disable PerfMonV2 when PERFCORE unavailable Dongli Zhang
2024-11-06  3:54   ` Zhao Liu
2024-11-07  0:29     ` dongli.zhang
2024-11-07  7:57       ` Zhao Liu
2024-11-04  9:40 ` [PATCH 2/7] target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE Dongli Zhang
2024-11-07  7:52   ` Zhao Liu [this message]
2024-11-07 23:44     ` dongli.zhang
2024-11-08  2:32       ` Zhao Liu
2024-11-08 12:52       ` Sandipan Das
2024-11-13 17:15       ` Zhao Liu
2024-11-14  0:13         ` dongli.zhang
2024-11-21 10:06       ` Mi, Dapeng
2025-02-07  9:52         ` Mi, Dapeng
2025-02-09 20:12           ` dongli.zhang
2025-02-10  8:04             ` Mi, Dapeng
2024-11-04  9:40 ` [PATCH 3/7] target/i386/kvm: init PMU information only once Dongli Zhang
2024-11-10 15:29   ` Zhao Liu
2024-11-13  1:50     ` dongli.zhang
2024-11-13 16:48       ` Zhao Liu
2024-11-04  9:40 ` [PATCH 4/7] target/i386/kvm: rename architectural PMU variables Dongli Zhang
2024-11-04  9:40 ` [PATCH 5/7] target/i386/kvm: reset AMD PMU registers during VM reset Dongli Zhang
2024-11-06  9:58   ` Sandipan Das
2024-11-07  0:33     ` dongli.zhang
2024-11-07 21:00   ` Maksim Davydov
2024-11-08  1:19     ` dongli.zhang
2024-11-08 14:07       ` Maksim Davydov
2024-11-08 18:04         ` dongli.zhang
2024-11-04  9:40 ` [PATCH 6/7] target/i386/kvm: support perfmon-v2 for reset Dongli Zhang
2024-11-08 13:09   ` Sandipan Das
2024-11-08 16:55     ` dongli.zhang
2024-11-04  9:40 ` [PATCH 7/7] target/i386/kvm: don't stop Intel PMU counters Dongli Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyxxygVaufOntpZJ@intel.com \
    --to=zhao1.liu@intel.com \
    --cc=alexander.ivanov@virtuozzo.com \
    --cc=babu.moger@amd.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=davydov-max@yandex-team.ru \
    --cc=den@virtuozzo.com \
    --cc=dongli.zhang@oracle.com \
    --cc=groug@kaod.org \
    --cc=joe.jin@oracle.com \
    --cc=khorenko@virtuozzo.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu.linux@gmail.com \
    --cc=likexu@tencent.com \
    --cc=lyan@digitalocean.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sandipan.das@amd.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).