From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: "Chen, Zide" <zide.chen@intel.com>, sashiko-reviews@lists.linux.dev
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
Date: Thu, 18 Jun 2026 10:21:31 +0800 [thread overview]
Message-ID: <ba9eead0-371a-4720-ae9d-e84993c7253c@linux.intel.com> (raw)
In-Reply-To: <5e03d30e-b33b-476e-b7e6-8ff4979ff8db@intel.com>
On 6/17/2026 12:29 AM, Chen, Zide wrote:
>
> On 6/15/2026 6:26 PM, sashiko-bot@kernel.org wrote:
>> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>> - [High] Guest-reachable WARN_ON_ONCE assertions in MSR_PERF_METRICS accessors allow unprivileged guests to trigger host-side denial of service.
>> - [High] Software updates to MSR_PERF_METRICS via KVM ioctls or VCPU reset are silently clobbered by stale hardware MSR values during vcpu_put().
>> - [High] MSR_PERF_METRICS is passed through to the guest without context switching if the guest has a PMU version of 0, bypassing host isolation.
>> --
>>
>> commit 65a5a5edf8809e596749d317875bbc842f4eb7fc
>> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>
>> KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
>>
>> This commit exposes the IA32_PERF_METRICS MSR to guests when mediated
>> vPMU is enabled. It allows host access via kvm_pmu_{get,set}_msr() and
>> handles save/restore during PMU context switches.
>>
>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>> index 05a59f4acfdd3..61102aa5e2f9e 100644
>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> [ ... ]
>>
>>> @@ -335,6 +337,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> case MSR_CORE_PERF_FIXED_CTR_CTRL:
>>> msr_info->data = pmu->fixed_ctr_ctrl;
>>> break;
>>> + case MSR_PERF_METRICS:
>>> + WARN_ON_ONCE(!msr_info->host_initiated);
>> [Severity: High]
>> Could this WARN_ON_ONCE() be triggered by an unprivileged guest?
>>
>> If a guest forces KVM's instruction emulator to handle the MSR access, for
>> example by executing RDMSR over an MMIO page, the emulator might process the
>> instruction and invoke KVM's MSR handlers with host_initiated = false.
> Yes, in theory, it's possible for the RDMSR/WRMSR to be handled in the
> emulator path and WARN_ON_ONCE() could be unexpectedly triggered.
>
> Since WARN_ON() is not intended to handle untrusted input, it's better
> to use a defensive check and reject non-host-initiated accesses:
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 19daee29b731..6073227bdd20 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -348,7 +348,8 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
> struct msr_data *msr_info)
> msr_info->data = pmu->fixed_ctr_ctrl;
> break;
> case MSR_PERF_METRICS:
> - WARN_ON_ONCE(!msr_info->host_initiated);
> + if (!msr_info->host_initiated)
> + return 1;
> msr_info->data = pmu->perf_metrics;
> break;
> case MSR_IA32_PEBS_ENABLE:
> @@ -401,7 +402,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu,
> struct msr_data *msr_info)
> reprogram_fixed_counters(pmu, data);
> break;
> case MSR_PERF_METRICS:
> - WARN_ON_ONCE(!msr_info->host_initiated);
> + if (!msr_info->host_initiated)
> + return 1;
Per my understanding, WARN_ON_ONCE() only prints the warning message once
and it should not spam Host log too much, but it's fine for me to remove
them. Thanks.
>
>
>
>> This would unconditionally trigger the warning and allow the guest to spam
>> the host kernel log or cause a host panic if panic_on_warn is enabled.
>>
>>> + msr_info->data = pmu->perf_metrics;
>>> + break;
>>> case MSR_IA32_PEBS_ENABLE:
>>> msr_info->data = pmu->pebs_enable;
>>> break;
>> [ ... ]
>>
>>> @@ -384,6 +390,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> if (pmu->fixed_ctr_ctrl != data)
>>> reprogram_fixed_counters(pmu, data);
>>> break;
>>> + case MSR_PERF_METRICS:
>>> + WARN_ON_ONCE(!msr_info->host_initiated);
>> [Severity: High]
>> Can a malicious guest also trigger this WARN_ON_ONCE() by executing WRMSR
>> over an MMIO page to force instruction emulation?
> Ditto.
>>> +
>>> + /*
>>> + * If TMA level 2 is not supported, bits [63:32] are reserved
>>> + * and ignored on write, so no validation is needed here.
>>> + */
>>> + pmu->perf_metrics = data;
>>> + break;
>>> case MSR_IA32_PEBS_ENABLE:
>> [ ... ]
>>
>>> @@ -622,6 +642,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
>>>
>>> static void intel_pmu_reset(struct kvm_vcpu *vcpu)
>>> {
>>> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>>> +
>>> + pmu->perf_metrics = 0;
>>> intel_pmu_release_guest_lbr_event(vcpu);
>>> }
>> [ ... ]
>>
>>> @@ -821,6 +851,12 @@ static void intel_mediated_pmu_put(struct kvm_vcpu *vcpu)
>>> */
>>> if (pmu->fixed_ctr_ctrl_hw)
>>> wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>>> +
>>> + if (vcpu_has_perf_metrics(vcpu)) {
>>> + pmu->perf_metrics = rdpmc(INTEL_PMC_FIXED_RDPMC_METRICS);
>> [Severity: High]
>> Does this rdpmc() call silently clobber software updates made to
>> pmu->perf_metrics?
>>
>> When the VCPU is loaded during a KVM_SET_MSRS ioctl or VCPU reset, updates
>> to pmu->perf_metrics in intel_pmu_set_msr() or intel_pmu_reset() modify the
>> software struct but do not synchronously write to the active hardware MSR.
> This is safe.
>
> After intel_pmu_set_msr() or intel_pmu_reset(),
> intel_mediated_pmu_load() is called to sync the state to hardware before
> the next intel_mediated_pmu_put().
>
>
>> Later, when the VCPU is descheduled, intel_mediated_pmu_put() reads the
>> physical MSR (which still holds the old pre-update value) and overwrites the
>> software state.
>>
>> Will this permanently discard the new software state and break live
>> migration or VCPU resets?
>>
>>> + if (pmu->perf_metrics)
>>> + wrmsrq(MSR_PERF_METRICS, 0);
>>> + }
>>> }
>> [ ... ]
>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 9324d60839419..475e72c733cee 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -4247,6 +4247,9 @@ static void vmx_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu)
>>> MSR_TYPE_RW, intercept);
>>> vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>>> MSR_TYPE_RW, intercept);
>>> +
>>> + vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
>>> + !vcpu_has_perf_metrics(vcpu));
>> [Severity: High]
>> Could clearing the intercept here allow the guest direct, unintercepted
>> access to the host's physical MSR_PERF_METRICS register without a context
>> switch?
>>
>> If enable_mediated_pmu is true, KVM exposes PERF_CAP_PERF_METRICS. A VMM
>> could configure a VCPU with MSR_IA32_PERF_CAPABILITIES.PERF_METRICS=1 but
>> leave the PMU version at 0.
>>
>> In this scenario, vcpu_has_perf_metrics() would return true, clearing the
>> intercept. However, because the PMU version is 0,
>> kvm_vcpu_has_mediated_pmu() returns false.
>>
>> This causes the context-switch handlers kvm_mediated_pmu_load() and
>> kvm_mediated_pmu_put() to exit early.
>>
>> Does this bypass host isolation and allow the guest to leak or clobber host
>> PMU state?
> Yes, you are right.
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7cffe1619a1f..28027d302d90 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4266,7 +4266,7 @@ static void vmx_recalc_pmu_msr_intercepts(struct
> kvm_vcpu *vcpu)
> MSR_TYPE_RW, intercept);
>
> vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
> - !vcpu_has_perf_metrics(vcpu));
> + !has_mediated_pmu ||
> !vcpu_has_perf_metrics(vcpu));
> }
>
> static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>
>>> }
>>>
>>> static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>
next prev parent reply other threads:[~2026-06-18 2:21 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 23:01 [PATCH V3 0/4] KVM: x86/pmu: Add hardware Topdown metrics support Zide Chen
2026-06-15 23:01 ` [PATCH V3 1/4] KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events Zide Chen
2026-06-15 23:01 ` [PATCH V3 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU Zide Chen
2026-06-15 23:01 ` [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in " Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:29 ` Chen, Zide
2026-06-18 2:21 ` Mi, Dapeng [this message]
2026-06-15 23:01 ` [PATCH V3 4/4] KVM: selftests: Add perf_metrics and fixed counter 3 tests Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:32 ` Chen, Zide
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ba9eead0-371a-4720-ae9d-e84993c7253c@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.