From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: "Chen, Zide" <zide.chen@intel.com>, sashiko-reviews@lists.linux.dev
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
Date: Thu, 18 Jun 2026 10:21:31 +0800 [thread overview]
Message-ID: <ba9eead0-371a-4720-ae9d-e84993c7253c@linux.intel.com> (raw)
In-Reply-To: <5e03d30e-b33b-476e-b7e6-8ff4979ff8db@intel.com>
On 6/17/2026 12:29 AM, Chen, Zide wrote:
>
> On 6/15/2026 6:26 PM, sashiko-bot@kernel.org wrote:
>> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>> - [High] Guest-reachable WARN_ON_ONCE assertions in MSR_PERF_METRICS accessors allow unprivileged guests to trigger host-side denial of service.
>> - [High] Software updates to MSR_PERF_METRICS via KVM ioctls or VCPU reset are silently clobbered by stale hardware MSR values during vcpu_put().
>> - [High] MSR_PERF_METRICS is passed through to the guest without context switching if the guest has a PMU version of 0, bypassing host isolation.
>> --
>>
>> commit 65a5a5edf8809e596749d317875bbc842f4eb7fc
>> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>
>> KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
>>
>> This commit exposes the IA32_PERF_METRICS MSR to guests when mediated
>> vPMU is enabled. It allows host access via kvm_pmu_{get,set}_msr() and
>> handles save/restore during PMU context switches.
>>
>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>> index 05a59f4acfdd3..61102aa5e2f9e 100644
>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> [ ... ]
>>
>>> @@ -335,6 +337,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> case MSR_CORE_PERF_FIXED_CTR_CTRL:
>>> msr_info->data = pmu->fixed_ctr_ctrl;
>>> break;
>>> + case MSR_PERF_METRICS:
>>> + WARN_ON_ONCE(!msr_info->host_initiated);
>> [Severity: High]
>> Could this WARN_ON_ONCE() be triggered by an unprivileged guest?
>>
>> If a guest forces KVM's instruction emulator to handle the MSR access, for
>> example by executing RDMSR over an MMIO page, the emulator might process the
>> instruction and invoke KVM's MSR handlers with host_initiated = false.
> Yes, in theory, it's possible for the RDMSR/WRMSR to be handled in the
> emulator path and WARN_ON_ONCE() could be unexpectedly triggered.
>
> Since WARN_ON() is not intended to handle untrusted input, it's better
> to use a defensive check and reject non-host-initiated accesses:
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 19daee29b731..6073227bdd20 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -348,7 +348,8 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
> struct msr_data *msr_info)
> msr_info->data = pmu->fixed_ctr_ctrl;
> break;
> case MSR_PERF_METRICS:
> - WARN_ON_ONCE(!msr_info->host_initiated);
> + if (!msr_info->host_initiated)
> + return 1;
> msr_info->data = pmu->perf_metrics;
> break;
> case MSR_IA32_PEBS_ENABLE:
> @@ -401,7 +402,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu,
> struct msr_data *msr_info)
> reprogram_fixed_counters(pmu, data);
> break;
> case MSR_PERF_METRICS:
> - WARN_ON_ONCE(!msr_info->host_initiated);
> + if (!msr_info->host_initiated)
> + return 1;
Per my understanding, WARN_ON_ONCE() only prints the warning message once
and it should not spam Host log too much, but it's fine for me to remove
them. Thanks.
>
>
>
>> This would unconditionally trigger the warning and allow the guest to spam
>> the host kernel log or cause a host panic if panic_on_warn is enabled.
>>
>>> + msr_info->data = pmu->perf_metrics;
>>> + break;
>>> case MSR_IA32_PEBS_ENABLE:
>>> msr_info->data = pmu->pebs_enable;
>>> break;
>> [ ... ]
>>
>>> @@ -384,6 +390,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>> if (pmu->fixed_ctr_ctrl != data)
>>> reprogram_fixed_counters(pmu, data);
>>> break;
>>> + case MSR_PERF_METRICS:
>>> + WARN_ON_ONCE(!msr_info->host_initiated);
>> [Severity: High]
>> Can a malicious guest also trigger this WARN_ON_ONCE() by executing WRMSR
>> over an MMIO page to force instruction emulation?
> Ditto.
>>> +
>>> + /*
>>> + * If TMA level 2 is not supported, bits [63:32] are reserved
>>> + * and ignored on write, so no validation is needed here.
>>> + */
>>> + pmu->perf_metrics = data;
>>> + break;
>>> case MSR_IA32_PEBS_ENABLE:
>> [ ... ]
>>
>>> @@ -622,6 +642,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
>>>
>>> static void intel_pmu_reset(struct kvm_vcpu *vcpu)
>>> {
>>> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>>> +
>>> + pmu->perf_metrics = 0;
>>> intel_pmu_release_guest_lbr_event(vcpu);
>>> }
>> [ ... ]
>>
>>> @@ -821,6 +851,12 @@ static void intel_mediated_pmu_put(struct kvm_vcpu *vcpu)
>>> */
>>> if (pmu->fixed_ctr_ctrl_hw)
>>> wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>>> +
>>> + if (vcpu_has_perf_metrics(vcpu)) {
>>> + pmu->perf_metrics = rdpmc(INTEL_PMC_FIXED_RDPMC_METRICS);
>> [Severity: High]
>> Does this rdpmc() call silently clobber software updates made to
>> pmu->perf_metrics?
>>
>> When the VCPU is loaded during a KVM_SET_MSRS ioctl or VCPU reset, updates
>> to pmu->perf_metrics in intel_pmu_set_msr() or intel_pmu_reset() modify the
>> software struct but do not synchronously write to the active hardware MSR.
> This is safe.
>
> After intel_pmu_set_msr() or intel_pmu_reset(),
> intel_mediated_pmu_load() is called to sync the state to hardware before
> the next intel_mediated_pmu_put().
>
>
>> Later, when the VCPU is descheduled, intel_mediated_pmu_put() reads the
>> physical MSR (which still holds the old pre-update value) and overwrites the
>> software state.
>>
>> Will this permanently discard the new software state and break live
>> migration or VCPU resets?
>>
>>> + if (pmu->perf_metrics)
>>> + wrmsrq(MSR_PERF_METRICS, 0);
>>> + }
>>> }
>> [ ... ]
>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 9324d60839419..475e72c733cee 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -4247,6 +4247,9 @@ static void vmx_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu)
>>> MSR_TYPE_RW, intercept);
>>> vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>>> MSR_TYPE_RW, intercept);
>>> +
>>> + vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
>>> + !vcpu_has_perf_metrics(vcpu));
>> [Severity: High]
>> Could clearing the intercept here allow the guest direct, unintercepted
>> access to the host's physical MSR_PERF_METRICS register without a context
>> switch?
>>
>> If enable_mediated_pmu is true, KVM exposes PERF_CAP_PERF_METRICS. A VMM
>> could configure a VCPU with MSR_IA32_PERF_CAPABILITIES.PERF_METRICS=1 but
>> leave the PMU version at 0.
>>
>> In this scenario, vcpu_has_perf_metrics() would return true, clearing the
>> intercept. However, because the PMU version is 0,
>> kvm_vcpu_has_mediated_pmu() returns false.
>>
>> This causes the context-switch handlers kvm_mediated_pmu_load() and
>> kvm_mediated_pmu_put() to exit early.
>>
>> Does this bypass host isolation and allow the guest to leak or clobber host
>> PMU state?
> Yes, you are right.
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7cffe1619a1f..28027d302d90 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4266,7 +4266,7 @@ static void vmx_recalc_pmu_msr_intercepts(struct
> kvm_vcpu *vcpu)
> MSR_TYPE_RW, intercept);
>
> vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
> - !vcpu_has_perf_metrics(vcpu));
> + !has_mediated_pmu ||
> !vcpu_has_perf_metrics(vcpu));
> }
>
> static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>
>>> }
>>>
>>> static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>
next prev parent reply other threads:[~2026-06-18 2:21 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 23:01 [PATCH V3 0/4] KVM: x86/pmu: Add hardware Topdown metrics support Zide Chen
2026-06-15 23:01 ` [PATCH V3 1/4] KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events Zide Chen
2026-06-15 23:01 ` [PATCH V3 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU Zide Chen
2026-06-15 23:01 ` [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in " Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:29 ` Chen, Zide
2026-06-18 2:21 ` Mi, Dapeng [this message]
2026-06-15 23:01 ` [PATCH V3 4/4] KVM: selftests: Add perf_metrics and fixed counter 3 tests Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:32 ` Chen, Zide
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ba9eead0-371a-4720-ae9d-e84993c7253c@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox