Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: "Chen, Zide" <zide.chen@intel.com>
To: sashiko-reviews@lists.linux.dev
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
Date: Tue, 16 Jun 2026 11:29:32 -0500	[thread overview]
Message-ID: <5e03d30e-b33b-476e-b7e6-8ff4979ff8db@intel.com> (raw)
In-Reply-To: <20260615232619.BBF811F000E9@smtp.kernel.org>



On 6/15/2026 6:26 PM, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> - [High] Guest-reachable WARN_ON_ONCE assertions in MSR_PERF_METRICS accessors allow unprivileged guests to trigger host-side denial of service.
> - [High] Software updates to MSR_PERF_METRICS via KVM ioctls or VCPU reset are silently clobbered by stale hardware MSR values during vcpu_put().
> - [High] MSR_PERF_METRICS is passed through to the guest without context switching if the guest has a PMU version of 0, bypassing host isolation.
> --
> 
> commit 65a5a5edf8809e596749d317875bbc842f4eb7fc
> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
> 
> KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
> 
> This commit exposes the IA32_PERF_METRICS MSR to guests when mediated
> vPMU is enabled. It allows host access via kvm_pmu_{get,set}_msr() and
> handles save/restore during PMU context switches.
> 
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index 05a59f4acfdd3..61102aa5e2f9e 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> 
> [ ... ]
> 
>> @@ -335,6 +337,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>  	case MSR_CORE_PERF_FIXED_CTR_CTRL:
>>  		msr_info->data = pmu->fixed_ctr_ctrl;
>>  		break;
>> +	case MSR_PERF_METRICS:
>> +		WARN_ON_ONCE(!msr_info->host_initiated);
> 
> [Severity: High]
> Could this WARN_ON_ONCE() be triggered by an unprivileged guest?
> 
> If a guest forces KVM's instruction emulator to handle the MSR access, for
> example by executing RDMSR over an MMIO page, the emulator might process the
> instruction and invoke KVM's MSR handlers with host_initiated = false.

Yes, in theory, it's possible for the RDMSR/WRMSR to be handled in the
emulator path and WARN_ON_ONCE() could be unexpectedly triggered.

Since WARN_ON() is not intended to handle untrusted input, it's better
to use a defensive check and reject non-host-initiated accesses:

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 19daee29b731..6073227bdd20 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -348,7 +348,8 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
struct msr_data *msr_info)
                msr_info->data = pmu->fixed_ctr_ctrl;
                break;
        case MSR_PERF_METRICS:
-               WARN_ON_ONCE(!msr_info->host_initiated);
+               if (!msr_info->host_initiated)
+                       return 1;
                msr_info->data = pmu->perf_metrics;
                break;
        case MSR_IA32_PEBS_ENABLE:
@@ -401,7 +402,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu,
struct msr_data *msr_info)
                        reprogram_fixed_counters(pmu, data);
                break;
        case MSR_PERF_METRICS:
-               WARN_ON_ONCE(!msr_info->host_initiated);
+               if (!msr_info->host_initiated)
+                       return 1;



> This would unconditionally trigger the warning and allow the guest to spam
> the host kernel log or cause a host panic if panic_on_warn is enabled.
> 
>> +		msr_info->data = pmu->perf_metrics;
>> +		break;
>>  	case MSR_IA32_PEBS_ENABLE:
>>  		msr_info->data = pmu->pebs_enable;
>>  		break;
> 
> [ ... ]
> 
>> @@ -384,6 +390,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>  		if (pmu->fixed_ctr_ctrl != data)
>>  			reprogram_fixed_counters(pmu, data);
>>  		break;
>> +	case MSR_PERF_METRICS:
>> +		WARN_ON_ONCE(!msr_info->host_initiated);
> 
> [Severity: High]
> Can a malicious guest also trigger this WARN_ON_ONCE() by executing WRMSR
> over an MMIO page to force instruction emulation?

Ditto.
>> +
>> +		/*
>> +		 * If TMA level 2 is not supported, bits [63:32] are reserved
>> +		 * and ignored on write, so no validation is needed here.
>> +		 */
>> +		pmu->perf_metrics = data;
>> +		break;
>>  	case MSR_IA32_PEBS_ENABLE:
> 
> [ ... ]
> 
>> @@ -622,6 +642,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
>>  
>>  static void intel_pmu_reset(struct kvm_vcpu *vcpu)
>>  {
>> +	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>> +
>> +	pmu->perf_metrics = 0;
>>  	intel_pmu_release_guest_lbr_event(vcpu);
>>  }
> 
> [ ... ]
> 
>> @@ -821,6 +851,12 @@ static void intel_mediated_pmu_put(struct kvm_vcpu *vcpu)
>>  	 */
>>  	if (pmu->fixed_ctr_ctrl_hw)
>>  		wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
>> +
>> +	if (vcpu_has_perf_metrics(vcpu)) {
>> +		pmu->perf_metrics = rdpmc(INTEL_PMC_FIXED_RDPMC_METRICS);
> 
> [Severity: High]
> Does this rdpmc() call silently clobber software updates made to
> pmu->perf_metrics?
> 
> When the VCPU is loaded during a KVM_SET_MSRS ioctl or VCPU reset, updates
> to pmu->perf_metrics in intel_pmu_set_msr() or intel_pmu_reset() modify the
> software struct but do not synchronously write to the active hardware MSR.

This is safe.

After intel_pmu_set_msr() or intel_pmu_reset(),
intel_mediated_pmu_load() is called to sync the state to hardware before
the next intel_mediated_pmu_put().


> Later, when the VCPU is descheduled, intel_mediated_pmu_put() reads the
> physical MSR (which still holds the old pre-update value) and overwrites the
> software state.
> 
> Will this permanently discard the new software state and break live
> migration or VCPU resets?
> 
>> +		if (pmu->perf_metrics)
>> +			wrmsrq(MSR_PERF_METRICS, 0);
>> +	}
>>  }
> 
> [ ... ]
> 
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 9324d60839419..475e72c733cee 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -4247,6 +4247,9 @@ static void vmx_recalc_pmu_msr_intercepts(struct kvm_vcpu *vcpu)
>>  				  MSR_TYPE_RW, intercept);
>>  	vmx_set_intercept_for_msr(vcpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
>>  				  MSR_TYPE_RW, intercept);
>> +
>> +	vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
>> +				  !vcpu_has_perf_metrics(vcpu));
> 
> [Severity: High]
> Could clearing the intercept here allow the guest direct, unintercepted
> access to the host's physical MSR_PERF_METRICS register without a context
> switch?
> 
> If enable_mediated_pmu is true, KVM exposes PERF_CAP_PERF_METRICS. A VMM
> could configure a VCPU with MSR_IA32_PERF_CAPABILITIES.PERF_METRICS=1 but
> leave the PMU version at 0.
> 
> In this scenario, vcpu_has_perf_metrics() would return true, clearing the
> intercept. However, because the PMU version is 0,
> kvm_vcpu_has_mediated_pmu() returns false.
> 
> This causes the context-switch handlers kvm_mediated_pmu_load() and
> kvm_mediated_pmu_put() to exit early.
> 
> Does this bypass host isolation and allow the guest to leak or clobber host
> PMU state?

Yes, you are right.

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7cffe1619a1f..28027d302d90 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4266,7 +4266,7 @@ static void vmx_recalc_pmu_msr_intercepts(struct
kvm_vcpu *vcpu)
                                  MSR_TYPE_RW, intercept);

        vmx_set_intercept_for_msr(vcpu, MSR_PERF_METRICS, MSR_TYPE_RW,
-                                 !vcpu_has_perf_metrics(vcpu));
+                                 !has_mediated_pmu ||
!vcpu_has_perf_metrics(vcpu));
 }

 static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)

>>  }
>>  
>>  static void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
> 


  reply	other threads:[~2026-06-16 16:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15 23:01 [PATCH V3 0/4] KVM: x86/pmu: Add hardware Topdown metrics support Zide Chen
2026-06-15 23:01 ` [PATCH V3 1/4] KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events Zide Chen
2026-06-15 23:01 ` [PATCH V3 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU Zide Chen
2026-06-15 23:01 ` [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in " Zide Chen
2026-06-15 23:26   ` sashiko-bot
2026-06-16 16:29     ` Chen, Zide [this message]
2026-06-18  2:21       ` Mi, Dapeng
2026-06-15 23:01 ` [PATCH V3 4/4] KVM: selftests: Add perf_metrics and fixed counter 3 tests Zide Chen
2026-06-15 23:26   ` sashiko-bot
2026-06-16 16:32     ` Chen, Zide

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e03d30e-b33b-476e-b7e6-8ff4979ff8db@intel.com \
    --to=zide.chen@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox