public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: "Chen, Zide" <zide.chen@intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jim Mattson <jmattson@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	Das Sandipan <Sandipan.Das@amd.com>,
	Shukla Manali <Manali.Shukla@amd.com>,
	Falcon Thomas <thomas.falcon@intel.com>,
	Xudong Hao <xudong.hao@intel.com>
Subject: Re: [PATCH V2 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU
Date: Wed, 6 May 2026 09:36:10 +0800	[thread overview]
Message-ID: <3af0b9d4-f708-4225-9480-cb7080406ca0@linux.intel.com> (raw)
In-Reply-To: <7d5ca8ac-2118-4d70-b70f-9188cf36f40a@intel.com>


On 5/1/2026 1:54 AM, Chen, Zide wrote:
>
> On 4/29/2026 7:19 PM, Mi, Dapeng wrote:
>> On 4/24/2026 1:46 AM, Zide Chen wrote:
>>> From: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>
>>> Starting with Ice Lake, Intel introduces fixed counter 3, which counts
>>> TOPDOWN.SLOTS - the number of available slots for an unhalted logical
>>> processor.  It serves as the denominator for top-level metrics in the
>>> Top-down Microarchitecture Analysis method.
>>>
>>> Emulating this counter on legacy vPMU would require introducing a new
>>> generic perf encoding for the Intel-specific TOPDOWN.SLOTS event in
>>> order to call perf_get_hw_event_config().  This is undesirable as it
>>> would pollute the generic perf event encoding.
>>>
>>> Moreover, KVM does not intend to emulate IA32_PERF_METRICS in the
>>> legacy vPMU model, and without IA32_PERF_METRICS, emulating this
>>> counter has little practical value.  Therefore, expose fixed counter
>>> 3 to guests only when mediated vPMU is enabled.
>>>
>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>> Co-developed-by: Zide Chen <zide.chen@intel.com>
>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>> ---
>>> V2:
>>> - Don't advertise fixed counter 3 to userspace if the host doesn't
>>>   support it.
>>> ---
>>>  arch/x86/include/asm/kvm_host.h | 2 +-
>>>  arch/x86/kvm/cpuid.c            | 9 +++++++--
>>>  arch/x86/kvm/pmu.c              | 4 ++++
>>>  arch/x86/kvm/x86.c              | 4 ++--
>>>  4 files changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index c470e40a00aa..cb736a4c72ea 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -556,7 +556,7 @@ struct kvm_pmc {
>>>  #define KVM_MAX_NR_GP_COUNTERS		KVM_MAX(KVM_MAX_NR_INTEL_GP_COUNTERS, \
>>>  						KVM_MAX_NR_AMD_GP_COUNTERS)
>>>  
>>> -#define KVM_MAX_NR_INTEL_FIXED_COUNTERS	3
>>> +#define KVM_MAX_NR_INTEL_FIXED_COUNTERS	4
>>>  #define KVM_MAX_NR_AMD_FIXED_COUNTERS	0
>>>  #define KVM_MAX_NR_FIXED_COUNTERS	KVM_MAX(KVM_MAX_NR_INTEL_FIXED_COUNTERS, \
>>>  						KVM_MAX_NR_AMD_FIXED_COUNTERS)
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index e69156b54cff..d87a26f740e5 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -1505,7 +1505,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>>>  		break;
>>>  	case 0xa: { /* Architectural Performance Monitoring */
>>>  		union cpuid10_eax eax = { };
>>> -		union cpuid10_edx edx = { };
>>> +		union cpuid10_edx edx = { }, host_edx;
>>>  
>>>  		if (!enable_pmu || !static_cpu_has(X86_FEATURE_ARCH_PERFMON)) {
>>>  			entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
>>> @@ -1516,9 +1516,14 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>>>  		eax.split.num_counters = kvm_pmu_cap.num_counters_gp;
>>>  		eax.split.bit_width = kvm_pmu_cap.bit_width_gp;
>>>  		eax.split.mask_length = kvm_pmu_cap.events_mask_len;
>>> -		edx.split.num_counters_fixed = kvm_pmu_cap.num_counters_fixed;
>>>  		edx.split.bit_width_fixed = kvm_pmu_cap.bit_width_fixed;
>>>  
>>> +		/* Guest does not support non-contiguous fixed counters. */
>>> +		host_edx = (union cpuid10_edx)entry->edx;
>>> +		edx.split.num_counters_fixed =
>>> +			 min_t(int, kvm_pmu_cap.num_counters_fixed,
>>> +			       host_edx.split.num_counters_fixed);
>> kvm_pmu_cap are derived from kvm_pmu_host which already represents host
>> fixed counters number, why host fixed counters number is checked again here?
> This stems from KVM not supporting non-contiguous fixed counters on the
> guest.
>
> On CWF, the fixed counter mask is 0x77 and the number of contiguous
> fixed counters is 3. kvm_host_pmu.num_counters_fixed is 6 from the host,
> and in kvm_pmu_cap it's capped to KVM_MAX_NR_INTEL_FIXED_COUNTERS
> without accounting for non-contiguity:
>
> memcpy(&kvm_pmu_cap, &kvm_host_pmu, sizeof(kvm_host_pmu));
> kvm_pmu_cap.num_counters_fixed = min(kvm_pmu_cap.num_counters_fixed,
>                                      KVM_MAX_NR_FIXED_COUNTERS);
>
> It would be more natural to check against the host's contiguous fixed
> counter count in kvm_init_pmu_capability(), but I placed it in cpuid.c
> to leverage do_host_cpuid().
>
> A more complete fix would be to pull in some PerfmonExt patches to add
> fixed/GP counter mask support in kvm_host_pmu, and filter out
> non-contiguous counters in kvm_init_pmu_capability(). But in this way,
> it could have too much "temporary" code to translate between
> nr_of_xxx_counters and xxx_counter_mask.

I see. It may be not a good choice to pull in the PerfmonExt patches in
this patchset considering its large patch size. We'd better move this part
of code into kvm_init_pmu_capability() which is a better place for it, and
we need some comments to explain it. Thanks.


>
>
>> Besides, we can't only depend on the fixed counters number to check if
>> fixed counter 3 is supported on host, e.g., CWF supports fixed counter 4, 5
>> and 6 but doesn't support fixed counter 3. Before adding PerfmonExt (0x23)
>> CPUID leaves support in KVM, we need to check the  CPUID.0xa.ecx to get the
>> real fixed countera bitmap and then check if fixed counter 3 is supported.
> This is a theoretical concern even without fixed counter 3 support.
> Before this patch, KVM supports up to 3 fixed counters and assumes they
> are contiguous, which holds true in practice.
>
> CPUID.0xa.ecx is only meaningful starting from PMU v4, so it can't be
> used unconditionally. However, CPUID.0xa.edx[4:0] always represents the
> number of contiguous fixed counters, so checking against it is
> sufficient to filter out non-contiguous ones.
>
>> Thanks.
>>
>>
>>> +
>>>  		if (kvm_pmu_cap.version)
>>>  			edx.split.anythread_deprecated = 1;
>>>  
>>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>>> index e218352e3423..9ff4a6a9cd0b 100644
>>> --- a/arch/x86/kvm/pmu.c
>>> +++ b/arch/x86/kvm/pmu.c
>>> @@ -148,12 +148,16 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops)
>>>  	}
>>>  
>>>  	memcpy(&kvm_pmu_cap, &kvm_host_pmu, sizeof(kvm_host_pmu));
>>> +
>>>  	kvm_pmu_cap.version = min(kvm_pmu_cap.version, 2);
>>>  	kvm_pmu_cap.num_counters_gp = min(kvm_pmu_cap.num_counters_gp,
>>>  					  pmu_ops->MAX_NR_GP_COUNTERS);
>>>  	kvm_pmu_cap.num_counters_fixed = min(kvm_pmu_cap.num_counters_fixed,
>>>  					     KVM_MAX_NR_FIXED_COUNTERS);
>>>  
>>> +	if (!enable_mediated_pmu && kvm_pmu_cap.num_counters_fixed > 3)
>>> +		kvm_pmu_cap.num_counters_fixed = 3;
>>> +
>>>  	kvm_pmu_eventsel.INSTRUCTIONS_RETIRED =
>>>  		perf_get_hw_event_config(PERF_COUNT_HW_INSTRUCTIONS);
>>>  	kvm_pmu_eventsel.BRANCH_INSTRUCTIONS_RETIRED =
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 0a1b63c63d1a..604072d9354f 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -360,7 +360,7 @@ static const u32 msrs_to_save_base[] = {
>>>  
>>>  static const u32 msrs_to_save_pmu[] = {
>>>  	MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1,
>>> -	MSR_ARCH_PERFMON_FIXED_CTR0 + 2,
>>> +	MSR_ARCH_PERFMON_FIXED_CTR2, MSR_ARCH_PERFMON_FIXED_CTR3,
>>>  	MSR_CORE_PERF_FIXED_CTR_CTRL, MSR_CORE_PERF_GLOBAL_STATUS,
>>>  	MSR_CORE_PERF_GLOBAL_CTRL,
>>>  	MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG,
>>> @@ -7756,7 +7756,7 @@ static void kvm_init_msr_lists(void)
>>>  {
>>>  	unsigned i;
>>>  
>>> -	BUILD_BUG_ON_MSG(KVM_MAX_NR_FIXED_COUNTERS != 3,
>>> +	BUILD_BUG_ON_MSG(KVM_MAX_NR_FIXED_COUNTERS != 4,
>>>  			 "Please update the fixed PMCs in msrs_to_save_pmu[]");
>>>  
>>>  	num_msrs_to_save = 0;

  reply	other threads:[~2026-05-06  1:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 17:46 [PATCH V2 0/4] KVM: x86/pmu: Add hardware Topdown metrics support Zide Chen
2026-04-23 17:46 ` [PATCH V2 1/4] KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events Zide Chen
2026-04-30  1:55   ` Mi, Dapeng
2026-04-23 17:46 ` [PATCH V2 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU Zide Chen
2026-04-30  2:19   ` Mi, Dapeng
2026-04-30 17:54     ` Chen, Zide
2026-05-06  1:36       ` Mi, Dapeng [this message]
2026-04-23 17:46 ` [PATCH V2 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in " Zide Chen
2026-04-30  2:22   ` Mi, Dapeng
2026-04-23 17:46 ` [PATCH V2 4/4] KVM: selftests: Add perf_metrics and fixed counter 3 tests Zide Chen
2026-04-30  2:26   ` Mi, Dapeng
2026-04-30 18:13     ` Chen, Zide

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3af0b9d4-f708-4225-9480-cb7080406ca0@linux.intel.com \
    --to=dapeng1.mi@linux.intel.com \
    --cc=Manali.Shukla@amd.com \
    --cc=Sandipan.Das@amd.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=thomas.falcon@intel.com \
    --cc=xudong.hao@intel.com \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox