From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B3751A255C for ; Fri, 17 Jan 2025 17:11:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737133916; cv=none; b=pS6wuhhLmojk3/Bweg2M6It1poQMMPjqGZ9s4n60x1NqXjvxzwYv4VU/BXREjZI44tLu02GK9V9cXSLrywwOZf7s20mwLDMYizojV5ivBPY+x0zwTX6yfHVDk10jC7QkRv6B3/6NO8s5qEeHl7YiNvcBx+c3/ahcpQufULwaxJM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737133916; c=relaxed/simple; bh=KanrjoUJLOdHNPS5gdbVhClozrzbXAVOWudE0SQVLhM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=f7L2fW6Ljgw2S3iJuUo/SmE5L5JotLpZRfzVgf3mFfgSjX37J15YnG3g9PKFZDB3AcT31MW1RynEPdhzbx2v1IkGfLMusGl1R503hprNa/lJ4a1D5Aje0iJyCbKkf1c1O+vtdGotvMzPwqfJqk6msGI+HDNCprlM5gpFYXJv+TM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TKzVcS1h; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TKzVcS1h" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f2a9743093so4393994a91.3 for ; Fri, 17 Jan 2025 09:11:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737133914; x=1737738714; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=4AX+0WCcXh8cJ6GecXmZCquxt01hUAK/G4viUXPcZHA=; b=TKzVcS1hbB3XJG4dc7WCzImg2L2MTtNrR99Q0m3qz7a8IqaNQFL/k1ccMN/RNlzD4K zWuvAJMcqXsassC3v74sNC+bM7kJOYyX/8e70UgqDZQlcWA6SyIJwjjGO2oZwDYVmjoF 8q7jsbcSFwRsswWu58hE/20aR0fSgsbB3FyohvOu9t5awKpt7HXHI/bvHAezpJrs7Mxj y9h2hHtby94bq4v2l82fE1GvtiJGDJPxcc58dqLEqsO8vMHcR5cjrZiaN/qZbsZyngY0 CGXLq8hxRZQjXY+/xXmmYvXhHC/C6Vap9Qz9cO1/1a84r6yenUXIDQQP+D8oggvdP35b /Exw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737133914; x=1737738714; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=4AX+0WCcXh8cJ6GecXmZCquxt01hUAK/G4viUXPcZHA=; b=VXm0/V8DTEDyB3+jbyZQ0l2JvdMtXwmYOreWf3MFhbhe4muip+EbtZe37nKkFoEh+N 18uuS22biqiDhrByZ9SjO1NHuo2hQ28+Bl1RXu59+PSa5gtuFI8WN8xMEDhoT1XbW/zR F1w+iAyuMnyOIGW4Adjfiy4+J1M5jpFOz/MZM9JMUctPUkUpaNHvaC+9uHbwSsGSR5pa Yr5imk0Vhu9jsPQXTdkzDNIolM86kTngBKblY82v+7SjVOABn4z2bfRZGZjJ3uNiab/D Eucx60LEXEnGYrUQQ8fjJkJ0lUTR0cXkynUbvuA5/7n9IVP6RgP/uC2lLDLzULUyRYkw Sczg== X-Forwarded-Encrypted: i=1; AJvYcCU5S3FZOzHmoM4mNhdinIRVVco9of9RyCQH/wk7d0M8RGX55VwgcKx+aWoM8oRk6DvblYg=@vger.kernel.org X-Gm-Message-State: AOJu0YySDx5DJB54UicAfkhLFitg/UbqGZ36nAWn4Qts3hfrC2x9X8Ob x+d80+PFoiaobgwZNZEz2TX8MyT5MiOqiprpyARWsgUJ50RRc7Z/F6ifNczbqW6egHLElk6wT+Q ccQ== X-Google-Smtp-Source: AGHT+IE77rcv5HrwFrxUynsxMjv0lHMOVyIZyhO9Pfy6UsV9tm9AHlNGCjvnQwG4vmR2r2SmutECV7mkUcs= X-Received: from pjbsl4.prod.google.com ([2002:a17:90b:2e04:b0:2f5:4762:e778]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f81:b0:2ee:f22a:61dd with SMTP id 98e67ed59e1d1-2f782d9a170mr4235938a91.32.1737133913770; Fri, 17 Jan 2025 09:11:53 -0800 (PST) Date: Fri, 17 Jan 2025 09:11:52 -0800 In-Reply-To: <6c23d536-484f-4c4b-aa85-3e0b9544611a@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <202501141009.30c629b4-lkp@intel.com> <6c23d536-484f-4c4b-aa85-3e0b9544611a@linux.intel.com> Message-ID: Subject: Re: [linux-next:master] [KVM] 7803339fa9: kernel-selftests.kvm.pmu_counters_test.fail From: Sean Christopherson To: Dapeng Mi Cc: kernel test robot , g@google.com, oe-lkp@lists.linux.dev, lkp@intel.com, Maxim Levitsky , kvm@vger.kernel.org, xudong.hao@intel.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Fri, Jan 17, 2025, Dapeng Mi wrote: > On 1/15/2025 10:44 AM, Mi, Dapeng wrote: > > On 1/15/2025 3:47 AM, Sean Christopherson wrote: > >>> # Testing fixed counters, PMU version 0, perf_caps =3D 2000 > >>> # Testing arch events, PMU version 1, perf_caps =3D 0 > >>> # =3D=3D=3D=3D Test Assertion Failure =3D=3D=3D=3D > >>> # x86/pmu_counters_test.c:129: count >=3D (10 * 4 + 5) > >>> # pid=3D6278 tid=3D6278 errno=3D4 - Interrupted system call > >>> # 1 0x0000000000411281: assert_on_unhandled_exception at process= or.c:625 > >>> # 2 0x00000000004075d4: _vcpu_run at kvm_util.c:1652 > >>> # 3 (inlined by) vcpu_run at kvm_util.c:1663 > >>> # 4 0x0000000000402c5e: run_vcpu at pmu_counters_test.c:62 > >>> # 5 0x0000000000402e4d: test_arch_events at pmu_counters_test.c:= 315 > >>> # 6 0x0000000000402663: test_arch_events at pmu_counters_test.c:= 304 > >>> # 7 (inlined by) test_intel_counters at pmu_counters_test.c:609 > >>> # 8 (inlined by) main at pmu_counters_test.c:642 > >>> # 9 0x00007f3b134f9249: ?? ??:0 > >>> # 10 0x00007f3b134f9304: ?? ??:0 > >>> # 11 0x0000000000402900: _start at ??:? > >>> # count >=3D NUM_INSNS_RETIRED > >> The failure is on top-down slots. I modified the assert to actually p= rint the > >> count (I'll make sure to post a patch regardless of where this goes), = and based > >> on the count for failing vs. passing, I'm pretty sure the issue is not= the extra > >> instruction, but instead is due to changing the target of the CLFUSH f= rom the > >> address of the code to the address of kvm_pmu_version. > >> > >> However, I think the blame lies with the assertion itself, i.e. with c= ommit > >> 4a447b135e45 ("KVM: selftests: Test top-down slots event in x86's pmu_= counters_test"). > >> Either that or top-down slots is broken on the Lakes. > >> > >> By my rudimentary measurements, tying the number of available slots to= the number > >> of instructions *retired* is fundamentally flawed. E.g. on the Lakes = (SKX is more > >> or less identical to CLX), omitting the CLFLUSHOPT entirely results in= *more* > >> slots being available throughout the lifetime of the measured section. > >> > >> My best guess is that flushing the cache line use for the data load ca= uses the > >> backend to saturate its slots with prefetching data, and as a result t= he number > >> of slots that are available goes down. > >> > >> CLFLUSHOPT . | CLFLUSHOPT [%m] | NOP > >> CLX 350-100 | 20-60[*] | 135-150 =20 > >> SPR 49000-57000 | 32500-41000 | 6760-6830 > >> > >> [*] CLX had a few outliers in the 200-400 range, but the majority of r= uns were > >> in the 20-60 range. > >> > >> Reading through more (and more and more) of the TMA documentation, I d= on't think > >> we can assume anything about the number of available slots, beyond a v= ery basic > >> assertion that it's practically impossible for there to never be an av= ailable > >> slot. IIUC, retiring an instruction does NOT require an available slo= t, rather > >> it requires the opposite: an occupied slot for the uop(s). > > I'm not quite sure about this. IIUC, retiring an instruction may not ne= ed a > > cycle, but it needs a slot at least except the instruction is macro-fus= ed. > > Anyway, let me double check with our micro-architecture and perf expert= s. >=20 > Hi Sean, >=20 > Just double check with our perf experts, the understanding that "retiring > an instruction needs a slot at least except the instruction is macro-fuse= d" > is correct. The reason of this error is that the architectural topdown > slots event is just introduced from Ice lake platform and it's not > supported on skylake and cascade lake platforms. On these earlier platfor= ms > the event 0x01a4 is another event which counts different thing instead of > topdown slots. On these earlier platforms, the slots event is derived fro= m > 0x3c event. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/a= rch/x86/events/intel/core.c#n466 >=20 >=20 > I don't understand why current pmu_counters_test code would validate an > architectural event which KVM (HW) doesn't support, it's not reasonable a= nd > could cause misleading. >=20 > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 /* > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Force iterating over k= nown arch events regardless of whether or not > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * KVM/hardware supports = a given event. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 nr_arch_events =3D max_t(typeo= f(nr_arch_events), nr_arch_events, > NR_INTEL_ARCH_EVENTS); /facepalm That's hilariously obvious in hindsight. > I would provide a patch to fix this. Testing "unsupproted" arch events is intentional. The idea is to validate = that KVM programs the requested event selector irrespective of whether or not th= e architectural event is *enumerated* to the guest (old KVM incorrectly "filt= ered" such events). The flaw in the test is that it doesn't check if the architectural event is supported in *hardware* when validating the count. But it's still desirabl= e to program the event selector in that case, i.e. only the validation of the co= unt should be skipped. Diff at the bottom to address this (needs to be spread over multiple patches). > BTW, currently KVM doesn't check if user space sets a valid pmu version i= n > kvm_check_cpuid(). The user space could set KVM a PMU version which is > larger than KVM supported maximum PMU version, just like currently > pmu_counters_test does. This is not correct. It's "correct" in the sense that KVM typically doesn't restrict what usersp= ace enumerates to the guest through CPUID. KVM needs to protect the host again= st doing bad things based on a funky guest CPUID, e.g. KVM needs t > I originally intent to fix this with the mediated vPMU patch series, but = It > looks we can send the patches just with this fix together, then the issue= can > be fixed earlier. I suspect that if there's a flaw in KVM, it only affects the mediated PMU. = Because perf manages MSRs/hardware with the current PMU, advertising a bogus PMU ve= rsion to the guest is "fine" because even if KVM thinks it's legal for the *guest= * to write MSRs that don't exist in hardware, KVM/perf will never try to propaga= te the guest values to non-existent hardware. diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/te= sting/selftests/kvm/x86/pmu_counters_test.c index accd7ecd3e5f..124051ea50be 100644 --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c @@ -29,10 +29,60 @@ /* Total number of instructions retired within the measured section. */ #define NUM_INSNS_RETIRED (NUM_LOOPS * NUM_INSNS_PER_LOOP + NUM_EXTRA_INS= NS) =20 +/* Track which architectural events are supported by hardware. */ +static uint32_t hardware_pmu_arch_events; =20 static uint8_t kvm_pmu_version; static bool kvm_has_perf_caps; =20 + +#define X86_PMU_FEATURE_NULL \ +({ \ + struct kvm_x86_pmu_feature feature =3D {}; \ + \ + feature; \ +}) + +static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event) +{ + return !(*(u64 *)&event); +} + +struct kvm_intel_pmu_event { + struct kvm_x86_pmu_feature gp_event; + struct kvm_x86_pmu_feature fixed_event; +}; + +/* + * Wrap the array to appease the compiler, as the macros used to construct= each + * kvm_x86_pmu_feature use syntax that's only valid in function scope, and= the + * compiler often thinks the feature definitions aren't compile-time const= ants. + */ +static struct kvm_intel_pmu_event intel_event_to_feature(uint8_t idx) +{ + const struct kvm_intel_pmu_event __intel_event_to_feature[] =3D { + [INTEL_ARCH_CPU_CYCLES_INDEX] =3D { X86_PMU_FEATURE_CPU_CYCLES, X86_PM= U_FEATURE_CPU_CYCLES_FIXED }, + [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] =3D { X86_PMU_FEATURE_INSNS_RET= IRED, X86_PMU_FEATURE_INSNS_RETIRED_FIXED }, + /* + * Note, the fixed counter for reference cycles is NOT the same as the + * general purpose architectural event. The fixed counter explicitly + * counts at the same frequency as the TSC, whereas the GP event counts + * at a fixed, but uarch specific, frequency. Bundle them here for + * simplicity. + */ + [INTEL_ARCH_REFERENCE_CYCLES_INDEX] =3D { X86_PMU_FEATURE_REFERENCE_CYC= LES, X86_PMU_FEATURE_REFERENCE_TSC_CYCLES_FIXED }, + [INTEL_ARCH_LLC_REFERENCES_INDEX] =3D { X86_PMU_FEATURE_LLC_REFERENCES,= X86_PMU_FEATURE_NULL }, + [INTEL_ARCH_LLC_MISSES_INDEX] =3D { X86_PMU_FEATURE_LLC_MISSES, X86_PM= U_FEATURE_NULL }, + [INTEL_ARCH_BRANCHES_RETIRED_INDEX] =3D { X86_PMU_FEATURE_BRANCH_INSNS_= RETIRED, X86_PMU_FEATURE_NULL }, + [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] =3D { X86_PMU_FEATURE_BRANCHES_= MISPREDICTED, X86_PMU_FEATURE_NULL }, + [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] =3D { X86_PMU_FEATURE_TOPDOWN_SLOTS, X= 86_PMU_FEATURE_TOPDOWN_SLOTS_FIXED }, + }; + + kvm_static_assert(ARRAY_SIZE(__intel_event_to_feature) =3D=3D NR_INTEL_AR= CH_EVENTS); + + return __intel_event_to_feature[idx]; +} + static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu, void *guest_code, uint8_t pmu_version, @@ -42,6 +92,7 @@ static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct = kvm_vcpu **vcpu, =20 vm =3D vm_create_with_one_vcpu(vcpu, guest_code); sync_global_to_guest(vm, kvm_pmu_version); + sync_global_to_guest(vm, hardware_pmu_arch_events); =20 /* * Set PERF_CAPABILITIES before PMU version as KVM disallows enabling @@ -98,14 +149,12 @@ static uint8_t guest_get_pmu_version(void) * Sanity check that in all cases, the event doesn't count when it's disab= led, * and that KVM correctly emulates the write of an arbitrary value. */ -static void guest_assert_event_count(uint8_t idx, - struct kvm_x86_pmu_feature event, - uint32_t pmc, uint32_t pmc_msr) +static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t p= mc_msr) { uint64_t count; =20 count =3D _rdpmc(pmc); - if (!this_pmu_has(event)) + if (!(hardware_pmu_arch_events & BIT(idx))) goto sanity_checks; =20 switch (idx) { @@ -126,7 +175,9 @@ static void guest_assert_event_count(uint8_t idx, GUEST_ASSERT_NE(count, 0); break; case INTEL_ARCH_TOPDOWN_SLOTS_INDEX: - GUEST_ASSERT(count >=3D NUM_INSNS_RETIRED); + __GUEST_ASSERT(count < NUM_INSNS_RETIRED, + "Expected top-down slots >=3D %u, got count =3D %lu", + NUM_INSNS_RETIRED, count); break; default: break; @@ -173,7 +224,7 @@ do { \ ); \ } while (0) =20 -#define GUEST_TEST_EVENT(_idx, _event, _pmc, _pmc_msr, _ctrl_msr, _value, = FEP) \ +#define GUEST_TEST_EVENT(_idx, _pmc, _pmc_msr, _ctrl_msr, _value, FEP) \ do { \ wrmsr(_pmc_msr, 0); \ \ @@ -184,54 +235,20 @@ do { \ else \ GUEST_MEASURE_EVENT(_ctrl_msr, _value, "nop", FEP); \ \ - guest_assert_event_count(_idx, _event, _pmc, _pmc_msr); \ + guest_assert_event_count(_idx, _pmc, _pmc_msr); \ } while (0) =20 -static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_featur= e event, - uint32_t pmc, uint32_t pmc_msr, +static void __guest_test_arch_event(uint8_t idx, uint32_t pmc, uint32_t pm= c_msr, uint32_t ctrl_msr, uint64_t ctrl_msr_value) { - GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, ""); + GUEST_TEST_EVENT(idx, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, ""); =20 if (is_forced_emulation_enabled) - GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, KVM= _FEP); -} - -#define X86_PMU_FEATURE_NULL \ -({ \ - struct kvm_x86_pmu_feature feature =3D {}; \ - \ - feature; \ -}) - -static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event) -{ - return !(*(u64 *)&event); + GUEST_TEST_EVENT(idx, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, KVM_FEP); } =20 static void guest_test_arch_event(uint8_t idx) { - const struct { - struct kvm_x86_pmu_feature gp_event; - struct kvm_x86_pmu_feature fixed_event; - } intel_event_to_feature[] =3D { - [INTEL_ARCH_CPU_CYCLES_INDEX] =3D { X86_PMU_FEATURE_CPU_CYCLES, X86_PM= U_FEATURE_CPU_CYCLES_FIXED }, - [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] =3D { X86_PMU_FEATURE_INSNS_RET= IRED, X86_PMU_FEATURE_INSNS_RETIRED_FIXED }, - /* - * Note, the fixed counter for reference cycles is NOT the same - * as the general purpose architectural event. The fixed counter - * explicitly counts at the same frequency as the TSC, whereas - * the GP event counts at a fixed, but uarch specific, frequency. - * Bundle them here for simplicity. - */ - [INTEL_ARCH_REFERENCE_CYCLES_INDEX] =3D { X86_PMU_FEATURE_REFERENCE_CYC= LES, X86_PMU_FEATURE_REFERENCE_TSC_CYCLES_FIXED }, - [INTEL_ARCH_LLC_REFERENCES_INDEX] =3D { X86_PMU_FEATURE_LLC_REFERENCES,= X86_PMU_FEATURE_NULL }, - [INTEL_ARCH_LLC_MISSES_INDEX] =3D { X86_PMU_FEATURE_LLC_MISSES, X86_PM= U_FEATURE_NULL }, - [INTEL_ARCH_BRANCHES_RETIRED_INDEX] =3D { X86_PMU_FEATURE_BRANCH_INSNS_= RETIRED, X86_PMU_FEATURE_NULL }, - [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] =3D { X86_PMU_FEATURE_BRANCHES_= MISPREDICTED, X86_PMU_FEATURE_NULL }, - [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] =3D { X86_PMU_FEATURE_TOPDOWN_SLOTS, X= 86_PMU_FEATURE_TOPDOWN_SLOTS_FIXED }, - }; - uint32_t nr_gp_counters =3D this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUN= TERS); uint32_t pmu_version =3D guest_get_pmu_version(); /* PERF_GLOBAL_CTRL exists only for Architectural PMU Version 2+. */ @@ -249,7 +266,7 @@ static void guest_test_arch_event(uint8_t idx) else base_pmc_msr =3D MSR_IA32_PERFCTR0; =20 - gp_event =3D intel_event_to_feature[idx].gp_event; + gp_event =3D intel_event_to_feature(idx).gp_event; GUEST_ASSERT_EQ(idx, gp_event.f.bit); =20 GUEST_ASSERT(nr_gp_counters); @@ -263,14 +280,14 @@ static void guest_test_arch_event(uint8_t idx) if (guest_has_perf_global_ctrl) wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(i)); =20 - __guest_test_arch_event(idx, gp_event, i, base_pmc_msr + i, + __guest_test_arch_event(idx, i, base_pmc_msr + i, MSR_P6_EVNTSEL0 + i, eventsel); } =20 if (!guest_has_perf_global_ctrl) return; =20 - fixed_event =3D intel_event_to_feature[idx].fixed_event; + fixed_event =3D intel_event_to_feature(idx).fixed_event; if (pmu_is_null_feature(fixed_event) || !this_pmu_has(fixed_event)) return; =20 @@ -278,7 +295,7 @@ static void guest_test_arch_event(uint8_t idx) =20 wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, FIXED_PMC_CTRL(i, FIXED_PMC_KERNEL)); =20 - __guest_test_arch_event(idx, fixed_event, i | INTEL_RDPMC_FIXED, + __guest_test_arch_event(idx, i | INTEL_RDPMC_FIXED, MSR_CORE_PERF_FIXED_CTR0 + i, MSR_CORE_PERF_GLOBAL_CTRL, FIXED_PMC_GLOBAL_CTRL_ENABLE(i)); @@ -546,7 +563,6 @@ static void test_fixed_counters(uint8_t pmu_version, ui= nt64_t perf_capabilities, =20 static void test_intel_counters(void) { - uint8_t nr_arch_events =3D kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECT= OR_LENGTH); uint8_t nr_fixed_counters =3D kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_= COUNTERS); uint8_t nr_gp_counters =3D kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTE= RS); uint8_t pmu_version =3D kvm_cpu_property(X86_PROPERTY_PMU_VERSION); @@ -568,18 +584,26 @@ static void test_intel_counters(void) =20 /* * Detect the existence of events that aren't supported by selftests. - * This will (obviously) fail any time the kernel adds support for a - * new event, but it's worth paying that price to keep the test fresh. + * This will (obviously) fail any time hardware adds support for a new + * event, but it's worth paying that price to keep the test fresh. */ - TEST_ASSERT(nr_arch_events <=3D NR_INTEL_ARCH_EVENTS, + TEST_ASSERT(this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH) <= =3D NR_INTEL_ARCH_EVENTS, "New architectural event(s) detected; please update this test (lengt= h =3D %u, mask =3D %x)", - nr_arch_events, kvm_cpu_property(X86_PROPERTY_PMU_EVENTS_MASK)); + this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH), + this_cpu_property(X86_PROPERTY_PMU_EVENTS_MASK)); =20 /* - * Force iterating over known arch events regardless of whether or not - * KVM/hardware supports a given event. + * Iterate over known arch events irrespective of KVM/hardware support + * to verify that KVM doesn't reject programming of events just because + * the *architectural* encoding is unsupported. Track which events are + * supported in hardware; the guest side will validate supported events + * count correctly, even if *enumeration* of the event is unsupported + * by KVM and/or isn't exposed to the guest. */ - nr_arch_events =3D max_t(typeof(nr_arch_events), nr_arch_events, NR_INTEL= _ARCH_EVENTS); + for (i =3D 0; i < NR_INTEL_ARCH_EVENTS; i++) { + if (this_pmu_has(intel_event_to_feature(i).gp_event)) + hardware_pmu_arch_events |=3D BIT(i); + } =20 for (v =3D 0; v <=3D max_pmu_version; v++) { for (i =3D 0; i < ARRAY_SIZE(perf_caps); i++) { @@ -595,8 +619,8 @@ static void test_intel_counters(void) * vector length. */ if (v =3D=3D pmu_version) { - for (k =3D 1; k < (BIT(nr_arch_events) - 1); k++) - test_arch_events(v, perf_caps[i], nr_arch_events, k); + for (k =3D 1; k < (BIT(NR_INTEL_ARCH_EVENTS) - 1); k++) + test_arch_events(v, perf_caps[i], NR_INTEL_ARCH_EVENTS, k); } /* * Test single bits for all PMU version and lengths up @@ -605,11 +629,11 @@ static void test_intel_counters(void) * host length). Explicitly test a mask of '0' and all * ones i.e. all events being available and unavailable. */ - for (j =3D 0; j <=3D nr_arch_events + 1; j++) { + for (j =3D 0; j <=3D NR_INTEL_ARCH_EVENTS + 1; j++) { test_arch_events(v, perf_caps[i], j, 0); test_arch_events(v, perf_caps[i], j, 0xff); =20 - for (k =3D 0; k < nr_arch_events; k++) + for (k =3D 0; k < NR_INTEL_ARCH_EVENTS; k++) test_arch_events(v, perf_caps[i], j, BIT(k)); }