* REGRESSION on linux-next (next-20250919) @ 2025-09-30 5:30 Borah, Chaitanya Kumar 2025-09-30 8:03 ` Mi, Dapeng 0 siblings, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-09-30 5:30 UTC (permalink / raw) To: seanjc Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, lucas.demarchi, linux-perf-users, kvm Hello Sean, Hope you are doing well. I am Chaitanya from the linux graphics team in Intel. This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository. Since the version next-20250919 [2], we are seeing the following regression ````````````````````````````````````````````````````````````````````````````````` <4>[ 10.973827] ------------[ cut here ]------------ <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 ... <4>[ 10.974028] Call Trace: <4>[ 10.974030] <TASK> <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] <4>[ 10.974322] do_one_initcall+0x60/0x3f0 <4>[ 10.974335] do_init_module+0x97/0x2b0 <4>[ 10.974345] load_module+0x2d08/0x2e30 <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 <4>[ 10.974381] init_module_from_file+0x96/0xe0 <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 <4>[ 10.974399] idempotent_init_module+0x117/0x330 <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 ... ````````````````````````````````````````````````````````````````````````````````` Details log can be found in [3]. After bisecting the tree, the following patch [4] seems to be the first "bad" commit ````````````````````````````````````````````````````````````````````````````````````````````````````````` From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc@google.com> Date: Wed, 6 Aug 2025 12:56:39 -0700 Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities Take a snapshot of the unadulterated PMU capabilities provided by perf so that KVM can compare guest vPMU capabilities against hardware capabilities when determining whether or not to intercept PMU MSRs (and RDPMC). ````````````````````````````````````````````````````````````````````````````````````````````````````````` We also verified that if we revert the patch the issue is not seen. Could you please check why the patch causes this regression and provide a fix if necessary? Thank you. Regards Chaitanya [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html? [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250919 [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250919/bat-arlh-2/boot0.txt [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250919&id=51f34b1e650fc5843530266cea4341750bd1ae37 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-09-30 5:30 REGRESSION on linux-next (next-20250919) Borah, Chaitanya Kumar @ 2025-09-30 8:03 ` Mi, Dapeng 2025-09-30 15:09 ` Sean Christopherson 0 siblings, 1 reply; 9+ messages in thread From: Mi, Dapeng @ 2025-09-30 8:03 UTC (permalink / raw) To: Borah, Chaitanya Kumar, seanjc Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, lucas.demarchi, linux-perf-users, kvm On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote: > Hello Sean, > > Hope you are doing well. I am Chaitanya from the linux graphics team in > Intel. > > This mail is regarding a regression we are seeing in our CI runs[1] on > linux-next repository. > > Since the version next-20250919 [2], we are seeing the following regression > > ````````````````````````````````````````````````````````````````````````````````` > <4>[ 10.973827] ------------[ cut here ]------------ > <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at > perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 > ... > <4>[ 10.974028] Call Trace: > <4>[ 10.974030] <TASK> > <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] > <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] > <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] > <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] > <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] > <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] > <4>[ 10.974322] do_one_initcall+0x60/0x3f0 > <4>[ 10.974335] do_init_module+0x97/0x2b0 > <4>[ 10.974345] load_module+0x2d08/0x2e30 > <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 > <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 > <4>[ 10.974381] init_module_from_file+0x96/0xe0 > <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 > <4>[ 10.974399] idempotent_init_module+0x117/0x330 > <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 > ... > ````````````````````````````````````````````````````````````````````````````````` > Details log can be found in [3]. > > After bisecting the tree, the following patch [4] seems to be the first > "bad" commit > > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 > > From: Sean Christopherson <seanjc@google.com> > > Date: Wed, 6 Aug 2025 12:56:39 -0700 > > Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities > > Take a snapshot of the unadulterated PMU capabilities provided by perf so > that KVM can compare guest vPMU capabilities against hardware capabilities > when determining whether or not to intercept PMU MSRs (and RDPMC). > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > > We also verified that if we revert the patch the issue is not seen. > > Could you please check why the patch causes this regression and provide > a fix if necessary? Hi Chaitanya, I suppose you found this warning on a hybrid client platform, right? It looks the warning is triggered by the below WARN_ON_ONCE() in perf_get_x86_pmu_capability() function. if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || !x86_pmu_initialized()) { memset(cap, 0, sizeof(*cap)); return; } The below change should fix it (just building, not test it). I would run a full scope vPMU test after I come back from China national day's holiday. Thanks. diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index cebce7094de8..6d87c25226d8 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; - perf_get_x86_pmu_capability(&kvm_host_pmu); - /* * Hybrid PMUs don't play nice with virtualization without careful * configuration by userspace, and KVM's APIs for reporting supported @@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) enable_pmu = false; if (enable_pmu) { + perf_get_x86_pmu_capability(&kvm_host_pmu); + /* * WARN if perf did NOT disable hardware PMU if the number of * architecturally required GP counters aren't present, i.e. if > > Thank you. > > Regards > > Chaitanya > > [1] > https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html? > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250919 > [3] > https://intel-gfx-ci.01.org/tree/linux-next/next-20250919/bat-arlh-2/boot0.txt > [4] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250919&id=51f34b1e650fc5843530266cea4341750bd1ae37 > ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-09-30 8:03 ` Mi, Dapeng @ 2025-09-30 15:09 ` Sean Christopherson 2025-10-06 8:03 ` Borah, Chaitanya Kumar 2025-10-06 8:27 ` Mi, Dapeng 0 siblings, 2 replies; 9+ messages in thread From: Sean Christopherson @ 2025-09-30 15:09 UTC (permalink / raw) To: Dapeng Mi Cc: Chaitanya Kumar Borah, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On Tue, Sep 30, 2025, Dapeng Mi wrote: > > On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote: > > Hello Sean, > > > > Hope you are doing well. I am Chaitanya from the linux graphics team in > > Intel. > > > > This mail is regarding a regression we are seeing in our CI runs[1] on > > linux-next repository. > > > > Since the version next-20250919 [2], we are seeing the following regression > > > > ````````````````````````````````````````````````````````````````````````````````` > > <4>[ 10.973827] ------------[ cut here ]------------ > > <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at > > perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 > > ... > > <4>[ 10.974028] Call Trace: > > <4>[ 10.974030] <TASK> > > <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] > > <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] > > <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] > > <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] > > <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] > > <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] > > <4>[ 10.974322] do_one_initcall+0x60/0x3f0 > > <4>[ 10.974335] do_init_module+0x97/0x2b0 > > <4>[ 10.974345] load_module+0x2d08/0x2e30 > > <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 > > <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 > > <4>[ 10.974381] init_module_from_file+0x96/0xe0 > > <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 > > <4>[ 10.974399] idempotent_init_module+0x117/0x330 > > <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 > > ... > > ````````````````````````````````````````````````````````````````````````````````` > > Details log can be found in [3]. > > > > After bisecting the tree, the following patch [4] seems to be the first > > "bad" commit > > > > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > > From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 > > > > From: Sean Christopherson <seanjc@google.com> > > > > Date: Wed, 6 Aug 2025 12:56:39 -0700 > > > > Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities > > > > Take a snapshot of the unadulterated PMU capabilities provided by perf so > > that KVM can compare guest vPMU capabilities against hardware capabilities > > when determining whether or not to intercept PMU MSRs (and RDPMC). > > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > > > > We also verified that if we revert the patch the issue is not seen. > > > > Could you please check why the patch causes this regression and provide > > a fix if necessary? > > Hi Chaitanya, > > I suppose you found this warning on a hybrid client platform, right? It > looks the warning is triggered by the below WARN_ON_ONCE() in > perf_get_x86_pmu_capability() function. > > if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || > !x86_pmu_initialized()) { > memset(cap, 0, sizeof(*cap)); > return; > } > > The below change should fix it (just building, not test it). I would run a > full scope vPMU test after I come back from China national day's holiday. I have access to a hybrid system, I'll also double check there (though I'm 99.9% certain you've got it right). > Thanks. > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index cebce7094de8..6d87c25226d8 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) > bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; > int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; > > - perf_get_x86_pmu_capability(&kvm_host_pmu); > - > /* > * Hybrid PMUs don't play nice with virtualization without careful > * configuration by userspace, and KVM's APIs for reporting supported > @@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) > enable_pmu = false; > > if (enable_pmu) { > + perf_get_x86_pmu_capability(&kvm_host_pmu); > + > /* > * WARN if perf did NOT disable hardware PMU if the number of > * architecturally required GP counters aren't present, i.e. if If we go this route, then the !enable_pmu path should explicitly zero kvm_host_pmu so that the behavior is consistent userspace loads kvm.ko with enable_pmu=0, versus enable_pmu being cleared because of lack of support. if (!enable_pmu) { memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); return; } The alternative would be keep kvm_host_pmu valid at all times for !HYBRID, which is what I intended with the bad patch, but that too would lead to inconsistent behavior. So I think it makes sense to go with Dapeng's approach; we can always revisit this if some future thing in KVM _needs_ kvm_host_pmu even with enable_pmu=0. if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { enable_pmu = false; memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); } else { perf_get_x86_pmu_capability(&kvm_host_pmu); } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-09-30 15:09 ` Sean Christopherson @ 2025-10-06 8:03 ` Borah, Chaitanya Kumar 2025-10-07 6:22 ` Borah, Chaitanya Kumar 2025-10-06 8:27 ` Mi, Dapeng 1 sibling, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-10-06 8:03 UTC (permalink / raw) To: Sean Christopherson, Dapeng Mi Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On 9/30/2025 8:39 PM, Sean Christopherson wrote: > On Tue, Sep 30, 2025, Dapeng Mi wrote: >> >> On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote: >>> Hello Sean, >>> >>> Hope you are doing well. I am Chaitanya from the linux graphics team in >>> Intel. >>> >>> This mail is regarding a regression we are seeing in our CI runs[1] on >>> linux-next repository. >>> >>> Since the version next-20250919 [2], we are seeing the following regression >>> >>> ````````````````````````````````````````````````````````````````````````````````` >>> <4>[ 10.973827] ------------[ cut here ]------------ >>> <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at >>> perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 >>> ... >>> <4>[ 10.974028] Call Trace: >>> <4>[ 10.974030] <TASK> >>> <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] >>> <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] >>> <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] >>> <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] >>> <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974322] do_one_initcall+0x60/0x3f0 >>> <4>[ 10.974335] do_init_module+0x97/0x2b0 >>> <4>[ 10.974345] load_module+0x2d08/0x2e30 >>> <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 >>> <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 >>> <4>[ 10.974381] init_module_from_file+0x96/0xe0 >>> <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 >>> <4>[ 10.974399] idempotent_init_module+0x117/0x330 >>> <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 >>> ... >>> ````````````````````````````````````````````````````````````````````````````````` >>> Details log can be found in [3]. >>> >>> After bisecting the tree, the following patch [4] seems to be the first >>> "bad" commit >>> >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 >>> >>> From: Sean Christopherson <seanjc@google.com> >>> >>> Date: Wed, 6 Aug 2025 12:56:39 -0700 >>> >>> Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities >>> >>> Take a snapshot of the unadulterated PMU capabilities provided by perf so >>> that KVM can compare guest vPMU capabilities against hardware capabilities >>> when determining whether or not to intercept PMU MSRs (and RDPMC). >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> >>> We also verified that if we revert the patch the issue is not seen. >>> >>> Could you please check why the patch causes this regression and provide >>> a fix if necessary? >> >> Hi Chaitanya, >> >> I suppose you found this warning on a hybrid client platform, right? It >> looks the warning is triggered by the below WARN_ON_ONCE() in >> perf_get_x86_pmu_capability() function. >> >> if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || >> !x86_pmu_initialized()) { >> memset(cap, 0, sizeof(*cap)); >> return; >> } >> >> The below change should fix it (just building, not test it). I would run a >> full scope vPMU test after I come back from China national day's holiday. > > I have access to a hybrid system, I'll also double check there (though I'm 99.9% > certain you've got it right). > >> Thanks. >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index cebce7094de8..6d87c25226d8 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; >> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; >> >> - perf_get_x86_pmu_capability(&kvm_host_pmu); >> - >> /* >> * Hybrid PMUs don't play nice with virtualization without careful >> * configuration by userspace, and KVM's APIs for reporting supported >> @@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> enable_pmu = false; >> >> if (enable_pmu) { >> + perf_get_x86_pmu_capability(&kvm_host_pmu); >> + >> /* >> * WARN if perf did NOT disable hardware PMU if the number of >> * architecturally required GP counters aren't present, i.e. if > > If we go this route, then the !enable_pmu path should explicitly zero kvm_host_pmu > so that the behavior is consistent userspace loads kvm.ko with enable_pmu=0, > versus enable_pmu being cleared because of lack of support. > > if (!enable_pmu) { > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); > return; > } > > The alternative would be keep kvm_host_pmu valid at all times for !HYBRID, which > is what I intended with the bad patch, but that too would lead to inconsistent > behavior. So I think it makes sense to go with Dapeng's approach; we can always > revisit this if some future thing in KVM _needs_ kvm_host_pmu even with enable_pmu=0. > > if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { > enable_pmu = false; > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > } else { > perf_get_x86_pmu_capability(&kvm_host_pmu); > } Thank you for your responses. Following change fixes the issue for us. diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 40ac4cb44ed2..487ad19a236e 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops) bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; - perf_get_x86_pmu_capability(&kvm_host_pmu); - /* * Hybrid PMUs don't play nice with virtualization without careful * configuration by userspace, and KVM's APIs for reporting supported * vPMU features do not account for hybrid PMUs. Disable vPMU support * for hybrid PMUs until KVM gains a way to let userspace opt-in. */ - if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { enable_pmu = false; + memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); + } else { + perf_get_x86_pmu_capability(&kvm_host_pmu); + } Regards Chaitanya ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-10-06 8:03 ` Borah, Chaitanya Kumar @ 2025-10-07 6:22 ` Borah, Chaitanya Kumar 2025-10-09 1:34 ` Mi, Dapeng 0 siblings, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-10-07 6:22 UTC (permalink / raw) To: Sean Christopherson, Dapeng Mi Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm Hi, On 10/6/2025 1:33 PM, Borah, Chaitanya Kumar wrote: > Thank you for your responses. > > Following change fixes the issue for us. > > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index 40ac4cb44ed2..487ad19a236e 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const struct > kvm_pmu_ops *pmu_ops) > bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; > int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; > > - perf_get_x86_pmu_capability(&kvm_host_pmu); > - > /* > * Hybrid PMUs don't play nice with virtualization without careful > * configuration by userspace, and KVM's APIs for reporting > supported > * vPMU features do not account for hybrid PMUs. Disable vPMU > support > * for hybrid PMUs until KVM gains a way to let userspace opt-in. > */ > - if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) > + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { > enable_pmu = false; > + memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > + } else { > + perf_get_x86_pmu_capability(&kvm_host_pmu); > + } Can we expect a formal patch soon? Regards Chaitanya ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-10-07 6:22 ` Borah, Chaitanya Kumar @ 2025-10-09 1:34 ` Mi, Dapeng 2025-10-09 12:58 ` Sean Christopherson 0 siblings, 1 reply; 9+ messages in thread From: Mi, Dapeng @ 2025-10-09 1:34 UTC (permalink / raw) To: Borah, Chaitanya Kumar, Sean Christopherson Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On 10/7/2025 2:22 PM, Borah, Chaitanya Kumar wrote: > Hi, > > On 10/6/2025 1:33 PM, Borah, Chaitanya Kumar wrote: >> Thank you for your responses. >> >> Following change fixes the issue for us. >> >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index 40ac4cb44ed2..487ad19a236e 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const struct >> kvm_pmu_ops *pmu_ops) >> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; >> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; >> >> - perf_get_x86_pmu_capability(&kvm_host_pmu); >> - >> /* >> * Hybrid PMUs don't play nice with virtualization without careful >> * configuration by userspace, and KVM's APIs for reporting >> supported >> * vPMU features do not account for hybrid PMUs. Disable vPMU >> support >> * for hybrid PMUs until KVM gains a way to let userspace opt-in. >> */ >> - if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) >> + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { >> enable_pmu = false; >> + memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); >> + } else { >> + perf_get_x86_pmu_capability(&kvm_host_pmu); >> + } > Can we expect a formal patch soon? I'd like to post a patch to fix this tomorrow if Sean has no bandwidth on this. Thanks. > > Regards > > Chaitanya ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-10-09 1:34 ` Mi, Dapeng @ 2025-10-09 12:58 ` Sean Christopherson 2025-10-10 0:47 ` Mi, Dapeng 0 siblings, 1 reply; 9+ messages in thread From: Sean Christopherson @ 2025-10-09 12:58 UTC (permalink / raw) To: Dapeng Mi Cc: Chaitanya Kumar Borah, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On Thu, Oct 09, 2025, Dapeng Mi wrote: > > On 10/7/2025 2:22 PM, Borah, Chaitanya Kumar wrote: > > Hi, > > > > On 10/6/2025 1:33 PM, Borah, Chaitanya Kumar wrote: > >> Thank you for your responses. > >> > >> Following change fixes the issue for us. > >> > >> > >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > >> index 40ac4cb44ed2..487ad19a236e 100644 > >> --- a/arch/x86/kvm/pmu.c > >> +++ b/arch/x86/kvm/pmu.c > >> @@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const struct > >> kvm_pmu_ops *pmu_ops) > >> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; > >> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; > >> > >> - perf_get_x86_pmu_capability(&kvm_host_pmu); > >> - > >> /* > >> * Hybrid PMUs don't play nice with virtualization without careful > >> * configuration by userspace, and KVM's APIs for reporting > >> supported > >> * vPMU features do not account for hybrid PMUs. Disable vPMU > >> support > >> * for hybrid PMUs until KVM gains a way to let userspace opt-in. > >> */ > >> - if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) > >> + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { > >> enable_pmu = false; > >> + memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > >> + } else { > >> + perf_get_x86_pmu_capability(&kvm_host_pmu); > >> + } > > Can we expect a formal patch soon? > > I'd like to post a patch to fix this tomorrow if Sean has no bandwidth on > this. Thanks. Sorry, my bad, I was waiting for you to post a patch, but that wasn't at all clear. So yeah, go ahead and post one :-) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-10-09 12:58 ` Sean Christopherson @ 2025-10-10 0:47 ` Mi, Dapeng 0 siblings, 0 replies; 9+ messages in thread From: Mi, Dapeng @ 2025-10-10 0:47 UTC (permalink / raw) To: Sean Christopherson Cc: Chaitanya Kumar Borah, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On 10/9/2025 8:58 PM, Sean Christopherson wrote: > On Thu, Oct 09, 2025, Dapeng Mi wrote: >> On 10/7/2025 2:22 PM, Borah, Chaitanya Kumar wrote: >>> Hi, >>> >>> On 10/6/2025 1:33 PM, Borah, Chaitanya Kumar wrote: >>>> Thank you for your responses. >>>> >>>> Following change fixes the issue for us. >>>> >>>> >>>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >>>> index 40ac4cb44ed2..487ad19a236e 100644 >>>> --- a/arch/x86/kvm/pmu.c >>>> +++ b/arch/x86/kvm/pmu.c >>>> @@ -108,16 +108,18 @@ void kvm_init_pmu_capability(const struct >>>> kvm_pmu_ops *pmu_ops) >>>> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; >>>> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; >>>> >>>> - perf_get_x86_pmu_capability(&kvm_host_pmu); >>>> - >>>> /* >>>> * Hybrid PMUs don't play nice with virtualization without careful >>>> * configuration by userspace, and KVM's APIs for reporting >>>> supported >>>> * vPMU features do not account for hybrid PMUs. Disable vPMU >>>> support >>>> * for hybrid PMUs until KVM gains a way to let userspace opt-in. >>>> */ >>>> - if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) >>>> + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { >>>> enable_pmu = false; >>>> + memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); >>>> + } else { >>>> + perf_get_x86_pmu_capability(&kvm_host_pmu); >>>> + } >>> Can we expect a formal patch soon? >> I'd like to post a patch to fix this tomorrow if Sean has no bandwidth on >> this. Thanks. > Sorry, my bad, I was waiting for you to post a patch, but that wasn't at all > clear. So yeah, go ahead and post one :-) Sure. Would post it now. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: REGRESSION on linux-next (next-20250919) 2025-09-30 15:09 ` Sean Christopherson 2025-10-06 8:03 ` Borah, Chaitanya Kumar @ 2025-10-06 8:27 ` Mi, Dapeng 1 sibling, 0 replies; 9+ messages in thread From: Mi, Dapeng @ 2025-10-06 8:27 UTC (permalink / raw) To: Sean Christopherson Cc: Chaitanya Kumar Borah, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Suresh Kumar Kurmi, Jani Saarinen, lucas.demarchi, linux-perf-users, kvm On 9/30/2025 11:09 PM, Sean Christopherson wrote: > On Tue, Sep 30, 2025, Dapeng Mi wrote: >> On 9/30/2025 1:30 PM, Borah, Chaitanya Kumar wrote: >>> Hello Sean, >>> >>> Hope you are doing well. I am Chaitanya from the linux graphics team in >>> Intel. >>> >>> This mail is regarding a regression we are seeing in our CI runs[1] on >>> linux-next repository. >>> >>> Since the version next-20250919 [2], we are seeing the following regression >>> >>> ````````````````````````````````````````````````````````````````````````````````` >>> <4>[ 10.973827] ------------[ cut here ]------------ >>> <4>[ 10.973841] WARNING: arch/x86/events/core.c:3089 at >>> perf_get_x86_pmu_capability+0xd/0xc0, CPU#15: (udev-worker)/386 >>> ... >>> <4>[ 10.974028] Call Trace: >>> <4>[ 10.974030] <TASK> >>> <4>[ 10.974033] ? kvm_init_pmu_capability+0x2b/0x190 [kvm] >>> <4>[ 10.974154] kvm_x86_vendor_init+0x1b0/0x1a40 [kvm] >>> <4>[ 10.974248] vmx_init+0xdb/0x260 [kvm_intel] >>> <4>[ 10.974278] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974296] vt_init+0x12/0x9d0 [kvm_intel] >>> <4>[ 10.974309] ? __pfx_vt_init+0x10/0x10 [kvm_intel] >>> <4>[ 10.974322] do_one_initcall+0x60/0x3f0 >>> <4>[ 10.974335] do_init_module+0x97/0x2b0 >>> <4>[ 10.974345] load_module+0x2d08/0x2e30 >>> <4>[ 10.974349] ? __kernel_read+0x158/0x2f0 >>> <4>[ 10.974370] ? kernel_read_file+0x2b1/0x320 >>> <4>[ 10.974381] init_module_from_file+0x96/0xe0 >>> <4>[ 10.974384] ? init_module_from_file+0x96/0xe0 >>> <4>[ 10.974399] idempotent_init_module+0x117/0x330 >>> <4>[ 10.974415] __x64_sys_finit_module+0x73/0xe0 >>> ... >>> ````````````````````````````````````````````````````````````````````````````````` >>> Details log can be found in [3]. >>> >>> After bisecting the tree, the following patch [4] seems to be the first >>> "bad" commit >>> >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> From 51f34b1e650fc5843530266cea4341750bd1ae37 Mon Sep 17 00:00:00 2001 >>> >>> From: Sean Christopherson <seanjc@google.com> >>> >>> Date: Wed, 6 Aug 2025 12:56:39 -0700 >>> >>> Subject: KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities >>> >>> Take a snapshot of the unadulterated PMU capabilities provided by perf so >>> that KVM can compare guest vPMU capabilities against hardware capabilities >>> when determining whether or not to intercept PMU MSRs (and RDPMC). >>> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >>> >>> We also verified that if we revert the patch the issue is not seen. >>> >>> Could you please check why the patch causes this regression and provide >>> a fix if necessary? >> Hi Chaitanya, >> >> I suppose you found this warning on a hybrid client platform, right? It >> looks the warning is triggered by the below WARN_ON_ONCE() in >> perf_get_x86_pmu_capability() function. >> >> if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || >> !x86_pmu_initialized()) { >> memset(cap, 0, sizeof(*cap)); >> return; >> } >> >> The below change should fix it (just building, not test it). I would run a >> full scope vPMU test after I come back from China national day's holiday. > I have access to a hybrid system, I'll also double check there (though I'm 99.9% > certain you've got it right). > >> Thanks. >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index cebce7094de8..6d87c25226d8 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -108,8 +108,6 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL; >> int min_nr_gp_ctrs = pmu_ops->MIN_NR_GP_COUNTERS; >> >> - perf_get_x86_pmu_capability(&kvm_host_pmu); >> - >> /* >> * Hybrid PMUs don't play nice with virtualization without careful >> * configuration by userspace, and KVM's APIs for reporting supported >> @@ -120,6 +118,8 @@ void kvm_init_pmu_capability(struct kvm_pmu_ops *pmu_ops) >> enable_pmu = false; >> >> if (enable_pmu) { >> + perf_get_x86_pmu_capability(&kvm_host_pmu); >> + >> /* >> * WARN if perf did NOT disable hardware PMU if the number of >> * architecturally required GP counters aren't present, i.e. if > If we go this route, then the !enable_pmu path should explicitly zero kvm_host_pmu > so that the behavior is consistent userspace loads kvm.ko with enable_pmu=0, > versus enable_pmu being cleared because of lack of support. > > if (!enable_pmu) { > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); > return; > } > > The alternative would be keep kvm_host_pmu valid at all times for !HYBRID, which > is what I intended with the bad patch, but that too would lead to inconsistent > behavior. So I think it makes sense to go with Dapeng's approach; we can always > revisit this if some future thing in KVM _needs_ kvm_host_pmu even with enable_pmu=0. > > if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) { > enable_pmu = false; > memset(&kvm_host_pmu, 0, sizeof(kvm_host_pmu)); > } else { > perf_get_x86_pmu_capability(&kvm_host_pmu); > } Yeah, it looks better. We should decouple "enable_pmu" and "kvm_host_pmu" as the initial design. Thanks. > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-10 0:47 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-30 5:30 REGRESSION on linux-next (next-20250919) Borah, Chaitanya Kumar 2025-09-30 8:03 ` Mi, Dapeng 2025-09-30 15:09 ` Sean Christopherson 2025-10-06 8:03 ` Borah, Chaitanya Kumar 2025-10-07 6:22 ` Borah, Chaitanya Kumar 2025-10-09 1:34 ` Mi, Dapeng 2025-10-09 12:58 ` Sean Christopherson 2025-10-10 0:47 ` Mi, Dapeng 2025-10-06 8:27 ` Mi, Dapeng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).