* Re: [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits [not found] ` <20260508234018.F06AEC2BCB0@smtp.kernel.org> @ 2026-05-12 11:30 ` Mi, Dapeng 2026-05-15 0:01 ` Sean Christopherson 0 siblings, 1 reply; 3+ messages in thread From: Mi, Dapeng @ 2026-05-12 11:30 UTC (permalink / raw) To: sashiko, Sean Christopherson; +Cc: linux-perf-users On 5/9/2026 7:40 AM, sashiko-bot@kernel.org wrote: > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: > - [High] The host PEBS collision check fails to mask non-counter bits, incorrectly disabling guest PEBS for events like Store Latency. > -- > > Note: The format of this report has been altered due to recitation > restrictions. The original patch code is not quoted directly, and findings are > provided as a free-form summary. > > commit 39adc53b360e88595a3be6fc59b18b048d9e9921 > Author: Sean Christopherson <seanjc@google.com> > > perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits > > This commit masks the pebs counter values being reinstated into the global > control msr for a guest with perf's desired value. This is done to ensure the > hypervisor doesn't unintentionally enable counters. > > In intel_guest_get_msrs(), the host pebs collision check evaluates the host > field of the pebs_enable array entry to determine if the host is actively > using pebs. If it is, guest pebs is disabled. > > Is it possible for this check to falsely detect a host collision due to > non-counter bits being set? > > When a guest configures a store latency pebs event, a backing perf event is > created. In intel_pmu_pebs_enable(), this sets both the counter index bit > and bit 63 in cpuc->pebs_enabled. > > Because the host field is calculated as cpuc->pebs_enabled masked against > the inverse of cpuc->intel_ctrl_guest_mask, and the guest mask exclusively > tracks valid counter indices (never including bit 63), bit 63 remains set in > the host field. > > Consequently, the check for whether the host is actively using pebs will > evaluate to true even when only the guest is using pebs. This introduces a > regression where guest pebs is completely disabled, silently breaking guest > store latency events. > > Should the host pebs usage evaluation also be masked by intel_ctrl to prevent > non-counter bits from triggering a false collision? Hmm, I suspect if the issue could happen on real hardwares. The function intel_pmu_pebs_enable() indeed sets extra bits into cpuc->pebs_enabled for load latency and specific store events, like below code shows. ``` ...... if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5)) cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32); else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST) cpuc->pebs_enabled |= 1ULL << 63; ...... ``` But these 2 cases should only be hit on quite old platforms (prior to Icelake). On these platforms, only GP counters support PEBS sampling and pebs_capable would be set PEBS_COUNTER_MASK, and so these extra bits would be filtered out by the pebs_capable and pebs_mask won't really contain these extra bits. Anyway, we could optimize the code further like below and thoroughly filter away these extra bits. (only building, not test on real HW) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 854881b5e696..6eee7636a822 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4997,7 +4997,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs; u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl); - u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable & intel_ctrl; u64 guest_pebs_mask; int global_ctrl; @@ -5049,7 +5049,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, * the guest wants to use for PEBS, (c) are not excluded from counting * in the guest, and (d) _are_ excluded from counting in the host. */ - guest_pebs_mask = pebs_mask & intel_ctrl & guest_pebs->enable & + guest_pebs_mask = pebs_mask & guest_pebs->enable & ~cpuc->intel_ctrl_exclude_guest_mask & cpuc->intel_ctrl_exclude_host_mask; > ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits 2026-05-12 11:30 ` [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits Mi, Dapeng @ 2026-05-15 0:01 ` Sean Christopherson 2026-05-15 1:49 ` Mi, Dapeng 0 siblings, 1 reply; 3+ messages in thread From: Sean Christopherson @ 2026-05-15 0:01 UTC (permalink / raw) To: Dapeng Mi; +Cc: sashiko, linux-perf-users On Tue, May 12, 2026, Dapeng Mi wrote: > > On 5/9/2026 7:40 AM, sashiko-bot@kernel.org wrote: > > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: > > - [High] The host PEBS collision check fails to mask non-counter bits, incorrectly disabling guest PEBS for events like Store Latency. > > -- > > > > Note: The format of this report has been altered due to recitation > > restrictions. The original patch code is not quoted directly, and findings are > > provided as a free-form summary. > > > > commit 39adc53b360e88595a3be6fc59b18b048d9e9921 > > Author: Sean Christopherson <seanjc@google.com> > > > > perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits > > > > This commit masks the pebs counter values being reinstated into the global > > control msr for a guest with perf's desired value. This is done to ensure the > > hypervisor doesn't unintentionally enable counters. > > > > In intel_guest_get_msrs(), the host pebs collision check evaluates the host > > field of the pebs_enable array entry to determine if the host is actively > > using pebs. If it is, guest pebs is disabled. > > > > Is it possible for this check to falsely detect a host collision due to > > non-counter bits being set? > > > > When a guest configures a store latency pebs event, a backing perf event is > > created. In intel_pmu_pebs_enable(), this sets both the counter index bit > > and bit 63 in cpuc->pebs_enabled. > > > > Because the host field is calculated as cpuc->pebs_enabled masked against > > the inverse of cpuc->intel_ctrl_guest_mask, and the guest mask exclusively > > tracks valid counter indices (never including bit 63), bit 63 remains set in > > the host field. > > > > Consequently, the check for whether the host is actively using pebs will > > evaluate to true even when only the guest is using pebs. This introduces a > > regression where guest pebs is completely disabled, silently breaking guest > > store latency events. > > > > Should the host pebs usage evaluation also be masked by intel_ctrl to prevent > > non-counter bits from triggering a false collision? > > Hmm, I suspect if the issue could happen on real hardwares. OMG, it took me _so_ long to understand what Sashiko was complaining about. I'm pretty sure I read Sashiko's response like four times, and it didn't click until I read your response. In case anyone is feeling as dense as me, Sashiko is saying this: /* * FIXME: Allow guest and host usage of PEBS events to co-exist instead * of disabling guest PEBS entirely if the host is using PEBS. * What exactly goes wrong if guest and host are using PEBS is * unknown. */ if (pebs_mask & ~cpuc->intel_ctrl_exclude_host_mask) guest_pebs_mask = 0; should probably be masked by intel_ctrl as well. > The function intel_pmu_pebs_enable() indeed sets extra bits into > cpuc->pebs_enabled for load latency and specific store events, like below > code shows. > > ``` > > ...... > > if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5)) > cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32); > else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST) > cpuc->pebs_enabled |= 1ULL << 63; > > ...... > > ``` > > But these 2 cases should only be hit on quite old platforms (prior to > Icelake). On these platforms, only GP counters support PEBS sampling and > pebs_capable would be set PEBS_COUNTER_MASK, and so these extra bits would > be filtered out by the pebs_capable and pebs_mask won't really contain > these extra bits. Thanks for digging into this! I was staring and just going "huh!?" over and over in my head :-) > Anyway, we could optimize the code further like below and thoroughly filter > away these extra bits. (only building, not test on real HW) Hmm, I think I'd rather figure out what it would take to drop the FIXME entirely. And if we need to keep the check, I'm a-ok risking false positives until we have a better understanding of why the check exists. > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c > index 854881b5e696..6eee7636a822 100644 > --- a/arch/x86/events/intel/core.c > +++ b/arch/x86/events/intel/core.c > @@ -4997,7 +4997,7 @@ static struct perf_guest_switch_msr > *intel_guest_get_msrs(int *nr, > struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); > struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs; > u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl); > - u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; > + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable & intel_ctrl; > u64 guest_pebs_mask; > int global_ctrl; > > @@ -5049,7 +5049,7 @@ static struct perf_guest_switch_msr > *intel_guest_get_msrs(int *nr, > * the guest wants to use for PEBS, (c) are not excluded from counting > * in the guest, and (d) _are_ excluded from counting in the host. > */ > - guest_pebs_mask = pebs_mask & intel_ctrl & guest_pebs->enable & > + guest_pebs_mask = pebs_mask & guest_pebs->enable & > ~cpuc->intel_ctrl_exclude_guest_mask & > cpuc->intel_ctrl_exclude_host_mask; > > > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits 2026-05-15 0:01 ` Sean Christopherson @ 2026-05-15 1:49 ` Mi, Dapeng 0 siblings, 0 replies; 3+ messages in thread From: Mi, Dapeng @ 2026-05-15 1:49 UTC (permalink / raw) To: Sean Christopherson; +Cc: sashiko, linux-perf-users On 5/15/2026 8:01 AM, Sean Christopherson wrote: > On Tue, May 12, 2026, Dapeng Mi wrote: >> On 5/9/2026 7:40 AM, sashiko-bot@kernel.org wrote: >>> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: >>> - [High] The host PEBS collision check fails to mask non-counter bits, incorrectly disabling guest PEBS for events like Store Latency. >>> -- >>> >>> Note: The format of this report has been altered due to recitation >>> restrictions. The original patch code is not quoted directly, and findings are >>> provided as a free-form summary. >>> >>> commit 39adc53b360e88595a3be6fc59b18b048d9e9921 >>> Author: Sean Christopherson <seanjc@google.com> >>> >>> perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits >>> >>> This commit masks the pebs counter values being reinstated into the global >>> control msr for a guest with perf's desired value. This is done to ensure the >>> hypervisor doesn't unintentionally enable counters. >>> >>> In intel_guest_get_msrs(), the host pebs collision check evaluates the host >>> field of the pebs_enable array entry to determine if the host is actively >>> using pebs. If it is, guest pebs is disabled. >>> >>> Is it possible for this check to falsely detect a host collision due to >>> non-counter bits being set? >>> >>> When a guest configures a store latency pebs event, a backing perf event is >>> created. In intel_pmu_pebs_enable(), this sets both the counter index bit >>> and bit 63 in cpuc->pebs_enabled. >>> >>> Because the host field is calculated as cpuc->pebs_enabled masked against >>> the inverse of cpuc->intel_ctrl_guest_mask, and the guest mask exclusively >>> tracks valid counter indices (never including bit 63), bit 63 remains set in >>> the host field. >>> >>> Consequently, the check for whether the host is actively using pebs will >>> evaluate to true even when only the guest is using pebs. This introduces a >>> regression where guest pebs is completely disabled, silently breaking guest >>> store latency events. >>> >>> Should the host pebs usage evaluation also be masked by intel_ctrl to prevent >>> non-counter bits from triggering a false collision? >> Hmm, I suspect if the issue could happen on real hardwares. > OMG, it took me _so_ long to understand what Sashiko was complaining about. I'm > pretty sure I read Sashiko's response like four times, and it didn't click until > I read your response. > > In case anyone is feeling as dense as me, Sashiko is saying this: > > /* > * FIXME: Allow guest and host usage of PEBS events to co-exist instead > * of disabling guest PEBS entirely if the host is using PEBS. > * What exactly goes wrong if guest and host are using PEBS is > * unknown. > */ > if (pebs_mask & ~cpuc->intel_ctrl_exclude_host_mask) > guest_pebs_mask = 0; > > should probably be masked by intel_ctrl as well. Yes, the comment of Sashiko is some kind of misleading. Strictly speaking, the comment should be made for the next patch 2/9 instead of this patch. > >> The function intel_pmu_pebs_enable() indeed sets extra bits into >> cpuc->pebs_enabled for load latency and specific store events, like below >> code shows. >> >> ``` >> >> ...... >> >> if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5)) >> cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32); >> else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST) >> cpuc->pebs_enabled |= 1ULL << 63; >> >> ...... >> >> ``` >> >> But these 2 cases should only be hit on quite old platforms (prior to >> Icelake). On these platforms, only GP counters support PEBS sampling and >> pebs_capable would be set PEBS_COUNTER_MASK, and so these extra bits would >> be filtered out by the pebs_capable and pebs_mask won't really contain >> these extra bits. > Thanks for digging into this! I was staring and just going "huh!?" over and over > in my head :-) :) > >> Anyway, we could optimize the code further like below and thoroughly filter >> away these extra bits. (only building, not test on real HW) > Hmm, I think I'd rather figure out what it would take to drop the FIXME entirely. > And if we need to keep the check, I'm a-ok risking false positives until we have > a better understanding of why the check exists. Kan Liang should be the best man who knows the history, but he has left Intel. I would check this with other guys internally and look at if they know the reason. Thanks. > >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c >> index 854881b5e696..6eee7636a822 100644 >> --- a/arch/x86/events/intel/core.c >> +++ b/arch/x86/events/intel/core.c >> @@ -4997,7 +4997,7 @@ static struct perf_guest_switch_msr >> *intel_guest_get_msrs(int *nr, >> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); >> struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs; >> u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl); >> - u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; >> + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable & intel_ctrl; >> u64 guest_pebs_mask; >> int global_ctrl; >> >> @@ -5049,7 +5049,7 @@ static struct perf_guest_switch_msr >> *intel_guest_get_msrs(int *nr, >> * the guest wants to use for PEBS, (c) are not excluded from counting >> * in the guest, and (d) _are_ excluded from counting in the host. >> */ >> - guest_pebs_mask = pebs_mask & intel_ctrl & guest_pebs->enable & >> + guest_pebs_mask = pebs_mask & guest_pebs->enable & >> ~cpuc->intel_ctrl_exclude_guest_mask & >> cpuc->intel_ctrl_exclude_host_mask; >> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-15 1:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260508231353.406465-2-seanjc@google.com>
[not found] ` <20260508234018.F06AEC2BCB0@smtp.kernel.org>
2026-05-12 11:30 ` [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits Mi, Dapeng
2026-05-15 0:01 ` Sean Christopherson
2026-05-15 1:49 ` Mi, Dapeng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox