From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Jim Mattson <jmattson@google.com>,
Sean Christopherson <seanjc@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@kernel.org>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Mingwei Zhang <mizhang@google.com>,
Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v2 1/4] perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU has isolation
Date: Mon, 27 Apr 2026 10:10:27 +0800 [thread overview]
Message-ID: <435241c3-20a7-4214-9f54-5ca82678dc3d@linux.intel.com> (raw)
In-Reply-To: <CALMp9eSivf0oDdbgfLSY0zcvtMAV__da8ENCJk=QpEkEeUC1PQ@mail.gmail.com>
On 4/24/2026 1:59 AM, Jim Mattson wrote:
> On Thu, Apr 23, 2026 at 8:03 AM Sean Christopherson <seanjc@google.com> wrote:
>> When filling the list of MSRs to be loaded by KVM on VM-Enter and VM-Exit,
>> *never* insert an entry for PEBS_ENABLED if the CPU properly isolates PEBS
>> events, in which case disabling counters via PERF_GLOBAL_CTRL is sufficient
>> to prevent unwanted PEBS events in the guest (or host). Because perf loads
>> PEBS_ENABLE with the unfiltered cpu_hw_events.pebs_enabled, i.e. with both
>> host and guest masks, there is no need to load different values for the
>> guest versus host, perf+KVM can and should simply control which counters
>> are enabled/disabled via PERF_GLOBAL_CTRL.
>>
>> Avoiding touching PEBS_ENABLED fixes a theorized bug where PEBS_ENABLED can
>> end up with "stuck" bits if a PEBS event is throttled better generating the
>> list and actually entering the guest (Intel CPUs can't arbtitrarily block
>> NMIs). And stating the obvious, leaving PEBS_ENABLED as-is avoids three MSR
>> writes on every VMX transition: one each on entry/exit, and one more
>> explicit WRMSR to zero PEBS_ENABLED before VM-Entry (KVM assumes the only
>> reason PEBS_ENABLED is in the load list is if the CPU lacks isolation and
>> thus needs a quiescent period).
>>
>> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
>> Cc: Jim Mattson <jmattson@google.com>
>> Cc: Mingwei Zhang <mizhang@google.com>
>> Cc: Stephane Eranian <eranian@google.com>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>> arch/x86/events/intel/core.c | 42 ++++++++++++++++++++----------------
>> 1 file changed, 23 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 793335c3ce78..002d809f82ef 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4999,12 +4999,15 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>> struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
>> u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
>> u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
>> - int global_ctrl, pebs_enable;
>> + u64 guest_pebs_mask = pebs_mask & ~cpuc->intel_ctrl_host_mask;
>> + int global_ctrl;
> Is it worth noting somewhere that pebs_ept is not supported on any
> CPUs with PMU version < 5, where a single event can set two
> PEBS_ENABLE bits (cf. intel_pmu_pebs_enable)?
>
>> /*
>> * In addition to obeying exclude_guest/exclude_host, remove bits being
>> * used for PEBS when running a guest, because PEBS writes to virtual
>> - * addresses (not physical addresses).
>> + * addresses (not physical addresses). If the guest wants to utilize
>> + * PEBS, and PEBS can safely enabled in the guest, bits for the guest's
>> + * PEBS-enabled counters will be OR'd back in as appropriate.
>> */
>> *nr = 0;
>> global_ctrl = (*nr)++;
>> @@ -5051,24 +5054,25 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>> };
>> }
>>
>> - pebs_enable = (*nr)++;
>> - arr[pebs_enable] = (struct perf_guest_switch_msr){
>> - .msr = MSR_IA32_PEBS_ENABLE,
>> - .host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
>> - .guest = pebs_mask & ~cpuc->intel_ctrl_host_mask & kvm_pmu->pebs_enable,
>> - };
>> -
>> - if (arr[pebs_enable].host) {
>> - /* Disable guest PEBS if host PEBS is enabled. */
>> - arr[pebs_enable].guest = 0;
>> - } else {
>> - /* Disable guest PEBS thoroughly for cross-mapped PEBS counters. */
>> - arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
>> - arr[global_ctrl].guest &= ~kvm_pmu->host_cross_mapped_mask;
>> - /* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
>> - arr[global_ctrl].guest |= arr[pebs_enable].guest;
>> - }
>> + /*
>> + * Disable counters where the guest PMC is different than the host PMC
>> + * being used on behalf of the guest, as the PEBS record includes
>> + * PERF_GLOBAL_STATUS, i.e. the guest will see overflow status for the
>> + * wrong counter(s). Similarly, disallow PEBS in the guest if the host
>> + * is using PEBS, to avoid bleeding host state into PEBS records.
>> + */
>> + guest_pebs_mask &= kvm_pmu->pebs_enable & ~kvm_pmu->host_cross_mapped_mask;
>> + if (pebs_mask & ~cpuc->intel_ctrl_guest_mask)
>> + guest_pebs_mask = 0;
> I don't understand this clause. IIUC, it says that if we don't have
> any exclude-host PEBS events, then clear PEBS_ENABLE for the guest.
I suppose it says all guest PEBS events need to be disabled if there is any
event using PEBS on host side, and it's clearing GLOBAL_CTRL instead of
PEBS_ENABLE to disable guest PEBS events.
>
> Yes, any guest-programmed PEBS event should be exclude-host, but if
> there is an inconsistency, shouldn't we apply a mask? What if there is
> only one exclude-host PEBS event, but there are two bits set in
> guest_pebs_mask?
>
>> + /*
>> + * Do NOT mess with PEBS_ENABLED. As above, disabling counters via
>> + * PERF_GLOBAL_CTRL is sufficient, and loading a stale PEBS_ENABLED,
>> + * e.g. on VM-Exit, can put the system in a bad state. Simply enable
>> + * counters in PERF_GLOBAL_CTRL, as perf load PEBS_ENABLED with the
>> + * full value, i.e. perf *also* relies on PERF_GLOBAL_CTRL.
>> + */
>> + arr[global_ctrl].guest |= guest_pebs_mask;
>> return arr;
>> }
>>
>> --
>> 2.54.0.545.g6539524ca2-goog
>>
next prev parent reply other threads:[~2026-04-27 2:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 15:03 [PATCH v2 0/4] perf/x86: Don't write PEBS_ENABLED on KVM transitions Sean Christopherson
2026-04-23 15:03 ` [PATCH v2 1/4] perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU has isolation Sean Christopherson
2026-04-23 16:22 ` Peter Zijlstra
2026-04-23 17:59 ` Jim Mattson
2026-04-27 2:10 ` Mi, Dapeng [this message]
2026-04-23 15:03 ` [PATCH v2 2/4] perf/x86/intel: Don't context switch DS_AREA (and PEBS config) if PEBS is unused Sean Christopherson
2026-04-27 2:24 ` Mi, Dapeng
2026-04-23 15:03 ` [PATCH v2 3/4] perf/x86/intel: Make @data a mandatory param for intel_guest_get_msrs() Sean Christopherson
2026-04-27 2:28 ` Mi, Dapeng
2026-04-23 15:03 ` [PATCH v2 4/4] perf/x86: KVM: Have perf define a dedicated struct for getting guest PEBS data Sean Christopherson
2026-04-23 18:14 ` Jim Mattson
2026-04-23 23:31 ` sashiko-bot
2026-04-23 15:33 ` [PATCH v2 0/4] perf/x86: Don't write PEBS_ENABLED on KVM transitions Jim Mattson
2026-04-23 16:16 ` Peter Zijlstra
2026-04-24 12:17 ` Mi, Dapeng
2026-04-24 12:23 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=435241c3-20a7-4214-9f54-5ca82678dc3d@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=acme@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mizhang@google.com \
--cc=namhyung@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox