From: Sean Christopherson <seanjc@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@kernel.org>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>,
Stephane Eranian <eranian@google.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>
Subject: [PATCH v2 0/4] perf/x86: Don't write PEBS_ENABLED on KVM transitions
Date: Thu, 23 Apr 2026 08:03:36 -0700 [thread overview]
Message-ID: <20260423150340.463896-1-seanjc@google.com> (raw)
Rework the handling of PEBS_ENABLED (and related PEBS MSRs) to *never* touch
PEBS_ENABLED if the CPU provides PEBS isolation, in which case disabling
counters via PERF_GLOBAL_CTRL is sufficient to prevent generation of unwanted
PEBS records. For vCPUs without PEBS enabled, this saves upwards of 7 MSR
writes on each roundtrip between the guest and host (KVM performs an immediate
WRMSR to zero out PEBS_ENABLED if it's in the load list). For vCPUS with PEBS,
this saves 3 MSR writes per roundtrip.
However, performance isn't the underlying motiviation. We (more accurately,
Jim, Mingwei, and Stephane) have been chasing issues where PEBS_ENABLED bits
can get "stuck" in a '1' state when running KVM guests while profiling the host
with PEBS events. The working theory is that perf throttles PEBS events in
NMI context, and thus clears bits in cpuc->pebs_enabled and PEBS_ENABLED, after
generating the list of PMU MSRs to context switch but before VM-Entry. And so
when the host's PEBS_ENABLED is loaded on VM-Exit, the CPU ends up with a
stale PEBS_ENABLED that doesn't get reset until something triggers an explicit
reload in perf.
Testing this against our "PEBS_ENABLED is stuck" reproducer is (still) a work
in-progress (largely because the "reproducer" is currently "throw the kernel in
a big test pool"), i.e. I don't know if this actually resolves the problems we
are seeing. But even if it doesn't fully resolve our woes, it seems like a
no-brainer improvement, and if we're missing something with respect to "stuck"
PEBS_ENABLED, it'd be nice to get feedback/input asap.
Note, if the throttling theory is correct (which is looking unlikely at the
moment), then there are likely more fixes that need to be done, e.g. for CPUs
without isolation, and/or if PERF_GLOBAL_CTRL can be modified from NMI context
too.
Patch 4 is a clean up that I posted as a standalone patch almost a year ago.
I included it here because it's very related, and because I needed to refresh
it anyways.
v2:
- "Load" the host value for the guest when an MSR should remain unchanged,
instead of omitting the MSR from the list entirely, as KVM may need to
_remove_ the MSR from the list. [Sashiko, Jim]
- Collect Jim's reviews. [Jim]
- Call out that the bug being fixed is theoretical at this point.
- Dropping PEBS_ENABLED from the lists save three MSR writes, not two, as
KVM performs an explicit WRMSR prior to VM-Entry to guarantee PEBS is
quiesced.
v1: https://lore.kernel.org/all/20260414191425.2697918-1-seanjc@google.com
Sean Christopherson (4):
perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU
has isolation
perf/x86/intel: Don't context switch DS_AREA (and PEBS config) if PEBS
is unused
perf/x86/intel: Make @data a mandatory param for
intel_guest_get_msrs()
perf/x86: KVM: Have perf define a dedicated struct for getting guest
PEBS data
arch/x86/events/core.c | 5 ++-
arch/x86/events/intel/core.c | 69 +++++++++++++++++++------------
arch/x86/events/perf_event.h | 3 +-
arch/x86/include/asm/kvm_host.h | 9 ----
arch/x86/include/asm/perf_event.h | 12 +++++-
arch/x86/kvm/vmx/pmu_intel.c | 20 +++++++--
arch/x86/kvm/vmx/vmx.c | 11 +++--
arch/x86/kvm/vmx/vmx.h | 2 +-
8 files changed, 82 insertions(+), 49 deletions(-)
base-commit: 6b802031877a995456c528095c41d1948546bf45
--
2.54.0.545.g6539524ca2-goog
next reply other threads:[~2026-04-23 15:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 15:03 Sean Christopherson [this message]
2026-04-23 15:03 ` [PATCH v2 1/4] perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU has isolation Sean Christopherson
2026-04-23 16:22 ` Peter Zijlstra
2026-04-23 17:59 ` Jim Mattson
2026-04-23 15:03 ` [PATCH v2 2/4] perf/x86/intel: Don't context switch DS_AREA (and PEBS config) if PEBS is unused Sean Christopherson
2026-04-23 15:03 ` [PATCH v2 3/4] perf/x86/intel: Make @data a mandatory param for intel_guest_get_msrs() Sean Christopherson
2026-04-23 15:03 ` [PATCH v2 4/4] perf/x86: KVM: Have perf define a dedicated struct for getting guest PEBS data Sean Christopherson
2026-04-23 18:14 ` Jim Mattson
2026-04-23 15:33 ` [PATCH v2 0/4] perf/x86: Don't write PEBS_ENABLED on KVM transitions Jim Mattson
2026-04-23 16:16 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260423150340.463896-1-seanjc@google.com \
--to=seanjc@google.com \
--cc=acme@kernel.org \
--cc=bp@alien8.de \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mizhang@google.com \
--cc=namhyung@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox