Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Sean Christopherson <seanjc@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	James Clark <james.clark@linaro.org>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, Jim Mattson <jmattson@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v3 2/9] perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU has isolation
Date: Tue, 12 May 2026 12:53:50 +0800	[thread overview]
Message-ID: <0a3535cb-b519-4225-9c09-733d47adfe5e@linux.intel.com> (raw)
In-Reply-To: <20260508231353.406465-3-seanjc@google.com>


On 5/9/2026 7:13 AM, Sean Christopherson wrote:
> When filling the list of MSRs to be loaded by KVM on VM-Enter and VM-Exit,
> *never* insert an entry for PEBS_ENABLED if the CPU properly isolates PEBS
> events, in which case disabling counters via PERF_GLOBAL_CTRL is sufficient
> to prevent unwanted PEBS events in the guest (or host).  Because perf loads
> PEBS_ENABLE with the unfiltered cpu_hw_events.pebs_enabled, i.e. with both
> host and guest masks, there is no need to load different values for the
> guest versus host, perf+KVM can and should simply control which counters
> are enabled/disabled via PERF_GLOBAL_CTRL.
>
> Avoiding touching PEBS_ENABLED "fixes" a bug where PEBS_ENABLED can end up
> with "stuck" bits if a PEBS event is throttled between generating the list
> and actually entering the guest (Intel CPUs can't arbtitrarily block NMIs).
> Fixes in quotes because leaving PEBS_ENABLED as-is doesn't fix the
> underlying problem of perf (via PMIs) being able to modify state after the
> perf<=>KVM handoff.
>
> But not writing PEBS_ENABLED is desirable no matter what, as stating the
> obvious, leaving PEBS_ENABLED as-is avoids three MSR writes on every VMX
> transition: one each on entry/exit, and one more explicit WRMSR to zero
> PEBS_ENABLED before VM-Entry (KVM assumes the only reason PEBS_ENABLED is
> in the load list is if the CPU lacks isolation and thus needs a quiescent
> period).
>
> Opportunistically add comments to (better) explain the rules for generating
> the set of PEBS counters that will be active while the guest is running,
> along with a FIXME for the suspected hack-a-fix where perf disables guest
> PEBS if _any_ PEBS event is configured to count in the host (commit
> 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily in two rare
> situations") doesn't explain the motivation, at all).
>
> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Mingwei Zhang <mizhang@google.com>
> Cc: Stephane Eranian <eranian@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/events/intel/core.c | 55 ++++++++++++++++++++++++------------
>  1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index b70dc35fcceb..13cd12d3eeee 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4999,12 +4999,15 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  	struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
>  	u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
>  	u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> -	int global_ctrl, pebs_enable;
> +	u64 guest_pebs_mask;
> +	int global_ctrl;
>  
>  	/*
>  	 * In addition to obeying exclude_guest/exclude_host, remove bits being
>  	 * used for PEBS when running a guest, because PEBS writes to virtual
> -	 * addresses (not physical addresses).
> +	 * addresses (not physical addresses).  If the guest wants to utilize
> +	 * PEBS, and PEBS can safely enabled in the guest, bits for the guest's
> +	 * PEBS-enabled counters will be OR'd back in as appropriate.
>  	 */
>  	*nr = 0;
>  	global_ctrl = (*nr)++;
> @@ -5051,24 +5054,40 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  		};
>  	}
>  
> -	pebs_enable = (*nr)++;
> -	arr[pebs_enable] = (struct perf_guest_switch_msr){
> -		.msr = MSR_IA32_PEBS_ENABLE,
> -		.host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
> -		.guest = pebs_mask & ~cpuc->intel_ctrl_host_mask & kvm_pmu->pebs_enable,
> -	};
> +	/*
> +	 * Restrict guest PEBS events to counters that (a) perf supports, (b)
> +	 * the guest wants to use for PEBS, (c) are not excluded from counting
> +	 * in the guest, and (d) _are_ excluded from counting in the host.
> +	 */
> +	guest_pebs_mask = pebs_mask & intel_ctrl & kvm_pmu->pebs_enable &
> +			  ~cpuc->intel_ctrl_host_mask &
> +			  cpuc->intel_ctrl_guest_mask;
>  
> -	if (arr[pebs_enable].host) {
> -		/* Disable guest PEBS if host PEBS is enabled. */
> -		arr[pebs_enable].guest = 0;
> -	} else {
> -		/* Disable guest PEBS thoroughly for cross-mapped PEBS counters. */
> -		arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
> -		arr[global_ctrl].guest &= ~kvm_pmu->host_cross_mapped_mask;
> -		/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
> -		arr[global_ctrl].guest |= intel_ctrl & arr[pebs_enable].guest;
> -	}
> +	/*
> +	 * Disable counters where the guest PMC is different than the host PMC
> +	 * being used on behalf of the guest, as the PEBS record includes
> +	 * PERF_GLOBAL_STATUS, i.e. the guest will see overflow status for the
> +	 * wrong counter(s).
> +	 */
> +	guest_pebs_mask &= ~kvm_pmu->host_cross_mapped_mask;
>  
> +	/*
> +	 * FIXME: Allow guest and host usage of PEBS events to co-exist instead
> +	 *        of disabling guest PEBS entirely if the host is using PEBS.
> +	 *        What exactly goes wrong if guest and host are using PEBS is
> +	 *        unknown.
> +	 */
> +	if (pebs_mask & ~cpuc->intel_ctrl_guest_mask)
> +		guest_pebs_mask = 0;
> +
> +	/*
> +	 * Do NOT mess with PEBS_ENABLED.  As above, disabling counters via
> +	 * PERF_GLOBAL_CTRL is sufficient, and loading a stale PEBS_ENABLED,
> +	 * e.g. on VM-Exit, can put the system in a bad state.  Simply enable
> +	 * counters in PERF_GLOBAL_CTRL, as perf load PEBS_ENABLED with the
> +	 * full value, i.e. perf *also* relies on PERF_GLOBAL_CTRL.
> +	 */
> +	arr[global_ctrl].guest |= guest_pebs_mask;
>  	return arr;
>  }

LGTM.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>


>  

  reply	other threads:[~2026-05-12  4:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 23:13 [PATCH v3 0/9] perf/x86: Don't write PEBS_ENABLED on KVM transitions Sean Christopherson
2026-05-08 23:13 ` [PATCH v3 1/9] perf/x86/intel: Ensure guest PEBS path doesn't set unwanted PERF_GLOBAL_CTRL bits Sean Christopherson
2026-05-12  4:53   ` Mi, Dapeng
2026-05-08 23:13 ` [PATCH v3 2/9] perf/x86/intel: Don't write PEBS_ENABLED on host<=>guest xfers if CPU has isolation Sean Christopherson
2026-05-12  4:53   ` Mi, Dapeng [this message]
2026-05-08 23:13 ` [PATCH v3 3/9] perf/x86/intel: Don't context switch DS_AREA (and PEBS config) if PEBS is unused Sean Christopherson
2026-05-08 23:13 ` [PATCH v3 4/9] perf/x86/intel: Make @data a mandatory param for intel_guest_get_msrs() Sean Christopherson
2026-05-12 12:39   ` Jim Mattson
2026-05-08 23:13 ` [PATCH v3 5/9] perf/x86/intel: Invert names of intel_ctrl_{guest,host}_mask Sean Christopherson
2026-05-12  4:58   ` Mi, Dapeng
2026-05-08 23:13 ` [PATCH v3 6/9] perf/x86: KVM: Have perf define a dedicated struct for getting guest PEBS data Sean Christopherson
2026-05-08 23:13 ` [PATCH v3 7/9] perf/x86/intel: KVM: Handle cross-mapped PEBS PMCs entirely within KVM Sean Christopherson
2026-05-12  4:59   ` Mi, Dapeng
2026-05-08 23:13 ` [PATCH v3 8/9] KVM: VMX: Drop a redundant pmu->global_ctrl check when processing pebs_enable Sean Christopherson
2026-05-12  5:00   ` Mi, Dapeng
2026-05-08 23:13 ` [PATCH v3 9/9] KVM: VMX: Only tell perf to enable PEBS counters for fully enabled PMCs Sean Christopherson
2026-05-12  5:01   ` Mi, Dapeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a3535cb-b519-4225-9c09-733d47adfe5e@linux.intel.com \
    --to=dapeng1.mi@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jmattson@google.com \
    --cc=jolsa@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mizhang@google.com \
    --cc=namhyung@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox