All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Dapeng Mi" <dapeng1.mi@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case
Date: Fri, 29 May 2026 09:24:23 +0000	[thread overview]
Message-ID: <20260529092423.F2DD81F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260529075645.580362-12-dapeng1.mi@linux.intel.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Calling `xsaves_nmi()` for PEBS events on CPUs lacking `X86_FEATURE_XSAVES` causes an invalid opcode exception (`#UD`), triggering warning loops and overhead.
- [High] Guest XMM registers are incorrectly sampled as host REGS_USER XMM registers when a PMU interrupt occurs during KVM guest execution.
--

commit feabaffcb00c51c951e3f0889a9ad615fe731d5e
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>

perf/x86: Enable XMM register sampling for REGS_USER case

This patch adds support for XMM register sampling in the REGS_USER case.
To handle simultaneous sampling of XMM registers, a per-CPU x86_user_regs
is introduced to store REGS_USER-specific XMM registers. The
x86_pmu_update_user_xregs helper function is added to retrieve user-space
XMM registers.

> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -707,12 +707,12 @@ int x86_pmu_hw_config(struct perf_event *event)
>  			return -EINVAL;
>  	}
>  
> -	if (event->attr.sample_type & PERF_SAMPLE_REGS_INTR) {
> +	if (event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) {
>  		/*
>  		 * Besides the general purpose registers, XMM registers may
>  		 * be collected as well.
>  		 */
> -		if (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK) {
> +		if (event_has_extended_regs(event)) {
>  			if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS))
>  				return -EINVAL;
>  			if (is_sampling_event(event) && !event->attr.precise_ip &&
>  			    !this_cpu_has(X86_FEATURE_XSAVES))
>  				return -EINVAL;
>  		}
>  	}
[ ... ]
> @@ -1812,33 +1803,155 @@ static void x86_pmu_update_regs_intr(struct perf_event *event,
[ ... ]
> +static inline u64 x86_pmu_update_user_xregs(struct perf_sample_data *data,
> +					    struct pt_regs *regs,
> +					    u64 mask, u64 ignore_mask)
> +{
> +	struct x86_perf_regs *perf_regs;
> +	struct xregs_state *xsave;
> +	struct fpu *fpu;
> +	struct fpstate *fps;
> +	u64 user_mask = mask;
> +
> +	if (data->regs_user.abi == PERF_SAMPLE_REGS_ABI_NONE)
> +		return 0;
> +
> +	/*
> +	 * If PEBS hits kernel space, need to re-sample extended
> +	 * registers for user space.
> +	 */
> +	if (user_mode(regs))
> +		user_mask &= ~ignore_mask;
> +
> +	if (user_mask && test_thread_flag(TIF_NEED_FPU_LOAD)) {

[Severity: High]
When KVM loads a guest's FPU state, fpu_swap_kvm_fpstate() caches the host's
state in fpu->__task_fpstate and explicitly clears the TIF_NEED_FPU_LOAD flag
via fpregs_mark_activate().

If a perf NMI interrupts the guest, this check for TIF_NEED_FPU_LOAD will
evaluate to false, causing x86_pmu_update_user_xregs() to return the mask
instead of reading the cached host state from __task_fpstate.

Does this cause xsaves_nmi() to sample the hardware registers directly, which
currently hold the KVM guest's XMM registers, and expose them to the host's
user-space perf record, leading to cross-VM data leaks?

> +		perf_regs = container_of(data->regs_user.regs,
> +				 struct x86_perf_regs, regs);
> +		fpu = x86_task_fpu(current);
> +		/*
> +		 * If __task_fpstate is set, it holds the right pointer,
> +		 * otherwise fpstate will.
> +		 */
> +		fps = READ_ONCE(fpu->__task_fpstate);
> +		if (!fps)
> +			fps = fpu->fpstate;
> +		xsave = &fps->regs.xsave;
> +
> +		update_perf_regs(perf_regs, xsave, user_mask);
> +		return 0;
> +	}
> +
> +	return user_mask;
> +}
[ ... ]
>  static void x86_pmu_sample_xregs(struct perf_event *event,
>  				 struct perf_sample_data *data,
> +				 struct pt_regs *regs,
>  				 u64 ignore_mask)
>  {
>  	struct xregs_state *xsave = get_ext_regs_buf(smp_processor_id());
>  	u64 sample_type = event->attr.sample_type;
>  	struct x86_perf_regs *perf_regs;
> +	u64 user_mask = 0;
>  	u64 intr_mask = 0;
>  	u64 mask = 0;
>  
>  	if (WARN_ON_ONCE(!xsave) || !in_nmi())
>  		return;
>  
> -	if ((sample_type & PERF_SAMPLE_REGS_INTR) &&
> -	    (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK))
> +	if (event_has_extended_regs(event))
>  		mask |= XFEATURE_MASK_SSE;
>  
>  	mask &= x86_pmu.ext_regs_mask;
> +	if (sample_type & PERF_SAMPLE_REGS_USER) {
> +		user_mask = x86_pmu_update_user_xregs(data, regs,
> +						      mask, ignore_mask);
> +	}
>  
>  	if (sample_type & PERF_SAMPLE_REGS_INTR)
>  		intr_mask = mask & ~ignore_mask;
>  
> +	if (user_mask | intr_mask) {
> +		xsave->header.xfeatures = 0;
> +		xsaves_nmi(xsave, user_mask | intr_mask);
> +	}

[Severity: High]
For PEBS events with PERF_SAMPLE_REGS_USER, the X86_FEATURE_XSAVES check in
x86_pmu_hw_config() is bypassed due to event->attr.precise_ip being set.

If this event triggers in kernel space on a CPU lacking XSAVES (for example,
booted with noxsaves or masked in a VM), user_mode(regs) is false, so
user_mask retains the extended bits from x86_pmu_update_user_xregs().

Since xsaves_nmi() unconditionally emits the xsaves instruction, will this
trigger an Invalid Opcode exception on CPUs without XSAVES support and add
overhead to every PEBS interrupt in kernel space?

> +
> +	if (user_mask) {
> +		perf_regs = container_of(data->regs_user.regs,
> +					 struct x86_perf_regs, regs);
> +		update_perf_regs(perf_regs, xsave, user_mask);
> +	}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529075645.580362-1-dapeng1.mi@linux.intel.com?part=11

  reply	other threads:[~2026-05-29  9:24 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  7:56 [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Dapeng Mi
2026-05-29  7:56 ` [Patch v8 01/23] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
2026-05-29  8:53   ` sashiko-bot
2026-05-29 11:11   ` Peter Zijlstra
2026-06-01  1:02     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 02/23] perf/x86: Move hybrid PMU initialization before x86_pmu_starting_cpu() Dapeng Mi
2026-05-29  8:51   ` sashiko-bot
2026-06-01  1:40     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 03/23] perf/x86/intel: Enable large PEBS sampling for XMMs Dapeng Mi
2026-05-29  7:56 ` [Patch v8 04/23] perf/x86/intel: Convert x86_perf_regs to per-cpu variables Dapeng Mi
2026-05-29  7:56 ` [Patch v8 05/23] perf: Eliminate duplicate arch-specific functions definations Dapeng Mi
2026-05-29  7:56 ` [Patch v8 06/23] perf/x86: Use x86_perf_regs in the x86 nmi handlers Dapeng Mi
2026-05-29  7:56 ` [Patch v8 07/23] x86/fpu/xstate: Add xsaves_nmi() helper Dapeng Mi
2026-05-29  8:56   ` sashiko-bot
2026-05-29 11:32   ` Peter Zijlstra
2026-06-01  2:31     ` Mi, Dapeng
2026-06-01  8:28       ` Peter Zijlstra
2026-05-29  7:56 ` [Patch v8 08/23] x86/fpu: Ensure TIF_NEED_FPU_LOAD is set after saving FPU state Dapeng Mi
2026-05-29  7:56 ` [Patch v8 09/23] perf: Move and enhance has_extended_regs() for arch-specific use Dapeng Mi
2026-05-29  7:56 ` [Patch v8 10/23] perf/x86: Enable XMM Register Sampling for Non-PEBS Events Dapeng Mi
2026-05-29  9:02   ` sashiko-bot
2026-06-01  3:11     ` Mi, Dapeng
2026-05-29 11:38   ` Peter Zijlstra
2026-06-01  3:04     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case Dapeng Mi
2026-05-29  9:24   ` sashiko-bot [this message]
2026-06-01  5:57     ` Mi, Dapeng
2026-05-29 11:42   ` Peter Zijlstra
2026-06-01  5:53     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 12/23] perf: Add sampling support for SIMD registers Dapeng Mi
2026-05-29  8:36   ` sashiko-bot
2026-06-01  6:44     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 13/23] perf/x86: Support XMM sampling using sample_simd_vec_reg_* fields Dapeng Mi
2026-05-29  8:49   ` sashiko-bot
2026-06-01  6:57     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 14/23] perf/x86: Support YMM " Dapeng Mi
2026-05-29  8:47   ` sashiko-bot
2026-06-01  7:14     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 15/23] perf/x86: Support ZMM " Dapeng Mi
2026-05-29  7:56 ` [Patch v8 16/23] perf/x86: Support OPMASK sampling using sample_simd_pred_reg_* fields Dapeng Mi
2026-05-29  9:21   ` sashiko-bot
2026-06-01  7:21     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 17/23] perf: Enhance perf_reg_validate() with simd_enabled argument Dapeng Mi
2026-05-29  7:56 ` [Patch v8 18/23] perf/x86: Support eGPRs sampling using sample_regs_* fields Dapeng Mi
2026-05-29  9:31   ` sashiko-bot
2026-06-01  8:20     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 19/23] perf/x86: Support SSP " Dapeng Mi
2026-05-29 10:03   ` sashiko-bot
2026-06-01  8:54     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 20/23] perf/x86/intel: Support arch-PEBS based SIMD/eGPRs/SSP sampling Dapeng Mi
2026-05-29  9:45   ` sashiko-bot
2026-06-01  9:08     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 21/23] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability Dapeng Mi
2026-05-29 10:43   ` sashiko-bot
2026-06-01  9:19     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 22/23] perf/x86: Activate back-to-back NMI detection for arch-PEBS induced NMIs Dapeng Mi
2026-05-29  9:34   ` sashiko-bot
2026-06-01  9:23     ` Mi, Dapeng
2026-05-29  7:56 ` [Patch v8 23/23] perf/x86/intel: Add sanity check for PEBS fragment size Dapeng Mi
2026-05-29  9:54   ` sashiko-bot
2026-06-01  9:42     ` Mi, Dapeng
2026-05-29  8:32 ` [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Mi, Dapeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260529092423.F2DD81F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.