From: sashiko-bot@kernel.org
To: "Dapeng Mi" <dapeng1.mi@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case
Date: Fri, 29 May 2026 09:24:23 +0000 [thread overview]
Message-ID: <20260529092423.F2DD81F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260529075645.580362-12-dapeng1.mi@linux.intel.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Calling `xsaves_nmi()` for PEBS events on CPUs lacking `X86_FEATURE_XSAVES` causes an invalid opcode exception (`#UD`), triggering warning loops and overhead.
- [High] Guest XMM registers are incorrectly sampled as host REGS_USER XMM registers when a PMU interrupt occurs during KVM guest execution.
--
commit feabaffcb00c51c951e3f0889a9ad615fe731d5e
Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
perf/x86: Enable XMM register sampling for REGS_USER case
This patch adds support for XMM register sampling in the REGS_USER case.
To handle simultaneous sampling of XMM registers, a per-CPU x86_user_regs
is introduced to store REGS_USER-specific XMM registers. The
x86_pmu_update_user_xregs helper function is added to retrieve user-space
XMM registers.
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -707,12 +707,12 @@ int x86_pmu_hw_config(struct perf_event *event)
> return -EINVAL;
> }
>
> - if (event->attr.sample_type & PERF_SAMPLE_REGS_INTR) {
> + if (event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) {
> /*
> * Besides the general purpose registers, XMM registers may
> * be collected as well.
> */
> - if (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK) {
> + if (event_has_extended_regs(event)) {
> if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS))
> return -EINVAL;
> if (is_sampling_event(event) && !event->attr.precise_ip &&
> !this_cpu_has(X86_FEATURE_XSAVES))
> return -EINVAL;
> }
> }
[ ... ]
> @@ -1812,33 +1803,155 @@ static void x86_pmu_update_regs_intr(struct perf_event *event,
[ ... ]
> +static inline u64 x86_pmu_update_user_xregs(struct perf_sample_data *data,
> + struct pt_regs *regs,
> + u64 mask, u64 ignore_mask)
> +{
> + struct x86_perf_regs *perf_regs;
> + struct xregs_state *xsave;
> + struct fpu *fpu;
> + struct fpstate *fps;
> + u64 user_mask = mask;
> +
> + if (data->regs_user.abi == PERF_SAMPLE_REGS_ABI_NONE)
> + return 0;
> +
> + /*
> + * If PEBS hits kernel space, need to re-sample extended
> + * registers for user space.
> + */
> + if (user_mode(regs))
> + user_mask &= ~ignore_mask;
> +
> + if (user_mask && test_thread_flag(TIF_NEED_FPU_LOAD)) {
[Severity: High]
When KVM loads a guest's FPU state, fpu_swap_kvm_fpstate() caches the host's
state in fpu->__task_fpstate and explicitly clears the TIF_NEED_FPU_LOAD flag
via fpregs_mark_activate().
If a perf NMI interrupts the guest, this check for TIF_NEED_FPU_LOAD will
evaluate to false, causing x86_pmu_update_user_xregs() to return the mask
instead of reading the cached host state from __task_fpstate.
Does this cause xsaves_nmi() to sample the hardware registers directly, which
currently hold the KVM guest's XMM registers, and expose them to the host's
user-space perf record, leading to cross-VM data leaks?
> + perf_regs = container_of(data->regs_user.regs,
> + struct x86_perf_regs, regs);
> + fpu = x86_task_fpu(current);
> + /*
> + * If __task_fpstate is set, it holds the right pointer,
> + * otherwise fpstate will.
> + */
> + fps = READ_ONCE(fpu->__task_fpstate);
> + if (!fps)
> + fps = fpu->fpstate;
> + xsave = &fps->regs.xsave;
> +
> + update_perf_regs(perf_regs, xsave, user_mask);
> + return 0;
> + }
> +
> + return user_mask;
> +}
[ ... ]
> static void x86_pmu_sample_xregs(struct perf_event *event,
> struct perf_sample_data *data,
> + struct pt_regs *regs,
> u64 ignore_mask)
> {
> struct xregs_state *xsave = get_ext_regs_buf(smp_processor_id());
> u64 sample_type = event->attr.sample_type;
> struct x86_perf_regs *perf_regs;
> + u64 user_mask = 0;
> u64 intr_mask = 0;
> u64 mask = 0;
>
> if (WARN_ON_ONCE(!xsave) || !in_nmi())
> return;
>
> - if ((sample_type & PERF_SAMPLE_REGS_INTR) &&
> - (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK))
> + if (event_has_extended_regs(event))
> mask |= XFEATURE_MASK_SSE;
>
> mask &= x86_pmu.ext_regs_mask;
> + if (sample_type & PERF_SAMPLE_REGS_USER) {
> + user_mask = x86_pmu_update_user_xregs(data, regs,
> + mask, ignore_mask);
> + }
>
> if (sample_type & PERF_SAMPLE_REGS_INTR)
> intr_mask = mask & ~ignore_mask;
>
> + if (user_mask | intr_mask) {
> + xsave->header.xfeatures = 0;
> + xsaves_nmi(xsave, user_mask | intr_mask);
> + }
[Severity: High]
For PEBS events with PERF_SAMPLE_REGS_USER, the X86_FEATURE_XSAVES check in
x86_pmu_hw_config() is bypassed due to event->attr.precise_ip being set.
If this event triggers in kernel space on a CPU lacking XSAVES (for example,
booted with noxsaves or masked in a VM), user_mode(regs) is false, so
user_mask retains the extended bits from x86_pmu_update_user_xregs().
Since xsaves_nmi() unconditionally emits the xsaves instruction, will this
trigger an Invalid Opcode exception on CPUs without XSAVES support and add
overhead to every PEBS interrupt in kernel space?
> +
> + if (user_mask) {
> + perf_regs = container_of(data->regs_user.regs,
> + struct x86_perf_regs, regs);
> + update_perf_regs(perf_regs, xsave, user_mask);
> + }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260529075645.580362-1-dapeng1.mi@linux.intel.com?part=11
next prev parent reply other threads:[~2026-05-29 9:24 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 7:56 [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Dapeng Mi
2026-05-29 7:56 ` [Patch v8 01/23] perf/x86/intel: Validate return value of intel_pmu_init_hybrid() Dapeng Mi
2026-05-29 8:53 ` sashiko-bot
2026-05-29 11:11 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 02/23] perf/x86: Move hybrid PMU initialization before x86_pmu_starting_cpu() Dapeng Mi
2026-05-29 8:51 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 03/23] perf/x86/intel: Enable large PEBS sampling for XMMs Dapeng Mi
2026-05-29 7:56 ` [Patch v8 04/23] perf/x86/intel: Convert x86_perf_regs to per-cpu variables Dapeng Mi
2026-05-29 7:56 ` [Patch v8 05/23] perf: Eliminate duplicate arch-specific functions definations Dapeng Mi
2026-05-29 7:56 ` [Patch v8 06/23] perf/x86: Use x86_perf_regs in the x86 nmi handlers Dapeng Mi
2026-05-29 7:56 ` [Patch v8 07/23] x86/fpu/xstate: Add xsaves_nmi() helper Dapeng Mi
2026-05-29 8:56 ` sashiko-bot
2026-05-29 11:32 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 08/23] x86/fpu: Ensure TIF_NEED_FPU_LOAD is set after saving FPU state Dapeng Mi
2026-05-29 7:56 ` [Patch v8 09/23] perf: Move and enhance has_extended_regs() for arch-specific use Dapeng Mi
2026-05-29 7:56 ` [Patch v8 10/23] perf/x86: Enable XMM Register Sampling for Non-PEBS Events Dapeng Mi
2026-05-29 9:02 ` sashiko-bot
2026-05-29 11:38 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case Dapeng Mi
2026-05-29 9:24 ` sashiko-bot [this message]
2026-05-29 11:42 ` Peter Zijlstra
2026-05-29 7:56 ` [Patch v8 12/23] perf: Add sampling support for SIMD registers Dapeng Mi
2026-05-29 8:36 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 13/23] perf/x86: Support XMM sampling using sample_simd_vec_reg_* fields Dapeng Mi
2026-05-29 8:49 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 14/23] perf/x86: Support YMM " Dapeng Mi
2026-05-29 8:47 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 15/23] perf/x86: Support ZMM " Dapeng Mi
2026-05-29 7:56 ` [Patch v8 16/23] perf/x86: Support OPMASK sampling using sample_simd_pred_reg_* fields Dapeng Mi
2026-05-29 9:21 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 17/23] perf: Enhance perf_reg_validate() with simd_enabled argument Dapeng Mi
2026-05-29 7:56 ` [Patch v8 18/23] perf/x86: Support eGPRs sampling using sample_regs_* fields Dapeng Mi
2026-05-29 9:31 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 19/23] perf/x86: Support SSP " Dapeng Mi
2026-05-29 10:03 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 20/23] perf/x86/intel: Support arch-PEBS based SIMD/eGPRs/SSP sampling Dapeng Mi
2026-05-29 9:45 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 21/23] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability Dapeng Mi
2026-05-29 10:43 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 22/23] perf/x86: Activate back-to-back NMI detection for arch-PEBS induced NMIs Dapeng Mi
2026-05-29 9:34 ` sashiko-bot
2026-05-29 7:56 ` [Patch v8 23/23] perf/x86/intel: Add sanity check for PEBS fragment size Dapeng Mi
2026-05-29 9:54 ` sashiko-bot
2026-05-29 8:32 ` [Patch v8 00/23] Support SIMD/eGPRs/SSP registers sampling for perf Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260529092423.F2DD81F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=dapeng1.mi@linux.intel.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox