From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
broonie@kernel.org, Ravi Bangoria <ravi.bangoria@amd.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Zide Chen <zide.chen@intel.com>,
Falcon Thomas <thomas.falcon@intel.com>,
Dapeng Mi <dapeng1.mi@intel.com>,
Xudong Hao <xudong.hao@intel.com>,
Kan Liang <kan.liang@linux.intel.com>
Subject: Re: [Patch v7 12/24] perf/x86: Enable XMM register sampling for REGS_USER case
Date: Wed, 25 Mar 2026 15:58:29 +0800 [thread overview]
Message-ID: <fa954a5a-2efd-4f21-abb9-c69150f66c56@linux.intel.com> (raw)
In-Reply-To: <20260324004118.3772171-13-dapeng1.mi@linux.intel.com>
On 3/24/2026 8:41 AM, Dapeng Mi wrote:
> This patch adds support for XMM register sampling in the REGS_USER case.
>
> To handle simultaneous sampling of XMM registers for both REGS_INTR and
> REGS_USER cases, a per-CPU `x86_user_regs` is introduced to store
> REGS_USER-specific XMM registers. This prevents REGS_USER-specific XMM
> register data from being overwritten by REGS_INTR-specific data if they
> share the same `x86_perf_regs` structure.
>
> To sample user-space XMM registers, the `x86_pmu_update_user_ext_regs()`
> helper function is added. It checks if the `TIF_NEED_FPU_LOAD` flag is
> set. If so, the user-space XMM register data can be directly retrieved
> from the cached task FPU state, as the corresponding hardware registers
> have been cleared or switched to kernel-space data. Otherwise, the data
> must be read from the hardware registers using the `xsaves` instruction.
>
> For PEBS events, `x86_pmu_update_user_ext_regs()` checks if the
> PEBS-sampled XMM register data belongs to user-space. If so, no further
> action is needed. Otherwise, the user-space XMM register data needs to be
> re-sampled using the same method as for non-PEBS events.
>
> Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> arch/x86/events/core.c | 95 ++++++++++++++++++++++++++++++++++++------
> 1 file changed, 82 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 22965a8a22b3..a5643c875190 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -696,7 +696,7 @@ int x86_pmu_hw_config(struct perf_event *event)
> return -EINVAL;
> }
>
> - if (event->attr.sample_type & PERF_SAMPLE_REGS_INTR) {
> + if (event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) {
> /*
> * Besides the general purpose registers, XMM registers may
> * be collected as well.
> @@ -707,15 +707,6 @@ int x86_pmu_hw_config(struct perf_event *event)
> }
> }
>
> - if (event->attr.sample_type & PERF_SAMPLE_REGS_USER) {
> - /*
> - * Currently XMM registers sampling for REGS_USER is not
> - * supported yet.
> - */
> - if (event_has_extended_regs(event))
> - return -EINVAL;
> - }
> -
> return x86_setup_perfctr(event);
> }
Sashiko comments
"
With this check removed, can older platforms hit an uninitialized stack
pointer dereference?
In __intel_pmu_pebs_events(), a struct x86_perf_regs is allocated on the
stack without zero-initialization. setup_pebs_fixed_sample_data() populates
perf_regs.regs but leaves perf_regs.xmm_regs uninitialized and doesn't set
the PERF_SAMPLE_REGS_USER flag.
If perf_prepare_sample() sees the missing flag, it calls
perf_sample_regs_user(), and perf_output_sample_regs() eventually
dereferences the uninitialized stack memory at perf_regs->xmm_regs to
output XMM data.
"
The comment makes sense. Would clear the xmm_regs and other pointers in
next version.
>
> @@ -1745,6 +1736,28 @@ static void x86_pmu_del(struct perf_event *event, int flags)
> static_call_cond(x86_pmu_del)(event);
> }
>
> +/*
> + * When both PERF_SAMPLE_REGS_INTR and PERF_SAMPLE_REGS_USER are set,
> + * an additional x86_perf_regs is required to save user-space registers.
> + * Without this, user-space register data may be overwritten by kernel-space
> + * registers.
> + */
> +static DEFINE_PER_CPU(struct x86_perf_regs, x86_user_regs);
> +static void x86_pmu_perf_get_regs_user(struct perf_sample_data *data,
> + struct pt_regs *regs)
> +{
> + struct x86_perf_regs *x86_regs_user = this_cpu_ptr(&x86_user_regs);
> + struct perf_regs regs_user;
> +
> + perf_get_regs_user(®s_user, regs);
> + data->regs_user.abi = regs_user.abi;
> + if (regs_user.regs) {
> + x86_regs_user->regs = *regs_user.regs;
> + data->regs_user.regs = &x86_regs_user->regs;
> + } else
> + data->regs_user.regs = NULL;
> +}
Sashiko comments
"
Does this leave the xmm_regs pointer uninitialized across perf events?
x86_user_regs is a per-CPU variable. When x86_pmu_perf_get_regs_user()
copies the general-purpose registers, it doesn't clear or initialize the
xmm_regs pointer in the union.
If user_mask evaluates to 0 later, x86_pmu_update_xregs() is bypassed,
leaving a stale xmm_regs pointer from a previous perf event on the same CPU.
Could this lead to a use-after-free or information leak when
perf_output_sample_regs() dereferences it?
"
It makes sense. would clear xmm_regs and other pointers in next version.
> +
> static void x86_pmu_setup_gpregs_data(struct perf_event *event,
> struct perf_sample_data *data,
> struct pt_regs *regs)
> @@ -1757,7 +1770,14 @@ static void x86_pmu_setup_gpregs_data(struct perf_event *event,
> data->regs_user.abi = perf_reg_abi(current);
> data->regs_user.regs = regs;
> } else if (!(current->flags & PF_KTHREAD)) {
> - perf_get_regs_user(&data->regs_user, regs);
> + /*
> + * It cannot guarantee that the kernel will never
> + * touch the registers outside of the pt_regs,
> + * especially when more and more registers
> + * (e.g., SIMD, eGPR) are added. The live data
> + * cannot be used.
> + */
> + x86_pmu_perf_get_regs_user(data, regs);
> } else {
> data->regs_user.abi = PERF_SAMPLE_REGS_ABI_NONE;
> data->regs_user.regs = NULL;
> @@ -1800,6 +1820,43 @@ static inline void x86_pmu_update_xregs(struct x86_perf_regs *perf_regs,
> perf_regs->xmm_space = xsave->i387.xmm_space;
> }
>
> +/*
> + * This function retrieves cached user-space fpu registers (XMM/YMM/ZMM).
> + * If TIF_NEED_FPU_LOAD is set, it indicates that the user-space FPU state
> + * is cached. Otherwise, the data should be read directly from the hardware
> + * registers.
> + */
> +static inline u64 x86_pmu_update_user_xregs(struct perf_sample_data *data,
> + u64 mask, u64 ignore_mask)
> +{
> + struct x86_perf_regs *perf_regs;
> + struct xregs_state *xsave;
> + struct fpu *fpu;
> + struct fpstate *fps;
> +
> + if (data->regs_user.abi == PERF_SAMPLE_REGS_ABI_NONE)
> + return 0;
> +
> + if (test_thread_flag(TIF_NEED_FPU_LOAD)) {
> + perf_regs = container_of(data->regs_user.regs,
> + struct x86_perf_regs, regs);
> + fpu = x86_task_fpu(current);
> + /*
> + * If __task_fpstate is set, it holds the right pointer,
> + * otherwise fpstate will.
> + */
> + fps = READ_ONCE(fpu->__task_fpstate);
> + if (!fps)
> + fps = fpu->fpstate;
> + xsave = &fps->regs.xsave;
> +
> + x86_pmu_update_xregs(perf_regs, xsave, mask);
> + return 0;
> + }
> +
> + return mask & ~ignore_mask;
> +}
Sashiko comments
"
If a PEBS event triggers in kernel space and TIF_NEED_FPU_LOAD is false,
PEBS has captured the kernel XMM registers, and the drain code sets
ignore_mask = XFEATURE_MASK_SSE.
When requesting user XMM registers, user_mask evaluates to 0 here because
of ignore_mask. Does this incorrectly suppress the xsaves_nmi fallback,
silently dropping the live user XMM registers?
This also seems to leave the xmm_regs pointer uninitialized, leading to
the use-after-free mentioned above.
"
It makes sense as well. Need to clear ignore_mask for user space sampling
if the PEBS sampling hits kernel space instead of user space. Thanks.
> +
> static void x86_pmu_sample_xregs(struct perf_event *event,
> struct perf_sample_data *data,
> u64 ignore_mask)
> @@ -1807,6 +1864,7 @@ static void x86_pmu_sample_xregs(struct perf_event *event,
> struct xregs_state *xsave = per_cpu(ext_regs_buf, smp_processor_id());
> u64 sample_type = event->attr.sample_type;
> struct x86_perf_regs *perf_regs;
> + u64 user_mask = 0;
> u64 intr_mask = 0;
> u64 mask = 0;
>
> @@ -1817,15 +1875,26 @@ static void x86_pmu_sample_xregs(struct perf_event *event,
> mask |= XFEATURE_MASK_SSE;
>
> mask &= x86_pmu.ext_regs_mask;
> + if ((sample_type & PERF_SAMPLE_REGS_USER) && data->regs_user.abi)
> + user_mask = x86_pmu_update_user_xregs(data, mask, ignore_mask);
>
> if ((sample_type & PERF_SAMPLE_REGS_INTR) && data->regs_intr.abi)
> intr_mask = mask & ~ignore_mask;
>
> + if (user_mask | intr_mask) {
> + xsave->header.xfeatures = 0;
> + xsaves_nmi(xsave, user_mask | intr_mask);
> + }
> +
> + if (user_mask) {
> + perf_regs = container_of(data->regs_user.regs,
> + struct x86_perf_regs, regs);
> + x86_pmu_update_xregs(perf_regs, xsave, user_mask);
> + }
> +
> if (intr_mask) {
> perf_regs = container_of(data->regs_intr.regs,
> struct x86_perf_regs, regs);
> - xsave->header.xfeatures = 0;
> - xsaves_nmi(xsave, mask);
> x86_pmu_update_xregs(perf_regs, xsave, intr_mask);
> }
> }
next prev parent reply other threads:[~2026-03-25 7:58 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 0:40 [Patch v7 00/24] Support SIMD/eGPRs/SSP registers sampling for perf Dapeng Mi
2026-03-24 0:40 ` [Patch v7 01/24] perf/x86: Move hybrid PMU initialization before x86_pmu_starting_cpu() Dapeng Mi
2026-03-24 0:40 ` [Patch v7 02/24] perf/x86/intel: Avoid PEBS event on fixed counters without extended PEBS Dapeng Mi
2026-03-24 0:40 ` [Patch v7 03/24] perf/x86/intel: Enable large PEBS sampling for XMMs Dapeng Mi
2026-03-24 0:40 ` [Patch v7 04/24] perf/x86/intel: Convert x86_perf_regs to per-cpu variables Dapeng Mi
2026-03-24 0:40 ` [Patch v7 05/24] perf: Eliminate duplicate arch-specific functions definations Dapeng Mi
2026-03-24 0:41 ` [Patch v7 06/24] perf/x86: Use x86_perf_regs in the x86 nmi handler Dapeng Mi
2026-03-24 0:41 ` [Patch v7 07/24] perf/x86: Introduce x86-specific x86_pmu_setup_regs_data() Dapeng Mi
2026-03-25 5:18 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 08/24] x86/fpu/xstate: Add xsaves_nmi() helper Dapeng Mi
2026-03-24 0:41 ` [Patch v7 09/24] x86/fpu: Ensure TIF_NEED_FPU_LOAD is set after saving FPU state Dapeng Mi
2026-03-24 0:41 ` [Patch v7 10/24] perf: Move and rename has_extended_regs() for ARCH-specific use Dapeng Mi
2026-03-24 0:41 ` [Patch v7 11/24] perf/x86: Enable XMM Register Sampling for Non-PEBS Events Dapeng Mi
2026-03-25 7:30 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 12/24] perf/x86: Enable XMM register sampling for REGS_USER case Dapeng Mi
2026-03-25 7:58 ` Mi, Dapeng [this message]
2026-03-24 0:41 ` [Patch v7 13/24] perf: Add sampling support for SIMD registers Dapeng Mi
2026-03-25 8:44 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 14/24] perf/x86: Enable XMM sampling using sample_simd_vec_reg_* fields Dapeng Mi
2026-03-25 9:01 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 15/24] perf/x86: Enable YMM " Dapeng Mi
2026-03-24 0:41 ` [Patch v7 16/24] perf/x86: Enable ZMM " Dapeng Mi
2026-03-24 0:41 ` [Patch v7 17/24] perf/x86: Enable OPMASK sampling using sample_simd_pred_reg_* fields Dapeng Mi
2026-03-24 0:41 ` [Patch v7 18/24] perf: Enhance perf_reg_validate() with simd_enabled argument Dapeng Mi
2026-03-24 0:41 ` [Patch v7 19/24] perf/x86: Enable eGPRs sampling using sample_regs_* fields Dapeng Mi
2026-03-24 0:41 ` [Patch v7 20/24] perf/x86: Enable SSP " Dapeng Mi
2026-03-25 9:25 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 21/24] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability Dapeng Mi
2026-04-25 2:01 ` sashiko-bot
2026-04-29 5:25 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 22/24] perf/x86/intel: Enable arch-PEBS based SIMD/eGPRs/SSP sampling Dapeng Mi
2026-04-25 3:08 ` sashiko-bot
2026-04-29 5:36 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 23/24] perf/x86: Activate back-to-back NMI detection for arch-PEBS induced NMIs Dapeng Mi
2026-04-25 3:31 ` sashiko-bot
2026-04-29 6:00 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 24/24] perf/x86/intel: Add sanity check for PEBS fragment size Dapeng Mi
2026-04-25 3:53 ` sashiko-bot
2026-04-29 7:04 ` Mi, Dapeng
2026-03-24 1:08 ` [Patch v7 00/24] Support SIMD/eGPRs/SSP registers sampling for perf Mi, Dapeng
2026-03-25 9:41 ` Mi, Dapeng
2026-05-13 5:52 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa954a5a-2efd-4f21-abb9-c69150f66c56@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=broonie@kernel.org \
--cc=dapeng1.mi@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=tglx@linutronix.de \
--cc=thomas.falcon@intel.com \
--cc=xudong.hao@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.