From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
broonie@kernel.org, Ravi Bangoria <ravi.bangoria@amd.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Zide Chen <zide.chen@intel.com>,
Falcon Thomas <thomas.falcon@intel.com>,
Dapeng Mi <dapeng1.mi@intel.com>,
Xudong Hao <xudong.hao@intel.com>,
Kan Liang <kan.liang@linux.intel.com>
Subject: Re: [Patch v7 20/24] perf/x86: Enable SSP sampling using sample_regs_* fields
Date: Wed, 25 Mar 2026 17:25:12 +0800 [thread overview]
Message-ID: <e46b208e-3858-4444-b892-60f51eb4e27f@linux.intel.com> (raw)
In-Reply-To: <20260324004118.3772171-21-dapeng1.mi@linux.intel.com>
On 3/24/2026 8:41 AM, Dapeng Mi wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
>
> This patch enables sampling of CET SSP register via the sample_regs_*
> fields.
>
> To sample SSP, the sample_simd_regs_enabled field must be set. This
> allows the spare space (reclaimed from the original XMM space) in the
> sample_regs_* fields to be used for representing SSP.
>
> Similar with eGPRs sampling, the perf_reg_value() function needs to
> check if the PERF_SAMPLE_REGS_ABI_SIMD flag is set first, and then
> determine whether to output SSP or legacy XMM registers to userspace.
>
> Additionally, arch-PEBS supports sampling SSP, which is placed into the
> GPRs group. This patch also enables arch-PEBS-based SSP sampling.
>
> Currently, SSP sampling is only supported on the x86_64 architecture, as
> CET is only available on x86_64 platforms.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> arch/x86/events/core.c | 9 +++++++++
> arch/x86/events/intel/ds.c | 8 ++++++++
> arch/x86/events/perf_event.h | 10 ++++++++++
> arch/x86/include/asm/perf_event.h | 4 ++++
> arch/x86/include/uapi/asm/perf_regs.h | 7 ++++---
> arch/x86/kernel/perf_regs.c | 5 +++++
> 6 files changed, 40 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index d33cfbe38573..ea451b48b9d6 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -712,6 +712,10 @@ int x86_pmu_hw_config(struct perf_event *event)
> if (event_needs_egprs(event) &&
> !(x86_pmu.ext_regs_mask & XFEATURE_MASK_APX))
> return -EINVAL;
> + if (event_needs_ssp(event) &&
> + !(x86_pmu.ext_regs_mask & XFEATURE_MASK_CET_USER))
> + return -EINVAL;
> +
> /* Not require any vector registers but set width */
> if (event->attr.sample_simd_vec_reg_qwords &&
> !event->attr.sample_simd_vec_reg_intr &&
> @@ -1871,6 +1875,7 @@ inline void x86_pmu_clear_perf_regs(struct pt_regs *regs)
> perf_regs->h16zmm_regs = NULL;
> perf_regs->opmask_regs = NULL;
> perf_regs->egpr_regs = NULL;
> + perf_regs->cet_regs = NULL;
> }
>
> static inline void x86_pmu_update_xregs(struct x86_perf_regs *perf_regs,
> @@ -1896,6 +1901,8 @@ static inline void x86_pmu_update_xregs(struct x86_perf_regs *perf_regs,
> perf_regs->opmask = get_xsave_addr(xsave, XFEATURE_OPMASK);
> if (mask & XFEATURE_MASK_APX)
> perf_regs->egpr = get_xsave_addr(xsave, XFEATURE_APX);
> + if (mask & XFEATURE_MASK_CET_USER)
> + perf_regs->cet = get_xsave_addr(xsave, XFEATURE_CET_USER);
> }
>
> /*
> @@ -1961,6 +1968,8 @@ static void x86_pmu_sample_xregs(struct perf_event *event,
> mask |= XFEATURE_MASK_OPMASK;
> if (event_needs_egprs(event))
> mask |= XFEATURE_MASK_APX;
> + if (event_needs_ssp(event))
> + mask |= XFEATURE_MASK_CET_USER;
>
> mask &= x86_pmu.ext_regs_mask;
> if ((sample_type & PERF_SAMPLE_REGS_USER) && data->regs_user.abi)
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index ac9a1c2f0177..3a2fb623e0ab 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2685,6 +2685,14 @@ static void setup_arch_pebs_sample_data(struct perf_event *event,
> __setup_pebs_gpr_group(event, data, regs,
> (struct pebs_gprs *)gprs,
> sample_type);
> +
> + /* Currently only user space mode enables SSP. */
> + if (user_mode(regs) && (sample_type &
> + (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER))) {
> + /* Point to r15 so that cet_regs[1] = ssp. */
> + perf_regs->cet_regs = &gprs->r15;
Sashiko comments
"
Is this relying on undefined behavior?
Treating the scalar struct member r15 as an array and accessing it via
cet_regs[1] can cause compilers with strict object bounds checking
(like -fsanitize=bounds) to trap. It also creates a brittle coupling
between the memory layout of struct arch_pebs_gprs and struct
cet_user_state.
"
OK. Would remove cet_regs pointer and directly introduce a "u64 *ssp"
pointer to record the SSP address.
> + ignore_mask = XFEATURE_MASK_CET_USER;
Sashiko comments
"
Should this be a bitwise OR (ignore_mask |= XFEATURE_MASK_CET_USER)?
Since setup_arch_pebs_sample_data() processes PEBS fragments in a loop,
overwriting ignore_mask with '=' instead of '|=' might lose previously set
bits from earlier fragments, such as XFEATURE_MASK_SSE from the xmm block.
This could cause x86_pmu_setup_regs_data() to unnecessarily read registers
from XSAVE and provide stale sample data.
"
In theory, it should not happen since there should be only 1 GPRs group
even in multiple PEBS fragments. But for the consistency, would follow the
comments. Thanks.
> + }
> }
>
> if (header->aux) {
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index 0974fd8b0e20..36688d28407f 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -197,6 +197,16 @@ static inline bool event_needs_egprs(struct perf_event *event)
> return false;
> }
>
> +static inline bool event_needs_ssp(struct perf_event *event)
> +{
> + if (event->attr.sample_simd_regs_enabled &&
> + (event->attr.sample_regs_user & BIT_ULL(PERF_REG_X86_SSP) ||
> + event->attr.sample_regs_intr & BIT_ULL(PERF_REG_X86_SSP)))
> + return true;
> +
> + return false;
> +}
> +
> struct amd_nb {
> int nb_id; /* NorthBridge id */
> int refcnt; /* reference count */
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index a54ea8fa6a04..0c6d58e6c98f 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -751,6 +751,10 @@ struct x86_perf_regs {
> u64 *egpr_regs;
> struct apx_state *egpr;
> };
> + union {
> + u64 *cet_regs;
> + struct cet_user_state *cet;
> + };
> };
>
> extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
> diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h
> index e721a47556d4..98a5b6c8e24c 100644
> --- a/arch/x86/include/uapi/asm/perf_regs.h
> +++ b/arch/x86/include/uapi/asm/perf_regs.h
> @@ -28,10 +28,10 @@ enum perf_event_x86_regs {
> PERF_REG_X86_R14,
> PERF_REG_X86_R15,
> /*
> - * The eGPRs and XMM have overlaps. Only one can be used
> + * The eGPRs/SSP and XMM have overlaps. Only one can be used
> * at a time. The ABI PERF_SAMPLE_REGS_ABI_SIMD is used to
> * distinguish which one is used. If PERF_SAMPLE_REGS_ABI_SIMD
> - * is set, then eGPRs is used, otherwise, XMM is used.
> + * is set, then eGPRs/SSP is used, otherwise, XMM is used.
> *
> * Extended GPRs (eGPRs)
> */
> @@ -51,10 +51,11 @@ enum perf_event_x86_regs {
> PERF_REG_X86_R29,
> PERF_REG_X86_R30,
> PERF_REG_X86_R31,
> + PERF_REG_X86_SSP,
> /* These are the limits for the GPRs. */
> PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
> PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
> - PERF_REG_MISC_MAX = PERF_REG_X86_R31 + 1,
> + PERF_REG_MISC_MAX = PERF_REG_X86_SSP + 1,
>
> /* These all need two bits set because they are 128bit */
> PERF_REG_X86_XMM0 = 32,
> diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
> index a34cc52dbbeb..9715d1f90313 100644
> --- a/arch/x86/kernel/perf_regs.c
> +++ b/arch/x86/kernel/perf_regs.c
> @@ -70,6 +70,11 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
> return 0;
> return perf_regs->egpr_regs[idx - PERF_REG_X86_R16];
> }
> + if (idx == PERF_REG_X86_SSP) {
> + if (!perf_regs->cet_regs)
> + return 0;
> + return perf_regs->cet_regs[1];
> + }
> } else {
> if (idx >= PERF_REG_X86_XMM0 && idx < PERF_REG_X86_XMM_MAX) {
> if (!perf_regs->xmm_regs)
next prev parent reply other threads:[~2026-03-25 9:25 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 0:40 [Patch v7 00/24] Support SIMD/eGPRs/SSP registers sampling for perf Dapeng Mi
2026-03-24 0:40 ` [Patch v7 01/24] perf/x86: Move hybrid PMU initialization before x86_pmu_starting_cpu() Dapeng Mi
2026-03-24 0:40 ` [Patch v7 02/24] perf/x86/intel: Avoid PEBS event on fixed counters without extended PEBS Dapeng Mi
2026-03-24 0:40 ` [Patch v7 03/24] perf/x86/intel: Enable large PEBS sampling for XMMs Dapeng Mi
2026-03-24 0:40 ` [Patch v7 04/24] perf/x86/intel: Convert x86_perf_regs to per-cpu variables Dapeng Mi
2026-03-24 0:40 ` [Patch v7 05/24] perf: Eliminate duplicate arch-specific functions definations Dapeng Mi
2026-03-24 0:41 ` [Patch v7 06/24] perf/x86: Use x86_perf_regs in the x86 nmi handler Dapeng Mi
2026-03-24 0:41 ` [Patch v7 07/24] perf/x86: Introduce x86-specific x86_pmu_setup_regs_data() Dapeng Mi
2026-03-25 5:18 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 08/24] x86/fpu/xstate: Add xsaves_nmi() helper Dapeng Mi
2026-03-24 0:41 ` [Patch v7 09/24] x86/fpu: Ensure TIF_NEED_FPU_LOAD is set after saving FPU state Dapeng Mi
2026-03-24 0:41 ` [Patch v7 10/24] perf: Move and rename has_extended_regs() for ARCH-specific use Dapeng Mi
2026-03-24 0:41 ` [Patch v7 11/24] perf/x86: Enable XMM Register Sampling for Non-PEBS Events Dapeng Mi
2026-03-25 7:30 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 12/24] perf/x86: Enable XMM register sampling for REGS_USER case Dapeng Mi
2026-03-25 7:58 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 13/24] perf: Add sampling support for SIMD registers Dapeng Mi
2026-03-25 8:44 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 14/24] perf/x86: Enable XMM sampling using sample_simd_vec_reg_* fields Dapeng Mi
2026-03-25 9:01 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 15/24] perf/x86: Enable YMM " Dapeng Mi
2026-03-24 0:41 ` [Patch v7 16/24] perf/x86: Enable ZMM " Dapeng Mi
2026-03-24 0:41 ` [Patch v7 17/24] perf/x86: Enable OPMASK sampling using sample_simd_pred_reg_* fields Dapeng Mi
2026-03-24 0:41 ` [Patch v7 18/24] perf: Enhance perf_reg_validate() with simd_enabled argument Dapeng Mi
2026-03-24 0:41 ` [Patch v7 19/24] perf/x86: Enable eGPRs sampling using sample_regs_* fields Dapeng Mi
2026-03-24 0:41 ` [Patch v7 20/24] perf/x86: Enable SSP " Dapeng Mi
2026-03-25 9:25 ` Mi, Dapeng [this message]
2026-03-24 0:41 ` [Patch v7 21/24] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability Dapeng Mi
2026-04-25 2:01 ` sashiko-bot
2026-04-29 5:25 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 22/24] perf/x86/intel: Enable arch-PEBS based SIMD/eGPRs/SSP sampling Dapeng Mi
2026-04-25 3:08 ` sashiko-bot
2026-04-29 5:36 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 23/24] perf/x86: Activate back-to-back NMI detection for arch-PEBS induced NMIs Dapeng Mi
2026-04-25 3:31 ` sashiko-bot
2026-04-29 6:00 ` Mi, Dapeng
2026-03-24 0:41 ` [Patch v7 24/24] perf/x86/intel: Add sanity check for PEBS fragment size Dapeng Mi
2026-04-25 3:53 ` sashiko-bot
2026-04-29 7:04 ` Mi, Dapeng
2026-03-24 1:08 ` [Patch v7 00/24] Support SIMD/eGPRs/SSP registers sampling for perf Mi, Dapeng
2026-03-25 9:41 ` Mi, Dapeng
2026-05-13 5:52 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e46b208e-3858-4444-b892-60f51eb4e27f@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=broonie@kernel.org \
--cc=dapeng1.mi@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=tglx@linutronix.de \
--cc=thomas.falcon@intel.com \
--cc=xudong.hao@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.