From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Ian Rogers <irogers@google.com>, sashiko-reviews@lists.linux.dev
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [Patch v8 2/5] perf regs: Support x86 eGPRs/SSP sampling
Date: Tue, 2 Jun 2026 13:12:50 +0800 [thread overview]
Message-ID: <bcc73423-2a00-4b2c-afec-83c70e28f023@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fWugapi9c-HWx9acm=MzyE=hb0muYwj3w86774iOBnvWw@mail.gmail.com>
On 5/30/2026 1:29 AM, Ian Rogers wrote:
> On Fri, May 29, 2026 at 3:08 AM <sashiko-bot@kernel.org> wrote:
>> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
>> - [Medium] Intel PT PEBS register synthesis is silently disabled on SIMD-capable machines.
>> - [Medium] Requesting legacy XMM registers (e.g., `-I XMM0`) fails with an 'Unknown register' error on modern CPUs, breaking backwards compatibility.
>> --
>>
>> commit a0d7b85bb399d047c94284c24c34805466f84209
>> Author: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> Subject: perf regs: Support x86 eGPRs/SSP sampling
>>
>> This commit adds support for sampling x86 extended GP registers (R16-R31) and
>> the shadow stack pointer (SSP) register. It reclaims the original XMM registers
>> space in sample_regs_user/sample_regs_intr to represent the eGPRs and SSP when
>> SIMD sampling is supported, using a new abi argument to distinguish layouts.
> Thanks Dapeng! I think stylistically this is exactly what I was hoping for!
>
> When I try to apply the patch on perf-tools-next it is failing, on
> this patch with a conflict on:
> ```
> --- tools/perf/builtin-inject.c
> +++ tools/perf/builtin-inject.c
> @@ -457,6 +457,8 @@ static int
> perf_event__convert_sample_callchain(const struct perf_tool *tool,
> /* adjust sample size for stack and regs */
> sample_size -= sample->user_stack.size;
> sample_size -= (hweight64(evsel->core.attr.sample_regs_user) + 1) *
> sizeof(u64);
> + if (sample->user_regs && sample->user_regs->abi & PERF_SAMPLE_REGS_ABI_SIMD)
> + sample_size -= 4 * sizeof(u64); /* Reduce SIMD regs header size */
> sample_size += (sample->callchain->nr + 1) * sizeof(u64);
> event_copy->header.size = sample_size;
> ```
> This is happening because commit daac18e7c42c ("perf inject: Fix
> itrace branch stack synthesis") switched the size logic to use
> perf_event__sample_event_size, which I believe accurately accounts for
> the new layout and so it is safe to drop this part of the change.
Sure. I would drop it in next version. This pachset is based on top of
Peter's perf/core tree and I forgot to rebase it on the perf-tools-next
tree. :(
>
>>> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
>>> index fc9eec8b54b82..2729ad8c6d263 100644
>>> --- a/tools/perf/util/intel-pt.c
>>> +++ b/tools/perf/util/intel-pt.c
>>> @@ -2470,6 +2470,7 @@ static int intel_pt_do_synth_pebs_sample(struct intel_pt_queue *ptq, struct evse
>>> }
>>>
>>> if (sample_type & PERF_SAMPLE_REGS_INTR &&
>>> + !evsel->core.attr.sample_simd_regs_enabled &&
>>> (items->mask[INTEL_PT_GP_REGS_POS] ||
>>> items->mask[INTEL_PT_XMM_POS])) {
>>> u64 regs_mask = evsel->core.attr.sample_regs_intr;
>> [Severity: Medium]
>> Does this completely disable PEBS register synthesis on SIMD-capable machines?
>>
>> Since PERF_SAMPLE_REGS_ABI_SIMD is set by default on supported platforms in
>> __perf_reg_mask_x86(), opts->sample_simd_regs_enabled will be set to 1 for
>> any register request (like -I ax).
>>
>> Could this cause basic GPR synthesis to be silently skipped here, resulting in
>> a functional regression for userspace tooling?
> I don't think functionality is lost and presumably the driver is
> ensuring PEBS sampled registers are preferred over say pt_regs from
> the interrupt.
Strictly speaking, PEBS_VIA_PT feature is only enabled for legacy PEBS now,
so sample_simd_regs_enabled won't be enabled on such kind of supporting
legacy PEBS platforms and then there is no any functionality lost.
Thanks.
>
>> [ ... ]
>>
>>> diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>>> index b6d20522b4e8f..ae26d991cdc9f 100644
>>> --- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>>> +++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c
>>> @@ -276,13 +276,38 @@ uint64_t __perf_reg_mask_x86(bool intr)
>> [ ... ]
>>> +uint64_t __perf_reg_mask_x86(bool intr, int *abi)
>>> +{
>> [ ... ]
>>> + if (mask != PERF_REGS_MASK) {
>>> + *abi |= PERF_SAMPLE_REGS_ABI_SIMD;
>>> + } else {
>>> + mask |= __arch__reg_mask(sample_type, PERF_REG_EXTENDED_MASK,
>>> + false);
>>> }
>>>
>>> - return PERF_REGS_MASK;
>>> + return mask;
>>> }
>> [ ... ]
>>> +const char *__perf_reg_name_x86(int id, int abi)
>>> +{
>>> + const char *name;
>>> +
>>> + name = __arch_reg_gpr_name(id);
>>> + if (name)
>>> + return name;
>>> +
>>> + if (abi & PERF_SAMPLE_REGS_ABI_SIMD)
>>> + name = __arch_reg_egpr_name(id);
>>> + else
>>> + name = __arch_reg_xmm_name(id);
>>> +
>>> + return name;
>>> +}
>> [Severity: Medium]
>> Does this break backwards compatibility for legacy XMM register requests?
>>
>> When the CPU supports eGPRs/SSP, PERF_SAMPLE_REGS_ABI_SIMD is set and the
>> legacy XMM registers (PERF_REG_EXTENDED_MASK) are excluded from the returned
>> mask because they share the same bit space.
>>
>> If __perf_reg_name_x86() maps these overlapping IDs exclusively to eGPR names
>> (for instance, mapping ID 32 to R16), will name_to_perf_reg_mask() fail to
>> resolve legacy requests like -I XMM0 and throw an unknown register error?
> So with PERF_SAMPLE_REGS_ABI_SIMD XMM0 is no longer part of the
> register mask and thus name_to_perf_reg_mask shouldn't set a bit for
> it. The next patch adjusts __parse_regs to encode vector registers in
> the new space within perf_event_attr.
>
> Thanks,
> Ian
>
>> --
>> Sashiko AI review · https://sashiko.dev/#/patchset/20260529082451.591783-1-dapeng1.mi@linux.intel.com?part=2
>>
next prev parent reply other threads:[~2026-06-02 5:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 8:24 [Patch v8 0/5] Perf tools: Support eGPRs/SSP/SIMD registers sampling Dapeng Mi
2026-05-29 8:24 ` [Patch v8 1/5] perf headers: Sync perf_event.h/perf_regs.h with the kernel headers Dapeng Mi
2026-05-29 9:23 ` sashiko-bot
2026-05-29 16:59 ` Ian Rogers
2026-06-02 2:30 ` Mi, Dapeng
2026-05-29 8:24 ` [Patch v8 2/5] perf regs: Support x86 eGPRs/SSP sampling Dapeng Mi
2026-05-29 10:06 ` sashiko-bot
2026-05-29 17:29 ` Ian Rogers
2026-06-02 5:12 ` Mi, Dapeng [this message]
2026-05-29 8:24 ` [Patch v8 3/5] perf regs: Support x86 SIMD registers sampling Dapeng Mi
2026-05-29 10:45 ` sashiko-bot
2026-05-29 17:41 ` Ian Rogers
2026-06-02 5:38 ` Mi, Dapeng
2026-06-02 6:35 ` Mi, Dapeng
2026-05-29 8:24 ` [Patch v8 4/5] perf regs: Enable dumping of SIMD registers Dapeng Mi
2026-05-29 11:23 ` sashiko-bot
2026-05-29 17:45 ` Ian Rogers
2026-06-02 5:45 ` Mi, Dapeng
2026-05-29 8:24 ` [Patch v8 5/5] perf dwarf-regs: Add SIMD/eGPRs support for x86 DWARF registers Dapeng Mi
2026-05-29 17:56 ` Ian Rogers
2026-06-02 6:04 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bcc73423-2a00-4b2c-afec-83c70e28f023@linux.intel.com \
--to=dapeng1.mi@linux.intel.com \
--cc=irogers@google.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox