From: "Liang, Kan" <kan.liang@linux.intel.com>
To: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, namhyung@kernel.org,
tglx@linutronix.de, dave.hansen@linux.intel.com,
irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org,
alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org,
ak@linux.intel.com, zide.chen@intel.com, mark.rutland@arm.com,
broonie@kernel.org, ravi.bangoria@amd.com, eranian@google.com
Subject: Re: [PATCH V3 05/17] perf/x86: Support XMM register for non-PEBS and REGS_USER
Date: Wed, 20 Aug 2025 11:03:44 -0700 [thread overview]
Message-ID: <53aebb23-2520-4016-ae71-18b66dec452f@linux.intel.com> (raw)
In-Reply-To: <47cbf0f3-9480-497a-ad88-5eb0c50dd88a@linux.intel.com>
On 2025-08-20 2:46 a.m., Mi, Dapeng wrote:
>
> On 8/19/2025 11:55 PM, Liang, Kan wrote:
>>
>> On 2025-08-19 6:39 a.m., Peter Zijlstra wrote:
>>> On Fri, Aug 15, 2025 at 02:34:23PM -0700, kan.liang@linux.intel.com wrote:
>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>
>>>> Collecting the XMM registers in a PEBS record has been supported since
>>>> the Icelake. But non-PEBS events don't support the feature. It's
>>>> possible to retrieve the XMM registers from the XSAVE for non-PEBS.
>>>> Add it to make the feature complete.
>>>>
>>>> To utilize the XSAVE, a 64-byte aligned buffer is required. Add a
>>>> per-CPU ext_regs_buf to store the vector registers. The size of the
>>>> buffer is ~2K. kzalloc_node() is used because there's a _guarantee_
>>>> that all kmalloc()'s with powers of 2 are naturally aligned and also
>>>> 64b aligned.
>>>>
>>>> Extend the support for both REGS_USER and REGS_INTR. For REGS_USER, the
>>>> perf_get_regs_user() returns the regs from the task_pt_regs(current),
>>>> which is struct pt_regs. Need to move it to local struct x86_perf_regs
>>>> x86_user_regs.
>>>> For PEBS, the HW support is still preferred. The XMM should be retrieved
>>>> from PEBS records.
>>>>
>>>> There could be more vector registers supported later. Add ext_regs_mask
>>>> to track the supported vector register group.
>>>
>>> I'm a little confused... *again* :-)
>>>
>>> Specifically, we should consider two sets of registers:
>>>
>>> - the live set, as per the CPU (XSAVE)
>>> - the stored set, as per x86_task_fpu()
>>>
>>> regs_intr should always get a copy of the live set; however
>>> regs_user should not. It might need a copy of the x86_task_fpu() instead
>>> of the live set, depending on TIF_NEED_FPU_LOAD (more or less, we need
>>> another variable set in kernel_fpu_begin_mask() *after*
>>> save_fpregs_to_fpstate() is completed).
>>>
>>> I don't see this code make this distinction.
>>>
>>> Consider getting a sample while the kernel is doing some avx enhanced
>>> crypto and such.
>> The regs_user only needs a set when the NMI hits the user mode
>> (user_mode(regs)) or a non-kernel thread (!(current->flags &
>> PF_KTHREAD)). The live set is good enough for both cases.
>
> It's fine if NMI hits user mode, but if NMI hits the kernel mode
> (!(current->flags &PF_KTHREAD)), won't the kernel space SIMD/eGPR regs be
> exposed to user space for user-regs option? I'm not sure if kernel space
> really use these SIMD/eGPR regs right now, but it seems a risk.
>
>
I don't think it's possible for the existing kernel. But I cannot
guarantee future usage.
If the kernel mode handling is still a concern, I think we should drop
the SIMD/eGPR regs for the case for now. Because
- To profile a userspace application which requires SIMD/eGPR regs, the
NMI usually hits the usersapce. It's not common to hit the kernel mode.
- The SIMD/eGPR cannot be retrieved from the task_pt_regs(). Although
it's possible to retrieve the values when the TIF_NEED_FPU_LOAD flag is
set, I don't think it's worth introducing such complexity to handle an
uncommon case in the critical path.
- Furthermore, only checking the TIF_NEED_FPU_LOAD flag cannot cure
everything. Some corner cases cannot be handled either. For example, an
NMI can happen when the flag just switched, but still in the kernel mode.
We can always add the support later if someone thinks it's important to
retrieve the user SIMD/eGPR regs during the kernel syscall.
Thanks,
Kan
>>
>> I think the kernel crypto should be to a kernel thread (current->flags &
>> PF_KTHREAD). If so, the regs_user should return NULL.
>>
>> Thanks,
>> Kan
>>
>
next prev parent reply other threads:[~2025-08-20 18:03 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-15 21:34 [PATCH V3 00/17] Support vector and more extended registers in perf kan.liang
2025-08-15 21:34 ` [PATCH V3 01/17] perf/x86: Use x86_perf_regs in the x86 nmi handler kan.liang
2025-08-15 21:34 ` [PATCH V3 02/17] perf/x86: Setup the regs data kan.liang
2025-08-15 21:34 ` [PATCH V3 03/17] x86/fpu/xstate: Add xsaves_nmi kan.liang
2025-08-15 21:34 ` [PATCH V3 04/17] perf: Move has_extended_regs() to header file kan.liang
2025-08-15 21:34 ` [PATCH V3 05/17] perf/x86: Support XMM register for non-PEBS and REGS_USER kan.liang
2025-08-19 13:39 ` Peter Zijlstra
2025-08-19 15:55 ` Liang, Kan
2025-08-20 9:46 ` Mi, Dapeng
2025-08-20 18:03 ` Liang, Kan [this message]
2025-08-21 1:00 ` Mi, Dapeng
2025-08-15 21:34 ` [PATCH V3 06/17] perf: Support SIMD registers kan.liang
2025-08-20 9:55 ` Mi, Dapeng
2025-08-20 18:08 ` Liang, Kan
2025-08-15 21:34 ` [PATCH V3 07/17] perf/x86: Move XMM to sample_simd_vec_regs kan.liang
2025-08-15 21:34 ` [PATCH V3 08/17] perf/x86: Add YMM into sample_simd_vec_regs kan.liang
2025-08-20 9:59 ` Mi, Dapeng
2025-08-20 18:10 ` Liang, Kan
2025-08-15 21:34 ` [PATCH V3 09/17] perf/x86: Add ZMM " kan.liang
2025-08-15 21:34 ` [PATCH V3 10/17] perf/x86: Add OPMASK into sample_simd_pred_reg kan.liang
2025-08-15 21:34 ` [PATCH V3 11/17] perf/x86: Add eGPRs into sample_regs kan.liang
2025-08-20 10:01 ` Mi, Dapeng
2025-08-15 21:34 ` [PATCH V3 12/17] perf/x86: Add SSP " kan.liang
2025-08-15 21:34 ` [PATCH V3 13/17] perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS kan.liang
2025-08-15 21:34 ` [POC PATCH 14/17] perf/x86/regs: Only support legacy regs for the PT and PERF_REGS_MASK for now kan.liang
2025-08-25 9:07 ` Adrian Hunter
2025-08-15 21:34 ` [POC PATCH 15/17] tools headers: Sync with the kernel sources kan.liang
2025-08-15 21:34 ` [POC PATCH 16/17] perf parse-regs: Support the new SIMD format kan.liang
2025-08-20 10:04 ` Mi, Dapeng
2025-08-20 18:18 ` Liang, Kan
2025-08-21 3:35 ` Mi, Dapeng
2025-08-15 21:34 ` [POC PATCH 17/17] perf regs: Support the PERF_SAMPLE_REGS_ABI_SIMD kan.liang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53aebb23-2520-4016-ae71-18b66dec452f@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=broonie@kernel.org \
--cc=dapeng1.mi@linux.intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=tglx@linutronix.de \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.