linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/12] Support vector and more extended registers in perf
@ 2025-06-13 13:49 kan.liang
  2025-06-13 13:49 ` [RFC PATCH 01/12] perf/x86: Use x86_perf_regs in the x86 nmi handler kan.liang
                   ` (13 more replies)
  0 siblings, 14 replies; 60+ messages in thread
From: kan.liang @ 2025-06-13 13:49 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, tglx, dave.hansen, irogers,
	adrian.hunter, jolsa, alexander.shishkin, linux-kernel
  Cc: dapeng1.mi, ak, zide.chen, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Starting from the Intel Ice Lake, the XMM registers can be collected in
a PEBS record. More registers, e.g., YMM, ZMM, OPMASK, SPP and APX, will
be added in the upcoming Architecture PEBS as well. But it requires the
hardware support.

The patch set provides a software solution to mitigate the hardware
requirement. It utilizes the XSAVES command to retrieve the requested
registers in the overflow handler. The feature isn't limited to the PEBS
event or specific platforms anymore.
The hardware solution (if available) is still preferred, since it has
low overhead (especially with the large PEBS) and is more accurate.

In theory, the solution should work for all X86 platforms. But I only
have newer Inter platforms to test. The patch set only enable the
feature for Intel Ice Lake and later platforms.

Open:
The new registers include YMM, ZMM, OPMASK, SSP, and APX.
The sample_regs_user/intr has run out. A new field in the
struct perf_event_attr is required for the registers.
There could be several options as below for the new field.

- Follow a similar format to XSAVES. Introduce the below fields to store
  the bitmap of the registers.
  struct perf_event_attr {
        ...
        __u64   sample_ext_regs_intr[2];
        __u64   sample_ext_regs_user[2];
        ...
  }
  Includes YMMH (16 bits), APX (16 bits), OPMASK (8 bits),
           ZMMH0-15 (16 bits), H16ZMM (16 bits), SSP
  For example, if a user wants YMM8, the perf tool needs to set the
  corresponding bits of XMM8 and YMMH8, and reconstruct the result.
  The method is similar to the existing method for
  sample_regs_user/intr, and match the XSAVES format.
  The kernel doesn't need to do extra configuration and reconstruction.
  It's implemented in the patch set.

- Similar to the above method. But the fields are the bitmap of the
  complete registers, E.g., YMM (16 bits), APX (16 bits),
  OPMASK (8 bits), ZMM (32 bits), SSP.
  The kernel needs to do extra configuration and reconstruction,
  which may brings extra overhead.

- Combine the XMM, YMM, and ZMM. So all the registers can be put into
  one u64 field.
        ...
        union {
                __u64 sample_ext_regs_intr;   //sample_ext_regs_user is simiar
                struct {
                        __u32 vector_bitmap;
                        __u32 vector_type   : 3, //0b001 XMM 0b010 YMM 0b100 ZMM
                              apx_bitmap    : 16,
                              opmask_bitmap : 8,
                              ssp_bitmap    : 1,
                              reserved      : 4,

                };
        ...
  For example, if the YMM8-15 is required,
  vector_bitmap: 0x0000ff00
  vector_type: 0x2
  This method can save two __u64 in the struct perf_event_attr.
  But it's not straightforward since it mixes the type and bitmap.
  The kernel also needs to do extra configuration and reconstruction.

Please let me know if there are more ideas.

Thanks,
Kan



Kan Liang (12):
  perf/x86: Use x86_perf_regs in the x86 nmi handler
  perf/x86: Setup the regs data
  x86/fpu/xstate: Add xsaves_nmi
  perf: Move has_extended_regs() to header file
  perf/x86: Support XMM register for non-PEBS and REGS_USER
  perf: Support extension of sample_regs
  perf/x86: Add YMMH in extended regs
  perf/x86: Add APX in extended regs
  perf/x86: Add OPMASK in extended regs
  perf/x86: Add ZMM in extended regs
  perf/x86: Add SSP in extended regs
  perf/x86/intel: Support extended registers

 arch/arm/kernel/perf_regs.c           |   9 +-
 arch/arm64/kernel/perf_regs.c         |   9 +-
 arch/csky/kernel/perf_regs.c          |   9 +-
 arch/loongarch/kernel/perf_regs.c     |   8 +-
 arch/mips/kernel/perf_regs.c          |   9 +-
 arch/powerpc/perf/perf_regs.c         |   9 +-
 arch/riscv/kernel/perf_regs.c         |   8 +-
 arch/s390/kernel/perf_regs.c          |   9 +-
 arch/x86/events/core.c                | 226 ++++++++++++++++++++++++--
 arch/x86/events/intel/core.c          |  49 ++++++
 arch/x86/events/intel/ds.c            |  12 +-
 arch/x86/events/perf_event.h          |  58 +++++++
 arch/x86/include/asm/fpu/xstate.h     |   1 +
 arch/x86/include/asm/perf_event.h     |   6 +
 arch/x86/include/uapi/asm/perf_regs.h | 101 ++++++++++++
 arch/x86/kernel/fpu/xstate.c          |  22 +++
 arch/x86/kernel/perf_regs.c           |  85 +++++++++-
 include/linux/perf_event.h            |  23 +++
 include/linux/perf_regs.h             |  29 +++-
 include/uapi/linux/perf_event.h       |   8 +
 kernel/events/core.c                  |  63 +++++--
 21 files changed, 699 insertions(+), 54 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2025-06-19 14:27 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 13:49 [RFC PATCH 00/12] Support vector and more extended registers in perf kan.liang
2025-06-13 13:49 ` [RFC PATCH 01/12] perf/x86: Use x86_perf_regs in the x86 nmi handler kan.liang
2025-06-13 13:49 ` [RFC PATCH 02/12] perf/x86: Setup the regs data kan.liang
2025-06-13 13:49 ` [RFC PATCH 03/12] x86/fpu/xstate: Add xsaves_nmi kan.liang
2025-06-13 14:39   ` Dave Hansen
2025-06-13 14:54     ` Liang, Kan
2025-06-13 15:19       ` Dave Hansen
2025-06-13 13:49 ` [RFC PATCH 04/12] perf: Move has_extended_regs() to header file kan.liang
2025-06-13 13:49 ` [RFC PATCH 05/12] perf/x86: Support XMM register for non-PEBS and REGS_USER kan.liang
2025-06-13 15:15   ` Dave Hansen
2025-06-13 17:51     ` Liang, Kan
2025-06-13 15:34   ` Dave Hansen
2025-06-13 18:14     ` Liang, Kan
2025-06-13 13:49 ` [RFC PATCH 06/12] perf: Support extension of sample_regs kan.liang
2025-06-17  8:00   ` Mi, Dapeng
2025-06-17  8:14   ` Peter Zijlstra
2025-06-17  9:49     ` Mi, Dapeng
2025-06-17 10:28       ` Peter Zijlstra
2025-06-17 12:14         ` Mi, Dapeng
2025-06-17 13:33           ` Peter Zijlstra
2025-06-17 14:06             ` Peter Zijlstra
2025-06-17 14:24               ` Mark Rutland
2025-06-17 14:44                 ` Peter Zijlstra
2025-06-17 14:55                   ` Mark Rutland
2025-06-17 19:00                     ` Mark Brown
2025-06-17 20:32                     ` Liang, Kan
2025-06-18  9:35                       ` Peter Zijlstra
2025-06-18 10:10                         ` Liang, Kan
2025-06-18 13:30                           ` Peter Zijlstra
2025-06-18 13:52                             ` Liang, Kan
2025-06-18 14:30                               ` Dave Hansen
2025-06-18 14:47                                 ` Dave Hansen
2025-06-18 15:24                                   ` Liang, Kan
2025-06-18 14:45                               ` Peter Zijlstra
2025-06-18 15:22                                 ` Liang, Kan
2025-06-13 13:49 ` [RFC PATCH 07/12] perf/x86: Add YMMH in extended regs kan.liang
2025-06-13 15:48   ` Dave Hansen
2025-06-13 13:49 ` [RFC PATCH 08/12] perf/x86: Add APX " kan.liang
2025-06-13 16:02   ` Dave Hansen
2025-06-13 17:17     ` Liang, Kan
2025-06-17  8:19   ` Peter Zijlstra
2025-06-13 13:49 ` [RFC PATCH 09/12] perf/x86: Add OPMASK " kan.liang
2025-06-13 13:49 ` [RFC PATCH 10/12] perf/x86: Add ZMM " kan.liang
2025-06-13 13:49 ` [RFC PATCH 11/12] perf/x86: Add SSP " kan.liang
2025-06-13 13:49 ` [RFC PATCH 12/12] perf/x86/intel: Support extended registers kan.liang
2025-06-17  7:50 ` [RFC PATCH 00/12] Support vector and more extended registers in perf Mi, Dapeng
2025-06-17  8:24 ` Peter Zijlstra
2025-06-17 13:52   ` Liang, Kan
2025-06-17 14:29     ` Peter Zijlstra
2025-06-17 15:23       ` Liang, Kan
2025-06-17 17:34         ` Peter Zijlstra
2025-06-18  0:57         ` Mi, Dapeng
2025-06-18 10:47           ` Liang, Kan
2025-06-18 12:28             ` Mi, Dapeng
2025-06-18 13:15               ` Liang, Kan
2025-06-19  0:41                 ` Mi, Dapeng
2025-06-19 11:11                   ` Liang, Kan
2025-06-19 12:26                     ` Mi, Dapeng
2025-06-19 13:38                     ` Peter Zijlstra
2025-06-19 14:27                       ` Liang, Kan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).