linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
Cc: kan.liang@linux.intel.com, mingo@redhat.com, acme@kernel.org,
	namhyung@kernel.org, tglx@linutronix.de,
	dave.hansen@linux.intel.com, irogers@google.com,
	adrian.hunter@intel.com, jolsa@kernel.org,
	alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org,
	ak@linux.intel.com, zide.chen@intel.com
Subject: Re: [RFC PATCH 06/12] perf: Support extension of sample_regs
Date: Tue, 17 Jun 2025 12:28:13 +0200	[thread overview]
Message-ID: <20250617102813.GS1613376@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <8fbf7fc5-2e38-4882-8835-49869b6dd47f@linux.intel.com>

On Tue, Jun 17, 2025 at 05:49:13PM +0800, Mi, Dapeng wrote:
> 
> On 6/17/2025 4:14 PM, Peter Zijlstra wrote:
> > On Fri, Jun 13, 2025 at 06:49:37AM -0700, kan.liang@linux.intel.com wrote:
> >> From: Kan Liang <kan.liang@linux.intel.com>
> >>
> >> More regs may be required in a sample, e.g., the vector registers. The
> >> current sample_regs_XXX has run out of space.
> >>
> >> Add sample_ext_regs_intr/user[2] in the struct perf_event_attr. It's used
> >> as a bitmap for the extension regs. There will be more than 64 registers
> >> added.
> >> Add a new flag PERF_PMU_CAP_EXTENDED_REGS2 to indicate the PMU which
> >> supports sample_ext_regs_intr/user.
> >>
> >> Extend the perf_reg_validate() to support the validation of the
> >> extension regs.
> >>
> >> Extend the perf_reg_value() to retrieve the extension regs. The regs may
> >> be larger than u64. Add two parameters to store the pointer and size.
> >> Add a dedicated perf_output_sample_ext_regs() to dump the extension
> >> regs.
> >>
> >> This is just a generic support for the extension regs. Any attempts to
> >> manipulate the extension regs will error out, until the driver-specific
> >> supports are implemented, which will be done in the following patch.
> >>
> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >> index 78a362b80027..e22ba72efcdb 100644
> >> --- a/include/uapi/linux/perf_event.h
> >> +++ b/include/uapi/linux/perf_event.h
> >> @@ -382,6 +382,10 @@ enum perf_event_read_format {
> >>  #define PERF_ATTR_SIZE_VER6			120	/* Add: aux_sample_size */
> >>  #define PERF_ATTR_SIZE_VER7			128	/* Add: sig_data */
> >>  #define PERF_ATTR_SIZE_VER8			136	/* Add: config3 */
> >> +#define PERF_ATTR_SIZE_VER9			168	/* Add: sample_ext_regs_intr */
> >> +							/* Add: sample_ext_regs_user */
> >> +
> >> +#define PERF_ATTR_EXT_REGS_SIZE			2
> >>  
> >>  /*
> >>   * 'struct perf_event_attr' contains various attributes that define
> >> @@ -543,6 +547,10 @@ struct perf_event_attr {
> >>  	__u64	sig_data;
> >>  
> >>  	__u64	config3; /* extension of config2 */
> >> +
> >> +	/* extension of sample_regs_XXX */
> >> +	__u64	sample_ext_regs_intr[PERF_ATTR_EXT_REGS_SIZE];
> >> +	__u64	sample_ext_regs_user[PERF_ATTR_EXT_REGS_SIZE];
> >>  };
> > Did anybody read this email?
> >
> >   https://lkml.kernel.org/r/20250416155327.GD17910@noisy.programming.kicks-ass.net
> >
> > The current regs interface really was designed for regular registers,
> > and trying to squish SIMD registers into it is a trainwreck.
> >
> > Not to mention that AAAARGHH64 and Risc-V have vector widths up to 2048
> > bit.
> 
> Yes, we followed this discussion. In sample_ext_regs_intr/user[], each bit
> represents an extended register regardless of how much bits does the
> register have.  At the beginning we added a item "sample_simd_reg_words" to
> represent the bit-width of the corresponding extended register, but we
> found it's unnecessary since the regs bitmap is fixed for a specific arch

So I disagree. Fundamentally we have only 32 SIMD registers on x86. We
should not have more bits than that.

> and the arch-specific code would know how many bits for the certain regs,
> e.g., on x86 platform, the bit 0 would represent YMMH0 in this patchset ,
> then the x86 specific perf code would know its bit-wdith is 128bits.

This is nonsense; YMMH0 is not a register. It should never be in this
array.

> The reason that we define an array with 2 u64 words is that we plan to
> support YMM (16 bits) + APX (16 bits) + OPMASK (8 bits) + ZMM (32 bits) +
> SSP (1 bit) regs which needs 73 bits and one u64 word is not enough.

This is insane. So now you're having 16 XMM 'regs', 16 YMMH 'regs' and
32 ZMMH 'regs' for a total of 64 bits that should have been just 32.

Suppose we're going to be doing AVX-1024, because awesome. That means we
need another 32 bits to denote whatever letter comes after 'Z'.

So no, this idiocy stops now.

We're going to do a sane SIMD register set with variable width, and
reclaim the XMM regs from the normal set.

  reply	other threads:[~2025-06-17 10:28 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13 13:49 [RFC PATCH 00/12] Support vector and more extended registers in perf kan.liang
2025-06-13 13:49 ` [RFC PATCH 01/12] perf/x86: Use x86_perf_regs in the x86 nmi handler kan.liang
2025-06-13 13:49 ` [RFC PATCH 02/12] perf/x86: Setup the regs data kan.liang
2025-06-13 13:49 ` [RFC PATCH 03/12] x86/fpu/xstate: Add xsaves_nmi kan.liang
2025-06-13 14:39   ` Dave Hansen
2025-06-13 14:54     ` Liang, Kan
2025-06-13 15:19       ` Dave Hansen
2025-06-13 13:49 ` [RFC PATCH 04/12] perf: Move has_extended_regs() to header file kan.liang
2025-06-13 13:49 ` [RFC PATCH 05/12] perf/x86: Support XMM register for non-PEBS and REGS_USER kan.liang
2025-06-13 15:15   ` Dave Hansen
2025-06-13 17:51     ` Liang, Kan
2025-06-13 15:34   ` Dave Hansen
2025-06-13 18:14     ` Liang, Kan
2025-06-13 13:49 ` [RFC PATCH 06/12] perf: Support extension of sample_regs kan.liang
2025-06-17  8:00   ` Mi, Dapeng
2025-06-17  8:14   ` Peter Zijlstra
2025-06-17  9:49     ` Mi, Dapeng
2025-06-17 10:28       ` Peter Zijlstra [this message]
2025-06-17 12:14         ` Mi, Dapeng
2025-06-17 13:33           ` Peter Zijlstra
2025-06-17 14:06             ` Peter Zijlstra
2025-06-17 14:24               ` Mark Rutland
2025-06-17 14:44                 ` Peter Zijlstra
2025-06-17 14:55                   ` Mark Rutland
2025-06-17 19:00                     ` Mark Brown
2025-06-17 20:32                     ` Liang, Kan
2025-06-18  9:35                       ` Peter Zijlstra
2025-06-18 10:10                         ` Liang, Kan
2025-06-18 13:30                           ` Peter Zijlstra
2025-06-18 13:52                             ` Liang, Kan
2025-06-18 14:30                               ` Dave Hansen
2025-06-18 14:47                                 ` Dave Hansen
2025-06-18 15:24                                   ` Liang, Kan
2025-06-18 14:45                               ` Peter Zijlstra
2025-06-18 15:22                                 ` Liang, Kan
2025-06-13 13:49 ` [RFC PATCH 07/12] perf/x86: Add YMMH in extended regs kan.liang
2025-06-13 15:48   ` Dave Hansen
2025-06-13 13:49 ` [RFC PATCH 08/12] perf/x86: Add APX " kan.liang
2025-06-13 16:02   ` Dave Hansen
2025-06-13 17:17     ` Liang, Kan
2025-06-17  8:19   ` Peter Zijlstra
2025-06-13 13:49 ` [RFC PATCH 09/12] perf/x86: Add OPMASK " kan.liang
2025-06-13 13:49 ` [RFC PATCH 10/12] perf/x86: Add ZMM " kan.liang
2025-06-13 13:49 ` [RFC PATCH 11/12] perf/x86: Add SSP " kan.liang
2025-06-13 13:49 ` [RFC PATCH 12/12] perf/x86/intel: Support extended registers kan.liang
2025-06-17  7:50 ` [RFC PATCH 00/12] Support vector and more extended registers in perf Mi, Dapeng
2025-06-17  8:24 ` Peter Zijlstra
2025-06-17 13:52   ` Liang, Kan
2025-06-17 14:29     ` Peter Zijlstra
2025-06-17 15:23       ` Liang, Kan
2025-06-17 17:34         ` Peter Zijlstra
2025-06-18  0:57         ` Mi, Dapeng
2025-06-18 10:47           ` Liang, Kan
2025-06-18 12:28             ` Mi, Dapeng
2025-06-18 13:15               ` Liang, Kan
2025-06-19  0:41                 ` Mi, Dapeng
2025-06-19 11:11                   ` Liang, Kan
2025-06-19 12:26                     ` Mi, Dapeng
2025-06-19 13:38                     ` Peter Zijlstra
2025-06-19 14:27                       ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250617102813.GS1613376@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).