From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC5D5398F85; Thu, 4 Dec 2025 01:37:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764812282; cv=none; b=AlGPasWHsCGnKOg9pBCAkbAvA4yZMBS31Ne+VThD/MWPD73aR60jaO9VA6cpa81StdKUopFNAFSe/gfPTRSipB4oqeIBJZJkHe3gfS4UJDWnC9+1p/rx2TQh4xbwwGVciHwyxNH7x87fZzbRURuFVr7yyXtzmqdTFcTebHAhrnM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764812282; c=relaxed/simple; bh=hlGf/F3H++tx64WdKSnByAsBUAJ7G1rUCgBeGkuRARs=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=LFTvKyUVeJTah7Bp1h8GO4J8FjdRJyvcusMAqGZBx4Hqw4R9DYPCe7+wV5H//mkz+wZnj++kcD2NfxjyDIX2jvCk/W5rPYSIOyEe2TFJKULA1oHA97pxvqS1zeeTXBuweoFzpX/rf+/OYtJ7qNZMQppcmRx4Khpir9PWwYKBaMw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Az7mml7e; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Az7mml7e" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1764812281; x=1796348281; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=hlGf/F3H++tx64WdKSnByAsBUAJ7G1rUCgBeGkuRARs=; b=Az7mml7ewwUySsARBjmYa24hlfSUtUvc9tQtvE7xMStNBAdzZdfrOw1b 2I7Fwe4vKPxhbUv8kIMg9Mt2njigw+Y12HEdZphGgStr9e303URm3Nwm7 AeJLhnFlbmBsi2Tj36VZD81xPdDmTqike0/hjhyvrNVb5j0WZutxvsQPf uPOKqBhOwt8//Yx8uvcTkMakhbIo16mi2vFDH2eqoPVnzVNeWGoFhFK4n sSuvyLcOHCyv5pKvLAiVu07Pv2CyFQjmmJFkW7C0O2Ggn2ENFPRNhAQvQ SZMNqcuzE0BaAqEttXKKKQpvyJ1YTCuTIOBfExxZQtvs+yHLa3KrWk+SG Q==; X-CSE-ConnectionGUID: h9EfhtYeQGukcNTUgtLs4A== X-CSE-MsgGUID: W5NTWlUSSO+qmeonKEIJNw== X-IronPort-AV: E=McAfee;i="6800,10657,11631"; a="66863562" X-IronPort-AV: E=Sophos;i="6.20,247,1758610800"; d="scan'208";a="66863562" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2025 17:38:00 -0800 X-CSE-ConnectionGUID: dj3HOOfJRTKGHe94ZNVojQ== X-CSE-MsgGUID: cJDP978jQIqE7Aa/YP2L3A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.20,247,1758610800"; d="scan'208";a="199326314" Received: from unknown (HELO [10.238.3.115]) ([10.238.3.115]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2025 17:37:54 -0800 Message-ID: Date: Thu, 4 Dec 2025 09:37:51 +0800 Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Patch v5 17/19] perf headers: Sync with the kernel headers To: Ian Rogers Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Thomas Gleixner , Dave Hansen , Adrian Hunter , Jiri Olsa , Alexander Shishkin , Andi Kleen , Eranian Stephane , Mark Rutland , broonie@kernel.org, Ravi Bangoria , linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Zide Chen , Falcon Thomas , Dapeng Mi , Xudong Hao , Kan Liang References: <20251203065500.2597594-1-dapeng1.mi@linux.intel.com> <20251203065500.2597594-18-dapeng1.mi@linux.intel.com> Content-Language: en-US From: "Mi, Dapeng" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 12/4/2025 7:43 AM, Ian Rogers wrote: > On Tue, Dec 2, 2025 at 10:59 PM Dapeng Mi wrote: >> From: Kan Liang >> >> Update include/uapi/linux/perf_event.h and >> arch/x86/include/uapi/asm/perf_regs.h to support extended regs. >> >> Signed-off-by: Kan Liang >> Co-developed-by: Dapeng Mi >> Signed-off-by: Dapeng Mi >> --- >> tools/arch/x86/include/uapi/asm/perf_regs.h | 62 +++++++++++++++++++++ >> tools/include/uapi/linux/perf_event.h | 45 +++++++++++++-- >> 2 files changed, 103 insertions(+), 4 deletions(-) >> >> diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/include/uapi/asm/perf_regs.h >> index 7c9d2bb3833b..f3561ed10041 100644 >> --- a/tools/arch/x86/include/uapi/asm/perf_regs.h >> +++ b/tools/arch/x86/include/uapi/asm/perf_regs.h >> @@ -27,9 +27,34 @@ enum perf_event_x86_regs { >> PERF_REG_X86_R13, >> PERF_REG_X86_R14, >> PERF_REG_X86_R15, >> + /* >> + * The EGPRs/SSP and XMM have overlaps. Only one can be used >> + * at a time. For the ABI type PERF_SAMPLE_REGS_ABI_SIMD, >> + * utilize EGPRs/SSP. For the other ABI type, XMM is used. >> + * >> + * Extended GPRs (EGPRs) >> + */ >> + PERF_REG_X86_R16, >> + PERF_REG_X86_R17, >> + PERF_REG_X86_R18, >> + PERF_REG_X86_R19, >> + PERF_REG_X86_R20, >> + PERF_REG_X86_R21, >> + PERF_REG_X86_R22, >> + PERF_REG_X86_R23, >> + PERF_REG_X86_R24, >> + PERF_REG_X86_R25, >> + PERF_REG_X86_R26, >> + PERF_REG_X86_R27, >> + PERF_REG_X86_R28, >> + PERF_REG_X86_R29, >> + PERF_REG_X86_R30, >> + PERF_REG_X86_R31, >> + PERF_REG_X86_SSP, >> /* These are the limits for the GPRs. */ >> PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1, >> PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1, >> + PERF_REG_MISC_MAX = PERF_REG_X86_SSP + 1, > I wonder MISC isn't the most intention revealing name. What happens if > things are extended again? Would APX be a better alternative, so > PERF_REG_APX_MAX ? Hmm, I don't think PERF_REG_APX_MAX is a good name either since there is SSP as well besides the APX eGPRs, and there could be more registers introduced in the future. How about PERF_REG_X86_EXTD_MAX? > >> /* These all need two bits set because they are 128bit */ >> PERF_REG_X86_XMM0 = 32, >> @@ -54,5 +79,42 @@ enum perf_event_x86_regs { >> }; >> >> #define PERF_REG_EXTENDED_MASK (~((1ULL << PERF_REG_X86_XMM0) - 1)) >> +#define PERF_X86_EGPRS_MASK GENMASK_ULL(PERF_REG_X86_R31, PERF_REG_X86_R16) >> + >> +enum { >> + PERF_REG_X86_XMM, >> + PERF_REG_X86_YMM, >> + PERF_REG_X86_ZMM, >> + PERF_REG_X86_MAX_SIMD_REGS, >> + >> + PERF_REG_X86_OPMASK = 0, >> + PERF_REG_X86_MAX_PRED_REGS = 1, >> +}; >> + >> +enum { >> + PERF_X86_SIMD_XMM_REGS = 16, >> + PERF_X86_SIMD_YMM_REGS = 16, >> + PERF_X86_SIMD_ZMMH_REGS = 16, >> + PERF_X86_SIMD_ZMM_REGS = 32, >> + PERF_X86_SIMD_VEC_REGS_MAX = PERF_X86_SIMD_ZMM_REGS, >> + >> + PERF_X86_SIMD_OPMASK_REGS = 8, >> + PERF_X86_SIMD_PRED_REGS_MAX = PERF_X86_SIMD_OPMASK_REGS, >> +}; >> + >> +#define PERF_X86_SIMD_PRED_MASK GENMASK(PERF_X86_SIMD_PRED_REGS_MAX - 1, 0) >> +#define PERF_X86_SIMD_VEC_MASK GENMASK_ULL(PERF_X86_SIMD_VEC_REGS_MAX - 1, 0) >> + >> +#define PERF_X86_H16ZMM_BASE PERF_X86_SIMD_ZMMH_REGS >> + >> +enum { >> + PERF_X86_OPMASK_QWORDS = 1, >> + PERF_X86_XMM_QWORDS = 2, >> + PERF_X86_YMMH_QWORDS = 2, >> + PERF_X86_YMM_QWORDS = 4, >> + PERF_X86_ZMMH_QWORDS = 4, >> + PERF_X86_ZMM_QWORDS = 8, >> + PERF_X86_SIMD_QWORDS_MAX = PERF_X86_ZMM_QWORDS, >> +}; >> >> #endif /* _ASM_X86_PERF_REGS_H */ >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index d292f96bc06f..f1474da32622 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -314,8 +314,9 @@ enum { >> */ >> enum perf_sample_regs_abi { >> PERF_SAMPLE_REGS_ABI_NONE = 0, >> - PERF_SAMPLE_REGS_ABI_32 = 1, >> - PERF_SAMPLE_REGS_ABI_64 = 2, >> + PERF_SAMPLE_REGS_ABI_32 = (1 << 0), >> + PERF_SAMPLE_REGS_ABI_64 = (1 << 1), >> + PERF_SAMPLE_REGS_ABI_SIMD = (1 << 2), >> }; >> >> /* >> @@ -382,6 +383,7 @@ enum perf_event_read_format { >> #define PERF_ATTR_SIZE_VER6 120 /* Add: aux_sample_size */ >> #define PERF_ATTR_SIZE_VER7 128 /* Add: sig_data */ >> #define PERF_ATTR_SIZE_VER8 136 /* Add: config3 */ >> +#define PERF_ATTR_SIZE_VER9 168 /* Add: sample_simd_{pred,vec}_reg_* */ > ARM have added a config4 in: > https://lore.kernel.org/lkml/20251111-james-perf-feat_spe_eft-v10-1-1e1b5bf2cd05@linaro.org/ > so this will need to be VER10. Thanks. It looks the ARM changes have been merged, so we can change it to VER10 in next version. > > Thanks, > Ian > >> /* >> * 'struct perf_event_attr' contains various attributes that define >> @@ -545,6 +547,25 @@ struct perf_event_attr { >> __u64 sig_data; >> >> __u64 config3; /* extension of config2 */ >> + >> + >> + /* >> + * Defines set of SIMD registers to dump on samples. >> + * The sample_simd_regs_enabled !=0 implies the >> + * set of SIMD registers is used to config all SIMD registers. >> + * If !sample_simd_regs_enabled, sample_regs_XXX may be used to >> + * config some SIMD registers on X86. >> + */ >> + union { >> + __u16 sample_simd_regs_enabled; >> + __u16 sample_simd_pred_reg_qwords; >> + }; >> + __u32 sample_simd_pred_reg_intr; >> + __u32 sample_simd_pred_reg_user; >> + __u16 sample_simd_vec_reg_qwords; >> + __u64 sample_simd_vec_reg_intr; >> + __u64 sample_simd_vec_reg_user; >> + __u32 __reserved_4; >> }; >> >> /* >> @@ -1018,7 +1039,15 @@ enum perf_event_type { >> * } && PERF_SAMPLE_BRANCH_STACK >> * >> * { u64 abi; # enum perf_sample_regs_abi >> - * u64 regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER >> + * u64 regs[weight(mask)]; >> + * struct { >> + * u16 nr_vectors; >> + * u16 vector_qwords; >> + * u16 nr_pred; >> + * u16 pred_qwords; >> + * u64 data[nr_vectors * vector_qwords + nr_pred * pred_qwords]; >> + * } && (abi & PERF_SAMPLE_REGS_ABI_SIMD) >> + * } && PERF_SAMPLE_REGS_USER >> * >> * { u64 size; >> * char data[size]; >> @@ -1045,7 +1074,15 @@ enum perf_event_type { >> * { u64 data_src; } && PERF_SAMPLE_DATA_SRC >> * { u64 transaction; } && PERF_SAMPLE_TRANSACTION >> * { u64 abi; # enum perf_sample_regs_abi >> - * u64 regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR >> + * u64 regs[weight(mask)]; >> + * struct { >> + * u16 nr_vectors; >> + * u16 vector_qwords; >> + * u16 nr_pred; >> + * u16 pred_qwords; >> + * u64 data[nr_vectors * vector_qwords + nr_pred * pred_qwords]; >> + * } && (abi & PERF_SAMPLE_REGS_ABI_SIMD) >> + * } && PERF_SAMPLE_REGS_INTR >> * { u64 phys_addr;} && PERF_SAMPLE_PHYS_ADDR >> * { u64 cgroup;} && PERF_SAMPLE_CGROUP >> * { u64 data_page_size;} && PERF_SAMPLE_DATA_PAGE_SIZE >> -- >> 2.34.1 >>