All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>, Andi Kleen <ak@linux.intel.com>
Cc: acme@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org,
	tglx@linutronix.de, jolsa@kernel.org, eranian@google.com,
	alexander.shishkin@linux.intel.com
Subject: Re: [PATCH V3 01/23] perf/x86: Support outputting XMM registers
Date: Mon, 25 Mar 2019 16:35:56 -0400	[thread overview]
Message-ID: <697d0b76-d08f-5fa4-d49a-c291ab0f57de@linux.intel.com> (raw)
In-Reply-To: <20190323095610.GB6058@hirez.programming.kicks-ass.net>



On 3/23/2019 5:56 AM, Peter Zijlstra wrote:
> On Fri, Mar 22, 2019 at 10:22:50AM -0700, Andi Kleen wrote:
>>>> diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h
>>>> index f3329cabce5c..b33995313d17 100644
>>>> --- a/arch/x86/include/uapi/asm/perf_regs.h
>>>> +++ b/arch/x86/include/uapi/asm/perf_regs.h
>>>> @@ -28,7 +28,29 @@ enum perf_event_x86_regs {
>>>>   	PERF_REG_X86_R14,
>>>>   	PERF_REG_X86_R15,
>>>>   
>>>> -	PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
>>>> -	PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,
>>>
>>> So this changes UAPI visible symbols... did we think about that?
>>
>> Should be fine. Old programs won't use the new bits,
>> and it just uses not yet used bits.
> 
> Old programs (that used the above symbols) will now fail to compile.
> Even if they won't use the new bits, that seems like a bad thing.
>

Yes, other programs which use the PERF_REG_GPR_X86_32/64_MAX symbols 
should be broken.
I think the new name PERF_REG_GPR_X86_32/64_MAX are more accurate. So I 
will keep both names in V4, and add comments for the old names.

/*
  * These names are deprecated, please use new names as below to instead.
  *     PERF_REG_GPR_X86_32_MAX
  *     PERF_REG_GPR_X86_64_MAX
  */
PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1,
PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1,


>>>> +	/* These all need two bits set because they are 128bit */
>>>> +	PERF_REG_X86_XMM0  = 32,
>>>> +	PERF_REG_X86_XMM1  = 34,
>>>> +	PERF_REG_X86_XMM2  = 36,
>>>> +	PERF_REG_X86_XMM3  = 38,
>>>> +	PERF_REG_X86_XMM4  = 40,
>>>> +	PERF_REG_X86_XMM5  = 42,
>>>> +	PERF_REG_X86_XMM6  = 44,
>>>> +	PERF_REG_X86_XMM7  = 46,
>>>> +	PERF_REG_X86_XMM8  = 48,
>>>> +	PERF_REG_X86_XMM9  = 50,
>>>> +	PERF_REG_X86_XMM10 = 52,
>>>> +	PERF_REG_X86_XMM11 = 54,
>>>> +	PERF_REG_X86_XMM12 = 56,
>>>> +	PERF_REG_X86_XMM13 = 58,
>>>> +	PERF_REG_X86_XMM14 = 60,
>>>> +	PERF_REG_X86_XMM15 = 62,
>>>> +
>>>> +	/* This does not include the XMMX registers */
>>>> +	PERF_REG_GPR_X86_32_MAX = PERF_REG_X86_GS + 1,
>>>> +	PERF_REG_GPR_X86_64_MAX = PERF_REG_X86_R15 + 1,
>>>> +
>>>> +	/* All registers include the XMMX registers */
>>>> +	PERF_REG_X86_MAX = PERF_REG_X86_XMM15 + 2,
>>>>   };
>>>>   #endif /* _ASM_X86_PERF_REGS_H */
>>>
>>> Also, what happens if we run a 32bit kernel or 32bit compat task?
>>>
>>> Then the register dump will report PERF_SAMPLE_REGS_ABI_32, should we
>>> then still interpret the XMM registers as 2x64bit?
>>
>> Yes XMM registers are 128bit in 32bit mode too.
>>
>>>
>>> Are they still at the same offset?
>>
>> Yes.
> 
> I think that is broken.. perf_prepare_sample() does:
> 
>   size += hweight(mask) * sizeof(u64);

It does size += hweight64(mask) * sizeof(u64);

> 
> And since 32bits will not have r8-r15 set, the XMM registers will shift
> forward no?
>

I tried a 32bits kernel, but I didn't observe any issue.

The index of XMM registers always start from 32. That's hard coded.

To double check, I also dumped the mask value in perf_prepare_sample().
With command "perf record -e cycles:p -IXMM0,IXMM1 sleep 1", the mask
value is 0xf00000000, hweight64(mask) returns 4. That is expected.

Is there anything I missed?

>>> Do we need additional PERF_SAMPLE_REGS_ABI_* definitions for this?
>>
>> I don't think so.
> 
> because....?
> 

I didn't observe any broken on 32bit. I think we don't need ABI to 
distinguish the XMM registers.

Thanks,
Kan

  reply	other threads:[~2019-03-25 20:36 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 16:36 [PATCH V3 00/23] perf: Add Icelake support kan.liang
2019-03-22 16:36 ` [PATCH V3 01/23] perf/x86: Support outputting XMM registers kan.liang
2019-03-22 17:08   ` Peter Zijlstra
2019-03-22 17:22     ` Andi Kleen
2019-03-23  9:56       ` Peter Zijlstra
2019-03-25 20:35         ` Liang, Kan [this message]
2019-03-26  0:02           ` Thomas Gleixner
2019-03-26  0:11   ` Thomas Gleixner
2019-03-26 13:14     ` Liang, Kan
2019-03-26 13:47       ` Thomas Gleixner
2019-03-26 13:55         ` Liang, Kan
2019-03-22 16:36 ` [PATCH V3 02/23] perf/x86/intel: Extract memory code PEBS parser for reuse kan.liang
2019-03-22 16:36 ` [PATCH V3 03/23] perf/x86/intel/ds: Extract code of event update in short period kan.liang
2019-03-22 16:36 ` [PATCH V3 04/23] perf/x86/intel: Support adaptive PEBSv4 kan.liang
2019-03-22 16:37 ` [PATCH V3 05/23] perf/x86/lbr: Avoid reading the LBRs when adaptive PEBS handles them kan.liang
2019-03-22 16:37 ` [PATCH V3 06/23] perf/x86: Support constraint ranges kan.liang
2019-03-22 16:37 ` [PATCH V3 07/23] perf/x86/intel: Add Icelake support kan.liang
2019-03-22 16:37 ` [PATCH V3 08/23] perf/x86/intel/cstate: " kan.liang
2019-03-22 16:37 ` [PATCH V3 09/23] perf/x86/intel/rapl: " kan.liang
2019-03-22 16:37 ` [PATCH V3 10/23] perf/x86/msr: " kan.liang
2019-03-22 16:37 ` [PATCH V3 11/23] perf/x86/intel/uncore: Add Intel Icelake uncore support kan.liang
2019-03-22 16:37 ` [PATCH V3 12/23] perf/core: Support a REMOVE transaction kan.liang
2019-03-22 16:37 ` [PATCH V3 13/23] perf/x86/intel: Basic support for metrics counters kan.liang
2019-03-22 16:37 ` [PATCH V3 14/23] perf/x86/intel: Support overflows on SLOTS kan.liang
2019-03-22 16:37 ` [PATCH V3 15/23] perf/x86/intel: Support hardware TopDown metrics kan.liang
2019-03-22 16:37 ` [PATCH V3 16/23] perf/x86/intel: Set correct weight for topdown subevent counters kan.liang
2019-03-22 16:37 ` [PATCH V3 17/23] perf/x86/intel: Export new top down events for Icelake kan.liang
2019-03-22 16:37 ` [PATCH V3 18/23] perf/x86/intel: Disable sampling read slots and topdown kan.liang
2019-03-22 16:37 ` [PATCH V3 19/23] perf/x86/intel: Support CPUID 10.ECX to disable fixed counters kan.liang
2019-03-22 16:37 ` [PATCH V3 20/23] perf, tools: Add support for recording and printing XMM registers kan.liang
2019-03-22 16:37 ` [PATCH 21/23] perf, tools, stat: Support new per thread TopDown metrics kan.liang
2019-03-22 16:37 ` [PATCH V3 22/23] perf, tools: Add documentation for topdown metrics kan.liang
2019-03-22 16:37 ` [PATCH V3 23/23] perf vendor events intel: Add JSON files for Icelake kan.liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=697d0b76-d08f-5fa4-d49a-c291ab0f57de@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.