Re: [PATCH 2/2] perf arm-spe: Parse more SPE fields and store source

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: German Gomez <german.gomez@arm.com>
To: Leo Yan <leo.yan@linaro.org>, Ali Saidi <alisaidi@amazon.com>
Cc: acme@kernel.org, alexander.shishkin@linux.intel.com,
	andrew.kilroy@arm.com, benh@kernel.crashing.org,
	james.clark@arm.com, john.garry@huawei.com, jolsa@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	mark.rutland@arm.com, mathieu.poirier@linaro.org,
	mingo@redhat.com, namhyung@kernel.org, peterz@infradead.org,
	will@kernel.org
Subject: Re: [PATCH 2/2] perf arm-spe: Parse more SPE fields and store source
Date: Mon, 21 Feb 2022 20:41:43 +0000	[thread overview]
Message-ID: <9266bfb6-341c-1d9c-e96f-c9f856a5ffb6@arm.com> (raw)
In-Reply-To: <20220212041927.GA763461@leoy-ThinkPad-X240s>

Hi Leo, Ali,

On 12/02/2022 04:19, Leo Yan wrote:
> Hi German, Ali,
>
> On Fri, Feb 11, 2022 at 04:31:40PM +0000, German Gomez wrote:
>> Hi Ali,
>>
>> [...]
> Let's step back a bit and divide the decoding flow into two parts:
> backend and frontend.

Sorry for derailing the conversation.

(I made some additional comments on the generation of samples below)

> enum arm_spe_op_type {
>         /* First level operation type */
> 	ARM_SPE_OP_OTHER        = 1 << 0,
> 	ARM_SPE_OP_LDST		= 1 << 1,
> 	ARM_SPE_OP_BRANCH_ERET  = 1 << 2,
>
>         /* Second level operation type for OTHER */
>         ARM_SPE_OP_SVE_OTHER    = 1 << 16,
>         ARM_SPE_OP_SVE_FP       = 1 << 17,
>         ARM_SPE_OP_SVE_PRED     = 1 << 18,
>
>         /* Second level operation type for LDST */
>         ARM_SPE_OP_LD           = 1 << 16,
>         ARM_SPE_OP_ST           = 1 << 17,
>         ARM_SPE_OP_ATOMIC       = 1 << 18,
>         ARM_SPE_OP_EXCL         = 1 << 19,
>         ARM_SPE_OP_AR           = 1 << 20,
>         ARM_SPE_OP_SIMD_FP      = 1 << 21,
>         ARM_SPE_OP_GP_REG       = 1 << 22,
>         ARM_SPE_OP_UNSPEC_REG   = 1 << 23,
>         ARM_SPE_OP_NV_SYSREG    = 1 << 24,
>         ARM_SPE_OP_SVE_PRED     = 1 << 25,
>         ARM_SPE_OP_SVE_SG       = 1 << 26,
>
>         /* Second level operation type for BRANCH_ERET */
>         ARM_SPE_OP_BR_COND      = 1 << 16,
>         ARM_SPE_OP_BR_INDIRECT  = 1 << 17,
> };
>
> IIUC, Ali suggested to directly reuse packet format to express
> operation type and don't need to redefine the operation types like
> above structure arm_spe_op_type.  I personally bias to convert the raw
> packet format to more readable format, a benefit is this would give
> us more readable code.

I personally like this method as well

>
> For the frontend, we need to convert Arm SPE record to samples.
> We can explore two fields: sample::flags and sample::data_src, for
> load/store related operations, I perfer we can fill more complete
> info in field sample::data_src and extend the definitions for
> perf_mem_data_src; for branch operations, we can fill sample::flags.
>
> So I am just wandering if we can set the field
> sample::data_src::mem_lock for atomic operations, like:
>
>     data_src.mem_op   = PERF_MEM_OP_LOAD;
>     data_src.mem_lock = PERF_MEM_LOCK_ATOMIC;
>
> The field "mem_lock" is only two bits, we can consider to extend the
> structure with an extra filed "mem_lock_ext" if it cannot meet our
> requirement.

Indeed it makes more sense to use data_src as much as possible. Thanks
for pointing that out.

Some comments:

# ARM_SPE_OP_ATOMIC

  This might be a hack, but can we not represent it as both LD&SR as the
  atomic op would combine both?

  data_src.mem_op = PERF_MEM_OP_LOAD | PERF_MEM_OP_STORE;

# ARM_SPE_OP_EXCL (instructions ldxr/stxr)

  x86 doesn't seem to have similar instructions with similar semantics
  (please correct me if I'm wrong). For this arch, PERF_MEM_LOCK_LOCK
  probably suffices.

  PPC seems to have similar instructions to arm64 (lwarx/stwcx). I don't
  know if they also have instructions with same semantics as x86.

  I think it makes sense to have a PERF_MEM_LOCK_EXCL. If not, reusing
  PERF_MEM_LOCK_LOCK is the quicker alternative.

# ARM_SPE_OP_SVE_SG

  (I'm sorry if this is too far out of scope of the original patch. Let
  me know if you would prefer to discuss it on a separate channel)

  On a separate note, I'm also looking at incorporating some of the SVE
  bits in the perf samples.
 
  For this, do you think it makes sense to have two mem_* categories in
  perf_mem_data_src:

  mem_vector (2 bits)
    - simd
    - other (SVE in arm64)

  mem_src (1 bit)
    - sparse (scatter/gather loads/stores in SVE, as well as simd)

---
Thanks,
German

>>>>> +	ARM_SPE_BR		= 1 << 5,
>>>>> +	ARM_SPE_BR_COND		= 1 << 6,
>>>>> +	ARM_SPE_BR_IND		= 1 << 7,
>> Seems like we can store BR_COND in the existing "branch-miss" event
>> (--itrace=b) with:
>>
>> sample->flags = PERF_IP_FLAG_BRANCH;
>> sample->flags |= PERF_IP_FLAG_CONDITIONAL;
>> and/or
>> sample->flags |= PERF_IP_FLAG_INDIRECT;
>>
>> PERF_IP_FLAG_INDIRECT doesn't exist yet but we can probably add it.
> Yes, for branch samples, this makes sense for me.
>
> Thanks,
> Leo

next prev parent reply	other threads:[~2022-02-21 20:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25 19:20 [PATCH 0/2] Allow perf scripts to process SPE raw data Ali Saidi
2022-01-25 19:20 ` [PATCH 1/2] perf arm-spe: Add arm_spe_record to synthesized sample Ali Saidi
2022-01-25 20:47   ` German Gomez
2022-01-26 15:58     ` [PATCH 1/2] perf arm-spe: Add arm_spe_record to synthesized Ali Saidi
2022-01-26 19:07       ` German Gomez
2022-01-27 19:13         ` Ali Saidi
2022-01-25 19:20 ` [PATCH 2/2] perf arm-spe: Parse more SPE fields and store source Ali Saidi
2022-01-28 17:20   ` German Gomez
2022-01-28 21:02     ` Ali Saidi
2022-02-11 16:31       ` German Gomez
2022-02-12  4:19         ` Leo Yan
2022-02-21 20:41           ` German Gomez [this message]
2022-02-22 19:29             ` Ali Saidi
2022-02-25 12:40               ` German Gomez
2022-02-27 13:54               ` Leo Yan
2022-02-27 13:20             ` Leo Yan
2022-03-01 10:54               ` German Gomez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9266bfb6-341c-1d9c-e96f-c9f856a5ffb6@arm.com \
    --to=german.gomez@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alisaidi@amazon.com \
    --cc=andrew.kilroy@arm.com \
    --cc=benh@kernel.crashing.org \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).