linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ali Saidi <alisaidi@amazon.com>
To: <leo.yan@linaro.org>
Cc: <Al.Grant@arm.com>, <acme@kernel.org>,
	<alexander.shishkin@linux.intel.com>, <alisaidi@amazon.com>,
	<andrew.kilroy@arm.com>, <benh@kernel.crashing.org>,
	<german.gomez@arm.com>, <james.clark@arm.com>,
	<john.garry@huawei.com>, <jolsa@redhat.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-perf-users@vger.kernel.org>, <mark.rutland@arm.com>,
	<mathieu.poirier@linaro.org>, <mingo@redhat.com>,
	<namhyung@kernel.org>, <peterz@infradead.org>, <will@kernel.org>
Subject: Re: [PATCH] perf arm-spe: Use SPE data source for neoverse cores
Date: Wed, 2 Feb 2022 19:51:15 +0000	[thread overview]
Message-ID: <20220202195115.12924-1-alisaidi@amazon.com> (raw)
In-Reply-To: <20220202150104.GB1939745@leoy-ThinkPad-X240s>

Hi Leo,

>Hi Ali, James,
>
> [...]
>> 
>> I'd really like someone familiar with perf c2c output to also end up getting
>> similar output when running on an Arm system with SPE. There are obviously large
>> micro-architectural differences that have been abstracted away by the data_src
>> abstraction but fundamentally my understanding of x86 HITM is that the line
>> was found in the snoop filter of the LLC as being owned by another core and
>> therefore the request needs to go to another core to get the line.  I'm not
>> 100% sure if on x86 it's really guaranteed to be dirty or not and it's not
>> always going to be dirty in a Neoverse system, but since the SPE source
>> indicates it was sourced from another core it is a core-2-core transfer of a
>> line which is currently owned by another cpu core and that is the interesting
>> data point that would be used to drive optimization and elimination of frequent
>> core-2-core transfers (true or false sharing).
>
>Though I don't know the implementation for the hardware conherency
>protocols, here I have the same understanding with Ali.
>
>For x86 arch, it uses HITM to indicate the cache line is "modified"
>state; on Arm64 Neoverse platforms, there have two data source values
>can tell us if the cache line is "modified" state:
>ARM_SPE_NV_PEER_CLSTR and ARM_SPE_NV_PEER_CORE.  The snooping can
>happen either within the cluster or cross clusters.

Yes, although it depends on the system topology if there are clusters.

>> >> +			data_src.mem_lvl = PERF_MEM_LVL_L3 | PERF_MEM_LVL_HIT;
>> >
>> >This one also adds PERF_MEM_LVL_HIT even though the check of "if (record->type & ARM_SPE_LLC_MISS)"
>> >hasn't happened yet. Maybe some comments would make it a bit clearer, but at the moment it's
>> >not obvious how the result is derived because there are some things that don't add up like
>> >ARM_SPE_LLC_MISS == PERF_MEM_LVL_HIT.
>> 
>> Assuming the above is correct, my reading of the existing code that creates the
>> c2c output is that when an access is marked as an LLC hit, that doesn't
>> necessarily mean that the data was present in the LLC. I don't see how it could
>> given that LLC_HIT + HITM means the line was dirty in another CPUs cache, and so
>> LLC_HIT + HITM seems to mean that it was a hit in the LLC snoop filter and
>> required a different core to provide the line. This and the above certainly
>> deserve a comment as to why the miss is being attributed in this way if it's
>> otherwise acceptable.
>
>As James pointed out, this might introduce confusion.  I am wanderding
>if we can extract two functions for synthesizing the data source, one is
>for Neoverse platform and another is for generic purpose (which
>without data source packets), below code is to demonstrate the basic
>idea.

The code below is cleaner, and I'm happy to rework the patches in this way, but
I think the question still remains about unifying behavior of the tool. If we
mark something with a data source of ARM_SPE_NV_PEER_CORE as at L1 hit + HITM
certainly c2c won't show the correct thing today, but i think it also hides the
intent. The line in question missed the L1, L2, and got to the LLC where we did
find a record that it was in another cores cache (L1 or L2). Looking at the way
that c2c works today, it seems like marking this as a hit in the LLC snoop
filter is the best way to unify behaviors among architectures?

I'll send you a perf.data file OOB.

Thanks,
Ali


  reply	other threads:[~2022-02-02 19:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-21 18:24 [PATCH] perf arm-spe: Use SPE data source for neoverse cores Ali Saidi
2022-01-24 17:24 ` James Clark
2022-01-24 22:53   ` Ali Saidi
2022-02-02 15:01     ` Leo Yan
2022-02-02 19:51       ` Ali Saidi [this message]
2022-02-03  9:19         ` Leo Yan
2022-02-05  0:07           ` Ali Saidi
2022-02-05  2:18             ` Leo Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220202195115.12924-1-alisaidi@amazon.com \
    --to=alisaidi@amazon.com \
    --cc=Al.Grant@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrew.kilroy@arm.com \
    --cc=benh@kernel.crashing.org \
    --cc=german.gomez@arm.com \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).