Re: [PATCH] perf stat: Support per-cluster aggregation

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yicong Yang <yangyicong@huawei.com>
To: Namhyung Kim <namhyung@gmail.com>, "Chen, Tim C" <tim.c.chen@intel.com>
Cc: <yangyicong@hisilicon.com>, "acme@kernel.org" <acme@kernel.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"james.clark@arm.com" <james.clark@arm.com>,
	"alexander.shishkin@linux.intel.com" 
	<alexander.shishkin@linux.intel.com>,
	"linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"prime.zeng@hisilicon.com" <prime.zeng@hisilicon.com>,
	"shenyang39@huawei.com" <shenyang39@huawei.com>,
	"linuxarm@huawei.com" <linuxarm@huawei.com>
Subject: Re: [PATCH] perf stat: Support per-cluster aggregation
Date: Wed, 29 Mar 2023 20:46:55 +0800	[thread overview]
Message-ID: <6cd44ff7-d339-d9a4-a134-2b8b9b3dbbfa@huawei.com> (raw)
In-Reply-To: <CAM9d7cgeLdBoniAz64YrzSYKw2Y4ivy5DhEzReEzhm41M-nvSQ@mail.gmail.com>

On 2023/3/29 14:47, Namhyung Kim wrote:
> Hello,
> 
> On Fri, Mar 24, 2023 at 11:09 AM Chen, Tim C <tim.c.chen@intel.com> wrote:
>>
>>>
>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>
>>> Some platforms have 'cluster' topology and CPUs in the cluster will share
>>> resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2 cache (for Intel
>>> Jacobsville). Currently parsing and building cluster topology have been
>>> supported since [1].
>>>
>>> perf stat has already supported aggregation for other topologies like die or
>>> socket, etc. It'll be useful to aggregate per-cluster to find problems like L3T
>>> bandwidth contention or imbalance.
>>>
>>> This patch adds support for "--per-cluster" option for per-cluster aggregation.
>>> Also update the docs and related test. The output will be like:
>>>
>>> [root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> S56-D0-CLS158    4      1,321,521,570      LLC-load
>>> S56-D0-CLS594    4        794,211,453      LLC-load
>>> S56-D0-CLS1030    4             41,623      LLC-load
>>> S56-D0-CLS1466    4             41,646      LLC-load
>>> S56-D0-CLS1902    4             16,863      LLC-load
>>> S56-D0-CLS2338    4             15,721      LLC-load
>>> S56-D0-CLS2774    4             22,671      LLC-load
>>> [...]
>>
>> Overall it looks good.  You can add my reviewed-by.
>>
>> I wonder if we could enhance the help message
>> in perf stat to tell user to refer to
>> /sys/devices/system/cpu/cpuX/topology/*_id
>> to map relevant ids back to overall cpu topology.
>>
>> For example the above example, cluster S56-D0-CLS158  has
>> really heavy load. It took me  a while
>> going through the code to figure out how to find
>> the info that maps cluster id to cpu.
> 
> Maybe we could enhance the cpu filter to accept something
> like -C S56-D0-CLS158.
> 

you mean specified the CPUs by a topology ID like this S56-D0-CLS158
then we actually filtering the CPUs in the CLS 158?

> I also wonder what if it runs on an old kernel which doesn't
> have the cluster_id file.

It should work well but may not be proper for the cluster. There's
no die topology nor related sysfs attributes on arm64, but --per-die
works like:

[root@localhost perf]# perf stat -a -e cycles --per-die -- sleep 1

 Performance counter stats for 'system wide':

S56-D0         64         12,700,186      cycles
S7182-D0       64         20,297,320      cycles

       1.003638080 seconds time elapsed

On a legacy kernel without cluster sysfs attributes, the output will be
look like:

[root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1

 Performance counter stats for 'system wide':

S56-D0-CLS-1   64         12,634,251      cycles
S7182-D0-CLS-1   64         16,348,322      cycles

       1.003696680 seconds time elapsed

The patch just assign -1 to the cluster id. I'll modify this to keep consistence
with the output of --per-die. Thanks for catching this!

Thanks,
Yicong

     prev parent reply	other threads:[~2023-03-29 12:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-13  8:59 [PATCH] perf stat: Support per-cluster aggregation Yicong Yang
2023-03-23 13:03 ` Yicong Yang
2023-03-24  2:34 ` Jie Zhan
2023-03-24 12:24   ` Jonathan Cameron
2023-03-24 12:30     ` Jonathan Cameron
2023-03-27  6:20       ` Yicong Yang
2023-03-24 18:05 ` Chen, Tim C
2023-03-27  4:03   ` Yicong Yang
2023-03-29  6:47   ` Namhyung Kim
2023-03-29 12:46     ` Yicong Yang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6cd44ff7-d339-d9a4-a134-2b8b9b3dbbfa@huawei.com \
    --to=yangyicong@huawei.com \
    --cc=21cnbao@gmail.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=james.clark@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@gmail.com \
    --cc=peterz@infradead.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=shenyang39@huawei.com \
    --cc=tim.c.chen@intel.com \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).