Re: [PATCH v9] perf stat: Fix wrong skipping for per-die aggregation

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org,
	mingo@redhat.com, alexander.shishkin@linux.intel.com,
	Linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com, ying.huang@intel.com
Subject: Re: [PATCH v9] perf stat: Fix wrong skipping for per-die aggregation
Date: Thu, 18 Feb 2021 08:24:02 +0800	[thread overview]
Message-ID: <88f2c092-2fc8-7b3f-3d41-e2ac64bc7eb9@linux.intel.com> (raw)
In-Reply-To: <YBcuvN106bsa7F+9@krava>

Hi Arnaldo,

On 2/1/2021 6:27 AM, Jiri Olsa wrote:
> On Thu, Jan 28, 2021 at 09:34:17AM +0800, Jin Yao wrote:
>> Uncore becomes die-scope on Xeon Cascade Lake-AP and perf has supported
>> --per-die aggregation yet.
>>
>> One issue is found in check_per_pkg() for uncore events running on
>> AP system. On cascade Lake-AP, we have:
>>
>> S0-D0
>> S0-D1
>> S1-D0
>> S1-D1
>>
>> But in check_per_pkg(), S0-D1 and S1-D1 are skipped because the
>> mask bits for S0 and S1 have been set for S0-D0 and S1-D0. It doesn't
>> check die_id. So the counting for S0-D1 and S1-D1 are set to zero.
>> That's not correct.
>>
>> root@lkp-csl-2ap4 ~# ./perf stat -a -I 1000 -e llc_misses.mem_read --per-die -- sleep 5
>>       1.001460963 S0-D0           1            1317376 Bytes llc_misses.mem_read
>>       1.001460963 S0-D1           1             998016 Bytes llc_misses.mem_read
>>       1.001460963 S1-D0           1             970496 Bytes llc_misses.mem_read
>>       1.001460963 S1-D1           1            1291264 Bytes llc_misses.mem_read
>>       2.003488021 S0-D0           1            1082048 Bytes llc_misses.mem_read
>>       2.003488021 S0-D1           1            1919040 Bytes llc_misses.mem_read
>>       2.003488021 S1-D0           1             890752 Bytes llc_misses.mem_read
>>       2.003488021 S1-D1           1            2380800 Bytes llc_misses.mem_read
>>       3.005613270 S0-D0           1            1126080 Bytes llc_misses.mem_read
>>       3.005613270 S0-D1           1            2898176 Bytes llc_misses.mem_read
>>       3.005613270 S1-D0           1             870912 Bytes llc_misses.mem_read
>>       3.005613270 S1-D1           1            3388608 Bytes llc_misses.mem_read
>>       4.007627598 S0-D0           1            1124608 Bytes llc_misses.mem_read
>>       4.007627598 S0-D1           1            3884416 Bytes llc_misses.mem_read
>>       4.007627598 S1-D0           1             921088 Bytes llc_misses.mem_read
>>       4.007627598 S1-D1           1            4451840 Bytes llc_misses.mem_read
>>       5.001479927 S0-D0           1             963328 Bytes llc_misses.mem_read
>>       5.001479927 S0-D1           1            4831936 Bytes llc_misses.mem_read
>>       5.001479927 S1-D0           1             895104 Bytes llc_misses.mem_read
>>       5.001479927 S1-D1           1            5496640 Bytes llc_misses.mem_read
>>
>>  From above output, we can see S0-D1 and S1-D1 don't report the interval
>> values, they are continued to grow. That's because check_per_pkg() wrongly
>> decides to use zero counts for S0-D1 and S1-D1.
>>
>> So in check_per_pkg(), we should use hashmap(socket,die) to decide if
>> the cpu counts needs to skip. Only considering socket is not enough.
>>
>> Now with this patch,
>>
>> root@lkp-csl-2ap4 ~# ./perf stat -a -I 1000 -e llc_misses.mem_read --per-die -- sleep 5
>>       1.001586691 S0-D0           1            1229440 Bytes llc_misses.mem_read
>>       1.001586691 S0-D1           1             976832 Bytes llc_misses.mem_read
>>       1.001586691 S1-D0           1             938304 Bytes llc_misses.mem_read
>>       1.001586691 S1-D1           1            1227328 Bytes llc_misses.mem_read
>>       2.003776312 S0-D0           1            1586752 Bytes llc_misses.mem_read
>>       2.003776312 S0-D1           1             875392 Bytes llc_misses.mem_read
>>       2.003776312 S1-D0           1             855616 Bytes llc_misses.mem_read
>>       2.003776312 S1-D1           1             949376 Bytes llc_misses.mem_read
>>       3.006512788 S0-D0           1            1338880 Bytes llc_misses.mem_read
>>       3.006512788 S0-D1           1             920064 Bytes llc_misses.mem_read
>>       3.006512788 S1-D0           1             877184 Bytes llc_misses.mem_read
>>       3.006512788 S1-D1           1            1020736 Bytes llc_misses.mem_read
>>       4.008895291 S0-D0           1             926592 Bytes llc_misses.mem_read
>>       4.008895291 S0-D1           1             906368 Bytes llc_misses.mem_read
>>       4.008895291 S1-D0           1             892224 Bytes llc_misses.mem_read
>>       4.008895291 S1-D1           1             987712 Bytes llc_misses.mem_read
>>       5.001590993 S0-D0           1             962624 Bytes llc_misses.mem_read
>>       5.001590993 S0-D1           1             912512 Bytes llc_misses.mem_read
>>       5.001590993 S1-D0           1             891200 Bytes llc_misses.mem_read
>>       5.001590993 S1-D1           1             978432 Bytes llc_misses.mem_read
>>
>> On no-die system, die_id is 0, actually it's hashmap(socket,0), original behavior
>> is not changed.
>>
>> Reported-by: Huang Ying <ying.huang@intel.com>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> v9:
>>   Rename zero_per_pkg to evsel__zero_per_pkg and move it to evsel.c. Then
>>   evsel__zero_per_pkg can be called under different code path.
>>
>>   Call evsel__zero_per_pkg in evsel__exit().
> 
> Acked-by: Jiri Olsa <jolsa@redhat.com>
> 
> thanks,
> jirka
> 

Can this fix be accepted or anything else I need to improve?

Thanks
Jin Yao

next prev parent reply	other threads:[~2021-02-18  0:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-28  1:34 [PATCH v9] perf stat: Fix wrong skipping for per-die aggregation Jin Yao
2021-01-31 22:27 ` Jiri Olsa
2021-02-18  0:24   ` Jin, Yao [this message]
2021-03-03 15:45   ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88f2c092-2fc8-7b3f-3d41-e2ac64bc7eb9@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.