All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leo Yan <leo.yan@linaro.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>, Ian Rogers <irogers@google.com>,
	Joe Mario <jmario@redhat.com>, David Ahern <dsahern@gmail.com>,
	Don Zickus <dzickus@redhat.com>, Al Grant <Al.Grant@arm.com>,
	James Clark <james.clark@arm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load
Date: Tue, 20 Oct 2020 16:18:43 +0800	[thread overview]
Message-ID: <20201020081843.GF13630@leoy-ThinkPad-X240s> (raw)
In-Reply-To: <CAM9d7cjuvB_67zSmaLAJJ-zS3RL5F59k8p+oqsuzJEOkAUx=WQ@mail.gmail.com>

On Tue, Oct 20, 2020 at 05:13:01PM +0900, Namhyung Kim wrote:
> Hello,
> 
> On Thu, Oct 15, 2020 at 11:51 PM Leo Yan <leo.yan@linaro.org> wrote:
> >
> > If the memory event doesn't contain HITM tag (like Arm SPE), it cannot
> > rely on HITM display to report cache false sharing.  Alternatively, we
> > can use the LLC access and multi-threads info to locate the potential
> > false sharing's data address, and if we connect with source code and
> > analyze the multi-threads' execution timing, if can conclude load and
> > store the same cache line at the meantime, thus this can be helpful for
> > resolve the cache false sharing issue.
> >
> > This patch set is to enable the display with sorting on LLC load
> > accesses; it adds dimensions for total LLC hit and LLC load accesses,
> > and these dimensions are used for shared cache line table and pareto.
> >
> > This patch set is dependend on the patch set "perf c2c: Refine the
> > organization of metrics" [1].
> >
> > [1] https://lore.kernel.org/patchwork/cover/1321499/
> >
> > With this patch set, we can get display 'llc' as follows:
> >
> >   # perf c2c report -d llc --coalesce tid,pid,iaddr,dso --stdio
> 
> I'm not sure if you ran the test on x86 or ARM.
> IIUC ARM should have 0 local hitm, right?

Yes, on Arm64 the local HITM and remote HITM both are zeros.  Below is
the testing result on x86.

Thanks,
Leo

> >   [...]
> >
> >   =================================================
> >              Shared Data Cache Line Table
> >   =================================================
> >   #
> >   #        ----------- Cacheline ----------  LLC Hit   LLC Hit    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
> >   # Index             Address  Node  PA cnt      Pct     Total  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
> >   # .....  ..................  ....  ......  .......  ........  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
> >   #
> >         0      0x563b01e83100     0    1401   65.32%       648     7011     3738     3273     2582      691      515     2516       59       143      505         0        0         0         0
> >         1      0x563b01e830c0     0       1   26.51%       263      400      400        0        0        0      130        3        4       262        1         0        0         0         0
> >         2      0x563b01e83080     0       1    7.76%        77      650      650        0        0        0      180      348       45        14       63         0        0         0         0
> >         3  0xffff88c3d74e82c0     0       1    0.10%         1        1        1        0        0        0        0        0        0         1        0         0        0         0         0
> >         4  0xffffa587c11e38c0   N/A       0    0.10%         1        2        1        1        1        0        0        0        0         1        0         0        0         0         0
> >         5  0xffffffffbd5e6fc0     0       1    0.10%         1        1        1        0        0        0        0        0        0         0        1         0        0         0         0
> >         6      0x7f90a4d6c2c0     0       1    0.10%         1        1        1        0        0        0        0        0        0         1        0         0        0         0         0
> >
> >   =================================================
> >         Shared Cache Line Distribution Pareto
> >   =================================================
> >   #
> >   #        ---- LLC LD ----  -- Store Refs --  --------- Data address ---------                                                   ---------- cycles ----------    Total       cpu                                  Shared
> >   #   Num   LclHit  LclHitm   L1 Hit  L1 Miss              Offset  Node  PA cnt      Pid                 Tid        Code address  rmt hitm  lcl hitm      load  records       cnt               Symbol             Object                  Source:Line  Node
> >   # .....  .......  .......  .......  .......  ..................  ....  ......  .......  ..................  ..................  ........  ........  ........  .......  ........  ...................  .................  ...........................  ....
> >   #
> >     -------------------------------------------------------------
> >         0      143      505     2582      691      0x563b01e83100
> >     -------------------------------------------------------------
> >             96.50%    7.72%   46.79%    0.00%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c16         0      1949      1331     1876         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:145   0
> >              0.00%   35.05%    0.00%    0.00%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c1d         0      2651       975      748         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
> >              0.00%   30.89%    0.00%    0.00%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c1d         0      1425      1003      762         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
> >              2.10%    7.52%   49.19%    0.00%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c16         0      1585      1053     2037         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:145   0
> >              0.00%    0.00%    2.52%   44.86%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c28         0         0         0      375         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
> >              0.00%    0.00%    1.51%   55.14%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c28         0         0         0      420         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
> >              1.40%   12.87%    0.00%    0.00%                0x20     0       1    14100    14104:reader_thd      0x563b01c81c73         0       166        99      417         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:155   0
> >              0.00%    5.94%    0.00%    0.00%                0x20     0       1    14100    14105:reader_thd      0x563b01c81c73         0       144        85      376         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:155   0
> >
> >   [...]
> >
> >
> > Leo Yan (8):
> >   perf mem: Add structure field c2c_stats::tot_llchit
> >   perf c2c: Add dimensions for total LLC hit
> >   perf c2c: Add dimensions for LLC load hit
> >   perf c2c: Change to general naming for macros
> >   perf c2c: Rename for shared cache line stats
> >   perf c2c: Refactor hist entry validation
> >   perf c2c: Add option '-d llc' for sorting with LLC load
> >   perf c2c: Update documentation for display option 'llc'
> >
> >  tools/perf/Documentation/perf-c2c.txt |  18 +-
> >  tools/perf/builtin-c2c.c              | 333 +++++++++++++++++++++-----
> >  tools/perf/util/mem-events.c          |   3 +
> >  tools/perf/util/mem-events.h          |   1 +
> >  4 files changed, 286 insertions(+), 69 deletions(-)
> >
> > --
> > 2.17.1
> >

      reply	other threads:[~2020-10-20  8:18 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 14:50 [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load Leo Yan
2020-10-15 14:50 ` [PATCH v1 1/8] perf mem: Add structure field c2c_stats::tot_llchit Leo Yan
2020-10-15 14:50 ` [PATCH v1 2/8] perf c2c: Add dimensions for total LLC hit Leo Yan
2020-10-15 14:50 ` [PATCH v1 3/8] perf c2c: Add dimensions for LLC load hit Leo Yan
2020-10-15 14:50 ` [PATCH v1 4/8] perf c2c: Change to general naming for macros Leo Yan
2020-10-15 14:50 ` [PATCH v1 5/8] perf c2c: Rename for shared cache line stats Leo Yan
2020-10-15 14:50 ` [PATCH v1 6/8] perf c2c: Refactor hist entry validation Leo Yan
2020-10-15 14:50 ` [PATCH v1 7/8] perf c2c: Add option '-d llc' for sorting with LLC load Leo Yan
2020-10-20  7:25   ` Jiri Olsa
2020-10-20  8:08     ` Leo Yan
2020-10-22  8:43       ` Jiri Olsa
2020-10-20  7:26   ` Jiri Olsa
2020-10-20  8:14     ` Leo Yan
2020-10-15 14:50 ` [PATCH v1 8/8] perf c2c: Update documentation for display option 'llc' Leo Yan
2020-10-20  7:26   ` Jiri Olsa
2020-10-15 15:05 ` [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load Arnaldo Carvalho de Melo
2020-10-15 15:14   ` Leo Yan
2020-10-20  8:13 ` Namhyung Kim
2020-10-20  8:18   ` Leo Yan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201020081843.GF13630@leoy-ThinkPad-X240s \
    --to=leo.yan@linaro.org \
    --cc=Al.Grant@arm.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dsahern@gmail.com \
    --cc=dzickus@redhat.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jmario@redhat.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.