linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Manuel Selva <selva.manuel@gmail.com>
To: linux-perf-users@vger.kernel.org
Subject: Remote Cache Accesses (1 Hop)
Date: Fri, 17 Jan 2014 14:24:09 +0100	[thread overview]
Message-ID: <52D92EF9.8000903@gmail.com> (raw)

Hi all,

I wrote a benchmarking program, in order to play with perf_event_open 
memory sampling capabilities (as discussed earlier on this list).

My benchmark is allocating a large array of memory (120 megas by 
default), then I am starting sampling with perf_event_open and the 
MEM_INST_RETIRED_LATENCY_ABOVE_THRESHOLD event (with threshold = 3, the 
minimal one according to Intel's doc) along with memory accesses 
counting (uncore event QMC_NORMAL_READ.ANY) and I access all the 
allocated memory either sequentially or randomly. The benchmark is mono 
thread, pinned to a given core, and memory is allocate don the core 
associated with this node using numa_alloc functions. perf_event_open is 
thus called to monitor only this core/numa node. The code is compiled 
without any optimization and I tried as possible to increase the ratio 
of memory accesses codes vs branching, and other stuff code.

I can clearly see with the QMC_NORMAL_READ.ANY events count that my 
random accesses test case generate far more memory accesses than the 
sequential one (I guess the prefetcher and sequential access are 
responsible for that).

Regarding the sampled event, I successfully mmap the result of perf 
event open and I am able to read the samples. My problem is that even 
for the random accesses test case on the 120 megas, I don't have any 
samples served by the RAM (I am using the PERF_SAMPLE_DATA_SRC field) 
(the sampling period is 1000 events and I only get ~700 samples). 
Nevertheless, in the sequential case I have 0,01% of my samples that are 
remote cache accesses (1 Hop) where as this percentage is ~20% in the 
random case.

So my questions are:

- What is exactly remote cache (1 hop) ? A data found in another core's 
private cache (L1 or L2 in my case) on the same processor ?

- How can I interpret my results, I was expecting to have local memory 
accesses samples increasing in the random case instead of remote cache 
(1 hop). How can I have remote cache accesses with malloced data not 
shared, and used by a thread pinned to a given core.

- I didn't yet look at that on details, but 700~ samples for accessing 
120 megas with sampling every 1000 events seems small. I am going to 
check the assembly code generated and also count the total number of 
memory requests (a core event) and compare that to the 700 samples * 
1000 events.

Thanks in advance for any suggestions you may have in order to help 
understand what's happening there.

-- 
Manu

             reply	other threads:[~2014-01-17 13:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-17 13:24 Manuel Selva [this message]
2014-01-23 16:02 ` Remote Cache Accesses (1 Hop) Manuel Selva
2014-01-29 13:47   ` Manuel Selva
2014-02-03 18:29     ` Manuel Selva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52D92EF9.8000903@gmail.com \
    --to=selva.manuel@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).