From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Lin Ming <ming.m.lin@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
Stephane Eranian <eranian@google.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
linux-kernel <linux-kernel@vger.kernel.org>,
Robert Richter <robert.richter@amd.com>
Subject: Re: [PATCH 1/4] perf: Add memory load/store events generic code
Date: Tue, 05 Jul 2011 16:17:48 +0200 [thread overview]
Message-ID: <1309875468.3282.210.camel@twins> (raw)
In-Reply-To: <1309866860.2381.1.camel@localhost>
On Tue, 2011-07-05 at 19:54 +0800, Lin Ming wrote:
> On Mon, 2011-07-04 at 19:16 +0800, Peter Zijlstra wrote:
> > On Mon, 2011-07-04 at 08:02 +0000, Lin Ming wrote:
> > > +#define MEM_STORE_DCU_HIT (1ULL << 0)
> >
> > I'm pretty sure that's not Dublin City University, but what is it?
> > Data-Cache-Unit? what does that mean, L1/L2 or also L3?
> >
> > > +#define MEM_STORE_STLB_HIT (1ULL << 1)
> >
> > What's an sTLB? I know iTLB and dTLB's but sTLBs I've not heard of yet.
> >
> > > +#define MEM_STORE_LOCKED_ACCESS (1ULL << 2)
> >
> > Presumably that's about LOCK'ed ops?
> >
> > So now you're just tacking bits on the end without even attempting to
> > generalize/unify things, not charmed at all.
>
> Any idea on the more useful store bits encoding?
For two of them, sure:
{load, store} x {atomic} x
{hasSRC} x {l1, l2, l3, ram, unkown, io, uncached, reserved} x
{hasLRS} x {local, remote, snoop} x
{hasMESI} x {MESI}
that would make MEM_STORE_DCU_HIT: store-l1 and MEM_STORE_LOCKED:
store-atomic.
Now this is needed for load-latency as well, since SNB extended the src
information with the same STLB/LOCK bits.
The SDM is somewhat inconsistent on what an STLB_MISS means:
Table 30-22 says: 0 - did not miss STLB (hit the DTLB/STLB), 1 - missed
the STLB.
Table 30-23 says: "the store missed the STLB if set, otherwise the store
hit the STLB", which simply cannot be true.
So I'm sticking with 30-22.
Now the above doesn't yet deal with TLBs nor can it map the IBS data
source bits because afaict that can report a u-op as both a store and a
load, but does not mention if a data-cache miss means L1 or L1/L2,
Robert?
One way to sort all that is not use enumerated spaces like above but
simply explode the whole thing like: load x store x atomic x l1 x l2
x ... that would of course give rise to a load of impossible
combinations but would do away with the hasFOO bits.
If the AMD data-cache means L1/L2 it can simply set both bits, same with
the Intel STLB miss, it can set TLB1/TLB2 bits (AMD does split those
nicely).
With all those bits exploded we can also express the inverse of
MEM_STORE_DCU_HIT as: store-l2-l3-dram, we simply set ~l1 for the
appropriate submask (which should arguably include IO/uncached/unknown
as well).
Now if anybody knows of another arch that has similar features (IA64,
ppc64?) can we get links to their PMU docs so that we can see if we can
map them as well?
Comments?
next prev parent reply other threads:[~2011-07-05 14:18 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-04 8:02 [PATCH 0/4] perf: memory load/store events generalization Lin Ming
2011-07-04 8:02 ` [PATCH 1/4] perf: Add memory load/store events generic code Lin Ming
2011-07-04 8:33 ` Peter Zijlstra
2011-07-04 8:44 ` Peter Zijlstra
2011-07-05 12:03 ` Peter Zijlstra
2011-07-05 23:02 ` Paul Mackerras
2011-07-06 13:58 ` Peter Zijlstra
2011-07-08 7:18 ` Anton Blanchard
2011-07-08 15:18 ` Peter Zijlstra
2011-08-08 11:57 ` Peter Zijlstra
2011-08-08 11:59 ` Peter Zijlstra
2011-07-04 22:01 ` Andi Kleen
2011-07-05 8:43 ` Peter Zijlstra
2011-07-04 11:08 ` Peter Zijlstra
2011-07-04 11:16 ` Peter Zijlstra
2011-07-04 21:52 ` Andi Kleen
2011-07-05 11:54 ` Lin Ming
2011-07-05 14:17 ` Peter Zijlstra [this message]
2011-07-06 5:53 ` Lin Ming
2011-07-06 13:51 ` Peter Zijlstra
2011-07-07 2:01 ` Lin Ming
2011-07-04 8:02 ` [PATCH 2/4] perf, x86: Add Intel Nhm/Wsm/Snb load latency support Lin Ming
2011-07-05 13:17 ` Peter Zijlstra
2011-07-05 13:34 ` Lin Ming
2011-07-22 18:58 ` Stephane Eranian
2011-07-04 8:02 ` [PATCH 3/4] perf, x86: Add Intel SandyBridge pricise store support Lin Ming
2011-07-11 8:32 ` Peter Zijlstra
2011-07-11 8:57 ` Lin Ming
2011-07-11 8:52 ` Peter Zijlstra
2011-07-04 8:02 ` [PATCH 4/4] perf, tool: Add new command "perf mem" Lin Ming
2011-07-04 22:00 ` Andi Kleen
2011-07-05 1:35 ` Lin Ming
2011-07-22 18:55 ` [PATCH 0/4] perf: memory load/store events generalization Stephane Eranian
2011-07-22 21:01 ` Andi Kleen
2011-07-22 21:14 ` Stephane Eranian
2011-07-22 21:43 ` Andi Kleen
2011-07-22 21:59 ` Stephane Eranian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1309875468.3282.210.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=andi@firstfloor.org \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=mingo@elte.hu \
--cc=robert.richter@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox