* [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD
@ 2010-11-01 14:11 Robert Schöne
2010-11-02 1:55 ` Stephane Eranian
0 siblings, 1 reply; 4+ messages in thread
From: Robert Schöne @ 2010-11-01 14:11 UTC (permalink / raw)
To: Stephane Eranian, Vince Weaver, Peter Zijlstra, Robert Richter,
Ingo Molnar
Cc: x86, linux-kernel
The current arch/x86/kernel/cpu/perf_event_amd.c file lists
L1-Instruction-Cache Misses and Accesses as PERF_COUNT_HW_CACHE_MISSES
resp. PERF_COUNT_HW_CACHE_REFERENCES.
This fix uses L2C-Misses and Accesses instead. (Real LLC-events would be
better, but there are some restrictions for Northbridge Events on AMD).
The event codes are copied from the list of cache events from the same
file.
Signed-off-by: Robert Schoene <robert.schoene@tu-dresden.de>
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -100,8 +100,8 @@ static const u64 amd_perfmon_event_map[] =
{
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
- [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080,
- [PERF_COUNT_HW_CACHE_MISSES] = 0x0081,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x037D,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x037E,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
};
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD 2010-11-01 14:11 [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD Robert Schöne @ 2010-11-02 1:55 ` Stephane Eranian 2010-11-02 11:08 ` Robert Schöne 0 siblings, 1 reply; 4+ messages in thread From: Stephane Eranian @ 2010-11-02 1:55 UTC (permalink / raw) To: Robert Schöne Cc: Vince Weaver, Peter Zijlstra, Robert Richter, Ingo Molnar, x86, linux-kernel Hi, On Mon, Nov 1, 2010 at 3:11 PM, Robert Schöne <robert.schoene@tu-dresden.de> wrote: > > The current arch/x86/kernel/cpu/perf_event_amd.c file lists > L1-Instruction-Cache Misses and Accesses as PERF_COUNT_HW_CACHE_MISSES > resp. PERF_COUNT_HW_CACHE_REFERENCES. > I always thought PERF_COUNT_HW_CACHE_* was about data cache misses. But given that there is no clear definitions for those events, it creates confusion. If you change the meaning of HW_CACHE_MISSES, then seems to me, you need to change the mapping in the perf tool, because now it includes both data+code. > This fix uses L2C-Misses and Accesses instead. (Real LLC-events would be > better, but there are some restrictions for Northbridge Events on AMD). > And those constraints are handled correctly by the kernel. The constraint is such that you cannot have more than 4 instances of Northbridge events active at the same time per core. If you do, then one of them will starve (if issued from different cores). > --- a/arch/x86/kernel/cpu/perf_event_amd.c > +++ b/arch/x86/kernel/cpu/perf_event_amd.c > @@ -100,8 +100,8 @@ static const u64 amd_perfmon_event_map[] = > { > [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, > [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, > - [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080, > - [PERF_COUNT_HW_CACHE_MISSES] = 0x0081, > + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x037D, > + [PERF_COUNT_HW_CACHE_MISSES] = 0x037E, > [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, > [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, > }; > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD 2010-11-02 1:55 ` Stephane Eranian @ 2010-11-02 11:08 ` Robert Schöne 2010-11-22 11:08 ` Stephane Eranian 0 siblings, 1 reply; 4+ messages in thread From: Robert Schöne @ 2010-11-02 11:08 UTC (permalink / raw) To: Stephane Eranian Cc: Vince Weaver, Peter Zijlstra, Robert Richter, Ingo Molnar, x86, linux-kernel Hi, Am Dienstag, den 02.11.2010, 02:55 +0100 schrieb Stephane Eranian: > Hi, > > > > On Mon, Nov 1, 2010 at 3:11 PM, Robert Schöne > <robert.schoene@tu-dresden.de> wrote: > > > > The current arch/x86/kernel/cpu/perf_event_amd.c file lists > > L1-Instruction-Cache Misses and Accesses as PERF_COUNT_HW_CACHE_MISSES > > resp. PERF_COUNT_HW_CACHE_REFERENCES. > > > I always thought PERF_COUNT_HW_CACHE_* was about data cache misses. > But given that there is no clear definitions for those events, it > creates confusion. > That's what I thought too before reading the AMD BKDG for Family 10. It always seemed to me that the "hardware" event type was kind of a mapping to the Intel "architectural events". And in their definition its it reads as LLC. > > If you change the meaning of HW_CACHE_MISSES, then seems to me, you need > to change the mapping in the perf tool, because now it includes both data+code. > So does the Intel implementation. It's just LLC misses with no definition on what was accessed. > > > This fix uses L2C-Misses and Accesses instead. (Real LLC-events would be > > better, but there are some restrictions for Northbridge Events on AMD). > > > And those constraints are handled correctly by the kernel. > > The constraint is such that you cannot have more than 4 instances of > Northbridge events active at the same time per core. If you do, then one > of them will starve (if issued from different cores). > Yes, we could use event 4E1 (L3 Cache Misses), but we would need different event IDs for the different AMD Families. Not all of them have an L3-Cache and even some implementations of Family 10h don't have L3 either. As this event ID is a definition, we would have to introduce a "placeholder" definition, which is - whenever a Cache Misses/Accesses event is initiated - replaced by the "Last Level Cache" event ID for the processor, which is currently in the system. > > > > --- a/arch/x86/kernel/cpu/perf_event_amd.c > > +++ b/arch/x86/kernel/cpu/perf_event_amd.c > > @@ -100,8 +100,8 @@ static const u64 amd_perfmon_event_map[] = > > { > > [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, > > [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, > > - [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080, > > - [PERF_COUNT_HW_CACHE_MISSES] = 0x0081, > > + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x037D, > > + [PERF_COUNT_HW_CACHE_MISSES] = 0x037E, > > [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, > > [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, > > }; > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD 2010-11-02 11:08 ` Robert Schöne @ 2010-11-22 11:08 ` Stephane Eranian 0 siblings, 0 replies; 4+ messages in thread From: Stephane Eranian @ 2010-11-22 11:08 UTC (permalink / raw) To: Robert Schöne Cc: Vince Weaver, Peter Zijlstra, Robert Richter, Ingo Molnar, x86, linux-kernel Robert, Has there been any progress on this issue? On Tue, Nov 2, 2010 at 12:08 PM, Robert Schöne <robert.schoene@tu-dresden.de> wrote: > > > > Yes, we could use event 4E1 (L3 Cache Misses), but we would need > different event IDs for the different AMD Families. Not all of them have > an L3-Cache and even some implementations of Family 10h don't have L3 > either. I think you could introduce several generic event mapping tables, like what is done for the various Intel processors, i.e., have variations of the amd_perfmon_event_map[] table. Then, the kernel would auto-detect the host CPU and pick the correct table. Same thing would have to be done for the LL generic cache events if some mappings use Northbridge events. In general, however, I would recommend not using those generic cache events to begin with. I think you understand why now. When dealing with PMU events, you should read the documentation first. Micro-architectures vary greatly even within the same processor family. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-11-22 11:08 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-11-01 14:11 [PATCH] wrong PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_MISSES for AMD Robert Schöne 2010-11-02 1:55 ` Stephane Eranian 2010-11-02 11:08 ` Robert Schöne 2010-11-22 11:08 ` Stephane Eranian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox