From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932447AbcHOPNY (ORCPT ); Mon, 15 Aug 2016 11:13:24 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:35647 "EHLO mail-pa0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932206AbcHOPNU (ORCPT ); Mon, 15 Aug 2016 11:13:20 -0400 Date: Mon, 15 Aug 2016 16:13:16 +0100 From: Matt Fleming To: Borislav Petkov Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, Ingo Molnar Subject: Re: [PATCH] perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2 Message-ID: <20160815151316.GI30909@codeblueprint.co.uk> References: <1470928902-31196-1-git-send-email-matt@codeblueprint.co.uk> <20160811164150.GB7296@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160811164150.GB7296@nazgul.tnic> User-Agent: Mutt/1.5.24+41 (02bc14ed1569) (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 11 Aug, at 06:41:50PM, Borislav Petkov wrote: > Drop stable from CC. > > On Thu, Aug 11, 2016 at 04:21:42PM +0100, Matt Fleming wrote: > > While the Intel PMU monitors the LLC when perf enables the > > HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor > > L1 instruction cache fetches (0x0080) and instruction cache misses > > (0x0081) on the AMD PMU. > > > > This is extremely confusing when monitoring the same workload across > > Intel and AMD machines, since parameters like, > > > > $ perf stat -e cache-references,cache-misses > > > > measure completely different things. > > > > Instead, make the AMD PMU measure instruction/data cache fill requests > > to the L2 and instruction/data cache misses in the L2 when > > HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled, respectively. > > That way the events measure unified caches on both platforms. > > I guess that's closer. > > Even though LLC is not always L2 on AMD (some have L3). Btw, > what are the exact events for PERF_COUNT_HW_CACHE_REFERENCES and > PERF_COUNT_HW_CACHE_MISSES called on Intel? They're referred to as "LLC Reference" and "LLC Misses" in the Intel SDM Table 18-1 and "Longest latency cache references/misses" in Table 19-1. > I could try to find better/more fitting event selectors on AMD... If you've got any other suggestions, I'm all ears. Note that one thing I wasn't sure about was whether we want to include TLB events hitting the L2. I left them out of this patch, but it might make sense to add them so that HW_CACHE_{REFERENCES,MISSES} is actually distinguishable from LLC-{loads,misses}. > > Signed-off-by: Matt Fleming > > Cc: Peter Zijlstra > > Cc: Ingo Molnar > > Cc: Borislav Petkov > > Cc: > > --- > > arch/x86/events/amd/core.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c > > index e07a22bb9308..8fd8bf79f32b 100644 > > --- a/arch/x86/events/amd/core.c > > +++ b/arch/x86/events/amd/core.c > > @@ -119,8 +119,8 @@ static const u64 amd_perfmon_event_map[PERF_COUNT_HW_MAX] = > > { > > [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, > > [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, > > - [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080, > > - [PERF_COUNT_HW_CACHE_MISSES] = 0x0081, > > + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x037d, > > + [PERF_COUNT_HW_CACHE_MISSES] = 0x037e, > > [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, > > [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, > > [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */ > > Btw, there's also amd_event_mapping in arch/x86/kvm/pmu_amd.c which has > duplicated amd_perfmon_event_map. Would need adjusting too. Urgh, right. I totally missed that. I'll update.