From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from terminus.zytor.com ([198.137.202.10]:46034 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933244AbcIPQLY (ORCPT ); Fri, 16 Sep 2016 12:11:24 -0400 Date: Fri, 16 Sep 2016 09:09:56 -0700 From: tip-bot for Matt Fleming Message-ID: Cc: torvalds@linux-foundation.org, mingo@kernel.org, peterz@infradead.org, bp@alien8.de, stable@vger.kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, matt@codeblueprint.co.uk Reply-To: stable@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de, matt@codeblueprint.co.uk, linux-kernel@vger.kernel.org, mingo@kernel.org, bp@alien8.de, peterz@infradead.org, torvalds@linux-foundation.org In-Reply-To: <1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk> References: <1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk> To: linux-tip-commits@vger.kernel.org Subject: [tip:perf/urgent] perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: stable-owner@vger.kernel.org List-ID: Commit-ID: 080fe0b790ad438fc1b61621dac37c1964ce7f35 Gitweb: http://git.kernel.org/tip/080fe0b790ad438fc1b61621dac37c1964ce7f35 Author: Matt Fleming AuthorDate: Wed, 24 Aug 2016 14:12:08 +0100 Committer: Ingo Molnar CommitDate: Fri, 16 Sep 2016 16:19:49 +0200 perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2 While the Intel PMU monitors the LLC when perf enables the HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor L1 instruction cache fetches (0x0080) and instruction cache misses (0x0081) on the AMD PMU. This is extremely confusing when monitoring the same workload across Intel and AMD machines, since parameters like, $ perf stat -e cache-references,cache-misses measure completely different things. Instead, make the AMD PMU measure instruction/data cache and TLB fill requests to the L2 and instruction/data cache and TLB misses in the L2 when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled, respectively. That way the events measure unified caches on both platforms. Signed-off-by: Matt Fleming Acked-by: Peter Zijlstra Cc: Cc: Borislav Petkov Cc: Linus Torvalds Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar --- arch/x86/events/amd/core.c | 4 ++-- arch/x86/kvm/pmu_amd.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index e07a22b..f5f4b3f 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -119,8 +119,8 @@ static const u64 amd_perfmon_event_map[PERF_COUNT_HW_MAX] = { [PERF_COUNT_HW_CPU_CYCLES] = 0x0076, [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, - [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080, - [PERF_COUNT_HW_CACHE_MISSES] = 0x0081, + [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d, + [PERF_COUNT_HW_CACHE_MISSES] = 0x077e, [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2, [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3, [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */ diff --git a/arch/x86/kvm/pmu_amd.c b/arch/x86/kvm/pmu_amd.c index 39b9112..cd94443 100644 --- a/arch/x86/kvm/pmu_amd.c +++ b/arch/x86/kvm/pmu_amd.c @@ -23,8 +23,8 @@ static struct kvm_event_hw_type_mapping amd_event_mapping[] = { [0] = { 0x76, 0x00, PERF_COUNT_HW_CPU_CYCLES }, [1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS }, - [2] = { 0x80, 0x00, PERF_COUNT_HW_CACHE_REFERENCES }, - [3] = { 0x81, 0x00, PERF_COUNT_HW_CACHE_MISSES }, + [2] = { 0x7d, 0x07, PERF_COUNT_HW_CACHE_REFERENCES }, + [3] = { 0x7e, 0x07, PERF_COUNT_HW_CACHE_MISSES }, [4] = { 0xc2, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS }, [5] = { 0xc3, 0x00, PERF_COUNT_HW_BRANCH_MISSES }, [6] = { 0xd0, 0x00, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND },