From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932954AbbHDJA2 (ORCPT ); Tue, 4 Aug 2015 05:00:28 -0400 Received: from terminus.zytor.com ([198.137.202.10]:50257 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932448AbbHDJAZ (ORCPT ); Tue, 4 Aug 2015 05:00:25 -0400 Date: Tue, 4 Aug 2015 01:59:48 -0700 From: tip-bot for Andi Kleen Message-ID: Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org, torvalds@linux-foundation.org, mingo@kernel.org, ak@linux.intel.com, hpa@zytor.com Reply-To: tglx@linutronix.de, linux-kernel@vger.kernel.org, mingo@kernel.org, hpa@zytor.com, ak@linux.intel.com, torvalds@linux-foundation.org, peterz@infradead.org In-Reply-To: <1432786398-23861-6-git-send-email-andi@firstfloor.org> References: <1432786398-23861-6-git-send-email-andi@firstfloor.org> To: linux-tip-commits@vger.kernel.org Subject: [tip:perf/core] perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode Git-Commit-ID: 90405aa02247c1a6313c33e2253f9fd2299ae60b X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 90405aa02247c1a6313c33e2253f9fd2299ae60b Gitweb: http://git.kernel.org/tip/90405aa02247c1a6313c33e2253f9fd2299ae60b Author: Andi Kleen AuthorDate: Wed, 27 May 2015 21:13:18 -0700 Committer: Ingo Molnar CommitDate: Tue, 4 Aug 2015 10:16:59 +0200 perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode In callstack mode the LBR is not a ring buffer, but a stack that grows up and down. This means in this case we don't need to access all LBRs, only the ones up to TOS. Do this optimization for the normal LBR read, and the context switch save/restore code. For save/restore it can be done unconditionally, as it only runs when call stack mode is active. This recovers some of the cost of going to 32 LBRs on Skylake. Signed-off-by: Andi Kleen Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: acme@kernel.org Cc: eranian@google.com Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1432786398-23861-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/perf_event_intel_lbr.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c index a5bc424..b2c9475 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c @@ -240,7 +240,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) mask = x86_pmu.lbr_nr - 1; tos = intel_pmu_lbr_tos(); - for (i = 0; i < x86_pmu.lbr_nr; i++) { + for (i = 0; i < tos; i++) { lbr_idx = (tos - i) & mask; wrmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); wrmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); @@ -263,7 +263,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) mask = x86_pmu.lbr_nr - 1; tos = intel_pmu_lbr_tos(); - for (i = 0; i < x86_pmu.lbr_nr; i++) { + for (i = 0; i < tos; i++) { lbr_idx = (tos - i) & mask; rdmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]); rdmsrl(x86_pmu.lbr_to + lbr_idx, task_ctx->lbr_to[i]); @@ -425,8 +425,12 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc) u64 tos = intel_pmu_lbr_tos(); int i; int out = 0; + int num = x86_pmu.lbr_nr; - for (i = 0; i < x86_pmu.lbr_nr; i++) { + if (cpuc->lbr_sel->config & LBR_CALL_STACK) + num = tos; + + for (i = 0; i < num; i++) { unsigned long lbr_idx = (tos - i) & mask; u64 from, to, mis = 0, pred = 0, in_tx = 0, abort = 0; int skip = 0;