From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932218AbbHGCAN (ORCPT ); Thu, 6 Aug 2015 22:00:13 -0400 Received: from mail.kernel.org ([198.145.29.136]:48592 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753772AbbHGCAA (ORCPT ); Thu, 6 Aug 2015 22:00:00 -0400 Date: Thu, 6 Aug 2015 16:44:24 -0300 From: Arnaldo Carvalho de Melo To: Andi Kleen Cc: jolsa@kernel.org, linux-kernel@vger.kernel.org, namhyung@kernel.org Subject: Re: Cycles annotation support for perf tools v3 Message-ID: <20150806194424.GB10826@kernel.org> References: <1437233094-12844-1-git-send-email-andi@firstfloor.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1437233094-12844-1-git-send-email-andi@firstfloor.org> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Sat, Jul 18, 2015 at 08:24:45AM -0700, Andi Kleen escreveu: > [v2: Addressed review comments. Fixed display problems and > correctly compute IPC now. See patches for detailed changes.] > [v3: Merged with current Arnaldo perf/core and added acked-by.] > > [Note the respective kernel patches to report cycles are in > peterz's perf/core queue, but so far not in tip. The patchkit > can be tested however with the "fake cycles" debug patch added at > the end] > > The upcoming Skylake CPU has a new timed branch stack feature, > that reports cycle counts for individual branches in the > last branch record. > > This allows to get fine grained cost information for code, and also allows > to compute fine grained IPC. Thanks, applied. - Arnaldo > Available from > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools3 > > This patchkit adds support for this in the perf tools: > - Basic support for the cycles field like other branch fields > - Show cycles in the standard branch sort view (no IPC here, > as IPC needs the instruction counts from annotation) > - Annotate cycles and IPC in the assembler annotate view > - Add branch support to top, so we can do live annotation. > - Misc support, like dumping it in perf report -D > > Example output for annotate (with made up numbers): > > The second column is the IPC and third average cycles for the basic block. > > │ static int hex(char ch) ▒ > │ { ▒ > 0.12 │ push %rbp ◆ > 0.12 │ mov %rsp,%rbp ▒ > 0.12 │ sub $0x20,%rsp ▒ > 0.12 │ mov %edi,%eax ▒ > 0.12 │ mov %al,-0x14(%rbp) ▒ > 0.12 │ mov %fs:0x28,%rax ▒ > 0.12 │ mov %rax,-0x8(%rbp) ▒ > 0.12 │ xor %eax,%eax ▒ > │ if ((ch >= '0') && (ch <= '9')) ▒ > 0.12 │ cmpb $0x2f,-0x14(%rbp) ▒ > 66.67 0.12 123 │ ↓ jle 31 ▒ > 0.12 │ cmpb $0x39,-0x14(%rbp) ▒ > 0.12 123 │ ↓ jg 31 ▒ > │ return ch - '0'; ▒ > 22.22 0.12 │ movsbl -0x14(%rbp),%eax ▒ > 0.12 │ sub $0x30,%eax ▒ > 0.12 123 │ ↓ jmp 60 ▒ > │ if ((ch >= 'a') && (ch <= 'f')) ▒ > 0.06 │31: cmpb $0x60,-0x14(%rbp) ▒ > 0.06 123 │ ↓ jle 46 ▒ > 0.06 │ cmpb $0x66,-0x14(%rbp) ▒ > 0.06 │ ↓ jg 46 ▒ > │ return ch - 'a' + 10; ▒ > 0.06 │ movsbl -0x14(%rbp),%eax > > Example output for branch view (again with fake data): > > Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles ◆ > 30.08% tcall tcall [.] f1 [.] f2 123 ▒ > 27.44% tcall tcall [.] f2 [.] f1 123 ▒ > 15.60% tcall tcall [.] main [.] f1 123 ▒ > 12.96% tcall tcall [.] f1 [.] main 123 ▒ > 12.86% tcall tcall [.] main [.] main 123 ▒ > 0.08% tcall [kernel.kallsyms] [k] hrtimer_interrupt [k] hrtimer_interrupt 123 > > IPC computation has a few limitations (see the comments in the respective patches), > in particular it punts on overlaping basic blocks. > > The annotation only works for the interactive annotation. Currently it is not > working in the scripted perf annotate, as that is missing a lot of the > infrastructure needed for per instruction state. > > It would be nice to add column headers to annotate. > > So far no support in --branch-history or in perf script.