From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751703AbbEYHqL (ORCPT ); Mon, 25 May 2015 03:46:11 -0400 Received: from cantor2.suse.de ([195.135.220.15]:40301 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751195AbbEYHqJ (ORCPT ); Mon, 25 May 2015 03:46:09 -0400 Message-ID: <5562D33F.70706@suse.cz> Date: Mon, 25 May 2015 09:46:07 +0200 From: =?UTF-8?B?TWFydGluIExpxaFrYQ==?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Andi Kleen CC: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo Subject: Re: [RFC] Add --show-total-period for perf annotate References: <555F3F8A.6000204@suse.cz> <87mw0wc4vt.fsf@tassilo.jf.intel.com> In-Reply-To: <87mw0wc4vt.fsf@tassilo.jf.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/23/2015 06:08 AM, Andi Kleen wrote: > Martin Liška writes: > >> I've been working on a new feature for perf annotate, which should be able to annotate >> instructions with total spent time (compared to percentage usage). >> >> Let's consider following use-case. You want to compare two different compilers >> on the same code base and let's assume 90% of wall-time is spent in a single function. >> Moreover, let's say that these compilers produce assembly of a totally different size. >> >> In such case, it's very useful to get an approximation of spent time on a bunch of instructions, >> which can be compared among other compilers. Otherwise, one has to somehow sum percentages and compare >> it to size of a function. > > perf diff does not handle this? Especially with the differential > profiling options it should. It does not work if you, in my case, compare ICC and GCC, where ICC uses a different mangling scheme for fortran modules. Moreover, situation can be more complicated if a compiler performs a bit different inlining decisions. > >>> @@ -623,6 +624,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) >> if (!target__none(&opts->target) && !opts->initial_delay) >> perf_evlist__enable(rec->evlist); >> >> + t0 = rdclock(); >> + >> /* >> * Let the child rip >> */ >> @@ -692,6 +695,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) >> goto out_child; >> } >> >> + t1 = rdclock(); >> + walltime_nsecs = t1 - t0; > > The walltime can be later computed by the difference of the first and > the last time stamp after sorting the events. So you don't need the new header. > > -Andi > Good point. Can you please help me how to compute a function percentage usage in perf annotate ;) ? Thanks, Martin