From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753299AbbEWEIk (ORCPT ); Sat, 23 May 2015 00:08:40 -0400 Received: from mga14.intel.com ([192.55.52.115]:39316 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751108AbbEWEIj convert rfc822-to-8bit (ORCPT ); Sat, 23 May 2015 00:08:39 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,480,1427785200"; d="scan'208";a="734076059" From: Andi Kleen To: Martin =?utf-8?Q?Li=C5=A1ka?= Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo Subject: Re: [RFC] Add --show-total-period for perf annotate References: <555F3F8A.6000204@suse.cz> Date: Fri, 22 May 2015 21:08:38 -0700 In-Reply-To: <555F3F8A.6000204@suse.cz> ("Martin =?utf-8?Q?Li=C5=A1ka=22's?= message of "Fri, 22 May 2015 16:39:06 +0200") Message-ID: <87mw0wc4vt.fsf@tassilo.jf.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Martin Liška writes: > I've been working on a new feature for perf annotate, which should be able to annotate > instructions with total spent time (compared to percentage usage). > > Let's consider following use-case. You want to compare two different compilers > on the same code base and let's assume 90% of wall-time is spent in a single function. > Moreover, let's say that these compilers produce assembly of a totally different size. > > In such case, it's very useful to get an approximation of spent time on a bunch of instructions, > which can be compared among other compilers. Otherwise, one has to somehow sum percentages and compare > it to size of a function. perf diff does not handle this? Especially with the differential profiling options it should. >> @@ -623,6 +624,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) > if (!target__none(&opts->target) && !opts->initial_delay) > perf_evlist__enable(rec->evlist); > > + t0 = rdclock(); > + > /* > * Let the child rip > */ > @@ -692,6 +695,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) > goto out_child; > } > > + t1 = rdclock(); > + walltime_nsecs = t1 - t0; The walltime can be later computed by the difference of the first and the last time stamp after sorting the events. So you don't need the new header. -Andi -- ak@linux.intel.com -- Speaking for myself only