From: "Martin Liška" <mliska@suse.cz>
To: Andi Kleen <andi@firstfloor.org>
Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [RFC] Add --show-total-period for perf annotate
Date: Tue, 26 May 2015 14:34:37 +0200 [thread overview]
Message-ID: <5564685D.1020204@suse.cz> (raw)
In-Reply-To: <20150525151450.GK19417@two.firstfloor.org>
[-- Attachment #1: Type: text/plain, Size: 2645 bytes --]
On 05/25/2015 05:14 PM, Andi Kleen wrote:
>>> perf diff does not handle this? Especially with the differential
>>> profiling options it should.
>>
>> It does not work if you, in my case, compare ICC and GCC, where ICC uses a different mangling
>> scheme for fortran modules. Moreover, situation can be more complicated if a compiler performs
>> a bit different inlining decisions.
>
> I suppose it could be enhanced with an input file that describes
> equivalent functions. But yes wouldn't work for inlining.
>
>> Good point. Can you please help me how to compute a function percentage usage in perf annotate ;) ?
>
> I wouldn't use time at all. Just sum up periods and then compute the
> percentage. The period sum computation already happens in the main view, and
> is displayed there. So you only need to save that value somewhere and
> then use it in the annotate display for another column.
>
> # Samples: 24 of event 'cycles'
> # Event count (approx.): 8856637
>
> -Andi
>
Hello.
Are you talking about summing cycles and compute global percentage for each isntruction?
Anyway, attached patch is capable of displaying milliseconds approximation for each instruction.
Example:
time ./perf record ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.743 MB perf.data (45386 samples) ]
real 0m11.465s
user 0m11.424s
sys 0m0.045s
$ perf report --stdio
32.69% a.out libc-2.19.so [.] __random_r
26.66% a.out libc-2.19.so [.] __random
12.55% a.out a.out [.] foo
10.14% a.out a.out [.] bar
9.27% a.out a.out [.] baz
7.22% a.out libc-2.19.so [.] rand
1.37% a.out a.out [.] rand@plt
$ perf annotate --stdio
Percent | Source code & Disassembly of libc-2.19.so for cycles
--------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000038890 <random_r>:
: __random_r():
11.12 : 38890: test %rdi,%rdi
$ perf annotate --stdio --show-total-period
Percent | Source code & Disassembly of libc-2.19.so for cycles
--------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000038890 <random_r>:
: __random_r():
413 : 38890: test %rdi,%rdi
First `test` instruction: 11465*0.3269*0.1112 ~ 413ms.
Thanks for suggestions,
Martin
[-- Attachment #2: 0001-First-semi-working-version.patch --]
[-- Type: text/x-patch, Size: 6030 bytes --]
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b57a027..cce19d6 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -326,6 +326,8 @@ int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
"objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
"Show event group information together"),
+ OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
+ "Show a column with the sum of periods"),
OPT_END()
};
int ret = hists__init();
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe913..dfd0c8c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -484,6 +484,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
struct perf_session *session;
bool disabled = false, draining = false;
int fd;
+ unsigned long long t0, t1;
rec->progname = argv[0];
@@ -623,6 +624,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
if (!target__none(&opts->target) && !opts->initial_delay)
perf_evlist__enable(rec->evlist);
+ t0 = rdclock();
+
/*
* Let the child rip
*/
@@ -692,6 +695,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
goto out_child;
}
+ t1 = rdclock();
+ walltime_nsecs = t1 - t0;
+
if (!quiet)
fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index aa79fb8..0acdf4c 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -35,6 +35,8 @@ extern const char *input_name;
extern bool perf_host, perf_guest;
extern const char perf_version_string[];
+extern unsigned long long walltime_nsecs;
+
void pthread__unblock_sigwinch(void);
#include "util/target.h"
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index e5250eb..e7af8ec 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -113,7 +113,12 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
for (i = 0; i < ab->nr_events; i++) {
ui_browser__set_percent_color(browser, bdl->percent[i],
current_entry);
- slsmg_printf("%6.2f ", bdl->percent[i]);
+
+ if (symbol_conf.show_total_period)
+ slsmg_printf("%7.0f ", bdl->percent[i] *
+ walltime_nsecs / (100 * 1e6));
+ else
+ slsmg_printf("%7.2f ", bdl->percent[i]);
}
} else {
ui_browser__set_percent_color(browser, 0, current_entry);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7f5bdfc..263e1bf 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -753,7 +753,12 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
for (i = 0; i < nr_percent; i++) {
percent = ppercents[i];
color = get_percent_color(percent);
- color_fprintf(stdout, color, " %7.2f", percent);
+
+ if (symbol_conf.show_total_period)
+ color_fprintf(stdout, color, " %7.0f",
+ percent * walltime_nsecs / (100 * 1e6));
+ else
+ color_fprintf(stdout, color, " %7.2f", percent);
}
printf(" : ");
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 3f0d809..a1d254b 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -47,6 +47,8 @@ struct perf_file_attr {
struct perf_file_section ids;
};
+unsigned long long walltime_nsecs;
+
void perf_header__set_feat(struct perf_header *header, int feat)
{
set_bit(feat, header->adds_features);
@@ -883,6 +885,14 @@ static int write_auxtrace(int fd, struct perf_header *h,
return err;
}
+static int write_total_wall_time(int fd, struct perf_header *h __maybe_unused,
+ struct perf_evlist *evlist __maybe_unused)
+{
+ do_write (fd, &walltime_nsecs, sizeof (walltime_nsecs));
+ return 0;
+}
+
+
static void print_hostname(struct perf_header *ph, int fd __maybe_unused,
FILE *fp)
{
@@ -1171,6 +1181,13 @@ static void print_auxtrace(struct perf_header *ph __maybe_unused,
fprintf(fp, "# contains AUX area data (e.g. instruction trace)\n");
}
+static void print_total_wall_time(struct perf_header *ph, int fd __maybe_unused,
+ FILE *fp)
+{
+ fprintf(fp, "# total wall-time: %.9f second\n",
+ ph->env.total_wall_time / 1e9);
+}
+
static void print_pmu_mappings(struct perf_header *ph, int fd __maybe_unused,
FILE *fp)
{
@@ -1857,6 +1874,23 @@ static int process_auxtrace(struct perf_file_section *section,
return err;
}
+static int process_total_wall_time(struct perf_file_section *section __maybe_unused,
+ struct perf_header *ph, int fd,
+ void *data __maybe_unused)
+{
+ ssize_t ret;
+
+ ret = readn(fd, &walltime_nsecs, sizeof(walltime_nsecs));
+ if (ret != sizeof(walltime_nsecs))
+ return -1;
+
+ if (ph->needs_swap)
+ walltime_nsecs = bswap_64(walltime_nsecs);
+
+ ph->env.total_wall_time = walltime_nsecs;
+ return 0;
+}
+
struct feature_ops {
int (*write)(int fd, struct perf_header *h, struct perf_evlist *evlist);
void (*print)(struct perf_header *h, int fd, FILE *fp);
@@ -1898,6 +1932,7 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
FEAT_OPP(HEADER_PMU_MAPPINGS, pmu_mappings),
FEAT_OPP(HEADER_GROUP_DESC, group_desc),
FEAT_OPP(HEADER_AUXTRACE, auxtrace),
+ FEAT_OPP(HEADER_TOTAL_WALLTIME, total_wall_time),
};
struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index d4d5796..f47571a 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -31,6 +31,7 @@ enum {
HEADER_PMU_MAPPINGS,
HEADER_GROUP_DESC,
HEADER_AUXTRACE,
+ HEADER_TOTAL_WALLTIME,
HEADER_LAST_FEATURE,
HEADER_FEAT_BITS = 256,
};
@@ -88,6 +89,7 @@ struct perf_session_env {
char *sibling_threads;
char *numa_nodes;
char *pmu_mappings;
+ unsigned long long total_wall_time;
};
struct perf_header {
--
2.1.4
next prev parent reply other threads:[~2015-05-26 13:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-22 14:39 [RFC] Add --show-total-period for perf annotate Martin Liška
2015-05-23 4:08 ` Andi Kleen
2015-05-25 7:46 ` Martin Liška
2015-05-25 15:14 ` Andi Kleen
2015-05-26 12:34 ` Martin Liška [this message]
2015-05-26 17:03 ` Andi Kleen
2015-05-27 8:46 ` Martin Liška
2015-05-27 9:04 ` Martin Liška
2015-05-27 14:04 ` Andi Kleen
2015-05-29 12:58 ` Martin Liška
2015-05-29 15:55 ` Andi Kleen
2015-06-01 11:08 ` Martin Liška
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5564685D.1020204@suse.cz \
--to=mliska@suse.cz \
--cc=acme@kernel.org \
--cc=andi@firstfloor.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).