From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757187AbcCCJPh (ORCPT ); Thu, 3 Mar 2016 04:15:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41204 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754693AbcCCJPc (ORCPT ); Thu, 3 Mar 2016 04:15:32 -0500 Date: Thu, 3 Mar 2016 10:15:22 +0100 From: Jiri Olsa To: Ingo Molnar Cc: Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, Alexei Starovoitov , Andi Kleen , Brendan Gregg , David Ahern , He Kuang , Jeff Bastian , Jeremie Galarneau , Jiri Olsa , Josh Boyer , Lai Jiangshan , Li Zefan , Masami Hiramatsu , Namhyung Kim , Peter Zijlstra , pi3orama@163.com, Stephane Eranian , Steven Rostedt , Taeung Song , Thomas Gleixner , Wang Nan , Arnaldo Carvalho de Melo Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes Message-ID: <20160303091522.GA14355@krava.redhat.com> References: <1456773727-3005-1-git-send-email-acme@kernel.org> <20160303082130.GA13523@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160303082130.GA13523@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 03, 2016 at 09:21:30AM +0100, Ingo Molnar wrote: > > * Arnaldo Carvalho de Melo wrote: > > > Hi Ingo, > > > > Please consider pulling, > > > > - Arnaldo > > > > The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1: > > > > perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100) > > > > are available in the git repository at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160229 > > > > for you to fetch changes up to 575a02e00b11eecbbabcb1eb22eab4c68e91ae77: > > > > perf record: Ensure return non-zero rc when mmap fail (2016-02-29 12:44:15 -0300) > > > > ---------------------------------------------------------------- > > perf/core improvements and fixes: > > > > User visible: > > > > - Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen) > > > > - Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa) > > > > - Rename the tracepoint '/format' field that carries the syscall ID from 'nr', > > that is also the name of some syscalls arguments, to "__syscall_nr", to > > avoid having multiple fields with the same name, that was breaking the > > python script skeleton generator from perf.data files (Taeung Song) > > > > - Support converting data from bpf events in 'perf data' (Wang Nan) > > > > Infrastructure: > > > > - Split libtraceevent's pevent_print_event() (Steven Rostedt) > > > > - Librarize some 'perf record' bits to allow handling multiple perf.data > > files per session (Wang Nan) > > > > - Ensure return non-zero rc when mmap fail in 'perf record' (Wang Nan) > > > > Signed-off-by: Arnaldo Carvalho de Melo > > > > ---------------------------------------------------------------- > > Andi Kleen (1): > > perf stat: Check existence of frontend/backed stalled cycles > > > > > Jiri Olsa (1): > > perf tools: Fix python extension build > > > > Steven Rostedt (1): > > tools lib traceevent: Split pevent_print_event() into specific functionality functions > > > > Taeung Song (2): > > perf trace: Check and discard not only 'nr' but also '__syscall_nr' > > tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr: > > > > Wang Nan (6): > > perf data: Support converting data from bpf_perf_event_output() > > perf data: Explicitly set byte order for integer types > > perf record: Use WARN_ONCE to replace 'if' condition > > perf record: Extract synthesize code to record__synthesize() > > perf record: Introduce record__finish_output() to finish a perf.data > > perf record: Ensure return non-zero rc when mmap fail > > > > kernel/trace/trace_syscalls.c | 16 ++-- > > tools/lib/traceevent/event-parse.c | 136 +++++++++++++++++++++++------- > > tools/lib/traceevent/event-parse.h | 13 +++ > > tools/perf/builtin-record.c | 168 ++++++++++++++++++++++--------------- > > tools/perf/builtin-stat.c | 22 ++++- > > tools/perf/builtin-trace.c | 8 +- > > tools/perf/util/data-convert-bt.c | 118 +++++++++++++++++++++++++- > > tools/perf/util/setup.py | 4 + > > 8 files changed, 372 insertions(+), 113 deletions(-) > > Hm, there's a 'perf stat' regression that I can see: > > Before: > > triton:~/tip> perf stat -a sleep 1 > > Performance counter stats for 'system wide': > > 11990.023100 task-clock (msec) # 11.981 CPUs utilized > 8,802 context-switches # 0.734 K/sec > 543 cpu-migrations # 0.045 K/sec > 97,375 page-faults # 0.008 M/sec > 9,854,385,894 cycles # 0.822 GHz > 15,274,841,152 stalled-cycles-frontend # 155.01% frontend cycles idle > stalled-cycles-backend > 9,634,486,137 instructions # 0.98 insn per cycle > # 1.59 stalled cycles per insn > 1,818,488,088 branches # 151.667 M/sec > 46,365,120 branch-misses # 2.55% of all branches > > 1.000741599 seconds time elapsed > > After: > > triton:~/tip> perf stat -a sleep 1 > > Performance counter stats for 'system wide': > > 11989.280397 task-clock (msec) # 11.981 CPUs utilized > 1299 context-switches # 0.108 K/sec > 6 cpu-migrations # 0.001 K/sec > 70 page-faults # 0.006 K/sec > 127008602 cycles # 0.011 GHz > 279538533 stalled-cycles-frontend # 220.09% frontend cycles idle > 119213269 instructions # 0.94 insn per cycle > # 2.34 stalled cycles per insn > 24166678 branches # 2.016 M/sec > 505681 branch-misses # 2.09% of all branches > > 1.000684278 seconds time elapsed > > > ... see how the numbers became human-unreadable, losing the big-number separator? > > I suspect it's due to the following commit: > > fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles yea, it used the pmu parsing which screwes locales, following patch fixed that for me.. jirka --- diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index ce61f79dbaae..d8cd038baed2 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -124,6 +124,17 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char * lc = setlocale(LC_NUMERIC, NULL); /* + * The lc string may be allocated in static storage, + * so get a dynamic copy to make it survive setlocale + * call below. + */ + lc = strdup(lc); + if (!lc) { + ret = -ENOMEM; + goto error; + } + + /* * force to C locale to ensure kernel * scale string is converted correctly. * kernel uses default C locale. @@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char * /* restore locale */ setlocale(LC_NUMERIC, lc); + free((char *) lc); + ret = 0; error: close(fd);