From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752460AbdFNKpD (ORCPT ); Wed, 14 Jun 2017 06:45:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:46450 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752320AbdFNKoa (ORCPT ); Wed, 14 Jun 2017 06:44:30 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 73A4E22CB4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=acme@kernel.org Date: Wed, 14 Jun 2017 07:44:26 -0300 From: Arnaldo Carvalho de Melo To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo , Adrian Hunter , Alexander Shishkin , David Ahern , Namhyung Kim , Peter Zijlstra , Wang Nan Subject: Re: [PATCH 1/2] perf evsel: Fix probing of precise_ip level for default cycles event Message-ID: <20170614104426.GA32020@kernel.org> References: <20170613232343.4365-1-acme@kernel.org> <20170613232343.4365-2-acme@kernel.org> <20170614054500.exocsybw2qkug5tn@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170614054500.exocsybw2qkug5tn@gmail.com> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Wed, Jun 14, 2017 at 07:45:01AM +0200, Ingo Molnar escreveu: > * Arnaldo Carvalho de Melo wrote: > > +++ b/tools/perf/util/evsel.c > > @@ -273,6 +273,11 @@ struct perf_evsel *perf_evsel__new_cycles(void) > > event_attr_init(&attr); > > + /* > > + * Unnamed union member, not supported as struct member named > > + * initializer in older compilers such as gcc 4.4.7 > > + */ > > + attr.sample_period = 1; > > perf_event_attr__set_max_precise_ip(&attr); > Hm, so this really broke perf for me on my main system - 'perf top' and 'perf > report' only shows: > triton:~/tip> perf report --stdio > unwind: target platform=x86 is not supported > unwind: target platform=x86 is not supported > unwind: target platform=x86 is not supported > # To display the perf.data header info, please use --header/--header-only options. > # Total Lost Samples: 0 > # Samples: 921K of event 'cycles:ppp' > # Event count (approx.): 921045 > # Overhead Command Shared Object Symbol > # ........ ......... ................ .................... > 99.93% hackbench [kernel.vmlinux] [k] native_write_msr > 0.07% perf [kernel.vmlinux] [k] native_write_msr > the bisection result is unambiguous: > 7fd1d092b4337831d7ccbf3a74c07cb0b2089023 is the first bad commit > proper output would be: > ... > # Total Lost Samples: 0 > # > # Samples: 9K of event 'cycles' > # Event count (approx.): 4378583062 > # > # Overhead Command Shared Object Symbol > # ........ ......... ................ ....................................... > 4.32% hackbench [kernel.vmlinux] [k] copy_user_enhanced_fast_string > 4.02% hackbench [kernel.vmlinux] [k] unix_stream_read_generic > 3.75% hackbench [kernel.vmlinux] [k] filemap_map_pages > 3.06% hackbench [kernel.vmlinux] [k] __check_object_size > 2.44% hackbench [kernel.vmlinux] [k] _raw_spin_lock_irqsave > 2.32% hackbench [kernel.vmlinux] [k] native_queued_spin_lock_slowpath > 2.22% hackbench [kernel.vmlinux] [k] entry_SYSENTER_compat > 1.90% hackbench [vdso] [.] __vdso_gettimeofday > 1.80% hackbench [kernel.vmlinux] [k] _raw_spin_lock > 1.80% hackbench [kernel.vmlinux] [k] skb_set_owner_w > 1.67% hackbench [kernel.vmlinux] [k] kmem_cache_free > 1.52% hackbench [kernel.vmlinux] [k] skb_release_data > 1.48% hackbench [kernel.vmlinux] [k] common_file_perm > 1.45% hackbench [kernel.vmlinux] [k] page_fault > 1.45% hackbench [kernel.vmlinux] [k] cmpxchg_double_slab.isra.62 > 1.42% hackbench [kernel.vmlinux] [k] new_sync_read > 1.36% hackbench [kernel.vmlinux] [k] __check_heap_object > > Here's the hardware details: No need for that, its simpler, brown paper bag, I concentrated on it not returning -EINVAL and didn't test it enough, ditto with the other guys that tested this on s/390 :-\ The default case gets the precise_ip right, i.e. 3 in this broadwell machine, but remained with the sample_period=1 used to probe it, ugh. [root@jouet ~]# perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (69 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 If we use 'cycles:P' explicitely we can see that it uses the default sample_freq: [root@jouet ~]# perf record -e cycles:P usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] [root@jouet ~]# perf evlist -v cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Can you try with the patch below, which is hackish but the minimal fix at this point. Later I'll come up with a way of returning a fully configured cycles evsel, which will make the tools code simpler, moving more stuff to the libraries. I'll look into the unwind case, where SRCARCH is being passed to the unwind code uses both x86_64, not x86 (probably uses i386 or x86_32 for the 32-bit case). - Arnaldo diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index a7ce529ca36c..cda44b0e821c 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -276,10 +276,17 @@ struct perf_evsel *perf_evsel__new_cycles(void) /* * Unnamed union member, not supported as struct member named * initializer in older compilers such as gcc 4.4.7 + * + * Just for probing the precise_ip: */ attr.sample_period = 1; perf_event_attr__set_max_precise_ip(&attr); + /* + * Now let the usual logic to set up the perf_event_attr defaults + * to kick in when we return and before perf_evsel__open() is called. + */ + attr.sample_period = 0; evsel = perf_evsel__new(&attr); if (evsel == NULL)