From mboxrd@z Thu Jan  1 00:00:00 1970
From: Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: perf report does not resolve symbols on s390x
Date: Tue, 11 Jul 2017 16:03:04 -0300
Message-ID: <20170711190304.GH27350@kernel.org>
References: <eb00bf70-936d-e405-5867-015d4067202f@linux.vnet.ibm.com>
 <20170705155007.GH27350@kernel.org>
 <67eb6f70-fc34-95cd-b43c-349bbc24ee5d@linux.vnet.ibm.com>
 <5ead7c5a-5b91-107a-51ca-ea464fe8cfba@linux.vnet.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Return-path: <linux-perf-users-owner@vger.kernel.org>
Received: from mail.kernel.org ([198.145.29.99]:60526 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1755549AbdGKTDJ (ORCPT
        <rfc822;linux-perf-users@vger.kernel.org>);
        Tue, 11 Jul 2017 15:03:09 -0400
Content-Disposition: inline
In-Reply-To: <5ead7c5a-5b91-107a-51ca-ea464fe8cfba@linux.vnet.ibm.com>
Sender: linux-perf-users-owner@vger.kernel.org
List-ID: <linux-perf-users.vger.kernel.org>
To: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: "linux-perf-use." <linux-perf-users@vger.kernel.org>, Hendrik Brueckner <brueckner@linux.vnet.ibm.com>, zvonko.kosic@de.ibm.com, Adrian Hunter <adrian.hunter@intel.com>

Em Fri, Jul 07, 2017 at 02:17:25PM +0200, Thomas-Mich Richter escreveu:
> On 07/06/2017 02:35 PM, Thomas-Mich Richter wrote:
> > On 07/05/2017 05:50 PM, Arnaldo Carvalho de Melo wrote:
> >> Em Wed, Jul 05, 2017 at 04:45:49PM +0200, Thomas-Mich Richter escreveu:
> >>> I use linux 4.12 kernel and the perf report --stdio does not resolve all symbols:
> >>> Only very few symbols are resolved and none listed in the call back chain.

> >>> Here is an example:

> >>>     50.00%    50.00%  true     [kernel.vmlinux]  [k] __rb_insert_augmented
> >>>             |
> >>>             ---0x6a0624
> >>>                0x2d7c00
> >>>                0x2d79c8
> >>>                0x2b5a26
> >>>                0x2da542
> >>>                0x2da01a
> >>>                0x2d6c2e
> >>>                0x2d6b86
> >>>
> >>>     50.00%     0.00%  true     [unknown]         [k] 0x000000000011e90a
> >>>             |
> >>>             ---0x11e90a
> >>>                0x2d39a4
> >>>                0x2d3238
> >>>                0x288140
> >>>                0x2cf404
> >>>                0x2df5d8
> >>>                0x31618c
> >>>                0x19b0ae

> >>> These addresses are all in the /proc/kallsyms and valid.

> >>> When I do a perf script it works:
> >>> [root@s8360046 perf]# perf script

> >> This is the strange part, 'script' resolving things 'report' doesn't,
> >> they should be using the same machinery... Since the DSO is not being
> >> resolved on the one failing above, this would look like its some
> >> PERF_RECORD_MMAP/PERF_RECORD_MMAP2 that is being missed, so you should
> >> look at the processing of such records, which is done via:

> >>  perf_tool.mmap() and perf_tool.mmap2()

> >> the perf_tool ops table is passed to:

> >>   perf_session__new(&file, false, &report.tool);

> >> Which will call the mmap() and mmap2() functions to process those
> >> records, if you look at script it has:

> >>         struct perf_script script = {
> >>                 .tool = {
> >>                         .sample          = process_sample_event,
> >>                         .mmap            = perf_event__process_mmap,
> >>                         .mmap2           = perf_event__process_mmap2,
> >> <SNIP>

> >> and then:

> >>   session = perf_session__new(&file, false, &script.tool);

> >> Which is the same as done for report:
> >>
> >>           struct report report = {
> >>                 .tool = {
> >>                         .sample          = process_sample_event,
> >>                         .mmap            = perf_event__process_mmap,
> >>                         .mmap2           = perf_event__process_mmap2,
> >> <SNIP>
> >>
> >>   perf_session__new(&file, false, &report.tool);
> >>
> >> I.e. both use the same functions to resolve PERF_RECORD_MMAP/MMAP2 records,
> >> then you should look at how samples are processed, its static functions, tool specific
> >> functions, both named process_sample_event, in tools/perf/builtin->{report,script}.c,
> >> and both will basically do:
> >>
> >>         if (machine__resolve(machine, &al, sample) < 0) {
> >>                 pr_debug("problem processing %d event, skipping it.\n",
> >>                          event->header.type);
> >>                 return -1;
> >>         }
> >>
> >> That will do lookups on trees populated by perf_event__process_mmap,
> >> perf_event__process_mmap2, and the routines handling PERF_RECORD_FORK, etc.
> >>
> >> Is this enough to help you go from here to investigate your problem?
 
> [....]
 
> > Somehow that al->map information gets lost and later on the calls
> > thread__find_addr_location() this information is not available
> > and the lookup fails. And than the hist__add_entry fails as well.

> > I am still seeing tooooo much fog.....
 
> The fog has lifted and I found the root cause. Digging into machine__resolve was the
> right hint...
 
> What happens is this (short version)
> machine__resolve has map and sym pointers ( al->map:0x32629190 al->sym:0x32776f40 )
> --> sample__resolve_callchain al->map:0x32629190 al->sym:0x32776f40
>     --> thread__resolve_callchain al->map:0x32629190 al->sym:0x32776f40
>         --> thread__resolve_callchain_sample
>             creates a new struct addr_location to find the ip addr and its details.
>             --> add_callchain_ip (cpumode:2 --> user space address indicator)
>                 --> thread__find_addr_map 
>                     this function can not resolve the address and as a last action
>                     tries to resolve the address within kernel address space 
>                     --> machine__get_kernel_start
 
> Now this function is interesting:
 
> int machine__get_kernel_start(struct machine *machine)
> {
>         struct map *map = machine__kernel_map(machine);
>         int err = 0;
 
>         /*
>          * The only addresses above 2^63 are kernel addresses of a 64-bit
>          * kernel.  Note that addresses are unsigned so that on a 32-bit system
>          * all addresses including kernel addresses are less than 2^32.  In
>          * that case (32-bit system), if the kernel mapping is unknown, all
>          * addresses will be assumed to be in user space - see
>          * machine__kernel_ip().
>          */
>         machine->kernel_start = 1ULL << 63;
>         if (map) {
>                 err = map__load(map);
>                 if (map->start)
>                         machine->kernel_start = map->start;
>         }
>         return err;
> }
 
> It determines the kernel starts at address 1<<63 and loads the kernel address mapping.
> On s390x
> - The kernel starts at 0x0 (value of map->start) and thus all checks in function 
>   thread__find_addr_map() fail and no symbol is found for the specified addresses
>   because the kernel starts at 0x8000000000000000. Which is wrong the kernel start at 0x0.

Hi Thomas, really nice debugging session!

I'm trying the one-liner below, Adrian, can you please check this and
provide an ack? I think that that comment about the address that it will
default when map__load() fails needs rewriting in light of Thomas
comments about other arches (see further below)?

I did a quick check of machine->kernel_start usage in Intel PT and since
on x86 that assumption about partitioning the address space holds, no
problem should be introduced by the one-liner fix, right?

- Arnaldo
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5de2b86b9880..2e9eb6aa3ce2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2209,7 +2209,7 @@ int machine__get_kernel_start(struct machine *machine)
 	machine->kernel_start = 1ULL << 63;
 	if (map) {
 		err = map__load(map);
-		if (map->start)
+		if (!err)
 			machine->kernel_start = map->start;
 	}
 	return err;

 
> When I set this correctly (for example by setting kernel_start to 0x200) I get this:
>  50.00%    50.00%  true     [kernel.vmlinux]  [k] lock_acquire
>             |
>             ---pgm_check_handler
>                do_dat_exception
>                handle_mm_fault
>                __handle_mm_fault
>                filemap_map_pages
>                alloc_set_pte
>                page_add_file_rmap
>                lock_page_memcg
>                lock_acquire
> 
> 
> This raises 2 questions:
> 1. s390 has a 64 bit address space for user and kernel. The processor status word (PSW)
>    determines which address space to use. That requires the PSW in the sample. Not sure
>    this is the case?
> 2. How does this work on sparc and other architectures with the same addressing scheme?
> 
> Thanks.
> -- 
> Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
> --
> Vorsitzende des Aufsichtsrats: Martina Koederitz 
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294