All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: "linux-perf-use." <linux-perf-users@vger.kernel.org>,
	Hendrik Brueckner <brueckner@linux.vnet.ibm.com>,
	zvonko.kosic@de.ibm.com, Adrian Hunter <adrian.hunter@intel.com>
Subject: Re: perf report does not resolve symbols on s390x
Date: Tue, 11 Jul 2017 16:03:04 -0300	[thread overview]
Message-ID: <20170711190304.GH27350@kernel.org> (raw)
In-Reply-To: <5ead7c5a-5b91-107a-51ca-ea464fe8cfba@linux.vnet.ibm.com>

Em Fri, Jul 07, 2017 at 02:17:25PM +0200, Thomas-Mich Richter escreveu:
> On 07/06/2017 02:35 PM, Thomas-Mich Richter wrote:
> > On 07/05/2017 05:50 PM, Arnaldo Carvalho de Melo wrote:
> >> Em Wed, Jul 05, 2017 at 04:45:49PM +0200, Thomas-Mich Richter escreveu:
> >>> I use linux 4.12 kernel and the perf report --stdio does not resolve all symbols:
> >>> Only very few symbols are resolved and none listed in the call back chain.

> >>> Here is an example:

> >>>     50.00%    50.00%  true     [kernel.vmlinux]  [k] __rb_insert_augmented
> >>>             |
> >>>             ---0x6a0624
> >>>                0x2d7c00
> >>>                0x2d79c8
> >>>                0x2b5a26
> >>>                0x2da542
> >>>                0x2da01a
> >>>                0x2d6c2e
> >>>                0x2d6b86
> >>>
> >>>     50.00%     0.00%  true     [unknown]         [k] 0x000000000011e90a
> >>>             |
> >>>             ---0x11e90a
> >>>                0x2d39a4
> >>>                0x2d3238
> >>>                0x288140
> >>>                0x2cf404
> >>>                0x2df5d8
> >>>                0x31618c
> >>>                0x19b0ae

> >>> These addresses are all in the /proc/kallsyms and valid.

> >>> When I do a perf script it works:
> >>> [root@s8360046 perf]# perf script

> >> This is the strange part, 'script' resolving things 'report' doesn't,
> >> they should be using the same machinery... Since the DSO is not being
> >> resolved on the one failing above, this would look like its some
> >> PERF_RECORD_MMAP/PERF_RECORD_MMAP2 that is being missed, so you should
> >> look at the processing of such records, which is done via:

> >>  perf_tool.mmap() and perf_tool.mmap2()

> >> the perf_tool ops table is passed to:

> >>   perf_session__new(&file, false, &report.tool);

> >> Which will call the mmap() and mmap2() functions to process those
> >> records, if you look at script it has:

> >>         struct perf_script script = {
> >>                 .tool = {
> >>                         .sample          = process_sample_event,
> >>                         .mmap            = perf_event__process_mmap,
> >>                         .mmap2           = perf_event__process_mmap2,
> >> <SNIP>

> >> and then:

> >>   session = perf_session__new(&file, false, &script.tool);

> >> Which is the same as done for report:
> >>
> >>           struct report report = {
> >>                 .tool = {
> >>                         .sample          = process_sample_event,
> >>                         .mmap            = perf_event__process_mmap,
> >>                         .mmap2           = perf_event__process_mmap2,
> >> <SNIP>
> >>
> >>   perf_session__new(&file, false, &report.tool);
> >>
> >> I.e. both use the same functions to resolve PERF_RECORD_MMAP/MMAP2 records,
> >> then you should look at how samples are processed, its static functions, tool specific
> >> functions, both named process_sample_event, in tools/perf/builtin->{report,script}.c,
> >> and both will basically do:
> >>
> >>         if (machine__resolve(machine, &al, sample) < 0) {
> >>                 pr_debug("problem processing %d event, skipping it.\n",
> >>                          event->header.type);
> >>                 return -1;
> >>         }
> >>
> >> That will do lookups on trees populated by perf_event__process_mmap,
> >> perf_event__process_mmap2, and the routines handling PERF_RECORD_FORK, etc.
> >>
> >> Is this enough to help you go from here to investigate your problem?
 
> [....]
 
> > Somehow that al->map information gets lost and later on the calls
> > thread__find_addr_location() this information is not available
> > and the lookup fails. And than the hist__add_entry fails as well.

> > I am still seeing tooooo much fog.....
 
> The fog has lifted and I found the root cause. Digging into machine__resolve was the
> right hint...
 
> What happens is this (short version)
> machine__resolve has map and sym pointers ( al->map:0x32629190 al->sym:0x32776f40 )
> --> sample__resolve_callchain al->map:0x32629190 al->sym:0x32776f40
>     --> thread__resolve_callchain al->map:0x32629190 al->sym:0x32776f40
>         --> thread__resolve_callchain_sample
>             creates a new struct addr_location to find the ip addr and its details.
>             --> add_callchain_ip (cpumode:2 --> user space address indicator)
>                 --> thread__find_addr_map 
>                     this function can not resolve the address and as a last action
>                     tries to resolve the address within kernel address space 
>                     --> machine__get_kernel_start
 
> Now this function is interesting:
 
> int machine__get_kernel_start(struct machine *machine)
> {
>         struct map *map = machine__kernel_map(machine);
>         int err = 0;
 
>         /*
>          * The only addresses above 2^63 are kernel addresses of a 64-bit
>          * kernel.  Note that addresses are unsigned so that on a 32-bit system
>          * all addresses including kernel addresses are less than 2^32.  In
>          * that case (32-bit system), if the kernel mapping is unknown, all
>          * addresses will be assumed to be in user space - see
>          * machine__kernel_ip().
>          */
>         machine->kernel_start = 1ULL << 63;
>         if (map) {
>                 err = map__load(map);
>                 if (map->start)
>                         machine->kernel_start = map->start;
>         }
>         return err;
> }
 
> It determines the kernel starts at address 1<<63 and loads the kernel address mapping.
> On s390x
> - The kernel starts at 0x0 (value of map->start) and thus all checks in function 
>   thread__find_addr_map() fail and no symbol is found for the specified addresses
>   because the kernel starts at 0x8000000000000000. Which is wrong the kernel start at 0x0.

Hi Thomas, really nice debugging session!

I'm trying the one-liner below, Adrian, can you please check this and
provide an ack? I think that that comment about the address that it will
default when map__load() fails needs rewriting in light of Thomas
comments about other arches (see further below)?

I did a quick check of machine->kernel_start usage in Intel PT and since
on x86 that assumption about partitioning the address space holds, no
problem should be introduced by the one-liner fix, right?

- Arnaldo

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5de2b86b9880..2e9eb6aa3ce2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2209,7 +2209,7 @@ int machine__get_kernel_start(struct machine *machine)
 	machine->kernel_start = 1ULL << 63;
 	if (map) {
 		err = map__load(map);
-		if (map->start)
+		if (!err)
 			machine->kernel_start = map->start;
 	}
 	return err;

 
> When I set this correctly (for example by setting kernel_start to 0x200) I get this:
>  50.00%    50.00%  true     [kernel.vmlinux]  [k] lock_acquire
>             |
>             ---pgm_check_handler
>                do_dat_exception
>                handle_mm_fault
>                __handle_mm_fault
>                filemap_map_pages
>                alloc_set_pte
>                page_add_file_rmap
>                lock_page_memcg
>                lock_acquire
> 
> 
> This raises 2 questions:
> 1. s390 has a 64 bit address space for user and kernel. The processor status word (PSW)
>    determines which address space to use. That requires the PSW in the sample. Not sure
>    this is the case?
> 2. How does this work on sparc and other architectures with the same addressing scheme?
> 
> Thanks.
> -- 
> Thomas Richter, Dept 3303, IBM LTC Boeblingen Germany
> --
> Vorsitzende des Aufsichtsrats: Martina Koederitz 
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

  parent reply	other threads:[~2017-07-11 19:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-05 14:45 perf report does not resolve symbols on s390x Thomas-Mich Richter
2017-07-05 15:50 ` Arnaldo Carvalho de Melo
2017-07-06  7:23   ` Thomas-Mich Richter
2017-07-06 12:35   ` Thomas-Mich Richter
2017-07-07 12:17     ` Thomas-Mich Richter
2017-07-07 12:22       ` Arnaldo Carvalho de Melo
2017-07-11 19:03       ` Arnaldo Carvalho de Melo [this message]
2017-07-11 19:38         ` Arnaldo Carvalho de Melo
2017-07-11 19:48           ` Arnaldo Carvalho de Melo
2017-07-12  8:21             ` Thomas-Mich Richter
2017-07-12 10:40             ` Michael Ellerman
2017-07-12 14:04               ` Arnaldo Carvalho de Melo
2017-07-13 12:02                 ` Michael Ellerman
2017-07-12  9:05         ` Thomas-Mich Richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170711190304.GH27350@kernel.org \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=brueckner@linux.vnet.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=tmricht@linux.vnet.ibm.com \
    --cc=zvonko.kosic@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.