linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Clark <james.clark@arm.com>
To: "Fontius Sebastian (XC-DA/ESV9)" <Sebastian.Fontius@de.bosch.com>,
	"linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>
Subject: Re: Perf call graphs on ARM
Date: Fri, 16 Apr 2021 14:16:56 +0300	[thread overview]
Message-ID: <25a2fd6c-05ca-3ccc-11a3-41249323fbd2@arm.com> (raw)
In-Reply-To: <e083221298cc4bd09c4d77a85a33c2a1@de.bosch.com>



On 14/04/2021 16:22, Fontius Sebastian (XC-DA/ESV9) wrote:
> Hi everyone,
> 
> we are having trouble to get perf to output a call graph using frame pointers on a Raspberry Pi 2 Model B Revision 1.1 (ARM Cortex-A7) running Raspbian on its official Kernel versions 4.9 and 5.10.
> 
> Let me illustrate the problem we're having using a small test program we will call aaa.cpp:
> 
> #include <iostream>
> #include <cmath>
> __attribute__((noinline)) double G(double aaa) {
>   return sqrt(aaa);
> }
> __attribute__((noinline)) double doit(double aaa) {
>   for(int i=0; i<1000; i++)
>     aaa=G(aaa);
>   return aaa;
> }
> int main() {
>   double aaa = 12;
>   for(int i=0; i<10000; i++){
>     aaa = doit(aaa);
>     for(int j=0; j<1000; j++)
>       aaa++;
>     }
>   std::cout << aaa << std::endl;
> }
> 

Hi Sebastian,

Is it possible to simplify the example and hold the process in G(), with some
kind of while loop for example (as long as the compiler still produces frame
records for each function).

I tried this on 64bit Arm and got a sensible stack and the kernel's frame pointer
unwind for 64 bit is pretty much the same as the 32 bit one. Could it be that
samples are being taken in the function prologues before the frame pointer is
fully setup which is causing some unexpected results?

Here is my stack which looks mostly the same with --call-graph=fp and --call-graph=dwarf:

-   99.73%    99.73%  unwind-simple.o  unwind-simple.out    [.] G                                                                                                                     
     0xfffff67decfd                                                                                                                                                                   
     _start                                                                                                                                                                           
     __libc_start_main                                                                                                                                                                
     main                                                                                                                                                                             
     doit                                                                                                                                                                             
     G   

Thanks
James

> This program gets compiled like this using GCC Raspbian 8.3.0-6+rpi1:
> 
> g++ -O2 -fno-omit-frame-pointer aaa.cpp
> 
> Then we run perf on it like this:
> 
> perf record -e cycles --call-graph fp -- ./a.out
> perf report
> 
> In the output of perf there should be a tree like the following:
> 
> main
>   + doit
>     + G
> 
> Instead what we are getting is all of those functions attached to a.out (but we _do_ get the runtimes correctly).
> 
> It seems the frame pointers are written to the binary, but do not work. We can see the frame pointers in a disassembly output created by compiling with -save-temps like this:
> 
> g++ -O2 -fno-omit-frame-pointer -save-temps aaa.cpp
> 
> This gives the following output for the doit() function:
> 
> _Z4doitd:
> .fnstart
> .LFB1758:
> @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 1, uses_anonymous_args = 0
> vmov.f64 d7, d0
> push {r4, r5, fp, lr}
> mov r4, #1000
> add fp, sp, #12
> .L7:
> vmov.f64 d0, d7
> bl _Z1Gd
> subs r4, r4, #1
> vmov.f64 d7, d0
> bne .L7
> pop {r4, r5, fp, pc}
> .cantunwind
> .fnend
> 
> One thing we did try in addition to using frame pointers is to use the DWARF format, but that has some disadvantages like e.g. using roughly 20x the space of the FP format and being much slower to record. Also the recording itself seems unstable and can simply hang the whole Raspberry Pi completely requiring a hard reset. Using Kernel 5.10 the DWARF format also did exhibit the same 'disconnectedness' of the call stack (i.e. all function directly below a.out).
> 
> We also tried running Ubuntu 20.04 using its Kernel 5.4, but there both FP and DWARF were 'disconnected' again.
> 
> We're at a loss what is going wrong here. Does someone here have an idea what we could try to further debug or even understand the problem?
> 
> Mit freundlichen Grüßen / Best regards
> 
> Sebastian Fontius
> 
> Chassis Systems Control, Image Processing 9 (XC-DA/ESV9)
> Robert Bosch GmbH | Postfach 16 61 | 71226 Leonberg | GERMANY | www.bosch.com
> 
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> Aufsichtsratsvorsitzender: Franz Fehrenbach; Geschäftsführung: Dr. Volkmar Denner, 
> Prof. Dr. Stefan Asenkerschbaumer, Filiz Albrecht, Dr. Michael Bolle, Dr. Christian Fischer, 
> Dr. Stefan Hartung, Dr. Markus Heyn, Harald Kröger, Rolf Najork, Uwe Raschke
> 

      parent reply	other threads:[~2021-04-16 11:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-14 13:22 Perf call graphs on ARM Fontius Sebastian (XC-DA/ESV9)
2021-04-14 13:38 ` Arnaldo Carvalho de Melo
2021-04-16  8:19   ` Fontius Sebastian (XC-DA/ESV9)
2021-04-16 11:16 ` James Clark [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25a2fd6c-05ca-3ccc-11a3-41249323fbd2@arm.com \
    --to=james.clark@arm.com \
    --cc=Sebastian.Fontius@de.bosch.com \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).