From: William Cohen <wcohen@redhat.com>
To: linux-perf-users@vger.kernel.org
Cc: wcohen@redhat.com
Subject: PROBLEM: The --call-graph=fp data does do not agree with the --call-graph=dwarf results
Date: Tue, 30 Aug 2022 10:21:28 -0400 [thread overview]
Message-ID: <cb3efad5-37eb-29e4-03b0-f0bcc8be9918@redhat.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2645 bytes --]
With a perf Fedora 36 perf-tools-5.18.13-200.fc36 I was examining
where perf-report was spending its time when generating its report and
found there was an efficiency issue in Fedora 36's binutils-2.37. The
efficient issue been addressed in Fedora rawhide and will be
backported to Fedora 36
(https://bugzilla.redhat.com/show_bug.cgi?id=2120752). This was
initially discovered when processing perf.data files created with
--call-graph=dwarf. The output of the perf-report call-graph for
dwarf information notes inlined functions in the report. The excessive
time spent in binutils bfd's lookup_func_by_offset was caused by perf-report
building up a red-black tree mapping IP addresses to functions
including inlined functions.
I ran a similar experiment with --call-graph=fp to see if it triggered
the same execessive overhead in building the red-black tree for
inlined functions. It did not. The resulting output of the perf-report
for --call-graph=fp does not include information about inlined functions.
I have a small reproducer in the attached perf_inlined.tar.gz that
demonstrates the difference between the two methods of storing
call-chain information. Compile and collect data with:
tar xvfz perf_inlined.tar.gz
cd perf_inlined
make all
perf report --input=perf_fp.data > fp.log
perf report --input=perf_dwarf.data > dwarf.log
The dwarf.log has the expected call structure for main:
main
|
--85.72%--fill_array (inlined)
|
|--78.09%--rand
| |
| --75.10%--__random
| |
| --9.14%--__random_r
|
|--1.58%--compute_sqrt (inlined)
|
--1.32%--_init
The fp.log looks odd given program:
99.99% 0.00% time_waste libc.so.6 [.] __libc_start_call_main
|
---__libc_start_call_main
|
|--66.07%--__random
|
|--21.28%--main
|
|--8.42%--__random_r
|
|--2.91%--rand
|
--1.31%--_init
Given how common that functions are inlined in optimized code it seems
like perf-report of --call-graph=fp should include information about
time spent in inlined functions.
-Will Cohen
[-- Attachment #2: perf_inlined.tar.gz --]
[-- Type: application/gzip, Size: 649 bytes --]
next reply other threads:[~2022-08-30 14:21 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-30 14:21 William Cohen [this message]
2022-08-31 12:17 ` PROBLEM: The --call-graph=fp data does do not agree with the --call-graph=dwarf results Jiri Olsa
2022-08-31 13:47 ` William Cohen
2022-08-31 14:22 ` Milian Wolff
2022-08-31 16:15 ` Arnaldo Carvalho de Melo
2022-09-08 19:12 ` William Cohen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb3efad5-37eb-29e4-03b0-f0bcc8be9918@redhat.com \
--to=wcohen@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).