From: Milian Wolff <milian.wolff@kdab.com>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>,
Linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
kernel-team@lge.com
Subject: Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
Date: Wed, 24 May 2017 15:42:59 +0200 [thread overview]
Message-ID: <4449663.nJV0hUTq9C@milian-kdab2> (raw)
In-Reply-To: <1535446.SrsFzc0SgI@milian-kdab2>
[-- Attachment #1: Type: text/plain, Size: 4304 bytes --]
On Wednesday, May 24, 2017 1:46:04 PM CEST Milian Wolff wrote:
> On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> > Hi Milian,
> >
> > On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> > > On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > > > This series of patches completely reworks the way inline frames are
> > > > handled. Instead of querying for the inline nodes on-demand in the
> > > > individual tools, we now create proper callchain nodes for inlined
> > > > frames. The advantages this approach brings are numerous:
> > > >
> > > > - less duplicated code in the individual browser
> > > > - aggregated cost for inlined frames for the --children top-down list
> > > > - various bug fixes that arose from querying for a srcline/symbol
> > > > based
> > > > on
> > > >
> > > > the IP of a sample, which will always point to the last inlined
> > > > frame
> > > > instead of the corresponding non-inlined frame
> > > >
> > > > - overall much better support for visualizing cost for heavily-inlined
> > > > C++
> > > >
> > > > code, which simply was confusing and unreliably before
> > > >
> > > > - srcline honors the global setting as to whether full paths or
> > > > basenames
> > > >
> > > > should be shown
> > > >
> > > > For comparison, below lists the output before and after for `perf
> > > > script`
> > >
> > > > and `perf report`. The example file I used to generate the perf data
is:
> > > And of course shortly after sending this patch series I notice the first
> > > issues ;-) The new behavior shows confusing results for `-g function`
> > > because match_chain uses sym->start. I fixed this locally to compare the
> > > actual
> >
> > > function name if either of the two symbols is an inlined fake symbol:
> > Why not making the fake symbol has start addr of the sample IP and
> > length of 1. The histogram sort code also compares the sym->start
> > which might confuse the output of the children mode too IMHO.
>
> I can try that out, thank you for the suggestion. But I think it can easily
> break in different ways. I.e. when the same inline function gets used at
> different IPs, it should actually be considered to be the same function when
> we group/merge/aggregate. I updated the `match_chain` function accordingly,
> to do a symname / srcline comparison on inlined frames, instead of relying
> on the symbol start/end. I think using the IP for the fake symbols won't be
> more reliable here, don't you think?
>
> In the end, I think we'll always have to special-case inlined fake symbols
> when we aggregate data, since the sym start/end is always going to be some
> arbitrary value that may or may not be what we want it to be. Doing the
> explicit comparison on e.g. srcline/symname is always going to be the most
> reliable option, as it also directly results in a proper aggregation based
> on the strings that the user will see in the end.
I haven't yet tried it out, but I think I can come up with a way to break your
approach easily. Assume the following pseudo-code:
void tail()
{
instr1; // IP1
instr2; // IP2
}
void mid()
{
tail();
}
void main()
{
mid();
}
Now, assume both `tail` and `mid` get inlined into `main`. If we get one
sample each for both IP1 and IP2, we want the following merged structure if we
merge based on symbol:
sym | incl | self
main | 2 | 0
mid | 2 | 0
tail | 2 | 2
If we would give the inlined fake-symbols a start of the IP, i.e. either IP1
or IP2, then we would end up with this (unexpected) behavior instead:
sym | incl | self
main | 2 | 0
mid | 1 | 0
mid | 1 | 0
tail | 1 | 1
tail | 1 | 1
The reason is that the fake symbols for the inlined frames would be considered
to be different functions since their start/end are not equal. This is "wrong"
in my eyes - we really have to do symbol name comparisons for inlined frames,
and also include srcline if that is desired.
If you think the above is not a valid assessment, I'll try to change my patch
series to use the IP + 1 trick you suggest. But I really don't think it's
going to work.
Cheers
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]
next prev parent reply other threads:[~2017-05-24 13:42 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-18 19:34 [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
2017-05-18 19:34 ` [PATCH 1/7] perf report: remove code to handle inline frames from browsers Milian Wolff
2017-05-18 19:34 ` [PATCH 2/7] perf util: take elf_name as const string in dso__demangle_sym Milian Wolff
2017-05-18 19:34 ` [PATCH 3/7] perf report: create real callchain entries for inlined frames Milian Wolff
2017-05-22 12:19 ` Namhyung Kim
2017-05-24 11:41 ` Milian Wolff
2017-05-18 19:34 ` [PATCH 4/7] perf report: use srcline from " Milian Wolff
2017-05-18 19:34 ` [PATCH 5/7] perf report: fall-back to function name comparison for -g srcline Milian Wolff
2017-05-18 19:34 ` [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
2017-05-22 12:48 ` Namhyung Kim
2017-05-24 11:46 ` Milian Wolff
2017-06-03 13:51 ` Milian Wolff
2017-06-06 1:33 ` Namhyung Kim
2017-06-06 7:26 ` Milian Wolff
2017-06-06 19:52 ` Arnaldo Carvalho de Melo
2017-05-18 19:34 ` [PATCH 7/7] perf script: mark inlined frames and do not print DSO for them Milian Wolff
2017-05-22 12:11 ` Namhyung Kim
2017-05-24 11:40 ` Milian Wolff
2017-05-18 20:05 ` [PATCH 0/7] generate full callchain cursor entries for inlined frames Milian Wolff
2017-05-22 9:06 ` Namhyung Kim
2017-05-24 11:46 ` Milian Wolff
2017-05-24 13:42 ` Milian Wolff [this message]
2017-05-24 15:02 ` Namhyung Kim
2017-05-29 18:36 ` Milian Wolff
2017-05-30 1:33 ` Namhyung Kim
2017-05-22 12:09 ` Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4449663.nJV0hUTq9C@milian-kdab2 \
--to=milian.wolff@kdab.com \
--cc=Linux-kernel@vger.kernel.org \
--cc=kernel-team@lge.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).