* Why the stack frame in perf.data isn't displayed in FlameGraph? @ 2017-05-18 4:09 Nan Xiao 2017-05-18 7:32 ` Milian Wolff 0 siblings, 1 reply; 7+ messages in thread From: Nan Xiao @ 2017-05-18 4:09 UTC (permalink / raw) To: linux-perf-users Hi all, I am a newbie of using perf tools and FlameGraph. I write a simple test program which is rafacimentoed from this NTL(http://www.shoup.net/ntl/doc/tour-ex3.html) program: #include <NTL/ZZX.h> using namespace std; using namespace NTL; void inner(int i, ZZX& t, Vec<ZZX>& phi) { for (long j = 1; j <= i-1; j++) if (i % j == 0) t *= phi(j); } void outer(int i, Vec<ZZX>& phi) { ZZX t; t = 1; inner(i, t, phi); phi(i) = (ZZX(INIT_MONO, i) - 1)/t; cout << phi(i) << "\n"; } int main() { Vec<ZZX> phi(INIT_SIZE, 100); for (long i = 1; i <= phi.length(); i++) { outer(i, phi); } } And compile it using following command: g++ -pthread test.cpp -lntl -lgmp I use "perf record -g ./a.out" to profile the program, but "perf report" can only show "main" and "outer", no "inner" function (https://github.com/NanXiao/images/blob/master/perf/perf.data): ...... + 7.10% 0.00% a.out a.out [.] outer + 7.10% 0.00% a.out a.out [.] main ...... From the perf.svg(https://github.com/NanXiao/images/blob/master/perf/perf.svg), there is neither "outer" nor "inner" stack frames. Since there is "outer" in perf.data, why it can't display in FlameGraph? Another doubt is why perf.data doesn't contain "inner" stack frame? Thanks in advance! Best Regards Nan Xiao ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-18 4:09 Why the stack frame in perf.data isn't displayed in FlameGraph? Nan Xiao @ 2017-05-18 7:32 ` Milian Wolff 2017-05-18 13:24 ` Nan Xiao 0 siblings, 1 reply; 7+ messages in thread From: Milian Wolff @ 2017-05-18 7:32 UTC (permalink / raw) To: Nan Xiao; +Cc: linux-perf-users [-- Attachment #1: Type: text/plain, Size: 2696 bytes --] On Donnerstag, 18. Mai 2017 06:09:55 CEST Nan Xiao wrote: > Hi all, > > I am a newbie of using perf tools and FlameGraph. I write a simple > test program which is rafacimentoed from this > NTL(http://www.shoup.net/ntl/doc/tour-ex3.html) program: > > #include <NTL/ZZX.h> > > using namespace std; > using namespace NTL; > > void inner(int i, ZZX& t, Vec<ZZX>& phi) > { > for (long j = 1; j <= i-1; j++) > if (i % j == 0) > t *= phi(j); > } > > void outer(int i, Vec<ZZX>& phi) > { > ZZX t; > t = 1; > inner(i, t, phi); > phi(i) = (ZZX(INIT_MONO, i) - 1)/t; > cout << phi(i) << "\n"; > } > > int main() > { > Vec<ZZX> phi(INIT_SIZE, 100); > > for (long i = 1; i <= phi.length(); i++) { > outer(i, phi); > } > } > > And compile it using following command: > > g++ -pthread test.cpp -lntl -lgmp Frame pointers are missing, you probably want to use either a) `g++ -fno-omit-frame-pointers` which may work if NTL is header-only b) `g++ -g` to rely on dwarf debug information and then use `perf record -- call-graph dwarf` In general, I suggest to always compile with `-O2 -g` when you want to profile, otherwise you are missing a lot of compiler optimizations, and/or cripple tools like perf that want to use debug information. > I use "perf record -g ./a.out" to profile the program, but "perf > report" can only show "main" and "outer", no "inner" function > (https://github.com/NanXiao/images/blob/master/perf/perf.data): > > ...... > + 7.10% 0.00% a.out a.out [.] outer > + 7.10% 0.00% a.out a.out [.] main > ...... I bet that's because the "inner" function got inlined and frame pointers are disabled (see above). Recompile with the command above, and then use `perf record --call-graph dwarf`. Also, to see inline frames, try to build perf from git (acme's perf/core) and use `perf report --inline`. > From the > perf.svg(https://github.com/NanXiao/images/blob/master/perf/perf.svg), > there is neither "outer" nor "inner" stack frames. > > Since there is "outer" in perf.data, why it can't display in > FlameGraph? Another doubt is why perf.data doesn't contain "inner" > stack frame? If the above fixes it for report, try to apply "[PATCH] perf script: Add -- inline option" by Namhyung Kim. Then you can generate the flamegraph via perf script --inline | stackcollapse-perf.pl | flamegraph.pl > out.svg Which should also show the inline frames. Cheers, hope that helps -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5903 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-18 7:32 ` Milian Wolff @ 2017-05-18 13:24 ` Nan Xiao 2017-05-18 19:01 ` Milian Wolff 0 siblings, 1 reply; 7+ messages in thread From: Nan Xiao @ 2017-05-18 13:24 UTC (permalink / raw) To: Milian Wolff; +Cc: linux-perf-users Hi Milian, Firstly, thanks very much for your kind and detailed help! (1) I recompile the program use "g++ -g -O2 ...", and use "perf record --call-graph dwarf" to sample. The FlameGraph can display full stack frames and "inner" function appears in both FlameGraph and "perf record". (2) > Also, to see inline frames, try to build perf from git (acme's perf/core) and use `perf report --inline`. I find my perf version doesn't support "--inline" option, and I can't figure out what is the meaning of "build perf from git (acme's perf/core)". Could you elaborate it? I just install perf tools using "sudo pacman -S perf". Thanks very much in advance! Best Regards Nan Xiao On Thu, May 18, 2017 at 3:32 PM, Milian Wolff <milian.wolff@kdab.com> wrote: > On Donnerstag, 18. Mai 2017 06:09:55 CEST Nan Xiao wrote: >> Hi all, >> >> I am a newbie of using perf tools and FlameGraph. I write a simple >> test program which is rafacimentoed from this >> NTL(http://www.shoup.net/ntl/doc/tour-ex3.html) program: >> >> #include <NTL/ZZX.h> >> >> using namespace std; >> using namespace NTL; >> >> void inner(int i, ZZX& t, Vec<ZZX>& phi) >> { >> for (long j = 1; j <= i-1; j++) >> if (i % j == 0) >> t *= phi(j); >> } >> >> void outer(int i, Vec<ZZX>& phi) >> { >> ZZX t; >> t = 1; >> inner(i, t, phi); >> phi(i) = (ZZX(INIT_MONO, i) - 1)/t; >> cout << phi(i) << "\n"; >> } >> >> int main() >> { >> Vec<ZZX> phi(INIT_SIZE, 100); >> >> for (long i = 1; i <= phi.length(); i++) { >> outer(i, phi); >> } >> } >> >> And compile it using following command: >> >> g++ -pthread test.cpp -lntl -lgmp > > Frame pointers are missing, you probably want to use either > > a) `g++ -fno-omit-frame-pointers` which may work if NTL is header-only > b) `g++ -g` to rely on dwarf debug information and then use `perf record -- > call-graph dwarf` > > In general, I suggest to always compile with `-O2 -g` when you want to > profile, otherwise you are missing a lot of compiler optimizations, and/or > cripple tools like perf that want to use debug information. > >> I use "perf record -g ./a.out" to profile the program, but "perf >> report" can only show "main" and "outer", no "inner" function >> (https://github.com/NanXiao/images/blob/master/perf/perf.data): >> >> ...... >> + 7.10% 0.00% a.out a.out [.] outer >> + 7.10% 0.00% a.out a.out [.] main >> ...... > > I bet that's because the "inner" function got inlined and frame pointers are > disabled (see above). Recompile with the command above, and then use `perf > record --call-graph dwarf`. > > Also, to see inline frames, try to build perf from git (acme's perf/core) and > use `perf report --inline`. > >> From the >> perf.svg(https://github.com/NanXiao/images/blob/master/perf/perf.svg), >> there is neither "outer" nor "inner" stack frames. >> >> Since there is "outer" in perf.data, why it can't display in >> FlameGraph? Another doubt is why perf.data doesn't contain "inner" >> stack frame? > > If the above fixes it for report, try to apply "[PATCH] perf script: Add -- > inline option" by Namhyung Kim. Then you can generate the flamegraph via > > perf script --inline | stackcollapse-perf.pl | flamegraph.pl > out.svg > > Which should also show the inline frames. > > Cheers, hope that helps > > -- > Milian Wolff | milian.wolff@kdab.com | Software Engineer > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company > Tel: +49-30-521325470 > KDAB - The Qt Experts ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-18 13:24 ` Nan Xiao @ 2017-05-18 19:01 ` Milian Wolff 2017-05-19 1:50 ` Nan Xiao 0 siblings, 1 reply; 7+ messages in thread From: Milian Wolff @ 2017-05-18 19:01 UTC (permalink / raw) To: Nan Xiao; +Cc: linux-perf-users [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] On Donnerstag, 18. Mai 2017 15:24:04 CEST Nan Xiao wrote: > Hi Milian, > > Firstly, thanks very much for your kind and detailed help! > > (1) I recompile the program use "g++ -g -O2 ...", and use "perf record > --call-graph dwarf" to sample. The FlameGraph can display full stack > frames and "inner" function appears in both FlameGraph and "perf > record". > > (2) > Also, to see inline frames, try to build perf from git (acme's > perf/core) and use `perf report --inline`. > > I find my perf version doesn't support "--inline" option, and I can't > figure out what is the meaning of "build perf from git (acme's > perf/core)". Could you elaborate it? I just install perf tools using > "sudo pacman -S perf". Essentially it boils down to do this: git clone git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux -b perf/ core cd linux/tools/perf make ./perf report --inline ... Cheers -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5903 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-18 19:01 ` Milian Wolff @ 2017-05-19 1:50 ` Nan Xiao 2017-05-23 9:32 ` Milian Wolff 0 siblings, 1 reply; 7+ messages in thread From: Nan Xiao @ 2017-05-19 1:50 UTC (permalink / raw) To: Milian Wolff; +Cc: linux-perf-users Hi Milian, Thanks very much! So what is the difference between acme's perf/core and general perf installed by pacman/yum? Is acme's perf/core more newer? Thx! Best Regards Nan Xiao On Fri, May 19, 2017 at 3:01 AM, Milian Wolff <milian.wolff@kdab.com> wrote: > On Donnerstag, 18. Mai 2017 15:24:04 CEST Nan Xiao wrote: >> Hi Milian, >> >> Firstly, thanks very much for your kind and detailed help! >> >> (1) I recompile the program use "g++ -g -O2 ...", and use "perf record >> --call-graph dwarf" to sample. The FlameGraph can display full stack >> frames and "inner" function appears in both FlameGraph and "perf >> record". >> >> (2) > Also, to see inline frames, try to build perf from git (acme's >> perf/core) and use `perf report --inline`. >> >> I find my perf version doesn't support "--inline" option, and I can't >> figure out what is the meaning of "build perf from git (acme's >> perf/core)". Could you elaborate it? I just install perf tools using >> "sudo pacman -S perf". > > Essentially it boils down to do this: > > git clone git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux -b perf/ > core > cd linux/tools/perf > make > ./perf report --inline ... > > Cheers > > -- > Milian Wolff | milian.wolff@kdab.com | Software Engineer > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company > Tel: +49-30-521325470 > KDAB - The Qt Experts ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-19 1:50 ` Nan Xiao @ 2017-05-23 9:32 ` Milian Wolff 2017-05-23 10:14 ` Nan Xiao 0 siblings, 1 reply; 7+ messages in thread From: Milian Wolff @ 2017-05-23 9:32 UTC (permalink / raw) To: Nan Xiao; +Cc: linux-perf-users [-- Attachment #1: Type: text/plain, Size: 1046 bytes --] On Friday, May 19, 2017 3:50:46 AM CEST Nan Xiao wrote: > Hi Milian, > > Thanks very much! > > So what is the difference between acme's perf/core and general perf > installed by pacman/yum? Is acme's perf/core more newer? Thx! Yes, acme is the maintainer of the perf subsystem and his perf/core branch is where all the latest work gets merged and tested before it lands in mainline. Once it's in mainline it takes some time until a kernel gets released with this version, and then a couple of months before distros start using that kernel version (or a newer one). So in general, if you rely on the distro packages for perf, most things are usually outdated by at least half a year or so, often times more. For the in- kernel recording features, that's often fine. But for the user-space analysis tools, that is often a difference between day and night in my eyes. YMMV. Cheers -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 5903 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Why the stack frame in perf.data isn't displayed in FlameGraph? 2017-05-23 9:32 ` Milian Wolff @ 2017-05-23 10:14 ` Nan Xiao 0 siblings, 0 replies; 7+ messages in thread From: Nan Xiao @ 2017-05-23 10:14 UTC (permalink / raw) To: Milian Wolff; +Cc: linux-perf-users Hi Milian, Got it! Thanks very much for your total kind help! :-) Best Regards Nan Xiao On Tue, May 23, 2017 at 5:32 PM, Milian Wolff <milian.wolff@kdab.com> wrote: > On Friday, May 19, 2017 3:50:46 AM CEST Nan Xiao wrote: >> Hi Milian, >> >> Thanks very much! >> >> So what is the difference between acme's perf/core and general perf >> installed by pacman/yum? Is acme's perf/core more newer? Thx! > > Yes, acme is the maintainer of the perf subsystem and his perf/core branch is > where all the latest work gets merged and tested before it lands in mainline. > Once it's in mainline it takes some time until a kernel gets released with > this version, and then a couple of months before distros start using that > kernel version (or a newer one). > > So in general, if you rely on the distro packages for perf, most things are > usually outdated by at least half a year or so, often times more. For the in- > kernel recording features, that's often fine. But for the user-space analysis > tools, that is often a difference between day and night in my eyes. YMMV. > > Cheers > > -- > Milian Wolff | milian.wolff@kdab.com | Software Engineer > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company > Tel: +49-30-521325470 > KDAB - The Qt Experts ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-05-23 10:14 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-18 4:09 Why the stack frame in perf.data isn't displayed in FlameGraph? Nan Xiao 2017-05-18 7:32 ` Milian Wolff 2017-05-18 13:24 ` Nan Xiao 2017-05-18 19:01 ` Milian Wolff 2017-05-19 1:50 ` Nan Xiao 2017-05-23 9:32 ` Milian Wolff 2017-05-23 10:14 ` Nan Xiao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).