public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user)
@ 2014-12-02 15:06 kan.liang
  2014-12-02 15:06 ` [PATCH V5 1/3] perf tools: enable LBR call stack support kan.liang
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: kan.liang @ 2014-12-02 15:06 UTC (permalink / raw)
  To: acme, jolsa, a.p.zijlstra, eranian
  Cc: linux-kernel, mingo, paulus, ak, namhyung, Kan Liang

From: Kan Liang <kan.liang@intel.com>

This is the user space patch for Haswell LBR call stack support.
For many profiling tasks we need the callgraph. For example we often
need to see the caller of a lock or the caller of a memcpy or other
library function to actually tune the program. Frame pointer unwinding
is efficient and works well. But frame pointers are off by default on
64bit code (and on modern 32bit gccs), so there are many binaries around
that do not use frame pointers. Profiling unchanged production code is
very useful in practice. On some CPUs frame pointer also has a high
cost. Dwarf2 unwinding also does not always work and is extremely slow
(upto 20% overhead).

Haswell has a new feature that utilizes the existing Last Branch Record
facility to record call chains. When the feature is enabled, function
call will be collected as normal, but as return instructions are
executed the last captured branch record is popped from the on-chip LBR
registers. The LBR call stack facility provides an alternative to get
callgraph. It has some limitations too, but should work in most cases
and is significantly faster than dwarf. Frame pointer unwinding is still
the best default, but LBR call stack is a good alternative when nothing
else works.

Please find the kernel part patch at https://lkml.org/lkml/2014/11/6/432

Changes since v1
 - Update help document
 - Force exclude_user to 0 with warning in LBR call stack
 - Dump both lbr and fp info when report -D
 - Reconstruct thread__resolve_callchain_sample and split it into two patches
 - Use has_branch_callstack function to check LBR call stack available

Changes since v2
 - Rebase to 025ce5d33373

Changes since v3
 - Rebase to cc502c23aadf
 - Separated function for lbr call stack sample resolve and print
 - Some minor changes according to comments

Changes since V4
 - Rebase to 09a6a1b
 - Falling back to framepointers if LBR not available, and warning user

Kan Liang (3):
  perf tools: enable LBR call stack support
  perf tool: Move cpumode resolve code to add_callchain_ip
  perf tools: Construct LBR call chain

 tools/perf/Documentation/perf-record.txt |   8 +-
 tools/perf/builtin-record.c              |   6 +-
 tools/perf/builtin-report.c              |   2 +
 tools/perf/util/callchain.c              |  10 +-
 tools/perf/util/callchain.h              |   1 +
 tools/perf/util/evsel.c                  |  21 +++-
 tools/perf/util/evsel.h                  |   4 +
 tools/perf/util/machine.c                | 174 ++++++++++++++++++++++---------
 tools/perf/util/session.c                |  64 ++++++++++--
 9 files changed, 229 insertions(+), 61 deletions(-)

-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-01-05 15:58 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-02 15:06 [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user) kan.liang
2014-12-02 15:06 ` [PATCH V5 1/3] perf tools: enable LBR call stack support kan.liang
2014-12-12  8:17   ` [tip:perf/urgent] perf callchain: Fixup parameter handling error message tip-bot for Kan Liang
2014-12-02 15:06 ` [PATCH V5 2/3] perf tool: Move cpumode resolve code to add_callchain_ip kan.liang
2014-12-12  8:18   ` [tip:perf/urgent] perf callchain: " tip-bot for Kan Liang
2014-12-02 15:06 ` [PATCH V5 3/3] perf tools: Construct LBR call chain kan.liang
2014-12-04 14:23 ` [PATCH V5 0/3] perf tool: Haswell LBR call stack support (user) Jiri Olsa
2014-12-04 14:49   ` Liang, Kan
2014-12-04 15:51     ` Arnaldo Carvalho de Melo
2014-12-04 16:02       ` Jiri Olsa
2014-12-04 16:18         ` Liang, Kan
2014-12-09 12:27           ` Arnaldo Carvalho de Melo
2014-12-09 12:53             ` Jiri Olsa
2014-12-09 13:11               ` Arnaldo Carvalho de Melo
2014-12-09 13:22                 ` Jiri Olsa
2014-12-09 13:27                   ` Arnaldo Carvalho de Melo
2014-12-09 13:33                     ` Jiri Olsa
2015-01-05 13:57     ` Peter Zijlstra
2015-01-05 15:58       ` Liang, Kan
2014-12-11 22:21 ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox