linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] perf tools updates
@ 2010-04-24  2:05 Frederic Weisbecker
  2010-04-24  2:05 ` [PATCH 1/9] perf lock: Fix state machine to recognize lock sequence Frederic Weisbecker
                   ` (10 more replies)
  0 siblings, 11 replies; 39+ messages in thread
From: Frederic Weisbecker @ 2010-04-24  2:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, William Cohen, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Hitoshi Mitake,
	Masami Hiramatsu, Tom Zanussi, Arjan van de Ven, Pekka Enberg,
	Li Zefan, Stephane Eranian, Jens Axboe, Jason Baron,
	Xiao Guangrong

Ingo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

Thanks,
	Frederic
---

Frederic Weisbecker (6):
      perf: Generalize perf lock's sample event reordering to the session layer
      perf: Use generic sample reordering in perf sched
      perf: Use generic sample reordering in perf kmem
      perf: Use generic sample reordering in perf trace
      perf: Use generic sample reordering in perf timechart
      perf: Add a perf trace option to check samples ordering reliability

Hitoshi Mitake (1):
      perf lock: Fix state machine to recognize lock sequence

Stephane Eranian (1):
      perf: Fix initialization bug in parse_single_tracepoint_event()

William Cohen (1):
      perf: Some perf-kvm documentation edits


 tools/perf/Documentation/perf-kvm.txt |    9 +-
 tools/perf/builtin-kmem.c             |    6 +-
 tools/perf/builtin-lock.c             |  595 ++++++++++++++++++++-------------
 tools/perf/builtin-sched.c            |    8 +-
 tools/perf/builtin-timechart.c        |  112 +------
 tools/perf/builtin-trace.c            |   13 +
 tools/perf/util/parse-events.c        |   13 +-
 tools/perf/util/session.c             |  179 ++++++++++-
 tools/perf/util/session.h             |   10 +
 9 files changed, 583 insertions(+), 362 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread
* [GIT PULL] perf tools updates
@ 2011-05-22  1:45 Frederic Weisbecker
  2011-05-22  8:49 ` Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Frederic Weisbecker @ 2011-05-22  1:45 UTC (permalink / raw)
  To: Ingo Molnar, Arnaldo Carvalho de Melo
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, Linus Torvalds

Ingo, Arnaldo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

Thanks,
	Frederic
---

Frederic Weisbecker (6):
      perf tools: Check we are able to read the event size on mmap
      perf tools: Remove junk code in mmap size handling
      perf tools: Move evlist sample helpers to evlist area
      perf tools: Pre-check sample size before parsing
      perf tools: Robustify dynamic sample content fetch
      perf tools: Propagate event parse error handling


 tools/perf/builtin-test.c |    9 ++++++++-
 tools/perf/builtin-top.c  |    7 ++++++-
 tools/perf/util/event.c   |   16 ++++++++++++++++
 tools/perf/util/event.h   |   12 +++++++++++-
 tools/perf/util/evlist.c  |   31 +++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h  |    3 +++
 tools/perf/util/evsel.c   |   32 +++++++++++++++++++++++++++++++-
 tools/perf/util/header.c  |   31 -------------------------------
 tools/perf/util/header.h  |    2 --
 tools/perf/util/python.c  |   13 ++++++++++---
 tools/perf/util/session.c |   25 ++++++++++++++++++-------
 tools/perf/util/session.h |    2 ++
 12 files changed, 136 insertions(+), 47 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread
* [GIT PULL] perf tools updates
@ 2011-06-29 23:34 Frederic Weisbecker
  2011-07-01 10:01 ` Ingo Molnar
  0 siblings, 1 reply; 39+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

Ingo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

It adds the inverted callchains support and let one use
parent filtering with parent sorting at the same time, because
it appears to me that inverted callchains sorted by filtered
parents is pretty useful, and extendable to more cool things.

Anyway inverted callchains used with some different sorting combination
in general can provide some interesting analysis flavours.

Having played with it a bit. It seems to me the callee point
of view (traditional -g callchains) is better suited to
find the precise zoomed-in places where cpu time is most
spent. Spot contention places, etc...

OTOH, caller point of view (-G, inverted callchain), is
for zoomed out observation, of course. It's more suited for
global profiling. To get a big overview of where the hot bulk
of a program is executing for example.

Examples:

- look at the hottest tree of call of a program.

	./perf report -G -s pid --stdio
	
     5.73%               perf:11933
            |
            --- __libc_start_main
               |          
               |--99.18%-- main
               |          run_builtin
               |          cmd_bench
               |          |          
               |          |--89.68%-- bench_sched_messaging
               |          |          |          
               |          |          |--96.11%-- create_worker
               |          |          |          |          
               |          |          |          |--95.10%-- __libc_fork
               |          |          |          |          |          
               |          |          |          |          |--93.99%-- stub_clone
               |          |          |          |          |          sys_clone
               |          |          |          |          |          do_fork
               |          |          |          |          |          |          
               |          |          |          |          |          |--99.09%-- copy_process
               |          |          |          |          |          |          |          
               |          |          |          |          |          |          |--91.62%-- dup_mm

- look at where kernel threads spend their time

	perf report -G -p kernel_thread -s parent --stdio
	
# Overhead  Parent symbol
# ........  .............
#
     0.07%  kernel_thread_helper
            |
            --- kernel_thread_helper
                kthread
               |          
               |--50.00%-- kjournald2
               |          jbd2_journal_commit_transaction
               |          journal_submit_commit_record
               |          submit_bh
               |          submit_bio
               |          generic_make_request
               |          __make_request
               |          __blk_run_queue
               |          scsi_request_fn
               |          scsi_dispatch_cmd
               |          ata_scsi_queuecmd
               |          ata_scsi_translate
               |          ata_qc_issue
               |          ata_bmdma_qc_issue
               |          ata_sff_qc_issue
               |          ata_sff_tf_load
               |          ata_sff_check_status
               |          ioread8
               |          
                --50.00%-- rcu_kthread
                          rcu_process_callbacks
                          delayed_put_task_struct
                          __put_task_struct
                          free_task
                          free_thread_info
                          free_thread_xstate
                          kmem_cache_free
                          __slab_free
                          add_partial
                          _raw_spin_lock
                          lock_acquire
                          
etc...

We could extend that by applying some cut in the callchains.
For example stop a callchain on a given dso and you can profile
which exported function is most called in it.

Anyway, this has some nice potential.


Thanks,
	Frederic
---

Frederic Weisbecker (5):
      perf tools: Make sort operations static
      perf tools: Remove sort print helpers declarations
      perf tools: Don't display ignored entries on stdio ui
      perf tools: Allow sort dimensions to be registered more than once
      perf tools: Only display parent field if explictly sorted

Sam Liao (1):
      perf tools: Add inverted call graph report support.


 tools/perf/Documentation/perf-report.txt |   15 ++-
 tools/perf/builtin-report.c              |   42 +++++-
 tools/perf/util/callchain.h              |    6 +
 tools/perf/util/hist.c                   |    6 +-
 tools/perf/util/session.c                |    7 +-
 tools/perf/util/sort.c                   |  223 ++++++++++++++----------------
 tools/perf/util/sort.h                   |   14 --
 7 files changed, 169 insertions(+), 144 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2011-07-01 10:01 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-24  2:05 [GIT PULL] perf tools updates Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 1/9] perf lock: Fix state machine to recognize lock sequence Frederic Weisbecker
2010-04-24 10:43   ` [PATCH] perf lock: add "info" subcommand for dumping misc information Hitoshi Mitake
2010-04-24 10:46     ` Hitoshi Mitake
2010-04-24 10:46   ` [PATCH v2] " Hitoshi Mitake
2010-04-24 13:41     ` Frederic Weisbecker
2010-04-30 18:49     ` Frederic Weisbecker
2010-05-03  5:11       ` Hitoshi Mitake
2010-05-03  5:12         ` [PATCH v3] " Hitoshi Mitake
2010-05-05 21:10           ` Frederic Weisbecker
2010-05-06  9:31             ` Hitoshi Mitake
2010-05-06  9:32               ` [PATCH] perf lock: track only specified threads Hitoshi Mitake
2010-05-07  0:49                 ` Frederic Weisbecker
2010-05-08  8:02                   ` Hitoshi Mitake
2010-05-08  8:10                     ` [PATCH] perf lock: Drop "-a" option from set of default arguments to cmd_record() Hitoshi Mitake
2010-05-08 16:14                       ` Frederic Weisbecker
2010-05-09 14:53                         ` Hitoshi Mitake
2010-05-11  6:48                           ` Frederic Weisbecker
2010-05-12 10:23                             ` Hitoshi Mitake
2010-05-12 15:51                               ` Frederic Weisbecker
2010-05-15  8:54                                 ` Hitoshi Mitake
2010-05-10  7:19           ` [tip:perf/core] perf lock: Add "info" subcommand for dumping misc information tip-bot for Hitoshi Mitake
2010-04-24  2:05 ` [PATCH 2/9] perf: Fix initialization bug in parse_single_tracepoint_event() Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 3/9] perf: Generalize perf lock's sample event reordering to the session layer Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 4/9] perf: Use generic sample reordering in perf sched Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 5/9] perf: Use generic sample reordering in perf kmem Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 6/9] perf: Use generic sample reordering in perf trace Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 7/9] perf: Use generic sample reordering in perf timechart Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 8/9] perf: Add a perf trace option to check samples ordering reliability Frederic Weisbecker
2010-04-24 16:13   ` Masami Hiramatsu
2010-04-25 18:08     ` Frederic Weisbecker
2010-04-24  2:05 ` [PATCH 9/9] perf: Some perf-kvm documentation edits Frederic Weisbecker
2010-04-24  2:27 ` [GIT PULL] perf tools updates Frederic Weisbecker
2010-04-27  9:15 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2011-05-22  1:45 Frederic Weisbecker
2011-05-22  8:49 ` Ingo Molnar
2011-05-22 12:07   ` Frederic Weisbecker
2011-06-29 23:34 Frederic Weisbecker
2011-07-01 10:01 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).