* Announcing simple-pt -- a simple Processor Trace implementation for Linux
@ 2015-08-17 4:31 Andi Kleen
2015-08-17 13:09 ` Frederic Weisbecker
0 siblings, 1 reply; 3+ messages in thread
From: Andi Kleen @ 2015-08-17 4:31 UTC (permalink / raw)
To: linux-kernel, linux-perf-users
Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
to trace branch execution with low overhead. This is useful for performance analysis and debugging.
simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.
Starting with Linux 4.1 Linux has an integrated PT implementation in perf
(see https://lwn.net/Articles/648154/).
simple-pt is an alternative implementation. It has many disadvantages over the perf PT
implementation, such as:
- needs to run as root
- no long term tracing or sampling with interrupts
- no support for interactive debugging (use gdb 7.10 on perf for that)
- no support for histograms
- somewhat experimental
- not as well supported as perf
On the positive side simple-pt is:
- simple
- standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
- easy to modify and experiment with
- more ftrace like decoding tool
- support for kprobes based triggers
- modular “unix style” design with simple tools that do only one thing each
- BSD licensed
Example output:
% sptcmd -c tcall taskset -c 0 ./tcall
cpu 0 offset 1027688, 1003 KB, writing to ptout.0
...
Wrote sideband to ptout.sideband
% sptdecode --sideband ptout.sideband --pt ptout.0 | less
TIME DELTA INSNs OPERATION
frequency 32
0 [+0] [+ 1] _dl_aux_init+436
[+ 6] __libc_start_main+455 -> _dl_discover_osversion
...
[+ 13] __libc_start_main+446 -> main
[+ 9] main+22 -> f1
[+ 4] f1+9 -> f2
[+ 2] f1+19 -> f2
[+ 5] main+22 -> f1
[+ 4] f1+9 -> f2
[+ 2] f1+19 -> f2
[+ 5] main+22 -> f1
...
Available from https://github.com/andikleen/simple-pt
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux
2015-08-17 4:31 Announcing simple-pt -- a simple Processor Trace implementation for Linux Andi Kleen
@ 2015-08-17 13:09 ` Frederic Weisbecker
2015-08-17 18:21 ` Andi Kleen
0 siblings, 1 reply; 3+ messages in thread
From: Frederic Weisbecker @ 2015-08-17 13:09 UTC (permalink / raw)
To: Andi Kleen; +Cc: LKML, perf group
2015-08-17 6:31 GMT+02:00 Andi Kleen <andi@firstfloor.org>:
>
> Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
> to trace branch execution with low overhead. This is useful for performance analysis and debugging.
>
> simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.
>
> Starting with Linux 4.1 Linux has an integrated PT implementation in perf
> (see https://lwn.net/Articles/648154/).
> simple-pt is an alternative implementation. It has many disadvantages over the perf PT
> implementation, such as:
> - needs to run as root
> - no long term tracing or sampling with interrupts
> - no support for interactive debugging (use gdb 7.10 on perf for that)
> - no support for histograms
> - somewhat experimental
> - not as well supported as perf
>
> On the positive side simple-pt is:
> - simple
> - standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
> - easy to modify and experiment with
> - more ftrace like decoding tool
> - support for kprobes based triggers
> - modular “unix style” design with simple tools that do only one thing each
> - BSD licensed
>
> Example output:
>
>
> % sptcmd -c tcall taskset -c 0 ./tcall
> cpu 0 offset 1027688, 1003 KB, writing to ptout.0
> ...
> Wrote sideband to ptout.sideband
> % sptdecode --sideband ptout.sideband --pt ptout.0 | less
> TIME DELTA INSNs OPERATION
> frequency 32
> 0 [+0] [+ 1] _dl_aux_init+436
> [+ 6] __libc_start_main+455 -> _dl_discover_osversion
> ...
> [+ 13] __libc_start_main+446 -> main
> [+ 9] main+22 -> f1
> [+ 4] f1+9 -> f2
> [+ 2] f1+19 -> f2
> [+ 5] main+22 -> f1
> [+ 4] f1+9 -> f2
> [+ 2] f1+19 -> f2
> [+ 5] main+22 -> f1
Nice. So I guess +x is the address offset. How hard would it be to
translate to file lines?
Thanks.
> ...
>
> Available from https://github.com/andikleen/simple-pt
>
> --
> ak@linux.intel.com -- Speaking for myself only.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux
2015-08-17 13:09 ` Frederic Weisbecker
@ 2015-08-17 18:21 ` Andi Kleen
0 siblings, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2015-08-17 18:21 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Andi Kleen, LKML, perf group
> > % sptdecode --sideband ptout.sideband --pt ptout.0 | less
> > TIME DELTA INSNs OPERATION
> > frequency 32
> > 0 [+0] [+ 1] _dl_aux_init+436
> > [+ 6] __libc_start_main+455 -> _dl_discover_osversion
> > ...
> > [+ 13] __libc_start_main+446 -> main
> > [+ 9] main+22 -> f1
> > [+ 4] f1+9 -> f2
> > [+ 2] f1+19 -> f2
> > [+ 5] main+22 -> f1
> > [+ 4] f1+9 -> f2
> > [+ 2] f1+19 -> f2
> > [+ 5] main+22 -> f1
>
> Nice. So I guess +x is the address offset. How hard would it be to
> translate to file lines?
Yes it's the address offset. Translating to lines wouldn't be too hard,
just needs to be implemented with a dwarf reader.
BTW the PT trace has all branches, just not the calls, but it's more
difficult to display them all in a nice way.
-Andi
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-08-17 18:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-17 4:31 Announcing simple-pt -- a simple Processor Trace implementation for Linux Andi Kleen
2015-08-17 13:09 ` Frederic Weisbecker
2015-08-17 18:21 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).