linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Announcing simple-pt -- a simple Processor Trace implementation for Linux
@ 2015-08-17  4:31 Andi Kleen
  2015-08-17 13:09 ` Frederic Weisbecker
  0 siblings, 1 reply; 3+ messages in thread
From: Andi Kleen @ 2015-08-17  4:31 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users


Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
to trace branch execution with low overhead. This is useful for performance analysis and debugging.

simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.

Starting with Linux 4.1 Linux has an integrated PT implementation in perf
(see https://lwn.net/Articles/648154/).
simple-pt is an alternative implementation. It has many disadvantages over the perf PT
implementation, such as:
- needs to run as root
- no long term tracing or sampling with interrupts
- no support for interactive debugging (use gdb 7.10 on perf for that)
- no support for histograms
- somewhat experimental
- not as well supported as perf

On the positive side simple-pt is:
- simple
- standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
- easy to modify and experiment with
- more ftrace like decoding tool
- support for kprobes based triggers
- modular “unix style” design with simple tools that do only one thing each
- BSD licensed

Example output:


        % sptcmd  -c tcall taskset -c 0 ./tcall
        cpu   0 offset 1027688,  1003 KB, writing to ptout.0
        ...
        Wrote sideband to ptout.sideband
        % sptdecode --sideband ptout.sideband --pt ptout.0 | less
        TIME      DELTA  INSNs   OPERATION
        frequency 32
        0        [+0]     [+   1] _dl_aux_init+436
                          [+   6] __libc_start_main+455 -> _dl_discover_osversion
        ...
                          [+  13] __libc_start_main+446 -> main
                          [+   9]     main+22 -> f1
                          [+   4]             f1+9 -> f2
                          [+   2]             f1+19 -> f2
                          [+   5]     main+22 -> f1
                          [+   4]             f1+9 -> f2
                          [+   2]             f1+19 -> f2
                          [+   5]     main+22 -> f1
        ...

Available from https://github.com/andikleen/simple-pt

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux
  2015-08-17  4:31 Announcing simple-pt -- a simple Processor Trace implementation for Linux Andi Kleen
@ 2015-08-17 13:09 ` Frederic Weisbecker
  2015-08-17 18:21   ` Andi Kleen
  0 siblings, 1 reply; 3+ messages in thread
From: Frederic Weisbecker @ 2015-08-17 13:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: LKML, perf group

2015-08-17 6:31 GMT+02:00 Andi Kleen <andi@firstfloor.org>:
>
> Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
> to trace branch execution with low overhead. This is useful for performance analysis and debugging.
>
> simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.
>
> Starting with Linux 4.1 Linux has an integrated PT implementation in perf
> (see https://lwn.net/Articles/648154/).
> simple-pt is an alternative implementation. It has many disadvantages over the perf PT
> implementation, such as:
> - needs to run as root
> - no long term tracing or sampling with interrupts
> - no support for interactive debugging (use gdb 7.10 on perf for that)
> - no support for histograms
> - somewhat experimental
> - not as well supported as perf
>
> On the positive side simple-pt is:
> - simple
> - standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
> - easy to modify and experiment with
> - more ftrace like decoding tool
> - support for kprobes based triggers
> - modular “unix style” design with simple tools that do only one thing each
> - BSD licensed
>
> Example output:
>
>
>         % sptcmd  -c tcall taskset -c 0 ./tcall
>         cpu   0 offset 1027688,  1003 KB, writing to ptout.0
>         ...
>         Wrote sideband to ptout.sideband
>         % sptdecode --sideband ptout.sideband --pt ptout.0 | less
>         TIME      DELTA  INSNs   OPERATION
>         frequency 32
>         0        [+0]     [+   1] _dl_aux_init+436
>                           [+   6] __libc_start_main+455 -> _dl_discover_osversion
>         ...
>                           [+  13] __libc_start_main+446 -> main
>                           [+   9]     main+22 -> f1
>                           [+   4]             f1+9 -> f2
>                           [+   2]             f1+19 -> f2
>                           [+   5]     main+22 -> f1
>                           [+   4]             f1+9 -> f2
>                           [+   2]             f1+19 -> f2
>                           [+   5]     main+22 -> f1

Nice. So I guess +x is the address offset. How hard would it be to
translate to file lines?

Thanks.

>         ...
>
> Available from https://github.com/andikleen/simple-pt
>
> --
> ak@linux.intel.com -- Speaking for myself only.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux
  2015-08-17 13:09 ` Frederic Weisbecker
@ 2015-08-17 18:21   ` Andi Kleen
  0 siblings, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2015-08-17 18:21 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Andi Kleen, LKML, perf group

> >         % sptdecode --sideband ptout.sideband --pt ptout.0 | less
> >         TIME      DELTA  INSNs   OPERATION
> >         frequency 32
> >         0        [+0]     [+   1] _dl_aux_init+436
> >                           [+   6] __libc_start_main+455 -> _dl_discover_osversion
> >         ...
> >                           [+  13] __libc_start_main+446 -> main
> >                           [+   9]     main+22 -> f1
> >                           [+   4]             f1+9 -> f2
> >                           [+   2]             f1+19 -> f2
> >                           [+   5]     main+22 -> f1
> >                           [+   4]             f1+9 -> f2
> >                           [+   2]             f1+19 -> f2
> >                           [+   5]     main+22 -> f1
> 
> Nice. So I guess +x is the address offset. How hard would it be to
> translate to file lines?

Yes it's the address offset. Translating to lines wouldn't be too hard,
just needs to be implemented with a dwarf reader.

BTW the PT trace has all branches, just not the calls, but it's more
difficult to display them all in a nice way.

-Andi

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-08-17 18:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-17  4:31 Announcing simple-pt -- a simple Processor Trace implementation for Linux Andi Kleen
2015-08-17 13:09 ` Frederic Weisbecker
2015-08-17 18:21   ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).