public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/20] perf: Add infrastructure and support for Intel PT
@ 2014-10-13 13:45 Alexander Shishkin
  2014-10-13 13:45 ` [PATCH v5 01/20] perf: Add data_{offset,size} to user_page Alexander Shishkin
                   ` (19 more replies)
  0 siblings, 20 replies; 65+ messages in thread
From: Alexander Shishkin @ 2014-10-13 13:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, Robert Richter, Frederic Weisbecker,
	Mike Galbraith, Paul Mackerras, Stephane Eranian, Andi Kleen,
	kan.liang, adrian.hunter, acme, Alexander Shishkin

Hi Peter and all,

[full description below the changelog]

This version of the patchset hopefully addresses comments from the
previous (v4) version. Changelog messages should be more descriptive
as well as comments in the code. Funcitonal changes:

  * events are not disabled on munmap(), this got replaced with
    refcounting,
  * explicit sfence is added to the intel_pt driver to make sure
    that data stores are globally visible before the aux_head is
    updated,
  * dropped the unnecessary set_output for inherited events in
    favor of using parent's ring buffer,
  * intel_pt needs to handle #GP from enabling WRMSR, so that a
    privileged user can set arbitrary RTIT_CTL bits in the range
    that is reserved for packet enables (see PT_BYPASS_MASK and
    comments around pt_config()) without potentially killing the
    machine.

Interface changes:

  * replaced 'u8 truncated' in the PERF_RECORD_AUX with a 'u64 flags',
    dropped redundant id/stream_id,
  * in overwrite mode, always provide offset and size even if the driver
    cannot tell where the snapshot begins/weather its beginning was
    overwritten by older data.

This patchset adds support for Intel Processor Trace (PT) extension [1] of
Intel Architecture that allows the capture of information about software
execution flow, to the perf kernel infrastructure.

The single most notable thing is that while PT outputs trace data in a
compressed binary format, it will still generate hundreds of megabytes
of trace data per second per core. Decoding this binary stream takes
2-3 orders of magnitude the cpu time that it takes to generate
it. These considerations make it impossible to carry out decoding in
kernel space. Therefore, the trace data is exported to userspace as a
zero-copy mapping that userspace can collect and store for later
decoding. To address this, this patchset extends perf ring buffer with
an "AUX space", which is allocated for hardware blocks such as PT to
export their trace data with minimal overhead. This space can be
configured via buffer's user page and mmapped from the same file
descriptor with a given offset. Data can then be collected from it
by reading the aux_head (write) pointer from the user page and updating
aux_tail (read) pointer similarly to data_{head,tail} of the
traditional perf buffer. There is an api between perf core and pmu
drivers that wish to make use of this AUX space to export their data.

For tracing blocks that don't support hardware scatter-gather tables,
we provide high-order physically contiguous allocations to minimize
the overhead needed for software double buffering and PMI pressure.

This way we get a normal perf data stream that provides sideband
information that is required to decode the trace data, such as MMAPs,
COMMs etc, plus the actual trace in its own logical space.

If the trace buffer is mapped writable, the driver will stop tracing
when it fills up (aux_head approaches aux_tail), till data is read,
aux_tail pointer is moved forward and an ioctl() is issued to
re-enable tracing. If the trace buffer is mapped read only, the
tracing will continue, overwriting older data, so that the buffer
always contains the most recent data. Tracing can be stopped with an
ioctl() and restarted once the data is collected.

Another use case is annotating samples of other perf events: setting
PERF_SAMPLE_AUX requests attr.aux_sample_size bytes of trace to be
included in each event's sample.

This patchset consists of necessary changes to the perf kernel
infrastructure, and PT and BTS pmu drivers. The tooling support is not
included in this series, however, it can be found in my github tree [2].

[1] http://software.intel.com/en-us/intel-isa-extensions
[2] http://github.com/virtuoso/linux-perf/tree/intel_pt

Alexander Shishkin (19):
  perf: Add data_{offset,size} to user_page
  perf: Support high-order allocations for AUX space
  perf: Add a capability for AUX_NO_SG pmus to do software double
    buffering
  perf: Add a pmu capability for "exclusive" events
  perf: Add AUX record
  perf: Add api for pmus to write to AUX area
  perf: Support overwrite mode for AUX area
  perf: Add wakeup watermark control to AUX area
  x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection
  x86: perf: Intel PT and LBR/BTS are mutually exclusive
  x86: perf: intel_pt: Intel PT PMU driver
  x86: perf: intel_bts: Add BTS PMU driver
  perf: add ITRACE_START record to indicate that tracing has started
  perf: Add api to (de-)allocate AUX buffers for kernel counters
  perf: Add a helper for looking up pmus by type
  perf: Add infrastructure for using AUX data in perf samples
  perf: Allocate ring buffers for inherited per-task kernel events
  perf: Allow AUX sampling for multiple events
  perf: Allow AUX sampling of inherited events

Peter Zijlstra (1):
  perf: Add AUX area to ring buffer for raw data streams

 arch/x86/include/asm/cpufeature.h          |   1 +
 arch/x86/include/uapi/asm/msr-index.h      |  18 +
 arch/x86/kernel/cpu/Makefile               |   1 +
 arch/x86/kernel/cpu/intel_pt.h             | 129 ++++
 arch/x86/kernel/cpu/perf_event.h           |  14 +
 arch/x86/kernel/cpu/perf_event_intel.c     |  14 +-
 arch/x86/kernel/cpu/perf_event_intel_bts.c | 518 +++++++++++++++
 arch/x86/kernel/cpu/perf_event_intel_ds.c  |  11 +-
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   9 +-
 arch/x86/kernel/cpu/perf_event_intel_pt.c  | 995 +++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/scattered.c            |   1 +
 include/linux/perf_event.h                 |  56 +-
 include/uapi/linux/perf_event.h            |  73 ++-
 kernel/events/core.c                       | 534 +++++++++++++++-
 kernel/events/internal.h                   |  52 ++
 kernel/events/ring_buffer.c                | 386 ++++++++++-
 16 files changed, 2768 insertions(+), 44 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/intel_pt.h
 create mode 100644 arch/x86/kernel/cpu/perf_event_intel_bts.c
 create mode 100644 arch/x86/kernel/cpu/perf_event_intel_pt.c

-- 
2.1.0


^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2014-11-11 14:17 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-13 13:45 [PATCH v5 00/20] perf: Add infrastructure and support for Intel PT Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 01/20] perf: Add data_{offset,size} to user_page Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 02/20] perf: Add AUX area to ring buffer for raw data streams Alexander Shishkin
2014-10-22 12:35   ` Peter Zijlstra
2014-10-23  3:05     ` Frederic Weisbecker
2014-10-13 13:45 ` [PATCH v5 03/20] perf: Support high-order allocations for AUX space Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 04/20] perf: Add a capability for AUX_NO_SG pmus to do software double buffering Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 05/20] perf: Add a pmu capability for "exclusive" events Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 06/20] perf: Add AUX record Alexander Shishkin
2014-10-22 13:26   ` Peter Zijlstra
2014-10-22 14:18     ` Alexander Shishkin
2014-10-22 15:07       ` Peter Zijlstra
2014-10-13 13:45 ` [PATCH v5 07/20] perf: Add api for pmus to write to AUX area Alexander Shishkin
2014-10-22 14:02   ` Peter Zijlstra
2014-10-22 14:14     ` Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 08/20] perf: Support overwrite mode for " Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 09/20] perf: Add wakeup watermark control to " Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 10/20] x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 11/20] x86: perf: Intel PT and LBR/BTS are mutually exclusive Alexander Shishkin
2014-10-22 14:15   ` Peter Zijlstra
2014-10-24  7:47     ` Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 12/20] x86: perf: intel_pt: Intel PT PMU driver Alexander Shishkin
2014-10-22 14:17   ` Peter Zijlstra
2014-10-22 14:20   ` Peter Zijlstra
2014-10-24  7:49     ` Alexander Shishkin
2014-10-24 11:26       ` Peter Zijlstra
2014-10-24 12:01         ` Alexander Shishkin
2014-10-22 14:23   ` Peter Zijlstra
2014-10-22 14:27   ` Peter Zijlstra
2014-10-24  7:50     ` Alexander Shishkin
2014-10-22 14:32   ` Peter Zijlstra
2014-10-31 13:13     ` Alexander Shishkin
2014-11-04 15:57       ` Peter Zijlstra
2014-11-11 11:24         ` Alexander Shishkin
2014-11-11 13:20           ` Peter Zijlstra
2014-11-11 14:17             ` Alexander Shishkin
2014-10-22 14:34   ` Peter Zijlstra
2014-10-24  7:52     ` Alexander Shishkin
2014-10-22 14:45   ` Peter Zijlstra
2014-10-24  8:22     ` Alexander Shishkin
2014-10-24 11:51       ` Peter Zijlstra
2014-10-24 12:13         ` Alexander Shishkin
2014-10-24 13:02           ` Peter Zijlstra
2014-10-24 13:18             ` Alexander Shishkin
2014-10-24 13:48               ` Peter Zijlstra
2014-10-22 14:49   ` Peter Zijlstra
2014-10-22 15:11     ` Peter Zijlstra
2014-10-22 14:55   ` Peter Zijlstra
2014-10-24  7:59     ` Alexander Shishkin
2014-10-22 15:14   ` Peter Zijlstra
2014-10-24  7:59     ` Alexander Shishkin
2014-10-22 15:26   ` Peter Zijlstra
2014-10-13 13:45 ` [PATCH v5 13/20] x86: perf: intel_bts: Add BTS " Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 14/20] perf: add ITRACE_START record to indicate that tracing has started Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 15/20] perf: Add api to (de-)allocate AUX buffers for kernel counters Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 16/20] perf: Add a helper for looking up pmus by type Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 17/20] perf: Add infrastructure for using AUX data in perf samples Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 18/20] perf: Allocate ring buffers for inherited per-task kernel events Alexander Shishkin
2014-10-23 12:38   ` Peter Zijlstra
2014-10-24  7:44     ` Alexander Shishkin
2014-10-30  8:43       ` Peter Zijlstra
2014-10-30 10:20         ` Alexander Shishkin
2014-10-30 13:31         ` Arnaldo Carvalho de Melo
2014-10-13 13:45 ` [PATCH v5 19/20] perf: Allow AUX sampling for multiple events Alexander Shishkin
2014-10-13 13:45 ` [PATCH v5 20/20] perf: Allow AUX sampling of inherited events Alexander Shishkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox