Re: [PATCH 3/3] perf record: mmap output file

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: David Ahern <dsahern@gmail.com>
Cc: acme@ghostprotocols.net, linux-kernel@vger.kernel.org,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Mike Galbraith <efault@gmx.de>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 3/3] perf record: mmap output file
Date: Wed, 9 Oct 2013 07:59:58 +0200	[thread overview]
Message-ID: <20131009055957.GA7664@gmail.com> (raw)
In-Reply-To: <1381289214-24885-4-git-send-email-dsahern@gmail.com>

* David Ahern <dsahern@gmail.com> wrote:

> When recording raw_syscalls for the entire system, e.g.,
>     perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
> 
> you end up with a negative feedback loop as perf itself calls
> write() fairly often. This patch handles the problem by mmap'ing the
> file in chunks of 64M at a time and copies events from the event buffers
> to the file avoiding write system calls.
> 
> Before (with write syscall):
> 
> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 81.843 MB /tmp/perf.data (~3575786 samples) ]
> 
> After (using mmap):
> 
> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- sleep 1
> [ perf record: Woken up 31 times to write data ]
> [ perf record: Captured and wrote 8.203 MB /tmp/perf.data (~358388 samples) ]
> 
> In addition to perf-trace benefits using mmap lowers the overhead of
> perf-record. For example,
> 
>   perf stat -i -- perf record -g -o /tmp/perf.data openssl speed aes
> 
> showsi a drop in time, CPU cycles, and instructions all drop by more than a
> factor of 3. Jiri also ran a test that showed a big improvement.

Here are some thoughts on how 'perf record' tracing performance could be 
further improved:

1)

The use of non-temporal stores (MOVNTQ) to copy the ring-buffer into the 
file buffer makes sure the CPU cache is not trashed by the copying - which 
is the largest 'collateral damage' copying does.

glibc does not appear to expose non-temporal instructions so it's going to 
be architecture dependent - but we could build the copy_user_nocache() 
function from the kernel proper (or copy it - we could even simplify it: 
knowing that only large and page aligned buffers are going to be copied 
with it).

See how tools/perf/bench/mem-mem* does that to be able to measure the 
kernel's memcpy() and memset() function performance.

2)

Yet another method would be to avoid the copies altogether via the splice 
system-call - see:

	git grep splice kernel/trace/

To make splice low-overhead we'd have to introduce a mode to not mmap the 
data part of the perf ring-buffer and splice the data straight from the 
perf fd into a temporary pipe and over from the pipe into the target file 
(or socket).

OTOH non-temporal stores are incredibly simple and memory bandwidth is 
plenty on modern systems so I'd certainly try that route first.

Thanks,

	Ingo

next prev parent reply	other threads:[~2013-10-09  6:00 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-09  3:26 [PATCH 0/3] perf trace enhancements David Ahern
2013-10-09  3:26 ` [PATCH 1/3] perf util: Add findnew method to intlist - v2 David Ahern
2013-10-15  5:32   ` [tip:perf/core] perf util: Add findnew method to intlist tip-bot for David Ahern
2013-10-09  3:26 ` [PATCH 2/3] perf trace: Add summary option to dump syscall statistics David Ahern
2013-10-09 13:16   ` Jiri Olsa
2013-10-15  5:32   ` [tip:perf/core] " tip-bot for David Ahern
2013-10-09  3:26 ` [PATCH 3/3] perf record: mmap output file David Ahern
2013-10-09  5:59   ` Ingo Molnar [this message]
2013-10-15 14:04     ` David Ahern
2013-10-15 14:32       ` Arnaldo Carvalho de Melo
2013-10-15 14:38         ` Peter Zijlstra
2013-10-15 15:27           ` Ingo Molnar
2013-10-15 15:29             ` Peter Zijlstra
2013-10-15 15:30             ` David Ahern
2013-10-15 16:06               ` Ingo Molnar
2013-10-15 16:16                 ` David Ahern
2013-10-16  7:11                   ` Ingo Molnar
2013-10-15 16:37                 ` Peter Zijlstra
2013-10-09  7:14   ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131009055957.GA7664@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@ghostprotocols.net \
    --cc=dsahern@gmail.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.