From: Alexey Budankov <alexey.budankov@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
Andi Kleen <ak@linux.intel.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: [PATCH v8 0/3]: perf: reduce data loss when profiling highly parallel CPU bound workloads
Date: Fri, 7 Sep 2018 10:07:22 +0300 [thread overview]
Message-ID: <e1144f9d-b231-e42c-d698-4db0e62b71ff@linux.intel.com> (raw)
Currently in record mode the tool implements trace writing serially.
The algorithm loops over mapped per-cpu data buffers and stores
ready data chunks into a trace file using write() system call.
At some circumstances the kernel may lack free space in a buffer
because the other buffer's half is not yet written to disk due to
some other buffer's data writing by the tool at the moment.
Thus serial trace writing implementation may cause the kernel
to loose profiling data and that is what observed when profiling
highly parallel CPU bound workloads on machines with big number
of cores.
Experiment with profiling matrix multiplication code executing 128
threads on Intel Xeon Phi (KNM) with 272 cores, like below,
demonstrates data loss metrics value of 98%:
/usr/bin/time perf record -o /tmp/perf-ser.data -a -N -B -T -R -g \
--call-graph dwarf,1024 --user-regs=IP,SP,BP \
--switch-events -e cycles,instructions,ref-cycles,software/period=1,name=cs,config=0x3/Duk -- \
matrix.gcc
Data loss metrics is the ratio lost_time/elapsed_time where
lost_time is the sum of time intervals containing PERF_RECORD_LOST
records and elapsed_time is the elapsed application run time
under profiling.
Applying asynchronous trace streaming thru Posix AIO API
(http://man7.org/linux/man-pages/man7/aio.7.html)
lowers data loss metrics value providing 2x improvement -
lowering 98% loss to almost 0%.
---
Alexey Budankov (3):
perf util: map data buffer for preserving collected data
perf record: enable asynchronous trace writing
perf record: extend trace writing to multi AIO
tools/perf/builtin-record.c | 166 ++++++++++++++++++++++++++++++++++++++++++--
tools/perf/perf.h | 1 +
tools/perf/util/evlist.c | 7 +-
tools/perf/util/evlist.h | 3 +-
tools/perf/util/mmap.c | 114 ++++++++++++++++++++++++++----
tools/perf/util/mmap.h | 11 ++-
6 files changed, 277 insertions(+), 25 deletions(-)
---
Changes in v8:
- run the whole thing thru checkpatch.pl and corrected found issues except
lines longer than 80 symbols
- corrected comments alignment and formatting
- moved multi AIO implementation into 3rd patch in the series
- implemented explicit cblocks array allocation
- split AIO completion check into separate record__aio_complete()
- set nr_cblocks default to 1 and max allowed value to 4
Changes in v7:
- implemented handling record.aio setting from perfconfig file
Changes in v6:
- adjusted setting of priorities for cblocks;
- handled errno == EAGAIN case from aio_write() return;
Changes in v5:
- resolved livelock on perf record -e intel_pt// -- dd if=/dev/zero of=/dev/null count=100000
- data loss metrics decreased from 25% to 2x in trialed configuration;
- reshaped layout of data structures;
- implemented --aio option;
- avoided nanosleep() prior calling aio_suspend();
- switched to per-cpu aio multi buffer record__aio_sync();
- record_mmap_read_sync() now does global sync just before
switching trace file or collection stop;
Changes in v4:
- converted mmap()/munmap() to malloc()/free() for mmap->data buffer management
- converted void *bf to struct perf_mmap *md in signatures
- written comment in perf_mmap__push() just before perf_mmap__get();
- written comment in record__mmap_read_sync() on possible restarting
of aio_write() operation and releasing perf_mmap object after all;
- added perf_mmap__put() for the cases of failed aio_write();
Changes in v3:
- written comments about nanosleep(0.5ms) call prior aio_suspend()
to cope with intrusiveness of its implementation in glibc;
- written comments about rationale behind coping profiling data
into mmap->data buffer;
Changes in v2:
- converted zalloc() to calloc() for allocation of mmap_aio array,
- cleared typo and adjusted fallback branch code;
next reply other threads:[~2018-09-07 7:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-07 7:07 Alexey Budankov [this message]
2018-09-07 7:11 ` [PATCH v8 1/3]: perf util: map data buffer for preserving collected data Alexey Budankov
2018-09-07 7:19 ` [PATCH v8 2/3]: perf record: enable asynchronous trace writing Alexey Budankov
2018-09-07 7:39 ` [PATCH v8 3/3]: perf record: extend trace writing to multi AIO Alexey Budankov
2018-09-07 9:34 ` [PATCH v8 0/3]: perf: reduce data loss when profiling highly parallel CPU bound workloads Alexey Budankov
2018-09-10 9:18 ` Ingo Molnar
2018-09-10 9:59 ` Jiri Olsa
2018-09-10 10:03 ` Ingo Molnar
2018-09-10 10:08 ` Jiri Olsa
2018-09-10 10:13 ` Ingo Molnar
2018-09-10 10:23 ` Jiri Olsa
2018-09-10 10:45 ` Alexey Budankov
2018-09-10 10:40 ` Alexey Budankov
2018-09-10 12:06 ` Ingo Molnar
2018-09-10 13:58 ` Arnaldo Carvalho de Melo
2018-09-10 15:19 ` Alexey Budankov
2018-09-10 14:48 ` Alexey Budankov
2018-09-11 6:35 ` Ingo Molnar
2018-09-11 8:16 ` Alexey Budankov
2018-09-11 8:34 ` Jiri Olsa
2018-09-11 13:42 ` Alexey Budankov
2018-09-13 8:00 ` Jiri Olsa
2018-09-11 14:19 ` Peter Zijlstra
2018-09-12 8:27 ` Alexey Budankov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1144f9d-b231-e42c-d698-4db0e62b71ff@linux.intel.com \
--to=alexey.budankov@linux.intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox