public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] event synthesization multithreading for perf record
@ 2017-10-13 14:09 kan.liang
  2017-10-13 14:09 ` [PATCH 1/4] perf tools: pass thread info to process function kan.liang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: kan.liang @ 2017-10-13 14:09 UTC (permalink / raw)
  To: acme, peterz, mingo, linux-kernel
  Cc: jolsa, wangnan0, hekuang, namhyung, alexander.shishkin,
	adrian.hunter, ak, Kan Liang

From: Kan Liang <Kan.liang@intel.com>

The event synthesization multithreading is introduced in
("perf top optimization") https://lkml.org/lkml/2017/9/29/269
But it was not enabled for perf record. Because the process function
process_synthesized_event was not multithreading friendly.

The patch series temporarily stores the process result in the buffer of each
thread, which make the processing in parallel. Then it writes the buffer
one by one to the perf.data at the end of event synthesization.

The source code is also available at
https://github.com/kliang2/perf.git perf_record_opt

Usually, the event synthesization only happens once on either start or end.
With the snapshotting code, we synthesize events multiple times, once per
each new perf.data file. Both of the cases are verified.

Here are the latency test result on Knights Mill and Skylake server

The workload is to compile Linux kernel as below
"sudo nice make -j$(grep -c '^processor' /proc/cpuinfo)"
Then, "sudo perf record -e cycles -a -- sleep 1"

The latency is the time cost of __machine__synthesize_threads or
its multithreading replacement, record__multithread_synthesize.

- Latency on Knights Mill (272 CPUs)

Original(s)     With patch(s)   Speedup
12.74           5.54            2.3X

- Latency on Skylake server (192 CPUs)

Original(s)     With patch(s)   Speedup
0.36            0.25            1.47X

Kan Liang (4):
  perf tools: pass thread info to process function
  perf tools: pass thread info in event synthesization
  perf record: event synthesization multithreading support
  perf record: add option to set the number of thread for event
    synthesize

 tools/perf/Documentation/perf-record.txt |   4 ++
 tools/perf/arch/x86/util/tsc.c           |   2 +-
 tools/perf/builtin-inject.c              |  12 +++-
 tools/perf/builtin-record.c              | 100 ++++++++++++++++++++++++++--
 tools/perf/builtin-sched.c               |  12 ++--
 tools/perf/builtin-stat.c                |   3 +-
 tools/perf/builtin-trace.c               |   3 +-
 tools/perf/tests/cpumap.c                |   6 +-
 tools/perf/tests/dwarf-unwind.c          |   6 +-
 tools/perf/tests/event_update.c          |  12 ++--
 tools/perf/tests/stat.c                  |   9 ++-
 tools/perf/tests/thread-map.c            |   3 +-
 tools/perf/util/auxtrace.c               |   2 +-
 tools/perf/util/event.c                  | 111 +++++++++++++++++++------------
 tools/perf/util/event.h                  |  19 ++++--
 tools/perf/util/header.c                 |  16 ++---
 tools/perf/util/intel-bts.c              |   3 +-
 tools/perf/util/intel-pt.c               |   3 +-
 tools/perf/util/session.c                |   4 +-
 19 files changed, 243 insertions(+), 87 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-10-13 20:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-13 14:09 [PATCH 0/4] event synthesization multithreading for perf record kan.liang
2017-10-13 14:09 ` [PATCH 1/4] perf tools: pass thread info to process function kan.liang
2017-10-13 14:09 ` [PATCH 2/4] perf tools: pass thread info in event synthesization kan.liang
2017-10-13 14:09 ` [PATCH 3/4] perf record: event synthesization multithreading support kan.liang
2017-10-13 14:38   ` Arnaldo Carvalho de Melo
2017-10-13 14:58     ` Liang, Kan
2017-10-13 15:09       ` Arnaldo Carvalho de Melo
2017-10-13 20:06     ` Ingo Molnar
2017-10-13 14:09 ` [PATCH 4/4] perf record: add option to set the number of thread for event synthesize kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox