From: Jiri Olsa <jolsa@redhat.com>
To: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@kernel.org>,
lkml <linux-kernel@vger.kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
David Ahern <dsahern@gmail.com>, Andi Kleen <ak@linux.intel.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andy Lutomirski <luto@amacapital.net>,
Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [RFC 00/21] perf tools: Add user data delayed processing
Date: Wed, 24 Jan 2018 13:11:14 +0100 [thread overview]
Message-ID: <20180124121114.GA17605@krava> (raw)
In-Reply-To: <20180124115143.14322-1-jolsa@kernel.org>
changing wrong subject :-\
On Wed, Jan 24, 2018 at 12:51:22PM +0100, Jiri Olsa wrote:
> hi,
> this RFC contains change to delay sample's user space
> data retrieval into task work, originally described and
> discussed by Peter and Ingo in here [1].
>
> This patchset tries to follow the original patch with
> some kernel changes (described below) and perf tool
> support included.
>
> Basically we allow the NMI event code to skip user data
> retrieval and schedule task work to do it, before the
> task resumes.
>
> Using the task work limits the window where we can do
> this. We can trigger the delayed task work only if the
> taskwork gets executed before the process executes again
> after NMI, because we need its stack as it was in NMI.
>
> That leaves us with window during the slow syscall path
> (check task_struct::perf_user_data_allowed in patches).
>
> The slow syscall processing is forced for task when
> the user data event is enabled, which makes the task
> slower.
>
> On the other hand I noticed roughly 100us drop in NMI
> processing times, which I plotted in here [2].
>
> Not sure it's worth to introduce this processing, which adds
> more processing time and does not show much improvement. On
> the other hand IIRC Peter mentioned it'd be nice to get user
> space data retrieval out of NMI.
>
> Also you guys could think of some other better/faster way ;-)
>
> NOTE I also implemented putting the user stack data into
> delayed processing, which showed nicer numbers. But it's
> little more tricky and brings more changes into this already
> big patchset. The logic stays, so I did not include it to
> keep the patchset simple.
>
> Also available in:
> https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/user_data
>
> thanks for comments,
> jirka
>
> [1] https://marc.info/?l=linux-kernel&m=150098372819938&w=2
> [2] http://people.redhat.com/~jolsa/ud-bench.png
>
> ---
> Jiri Olsa (21):
> perf tools: Add perf_evsel__is_sample_bit function
> perf tools: Add perf_sample__process function
> perf tools: Add callchain__printf for pure callchain dump
> perf tools: Add perf_sample__copy|free functions
> perf: Add TIF_PERF_USER_DATA bit
> perf: Add PERF_RECORD_USER_DATA event processing
> perf: Add PERF_SAMPLE_USER_DATA_ID sample type
> perf: Add PERF_SAMPLE_CALLCHAIN to user data event
> perf: Export running sample length values through debugfs
> perf tools: Sync perf_event.h uapi header
> perf tools: Add perf_sample__parse function
> perf tools: Add struct parse_args arg to perf_sample__parse
> perf tools: Add support to parse user data event
> perf tools: Add support to dump user data event info
> perf report: Add delayed user data event processing
> perf record: Enable delayed user data events
> perf script: Add support to display user data events
> perf script: Add support to display user data ID
> perf script: Display USER_DATA misc char for sample
> perf report: Add user data processing stats
> perf report: Add --stats=ud option to display user data debug info
>
> arch/x86/entry/common.c | 6 +++
> arch/x86/events/core.c | 18 ++++++++
> arch/x86/events/intel/ds.c | 4 +-
> arch/x86/include/asm/thread_info.h | 4 +-
> include/linux/init_task.h | 4 +-
> include/linux/perf_event.h | 3 ++
> include/linux/sched.h | 20 ++++++++
> include/uapi/linux/perf_event.h | 34 +++++++++++++-
> kernel/events/core.c | 283 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> tools/include/uapi/linux/perf_event.h | 34 +++++++++++++-
> tools/perf/Documentation/perf-script.txt | 3 +-
> tools/perf/builtin-record.c | 2 +
> tools/perf/builtin-report.c | 301 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
> tools/perf/builtin-script.c | 98 +++++++++++++++++++++++++++++++++++++++
> tools/perf/perf.h | 1 +
> tools/perf/util/event.c | 1 +
> tools/perf/util/event.h | 9 ++++
> tools/perf/util/evsel.c | 118 +++++++++++++++++++++++++++++++++++++----------
> tools/perf/util/evsel.h | 5 ++
> tools/perf/util/session.c | 60 +++++++++++++++++++-----
> tools/perf/util/thread.c | 1 +
> tools/perf/util/thread.h | 16 +++++++
> tools/perf/util/tool.h | 1 +
> 23 files changed, 954 insertions(+), 72 deletions(-)
prev parent reply other threads:[~2018-01-24 12:11 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 11:51 [RFC 00/21] perf tools: Add perf_evsel__is_sample_bit function Jiri Olsa
2018-01-24 11:51 ` [PATCH 01/21] " Jiri Olsa
2018-01-24 11:51 ` [PATCH 02/21] perf tools: Add perf_sample__process function Jiri Olsa
2018-01-24 11:51 ` [PATCH 03/21] perf tools: Add callchain__printf for pure callchain dump Jiri Olsa
2018-01-24 11:51 ` [PATCH 04/21] perf tools: Add perf_sample__copy|free functions Jiri Olsa
2018-01-24 11:51 ` [PATCH 05/21] perf: Add TIF_PERF_USER_DATA bit Jiri Olsa
2018-01-24 11:51 ` [PATCH 06/21] perf: Add PERF_RECORD_USER_DATA event processing Jiri Olsa
2018-01-24 11:51 ` [PATCH 07/21] perf: Add PERF_SAMPLE_USER_DATA_ID sample type Jiri Olsa
2018-01-24 11:51 ` [PATCH 08/21] perf: Add PERF_SAMPLE_CALLCHAIN to user data event Jiri Olsa
2018-01-24 11:51 ` [PATCH 09/21] perf: Export running sample length values through debugfs Jiri Olsa
2018-01-24 11:51 ` [PATCH 10/21] perf tools: Sync perf_event.h uapi header Jiri Olsa
2018-01-24 11:51 ` [PATCH 11/21] perf tools: Add perf_sample__parse function Jiri Olsa
2018-01-24 11:51 ` [PATCH 12/21] perf tools: Add struct parse_args arg to perf_sample__parse Jiri Olsa
2018-01-24 11:51 ` [PATCH 13/21] perf tools: Add support to parse user data event Jiri Olsa
2018-01-24 11:51 ` [PATCH 14/21] perf tools: Add support to dump user data event info Jiri Olsa
2018-01-24 11:51 ` [PATCH 15/21] perf report: Add delayed user data event processing Jiri Olsa
2018-01-24 11:51 ` [PATCH 16/21] perf record: Enable delayed user data events Jiri Olsa
2018-01-24 11:51 ` [PATCH 17/21] perf script: Add support to display " Jiri Olsa
2018-01-24 11:51 ` [PATCH 18/21] perf script: Add support to display user data ID Jiri Olsa
2018-01-24 11:51 ` [PATCH 19/21] perf script: Display USER_DATA misc char for sample Jiri Olsa
2018-01-24 11:51 ` [PATCH 20/21] perf report: Add user data processing stats Jiri Olsa
2018-01-24 11:51 ` [PATCH 21/21] perf report: Add --stats=ud option to display user data debug info Jiri Olsa
2018-01-24 12:11 ` Jiri Olsa [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180124121114.GA17605@krava \
--to=jolsa@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dsahern@gmail.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.