From: Namhyung Kim <namhyung@kernel.org>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: x86@kernel.org, Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
linux-kernel@vger.kernel.org,
Indu Bhagat <indu.bhagat@oracle.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
linux-perf-users@vger.kernel.org, Mark Brown <broonie@kernel.org>,
linux-toolchains@vger.kernel.org, Jordan Rome <jordalgo@meta.com>,
Sam James <sam@gentoo.org>
Subject: Re: [PATCH v2 09/11] perf: Introduce deferred user callchains
Date: Tue, 17 Sep 2024 15:07:43 -0700 [thread overview]
Message-ID: <Zun9r1TAAG1slUSA@google.com> (raw)
In-Reply-To: <5bc834b68fe14daaa1782b925ab54fc414245334.1726268190.git.jpoimboe@kernel.org>
On Sat, Sep 14, 2024 at 01:02:11AM +0200, Josh Poimboeuf wrote:
> Instead of attempting to unwind user space from the NMI handler, defer
> it to run in task context by sending a self-IPI and then scheduling the
> unwind to run in the IRQ's exit task work before returning to user space.
>
> This allows the user stack page to be paged in if needed, avoids
> duplicate unwinds for kernel-bound workloads, and prepares for SFrame
> unwinding (so .sframe sections can be paged in on demand).
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
[SNIP]
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 19fd7bd38ecf..5fc7c5156287 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6854,11 +6860,70 @@ static void perf_pending_irq(struct irq_work *entry)
> perf_swevent_put_recursion_context(rctx);
> }
>
> +struct perf_callchain_deferred_event {
> + struct perf_event_header header;
> + struct perf_callchain_entry callchain;
> +};
> +
> +#define PERF_CALLCHAIN_DEFERRED_EVENT_SIZE \
> + sizeof(struct perf_callchain_deferred_event) + \
> + (sizeof(__u64) * 1) + /* PERF_CONTEXT_USER */ \
> + (sizeof(__u64) * PERF_MAX_STACK_DEPTH)
> +
> +static void perf_event_callchain_deferred(struct perf_event *event)
> +{
> + struct pt_regs *regs = task_pt_regs(current);
> + struct perf_callchain_entry *callchain;
> + struct perf_output_handle handle;
> + struct perf_sample_data data;
> + unsigned char buf[PERF_CALLCHAIN_DEFERRED_EVENT_SIZE];
> + struct perf_callchain_entry_ctx ctx;
> + struct perf_callchain_deferred_event *deferred_event;
> +
> + deferred_event = (void *)&buf;
> +
> + callchain = &deferred_event->callchain;
> + callchain->nr = 0;
> +
> + ctx.entry = callchain;
> + ctx.max_stack = MIN(event->attr.sample_max_stack,
> + PERF_MAX_STACK_DEPTH);
> + ctx.nr = 0;
> + ctx.contexts = 0;
> + ctx.contexts_maxed = false;
> +
> + perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
> + perf_callchain_user_deferred(&ctx, regs);
> +
> + deferred_event->header.type = PERF_RECORD_CALLCHAIN_DEFERRED;
> + deferred_event->header.misc = 0;
I think we can use PERF_RECORD_MISC_USER here as it's about user
callchains.
> + deferred_event->header.size = sizeof(*deferred_event) +
> + (callchain->nr * sizeof(u64));
> +
> + perf_event_header__init_id(&deferred_event->header, &data, event);
> +
> + if (perf_output_begin(&handle, &data, event,
> + deferred_event->header.size))
> + return;
> +
> + perf_output_copy(&handle, deferred_event, deferred_event->header.size);
You should not copy the whole event size because it also contains the
id_sample parts in the below. Maybe something like this instead?
perf_output_put(&handle, *deferred_event);
__output_copy(&handle, callchain->ip, callchain->nr * sizeof(u64));
Thanks,
Namhyung
> + perf_event__output_id_sample(event, &handle, &data);
> + perf_output_end(&handle);
> +}
> +
> static void perf_pending_task(struct callback_head *head)
> {
> struct perf_event *event = container_of(head, struct perf_event, pending_task);
> int rctx;
>
> + if (!is_software_event(event)) {
> + if (event->pending_callchain) {
> + perf_event_callchain_deferred(event);
> + event->pending_callchain = 0;
> + }
> + return;
> + }
> +
> /*
> * All accesses to the event must belong to the same implicit RCU read-side
> * critical section as the ->pending_work reset. See comment in
> @@ -7688,6 +7753,8 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
> bool user = !event->attr.exclude_callchain_user;
> const u32 max_stack = event->attr.sample_max_stack;
> struct perf_callchain_entry *callchain;
> + bool defer_user = IS_ENABLED(CONFIG_HAVE_PERF_CALLCHAIN_DEFERRED) &&
> + event->attr.defer_callchain;
>
> /* Disallow cross-task user callchains. */
> user &= !event->ctx->task || event->ctx->task == current;
> @@ -7695,7 +7762,14 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
> if (!kernel && !user)
> return &__empty_callchain;
>
> - callchain = get_perf_callchain(regs, kernel, user, max_stack, true);
> + callchain = get_perf_callchain(regs, kernel, user, max_stack, true,
> + defer_user);
> +
> + if (user && defer_user && !event->pending_callchain) {
> + event->pending_callchain = 1;
> + irq_work_queue(&event->pending_irq);
> + }
> +
> return callchain ?: &__empty_callchain;
> }
>
> --
> 2.46.0
>
next prev parent reply other threads:[~2024-09-17 22:07 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-13 23:02 [PATCH v2 00/11] unwind, perf: sframe user space unwinding, deferred perf callchains Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 01/11] unwind: Introduce generic user space unwinding interface Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 02/11] unwind/x86: Add HAVE_USER_UNWIND Josh Poimboeuf
2024-09-16 11:46 ` Peter Zijlstra
2024-10-20 8:09 ` Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 03/11] unwind: Introduce SFrame user space unwinding Josh Poimboeuf
2024-09-14 11:23 ` Steven Rostedt
2024-10-01 18:20 ` Indu Bhagat
2024-10-01 18:36 ` Steven Rostedt
2024-10-02 8:18 ` Florian Weimer
2024-10-02 14:05 ` Steven Rostedt
2024-10-23 13:59 ` Jens Remus
2024-10-27 17:49 ` Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 04/11] unwind/x86/64: Add HAVE_USER_UNWIND_SFRAME Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 05/11] perf/x86: Use user_unwind interface Josh Poimboeuf
2024-09-16 6:48 ` kernel test robot
2024-09-17 22:01 ` Namhyung Kim
2024-09-13 23:02 ` [PATCH v2 06/11] perf: Remove get_perf_callchain() 'init_nr' argument Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 07/11] perf: Remove get_perf_callchain() 'crosstask' argument Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 08/11] perf: Simplify get_perf_callchain() user logic Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 09/11] perf: Introduce deferred user callchains Josh Poimboeuf
2024-09-17 22:07 ` Namhyung Kim [this message]
2024-09-13 23:02 ` [PATCH v2 10/11] perf/x86: Add HAVE_PERF_CALLCHAIN_DEFERRED Josh Poimboeuf
2024-09-13 23:02 ` [PATCH v2 11/11] perf/x86: Enable SFrame unwinding for deferred user callchains Josh Poimboeuf
2024-09-14 12:12 ` [PATCH v2 00/11] unwind, perf: sframe user space unwinding, deferred perf callchains Steven Rostedt
2024-09-15 11:11 ` Josh Poimboeuf
2024-09-15 11:38 ` Steven Rostedt
2024-09-16 14:08 ` Peter Zijlstra
2024-09-16 15:39 ` Josh Poimboeuf
2024-09-16 18:15 ` Peter Zijlstra
2024-09-16 0:15 ` Mathieu Desnoyers
2024-09-16 0:33 ` Mathieu Desnoyers
2024-09-17 0:37 ` Mathieu Desnoyers
2024-09-16 22:46 ` Steven Rostedt
2024-09-17 21:58 ` Namhyung Kim
2024-09-18 5:14 ` Mathieu Desnoyers
2024-10-03 2:31 ` Steven Rostedt
2024-10-03 2:37 ` Josh Poimboeuf
2024-10-03 14:56 ` Steven Rostedt
2024-09-16 16:03 ` Steven Rostedt
2024-09-14 19:37 ` Namhyung Kim
2024-10-23 13:22 ` Jens Remus
2024-10-24 2:22 ` Steven Rostedt
2024-10-27 17:24 ` Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zun9r1TAAG1slUSA@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=broonie@kernel.org \
--cc=indu.bhagat@oracle.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=jordalgo@meta.com \
--cc=jpoimboe@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-toolchains@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sam@gentoo.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).