public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Steven Rostedt <rostedt@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	bpf@vger.kernel.org, x86@kernel.org,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrii Nakryiko <andrii@kernel.org>,
	Indu Bhagat <indu.bhagat@oracle.com>,
	"Jose E. Marchesi" <jemarch@gnu.org>,
	Beau Belgrave <beaub@linux.microsoft.com>,
	Jens Remus <jremus@linux.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Florian Weimer <fweimer@redhat.com>, Sam James <sam@gentoo.org>,
	Kees Cook <kees@kernel.org>,
	Carlos O'Donell <codonell@redhat.com>
Subject: Re: [RESEND][PATCH v15 2/4] perf: Support deferred user callchains
Date: Tue, 23 Sep 2025 12:32:13 +0200	[thread overview]
Message-ID: <20250923103213.GD3419281@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20250908171524.605637238@kernel.org>

On Mon, Sep 08, 2025 at 01:14:14PM -0400, Steven Rostedt wrote:
> +struct perf_callchain_deferred_event {
> +	struct perf_event_header	header;
> +	u64				cookie;
> +	u64				nr;
> +	u64				ips[];
> +};
> +
> +static void perf_event_callchain_deferred(struct callback_head *work)
> +{
> +	struct perf_event *event = container_of(work, struct perf_event, pending_unwind_work);
> +	struct perf_callchain_deferred_event deferred_event;
> +	u64 callchain_context = PERF_CONTEXT_USER;
> +	struct unwind_stacktrace trace;
> +	struct perf_output_handle handle;
> +	struct perf_sample_data data;
> +	u64 nr;
> +
> +	if (!event->pending_unwind_callback)
> +		return;
> +
> +	if (unwind_user_faultable(&trace) < 0)
> +		goto out;
> +
> +	/*
> +	 * All accesses to the event must belong to the same implicit RCU
> +	 * read-side critical section as the ->pending_unwind_callback reset.
> +	 * See comment in perf_pending_unwind_sync().
> +	 */
> +	guard(rcu)();
> +
> +	if (current->flags & (PF_KTHREAD | PF_USER_WORKER))
> +		goto out;
> +
> +	nr = trace.nr + 1 ; /* '+1' == callchain_context */
> +
> +	deferred_event.header.type = PERF_RECORD_CALLCHAIN_DEFERRED;
> +	deferred_event.header.misc = PERF_RECORD_MISC_USER;
> +	deferred_event.header.size = sizeof(deferred_event) + (nr * sizeof(u64));
> +
> +	deferred_event.nr = nr;
> +	deferred_event.cookie = unwind_user_get_cookie();
> +
> +	perf_event_header__init_id(&deferred_event.header, &data, event);
> +
> +	if (perf_output_begin(&handle, &data, event, deferred_event.header.size))
> +		goto out;
> +
> +	perf_output_put(&handle, deferred_event);
> +	perf_output_put(&handle, callchain_context);
> +	/* trace.entries[] are not guaranteed to be 64bit */
> +	for (int i = 0; i < trace.nr; i++) {
> +		u64 entry = trace.entries[i];
> +		perf_output_put(&handle, entry);
> +	}
> +	perf_event__output_id_sample(event, &handle, &data);
> +
> +	perf_output_end(&handle);
> +
> +out:
> +	event->pending_unwind_callback = 0;
> +	local_dec(&event->ctx->nr_no_switch_fast);
> +	rcuwait_wake_up(&event->pending_unwind_wait);
> +}
> +

> +/*
> + * Returns:
> +*     > 0 : if already queued.
> + *      0 : if it performed the queuing
> + *    < 0 : if it did not get queued.
> + */
> +static int deferred_request(struct perf_event *event)
> +{
> +	struct callback_head *work = &event->pending_unwind_work;
> +	int pending;
> +	int ret;
> +
> +	/* Only defer for task events */
> +	if (!event->ctx->task)
> +		return -EINVAL;
> +
> +	if ((current->flags & (PF_KTHREAD | PF_USER_WORKER)) ||
> +	    !user_mode(task_pt_regs(current)))
> +		return -EINVAL;
> +
> +	guard(irqsave)();
> +
> +	/* callback already pending? */
> +	pending = READ_ONCE(event->pending_unwind_callback);
> +	if (pending)
> +		return 1;
> +
> +	/* Claim the work unless an NMI just now swooped in to do so. */
> +	if (!try_cmpxchg(&event->pending_unwind_callback, &pending, 1))
> +		return 1;
> +
> +	/* The work has been claimed, now schedule it. */
> +	ret = task_work_add(current, work, TWA_RESUME);
> +	if (WARN_ON_ONCE(ret)) {
> +		WRITE_ONCE(event->pending_unwind_callback, 0);
> +		return ret;
> +	}
> +	return 0;
> +}

So the thing that stands out is that you're not actually using the
unwind infrastructure you've previously created. Things like: struct
unwind_work, unwind_deferred_{init,request,cancel}() all go unused, and
instead you seem to have build a parallel set, with similar bugs to the
ones I just had to fix in the unwind_deferred things :/

I'm also not much of a fan of nr_no_switch_fast, and the fact that this
patch is limited to per-task events, and you're then adding another 300+
lines of code to support per-cpu events later on.

Fundamentally we only have one stack-trace per task at any one point. We
can have many events per task and many more per-cpu. Let us stick a
struct unwind_work in task_struct and have the perf callback function
use perf_iterate_sb() to find all events that want delivery or so (or we
can add another per perf_event_context list for this purpose).

But duplicating all this seems 'unfortunate'.

  parent reply	other threads:[~2025-09-23 10:32 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-08 17:14 [RESEND][PATCH v15 0/4] perf: Support the deferred unwinding infrastructure Steven Rostedt
2025-09-08 17:14 ` [RESEND][PATCH v15 1/4] unwind deferred: Add unwind_user_get_cookie() API Steven Rostedt
2025-09-08 17:14 ` [RESEND][PATCH v15 2/4] perf: Support deferred user callchains Steven Rostedt
2025-09-23  9:19   ` Peter Zijlstra
2025-09-23  9:35     ` Steven Rostedt
2025-09-23  9:38       ` Peter Zijlstra
2025-09-23 10:28         ` [RESEND][PATCH v15 2/4] perf: Support deferred user callchainshttps://lore.kernel.org/linux-trace-kernel/20250827193644.527334838@kernel.org/ Steven Rostedt
2025-09-23 10:35           ` Peter Zijlstra
2025-09-23 10:01   ` [RESEND][PATCH v15 2/4] perf: Support deferred user callchains Peter Zijlstra
2025-09-23 10:32   ` Peter Zijlstra [this message]
2025-09-23 12:36     ` Steven Rostedt
2025-10-03 19:56     ` Steven Rostedt
2025-09-08 17:14 ` [RESEND][PATCH v15 3/4] perf: Have the deferred request record the user context cookie Steven Rostedt
2025-09-08 17:14 ` [RESEND][PATCH v15 4/4] perf: Support deferred user callchains for per CPU events Steven Rostedt
2025-09-08 17:21 ` [RESEND][PATCH v15 0/4] perf: Support the deferred unwinding infrastructure Steven Rostedt
2025-09-16 14:41   ` Steven Rostedt
2025-09-18 11:46 ` Peter Zijlstra
2025-09-18 15:18   ` Steven Rostedt
2025-09-18 17:24     ` Peter Zijlstra
2025-09-18 17:32       ` Peter Zijlstra
2025-09-18 19:10         ` Steven Rostedt
2025-09-19 23:34           ` Josh Poimboeuf
2025-09-21 23:33             ` Steven Rostedt
2025-09-22  7:23             ` Peter Zijlstra
2025-09-22 14:17               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250923103213.GD3419281@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=beaub@linux.microsoft.com \
    --cc=bpf@vger.kernel.org \
    --cc=codonell@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=indu.bhagat@oracle.com \
    --cc=jemarch@gnu.org \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=jremus@linux.ibm.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@kernel.org \
    --cc=sam@gentoo.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox