All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: alexander.shishkin@linux.intel.com, acme@kernel.org,
	mingo@redhat.com, jolsa@redhat.com, mark.rutland@arm.com,
	namhyung@kernel.org, tglx@linutronix.de, glider@google.com,
	viro@zeniv.linux.org.uk, arnd@arndb.de, christian@brauner.io,
	dvyukov@google.com, jannh@google.com, axboe@kernel.dk,
	mascasa@google.com, pcc@google.com, irogers@google.com,
	kasan-dev@googlegroups.com, linux-arch@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	x86@kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec
Date: Mon, 22 Mar 2021 10:20:02 +0100	[thread overview]
Message-ID: <YFhhQgUzXLSTlcu0@elver.google.com> (raw)
In-Reply-To: <YFDbP3obvxn0SL4w@hirez.programming.kicks-ass.net>

On Tue, Mar 16, 2021 at 05:22PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 10, 2021 at 11:41:34AM +0100, Marco Elver wrote:
> > Adds bit perf_event_attr::remove_on_exec, to support removing an event
> > from a task on exec.
> > 
> > This option supports the case where an event is supposed to be
> > process-wide only, and should not propagate beyond exec, to limit
> > monitoring to the original process image only.
> > 
> > Signed-off-by: Marco Elver <elver@google.com>
> 
> > +/*
> > + * Removes all events from the current task that have been marked
> > + * remove-on-exec, and feeds their values back to parent events.
> > + */
> > +static void perf_event_remove_on_exec(void)
> > +{
> > +	int ctxn;
> > +
> > +	for_each_task_context_nr(ctxn) {
> > +		struct perf_event_context *ctx;
> > +		struct perf_event *event, *next;
> > +
> > +		ctx = perf_pin_task_context(current, ctxn);
> > +		if (!ctx)
> > +			continue;
> > +		mutex_lock(&ctx->mutex);
> > +
> > +		list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
> > +			if (!event->attr.remove_on_exec)
> > +				continue;
> > +
> > +			if (!is_kernel_event(event))
> > +				perf_remove_from_owner(event);
> > +			perf_remove_from_context(event, DETACH_GROUP);
> 
> There's a comment on this in perf_event_exit_event(), if this task
> happens to have the original event, then DETACH_GROUP will destroy the
> grouping.
> 
> I think this wants to be:
> 
> 			perf_remove_from_text(event,
> 					      child_event->parent ?  DETACH_GROUP : 0);
> 
> or something.
> 
> > +			/*
> > +			 * Remove the event and feed back its values to the
> > +			 * parent event.
> > +			 */
> > +			perf_event_exit_event(event, ctx, current);
> 
> Oooh, and here we call it... but it will do list_del_even() /
> perf_group_detach() *again*.
> 
> So the problem is that perf_event_exit_task_context() doesn't use
> remove_from_context(), but instead does task_ctx_sched_out() and then
> relies on the events not being active.
> 
> Whereas above you *DO* use remote_from_context(), but then
> perf_event_exit_event() will try and remove it more.

AFAIK, we want to deallocate the events and not just remove them, so
doing what perf_event_exit_event() is the right way forward? Or did you
have something else in mind?

I'm still trying to make sense of the zoo of synchronisation mechanisms
at play here. No matter what I try, it seems I get stuck on the fact
that I can't cleanly "pause" the context to remove the events (warnings
in event_function()).

This is what I've been playing with to understand:

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 450ea9415ed7..c585cef284a0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4195,6 +4195,88 @@ static void perf_event_enable_on_exec(int ctxn)
 		put_ctx(clone_ctx);
 }
 
+static void perf_remove_from_owner(struct perf_event *event);
+static void perf_event_exit_event(struct perf_event *child_event,
+				  struct perf_event_context *child_ctx,
+				  struct task_struct *child);
+
+/*
+ * Removes all events from the current task that have been marked
+ * remove-on-exec, and feeds their values back to parent events.
+ */
+static void perf_event_remove_on_exec(void)
+{
+	struct perf_event *event, *next;
+	int ctxn;
+
+	/*****************  BROKEN BROKEN BROKEN *****************/
+
+	for_each_task_context_nr(ctxn) {
+		struct perf_event_context *ctx;
+		bool removed = false;
+
+		ctx = perf_pin_task_context(current, ctxn);
+		if (!ctx)
+			continue;
+		mutex_lock(&ctx->mutex);
+
+		raw_spin_lock_irq(&ctx->lock);
+		/*
+		 * WIP: Ok, we will unschedule the context, _and_ tell everyone
+		 * still trying to use that it's dead... even though it isn't.
+		 *
+		 * This can't be right...
+		 */
+		task_ctx_sched_out(__get_cpu_context(ctx), ctx, EVENT_ALL);
+		RCU_INIT_POINTER(current->perf_event_ctxp[ctxn], NULL);
+		WRITE_ONCE(ctx->task, TASK_TOMBSTONE);

This code here is obviously bogus, because it removes the context from
the task: we might still need it since this task is not dead yet.

What's the right way to pause the context to remove the events from it?

+		raw_spin_unlock_irq(&ctx->lock);
+
+		list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
+			if (!event->attr.remove_on_exec)
+				continue;
+			removed = true;
+
+			if (!is_kernel_event(event))
+				perf_remove_from_owner(event);
+
+			/*
+			 * WIP: Want to free the event and feed back its values
+			 * to the parent (if any) ...
+			 */
+			perf_event_exit_event(event, ctx, current);
+		}
+

... need to schedule context back in here?

+
+		mutex_unlock(&ctx->mutex);
+		perf_unpin_context(ctx);
+		put_ctx(ctx);
+	}
+}
+
 struct perf_read_data {
 	struct perf_event *event;
 	bool group;
@@ -7553,6 +7635,8 @@ void perf_event_exec(void)
 				   true);
 	}
 	rcu_read_unlock();
+
+	perf_event_remove_on_exec();
 }
 

Thanks,
-- Marco

  reply	other threads:[~2021-03-22  9:21 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 10:41 [PATCH RFC v2 0/8] Add support for synchronous signals on perf events Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 1/8] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 2/8] perf/core: Support only inheriting events if cloned with CLONE_THREAD Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec Marco Elver
2021-03-10 10:47   ` Marco Elver
2021-03-16 16:22   ` Peter Zijlstra
2021-03-22  9:20     ` Marco Elver [this message]
2021-03-10 10:41 ` [PATCH RFC v2 4/8] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 5/8] perf/core: Add support for SIGTRAP on perf events Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 6/8] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 7/8] selftests/perf: Add kselftest for process-wide sigtrap handling Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 8/8] selftests/perf: Add kselftest for remove_on_exec Marco Elver
2021-03-22 13:24   ` Marco Elver
2021-03-22 16:42     ` Peter Zijlstra
2021-03-23  9:52       ` Marco Elver
2021-03-23 10:32         ` Peter Zijlstra
2021-03-23 10:41           ` Marco Elver
2021-03-23 12:08             ` Marco Elver
2021-03-23 14:45           ` Peter Zijlstra
2021-03-23 15:58             ` Marco Elver
2021-03-23 16:19               ` Peter Zijlstra
2021-03-23  3:10     ` Ian Rogers
2021-03-23  9:47       ` Marco Elver
2021-03-23 19:16         ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFhhQgUzXLSTlcu0@elver.google.com \
    --to=elver@google.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=christian@brauner.io \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=irogers@google.com \
    --cc=jannh@google.com \
    --cc=jolsa@redhat.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mascasa@google.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=pcc@google.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.