linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: alexander.shishkin@linux.intel.com, acme@kernel.org,
	mingo@redhat.com, jolsa@redhat.com, mark.rutland@arm.com,
	namhyung@kernel.org, tglx@linutronix.de, glider@google.com,
	viro@zeniv.linux.org.uk, arnd@arndb.de, christian@brauner.io,
	dvyukov@google.com, jannh@google.com, axboe@kernel.dk,
	mascasa@google.com, pcc@google.com, irogers@google.com,
	kasan-dev@googlegroups.com, linux-arch@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	x86@kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec
Date: Mon, 22 Mar 2021 10:20:02 +0100	[thread overview]
Message-ID: <YFhhQgUzXLSTlcu0@elver.google.com> (raw)
In-Reply-To: <YFDbP3obvxn0SL4w@hirez.programming.kicks-ass.net>

On Tue, Mar 16, 2021 at 05:22PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 10, 2021 at 11:41:34AM +0100, Marco Elver wrote:
> > Adds bit perf_event_attr::remove_on_exec, to support removing an event
> > from a task on exec.
> > 
> > This option supports the case where an event is supposed to be
> > process-wide only, and should not propagate beyond exec, to limit
> > monitoring to the original process image only.
> > 
> > Signed-off-by: Marco Elver <elver@google.com>
> 
> > +/*
> > + * Removes all events from the current task that have been marked
> > + * remove-on-exec, and feeds their values back to parent events.
> > + */
> > +static void perf_event_remove_on_exec(void)
> > +{
> > +	int ctxn;
> > +
> > +	for_each_task_context_nr(ctxn) {
> > +		struct perf_event_context *ctx;
> > +		struct perf_event *event, *next;
> > +
> > +		ctx = perf_pin_task_context(current, ctxn);
> > +		if (!ctx)
> > +			continue;
> > +		mutex_lock(&ctx->mutex);
> > +
> > +		list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
> > +			if (!event->attr.remove_on_exec)
> > +				continue;
> > +
> > +			if (!is_kernel_event(event))
> > +				perf_remove_from_owner(event);
> > +			perf_remove_from_context(event, DETACH_GROUP);
> 
> There's a comment on this in perf_event_exit_event(), if this task
> happens to have the original event, then DETACH_GROUP will destroy the
> grouping.
> 
> I think this wants to be:
> 
> 			perf_remove_from_text(event,
> 					      child_event->parent ?  DETACH_GROUP : 0);
> 
> or something.
> 
> > +			/*
> > +			 * Remove the event and feed back its values to the
> > +			 * parent event.
> > +			 */
> > +			perf_event_exit_event(event, ctx, current);
> 
> Oooh, and here we call it... but it will do list_del_even() /
> perf_group_detach() *again*.
> 
> So the problem is that perf_event_exit_task_context() doesn't use
> remove_from_context(), but instead does task_ctx_sched_out() and then
> relies on the events not being active.
> 
> Whereas above you *DO* use remote_from_context(), but then
> perf_event_exit_event() will try and remove it more.

AFAIK, we want to deallocate the events and not just remove them, so
doing what perf_event_exit_event() is the right way forward? Or did you
have something else in mind?

I'm still trying to make sense of the zoo of synchronisation mechanisms
at play here. No matter what I try, it seems I get stuck on the fact
that I can't cleanly "pause" the context to remove the events (warnings
in event_function()).

This is what I've been playing with to understand:

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 450ea9415ed7..c585cef284a0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4195,6 +4195,88 @@ static void perf_event_enable_on_exec(int ctxn)
 		put_ctx(clone_ctx);
 }
 
+static void perf_remove_from_owner(struct perf_event *event);
+static void perf_event_exit_event(struct perf_event *child_event,
+				  struct perf_event_context *child_ctx,
+				  struct task_struct *child);
+
+/*
+ * Removes all events from the current task that have been marked
+ * remove-on-exec, and feeds their values back to parent events.
+ */
+static void perf_event_remove_on_exec(void)
+{
+	struct perf_event *event, *next;
+	int ctxn;
+
+	/*****************  BROKEN BROKEN BROKEN *****************/
+
+	for_each_task_context_nr(ctxn) {
+		struct perf_event_context *ctx;
+		bool removed = false;
+
+		ctx = perf_pin_task_context(current, ctxn);
+		if (!ctx)
+			continue;
+		mutex_lock(&ctx->mutex);
+
+		raw_spin_lock_irq(&ctx->lock);
+		/*
+		 * WIP: Ok, we will unschedule the context, _and_ tell everyone
+		 * still trying to use that it's dead... even though it isn't.
+		 *
+		 * This can't be right...
+		 */
+		task_ctx_sched_out(__get_cpu_context(ctx), ctx, EVENT_ALL);
+		RCU_INIT_POINTER(current->perf_event_ctxp[ctxn], NULL);
+		WRITE_ONCE(ctx->task, TASK_TOMBSTONE);

This code here is obviously bogus, because it removes the context from
the task: we might still need it since this task is not dead yet.

What's the right way to pause the context to remove the events from it?

+		raw_spin_unlock_irq(&ctx->lock);
+
+		list_for_each_entry_safe(event, next, &ctx->event_list, event_entry) {
+			if (!event->attr.remove_on_exec)
+				continue;
+			removed = true;
+
+			if (!is_kernel_event(event))
+				perf_remove_from_owner(event);
+
+			/*
+			 * WIP: Want to free the event and feed back its values
+			 * to the parent (if any) ...
+			 */
+			perf_event_exit_event(event, ctx, current);
+		}
+

... need to schedule context back in here?

+
+		mutex_unlock(&ctx->mutex);
+		perf_unpin_context(ctx);
+		put_ctx(ctx);
+	}
+}
+
 struct perf_read_data {
 	struct perf_event *event;
 	bool group;
@@ -7553,6 +7635,8 @@ void perf_event_exec(void)
 				   true);
 	}
 	rcu_read_unlock();
+
+	perf_event_remove_on_exec();
 }
 

Thanks,
-- Marco

  reply	other threads:[~2021-03-22  9:21 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 10:41 [PATCH RFC v2 0/8] Add support for synchronous signals on perf events Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 1/8] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 2/8] perf/core: Support only inheriting events if cloned with CLONE_THREAD Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 3/8] perf/core: Add support for event removal on exec Marco Elver
2021-03-10 10:47   ` Marco Elver
2021-03-16 16:22   ` Peter Zijlstra
2021-03-22  9:20     ` Marco Elver [this message]
2021-03-10 10:41 ` [PATCH RFC v2 4/8] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 5/8] perf/core: Add support for SIGTRAP on perf events Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 6/8] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 7/8] selftests/perf: Add kselftest for process-wide sigtrap handling Marco Elver
2021-03-10 10:41 ` [PATCH RFC v2 8/8] selftests/perf: Add kselftest for remove_on_exec Marco Elver
2021-03-22 13:24   ` Marco Elver
2021-03-22 16:42     ` Peter Zijlstra
2021-03-23  9:52       ` Marco Elver
2021-03-23 10:32         ` Peter Zijlstra
2021-03-23 10:41           ` Marco Elver
2021-03-23 12:08             ` Marco Elver
2021-03-23 14:45           ` Peter Zijlstra
2021-03-23 15:58             ` Marco Elver
2021-03-23 16:19               ` Peter Zijlstra
2021-03-23  3:10     ` Ian Rogers
2021-03-23  9:47       ` Marco Elver
2021-03-23 19:16         ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFhhQgUzXLSTlcu0@elver.google.com \
    --to=elver@google.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=christian@brauner.io \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=irogers@google.com \
    --cc=jannh@google.com \
    --cc=jolsa@redhat.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mascasa@google.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=pcc@google.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).