From: Peter Zijlstra <peterz@infradead.org>
To: peterz@infradead.org, mingo@kernel.org,
alexander.shishkin@linux.intel.com, eranian@google.com
Cc: linux-kernel@vger.kernel.org, vince@deater.net,
dvyukov@google.com, andi@firstfloor.org, jolsa@redhat.com,
sasha.levin@oracle.com, oleg@redhat.com
Subject: [RFC][PATCH 5/7] perf: Fix cloning
Date: Fri, 19 Feb 2016 15:37:48 +0100 [thread overview]
Message-ID: <20160219144132.109477628@infradead.org> (raw)
In-Reply-To: 20160219143743.692339502@infradead.org
[-- Attachment #1: peterz-perf-more-fixes-5.patch --]
[-- Type: text/plain, Size: 4857 bytes --]
Alexander reported that when the 'original' context gets destroyed, no
new clones happen.
This can happen irrespective of the ctx switch optimization, any task
can die, even the parent, and we want to continue monitoring the task
hierarchy until we either close the event or no tasks are left in the
hierarchy.
perf_event_init_context() will attempt to pin the 'parent' context
during clone(). At that point current is the parent, and since current
cannot have exited while executing clone(), its context cannot have
passed through perf_event_exit_task_context(). Therefore
perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.
However, since inherit_event() does:
if (parent_event->parent)
parent_event = parent_event->parent;
it looks at the 'original' event when it does: is_orphaned_event().
This can return true if the context that contains the this event has
passed through perf_event_exit_task_context(). And thus we'll fail to
clone the perf context.
Fix this by adding a new state: STATE_DEAD, which is set by
perf_release() to indicate that the filedesc (or kernel reference) is
dead and there are no observers for our data left.
Only for STATE_DEAD will is_orphaned_event() be true and inhibit
cloning.
STATE_EXIT is otherwise preserved such that is_event_hup() remains
functional and will report when the observed task hierarchy becomes
empty.
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/perf_event.h | 1 +
kernel/events/core.c | 29 ++++++++++++++---------------
2 files changed, 15 insertions(+), 15 deletions(-)
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -397,6 +397,7 @@ struct pmu {
* enum perf_event_active_state - the states of a event
*/
enum perf_event_active_state {
+ PERF_EVENT_STATE_DEAD = -4,
PERF_EVENT_STATE_EXIT = -3,
PERF_EVENT_STATE_ERROR = -2,
PERF_EVENT_STATE_OFF = -1,
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1645,7 +1645,7 @@ static void perf_group_detach(struct per
static bool is_orphaned_event(struct perf_event *event)
{
- return event->state == PERF_EVENT_STATE_EXIT;
+ return event->state == PERF_EVENT_STATE_DEAD;
}
static inline int pmu_filter_match(struct perf_event *event)
@@ -1732,7 +1732,6 @@ group_sched_out(struct perf_event *group
}
#define DETACH_GROUP 0x01UL
-#define DETACH_STATE 0x02UL
/*
* Cross CPU call to remove a performance event
@@ -1752,8 +1751,6 @@ __perf_remove_from_context(struct perf_e
if (flags & DETACH_GROUP)
perf_group_detach(event);
list_del_event(event, ctx);
- if (flags & DETACH_STATE)
- event->state = PERF_EVENT_STATE_EXIT;
if (!ctx->nr_events && ctx->is_active) {
ctx->is_active = 0;
@@ -3776,22 +3773,24 @@ int perf_event_release_kernel(struct per
ctx = perf_event_ctx_lock(event);
WARN_ON_ONCE(ctx->parent_ctx);
- perf_remove_from_context(event, DETACH_GROUP | DETACH_STATE);
- perf_event_ctx_unlock(event, ctx);
+ perf_remove_from_context(event, DETACH_GROUP);
+ raw_spin_lock_irq(&ctx->lock);
/*
- * At this point we must have event->state == PERF_EVENT_STATE_EXIT,
- * either from the above perf_remove_from_context() or through
- * perf_event_exit_event().
+ * Mark this even as STATE_DEAD, there is no external reference to it
+ * anymore.
*
- * Therefore, anybody acquiring event->child_mutex after the below
- * loop _must_ also see this, most importantly inherit_event() which
- * will avoid placing more children on the list.
+ * Anybody acquiring event->child_mutex after the below loop _must_
+ * also see this, most importantly inherit_event() which will avoid
+ * placing more children on the list.
*
* Thus this guarantees that we will in fact observe and kill _ALL_
* child events.
*/
- WARN_ON_ONCE(event->state != PERF_EVENT_STATE_EXIT);
+ event->state = PERF_EVENT_STATE_DEAD;
+ raw_spin_unlock_irq(&ctx->lock);
+
+ perf_event_ctx_unlock(event, ctx);
again:
mutex_lock(&event->child_mutex);
@@ -4004,7 +4003,7 @@ static bool is_event_hup(struct perf_eve
{
bool no_children;
- if (event->state != PERF_EVENT_STATE_EXIT)
+ if (event->state > PERF_EVENT_STATE_EXIT)
return false;
mutex_lock(&event->child_mutex);
@@ -8731,7 +8730,7 @@ perf_event_exit_event(struct perf_event
if (parent_event)
perf_group_detach(child_event);
list_del_event(child_event, child_ctx);
- child_event->state = PERF_EVENT_STATE_EXIT; /* see perf_event_release_kernel() */
+ child_event->state = PERF_EVENT_STATE_EXIT; /* is_event_hup() */
raw_spin_unlock_irq(&child_ctx->lock);
/*
next prev parent reply other threads:[~2016-02-19 14:50 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-19 14:37 [RFC][PATCH 0/7] perf: more fuzzer inspired patches Peter Zijlstra
2016-02-19 14:37 ` [RFC][PATCH 1/7] perf: Close install vs exit race Peter Zijlstra
2016-02-19 14:37 ` [RFC][PATCH 2/7] perf: Do not double free Peter Zijlstra
2016-02-19 14:37 ` [RFC][PATCH 3/7] perf: Allow perf_release() with !event->ctx Peter Zijlstra
2016-02-19 14:37 ` [RFC][PATCH 4/7] perf: Fix scaling vs enable_on_exec Peter Zijlstra
2016-02-23 15:27 ` Peter Zijlstra
2016-02-23 15:48 ` Jiri Olsa
2016-02-23 16:35 ` Pratyush Anand
2016-02-23 17:47 ` Peter Zijlstra
2016-02-24 11:53 ` Peter Zijlstra
2016-02-24 14:02 ` Peter Zijlstra
2016-02-24 16:02 ` Peter Zijlstra
2016-02-25 4:07 ` Pratyush Anand
2016-02-23 21:44 ` Jiri Olsa
2016-02-26 2:25 ` Oleg Nesterov
2016-02-19 14:37 ` Peter Zijlstra [this message]
2016-02-19 14:37 ` [RFC][PATCH 6/7] perf: Fix race between event install and jump_labels Peter Zijlstra
2016-02-19 14:37 ` [RFC][PATCH 7/7] perf: Cure event->pending_disable race Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160219144132.109477628@infradead.org \
--to=peterz@infradead.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=andi@firstfloor.org \
--cc=dvyukov@google.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=sasha.levin@oracle.com \
--cc=vince@deater.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).