public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] perf: Fix tracepoint events permissions check
@ 2014-07-11 11:56 Jiri Olsa
  2014-07-11 11:56 ` [PATCH 1/5] perf: Make perf_init_event function static Jiri Olsa
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexander Yarygin, Arnaldo Carvalho de Melo, Corey Ashford,
	Frederic Weisbecker, Ingo Molnar, Paul Mackerras, Peter Zijlstra

hi,
sending fix for bug reported by Alexander Yarygin in here:
  http://marc.info/?l=linux-kernel&m=140475133707722&w=2

The main problem was, that the event_init tracepoint callback
checked permission of the 'current' task instead of the event
owner task.

While this is ok for perf_event_open syscall check, it is wrong
once event_init is called during fork to create child events.
In this case the permission of the forked task is checked instead
of the owner task of the parent event.

Changing tracepoint permission code to check event's owner task,
plus some other changes I needed for this.

thanks,
jirka


---
Jiri Olsa (5):
      perf: Make perf_init_event function static
      perf: Destroy event's children on task exit
      perf: Initialize owner before calling event_init callback
      perf: Move event owner retrieval into perf_event_get_owner
      perf: Check event's owner permission in tracepoint init callback

 include/linux/perf_event.h      |  1 +
 kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
 kernel/trace/trace_event_perf.c | 19 +++++++++++++++++--
 3 files changed, 73 insertions(+), 19 deletions(-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/5] perf: Make perf_init_event function static
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
@ 2014-07-11 11:56 ` Jiri Olsa
  2014-07-11 11:56 ` [PATCH 2/5] perf: Destroy event's children on task exit Jiri Olsa
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra, Jiri Olsa

From: Jiri Olsa <jolsa@redhat.com>

No need for this one to be global.

Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67e3b9c..71a56ae 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6660,7 +6660,7 @@ void perf_pmu_unregister(struct pmu *pmu)
 }
 EXPORT_SYMBOL_GPL(perf_pmu_unregister);
 
-struct pmu *perf_init_event(struct perf_event *event)
+static struct pmu *perf_init_event(struct perf_event *event)
 {
 	struct pmu *pmu = NULL;
 	int idx;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
  2014-07-11 11:56 ` [PATCH 1/5] perf: Make perf_init_event function static Jiri Olsa
@ 2014-07-11 11:56 ` Jiri Olsa
  2014-07-11 13:23   ` Peter Zijlstra
  2014-07-14 11:18   ` Peter Zijlstra
  2014-07-11 11:56 ` [PATCH 3/5] perf: Initialize owner before calling event_init callback Jiri Olsa
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra, Jiri Olsa

From: Jiri Olsa <jolsa@redhat.com>

When task exits we close:
  1) all events that are installed in task
  2) all events owned by task (via file descriptor)

But we don't close children events of 2) events. Those children
events stay until the child task exits and are useless with the
parent being gone, because we have no way to get to values any
more.

Plus if the event stays installed in task even with the owner task
gone, it runs the perf callback any time the task forks, for no
real reason.

Closing all children events events when the owner task of the
parent event is closed.

Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 71a56ae..37797dd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7535,6 +7535,32 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 	put_ctx(child_ctx);
 }
 
+static void perf_event_exit_children(struct perf_event *parent)
+{
+	struct perf_event *child, *tmp;
+
+	mutex_lock(&parent->child_mutex);
+	list_for_each_entry_safe(child, tmp, &parent->child_list,
+				 child_list) {
+		struct perf_event_context *child_ctx = child->ctx;
+
+		/*
+		 * Child events got removed from child_list under
+		 * child_mutex and then freed. So it's safe to access
+		 * childs context in here, because the child holds
+		 * context ref.
+		 */
+		mutex_lock(&child_ctx->mutex);
+		perf_remove_from_context(child, true);
+		mutex_unlock(&child_ctx->mutex);
+
+		list_del_init(&child->child_list);
+		put_event(parent);
+		free_event(child);
+	}
+	mutex_unlock(&parent->child_mutex);
+}
+
 /*
  * When a child task exits, feed back event values to parent events.
  */
@@ -7555,6 +7581,7 @@ void perf_event_exit_task(struct task_struct *child)
 		 */
 		smp_wmb();
 		event->owner = NULL;
+		perf_event_exit_children(event);
 	}
 	mutex_unlock(&child->perf_event_mutex);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/5] perf: Initialize owner before calling event_init callback
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
  2014-07-11 11:56 ` [PATCH 1/5] perf: Make perf_init_event function static Jiri Olsa
  2014-07-11 11:56 ` [PATCH 2/5] perf: Destroy event's children on task exit Jiri Olsa
@ 2014-07-11 11:56 ` Jiri Olsa
  2014-07-11 11:56 ` [PATCH 4/5] perf: Move event owner retrieval into perf_event_get_owner Jiri Olsa
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra, Jiri Olsa

From: Jiri Olsa <jolsa@redhat.com>

To be able to check the owner task permission in event_init
PMU callback, we need to set the owner before it is called.

Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 37797dd..a36ebfe 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6751,6 +6751,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		 struct task_struct *task,
 		 struct perf_event *group_leader,
 		 struct perf_event *parent_event,
+		 struct task_struct *owner,
 		 perf_overflow_handler_t overflow_handler,
 		 void *context)
 {
@@ -6829,6 +6830,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	perf_event__state_init(event);
 
+	if (owner)
+		event->owner = owner;
+
 	pmu = NULL;
 
 	hwc = &event->hw;
@@ -7141,7 +7145,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	get_online_cpus();
 
 	event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
-				 NULL, NULL);
+				 current, NULL, NULL);
 	if (IS_ERR(event)) {
 		err = PTR_ERR(event);
 		goto err_cpus;
@@ -7293,8 +7297,6 @@ SYSCALL_DEFINE5(perf_event_open,
 
 	put_online_cpus();
 
-	event->owner = current;
-
 	mutex_lock(&current->perf_event_mutex);
 	list_add_tail(&event->owner_entry, &current->perf_event_list);
 	mutex_unlock(&current->perf_event_mutex);
@@ -7353,7 +7355,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 	 * Get the target context (task or percpu):
 	 */
 
-	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
+	event = perf_event_alloc(attr, cpu, task, NULL, NULL, NULL,
 				 overflow_handler, context);
 	if (IS_ERR(event)) {
 		err = PTR_ERR(event);
@@ -7674,11 +7676,9 @@ inherit_event(struct perf_event *parent_event,
 	if (parent_event->parent)
 		parent_event = parent_event->parent;
 
-	child_event = perf_event_alloc(&parent_event->attr,
-					   parent_event->cpu,
-					   child,
-					   group_leader, parent_event,
-				           NULL, NULL);
+	child_event = perf_event_alloc(&parent_event->attr, parent_event->cpu,
+				       child, group_leader, parent_event,
+				       NULL, NULL, NULL);
 	if (IS_ERR(child_event))
 		return child_event;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/5] perf: Move event owner retrieval into perf_event_get_owner
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
                   ` (2 preceding siblings ...)
  2014-07-11 11:56 ` [PATCH 3/5] perf: Initialize owner before calling event_init callback Jiri Olsa
@ 2014-07-11 11:56 ` Jiri Olsa
  2014-07-11 11:56 ` [PATCH 5/5] perf: Check event's owner permission in tracepoint init callback Jiri Olsa
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra

Moving the code of getting safely event's owner into
perf_event_get_owner function, so it could be used
in following change.

Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/perf_event.h |  1 +
 kernel/events/core.c       | 23 +++++++++++++++--------
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 707617a..ec6ef94 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -784,6 +784,7 @@ extern void perf_event_enable(struct perf_event *event);
 extern void perf_event_disable(struct perf_event *event);
 extern int __perf_event_disable(void *info);
 extern void perf_event_task_tick(void);
+extern struct task_struct *perf_event_get_owner(struct perf_event *event);
 #else /* !CONFIG_PERF_EVENTS: */
 static inline void
 perf_event_task_sched_in(struct task_struct *prev,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a36ebfe..faba9e0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3311,17 +3311,10 @@ static void free_event(struct perf_event *event)
 	_free_event(event);
 }
 
-/*
- * Called when the last reference to the file is gone.
- */
-static void put_event(struct perf_event *event)
+struct task_struct *perf_event_get_owner(struct perf_event *event)
 {
-	struct perf_event_context *ctx = event->ctx;
 	struct task_struct *owner;
 
-	if (!atomic_long_dec_and_test(&event->refcount))
-		return;
-
 	rcu_read_lock();
 	owner = ACCESS_ONCE(event->owner);
 	/*
@@ -3340,7 +3333,21 @@ static void put_event(struct perf_event *event)
 		get_task_struct(owner);
 	}
 	rcu_read_unlock();
+	return owner;
+}
+
+/*
+ * Called when the last reference to the file is gone.
+ */
+static void put_event(struct perf_event *event)
+{
+	struct perf_event_context *ctx = event->ctx;
+	struct task_struct *owner;
+
+	if (!atomic_long_dec_and_test(&event->refcount))
+		return;
 
+	owner = perf_event_get_owner(event);
 	if (owner) {
 		mutex_lock(&owner->perf_event_mutex);
 		/*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/5] perf: Check event's owner permission in tracepoint init callback
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
                   ` (3 preceding siblings ...)
  2014-07-11 11:56 ` [PATCH 4/5] perf: Move event owner retrieval into perf_event_get_owner Jiri Olsa
@ 2014-07-11 11:56 ` Jiri Olsa
  2014-07-11 12:02 ` [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
  2014-07-28  8:28 ` [tip:perf/core] perf: Check permission only for parent tracepoint event tip-bot for Jiri Olsa
  6 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra

Checking event's owner task permissions instead of 'current'
process' ones. This way we get proper check when this is
called from fork callback, where 'current' process does not
represent the owner.

Checking parent event owner task in case of child event.

Reported-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c            |  4 ++++
 kernel/trace/trace_event_perf.c | 19 +++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index faba9e0..c478828 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3315,6 +3315,10 @@ struct task_struct *perf_event_get_owner(struct perf_event *event)
 {
 	struct task_struct *owner;
 
+	/* Only the parent holds owner task. */
+	if (event->parent)
+		event = event->parent;
+
 	rcu_read_lock();
 	owner = ACCESS_ONCE(event->owner);
 	/*
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 5d12bb4..35b76ec 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -21,9 +21,24 @@ typedef typeof(unsigned long [PERF_MAX_TRACE_SIZE / sizeof(unsigned long)])
 /* Count the events in use (per event id, not per instance) */
 static int	total_ref_count;
 
+static bool is_owner_admin(struct perf_event *event)
+{
+	struct task_struct *owner = perf_event_get_owner(event);
+	bool admin = false;
+
+	if (owner) {
+		admin = has_capability(owner, CAP_SYS_ADMIN);
+		put_task_struct(owner);
+	}
+
+	return admin;
+}
+
 static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 				 struct perf_event *p_event)
 {
+	bool admin = is_owner_admin(p_event);
+
 	if (tp_event->perf_perm) {
 		int ret = tp_event->perf_perm(tp_event, p_event);
 		if (ret)
@@ -32,7 +47,7 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event)) {
-		if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
+		if (perf_paranoid_tracepoint_raw() && !admin)
 			return -EPERM;
 
 		/*
@@ -65,7 +80,7 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 	 * ...otherwise raw tracepoint data can be a severe data leak,
 	 * only allow root to have these.
 	 */
-	if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
+	if (perf_paranoid_tracepoint_raw() && !admin)
 		return -EPERM;
 
 	return 0;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] perf: Fix tracepoint events permissions check
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
                   ` (4 preceding siblings ...)
  2014-07-11 11:56 ` [PATCH 5/5] perf: Check event's owner permission in tracepoint init callback Jiri Olsa
@ 2014-07-11 12:02 ` Jiri Olsa
  2014-07-28  8:28 ` [tip:perf/core] perf: Check permission only for parent tracepoint event tip-bot for Jiri Olsa
  6 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 12:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Alexander Yarygin, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra

On Fri, Jul 11, 2014 at 01:56:17PM +0200, Jiri Olsa wrote:
> hi,
> sending fix for bug reported by Alexander Yarygin in here:
>   http://marc.info/?l=linux-kernel&m=140475133707722&w=2
> 
> The main problem was, that the event_init tracepoint callback
> checked permission of the 'current' task instead of the event
> owner task.
> 
> While this is ok for perf_event_open syscall check, it is wrong
> once event_init is called during fork to create child events.
> In this case the permission of the forked task is checked instead
> of the owner task of the parent event.

also available in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/core_perm

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-11 11:56 ` [PATCH 2/5] perf: Destroy event's children on task exit Jiri Olsa
@ 2014-07-11 13:23   ` Peter Zijlstra
  2014-07-11 13:31     ` Jiri Olsa
  2014-07-16 12:20     ` Ingo Molnar
  2014-07-14 11:18   ` Peter Zijlstra
  1 sibling, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2014-07-11 13:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Jiri Olsa, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> From: Jiri Olsa <jolsa@redhat.com>
> 
> When task exits we close:
>   1) all events that are installed in task
>   2) all events owned by task (via file descriptor)
> 
> But we don't close children events of 2) events. Those children
> events stay until the child task exits and are useless with the
> parent being gone, because we have no way to get to values any
> more.
> 
> Plus if the event stays installed in task even with the owner task
> gone, it runs the perf callback any time the task forks, for no
> real reason.
> 
> Closing all children events events when the owner task of the
> parent event is closed.

So I _think_ the reason we didn't do this is because this is potentially
very expensive and keeping them around isn't too much bother, they'll
die eventually.

But I can't really remember. Ingo any recollections / opinions?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-11 13:23   ` Peter Zijlstra
@ 2014-07-11 13:31     ` Jiri Olsa
  2014-07-16 12:20     ` Ingo Molnar
  1 sibling, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-11 13:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Fri, Jul 11, 2014 at 03:23:09PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > From: Jiri Olsa <jolsa@redhat.com>
> > 
> > When task exits we close:
> >   1) all events that are installed in task
> >   2) all events owned by task (via file descriptor)
> > 
> > But we don't close children events of 2) events. Those children
> > events stay until the child task exits and are useless with the
> > parent being gone, because we have no way to get to values any
> > more.
> > 
> > Plus if the event stays installed in task even with the owner task
> > gone, it runs the perf callback any time the task forks, for no
> > real reason.
> > 
> > Closing all children events events when the owner task of the
> > parent event is closed.
> 
> So I _think_ the reason we didn't do this is because this is potentially
> very expensive and keeping them around isn't too much bother, they'll
> die eventually.

yep, they will.. but until then the task they are installed
in will continue having them and clone them on each fork

also for each such fork there's the permission check for inherited
tracepoint event, which we canot make without the owner task

so I ended up with closing all the children

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-11 11:56 ` [PATCH 2/5] perf: Destroy event's children on task exit Jiri Olsa
  2014-07-11 13:23   ` Peter Zijlstra
@ 2014-07-14 11:18   ` Peter Zijlstra
  2014-07-14 11:43     ` Jiri Olsa
  2014-07-14 20:18     ` Jiri Olsa
  1 sibling, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2014-07-14 11:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Jiri Olsa, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 2962 bytes --]

On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> From: Jiri Olsa <jolsa@redhat.com>
> 
> When task exits we close:
>   1) all events that are installed in task
>   2) all events owned by task (via file descriptor)
> 
> But we don't close children events of 2) events. Those children
> events stay until the child task exits and are useless with the
> parent being gone, because we have no way to get to values any
> more.
> 
> Plus if the event stays installed in task even with the owner task
> gone, it runs the perf callback any time the task forks, for no
> real reason.
> 
> Closing all children events events when the owner task of the
> parent event is closed.

Do we need this for the other patches, or is this an unrelated change? 

> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/events/core.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 71a56ae..37797dd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7535,6 +7535,32 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
>  	put_ctx(child_ctx);
>  }
>  
> +static void perf_event_exit_children(struct perf_event *parent)
> +{
> +	struct perf_event *child, *tmp;
> +
> +	mutex_lock(&parent->child_mutex);
> +	list_for_each_entry_safe(child, tmp, &parent->child_list,
> +				 child_list) {
> +		struct perf_event_context *child_ctx = child->ctx;
> +
> +		/*
> +		 * Child events got removed from child_list under
> +		 * child_mutex and then freed. So it's safe to access
> +		 * childs context in here, because the child holds
> +		 * context ref.
> +		 */
> +		mutex_lock(&child_ctx->mutex);
> +		perf_remove_from_context(child, true);
> +		mutex_unlock(&child_ctx->mutex);
> +
> +		list_del_init(&child->child_list);
> +		put_event(parent);
> +		free_event(child);
> +	}
> +	mutex_unlock(&parent->child_mutex);
> +}
> +
>  /*
>   * When a child task exits, feed back event values to parent events.
>   */
> @@ -7555,6 +7581,7 @@ void perf_event_exit_task(struct task_struct *child)
>  		 */
>  		smp_wmb();
>  		event->owner = NULL;
> +		perf_event_exit_children(event);
>  	}
>  	mutex_unlock(&child->perf_event_mutex);

I don't think this is correct, perf_event_init_context() can come in
concurrently and the first place it runs into ->child_mutex is after its
already allocated and created the (first) child event.

This also means that it not only makes exit() very slow for all tasks
that own events -- surmountable I suppose, but also makes clone()/fork()
very slow for all decedents -- not so good.

Now I suppose we can fix this by testing event->owner in
perf_event_init_context() or somesuch, thereby bypassing event creation
when the owner is tearing things down.

But it would be good if we can do all this in a separate series.


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 11:18   ` Peter Zijlstra
@ 2014-07-14 11:43     ` Jiri Olsa
  2014-07-14 13:02       ` Peter Zijlstra
  2014-07-14 20:18     ` Jiri Olsa
  1 sibling, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2014-07-14 11:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > From: Jiri Olsa <jolsa@redhat.com>
> > 
> > When task exits we close:
> >   1) all events that are installed in task
> >   2) all events owned by task (via file descriptor)
> > 
> > But we don't close children events of 2) events. Those children
> > events stay until the child task exits and are useless with the
> > parent being gone, because we have no way to get to values any
> > more.
> > 
> > Plus if the event stays installed in task even with the owner task
> > gone, it runs the perf callback any time the task forks, for no
> > real reason.
> > 
> > Closing all children events events when the owner task of the
> > parent event is closed.
> 
> Do we need this for the other patches, or is this an unrelated change? 

if we dont do it, the event stays installed without owner and
perf fork callback will be called and fail on permission checking
(because of owner == NULL) ... so yes, I think it's needed

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 11:43     ` Jiri Olsa
@ 2014-07-14 13:02       ` Peter Zijlstra
  2014-07-14 13:22         ` Jiri Olsa
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2014-07-14 13:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 1429 bytes --]

On Mon, Jul 14, 2014 at 01:43:59PM +0200, Jiri Olsa wrote:
> On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> > On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > > From: Jiri Olsa <jolsa@redhat.com>
> > > 
> > > When task exits we close:
> > >   1) all events that are installed in task
> > >   2) all events owned by task (via file descriptor)
> > > 
> > > But we don't close children events of 2) events. Those children
> > > events stay until the child task exits and are useless with the
> > > parent being gone, because we have no way to get to values any
> > > more.
> > > 
> > > Plus if the event stays installed in task even with the owner task
> > > gone, it runs the perf callback any time the task forks, for no
> > > real reason.
> > > 
> > > Closing all children events events when the owner task of the
> > > parent event is closed.
> > 
> > Do we need this for the other patches, or is this an unrelated change? 
> 
> if we dont do it, the event stays installed without owner and
> perf fork callback will be called and fail on permission checking
> (because of owner == NULL) ... so yes, I think it's needed

Oh, right. Alternatively, we don't need permission checking for inherits
at all, if we're allowed to create the initial event, we should be good
for inherits.

And I suppose install_exec_creds() already does the right thing when we
may not.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 13:02       ` Peter Zijlstra
@ 2014-07-14 13:22         ` Jiri Olsa
  2014-07-14 13:35           ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2014-07-14 13:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Mon, Jul 14, 2014 at 03:02:37PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2014 at 01:43:59PM +0200, Jiri Olsa wrote:
> > On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > > > From: Jiri Olsa <jolsa@redhat.com>
> > > > 
> > > > When task exits we close:
> > > >   1) all events that are installed in task
> > > >   2) all events owned by task (via file descriptor)
> > > > 
> > > > But we don't close children events of 2) events. Those children
> > > > events stay until the child task exits and are useless with the
> > > > parent being gone, because we have no way to get to values any
> > > > more.
> > > > 
> > > > Plus if the event stays installed in task even with the owner task
> > > > gone, it runs the perf callback any time the task forks, for no
> > > > real reason.
> > > > 
> > > > Closing all children events events when the owner task of the
> > > > parent event is closed.
> > > 
> > > Do we need this for the other patches, or is this an unrelated change? 
> > 
> > if we dont do it, the event stays installed without owner and
> > perf fork callback will be called and fail on permission checking
> > (because of owner == NULL) ... so yes, I think it's needed
> 
> Oh, right. Alternatively, we don't need permission checking for inherits
> at all, if we're allowed to create the initial event, we should be good
> for inherits.

I could adress that in follow up patch.. or you want this instead
of this one? IMO we should close those events anyway..

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 13:22         ` Jiri Olsa
@ 2014-07-14 13:35           ` Peter Zijlstra
  2014-07-14 14:21             ` Jiri Olsa
  2014-07-16 12:14             ` Jiri Olsa
  0 siblings, 2 replies; 21+ messages in thread
From: Peter Zijlstra @ 2014-07-14 13:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]

On Mon, Jul 14, 2014 at 03:22:23PM +0200, Jiri Olsa wrote:
> > > if we dont do it, the event stays installed without owner and
> > > perf fork callback will be called and fail on permission checking
> > > (because of owner == NULL) ... so yes, I think it's needed
> > 
> > Oh, right. Alternatively, we don't need permission checking for inherits
> > at all, if we're allowed to create the initial event, we should be good
> > for inherits.
> 
> I could adress that in follow up patch.. or you want this instead
> of this one? IMO we should close those events anyway..

I tend to agree that closing them all is nicer. But we need to be
careful while doing it so as not to make the clone/fork path block on
it.

I _think_ it might be best to separate these two issues for the moment,
so cure the reported problem by avoiding the permission check for
inherited events -- IFF you agree with the previous argument that
install_exec_creds() should be sufficient.

And then so a patch playing games with perf_event_init_context()
(clone/fork) vs perf_event_exit_task() (exit).



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 13:35           ` Peter Zijlstra
@ 2014-07-14 14:21             ` Jiri Olsa
  2014-07-16 12:14             ` Jiri Olsa
  1 sibling, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-14 14:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Mon, Jul 14, 2014 at 03:35:42PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2014 at 03:22:23PM +0200, Jiri Olsa wrote:
> > > > if we dont do it, the event stays installed without owner and
> > > > perf fork callback will be called and fail on permission checking
> > > > (because of owner == NULL) ... so yes, I think it's needed
> > > 
> > > Oh, right. Alternatively, we don't need permission checking for inherits
> > > at all, if we're allowed to create the initial event, we should be good
> > > for inherits.
> > 
> > I could adress that in follow up patch.. or you want this instead
> > of this one? IMO we should close those events anyway..
> 
> I tend to agree that closing them all is nicer. But we need to be
> careful while doing it so as not to make the clone/fork path block on
> it.
> 
> I _think_ it might be best to separate these two issues for the moment,
> so cure the reported problem by avoiding the permission check for
> inherited events -- IFF you agree with the previous argument that
> install_exec_creds() should be sufficient.
> 
> And then so a patch playing games with perf_event_init_context()
> (clone/fork) vs perf_event_exit_task() (exit).

ook, will do

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 11:18   ` Peter Zijlstra
  2014-07-14 11:43     ` Jiri Olsa
@ 2014-07-14 20:18     ` Jiri Olsa
  2014-07-15  9:11       ` Peter Zijlstra
  1 sibling, 1 reply; 21+ messages in thread
From: Jiri Olsa @ 2014-07-14 20:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > From: Jiri Olsa <jolsa@redhat.com>
> > 
> > When task exits we close:
> >   1) all events that are installed in task
> >   2) all events owned by task (via file descriptor)
> > 
> > But we don't close children events of 2) events. Those children
> > events stay until the child task exits and are useless with the
> > parent being gone, because we have no way to get to values any
> > more.
> > 
> > Plus if the event stays installed in task even with the owner task
> > gone, it runs the perf callback any time the task forks, for no
> > real reason.
> > 
> > Closing all children events events when the owner task of the
> > parent event is closed.
> 
> Do we need this for the other patches, or is this an unrelated change? 
> 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/events/core.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 71a56ae..37797dd 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -7535,6 +7535,32 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
> >  	put_ctx(child_ctx);
> >  }
> >  
> > +static void perf_event_exit_children(struct perf_event *parent)
> > +{
> > +	struct perf_event *child, *tmp;
> > +
> > +	mutex_lock(&parent->child_mutex);
> > +	list_for_each_entry_safe(child, tmp, &parent->child_list,
> > +				 child_list) {
> > +		struct perf_event_context *child_ctx = child->ctx;
> > +
> > +		/*
> > +		 * Child events got removed from child_list under
> > +		 * child_mutex and then freed. So it's safe to access
> > +		 * childs context in here, because the child holds
> > +		 * context ref.
> > +		 */
> > +		mutex_lock(&child_ctx->mutex);
> > +		perf_remove_from_context(child, true);
> > +		mutex_unlock(&child_ctx->mutex);
> > +
> > +		list_del_init(&child->child_list);
> > +		put_event(parent);
> > +		free_event(child);
> > +	}
> > +	mutex_unlock(&parent->child_mutex);
> > +}
> > +
> >  /*
> >   * When a child task exits, feed back event values to parent events.
> >   */
> > @@ -7555,6 +7581,7 @@ void perf_event_exit_task(struct task_struct *child)
> >  		 */
> >  		smp_wmb();
> >  		event->owner = NULL;
> > +		perf_event_exit_children(event);
> >  	}
> >  	mutex_unlock(&child->perf_event_mutex);
> 
> I don't think this is correct, perf_event_init_context() can come in
> concurrently and the first place it runs into ->child_mutex is after its
> already allocated and created the (first) child event.

just noticed this.. I'm working on the other version we decide, but FWIW
there's also mutex_lock(&child_ctx->mutex); before removing the context,
that should protect it against perf_event_init_context call

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 20:18     ` Jiri Olsa
@ 2014-07-15  9:11       ` Peter Zijlstra
  2014-07-15  9:31         ` Jiri Olsa
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Zijlstra @ 2014-07-15  9:11 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 2120 bytes --]

On Mon, Jul 14, 2014 at 10:18:54PM +0200, Jiri Olsa wrote:
> On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> > On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > > From: Jiri Olsa <jolsa@redhat.com>

> > > +static void perf_event_exit_children(struct perf_event *parent)
> > > +{
> > > +	struct perf_event *child, *tmp;
> > > +
> > > +	mutex_lock(&parent->child_mutex);
> > > +	list_for_each_entry_safe(child, tmp, &parent->child_list,
> > > +				 child_list) {
> > > +		struct perf_event_context *child_ctx = child->ctx;
> > > +
> > > +		/*
> > > +		 * Child events got removed from child_list under
> > > +		 * child_mutex and then freed. So it's safe to access
> > > +		 * childs context in here, because the child holds
> > > +		 * context ref.
> > > +		 */
> > > +		mutex_lock(&child_ctx->mutex);
> > > +		perf_remove_from_context(child, true);
> > > +		mutex_unlock(&child_ctx->mutex);
> > > +
> > > +		list_del_init(&child->child_list);
> > > +		put_event(parent);
> > > +		free_event(child);
> > > +	}
> > > +	mutex_unlock(&parent->child_mutex);
> > > +}

> > I don't think this is correct, perf_event_init_context() can come in
> > concurrently and the first place it runs into ->child_mutex is after its
> > already allocated and created the (first) child event.
> 
> just noticed this.. I'm working on the other version we decide, but FWIW
> there's also mutex_lock(&child_ctx->mutex); before removing the context,
> that should protect it against perf_event_init_context call

Oh, more fail :-)

You have:

  perf_event::child_mutex
    perf_event_context::mutex

The existing code has:

  perf_event_context::mutex
    perf_event::child_context

See for example:

  perf_event_init_context()
    mutex_lock(&parent_ctx->mutex)
    inherit_task_group()
      inherit_group()
        inherit_event()
	  mutex_lock(&parent_event->child_mutex)

and

  perf_event_for_each()
    mutex_lock(&ctx->mutex)
    perf_event_for_each_child()
      mutex_lock(&event->child_mutex)

So the patch creates an AB-BA deadlock.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-15  9:11       ` Peter Zijlstra
@ 2014-07-15  9:31         ` Jiri Olsa
  0 siblings, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-15  9:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Tue, Jul 15, 2014 at 11:11:04AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2014 at 10:18:54PM +0200, Jiri Olsa wrote:
> > On Mon, Jul 14, 2014 at 01:18:33PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > > > From: Jiri Olsa <jolsa@redhat.com>

SNIP

> 
> > > I don't think this is correct, perf_event_init_context() can come in
> > > concurrently and the first place it runs into ->child_mutex is after its
> > > already allocated and created the (first) child event.
> > 
> > just noticed this.. I'm working on the other version we decide, but FWIW
> > there's also mutex_lock(&child_ctx->mutex); before removing the context,
> > that should protect it against perf_event_init_context call
> 
> Oh, more fail :-)
> 
> You have:
> 
>   perf_event::child_mutex
>     perf_event_context::mutex
> 
> The existing code has:
> 
>   perf_event_context::mutex
>     perf_event::child_context
> 
> See for example:
> 
>   perf_event_init_context()
>     mutex_lock(&parent_ctx->mutex)
>     inherit_task_group()
>       inherit_group()
>         inherit_event()
> 	  mutex_lock(&parent_event->child_mutex)
> 
> and
> 
>   perf_event_for_each()
>     mutex_lock(&ctx->mutex)
>     perf_event_for_each_child()
>       mutex_lock(&event->child_mutex)
> 
> So the patch creates an AB-BA deadlock.

ouch, right.. I'll try to come with other way

thanks,
jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-14 13:35           ` Peter Zijlstra
  2014-07-14 14:21             ` Jiri Olsa
@ 2014-07-16 12:14             ` Jiri Olsa
  1 sibling, 0 replies; 21+ messages in thread
From: Jiri Olsa @ 2014-07-16 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Paul Mackerras

On Mon, Jul 14, 2014 at 03:35:42PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2014 at 03:22:23PM +0200, Jiri Olsa wrote:
> > > > if we dont do it, the event stays installed without owner and
> > > > perf fork callback will be called and fail on permission checking
> > > > (because of owner == NULL) ... so yes, I think it's needed
> > > 
> > > Oh, right. Alternatively, we don't need permission checking for inherits
> > > at all, if we're allowed to create the initial event, we should be good
> > > for inherits.
> > 
> > I could adress that in follow up patch.. or you want this instead
> > of this one? IMO we should close those events anyway..
> 
> I tend to agree that closing them all is nicer. But we need to be
> careful while doing it so as not to make the clone/fork path block on
> it.
> 
> I _think_ it might be best to separate these two issues for the moment,
> so cure the reported problem by avoiding the permission check for
> inherited events -- IFF you agree with the previous argument that
> install_exec_creds() should be sufficient.

install_exec_creds remove removes current events any time
suid binary is executed.. so it seems ok

        /*
         * Disable monitoring for regular users
         * when executing setuid binaries. Must
         * wait until new credentials are committed
         * by commit_creds() above
         */
        if (get_dumpable(current->mm) != SUID_DUMP_USER)
                perf_event_exit_task(current);

jirka

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] perf: Destroy event's children on task exit
  2014-07-11 13:23   ` Peter Zijlstra
  2014-07-11 13:31     ` Jiri Olsa
@ 2014-07-16 12:20     ` Ingo Molnar
  1 sibling, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2014-07-16 12:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Jiri Olsa, Alexander Yarygin,
	Arnaldo Carvalho de Melo, Corey Ashford, Frederic Weisbecker,
	Paul Mackerras


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Jul 11, 2014 at 01:56:19PM +0200, Jiri Olsa wrote:
> > From: Jiri Olsa <jolsa@redhat.com>
> > 
> > When task exits we close:
> >   1) all events that are installed in task
> >   2) all events owned by task (via file descriptor)
> > 
> > But we don't close children events of 2) events. Those children
> > events stay until the child task exits and are useless with the
> > parent being gone, because we have no way to get to values any
> > more.
> > 
> > Plus if the event stays installed in task even with the owner task
> > gone, it runs the perf callback any time the task forks, for no
> > real reason.
> > 
> > Closing all children events events when the owner task of the
> > parent event is closed.
> 
> So I _think_ the reason we didn't do this is because this is potentially
> very expensive and keeping them around isn't too much bother, they'll
> die eventually.
> 
> But I can't really remember. Ingo any recollections / opinions?

As far as I can remember it's just a historic path dependent 
implementation detail, not really a conscious choice: initially we 
didn't remove events (because we couldn't), and then it remained so 
because it broke nothing.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:perf/core] perf: Check permission only for parent tracepoint event
  2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
                   ` (5 preceding siblings ...)
  2014-07-11 12:02 ` [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
@ 2014-07-28  8:28 ` tip-bot for Jiri Olsa
  6 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-07-28  8:28 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, hpa, mingo, jolsa, torvalds, peterz, acme,
	fweisbec, yarygin, rostedt, tglx, cjashfor

Commit-ID:  f4be073db878d0e79f74bc36f1642847781791a0
Gitweb:     http://git.kernel.org/tip/f4be073db878d0e79f74bc36f1642847781791a0
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Wed, 16 Jul 2014 14:33:29 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 28 Jul 2014 10:01:38 +0200

perf: Check permission only for parent tracepoint event

There's no need to check cloned event's permission once the
parent was already checked.

Also the code is checking 'current' process permissions, which
is not owner process for cloned events, thus could end up with
wrong permission check result.

Reported-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Tested-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1405079782-8139-1-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/trace/trace_event_perf.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 5d12bb4..4b9c114 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -30,6 +30,18 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 			return ret;
 	}
 
+	/*
+	 * We checked and allowed to create parent,
+	 * allow children without checking.
+	 */
+	if (p_event->parent)
+		return 0;
+
+	/*
+	 * It's ok to check current process (owner) permissions in here,
+	 * because code below is called only via perf_event_open syscall.
+	 */
+
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event)) {
 		if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-07-28  8:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-11 11:56 [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
2014-07-11 11:56 ` [PATCH 1/5] perf: Make perf_init_event function static Jiri Olsa
2014-07-11 11:56 ` [PATCH 2/5] perf: Destroy event's children on task exit Jiri Olsa
2014-07-11 13:23   ` Peter Zijlstra
2014-07-11 13:31     ` Jiri Olsa
2014-07-16 12:20     ` Ingo Molnar
2014-07-14 11:18   ` Peter Zijlstra
2014-07-14 11:43     ` Jiri Olsa
2014-07-14 13:02       ` Peter Zijlstra
2014-07-14 13:22         ` Jiri Olsa
2014-07-14 13:35           ` Peter Zijlstra
2014-07-14 14:21             ` Jiri Olsa
2014-07-16 12:14             ` Jiri Olsa
2014-07-14 20:18     ` Jiri Olsa
2014-07-15  9:11       ` Peter Zijlstra
2014-07-15  9:31         ` Jiri Olsa
2014-07-11 11:56 ` [PATCH 3/5] perf: Initialize owner before calling event_init callback Jiri Olsa
2014-07-11 11:56 ` [PATCH 4/5] perf: Move event owner retrieval into perf_event_get_owner Jiri Olsa
2014-07-11 11:56 ` [PATCH 5/5] perf: Check event's owner permission in tracepoint init callback Jiri Olsa
2014-07-11 12:02 ` [PATCH 0/5] perf: Fix tracepoint events permissions check Jiri Olsa
2014-07-28  8:28 ` [tip:perf/core] perf: Check permission only for parent tracepoint event tip-bot for Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox