Re: [RFC PATCH] perf_counter: dynamically allocate tasks' perf_counter_context struct

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	Corey Ashford <cjashfor@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC PATCH] perf_counter: dynamically allocate tasks' perf_counter_context struct
Date: Wed, 20 May 2009 17:03:08 +0200	[thread overview]
Message-ID: <1242831788.32543.1706.camel@laptop> (raw)
In-Reply-To: <18963.63352.489273.92145@cargo.ozlabs.ibm.com>

On Wed, 2009-05-20 at 22:28 +1000, Paul Mackerras wrote:
> This replaces the struct perf_counter_context in the task_struct with
> a pointer to a dynamically allocated perf_counter_context struct.  The
> main reason for doing is this is to allow us to transfer a
> perf_counter_context from one task to another when we do lazy PMU
> switching in a later patch.

Looks good!, few comments below.

> +	ctx = task->perf_counter_ctx;
> +	if (!ctx) {
> +		ctx = kmalloc(sizeof(struct perf_counter_context), GFP_KERNEL);
> +		if (!ctx) {
> +			put_task_struct(task);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +		__perf_counter_init_context(ctx, task);
> +		mutex_lock(&task->perf_counter_mutex);
> +		if (!task->perf_counter_ctx) {
> +			task->perf_counter_ctx = ctx;
> +		} else {
> +			kfree(ctx);
> +			ctx = task->perf_counter_ctx;
> +		}
> +		mutex_unlock(&task->perf_counter_mutex);
> +	}

This seems to be the only site where we use ->perf_counter_mutex,
couldn't we simply write this as:

 if (cmpxchg(&task->perf_counter_ctx, NULL, ctx) != NULL)
   kfree(ctx);

and get rid of the mutex?


> +/*
> + * Remove a counter from the lists for its context.
> + * Must be called with counter->mutex and ctx->mutex held.
> + */
>  static void
>  list_del_counter(struct perf_counter *counter, struct perf_counter_context *ctx)


> @@ -3295,6 +3354,8 @@ __perf_counter_exit_task(struct task_struct *child,
>   *
>   * Note: we may be running in child context, but the PID is not hashed
>   * anymore so new counters will not be added.
> + * (XXX not sure that is true when we get called from flush_old_exec.
> + *  -- paulus)
>   */
>  void perf_counter_exit_task(struct task_struct *child)
>  {

Yes, we have a little hole here, but I think it should fixable.

we have:

 do_exit()
 {
   exit_signals(); /* sets PF_EXITING */

   ... (1)

   perf_counter_exit_task();

   ... (2)

   <unhash somewhere>

vs


 find_get_context()


If, in find_get_context() we exclude PF_EXITING tasks under ctx->mutex,
and ensure we take ctx->mutex in perf_counter_exit_task() we should be
good.

Tasks could race and still attach in (1), but that would be ok, as
they'd get cleared out in perf_counter_exit_task(), but not after that
in (2) as they'd be sure to observe PF_EXITING.

Now it appears perf_counter_exit_task() is a little light on locking, so
the below patch adds both ctx->mutex and counter->mutex, as you say it
should when invoking list_del_counter().

Secondly it adds the PF_EXITING test in find_get_context().

(utterly untested and such..)

Almost-Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
Index: linux-2.6/kernel/perf_counter.c
===================================================================
--- linux-2.6.orig/kernel/perf_counter.c
+++ linux-2.6/kernel/perf_counter.c
@@ -1228,6 +1228,13 @@ static struct perf_counter_context *find
 	ctx = &task->perf_counter_ctx;
 	ctx->task = task;
 
+	mutex_lock(&ctx->mutex);
+	if (task->state & PF_EXITING) {
+		mutex_unlock(&ctx->mutex);
+		return ERR_PTR(-ESRCH);
+	}
+	mutex_unlock(&ctx->mutex);
+
 	/* Reuse ptrace permission checks for now. */
 	if (!ptrace_may_access(task, PTRACE_MODE_READ)) {
 		put_context(ctx);
@@ -3292,42 +3299,29 @@ __perf_counter_exit_task(struct task_str
 			 struct perf_counter_context *child_ctx)
 {
 	struct perf_counter *parent_counter;
+	struct perf_cpu_context *cpuctx;
+	unsigned long flags;
 
 	/*
-	 * If we do not self-reap then we have to wait for the
-	 * child task to unschedule (it will happen for sure),
-	 * so that its counter is at its final count. (This
-	 * condition triggers rarely - child tasks usually get
-	 * off their CPU before the parent has a chance to
-	 * get this far into the reaping action)
-	 */
-	if (child != current) {
-		wait_task_inactive(child, 0);
-		update_counter_times(child_counter);
-		list_del_counter(child_counter, child_ctx);
-	} else {
-		struct perf_cpu_context *cpuctx;
-		unsigned long flags;
-
-		/*
-		 * Disable and unlink this counter.
-		 *
-		 * Be careful about zapping the list - IRQ/NMI context
-		 * could still be processing it:
-		 */
-		local_irq_save(flags);
-		perf_disable();
+	 * Disable and unlink this counter.
+	 *
+	 * Be careful about zapping the list - IRQ/NMI context
+	 * could still be processing it:
+	 */
+	local_irq_save(flags);
+	perf_disable();
 
-		cpuctx = &__get_cpu_var(perf_cpu_context);
+	cpuctx = &__get_cpu_var(perf_cpu_context);
 
-		group_sched_out(child_counter, cpuctx, child_ctx);
-		update_counter_times(child_counter);
+	group_sched_out(child_counter, cpuctx, child_ctx);
+	update_counter_times(child_counter);
 
-		list_del_counter(child_counter, child_ctx);
+	perf_enable();
+	local_irq_restore(flags);
 
-		perf_enable();
-		local_irq_restore(flags);
-	}
+	mutex_lock(&child_counter->mutex);
+	list_del_counter(child_counter, child_ctx);
+	mutex_unlock(&child_counter->mutex);
 
 	parent_counter = child_counter->parent;
 	/*
@@ -3359,6 +3353,8 @@ void perf_counter_exit_task(struct task_
 	if (likely(!child_ctx->nr_counters))
 		return;
 
+	mutex_lock(&child_ctx->mutex);
+
 again:
 	list_for_each_entry_safe(child_counter, tmp, &child_ctx->counter_list,
 				 list_entry)
@@ -3371,6 +3367,8 @@ again:
 	 */
 	if (!list_empty(&child_ctx->counter_list))
 		goto again;
+
+	mutex_unlock(&child_ctx->mutex);
 }
 
 /*

next prev parent reply	other threads:[~2009-05-20 15:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-20 12:28 [RFC PATCH] perf_counter: dynamically allocate tasks' perf_counter_context struct Paul Mackerras
2009-05-20 15:03 ` Peter Zijlstra [this message]
2009-05-20 23:50   ` Paul Mackerras
2009-05-20 17:12 ` Ingo Molnar
2009-05-20 23:15   ` Paul Mackerras
2009-05-22 19:18     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1242831788.32543.1706.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=cjashfor@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox