From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@elte.hu>,
Alan Stern <stern@rowland.harvard.edu>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Paul Mackerras <paulus@samba.org>,
Prasad <prasad@linux.vnet.ibm.com>,
Roland McGrath <roland@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: Q: perf_install_in_context/perf_event_enable are racy?
Date: Fri, 28 Jan 2011 15:57:47 +0100 [thread overview]
Message-ID: <1296226667.15234.337.camel@laptop> (raw)
In-Reply-To: <1296215577.15234.333.camel@laptop>
On Fri, 2011-01-28 at 12:52 +0100, Peter Zijlstra wrote:
> Right, so in case the perf_event_task_sched_in() missed the assignment
> of ->perf_event_ctxp[n], then our above story falls flat on its face.
>
> Because then we can not rely on ->in_active being set for running tasks.
>
> So we need a task_curr() test under that lock, which would need
> perf_event_task_sched_out() to be done _before_ we set rq->curr = next,
> I _think_.
Ok, so how about something like this:
if task_oncpu_function_call() managed to execute the function proper,
we're done. Otherwise, if while holding the lock, task_curr() is true,
it means the task is now current and we should try again, if its not, it
cannot become current because us holding ctx->lock blocks
perf_event_task_sched_in().
Hmm?
---
include/linux/sched.h | 4 +-
kernel/perf_event.c | 23 ++++++++++-------
kernel/sched.c | 65 +++++++++++++++++++++++++++++++++++++++++++------
3 files changed, 73 insertions(+), 19 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d747f94..b147d73 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2581,8 +2581,8 @@ static inline void inc_syscw(struct task_struct *tsk)
/*
* Call the function if the target task is executing on a CPU right now:
*/
-extern void task_oncpu_function_call(struct task_struct *p,
- void (*func) (void *info), void *info);
+extern int task_oncpu_function_call(struct task_struct *p,
+ void (*func) (void *info), void *info);
#ifdef CONFIG_MM_OWNER
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 852ae8c..0d988b8 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1017,6 +1017,7 @@ perf_install_in_context(struct perf_event_context *ctx,
int cpu)
{
struct task_struct *task = ctx->task;
+ int ret;
event->ctx = ctx;
@@ -1031,25 +1032,29 @@ perf_install_in_context(struct perf_event_context *ctx,
}
retry:
- task_oncpu_function_call(task, __perf_install_in_context,
- event);
+ ret = task_oncpu_function_call(task,
+ __perf_install_in_context, event);
+
+ if (!ret)
+ return;
raw_spin_lock_irq(&ctx->lock);
+
/*
- * we need to retry the smp call.
+ * If the task_oncpu_function_call() failed, re-check task_curr() now
+ * that we hold ctx->lock(), if it is running retry the IPI.
*/
- if (ctx->is_active && list_empty(&event->group_entry)) {
+ if (task_curr(task)) {
raw_spin_unlock_irq(&ctx->lock);
goto retry;
}
/*
- * The lock prevents that this context is scheduled in so we
- * can add the event safely, if it the call above did not
- * succeed.
+ * Otherwise the lock prevents that this context is scheduled in so we
+ * can add the event safely, if it the call above did not succeed.
*/
- if (list_empty(&event->group_entry))
- add_event_to_ctx(event, ctx);
+ add_event_to_ctx(event, ctx);
+
raw_spin_unlock_irq(&ctx->lock);
}
diff --git a/kernel/sched.c b/kernel/sched.c
index 18d38e4..3686dce 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -490,7 +490,10 @@ struct rq {
struct task_struct *curr, *idle, *stop;
unsigned long next_balance;
struct mm_struct *prev_mm;
-
+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+ int in_ctxsw;
+#endif
+
u64 clock;
u64 clock_task;
@@ -2265,6 +2268,34 @@ void kick_process(struct task_struct *p)
EXPORT_SYMBOL_GPL(kick_process);
#endif /* CONFIG_SMP */
+struct task_function_call {
+ struct task_struct *p;
+ void (*func)(void *info);
+ void *info;
+ int ret;
+};
+
+void task_function_trampoline(void *data)
+{
+ struct task_function_call *tfc = data;
+ struct task_struct *p = tfc->p;
+ struct rq *rq = this_rq();
+
+ tfc->ret = -EAGAIN;
+
+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+ if (rq->in_ctxsw)
+ return;
+#endif
+
+ if (rq->curr != p)
+ return;
+
+ tfc->ret = 0;
+
+ tfc->func(tfc->info);
+}
+
/**
* task_oncpu_function_call - call a function on the cpu on which a task runs
* @p: the task to evaluate
@@ -2273,17 +2304,31 @@ EXPORT_SYMBOL_GPL(kick_process);
*
* Calls the function @func when the task is currently running. This might
* be on the current CPU, which just calls the function directly
+ *
+ * returns: 0 when @func got called
*/
-void task_oncpu_function_call(struct task_struct *p,
+int task_oncpu_function_call(struct task_struct *p,
void (*func) (void *info), void *info)
{
+ struct task_function_call data = {
+ .p = p,
+ .func = func,
+ .info = info,
+ };
int cpu;
preempt_disable();
+again:
+ data.ret = -ESRCH; /* No such (running) process */
cpu = task_cpu(p);
- if (task_curr(p))
- smp_call_function_single(cpu, func, info, 1);
+ if (task_curr(p)) {
+ smp_call_function_single(cpu, task_function_trampoline, &data, 1);
+ if (data.ret == -EAGAIN)
+ goto again;
+ }
preempt_enable();
+
+ return data.ret;
}
#ifdef CONFIG_SMP
@@ -2776,9 +2821,15 @@ static inline void
prepare_task_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+ rq->in_ctxsw = 1;
+#endif
+ sched_info_switch(prev, next);
+ perf_event_task_sched_out(prev, next);
fire_sched_out_preempt_notifiers(prev, next);
prepare_lock_switch(rq, next);
prepare_arch_switch(next);
+ trace_sched_switch(prev, next);
}
/**
@@ -2822,6 +2873,7 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)
#endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */
perf_event_task_sched_in(current);
#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+ rq->in_ctxsw = 0;
local_irq_enable();
#endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */
finish_lock_switch(rq, prev);
@@ -2911,7 +2963,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
struct mm_struct *mm, *oldmm;
prepare_task_switch(rq, prev, next);
- trace_sched_switch(prev, next);
+
mm = next->mm;
oldmm = prev->active_mm;
/*
@@ -3989,9 +4041,6 @@ need_resched_nonpreemptible:
rq->skip_clock_update = 0;
if (likely(prev != next)) {
- sched_info_switch(prev, next);
- perf_event_task_sched_out(prev, next);
-
rq->nr_switches++;
rq->curr = next;
++*switch_count;
next prev parent reply other threads:[~2011-01-28 14:56 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-08 14:56 Q: perf_event && task->ptrace_bps[] Oleg Nesterov
2010-11-08 14:57 ` Q: sys_perf_event_open() && PF_EXITING Oleg Nesterov
2011-01-19 18:21 ` [PATCH 0/2] Was: " Oleg Nesterov
2011-01-19 18:22 ` [PATCH 1/2] perf: fix find_get_context() vs perf_event_exit_task() race Oleg Nesterov
2011-01-19 18:49 ` Peter Zijlstra
2011-01-19 19:18 ` [tip:perf/urgent] perf: Fix " tip-bot for Oleg Nesterov
2011-01-21 15:29 ` Ingo Molnar
2011-01-21 15:53 ` Oleg Nesterov
2011-01-21 17:45 ` [PATCH] perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/ Oleg Nesterov
2011-01-21 17:53 ` Oleg Nesterov
2011-01-21 21:50 ` Paul E. McKenney
2011-01-24 11:51 ` Oleg Nesterov
2011-01-21 22:12 ` [tip:perf/urgent] " tip-bot for Oleg Nesterov
2011-01-19 18:22 ` [PATCH 2/2] perf: fix perf_event_init_task()/perf_event_free_task() interaction Oleg Nesterov
2011-01-19 18:51 ` Peter Zijlstra
2011-01-19 19:19 ` [tip:perf/urgent] perf: Fix " tip-bot for Oleg Nesterov
2011-01-20 19:30 ` Q: perf_install_in_context/perf_event_enable are racy? Oleg Nesterov
2011-01-21 12:11 ` Peter Zijlstra
2011-01-21 13:03 ` Ingo Molnar
2011-01-21 13:39 ` Peter Zijlstra
2011-01-21 14:26 ` Oleg Nesterov
2011-01-21 15:05 ` Peter Zijlstra
2011-01-21 20:40 ` Frederic Weisbecker
2011-01-24 11:42 ` Oleg Nesterov
2011-01-26 17:53 ` Oleg Nesterov
2011-01-26 18:49 ` Oleg Nesterov
2011-01-26 18:51 ` [PATCH] fix the theoretical task_cpu/task_curr problem in kick_process/task_oncpu_function_call Oleg Nesterov
2011-01-26 19:05 ` Q: perf_install_in_context/perf_event_enable are racy? Peter Zijlstra
2011-01-26 19:33 ` Peter Zijlstra
2011-01-26 19:38 ` Peter Zijlstra
2011-01-26 21:19 ` Oleg Nesterov
2011-01-26 21:33 ` Oleg Nesterov
2011-01-27 10:32 ` Peter Zijlstra
2011-01-27 12:29 ` Peter Zijlstra
2011-01-27 16:10 ` Oleg Nesterov
2011-01-27 16:27 ` Peter Zijlstra
2011-01-27 16:59 ` Oleg Nesterov
2011-01-27 15:52 ` Oleg Nesterov
2011-01-27 13:14 ` Peter Zijlstra
2011-01-27 14:28 ` Peter Zijlstra
2011-01-27 14:58 ` Peter Zijlstra
2011-01-27 16:57 ` Oleg Nesterov
2011-01-27 17:11 ` Peter Zijlstra
2011-01-27 22:18 ` Oleg Nesterov
2011-01-28 11:52 ` Peter Zijlstra
2011-01-28 14:57 ` Peter Zijlstra [this message]
2011-01-28 16:28 ` Oleg Nesterov
2011-01-28 18:11 ` Peter Zijlstra
2011-01-31 17:26 ` Oleg Nesterov
2011-01-31 18:23 ` Peter Zijlstra
2011-01-31 19:11 ` Oleg Nesterov
2011-01-31 19:29 ` Peter Zijlstra
2011-02-01 14:03 ` [PATCH] perf: Cure task_oncpu_function_call() races Peter Zijlstra
2011-02-01 17:27 ` Oleg Nesterov
2011-02-01 18:08 ` Peter Zijlstra
2011-02-01 18:18 ` Peter Zijlstra
2011-02-01 21:00 ` Peter Zijlstra
2010-11-08 14:57 ` Q: perf_event && event->owner Oleg Nesterov
2010-11-08 20:11 ` Frederic Weisbecker
2010-11-08 20:41 ` Peter Zijlstra
2010-11-09 16:18 ` Oleg Nesterov
2010-11-09 15:57 ` Oleg Nesterov
2010-11-09 16:56 ` Peter Zijlstra
2010-11-09 16:58 ` Oleg Nesterov
2010-11-09 17:07 ` Peter Zijlstra
2010-11-09 17:42 ` Oleg Nesterov
2010-11-09 18:01 ` Peter Zijlstra
2010-11-09 18:57 ` Oleg Nesterov
2010-11-09 19:16 ` Peter Zijlstra
2010-11-10 15:17 ` Peter Zijlstra
2010-11-10 15:44 ` Oleg Nesterov
2010-11-12 15:48 ` Peter Zijlstra
2010-11-12 18:49 ` Oleg Nesterov
2010-11-18 14:09 ` [tip:perf/urgent] perf: Fix owner-list vs exit tip-bot for Peter Zijlstra
2010-11-08 18:41 ` Q: perf_event && task->ptrace_bps[] Frederic Weisbecker
2010-11-08 19:18 ` Oleg Nesterov
2011-01-17 23:58 ` Frederic Weisbecker
2011-01-18 1:16 ` Roland McGrath
2011-01-17 20:34 ` Oleg Nesterov
2011-01-17 20:52 ` Peter Zijlstra
2011-01-17 21:01 ` Frederic Weisbecker
2011-01-18 16:09 ` [PATCH 0/2] perf: event->cpu checking fixes Oleg Nesterov
2011-01-18 16:10 ` [PATCH 1/2] perf: find_get_context: fix the per-cpu-counter check Oleg Nesterov
2011-01-18 19:06 ` [tip:perf/urgent] perf: Find_get_context: " tip-bot for Oleg Nesterov
2011-01-18 16:10 ` [PATCH 2/2] perf: validate cpu early in perf_event_alloc() Oleg Nesterov
2011-01-18 19:07 ` [tip:perf/urgent] perf: Validate " tip-bot for Oleg Nesterov
2011-01-18 18:42 ` Q: perf_event && task->ptrace_bps[] Frederic Weisbecker
2011-01-19 15:37 ` Oleg Nesterov
2011-01-19 20:05 ` Frederic Weisbecker
2011-01-20 17:28 ` Oleg Nesterov
2011-01-28 17:41 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1296226667.15234.337.camel@laptop \
--to=peterz@infradead.org \
--cc=acme@redhat.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=paulus@samba.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=roland@redhat.com \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox