public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC][PATCH] Avoid race w/ posix-cpu-timer and exiting tasks
@ 2006-06-14  2:49 Oleg Nesterov
  2006-06-13 23:05 ` john stultz
  0 siblings, 1 reply; 4+ messages in thread
From: Oleg Nesterov @ 2006-06-14  2:49 UTC (permalink / raw)
  To: john stultz
  Cc: Ingo Molnar, Thomas Gleixner, Steven Rostedt, linux-kernel,
	Roland McGrath

john stultz wrote:
>
> Hey Ingo,
> 	We've occasionally come across OOPSes in posix-cpu-timer thread (as
> well as tripping over the BUG_ON(tsk->exit_state there) where it appears
> the task we're processing exits out on us while we're using it. 
>
> Thus this fix tries to avoid running the posix-cpu-timers on a task that
> is exiting.
>
> --- 2.6-rt/kernel/posix-cpu-timers.c	2006-06-11 15:38:58.000000000 -0500
> +++ devrt/kernel/posix-cpu-timers.c	2006-06-12 10:52:20.000000000 -0500
> @@ -1290,12 +1290,15 @@
>
>  #undef	UNEXPIRED
>
> -	BUG_ON(tsk->exit_state);
> -
>  	/*
>  	 * Double-check with locks held.
>  	 */
>  	read_lock(&tasklist_lock);
> +	/* Make sure the task doesn't exit under us. */
> +	if(unlikely(tsk->exit_state)) {
> +		read_unlock(&tasklist_lock);
> +		return;
> +	}
>  	spin_lock(&tsk->sighand->siglock);

I strongly believe this BUG_ON() is indeed wrong, and I did a similar patch
a long ago:

	[PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers()
	Commit 3de463c7d9d58f8cf3395268230cb20a4c15bffa

However it was reverted due to some unclear problems (I think those problems
were not related to this patch).

Instead this patch was added:

	[PATCH] Yet more posix-cpu-timer fixes
	Commit 3de463c7d9d58f8cf3395268230cb20a4c15bffa

and I still think this patch is not correct.

Quoting myself:
>
> Roland McGrath wrote:
> >
> > @@ -566,6 +566,9 @@ static void arm_timer(struct k_itimer *t
> >         struct cpu_timer_list *next;
> >         unsigned long i;
> >
> > +       if (CPUCLOCK_PERTHREAD(timer->it_clock) && (p->flags & PF_EXITING))
> > +               return;
> > +
>
> Why CPUCLOCK_PERTHREAD() ?.
>
> Also, this is racy, no? Why should arm_timer() see PF_EXITING which is
> set on another cpu without any barriers/locking? After all, arm_timer()
> can test PF_EXITING before do_exit() sets this flag, but set ->it_xxx_expires
> after do_exit() resets it.

Oleg.


^ permalink raw reply	[flat|nested] 4+ messages in thread
* [RFC][PATCH] Avoid race w/ posix-cpu-timer and exiting tasks
@ 2006-06-13  0:25 john stultz
  0 siblings, 0 replies; 4+ messages in thread
From: john stultz @ 2006-06-13  0:25 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, Steven Rostedt, lkml

Hey Ingo,
	We've occasionally come across OOPSes in posix-cpu-timer thread (as
well as tripping over the BUG_ON(tsk->exit_state there) where it appears
the task we're processing exits out on us while we're using it. 

Thus this fix tries to avoid running the posix-cpu-timers on a task that
is exiting.

I'm not sure if it is the proper fix, so I wanted some extra eyes to
look it over. We're testing it to see if we can still trigger any of the
OOPSes (the BUG_ON is removed, so that won't catch us anymore), but if
you have any thoughts I'd be interested in them.

thanks
-john

--- 2.6-rt/kernel/posix-cpu-timers.c	2006-06-11 15:38:58.000000000 -0500
+++ devrt/kernel/posix-cpu-timers.c	2006-06-12 10:52:20.000000000 -0500
@@ -1290,12 +1290,15 @@
 
 #undef	UNEXPIRED
 
-	BUG_ON(tsk->exit_state);
-
 	/*
 	 * Double-check with locks held.
 	 */
 	read_lock(&tasklist_lock);
+	/* Make sure the task doesn't exit under us. */
+	if(unlikely(tsk->exit_state)) {
+		read_unlock(&tasklist_lock);
+		return;
+	}
 	spin_lock(&tsk->sighand->siglock);
 
 	/*




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-06-14 21:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-14  2:49 [RFC][PATCH] Avoid race w/ posix-cpu-timer and exiting tasks Oleg Nesterov
2006-06-13 23:05 ` john stultz
2006-06-15  1:24   ` Oleg Nesterov
  -- strict thread matches above, loose matches on Subject: below --
2006-06-13  0:25 john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox