All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>,
	Ingo Molnar <mingo@elte.hu>,
	Hiroyuki KAMEZAWA <kamezawa.hiroyu@jp.fujitsu.com>,
	Motohiro Kosaki <kosaki.motohiro@jp.fujitsu.com>,
	Linux Kernel ML <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] TASK_DEAD task is able to be woken up in special condition
Date: Fri, 6 Jan 2012 15:12:58 +0100	[thread overview]
Message-ID: <20120106141258.GB19462@redhat.com> (raw)
In-Reply-To: <1325853838.2442.18.camel@twins>

On 01/06, Peter Zijlstra wrote:
>
> On Fri, 2012-01-06 at 21:01 +0900, Yasunori Goto wrote:
>
> > Do you mean the following patch?
>
> Yes, something like that. At that point ->state should be TASK_RUNNING
> (since we are after all running). The unlock_wait() will synchronize
> against any in-progress ttwu() while its fast path is a non-atomic
> compare. Any ttwu after this will bail since it will either observe
> TASK_RUNNING or TASK_DEAD, neither are a state it will act upon.
>
> Now the only question that remains is if we need the full memory barrier
> or if we can get away with less.
>
> I guess the mb separates the write to ->state (setting TASK_RUNNING)
> from the read of ->pi_lock. The remote CPU must see the TASK_RUNNING,
> and we must see ->pi_lock taken if it is.

Yes, I think we need the full mb, STORE vs LOAD.

> > --- linux-3.2-rc7.orig/kernel/exit.c
> > +++ linux-3.2-rc7/kernel/exit.c
> > @@ -1038,6 +1038,10 @@ NORET_TYPE void do_exit(long code)
> >
> >  	preempt_disable();
> >  	exit_rcu();
> > +
> > +	smp_mb();
> > +	raw_spin_unlock_wait(&tsk->pi_lock);
> > +
> >  	/* causes final put_task_struct in finish_task_switch(). */
> >  	tsk->state = TASK_DEAD;

Interesting. Initially I thought this is wrong and we should do

	raw_spin_unlock_wait(pi_lock);

	mb();

	tsk->state = TASK_DEAD;

This "obviously" serializes LOAD(pi_lock) and STORE(state).

But when I re-read your explanation above I think you are right,
mb() before unlock_wait() should work too, just it refers to
state = RUNNING in the past.

But this makes me worry. We are doing a lot of things after
exit_mm(). In particular we take tasklist_lock in exit_notify()
and then do_exit() takes task_lock(). But every unlock + lock
implies mb(). So how it was possible to hit this bug???

Oleg.


  reply	other threads:[~2012-01-06 14:19 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-22  0:42 [BUG] TASK_DEAD task is able to be woken up in special condition Yasunori Goto
2011-12-22  2:14 ` KOSAKI Motohiro
2011-12-22  8:22   ` Yasunori Goto
2011-12-22 20:02     ` KOSAKI Motohiro
2011-12-23  9:49 ` Peter Zijlstra
2011-12-23 15:41   ` Oleg Nesterov
2011-12-26  8:23     ` Yasunori Goto
2011-12-26 17:11       ` Oleg Nesterov
2011-12-27  6:48         ` Yasunori Goto
2012-01-06 10:22           ` Yasunori Goto
2012-01-06 11:01             ` Peter Zijlstra
2012-01-06 12:01               ` Yasunori Goto
2012-01-06 12:43                 ` Peter Zijlstra
2012-01-06 14:12                   ` Oleg Nesterov [this message]
2012-01-06 14:19                     ` Oleg Nesterov
2012-01-07  1:31                     ` Yasunori Goto
2012-01-16 11:51                       ` Yasunori Goto
2012-01-16 13:38                         ` Peter Zijlstra
2012-01-17  8:40                           ` Yasunori Goto
2012-01-17  9:06                             ` Ingo Molnar
2012-01-17 15:12                               ` Oleg Nesterov
2012-01-18  9:42                                 ` Ingo Molnar
2012-01-18 14:20                                   ` Oleg Nesterov
2012-01-24 10:19                                     ` Peter Zijlstra
2012-01-24 10:55                                       ` Peter Zijlstra
2012-01-24 17:25                                         ` KOSAKI Motohiro
2012-01-25 15:45                                         ` Oleg Nesterov
2012-01-25 16:51                                           ` Peter Zijlstra
2012-01-25 17:43                                             ` Oleg Nesterov
2012-01-26 15:32                                               ` Peter Zijlstra
2012-01-26 16:26                                                 ` Oleg Nesterov
2012-01-27  8:59                                                   ` Peter Zijlstra
2012-01-24 10:11                                   ` Peter Zijlstra
2012-01-26  9:39                                     ` Ingo Molnar
2012-01-28 12:03                             ` [tip:sched/core] sched: Fix ancient race in do_exit() tip-bot for Yasunori Goto
2012-01-28 21:12                               ` Linus Torvalds
2012-01-29 16:07                                 ` Oleg Nesterov
2012-01-29 17:44                                   ` Linus Torvalds
2012-01-29 18:28                                     ` Linus Torvalds
2012-01-29 18:59                                     ` Oleg Nesterov
2012-01-30 16:27                                       ` Linus Torvalds
2012-01-06 13:48             ` [BUG] TASK_DEAD task is able to be woken up in special condition Oleg Nesterov
2011-12-28 21:07         ` KOSAKI Motohiro
2012-01-24 10:23           ` Peter Zijlstra
2012-01-24 18:01             ` KOSAKI Motohiro
2012-01-25  6:15               ` Mike Galbraith
2012-01-26 21:24                 ` KOSAKI Motohiro
2012-01-25 10:10           ` Peter Zijlstra
2012-01-26 20:25             ` [tip:sched/urgent] sched: Fix rq->nr_uninterruptible update race tip-bot for Peter Zijlstra
2012-01-27  5:20               ` Rakib Mullick
2012-01-27  8:19                 ` Peter Zijlstra
2012-01-27 14:11                   ` Rakib Mullick
2012-01-26 21:21             ` [BUG] TASK_DEAD task is able to be woken up in special condition KOSAKI Motohiro
2012-01-27  8:21               ` Peter Zijlstra
2011-12-26  6:52   ` Yasunori Goto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120106141258.GB19462@redhat.com \
    --to=oleg@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.