public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Ingo Molnar <mingo@kernel.org>,
	John Stultz <john.stultz@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Dave Jones <davej@redhat.com>,
	Andrey Ryabinin <a.ryabinin@samsung.com>
Subject: Re: finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process)
Date: Tue, 29 Jul 2014 17:53:09 +0200	[thread overview]
Message-ID: <20140729155309.GA30194@redhat.com> (raw)
In-Reply-To: <20140729092237.GU12054@laptop.lan>

On 07/29, Peter Zijlstra wrote:
>
> On Tue, Jul 29, 2014 at 11:10:18AM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 15, 2014 at 04:25:25PM +0200, Oleg Nesterov wrote:
> >
> > > And probably I missed something again, but it seems that this logic is broken
> > > with __ARCH_WANT_UNLOCKED_CTXSW.
> > >
> > > Of course, even if I am right this is pure theoretical, but smp_wmb() before
> > > "->on_cpu = 0" is not enough and we need a full barrier ?
> >
> > (long delay there, forgot about this thread, sorry)
> >
> > Yes, I think I see that.. but now I think the comment is further wrong.
> >
> > Its not rq->lock that is important, remember, a concurrent wakeup onto
> > another CPU does not require our rq->lock at all.
> >
> > It is the ->on_cpu = 0 store that is important (for both the
> > UNLOCKED_CTXSW cases). As soon as that store comes through the task can
> > start running on the remote cpu.

Yes, I came to the same conclusion right after I sent that email.

> > Now the below patch 'fixes' this but at the cost of adding a full
> > barrier which is somewhat unfortunate to say the least.

And yes, this is obviously the "fix" I had in mind, but:

> > wmb's are free on x86 and generally cheaper than mbs, so it would to
> > find another solution to this problem...
>
> Something like so then?

Hmm, indeed! Unfortunately I didn't find this simple solution. Yes, I think
we should check current->state == TASK_DEAD,

> @@ -2304,6 +2293,21 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	       struct task_struct *next)
>  {
>  	struct mm_struct *mm, *oldmm;
> +	/*
> +	 * A task struct has one reference for the use as "current".
> +	 * If a task dies, then it sets TASK_DEAD in tsk->state and calls
> +	 * schedule one last time. The schedule call will never return, and
> +	 * the scheduled task must drop that reference.
> +	 *
> +	 * We must observe prev->state before clearing prev->on_cpu (in
> +	 * finish_lock_switch), otherwise a concurrent wakeup can get prev
> +	 * running on another CPU and we could race with its RUNNING -> DEAD
> +	 * transition, and then the reference would be dropped twice.
> +	 *
> +	 * We avoid the race by observing prev->state while it is still
> +	 * current.
> +	 */
> +	long prev_state = prev->state;

This doesn't really matter, but probably it would be better to do this right
before switch_to(), prev == current until this point.

Oleg.


  reply	other threads:[~2014-07-29 15:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-13 21:51 sched, timers: use after free in __lock_task_sighand when exiting a process Sasha Levin
2014-07-13 23:45 ` Sasha Levin
2014-07-14  9:04   ` Peter Zijlstra
2014-07-14  9:34     ` Andrey Ryabinin
2014-07-14  9:58       ` Peter Zijlstra
2014-07-14 10:25         ` Andrey Ryabinin
2014-07-14 14:49     ` Oleg Nesterov
2014-07-14 15:13       ` Oleg Nesterov
2014-07-14 15:31       ` Andrey Ryabinin
2014-07-14 16:01       ` finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process) Oleg Nesterov
2014-07-15 13:12         ` Oleg Nesterov
2014-07-15 13:23           ` Peter Zijlstra
2014-07-15 14:25             ` Oleg Nesterov
2014-07-29  9:10               ` Peter Zijlstra
2014-07-29  9:22                 ` Peter Zijlstra
2014-07-29 15:53                   ` Oleg Nesterov [this message]
2014-07-15 13:28       ` sched, timers: use after free in __lock_task_sighand when exiting a process Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140729155309.GA30194@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.ryabinin@samsung.com \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox