All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Ingo Molnar <mingo@kernel.org>,
	John Stultz <john.stultz@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Dave Jones <davej@redhat.com>,
	Andrey Ryabinin <a.ryabinin@samsung.com>
Subject: Re: finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process)
Date: Tue, 29 Jul 2014 17:53:09 +0200	[thread overview]
Message-ID: <20140729155309.GA30194@redhat.com> (raw)
In-Reply-To: <20140729092237.GU12054@laptop.lan>

On 07/29, Peter Zijlstra wrote:
>
> On Tue, Jul 29, 2014 at 11:10:18AM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 15, 2014 at 04:25:25PM +0200, Oleg Nesterov wrote:
> >
> > > And probably I missed something again, but it seems that this logic is broken
> > > with __ARCH_WANT_UNLOCKED_CTXSW.
> > >
> > > Of course, even if I am right this is pure theoretical, but smp_wmb() before
> > > "->on_cpu = 0" is not enough and we need a full barrier ?
> >
> > (long delay there, forgot about this thread, sorry)
> >
> > Yes, I think I see that.. but now I think the comment is further wrong.
> >
> > Its not rq->lock that is important, remember, a concurrent wakeup onto
> > another CPU does not require our rq->lock at all.
> >
> > It is the ->on_cpu = 0 store that is important (for both the
> > UNLOCKED_CTXSW cases). As soon as that store comes through the task can
> > start running on the remote cpu.

Yes, I came to the same conclusion right after I sent that email.

> > Now the below patch 'fixes' this but at the cost of adding a full
> > barrier which is somewhat unfortunate to say the least.

And yes, this is obviously the "fix" I had in mind, but:

> > wmb's are free on x86 and generally cheaper than mbs, so it would to
> > find another solution to this problem...
>
> Something like so then?

Hmm, indeed! Unfortunately I didn't find this simple solution. Yes, I think
we should check current->state == TASK_DEAD,

> @@ -2304,6 +2293,21 @@ context_switch(struct rq *rq, struct task_struct *prev,
>  	       struct task_struct *next)
>  {
>  	struct mm_struct *mm, *oldmm;
> +	/*
> +	 * A task struct has one reference for the use as "current".
> +	 * If a task dies, then it sets TASK_DEAD in tsk->state and calls
> +	 * schedule one last time. The schedule call will never return, and
> +	 * the scheduled task must drop that reference.
> +	 *
> +	 * We must observe prev->state before clearing prev->on_cpu (in
> +	 * finish_lock_switch), otherwise a concurrent wakeup can get prev
> +	 * running on another CPU and we could race with its RUNNING -> DEAD
> +	 * transition, and then the reference would be dropped twice.
> +	 *
> +	 * We avoid the race by observing prev->state while it is still
> +	 * current.
> +	 */
> +	long prev_state = prev->state;

This doesn't really matter, but probably it would be better to do this right
before switch_to(), prev == current until this point.

Oleg.


  reply	other threads:[~2014-07-29 15:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-13 21:51 sched, timers: use after free in __lock_task_sighand when exiting a process Sasha Levin
2014-07-13 23:45 ` Sasha Levin
2014-07-14  9:04   ` Peter Zijlstra
2014-07-14  9:34     ` Andrey Ryabinin
2014-07-14  9:58       ` Peter Zijlstra
2014-07-14 10:25         ` Andrey Ryabinin
2014-07-14 14:49     ` Oleg Nesterov
2014-07-14 15:13       ` Oleg Nesterov
2014-07-14 15:31       ` Andrey Ryabinin
2014-07-14 16:01       ` finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process) Oleg Nesterov
2014-07-15 13:12         ` Oleg Nesterov
2014-07-15 13:23           ` Peter Zijlstra
2014-07-15 14:25             ` Oleg Nesterov
2014-07-29  9:10               ` Peter Zijlstra
2014-07-29  9:22                 ` Peter Zijlstra
2014-07-29 15:53                   ` Oleg Nesterov [this message]
2014-07-15 13:28       ` sched, timers: use after free in __lock_task_sighand when exiting a process Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140729155309.GA30194@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.ryabinin@samsung.com \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.