All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kautuk Consul <consul.kautuk@gmail.com>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	David Rientjes <rientjes@google.com>,
	Ionut Alexa <ionut.m.alexa@gmail.com>,
	Guillaume Morin <guillaume@morinfr.org>,
	linux-kernel@vger.kernel.org, Kirill Tkhai <tkhai@yandex.ru>
Subject: Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up()
Date: Wed, 3 Sep 2014 15:36:40 +0200	[thread overview]
Message-ID: <20140903133640.GA25439@redhat.com> (raw)
In-Reply-To: <20140902173910.GF27892@worktop.ger.corp.intel.com>

Peter, sorry for slow responses.

On 09/02, Peter Zijlstra wrote:
>
> On Tue, Sep 02, 2014 at 06:47:14PM +0200, Oleg Nesterov wrote:
>
> > But since I already wrote v2 yesterday, let me show it anyway. Perhaps
> > you will notice something wrong immediately...
> >
> > So, once again, this patch adds the ugly "goto" into schedule(). OTOH,
> > it removes the ugly spin_unlock_wait(pi_lock).
>
> But schedule() is called _far_ more often than exit(). It would be
> really good not to have to do that.

Yes sure, performance-wise this is not a win. My point was, this way the
whole "last schedule" logic becomes very simple.

But OK, I buy your nack. I understand that we should not penalize
__schedule() if possible. Let's forget this patch.

> > TASK_DEAD can die. The only valid user is schedule_debug(), trivial to
> > change. The usage of TASK_DEAD in task_numa_fault() is wrong in any case.
> >
> > In fact, I think that the next change can change exit_schedule() to use
> > PREEMPT_ACTIVE, and then we can simply remove the TASK_DEAD check in
> > schedule_debug().
>
> So you worry about concurrent wakeups vs setting TASK_DEAD and thereby
> loosing it, right?
>
> Would not something like:
>
> 	spin_lock_irq(&current->pi_lock);
> 	__set_current_state(TASK_DEAD);
> 	spin_unlock_irq(&current->pi_lock);

Sure. This should obviously fix the problem.

And, I think, another mb() after unlock_wait should fix it as well.

> Not be race free and similarly expensive to the smp_mb() we have there
> now?

Ah, I simply do not know what is cheaper, even on x86. Well, we need
to enable/disable irqs, but again I do not really know how much does
this cost. I can even say what (imo) looks better, lock/unlock above
or

	// Ensure that the previous __set_current_state(RUNNING) can't
	// leak after spin_unlock_wait()
	smp_mb();
	spin_unlock_wait();
	// Another mb to ensure this too can't be reordered with unlock_wait
	set_current_state(TASK_DEAD);

What do you think looks better?

Oleg.


  reply	other threads:[~2014-09-03 13:39 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-25 10:54 [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kautuk Consul
2014-08-25 15:57 ` Oleg Nesterov
2014-08-26  4:45   ` Kautuk Consul
2014-08-26 15:03     ` Oleg Nesterov
2014-09-01 15:39   ` Peter Zijlstra
2014-09-01 17:58     ` Oleg Nesterov
2014-09-01 19:09       ` Peter Zijlstra
2014-09-02 15:52         ` Oleg Nesterov
2014-09-02 16:47           ` Oleg Nesterov
2014-09-02 17:39             ` Peter Zijlstra
2014-09-03 13:36               ` Oleg Nesterov [this message]
2014-09-03 14:44                 ` Peter Zijlstra
2014-09-03 15:18                   ` Oleg Nesterov
2014-09-04  7:15                     ` Peter Zijlstra
2014-09-04 17:03                       ` Paul E. McKenney
2014-09-04  5:04                   ` Ingo Molnar
2014-09-04  6:32                     ` Peter Zijlstra
2014-09-03 16:08             ` task_numa_fault() && TASK_DEAD Oleg Nesterov
2014-09-03 16:33               ` Rik van Riel
2014-09-04  7:11               ` Peter Zijlstra
2014-09-04 10:39                 ` Oleg Nesterov
2014-09-04 19:14                   ` Hugh Dickins
2014-09-05 11:35                     ` Oleg Nesterov
2014-09-03  9:04   ` [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kirill Tkhai
2014-09-03  9:45     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140903133640.GA25439@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=consul.kautuk@gmail.com \
    --cc=guillaume@morinfr.org \
    --cc=ionut.m.alexa@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=tkhai@yandex.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.