From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kautuk Consul <consul.kautuk@gmail.com>,
Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
David Rientjes <rientjes@google.com>,
Ionut Alexa <ionut.m.alexa@gmail.com>,
Guillaume Morin <guillaume@morinfr.org>,
linux-kernel@vger.kernel.org, Kirill Tkhai <tkhai@yandex.ru>
Subject: Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up()
Date: Mon, 1 Sep 2014 19:58:51 +0200 [thread overview]
Message-ID: <20140901175851.GA15210@redhat.com> (raw)
In-Reply-To: <20140901153935.GQ27892@worktop.ger.corp.intel.com>
On 09/01, Peter Zijlstra wrote:
>
> On Mon, Aug 25, 2014 at 05:57:38PM +0200, Oleg Nesterov wrote:
> > Peter, do you remember another problem with TASK_DEAD we discussed recently?
> > (prev_state == TASK_DEAD detection in finish_task_switch() still looks racy).
>
> Uhm, right. That was somewhere on the todo list :-)
>
> > I am starting to think that perhaps we need something like below, what do
> > you all think?
>
> I'm thinking you lost the hunk that adds rq::dead :-), more comments
> below.
And "goto deactivate" should be moved down, after "switch_count"
initialization.
> > + if (unlikely(rq->dead))
> > + goto deactivate;
> > +
>
> Yeah, it would be best to not have to do that; ideally we would be able
> to maybe do both; set rq->dead and current->state == TASK_DEAD.
To avoid spin_unlock_wait() in do_exit(). But on a second thought this
can't work, please see below.
> > --- x/kernel/exit.c
> > +++ x/kernel/exit.c
> > @@ -815,25 +815,8 @@ void do_exit(long code)
> > __this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
> > exit_rcu();
> >
> > - /*
> > - * The setting of TASK_RUNNING by try_to_wake_up() may be delayed
> > - * when the following two conditions become true.
> > - * - There is race condition of mmap_sem (It is acquired by
> > - * exit_mm()), and
> > - * - SMI occurs before setting TASK_RUNINNG.
> > - * (or hypervisor of virtual machine switches to other guest)
> > - * As a result, we may become TASK_RUNNING after becoming TASK_DEAD
> > - *
> > - * To avoid it, we have to wait for releasing tsk->pi_lock which
> > - * is held by try_to_wake_up()
> > - */
> > - smp_mb();
> > - raw_spin_unlock_wait(&tsk->pi_lock);
> > -
> > - /* causes final put_task_struct in finish_task_switch(). */
> > - tsk->state = TASK_DEAD;
> > tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */
> > - schedule();
> > + exit_schedule();
> > BUG();
> > /* Avoid "noreturn function does return". */
> > for (;;)
>
> Yes, something like this might work fine..
Not really :/ Yes, rq->dead (or just "bool prev_dead") should obviously
solve the problem with ttwu() after the last schedule(). But only in a
sense that the dying task won't be activated.
However, the very fact that another CPU can look at this task_struct
means that we still need spin_unlock_wait(). If nothing else to ensure
that try_to_wake_up()->spin_unlock(pi_lock) won't write into the memory
we are are going to free.
So I think the comment in do exit should be updated too, and smp_mb()
should be moved under raw_spin_unlock_wait() but ...
But. If am right, doesn't this mean we that have even more problems with
postmortem wakeups??? Why ttwu() can't _start_ after spin_unlock_wait ?
Oleg.
next prev parent reply other threads:[~2014-09-01 18:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-25 10:54 [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kautuk Consul
2014-08-25 15:57 ` Oleg Nesterov
2014-08-26 4:45 ` Kautuk Consul
2014-08-26 15:03 ` Oleg Nesterov
2014-09-01 15:39 ` Peter Zijlstra
2014-09-01 17:58 ` Oleg Nesterov [this message]
2014-09-01 19:09 ` Peter Zijlstra
2014-09-02 15:52 ` Oleg Nesterov
2014-09-02 16:47 ` Oleg Nesterov
2014-09-02 17:39 ` Peter Zijlstra
2014-09-03 13:36 ` Oleg Nesterov
2014-09-03 14:44 ` Peter Zijlstra
2014-09-03 15:18 ` Oleg Nesterov
2014-09-04 7:15 ` Peter Zijlstra
2014-09-04 17:03 ` Paul E. McKenney
2014-09-04 5:04 ` Ingo Molnar
2014-09-04 6:32 ` Peter Zijlstra
2014-09-03 16:08 ` task_numa_fault() && TASK_DEAD Oleg Nesterov
2014-09-03 16:33 ` Rik van Riel
2014-09-04 7:11 ` Peter Zijlstra
2014-09-04 10:39 ` Oleg Nesterov
2014-09-04 19:14 ` Hugh Dickins
2014-09-05 11:35 ` Oleg Nesterov
2014-09-03 9:04 ` [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kirill Tkhai
2014-09-03 9:45 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140901175851.GA15210@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=consul.kautuk@gmail.com \
--cc=guillaume@morinfr.org \
--cc=ionut.m.alexa@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tkhai@yandex.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.