From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Mason <chris.mason@oracle.com>,
Frank Rowand <frank.rowand@am.sony.com>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
Jens Axboe <axboe@kernel.dk>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 5/5] sched: Reduce ttwu rq->lock contention
Date: Fri, 17 Dec 2010 19:24:57 +0100 [thread overview]
Message-ID: <1292610297.2266.334.camel@twins> (raw)
In-Reply-To: <20101217175013.GB8997@redhat.com>
On Fri, 2010-12-17 at 18:50 +0100, Oleg Nesterov wrote:
> On 12/17, Oleg Nesterov wrote:
> >
> > On 12/16, Peter Zijlstra wrote:
> > >
> > > + if (p->se.on_rq && ttwu_force(p, state, wake_flags))
> > > + return 1;
> >
> > ----- WINDOW -----
> >
> > > + for (;;) {
> > > + unsigned int task_state = p->state;
> > > +
> > > + if (!(task_state & state))
> > > + goto out;
> > > +
> > > + load = task_contributes_to_load(p);
> > > +
> > > + if (cmpxchg(&p->state, task_state, TASK_WAKING) == task_state)
> > > + break;
> >
> > Suppose that we have a task T sleeping in TASK_INTERRUPTIBLE state,
> > and this cpu does try_to_wake_up(TASK_INTERRUPTIBLE). on_rq == false.
> > try_to_wake_up() starts the "for (;;)" loop.
> >
> > However, in the WINDOW above, it is possible that somebody else wakes
> > it up, and then this task changes its state to TASK_INTERRUPTIBLE again.
> >
> > Then we set ->state = TASK_WAKING, but this (still running) T restores
> > TASK_RUNNING after us.
>
> Even simpler. This can race with, say, __migrate_task() which does
> deactivate_task + activate_task and temporary clears on_rq. Although
> this is simple to fix, I think.
Yes, another hole..
> Also. Afaics, without rq->lock, we can't trust "while (p->oncpu)", at
> least we need rmb() after that.
I think Linus once argued that loops like that should be fine without a
rmb(), at worst they'll have to spin a few more times to observe the
1->0 switch (we don't care about the 0->1 switch in this case because
that's ruled out by the ->state test).
> Interestingly, I can't really understand the current meaning of smp_wmb()
> in finish_lock_switch(). Do you know what exactly is buys?
I _think_ its meant to ensure the full contest switch happened and we've
stored all changes to the rq structure (destroying all references to
prev), in particular, we've finished writing the new value of current.
> In any case,
> task_running() (or its callers) do not have the corresponding rmb().
> Say, currently try_to_wake_up()->task_waking() can miss all changes
> starting from prepare_lock_switch(). Hopefully this is OK, but I am
> confused ;)
So I thought I saw how we are OK there, but then I got myself confused
too :-)
My argument was something along the lines of there must be some
serialization between the task going to sleep and another task waking it
(the task setting TASK_UNINTERRUPTIBLE and enqueuing it on a waitqueue,
and the waker finding it on the waitqueue), this should be sufficient to
make ->state visible to the waker.
If the waker observes a !TASK_RUNNING ->state, then by definition it
must see all the changes previous to it (including the ->oncpu 0->1
transition).
But like said, got my brain in a twist too.
next prev parent reply other threads:[~2010-12-17 18:25 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-16 14:56 [RFC][PATCH 0/5] Reduce runqueue lock contention -v2 Peter Zijlstra
2010-12-16 14:56 ` [RFC][PATCH 1/5] sched: Always provide p->oncpu Peter Zijlstra
2010-12-18 1:03 ` Frank Rowand
2010-12-16 14:56 ` [RFC][PATCH 2/5] mutex: Use p->oncpu for the adaptive spin Peter Zijlstra
2010-12-16 17:34 ` Oleg Nesterov
2010-12-16 19:29 ` Peter Zijlstra
2010-12-17 19:17 ` Oleg Nesterov
2010-12-16 14:56 ` [RFC][PATCH 3/5] sched: Change the ttwu success details Peter Zijlstra
2010-12-16 15:23 ` Frederic Weisbecker
2010-12-16 15:27 ` Peter Zijlstra
2010-12-16 15:30 ` Peter Zijlstra
2010-12-16 15:45 ` Frederic Weisbecker
2010-12-16 15:35 ` Frederic Weisbecker
2010-12-18 1:05 ` Frank Rowand
2010-12-16 14:56 ` [RFC][PATCH 4/5] sched: Clean up ttwu stats Peter Zijlstra
2010-12-18 1:09 ` Frank Rowand
2010-12-16 14:56 ` [RFC][PATCH 5/5] sched: Reduce ttwu rq->lock contention Peter Zijlstra
2010-12-16 15:31 ` Frederic Weisbecker
2010-12-16 17:58 ` Oleg Nesterov
2010-12-16 18:42 ` Oleg Nesterov
2010-12-16 18:58 ` Peter Zijlstra
2010-12-16 19:03 ` Peter Zijlstra
2010-12-16 19:47 ` Peter Zijlstra
2010-12-16 20:32 ` Peter Zijlstra
2010-12-17 3:06 ` Yan, Zheng
2010-12-17 13:23 ` Peter Zijlstra
2010-12-17 16:54 ` Oleg Nesterov
2010-12-17 17:43 ` Peter Zijlstra
2010-12-17 18:15 ` Peter Zijlstra
2010-12-17 19:28 ` Oleg Nesterov
2010-12-17 21:02 ` Peter Zijlstra
2010-12-18 14:49 ` Yong Zhang
2010-12-18 20:08 ` Oleg Nesterov
2010-12-19 11:20 ` Yong Zhang
2010-12-17 18:21 ` Oleg Nesterov
2010-12-17 17:50 ` Oleg Nesterov
2010-12-17 18:24 ` Peter Zijlstra [this message]
2010-12-17 18:41 ` Peter Zijlstra
2010-12-16 19:12 ` [RFC][PATCH 0/5] Reduce runqueue lock contention -v2 Frank Rowand
2010-12-16 19:36 ` Frank Rowand
2010-12-16 19:39 ` Frank Rowand
2010-12-16 19:42 ` Peter Zijlstra
2010-12-16 20:45 ` Frank Rowand
2010-12-16 19:36 ` Frank Rowand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1292610297.2266.334.camel@twins \
--to=peterz@infradead.org \
--cc=axboe@kernel.dk \
--cc=chris.mason@oracle.com \
--cc=efault@gmx.de \
--cc=frank.rowand@am.sony.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=pjt@google.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox