From: Peter Zijlstra <peterz@infradead.org>
To: Dongsheng Yang <dongsheng081251@gmail.com>
Cc: "yangds.fnst" <yangds.fnst@cn.fujitsu.com>,
linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
mingo@redhat.com, bsegall@google.com
Subject: Re: Fwd: [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not.
Date: Sun, 11 May 2014 18:35:31 +0200 [thread overview]
Message-ID: <20140511163531.GG30445@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+qeAOr_PrBuo+qfjNoaZyHF21p=8UgQy1oZOjAxahhYTyrvLQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]
On Sun, May 11, 2014 at 11:24:22PM +0800, Dongsheng Yang wrote:
> Actually, this patch does not attempt to solve the race condition.
> It only want to avoid sched:sched_wakeup with success==true in
> a fake wakeup, as explained below.
>
> > So the fundamental wait loop is:
> >
> > for (;;) {
> > set_current_state(TASK_UNINTERRUPTIBLE);
> > if (cond)
> > break;
> > schedule();
> > }
> > __set_task_state(TASK_RUNNING);
> >
> > And the fundamental wakeup is:
> >
> > cond = true;
> > wake_up_process(TASK_NORMAL);
> >
> > And this is very much on purpose a lock-free but strictly ordered
> > scenario. It is a variation of:
> >
> > X = Y = 0
> >
> > (wait) (wake)
> > [w] X = 1 [w] Y = 1
> > MB MB
> > [r] Y [r] X
> >
> > [ where: X := state, Y := cond ]
> >
> > And we all 'know' that the only provided guarantee is that:
> > X==0 && Y==0
> > is impossible -- but only that, all 3 other states are observable.
> >
> > This guarantee means that its impossible to both miss the condition and
> > the wakeup; iow. it guarantees fwd progress.
> >
> > OTOH its fundamentally racy, nothing guarantees we will not 'observe' both
> > the condition and the wakeup.
> >
> > The setting of .success=false when ->on_rq is actively wrong, suppose
> > the waiter has already observed cond==false but has not yet gotten to
> > schedule(), at that point the wakeup happens and sees ->on_rq==1. The
> > wakeup is still very much a real wakeup.
>
>
> Yes, if a wakeup happens before schedule(), wakeup
> sees ->on_rq==1. Then we can get an event with .success==false.
> But I think it is not a real wakeup. :(
>
> Yes, at this moment, maybe the task is already out of run queue.
> But *this* wakeup did not move it back to run queue, it only
> change the state of it to TASK_RUNNING. I believe the next
> wakeup for this task will do the real wake up moving it back
> to run queue.
>
> And if scheduler really wake it up, we can get an event with success==true.
>
> Anyway, what I want with this patch is to make scheduler raise accurate
> events when waking up a task.
>
> If a wakeup only change the state of task, raise a event with success==false.
> If a wakeup move a task back to runqueue, .success==true.
>
> It means, we do not need to care about the task is on_rq or not currently,
> the value of .success is decided by the behavior we did in the function
> of try_to_wake_up().
>
> Wish I explain myself clearly.
So if the wait side has already observed cond==false, then without the
wakeup, which still potentially has ->on_rq == true, it would block.
Therefore the wakeup is a _real_ wakeup.
We fundamentally cannot know, on the wake side, if the wait side has or
has not observed cond, and therefore the distinction you're trying to
make is a false one.
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2014-05-11 16:35 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-15 12:32 [PATCH 0/8] perf sched: Add trace event for sched wait Dongsheng Yang
2014-04-15 12:32 ` [PATCH 1/8] sched & trace: Add a trace event for wait Dongsheng Yang
2014-04-15 13:49 ` Peter Zijlstra
2014-04-16 14:23 ` Steven Rostedt
2014-04-15 12:32 ` [PATCH 2/8] sched/wait: Add trace point before add task into wait queue Dongsheng Yang
2014-04-15 12:32 ` [PATCH 3/8] sched/wait: Use __add_wait_queue{_tail}_exclusive() as possible Dongsheng Yang
2014-04-15 13:49 ` Peter Zijlstra
2014-04-16 9:51 ` Dongsheng Yang
2014-04-15 12:32 ` [PATCH 4/8] sched/core: Skip wakeup when task is already running Dongsheng Yang
2014-04-15 13:53 ` Peter Zijlstra
2014-04-16 10:22 ` Dongsheng Yang
2014-04-22 11:56 ` Dongsheng Yang
2014-04-22 13:23 ` Peter Zijlstra
2014-04-22 17:10 ` bsegall
2014-04-22 17:53 ` Steven Rostedt
2014-04-22 18:18 ` Peter Zijlstra
2014-05-05 6:32 ` Dongsheng Yang
2014-05-05 6:34 ` [PATCH] sched: Move the wakeup tracepoint from ttwu_do_wakeup() to ttwu_activate() Dongsheng Yang
2014-05-05 14:00 ` Steven Rostedt
2014-05-06 0:19 ` Dongsheng Yang
2014-05-06 0:26 ` Dongsheng Yang
2014-05-06 2:06 ` Steven Rostedt
2014-05-06 1:29 ` Dongsheng Yang
2014-05-06 1:52 ` [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not Dongsheng Yang
2014-05-09 0:16 ` Dongsheng Yang
2014-05-09 1:27 ` Steven Rostedt
2014-05-10 15:29 ` Peter Zijlstra
[not found] ` <536F90BE.2080806@gmail.com>
2014-05-11 15:24 ` Fwd: " Dongsheng Yang
2014-05-11 16:35 ` Peter Zijlstra [this message]
2014-05-11 18:52 ` Steven Rostedt
2014-05-12 6:47 ` Peter Zijlstra
2014-05-12 8:58 ` Dongsheng Yang
2014-05-12 14:09 ` Steven Rostedt
2014-05-12 15:09 ` Peter Zijlstra
2014-05-12 15:17 ` Steven Rostedt
2014-05-12 15:28 ` Peter Zijlstra
2014-04-15 12:32 ` [PATCH 5/8] perf tools: record and process sched:sched_wait event Dongsheng Yang
2014-04-15 12:32 ` [PATCH 6/8] perf tools: add missing event for perf sched record Dongsheng Yang
2014-04-15 12:32 ` [PATCH 7/8] perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space Dongsheng Yang
2014-04-15 12:32 ` [PATCH 8/8] perf tools: Clarify the output of perf sched map Dongsheng Yang
2014-04-15 13:54 ` [PATCH 0/8] perf sched: Add trace event for sched wait Peter Zijlstra
2014-04-16 10:28 ` Dongsheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140511163531.GG30445@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bsegall@google.com \
--cc=dongsheng081251@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=yangds.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox