From: Peter Zijlstra <peterz@infradead.org>
To: Dongsheng Yang <dongsheng081251@gmail.com>
Cc: "yangds.fnst" <yangds.fnst@cn.fujitsu.com>,
linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
mingo@redhat.com, bsegall@google.com
Subject: Re: Fwd: [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not.
Date: Sun, 11 May 2014 18:35:31 +0200 [thread overview]
Message-ID: <20140511163531.GG30445@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+qeAOr_PrBuo+qfjNoaZyHF21p=8UgQy1oZOjAxahhYTyrvLQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]
On Sun, May 11, 2014 at 11:24:22PM +0800, Dongsheng Yang wrote:
> Actually, this patch does not attempt to solve the race condition.
> It only want to avoid sched:sched_wakeup with success==true in
> a fake wakeup, as explained below.
>
> > So the fundamental wait loop is:
> >
> > for (;;) {
> > set_current_state(TASK_UNINTERRUPTIBLE);
> > if (cond)
> > break;
> > schedule();
> > }
> > __set_task_state(TASK_RUNNING);
> >
> > And the fundamental wakeup is:
> >
> > cond = true;
> > wake_up_process(TASK_NORMAL);
> >
> > And this is very much on purpose a lock-free but strictly ordered
> > scenario. It is a variation of:
> >
> > X = Y = 0
> >
> > (wait) (wake)
> > [w] X = 1 [w] Y = 1
> > MB MB
> > [r] Y [r] X
> >
> > [ where: X := state, Y := cond ]
> >
> > And we all 'know' that the only provided guarantee is that:
> > X==0 && Y==0
> > is impossible -- but only that, all 3 other states are observable.
> >
> > This guarantee means that its impossible to both miss the condition and
> > the wakeup; iow. it guarantees fwd progress.
> >
> > OTOH its fundamentally racy, nothing guarantees we will not 'observe' both
> > the condition and the wakeup.
> >
> > The setting of .success=false when ->on_rq is actively wrong, suppose
> > the waiter has already observed cond==false but has not yet gotten to
> > schedule(), at that point the wakeup happens and sees ->on_rq==1. The
> > wakeup is still very much a real wakeup.
>
>
> Yes, if a wakeup happens before schedule(), wakeup
> sees ->on_rq==1. Then we can get an event with .success==false.
> But I think it is not a real wakeup. :(
>
> Yes, at this moment, maybe the task is already out of run queue.
> But *this* wakeup did not move it back to run queue, it only
> change the state of it to TASK_RUNNING. I believe the next
> wakeup for this task will do the real wake up moving it back
> to run queue.
>
> And if scheduler really wake it up, we can get an event with success==true.
>
> Anyway, what I want with this patch is to make scheduler raise accurate
> events when waking up a task.
>
> If a wakeup only change the state of task, raise a event with success==false.
> If a wakeup move a task back to runqueue, .success==true.
>
> It means, we do not need to care about the task is on_rq or not currently,
> the value of .success is decided by the behavior we did in the function
> of try_to_wake_up().
>
> Wish I explain myself clearly.
So if the wait side has already observed cond==false, then without the
wakeup, which still potentially has ->on_rq == true, it would block.
Therefore the wakeup is a _real_ wakeup.
We fundamentally cannot know, on the wake side, if the wait side has or
has not observed cond, and therefore the distinction you're trying to
make is a false one.
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2014-05-11 16:35 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-15 12:32 [PATCH 0/8] perf sched: Add trace event for sched wait Dongsheng Yang
2014-04-15 12:32 ` [PATCH 1/8] sched & trace: Add a trace event for wait Dongsheng Yang
2014-04-15 13:49 ` Peter Zijlstra
2014-04-16 14:23 ` Steven Rostedt
2014-04-15 12:32 ` [PATCH 2/8] sched/wait: Add trace point before add task into wait queue Dongsheng Yang
2014-04-15 12:32 ` [PATCH 3/8] sched/wait: Use __add_wait_queue{_tail}_exclusive() as possible Dongsheng Yang
2014-04-15 13:49 ` Peter Zijlstra
2014-04-16 9:51 ` Dongsheng Yang
2014-04-15 12:32 ` [PATCH 4/8] sched/core: Skip wakeup when task is already running Dongsheng Yang
2014-04-15 13:53 ` Peter Zijlstra
2014-04-16 10:22 ` Dongsheng Yang
2014-04-22 11:56 ` Dongsheng Yang
2014-04-22 13:23 ` Peter Zijlstra
2014-04-22 17:10 ` bsegall
2014-04-22 17:53 ` Steven Rostedt
2014-04-22 18:18 ` Peter Zijlstra
2014-05-05 6:32 ` Dongsheng Yang
2014-05-05 6:34 ` [PATCH] sched: Move the wakeup tracepoint from ttwu_do_wakeup() to ttwu_activate() Dongsheng Yang
2014-05-05 14:00 ` Steven Rostedt
2014-05-06 0:19 ` Dongsheng Yang
2014-05-06 0:26 ` Dongsheng Yang
2014-05-06 2:06 ` Steven Rostedt
2014-05-06 1:29 ` Dongsheng Yang
2014-05-06 1:52 ` [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not Dongsheng Yang
2014-05-09 0:16 ` Dongsheng Yang
2014-05-09 1:27 ` Steven Rostedt
2014-05-10 15:29 ` Peter Zijlstra
[not found] ` <536F90BE.2080806@gmail.com>
2014-05-11 15:24 ` Fwd: " Dongsheng Yang
2014-05-11 16:35 ` Peter Zijlstra [this message]
2014-05-11 18:52 ` Steven Rostedt
2014-05-12 6:47 ` Peter Zijlstra
2014-05-12 8:58 ` Dongsheng Yang
2014-05-12 14:09 ` Steven Rostedt
2014-05-12 15:09 ` Peter Zijlstra
2014-05-12 15:17 ` Steven Rostedt
2014-05-12 15:28 ` Peter Zijlstra
2014-04-15 12:32 ` [PATCH 5/8] perf tools: record and process sched:sched_wait event Dongsheng Yang
2014-04-15 12:32 ` [PATCH 6/8] perf tools: add missing event for perf sched record Dongsheng Yang
2014-04-15 12:32 ` [PATCH 7/8] perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space Dongsheng Yang
2014-04-15 12:32 ` [PATCH 8/8] perf tools: Clarify the output of perf sched map Dongsheng Yang
2014-04-15 13:54 ` [PATCH 0/8] perf sched: Add trace event for sched wait Peter Zijlstra
2014-04-16 10:28 ` Dongsheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140511163531.GG30445@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bsegall@google.com \
--cc=dongsheng081251@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=yangds.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.