public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Dongsheng Yang <dongsheng081251@gmail.com>
Cc: "yangds.fnst" <yangds.fnst@cn.fujitsu.com>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	mingo@redhat.com, bsegall@google.com
Subject: Re: Fwd: [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not.
Date: Sun, 11 May 2014 18:35:31 +0200	[thread overview]
Message-ID: <20140511163531.GG30445@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+qeAOr_PrBuo+qfjNoaZyHF21p=8UgQy1oZOjAxahhYTyrvLQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2948 bytes --]

On Sun, May 11, 2014 at 11:24:22PM +0800, Dongsheng Yang wrote:
> Actually, this patch does not attempt to solve the race condition.
> It only want to avoid sched:sched_wakeup with success==true in
> a fake wakeup, as explained below.
> 
> > So the fundamental wait loop is:
> >
> >    for (;;) {
> >         set_current_state(TASK_UNINTERRUPTIBLE);
> >         if (cond)
> >                 break;
> >         schedule();
> >    }
> >    __set_task_state(TASK_RUNNING);
> >
> > And the fundamental wakeup is:
> >
> >    cond = true;
> >    wake_up_process(TASK_NORMAL);
> >
> > And this is very much on purpose a lock-free but strictly ordered
> > scenario. It is a variation of:
> >
> >    X = Y = 0
> >
> >    (wait)       (wake)
> >    [w] X = 1    [w] Y = 1
> >    MB           MB
> >    [r] Y                [r] X
> >
> > [ where: X := state, Y := cond ]
> >
> > And we all 'know' that the only provided guarantee is that:
> >    X==0 && Y==0
> > is impossible -- but only that, all 3 other states are observable.
> >
> > This guarantee means that its impossible to both miss the condition and
> > the wakeup; iow. it guarantees fwd progress.
> >
> > OTOH its fundamentally racy, nothing guarantees we will not 'observe' both
> > the condition and the wakeup.
> >
> > The setting of .success=false when ->on_rq is actively wrong, suppose
> > the waiter has already observed cond==false but has not yet gotten to
> > schedule(), at that point the wakeup happens and sees ->on_rq==1. The
> > wakeup is still very much a real wakeup.
> 
> 
> Yes, if a wakeup happens before schedule(), wakeup
> sees ->on_rq==1. Then we can get an event with .success==false.
> But I think it is not a real wakeup. :(
> 
> Yes, at this moment, maybe the task is already out of run queue.
> But *this* wakeup did not move it back to run queue, it only
> change the state of it to TASK_RUNNING. I believe the next
> wakeup for this task will do the real wake up moving it back
> to run queue.
> 
> And if scheduler really wake it up, we can get an event with success==true.
> 
> Anyway, what I want with this patch is to make scheduler raise accurate
> events when waking up a task.
> 
> If a wakeup only change the state of task, raise a event with success==false.
> If a wakeup move a task back to runqueue, .success==true.
> 
> It means, we do not need to care about the task is on_rq or not currently,
> the value of .success is decided by the behavior we did in the function
> of try_to_wake_up().
> 
> Wish I explain myself clearly.

So if the wait side has already observed cond==false, then without the
wakeup, which still potentially has ->on_rq == true, it would block.
Therefore the wakeup is a _real_ wakeup.

We fundamentally cannot know, on the wake side, if the wait side has or
has not observed cond, and therefore the distinction you're trying to
make is a false one.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2014-05-11 16:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 12:32 [PATCH 0/8] perf sched: Add trace event for sched wait Dongsheng Yang
2014-04-15 12:32 ` [PATCH 1/8] sched & trace: Add a trace event for wait Dongsheng Yang
2014-04-15 13:49   ` Peter Zijlstra
2014-04-16 14:23     ` Steven Rostedt
2014-04-15 12:32 ` [PATCH 2/8] sched/wait: Add trace point before add task into wait queue Dongsheng Yang
2014-04-15 12:32 ` [PATCH 3/8] sched/wait: Use __add_wait_queue{_tail}_exclusive() as possible Dongsheng Yang
2014-04-15 13:49   ` Peter Zijlstra
2014-04-16  9:51     ` Dongsheng Yang
2014-04-15 12:32 ` [PATCH 4/8] sched/core: Skip wakeup when task is already running Dongsheng Yang
2014-04-15 13:53   ` Peter Zijlstra
2014-04-16 10:22     ` Dongsheng Yang
2014-04-22 11:56       ` Dongsheng Yang
2014-04-22 13:23         ` Peter Zijlstra
2014-04-22 17:10         ` bsegall
2014-04-22 17:53           ` Steven Rostedt
2014-04-22 18:18           ` Peter Zijlstra
2014-05-05  6:32             ` Dongsheng Yang
2014-05-05  6:34               ` [PATCH] sched: Move the wakeup tracepoint from ttwu_do_wakeup() to ttwu_activate() Dongsheng Yang
2014-05-05 14:00                 ` Steven Rostedt
2014-05-06  0:19                   ` Dongsheng Yang
2014-05-06  0:26                     ` Dongsheng Yang
2014-05-06  2:06                     ` Steven Rostedt
2014-05-06  1:29                       ` Dongsheng Yang
2014-05-06  1:52                         ` [PATCH] sched: Distinguish sched_wakeup event when wake up a task which did schedule out or not Dongsheng Yang
2014-05-09  0:16                           ` Dongsheng Yang
2014-05-09  1:27                             ` Steven Rostedt
2014-05-10 15:29                           ` Peter Zijlstra
     [not found]                             ` <536F90BE.2080806@gmail.com>
2014-05-11 15:24                               ` Fwd: " Dongsheng Yang
2014-05-11 16:35                                 ` Peter Zijlstra [this message]
2014-05-11 18:52                                   ` Steven Rostedt
2014-05-12  6:47                                     ` Peter Zijlstra
2014-05-12  8:58                                       ` Dongsheng Yang
2014-05-12 14:09                                       ` Steven Rostedt
2014-05-12 15:09                                         ` Peter Zijlstra
2014-05-12 15:17                                           ` Steven Rostedt
2014-05-12 15:28                                             ` Peter Zijlstra
2014-04-15 12:32 ` [PATCH 5/8] perf tools: record and process sched:sched_wait event Dongsheng Yang
2014-04-15 12:32 ` [PATCH 6/8] perf tools: add missing event for perf sched record Dongsheng Yang
2014-04-15 12:32 ` [PATCH 7/8] perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space Dongsheng Yang
2014-04-15 12:32 ` [PATCH 8/8] perf tools: Clarify the output of perf sched map Dongsheng Yang
2014-04-15 13:54 ` [PATCH 0/8] perf sched: Add trace event for sched wait Peter Zijlstra
2014-04-16 10:28   ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140511163531.GG30445@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bsegall@google.com \
    --cc=dongsheng081251@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=yangds.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox