public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Mason <chris.mason@oracle.com>,
	Frank Rowand <frank.rowand@am.sony.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
	Jens Axboe <axboe@kernel.dk>, Yong Zhang <yong.zhang0@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 17/18] sched: Move the second half of ttwu() to the remote cpu
Date: Tue, 18 Jan 2011 17:38:08 +0100	[thread overview]
Message-ID: <1295368688.30950.925.camel@laptop> (raw)
In-Reply-To: <20110107152207.GA16341@redhat.com>

On Fri, 2011-01-07 at 16:22 +0100, Oleg Nesterov wrote:

> Why sched_fork() does set_task_cpu() ? Just curious, it seems
> that wake_up_new_task() does all we need.

The only reason I can come up with is to properly initialize the
data-structures before make the thing visible, by the time
wake_up_new_task() comes along, its already fully visible.

> ttwu_queue_remote() does "struct task_struct *next = NULL".
> Probably "next = rq->wake_list" makes more sens. Otherwise the
> first cmpxchg() always fails if rq->wake_list != NULL.

Indeed, I think Yong mentioned the same a while back.. done.

> Doesn't __migrate_task() need pi_lock? Consider:
> 
> 1. A task T runs on CPU_0, it does set_current_state(TASK_INTERRUBTIBLE)
> 
> 2. some CPU does set_cpus_allowed_ptr(T, new_mask), new_mask doesn't
>    include CPU_0.
> 
>    T is running, cpumask_any_and() picks CPU_1, set_cpus_allowed_ptr()
>    drops pi_lock and rq->lock before stop_one_cpu().
> 
> 3. T calls schedule() and becomes deactivated.
> 
> 4. CPU_2 does try_to_wake_up(T, TASK_INTERRUPTIBLE), takes pi_lock
>    and sees on_rq == F.
>
> 5. set_cpus_allowed_ptr() resumes and calls stop_one_cpu(cpu => 1).
> 
> 6. cpu_stopper_thread() runs on CPU_1 and calls ____migrate_task().
>    It locks CPU_0 and CPU_1 rq's and checks task_cpu() == src_cpu.
> 
> 7. CPU_2 calls select_task_rq(), it returns (to simplify) 2.
> 
>    Now try_to_wake_up() does set_task_cpu(T, 2), and calls
>    ttwu_queue()->ttwu_do_activate()->activate_task().
> 
> 8. __migrate_task() on CPU_1 sees p->on_rq and starts the
>    deactivate/activate dance racing with ttwu_do_activate()
>    on CPU_2.

Drad, yes I think you're right, now you've got me worried about the
other migration paths too.. however did you come up with that
scenario? :-)

A simple fix would be to keep ->pi_lock locked over the call to
stop_one_cpu() from set_cpus_allowed_ptr().

I think the sched_fair.c load-balance code paths are ok because we only
find a task to migrate after we've obtained both runqueue locks, so even
if we migrate current, it cannot schedule (step 3).

I'm not at all sure about the sched_rt load-balance paths, will need to
twist my head around that..


> And a final question. This is really, really minor, but
> activate_task/deactivate_task are not symmetric, the former
> always sets p->on_rq. Looks correct, but imho a bit confusing and
> can complicate the understanding. Since p->on_rq is cleared
> explicitly by schedule(), perhaps it can be set explicitly to
> in try_to_wake_up_*. Or, perhaps, activate/deactivate can check
> ENQUEUE_WAKEUP/DEQUEUE_SLEEP and set/clear p->on_rq. Once again,
> this is purely cosmetic issue.

Right, only because I didn't want to add conditionals and there's two
ENQUEUE_WAKEUP sites and didn't want to replicate the assignment. I'll
fix it up.



  reply	other threads:[~2011-01-18 16:37 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-04 14:59 [RFC][PATCH 00/18] sched: Reduce runqueue lock contention -v4 Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 01/18] sched: Always provide p->on_cpu Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 02/18] mutex: Use p->on_cpu for the adaptive spin Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 03/18] sched: Change the ttwu success details Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 04/18] sched: Clean up ttwu stats Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 05/18] sched: Provide p->on_rq Peter Zijlstra
2011-01-05  8:13   ` Yong Zhang
2011-01-05  9:53     ` Peter Zijlstra
2011-01-29  0:10   ` Frank Rowand
2011-01-04 14:59 ` [RFC][PATCH 06/18] sched: Serialize p->cpus_allowed and ttwu() using p->pi_lock Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 07/18] sched: Drop the rq argument to sched_class::select_task_rq() Peter Zijlstra
2011-01-06 13:57   ` Peter Zijlstra
2011-01-06 14:23     ` Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 08/18] sched: Remove rq argument to sched_class::task_waking() Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 09/18] sched: Delay task_contributes_to_load() Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 10/18] sched: Also serialize ttwu_local() with p->pi_lock Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 11/18] sched: Add p->pi_lock to task_rq_lock() Peter Zijlstra
2011-01-05 18:46   ` Oleg Nesterov
2011-01-05 19:33     ` Peter Zijlstra
2011-01-29  0:21   ` Frank Rowand
2011-02-03 17:16     ` Peter Zijlstra
2011-02-03 17:49       ` Frank Rowand
2011-01-04 14:59 ` [RFC][PATCH 12/18] sched: Drop rq->lock from first part of wake_up_new_task() Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 13/18] sched: Drop rq->lock from sched_exec() Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 14/18] sched: Remove rq->lock from the first half of ttwu() Peter Zijlstra
2011-01-06 16:29   ` Peter Zijlstra
2011-01-29  1:05   ` Frank Rowand
2011-02-03 17:16     ` Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 15/18] sched: Remove rq argument from ttwu_stat() Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 16/18] sched: Rename ttwu_post_activation Peter Zijlstra
2011-01-29  1:08   ` Frank Rowand
2011-01-04 14:59 ` [RFC][PATCH 17/18] sched: Move the second half of ttwu() to the remote cpu Peter Zijlstra
2011-01-05 21:07   ` Oleg Nesterov
2011-01-06 15:09     ` Peter Zijlstra
2011-01-07 15:22       ` Oleg Nesterov
2011-01-18 16:38         ` Peter Zijlstra [this message]
2011-01-19 19:37           ` Oleg Nesterov
2011-01-29  0:04           ` Frank Rowand
2011-02-03 17:16             ` Peter Zijlstra
2011-01-04 14:59 ` [RFC][PATCH 18/18] sched: Sort hotplug vs ttwu queueing Peter Zijlstra
2011-01-05 20:47   ` Oleg Nesterov
2011-01-06 10:56     ` Peter Zijlstra
2011-01-04 15:16 ` [RFC][PATCH 00/18] sched: Reduce runqueue lock contention -v4 Ingo Molnar
2011-01-29  1:20 ` Frank Rowand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1295368688.30950.925.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=axboe@kernel.dk \
    --cc=chris.mason@oracle.com \
    --cc=efault@gmx.de \
    --cc=frank.rowand@am.sony.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    --cc=yong.zhang0@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox