From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Raistlin <raistlin@linux.it>, Ingo Molnar <mingo@elte.hu>,
Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Chris Friesen <cfriesen@nortel.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Darren Hart <darren@dvhart.com>,
Johan Eker <johan.eker@ericsson.com>,
"p.faure" <p.faure@akatech.ch>,
linux-kernel <linux-kernel@vger.kernel.org>,
Claudio Scordino <claudio@evidence.eu.com>,
michael trimarchi <trimarchi@retis.sssup.it>,
Fabio Checconi <fabio@gandalf.sssup.it>,
Tommaso Cucinotta <cucinotta@sssup.it>,
Juri Lelli <juri.lelli@gmail.com>,
Nicola Manica <nicola.manica@disi.unitn.it>,
Luca Abeni <luca.abeni@unitn.it>,
Dhaval Giani <dhaval@retis.sssup.it>,
Harald Gustafsson <hgu1972@gmail.com>,
paulmck <paulmck@linux.vnet.ibm.com>
Subject: Re: [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads
Date: Sat, 13 Nov 2010 21:31:31 +0100 [thread overview]
Message-ID: <1289680291.2109.244.camel@laptop> (raw)
In-Reply-To: <20101113195857.GA11411@redhat.com>
On Sat, 2010-11-13 at 20:58 +0100, Oleg Nesterov wrote:
> On 11/13, Peter Zijlstra wrote:
> >
> > Something like so?.. hasn't even seen a compiler yet but one's got to do
> > something to keep the worst bore of saturday night telly in check ;-)
>
> Yes, I _think_ this all can work (and imho makes a lot of sense
> if it works).
>
> quick and dirty review below ;)
>
> > struct take_cpu_down_param {
> > - struct task_struct *caller;
> > unsigned long mod;
> > void *hcpu;
> > };
> > @@ -208,11 +207,6 @@ static int __ref take_cpu_down(void *_pa
> >
> > cpu_notify(CPU_DYING | param->mod, param->hcpu);
> >
> > - if (task_cpu(param->caller) == cpu)
> > - move_task_off_dead_cpu(cpu, param->caller);
> > - /* Force idle task to run as soon as we yield: it should
> > - immediately notice cpu is offline and die quickly. */
> > - sched_idle_next();
>
> Yes. but we should remove "while (!idle_cpu(cpu))" from _cpu_down().
Right, I think we should replace that with something like BUG_ON(!
idle_cpu(cpu)); Since we migrated everything away during the stop
machine, the cpu should be idle after it.
> > @@ -2381,18 +2381,15 @@ static int select_fallback_rq(int cpu, s
> > return dest_cpu;
> >
> > /* No more Mr. Nice Guy. */
> > - if (unlikely(dest_cpu >= nr_cpu_ids)) {
> > - dest_cpu = cpuset_cpus_allowed_fallback(p);
> > - /*
> > - * Don't tell them about moving exiting tasks or
> > - * kernel threads (both mm NULL), since they never
> > - * leave kernel.
> > - */
> > - if (p->mm && printk_ratelimit()) {
> > - printk(KERN_INFO "process %d (%s) no "
> > - "longer affine to cpu%d\n",
> > - task_pid_nr(p), p->comm, cpu);
> > - }
> > + dest_cpu = cpuset_cpus_allowed_fallback(p);
> > + /*
> > + * Don't tell them about moving exiting tasks or
> > + * kernel threads (both mm NULL), since they never
> > + * leave kernel.
> > + */
> > + if (p->mm && printk_ratelimit()) {
> > + printk(KERN_INFO "process %d (%s) no longer affine to cpu%d\n",
> > + task_pid_nr(p), p->comm, cpu);
> > }
>
> Hmm. I was really puzzled until I realized this is just cleanup,
> we can't reach this point if dest_cpu < nr_cpu_ids.
Right.. Noticed that when I read that code, though I might as well fix
it up.
> > +static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
> > {
> > struct rq *rq = cpu_rq(dead_cpu);
> > + int needs_cpu, uninitialized_var(dest_cpu);
> >
> > - /* Must be exiting, otherwise would be on tasklist. */
> > - BUG_ON(!p->exit_state);
> > -
> > - /* Cannot have done final schedule yet: would have vanished. */
> > - BUG_ON(p->state == TASK_DEAD);
> > -
> > - get_task_struct(p);
> > + needs_cpu = (task_cpu(p) == dead_cpu) && (p->state != TASK_WAKING);
> > + if (needs_cpu)
> > + dest_cpu = select_fallback_rq(dead_cpu, p);
> > + raw_spin_unlock(&rq->lock);
>
> Probably we do not need any checks. This task was picked by
> ->pick_next_task(), it should have task_cpu(p) == dead_cpu ?
Right, we can drop those checks, its unconditionally true.
> But. I think there is a problem. We should not migrate current task,
> stop thread, which does the migrating. At least, sched_stoptask.c
> doesn't implement ->enqueue_task() and we can never wake it up later
> for kthread_stop().
Hrm, right, so while the migration thread isn't actually on any rq
structure as such, pick_next_task() will return it.. need to come up
with a way to skip it.
As to current, take_cpu_down() is actually migrating current away before
this patch, so I simply included current in the
CPU_DYING->migrate_tasks() loop and removed the special case from
take_cpu_down().
> Perhaps migrate_tasks() should do for_each_class() by hand to
> ignore stop_sched_class. But then _cpu_down() should somewhow
> ensure the stop thread on the dead CPU is already parked in
> schedule().
Well, since we're in stop_machine all cpus but the cpu that is executing
is stuck in the stop_machine_cpu_stop() loop, in both cases we could
simply fudge the pick_next_task_stop() condition (eg. set rq->stop =
NULL) while doing that loop, and restore it afterwards, nothing will hit
schedule() while we're there.
> > - case CPU_DYING_FROZEN:
> > /* Update our root-domain */
> > raw_spin_lock_irqsave(&rq->lock, flags);
> > if (rq->rd) {
> > BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
> > set_rq_offline(rq);
> > }
> > + migrate_tasks(cpu);
> > + BUG_ON(rq->nr_running != 0);
> > raw_spin_unlock_irqrestore(&rq->lock, flags);
>
> Probably we don't really need rq->lock. All cpus run stop threads.
Right, but I was worried about stuff that relied on lockdep state like
the rcu lockdep stuff.. and taking the lock doesn't hurt.
> I am not sure about rq->idle, perhaps it should be deactivated.
> I don't think we should migrate it.
Ah, I think the !nr_running check will bail before we end up selecting
the idle thread.
> What I never understood is the meaning of play_dead/etc. If we
> remove sched_idle_next(), who will do that logic? And how the
> idle thread can call idle_task_exit() ?
Well, since we'll have migrated every task on that runqueue (except the
migration thread), the only runnable task left (once the migration
thread stops running) is the idle thread, so it should be implicit.
As to play_dead():
cpu_idle()
if (cpu_is_offline(smp_processor_id()))
play_dead()
native_play_dead() /* did I already say I detest paravirt? */
play_dead_common()
idle_task_exit();
local_irq_disable();
tboot/mwait/hlt
It basically puts the cpu to sleep with IRQs disabled, needs special
magic to wake it back up.
next prev parent reply other threads:[~2010-11-13 20:32 UTC|newest]
Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-29 6:18 [RFC][PATCH 00/22] sched: SCHED_DEADLINE v3 Raistlin
2010-10-29 6:25 ` [RFC][PATCH 01/22] sched: add sched_class->task_dead Raistlin
2010-10-29 6:27 ` [RFC][PATCH 02/22] sched: add extended scheduling interface Raistlin
2010-11-10 16:00 ` Dhaval Giani
2010-11-10 16:12 ` Dhaval Giani
2010-11-10 22:45 ` Raistlin
2010-11-10 16:17 ` Claudio Scordino
2010-11-10 17:28 ` Peter Zijlstra
2010-11-10 19:26 ` Peter Zijlstra
2010-11-10 23:33 ` Tommaso Cucinotta
2010-11-11 12:19 ` Peter Zijlstra
2010-11-10 22:17 ` Raistlin
2010-11-10 22:57 ` Tommaso Cucinotta
2010-11-11 13:32 ` Peter Zijlstra
2010-11-11 13:54 ` Raistlin
2010-11-11 14:08 ` Peter Zijlstra
2010-11-11 17:27 ` Raistlin
2010-11-11 14:05 ` Dhaval Giani
2010-11-10 22:24 ` Raistlin
2010-11-10 18:50 ` Peter Zijlstra
2010-11-10 22:05 ` Raistlin
2010-11-12 16:38 ` Steven Rostedt
2010-11-12 16:43 ` Peter Zijlstra
2010-11-12 16:52 ` Steven Rostedt
2010-11-12 19:19 ` Raistlin
2010-11-12 19:23 ` Steven Rostedt
2010-11-12 17:42 ` Tommaso Cucinotta
2010-11-12 19:21 ` Steven Rostedt
2010-11-12 19:24 ` Raistlin
2010-10-29 6:28 ` [RFC][PATCH 03/22] sched: SCHED_DEADLINE data structures Raistlin
2010-11-10 18:59 ` Peter Zijlstra
2010-11-10 22:06 ` Raistlin
2010-11-10 19:10 ` Peter Zijlstra
2010-11-12 17:11 ` Steven Rostedt
2010-10-29 6:29 ` [RFC][PATCH 04/22] sched: SCHED_DEADLINE SMP-related " Raistlin
2010-11-10 19:17 ` Peter Zijlstra
2010-10-29 6:30 ` [RFC][PATCH 05/22] sched: SCHED_DEADLINE policy implementation Raistlin
2010-11-10 19:21 ` Peter Zijlstra
2010-11-10 19:43 ` Peter Zijlstra
2010-11-11 1:02 ` Raistlin
2010-11-10 19:45 ` Peter Zijlstra
2010-11-10 22:26 ` Raistlin
2010-11-10 20:21 ` Peter Zijlstra
2010-11-11 1:18 ` Raistlin
2010-11-11 13:13 ` Peter Zijlstra
2010-11-11 14:13 ` Peter Zijlstra
2010-11-11 14:28 ` Raistlin
2010-11-11 14:17 ` Peter Zijlstra
2010-11-11 18:33 ` Raistlin
2010-11-11 14:25 ` Peter Zijlstra
2010-11-11 14:33 ` Raistlin
2010-11-14 8:54 ` Raistlin
2010-11-23 14:24 ` Peter Zijlstra
2010-10-29 6:31 ` [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads Raistlin
2010-11-11 14:31 ` Peter Zijlstra
2010-11-11 14:50 ` Dario Faggioli
2010-11-11 14:34 ` Peter Zijlstra
2010-11-11 15:27 ` Oleg Nesterov
2010-11-11 15:43 ` Peter Zijlstra
2010-11-11 16:32 ` Oleg Nesterov
2010-11-13 18:35 ` Peter Zijlstra
2010-11-13 19:58 ` Oleg Nesterov
2010-11-13 20:31 ` Peter Zijlstra [this message]
2010-11-13 20:51 ` Peter Zijlstra
2010-11-13 23:31 ` Peter Zijlstra
2010-11-15 20:06 ` [PATCH] sched: Simplify cpu-hot-unplug task migration Peter Zijlstra
2010-11-17 19:27 ` Oleg Nesterov
2010-11-17 19:42 ` Peter Zijlstra
2010-11-18 14:05 ` Oleg Nesterov
2010-11-18 14:24 ` Peter Zijlstra
2010-11-18 15:32 ` Oleg Nesterov
2010-11-18 14:09 ` [tip:sched/core] " tip-bot for Peter Zijlstra
2010-11-11 14:46 ` [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads Peter Zijlstra
2010-10-29 6:32 ` [RFC][PATCH 07/22] sched: SCHED_DEADLINE push and pull logic Raistlin
2010-11-12 16:17 ` Peter Zijlstra
2010-11-12 21:11 ` Raistlin
2010-11-14 9:14 ` Raistlin
2010-11-23 14:27 ` Peter Zijlstra
2010-10-29 6:33 ` [RFC][PATCH 08/22] sched: SCHED_DEADLINE avg_update accounting Raistlin
2010-11-11 19:16 ` Peter Zijlstra
2010-10-29 6:34 ` [RFC][PATCH 09/22] sched: add period support for -deadline tasks Raistlin
2010-11-11 19:17 ` Peter Zijlstra
2010-11-11 19:31 ` Raistlin
2010-11-11 19:43 ` Peter Zijlstra
2010-11-11 23:33 ` Tommaso Cucinotta
2010-11-12 13:33 ` Raistlin
2010-11-12 13:45 ` Peter Zijlstra
2010-11-12 13:46 ` Luca Abeni
2010-11-12 14:01 ` Raistlin
2010-10-29 6:35 ` [RFC][PATCH 10/22] sched: add a syscall to wait for the next instance Raistlin
2010-11-11 19:21 ` Peter Zijlstra
2010-11-11 19:33 ` Raistlin
2010-10-29 6:35 ` [RFC][PATCH 11/22] sched: add schedstats for -deadline tasks Raistlin
2010-10-29 6:36 ` [RFC][PATCH 12/22] sched: add runtime reporting " Raistlin
2010-11-11 19:37 ` Peter Zijlstra
2010-11-12 16:15 ` Raistlin
2010-11-12 16:27 ` Peter Zijlstra
2010-11-12 21:12 ` Raistlin
2010-10-29 6:37 ` [RFC][PATCH 13/22] sched: add resource limits " Raistlin
2010-11-11 19:57 ` Peter Zijlstra
2010-11-12 21:30 ` Raistlin
2010-11-12 23:32 ` Peter Zijlstra
2010-10-29 6:38 ` [RFC][PATCH 14/22] sched: add latency tracing " Raistlin
2010-10-29 6:38 ` [RFC][PATCH 15/22] sched: add traceporints " Raistlin
2010-11-11 19:54 ` Peter Zijlstra
2010-11-12 16:13 ` Raistlin
2010-10-29 6:39 ` [RFC][PATCH 16/22] sched: add SMP " Raistlin
2010-10-29 6:40 ` [RFC][PATCH 17/22] sched: add signaling overrunning " Raistlin
2010-11-11 21:58 ` Peter Zijlstra
2010-11-12 15:39 ` Raistlin
2010-11-12 16:04 ` Peter Zijlstra
2010-10-29 6:42 ` [RFC][PATCH 19/22] rtmutex: turn the plist into an rb-tree Raistlin
2010-10-29 6:42 ` [RFC][PATCH 18/22] sched: add reclaiming logic to -deadline tasks Raistlin
2010-11-11 22:12 ` Peter Zijlstra
2010-11-12 15:36 ` Raistlin
2010-11-12 16:04 ` Peter Zijlstra
2010-11-12 17:41 ` Luca Abeni
2010-11-12 17:51 ` Peter Zijlstra
2010-11-12 17:54 ` Luca Abeni
2010-11-13 21:08 ` Raistlin
2010-11-12 18:07 ` Tommaso Cucinotta
2010-11-12 19:07 ` Raistlin
2010-11-13 0:43 ` Peter Zijlstra
2010-11-13 1:49 ` Tommaso Cucinotta
2010-11-12 18:56 ` Raistlin
[not found] ` <80992760-24F2-42AE-AF2D-15727F6A1C81@email.unc.edu>
2010-11-15 18:37 ` James H. Anderson
2010-11-15 19:23 ` Luca Abeni
2010-11-15 19:49 ` James H. Anderson
2010-11-15 19:39 ` Luca Abeni
2010-11-15 21:34 ` Raistlin
2010-10-29 6:43 ` [RFC][PATCH 20/22] sched: drafted deadline inheritance logic Raistlin
2010-11-11 22:15 ` Peter Zijlstra
2010-11-14 12:00 ` Raistlin
2010-10-29 6:44 ` [RFC][PATCH 21/22] sched: add bandwidth management for sched_dl Raistlin
2010-10-29 6:45 ` [RFC][PATCH 22/22] sched: add sched_dl documentation Raistlin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1289680291.2109.244.camel@laptop \
--to=peterz@infradead.org \
--cc=cfriesen@nortel.com \
--cc=claudio@evidence.eu.com \
--cc=cucinotta@sssup.it \
--cc=darren@dvhart.com \
--cc=dhaval@retis.sssup.it \
--cc=fabio@gandalf.sssup.it \
--cc=fweisbec@gmail.com \
--cc=hgu1972@gmail.com \
--cc=johan.eker@ericsson.com \
--cc=juri.lelli@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@unitn.it \
--cc=mingo@elte.hu \
--cc=nicola.manica@disi.unitn.it \
--cc=oleg@redhat.com \
--cc=p.faure@akatech.ch \
--cc=paulmck@linux.vnet.ibm.com \
--cc=raistlin@linux.it \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=trimarchi@retis.sssup.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox