From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755647Ab3LTROJ (ORCPT ); Fri, 20 Dec 2013 12:14:09 -0500 Received: from merlin.infradead.org ([205.233.59.134]:35496 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753342Ab3LTROG (ORCPT ); Fri, 20 Dec 2013 12:14:06 -0500 Date: Fri, 20 Dec 2013 18:13:43 +0100 From: Peter Zijlstra To: tglx@linutronix.de, mingo@redhat.com, rostedt@goodmis.org, oleg@redhat.com, fweisbec@gmail.com, darren@dvhart.com, johan.eker@ericsson.com, p.faure@akatech.ch, linux-kernel@vger.kernel.org, claudio@evidence.eu.com, michael@amarulasolutions.com, fchecconi@gmail.com, tommaso.cucinotta@sssup.it, juri.lelli@gmail.com, nicola.manica@disi.unitn.it, luca.abeni@unitn.it, dhaval.giani@gmail.com, hgu1972@gmail.com, paulmck@linux.vnet.ibm.com, raistlin@linux.it, insop.song@gmail.com, liming.wang@windriver.com, jkacur@redhat.com Subject: Re: [PATCH 09/13] sched: Add bandwidth management for sched_dl Message-ID: <20131220171343.GL2480@laptop.programming.kicks-ass.net> References: <20131217122720.950475833@infradead.org> <20131217123353.180539582@infradead.org> <20131218165508.GB30183@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131218165508.GB30183@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 18, 2013 at 05:55:08PM +0100, Peter Zijlstra wrote: > If the purpose is to fail hotplug because taking out the CPU would end > up in over-subscription, then we need a DOWN_PREPARE handler. Juri just said (on IRC) that that was indeed the intended purpose. --- Subject: sched, deadline: Fix hotplug admission control From: Peter Zijlstra Date: Thu Dec 19 11:54:45 CET 2013 The current hotplug admission control is broken because: CPU_DYING -> migration_call() -> migrate_tasks() -> __migrate_task() cannot fail and hard assumes it _will_ move all tasks off of the dying cpu, failing this will break hotplug. The much simpler solution is a DOWN_PREPARE handler that fails when removing one CPU gets us below the total allocated bandwidth. Signed-off-by: Peter Zijlstra --- kernel/sched/core.c | 68 ++++++++++++++++------------------------------------ 1 file changed, 21 insertions(+), 47 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1886,9 +1886,9 @@ inline struct dl_bw *dl_bw_of(int i) return &cpu_rq(i)->rd->dl_bw; } -static inline int __dl_span_weight(struct rq *rq) +static inline int dl_bw_cpus(int i) { - return cpumask_weight(rq->rd->span); + return cpumask_weight(cpu_rq(i)->rd->span); } #else inline struct dl_bw *dl_bw_of(int i) @@ -1896,7 +1896,7 @@ inline struct dl_bw *dl_bw_of(int i) return &cpu_rq(i)->dl.dl_bw; } -static inline int __dl_span_weight(struct rq *rq) +static inline int dl_bw_cpus(int i) { return 1; } @@ -1937,7 +1937,7 @@ static int dl_overflow(struct task_struc u64 period = attr->sched_period; u64 runtime = attr->sched_runtime; u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0; - int cpus = __dl_span_weight(task_rq(p)); + int cpus = dl_bw_cpus(task_cpu(p)); int err = -1; if (new_bw == p->dl.dl_bw) @@ -4523,42 +4523,6 @@ int set_cpus_allowed_ptr(struct task_str EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr); /* - * When dealing with a -deadline task, we have to check if moving it to - * a new CPU is possible or not. In fact, this is only true iff there - * is enough bandwidth available on such CPU, otherwise we want the - * whole migration procedure to fail over. - */ -static inline -bool set_task_cpu_dl(struct task_struct *p, unsigned int cpu) -{ - struct dl_bw *dl_b = dl_bw_of(task_cpu(p)); - struct dl_bw *cpu_b = dl_bw_of(cpu); - int ret = 1; - u64 bw; - - if (dl_b == cpu_b) - return 1; - - raw_spin_lock(&dl_b->lock); - raw_spin_lock(&cpu_b->lock); - - bw = cpu_b->bw * cpumask_weight(cpu_rq(cpu)->rd->span); - if (dl_bandwidth_enabled() && - bw < cpu_b->total_bw + p->dl.dl_bw) { - ret = 0; - goto unlock; - } - dl_b->total_bw -= p->dl.dl_bw; - cpu_b->total_bw += p->dl.dl_bw; - -unlock: - raw_spin_unlock(&cpu_b->lock); - raw_spin_unlock(&dl_b->lock); - - return ret; -} - -/* * Move (not current) task off this cpu, onto dest cpu. We're doing * this because either it can't run here any more (set_cpus_allowed() * away from this CPU, or CPU going down), or because we're @@ -4590,13 +4554,6 @@ static int __migrate_task(struct task_st goto fail; /* - * If p is -deadline, proceed only if there is enough - * bandwidth available on dest_cpu - */ - if (unlikely(dl_task(p)) && !set_task_cpu_dl(p, dest_cpu)) - goto fail; - - /* * If we're not on a rq, the next wake-up will ensure we're * placed properly. */ @@ -4985,6 +4942,23 @@ migration_call(struct notifier_block *nf unsigned long flags; struct rq *rq = cpu_rq(cpu); + switch (action) { + case CPU_DOWN_PREPARE: /* explicitly allow suspend */ + { + struct dl_bw *dl_b = dl_bw_of(cpu); + int cpus = dl_bw_cpus(cpu); + bool overflow; + + raw_spin_lock_irqsave(&dl_b->lock, flags); + overflow = __dl_overflow(dl_b, cpus-1, 0, 0); + raw_spin_unlock_irqrestore(&dl_b->lock, flags); + + if (overflow) + return notifier_from_errno(-EBUSY); + } + break; + } + switch (action & ~CPU_TASKS_FROZEN) { case CPU_UP_PREPARE: