From: Raistlin <raistlin@linux.it>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Chris Friesen <cfriesen@nortel.com>,
oleg@redhat.com, Frederic Weisbecker <fweisbec@gmail.com>,
Darren Hart <darren@dvhart.com>,
Johan Eker <johan.eker@ericsson.com>,
"p.faure" <p.faure@akatech.ch>,
linux-kernel <linux-kernel@vger.kernel.org>,
Claudio Scordino <claudio@evidence.eu.com>,
michael trimarchi <trimarchi@retis.sssup.it>,
Fabio Checconi <fabio@gandalf.sssup.it>,
Tommaso Cucinotta <cucinotta@sssup.it>,
Juri Lelli <juri.lelli@gmail.com>,
Nicola Manica <nicola.manica@disi.unitn.it>,
Luca Abeni <luca.abeni@unitn.it>,
Dhaval Giani <dhaval@retis.sssup.it>,
Harald Gustafsson <hgu1972@gmail.com>,
paulmck <paulmck@linux.vnet.ibm.com>
Subject: [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads
Date: Fri, 29 Oct 2010 08:31:16 +0200 [thread overview]
Message-ID: <1288333876.8661.147.camel@Palantir> (raw)
In-Reply-To: <1288333128.8661.137.camel@Palantir>
[-- Attachment #1: Type: text/plain, Size: 9656 bytes --]
There sometimes is the need of executing a task as if it would
have the maximum possible priority in the entire system, i.e.,
whenever it gets ready it must run! This is for example the case
for some maintainance kernel thread like migration and (sometimes)
watchdog or ksoftirq.
Since SCHED_DEADLINE is now the highest priority scheduling class
these tasks have to be handled therein, but it is not obvious how
to choose a runtime and a deadline that guarantee what explained
above. Therefore, we need a mean of recognizing system tasks inside
the -deadline class and always run them as soon as possible, without
any kind of runtime and bandwidth limitation.
This patch:
- adds the SF_HEAD flag, which identify a special task that need
absolute prioritization among any other task;
- ensures that special tasks always preempt everyone else (and,
obviously, are not preempted by non special tasks);
- disables runtime and bandwidth checking for such tasks, hoping
that the interference they cause is small enough.
Signed-off-by: Dario Faggioli <raistlin@linux.it>
---
include/linux/sched.h | 13 ++++++++++
kernel/sched.c | 59 ++++++++++++++++++++++++++++++++++++++----------
kernel/sched_dl.c | 27 ++++++++++++++++++++--
kernel/softirq.c | 6 +----
kernel/watchdog.c | 3 +-
5 files changed, 85 insertions(+), 23 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f94da51..f25d3a6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -154,6 +154,18 @@ struct sched_param_ex {
struct timespec curr_deadline;
};
+/*
+ * Scheduler flags.
+ *
+ * @SF_HEAD tells us that the task has to be considered one of the
+ * maximum priority tasks in the system. This means it
+ * always enqueued with maximum priority in the runqueue
+ * of the highest priority scheduling class. In case it
+ * it sched_deadline, the task also ignore runtime and
+ * bandwidth limitations.
+ */
+#define SF_HEAD 1
+
struct exec_domain;
struct futex_pi_state;
struct robust_list_head;
@@ -2072,6 +2084,7 @@ extern int sched_setscheduler(struct task_struct *, int,
const struct sched_param *);
extern int sched_setscheduler_nocheck(struct task_struct *, int,
const struct sched_param *);
+extern void setscheduler_dl_special(struct task_struct *);
extern int sched_setscheduler_ex(struct task_struct *, int,
const struct sched_param *,
const struct sched_param_ex *);
diff --git a/kernel/sched.c b/kernel/sched.c
index 208fa08..79e7c1c 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2086,19 +2086,13 @@ static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time) { }
void sched_set_stop_task(int cpu, struct task_struct *stop)
{
- struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
struct task_struct *old_stop = cpu_rq(cpu)->stop;
if (stop) {
/*
- * Make it appear like a SCHED_FIFO task, its something
- * userspace knows about and won't get confused about.
- *
- * Also, it will make PI more or less work without too
- * much confusion -- but then, stop work should not
- * rely on PI working anyway.
+ * Make it appear like a SCHED_DEADLINE task.
*/
- sched_setscheduler_nocheck(stop, SCHED_FIFO, ¶m);
+ setscheduler_dl_special(stop);
stop->sched_class = &stop_sched_class;
}
@@ -2110,7 +2104,7 @@ void sched_set_stop_task(int cpu, struct task_struct *stop)
* Reset it back to a normal scheduling class so that
* it can die in pieces.
*/
- old_stop->sched_class = &rt_sched_class;
+ old_stop->sched_class = &dl_sched_class;
}
}
@@ -4808,9 +4802,15 @@ __getparam_dl(struct task_struct *p, struct sched_param_ex *param_ex)
* than the runtime.
*/
static bool
-__checkparam_dl(const struct sched_param_ex *prm)
+__checkparam_dl(const struct sched_param_ex *prm, bool kthread)
{
- return prm && timespec_to_ns(&prm->sched_deadline) != 0 &&
+ if (!prm)
+ return false;
+
+ if (prm->sched_flags & SF_HEAD)
+ return kthread;
+
+ return timespec_to_ns(&prm->sched_deadline) != 0 &&
timespec_compare(&prm->sched_deadline,
&prm->sched_runtime) >= 0;
}
@@ -4869,7 +4869,7 @@ recheck:
(p->mm && param->sched_priority > MAX_USER_RT_PRIO-1) ||
(!p->mm && param->sched_priority > MAX_RT_PRIO-1))
return -EINVAL;
- if ((dl_policy(policy) && !__checkparam_dl(param_ex)) ||
+ if ((dl_policy(policy) && !__checkparam_dl(param_ex, !p->mm)) ||
(rt_policy(policy) != (param->sched_priority != 0)))
return -EINVAL;
@@ -5133,6 +5133,39 @@ SYSCALL_DEFINE4(sched_setscheduler_ex, pid_t, pid, int, policy,
return do_sched_setscheduler_ex(pid, policy, len, param_ex);
}
+/*
+ * These functions make the task one of the highest priority task in
+ * the system. This means it will always run as soon as it gets ready,
+ * and it won't be preempted by any other task, independently from their
+ * scheduling policy, deadline, priority, etc. (provided they're not
+ * 'special tasks' as well).
+ */
+static void __setscheduler_dl_special(struct rq *rq, struct task_struct *p)
+{
+ p->dl.dl_runtime = 0;
+ p->dl.dl_deadline = 0;
+ p->dl.flags = SF_HEAD;
+ p->dl.dl_new = 1;
+
+ __setscheduler(rq, p, SCHED_DEADLINE, MAX_RT_PRIO-1);
+}
+
+void setscheduler_dl_special(struct task_struct *p)
+{
+ struct sched_param param;
+ struct sched_param_ex param_ex;
+
+ param.sched_priority = 0;
+
+ param_ex.sched_priority = MAX_RT_PRIO-1;
+ param_ex.sched_runtime = ns_to_timespec(0);
+ param_ex.sched_deadline = ns_to_timespec(0);
+ param_ex.sched_flags = SF_HEAD;
+
+ __sched_setscheduler(p, SCHED_DEADLINE, ¶m, ¶m_ex, false);
+}
+EXPORT_SYMBOL(setscheduler_dl_special);
+
/**
* sys_sched_setparam - set/change the RT priority of a thread
* @pid: the pid in question.
@@ -6071,7 +6104,7 @@ void sched_idle_next(void)
*/
raw_spin_lock_irqsave(&rq->lock, flags);
- __setscheduler(rq, p, SCHED_FIFO, MAX_RT_PRIO-1);
+ __setscheduler_dl_special(rq, p);
activate_task(rq, p, 0);
diff --git a/kernel/sched_dl.c b/kernel/sched_dl.c
index 9d0443e..17973aa 100644
--- a/kernel/sched_dl.c
+++ b/kernel/sched_dl.c
@@ -19,6 +19,21 @@ static inline int dl_time_before(u64 a, u64 b)
return (s64)(a - b) < 0;
}
+/*
+ * Tells if entity @a should preempt entity @b.
+ */
+static inline
+int dl_entity_preempt(struct sched_dl_entity *a, struct sched_dl_entity *b)
+{
+ /*
+ * A system task marked with SF_HEAD flag will always
+ * preempt a non 'special' one.
+ */
+ return a->flags & SF_HEAD ||
+ (!(b->flags & SF_HEAD) &&
+ dl_time_before(a->deadline, b->deadline));
+}
+
static inline struct task_struct *dl_task_of(struct sched_dl_entity *dl_se)
{
return container_of(dl_se, struct task_struct, dl);
@@ -291,7 +306,13 @@ int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se)
int dmiss = dl_time_before(dl_se->deadline, rq->clock);
int rorun = dl_se->runtime <= 0;
- if (!rorun && !dmiss)
+ /*
+ * No need for checking if it's time to enforce the
+ * bandwidth for the tasks that are:
+ * - maximum priority (SF_HEAD),
+ * - not overrunning nor missing a deadline.
+ */
+ if (dl_se->flags & SF_HEAD || (!rorun && !dmiss))
return 0;
/*
@@ -359,7 +380,7 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
while (*link) {
parent = *link;
entry = rb_entry(parent, struct sched_dl_entity, rb_node);
- if (dl_time_before(dl_se->deadline, entry->deadline))
+ if (dl_entity_preempt(dl_se, entry))
link = &parent->rb_left;
else {
link = &parent->rb_right;
@@ -471,7 +492,7 @@ static void check_preempt_curr_dl(struct rq *rq, struct task_struct *p,
int flags)
{
if (!dl_task(rq->curr) || (dl_task(p) &&
- dl_time_before(p->dl.deadline, rq->curr->dl.deadline)))
+ dl_entity_preempt(&p->dl, &rq->curr->dl)))
resched_task(rq->curr);
}
diff --git a/kernel/softirq.c b/kernel/softirq.c
index d4d918a..9c4c967 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -853,13 +853,9 @@ static int __cpuinit cpu_callback(struct notifier_block *nfb,
cpumask_any(cpu_online_mask));
case CPU_DEAD:
case CPU_DEAD_FROZEN: {
- static struct sched_param param = {
- .sched_priority = MAX_RT_PRIO-1
- };
-
p = per_cpu(ksoftirqd, hotcpu);
per_cpu(ksoftirqd, hotcpu) = NULL;
- sched_setscheduler_nocheck(p, SCHED_FIFO, ¶m);
+ setscheduler_dl_special(p);
kthread_stop(p);
takeover_tasklets(hotcpu);
break;
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 94ca779..2b7f259 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -307,10 +307,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
*/
static int watchdog(void *unused)
{
- static struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
- sched_setscheduler(current, SCHED_FIFO, ¶m);
+ setscheduler_dl_special(current);
/* initialize timestamp */
__touch_watchdog();
--
1.7.2.3
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
----------------------------------------------------------------------
Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)
http://blog.linux.it/raistlin / raistlin@ekiga.net /
dario.faggioli@jabber.org
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next prev parent reply other threads:[~2010-10-29 6:31 UTC|newest]
Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-29 6:18 [RFC][PATCH 00/22] sched: SCHED_DEADLINE v3 Raistlin
2010-10-29 6:25 ` [RFC][PATCH 01/22] sched: add sched_class->task_dead Raistlin
2010-10-29 6:27 ` [RFC][PATCH 02/22] sched: add extended scheduling interface Raistlin
2010-11-10 16:00 ` Dhaval Giani
2010-11-10 16:12 ` Dhaval Giani
2010-11-10 22:45 ` Raistlin
2010-11-10 16:17 ` Claudio Scordino
2010-11-10 17:28 ` Peter Zijlstra
2010-11-10 19:26 ` Peter Zijlstra
2010-11-10 23:33 ` Tommaso Cucinotta
2010-11-11 12:19 ` Peter Zijlstra
2010-11-10 22:17 ` Raistlin
2010-11-10 22:57 ` Tommaso Cucinotta
2010-11-11 13:32 ` Peter Zijlstra
2010-11-11 13:54 ` Raistlin
2010-11-11 14:08 ` Peter Zijlstra
2010-11-11 17:27 ` Raistlin
2010-11-11 14:05 ` Dhaval Giani
2010-11-10 22:24 ` Raistlin
2010-11-10 18:50 ` Peter Zijlstra
2010-11-10 22:05 ` Raistlin
2010-11-12 16:38 ` Steven Rostedt
2010-11-12 16:43 ` Peter Zijlstra
2010-11-12 16:52 ` Steven Rostedt
2010-11-12 19:19 ` Raistlin
2010-11-12 19:23 ` Steven Rostedt
2010-11-12 17:42 ` Tommaso Cucinotta
2010-11-12 19:21 ` Steven Rostedt
2010-11-12 19:24 ` Raistlin
2010-10-29 6:28 ` [RFC][PATCH 03/22] sched: SCHED_DEADLINE data structures Raistlin
2010-11-10 18:59 ` Peter Zijlstra
2010-11-10 22:06 ` Raistlin
2010-11-10 19:10 ` Peter Zijlstra
2010-11-12 17:11 ` Steven Rostedt
2010-10-29 6:29 ` [RFC][PATCH 04/22] sched: SCHED_DEADLINE SMP-related " Raistlin
2010-11-10 19:17 ` Peter Zijlstra
2010-10-29 6:30 ` [RFC][PATCH 05/22] sched: SCHED_DEADLINE policy implementation Raistlin
2010-11-10 19:21 ` Peter Zijlstra
2010-11-10 19:43 ` Peter Zijlstra
2010-11-11 1:02 ` Raistlin
2010-11-10 19:45 ` Peter Zijlstra
2010-11-10 22:26 ` Raistlin
2010-11-10 20:21 ` Peter Zijlstra
2010-11-11 1:18 ` Raistlin
2010-11-11 13:13 ` Peter Zijlstra
2010-11-11 14:13 ` Peter Zijlstra
2010-11-11 14:28 ` Raistlin
2010-11-11 14:17 ` Peter Zijlstra
2010-11-11 18:33 ` Raistlin
2010-11-11 14:25 ` Peter Zijlstra
2010-11-11 14:33 ` Raistlin
2010-11-14 8:54 ` Raistlin
2010-11-23 14:24 ` Peter Zijlstra
2010-10-29 6:31 ` Raistlin [this message]
2010-11-11 14:31 ` [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads Peter Zijlstra
2010-11-11 14:50 ` Dario Faggioli
2010-11-11 14:34 ` Peter Zijlstra
2010-11-11 15:27 ` Oleg Nesterov
2010-11-11 15:43 ` Peter Zijlstra
2010-11-11 16:32 ` Oleg Nesterov
2010-11-13 18:35 ` Peter Zijlstra
2010-11-13 19:58 ` Oleg Nesterov
2010-11-13 20:31 ` Peter Zijlstra
2010-11-13 20:51 ` Peter Zijlstra
2010-11-13 23:31 ` Peter Zijlstra
2010-11-15 20:06 ` [PATCH] sched: Simplify cpu-hot-unplug task migration Peter Zijlstra
2010-11-17 19:27 ` Oleg Nesterov
2010-11-17 19:42 ` Peter Zijlstra
2010-11-18 14:05 ` Oleg Nesterov
2010-11-18 14:24 ` Peter Zijlstra
2010-11-18 15:32 ` Oleg Nesterov
2010-11-18 14:09 ` [tip:sched/core] " tip-bot for Peter Zijlstra
2010-11-11 14:46 ` [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacial kthreads Peter Zijlstra
2010-10-29 6:32 ` [RFC][PATCH 07/22] sched: SCHED_DEADLINE push and pull logic Raistlin
2010-11-12 16:17 ` Peter Zijlstra
2010-11-12 21:11 ` Raistlin
2010-11-14 9:14 ` Raistlin
2010-11-23 14:27 ` Peter Zijlstra
2010-10-29 6:33 ` [RFC][PATCH 08/22] sched: SCHED_DEADLINE avg_update accounting Raistlin
2010-11-11 19:16 ` Peter Zijlstra
2010-10-29 6:34 ` [RFC][PATCH 09/22] sched: add period support for -deadline tasks Raistlin
2010-11-11 19:17 ` Peter Zijlstra
2010-11-11 19:31 ` Raistlin
2010-11-11 19:43 ` Peter Zijlstra
2010-11-11 23:33 ` Tommaso Cucinotta
2010-11-12 13:33 ` Raistlin
2010-11-12 13:45 ` Peter Zijlstra
2010-11-12 13:46 ` Luca Abeni
2010-11-12 14:01 ` Raistlin
2010-10-29 6:35 ` [RFC][PATCH 10/22] sched: add a syscall to wait for the next instance Raistlin
2010-11-11 19:21 ` Peter Zijlstra
2010-11-11 19:33 ` Raistlin
2010-10-29 6:35 ` [RFC][PATCH 11/22] sched: add schedstats for -deadline tasks Raistlin
2010-10-29 6:36 ` [RFC][PATCH 12/22] sched: add runtime reporting " Raistlin
2010-11-11 19:37 ` Peter Zijlstra
2010-11-12 16:15 ` Raistlin
2010-11-12 16:27 ` Peter Zijlstra
2010-11-12 21:12 ` Raistlin
2010-10-29 6:37 ` [RFC][PATCH 13/22] sched: add resource limits " Raistlin
2010-11-11 19:57 ` Peter Zijlstra
2010-11-12 21:30 ` Raistlin
2010-11-12 23:32 ` Peter Zijlstra
2010-10-29 6:38 ` [RFC][PATCH 14/22] sched: add latency tracing " Raistlin
2010-10-29 6:38 ` [RFC][PATCH 15/22] sched: add traceporints " Raistlin
2010-11-11 19:54 ` Peter Zijlstra
2010-11-12 16:13 ` Raistlin
2010-10-29 6:39 ` [RFC][PATCH 16/22] sched: add SMP " Raistlin
2010-10-29 6:40 ` [RFC][PATCH 17/22] sched: add signaling overrunning " Raistlin
2010-11-11 21:58 ` Peter Zijlstra
2010-11-12 15:39 ` Raistlin
2010-11-12 16:04 ` Peter Zijlstra
2010-10-29 6:42 ` [RFC][PATCH 19/22] rtmutex: turn the plist into an rb-tree Raistlin
2010-10-29 6:42 ` [RFC][PATCH 18/22] sched: add reclaiming logic to -deadline tasks Raistlin
2010-11-11 22:12 ` Peter Zijlstra
2010-11-12 15:36 ` Raistlin
2010-11-12 16:04 ` Peter Zijlstra
2010-11-12 17:41 ` Luca Abeni
2010-11-12 17:51 ` Peter Zijlstra
2010-11-12 17:54 ` Luca Abeni
2010-11-13 21:08 ` Raistlin
2010-11-12 18:07 ` Tommaso Cucinotta
2010-11-12 19:07 ` Raistlin
2010-11-13 0:43 ` Peter Zijlstra
2010-11-13 1:49 ` Tommaso Cucinotta
2010-11-12 18:56 ` Raistlin
[not found] ` <80992760-24F2-42AE-AF2D-15727F6A1C81@email.unc.edu>
2010-11-15 18:37 ` James H. Anderson
2010-11-15 19:23 ` Luca Abeni
2010-11-15 19:49 ` James H. Anderson
2010-11-15 19:39 ` Luca Abeni
2010-11-15 21:34 ` Raistlin
2010-10-29 6:43 ` [RFC][PATCH 20/22] sched: drafted deadline inheritance logic Raistlin
2010-11-11 22:15 ` Peter Zijlstra
2010-11-14 12:00 ` Raistlin
2010-10-29 6:44 ` [RFC][PATCH 21/22] sched: add bandwidth management for sched_dl Raistlin
2010-10-29 6:45 ` [RFC][PATCH 22/22] sched: add sched_dl documentation Raistlin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1288333876.8661.147.camel@Palantir \
--to=raistlin@linux.it \
--cc=cfriesen@nortel.com \
--cc=claudio@evidence.eu.com \
--cc=cucinotta@sssup.it \
--cc=darren@dvhart.com \
--cc=dhaval@retis.sssup.it \
--cc=fabio@gandalf.sssup.it \
--cc=fweisbec@gmail.com \
--cc=hgu1972@gmail.com \
--cc=johan.eker@ericsson.com \
--cc=juri.lelli@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@unitn.it \
--cc=mingo@elte.hu \
--cc=nicola.manica@disi.unitn.it \
--cc=oleg@redhat.com \
--cc=p.faure@akatech.ch \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=trimarchi@retis.sssup.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.