Re: [patch] CFS scheduler, -v18

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "S.Çağlar Onur" <caglar@pardus.org.tr>,
	linux-kernel@vger.kernel.org,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Mike Galbraith" <efault@gmx.de>,
	"Arjan van de Ven" <arjan@infradead.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Dmitry Adamushko" <dmitry.adamushko@gmail.com>,
	"Srivatsa Vaddagiri" <vatsa@linux.vnet.ibm.com>
Subject: Re: [patch] CFS scheduler, -v18
Date: Tue, 26 Jun 2007 10:38:13 +0200	[thread overview]
Message-ID: <20070626083813.GA16151@elte.hu> (raw)
In-Reply-To: <20070625200235.9d24f6cd.akpm@linux-foundation.org>


* Andrew Morton <akpm@linux-foundation.org> wrote:

> So I locally generated the diff to take -mm up to the above version of 
> CFS.

thx. I released a diff against mm2:

 http://people.redhat.com/mingo/cfs-scheduler/sched-cfs-v2.6.22-rc4-mm2-v18.patch

but indeed the -git diff serves you better if you updated -mm to Linus' 
latest.

firstly, thanks a ton for your review feedback!

> - sys_sched_yield_to() went away?  I guess I missed that.

yep. Nobody tried it and sent any feedback on it, it was causing 
patch-logistical complications both in -mm and for packagers that bundle 
CFS (the experimental-schedulers site has a CFS repo and Fedora rawhide 
started carrying CFS recently as well), and i dont really agree with 
adding yet another yield interface anyway. So we can and should do this 
independently of CFS.

> - Curious.  the simplification of task_tick_rt() seems to go only
>   halfway.  Could do
> 
> 	if (p->policy != SCHED_RR)
> 		return;
> 
> 	if (--p->time_slice)
> 		return;
> 
> 	/* stuff goes here */

yeah. I have fixed it in my v19 tree for it to look like you suggest.

> - dud macro:
> 					
> #define is_rt_policy(p)		((p) == SCHED_FIFO || (p) == SCHED_RR)
> 
>   It evaluates its arg twice and could and should be coded in C.
> 
>   There are a bunch of other don't-need-to-be-implemented-as-a-macro
>   macros around there too.  Generally, I suggest you review all the
>   patchset for macros-which-don't-need-to-be-macros.

yep, fixed. (is a historic macro)

> - Extraneous newline:
> 
> enum cpu_idle_type
> {

fixed. (is a pre-existing enum)

> - Style thing:
> 
> struct sched_entity {

> 	u64 sleep_start, sleep_start_fair;

fixed.

> - None of these fields have comments describing what they do ;)

one of them has ;-) Will fill this in.

> - __exit_signal() does apparently-unlocked 64-bit arith.  Is there 
>   some implicit locking here or do we not care about the occasional 
>   race-induced inaccuracy?

do you mean the tsk->se.sum_exec_runtime addition, etc? That runs with 
interrupts disabled so sum_sched_runtime is protected.

>   (ditto, lots of places, I expect)

which places do you mean?

>   (Gee, there's shitloads of 64-bit stuff in there.  Does it all 
>   _really_ need to be 64-bit on 32-bit?)

yes - CFS is fundamentally designed for 64-bit, with still pretty OK 
arithmetics performance for 32-bit.

> - weight_s64() (what does this do?) looks too big to inline on 32-bit.

ok, i've uninlined it.

> - update_stats_enqueue() looks too big to inline even on 64-bit.

done.

> - Overall, this change is tremendously huge for something which is
>   supposedly ready-to-merge. [...]

hey, that's not fair, your review comments just made it 10K larger ;-)

> [...] Looks like a lot of that is the sched_entity conversion, but 
> afaict there's quite a lot besides.
> 
> - Should "4" in
> 
> 	(sysctl_sched_features & 4)
> 
>   be enumerated?

yep, done.

> - Maybe even __check_preempt_curr_fair() is too porky to inline.

yep - undone.

> - There really is an astonishing amount of 64-bit arith in here...
> 
> - Some opportunities for useful comments have been missed ;)
> 
> #define NICE_0_LOAD	SCHED_LOAD_SCALE
> #define NICE_0_SHIFT	SCHED_LOAD_SHIFT
> 
>   <wonders what these mean>

SCHED_LOAD_SCALE is the smpnice stuff. CFS reuses that and also makes it 
clear via this define that a nice-0 task has a 'load' contribution to 
the CPU as of NICE_0_LOAD. Sometimes, when doing smpnice load-balancing 
calculations we want to use 'SCHED_LOAD_SCALE', sometimes we want to 
stress it's NICE_0_LOAD.

> - Should any of those new 64-bit arith functions in sched.c be pulled 
>   out and made generic?

yep, the plan is to put this all into reciprocal_div.h and to convert 
existing users of reciprocal_div to the cleaner stuff from Thomas. The 
patch wont get any smaller due to that though ;-)

> - update_curr_load() is huge, inlined and has several callsites?

this is a reasonable tradeoff i think - update_curr_load()'s slowpath is 
in __update_curr_load(). Anyway, it probably wont get inlined when the 
kernel is built with -Os and without forced-inlining.

> - lots more macros-which-dont-need-to-be-macros in sched.c:
>   load_weight(), PRIO_TO_load_weight(), RTPRIO_TO_load_weight(), maybe
>   others.  People are more inclined to comment functions than they are
>   macros, for some reason.

these are mostly ancient macros. I fixed up some of them in my current 
tree.

> - inc_load(), dec_load(), inc_nr_running(), dec_nr_running(): these will
>   generate plenty of code on 32-bit and they're all inlined with 
>   multiple callsites.

yep - i'll revisit the inlining picture. This is not really a primary 
worry i think because it's easy to tweak and people can already express 
their inlining preference via CONFIG_CC_OPTIMIZE_FOR_SIZE and 
CONFIG_FORCED_INLINING.

> - overall, CFS takes sched.o from 41157 of .text up to 48781 on x86_64,
>   which at 18% is rather a large bloat.  Hopefully a lot of this is 
>   the new debug stuff.

> - On i386 sched.o went from 33755 up to 43660 which is 29% growth. 
>   Possibly acceptable, but why did it increase a lot more than the x86_64
>   version?  All that 64-bit arith, I assume?

the main reason is the sched debugging stuff:

   text    data     bss     dec     hex filename
  37570    2538      20   40128    9cc0 kernel/sched.o
  30692    2426      20   33138    8172 kernel/sched-no_sched_debug.o

i can make it depend on CONFIG_SCHEDSTATS, although i'd prefer it to be 
always on.

> - style (or the lack thereof):
> 
> 	p->se.sum_wait_runtime = p->se.sum_sleep_runtime = 0;
> 	p->se.sleep_start = p->se.sleep_start_fair = p->se.block_start = 0;
> 	p->se.sleep_max = p->se.block_max = p->se.exec_max = p->se.wait_max = 0;
> 	p->se.wait_runtime_overruns = p->se.wait_runtime_underruns = 0;
> 
>   bit of an eyesore?

fixed. (this heap grew gradually and now is/was an eyesore indeed.)

> - in sched_init() this looks funny:
> 
> 		rq->ls.load_update_last = sched_clock();
> 		rq->ls.load_update_start = sched_clock();
> 
>   was it intended that these both get the same value?

it doesnt really matter, i fixed them to be initialized to the same 
'now' value.

i've attached my current fixes. (Please dont apply it yet.)

	Ingo

Not-Signed-off-by: Ingo Molnar <mingo@elte.hu>

---
 Makefile              |    2 
 include/linux/sched.h |  120 +++++++++++++++++++++++++++++---------------------
 kernel/exit.c         |    2 
 kernel/sched.c        |   58 ++++++++++++++++--------
 kernel/sched_debug.c  |    2 
 kernel/sched_fair.c   |   61 ++++++++++++-------------
 kernel/sched_rt.c     |   15 +++---
 7 files changed, 149 insertions(+), 111 deletions(-)

Index: linux/Makefile
===================================================================
--- linux.orig/Makefile
+++ linux/Makefile
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 6
 SUBLEVEL = 22
-EXTRAVERSION = -rc6-cfs-v18
+EXTRAVERSION = -rc6-cfs-v19
 NAME = Holy Dancing Manatees, Batman!
 
 # *DOCUMENTATION*
Index: linux/include/linux/sched.h
===================================================================
--- linux.orig/include/linux/sched.h
+++ linux/include/linux/sched.h
@@ -528,31 +528,6 @@ struct signal_struct {
 #define SIGNAL_STOP_CONTINUED	0x00000004 /* SIGCONT since WCONTINUED reap */
 #define SIGNAL_GROUP_EXIT	0x00000008 /* group exit in progress */
 
-
-/*
- * Priority of a process goes from 0..MAX_PRIO-1, valid RT
- * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
- * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
- * values are inverted: lower p->prio value means higher priority.
- *
- * The MAX_USER_RT_PRIO value allows the actual maximum
- * RT priority to be separate from the value exported to
- * user-space.  This allows kernel threads to set their
- * priority to a value higher than any user task. Note:
- * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
- */
-
-#define MAX_USER_RT_PRIO	100
-#define MAX_RT_PRIO		MAX_USER_RT_PRIO
-
-#define MAX_PRIO		(MAX_RT_PRIO + 40)
-#define DEFAULT_PRIO		(MAX_RT_PRIO + 20)
-
-#define rt_prio(prio)		unlikely((prio) < MAX_RT_PRIO)
-#define rt_task(p)		rt_prio((p)->prio)
-#define is_rt_policy(p)		((p) == SCHED_FIFO || (p) == SCHED_RR)
-#define has_rt_policy(p)	unlikely(is_rt_policy((p)->policy))
-
 /*
  * Some day this will be a full-fledged user tracking system..
  */
@@ -646,8 +621,7 @@ static inline int sched_info_on(void)
 #endif
 }
 
-enum cpu_idle_type
-{
+enum cpu_idle_type {
 	CPU_IDLE,
 	CPU_NOT_IDLE,
 	CPU_NEWLY_IDLE,
@@ -843,30 +817,45 @@ struct load_weight {
 	unsigned long weight, inv_weight;
 };
 
-/* CFS stats for a schedulable entity (task, task-group etc) */
+/*
+ * CFS stats for a schedulable entity (task, task-group etc)
+ *
+ * Current field usage histogram:
+ *
+ *     4 se->block_start
+ *     4 se->run_node
+ *     4 se->sleep_start
+ *     4 se->sleep_start_fair
+ *     6 se->load.weight
+ *     7 se->delta_fair
+ *    15 se->wait_runtime
+ */
 struct sched_entity {
-	struct load_weight load;	/* for nice- load-balancing purposes */
-	int on_rq;
-	struct rb_node run_node;
-	unsigned long delta_exec;
-	s64 delta_fair;
-
-	u64 wait_start_fair;
-	u64 wait_start;
-	u64 exec_start;
-	u64 sleep_start, sleep_start_fair;
-	u64 block_start;
-	u64 sleep_max;
-	u64 block_max;
-	u64 exec_max;
-	u64 wait_max;
-	u64 last_ran;
-
-	s64 wait_runtime;
-	u64 sum_exec_runtime;
-	s64 fair_key;
-	s64 sum_wait_runtime, sum_sleep_runtime;
-	unsigned long wait_runtime_overruns, wait_runtime_underruns;
+	s64			wait_runtime;
+	s64			delta_fair;
+	struct load_weight	load;		/* for load-balancing */
+	struct rb_node		run_node;
+	int			on_rq;
+	unsigned long		delta_exec;
+
+	u64			wait_start_fair;
+	u64			wait_start;
+	u64			exec_start;
+	u64			sleep_start;
+	u64			sleep_start_fair;
+	u64			block_start;
+	u64			sleep_max;
+	u64			block_max;
+	u64			exec_max;
+	u64			wait_max;
+	u64			last_ran;
+
+	u64			sum_exec_runtime;
+	s64			fair_key;
+	s64			sum_wait_runtime;
+	s64			sum_sleep_runtime;
+	unsigned long		wait_runtime_overruns;
+	unsigned long		wait_runtime_underruns;
 };
 
 struct task_struct {
@@ -1126,6 +1115,37 @@ struct task_struct {
 #endif
 };
 
+/*
+ * Priority of a process goes from 0..MAX_PRIO-1, valid RT
+ * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
+ * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
+ * values are inverted: lower p->prio value means higher priority.
+ *
+ * The MAX_USER_RT_PRIO value allows the actual maximum
+ * RT priority to be separate from the value exported to
+ * user-space.  This allows kernel threads to set their
+ * priority to a value higher than any user task. Note:
+ * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
+ */
+
+#define MAX_USER_RT_PRIO	100
+#define MAX_RT_PRIO		MAX_USER_RT_PRIO
+
+#define MAX_PRIO		(MAX_RT_PRIO + 40)
+#define DEFAULT_PRIO		(MAX_RT_PRIO + 20)
+
+static inline int rt_prio(int prio)
+{
+	if (unlikely(prio < MAX_RT_PRIO))
+		return 1;
+	return 0;
+}
+
+static inline int rt_task(struct task_struct *p)
+{
+	return rt_prio(p->prio);
+}
+
 static inline pid_t process_group(struct task_struct *tsk)
 {
 	return tsk->signal->pgrp;
Index: linux/kernel/exit.c
===================================================================
--- linux.orig/kernel/exit.c
+++ linux/kernel/exit.c
@@ -290,7 +290,7 @@ static void reparent_to_kthreadd(void)
 	/* Set the exit signal to SIGCHLD so we signal init on exit */
 	current->exit_signal = SIGCHLD;
 
-	if (!has_rt_policy(current) && (task_nice(current) < 0))
+	if (task_nice(current) < 0)
 		set_user_nice(current, 0);
 	/* cpus_allowed? */
 	/* rt_priority? */
Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -106,6 +106,18 @@ unsigned long long __attribute__((weak))
 #define MIN_TIMESLICE		max(5 * HZ / 1000, 1)
 #define DEF_TIMESLICE		(100 * HZ / 1000)
 
+static inline int rt_policy(int policy)
+{
+	if (unlikely(policy == SCHED_FIFO) || unlikely(policy == SCHED_RR))
+		return 1;
+	return 0;
+}
+
+static inline int task_has_rt_policy(struct task_struct *p)
+{
+	return rt_policy(p->policy);
+}
+
 /*
  * This is the priority-queue data structure of the RT scheduling class:
  */
@@ -752,7 +764,7 @@ static void set_load_weight(struct task_
 	task_rq(p)->cfs.wait_runtime -= p->se.wait_runtime;
 	p->se.wait_runtime = 0;
 
-	if (has_rt_policy(p)) {
+	if (task_has_rt_policy(p)) {
 		p->se.load.weight = prio_to_weight[0] * 2;
 		p->se.load.inv_weight = prio_to_wmult[0] >> 1;
 		return;
@@ -805,7 +817,7 @@ static inline int normal_prio(struct tas
 {
 	int prio;
 
-	if (has_rt_policy(p))
+	if (task_has_rt_policy(p))
 		prio = MAX_RT_PRIO-1 - p->rt_priority;
 	else
 		prio = __normal_prio(p);
@@ -1476,17 +1488,24 @@ int fastcall wake_up_state(struct task_s
  */
 static void __sched_fork(struct task_struct *p)
 {
-	p->se.wait_start_fair = p->se.wait_start = p->se.exec_start = 0;
-	p->se.sum_exec_runtime = 0;
-	p->se.delta_exec = 0;
-	p->se.delta_fair = 0;
-
-	p->se.wait_runtime = 0;
-
-	p->se.sum_wait_runtime = p->se.sum_sleep_runtime = 0;
-	p->se.sleep_start = p->se.sleep_start_fair = p->se.block_start = 0;
-	p->se.sleep_max = p->se.block_max = p->se.exec_max = p->se.wait_max = 0;
-	p->se.wait_runtime_overruns = p->se.wait_runtime_underruns = 0;
+	p->se.wait_start_fair		= 0;
+	p->se.wait_start		= 0;
+	p->se.exec_start		= 0;
+	p->se.sum_exec_runtime		= 0;
+	p->se.delta_exec		= 0;
+	p->se.delta_fair		= 0;
+	p->se.wait_runtime		= 0;
+	p->se.sum_wait_runtime		= 0;
+	p->se.sum_sleep_runtime		= 0;
+	p->se.sleep_start		= 0;
+	p->se.sleep_start_fair		= 0;
+	p->se.block_start		= 0;
+	p->se.sleep_max			= 0;
+	p->se.block_max			= 0;
+	p->se.exec_max			= 0;
+	p->se.wait_max			= 0;
+	p->se.wait_runtime_overruns	= 0;
+	p->se.wait_runtime_underruns	= 0;
 
 	INIT_LIST_HEAD(&p->run_list);
 	p->se.on_rq = 0;
@@ -1799,7 +1818,7 @@ static void update_cpu_load(struct rq *t
 	int i, scale;
 
 	this_rq->nr_load_updates++;
-	if (sysctl_sched_features & 64)
+	if (unlikely(!(sysctl_sched_features & SCHED_FEAT_PRECISE_CPU_LOAD)))
 		goto do_avg;
 
 	/* Update delta_fair/delta_exec fields first */
@@ -3801,7 +3820,7 @@ void set_user_nice(struct task_struct *p
 	 * it wont have any effect on scheduling until the task is
 	 * SCHED_FIFO/SCHED_RR:
 	 */
-	if (has_rt_policy(p)) {
+	if (task_has_rt_policy(p)) {
 		p->static_prio = NICE_TO_PRIO(nice);
 		goto out_unlock;
 	}
@@ -3999,14 +4018,14 @@ recheck:
 	    (p->mm && param->sched_priority > MAX_USER_RT_PRIO-1) ||
 	    (!p->mm && param->sched_priority > MAX_RT_PRIO-1))
 		return -EINVAL;
-	if (is_rt_policy(policy) != (param->sched_priority != 0))
+	if (rt_policy(policy) != (param->sched_priority != 0))
 		return -EINVAL;
 
 	/*
 	 * Allow unprivileged RT tasks to decrease priority:
 	 */
 	if (!capable(CAP_SYS_NICE)) {
-		if (is_rt_policy(policy)) {
+		if (rt_policy(policy)) {
 			unsigned long rlim_rtprio;
 			unsigned long flags;
 
@@ -6186,6 +6205,7 @@ int in_sched_functions(unsigned long add
 
 void __init sched_init(void)
 {
+	u64 now = sched_clock();
 	int highest_cpu = 0;
 	int i, j;
 
@@ -6206,8 +6226,8 @@ void __init sched_init(void)
 		rq->nr_running = 0;
 		rq->cfs.tasks_timeline = RB_ROOT;
 		rq->clock = rq->cfs.fair_clock = 1;
-		rq->ls.load_update_last = sched_clock();
-		rq->ls.load_update_start = sched_clock();
+		rq->ls.load_update_last = now;
+		rq->ls.load_update_start = now;
 
 		for (j = 0; j < CPU_LOAD_IDX_MAX; j++)
 			rq->cpu_load[j] = 0;
Index: linux/kernel/sched_debug.c
===================================================================
--- linux.orig/kernel/sched_debug.c
+++ linux/kernel/sched_debug.c
@@ -157,7 +157,7 @@ static int sched_debug_show(struct seq_f
 	u64 now = ktime_to_ns(ktime_get());
 	int cpu;
 
-	SEQ_printf(m, "Sched Debug Version: v0.03, cfs-v18, %s %.*s\n",
+	SEQ_printf(m, "Sched Debug Version: v0.03, cfs-v19, %s %.*s\n",
 		init_utsname()->release,
 		(int)strcspn(init_utsname()->version, " "),
 		init_utsname()->version);
Index: linux/kernel/sched_fair.c
===================================================================
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -61,8 +61,24 @@ unsigned int sysctl_sched_stat_granulari
  */
 unsigned int sysctl_sched_runtime_limit __read_mostly;
 
+/*
+ * Debugging: various feature bits
+ */
+enum {
+	SCHED_FEAT_IGNORE_PREEMPTED	= 1,
+	SCHED_FEAT_DISTRIBUTE		= 2,
+	SCHED_FEAT_FAIR_SLEEPERS	= 4,
+	SCHED_FEAT_SLEEPER_AVG		= 32,
+	SCHED_FEAT_PRECISE_CPU_LOAD	= 64,
+	SCHED_FEAT_START_DEBIT		= 128,
+	SCHED_FEAT_SKIP_INITIAL		= 256,
+};
+
 unsigned int sysctl_sched_features __read_mostly =
-			0 | 2 | 4 | 8 | 0 | 0 | 0 | 0;
+		SCHED_FEAT_DISTRIBUTE		|
+		SCHED_FEAT_FAIR_SLEEPERS	|
+		SCHED_FEAT_SLEEPER_AVG		|
+		SCHED_FEAT_PRECISE_CPU_LOAD;
 
 extern struct sched_class fair_sched_class;
 
@@ -145,7 +161,7 @@ static inline void
 __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	if (cfs_rq->rb_leftmost == &se->run_node)
-		cfs_rq->rb_leftmost = NULL;
+		cfs_rq->rb_leftmost = rb_next(&se->run_node);
 	rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
 	update_load_sub(&cfs_rq->load, se->load.weight);
 	cfs_rq->nr_running--;
@@ -258,7 +274,7 @@ __update_curr(struct cfs_rq *cfs_rq, str
 	 * Task already marked for preemption, do not burden
 	 * it with the cost of not having left the CPU yet:
 	 */
-	if (unlikely(sysctl_sched_features & 1))
+	if (unlikely(sysctl_sched_features & SCHED_FEAT_IGNORE_PREEMPTED))
 		if (unlikely(test_tsk_thread_flag(curtask, TIF_NEED_RESCHED)))
 			return;
 
@@ -305,7 +321,7 @@ update_stats_wait_start(struct cfs_rq *c
 	se->wait_start = now;
 }
 
-static inline s64 weight_s64(s64 calc, unsigned long weight, int shift)
+static s64 weight_s64(s64 calc, unsigned long weight, int shift)
 {
 	if (calc < 0) {
 		calc = - calc * weight;
@@ -317,7 +333,7 @@ static inline s64 weight_s64(s64 calc, u
 /*
  * Task is being enqueued - update stats:
  */
-static inline void
+static void
 update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se, u64 now)
 {
 	s64 key;
@@ -438,7 +454,7 @@ static void distribute_fair_add(struct c
 	struct sched_entity *curr = cfs_rq_curr(cfs_rq);
 	s64 delta_fair = 0;
 
-	if (!(sysctl_sched_features & 2))
+	if (!(sysctl_sched_features & SCHED_FEAT_DISTRIBUTE))
 		return;
 
 	if (cfs_rq->nr_running) {
@@ -469,7 +485,7 @@ __enqueue_sleeper(struct cfs_rq *cfs_rq,
 	 * Fix up delta_fair with the effect of us running
 	 * during the whole sleep period:
 	 */
-	if (!(sysctl_sched_features & 32))
+	if (sysctl_sched_features & SCHED_FEAT_SLEEPER_AVG)
 		delta_fair = div64_s(delta_fair * load, load + se->load.weight);
 
 	delta_fair = weight_s64(delta_fair, se->load.weight, NICE_0_SHIFT);
@@ -495,7 +511,7 @@ enqueue_sleeper(struct cfs_rq *cfs_rq, s
 	s64 delta_fair;
 
 	if ((entity_is_task(se) && tsk->policy == SCHED_BATCH) ||
-						 !(sysctl_sched_features & 4))
+			 !(sysctl_sched_features & SCHED_FEAT_FAIR_SLEEPERS))
 		return;
 
 	delta_fair = cfs_rq->fair_clock - se->sleep_start_fair;
@@ -574,7 +590,7 @@ static void dequeue_entity(struct cfs_rq
 /*
  * Preempt the current task with a newly woken task if needed:
  */
-static inline void
+static void
 __check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se,
 			  struct sched_entity *curr, unsigned long granularity)
 {
@@ -612,23 +628,6 @@ put_prev_entity(struct cfs_rq *cfs_rq, s
 	int updated = 0;
 
 	/*
-	 * If the task is still waiting for the CPU (it just got
-	 * preempted), update its position within the tree and
-	 * start the wait period:
-	 */
-	if ((sysctl_sched_features & 16) && entity_is_task(prev))  {
-		struct task_struct *prevtask = task_of(prev);
-
-		if (prev->on_rq &&
-			test_tsk_thread_flag(prevtask, TIF_NEED_RESCHED)) {
-
-			dequeue_entity(cfs_rq, prev, 0, now);
-			enqueue_entity(cfs_rq, prev, 0, now);
-			updated = 1;
-		}
-	}
-
-	/*
 	 * If still on the runqueue then deactivate_task()
 	 * was not called and update_curr() has to be done:
 	 */
@@ -741,10 +740,8 @@ static void check_preempt_curr_fair(stru
 	unsigned long gran;
 
 	if (unlikely(rt_prio(p->prio))) {
-		if (sysctl_sched_features & 8) {
-			if (rt_prio(p->prio))
-				update_curr(cfs_rq, rq_clock(rq));
-		}
+		if (rt_prio(p->prio))
+			update_curr(cfs_rq, rq_clock(rq));
 		resched_task(curr);
 		return;
 	}
@@ -850,14 +847,14 @@ static void task_new_fair(struct rq *rq,
 	 * The first wait is dominated by the child-runs-first logic,
 	 * so do not credit it with that waiting time yet:
 	 */
-	if (sysctl_sched_features & 256)
+	if (sysctl_sched_features & SCHED_FEAT_SKIP_INITIAL)
 		p->se.wait_start_fair = 0;
 
 	/*
 	 * The statistical average of wait_runtime is about
 	 * -granularity/2, so initialize the task with that:
 	 */
-	if (sysctl_sched_features & 128)
+	if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
 		p->se.wait_runtime = -(s64)(sysctl_sched_granularity / 2);
 
 	__enqueue_entity(cfs_rq, se);
Index: linux/kernel/sched_rt.c
===================================================================
--- linux.orig/kernel/sched_rt.c
+++ linux/kernel/sched_rt.c
@@ -12,7 +12,7 @@ static inline void update_curr_rt(struct
 	struct task_struct *curr = rq->curr;
 	u64 delta_exec;
 
-	if (!has_rt_policy(curr))
+	if (!task_has_rt_policy(curr))
 		return;
 
 	delta_exec = now - curr->se.exec_start;
@@ -179,13 +179,14 @@ static void task_tick_rt(struct rq *rq, 
 	if (p->policy != SCHED_RR)
 		return;
 
-	if (!(--p->time_slice)) {
-		p->time_slice = static_prio_timeslice(p->static_prio);
-		set_tsk_need_resched(p);
+	if (--p->time_slice)
+		return;
 
-		/* put it at the end of the queue: */
-		requeue_task_rt(rq, p);
-	}
+	p->time_slice = static_prio_timeslice(p->static_prio);
+	set_tsk_need_resched(p);
+
+	/* put it at the end of the queue: */
+	requeue_task_rt(rq, p);
 }
 
 /*

next prev parent reply	other threads:[~2007-06-26  8:39 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-22 22:02 [patch] CFS scheduler, -v18 Ingo Molnar
2007-06-22 22:09 ` S.Çağlar Onur
2007-06-22 22:16   ` S.Çağlar Onur
2007-06-22 22:20     ` Ingo Molnar
2007-06-26  3:02       ` Andrew Morton
2007-06-26  8:38         ` Ingo Molnar [this message]
2007-06-26  9:00           ` Andrew Morton
2007-06-26  9:38             ` Ingo Molnar
2007-06-22 23:08 ` Gene Heskett
2007-06-23  7:11   ` Ingo Molnar
2007-06-23  9:55     ` Gene Heskett
2007-06-23 10:22 ` Antonino Ingargiola
2007-06-23 17:25   ` Ingo Molnar
2007-06-24 10:02     ` Antonino Ingargiola
2007-06-24 11:07       ` Ingo Molnar
2007-06-25  7:27         ` Antonino Ingargiola
2007-06-23 13:24 ` Willy Tarreau
2007-06-24 15:52   ` Ingo Molnar
2007-06-24 17:08     ` Willy Tarreau
2007-06-24 20:31       ` Ingo Molnar
2007-06-26 20:17 ` Fortier,Vincent [Montreal]
2007-06-27 10:51   ` Ingo Molnar
2007-06-30 21:06 ` Willy Tarreau
2007-07-01  8:31   ` Ingo Molnar
2007-07-01  8:45     ` Ingo Molnar
2007-07-01  9:00       ` Willy Tarreau
2007-07-02 11:44 ` Vegard Nossum
2007-07-02 13:01   ` Dmitry Adamushko
2007-07-02 13:43     ` Vegard Nossum
2007-07-02 15:50       ` Ingo Molnar
2007-07-02 16:40         ` Vegard Nossum
2007-07-02 18:13           ` Ingo Molnar
2007-07-03  7:01             ` Vegard Nossum
2007-07-03  7:12           ` Mike Galbraith
2007-07-03  7:22             ` Ingo Molnar
2007-07-03  8:08               ` Keith Packard
2007-07-03  8:31                 ` Ingo Molnar
2007-07-04 12:11               ` Andi Kleen
2007-07-02 14:13   ` Bill Davidsen
2007-07-03  7:15   ` Ingo Molnar
2007-07-03  9:11     ` Vegard Nossum
     [not found] <8yZun-1bO-5@gated-at.bofh.it>
     [not found] ` <8CszT-4hd-11@gated-at.bofh.it>
     [not found]   ` <8CtPi-6qj-9@gated-at.bofh.it>
     [not found]     ` <8Cus1-7eL-11@gated-at.bofh.it>
     [not found]       ` <8CwtU-1Z5-3@gated-at.bofh.it>
     [not found]         ` <8Cxgd-3fN-9@gated-at.bofh.it>
     [not found]           ` <8CKQ8-7HN-13@gated-at.bofh.it>
     [not found]             ` <8CKZV-7Ur-11@gated-at.bofh.it>
     [not found]               ` <8CLCt-vX-3@gated-at.bofh.it>
2007-07-05 13:29                 ` Thomas Dickey
     [not found]                 ` <8CM5x-19K-3@gated-at.bofh.it>
2007-07-05 13:39                   ` Thomas Dickey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070626083813.GA16151@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=caglar@pardus.org.tr \
    --cc=dmitry.adamushko@gmail.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.