public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC][PATCH][2.6.6] Replacing CPU scheduler active and expired with a single array
@ 2004-05-29  5:27 Peter Williams
  2004-05-29 11:17 ` Con Kolivas
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Williams @ 2004-05-29  5:27 UTC (permalink / raw)
  To: Con Kolivas; +Cc: Ingo Molnar, Linux Kernel Mailing List

Con Kolivas wrote:
 > On Fri, 28 May 2004 19:24, Peter Williams wrote:
 > > Ingo Molnar wrote:
 > > > just try it - run a task that runs 95% of the time and sleeps 5%
 > > > of the time, and run a (same prio) task that runs 100% of the
 > > > time. With the current scheduler the slightly-sleeping task gets
 > > > 45% of the CPU, the looping one gets 55% of the CPU. With your
 > > > patch the slightly-sleeping process can easily monopolize 90% of
 > > > the CPU!
 >
 > > This does, of course, not take into account the interactive bonus.
 > > If the task doing the shorter CPU bursts manages to earn a larger
 > > interactivity bonus than the other then it will get more CPU but
 > > isn't that the intention of the interactivity bonus?
 >
 > No. Ideally the interactivity bonus should decide what goes first
 > every time to decrease the latency of interactive tasks, but the cpu
 > percentage should remain close to the same for equal "nice" tasks.

There are at least two possible ways of viewing "nice": one of these is 
that it is an indicator of the tasks entitlement to CPU resource (which 
is more or less the view you describe) and another that it is an 
indicator of the task's priority with respect to access to CPU resources.

If you wish the system to take the first of these views then the 
appropriate solution to the scheduling problem is to use an entitlement 
based scheduler such as EBS (see 
<http://sourceforge.net/projects/ebs-linux/>) which is also much simpler 
than the current O(1) scheduler and has the advantage that it gives 
pretty good interactive responsiveness without treating interactive 
tasks specially (although some modification in this regard may be 
desirable if very high loads are going to be encountered).

If you want the second of these then this proposed modification is a 
simple way of getting it (with the added proviso that starvation be 
avoided).

Of course, there can be other scheduling aims such as maximising 
throughput where different scheduler paradigms need to be used.  As a 
matter of interest these tend to have not very good interactive response.

If the system is an interactive system then all of these models (or at 
least two of them) need to be modified to "break the rules" as far as 
interactive tasks are concerned and give them higher priority in order 
not to try human patience.

 > Interactive tasks need low scheduling latency and short bursts of high
 > cpu usage; not more cpu usage overall. When the cpu percentage 
differs > significantly from this the logic has failed.

The only way this will happen is if the interactive bonus mechanism 
misidentifies a CPU bound task as an interactive task and gives it a 
large bonus.  This seems to be the case as tasks with a 95% CPU demand 
rate are being given a bonus of 9 (out of 10 possible) points.

Peter
-- 
Dr Peter Williams                                pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 15+ messages in thread
* Re: [RFC][PATCH][2.6.6] Replacing CPU scheduler active and expired with a single array
@ 2004-05-29  1:39 Peter Williams
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Williams @ 2004-05-29  1:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linux Kernel Mailing List

 > Ingo Molnar wrote:
 >> Peter Williams <peterw@xxxxxxxxxx> wrote:
 >> >just try it - run a task that runs 95% of the time and sleeps 5% of the
 >> >time, and run a (same prio) task that runs 100% of the time. With the
 >> >current scheduler the slightly-sleeping task gets 45% of the CPU, the
 >> >looping one gets 55% of the CPU. With your patch the slightly-sleeping
 >> >process can easily monopolize 90% of the CPU!
 >>
 >> If these two tasks have the same nice value they should around robin
 >> with each other in the same priority slot and this means that the one
 >> doing the smaller bites of CPU each time will in fact get less CPU
 >> than the other i.e. the outcome will be the opposite of what you
 >> claim.
 >
 > just try what i described with and without your patch and look at the
 > 'top' output. You can do a simple loop plus short 10-20msec sleeps
 > (via nanosleep) to simulate a 95% busy task.

OK.  I've tried this experiment and an effect similar to what you 
described (but not as severe) was observed.  However, as I opined, this 
was due to the interactive bonus mechanism being too generous to the 
sleeper - sometimes giving at as many as 9 bonus points out of a 
possible maximum of 10.  The severity of the effect was variable and 
proportional to the size of the bonus awarded at any given time.  So the 
problem is due to the bonus point scheme and not the absence of the 
active and expired arrays.

Further investigation of this problem has revealed that the reason that 
the interactive bonus scheme is so generous to this type of task is due 
the code in schedule() that allows tasks with "activated" greater than 1 
to count time spent on the run queue as sleep time.  If this code is 
removed the effect disappears.

Interestingly enough, removing that code has no effect (that I could 
discern) on the interactive response even under heavy loads.  This had 
been observed previously and consideration was given to removing this 
code as part of this patch but it was decided to make no significant 
changes to the interactive bonus scheme for this patch.  I.e. 
modifying/improving the interactive bonus mechanism was considered to be 
another issue.

Peter
PS Sorry about the change in e-mail address but my ISP changed my IP 
address and this has caused me to lose my SSH connection with my 
aurema.com account and as it's now Saturday here in Sydney I'm unlikely 
to get it back until Monday.
-- 
Dr Peter Williams                                pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 15+ messages in thread
* [RFC][PATCH][2.6.6] Replacing CPU scheduler active and expired with a single array
@ 2004-05-28  4:52 Peter Williams
  2004-05-28  9:05 ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Williams @ 2004-05-28  4:52 UTC (permalink / raw)
  To: Linux Kernel Mailing List

This is a modification of the O(1) scheduler to replace the active and 
expired priority arrays with a single priority array:

<http://users.bigpond.net.au/Peter-Williams/patch-2.6.6-spa>

This patch simplifies the scheduler code by:

- removing the need to manage two priority arrays per CPU
- removing the need to use variable time slices to implement "nice" 
which means no more sharing of time slices between parents and children 
at fork and exit.

Key features are:

-- uses the existing "interactive task" bonus scheme so there is no 
change to interactive responsiveness
-- the number of priority slots has been increased by MAX_BONUS + 1 so 
that there is no need to truncate the allocated priority after 
allocation of the bonus and to allow the "idle" tasks to be placed on 
the queues
-- at the end of each time slice (or when waking up) each task is given 
a complete new time slice and, if class SCHED_NORMAL, is put in a 
priority slot given by (static_prio + MAX_BONUS - interactive_bonus)
-- starvation is prevented by promoting all runnable tasks by one 
priority slot at intervals determined by a base promotion interval 
multiplied by the number of runnable tasks on the CPU in question
-- when there are less than 2 runnable processes on the CPU the 
promotion is not performed
-- in order for the anti starvation promotion mechanism to be O(1) the 
"prio" field has been removed from the task structure
-- to enable the priority of the current task to be available for 
various uses that previously used current->prio a record of the priority 
slot for the current task is now kept in the runqueue struct
-- having the "idle" tasks on the queues allows the code for selecting 
the "next" task in schedule() to be simplified

Effected files are:

init_task.h -- cope with removal of "prio" and "array" from task struct
sched.h -- remove "prio" and "array" from task struct
exit.c -- remove call to sched_exit()
sched.c -- the bulk of the change

Performance:

- no discernible change in interactive responsiveness
- slight improvements in results from tests such as kernbench e.g.

Average Half Load -j 3 Run:
Elapsed Time 573.582 (vs 576.196)
User Time 1576.36 (vs 1582.19)
System Time 100.866 (vs 102.266)
Percent CPU 291.8 (vs 291.8)
Context Switches 9970.2 (vs 10653.6)
Sleeps 23996 (vs 23736.2)

Average Optimal -j 16 Load Run:
Elapsed Time 439.88 (vs 443.21)
User Time 1605.09 (vs 1613.75)
System Time 104.618 (vs 106.99)
Percent CPU 388 (vs 387.6)
Context Switches 29816.4 (vs 35444.6)
Sleeps 25589.8 (vs 27509.4)

Average Maximal -j Load Run:
Elapsed Time 460.252 (vs 460.72)
User Time 1628.57 (vs 1633.04)
System Time 147.666 (vs 147.222)
Percent CPU 385.4 (vs 386.2)
Context Switches 78475.6 (vs 78242.4)
Sleeps 126011 (vs 113026)

In summary, this patch provides code simplification without cost.
-- 
Dr Peter Williams, Chief Scientist                peterw@aurema.com
Aurema Pty Limited                                Tel:+61 2 9698 2322
PO Box 305, Strawberry Hills NSW 2012, Australia  Fax:+61 2 9699 9174
79 Myrtle Street, Chippendale NSW 2008, Australia http://www.aurema.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-06-04  7:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <214A1-6NK-7@gated-at.bofh.it>
     [not found] ` <21acm-2GN-1@gated-at.bofh.it>
2004-05-29 12:24   ` [RFC][PATCH][2.6.6] Replacing CPU scheduler active and expired with a single array Andi Kleen
2004-05-29 12:38     ` Con Kolivas
2004-06-04  7:40     ` Peter Williams
2004-05-29  5:27 Peter Williams
2004-05-29 11:17 ` Con Kolivas
2004-05-30  0:19   ` Peter Williams
2004-05-30 12:56     ` Con Kolivas
2004-05-31  0:04       ` Peter Williams
2004-05-30 23:13         ` Con Kolivas
  -- strict thread matches above, loose matches on Subject: below --
2004-05-29  1:39 Peter Williams
2004-05-28  4:52 Peter Williams
2004-05-28  9:05 ` Ingo Molnar
2004-05-28  9:24   ` Peter Williams
2004-05-28  9:29     ` Ingo Molnar
2004-05-28  9:57     ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox