public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
@ 2004-06-10  5:36 Peter Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Williams @ 2004-06-10  5:36 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Michal Kaczmarski

Peter Williams wrote:
 > The single priority array scheduler (SPA) is a patch that simplifies
 > the O(1) scheduler while maintaining its good scalability and
 > interactive response characteristics. The patch comes as four sub
 > patches to simplify perusal of the changes:

An updated version of this scheduler is now available for 2.6.7-rc3 at:

<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA-v0.1>
<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_IAB-v0.1>
<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_TPB-v0.1>
<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_TSTATS-v0.1>

These patches should be applied in the order that they are listed.

Also, as promised, the first of these patches has been unified with the 
staircase scheduler and a patch that implements the staircase scheduler 
on top of the first of the above patches is available at:

<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_STAIRCASE-v0.1>

This scheduler is functionally equivalent to Con Kolivas's v6.4 
scheduler except that a promotion feature (that will trigger very 
infrequently probably never :-)) has been added.  This was added 
because, although it is extremely unlikely, starvation is possible with 
the staircase scheduler and this feature removes that possibility.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
@ 2004-06-11 13:39 Shane Shrybman
  2004-06-12  0:10 ` Peter Williams
  0 siblings, 1 reply; 7+ messages in thread
From: Shane Shrybman @ 2004-06-11 13:39 UTC (permalink / raw)
  To: pwil3058; +Cc: linux-kernel

Hi Peter,

I just started to try out your SPA scheduler patch and found that it is
noticeably sluggish when resizing a mozilla window on the desktop. I
have a profile of 2.6.7-rc3-spa and 2.6.7-rc2-mm2 and put them up at:
http://zeke.yi.org/linux/spa/ . There is also vmstat output there but it
doesn't look too helpful to me.

The test was basic and went like this:

x86, K7, UP, gnome desktop with mozilla (with a bunch of tabs) and a few
rxvts. cmdline= elevator=cfq profile=1

readprofile -r

grab a corner of my mozilla window and continually move it around for
several seconds

readprofile -v -m /boot/System.map-2.6.7-rc3|sort -rn +2|head -n30

do the same while dumping vmstat 1 to a file.

The kernel with your patch had a much harder time keeping up with the
window resizing. Moving the entire window did not seem too bad or not
too noticeable. I tried a similar test while running a kernel compile
(make -j3) and it made the window resizing _really_ slow to respond.

Regards,

Shane






^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
  2004-06-11 13:39 [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler Shane Shrybman
@ 2004-06-12  0:10 ` Peter Williams
  2004-06-12 14:50   ` Shane Shrybman
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Williams @ 2004-06-12  0:10 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman wrote:
> Hi Peter,
> 
> I just started to try out your SPA scheduler patch and found that it is
> noticeably sluggish when resizing a mozilla window on the desktop. I
> have a profile of 2.6.7-rc3-spa and 2.6.7-rc2-mm2 and put them up at:
> http://zeke.yi.org/linux/spa/ . There is also vmstat output there but it
> doesn't look too helpful to me.
> 
> The test was basic and went like this:
> 
> x86, K7, UP, gnome desktop with mozilla (with a bunch of tabs) and a few
> rxvts. cmdline= elevator=cfq profile=1
> 
> readprofile -r
> 
> grab a corner of my mozilla window and continually move it around for
> several seconds
> 
> readprofile -v -m /boot/System.map-2.6.7-rc3|sort -rn +2|head -n30
> 
> do the same while dumping vmstat 1 to a file.
> 
> The kernel with your patch had a much harder time keeping up with the
> window resizing. Moving the entire window did not seem too bad or not
> too noticeable. I tried a similar test while running a kernel compile
> (make -j3) and it made the window resizing _really_ slow to respond.

Thanks for the feedback, I'll add your test to those I perform myself.

Some of the control variables for this scheduler have rather arbitrary 
values at the moment and are likely to be non optimal.  I'm in the 
process of making some of these variables modifiable at run time via 
/proc/sys to enable experimentation with different settings.  Hopefully, 
this will enable settings that improve interactive performance to be 
established.

Once again, thanks for the feedback
Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
  2004-06-12  0:10 ` Peter Williams
@ 2004-06-12 14:50   ` Shane Shrybman
  2004-06-13  1:15     ` Peter Williams
  0 siblings, 1 reply; 7+ messages in thread
From: Shane Shrybman @ 2004-06-12 14:50 UTC (permalink / raw)
  To: Peter Williams; +Cc: linux-kernel

On Fri, 2004-06-11 at 20:10, Peter Williams wrote:
> Shane Shrybman wrote:
> > Hi Peter,
> > 
> > I just started to try out your SPA scheduler patch and found that it is
> > noticeably sluggish when resizing a mozilla window on the desktop. I
> > have a profile of 2.6.7-rc3-spa and 2.6.7-rc2-mm2 and put them up at:
> > http://zeke.yi.org/linux/spa/ . There is also vmstat output there but it
> > doesn't look too helpful to me.
> > 
[..snip..]
> 
> Thanks for the feedback, I'll add your test to those I perform myself.
> 

Sure, no problem. Hope it helps.

> Some of the control variables for this scheduler have rather arbitrary 
> values at the moment and are likely to be non optimal.  I'm in the 
> process of making some of these variables modifiable at run time via 
> /proc/sys to enable experimentation with different settings.  Hopefully, 
> this will enable settings that improve interactive performance to be 
> established.
> 
> Once again, thanks for the feedback
> Peter

Regards,

Shane


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
  2004-06-12 14:50   ` Shane Shrybman
@ 2004-06-13  1:15     ` Peter Williams
  2004-06-13 17:46       ` Shane Shrybman
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Williams @ 2004-06-13  1:15 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman wrote:
> On Fri, 2004-06-11 at 20:10, Peter Williams wrote:
> 
>>Shane Shrybman wrote:
>>
>>>Hi Peter,
>>>
>>>I just started to try out your SPA scheduler patch and found that it is
>>>noticeably sluggish when resizing a mozilla window on the desktop. I
>>>have a profile of 2.6.7-rc3-spa and 2.6.7-rc2-mm2 and put them up at:
>>>http://zeke.yi.org/linux/spa/ . There is also vmstat output there but it
>>>doesn't look too helpful to me.
>>>
> 
> [..snip..]
> 
>>Thanks for the feedback, I'll add your test to those I perform myself.
>>
> 
> 
> Sure, no problem. Hope it helps.
> 
> 
>>Some of the control variables for this scheduler have rather arbitrary 
>>values at the moment and are likely to be non optimal.  I'm in the 
>>process of making some of these variables modifiable at run time via 
>>/proc/sys to enable experimentation with different settings.  Hopefully, 
>>this will enable settings that improve interactive performance to be 
>>established.

A patch which adds control of the SPA scheduler via /proc/sys/kernel is 
available at:

<http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_CNTL-v0.1>

This adds the directory /proc/sys/kernel/cpusched to the /proc directory 
if SYSCTL is configured into the kernel.  This directory contains 6 
files that allow read/write (root only for write) to 6 unsigned integer 
parameters that control the scheduler as follows:

time_slice -- (in msecs) controls the amount of CPU time that CPU bound 
tasks get before they are booted off the CPU, have their bonuses 
reassessed, are put back on the run queue and rescheduled.  This has a 
default value of 100 (which is the average time slice dealt out by the 
original O(1) scheduler) and a maximum allowable value of 1000 (i.e. 1 
second) and a minimum of 1.  Making this smaller MAY help interactive 
response but COULD increase the context switching rate.  Making this 
larger MAY improve overall system throughput and reduce context 
switching rate but MAY make interactive responsiveness worse.

base_promotion_interval -- (in msecs) controls the rate at which the 
promotion mechanism promotes runnable tasks.  This has a default value 
of 55 and a maximum of UINT_MAX and a minimum of 1.  The primary effect 
of changing this parameter is to alter the "severity" (i.e. how much 
more CPU access a lower "nice" will give a task) of "nice" values.  The 
default value is such that if there are two CPU bound tasks (hard 
spinners) running (and time_slice has its default value of 100) and one 
of them has a "nice" value one less than the other then that task will 
get one and a half times as much CPU access as the other.

max_ai_bonus  -- determines the maximum interactive bonus that a task 
can receive.  This has a default value of 10 and a maximum value of 10. 
  Setting this value to zero effectively turns off the interactive bonus.

max_tpt_bonus  -- determines the maximum throughput bonus that a task 
can receive.  This has a default value of 5 and a maximum value of 9 (to 
keep the maximum number of priority slots <= 160 (or 5 * 32)). Setting 
this value to zero effectively turns off the interactive bonus.

ia_threshold  -- is the sleepiness (i.e. A_s / (A_s + A_c) where A_s is 
the average sleep duration per scheduling cycle and A_c is the average 
CPU consumption per scheduling cycle) threshold (in thousandths) above 
which the task will be considered interactive and have its interactive 
bonus increased asymptotically towards the maximum.  This is assessed at 
the end of each scheduling cycle and is only used to determine whether 
an increase in the bonus is warranted.  This has a default value of 900 
and a minimum and maximum of 0 and 1000 respectively.

cpu_hog_threshold  -- is the idleness (i.e.. (A_s + A_d) / (A_s + A_d + 
A_c) where A_c and A_s are as before and A_d is average time spent on 
the run queue per cycle waiting for CPU access) threshold (in 
thousandths) below which a task will be considered a CPU hog and start 
to lose interactive bonus points (if it has any).  This is assessed at 
the end of each scheduling cycle and at the end of each time slice (if 
the task uses an entire slice) and is only used to decrease the task's 
interactive bonus.  This has a default value of 500 and a minimum and 
maximum of 0 and 1000 respectively.

I suspect that it is this last parameter that is the cause of the 
phenomena that you reported as the activity you described probably 
caused the X servers idleness to become smaller than 500 causing it to 
lose some of its interactive bonus.  Could you apply this patch and try 
decreasing the value of this parameter to see if your problem goes away? 
  A clue as to how small to make it can be obtained by observing the CPU 
usage rate reported by top for the X server when you are stirring it up. 
  If the reported usage is y% then a threshold of slightly smaller than 
(1000 - y * 10) should work.

Thanks
Peter
PS I will probably change the semantics of cpu_hog_threshold in the next 
version to use CPU usage rate (the opposite of idleness) as the measure 
and reverse the test (i.e. if usage rate is greater than the threshold 
the task loses some of its interactive bonus). Its value (A_c / (A_c + 
A_s + A_d) NB this is different to the complement of sleepiness) is 
slightly cheaper to calculate and I also think it's a more natural way 
of thinking about the issue.
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
  2004-06-13  1:15     ` Peter Williams
@ 2004-06-13 17:46       ` Shane Shrybman
  2004-06-13 23:45         ` Peter Williams
  0 siblings, 1 reply; 7+ messages in thread
From: Shane Shrybman @ 2004-06-13 17:46 UTC (permalink / raw)
  To: Peter Williams; +Cc: linux-kernel

On Sat, 2004-06-12 at 21:15, Peter Williams wrote:
> Shane Shrybman wrote:
> > On Fri, 2004-06-11 at 20:10, Peter Williams wrote:
> > 
> >>Shane Shrybman wrote:
> >>
> >>>Hi Peter,
> >>>
> >>>I just started to try out your SPA scheduler patch and found that it is
> >>>noticeably sluggish when resizing a mozilla window on the desktop. I
> >>>have a profile of 2.6.7-rc3-spa and 2.6.7-rc2-mm2 and put them up at:
> >>>http://zeke.yi.org/linux/spa/ . There is also vmstat output there but it
> >>>doesn't look too helpful to me.
> >>>
> > 
> > [..snip..]
> > 
> >>Thanks for the feedback, I'll add your test to those I perform myself.
> >>
> > 
> > 
> > Sure, no problem. Hope it helps.
> > 
> > 
> >>Some of the control variables for this scheduler have rather arbitrary 
> >>values at the moment and are likely to be non optimal.  I'm in the 
> >>process of making some of these variables modifiable at run time via 
> >>/proc/sys to enable experimentation with different settings.  Hopefully, 
> >>this will enable settings that improve interactive performance to be 
> >>established.
> 
> A patch which adds control of the SPA scheduler via /proc/sys/kernel is 
> available at:
> 
> <http://users.bigpond.net.au/Peter-Williams/patch-2_6_7_rc3-SPA_CNTL-v0.1>
> 
> This adds the directory /proc/sys/kernel/cpusched to the /proc directory 
> if SYSCTL is configured into the kernel.  This directory contains 6 
> files that allow read/write (root only for write) to 6 unsigned integer 
> parameters that control the scheduler as follows:
> 
> time_slice -- (in msecs) controls the amount of CPU time that CPU bound 
> tasks get before they are booted off the CPU, have their bonuses 
> reassessed, are put back on the run queue and rescheduled.  This has a 
> default value of 100 (which is the average time slice dealt out by the 
> original O(1) scheduler) and a maximum allowable value of 1000 (i.e. 1 
> second) and a minimum of 1.  Making this smaller MAY help interactive 
> response but COULD increase the context switching rate.  Making this 
> larger MAY improve overall system throughput and reduce context 
> switching rate but MAY make interactive responsiveness worse.
> 
> base_promotion_interval -- (in msecs) controls the rate at which the 
> promotion mechanism promotes runnable tasks.  This has a default value 
> of 55 and a maximum of UINT_MAX and a minimum of 1.  The primary effect 
> of changing this parameter is to alter the "severity" (i.e. how much 
> more CPU access a lower "nice" will give a task) of "nice" values.  The 
> default value is such that if there are two CPU bound tasks (hard 
> spinners) running (and time_slice has its default value of 100) and one 
> of them has a "nice" value one less than the other then that task will 
> get one and a half times as much CPU access as the other.
> 
> max_ai_bonus  -- determines the maximum interactive bonus that a task 
> can receive.  This has a default value of 10 and a maximum value of 10. 
>   Setting this value to zero effectively turns off the interactive bonus.
> 
> max_tpt_bonus  -- determines the maximum throughput bonus that a task 
> can receive.  This has a default value of 5 and a maximum value of 9 (to 
> keep the maximum number of priority slots <= 160 (or 5 * 32)). Setting 
> this value to zero effectively turns off the interactive bonus.
> 
> ia_threshold  -- is the sleepiness (i.e. A_s / (A_s + A_c) where A_s is 
> the average sleep duration per scheduling cycle and A_c is the average 
> CPU consumption per scheduling cycle) threshold (in thousandths) above 
> which the task will be considered interactive and have its interactive 
> bonus increased asymptotically towards the maximum.  This is assessed at 
> the end of each scheduling cycle and is only used to determine whether 
> an increase in the bonus is warranted.  This has a default value of 900 
> and a minimum and maximum of 0 and 1000 respectively.
> 
> cpu_hog_threshold  -- is the idleness (i.e.. (A_s + A_d) / (A_s + A_d + 
> A_c) where A_c and A_s are as before and A_d is average time spent on 
> the run queue per cycle waiting for CPU access) threshold (in 
> thousandths) below which a task will be considered a CPU hog and start 
> to lose interactive bonus points (if it has any).  This is assessed at 
> the end of each scheduling cycle and at the end of each time slice (if 
> the task uses an entire slice) and is only used to decrease the task's 
> interactive bonus.  This has a default value of 500 and a minimum and 
> maximum of 0 and 1000 respectively.
> 
> I suspect that it is this last parameter that is the cause of the 
> phenomena that you reported as the activity you described probably 
> caused the X servers idleness to become smaller than 500 causing it to 
> lose some of its interactive bonus.  Could you apply this patch and try 
> decreasing the value of this parameter to see if your problem goes away? 
>   A clue as to how small to make it can be obtained by observing the CPU 
> usage rate reported by top for the X server when you are stirring it up. 
>   If the reported usage is y% then a threshold of slightly smaller than 
> (1000 - y * 10) should work.
> 

I am doing the same basic test as before but I should mention a few
things. In mozilla I have ten tabs open, 7 of them with loaded web
pages. Depending on which tab is focused the interactive responsiveness
varies greatly. Performance decreases rapidly with the complexity of the
web page to be rendered. I am using the Debian 4.3.0-7 20040318043201
version of XFree86 with the nv driver. This driver is incredibly hungry
for CPU resources (absolute pig really). 

Here is what I see:

At the default setting of 500 I see mozilla and XFree86 each at about
49% cpu (fighting for cpu), each with a priority ~30-33, with no idle
time. Metacity, the window manager, also looks to be trying to get some
CPU, cause it is usually third on the list at ~3% cpu.

As cpu_hog_threshold is decreased to 400. CPU X=50% pri=30, mozilla=40%
pri=34-37, metacity=4.5% pri=29. Feels a tad less jerky, cause metacity
gets a bit more cpu?

Decreasing cpu_hog_threshold to 300 feels worse than 500.

Decreasing the time_slice to 10-20 makes things noticeably smoother. Are
time slices dynamic in the SPA scheduler? and the vanilla (0)1
scheduler?

My humble assessment is that to provide a good interactive feel for this
workload; Xfree86, mozilla and metacity all need to get CPU in a timely
manner. This is a tricky balancing act because XFree86 and mozilla can
easily fall into the cpu hog category, or if they are not cpu hogs they
can act to starve metacity of cpu resources.

Right now I am using cpu_hog_threshold=400 and timeslice=20 and it seems
to be an improvement. But this is fairly subjective of course.

> Thanks
> Peter
> PS I will probably change the semantics of cpu_hog_threshold in the next 
> version to use CPU usage rate (the opposite of idleness) as the measure 
> and reverse the test (i.e. if usage rate is greater than the threshold 
> the task loses some of its interactive bonus). Its value (A_c / (A_c + 
> A_s + A_d) NB this is different to the complement of sleepiness) is 
> slightly cheaper to calculate and I also think it's a more natural way 
> of thinking about the issue.

Regards,

Shane


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler
  2004-06-13 17:46       ` Shane Shrybman
@ 2004-06-13 23:45         ` Peter Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Williams @ 2004-06-13 23:45 UTC (permalink / raw)
  To: Shane Shrybman; +Cc: linux-kernel

Shane Shrybman wrote:
[bits deleted]
> 
> I am doing the same basic test as before but I should mention a few
> things. In mozilla I have ten tabs open, 7 of them with loaded web
> pages. Depending on which tab is focused the interactive responsiveness
> varies greatly. Performance decreases rapidly with the complexity of the
> web page to be rendered. I am using the Debian 4.3.0-7 20040318043201
> version of XFree86 with the nv driver. This driver is incredibly hungry
> for CPU resources (absolute pig really). 
> 
> Here is what I see:
> 
> At the default setting of 500 I see mozilla and XFree86 each at about
> 49% cpu (fighting for cpu), each with a priority ~30-33, with no idle
> time. Metacity, the window manager, also looks to be trying to get some
> CPU, cause it is usually third on the list at ~3% cpu.
> 
> As cpu_hog_threshold is decreased to 400. CPU X=50% pri=30, mozilla=40%
> pri=34-37, metacity=4.5% pri=29. Feels a tad less jerky, cause metacity
> gets a bit more cpu?
> 
> Decreasing cpu_hog_threshold to 300 feels worse than 500.

This probably means that a CPU hog managed to get some interactive bonus 
points and the lower threshold stops them being taken away.

> 
> Decreasing the time_slice to 10-20 makes things noticeably smoother. Are
> time slices dynamic in the SPA scheduler?

No.  Each task gets a fixed fresh time slice when it wakes up and at the 
end of an expired time slice.

> and the vanilla (0)1
> scheduler?

The vanilla scheduler uses time slices to control "nice" and the 
allocated size varies widely.  The default value in SPA is the average 
of what vanilla would have awarded.  This is not necessarily the best 
time slice for interactive response with SPA.

> 
> My humble assessment is that to provide a good interactive feel for this
> workload; Xfree86, mozilla and metacity all need to get CPU in a timely
> manner. This is a tricky balancing act because XFree86 and mozilla can
> easily fall into the cpu hog category, or if they are not cpu hogs they
> can act to starve metacity of cpu resources.

It is indeed a tricky balancing act and it would be easier if more was 
known about the behavioural characteristics of the protagonists.  I have 
added a logging switch which when on will log the scheduling statistics 
for exiting tasks to the system log and I plan to use this to get an 
idea of the characteristics of the tasks involved in a kernel build.

I'm also writing a tool that can be pointed at a particular task (such 
as the X server) and gather time series data on the behaviour of that 
task using the statistics in /proc/<tgid>/task/<tid>/cpustats. 
Hopefully, that will help in working out the best settings or perhaps 
suggest other criteria that may be better for deciding interactiveness.

I had hoped that the throughput bonus would have helped in situations 
such as the one you describe, where several interactive tasks are 
fighting it out, by minimizing queuing.  I suspect that the reason it 
isn't helping is that a linear mapping from CPU delay to the bonus is 
inadequate as it means quite large discrepancies (20%) are required to 
get one bonus point.  I will look at modifying the mapping function to 
give more emphasis to small discrepancies.

> 
> Right now I am using cpu_hog_threshold=400 and timeslice=20 and it seems
> to be an improvement. But this is fairly subjective of course.

You might also try increasing the promotion interval a little as this 
will accentuate the differences between the allocated priorities.

Thanks again for the feedback
Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-06-13 23:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-11 13:39 [PATCH][2.6.7-rc3] Single Priority Array CPU Scheduler Shane Shrybman
2004-06-12  0:10 ` Peter Williams
2004-06-12 14:50   ` Shane Shrybman
2004-06-13  1:15     ` Peter Williams
2004-06-13 17:46       ` Shane Shrybman
2004-06-13 23:45         ` Peter Williams
  -- strict thread matches above, loose matches on Subject: below --
2004-06-10  5:36 Peter Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox