* Re: Fwd: Re: [patch][rfc] quell interactive feeding frenzy
[not found] <200604112100.28725.kernel@kolivas.org>
@ 2006-04-11 17:03 ` Al Boldi
2006-04-11 22:56 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-11 17:03 UTC (permalink / raw)
To: Con Kolivas; +Cc: linux-kernel, Mike Galbraith
Con Kolivas wrote:
> Hi Al
Hi Con!
> On Tuesday 11 April 2006 00:43, Al Boldi wrote:
> > After that the loadavg starts to wrap.
> > And even then it is possible to login.
> > And that's not with the default 2.6 scheduler, but rather w/ spa.
>
> Since you seem to use plugsched, I wonder if you could tell me how does
> current staircase perform with a load like that?
With plugsched-2.6.16 your staircase sched reaches about 40 then slows down,
maxing around 100. Setting sched_compute=1 causes console lock-ups.
With staircase14.2-test3 it reaches around 300 then slows down, halting at
around 500.
Your scheduler seems to be tuned for single-user multi-tasking, i.e.
concurrent tasks around 10, where its aggressive nature is sustained by a
short run-queue. Once you go above 50, this aggressiveness starts to
express itself as very jumpy.
This is of course very cpu/mem/ctxt dependent and it would be great, if your
scheduler could maybe do some simple on-the-fly benchmarking as it
reschedules, thus adjusting this aggressiveness depending on its
sustainability.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-11 17:03 ` Fwd: Re: [patch][rfc] quell interactive feeding frenzy Al Boldi
@ 2006-04-11 22:56 ` Con Kolivas
2006-04-12 5:41 ` Al Boldi
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-11 22:56 UTC (permalink / raw)
To: Al Boldi, ck list; +Cc: linux-kernel, Mike Galbraith
On Wednesday 12 April 2006 03:03, Al Boldi wrote:
> With plugsched-2.6.16 your staircase sched reaches about 40 then slows
> down, maxing around 100. Setting sched_compute=1 causes console lock-ups.
Which is fine because sched_compute isn't designed for heavily multithreaded
usage.
> With staircase14.2-test3 it reaches around 300 then slows down, halting at
> around 500.
Oh that's good because staircase14.2_test3 is basically staircase15 which is
in the current plugsched (ie newer than the staircase you tested in
plugsched-2.6.16 above). So it tolerates a load of up to 500 on single cpu?
That seems very robust to me.
> Your scheduler seems to be tuned for single-user multi-tasking, i.e.
> concurrent tasks around 10, where its aggressive nature is sustained by a
> short run-queue. Once you go above 50, this aggressiveness starts to
> express itself as very jumpy.
Oh no it's nothing like "tuned for single-user multi tasking". It seems a
common misconception because interactivity is a prime concern for staircase
but the idea is that we should be able to do interactivity without
sacrificing fairness. The same mechanism that is responsible for maintaining
fairness is also responsible for creating its interactivity. That's what I
mean by "interactive by design", and what makes it different from extracting
interactivity out of other designs that have some form of estimator to add
unfairness to create that interactivity.
> This is of course very cpu/mem/ctxt dependent and it would be great, if
> your scheduler could maybe do some simple on-the-fly benchmarking as it
> reschedules, thus adjusting this aggressiveness depending on its
> sustainability.
I know you're _very_ keen on the idea of some autotuning but I think this is
the wrong thing to autotune. The whole point of staircase is it's a simple
design without any interactivity estimator. It uses pure cpu accounting to
change priority and that is a percentage which is effectively already tuned
to the underlying cpu. Any benchmarking/aggressiveness "tuning" would undo
the (effectively) very simple design.
Feel free to look at the code. Sleep for time Y, increase priority by
Y/RR_INTERVAL. Run for time X, drop priority by X/RR_INTERVAL. If it drops to
lowest priority it then jumps back up to best priority again (to prevent it
being "batch starved").
Thanks very much for testing :)
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-11 22:56 ` Con Kolivas
@ 2006-04-12 5:41 ` Al Boldi
2006-04-12 6:22 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-12 5:41 UTC (permalink / raw)
To: Con Kolivas, ck list; +Cc: linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Wednesday 12 April 2006 03:03, Al Boldi wrote:
> > With plugsched-2.6.16 your staircase sched reaches about 40 then slows
> > down, maxing around 100. Setting sched_compute=1 causes console
> > lock-ups.
>
> Which is fine because sched_compute isn't designed for heavily
> multithreaded usage.
What's it good for?
> > With staircase14.2-test3 it reaches around 300 then slows down, halting
> > at around 500.
>
> Oh that's good because staircase14.2_test3 is basically staircase15 which
> is in the current plugsched (ie newer than the staircase you tested in
> plugsched-2.6.16 above). So it tolerates a load of up to 500 on single
> cpu? That seems very robust to me.
Yes, better than the default 2.6 scheduler.
> > Your scheduler seems to be tuned for single-user multi-tasking, i.e.
> > concurrent tasks around 10, where its aggressive nature is sustained by
> > a short run-queue. Once you go above 50, this aggressiveness starts to
> > express itself as very jumpy.
>
> Oh no it's nothing like "tuned for single-user multi tasking". It seems a
> common misconception because interactivity is a prime concern for
> staircase but the idea is that we should be able to do interactivity
> without sacrificing fairness.
Agreed.
> The same mechanism that is responsible for
> maintaining fairness is also responsible for creating its interactivity.
> That's what I mean by "interactive by design", and what makes it different
> from extracting interactivity out of other designs that have some form of
> estimator to add unfairness to create that interactivity.
Yes, but staircase isn't really fair, and it's definitely not smooth. You
are trying to get ia by aggressively attacking priority which kills
smoothness, and is only fair with a short run-queue.
> > This is of course very cpu/mem/ctxt dependent and it would be great, if
> > your scheduler could maybe do some simple on-the-fly benchmarking as it
> > reschedules, thus adjusting this aggressiveness depending on its
> > sustainability.
>
> I know you're _very_ keen on the idea of some autotuning but I think this
> is the wrong thing to autotune. The whole point of staircase is it's a
> simple design without any interactivity estimator. It uses pure cpu
> accounting to change priority and that is a percentage which is
> effectively already tuned to the underlying cpu. Any
> benchmarking/aggressiveness "tuning" would undo the (effectively) very
> simple design.
I like simple designs. They tend to keep things to the point and aid
efficiency. But staircase doesn't look efficient to me under heavy load,
and I would think this may be easily improved.
> Feel free to look at the code. Sleep for time Y, increase priority by
> Y/RR_INTERVAL. Run for time X, drop priority by X/RR_INTERVAL. If it drops
> to lowest priority it then jumps back up to best priority again (to
> prevent it being "batch starved").
Looks simple enough, and should work for short run'q's, but this looks
unsustainable for long run'q's, due to the unconditional jump from lowest to
best prio. Making it conditional and maybe moderating X,Y,RR_INTERVAL could
be helpful.
Also, can you export lowest/best prio as well as timeslice and friends to
procfs/sysfs?
> Thanks very much for testing :)
Thank you!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 5:41 ` Al Boldi
@ 2006-04-12 6:22 ` Con Kolivas
2006-04-12 8:17 ` Al Boldi
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-12 6:22 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Wed, 12 Apr 2006 03:41 pm, Al Boldi wrote:
> Con Kolivas wrote:
> > Which is fine because sched_compute isn't designed for heavily
> > multithreaded usage.
>
> What's it good for?
Single heavily cpu bound computationally intensive tasks (think rendering
etc).
> > Oh that's good because staircase14.2_test3 is basically staircase15 which
> > is in the current plugsched (ie newer than the staircase you tested in
> > plugsched-2.6.16 above). So it tolerates a load of up to 500 on single
> > cpu? That seems very robust to me.
>
> Yes, better than the default 2.6 scheduler.
>
> > > Your scheduler seems to be tuned for single-user multi-tasking, i.e.
> > > concurrent tasks around 10, where its aggressive nature is sustained by
> > > a short run-queue. Once you go above 50, this aggressiveness starts to
> > > express itself as very jumpy.
> >
> > Oh no it's nothing like "tuned for single-user multi tasking". It seems a
> > common misconception because interactivity is a prime concern for
> > staircase but the idea is that we should be able to do interactivity
> > without sacrificing fairness.
>
> Agreed.
>
> > The same mechanism that is responsible for
> > maintaining fairness is also responsible for creating its interactivity.
> > That's what I mean by "interactive by design", and what makes it
> > different from extracting interactivity out of other designs that have
> > some form of estimator to add unfairness to create that interactivity.
>
> Yes, but staircase isn't really fair, and it's definitely not smooth. You
> are trying to get ia by aggressively attacking priority which kills
> smoothness, and is only fair with a short run-queue.
Sorry I don't understand what you mean. Why do you say it's not fair (got a
testcase?). What do you mean by "definitely not smooth". What is smoothness
and on what workloads is it not smooth? Also by ia you mean what?
> > I know you're _very_ keen on the idea of some autotuning but I think this
> > is the wrong thing to autotune. The whole point of staircase is it's a
> > simple design without any interactivity estimator. It uses pure cpu
> > accounting to change priority and that is a percentage which is
> > effectively already tuned to the underlying cpu. Any
> > benchmarking/aggressiveness "tuning" would undo the (effectively) very
> > simple design.
>
> I like simple designs. They tend to keep things to the point and aid
> efficiency. But staircase doesn't look efficient to me under heavy load,
> and I would think this may be easily improved.
Again I don't understand. Just how heavy a load is heavy? Your testcases are
already in what I would call stratospheric range. I don't personally think a
cpu scheduler should be optimised for load infinity. And how are you defining
efficient? You say it doesn't "look" efficient? What "looks" inefficient
about it?
> > Feel free to look at the code. Sleep for time Y, increase priority by
> > Y/RR_INTERVAL. Run for time X, drop priority by X/RR_INTERVAL. If it
> > drops to lowest priority it then jumps back up to best priority again (to
> > prevent it being "batch starved").
>
> Looks simple enough, and should work for short run'q's, but this looks
> unsustainable for long run'q's, due to the unconditional jump from lowest
> to best prio.
Looks? How? You've shown what I consider very long runqueues work fine
already.
> Making it conditional and maybe moderating X,Y,RR_INTERVAL
> could be helpful.
I think over all meaningful loads, and into absurdly high load ranges it
works. I don't think undoing the incredible simplicity that works over all
that range should be done to optimise for loads even greater than that.
> Also, can you export lowest/best prio as well as timeslice and friends to
> procfs/sysfs?
You want tunables? The only tunable in staircase is rr_interval which (in -ck)
has an on/off for big/small (sched_compute) since most other numbers in
between (in my experience) are pretty meaningless. I could export rr_interval
directly instead... I've not seen a good argument for doing that. Got one?
However there are no other tunables at all (just look at the code). All tasks
of any nice level have available the whole priority range from 100-139 which
appears as PRIO 0-39 on top. Limiting that (again) changes the semantics.
> > Thanks very much for testing :)
>
> Thank you!
And another round of thanks :) But many more questions.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 6:22 ` Con Kolivas
@ 2006-04-12 8:17 ` Al Boldi
2006-04-12 9:36 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-12 8:17 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Wed, 12 Apr 2006 03:41 pm, Al Boldi wrote:
> > Con Kolivas wrote:
> > > Which is fine because sched_compute isn't designed for heavily
> > > multithreaded usage.
> >
> > What's it good for?
>
> Single heavily cpu bound computationally intensive tasks (think rendering
> etc).
Why do you need a switch for that?
> > > The same mechanism that is responsible for
> > > maintaining fairness is also responsible for creating its
> > > interactivity. That's what I mean by "interactive by design", and what
> > > makes it different from extracting interactivity out of other designs
> > > that have some form of estimator to add unfairness to create that
> > > interactivity.
> >
> > Yes, but staircase isn't really fair, and it's definitely not smooth.
> > You are trying to get ia by aggressively attacking priority which kills
> > smoothness, and is only fair with a short run-queue.
>
> Sorry I don't understand what you mean. Why do you say it's not fair (got
> a testcase?). What do you mean by "definitely not smooth". What is
> smoothness and on what workloads is it not smooth? Also by ia you mean
> what?
ia=interactivity i.e: responsiveness under high load.
smooth=not jumpy i.e: run '# gears & morph3d & reflect &' w/o stutter
fair=non hogging i.e: spreading cpu-load across tasks evenly (top d.1)
> > > I know you're _very_ keen on the idea of some autotuning but I think
> > > this is the wrong thing to autotune. The whole point of staircase is
> > > it's a simple design without any interactivity estimator. It uses pure
> > > cpu accounting to change priority and that is a percentage which is
> > > effectively already tuned to the underlying cpu. Any
> > > benchmarking/aggressiveness "tuning" would undo the (effectively) very
> > > simple design.
> >
> > I like simple designs. They tend to keep things to the point and aid
> > efficiency. But staircase doesn't look efficient to me under heavy
> > load, and I would think this may be easily improved.
>
> Again I don't understand. Just how heavy a load is heavy? Your testcases
> are already in what I would call stratospheric range. I don't personally
> think a cpu scheduler should be optimised for load infinity. And how are
> you defining efficient? You say it doesn't "look" efficient? What "looks"
> inefficient about it?
The idea here is to expose inefficiencies by driving the system into
saturation, and although staircase is more efficient than the default 2.6
scheduler, it is obviously less efficient than spa.
> > Also, can you export lowest/best prio as well as timeslice and friends
> > to procfs/sysfs?
>
> You want tunables? The only tunable in staircase is rr_interval which (in
> -ck) has an on/off for big/small (sched_compute) since most other numbers
> in between (in my experience) are pretty meaningless. I could export
> rr_interval directly instead... I've not seen a good argument for doing
> that. Got one?
Smoothness control, maybe?
> However there are no other tunables at all (just look at
> the code). All tasks of any nice level have available the whole priority
> range from 100-139 which appears as PRIO 0-39 on top. Limiting that
> (again) changes the semantics.
Yes, limiting this could change the semantics for the sake of fairness, it's
up to you.
> And another round of thanks :) But many more questions.
No problem.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 8:17 ` Al Boldi
@ 2006-04-12 9:36 ` Con Kolivas
2006-04-12 10:39 ` Al Boldi
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-12 9:36 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Wednesday 12 April 2006 18:17, Al Boldi wrote:
> Con Kolivas wrote:
> > Single heavily cpu bound computationally intensive tasks (think rendering
> > etc).
>
> Why do you need a switch for that?
Because avoiding doing need_resched and reassessing priority at less regular
intervals means less overhead, and there is always something else running on
a pc. At low loads the longer timeslices and delayed preemption contribute
considerably to cache warmth and throughput. Comparing staircase's
sched_compute mode on kernbench at "optimal loads" (make -j4 x num_cpus)
showed the best throughput of all the schedulers tested.
> > Sorry I don't understand what you mean. Why do you say it's not fair (got
> > a testcase?). What do you mean by "definitely not smooth". What is
> > smoothness and on what workloads is it not smooth? Also by ia you mean
> > what?
>
> ia=interactivity i.e: responsiveness under high load.
> smooth=not jumpy i.e: run '# gears & morph3d & reflect &' w/o stutter
Installed and tested here just now. They run smoothly concurrently here. Are
you testing on staircase15?
> fair=non hogging i.e: spreading cpu-load across tasks evenly (top d.1)
Only unblocked processes/threads where one depends on the other don't get
equal share, which is as broken a testcase as relying on sched_yield. I have
not seen a testcase demonstrating unfairness on current staircase. top shows
me fair cpu usage.
> > Again I don't understand. Just how heavy a load is heavy? Your testcases
> > are already in what I would call stratospheric range. I don't personally
> > think a cpu scheduler should be optimised for load infinity. And how are
> > you defining efficient? You say it doesn't "look" efficient? What "looks"
> > inefficient about it?
>
> The idea here is to expose inefficiencies by driving the system into
> saturation, and although staircase is more efficient than the default 2.6
> scheduler, it is obviously less efficient than spa.
Where do you stop calling something saturation and start calling it absurd? By
your reckoning staircase is stable to loads of 300 on one cpu. spa being
stable to higher loads is hardly comparable given the interactivity disparity
between it and staircase. A compromise is one that does both very well; not
one perfectly and the other poorly.
> > You want tunables? The only tunable in staircase is rr_interval which (in
> > -ck) has an on/off for big/small (sched_compute) since most other numbers
> > in between (in my experience) are pretty meaningless. I could export
> > rr_interval directly instead... I've not seen a good argument for doing
> > that. Got one?
>
> Smoothness control, maybe?
Have to think about that one. I'm not seeing a smoothness issue.
> > However there are no other tunables at all (just look at
> > the code). All tasks of any nice level have available the whole priority
> > range from 100-139 which appears as PRIO 0-39 on top. Limiting that
> > (again) changes the semantics.
>
> Yes, limiting this could change the semantics for the sake of fairness,
> it's up to you.
There is no problem with fairness that I am aware of.
Thanks!
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 9:36 ` Con Kolivas
@ 2006-04-12 10:39 ` Al Boldi
2006-04-12 11:27 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-12 10:39 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Wednesday 12 April 2006 18:17, Al Boldi wrote:
> > Con Kolivas wrote:
> > > Single heavily cpu bound computationally intensive tasks (think
> > > rendering etc).
> >
> > Why do you need a switch for that?
>
> Because avoiding doing need_resched and reassessing priority at less
> regular intervals means less overhead, and there is always something else
> running on a pc. At low loads the longer timeslices and delayed preemption
> contribute considerably to cache warmth and throughput. Comparing
> staircase's sched_compute mode on kernbench at "optimal loads" (make -j4 x
> num_cpus) showed the best throughput of all the schedulers tested.
Great!
> > > Sorry I don't understand what you mean. Why do you say it's not fair
> > > (got a testcase?). What do you mean by "definitely not smooth". What
> > > is smoothness and on what workloads is it not smooth? Also by ia you
> > > mean what?
> >
> > ia=interactivity i.e: responsiveness under high load.
> > smooth=not jumpy i.e: run '# gears & morph3d & reflect &' w/o stutter
>
> Installed and tested here just now. They run smoothly concurrently here.
> Are you testing on staircase15?
staircase14.2-test3. Are you testing w/ DRM? If not then all mesa requests
will be queued into X, and then runs as one task (check top d.1)
> > fair=non hogging i.e: spreading cpu-load across tasks evenly (top d.1)
>
> Only unblocked processes/threads where one depends on the other don't get
> equal share, which is as broken a testcase as relying on sched_yield. I
> have not seen a testcase demonstrating unfairness on current staircase.
> top shows me fair cpu usage.
Try ping -A (10x). top d.1 should show skewed times. If you have a fast
machine, you may have to increase the load.
> > > Again I don't understand. Just how heavy a load is heavy? Your
> > > testcases are already in what I would call stratospheric range. I
> > > don't personally think a cpu scheduler should be optimised for load
> > > infinity. And how are you defining efficient? You say it doesn't
> > > "look" efficient? What "looks" inefficient about it?
> >
> > The idea here is to expose inefficiencies by driving the system into
> > saturation, and although staircase is more efficient than the default
> > 2.6 scheduler, it is obviously less efficient than spa.
>
> Where do you stop calling something saturation and start calling it
> absurd? By your reckoning staircase is stable to loads of 300 on one cpu.
> spa being stable to higher loads is hardly comparable given the
> interactivity disparity between it and staircase. A compromise is one that
> does both very well; not one perfectly and the other poorly.
>
> > > You want tunables? The only tunable in staircase is rr_interval which
> > > (in -ck) has an on/off for big/small (sched_compute) since most other
> > > numbers in between (in my experience) are pretty meaningless. I could
> > > export rr_interval directly instead... I've not seen a good argument
> > > for doing that. Got one?
> >
> > Smoothness control, maybe?
>
> Have to think about that one. I'm not seeing a smoothness issue.
>
> > > However there are no other tunables at all (just look at
> > > the code). All tasks of any nice level have available the whole
> > > priority range from 100-139 which appears as PRIO 0-39 on top.
> > > Limiting that (again) changes the semantics.
> >
> > Yes, limiting this could change the semantics for the sake of fairness,
> > it's up to you.
>
> There is no problem with fairness that I am aware of.
Let's see after you retry the tests.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 10:39 ` Al Boldi
@ 2006-04-12 11:27 ` Con Kolivas
2006-04-12 15:25 ` Al Boldi
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-12 11:27 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Wednesday 12 April 2006 20:39, Al Boldi wrote:
> Con Kolivas wrote:
> > Installed and tested here just now. They run smoothly concurrently here.
> > Are you testing on staircase15?
>
> staircase14.2-test3. Are you testing w/ DRM? If not then all mesa
> requests will be queued into X, and then runs as one task (check top d.1)
Nvidia driver; all separate tasks in top.
> Try ping -A (10x). top d.1 should show skewed times. If you have a fast
> machine, you may have to increase the load.
Ran for a bit over 10 mins outside of X to avoid other tasks influencing
results. I was too lazy to go to init 1.
ps -eALo pid,spid,user,priority,ni,pcpu,vsize,time,args
15648 15648 root 39 0 9.2 1740 00:01:03 ping -A localhost
15649 15649 root 28 0 9.8 1740 00:01:06 ping -A localhost
15650 15650 root 39 0 9.9 1744 00:01:07 ping -A localhost
15651 15651 root 39 0 9.3 1740 00:01:03 ping -A localhost
15652 15652 root 39 0 10.3 1740 00:01:10 ping -A localhost
15653 15653 root 39 0 10.8 1740 00:01:13 ping -A localhost
15654 15654 root 39 0 10.0 1740 00:01:08 ping -A localhost
15655 15655 root 39 0 10.5 1740 00:01:11 ping -A localhost
15656 15656 root 39 0 9.9 1740 00:01:07 ping -A localhost
15657 15657 root 39 0 10.2 1740 00:01:09 ping -A localhost
mean 68.7 seconds
range 63-73 seconds.
For a load that wakes up so frequently for such a short period of time I think
that is pretty fair cpu distribution over 10 mins. Over shorter periods top
is hopeless at representing accurate cpu usage, especially at low HZ settings
of the kernel. You can see the current cpu distribution on ps there is pretty
consistent across the tasks and gives what I consider quite a fair cpu
distribution over 10 mins.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 11:27 ` Con Kolivas
@ 2006-04-12 15:25 ` Al Boldi
2006-04-13 11:51 ` Con Kolivas
2006-04-16 6:02 ` Con Kolivas
0 siblings, 2 replies; 27+ messages in thread
From: Al Boldi @ 2006-04-12 15:25 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Wednesday 12 April 2006 20:39, Al Boldi wrote:
> > Con Kolivas wrote:
> > > Installed and tested here just now. They run smoothly concurrently
> > > here. Are you testing on staircase15?
> >
> > staircase14.2-test3. Are you testing w/ DRM? If not then all mesa
> > requests will be queued into X, and then runs as one task (check top
> > d.1)
>
> Nvidia driver; all separate tasks in top.
On a 400MhzP2 i810drm w/ kernel HZ=1000 it stutters.
You may want to compensate for nvidia w/ a few cpu-hogs.
How many gears fps do you get?
> > Try ping -A (10x). top d.1 should show skewed times. If you have a
> > fast machine, you may have to increase the load.
>
> Ran for a bit over 10 mins outside of X to avoid other tasks influencing
> results. I was too lazy to go to init 1.
>
> ps -eALo pid,spid,user,priority,ni,pcpu,vsize,time,args
>
> 15648 15648 root 39 0 9.2 1740 00:01:03 ping -A localhost
> 15649 15649 root 28 0 9.8 1740 00:01:06 ping -A localhost
> 15650 15650 root 39 0 9.9 1744 00:01:07 ping -A localhost
> 15651 15651 root 39 0 9.3 1740 00:01:03 ping -A localhost
> 15652 15652 root 39 0 10.3 1740 00:01:10 ping -A localhost
> 15653 15653 root 39 0 10.8 1740 00:01:13 ping -A localhost
> 15654 15654 root 39 0 10.0 1740 00:01:08 ping -A localhost
> 15655 15655 root 39 0 10.5 1740 00:01:11 ping -A localhost
> 15656 15656 root 39 0 9.9 1740 00:01:07 ping -A localhost
> 15657 15657 root 39 0 10.2 1740 00:01:09 ping -A localhost
>
> mean 68.7 seconds
>
> range 63-73 seconds.
Could this 10s skew be improved to around 1s to aid smoothness?
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 15:25 ` Al Boldi
@ 2006-04-13 11:51 ` Con Kolivas
2006-04-14 3:16 ` Al Boldi
2006-04-16 6:02 ` Con Kolivas
1 sibling, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-13 11:51 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Thursday 13 April 2006 01:25, Al Boldi wrote:
> Con Kolivas wrote:
> > Nvidia driver; all separate tasks in top.
>
> On a 400MhzP2 i810drm w/ kernel HZ=1000 it stutters.
> You may want to compensate for nvidia w/ a few cpu-hogs.
I tried adding cpu hogs and it gets extremely slow very soon but still doesn't
stutter here.
> How many gears fps do you get?
When those 3 are running concurrently (without any other cpu hogs) gears is
showing 317 fps.
> > range 63-73 seconds.
>
> Could this 10s skew be improved to around 1s to aid smoothness?
I'm happy to try... but I doubt it. 10% difference over 10 tasks over 10 mins
of tasks of that wake/sleep nature is pretty good IMO. I'll see if there's
anywhere else I can make the cpu accounting any better.
As an aside, note that sched_clock and nanosecond timing with TSC isn't
actually used if you use the pm timer which undoes any high res accounting
the cpu scheduler can do (I noticed this when playing with pm timer that
sched_clock just returns jiffies resolution instead of real nanosecond res).
This could undo any smoothness that good cpu accounting can do.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-13 11:51 ` Con Kolivas
@ 2006-04-14 3:16 ` Al Boldi
2006-04-15 7:05 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-14 3:16 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
[-- Attachment #1: Type: text/plain, Size: 2136 bytes --]
Con Kolivas wrote:
> On Thursday 13 April 2006 01:25, Al Boldi wrote:
> > Con Kolivas wrote:
> > > Nvidia driver; all separate tasks in top.
> >
> > On a 400MhzP2 i810drm w/ kernel HZ=1000 it stutters.
> > You may want to compensate for nvidia w/ a few cpu-hogs.
>
> I tried adding cpu hogs and it gets extremely slow very soon but still
> doesn't stutter here.
>
> > How many gears fps do you get?
>
> When those 3 are running concurrently (without any other cpu hogs) gears
> is showing 317 fps.
Your machine is probably too fast to show the problem, due to enough
cpu-cycles/timeslice to complete the request.
Can you try the attached mem-eater passing it the number of kb to be eaten.
i.e. '# while :; do ./eatm 9999 ; done'
This will print the number of bytes eaten and the timing in ms.
Assuming timeslice=100, adjust the number of kb to be eaten such that the
timing will be less than timeslice (something like 60ms). Switch to another
vt and start another eatm w/ the number of kb yielding more than timeslice
(something like 140ms). This eatm should starve completely after exceeding
timeslice.
This problem also exists in mainline, but it is able to break out of it to
some extent. Setting eatm kb to a timing larger than timeslice does not
exhibit this problem.
> > > range 63-73 seconds.
> >
> > Could this 10s skew be improved to around 1s to aid smoothness?
>
> I'm happy to try... but I doubt it. 10% difference over 10 tasks over 10
> mins of tasks of that wake/sleep nature is pretty good IMO. I'll see if
> there's anywhere else I can make the cpu accounting any better.
Great!
> As an aside, note that sched_clock and nanosecond timing with TSC isn't
> actually used if you use the pm timer which undoes any high res accounting
> the cpu scheduler can do (I noticed this when playing with pm timer that
> sched_clock just returns jiffies resolution instead of real nanosecond
> res). This could undo any smoothness that good cpu accounting can do.
Yes, pm-timer looks rather broken, at least on my machine. Too bad it's on
by default, as I always have to turn it off.
Thanks!
--
Al
[-- Attachment #2: eatm.c --]
[-- Type: text/x-csrc, Size: 810 bytes --]
#include <stdio.h>
#include <sys/time.h>
unsigned long elapsed(int start) {
static struct timeval s,e;
if (start) return gettimeofday(&s, NULL);
gettimeofday(&e, NULL);
return ((e.tv_sec - s.tv_sec) * 1000 + (e.tv_usec - s.tv_usec) / 1000);
}
int main(int argc, char **argv) {
unsigned long int i,j,max;
unsigned char *p;
if (argc>1)
max=atol(argv[1]);
else
max=0x60000;
elapsed(1);
for (i=0;((i<max/1024) && (p = (char *)malloc(1024*1024)));i++) {
for (j=0;j<1024;p[1024*j++]=0);
fprintf(stderr,"\r%d MB ",i+1);
}
for (j=max-(i*=1024);((i<max) && (p = (char *)malloc(1024)));i++) {
*p = 0;
}
fprintf(stderr,"%d KB ",j-(max-i));
fprintf(stderr,"eaten in %lu msec (%lu MB/s)\n",elapsed(0),i/(elapsed(0)?:1)*1000/1024);
return 0;
}
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-14 3:16 ` Al Boldi
@ 2006-04-15 7:05 ` Con Kolivas
2006-04-15 18:23 ` [ck] " Michael Gerdau
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Con Kolivas @ 2006-04-15 7:05 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Friday 14 April 2006 13:16, Al Boldi wrote:
> Can you try the attached mem-eater passing it the number of kb to be eaten.
>
> i.e. '# while :; do ./eatm 9999 ; done'
>
> This will print the number of bytes eaten and the timing in ms.
>
> Assuming timeslice=100, adjust the number of kb to be eaten such that the
> timing will be less than timeslice (something like 60ms). Switch to
> another vt and start another eatm w/ the number of kb yielding more than
> timeslice (something like 140ms). This eatm should starve completely after
> exceeding timeslice.
>
> This problem also exists in mainline, but it is able to break out of it to
> some extent. Setting eatm kb to a timing larger than timeslice does not
> exhibit this problem.
Thanks for bringing this to my attention. A while back I had different
management of forked tasks and merged it with PF_NONSLEEP. Since then I've
changed the management of NONSLEEP tasks and didn't realise it had adversely
affected the accounting of forking tasks. This patch should rectify it.
Thanks!
---
include/linux/sched.h | 1 +
kernel/sched.c | 9 ++++++---
2 files changed, 7 insertions(+), 3 deletions(-)
Index: linux-2.6.16-ck5/include/linux/sched.h
===================================================================
--- linux-2.6.16-ck5.orig/include/linux/sched.h 2006-04-15 16:32:18.000000000 +1000
+++ linux-2.6.16-ck5/include/linux/sched.h 2006-04-15 16:34:36.000000000 +1000
@@ -961,6 +961,7 @@ static inline void put_task_struct(struc
#define PF_SWAPWRITE 0x01000000 /* Allowed to write to swap */
#define PF_NONSLEEP 0x02000000 /* Waiting on in kernel activity */
#define PF_ISOREF 0x04000000 /* SCHED_ISO task has used up quota */
+#define PF_FORKED 0x08000000 /* Task just forked another process */
/*
* Only the _current_ task can read/write to tsk->flags, but other
Index: linux-2.6.16-ck5/kernel/sched.c
===================================================================
--- linux-2.6.16-ck5.orig/kernel/sched.c 2006-04-15 16:32:18.000000000 +1000
+++ linux-2.6.16-ck5/kernel/sched.c 2006-04-15 16:34:35.000000000 +1000
@@ -18,7 +18,7 @@
* 2004-04-02 Scheduler domains code by Nick Piggin
* 2006-04-02 Staircase scheduling policy by Con Kolivas with help
* from William Lee Irwin III, Zwane Mwaikambo & Peter Williams.
- * Staircase v15
+ * Staircase v15_test2
*/
#include <linux/mm.h>
@@ -809,6 +809,9 @@ static inline void recalc_task_prio(task
else
sleep_time = 0;
+ if (unlikely(p->flags & PF_FORKED))
+ sleep_time = 0;
+
/*
* If we sleep longer than our running total and have not set the
* PF_NONSLEEP flag we gain a bonus.
@@ -847,7 +850,7 @@ static void activate_task(task_t *p, run
p->time_slice = p->slice % rr ? : rr;
if (!rt_task(p)) {
recalc_task_prio(p, now);
- p->flags &= ~PF_NONSLEEP;
+ p->flags &= ~(PF_NONSLEEP | PF_FORKED);
p->systime = 0;
p->prio = effective_prio(p);
}
@@ -1464,7 +1467,7 @@ void fastcall wake_up_new_task(task_t *p
/* Forked process gets no bonus to prevent fork bombs. */
p->bonus = 0;
- current->flags |= PF_NONSLEEP;
+ current->flags |= PF_FORKED;
if (likely(cpu == this_cpu)) {
activate_task(p, rq, 1);
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 7:05 ` Con Kolivas
@ 2006-04-15 18:23 ` Michael Gerdau
2006-04-15 20:45 ` Al Boldi
2006-04-15 22:32 ` jos poortvliet
2 siblings, 0 replies; 27+ messages in thread
From: Michael Gerdau @ 2006-04-15 18:23 UTC (permalink / raw)
To: ck; +Cc: Con Kolivas, Al Boldi, Mike Galbraith, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 640 bytes --]
> Thanks for bringing this to my attention. A while back I had different
> management of forked tasks and merged it with PF_NONSLEEP. Since then I've
> changed the management of NONSLEEP tasks and didn't realise it had adversely
> affected the accounting of forking tasks. This patch should rectify it.
[snip]
At least here this patch solves the "previously working" (i.e. starving)
testcase, i.e. now all eatm-processes continue to work.
Best,
Michael
--
Vote against SPAM - see http://www.politik-digital.de/spam/
Michael Gerdau email: mgd@technosis.de
GPG-keys available on request or at public keyserver
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 7:05 ` Con Kolivas
2006-04-15 18:23 ` [ck] " Michael Gerdau
@ 2006-04-15 20:45 ` Al Boldi
2006-04-15 23:22 ` Con Kolivas
2006-04-15 22:32 ` jos poortvliet
2 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-15 20:45 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Friday 14 April 2006 13:16, Al Boldi wrote:
> > Can you try the attached mem-eater passing it the number of kb to be
> > eaten.
> >
> > i.e. '# while :; do ./eatm 9999 ; done'
> >
> > This will print the number of bytes eaten and the timing in ms.
> >
> > Assuming timeslice=100, adjust the number of kb to be eaten such that
> > the timing will be less than timeslice (something like 60ms). Switch to
> > another vt and start another eatm w/ the number of kb yielding more than
> > timeslice (something like 140ms). This eatm should starve completely
> > after exceeding timeslice.
> >
> > This problem also exists in mainline, but it is able to break out of it
> > to some extent. Setting eatm kb to a timing larger than timeslice does
> > not exhibit this problem.
>
> Thanks for bringing this to my attention. A while back I had different
> management of forked tasks and merged it with PF_NONSLEEP. Since then I've
> changed the management of NONSLEEP tasks and didn't realise it had
> adversely affected the accounting of forking tasks. This patch should
> rectify it.
Congrats!
Much smoother, but I still get this choke w/ 2 eatm 9999 loops running:
9 MB 783 KB eaten in 131 msec (74 MB/s)
9 MB 783 KB eaten in 129 msec (75 MB/s)
9 MB 783 KB eaten in 129 msec (75 MB/s)
9 MB 783 KB eaten in 131 msec (74 MB/s)
9 MB 783 KB eaten in 133 msec (73 MB/s)
9 MB 783 KB eaten in 132 msec (73 MB/s)
9 MB 783 KB eaten in 128 msec (76 MB/s)
9 MB 783 KB eaten in 133 msec (73 MB/s)
9 MB 783 KB eaten in 129 msec (75 MB/s)
9 MB 783 KB eaten in 130 msec (74 MB/s)
9 MB 783 KB eaten in 2416 msec (3 MB/s) <<<<<<<<<<<<<
9 MB 783 KB eaten in 197 msec (48 MB/s)
9 MB 783 KB eaten in 133 msec (73 MB/s)
9 MB 783 KB eaten in 132 msec (73 MB/s)
9 MB 783 KB eaten in 132 msec (73 MB/s)
9 MB 783 KB eaten in 126 msec (77 MB/s)
9 MB 783 KB eaten in 135 msec (72 MB/s)
9 MB 783 KB eaten in 132 msec (73 MB/s)
9 MB 783 KB eaten in 132 msec (73 MB/s)
9 MB 783 KB eaten in 134 msec (72 MB/s)
9 MB 783 KB eaten in 64 msec (152 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 64 msec (152 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 64 msec (152 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
9 MB 783 KB eaten in 63 msec (154 MB/s)
You may have to adjust the kb to get the same effect.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 7:05 ` Con Kolivas
2006-04-15 18:23 ` [ck] " Michael Gerdau
2006-04-15 20:45 ` Al Boldi
@ 2006-04-15 22:32 ` jos poortvliet
2006-04-15 23:06 ` Con Kolivas
2 siblings, 1 reply; 27+ messages in thread
From: jos poortvliet @ 2006-04-15 22:32 UTC (permalink / raw)
To: ck; +Cc: Con Kolivas, Al Boldi, Mike Galbraith, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1848 bytes --]
Op zaterdag 15 april 2006 09:05, schreef Con Kolivas:
> Thanks for bringing this to my attention. A while back I had different
> management of forked tasks and merged it with PF_NONSLEEP. Since then I've
> changed the management of NONSLEEP tasks and didn't realise it had
> adversely affected the accounting of forking tasks. This patch should
> rectify it.
>
> Thanks!
hey con, i get this:
can't find file to patch at input line 9
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
| include/linux/sched.h | 1 +
| kernel/sched.c | 9 ++++++---
| 2 files changed, 7 insertions(+), 3 deletions(-)
|
|Index: linux-2.6.16-ck5/include/linux/sched.h
|===================================================================
|--- linux-2.6.16-ck5.orig/include/linux/sched.h 2006-04-15 16:32:18.000000000
+1000
|+++ linux-2.6.16-ck5/include/linux/sched.h 2006-04-15 16:34:36.000000000
+1000
--------------------------
File to patch: include/linux/sched.h
patching file include/linux/sched.h
patch: **** malformed patch at line 10: #define
PF_SWAPWRITE 0x01000000 /* Allowed to write to swap */
doesn't compile if i add the patch by hand... tried on 2.6.17-rc1-ck1 and on
2.6.16-ck3.
----------------
In file included from include/linux/mm.h:4,
from kernel/sched.c:24:
include/linux/sched.h:975:18: warning: missing whitespace after the macro name
kernel/sched.c: In function ‘recalc_task_prio’:
kernel/sched.c:820: error: stray ‘\194’ in program
-----------
i'm no coder at all, and have no idea what's goin on.
grtz
Jos
ps 2.6.17-rc2 won't be long i guess, maybe you can just roll this up in -ck...
--
You will have good luck and overcome many hardships.
[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 22:32 ` jos poortvliet
@ 2006-04-15 23:06 ` Con Kolivas
0 siblings, 0 replies; 27+ messages in thread
From: Con Kolivas @ 2006-04-15 23:06 UTC (permalink / raw)
To: jos poortvliet; +Cc: ck, Al Boldi, Mike Galbraith, linux-kernel
On Sunday 16 April 2006 08:32, jos poortvliet wrote:
> Op zaterdag 15 april 2006 09:05, schreef Con Kolivas:
> > Thanks for bringing this to my attention. A while back I had different
> > management of forked tasks and merged it with PF_NONSLEEP. Since then
> > I've changed the management of NONSLEEP tasks and didn't realise it had
> > adversely affected the accounting of forking tasks. This patch should
> > rectify it.
> >
> > Thanks!
>
> hey con, i get this:
>
> can't find file to patch at input line 9
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
That's because it's not an attachment but inserted into the mail and your
mailer is mangling it on extraction. Save the whole email unmodified (eg with
a "save as" function) and use that as the patch. Don't fret as it's a
non-critical fix since it's a corner case only and it will be in the next -ck
anyway.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 20:45 ` Al Boldi
@ 2006-04-15 23:22 ` Con Kolivas
2006-04-16 18:44 ` [ck] " Andreas Mohr
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-15 23:22 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Sunday 16 April 2006 06:45, Al Boldi wrote:
> Con Kolivas wrote:
> > Thanks for bringing this to my attention. A while back I had different
> > management of forked tasks and merged it with PF_NONSLEEP. Since then
> > I've changed the management of NONSLEEP tasks and didn't realise it had
> > adversely affected the accounting of forking tasks. This patch should
> > rectify it.
>
> Congrats!
>
> Much smoother, but I still get this choke w/ 2 eatm 9999 loops running:
> 9 MB 783 KB eaten in 130 msec (74 MB/s)
> 9 MB 783 KB eaten in 2416 msec (3 MB/s) <<<<<<<<<<<<<
> 9 MB 783 KB eaten in 197 msec (48 MB/s)
> You may have to adjust the kb to get the same effect.
I've seen it. It's an artefact of timekeeping that it takes an accumulation of
data to get all the information. Not much I can do about it except to have
timeslices so small that they thrash the crap out of cpu caches and
completely destroy throughput.
The current value, 6ms at 1000HZ, is chosen because it's the largest value
that can schedule a task in less than normal human perceptible range when two
competing heavily cpu bound tasks are the same priority. At 250HZ it works
out to 7.5ms and 10ms at 100HZ. Ironically in my experimenting I found the
cpu cache improvements become much less significant above 7ms so I'm very
happy with this compromise.
Thanks!
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-12 15:25 ` Al Boldi
2006-04-13 11:51 ` Con Kolivas
@ 2006-04-16 6:02 ` Con Kolivas
2006-04-16 8:31 ` Al Boldi
1 sibling, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-16 6:02 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Thursday 13 April 2006 01:25, Al Boldi wrote:
> Con Kolivas wrote:
> > mean 68.7 seconds
> >
> > range 63-73 seconds.
>
> Could this 10s skew be improved to around 1s to aid smoothness?
It turns out to be dependant on accounting of system time which only staircase
does at the moment btw. Currently it's done on a jiffy basis. To increase the
accuracy of this would incur incredible cost which I don't consider worth it.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-16 6:02 ` Con Kolivas
@ 2006-04-16 8:31 ` Al Boldi
2006-04-16 8:58 ` Con Kolivas
2006-04-16 10:37 ` was " Con Kolivas
0 siblings, 2 replies; 27+ messages in thread
From: Al Boldi @ 2006-04-16 8:31 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel, Mike Galbraith
Con Kolivas wrote:
> On Thursday 13 April 2006 01:25, Al Boldi wrote:
> > Con Kolivas wrote:
> > > mean 68.7 seconds
> > >
> > > range 63-73 seconds.
> >
> > Could this 10s skew be improved to around 1s to aid smoothness?
>
> It turns out to be dependant on accounting of system time which only
> staircase does at the moment btw. Currently it's done on a jiffy basis. To
> increase the accuracy of this would incur incredible cost which I don't
> consider worth it.
Is this also related to that?
> > Much smoother, but I still get this choke w/ 2 eatm 9999 loops running:
> >
> > 9 MB 783 KB eaten in 130 msec (74 MB/s)
> > 9 MB 783 KB eaten in 2416 msec (3 MB/s) <<<<<<<<<<<<<
> > 9 MB 783 KB eaten in 197 msec (48 MB/s)
> >
> > You may have to adjust the kb to get the same effect.
>
> I've seen it. It's an artefact of timekeeping that it takes an
> accumulation of data to get all the information. Not much I can do about
> it except to have timeslices so small that they thrash the crap out of cpu
> caches and completely destroy throughput.
So why is this not visible in other schedulers?
Are you sure this is not a priority boost problem?
> The current value, 6ms at 1000HZ, is chosen because it's the largest value
> that can schedule a task in less than normal human perceptible range when
> two competing heavily cpu bound tasks are the same priority. At 250HZ it
> works out to 7.5ms and 10ms at 100HZ. Ironically in my experimenting I
> found the cpu cache improvements become much less significant above 7ms so
> I'm very happy with this compromise.
Would you think this is dependent on cache-size and cpu-speed?
Also, what's this iso_cpu thing?
> Thanks!
Thank you!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch][rfc] quell interactive feeding frenzy
2006-04-16 8:31 ` Al Boldi
@ 2006-04-16 8:58 ` Con Kolivas
2006-04-16 10:37 ` was " Con Kolivas
1 sibling, 0 replies; 27+ messages in thread
From: Con Kolivas @ 2006-04-16 8:58 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel, Mike Galbraith
On Sunday 16 April 2006 18:31, Al Boldi wrote:
> Con Kolivas wrote:
> > On Thursday 13 April 2006 01:25, Al Boldi wrote:
> > > Con Kolivas wrote:
> > > > mean 68.7 seconds
> > > >
> > > > range 63-73 seconds.
> > >
> > > Could this 10s skew be improved to around 1s to aid smoothness?
> >
> > It turns out to be dependant on accounting of system time which only
> > staircase does at the moment btw. Currently it's done on a jiffy basis.
> > To increase the accuracy of this would incur incredible cost which I
> > don't consider worth it.
>
> Is this also related to that?
No.
> > > Much smoother, but I still get this choke w/ 2 eatm 9999 loops running:
> > >
> > > 9 MB 783 KB eaten in 130 msec (74 MB/s)
> > > 9 MB 783 KB eaten in 2416 msec (3 MB/s) <<<<<<<<<<<<<
> > > 9 MB 783 KB eaten in 197 msec (48 MB/s)
> > >
> > > You may have to adjust the kb to get the same effect.
> >
> > I've seen it. It's an artefact of timekeeping that it takes an
> > accumulation of data to get all the information. Not much I can do about
> > it except to have timeslices so small that they thrash the crap out of
> > cpu caches and completely destroy throughput.
>
> So why is this not visible in other schedulers?
When I said there's not much I can do about it I mean with respect to the
design.
> Are you sure this is not a priority boost problem?
Indeed it is related to the way cpu is proportioned out in staircase being
both priority and slice. Problem? The magnitude of said problem is up to the
observer to decide. It's a phenomenon of only two infinitely repeating
concurrent rapidly forking workloads when one forks every less than 100ms and
the other more; ie your test case. I'm sure there's a real world workload
somewhere somehow that exhibits this, but it's important to remember that
overall it's fair with the occasional blip.
> > The current value, 6ms at 1000HZ, is chosen because it's the largest
> > value that can schedule a task in less than normal human perceptible
> > range when two competing heavily cpu bound tasks are the same priority.
> > At 250HZ it works out to 7.5ms and 10ms at 100HZ. Ironically in my
> > experimenting I found the cpu cache improvements become much less
> > significant above 7ms so I'm very happy with this compromise.
>
> Would you think this is dependent on cache-size and cpu-speed?
It is. Cache warmth time varies on architecture and design. Of course you're
going to tell me to add a tunable and/or autotune this. Then that undoes the
limiting it to human perception range. It really does cost us to export these
things which are otherwise compile time constants... sigh.
> Also, what's this iso_cpu thing?
SCHED_ISO cpu usage which you're not using.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* was Re: quell interactive feeding frenzy
2006-04-16 8:31 ` Al Boldi
2006-04-16 8:58 ` Con Kolivas
@ 2006-04-16 10:37 ` Con Kolivas
2006-04-16 19:03 ` Al Boldi
1 sibling, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-16 10:37 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel
Al Since you have an unhealthy interest in cpu schedulers you may also want to
look at my ultimate fairness with mild interactivity builtin cpu scheduler I
hacked on briefly. I was bored for a couple of days and came up with the
design and hacked it together. I never got around to finishing it to live up
fully to its design intent but it's working embarassingly well at the moment.
It makes no effort to optimise for interactivity in anyw way. Maybe if I ever
find some spare time I'll give it more polish and port it to plugsched.
Ignore the lovely name I give it; the patch is for 2.6.16. It's a dual
priority array rr scheduler that iterates over all priorities. This is as
opposed to staircase which is a single priority array scheduler where the
tasks themselves iterate over all priorities.
http://ck.kolivas.org/patches/crap/sched-crap-1.patch
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-15 23:22 ` Con Kolivas
@ 2006-04-16 18:44 ` Andreas Mohr
2006-04-17 0:08 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Andreas Mohr @ 2006-04-16 18:44 UTC (permalink / raw)
To: Con Kolivas; +Cc: Al Boldi, ck list, Mike Galbraith, linux-kernel
Hi,
On Sun, Apr 16, 2006 at 09:22:59AM +1000, Con Kolivas wrote:
> The current value, 6ms at 1000HZ, is chosen because it's the largest value
> that can schedule a task in less than normal human perceptible range when two
> competing heavily cpu bound tasks are the same priority. At 250HZ it works
> out to 7.5ms and 10ms at 100HZ. Ironically in my experimenting I found the
> cpu cache improvements become much less significant above 7ms so I'm very
> happy with this compromise.
Heh, this part is *EXACTLY* a fully sufficient explanation of what I was
wondering about myself just these days ;)
(I'm experimenting with different timeslice values on my P3/450 to verify
what performance impact exactly it has)
However with a measly 256kB cache it probably doesn't matter too much,
I think.
But I think it's still important to mention that your perception might be
twisted by your P4 limitation (no testing with slower and really slow
machines).
Andreas
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: was Re: quell interactive feeding frenzy
2006-04-16 10:37 ` was " Con Kolivas
@ 2006-04-16 19:03 ` Al Boldi
2006-04-16 23:26 ` Con Kolivas
0 siblings, 1 reply; 27+ messages in thread
From: Al Boldi @ 2006-04-16 19:03 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck list, linux-kernel
Con Kolivas wrote:
> Al Since you have an unhealthy interest in cpu schedulers you may also
> want to look at my ultimate fairness with mild interactivity builtin cpu
> scheduler I hacked on briefly. I was bored for a couple of days and came
> up with the design and hacked it together. I never got around to finishing
> it to live up fully to its design intent but it's working embarassingly
> well at the moment. It makes no effort to optimise for interactivity in
> anyw way. Maybe if I ever find some spare time I'll give it more polish
> and port it to plugsched. Ignore the lovely name I give it; the patch is
> for 2.6.16. It's a dual priority array rr scheduler that iterates over all
> priorities. This is as opposed to staircase which is a single priority
> array scheduler where the tasks themselves iterate over all priorities.
It's not bad, but it seems to allow cpu-hogs to steal left-over timeslices,
which increases unfairness as the proc load increases. Conditionalizing
prio-boosting based on hogginess maybe one way to compensate for this. This
would involve resisting any prio-change unless hogged, which should be
scaled by hogginess, something like SleepAVG but much simpler and less
fluctuating.
Really, the key to a successful scheduler would be to build it step by step
by way of abstraction, modularization, and extension. Starting w/ a
noop/RR-scheduler, each step would need to be analyzed for stability and
efficiency, before moving to the next step, thus exposing problems as you
move from step to step.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: was Re: quell interactive feeding frenzy
2006-04-16 19:03 ` Al Boldi
@ 2006-04-16 23:26 ` Con Kolivas
0 siblings, 0 replies; 27+ messages in thread
From: Con Kolivas @ 2006-04-16 23:26 UTC (permalink / raw)
To: Al Boldi; +Cc: ck list, linux-kernel
On Monday 17 April 2006 05:03, Al Boldi wrote:
> It's not bad, but it seems to allow cpu-hogs to steal left-over timeslices,
> which increases unfairness as the proc load increases.
Spot on.
> Conditionalizing
> prio-boosting based on hogginess maybe one way to compensate for this.
> This would involve resisting any prio-change unless hogged, which should be
> scaled by hogginess, something like SleepAVG but much simpler and less
> fluctuating.
Not interested in hacking on something like that onto it. It was more of an
experiment in the simplest possible starvation free design that still
supported nice levels.
> Really, the key to a successful scheduler would be to build it step by step
> by way of abstraction, modularization, and extension. Starting w/ a
> noop/RR-scheduler, each step would need to be analyzed for stability and
> efficiency, before moving to the next step, thus exposing problems as you
> move from step to step.
While this may be the key, it is not the reason we aren't getting maximum
roundness in our designs in linux. Our major enemy is cpu accounting of work
done in kernel context on behalf of everyone else. Putting architecture
dependant hooks into the assembly code to account for entry and exit would be
the accurate way of doing this.
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-16 18:44 ` [ck] " Andreas Mohr
@ 2006-04-17 0:08 ` Con Kolivas
2006-04-19 8:37 ` Andreas Mohr
0 siblings, 1 reply; 27+ messages in thread
From: Con Kolivas @ 2006-04-17 0:08 UTC (permalink / raw)
To: Andreas Mohr; +Cc: Al Boldi, ck list, Mike Galbraith, linux-kernel
On Monday 17 April 2006 04:44, Andreas Mohr wrote:
> Hi,
>
> On Sun, Apr 16, 2006 at 09:22:59AM +1000, Con Kolivas wrote:
> > The current value, 6ms at 1000HZ, is chosen because it's the largest
> > value that can schedule a task in less than normal human perceptible
> > range when two competing heavily cpu bound tasks are the same priority.
> > At 250HZ it works out to 7.5ms and 10ms at 100HZ. Ironically in my
> > experimenting I found the cpu cache improvements become much less
> > significant above 7ms so I'm very happy with this compromise.
>
> Heh, this part is *EXACTLY* a fully sufficient explanation of what I was
> wondering about myself just these days ;)
> (I'm experimenting with different timeslice values on my P3/450 to verify
> what performance impact exactly it has)
> However with a measly 256kB cache it probably doesn't matter too much,
> I think.
>
> But I think it's still important to mention that your perception might be
> twisted by your P4 limitation (no testing with slower and really slow
> machines).
You underestimate me. Those cpu cache effects were performance effects
measured down to a PII 233, but all were i386 architecture. As for
"perception" this isn't my testing I'm talking about; these are
neuropsychiatric tests that have nothing to do with pcs or what processor you
use ;)
--
-ck
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-17 0:08 ` Con Kolivas
@ 2006-04-19 8:37 ` Andreas Mohr
2006-04-19 8:59 ` jos poortvliet
0 siblings, 1 reply; 27+ messages in thread
From: Andreas Mohr @ 2006-04-19 8:37 UTC (permalink / raw)
To: Con Kolivas; +Cc: Al Boldi, ck list, Mike Galbraith, linux-kernel
Hi,
On Mon, Apr 17, 2006 at 10:08:08AM +1000, Con Kolivas wrote:
> On Monday 17 April 2006 04:44, Andreas Mohr wrote:
> > Hi,
> >
> > On Sun, Apr 16, 2006 at 09:22:59AM +1000, Con Kolivas wrote:
> > > The current value, 6ms at 1000HZ, is chosen because it's the largest
> > > value that can schedule a task in less than normal human perceptible
> > > range when two competing heavily cpu bound tasks are the same priority.
> > > At 250HZ it works out to 7.5ms and 10ms at 100HZ. Ironically in my
> > > experimenting I found the cpu cache improvements become much less
> > > significant above 7ms so I'm very happy with this compromise.
> >
> > Heh, this part is *EXACTLY* a fully sufficient explanation of what I was
> > wondering about myself just these days ;)
> > (I'm experimenting with different timeslice values on my P3/450 to verify
> > what performance impact exactly it has)
> > However with a measly 256kB cache it probably doesn't matter too much,
> > I think.
> >
> > But I think it's still important to mention that your perception might be
> > twisted by your P4 limitation (no testing with slower and really slow
> > machines).
>
> You underestimate me. Those cpu cache effects were performance effects
> measured down to a PII 233, but all were i386 architecture. As for
> "perception" this isn't my testing I'm talking about; these are
> neuropsychiatric tests that have nothing to do with pcs or what processor you
> use ;)
OK, but I was not worrying about the interactivity aspects, rather the
performance aspects (GUI updates of KDE 3.5.2 on P3/450/256MB on Ubuntu are
about as slow as medium-hot lava). While of course it's mostly KDE (and
probably also the S3 Savage driver/card) which is to blame here,
I'm trying to first do as much as possible at kernel level before eventually
going higher up the chain...
Andreas
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [ck] Re: [patch][rfc] quell interactive feeding frenzy
2006-04-19 8:37 ` Andreas Mohr
@ 2006-04-19 8:59 ` jos poortvliet
0 siblings, 0 replies; 27+ messages in thread
From: jos poortvliet @ 2006-04-19 8:59 UTC (permalink / raw)
To: ck; +Cc: Andreas Mohr, Con Kolivas, Al Boldi, linux-kernel, Mike Galbraith
[-- Attachment #1: Type: text/plain, Size: 2816 bytes --]
Op woensdag 19 april 2006 10:37, schreef Andreas Mohr:
> Hi,
>
> On Mon, Apr 17, 2006 at 10:08:08AM +1000, Con Kolivas wrote:
> > On Monday 17 April 2006 04:44, Andreas Mohr wrote:
> > > Hi,
> > >
> > > On Sun, Apr 16, 2006 at 09:22:59AM +1000, Con Kolivas wrote:
> > > > The current value, 6ms at 1000HZ, is chosen because it's the largest
> > > > value that can schedule a task in less than normal human perceptible
> > > > range when two competing heavily cpu bound tasks are the same
> > > > priority. At 250HZ it works out to 7.5ms and 10ms at 100HZ.
> > > > Ironically in my experimenting I found the cpu cache improvements
> > > > become much less significant above 7ms so I'm very happy with this
> > > > compromise.
> > >
> > > Heh, this part is *EXACTLY* a fully sufficient explanation of what I
> > > was wondering about myself just these days ;)
> > > (I'm experimenting with different timeslice values on my P3/450 to
> > > verify what performance impact exactly it has)
> > > However with a measly 256kB cache it probably doesn't matter too much,
> > > I think.
> > >
> > > But I think it's still important to mention that your perception might
> > > be twisted by your P4 limitation (no testing with slower and really
> > > slow machines).
> >
> > You underestimate me. Those cpu cache effects were performance effects
> > measured down to a PII 233, but all were i386 architecture. As for
> > "perception" this isn't my testing I'm talking about; these are
> > neuropsychiatric tests that have nothing to do with pcs or what processor
> > you use ;)
>
> OK, but I was not worrying about the interactivity aspects, rather the
> performance aspects (GUI updates of KDE 3.5.2 on P3/450/256MB on Ubuntu are
> about as slow as medium-hot lava). While of course it's mostly KDE (and
> probably also the S3 Savage driver/card) which is to blame here,
> I'm trying to first do as much as possible at kernel level before
> eventually going higher up the chain...
if it's slow redrawing, i'd blame the video driver or X, not KDE/Qt - tough
you might want to try another style and windowdecoration. plastik is
generally speaking quite fast, or try the 'light' style. Xorg xcomposite
might help too, but i guess your video card won't like that :D
slow app startup will be better in Kubuntu Dapper+1, as they will finally
incorporate the new fontconfig (*g* i hope so) and use GCC's C++ symbol
invisibillity in KDE/Qt. those both can give speedups in the area of 10-20%,
so that should be noticable. KDE 3.5.3 also will also have a few other
patches to speed up the KDE startup process, so - faster login, too.
now with al this, its hard to imagine KDE 4 will again sport some 20/30%
speedup due to reduced memory usage/binary size ;-)
> Andreas
[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2006-04-19 8:59 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200604112100.28725.kernel@kolivas.org>
2006-04-11 17:03 ` Fwd: Re: [patch][rfc] quell interactive feeding frenzy Al Boldi
2006-04-11 22:56 ` Con Kolivas
2006-04-12 5:41 ` Al Boldi
2006-04-12 6:22 ` Con Kolivas
2006-04-12 8:17 ` Al Boldi
2006-04-12 9:36 ` Con Kolivas
2006-04-12 10:39 ` Al Boldi
2006-04-12 11:27 ` Con Kolivas
2006-04-12 15:25 ` Al Boldi
2006-04-13 11:51 ` Con Kolivas
2006-04-14 3:16 ` Al Boldi
2006-04-15 7:05 ` Con Kolivas
2006-04-15 18:23 ` [ck] " Michael Gerdau
2006-04-15 20:45 ` Al Boldi
2006-04-15 23:22 ` Con Kolivas
2006-04-16 18:44 ` [ck] " Andreas Mohr
2006-04-17 0:08 ` Con Kolivas
2006-04-19 8:37 ` Andreas Mohr
2006-04-19 8:59 ` jos poortvliet
2006-04-15 22:32 ` jos poortvliet
2006-04-15 23:06 ` Con Kolivas
2006-04-16 6:02 ` Con Kolivas
2006-04-16 8:31 ` Al Boldi
2006-04-16 8:58 ` Con Kolivas
2006-04-16 10:37 ` was " Con Kolivas
2006-04-16 19:03 ` Al Boldi
2006-04-16 23:26 ` Con Kolivas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox