public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.5.47 scheduler problems?
@ 2002-11-18  6:20 Mike Galbraith
  2002-11-18  6:51 ` Tim Connors
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2002-11-18  6:20 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Greetings,

For testing swap throughput, I like to run make -j30 bzImage on my 500Mhz 
PIII w. 128Mb ram.  For testing interactivity, I fire up KDE, start a 
smaller make -j, grab a window, and wave it around.

With 2.4.20rc2+rc1aa1, running a -j10 build (not swapping) is very very 
bad.  However, if I set all tasks in the system to SCHED_FIFO or SCHED_RR 
prior to this light make -j, I have a ~pretty smooth system.

If I do the same in 2.5.47, I have no control of my box.  Setting all tasks 
to SCHED_FIFO or SCHED_RR prior to starting make -j10 bzImage, I can regain 
control, but interactivity under load is basically not present.

I used to be able to wave a window poorly at make -j25 (swapping heftily), 
fairly smoothly at make -j20, and smoothly at make -j15 or below.  This 
with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like this in 
quite a while though)

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  6:20 2.5.47 scheduler problems? Mike Galbraith
@ 2002-11-18  6:51 ` Tim Connors
  2002-11-18  7:08   ` Andrew Morton
  2002-11-18  7:29   ` Mike Galbraith
  0 siblings, 2 replies; 11+ messages in thread
From: Tim Connors @ 2002-11-18  6:51 UTC (permalink / raw)
  To: linux-kernel, Mike Galbraith

In linux.kernel, you wrote:
> Greetings,
> 
> For testing swap throughput, I like to run make -j30 bzImage on my 500Mhz 
> PIII w. 128Mb ram.  For testing interactivity, I fire up KDE, start a 
> smaller make -j, grab a window, and wave it around.
> 
> With 2.4.20rc2+rc1aa1, running a -j10 build (not swapping) is very very 
> bad.  However, if I set all tasks in the system to SCHED_FIFO or SCHED_RR 
> prior to this light make -j, I have a ~pretty smooth system.
> 
> If I do the same in 2.5.47, I have no control of my box.  Setting all tasks 
> to SCHED_FIFO or SCHED_RR prior to starting make -j10 bzImage, I can regain 
> control, but interactivity under load is basically not present.

Funny that.

> I used to be able to wave a window poorly at make -j25 (swapping heftily), 
> fairly smoothly at make -j20, and smoothly at make -j15 or below.  This 
> with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like this in 
> quite a while though)

Perhaps you should consider buying an extra 29 CPU's for you desktop?

-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/

A Chemist who falls in acid is absorbed in work.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  6:51 ` Tim Connors
@ 2002-11-18  7:08   ` Andrew Morton
  2002-11-18  7:35     ` Mike Galbraith
  2002-11-18  7:29   ` Mike Galbraith
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2002-11-18  7:08 UTC (permalink / raw)
  To: Tim Connors; +Cc: linux-kernel, Mike Galbraith

Tim Connors wrote:
> 
> > I used to be able to wave a window poorly at make -j25 (swapping heftily),
> > fairly smoothly at make -j20, and smoothly at make -j15 or below.  This
> > with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like this in
> > quite a while though)
> 
> Perhaps you should consider buying an extra 29 CPU's for you desktop?
> 

No.  He's saying that it used to be OK, but it has got worse.

A much simpler test is to start a big compilation and then madly
waggle an X window around.  Goes OK for a few seconds, and then
seizes up quite horridly.  Presumably because the scheduler has
suddenly decided that the X server has become a "batch" process
and is scheduling it in a similar manner to the compilation.

If you stop wiggling the window for 5-10 seconds it comes back.
Presumably because the scheduler has decided that the X server is
"interactive" again.

When it happens, it's *very* bad.  The mouse cursor doesn't move
for 0.5-1.0 seconds and then takes great leaps.  It is unusable.

Strangely it does not happen (much) when the background load is
a few busywaits.  It has to be a compilation - maybe short-lived
batch processes is what triggers it.

For me, the X server is sometimes the victim, and the MUA (netscape4)
is frequently victimised.  This is because the MUA alternates between
periods of interactivity and periods of compute-intensive work (parsing
large mailboxes).   When this problem strikes you have to just sit there
with your arms folded waiting for it to stop.

It needs fixing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  6:51 ` Tim Connors
  2002-11-18  7:08   ` Andrew Morton
@ 2002-11-18  7:29   ` Mike Galbraith
  2002-11-18  7:53     ` Tim Connors
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2002-11-18  7:29 UTC (permalink / raw)
  To: Tim Connors, linux-kernel


----- Original Message -----
From: "Tim Connors" <tconnors@astro.swin.edu.au>
To: <linux-kernel@vger.kernel.org>; "Mike Galbraith" <efault@gmx.de>
Sent: Monday, November 18, 2002 7:51 AM
Subject: Re: 2.5.47 scheduler problems?


> In linux.kernel, you wrote:
> > Greetings,
> >
> > For testing swap throughput, I like to run make -j30 bzImage on my
500Mhz
> > PIII w. 128Mb ram.  For testing interactivity, I fire up KDE, start
a
> > smaller make -j, grab a window, and wave it around.
> >
> > With 2.4.20rc2+rc1aa1, running a -j10 build (not swapping) is very
very
> > bad.  However, if I set all tasks in the system to SCHED_FIFO or
SCHED_RR
> > prior to this light make -j, I have a ~pretty smooth system.
> >
> > If I do the same in 2.5.47, I have no control of my box.  Setting
all tasks
> > to SCHED_FIFO or SCHED_RR prior to starting make -j10 bzImage, I can
regain
> > control, but interactivity under load is basically not present.
>
> Funny that.
>
> > I used to be able to wave a window poorly at make -j25 (swapping
heftily),
> > fairly smoothly at make -j20, and smoothly at make -j15 or below.
This
> > with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like this
in
> > quite a while though)
>
> Perhaps you should consider buying an extra 29 CPU's for you desktop?

I have neither the need for 30 CPUs, nor the cash to pay for such a
beast :)

I gather you think my test is silly?

    -Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  7:08   ` Andrew Morton
@ 2002-11-18  7:35     ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2002-11-18  7:35 UTC (permalink / raw)
  To: Andrew Morton, Tim Connors; +Cc: linux-kernel


----- Original Message -----
From: "Andrew Morton" <akpm@digeo.com>
To: "Tim Connors" <tconnors@astro.swin.edu.au>
Cc: <linux-kernel@vger.kernel.org>; "Mike Galbraith" <efault@gmx.de>
Sent: Monday, November 18, 2002 8:08 AM
Subject: Re: 2.5.47 scheduler problems?


> Tim Connors wrote:
> >
> > > I used to be able to wave a window poorly at make -j25 (swapping
heftily),
> > > fairly smoothly at make -j20, and smoothly at make -j15 or below.
This
> > > with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like
this in
> > > quite a while though)
> >
> > Perhaps you should consider buying an extra 29 CPU's for you
desktop?
> >
>
> No.  He's saying that it used to be OK, but it has got worse.
>
> A much simpler test is to start a big compilation and then madly
> waggle an X window around.  Goes OK for a few seconds, and then
> seizes up quite horridly.  Presumably because the scheduler has
> suddenly decided that the X server has become a "batch" process
> and is scheduling it in a similar manner to the compilation.
>
> If you stop wiggling the window for 5-10 seconds it comes back.
> Presumably because the scheduler has decided that the X server is
> "interactive" again.
>
> When it happens, it's *very* bad.  The mouse cursor doesn't move
> for 0.5-1.0 seconds and then takes great leaps.  It is unusable.

I was watching it this morning, without wiggling, and it seems to update
window content (make output in one and vmstat in another) about every 5
seconds.. very odd looking.

    -Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  7:29   ` Mike Galbraith
@ 2002-11-18  7:53     ` Tim Connors
  2002-11-18 10:52       ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Connors @ 2002-11-18  7:53 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel

On Mon, 18 Nov 2002, Mike Galbraith wrote:

> > > If I do the same in 2.5.47, I have no control of my box.  Setting
> all tasks
> > > to SCHED_FIFO or SCHED_RR prior to starting make -j10 bzImage, I can
> regain
> > > control, but interactivity under load is basically not present.
> >
> > Funny that.
> >
> > > I used to be able to wave a window poorly at make -j25 (swapping
> heftily),
> > > fairly smoothly at make -j20, and smoothly at make -j15 or below.
> This
> > > with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like this
> in
> > > quite a while though)
> >
> > Perhaps you should consider buying an extra 29 CPU's for you desktop?
>
> I have neither the need for 30 CPUs, nor the cash to pay for such a
> beast :)
>
> I gather you think my test is silly?

Well, yes, 30 processes at a time on a single CPU does seem a bit silly -
given that (under the old system), you would not expect X to get more than
3% of the CPU time.
Also sceduling normal processes (ie, not real-time processes) as RR/FIFO
seemed also pretty bad.

However....

But I have to now admit that I haven't yet played with 2.5.47 seriously,
and wansn't aware of the problems which Andrew just posted.

mea culpa.


-- 
TimC -- http://astronomy.swin.edu.au/staff/tconnors/

If you ever fear that machines will surpass humans in intelligence,
just ask Microsoft to write the OS.     -- POTU in RHOD


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-18  7:53     ` Tim Connors
@ 2002-11-18 10:52       ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2002-11-18 10:52 UTC (permalink / raw)
  To: Tim Connors; +Cc: linux-kernel


----- Original Message -----
From: "Tim Connors" <tconnors@astro.swin.edu.au>
To: "Mike Galbraith" <EFAULT@gmx.de>
Cc: <linux-kernel@vger.kernel.org>
Sent: Monday, November 18, 2002 8:53 AM
Subject: Re: 2.5.47 scheduler problems?


> On Mon, 18 Nov 2002, Mike Galbraith wrote:
>
> > > > If I do the same in 2.5.47, I have no control of my box.
Setting
> > all tasks
> > > > to SCHED_FIFO or SCHED_RR prior to starting make -j10 bzImage, I
can
> > regain
> > > > control, but interactivity under load is basically not present.
> > >
> > > Funny that.
> > >
> > > > I used to be able to wave a window poorly at make -j25 (swapping
> > heftily),
> > > > fairly smoothly at make -j20, and smoothly at make -j15 or
below.
> > This
> > > > with no SCHED_RR/SCHED_FIFO.  (I haven't done much testing like
this
> > in
> > > > quite a while though)
> > >
> > > Perhaps you should consider buying an extra 29 CPU's for you
desktop?
> >
> > I have neither the need for 30 CPUs, nor the cash to pay for such a
> > beast :)
> >
> > I gather you think my test is silly?
>
> Well, yes, 30 processes at a time on a single CPU does seem a bit
silly -
> given that (under the old system), you would not expect X to get more
than
> 3% of the CPU time.

I don't try -j30 with X/KDE running.. that's much too heavy for my
little box.  The whole point of doing -j30 on my box without X/KDE is
that it juuuust fills up capacity.  It generally adds a minute to build
time despite quite hefty swapping.  With aa kernels or heavily twiddled
stock kernels, it's more like 30 seconds.  (with new gcc, -j30 is way
too much too.. oink oink;)

> Also sceduling normal processes (ie, not real-time processes) as
RR/FIFO
> seemed also pretty bad.

That was only to see if I _could_ get some CPU, and with (only:) 10
copies of gcc running.

>
> However....
>
> But I have to now admit that I haven't yet played with 2.5.47
seriously,
> and wansn't aware of the problems which Andrew just posted.
>
> mea culpa.
>
>
> --
> TimC -- http://astronomy.swin.edu.au/staff/tconnors/

    -Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
@ 2002-11-22  5:41 Jim Houston
  2002-11-22 11:07 ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Houston @ 2002-11-22  5:41 UTC (permalink / raw)
  To: efault, linux-kernel; +Cc: riel

Hi Mike, Rik, Everyone,

The O(1) schedule just isn't fair.  It will run a subset 
of the runable processes excluding the rest.  See my earlier
emails for the details.

I had been working on a fix for this but got distracted
by Posix timers.  I still hope to get back to it.

My patch is here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103508412423719&w=2

It fixes fairness but breaks nice(2). Rik van Riel has a
patch here which builds on my patch which fixes this:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103651801424031&w=2

I just gave this a spin with.  The patches still apply cleanly
to linux-2.5.48 and it seems well behaved:-)  

I found this problem with the LTP waitpid06 test.  It actually
produced a live-lock. See this mail:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103133744217082&w=2

Jim Houston - Concurrent Computer Corp.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-22  5:41 Jim Houston
@ 2002-11-22 11:07 ` Mike Galbraith
  2002-11-22 12:51   ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2002-11-22 11:07 UTC (permalink / raw)
  To: jim.houston, linux-kernel; +Cc: riel

At 12:41 AM 11/22/2002 -0500, Jim Houston wrote:
>Hi Mike, Rik, Everyone,
>
>The O(1) schedule just isn't fair.  It will run a subset
>of the runable processes excluding the rest.  See my earlier
>emails for the details.
>
>I had been working on a fix for this but got distracted
>by Posix timers.  I still hope to get back to it.
>
>My patch is here:
>http://marc.theaimsgroup.com/?l=linux-kernel&m=103508412423719&w=2

In a brief test, this seems to cure my problem.

>It fixes fairness but breaks nice(2). Rik van Riel has a
>patch here which builds on my patch which fixes this:
>http://marc.theaimsgroup.com/?l=linux-kernel&m=103651801424031&w=2

(I haven't test this one yet)

>I just gave this a spin with.  The patches still apply cleanly
>to linux-2.5.48 and it seems well behaved:-)

It seems a little choppy still for a not swapping load, but greatly improved.

Thanks!

         -Mike 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-22 11:07 ` Mike Galbraith
@ 2002-11-22 12:51   ` Mike Galbraith
  2002-11-22 14:04     ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2002-11-22 12:51 UTC (permalink / raw)
  To: jim.houston, linux-kernel; +Cc: riel

[-- Attachment #1: Type: text/plain, Size: 654 bytes --]

At 12:07 PM 11/22/2002 +0100, Mike Galbraith wrote:
>At 12:41 AM 11/22/2002 -0500, Jim Houston wrote:
>
>>I just gave this a spin with.  The patches still apply cleanly
>>to linux-2.5.48 and it seems well behaved:-)
>
>It seems a little choppy still for a not swapping load, but greatly improved.
>
>Thanks!

(I put it into virgin 2.5.47 fwiw)   I have some very odd behavior.  I 
wanted to see how the kernel did at make -j30 bzImage on my test box to see 
what effect it has on throughput (box is 500 Mhz PIII + 128Mb ram), and get 
vmstat output like the attached.  I should be roughly 30Mb into swap and 
paging heftily at this point.

         -Mike

[-- Attachment #2: vmstat.out --]
[-- Type: text/plain, Size: 2291 bytes --]

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
41  8    764  59940   1596  18484    0    0     4    27 1027   129 90 10  0  0
35 10    764  58932   1604  18616    0    0   120     0 1026   134 92  8  0  0
34  8    764  53540   1608  18804    0    0    29     0 1023   141 87 13  0  0
38  7    764  47780   1616  19300    0    0    10     0 1075   168 92  8  0  0
38 11    764  54476   1616  19684    0    0     8     0 1018   130 90 10  0  0
33 12    764  47596   1632  20048    0    0    14   474 1062   143 89 11  0  0
33  6    764  41564   1636  19412    0    0    88     0 1104   158 90 10  0  0
32  4    764  34940   1640  19728    0    0    93     0 1024   157 91  9  0  0
37  6    764  34652   1640  20012    0    0    38     0 1022   143 93  7  0  0
36  8    764  33196   1648  20108    0    0    18     0 1023   168 89 11  0  0
35  8    764  32668   1652  20528    0    0     8   110 1029   159 91  9  0  0
34  6    764  28364   1664  20852    0    0    84     1 1023   167 92  8  0  0
32  6    764  23140   1664  20608    0    0     4     0 1020   165 93  7  0  0
34  7    764  28372   1664  20424    0    0    12     0 1020   147 93  7  0  0
38  7    764  23476   1668  20480    0    0    45     0 1123   194 93  7  0  0
32  8    764  32988   1668  20952    0    0     4   132 1027   151 94  6  0  0
35  9    764  29260   1672  20332    0    0     4     1 1020   152 89 11  0  0
32  6    764  31100   1672  20688    0    0     0     0 1106   156 92  8  0  0
31  8    764  32076   1680  20264    0    0   108     0 1023   161 91  9  0  0
35  8    764  42332   1684  20836    0    0    17     0 1023   157 86 14  0  0
34 11    764  48916   1688  20752    0    0    52   234 1033   151 88 12  0  0
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
37  5    764  45796   1692  20956    0    0    12     1 1020   136 89 11  0  0
32  6    764  46820   1692  20616    0    0     0     0 1017   149 89 11  0  0
34  8    764  46820   1692  20652    0    0     0     0 1017   130 91  9  0  0
31  8    764  41316   1692  20496    0    0     8     0 1083   153 92  8  0  0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.5.47 scheduler problems?
  2002-11-22 12:51   ` Mike Galbraith
@ 2002-11-22 14:04     ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2002-11-22 14:04 UTC (permalink / raw)
  To: jim.houston, linux-kernel; +Cc: riel

At 01:51 PM 11/22/2002 +0100, Mike Galbraith wrote:
>At 12:07 PM 11/22/2002 +0100, Mike Galbraith wrote:
>>At 12:41 AM 11/22/2002 -0500, Jim Houston wrote:
>>
>>>I just gave this a spin with.  The patches still apply cleanly
>>>to linux-2.5.48 and it seems well behaved:-)
>>
>>It seems a little choppy still for a not swapping load, but greatly improved.
>>
>>Thanks!
>
>(I put it into virgin 2.5.47 fwiw)   I have some very odd behavior.  I 
>wanted to see how the kernel did at make -j30 bzImage on my test box to 
>see what effect it has on throughput (box is 500 Mhz PIII + 128Mb ram), 
>and get vmstat output like the attached.  I should be roughly 30Mb into 
>swap and paging heftily at this point.

Never mind the vmstat output.. it seems you need both patches.  With both 
in 2.5.48, the build progressed in a much more normal looking fashion.  I'm 
not losing control of my box any more under load.

         -Mike  


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-11-22 14:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-18  6:20 2.5.47 scheduler problems? Mike Galbraith
2002-11-18  6:51 ` Tim Connors
2002-11-18  7:08   ` Andrew Morton
2002-11-18  7:35     ` Mike Galbraith
2002-11-18  7:29   ` Mike Galbraith
2002-11-18  7:53     ` Tim Connors
2002-11-18 10:52       ` Mike Galbraith
  -- strict thread matches above, loose matches on Subject: below --
2002-11-22  5:41 Jim Houston
2002-11-22 11:07 ` Mike Galbraith
2002-11-22 12:51   ` Mike Galbraith
2002-11-22 14:04     ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox