Plain 2.4.5 VM...

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Plain 2.4.5 VM...
@ 2001-05-29  0:32 Jeff Garzik
  2001-05-29  1:13 ` Mohammad A. Haque
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Jeff Garzik @ 2001-05-29  0:32 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Ouch!  When compiling MySql, building sql_yacc.cc results in a ~300M
cc1plus process size.  Unfortunately this leads the machine with 380M of
RAM deeply into swap:

Mem:   381608K av,  248504K used,  133104K free,       0K shrd,     192K
buff
Swap:  255608K av,  255608K used,       0K free                  215744K
cached

Vanilla 2.4.5 VM.

-- 
Jeff Garzik      | Disbelief, that's why you fail.
Building 1024    |
MandrakeSoft     |

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  0:32 Plain 2.4.5 VM Jeff Garzik
@ 2001-05-29  1:13 ` Mohammad A. Haque
  2001-05-29  1:14 ` Mohammad A. Haque
  2001-05-29  8:51 ` Alan Cox
  2 siblings, 0 replies; 28+ messages in thread
From: Mohammad A. Haque @ 2001-05-29  1:13 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Mailing List

Jeff Garzik wrote:
> 
> Ouch!  When compiling MySql, building sql_yacc.cc results in a ~300M
> cc1plus process size.  Unfortunately this leads the machine with 380M of
> RAM deeply into swap:
> 
> Mem:   381608K av,  248504K used,  133104K free,       0K shrd,     192K
> buff
> Swap:  255608K av,  255608K used,       0K free                  215744K
> cached
> 
> Vanilla 2.4.5 VM.

I don't think this is new/unusual. 

You can add the following to configure when compiling mysql .. 

  --with-low-memory       Try to use less memory to compile to avoid 
                          memory limitations.
 
--

=====================================================================
Mohammad A. Haque                              http://www.haque.net/ 
                                               mhaque@haque.net

  "Alcohol and calculus don't mix.             Project Lead
   Don't drink and derive." --Unknown          http://wm.themes.org/
                                               batmanppc@themes.org
=====================================================================

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  0:32 Plain 2.4.5 VM Jeff Garzik
  2001-05-29  1:13 ` Mohammad A. Haque
@ 2001-05-29  1:14 ` Mohammad A. Haque
  2001-05-29  8:51 ` Alan Cox
  2 siblings, 0 replies; 28+ messages in thread
From: Mohammad A. Haque @ 2001-05-29  1:14 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Mailing List

Jeff Garzik wrote:
> 
> Ouch!  When compiling MySql, building sql_yacc.cc results in a ~300M
> cc1plus process size.  Unfortunately this leads the machine with 380M of
> RAM deeply into swap:
> 
> Mem:   381608K av,  248504K used,  133104K free,       0K shrd,     192K
> buff
> Swap:  255608K av,  255608K used,       0K free                  215744K
> cached
> 
> Vanilla 2.4.5 VM.
> 

Sorry. I just looked at your numbers again and saw you have 133 MB of
real ram free. Is this during compile?

-- 

=====================================================================
Mohammad A. Haque                              http://www.haque.net/ 
                                               mhaque@haque.net

  "Alcohol and calculus don't mix.             Project Lead
   Don't drink and derive." --Unknown          http://wm.themes.org/
                                               batmanppc@themes.org
=====================================================================

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  0:32 Plain 2.4.5 VM Jeff Garzik
  2001-05-29  1:13 ` Mohammad A. Haque
  2001-05-29  1:14 ` Mohammad A. Haque
@ 2001-05-29  8:51 ` Alan Cox
  2 siblings, 0 replies; 28+ messages in thread
From: Alan Cox @ 2001-05-29  8:51 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Mailing List

> Ouch!  When compiling MySql, building sql_yacc.cc results in a ~300M
> cc1plus process size.  Unfortunately this leads the machine with 380M of
> RAM deeply into swap:
> 
> Mem:   381608K av,  248504K used,  133104K free,       0K shrd,     192K
> buff
> Swap:  255608K av,  255608K used,       0K free                  215744K
> cached

That is supposed to hapen.  The pages are existing both in swap and memory but
not recovered. In that state the VM hasn't even broken yet. 

Where you hit a problem is that the 255Mb of stuff both in memory and swap
won't be flushed from swap when you need more swap space. That is a giant size
special edition stupid design flaw that is on the VM hackers list. But there
are only a finite number of patches you can do in a day, and things like
sucking completely came first I believe

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
@ 2001-05-29  2:32 G. Hugh Song
  2001-05-29  4:10 ` Jakob Østergaard
  0 siblings, 1 reply; 28+ messages in thread
From: G. Hugh Song @ 2001-05-29  2:32 UTC (permalink / raw)
  To: linux-kernel

Jeff Garzik wrote: 
> 
> Ouch! When compiling MySql, building sql_yacc.cc results in a ~300M 
> cc1plus process size. Unfortunately this leads the machine with 380M of 
> RAM deeply into swap: 
> 
> Mem: 381608K av, 248504K used, 133104K free, 0K shrd, 192K 
> buff 
> Swap: 255608K av, 255608K used, 0K free 215744K 
> cached 
> 
> Vanilla 2.4.5 VM. 
> 

This bug known as the swap-reclaim bug has been there for a while since 
around 2.4.4.  Rick van Riel said that it is in the TO-DO list.
Because of this, I went back to 2.2.20pre2aa1 on UP2000 SMP.

IMHO, the current 2.4.* kernels should still be 2.3.*.  When this bug
is removed, I will come back to 2.4.*.

Regards,

Hugh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  2:32 G. Hugh Song
@ 2001-05-29  4:10 ` Jakob Østergaard
  2001-05-29  4:26   ` safemode
                     ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Jakob Østergaard @ 2001-05-29  4:10 UTC (permalink / raw)
  To: G. Hugh Song; +Cc: linux-kernel

On Tue, May 29, 2001 at 11:32:09AM +0900, G. Hugh Song wrote:
> 
> Jeff Garzik wrote: 
> > 
> > Ouch! When compiling MySql, building sql_yacc.cc results in a ~300M 
> > cc1plus process size. Unfortunately this leads the machine with 380M of 
> > RAM deeply into swap: 
> > 
> > Mem: 381608K av, 248504K used, 133104K free, 0K shrd, 192K 
> > buff 
> > Swap: 255608K av, 255608K used, 0K free 215744K 
> > cached 
> > 
> > Vanilla 2.4.5 VM. 
> > 
> 
> This bug known as the swap-reclaim bug has been there for a while since 
> around 2.4.4.  Rick van Riel said that it is in the TO-DO list.
> Because of this, I went back to 2.2.20pre2aa1 on UP2000 SMP.
> 
> IMHO, the current 2.4.* kernels should still be 2.3.*.  When this bug
> is removed, I will come back to 2.4.*.

Just keep enough swap around.  How hard can that be ?

Really, it's not like a memory leak or something.  It's just "late reclaim".

If Linux didn't do over-commit, you wouldn't have been able to run that job
anyway.

It's not a bug.  It's a feature.  It only breaks systems that are run with "too
little" swap, and the only difference from 2.2 till now is, that the definition
of "too little" changed.

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:10 ` Jakob Østergaard
@ 2001-05-29  4:26   ` safemode
  2001-05-29  4:38     ` Jeff Garzik
  2001-05-29  4:46   ` G. Hugh Song
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: safemode @ 2001-05-29  4:26 UTC (permalink / raw)
  To: Jakob Østergaard, G. Hugh Song; +Cc: linux-kernel

On Tuesday 29 May 2001 00:10, Jakob Østergaard wrote:

> > > Mem: 381608K av, 248504K used, 133104K free, 0K shrd, 192K
> > > buff
> > > Swap: 255608K av, 255608K used, 0K free 215744K
> > > cached
> > >
> > > Vanilla 2.4.5 VM.

> It's not a bug.  It's a feature.  It only breaks systems that are run with
> "too little" swap, and the only difference from 2.2 till now is, that the
> definition of "too little" changed.


Sorry but if ~250MB is too little ... it _is_ a bug.   I think everyone would 
agree that 250MB of swap in use is far far far too much.  If this is a 
feature, it is one nobody would want.  

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:26   ` safemode
@ 2001-05-29  4:38     ` Jeff Garzik
  2001-05-29  6:04       ` Mike Galbraith
  2001-05-29 14:06       ` Gerhard Mack
  0 siblings, 2 replies; 28+ messages in thread
From: Jeff Garzik @ 2001-05-29  4:38 UTC (permalink / raw)
  To: Jakob Østergaard; +Cc: safemode, G. Hugh Song, linux-kernel

> On Tuesday 29 May 2001 00:10, Jakob Østergaard wrote:
> 
> > > > Mem: 381608K av, 248504K used, 133104K free, 0K shrd, 192K
> > > > buff
> > > > Swap: 255608K av, 255608K used, 0K free 215744K
> > > > cached
> > > >
> > > > Vanilla 2.4.5 VM.
> 
> > It's not a bug.  It's a feature.  It only breaks systems that are run with
> > "too little" swap, and the only difference from 2.2 till now is, that the
> > definition of "too little" changed.

I am surprised as many people as this are missing,

* when you have an active process using ~300M of VM, in a ~380M machine,
2/3 of the machine's RAM should -not- be soaked up by cache

* when you have an active process using ~300M of VM, in a ~380M machine,
swap should not be full while there is 133M of RAM available.

The above quoted is top output, taken during the several minutes where
cc1plus process was ~300M in size.  Similar numbers existed before and
after my cut-n-paste, so this was not transient behavior.

I can assure you, these are bugs not features :)

-- 
Jeff Garzik      | Disbelief, that's why you fail.
Building 1024    |
MandrakeSoft     |

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:38     ` Jeff Garzik
@ 2001-05-29  6:04       ` Mike Galbraith
  2001-05-29 14:06       ` Gerhard Mack
  1 sibling, 0 replies; 28+ messages in thread
From: Mike Galbraith @ 2001-05-29  6:04 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Jakob Østergaard, safemode, G. Hugh Song, linux-kernel

On Tue, 29 May 2001, Jeff Garzik wrote:

> > On Tuesday 29 May 2001 00:10, Jakob Østergaard wrote:
> >
> > > > > Mem: 381608K av, 248504K used, 133104K free, 0K shrd, 192K
> > > > > buff
> > > > > Swap: 255608K av, 255608K used, 0K free 215744K
> > > > > cached
> > > > >
> > > > > Vanilla 2.4.5 VM.
> >
> > > It's not a bug.  It's a feature.  It only breaks systems that are run with
> > > "too little" swap, and the only difference from 2.2 till now is, that the
> > > definition of "too little" changed.
>
> I am surprised as many people as this are missing,
>
> * when you have an active process using ~300M of VM, in a ~380M machine,
> 2/3 of the machine's RAM should -not- be soaked up by cache

Emphatic yes.  We went from cache collapse to cache bloat.  IMHO, the
bugfix for collapse exposed other problems.  I posted a patch which
I believe demonstrated that pretty well.  (i also bet Rik a virtual
beer that folks would knock on his mailbox when 2.4.5 was released.
please cc him somebody.. i want my brewski;)

	-Mike


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:38     ` Jeff Garzik
  2001-05-29  6:04       ` Mike Galbraith
@ 2001-05-29 14:06       ` Gerhard Mack
  1 sibling, 0 replies; 28+ messages in thread
From: Gerhard Mack @ 2001-05-29 14:06 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Jakob Østergaard, safemode, G. Hugh Song, linux-kernel

> * when you have an active process using ~300M of VM, in a ~380M machine,
> 2/3 of the machine's RAM should -not- be soaked up by cache
> 
> * when you have an active process using ~300M of VM, in a ~380M machine,
> swap should not be full while there is 133M of RAM available.
> 
> The above quoted is top output, taken during the several minutes where
> cc1plus process was ~300M in size.  Similar numbers existed before and
> after my cut-n-paste, so this was not transient behavior.
> 
> I can assure you, these are bugs not features :)
> 
Ive seen that here too but every report I've sent on that has been
dismissed as "that's what it's supposed to do"

--
Gerhard Mack

gmack@innerfire.net

<>< As a computer I find your faith in technology amusing.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:10 ` Jakob Østergaard
  2001-05-29  4:26   ` safemode
@ 2001-05-29  4:46   ` G. Hugh Song
  2001-05-29  4:57     ` Jakob Østergaard
  2001-05-29  7:13   ` Marcelo Tosatti
  2001-05-29  9:10   ` Alan Cox
  3 siblings, 1 reply; 28+ messages in thread
From: G. Hugh Song @ 2001-05-29  4:46 UTC (permalink / raw)
  To: Jakob �stergaard; +Cc: linux-kernel

Jakob,

My Alpha has 2GB of physical memory.  In this case how much swap space
should
I assign in these days of kernel 2.4.*?  I had had trouble with 1GB of
swap space
before switching back to 2.2.20pre2aa1.

Thanks

-- 
G. Hugh Song

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:46   ` G. Hugh Song
@ 2001-05-29  4:57     ` Jakob Østergaard
  0 siblings, 0 replies; 28+ messages in thread
From: Jakob Østergaard @ 2001-05-29  4:57 UTC (permalink / raw)
  To: G. Hugh Song; +Cc: linux-kernel

On Tue, May 29, 2001 at 01:46:28PM +0900, G. Hugh Song wrote:
> Jakob,
> 
> My Alpha has 2GB of physical memory.  In this case how much swap space
> should
> I assign in these days of kernel 2.4.*?  I had had trouble with 1GB of
> swap space
> before switching back to 2.2.20pre2aa1.

If you run a single mingetty and bash session, you need no swap.

If you run four 1GB processes concurrently, I would use ~5-6G of swap to be on
the safe side.

Swap is very cheap, even if measured in gigabytes. Go with the sum of the
largest process foot-prints you can imagine running on your system, and then
add some. Be generous.  It's not like unused swap space is going to slow the
system down - it's a nice extra little safety to have.   It's beyond me why
anyone would run a system with marginal swap.

On a compile box here with 392 MB physical, I have 900 MB swap. This
accomodates multiple concurrent 100-300 MB compile jobs.   Never had a problem.
Oh, and I didn't have to change my swap setup between 2.2 and 2.4.

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:10 ` Jakob Østergaard
  2001-05-29  4:26   ` safemode
  2001-05-29  4:46   ` G. Hugh Song
@ 2001-05-29  7:13   ` Marcelo Tosatti
  2001-05-29  9:10   ` Alan Cox
  3 siblings, 0 replies; 28+ messages in thread
From: Marcelo Tosatti @ 2001-05-29  7:13 UTC (permalink / raw)
  To: Jakob Østergaard; +Cc: G. Hugh Song, linux-kernel



On Tue, 29 May 2001, Jakob Østergaard wrote:

> 
> It's not a bug.  It's a feature.  It only breaks systems that are run with "too
> little" swap, and the only difference from 2.2 till now is, that the definition
> of "too little" changed.

Its just a balancing change, actually. You can tune the code to reap cache
aggressively.

"just put more swap and you're OK" is not the answer, IMO. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  4:10 ` Jakob Østergaard
                     ` (2 preceding siblings ...)
  2001-05-29  7:13   ` Marcelo Tosatti
@ 2001-05-29  9:10   ` Alan Cox
  2001-05-29 15:37     ` elko
  3 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2001-05-29  9:10 UTC (permalink / raw)
  To: Jakob Østergaard; +Cc: G. Hugh Song, linux-kernel

> It's not a bug.  It's a feature.  It only breaks systems that are run w=
> ith "too
> little" swap, and the only difference from 2.2 till now is, that the de=
> finition
> of "too little" changed.

its a giant bug. Or do you want to add 128Gb of unused swap to a full kitted
out Xeon box - or 512Gb to a big athlon ???

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM...
  2001-05-29  9:10   ` Alan Cox
@ 2001-05-29 15:37     ` elko
  0 siblings, 0 replies; 28+ messages in thread
From: elko @ 2001-05-29 15:37 UTC (permalink / raw)
  To: linux-kernel

On Tuesday 29 May 2001 11:10, Alan Cox wrote:
> > It's not a bug.  It's a feature.  It only breaks systems that are run w=
> > ith "too
> > little" swap, and the only difference from 2.2 till now is, that the de=
> > finition
> > of "too little" changed.
>
> its a giant bug. Or do you want to add 128Gb of unused swap to a full
> kitted out Xeon box - or 512Gb to a big athlon ???

this bug is biting me too and I do NOT like it !

if it's a *giant* bug, then why is LK-2.4 called a *stable* kernel ??

-- 
Elko Holl

^ permalink raw reply	[flat|nested] 28+ messages in thread

[parent not found: <mailman.991098720.29883.linux-kernel2news@redhat.com>]

* Re: Plain 2.4.5 VM...
       [not found] <mailman.991098720.29883.linux-kernel2news@redhat.com>
@ 2001-05-29  2:55 ` Pete Zaitcev
  0 siblings, 0 replies; 28+ messages in thread
From: Pete Zaitcev @ 2001-05-29  2:55 UTC (permalink / raw)
  To: jgarzik, Linux Kernel Mailing List

> Ouch!  When compiling MySql, building sql_yacc.cc results in a ~300M
> cc1plus process size.  Unfortunately this leads the machine with 380M of
> RAM deeply into swap:
> 
> Mem:   381608K av,  248504K used,  133104K free,       0K shrd,     192K buff
> Swap:  255608K av,  255608K used,       0K free                  215744K cached
> 
> Vanilla 2.4.5 VM.

I noticed that too and there is no way around it. If we assume
a 2.5xRAM target, you must add about 704MB. In my case I had no
spare partition so I added a swapfile, as undoubtedly many
2.4 sufferers did.

-- Pete

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
@ 2001-05-30  4:24 Craig Kulesa
  2001-05-30  6:50 ` Mike Galbraith
  2001-05-30 13:27 ` Jonathan Morton
  0 siblings, 2 replies; 28+ messages in thread
From: Craig Kulesa @ 2001-05-30  4:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rik van Riel

Mike Galbraith (mikeg@wen-online.de) wrote:
>
> Emphatic yes.  We went from cache collapse to cache bloat.

Rik, I think Mike deserves his beer.  ;)

Agreed.  Swap reclaim doesn't seem to be the root issue here, IMHO.
But instead: his box was capable of maintaining a modest cache
and the desired user processes without massive allocations (and use)
of swap space.  There was plenty of cache to reap, but VM decided to
swapout instead.  Seems we're out of balance here (IMHO).

I see this too, and it's only a symptom of post-2.4.4 kernels.

Example: on a 128M system w/2.4.5, loading X and a simulation code of
RSS=70M causes the system to drop 40-50M into swap...with 100M of cache
sitting there, and some of those cache pages are fairly old. It's not
just allocation; there is noticable disk activity associated with paging
that causes a lag in interactivity.  In 2.4.4, there is no swap activity
at all.

And if the application causes heavy I/O *and* memory load (think
StarOffice, or Quake 3), this situation gets even worse (because there's
typically more competition/demand for cache pages).

And on low-memory systems (ex. 32M), even a basic Web browsing test w/
Opera drops the 2.4.5 system 25M into swap where 2.4.4 barely cracks 5 MB
on the same test (and the interactivity shows).  This is all independent
of swap reclaim.

So is there an ideal VM balance for everyone?  I have found that low-RAM
systems seem to benefit from being on the "cache-collapse" side of the
curve (so I prefer the pre-2.4.5 balance more than Mike probably does) and
those low-RAM systems are the first hit when, as now, we're favoring
"cache bloat".  Should balance behaviors could be altered by the user
(via sysctl's maybe?  Yeah, I hear the cringes)?  Or better, is it
possible to dynamically choose where the watermarks in balancing should
lie, and alter them automatically?  2.5 stuff there, no doubt.  Balancing
seems so *fragile* (to me).

Cheers,

Craig Kulesa
ckulesa@as.arizona.edu

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30  4:24 Craig Kulesa
@ 2001-05-30  6:50 ` Mike Galbraith
  2001-05-30 13:27 ` Jonathan Morton
  1 sibling, 0 replies; 28+ messages in thread
From: Mike Galbraith @ 2001-05-30  6:50 UTC (permalink / raw)
  To: Craig Kulesa; +Cc: linux-kernel, Rik van Riel

On Tue, 29 May 2001, Craig Kulesa wrote:

> Mike Galbraith (mikeg@wen-online.de) wrote:
> >
> > Emphatic yes.  We went from cache collapse to cache bloat.
>
> Rik, I think Mike deserves his beer.  ;)

:)

...

> So is there an ideal VM balance for everyone?  I have found that low-RAM

(I seriously doubt it)

> systems seem to benefit from being on the "cache-collapse" side of the
> curve (so I prefer the pre-2.4.5 balance more than Mike probably does) and

I hate both bad behaviors equally.  "cache bloat" hurts more people
than "cache collapse" does though because it shows under light load.

> those low-RAM systems are the first hit when, as now, we're favoring
> "cache bloat".  Should balance behaviors could be altered by the user
> (via sysctl's maybe?  Yeah, I hear the cringes)?  Or better, is it
> possible to dynamically choose where the watermarks in balancing should
> lie, and alter them automatically?  2.5 stuff there, no doubt.  Balancing
> seems so *fragile* (to me).

The page aging logic does seems fragile as heck.  You never know how
many folks are aging pages or at what rate.  If aging happens too fast,
it defeats the garbage identification logic and you rape your cache. If
aging happens too slowly...... sigh.

	-Mike


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30  4:24 Craig Kulesa
  2001-05-30  6:50 ` Mike Galbraith
@ 2001-05-30 13:27 ` Jonathan Morton
  2001-05-30 12:20   ` Marcelo Tosatti
  2001-05-30 15:18   ` Mike Galbraith
  1 sibling, 2 replies; 28+ messages in thread
From: Jonathan Morton @ 2001-05-30 13:27 UTC (permalink / raw)
  To: Mike Galbraith, Craig Kulesa; +Cc: linux-kernel, Rik van Riel

>The page aging logic does seems fragile as heck.  You never know how
>many folks are aging pages or at what rate.  If aging happens too fast,
>it defeats the garbage identification logic and you rape your cache. If
>aging happens too slowly...... sigh.

Then it sounds like the current algorithm is totally broken and needs
replacement.  If it's impossible to make a system stable by choosing the
right numbers, the system needs changing, not the numbers.  I think that's
pretty much what we're being taught in Control Engineering.  :)

Not having studied the code too closely, it sounds as though there are half
a dozen different "clocks" running for different types of memory, and each
one runs at a different speed and is updated at a different time.
Meanwhile, the paging-out is done on the assumption that all the clocks are
(at least roughly) in sync.  Makes sense, right?  (not!)

I think it's worthwhile to think of the page/buffer caches as having a
working set of their own - if they are being heavily used, they should get
more memory than if they are only lightly used.  The important point to get
right is to ensure that the "clocks" used for each memory area remain in
sync - they don't have to measure real time, just be consistent and fine
granularity.

I'm working on some relatively small changes to vmscan.c which should help
improve the behaviour without upsetting the balance too much.  Watch this
space...

--------------------------------------------------------------
from:     Jonathan "Chromatix" Morton
mail:     chromi@cyberspace.org  (not for attachments)
big-mail: chromatix@penguinpowered.com
uni-mail: j.d.morton@lancaster.ac.uk

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-----END GEEK CODE BLOCK-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 13:27 ` Jonathan Morton
@ 2001-05-30 12:20   ` Marcelo Tosatti
  2001-05-30 15:27     ` Mike Galbraith
  2001-05-30 18:39     ` Rik van Riel
  2001-05-30 15:18   ` Mike Galbraith
  1 sibling, 2 replies; 28+ messages in thread
From: Marcelo Tosatti @ 2001-05-30 12:20 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Mike Galbraith, Craig Kulesa, linux-kernel, Rik van Riel



On Wed, 30 May 2001, Jonathan Morton wrote:

> >The page aging logic does seems fragile as heck.  You never know how
> >many folks are aging pages or at what rate.  If aging happens too fast,
> >it defeats the garbage identification logic and you rape your cache. If
> >aging happens too slowly...... sigh.
> 
> Then it sounds like the current algorithm is totally broken and needs
> replacement.  If it's impossible to make a system stable by choosing the
> right numbers, the system needs changing, not the numbers.  I think that's
> pretty much what we're being taught in Control Engineering.  :)

The problem is that we allow _every_ task to age pages on the system at
the same time --- this is one of the things which is fucking up.

The another problem is that don't limit the writeout in the VM. 

We (me and Rik) are going to work on this later --- right now I'm busy
with the distribution release and Rik is travelling. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 12:20   ` Marcelo Tosatti
@ 2001-05-30 15:27     ` Mike Galbraith
  2001-05-30 18:39     ` Rik van Riel
  1 sibling, 0 replies; 28+ messages in thread
From: Mike Galbraith @ 2001-05-30 15:27 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Jonathan Morton, Craig Kulesa, linux-kernel, Rik van Riel

On Wed, 30 May 2001, Marcelo Tosatti wrote:

> On Wed, 30 May 2001, Jonathan Morton wrote:
>
> > >The page aging logic does seems fragile as heck.  You never know how
> > >many folks are aging pages or at what rate.  If aging happens too fast,
> > >it defeats the garbage identification logic and you rape your cache. If
> > >aging happens too slowly...... sigh.
> >
> > Then it sounds like the current algorithm is totally broken and needs
> > replacement.  If it's impossible to make a system stable by choosing the
> > right numbers, the system needs changing, not the numbers.  I think that's
> > pretty much what we're being taught in Control Engineering.  :)
>
> The problem is that we allow _every_ task to age pages on the system at
> the same time --- this is one of the things which is fucking up.

Yes.  (I've been muttering/mumbling about this for... ever.  look at the
last patch I posted in this light.. make -j30 load:)

> The another problem is that don't limit the writeout in the VM.

And sometimes we don't start writing out soon enough.

> We (me and Rik) are going to work on this later --- right now I'm busy
> with the distribution release and Rik is travelling.

Cool.

	-Mike


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 12:20   ` Marcelo Tosatti
  2001-05-30 15:27     ` Mike Galbraith
@ 2001-05-30 18:39     ` Rik van Riel
  2001-05-30 17:19       ` Marcelo Tosatti
  2001-05-30 20:33       ` Mike Galbraith
  1 sibling, 2 replies; 28+ messages in thread
From: Rik van Riel @ 2001-05-30 18:39 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Jonathan Morton, Mike Galbraith, Craig Kulesa, linux-kernel

On Wed, 30 May 2001, Marcelo Tosatti wrote:

> The problem is that we allow _every_ task to age pages on the system
> at the same time --- this is one of the things which is fucking up.

This should not have any effect on the ratio of cache
reclaiming vs. swapout use, though...

> The another problem is that don't limit the writeout in the VM.

This is a big problem too, but also unrelated to the
impossibility of balancing cache vs. swap in the current
scheme.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 18:39     ` Rik van Riel
@ 2001-05-30 17:19       ` Marcelo Tosatti
  2001-05-30 20:33       ` Mike Galbraith
  1 sibling, 0 replies; 28+ messages in thread
From: Marcelo Tosatti @ 2001-05-30 17:19 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Jonathan Morton, Mike Galbraith, Craig Kulesa, linux-kernel


On Wed, 30 May 2001, Rik van Riel wrote:

> On Wed, 30 May 2001, Marcelo Tosatti wrote:
> 
> > The problem is that we allow _every_ task to age pages on the system
> > at the same time --- this is one of the things which is fucking up.
> 
> This should not have any effect on the ratio of cache
> reclaiming vs. swapout use, though...

Sure, who said that ? :) 

The current discussion between Mike/Jonathan and me is about the aging
issue.

> 
> > The another problem is that don't limit the writeout in the VM.
> 
> This is a big problem too, but also unrelated to the
> impossibility of balancing cache vs. swap in the current
> scheme.

... 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 18:39     ` Rik van Riel
  2001-05-30 17:19       ` Marcelo Tosatti
@ 2001-05-30 20:33       ` Mike Galbraith
  2001-05-30 19:25         ` Marcelo Tosatti
  1 sibling, 1 reply; 28+ messages in thread
From: Mike Galbraith @ 2001-05-30 20:33 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Marcelo Tosatti, Jonathan Morton, Craig Kulesa, linux-kernel

On Wed, 30 May 2001, Rik van Riel wrote:

> On Wed, 30 May 2001, Marcelo Tosatti wrote:
>
> > The problem is that we allow _every_ task to age pages on the system
> > at the same time --- this is one of the things which is fucking up.
>
> This should not have any effect on the ratio of cache
> reclaiming vs. swapout use, though...

It shouldn't.. but when many tasks are aging, it does.  Excluding
these guys certainly seems to make a difference.  (could be seeing
something else and interpreting it wrong...)

	-Mike


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 20:33       ` Mike Galbraith
@ 2001-05-30 19:25         ` Marcelo Tosatti
  2001-05-31  5:20           ` Mike Galbraith
  0 siblings, 1 reply; 28+ messages in thread
From: Marcelo Tosatti @ 2001-05-30 19:25 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Rik van Riel, Jonathan Morton, Craig Kulesa, linux-kernel



On Wed, 30 May 2001, Mike Galbraith wrote:

> On Wed, 30 May 2001, Rik van Riel wrote:
> 
> > On Wed, 30 May 2001, Marcelo Tosatti wrote:
> >
> > > The problem is that we allow _every_ task to age pages on the system
> > > at the same time --- this is one of the things which is fucking up.
> >
> > This should not have any effect on the ratio of cache
> > reclaiming vs. swapout use, though...
> 
> It shouldn't.. but when many tasks are aging, it does. 

What Rik means is that they are independant problems.

> Excluding these guys certainly seems to make a difference.  

Sure, those guys are going to "help" kswapd to unmap pte's and allocate
swap space.

Now even if only kswapd does this job (meaning a sane amount of cache
reclaims/swapouts), you still have to deal with the reclaim/swapout
tradeoff.

See? 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 19:25         ` Marcelo Tosatti
@ 2001-05-31  5:20           ` Mike Galbraith
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Galbraith @ 2001-05-31  5:20 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Rik van Riel, Jonathan Morton, Craig Kulesa, linux-kernel

On Wed, 30 May 2001, Marcelo Tosatti wrote:

> On Wed, 30 May 2001, Mike Galbraith wrote:
>
> > On Wed, 30 May 2001, Rik van Riel wrote:
> >
> > > On Wed, 30 May 2001, Marcelo Tosatti wrote:
> > >
> > > > The problem is that we allow _every_ task to age pages on the system
> > > > at the same time --- this is one of the things which is fucking up.
> > >
> > > This should not have any effect on the ratio of cache
> > > reclaiming vs. swapout use, though...
> >
> > It shouldn't.. but when many tasks are aging, it does.
>
> What Rik means is that they are independant problems.

Ok.

>
> > Excluding these guys certainly seems to make a difference.
>
> Sure, those guys are going to "help" kswapd to unmap pte's and allocate
> swap space.
>
> Now even if only kswapd does this job (meaning a sane amount of cache
> reclaims/swapouts), you still have to deal with the reclaim/swapout
> tradeoff.
>
> See?

Yes.

	-Mike


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 13:27 ` Jonathan Morton
  2001-05-30 12:20   ` Marcelo Tosatti
@ 2001-05-30 15:18   ` Mike Galbraith
  2001-05-30 19:16     ` Rik van Riel
  1 sibling, 1 reply; 28+ messages in thread
From: Mike Galbraith @ 2001-05-30 15:18 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Craig Kulesa, linux-kernel, Rik van Riel

On Wed, 30 May 2001, Jonathan Morton wrote:

> >The page aging logic does seems fragile as heck.  You never know how
> >many folks are aging pages or at what rate.  If aging happens too fast,
> >it defeats the garbage identification logic and you rape your cache. If
> >aging happens too slowly...... sigh.
>
> Then it sounds like the current algorithm is totally broken and needs
> replacement.  If it's impossible to make a system stable by choosing the
> right numbers, the system needs changing, not the numbers.  I think that's
> pretty much what we're being taught in Control Engineering.  :)

I wouldn't go so far as to say totally broken (mostly because I've tried
like _hell_ to find a better way, and [despite minor successes] I've not
been able to come up with something which covers all cases that even _I_
[hw tech] can think of well).  Hard numbers are just plain bad, see below.

I _will_ say that it is entirely too easy to break for comfort ;-)

> Not having studied the code too closely, it sounds as though there are half
> a dozen different "clocks" running for different types of memory, and each
> one runs at a different speed and is updated at a different time.
> Meanwhile, the paging-out is done on the assumption that all the clocks are
> (at least roughly) in sync.  Makes sense, right?  (not!)

No, I don't think that's the case at all.  The individual zone balancing
logic (or individual page [content] type logic) hasn't been inplimented
yet (content type handling I don't think we want), but doesn't look like
it'll be any trouble to do with the structures in place.  That's just a
fleshing out thing.  The variations in aging rate is the most difficult
problem I can see.  IMHO, it needs to be either decoupled and done by an
impartial bystander (tried that, ran into info flow troubles because of
scheduling) or integrated tightly into the allocator proper (tried that..
interesting results but has problems of it's own wrt the massive changes
in general strategy needed to make it work.. approaches rewrite)

> I think it's worthwhile to think of the page/buffer caches as having a
> working set of their own - if they are being heavily used, they should get
> more memory than if they are only lightly used.  The important point to get
> right is to ensure that the "clocks" used for each memory area remain in
> sync - they don't have to measure real time, just be consistent and fine
> granularity.

IMHO, the only thing of interest you can do with clocks is set your state
sample rate.  If state is changing rapidly, you must sample rapidly.  As
far as corrections go, you can only insert a corrective vector into the
mix and then see if the sum induced the desired change in direction.  The
correct magnitude of this vector is not even possible to know.. that's
what makes it so darn hard [defining numerical goals is utterly bogus].

> I'm working on some relatively small changes to vmscan.c which should help
> improve the behaviour without upsetting the balance too much.  Watch this
> space...

With much interest :)

	-Mike

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Plain 2.4.5 VM
  2001-05-30 15:18   ` Mike Galbraith
@ 2001-05-30 19:16     ` Rik van Riel
  0 siblings, 0 replies; 28+ messages in thread
From: Rik van Riel @ 2001-05-30 19:16 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Jonathan Morton, Craig Kulesa, linux-kernel

On Wed, 30 May 2001, Mike Galbraith wrote:

> I wouldn't go so far as to say totally broken (mostly because I've
> tried like _hell_ to find a better way, and [despite minor successes]
> I've not been able to come up with something which covers all cases
> that even _I_ [hw tech] can think of well).

The "easy way out" seems to be physical -> virtual
page reverse mappings, these make it trivial to apply
balanced pressure on all pages.

The downside of this measure is that it costs additional
overhead and can up to double the amount of memory we
take in with page tables. Of course, this amount is only
prohibitive if the amount of page table memory was also
prohibitively large in the first place, but ... :)

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2001-05-31  5:23 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-29  0:32 Plain 2.4.5 VM Jeff Garzik
2001-05-29  1:13 ` Mohammad A. Haque
2001-05-29  1:14 ` Mohammad A. Haque
2001-05-29  8:51 ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2001-05-29  2:32 G. Hugh Song
2001-05-29  4:10 ` Jakob Østergaard
2001-05-29  4:26   ` safemode
2001-05-29  4:38     ` Jeff Garzik
2001-05-29  6:04       ` Mike Galbraith
2001-05-29 14:06       ` Gerhard Mack
2001-05-29  4:46   ` G. Hugh Song
2001-05-29  4:57     ` Jakob Østergaard
2001-05-29  7:13   ` Marcelo Tosatti
2001-05-29  9:10   ` Alan Cox
2001-05-29 15:37     ` elko
     [not found] <mailman.991098720.29883.linux-kernel2news@redhat.com>
2001-05-29  2:55 ` Pete Zaitcev
2001-05-30  4:24 Craig Kulesa
2001-05-30  6:50 ` Mike Galbraith
2001-05-30 13:27 ` Jonathan Morton
2001-05-30 12:20   ` Marcelo Tosatti
2001-05-30 15:27     ` Mike Galbraith
2001-05-30 18:39     ` Rik van Riel
2001-05-30 17:19       ` Marcelo Tosatti
2001-05-30 20:33       ` Mike Galbraith
2001-05-30 19:25         ` Marcelo Tosatti
2001-05-31  5:20           ` Mike Galbraith
2001-05-30 15:18   ` Mike Galbraith
2001-05-30 19:16     ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox