public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: la la la la ... swappiness
@ 2006-12-04 19:02 Al Boldi
  0 siblings, 0 replies; 39+ messages in thread
From: Al Boldi @ 2006-12-04 19:02 UTC (permalink / raw)
  To: linux-kernel

As a workaround try this:

echo 2 > /proc/sys/vm/overcommit_memory
echo 0 > /proc/sys/vm/overcommit_ratio

Hopefully someone can fix this intrinsic swap before drop behaviour.


Thanks!

--
Al


^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: la la la la ... swappiness
@ 2006-12-04  1:54 Aucoin
  2006-12-04  4:59 ` Andrew Morton
  2006-12-04  7:22 ` Kyle Moffett
  0 siblings, 2 replies; 39+ messages in thread
From: Aucoin @ 2006-12-04  1:54 UTC (permalink / raw)
  To: Aucoin, 'Tim Schmielau'
  Cc: 'Andrew Morton', torvalds, linux-kernel, clameter


I should also have made it clear that under a full load OOM kills critical
data moving processes because of (what appears to be) the out of control
memory consumption by disk I/O cache related to the tar.

As a side note, even now, *hours* after the tar has completed and even
though I have swappiness set to 0, cache pressure set to 9999, all dirty
timeouts set to 1 and all dirty ratios set to 1, I still have a 360+K
inactive page count and my "free" memory is less than 10% of normal. I'm not
pretending to understand what's happening here but shouldn't some kind of
expiration have kicked in by now and freed up all those inactive pages? The
*instant* I manually push a "3" into drop_caches I have 100% of my normal
free memory and the inactive page count drops below 2K. Maybe I completely
misunderstood the purpose of all those dials but I really did get the
feeling that twisting them all tight would make the housekeeping algorithms
more aggressive.

What, if anything, besides manually echoing a "3" to drop_caches will cause
all those inactive pages to be put back on the free list ?

-----Original Message-----
From: Aucoin [mailto:Aucoin@Houston.RR.com] 
Sent: Sunday, December 03, 2006 5:57 PM
To: 'Tim Schmielau'
Cc: 'Andrew Morton'; 'torvalds@osdl.org'; 'linux-kernel@vger.kernel.org';
'clameter@sgi.com'
Subject: RE: la la la la ... swappiness

We want it to swap less for this particular operation because it is low
priority compared to the rest of what's going on inside the box.

We've considered both artificially manipulating swap on the fly similar to
your suggestion as well a parallel thread that pumps a 3 into drop_caches
every few seconds while the update is running, but these seem too much like
hacks for our liking. Mind you, if we don't have a choice we'll do what we
need to get the job done but there's a nagging voice in our conscience that
says keep looking for a more elegant solution and work *with* the kernel
rather than working against it or trying to trick it into doing what we
want. 

We've already disabled OOM so we can at least keep our testing alive while
searching for a more elegant solution. Although we want to avoid swap in
this particular instance for this particular reason, in our hearts we agree
with Andrew that swap can be your friend and get you out of a jam once in a
while. Even more, we'd like to leave OOM active if we can because we want to
be told when somebody's not being a good memory citizen.

Some background, what we've done is carve up a huge chunk of memory that is
shared between three resident processes as write cache for a proprietary
block system layout that is part of a scalable storage architecture
currently capable of RAID 0, 1, 5 (soon 6) virtualized across multiple
chassis's, essentially treating each machine as a "disk" and providing
multipath I/O to multiple iSCSI targets as part of a grid/array storage
solution. Whew! We also have a version that leverages a battery backed write
cache for higher performance at an additional cost. This software is
installable on any commodity platform with 4-N disks supported by Linux,
I've even put it on an Optiplex with 4 simulated disks. Yawn ... yet another
iSCSI storage solution, but this one scales linearly in capacity as well as
performance. As such, we have no user level apps on the boxes and precious
little disk to spare for additional swap so our version of the swap
manipulation solution is to turn swap completely off for the duration of the
update.

I hope I haven't muddied things up even more but basically what we want to
do is find a way to limit the number of cached pages for disk I/O on the OS
filesystem, even if it drastically slows down the untar and verify process
because the disk I/O we really care about is not on any of the OS
partitions.

Louis Aucoin

-----Original Message-----
From: Tim Schmielau [mailto:tim@physik3.uni-rostock.de] 
Sent: Sunday, December 03, 2006 2:47 PM
To: Aucoin
Cc: 'Andrew Morton'; torvalds@osdl.org; linux-kernel@vger.kernel.org;
clameter@sgi.com
Subject: RE: la la la la ... swappiness

On Sun, 3 Dec 2006, Aucoin wrote:

> during tar extraction ... inactive pages reaches levels as high as ~375000

So why do you want the system to swap _less_? You need to find some free 
memory for the additional processes to run in, and you have lots of 
inactive pages, so I think you want to swap out _more_ pages.

I'd suggest to temporarily add a swapfile before you update your system. 
This can even help in bringing your memory use to the state before if you 
do it like this
  - swapon additional swapfile
  - update your database software
  - swapoff swap partition
  - swapon swap partition
  - swapoff additional swapfile

Tim



^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: la la la la ... swappiness
@ 2006-12-03  6:18 Aucoin
  0 siblings, 0 replies; 39+ messages in thread
From: Aucoin @ 2006-12-03  6:18 UTC (permalink / raw)
  To: akpm, torvalds, linux-kernel, clameter


Reformatted as plain text.

________________________________________
From: Aucoin [mailto:Aucoin@Houston.RR.com] 
Sent: Sunday, December 03, 2006 12:17 AM
To: 'akpm@osdl.org'; 'torvalds@osdl.org'; 'linux-kernel@vger.kernel.org';
'clameter@sgi.com'
Subject: la la la la ... swappiness

I set swappiness to zero and it doesn't do what I want!

I have a system that runs as a Linux based data server 24x7 and occasionally
I need to apply an update or patch. It's a BIIIG patch to the tune of
several hundred megabytes, let's say 600MB for a good round number. The
server software itself runs on very tight memory boundaries, I've
preallocated a large chunk of memory that is shared amongst several
processes as a form of application cache, there is barely 15% spare memory
floating around.

The update is delivered to the server as a tar file. In order to minimize
down time I untar this update and verify the contents landed correctly
before switching over to the updated software.

The problem is when I attempt to untar the payload disk I/O starts caching,
the inactive page count reels wildly out of control, the system starts
swapping, OOM fires and there goes my 4 9's uptime. My system just suffered
a catastrophic failure because I can't control pagecache due to disk I/O.
I need a pagecache throttle, what do you suggest?





^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2006-12-05 13:50 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200612030616.kB36GYBs019873@ms-smtp-03.texas.rr.com>
2006-12-03  8:08 ` la la la la ... swappiness Andrew Morton
2006-12-03 15:40   ` Aucoin
2006-12-03 20:46     ` Tim Schmielau
2006-12-03 23:56       ` Aucoin
2006-12-04  0:57         ` Horst H. von Brand
2006-12-04  4:56         ` Andrew Morton
2006-12-04  5:13           ` Linus Torvalds
2006-12-04 17:03             ` Christoph Lameter
2006-12-04 10:43         ` Nick Piggin
2006-12-04 14:45           ` Aucoin
2006-12-04 15:04             ` Nick Piggin
2006-12-05  4:02               ` Aucoin
2006-12-05  4:46                 ` Linus Torvalds
2006-12-05  6:41                   ` Aucoin
2006-12-05  7:01                     ` Nick Piggin
2006-12-05  7:26                       ` Rene Herman
2006-12-05 13:27                         ` Aucoin
2006-12-05 13:49                           ` Rene Herman
2006-12-05 13:25                       ` Aucoin
2006-12-04 19:02 Al Boldi
  -- strict thread matches above, loose matches on Subject: below --
2006-12-04  1:54 Aucoin
2006-12-04  4:59 ` Andrew Morton
2006-12-04  7:22 ` Kyle Moffett
2006-12-04 14:39   ` Aucoin
2006-12-04 16:10     ` Chris Friesen
2006-12-04 17:07     ` Horst H. von Brand
2006-12-04 17:49       ` Aucoin
2006-12-04 18:44         ` Tim Schmielau
2006-12-04 21:28           ` Aucoin
2006-12-04 18:46         ` Horst H. von Brand
2006-12-04 21:43           ` Aucoin
2006-12-04 18:06       ` Andrew Morton
2006-12-04 18:15         ` Christoph Lameter
2006-12-04 18:38           ` Jeffrey Hundstad
2006-12-04 21:25             ` Aucoin
2006-12-04 21:43               ` Andrew Morton
2006-12-04 15:55   ` David Lang
2006-12-04 17:42     ` Aucoin
2006-12-03  6:18 Aucoin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox