Re: [2.4] heavy-load under swap space shortage

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marc-Christian Petersen <m.c.p@kernel.linux-systeme.com>
To: linux-kernel@vger.kernel.org
Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com>,
	j-nomura@ce.jp.nec.com, andrea@suse.de,
	Andrew Morton <akpm@osdl.org>,
	hugh@veritas.com
Subject: Re: [2.4] heavy-load under swap space shortage
Date: Wed, 26 May 2004 20:24:34 +0200	[thread overview]
Message-ID: <200405262024.34905@WOLK> (raw)
In-Reply-To: <20040526124104.GF6439@logos.cnet>

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

On Wednesday 26 May 2004 14:41, Marcelo Tosatti wrote:

Marcelo,

> I think we can merge this patch.

I think this too =)


> Its very safe - default behaviour unchanged.
> Jun, are you willing to do another test for us if this gets merged
> in v2.4.27-pre4 ?
> Maybe we should document the VM tunables somewhere outside source code
> (Documentation/) ?

I think we should merge the attached patches to finally remove utterly bogus 
and non-existent documentation things and clean up stuff a bit and document 
the -aa VM bits.

Agreed?

Kinda same cleanups and more following soon for 2.6-mm.

ciao, Marc


[-- Attachment #2: 02_add-new-docu-VM.patch --]
[-- Type: text/x-diff, Size: 24952 bytes --]

--- a/Documentation/sysctl/vm.txt	2004-05-26 19:57:15.000000000 +0200
+++ b/Documentation/sysctl/vm.txt	2004-05-26 20:06:20.000000000 +0200
@@ -1,111 +1,143 @@
-Documentation for /proc/sys/vm/*	kernel version 2.4.19
-	(c) 1998, 1999,  Rik van Riel <riel@nl.linux.org>
+Documentation for /proc/sys/vm/*	Kernel version 2.4.26
+=============================================================
 
-For general info and legal blurb, please look in README.
+ (c) 1998, 1999, Rik van Riel <riel@nl.linux.org>
+    - Initial version
 
-==============================================================
+ (c) 2004, Marc-Christian Petersen <m.c.p@linux-systeme.com>
+    - Removed non-existent knobs which were removed in early
+      2.4 stages
+    - Corrected values for bdflush
+    - Documented missing tunables
+    - Documented aa-vm tunables
+
+
+
+For general info and legal blurb, please look in README.
+=============================================================
 
 This file contains the documentation for the sysctl files in
-/proc/sys/vm and is valid for Linux kernel version 2.4.
+/proc/sys/vm and is valid for Linux kernel v2.4.26.
 
 The files in this directory can be used to tune the operation
 of the virtual memory (VM) subsystem of the Linux kernel, and
-one of the files (bdflush) also has a little influence on disk
-usage.
+three of the files (bdflush, max-readahead, min-readahead)
+also have some influence on disk usage.
 
 Default values and initialization routines for most of these
-files can be found in mm/swap.c.
+files can be found in mm/vmscan.c, mm/page_alloc.c and
+mm/filemap.c.
 
 Currently, these files are in /proc/sys/vm:
 - bdflush
+- block_dump
 - kswapd
+- laptop_mode
+- max-readahead
+- min-readahead
 - max_map_count
 - overcommit_memory
 - page-cluster
 - pagetable_cache
+- vm_anon_lru
+- vm_cache_scan_ratio
+- vm_gfp_debug
+- vm_lru_balance_ratio
+- vm_mapped_ratio
+- vm_passes
+- vm_vfs_scan_ratio
+=============================================================
 
-==============================================================
 
-bdflush:
 
+bdflush:
+--------
 This file controls the operation of the bdflush kernel
 daemon. The source code to this struct can be found in
-linux/fs/buffer.c. It currently contains 9 integer values,
+fs/buffer.c. It currently contains 9 integer values,
 of which 6 are actually used by the kernel.
 
-From linux/fs/buffer.c:
---------------------------------------------------------------
-union bdflush_param {
-	struct {
-		int nfract;	/* Percentage of buffer cache dirty to
-				   activate bdflush */
-		int ndirty;	/* Maximum number of dirty blocks to write out per
-				   wake-cycle */
-		int dummy2;	/* old "nrefill" */
-		int dummy3;	/* unused */
-		int interval;	/* jiffies delay between kupdate flushes */
-		int age_buffer;	/* Time for normal buffer to age before we flush it */
-		int nfract_sync;/* Percentage of buffer cache dirty to
-				   activate bdflush synchronously */
-		int nfract_stop_bdflush; /* Percentage of buffer cache dirty to stop bdflush */
-		int dummy5;	/* unused */
-	} b_un;
-	unsigned int data[N_PARAM];
-} bdf_prm = {{30, 500, 0, 0, 5*HZ, 30*HZ, 60, 20, 0}};
---------------------------------------------------------------
-
-int nfract:
-The first parameter governs the maximum number of dirty
-buffers in the buffer cache. Dirty means that the contents
-of the buffer still have to be written to disk (as opposed
-to a clean buffer, which can just be forgotten about).
-Setting this to a high value means that Linux can delay disk
-writes for a long time, but it also means that it will have
-to do a lot of I/O at once when memory becomes short. A low
-value will spread out disk I/O more evenly, at the cost of
-more frequent I/O operations.  The default value is 30%,
-the minimum is 0%, and the maximum is 100%.
-
-int ndirty:
-The second parameter (ndirty) gives the maximum number of
-dirty buffers that bdflush can write to the disk in one time.
-A high value will mean delayed, bursty I/O, while a small
-value can lead to memory shortage when bdflush isn't woken
-up often enough.
-
-int interval:
-The fifth parameter, interval, is the minimum rate at
-which kupdate will wake and flush.  The value is expressed in
-jiffies (clockticks), the number of jiffies per second is
-normally 100 (Alpha is 1024). Thus, x*HZ is x seconds.  The
-default value is 5 seconds, the minimum is 0 seconds, and the
-maximum is 600 seconds.
-
-int age_buffer:
-The sixth parameter, age_buffer, governs the maximum time
-Linux waits before writing out a dirty buffer to disk.  The
-value is in jiffies.  The default value is 30 seconds,
-the minimum is 1 second, and the maximum 6,000 seconds.
-
-int nfract_sync:
-The seventh parameter, nfract_sync, governs the percentage
-of buffer cache that is dirty before bdflush activates
-synchronously.  This can be viewed as the hard limit before
-bdflush forces buffers to disk.  The default is 60%, the
-minimum is 0%, and the maximum is 100%.
-
-int nfract_stop_bdflush:
-The eighth parameter, nfract_stop_bdflush, governs the percentage
-of buffer cache that is dirty which will stop bdflush.
-The default is 20%, the miniumum is 0%, and the maxiumum is 100%.
-==============================================================
+nfract:		The first parameter governs the maximum
+		number of dirty buffers in the buffer
+		cache. Dirty means that the contents of the
+		buffer still have to be written to disk (as
+		opposed to a clean buffer, which can just be
+		forgotten about). Setting this to a high
+		value means that Linux can delay disk writes
+		for a long time, but it also means that it
+		will have to do a lot of I/O at once when
+		memory becomes short. A low value will
+		spread out disk I/O more evenly, at the cost
+		of more frequent I/O operations. The default
+		value is 30%, the minimum is 0%, and the
+		maximum is 100%.
+
+ndirty:		The second parameter (ndirty) gives the
+		maximum number of dirty buffers that bdflush
+		can write to the disk in one time. A high
+		value will mean delayed, bursty I/O, while a
+		small value can lead to memory shortage when
+		bdflush isn't woken up often enough. The
+		default value is 500 dirty buffers, the
+		minimum is 1, and the maximum is 50000.
+
+dummy2:		The third parameter is not used.
+
+dummy3:		The fourth parameter is not used.
+
+interval:	The fifth parameter, interval, is the minimum
+		rate at which kupdate will wake and flush.
+		The value is in jiffies (clockticks), the
+		number of jiffies per second is normally 100
+		(Alpha is 1024). Thus, x*HZ is x seconds. The
+		default value is 5 seconds, the minimum	is 0
+		seconds, and the maximum is 10,000 seconds.
+
+age_buffer:	The sixth parameter, age_buffer, governs the
+		maximum time Linux waits before writing out a
+		dirty buffer to disk. The value is in jiffies.
+		The default value is 30 seconds, the minimum
+		is 1 second, and the maximum 10,000 seconds.
+
+sync:		The seventh parameter, nfract_sync, governs
+		the percentage of buffer cache that is dirty
+		before bdflush activates synchronously. This
+		can be viewed as the hard limit before
+		bdflush forces buffers to disk. The default
+		is 60%,	the minimum is 0%, and the maximum
+		is 100%.
+
+stop_bdflush:	The eighth parameter, nfract_stop_bdflush,
+		governs the percentage of buffer cache that
+		is dirty which will stop bdflush. The default
+		is 20%, the miniumum is 0%, and the maxiumum
+		is 100%.
+
+dummy5:		The ninth parameter is not used.
+
+So the default is: 30 500 0 0 500 3000 60 20 0   for 100 HZ.
+=============================================================
+
+
+
+block_dump:
+-----------
+It can happen that the disk still keeps spinning up and you
+don't quite know why or what causes it. The laptop mode patch
+has a little helper for that as well. When set to 1, it will
+dump info to the kernel message buffer about what process
+caused the io. Be careful when playing with this setting.
+It is advisable to shut down syslog first! The default is 0.
+=============================================================
+
 
-kswapd:
 
+kswapd:
+-------
 Kswapd is the kernel swapout daemon. That is, kswapd is that
 piece of the kernel that frees memory when it gets fragmented
-or full. Since every system is different, you'll probably want
-some control over this piece of the system.
+or full. Since every system is different, you'll probably
+want some control over this piece of the system.
 
 The numbers in this page correspond to the numbers in the
 struct pager_daemon {tries_base, tries_min, swap_cluster
@@ -117,39 +149,83 @@ tries_base	The maximum number of pages k
 		number. Usually this number will be divided
 		by 4 or 8 (see mm/vmscan.c), so it isn't as
 		big as it looks.
-		When you need to increase the bandwidth to/from
-		swap, you'll want to increase this number.
+		When you need to increase the bandwidth to/
+		from swap, you'll want to increase this
+		number.
+
 tries_min	This is the minimum number of times kswapd
 		tries to free a page each time it is called.
 		Basically it's just there to make sure that
 		kswapd frees some pages even when it's being
 		called with minimum priority.
+
 swap_cluster	This is the number of pages kswapd writes in
 		one turn. You want this large so that kswapd
 		does it's I/O in large chunks and the disk
-		doesn't have to seek often, but you don't want
-		it to be too large since that would flood the
-		request queue.
+		doesn't have to seek often, but you don't
+		want it to be too large since that would
+		flood the request queue.
+
+The default value is: 512 32 8.
+=============================================================
 
-==============================================================
 
-overcommit_memory:
 
-This value contains a flag that enables memory overcommitment.
-When this flag is 0, the kernel checks before each malloc()
-to see if there's enough memory left. If the flag is nonzero,
-the system pretends there's always enough memory.
+laptop_mode:
+------------
+Setting this to 1 switches the vm (and block layer) to laptop
+mode. Leaving it to 0 makes the kernel work like before. When
+in laptop mode, you also want to extend the intervals
+desribed in Documentation/laptop-mode.txt.
+See the laptop-mode.sh script for how to do that.
+
+The default value is 0.
+=============================================================
 
-This feature can be very useful because there are a lot of
-programs that malloc() huge amounts of memory "just-in-case"
-and don't use much of it.
 
-Look at: mm/mmap.c::vm_enough_memory() for more information.
 
-==============================================================
+max-readahead:
+--------------
+This tunable affects how early the Linux VFS will fetch the
+next block of a file from memory. File readahead values are
+determined on a per file basis in the VFS and are adjusted
+based on the behavior of the application accessing the file.
+Anytime the current position being read in a file plus the
+current read ahead value results in the file pointer pointing
+to the next block in the file, that block will be fetched
+from disk. By raising this value, the Linux kernel will allow
+the readahead value to grow larger, resulting in more blocks
+being prefetched from disks which predictably access files in
+uniform linear fashion. This can result in performance
+improvements, but can also result in excess (and often
+unnecessary) memory usage. Lowering this value has the
+opposite affect. By forcing readaheads to be less aggressive,
+memory may be conserved at a potential performance impact.
+
+The default value is 31.
+=============================================================
 
-max_map_count:
 
+
+min-readahead:
+--------------
+Like max-readahead, min-readahead places a floor on the
+readahead value. Raising this number forces a files readahead
+value to be unconditionally higher, which can bring about
+performance improvements, provided that all file access in
+the system is predictably linear from the start to the end of
+a file. This of course results in higher memory usage from
+the pagecache. Conversely, lowering this value, allows the
+kernel to conserve pagecache memory, at a potential
+performance cost.
+
+The default value is 3.
+=============================================================
+
+
+
+max_map_count:
+--------------
 This file contains the maximum number of memory map areas a
 process may have. Memory map areas are used as a side-effect
 of calling malloc, directly by mmap and mprotect, and also
@@ -159,10 +235,29 @@ While most applications need less than a
 certain programs, particularly malloc debuggers, may consume 
 lots of them, e.g. up to one or two maps per allocation.
 
-==============================================================
+The default value is 65536.
+=============================================================
+
+
+
+overcommit_memory:
+------------------
+This value contains a flag to enable memory overcommitment.
+When this flag is 0, the kernel checks before each malloc()
+to see if there's enough memory left. If the flag is nonzero,
+the system pretends there's always enough memory.
+
+This feature can be very useful because there are a lot of
+programs that malloc() huge amounts of memory "just-in-case"
+and don't use much of it. The default value is 0.
+
+Look at: mm/mmap.c::vm_enough_memory() for more information.
+=============================================================
+
 
-page-cluster:
 
+page-cluster:
+-------------
 The Linux VM subsystem avoids excessive disk seeks by reading
 multiple pages on a page fault. The number of pages it reads
 is dependent on the amount of memory in your machine.
@@ -170,11 +265,12 @@ is dependent on the amount of memory in 
 The number of pages the kernel reads in at once is equal to
 2 ^ page-cluster. Values above 2 ^ 5 don't make much sense
 for swap because we only cluster swap data in 32-page groups.
+=============================================================
 
-==============================================================
 
-pagetable_cache:
 
+pagetable_cache:
+----------------
 The kernel keeps a number of page tables in a per-processor
 cache (this helps a lot on SMP systems). The cache size for
 each processor will be between the low and the high value.
@@ -188,3 +284,98 @@ For large systems, the settings are prob
 systems they won't hurt a bit. For small systems (<16MB ram)
 it might be advantageous to set both values to 0.
 
+The default value is: 25 50.
+=============================================================
+
+
+
+vm_anon_lru:
+------------
+select if to immdiatly insert anon pages in the lru.
+Immediatly means as soon as they're allocated during the page
+faults. If this is set to 0, they're inserted only after the
+first swapout.
+  
+Having anon pages immediatly inserted in the lru allows the
+VM to know better when it's worthwhile to start swapping
+anonymous ram, it will start to swap earlier and it should
+swap smoother and faster, but it will decrease scalability
+on the >16-ways of an order of magnitude. Big SMP/NUMA
+definitely can't take an hit on a global spinlock at
+every anon page allocation.
+
+Low ram machines that swaps all the time want to turn
+this on (i.e. set to 1).
+
+The default value is 1.
+=============================================================
+
+
+
+vm_cache_scan_ratio:
+--------------------
+is how much of the inactive LRU queue we will scan in one go.
+A value of 6 for vm_cache_scan_ratio implies that we'll scan
+1/6 of the inactive lists during a normal aging round.
+
+The default value is 6.
+=============================================================
+
+
+
+vm_gfp_debug:
+------------
+is when __alloc_pages fails, dump us a stack. This will
+mostly happen during OOM conditions (hopefully ;)
+
+The default value is 0.
+=============================================================
+
+
+
+vm_lru_balance_ratio:
+---------------------
+controls the balance between active and inactive cache. The
+bigger vm_balance is, the easier the active cache will grow,
+because we'll rotate the active list slowly. A value of 2
+means we'll go towards a balance of 1/3 of the cache being
+inactive.
+
+The default value is 2.
+=============================================================
+
+
+
+vm_mapped_ratio:
+----------------
+controls the pageout rate, the smaller, the earlier we'll
+start to pageout.
+
+The default value is 100.
+=============================================================
+
+
+
+vm_passes:
+----------
+is the number of vm passes before failing the memory
+balancing. Take into account 3 passes are needed for a
+flush/wait/free cycle and that we only scan
+1/vm_cache_scan_ratio of the inactive list at each pass.
+
+The default value is 60.
+=============================================================
+
+
+
+vm_vfs_scan_ratio:
+------------------
+is what proportion of the VFS queues we will scan in one go.
+A value of 6 for vm_vfs_scan_ratio implies that 1/6th of the
+unused-inode, dentry and dquot caches will be freed during a
+normal aging round.
+Big fileservers (NFS, SMB etc.) probably want to set this
+value to 3 or 2.
+
+The default value is 6.
+=============================================================
--- a/Documentation/filesystems/proc.txt	2004-05-23 00:08:31.000000000 +0200
+++ b/Documentation/filesystems/proc.txt	2004-05-23 02:33:41.000000000 +0200
@@ -936,172 +936,7 @@ program to load modules on demand.
 
 2.4 /proc/sys/vm - The virtual memory subsystem
 -----------------------------------------------
-
-The files  in  this directory can be used to tune the operation of the virtual
-memory (VM)  subsystem  of  the  Linux  kernel.  In addition, one of the files
-(bdflush) has some influence on disk usage.
-
-bdflush
--------
-
-This file  controls  the  operation of the bdflush kernel daemon. It currently
-contains nine  integer  values,  six of which are actually used by the kernel.
-They are listed in table 2-2.
-
-
-Table 2-2: Parameters in /proc/sys/vm/bdflush 
-..............................................................................
- Value      Meaning                                                            
- nfract     Percentage of buffer cache dirty to activate bdflush              
- ndirty     Maximum number of dirty blocks to  write out per wake-cycle        
- dummy      Unused                                                             
- dummy      Unused                                                             
- interval   jiffies delay between kupdate flushes
- age_buffer Time for normal buffer to age before we flush it                   
- nfract_sync Percentage of buffer cache dirty to activate bdflush synchronously
- nfract_stop_bdflush Percetange of buffer cache dirty to stop bdflush
- dummy      Unused                                                             
-..............................................................................
-
-nfract
-------
-
-This parameter  governs  the  maximum  number  of  dirty buffers in the buffer
-cache. Dirty means that the contents of the buffer still have to be written to
-disk (as  opposed  to  a  clean  buffer,  which  can just be forgotten about).
-Setting this  to  a  higher value means that Linux can delay disk writes for a
-long time, but it also means that it will have to do a lot of I/O at once when
-memory becomes short. A lower value will spread out disk I/O more evenly.
-
-interval
---------
-
-The interval between two kupdate runs. The value is expressed in
-jiffies (clockticks),  the  number of jiffies per second is 100.
-
-ndirty
-------
-
-Ndirty gives the maximum number of dirty buffers that bdflush can write to the
-disk at  one  time.  A high value will mean delayed, bursty I/O, while a small
-value can lead to memory shortage when bdflush isn't woken up often enough.
-
-age_buffer
-----------
-
-Finally, the age_buffer parameter govern the maximum time Linux
-waits before  writing  out  a  dirty buffer to disk. The value is expressed in
-jiffies (clockticks),  the  number of jiffies per second is 100.
-
-nfract_sync
------------
-
-nfract_stop_bdflush
--------------------
-
-kswapd
-------
-
-Kswapd is  the  kernel  swap  out daemon. That is, kswapd is that piece of the
-kernel that  frees  memory when it gets fragmented or full. Since every system
-is different, you'll probably want some control over this piece of the system.
-
-The file contains three numbers:
-
-tries_base
-----------
-
-The maximum  number  of  pages kswapd tries to free in one round is calculated
-from this  number.  Usually  this  number  will  be  divided  by  4  or 8 (see
-mm/vmscan.c), so it isn't as big as it looks.
-
-When you  need to increase the bandwidth to/from swap, you'll want to increase
-this number.
-
-tries_min
----------
-
-This is  the  minimum number of times kswapd tries to free a page each time it
-is called. Basically it's just there to make sure that kswapd frees some pages
-even when it's being called with minimum priority.
-
-overcommit_memory
------------------
-
-This file  contains  one  value.  The following algorithm is used to decide if
-there's enough  memory:  if  the  value of overcommit_memory is positive, then
-there's always  enough  memory. This is a useful feature, since programs often
-malloc() huge  amounts  of  memory 'just in case', while they only use a small
-part of  it.  Leaving  this value at 0 will lead to the failure of such a huge
-malloc(), when in fact the system has enough memory for the program to run.
-
-On the  other  hand,  enabling this feature can cause you to run out of memory
-and thrash the system to death, so large and/or important servers will want to
-set this value to 0.
-
-pagetable_cache
----------------
-
-The kernel  keeps a number of page tables in a per-processor cache (this helps
-a lot  on  SMP systems). The cache size for each processor will be between the
-low and the high value.
-
-On a  low-memory,  single  CPU system, you can safely set these values to 0 so
-you don't  waste  memory.  It  is  used  on SMP systems so that the system can
-perform fast  pagetable allocations without having to acquire the kernel memory
-lock.
-
-For large  systems,  the  settings  are probably fine. For normal systems they
-won't hurt  a  bit.  For  small  systems  (  less  than  16MB ram) it might be
-advantageous to set both values to 0.
-
-swapctl
--------
-
-This file  contains  no less than 8 variables. All of these values are used by
-kswapd.
-
-The first four variables
-* sc_max_page_age,
-* sc_page_advance,
-* sc_page_decline and
-* sc_page_initial_age
-are used  to  keep  track  of  Linux's page aging. Page aging is a bookkeeping
-method to  track  which pages of memory are often used, and which pages can be
-swapped out without consequences.
-
-When a  page  is  swapped in, it starts at sc_page_initial_age (default 3) and
-when the  page  is  scanned  by  kswapd,  its age is adjusted according to the
-following scheme:
-
-* If  the  page  was used since the last time we scanned, its age is increased
-  by sc_page_advance  (default  3).  Where  the  maximum  value  is  given  by
-  sc_max_page_age (default 20).
-* Otherwise  (meaning  it wasn't used) its age is decreased by sc_page_decline
-  (default 1).
-
-When a page reaches age 0, it's ready to be swapped out.
-
-The variables  sc_age_cluster_fract, sc_age_cluster_min, sc_pageout_weight and
-sc_bufferout_weight, can  be  used  to  control  kswapd's  aggressiveness  in
-swapping out pages.
-
-Sc_age_cluster_fract is used to calculate how many pages from a process are to
-be scanned by kswapd. The formula used is
-
-(sc_age_cluster_fract divided by 1024) times resident set size
-
-So if you want kswapd to scan the whole process, sc_age_cluster_fract needs to
-have a  value  of  1024.  The  minimum  number  of  pages  kswapd will scan is
-represented by sc_age_cluster_min, which is done so that kswapd will also scan
-small processes.
-
-The values  of  sc_pageout_weight  and sc_bufferout_weight are used to control
-how many  tries  kswapd  will make in order to swap out one page/buffer. These
-values can  be used to fine-tune the ratio between user pages and buffer/cache
-memory. When  you find that your Linux system is swapping out too many process
-pages in  order  to  satisfy  buffer  memory  demands,  you may want to either
-increase sc_bufferout_weight, or decrease the value of sc_pageout_weight.
+Please read Documentation/sysctl/vm.txt
 
 2.5 /proc/sys/dev - Device specific parameters
 ----------------------------------------------
@@ -1719,10 +1719,3 @@ need to  recompile  the kernel, or even 
 command to write value into these files, thereby changing the default settings
 of the kernel.
 ------------------------------------------------------------------------------
-
-
-
-
-
-
-

[-- Attachment #3: 01_remove-old-docu-VM.patch --]
[-- Type: text/x-diff, Size: 5132 bytes --]

--- a/Documentation/sysctl/vm.txt	2002-11-28 16:53:08.000000000 -0700
+++ b/Documentation/sysctl/vm.txt	2003-11-12 17:35:11.000000000 -0700
@@ -18,13 +18,10 @@
 
 Currently, these files are in /proc/sys/vm:
 - bdflush
-- buffermem
-- freepages
 - kswapd
 - max_map_count
 - overcommit_memory
 - page-cluster
-- pagecache
 - pagetable_cache
 
 ==============================================================
@@ -102,38 +99,6 @@
 of buffer cache that is dirty which will stop bdflush.
 The default is 20%, the miniumum is 0%, and the maxiumum is 100%.
 ==============================================================
-buffermem:
-
-The three values in this file correspond to the values in
-the struct buffer_mem. It controls how much memory should
-be used for buffer memory. The percentage is calculated
-as a percentage of total system memory.
-
-The values are:
-min_percent	-- this is the minimum percentage of memory
-		   that should be spent on buffer memory
-borrow_percent  -- UNUSED
-max_percent     -- UNUSED
-
-==============================================================
-freepages:
-
-This file contains the values in the struct freepages. That
-struct contains three members: min, low and high.
-
-The meaning of the numbers is:
-
-freepages.min	When the number of free pages in the system
-		reaches this number, only the kernel can
-		allocate more memory.
-freepages.low	If the number of free pages gets below this
-		point, the kernel starts swapping aggressively.
-freepages.high	The kernel tries to keep up to this amount of
-		memory free; if memory comes below this point,
-		the kernel gently starts swapping in the hopes
-		that it never has to do real aggressive swapping.
-
-==============================================================
 
 kswapd:
 
@@ -208,24 +173,6 @@
 
 ==============================================================
 
-pagecache:
-
-This file does exactly the same as buffermem, only this
-file controls the struct page_cache, and thus controls
-the amount of memory used for the page cache.
-
-In 2.2, the page cache is used for 3 main purposes:
-- caching read() data from files
-- caching mmap()ed data and executable files
-- swap cache
-
-When your system is both deep in swap and high on cache,
-it probably means that a lot of the swapped data is being
-cached, making for more efficient swapping than possible
-with the 2.0 kernel.
-
-==============================================================
-
 pagetable_cache:
 
 The kernel keeps a number of page tables in a per-processor
--- a/Documentation/filesystems/proc.txt	2004-05-21 22:54:13.000000000 +0200
+++ b/Documentation/filesystems/proc.txt	2004-05-23 00:08:09.000000000 +0200
@@ -999,54 +999,6 @@ nfract_sync
 nfract_stop_bdflush
 -------------------
 
-buffermem
----------
-
-The three  values  in  this  file  control  how much memory should be used for
-buffer memory.  The  percentage  is calculated as a percentage of total system
-memory.
-
-The values are:
-
-min_percent
------------
-
-This is  the  minimum  percentage  of  memory  that  should be spent on buffer
-memory.
-
-borrow_percent
---------------
-
-When Linux is short on memory, and the buffer cache uses more than it has been
-allotted, the  memory  management  (MM)  subsystem will prune the buffer cache
-more heavily than other memory to compensate.
-
-max_percent
------------
-
-This is the maximum amount of memory that can be used for buffer memory.
-
-freepages
----------
-
-This file contains three values: min, low and high:
-
-min
----
-When the  number  of  free  pages  in the system reaches this number, only the
-kernel can allocate more memory.
-
-low
----
-If the number of free pages falls below this point, the kernel starts swapping
-aggressively.
-
-high
-----
-The kernel  tries  to  keep  up to this amount of memory free; if memory falls
-below this point, the kernel starts gently swapping in the hopes that it never
-has to do really aggressive swapping.
-
 kswapd
 ------
 
@@ -1073,16 +1025,6 @@ This is  the  minimum number of times ks
 is called. Basically it's just there to make sure that kswapd frees some pages
 even when it's being called with minimum priority.
 
-swap_cluster
-------------
-
-This is probably the greatest influence on system performance.
-
-swap_cluster is  the  number  of  pages kswapd writes in one turn. You'll want
-this value  to  be  large  so that kswapd does its I/O in large chunks and the
-disk doesn't  have  to  seek  as  often, but you don't want it to be too large
-since that would flood the request queue.
-
 overcommit_memory
 -----------------
 
@@ -1097,15 +1039,6 @@ On the  other  hand,  enabling this feat
 and thrash the system to death, so large and/or important servers will want to
 set this value to 0.
 
-pagecache
----------
-
-This file  does exactly the same job as buffermem, only this file controls the
-amount of memory allowed for memory mapping and generic caching of files.
-
-You don't  want  the  minimum level to be too low, otherwise your system might
-thrash when memory is tight or fragmentation is high.
-
 pagetable_cache
 ---------------

next prev parent reply	other threads:[~2004-05-26 18:30 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-02 10:12 [2.4] heavy-load under swap space shortage j-nomura
2004-02-02 13:29 ` Hugh Dickins
2004-02-03  7:53   ` j-nomura
2004-02-03 17:19     ` Hugh Dickins
2004-02-04 11:40       ` j-nomura
2004-02-05 18:42         ` Hugh Dickins
2004-02-06  9:03           ` j-nomura
2004-03-10 10:57           ` j-nomura
2004-03-14 19:47             ` Marcelo Tosatti
2004-03-14 19:54               ` Rik van Riel
2004-03-14 20:15               ` Andrew Morton
     [not found]                 ` <20040314230138.GV30940@dualathlon.random>
2004-03-14 23:22                   ` Andrew Morton
2004-03-15  0:14                     ` Andrea Arcangeli
2004-03-15  4:38                       ` Nick Piggin
2004-03-15 11:49                         ` Andrea Arcangeli
2004-03-15 13:23                           ` Rik van Riel
2004-03-15 14:37                             ` Nick Piggin
2004-03-15 14:50                               ` Andrea Arcangeli
2004-03-15 18:35                                 ` Andrew Morton
2004-03-15 18:51                                   ` Andrea Arcangeli
2004-03-15 19:02                                     ` Andrew Morton
2004-03-15 21:55                                       ` Andrea Arcangeli
2004-03-15 22:05                                 ` Nick Piggin
2004-03-15 22:24                                   ` Andrea Arcangeli
2004-03-15 22:41                                     ` Nick Piggin
2004-03-15 22:44                                       ` Andrea Arcangeli
2004-03-15 22:41                                     ` Rik van Riel
2004-03-15 23:32                                       ` Andrea Arcangeli
2004-03-16  6:27                                         ` Nick Piggin
2004-03-16  7:25                                   ` Marcelo Tosatti
2004-03-16  6:31                     ` Marcelo Tosatti
2004-03-16 13:47                       ` Andrea Arcangeli
2004-03-16 16:59                         ` Marcelo Tosatti
2004-11-22 15:01                     ` Lazily add anonymous pages to LRU on v2.4? was " Marcelo Tosatti
2004-11-22 19:49                       ` Andrea Arcangeli
2004-11-22 15:58                         ` Marcelo Tosatti
2004-05-26 12:41             ` Marcelo Tosatti
2004-05-26 18:24               ` Marc-Christian Petersen [this message]
2004-05-27 11:16                 ` Marcelo Tosatti
2004-05-26 19:06               ` Hugh Dickins
2004-05-26 22:23               ` Andrea Arcangeli
2004-05-28  2:55               ` j-nomura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200405262024.34905@WOLK \
    --to=m.c.p@kernel.linux-systeme.com \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=hugh@veritas.com \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.