public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: James Pearson <james-p@moving-picture.com>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: Reducing inode cache usage on 2.4?
Date: Sat, 18 Dec 2004 13:02:26 -0200	[thread overview]
Message-ID: <20041218150226.GC31040@logos.cnet> (raw)
In-Reply-To: <41C37AB6.10906@moving-picture.com>

On Sat, Dec 18, 2004 at 12:32:54AM +0000, James Pearson wrote:
> Marcelo Tosatti wrote:
> >
> >>Or am I looking in completely the wrong place i.e. the inode cache is 
> >>not the problem?
> >
> >
> >No, in your case the extreme inode/dcache sizes indeed seem to be a 
> >problem. 
> >The default kernel shrinking ratio can be tuned for enhanced reclaim 
> >efficiency.
> >
> >
> >>xfs_inode         931428 931428    408 103492 103492    1 :  124   62
> >>dentry_cache      499222 518850    128 17295 17295    1 :  252  126
> >
> >
> >vm_vfs_scan_ratio:
> >------------------
> >is what proportion of the VFS queues we will scan in one go.
> >A value of 6 for vm_vfs_scan_ratio implies that 1/6th of the
> >unused-inode, dentry and dquot caches will be freed during a
> >normal aging round.
> >Big fileservers (NFS, SMB etc.) probably want to set this
> >value to 3 or 2.
> >
> >The default value is 6.
> >=============================================================
> >
> >Tune /proc/sys/vm/vm_vfs_scan_ratio increasing the value to 10 and so on 
> >and examine the results.
> 
> Thanks for the info - but doesn't increasing the value of 
> vm_vfs_scan_ratio mean that less of the caches will be freed?

Right - what I said was wrong - its the other way around:
Decreasing the value increases the percentage of VFS caches scanned at each "aging pass".

Now Andrew's changed the ageing round pass. 

Quoting him "If the machine is full of unmapped clean pagecache pages the kernel 
won't even try to reclaim inodes".

vm_vfs_scan_ratio now is more meaningful. 

kswapd is awaken as soon as a zone's low watermark is reached, and will
work to free pages until it reaches the zone's high watermark.

There are three zones: DMA (1) , Normal (2) and Highmem (3).

 * On machines where it is needed (eg PCs) we divide physical memory
 * into multiple physical zones. On a PC we have 3 zones:
 *
 * ZONE_DMA       < 16 MB       ISA DMA capable memory
 * ZONE_NORMAL  16-896 MB       direct mapped by the kernel
 * ZONE_HIGHMEM  > 896 MB       only page cache and user processes

So these thresolds are used to calculate each zone's min, low and high
watermarks using the following calculation (mm/page_alloc.c):

	mask = (realsize / zone_balance_ratio[j]);
	if (mask < zone_balance_min[j])
     	mask = zone_balance_min[j];
              else if (mask > zone_balance_max[j])
                        mask = zone_balance_max[j];
                zone->watermarks[j].min = mask;
                zone->watermarks[j].low = mask*2;
                zone->watermarks[j].high = mask*3;

To trigger the normal aging round earlier the "low" watermark has to be increased,
but you better increase the "high" watermark which makes kswapd work up longer
until such high free page watermark is reached, one can try for example

 zone->watermarks[j].high = mask*4

But hopefully you wont need such modification (it would be nice if they were all boot 
configurable BTW) with Andrew's change.

      parent reply	other threads:[~2004-12-18 18:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-17 17:26 Reducing inode cache usage on 2.4? James Pearson
2004-12-17 15:12 ` Marcelo Tosatti
2004-12-17 21:52   ` Willy Tarreau
2004-12-18  0:32   ` James Pearson
2004-12-18  1:21     ` Andrew Morton
2004-12-18 11:02       ` Marcelo Tosatti
2004-12-20 13:47         ` James Pearson
2004-12-20 12:46           ` Marcelo Tosatti
2004-12-20 15:10             ` Andrea Arcangeli
2004-12-20 15:06               ` Marcelo Tosatti
2004-12-20 17:54                 ` Andrea Arcangeli
2004-12-20 15:43                   ` Marcelo Tosatti
2004-12-20 19:20       ` Andrea Arcangeli
2004-12-21 11:33         ` James Pearson
2004-12-21 13:22           ` Andrea Arcangeli
2004-12-21 13:59             ` James Pearson
2004-12-21 14:39               ` Andrea Arcangeli
2004-12-18 15:02     ` Marcelo Tosatti [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041218150226.GC31040@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=akpm@osdl.org \
    --cc=james-p@moving-picture.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox