From: Josh MacDonald <jmacd@CS.Berkeley.EDU>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
reiserfs-dev@namesys.com
Subject: Re: Note describing poor dcache utilization under high memory pressure
Date: Tue, 29 Jan 2002 09:27:01 -0800 [thread overview]
Message-ID: <20020129092701.C8740@helen.CS.Berkeley.EDU> (raw)
In-Reply-To: <20020128091338.D6578@helen.CS.Berkeley.EDU> <Pine.LNX.4.33.0201280930130.1557-100000@penguin.transmeta.com>
In-Reply-To: <Pine.LNX.4.33.0201280930130.1557-100000@penguin.transmeta.com>; from torvalds@transmeta.com on Mon, Jan 28, 2002 at 09:39:25AM -0800
Quoting Linus Torvalds (torvalds@transmeta.com):
>
> On Mon, 28 Jan 2002, Josh MacDonald wrote:
> >
> > So, it would seem that the dcache and kmem_slab_cache memory allocator
> > could benefit from a way to shrink the dcache in a less random way.
> > Any thoughts?
>
> The way I want to solve this problem generically is to basically get rid
> of the special-purpose memory shrinkers, and have everything done with one
> unified interface, namely the physical-page-based "writeout()" routine. We
> do that for the page cache, and there's nothing that says that we couldn't
> do the same for all other caches, including very much the slab allocator.
>
> Thus any slab user that wants to, could just register their own per-page
> memory pressure logic. The dcache "reference" bit would go away, to be
> replaced by a per-page reference bit (that part could be done already, of
> course, and might help a bit on its own).
>
> Basically, the more different "pools" of memory we have, the harder it
> gets to balance them. Clearly, the optimal number of pools from a
> balancing standpoint is just a single, direct physical pool.
>
> Right now we have several pools - we have the pure physical LRU, we have
> the virtual mapping (where scanning is directly tied to the physical LRU,
> but where the separate pool still _does_ pose some problems), and we have
> separate balancing for inodes, dentries and quota. And there's no question
> that it hurts us under memory pressure.
>
> (There's a related question, which is whether other caches might also
> benefit from being able to grow more - right now there are some caches
> that are of a limited size partly because they have no good way of
> shrinking back on demand).
Using a physical-page-based "writeout()" routine seems like a nice way
to unify the application of memory pressure to various caches, but it
does not address the issue of fragmentation within a cache slab. You
could have a situation in which a number of hot dcache entries are
occupying some number of pages, such that dcache pages are always more
recently used than other pages in the system. Would the VM ever tell
the dcache to writeout() in that case?
It seems that the current special-purpose memory "shrinkers" approach
has some advantages in this regard: when memory pressure is applied
every cache attempts to free some resources. Do you envision the
unified interface approach applying pressure to pages of every kind of
cache under memory pressure?
Even so, the physical-page writeout() approach results in a less
effective cache under memory pressure. Suppose the VM chooses some
number of least-recently-used physical pages belonging to the dcache
and tells the slab allocator to release those pages. Assume that the
dcache entries are not currently in use and that the dcache is in fact
able to release them. Some of the dcache entries being tossed from
memory could instead replace less-recently-used objects on more
recently-used physical pages. In other words, the dcache would
benefit from relocating its more frequently used entries onto the same
physical pages under memory pressure.
Unless the cache ejects entries based on the object access and not
physical page access, the situation will never improve. Pages with
hot dcache entries will never clean-out the inactive entries on the
same page. For this reason, I don't think it makes sense to eliminate
the object-based aging of cache entries altogether.
Perhaps a combination of the approaches would work best. When the VM
system begins forcing the dcache to writeout(), the dcache could both
release some of its pages by ejecting all the entries (as above) and
in addition it could run something like prune_dcache(), thus creating
free space in the hotter set of physical pages so that over a period
of prolonged memory pressure, the hotter dcache entries would
eventually become located on the same pages.
A solution that relocates dcache entries to reduce total page
consumption, however, makes the most effective use of cache space.
-josh
--
PRCS version control system http://sourceforge.net/projects/prcs
Xdelta storage & transport http://sourceforge.net/projects/xdelta
Need a concurrent skip list? http://sourceforge.net/projects/skiplist
prev parent reply other threads:[~2002-01-29 17:27 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-01-28 17:13 Note describing poor dcache utilization under high memory pressure Josh MacDonald
2002-01-28 17:39 ` Linus Torvalds
2002-01-28 18:01 ` Rik van Riel
2002-01-28 18:21 ` Linus Torvalds
2002-01-28 18:37 ` Rik van Riel
2002-01-28 19:28 ` William Lee Irwin III
2002-01-28 20:01 ` Daniel Phillips
2002-01-28 21:33 ` Rick Stevens
2002-01-28 21:43 ` Rik van Riel
2002-01-28 22:00 ` Rick Stevens
2002-01-28 22:43 ` Daniel Phillips
2002-01-28 23:06 ` Rick Stevens
2002-01-28 23:51 ` [OT] " jepler
2002-01-29 2:30 ` IPmonger
2002-01-29 12:02 ` Karl & Betty Schendel
2002-01-28 22:26 ` Daniel Phillips
2002-01-28 22:34 ` Brian Gerst
2002-01-28 23:08 ` Daniel Phillips
2002-01-28 22:39 ` Daniel Phillips
2002-01-28 23:12 ` Rick Stevens
2002-01-28 23:27 ` Daniel Phillips
2002-01-28 22:01 ` Momchil Velikov
2002-01-28 22:19 ` Daniel Phillips
2002-01-29 1:29 ` Oliver Xymoron
2002-01-29 1:37 ` [reiserfs-list] " Valdis.Kletnieks
2002-01-29 1:45 ` Daniel Phillips
2002-01-29 8:39 ` Momchil Velikov
2002-01-29 8:55 ` Daniel Phillips
2002-01-29 9:20 ` William Lee Irwin III
2002-01-29 9:55 ` Daniel Phillips
2002-01-29 10:18 ` Momchil Velikov
2002-01-29 19:55 ` William Lee Irwin III
2002-01-29 20:08 ` Linus Torvalds
2002-01-29 20:39 ` William Lee Irwin III
2002-01-29 20:49 ` Linus Torvalds
2002-01-29 21:01 ` William Lee Irwin III
2002-01-29 9:20 ` Momchil Velikov
2002-01-29 10:27 ` Daniel Phillips
2002-01-29 11:54 ` Helge Hafting
2002-01-29 12:33 ` Daniel Phillips
2002-01-30 9:07 ` Horst von Brand
2002-01-30 10:55 ` Daniel Phillips
2002-01-30 14:46 ` Rik van Riel
2002-01-30 14:59 ` Daniel Phillips
2002-01-30 15:54 ` Rik van Riel
2002-01-30 16:34 ` Daniel Phillips
2002-01-29 10:59 ` Rik van Riel
2002-01-29 11:28 ` Daniel Phillips
2002-01-29 11:38 ` Rik van Riel
2002-01-29 12:01 ` Daniel Phillips
2002-01-29 16:57 ` Oliver Xymoron
2002-01-29 17:25 ` Rik van Riel
2002-01-29 20:48 ` Daniel Phillips
2002-01-29 21:00 ` Oliver Xymoron
2002-01-29 21:08 ` Linus Torvalds
2002-01-29 21:13 ` Oliver Xymoron
2002-01-29 21:50 ` Linus Torvalds
2002-01-29 22:02 ` Oliver Xymoron
2002-01-29 22:10 ` Linus Torvalds
2002-01-29 22:53 ` Daniel Phillips
2002-01-29 22:53 ` Daniel Phillips
2002-01-29 23:02 ` Oliver Xymoron
2002-01-29 23:21 ` Daniel Phillips
2002-01-28 19:25 ` [reiserfs-dev] " Hans Reiser
2002-01-28 23:52 ` Daniel Phillips
2002-01-29 0:16 ` Hans Reiser
2002-01-29 0:30 ` Alexander Viro
2002-01-29 10:46 ` Hans Reiser
2002-01-29 14:50 ` Chris Mason
2002-01-29 21:10 ` Hans Reiser
2002-01-30 7:11 ` Oliver Xymoron
2002-01-30 9:57 ` Hans Reiser
2002-01-29 17:28 ` Josh MacDonald
2002-01-29 18:44 ` [reiserfs-list] " Andreas Dilger
2002-01-29 19:55 ` Andrew Morton
2002-01-30 7:17 ` Oliver Xymoron
2002-01-30 7:32 ` [reiserfs-list] Re: [reiserfs-dev] Re: Note describing poordcache " Andrew Morton
2002-01-30 7:52 ` Oliver Xymoron
2002-01-30 10:03 ` Hans Reiser
2002-01-30 10:07 ` [reiserfs-dev] Re: Note describing poor dcache " Horst von Brand
2002-01-29 18:29 ` Horst von Brand
2002-01-29 0:51 ` Daniel Phillips
2002-01-29 1:32 ` Daniel Phillips
2002-01-28 22:46 ` Alex Bligh - linux-kernel
2002-01-29 17:27 ` Josh MacDonald [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020129092701.C8740@helen.CS.Berkeley.EDU \
--to=jmacd@cs.berkeley.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=reiserfs-dev@namesys.com \
--cc=reiserfs-list@namesys.com \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox