From: Andrew Morton <akpm@digeo.com>
To: Ed Tomlinson <tomlins@cam.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH] slabasap-mm5_A2
Date: Sun, 08 Sep 2002 14:47:59 -0700 [thread overview]
Message-ID: <3D7BC58F.D8AC82E8@digeo.com> (raw)
In-Reply-To: 200209081714.54110.tomlins@cam.org
Ed Tomlinson wrote:
>
> On September 8, 2002 04:56 pm, Andrew Morton wrote:
> > Ed Tomlinson wrote:
> > > Hi,
> > >
> > > Here is a rewritten slablru - this time its not using the lru... If
> > > changes long standing slab behavior. Now slab.c releases pages as soon
> > > as possible. This was done since we noticed that slablru was taking a
> > > long time to release the pages it freed - from other vm experiences this
> > > is not a good thing.
> >
> > Right. There remains the issue that we're ripping away constructed
> > objects from slabs which have constructors, as Stephen points out.
>
> I have a small optimization coded in slab. If there are not any free
> slab objects I do not free the page. If we have problems with high
> order slabs we can change this to be if we do not have <n> objects
> do not free it.
OK.
> > I doubt if that matters. slab constructors just initialise stuff.
> > If the memory is in cache then the initialisation is negligible.
> > If the memory is not in cache then the initialisation will pull
> > it into cache, which is something which we needed to do anyway. And
> > unless the slab's access pattern is extremely LIFO, chances are that
> > most allocations will come in from part-filled slab pages anyway.
> >
> > And other such waffly words ;) I'll do the global LIFO page hotlists
> > soonl; that'll fix it up.
> >
> > > In this patch I have tried to make as few changes as possible.
> >
> > Thanks. I've shuffled the patching sequence (painful), and diddled
> > a few things. We actually do have the "number of scanned pages"
> > in there, so we can use that. I agree that the ratio should be
> > nr_scanned/total rather than nr_reclaimed/total. This way, if
> > nr_reclaimed < nr_scanned (page reclaim is in trouble) then we
> > put more pressure on slabs.
>
> OK will change this. This also means the changes to prune functions
> made for slablru will come back - they convert these fuctions so they
> age <n> object rather than purge <n>.
That would make the slab pruning less aggressive than the code I'm
testing now. I'm not sure it needs that change. Not sure...
> > > With this in mind I am using
> > > the percentage of the active+inactive pages reclaimed to recover the same
> > > percentage of the pruneable caches. In slablru the affect was to age the
> > > pruneable caches by percentage of the active+inactive pages scanned -
> > > this could be done but required more code so I went used pages reclaimed.
> > > The same choise was made about accounting of pages freed by the
> > > shrink_<something>_memory calls.
> > >
> > > There is also a question as to if we should only use the ZONE_DMA and
> > > ZONE_NORMAL to drive the cache shrinking. Talk with Rik on irc convinced
> > > me to go with the choise that required less code, so we use all zones.
> >
> > OK. We could do with a `gimme_the_direct_addressed_classzone' utility
> > anyway. It is currently open-coded in fs/buffer.c:free_more_memory().
> > We can just pull that out of there and use memclass() on it for this.
>
> Ah thanks. Was wondering the best way to do this. Will read the code.
Then again, shrinking slab harder for big highmem machines is good ;)
> ...
> > From a quick test, the shrinking rate seems quite reasonable to
> > me. mem=512m, with twenty megs of ext2 inodes in core, a `dd'
> > of one gigabyte (twice the size of memory) steadily pushed the
> > ext2 inodes down to 2.5 megs (although total memory was still
> > 9 megs - internal fragmentation of the slab).
> >
> > A second 1G dd pushed it down to 1M/3M.
> >
> > A third 1G dd pushed it down to .25M/1.25M
> >
> > Seems OK.
> >
> > A few things we should do later:
> >
> > - We're calling prune_icache with a teeny number of inodes, many times.
> > Would be better to batch that up a bit.
>
> Why not move the prunes to try_to_free_pages? The should help a little to get
> bigger batches of pages as will using the number of scanned pages.
But the prunes are miles too small at present. We go into try_to_free_pages()
and reclaim 32 pages. And we also call into prune_cache() and free about 0.3
pages. It's out of whack. I'd suggest not calling out to the pruner until
we want at least several pages' worth of objects.
> ...
> > But let's get the current code settled in before doing these
> > refinements.
>
> I can get the aging changes to you real fast if you want them. I initially
> coded it this way then pull the changes to reduce the code... see below
No rush.
> The other thing we want to be careful with is to make sure the lack of
> free page accounting is detected by oom - we definitly do not want to
> oom when slab has freed memory by try_to_free_pages does not
> realize it..
How much memory are we talking about here? Not much I think?
> > There are some usage patterns in which the dentry/inode aging
> > might be going wrong. Try, with mem=512m
> >
> > cp -a linux a
> > cp -a linux b
> > cp -a linux c
> >
> > etc.
> >
> > Possibly the inode/dentry cache is just being FIFO here and is doing
> > exactly the wrong thing. But the dcache referenced-bit logic should
> > cause the inodes in `linux' to be pinned with this test, so that
> > should be OK. Dunno.
> >
> > The above test will be hurt a bit by the aggressively lowered (10%)
> > background writeback threshold - more reads competing with writes.
> > Maybe I should not kick off background writeback until the dirty
> > threshold reaches 30% if there are reads queued against the device.
> > That's easy enough to do.
> >
> > drop-behind should help here too.
>
> This converts the prunes in inode and dcache to age <n> entries rather
> than purge them. Think this is the more correct behavior. Code is from
> slablru.
Makes sense (I think).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
next prev parent reply other threads:[~2002-09-08 21:33 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-09-07 14:06 [PATCH][RFC] slabnow Ed Tomlinson
2002-09-08 20:45 ` Daniel Phillips
2002-09-09 4:59 ` Andrew Morton
2002-09-09 5:14 ` Daniel Phillips
[not found] ` <200209081142.02839.tomlins@cam.org>
2002-09-08 20:56 ` [PATCH] slabasap-mm5_A2 Andrew Morton
2002-09-08 21:08 ` Andrew Morton
2002-09-08 21:14 ` Ed Tomlinson
2002-09-08 21:47 ` Andrew Morton [this message]
2002-09-08 21:48 ` Ed Tomlinson
2002-09-08 22:46 ` Andrew Morton
2002-09-09 9:29 ` Stephen C. Tweedie
2002-09-09 21:33 ` Ed Tomlinson
2002-09-09 22:07 ` Andrew Morton
2002-09-09 22:28 ` Ed Tomlinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D7BC58F.D8AC82E8@digeo.com \
--to=akpm@digeo.com \
--cc=linux-mm@kvack.org \
--cc=tomlins@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.