* Atrocious icache/dcache in 2.4.2
@ 2001-04-27 19:01 Pete Zaitcev
2001-04-27 19:13 ` Christoph Hellwig
2001-04-27 19:40 ` Alexander Viro
0 siblings, 2 replies; 4+ messages in thread
From: Pete Zaitcev @ 2001-04-27 19:01 UTC (permalink / raw)
To: linux-kernel; +Cc: zaitcev
Hello:
My box here slows down dramatically after a while, and starts
behaving as if it has very little memory, e.g. programs page
each other out. It turns out that out of 40MB total, about
35MB is used for dcache and icache, and system basically
runs in 5MB of RAM.
When I tried to discuss it with riel, viro, and others,
I got an immediate and very strong knee jerk reaction "we fixed
it in 2.4.4-pre4!" "we gotta call prune_dcache more!".
That just does not sound persuasive to me.
After a little thinking it seems apparent to me that it
may be a good thing to have VM taking pages from dentry and
inode pools directly. This sounds almost what slab does,
so let me speculate about it (it is a bad idea, but it is
interesting _why_).
Suppose that we do this: when inode gets clean (e.g. unlocked,
written to disk if was changed), drop it into kmem_cache_free(),
but retain on hash (forget about poisoning for a momemt).
Then, if memory is needed, VM may ask slab, slab calls our
destructors, and destructors take inode off hash. The idea
solves the problem, but has two marks agains it. First, when
we look up an inode, we either hit dirty or "clean", which
is free. Then we have to do kmem_cache_alloc() and that will
return wrong inode, which we have to drop from hash, then do
memcpy from old "really free one", etc. It still saves disk
I/O, but messy. Another thing is a fragmentation: suppose we
have bunch of slabs, every one has a single dirty inode in it
(tar xf -). Memory pressure will be powerless to do anything
about them.
So, I have a better crackpot idea: create a fake filesystem,
say "inodefs". When inodes are needed, we pretend to read
pages from that filesystem, but in fact we just zero most
of them and put inodes there, also every one needs a "used"
counter, like slab has. When an inode is dirty, we mark
those pages locked or dirty, if only clean - mark pages
as dirty. VM will automatically try to get pages, and
write out those that are "dirty". At that moment,
we have an option to look, if any used (clean or dirty) inodes
are inside the page. If they are, we either move them in
some other (fragmented) pages, or just remove them from
hashes and pretend that the page is written.
The bad part is that inode cache code and inodefs will have
part of slab machinery replicated in them. Dunno if that is
bad enough to bury the thing.
If you have read to this point, let me know what you think.
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Atrocious icache/dcache in 2.4.2
2001-04-27 19:01 Atrocious icache/dcache in 2.4.2 Pete Zaitcev
@ 2001-04-27 19:13 ` Christoph Hellwig
2001-04-27 19:40 ` Alexander Viro
1 sibling, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2001-04-27 19:13 UTC (permalink / raw)
To: Pete Zaitcev; +Cc: linux-kernel
Hi Pete,
In article <20010427150114.A23960@devserv.devel.redhat.com> you wrote:
> After a little thinking it seems apparent to me that it
> may be a good thing to have VM taking pages from dentry and
> inode pools directly. This sounds almost what slab does,
> so let me speculate about it (it is a bad idea, but it is
> interesting _why_).
>
> Suppose that we do this: when inode gets clean (e.g. unlocked,
> written to disk if was changed), drop it into kmem_cache_free(),
> but retain on hash (forget about poisoning for a momemt).
> Then, if memory is needed, VM may ask slab, slab calls our
> destructors, and destructors take inode off hash. The idea
> solves the problem, but has two marks agains it. First, when
> we look up an inode, we either hit dirty or "clean", which
> is free. Then we have to do kmem_cache_alloc() and that will
> return wrong inode, which we have to drop from hash, then do
> memcpy from old "really free one", etc. It still saves disk
> I/O, but messy. Another thing is a fragmentation: suppose we
> have bunch of slabs, every one has a single dirty inode in it
> (tar xf -). Memory pressure will be powerless to do anything
> about them.
It looks like you want the SLAB cache ->reclaim method we seem
to have forgotten when cloning the Solaris SLAB interface nearly
1:1...
Christoph
--
Of course it doesn't work. We've performed a software upgrade.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Atrocious icache/dcache in 2.4.2
2001-04-27 19:01 Atrocious icache/dcache in 2.4.2 Pete Zaitcev
2001-04-27 19:13 ` Christoph Hellwig
@ 2001-04-27 19:40 ` Alexander Viro
1 sibling, 0 replies; 4+ messages in thread
From: Alexander Viro @ 2001-04-27 19:40 UTC (permalink / raw)
To: Pete Zaitcev; +Cc: linux-kernel
On Fri, 27 Apr 2001, Pete Zaitcev wrote:
> Hello:
>
> My box here slows down dramatically after a while, and starts
> behaving as if it has very little memory, e.g. programs page
> each other out. It turns out that out of 40MB total, about
> 35MB is used for dcache and icache, and system basically
> runs in 5MB of RAM.
>
> When I tried to discuss it with riel, viro, and others,
> I got an immediate and very strong knee jerk reaction "we fixed
> it in 2.4.4-pre4!" "we gotta call prune_dcache more!".
> That just does not sound persuasive to me.
[snip]
> written to disk if was changed), drop it into kmem_cache_free(),
> but retain on hash (forget about poisoning for a momemt).
What for?
I'm with you until now. But why bother keeping them resurrectable?
They are not refered by dentries. They have no IO happening on
them. Why retain them in cache for long?
Notice that icache is behind the dcache, so you are looking at the
second-order effects here. With the data you've shown on #kernel
it looks like half of your icache is just sitting there for no
ggod reason and slows down hash lookups.
It makes sense to retain them for a while, but inode sitting there
unreferenced by anything for minutes is a dead weight and nothing
else.
Notice that actually percent of the needlessly held inodes is higher -
2.4.2 _really_ keeps stale stuff in dcache and that means stale
stuff in icache. I.e. the only reference is from dentry that hadn't
been touched by anything for a _long_ time.
IOW, we just need to make sure that unreferenced inodes get freed
once they are not dirty / not locked. Fast. No need to keep them
on hash - just free them for real. Moreover, that will get
fragmentation down.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Atrocious icache/dcache in 2.4.2
@ 2001-04-28 2:36 Ed Tomlinson
0 siblings, 0 replies; 4+ messages in thread
From: Ed Tomlinson @ 2001-04-28 2:36 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-mm, Alexander Viro, Pete Zaitcev
Hi,
Here is a patch that prunes unused, clean inodes from the icache
faster. I have previously checked out cleaning the unused dirty
icache entries from the the same place in kswapd but did not find
it to be much a win.
This code does the following. It factors prune_icache to use,
prune_unused_icache to actually prune the list. This makes it
simple to add a shrink_unused_icache_memory routine which gets
plugged into the kswapd loop. The result is the icache is kept
smaller.
fast_prune.diff
-----
--- 2.4.4-pre7/include/linux/dcache.h Thu Apr 26 12:57:47 2001
+++ linux/include/linux/dcache.h Fri Apr 27 18:17:20 2001
@@ -176,6 +176,7 @@
/* icache memory management (defined in linux/fs/inode.c) */
extern void shrink_icache_memory(int, int);
+extern void shrink_unused_icache_memory(int);
extern void prune_icache(int);
/* only used at mount-time */
--- 2.4.4-pre7/mm/vmscan.c Fri Apr 27 11:36:04 2001
+++ linux/mm/vmscan.c Fri Apr 27 18:33:07 2001
@@ -953,6 +953,11 @@
*/
refill_inactive_scan(DEF_PRIORITY, 0);
+ /*
+ * Free unused inodes.
+ */
+ shrink_unused_icache_memory(GFP_KSWAPD);
+
/* Once a second, recalculate some VM stats. */
if (time_after(jiffies, recalc + HZ)) {
recalc = jiffies;
--- 2.4.4-pre7/fs/inode.c Thu Apr 26 12:49:33 2001
+++ linux/fs/inode.c Fri Apr 27 18:54:25 2001
@@ -540,16 +540,16 @@
!inode_has_buffers(inode))
#define INODE(entry) (list_entry(entry, struct inode, i_list))
-void prune_icache(int goal)
+/*
+ * Called with inode lock held, returns with it released.
+ */
+int prune_unused_icache(int goal)
{
LIST_HEAD(list);
struct list_head *entry, *freeable = &list;
- int count = 0, synced = 0;
+ int count = 0;
struct inode * inode;
- spin_lock(&inode_lock);
-
-free_unused:
entry = inode_unused.prev;
while (entry != &inode_unused)
{
@@ -577,19 +577,27 @@
dispose_list(freeable);
+ return count;
+}
+
+/*
+ * A goal of zero frees everything
+ */
+void prune_icache(int goal)
+{
+ spin_lock(&inode_lock);
+ goal -= prune_unused_icache(goal);
+
/*
* If we freed enough clean inodes, avoid writing
- * dirty ones. Also giveup if we already tried to
- * sync dirty inodes.
+ * dirty ones.
*/
- if (!goal || synced)
+ if (!goal)
return;
- synced = 1;
-
spin_lock(&inode_lock);
try_to_sync_unused_inodes();
- goto free_unused;
+ prune_unused_icache(goal);
}
void shrink_icache_memory(int priority, int gfp_mask)
@@ -611,6 +619,20 @@
prune_icache(count);
kmem_cache_shrink(inode_cachep);
+}
+
+void shrink_unused_icache_memory(int gfp_mask)
+{
+ /*
+ * Nasty deadlock avoidance..
+ */
+ if (!(gfp_mask & __GFP_IO))
+ return;
+
+ if (spin_trylock(&inode_lock)) {
+ prune_unused_icache(0);
+ kmem_cache_shrink(inode_cachep);
+ }
}
/*
-----
Comments?
Ed Tomlinson <tomlins@cam.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2001-04-28 2:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-27 19:01 Atrocious icache/dcache in 2.4.2 Pete Zaitcev
2001-04-27 19:13 ` Christoph Hellwig
2001-04-27 19:40 ` Alexander Viro
-- strict thread matches above, loose matches on Subject: below --
2001-04-28 2:36 Ed Tomlinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox