From: Linus Torvalds <torvalds@osdl.org>
To: Theodore Tso <tytso@mit.edu>
Cc: Martin Langhoff <martin@catalyst.net.nz>,
Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <junkio@cox.net>,
Johannes.Schindelin@gmx.de, spyderous@gentoo.org,
smurf@smurf.noris.de
Subject: Re: [PATCH] cvsimport: introduce -L<imit> option to workaround memory leaks
Date: Tue, 23 May 2006 09:05:36 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0605230848060.5623@g5.osdl.org> (raw)
In-Reply-To: <20060523153636.GA21506@thunk.org>
On Tue, 23 May 2006, Theodore Tso wrote:
> On Mon, May 22, 2006 at 07:28:37PM -0700, Linus Torvalds wrote:
> >
> > I actually think that I found a real ext3 performance bug from trying to
> > determine why git sometimes slows down ridiculously when the tree has been
> > allowed to go too long without a repack.
>
> Do you have dir_index (the hashed btree) feature enabled by any chance?
No, and I know I probably should, since it would hopefully help git usage.
But my problem actually happens even with moderately sized directories:
they were just 40kB or so in size, and the problem isn't high system CPU
usage, but tons of extra IO. I ran things on a machine with 2GB of RAM,
and as far as I could tell, the working set _should_ have fit into memory,
but CPU utilization was consistently in the 1% range.
Now, it's possible that I'm just wrong, and it really didn't fit in
memory, but I I _suspect_ that the issue is that ext3 directory handling
still uses the "buffer_head" thing rather than the page cache, and that we
simply don't LRU the memory appropriately so we don't let the memory
pressure expand the buffer cache.
Now, using buffer cache in this day and age is insane and horrible
(there's a reason I suspect the LRU doesn't work that well: the buffer
heads aren't supposed to be used as a cache, and people are supposed to
use the page cache for it these days), but Andrew tells me that the whole
JBD thing basically requires it. Whatever.
Now, repacking obviously hides it entirely (because then the load becomes
entirely a page-cache load, and the kernel does _that_ beautifully), but
I'm a bit bummed that I think I hit an ext3 braindamage.
So an unpacked git archive on ext3 (but not ext2, I believe: ext2 should
use the page cache for directories) ends up being very buffer-cache
intensive. And the buffer cache is basically deprecated..
Linus
PS. I'll see if I can figure out the problem, and maybe the good news is
that I'll be able to just fix a real kernel performance issue. Still,
there's a _reason_ we tried to get away from the buffer heads as a caching
entity..
next prev parent reply other threads:[~2006-05-23 16:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-22 11:38 [PATCH] cvsimport: introduce -L<imit> option to workaround memory leaks Martin Langhoff
2006-05-23 2:28 ` Linus Torvalds
2006-05-23 3:15 ` Martin Langhoff (CatalystIT)
2006-05-23 15:36 ` Theodore Tso
2006-05-23 16:05 ` Linus Torvalds [this message]
2006-05-23 16:25 ` Linus Torvalds
2006-05-26 0:42 ` Martin Langhoff
2006-05-26 5:20 ` Linus Torvalds
2006-05-26 5:29 ` Jakub Narebski
2006-05-26 6:02 ` Martin Langhoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0605230848060.5623@g5.osdl.org \
--to=torvalds@osdl.org \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=martin@catalyst.net.nz \
--cc=smurf@smurf.noris.de \
--cc=spyderous@gentoo.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).