git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Theodore Tso <tytso@mit.edu>
Cc: Martin Langhoff <martin@catalyst.net.nz>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <junkio@cox.net>,
	Johannes.Schindelin@gmx.de, spyderous@gentoo.org,
	smurf@smurf.noris.de
Subject: Re: [PATCH] cvsimport: introduce -L<imit> option to workaround memory leaks
Date: Tue, 23 May 2006 09:05:36 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0605230848060.5623@g5.osdl.org> (raw)
In-Reply-To: <20060523153636.GA21506@thunk.org>



On Tue, 23 May 2006, Theodore Tso wrote:
> On Mon, May 22, 2006 at 07:28:37PM -0700, Linus Torvalds wrote:
> > 
> > I actually think that I found a real ext3 performance bug from trying to 
> > determine why git sometimes slows down ridiculously when the tree has been 
> > allowed to go too long without a repack.
> 
> Do you have dir_index (the hashed btree) feature enabled by any chance?

No, and I know I probably should, since it would hopefully help git usage.

But my problem actually happens even with moderately sized directories: 
they were just 40kB or so in size, and the problem isn't high system CPU 
usage, but tons of extra IO. I ran things on a machine with 2GB of RAM, 
and as far as I could tell, the working set _should_ have fit into memory, 
but CPU utilization was consistently in the 1% range.

Now, it's possible that I'm just wrong, and it really didn't fit in 
memory, but I I _suspect_ that the issue is that ext3 directory handling 
still uses the "buffer_head" thing rather than the page cache, and that we 
simply don't LRU the memory appropriately so we don't let the memory 
pressure expand the buffer cache.

Now, using buffer cache in this day and age is insane and horrible 
(there's a reason I suspect the LRU doesn't work that well: the buffer 
heads aren't supposed to be used as a cache, and people are supposed to 
use the page cache for it these days), but Andrew tells me that the whole 
JBD thing basically requires it. Whatever.

Now, repacking obviously hides it entirely (because then the load becomes 
entirely a page-cache load, and the kernel does _that_ beautifully), but 
I'm a bit bummed that I think I hit an ext3 braindamage.

So an unpacked git archive on ext3 (but not ext2, I believe: ext2 should 
use the page cache for directories) ends up being very buffer-cache 
intensive. And the buffer cache is basically deprecated..

		Linus

PS. I'll see if I can figure out the problem, and maybe the good news is 
that I'll be able to just fix a real kernel performance issue. Still, 
there's a _reason_ we tried to get away from the buffer heads as a caching 
entity..

  reply	other threads:[~2006-05-23 16:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-22 11:38 [PATCH] cvsimport: introduce -L<imit> option to workaround memory leaks Martin Langhoff
2006-05-23  2:28 ` Linus Torvalds
2006-05-23  3:15   ` Martin Langhoff (CatalystIT)
2006-05-23 15:36   ` Theodore Tso
2006-05-23 16:05     ` Linus Torvalds [this message]
2006-05-23 16:25       ` Linus Torvalds
2006-05-26  0:42   ` Martin Langhoff
2006-05-26  5:20     ` Linus Torvalds
2006-05-26  5:29       ` Jakub Narebski
2006-05-26  6:02       ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0605230848060.5623@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=martin@catalyst.net.nz \
    --cc=smurf@smurf.noris.de \
    --cc=spyderous@gentoo.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).