git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: mkoegler@auto.tuwien.ac.at (Martin Koegler)
To: Nicolas Pitre <nico@cam.org>
Cc: david@lang.hm, Jon Smirl <jonsmirl@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: RAM consumption when working with the gcc repo
Date: Sat, 8 Dec 2007 18:24:37 +0100	[thread overview]
Message-ID: <20071208172437.GA24442@auto.tuwien.ac.at> (raw)
In-Reply-To: <alpine.LFD.0.99999.0712071529580.555@xanadu.home>

On Fri, Dec 07, 2007 at 03:46:30PM -0500, Nicolas Pitre wrote:
> On Fri, 7 Dec 2007, david@lang.hm wrote:
> > On Fri, 7 Dec 2007, Jon Smirl wrote:
> > 
> > > I noticed two things when doing a repack of the gcc repo. First is
> > > that the git process is getting to be way too big. Turning off the
> > > delta caches had minimal impact. Why does the process still grow to
> > > 4.8GB?
> > > 
> > > Putting this in perspective, this is a 4.8GB process constructing a
> > > 330MB file. Something isn't right. Memory leak or inefficient data
> > > structure?
> > 
> > keep in mind that that 330MB file is _very_ heavily compressed. the simple
> > zlib compression is probably getting you 10:1 or 20:1 compression and the
> > delta compression is a significant multiplier on top of that.
> 
> Doesn't matter.  Something is indeed fishy.
> 
> The bulk of pack-objects memory consumption can be estimated as follows:
> 
> 1M objects * sizeof(struct object_entry) ~= 100MB
> 256 window entries with data (assuming a big 1MB per entry) = 256MB

For each (uncompressed) object in the delta window, a delta index is
created. It can have the same size as the uncompressed object.

Each thread has its own window, so using 4 threads means having 1024 objects 
in memory => 1 GB

> Delta result caching was disabled therefore 0MB
> read-side delta cache limited to 16MB
> 
> So the purely ram allocation might get to roughly 400MB.
> 
> Then add the pack and index map, which, depending on the original pack 
> size,
> might be 2GB.
> 
> So we're pessimistically talking of about 2.5GB of virtual space.
> 
> The other 2.3GB is hard to explain.

mfg Martin Kögler

  parent reply	other threads:[~2007-12-08 17:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-07 20:07 RAM consumption when working with the gcc repo Jon Smirl
2007-12-07 21:24 ` david
2007-12-07 20:36   ` Marco Costalba
2007-12-07 20:46   ` Nicolas Pitre
2007-12-07 21:23     ` Jon Smirl
2007-12-07 21:25     ` Marco Costalba
2007-12-08 11:54       ` Johannes Schindelin
2007-12-08 19:12         ` Marco Costalba
2007-12-07 21:27     ` Jon Smirl
2007-12-07 21:39     ` Jon Smirl
2007-12-07 21:50     ` Jon Smirl
2007-12-08 17:24     ` Martin Koegler [this message]
2007-12-07 21:39 ` Jeff King
2007-12-07 21:40   ` Jeff King
2007-12-07 21:43   ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071208172437.GA24442@auto.tuwien.ac.at \
    --to=mkoegler@auto.tuwien.ac.at \
    --cc=david@lang.hm \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).