git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Nicolas Pitre <nico@cam.org>
Cc: git@vger.kernel.org, Jon Smirl <jonsmirl@gmail.com>
Subject: Re: Packfile can't be mapped
Date: Mon, 28 Aug 2006 12:42:22 -0400	[thread overview]
Message-ID: <20060828164222.GA22451@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0608280014190.3683@localhost.localdomain>

Nicolas Pitre <nico@cam.org> wrote:
> On Sun, 27 Aug 2006, Shawn Pearce wrote:
> 
> > I'm going to try to get tree deltas written to the pack sometime this
> > week. That should compact this intermediate pack down to something
> > that git-pack-objects would be able to successfully mmap into a
> > 32 bit address space.  A complete repack with no delta reuse will
> > hopefully generate a pack closer to 400 MB in size.  But I know
> > Jon would like to get that pack even smaller.  :)
> 
> One thing to consider in your code (if you didn't implement that 
> already) is to _not_ attempt any delta on any object whose size is 
> smaller than 50 bytes, and then limit the maximum delta size to 
> object_size/2 - 20 (use that for the last argument to diff-delta() and 
> store the undeltified object when diff-delta returns NULL).  This way 
> you'll avoid creating delta objects that are most likely to end up being 
> _larger_ than the undeltified object.

So I added Nico's suggestions to fast-import and ran it on a small
subset of the Mozilla repository (3424 blobs):

  naive always delta: 6652 KiB
  Nico's suggestion:  6842 KiB

So Nico's suggestion of limiting delta size to (orig_len/2)-20 or
not using deltas on blobs < 50 bytes actually added 190 KB to the
output pack.  Since this sample is probably fairly representative
of the rest of the repository's blobs I'm thinking we may see a 2.8%
increase in size over the current 930 MB blob pack.  That's another
26 MB in our intermediate pack.  I don't think this suggestion is
really worth including in fast-import right now...

-- 
Shawn.

  parent reply	other threads:[~2006-08-28 16:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-28  1:04 Packfile can't be mapped Jon Smirl
2006-08-28  2:47 ` Shawn Pearce
2006-08-28  4:27   ` Nicolas Pitre
2006-08-28  4:36     ` Linus Torvalds
2006-08-28  6:00       ` Shawn Pearce
2006-08-28 14:15         ` Jon Smirl
2006-08-28 14:40         ` Nicolas Pitre
2006-08-28 15:44           ` Jon Smirl
2006-08-28 16:43             ` Nicolas Pitre
2006-08-28 16:48           ` Shawn Pearce
2006-08-28 14:48       ` Nicolas Pitre
2006-08-28  5:33     ` Shawn Pearce
2006-08-28 16:42     ` Shawn Pearce [this message]
2006-08-28 17:19       ` Nicolas Pitre
2006-08-29  4:52   ` Shawn Pearce
2006-08-29  5:33     ` Shawn Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060828164222.GA22451@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).