From: Shawn Pearce <spearce@spearce.org>
To: Nicolas Pitre <nico@cam.org>
Cc: git@vger.kernel.org, Jon Smirl <jonsmirl@gmail.com>
Subject: Re: Packfile can't be mapped
Date: Mon, 28 Aug 2006 12:42:22 -0400 [thread overview]
Message-ID: <20060828164222.GA22451@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0608280014190.3683@localhost.localdomain>
Nicolas Pitre <nico@cam.org> wrote:
> On Sun, 27 Aug 2006, Shawn Pearce wrote:
>
> > I'm going to try to get tree deltas written to the pack sometime this
> > week. That should compact this intermediate pack down to something
> > that git-pack-objects would be able to successfully mmap into a
> > 32 bit address space. A complete repack with no delta reuse will
> > hopefully generate a pack closer to 400 MB in size. But I know
> > Jon would like to get that pack even smaller. :)
>
> One thing to consider in your code (if you didn't implement that
> already) is to _not_ attempt any delta on any object whose size is
> smaller than 50 bytes, and then limit the maximum delta size to
> object_size/2 - 20 (use that for the last argument to diff-delta() and
> store the undeltified object when diff-delta returns NULL). This way
> you'll avoid creating delta objects that are most likely to end up being
> _larger_ than the undeltified object.
So I added Nico's suggestions to fast-import and ran it on a small
subset of the Mozilla repository (3424 blobs):
naive always delta: 6652 KiB
Nico's suggestion: 6842 KiB
So Nico's suggestion of limiting delta size to (orig_len/2)-20 or
not using deltas on blobs < 50 bytes actually added 190 KB to the
output pack. Since this sample is probably fairly representative
of the rest of the repository's blobs I'm thinking we may see a 2.8%
increase in size over the current 930 MB blob pack. That's another
26 MB in our intermediate pack. I don't think this suggestion is
really worth including in fast-import right now...
--
Shawn.
next prev parent reply other threads:[~2006-08-28 16:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-28 1:04 Packfile can't be mapped Jon Smirl
2006-08-28 2:47 ` Shawn Pearce
2006-08-28 4:27 ` Nicolas Pitre
2006-08-28 4:36 ` Linus Torvalds
2006-08-28 6:00 ` Shawn Pearce
2006-08-28 14:15 ` Jon Smirl
2006-08-28 14:40 ` Nicolas Pitre
2006-08-28 15:44 ` Jon Smirl
2006-08-28 16:43 ` Nicolas Pitre
2006-08-28 16:48 ` Shawn Pearce
2006-08-28 14:48 ` Nicolas Pitre
2006-08-28 5:33 ` Shawn Pearce
2006-08-28 16:42 ` Shawn Pearce [this message]
2006-08-28 17:19 ` Nicolas Pitre
2006-08-29 4:52 ` Shawn Pearce
2006-08-29 5:33 ` Shawn Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060828164222.GA22451@spearce.org \
--to=spearce@spearce.org \
--cc=git@vger.kernel.org \
--cc=jonsmirl@gmail.com \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).