git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Nicolas Pitre <nico@cam.org>
Cc: git@vger.kernel.org, Jon Smirl <jonsmirl@gmail.com>
Subject: Re: Packfile can't be mapped
Date: Mon, 28 Aug 2006 01:33:01 -0400	[thread overview]
Message-ID: <20060828053301.GA25285@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0608280014190.3683@localhost.localdomain>

Nicolas Pitre <nico@cam.org> wrote:
> On Sun, 27 Aug 2006, Shawn Pearce wrote:
> 
> > I'm going to try to get tree deltas written to the pack sometime this
> > week. That should compact this intermediate pack down to something
> > that git-pack-objects would be able to successfully mmap into a
> > 32 bit address space.  A complete repack with no delta reuse will
> > hopefully generate a pack closer to 400 MB in size.  But I know
> > Jon would like to get that pack even smaller.  :)
> 
> One thing to consider in your code (if you didn't implement that 
> already) is to _not_ attempt any delta on any object whose size is 
> smaller than 50 bytes, and then limit the maximum delta size to 
> object_size/2 - 20 (use that for the last argument to diff-delta() and 
> store the undeltified object when diff-delta returns NULL).  This way 
> you'll avoid creating delta objects that are most likely to end up being 
> _larger_ than the undeltified object.

I haven't tried this.  Should be trivial to implement.  Thanks for
the suggestion.

> > I should point out that the input stream to fast-import was 20 GB
> > (completely decompressed revisions from RCS) plus all commit data.
> > The original CVS ,v files are around 3 GB.  An archive .tar.gz'ing
> > the ,v files is around 550 MB.  Going to only 1.7 GB without tree
> > or commit deltas is certainly pretty good.  :)
> 
> Good job indeed.  Oh and you probably should not bother trying to 
> deltify commit objects at all since that would be a waste of time.

I wasn't going to bother even trying to delta the commits.  In this
import the 200k commits isn't a very large percentage of the data.
As I'm sure you are well aware its pretty much a waste time to try
with the commits, especially with an "intermediate" pack such as
this one.

-- 
Shawn.

  parent reply	other threads:[~2006-08-28  5:33 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-28  1:04 Packfile can't be mapped Jon Smirl
2006-08-28  2:47 ` Shawn Pearce
2006-08-28  4:27   ` Nicolas Pitre
2006-08-28  4:36     ` Linus Torvalds
2006-08-28  6:00       ` Shawn Pearce
2006-08-28 14:15         ` Jon Smirl
2006-08-28 14:40         ` Nicolas Pitre
2006-08-28 15:44           ` Jon Smirl
2006-08-28 16:43             ` Nicolas Pitre
2006-08-28 16:48           ` Shawn Pearce
2006-08-28 14:48       ` Nicolas Pitre
2006-08-28  5:33     ` Shawn Pearce [this message]
2006-08-28 16:42     ` Shawn Pearce
2006-08-28 17:19       ` Nicolas Pitre
2006-08-29  4:52   ` Shawn Pearce
2006-08-29  5:33     ` Shawn Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060828053301.GA25285@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).