From: "Jon Smirl" <jonsmirl@gmail.com>
To: git <git@vger.kernel.org>, "Shawn Pearce" <spearce@spearce.org>
Subject: Re: fast-import and unique objects.
Date: Sun, 6 Aug 2006 11:53:43 -0400 [thread overview]
Message-ID: <9e4733910608060853ua0eabc1w9b35b8414d3c9bae@mail.gmail.com> (raw)
In-Reply-To: <9e4733910608060532w51fca2c0r8038828df0d41eeb@mail.gmail.com>
On 8/6/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> This model has a lot of object duplication. I generated 949,305
> revisions, but only 754,165 are unique. I'll modify my code to build a
> hash of the objects it has seen and then not send the duplicates to
> fast-import. Those 195,140 duplicated objects may be what is tripping
> index-pack up.
New run is finished with duplicate removal.
Time to run is unchanged, still 2hrs. Run time is IO bound not CPU.
Pack file is 845MB instead of 934MB.
git-index-pack works now, it takes 4 CPU minutes to run.
Index file is 18MB.
So it looks like the first stage code is working. Next I need to
modify cvs2svn to keep track of the sha-1 through it's sorting process
instead of file:revision.
--
Jon Smirl
jonsmirl@gmail.com
next prev parent reply other threads:[~2006-08-06 15:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-06 12:32 fast-import and unique objects Jon Smirl
2006-08-06 15:53 ` Jon Smirl [this message]
2006-08-06 18:03 ` Shawn Pearce
2006-08-07 4:48 ` Jon Smirl
2006-08-07 5:04 ` Shawn Pearce
2006-08-07 14:37 ` Jon Smirl
2006-08-07 14:48 ` Jakub Narebski
2006-08-07 18:45 ` Jon Smirl
2006-08-08 3:12 ` Shawn Pearce
2006-08-08 12:11 ` Jon Smirl
2006-08-08 22:45 ` Shawn Pearce
2006-08-08 23:56 ` Jon Smirl
2006-08-07 5:10 ` Martin Langhoff
2006-08-07 7:57 ` Ryan Anderson
2006-08-07 23:02 ` Shawn Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9e4733910608060853ua0eabc1w9b35b8414d3c9bae@mail.gmail.com \
--to=jonsmirl@gmail.com \
--cc=git@vger.kernel.org \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).