git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Keith Packard <keithp@keithp.com>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: keithp@keithp.com, Martin Langhoff <martin.langhoff@gmail.com>,
	git <git@vger.kernel.org>
Subject: Re: packs and trees
Date: Tue, 20 Jun 2006 08:18:48 -0700	[thread overview]
Message-ID: <1150816728.5382.27.camel@neko.keithp.com> (raw)
In-Reply-To: <9e4733910606200735u5741a9adr83264ae7d51dd37@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1338 bytes --]

On Tue, 2006-06-20 at 10:35 -0400, Jon Smirl wrote:

> Keith's parsecvs run ended up in a loop and mine hit a parsecvs error
> and then had memory corruption after about eight hours. That was last
> week,  I just checked the logs and I don't see any comments about
> fixing it.

Yeah, I'm rewriting the tool; the current codebase isn't supportable.

> Even after spending eight hours building the changeset info iit is
> still going to take it a couple of days to retrieve the versions one
> at a time and write them to git. Reparsing 50MB delta files n^2/2
> times is a major bottleneck for all three programs.

The eight hours in question *were* writing out the deltas and packing
the resulting trees. All that remained was to construct actual commit
objects and write them out. 

The problem was that parsecvs's internals are structured so that this
processes would take a large amount of memory, so I'm reworking the code
to free stuff as it goes along.

With a rewritten parsecvs, I'm hoping to be able to steal the algorithms
from cvs2svn and stick those in place. Then work on truncating the
history so it can deal with incremental updates to the repository, which
I think will be straightforward if we stick a few breadcrumbs in the git
repository to recover state from.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2006-06-20 15:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-20  5:57 packs and trees Jon Smirl
2006-06-20  6:13 ` Martin Langhoff
2006-06-20 14:35   ` Jon Smirl
2006-06-20 15:18     ` Keith Packard [this message]
2006-06-20 16:33       ` Jon Smirl
2006-06-20 15:03   ` Nicolas Pitre
2006-06-20 19:41     ` Martin Langhoff
2006-06-20 20:51       ` Nicolas Pitre
2006-06-21  3:54       ` Linus Torvalds
2006-06-21 15:32         ` David Lang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1150816728.5382.27.camel@neko.keithp.com \
    --to=keithp@keithp.com \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=martin.langhoff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).