git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph <christoph.duelli@gmx.de>
To: Andreas Ericsson <ae@op5.se>
Cc: git@vger.kernel.org
Subject: Re: importing bk into git (succeeded)
Date: Sat, 8 Dec 2007 20:19:09 +0100	[thread overview]
Message-ID: <200712082019.10273.christoph.duelli@gmx.de> (raw)
In-Reply-To: <474FC2C5.8060400@op5.se>

[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]

On Friday 30 November 2007 08:59:01 you wrote:
> Christoph wrote:
> > I am trying to import a BitKeeper repo into a (new) git repo.
> >
> > I am trying with the script bk2git.py that I found on the web.
> > This does not quite work - I fear script is no longer working with the
> > current git release. (I am using the current git release.)
[snip]
> > The following lines fail
> >   os.system("cd %s; git-ls-files --deleted | xargs
> > git-update-cache --remove" % tmp_dir)
[snip]
> It should still do this, afaik, although it's probably better
> to just use GIT_DIR nowadays.
Using GIT_DIR works, one has to set it to point to the .git directory (I had 
assumed the git_dir to be the one *containing* .git).

Another point with the original script was that you had to have all commiters 
in the mapping (email -> name), otherwise it would not work. 
(Supplying '*Unknown*' fixed this.)

I have added
- better arguments parsing (see --help)
- ability to do incremental conversions (--incr-db, -r)
- different levels of verbosity

I have attached a working version of the script. I have added comments that 
(try to) explain the script if someone else has trouble with it.

Moreover, it is very helpful to put the directories inside a ramdisk. 
Otherwise, you have to be extremely patient.
(You have to be patient anyway. For a bk repo of some 14k files (>110MB  
when 'clean', 8000 changesets) the script took some 11hrs).
Another issue (when using ramdisks) is memory. On my machine memory is scarce 
(only 1 GB). So the ever growing bare repo (can't be gc'ed before getting its 
head) exhausted the ramdisk space. I worked around this by doing an 
incremental conversion. After each increment a gc is possible and the git 
repo shrinks to a managable size (and still fits inside the ramdisk).

So, to sum up: converting a big repo is no fun, but it works, given enough 
time (and ram).

Thanks, best regards
Christoph
-- 
A billion here, a couple of billion there -- first thing you know it
adds up to be real money.
		-- Senator Everett McKinley Dirksen

[-- Attachment #2: bk2git.py --]
[-- Type: application/x-python, Size: 10396 bytes --]

  parent reply	other threads:[~2007-12-08 19:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-29 21:32 importing bk into git Christoph
2007-11-30  7:59 ` Andreas Ericsson
2007-11-30 11:35   ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin
2007-12-08 19:19   ` Christoph [this message]
2007-12-03  3:02 ` importing bk into git David Kettler
2007-12-03 20:59   ` Christoph

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200712082019.10273.christoph.duelli@gmx.de \
    --to=christoph.duelli@gmx.de \
    --cc=ae@op5.se \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).