* importing bk into git
@ 2007-11-29 21:32 Christoph
2007-11-30 7:59 ` Andreas Ericsson
2007-12-03 3:02 ` importing bk into git David Kettler
0 siblings, 2 replies; 6+ messages in thread
From: Christoph @ 2007-11-29 21:32 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1572 bytes --]
I am trying to import a BitKeeper repo into a (new) git repo.
I am trying with the script bk2git.py that I found on the web.
This does not quite work - I fear script is no longer working with the current
git release. (I am using the current git release.)
If I have understood the script correctly, it does repeated bk checkouts and
imports the updates the git repo diff of the (next) checkout etc.
It seems this script tries to do so by settings environment vars
GIT_OBJECT_DIRECTORY and GIT_INDEX_FILE
to point at the git repo.
The bk checkout are done at a temp. dir (tmp_dir).
The following lines fail
os.system("cd %s; git-ls-files --deleted | xargs
git-update-cache --remove" % tmp_dir)
with: fatal: Not a git repository
xargs: git-update-cache: No such file or directory
The problem seems to be that the script cd's into the temp dir (which is not a
git repo) and the git-ls-files fails to find a git repo there.
I think the issue might be that an earlier version of git was perhaps able to
find the repo by means of the env. vars mentioned above.
Any idea if/how I can fix this?
Thanks for any ideas and best regards
Christoph
(Sorry, my python and git skills are so far very limited.)
PS: I have attached the script I downloaded from the net.
--
FORTUNE'S PARTY TIPS #14
Tired of finding that other people are helping themselves to your good
liquor at BYOB parties? Take along a candle, which you insert and
light after you've opened the bottle. No one ever expects anything
drinkable to be in a bottle which has a candle stuck in its neck.
[-- Attachment #2: bk2git.py --]
[-- Type: application/x-python, Size: 4700 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git
2007-11-29 21:32 importing bk into git Christoph
@ 2007-11-30 7:59 ` Andreas Ericsson
2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin
2007-12-08 19:19 ` importing bk into git (succeeded) Christoph
2007-12-03 3:02 ` importing bk into git David Kettler
1 sibling, 2 replies; 6+ messages in thread
From: Andreas Ericsson @ 2007-11-30 7:59 UTC (permalink / raw)
To: christoph.duelli; +Cc: git
Christoph wrote:
> I am trying to import a BitKeeper repo into a (new) git repo.
>
> I am trying with the script bk2git.py that I found on the web.
> This does not quite work - I fear script is no longer working with the current
> git release. (I am using the current git release.)
>
> If I have understood the script correctly, it does repeated bk checkouts and
> imports the updates the git repo diff of the (next) checkout etc.
>
> It seems this script tries to do so by settings environment vars
> GIT_OBJECT_DIRECTORY and GIT_INDEX_FILE
> to point at the git repo.
>
> The bk checkout are done at a temp. dir (tmp_dir).
>
>
> The following lines fail
> os.system("cd %s; git-ls-files --deleted | xargs
> git-update-cache --remove" % tmp_dir)
>
> with: fatal: Not a git repository
> xargs: git-update-cache: No such file or directory
>
You may have better luck using "git update-index" instead of
git-update-cache. If that doesn't work, try finding out which
version of git the importer script was written against and try
using that version of git for the import.
If you run into problems while using a newer git on the
imported repository that'll be a different discussion.
> The problem seems to be that the script cd's into the temp dir (which is not a
> git repo) and the git-ls-files fails to find a git repo there.
> I think the issue might be that an earlier version of git was perhaps able to
> find the repo by means of the env. vars mentioned above.
>
It should still do this, afaik, although it's probably better
to just use GIT_DIR nowadays.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] Replace the word 'update-cache' by 'update-index' everywhere
2007-11-30 7:59 ` Andreas Ericsson
@ 2007-11-30 11:35 ` Johannes Schindelin
2007-12-08 19:19 ` importing bk into git (succeeded) Christoph
1 sibling, 0 replies; 6+ messages in thread
From: Johannes Schindelin @ 2007-11-30 11:35 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: christoph.duelli, git, gitster
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/user-manual.txt | 2 +-
Makefile | 2 +-
configure.ac | 2 +-
t/t4000-diff-format.sh | 2 +-
t/t4001-diff-rename.sh | 2 +-
t/t4100/t-apply-1.patch | 2 +-
t/t4100/t-apply-2.patch | 2 +-
t/t4100/t-apply-5.patch | 2 +-
t/t4100/t-apply-6.patch | 2 +-
9 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index 0aaed10..93a47b4 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -3707,7 +3707,7 @@ should use the `--remove` and `--add` flags respectively.
NOTE! A `--remove` flag does 'not' mean that subsequent filenames will
necessarily be removed: if the files still exist in your directory
structure, the index will be updated with their new status, not
-removed. The only thing `--remove` means is that update-cache will be
+removed. The only thing `--remove` means is that update-index will be
considering a removed file to be a valid thing, and if the file really
does not exist any more, it will update the index accordingly.
diff --git a/Makefile b/Makefile
index 4454116..f000a5e 100644
--- a/Makefile
+++ b/Makefile
@@ -111,7 +111,7 @@ all::
# times (my ext3 doesn't).
#
# Define USE_STDEV below if you want git to care about the underlying device
-# change being considered an inode change from the update-cache perspective.
+# change being considered an inode change from the update-index perspective.
#
# Define ASCIIDOC8 if you want to format documentation with AsciiDoc 8
#
diff --git a/configure.ac b/configure.ac
index 7bcf1a4..5f8a15b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -415,7 +415,7 @@ GIT_PARSE_WITH(iconv))
# times (my ext3 doesn't).
#
# Define USE_STDEV below if you want git to care about the underlying device
-# change being considered an inode change from the update-cache perspective.
+# change being considered an inode change from the update-index perspective.
## Output files
diff --git a/t/t4000-diff-format.sh b/t/t4000-diff-format.sh
index 7d92ae3..c44b27a 100755
--- a/t/t4000-diff-format.sh
+++ b/t/t4000-diff-format.sh
@@ -16,7 +16,7 @@ cat path0 >path1
chmod +x path1
test_expect_success \
- 'update-cache --add two files with and without +x.' \
+ 'update-index --add two files with and without +x.' \
'git update-index --add path0 path1'
mv path0 path0-
diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
index 877c1ea..a326924 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -27,7 +27,7 @@ Line 15
'
test_expect_success \
- 'update-cache --add a file.' \
+ 'update-index --add a file.' \
'git update-index --add path0'
test_expect_success \
diff --git a/t/t4100/t-apply-1.patch b/t/t4100/t-apply-1.patch
index de58751..f98baa8 100644
--- a/t/t4100/t-apply-1.patch
+++ b/t/t4100/t-apply-1.patch
@@ -90,7 +90,7 @@ diff --git a/Documentation/git.txt b/Documentation/git.txt
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
-@@ -30,7 +30,7 @@ PROG= git-update-cache git-diff-files
+@@ -30,7 +30,7 @@ PROG= git-update-index git-diff-files
git-checkout-cache git-diff-tree git-rev-tree git-ls-files \
git-check-files git-ls-tree git-merge-base git-merge-cache \
git-unpack-file git-export git-diff-cache git-convert-cache \
diff --git a/t/t4100/t-apply-2.patch b/t/t4100/t-apply-2.patch
index cfdc808..f5c7d60 100644
--- a/t/t4100/t-apply-2.patch
+++ b/t/t4100/t-apply-2.patch
@@ -9,7 +9,7 @@ diff --git a/Makefile b/Makefile
- git-deltafy-script
+ git-deltafy-script git-fetch-script
- PROG= git-update-cache git-diff-files git-init-db git-write-tree \
+ PROG= git-update-index git-diff-files git-init-db git-write-tree \
git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-pull-script b/git-fetch-script
similarity index 87%
diff --git a/t/t4100/t-apply-5.patch b/t/t4100/t-apply-5.patch
index de11623..ad45c51 100644
--- a/t/t4100/t-apply-5.patch
+++ b/t/t4100/t-apply-5.patch
@@ -200,7 +200,7 @@ diff a/Documentation/git.txt b/Documentation/git.txt
diff a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
-@@ -30,7 +30,7 @@ PROG= git-update-cache git-diff-files
+@@ -30,7 +30,7 @@ PROG= git-update-index git-diff-files
git-checkout-cache git-diff-tree git-rev-tree git-ls-files \
git-check-files git-ls-tree git-merge-base git-merge-cache \
git-unpack-file git-export git-diff-cache git-convert-cache \
diff --git a/t/t4100/t-apply-6.patch b/t/t4100/t-apply-6.patch
index d975363..a72729a 100644
--- a/t/t4100/t-apply-6.patch
+++ b/t/t4100/t-apply-6.patch
@@ -8,7 +8,7 @@ diff a/Makefile b/Makefile
- git-deltafy-script
+ git-deltafy-script git-fetch-script
- PROG= git-update-cache git-diff-files git-init-db git-write-tree \
+ PROG= git-update-index git-diff-files git-init-db git-write-tree \
git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff a/git-fetch-script b/git-fetch-script
--- /dev/null
--
1.5.3.6.2088.g8c260
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: importing bk into git
2007-11-29 21:32 importing bk into git Christoph
2007-11-30 7:59 ` Andreas Ericsson
@ 2007-12-03 3:02 ` David Kettler
2007-12-03 20:59 ` Christoph
1 sibling, 1 reply; 6+ messages in thread
From: David Kettler @ 2007-12-03 3:02 UTC (permalink / raw)
To: christoph.duelli; +Cc: git
G'day,
I modified that script to convert a number of our repositories in
February. The version below worked for me at the time, but I'm not
able to test it now as our BK license has expired. In particular I'm
not sure if the bk_info.split line is correct; I had a reduced form of
this line in the file which now looks obviously wrong.
The script is slow; most of the time is in the bk export for every
revision. There are probably dumb things in there; I don't know
python and I was just starting with git.
Changes from the version I downloaded from the web include:
- sundry changes to make it work for me
- separate committers file to translate user names to full names
- specify a git dir template
- copy tags from BK
- minimal conversion of ignore files
- increased recursion limit to handle large number of commits
I hope this is useful to someone.
regards, David.
--snip,snip--
# Convert a BK repository to GIT
# usage: bk2git BK_REPO [GIT_REPO]
# Single branch only.
import os
import time
import sys
sys.setrecursionlimit(10000);
templates_dir = "/tmp/bk2git/templates"
committers_file = "/tmp/bk2git/committers"
tmp_dir = "/tmp/bk-export%d" % os.getpid()
# Get repository locations.
if len(sys.argv) < 2 or len(sys.argv) > 3:
print "usage: bk2git BK_REPO [GIT_REPO]"
sys.exit(1);
bk_dir = sys.argv[1]
if len(sys.argv) == 3:
git_dir = sys.argv[2]
else:
git_dir = bk_dir + ".git"
print "BK " + bk_dir
print "GIT " + git_dir
# Get committer names.
f = file(committers_file, "r")
committers = {}
for line in f:
[m,n] = line.split(" ",1)
committers[m] = n.strip();
f.close()
# Get tree of commits.
f = os.popen("cd %s; bk prs -d':REV:\\t:PARENT:\\t:MPARENT:\\t\\n' ChangeSet" % bk_dir)
f.readline()
parents={}
for rev in f:
[n,p] = rev.rstrip().split("\t",1)
parents[n] = p.split("\t")
f.close()
# Get tags.
f = os.popen("cd %s; bk changes -t -n -d':I:\\t:TAG:'" % bk_dir)
tags={}
for rev in f:
[n,t] = rev.rstrip().split("\t",1)
tags[n] = t
f.close()
# Initialize git repository.
os.system("mkdir %s" % git_dir)
os.chdir(git_dir)
os.system("git --bare init --template=%s" % templates_dir)
os.system("git-config core.bare false")
unknown = {}
def get_name(email):
if committers.has_key(email):
return committers[email]
unknown[email] = True
return "*Unknown*"
def git_commit(rev, p):
os.chdir(tmp_dir)
os.symlink(git_dir, "%s/.git" % tmp_dir)
# os.system("pwd; ls -AlR")
os.system("git-ls-files -z --deleted | xargs -0 git-update-index --remove")
os.system("git-ls-files -z --others | xargs -0 git-update-index --add")
os.system("git-ls-files -z | xargs -0 git-update-index")
treeid = os.popen("git-write-tree").read().rstrip()
print "wrote tree as %s" % treeid
os.chdir(git_dir)
os.system("rm -Rf %s" % tmp_dir)
bk_info = os.popen("cd %s; bk prs -r%s -d':KEY:\\n:UTC:\\n:USER:@:HOST:\\n$each(:C:){:C\\n}\\n' ChangeSet | sed 1d" % (bk_dir, rev)).read()
[key, date, user, comments] = bk_info.split("\n", 3)
# [key, date, user] = bk_info.split("\n", 2)
[name, machine] = user.split("@", 1);
f = file("/tmp/git-comments","w")
f.write(comments)
f.write("BK KEY: %s\n" % key)
f.close()
sdate = str(int(time.mktime(time.strptime(date+" UTC", "%Y%m%d%H%M%S %Z"))))
os.putenv("GIT_AUTHOR_DATE", sdate)
os.putenv("GIT_AUTHOR_EMAIL", user)
os.putenv("GIT_AUTHOR_NAME", get_name(name))
os.putenv("GIT_COMMITTER_DATE", sdate)
os.putenv("GIT_COMMITTER_EMAIL", user)
os.putenv("GIT_COMMITTER_NAME", get_name(name))
commitid = os.popen("git-commit-tree %s %s < /tmp/git-comments" % (treeid, " ".join(["-p "+a for a in p]))).read().rstrip()
print "committed %s as %s" % (rev, commitid)
if tags.has_key(rev):
os.system("git-tag %s %s" % (tags[rev], commitid))
print "tagged %s" % tags[rev]
return commitid
os.system("mkdir %s; touch %s/initial" % (tmp_dir, tmp_dir))
resolved = {'1.1': git_commit("1.1",[])}
def res(ver):
if resolved.has_key(ver):
return
for v in parents[ver]:
res(v)
os.system("cd %s; bk export -r%s %s" % (bk_dir, ver, tmp_dir))
ignore = "%s/.gitignore" % tmp_dir
os.system("cd %s; bk co -kpq -r@%s BitKeeper/etc/ignore | sed '/^BitKeeper/d;/^PENDING/d' > %s" % (bk_dir, ver, ignore))
os.system("test -s %s || rm %s" % (ignore, ignore))
resolved[ver] = git_commit(ver, [resolved[v] for v in parents[ver]])
return resolved[ver]
tot = os.popen("cd %s; bk prs -r+ -d':REV:' ChangeSet | tail -n 1" % bk_dir).read()
print "Exporting bitkeeper up to version %s" % tot
HEAD = res(tot)
print "HEAD: %s" % HEAD
file("%s/refs/heads/master" % git_dir,"w").write(HEAD + "\n")
os.system("git-config core.bare true")
os.system("git gc")
print unknown.keys()
--snip,snip--
--
IMPORTANT: This email remains the property of the Australian Defence
Organisation and is subject to the jurisdiction of section 70 of the
CRIMES ACT 1914. If you have received this email in error, you are
requested to contact the sender and delete the email.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git
2007-12-03 3:02 ` importing bk into git David Kettler
@ 2007-12-03 20:59 ` Christoph
0 siblings, 0 replies; 6+ messages in thread
From: Christoph @ 2007-12-03 20:59 UTC (permalink / raw)
To: David Kettler; +Cc: git
On Monday 03 December 2007 04:02:43 you wrote:
> G'day,
>
> I modified that script to convert a number of our repositories in
> February. The version below worked for me at the time, but I'm not
> able to test it now as our BK license has expired. In particular I'm
> not sure if the bk_info.split line is correct; I had a reduced form of
> this line in the file which now looks obviously wrong.
>
> The script is slow; most of the time is in the bk export for every
> revision. There are probably dumb things in there; I don't know
> python and I was just starting with git.
>
> Changes from the version I downloaded from the web include:
> - sundry changes to make it work for me
> - separate committers file to translate user names to full names
> - specify a git dir template
> - copy tags from BK
> - minimal conversion of ignore files
> - increased recursion limit to handle large number of commits
>
> I hope this is useful to someone.
Thanks,
I have made some modifications to get the script working as well. I was able
to convert some simple (really small test repos).I wanted to try to convert
my real big repo (>10k files, >5000 revisions) before mailing it. However, as
running this conversion will probably take more than 24hours I will do so
next weekend.
I will check your script and integrate my changes (if there are any relative
to yours) after that.
best regards
Christoph
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git (succeeded)
2007-11-30 7:59 ` Andreas Ericsson
2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin
@ 2007-12-08 19:19 ` Christoph
1 sibling, 0 replies; 6+ messages in thread
From: Christoph @ 2007-12-08 19:19 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2057 bytes --]
On Friday 30 November 2007 08:59:01 you wrote:
> Christoph wrote:
> > I am trying to import a BitKeeper repo into a (new) git repo.
> >
> > I am trying with the script bk2git.py that I found on the web.
> > This does not quite work - I fear script is no longer working with the
> > current git release. (I am using the current git release.)
[snip]
> > The following lines fail
> > os.system("cd %s; git-ls-files --deleted | xargs
> > git-update-cache --remove" % tmp_dir)
[snip]
> It should still do this, afaik, although it's probably better
> to just use GIT_DIR nowadays.
Using GIT_DIR works, one has to set it to point to the .git directory (I had
assumed the git_dir to be the one *containing* .git).
Another point with the original script was that you had to have all commiters
in the mapping (email -> name), otherwise it would not work.
(Supplying '*Unknown*' fixed this.)
I have added
- better arguments parsing (see --help)
- ability to do incremental conversions (--incr-db, -r)
- different levels of verbosity
I have attached a working version of the script. I have added comments that
(try to) explain the script if someone else has trouble with it.
Moreover, it is very helpful to put the directories inside a ramdisk.
Otherwise, you have to be extremely patient.
(You have to be patient anyway. For a bk repo of some 14k files (>110MB
when 'clean', 8000 changesets) the script took some 11hrs).
Another issue (when using ramdisks) is memory. On my machine memory is scarce
(only 1 GB). So the ever growing bare repo (can't be gc'ed before getting its
head) exhausted the ramdisk space. I worked around this by doing an
incremental conversion. After each increment a gc is possible and the git
repo shrinks to a managable size (and still fits inside the ramdisk).
So, to sum up: converting a big repo is no fun, but it works, given enough
time (and ram).
Thanks, best regards
Christoph
--
A billion here, a couple of billion there -- first thing you know it
adds up to be real money.
-- Senator Everett McKinley Dirksen
[-- Attachment #2: bk2git.py --]
[-- Type: application/x-python, Size: 10396 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-12-08 19:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-29 21:32 importing bk into git Christoph
2007-11-30 7:59 ` Andreas Ericsson
2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin
2007-12-08 19:19 ` importing bk into git (succeeded) Christoph
2007-12-03 3:02 ` importing bk into git David Kettler
2007-12-03 20:59 ` Christoph
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).