* importing bk into git @ 2007-11-29 21:32 Christoph 2007-11-30 7:59 ` Andreas Ericsson 2007-12-03 3:02 ` importing bk into git David Kettler 0 siblings, 2 replies; 6+ messages in thread From: Christoph @ 2007-11-29 21:32 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 1572 bytes --] I am trying to import a BitKeeper repo into a (new) git repo. I am trying with the script bk2git.py that I found on the web. This does not quite work - I fear script is no longer working with the current git release. (I am using the current git release.) If I have understood the script correctly, it does repeated bk checkouts and imports the updates the git repo diff of the (next) checkout etc. It seems this script tries to do so by settings environment vars GIT_OBJECT_DIRECTORY and GIT_INDEX_FILE to point at the git repo. The bk checkout are done at a temp. dir (tmp_dir). The following lines fail os.system("cd %s; git-ls-files --deleted | xargs git-update-cache --remove" % tmp_dir) with: fatal: Not a git repository xargs: git-update-cache: No such file or directory The problem seems to be that the script cd's into the temp dir (which is not a git repo) and the git-ls-files fails to find a git repo there. I think the issue might be that an earlier version of git was perhaps able to find the repo by means of the env. vars mentioned above. Any idea if/how I can fix this? Thanks for any ideas and best regards Christoph (Sorry, my python and git skills are so far very limited.) PS: I have attached the script I downloaded from the net. -- FORTUNE'S PARTY TIPS #14 Tired of finding that other people are helping themselves to your good liquor at BYOB parties? Take along a candle, which you insert and light after you've opened the bottle. No one ever expects anything drinkable to be in a bottle which has a candle stuck in its neck. [-- Attachment #2: bk2git.py --] [-- Type: application/x-python, Size: 4700 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git 2007-11-29 21:32 importing bk into git Christoph @ 2007-11-30 7:59 ` Andreas Ericsson 2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin 2007-12-08 19:19 ` importing bk into git (succeeded) Christoph 2007-12-03 3:02 ` importing bk into git David Kettler 1 sibling, 2 replies; 6+ messages in thread From: Andreas Ericsson @ 2007-11-30 7:59 UTC (permalink / raw) To: christoph.duelli; +Cc: git Christoph wrote: > I am trying to import a BitKeeper repo into a (new) git repo. > > I am trying with the script bk2git.py that I found on the web. > This does not quite work - I fear script is no longer working with the current > git release. (I am using the current git release.) > > If I have understood the script correctly, it does repeated bk checkouts and > imports the updates the git repo diff of the (next) checkout etc. > > It seems this script tries to do so by settings environment vars > GIT_OBJECT_DIRECTORY and GIT_INDEX_FILE > to point at the git repo. > > The bk checkout are done at a temp. dir (tmp_dir). > > > The following lines fail > os.system("cd %s; git-ls-files --deleted | xargs > git-update-cache --remove" % tmp_dir) > > with: fatal: Not a git repository > xargs: git-update-cache: No such file or directory > You may have better luck using "git update-index" instead of git-update-cache. If that doesn't work, try finding out which version of git the importer script was written against and try using that version of git for the import. If you run into problems while using a newer git on the imported repository that'll be a different discussion. > The problem seems to be that the script cd's into the temp dir (which is not a > git repo) and the git-ls-files fails to find a git repo there. > I think the issue might be that an earlier version of git was perhaps able to > find the repo by means of the env. vars mentioned above. > It should still do this, afaik, although it's probably better to just use GIT_DIR nowadays. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] Replace the word 'update-cache' by 'update-index' everywhere 2007-11-30 7:59 ` Andreas Ericsson @ 2007-11-30 11:35 ` Johannes Schindelin 2007-12-08 19:19 ` importing bk into git (succeeded) Christoph 1 sibling, 0 replies; 6+ messages in thread From: Johannes Schindelin @ 2007-11-30 11:35 UTC (permalink / raw) To: Andreas Ericsson; +Cc: christoph.duelli, git, gitster Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/user-manual.txt | 2 +- Makefile | 2 +- configure.ac | 2 +- t/t4000-diff-format.sh | 2 +- t/t4001-diff-rename.sh | 2 +- t/t4100/t-apply-1.patch | 2 +- t/t4100/t-apply-2.patch | 2 +- t/t4100/t-apply-5.patch | 2 +- t/t4100/t-apply-6.patch | 2 +- 9 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index 0aaed10..93a47b4 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -3707,7 +3707,7 @@ should use the `--remove` and `--add` flags respectively. NOTE! A `--remove` flag does 'not' mean that subsequent filenames will necessarily be removed: if the files still exist in your directory structure, the index will be updated with their new status, not -removed. The only thing `--remove` means is that update-cache will be +removed. The only thing `--remove` means is that update-index will be considering a removed file to be a valid thing, and if the file really does not exist any more, it will update the index accordingly. diff --git a/Makefile b/Makefile index 4454116..f000a5e 100644 --- a/Makefile +++ b/Makefile @@ -111,7 +111,7 @@ all:: # times (my ext3 doesn't). # # Define USE_STDEV below if you want git to care about the underlying device -# change being considered an inode change from the update-cache perspective. +# change being considered an inode change from the update-index perspective. # # Define ASCIIDOC8 if you want to format documentation with AsciiDoc 8 # diff --git a/configure.ac b/configure.ac index 7bcf1a4..5f8a15b 100644 --- a/configure.ac +++ b/configure.ac @@ -415,7 +415,7 @@ GIT_PARSE_WITH(iconv)) # times (my ext3 doesn't). # # Define USE_STDEV below if you want git to care about the underlying device -# change being considered an inode change from the update-cache perspective. +# change being considered an inode change from the update-index perspective. ## Output files diff --git a/t/t4000-diff-format.sh b/t/t4000-diff-format.sh index 7d92ae3..c44b27a 100755 --- a/t/t4000-diff-format.sh +++ b/t/t4000-diff-format.sh @@ -16,7 +16,7 @@ cat path0 >path1 chmod +x path1 test_expect_success \ - 'update-cache --add two files with and without +x.' \ + 'update-index --add two files with and without +x.' \ 'git update-index --add path0 path1' mv path0 path0- diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh index 877c1ea..a326924 100755 --- a/t/t4001-diff-rename.sh +++ b/t/t4001-diff-rename.sh @@ -27,7 +27,7 @@ Line 15 ' test_expect_success \ - 'update-cache --add a file.' \ + 'update-index --add a file.' \ 'git update-index --add path0' test_expect_success \ diff --git a/t/t4100/t-apply-1.patch b/t/t4100/t-apply-1.patch index de58751..f98baa8 100644 --- a/t/t4100/t-apply-1.patch +++ b/t/t4100/t-apply-1.patch @@ -90,7 +90,7 @@ diff --git a/Documentation/git.txt b/Documentation/git.txt diff --git a/Makefile b/Makefile --- a/Makefile +++ b/Makefile -@@ -30,7 +30,7 @@ PROG= git-update-cache git-diff-files +@@ -30,7 +30,7 @@ PROG= git-update-index git-diff-files git-checkout-cache git-diff-tree git-rev-tree git-ls-files \ git-check-files git-ls-tree git-merge-base git-merge-cache \ git-unpack-file git-export git-diff-cache git-convert-cache \ diff --git a/t/t4100/t-apply-2.patch b/t/t4100/t-apply-2.patch index cfdc808..f5c7d60 100644 --- a/t/t4100/t-apply-2.patch +++ b/t/t4100/t-apply-2.patch @@ -9,7 +9,7 @@ diff --git a/Makefile b/Makefile - git-deltafy-script + git-deltafy-script git-fetch-script - PROG= git-update-cache git-diff-files git-init-db git-write-tree \ + PROG= git-update-index git-diff-files git-init-db git-write-tree \ git-read-tree git-commit-tree git-cat-file git-fsck-cache \ diff --git a/git-pull-script b/git-fetch-script similarity index 87% diff --git a/t/t4100/t-apply-5.patch b/t/t4100/t-apply-5.patch index de11623..ad45c51 100644 --- a/t/t4100/t-apply-5.patch +++ b/t/t4100/t-apply-5.patch @@ -200,7 +200,7 @@ diff a/Documentation/git.txt b/Documentation/git.txt diff a/Makefile b/Makefile --- a/Makefile +++ b/Makefile -@@ -30,7 +30,7 @@ PROG= git-update-cache git-diff-files +@@ -30,7 +30,7 @@ PROG= git-update-index git-diff-files git-checkout-cache git-diff-tree git-rev-tree git-ls-files \ git-check-files git-ls-tree git-merge-base git-merge-cache \ git-unpack-file git-export git-diff-cache git-convert-cache \ diff --git a/t/t4100/t-apply-6.patch b/t/t4100/t-apply-6.patch index d975363..a72729a 100644 --- a/t/t4100/t-apply-6.patch +++ b/t/t4100/t-apply-6.patch @@ -8,7 +8,7 @@ diff a/Makefile b/Makefile - git-deltafy-script + git-deltafy-script git-fetch-script - PROG= git-update-cache git-diff-files git-init-db git-write-tree \ + PROG= git-update-index git-diff-files git-init-db git-write-tree \ git-read-tree git-commit-tree git-cat-file git-fsck-cache \ diff a/git-fetch-script b/git-fetch-script --- /dev/null -- 1.5.3.6.2088.g8c260 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: importing bk into git (succeeded) 2007-11-30 7:59 ` Andreas Ericsson 2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin @ 2007-12-08 19:19 ` Christoph 1 sibling, 0 replies; 6+ messages in thread From: Christoph @ 2007-12-08 19:19 UTC (permalink / raw) To: Andreas Ericsson; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2057 bytes --] On Friday 30 November 2007 08:59:01 you wrote: > Christoph wrote: > > I am trying to import a BitKeeper repo into a (new) git repo. > > > > I am trying with the script bk2git.py that I found on the web. > > This does not quite work - I fear script is no longer working with the > > current git release. (I am using the current git release.) [snip] > > The following lines fail > > os.system("cd %s; git-ls-files --deleted | xargs > > git-update-cache --remove" % tmp_dir) [snip] > It should still do this, afaik, although it's probably better > to just use GIT_DIR nowadays. Using GIT_DIR works, one has to set it to point to the .git directory (I had assumed the git_dir to be the one *containing* .git). Another point with the original script was that you had to have all commiters in the mapping (email -> name), otherwise it would not work. (Supplying '*Unknown*' fixed this.) I have added - better arguments parsing (see --help) - ability to do incremental conversions (--incr-db, -r) - different levels of verbosity I have attached a working version of the script. I have added comments that (try to) explain the script if someone else has trouble with it. Moreover, it is very helpful to put the directories inside a ramdisk. Otherwise, you have to be extremely patient. (You have to be patient anyway. For a bk repo of some 14k files (>110MB when 'clean', 8000 changesets) the script took some 11hrs). Another issue (when using ramdisks) is memory. On my machine memory is scarce (only 1 GB). So the ever growing bare repo (can't be gc'ed before getting its head) exhausted the ramdisk space. I worked around this by doing an incremental conversion. After each increment a gc is possible and the git repo shrinks to a managable size (and still fits inside the ramdisk). So, to sum up: converting a big repo is no fun, but it works, given enough time (and ram). Thanks, best regards Christoph -- A billion here, a couple of billion there -- first thing you know it adds up to be real money. -- Senator Everett McKinley Dirksen [-- Attachment #2: bk2git.py --] [-- Type: application/x-python, Size: 10396 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git 2007-11-29 21:32 importing bk into git Christoph 2007-11-30 7:59 ` Andreas Ericsson @ 2007-12-03 3:02 ` David Kettler 2007-12-03 20:59 ` Christoph 1 sibling, 1 reply; 6+ messages in thread From: David Kettler @ 2007-12-03 3:02 UTC (permalink / raw) To: christoph.duelli; +Cc: git G'day, I modified that script to convert a number of our repositories in February. The version below worked for me at the time, but I'm not able to test it now as our BK license has expired. In particular I'm not sure if the bk_info.split line is correct; I had a reduced form of this line in the file which now looks obviously wrong. The script is slow; most of the time is in the bk export for every revision. There are probably dumb things in there; I don't know python and I was just starting with git. Changes from the version I downloaded from the web include: - sundry changes to make it work for me - separate committers file to translate user names to full names - specify a git dir template - copy tags from BK - minimal conversion of ignore files - increased recursion limit to handle large number of commits I hope this is useful to someone. regards, David. --snip,snip-- # Convert a BK repository to GIT # usage: bk2git BK_REPO [GIT_REPO] # Single branch only. import os import time import sys sys.setrecursionlimit(10000); templates_dir = "/tmp/bk2git/templates" committers_file = "/tmp/bk2git/committers" tmp_dir = "/tmp/bk-export%d" % os.getpid() # Get repository locations. if len(sys.argv) < 2 or len(sys.argv) > 3: print "usage: bk2git BK_REPO [GIT_REPO]" sys.exit(1); bk_dir = sys.argv[1] if len(sys.argv) == 3: git_dir = sys.argv[2] else: git_dir = bk_dir + ".git" print "BK " + bk_dir print "GIT " + git_dir # Get committer names. f = file(committers_file, "r") committers = {} for line in f: [m,n] = line.split(" ",1) committers[m] = n.strip(); f.close() # Get tree of commits. f = os.popen("cd %s; bk prs -d':REV:\\t:PARENT:\\t:MPARENT:\\t\\n' ChangeSet" % bk_dir) f.readline() parents={} for rev in f: [n,p] = rev.rstrip().split("\t",1) parents[n] = p.split("\t") f.close() # Get tags. f = os.popen("cd %s; bk changes -t -n -d':I:\\t:TAG:'" % bk_dir) tags={} for rev in f: [n,t] = rev.rstrip().split("\t",1) tags[n] = t f.close() # Initialize git repository. os.system("mkdir %s" % git_dir) os.chdir(git_dir) os.system("git --bare init --template=%s" % templates_dir) os.system("git-config core.bare false") unknown = {} def get_name(email): if committers.has_key(email): return committers[email] unknown[email] = True return "*Unknown*" def git_commit(rev, p): os.chdir(tmp_dir) os.symlink(git_dir, "%s/.git" % tmp_dir) # os.system("pwd; ls -AlR") os.system("git-ls-files -z --deleted | xargs -0 git-update-index --remove") os.system("git-ls-files -z --others | xargs -0 git-update-index --add") os.system("git-ls-files -z | xargs -0 git-update-index") treeid = os.popen("git-write-tree").read().rstrip() print "wrote tree as %s" % treeid os.chdir(git_dir) os.system("rm -Rf %s" % tmp_dir) bk_info = os.popen("cd %s; bk prs -r%s -d':KEY:\\n:UTC:\\n:USER:@:HOST:\\n$each(:C:){:C\\n}\\n' ChangeSet | sed 1d" % (bk_dir, rev)).read() [key, date, user, comments] = bk_info.split("\n", 3) # [key, date, user] = bk_info.split("\n", 2) [name, machine] = user.split("@", 1); f = file("/tmp/git-comments","w") f.write(comments) f.write("BK KEY: %s\n" % key) f.close() sdate = str(int(time.mktime(time.strptime(date+" UTC", "%Y%m%d%H%M%S %Z")))) os.putenv("GIT_AUTHOR_DATE", sdate) os.putenv("GIT_AUTHOR_EMAIL", user) os.putenv("GIT_AUTHOR_NAME", get_name(name)) os.putenv("GIT_COMMITTER_DATE", sdate) os.putenv("GIT_COMMITTER_EMAIL", user) os.putenv("GIT_COMMITTER_NAME", get_name(name)) commitid = os.popen("git-commit-tree %s %s < /tmp/git-comments" % (treeid, " ".join(["-p "+a for a in p]))).read().rstrip() print "committed %s as %s" % (rev, commitid) if tags.has_key(rev): os.system("git-tag %s %s" % (tags[rev], commitid)) print "tagged %s" % tags[rev] return commitid os.system("mkdir %s; touch %s/initial" % (tmp_dir, tmp_dir)) resolved = {'1.1': git_commit("1.1",[])} def res(ver): if resolved.has_key(ver): return for v in parents[ver]: res(v) os.system("cd %s; bk export -r%s %s" % (bk_dir, ver, tmp_dir)) ignore = "%s/.gitignore" % tmp_dir os.system("cd %s; bk co -kpq -r@%s BitKeeper/etc/ignore | sed '/^BitKeeper/d;/^PENDING/d' > %s" % (bk_dir, ver, ignore)) os.system("test -s %s || rm %s" % (ignore, ignore)) resolved[ver] = git_commit(ver, [resolved[v] for v in parents[ver]]) return resolved[ver] tot = os.popen("cd %s; bk prs -r+ -d':REV:' ChangeSet | tail -n 1" % bk_dir).read() print "Exporting bitkeeper up to version %s" % tot HEAD = res(tot) print "HEAD: %s" % HEAD file("%s/refs/heads/master" % git_dir,"w").write(HEAD + "\n") os.system("git-config core.bare true") os.system("git gc") print unknown.keys() --snip,snip-- -- IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this email in error, you are requested to contact the sender and delete the email. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: importing bk into git 2007-12-03 3:02 ` importing bk into git David Kettler @ 2007-12-03 20:59 ` Christoph 0 siblings, 0 replies; 6+ messages in thread From: Christoph @ 2007-12-03 20:59 UTC (permalink / raw) To: David Kettler; +Cc: git On Monday 03 December 2007 04:02:43 you wrote: > G'day, > > I modified that script to convert a number of our repositories in > February. The version below worked for me at the time, but I'm not > able to test it now as our BK license has expired. In particular I'm > not sure if the bk_info.split line is correct; I had a reduced form of > this line in the file which now looks obviously wrong. > > The script is slow; most of the time is in the bk export for every > revision. There are probably dumb things in there; I don't know > python and I was just starting with git. > > Changes from the version I downloaded from the web include: > - sundry changes to make it work for me > - separate committers file to translate user names to full names > - specify a git dir template > - copy tags from BK > - minimal conversion of ignore files > - increased recursion limit to handle large number of commits > > I hope this is useful to someone. Thanks, I have made some modifications to get the script working as well. I was able to convert some simple (really small test repos).I wanted to try to convert my real big repo (>10k files, >5000 revisions) before mailing it. However, as running this conversion will probably take more than 24hours I will do so next weekend. I will check your script and integrate my changes (if there are any relative to yours) after that. best regards Christoph ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-12-08 19:19 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-29 21:32 importing bk into git Christoph 2007-11-30 7:59 ` Andreas Ericsson 2007-11-30 11:35 ` [PATCH] Replace the word 'update-cache' by 'update-index' everywhere Johannes Schindelin 2007-12-08 19:19 ` importing bk into git (succeeded) Christoph 2007-12-03 3:02 ` importing bk into git David Kettler 2007-12-03 20:59 ` Christoph
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).