From: Thomas Jarosch <thomas.jarosch@intra2net.com>
To: git@vger.kernel.org
Subject: help needed: Splitting a git repository after subversion migration
Date: Sun, 07 Dec 2008 18:41:01 +0100 [thread overview]
Message-ID: <493C0AAD.1040208@intra2net.com> (raw)
Hello together,
I've successfully imported a large subversion repository into git.
The tree contains source code and binary data ("releases"),
the resulting .git directory is about 11GB.
After the import I recreated the tags/branches by converting the refs
to the subversion tags using a small shell script from the web:
for branch in `git branch -r`; do
...
version=`basename $branch`
git tag -s -f -m "$subject" "$version" "$branch^"
git branch -d -r $branch
done
Ok, so far everything went really smooth. I wanted to split this repository
into two repositories, one for the source code and one for the binary data.
The current tree layout is like this:
sources/c++_xyz
releases/large_binary_data
...
The original tree was imported from CVS to subversion and the layout
of the trunk was once reorganized/moved later. Here's the command
I used to split out the "source" tree:
git filter-branch --index-filter 'git rm --cached --ignore-unmatch -r -f
CVSROOT Attic source/Attic develpkg/Attic
source/packages/Attic releases update_pkg' -- --all
After that I ran these commands to reclaim the space:
- git clone --no-hardlinks filtered_tree final_output
- cd final_output
- git gc
- git prune
- git repack -a -d --depth=250 --window=250
Unfortunately the .git directory of the "source" tree is still 7.5GB big.
When I just imported the "trunk" from subversion without any tags
and then ran "git filter-branch --subdirectory-filter source" + git gc,
the .git directory was about 1.5GB afterwards.
How can I find out where those other 6GB go to?
I already looked at the tags with gitk,
there's no sign of the releases/* stuff left.
The "--all" switch for "git filter-branch"
doesn't seem documented in git 1.6.0.4?
I just learned about it from the example usage.
"git filter-branch" also had trouble converting the tags
and suggested I should add "--tag-name-filter cat", which I did.
Maybe that's something for the examples, too?
I also tried running "git filter-branch --tag-name-filter cat
--subdirectory-filter source -- --all", but that commands aborts
with these messages:
WARNING: 'refs/tags/v5-0-8' was rewritten into multiple commits:
ee180f6117597b60ee237e9da92047946dfdeec5
fd7824d1926ce9e4c89b685583eb9a9c2f2537af
WARNING: Ref 'refs/tags/v5-0-8' points to the first one now.
error: Ref refs/tags/v5-0-8 is at 4ea78238cfd6ee259c4e8bde7be4a90bc86295b0
but expected 06c60261502acfb7b2bbe44c2e2ec371bea65827
fatal: Cannot lock the ref 'refs/tags/v5-0-8'.
Could not rewrite refs/tags/v5-0-8
Besides that git really rocks :-)
Thanks in advance,
Thomas
next reply other threads:[~2008-12-07 17:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-07 17:41 Thomas Jarosch [this message]
2008-12-08 13:30 ` help needed: Splitting a git repository after subversion migration Michael J Gruber
2008-12-08 14:24 ` Björn Steinbrink
2008-12-08 17:34 ` Thomas Jarosch
2008-12-10 16:33 ` Thomas Jarosch
2008-12-11 8:10 ` Björn Steinbrink
2008-12-12 14:22 ` Thomas Jarosch
2008-12-12 14:49 ` Björn Steinbrink
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=493C0AAD.1040208@intra2net.com \
--to=thomas.jarosch@intra2net.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.