git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Jarosch <thomas.jarosch@intra2net.com>
To: git@vger.kernel.org
Subject: help needed: Splitting a git repository after subversion migration
Date: Sun, 07 Dec 2008 18:41:01 +0100	[thread overview]
Message-ID: <493C0AAD.1040208@intra2net.com> (raw)

Hello together,

I've successfully imported a large subversion repository into git.
The tree contains source code and binary data ("releases"),
the resulting .git directory is about 11GB.

After the import I recreated the tags/branches by converting the refs
to the subversion tags using a small shell script from the web:

for branch in `git branch -r`; do
     ...
     version=`basename $branch`
     git tag -s -f -m "$subject" "$version" "$branch^"
     git branch -d -r $branch
done

Ok, so far everything went really smooth. I wanted to split this repository
into two repositories, one for the source code and one for the binary data.
The current tree layout is like this:

sources/c++_xyz
releases/large_binary_data
...

The original tree was imported from CVS to subversion and the layout
of the trunk was once reorganized/moved later. Here's the command
I used to split out the "source" tree:

git filter-branch --index-filter 'git rm --cached --ignore-unmatch -r -f
CVSROOT Attic source/Attic develpkg/Attic
source/packages/Attic releases update_pkg' -- --all

After that I ran these commands to reclaim the space:
- git clone --no-hardlinks filtered_tree final_output
- cd final_output
- git gc
- git prune
- git repack -a -d --depth=250 --window=250

Unfortunately the .git directory of the "source" tree is still 7.5GB big.

When I just imported the "trunk" from subversion without any tags
and then ran "git filter-branch --subdirectory-filter source" + git gc,
the .git directory was about 1.5GB afterwards.

How can I find out where those other 6GB go to?
I already looked at the tags with gitk,
there's no sign of the releases/* stuff left.

The "--all" switch for "git filter-branch"
doesn't seem documented in git 1.6.0.4?
I just learned about it from the example usage.

"git filter-branch" also had trouble converting the tags
and suggested I should add "--tag-name-filter cat", which I did.
Maybe that's something for the examples, too?

I also tried running "git filter-branch --tag-name-filter cat 
--subdirectory-filter source -- --all", but that commands aborts
with these messages:

WARNING: 'refs/tags/v5-0-8' was rewritten into multiple commits:
ee180f6117597b60ee237e9da92047946dfdeec5
fd7824d1926ce9e4c89b685583eb9a9c2f2537af
WARNING: Ref 'refs/tags/v5-0-8' points to the first one now.
error: Ref refs/tags/v5-0-8 is at 4ea78238cfd6ee259c4e8bde7be4a90bc86295b0 
but expected 06c60261502acfb7b2bbe44c2e2ec371bea65827
fatal: Cannot lock the ref 'refs/tags/v5-0-8'.
Could not rewrite refs/tags/v5-0-8


Besides that git really rocks :-)

Thanks in advance,
Thomas

             reply	other threads:[~2008-12-07 17:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-07 17:41 Thomas Jarosch [this message]
2008-12-08 13:30 ` help needed: Splitting a git repository after subversion migration Michael J Gruber
2008-12-08 14:24   ` Björn Steinbrink
2008-12-08 17:34     ` Thomas Jarosch
2008-12-10 16:33       ` Thomas Jarosch
2008-12-11  8:10         ` Björn Steinbrink
2008-12-12 14:22           ` Thomas Jarosch
2008-12-12 14:49             ` Björn Steinbrink

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=493C0AAD.1040208@intra2net.com \
    --to=thomas.jarosch@intra2net.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).