* [PATCH] Rough draft: Split a git repo into superproject and submodule
@ 2009-02-11 2:36 Eric Kidd
0 siblings, 0 replies; only message in thread
From: Eric Kidd @ 2009-02-11 2:36 UTC (permalink / raw)
To: git; +Cc: Eric Kidd
I have some extremely large git repositories which I want to split into
modules. This could be done using 'git filter-branch' and the following
steps:
1) Create a submodule using --subdirectory-filter.
2) Create a superproject using an index filter to delete the submodule.
3) Commit the submodule to the latest version of the superproject.
Unfortunately, this approach loses all the historical connections
between the superproject and the submodule, breaking tools like 'git
bisect', and making it difficult to recover old releases.
Ideally, each version of the newly created superproject would be linked
to the correct version of the submodule (and all the .gitmodules entries
would be set up correctly, too, throughout the project's history).
The attached patch contains a _very_ rough draft of a git
submodule-split command. You can run it as follows:
git submodule-split original-repo superproject directory-to-split
git clone superproject.git superproject
cd superproject
git submodule update --init
It will output two repositories named superproject.git and
subproject.git. Internally, it runs 'git filter-branch' on a bare
repository.
There's a test suite, too.
I still have quite a few things left to do before this script is
actually useful:
Rename submodule-split to something more appropriate
Update .gitmodules using git config
Merge new entries with existing .gitmodules
Add support for directory and repository names which differ
Add support for multiple possible directory names
Add support for directories which move around the tree
Add support for directories which are missing in some revisions
I'm releasing this version for feedback on the general design and the
coding style. I don't often write shell scripts of this magnitude, and
there are almost certainly some portability and style problems. So please
let me know what needs improvement, and I'll try to fix it.
Thank you for your feedback!
---
.gitignore | 1 +
Makefile | 1 +
git-submodule-split.sh | 113 ++++++++++++++++++++++++++++++++++++++++++++
t/t7404-submodule-split.sh | 36 ++++++++++++++
4 files changed, 151 insertions(+), 0 deletions(-)
create mode 100644 git-submodule-split.sh
create mode 100755 t/t7404-submodule-split.sh
diff --git a/.gitignore b/.gitignore
index 13311f1..fa6ed07 100644
--- a/.gitignore
+++ b/.gitignore
@@ -119,6 +119,7 @@ git-show
git-show-branch
git-show-index
git-show-ref
+git-submodule-split
git-stage
git-stash
git-status
diff --git a/Makefile b/Makefile
index 27b9569..aceac8f 100644
--- a/Makefile
+++ b/Makefile
@@ -277,6 +277,7 @@ SCRIPT_SH += git-sh-setup.sh
SCRIPT_SH += git-stash.sh
SCRIPT_SH += git-submodule.sh
SCRIPT_SH += git-web--browse.sh
+SCRIPT_SH += git-submodule-split.sh
SCRIPT_PERL += git-add--interactive.perl
SCRIPT_PERL += git-archimport.perl
diff --git a/git-submodule-split.sh b/git-submodule-split.sh
new file mode 100644
index 0000000..d7d1080
--- /dev/null
+++ b/git-submodule-split.sh
@@ -0,0 +1,113 @@
+#!/bin/sh
+#
+# Split a repository into a submodule and main module, with history
+#
+# Copyright 2009 Eric Kidd
+# License: GNU General Public License, version 2 or later
+
+USAGE="src-repo dst-repo submodule-repo"
+
+OPTIONS_SPEC=
+NONGIT_OK=Yes
+. git-sh-setup
+
+# Keep our argument parsing simple for now.
+test "$#" = 3 || usage
+src_repo="$1"
+dst_repo="$2.git"
+sub_repo="$3.git"
+submodule_dir="$3"
+revs="--all"
+
+# Make a bare clone of a git repo with identical branches.
+git_mirror() {
+ git clone --mirror "$1" "$2" || exit 1
+ # For some reason, git clone --mirror doesn't actually create our
+ # local branch references for us.
+ (cd "$2" && git fetch || exit 1)
+}
+
+# We export these variables so that they can be used from scripts passed to
+# git filter-branch. Thanks to gitte for this trick, which also allows us
+# to do the right thing when subdirectory names contain spaces and quotes.
+export SPLIT_SUB_REPO="$sub_repo"
+export SPLIT_SUBMODULE_DIR="$submodule_dir"
+export SPLIT_MAP_DIR="`pwd`/$sub_repo/split-map"
+
+
+#--------------------------------------------------------------------------
+# Create the new submodule
+
+# Create a copy of $src_repo to transform.
+git_mirror "$src_repo" "$sub_repo"
+
+# For each commit ID, we will create a files in containing
+# information that we'll later use to rewrite the subproject.
+mkdir "$SPLIT_MAP_DIR" || exit 1
+
+index_filter=$(cat << \EOF
+map_info="$SPLIT_MAP_DIR/$GIT_COMMIT"
+if git rev-parse -q --verify $GIT_COMMIT:"$SPLIT_SUBMODULE_DIR"; then
+ # Adapted from git-filter-branch.
+ err=$(git read-tree -i -m $GIT_COMMIT:"$SPLIT_SUBMODULE_DIR" 2>&1) ||
+ die "$err"
+ echo -n "$SPLIT_SUBMODULE_DIR" > "$map_info-dir"
+else
+ # We will use an empty file to indicate that the directory
+ # doesn't exist in the tree.
+ # touch "$map_info-skipped"
+ die "Directory is missing"
+fi
+EOF
+)
+
+commit_filter=$(cat << \EOF
+map_info="$SPLIT_MAP_DIR/$GIT_COMMIT"
+new_commit="$(git commit-tree "$@")" || exit 1
+echo $new_commit
+echo $new_commit > "$map_info-submodule-commit" ||
+ die "Can't record the commit ID of the new commit"
+EOF
+)
+
+# Run our filters.
+(cd "$sub_repo" &&
+ git filter-branch --index-filter "$index_filter" \
+ --commit-filter "$commit_filter" -- "$revs") || exit 1
+
+
+#--------------------------------------------------------------------------
+# Create the new superproject
+
+# Next, create our new parent repository.
+git_mirror "$src_repo" "$dst_repo"
+
+index_filter=$(cat << \EOF
+map_info="$SPLIT_MAP_DIR/$GIT_COMMIT"
+
+# Splice the repo into the tree.
+test -f "$map_info-submodule-commit" || die "Can't find map for $GIT_COMMIT"
+git rm -q --cached -r "$SPLIT_SUBMODULE_DIR" || exit 1
+echo "160000 $(cat "$map_info-submodule-commit") $SPLIT_SUBMODULE_DIR" |
+ git update-index --index-info || exit 1
+
+# Construct a new .gitmodules file.
+cat > "$SPLIT_MAP_DIR/gitmodules" <<EOC
+[submodule "$SPLIT_SUBMODULE_DIR"]
+ path = $SPLIT_SUBMODULE_DIR
+ url = ../$SPLIT_SUB_REPO
+EOC
+
+# Write the new .gitmodules file into the tree.
+new_obj=$(git hash-object -t blob -w "$SPLIT_MAP_DIR/gitmodules") ||
+ die "Error adding new .gitmodules file to tree"
+git update-index --add --cacheinfo 100644 "$new_obj" .gitmodules || exit 1
+
+EOF
+)
+
+# Run our filter.
+(cd "$dst_repo" &&
+ git filter-branch --index-filter "$index_filter" -- "$revs") || exit 1
+
+exit 0
diff --git a/t/t7404-submodule-split.sh b/t/t7404-submodule-split.sh
new file mode 100755
index 0000000..b490c60
--- /dev/null
+++ b/t/t7404-submodule-split.sh
@@ -0,0 +1,36 @@
+#!/bin/sh
+#
+# Copyright 2009 Eric Kidd
+
+test_description='git submodule-split tests'
+. ./test-lib.sh
+
+rm -rf .git
+test_create_repo original
+
+test_expect_success \
+ 'create original repository' \
+ '(cd original &&
+ echo "In main project" > main-file &&
+ mkdir sub1 &&
+ echo "In sub1" > sub1/sub1-file &&
+ git add . &&
+ git commit -m "Original project and sub1")'
+
+test_expect_success \
+ 'split out sub1' \
+ 'git submodule-split original split1 sub1 &&
+ git clone split1.git split1 &&
+ test -f split1/main-file &&
+ ! test -f split1/sub1/sub1-file &&
+ git clone sub1.git sub1 &&
+ test -f sub1/sub1-file'
+
+test_expect_success \
+ 'compare split repositories with original' \
+ 'rm -rf split1 &&
+ git clone split1.git split1 &&
+ (cd split1 && git submodule init sub1 && git submodule update) &&
+ diff -uNr -x .git -x .gitmodules original split1'
+
+test_done
--
1.6.0.4
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2009-02-11 2:39 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-11 2:36 [PATCH] Rough draft: Split a git repo into superproject and submodule Eric Kidd
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).