* rewriting pathnames in history
@ 2006-02-21 7:53 Jeff King
2006-02-21 20:54 ` Sam Vilain
0 siblings, 1 reply; 2+ messages in thread
From: Jeff King @ 2006-02-21 7:53 UTC (permalink / raw)
To: git
I recently ran into an interesting situation with git. I created a
repository that consisted of several directories (and files in them).
Later, after many commits, I realized I would prefer each directory have
its own git repository. That is, given a repo with the files:
foo/bar
baz/bleep
I wanted two repos, "foo" containing the file "bar" and "baz" containing
the file "bleep".
Obviously, one could simply make new repositories (one for each
directory), rename the files, and commit. However, I wanted to keep the
history for each new repo tidy, as well. So my solution was to replay
the history once for each new repo, omitting any revisions which had no
effect, and rewriting paths to move "dir/foo" to "foo".
The script I used is included at the end of this mail. I'm posting in
case anyone else finds it useful (comments are also welcome).
I also have a question regarding this task. I wanted to split the whole
history, so I wanted a "blank" commit to start adding my replay to (that
is, a commit with no files and no parent). What's the best way using
git to get a blank commit? I ended up creating a new repo (with cogito,
which I regularly use), and then fetching it into the original repo and
switching to it as a branch.
-Peff
--------------
#!/usr/bin/perl
# Rewrite history by replaying and modifying pathnames.
# Public domain.
#
# 1. Switch your HEAD to the head where the rewritten history will go.
# 2. Figure out which revs you want to replay (e.g., git-rev-list master)
# 3. Figure out which paths you want to include (e.g., '/^prefix/)
# 4. Figure out how you want to modify the path (e.g., 's!^prefix/!!')
# 5. Run the script:
# git-rev-list master | perl rewrite.pl /^prefix/ 's!^prefix/!!'
my $USAGE = 'usage: rewrite.pl match rewrite';
my $match = shift or die "$USAGE, halting";
my $rewrite = shift or die "$USAGE, halting";
my @revs = ('HEAD', reverse(map { chomp; $_ } <>));
foreach my $i (1 .. $#revs) {
my @files = difftree($revs[$i-1], $revs[$i]);
@files = grep { match($match, $_) } @files
or next;
@files = map { rewrite($rewrite, $_) } @files;
update_index(@files);
commit($revs[$i]);
}
sub difftree {
my ($x, $y) = @_;
my @files;
open(my $fh, "git-diff-tree -r $x $y|")
or die "unable to open git-diff-tree: $!, halting";
while(my $line = <$fh>) {
chomp $line;
$line =~ /^:\d+ (\d+) [0-9a-f]+ ([0-9a-f]+) .\t(.*)/
or die "bad diff-tree output: $line, halting";
push @files, [$1, $2, $3];
}
$? and die "git-diff-tree returned error: $!, halting";
return @files;
}
sub match {
my $m = shift;
my $f = shift;
local $_ = $f->[2];
return eval $m;
}
sub rewrite {
my $r = shift;
my $f = shift;
local $_ = $f->[2];
eval $r;
$@ and die $@;
$f->[2] = $_;
return $f;
}
sub update_index {
open(my $fh, '|git-update-index --index-info')
or die "unable to open git-update-index, halting";
foreach my $f (@_) {
print $fh "$f->[0] $f->[1]\t$f->[2]\n";
}
close($fh);
$? and die "git-update-index reported failure, halting";
}
sub commit {
my $r = shift;
system("git-commit -C $r")
and die "git-commit reported failure, halting";
}
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: rewriting pathnames in history
2006-02-21 7:53 rewriting pathnames in history Jeff King
@ 2006-02-21 20:54 ` Sam Vilain
0 siblings, 0 replies; 2+ messages in thread
From: Sam Vilain @ 2006-02-21 20:54 UTC (permalink / raw)
To: Jeff King; +Cc: git
Jeff King wrote:
> I recently ran into an interesting situation with git. I created a
> repository that consisted of several directories (and files in them).
> Later, after many commits, I realized I would prefer each directory have
> its own git repository. That is, given a repo with the files:
> foo/bar
> baz/bleep
> I wanted two repos, "foo" containing the file "bar" and "baz" containing
> the file "bleep".
Nice work, but I think you should be able to get it *really* fast, much
faster than that.
Instead of replaying a checked out copy, just go through the commit
history, and when the treeID for that subdirectory has changed, then
that directory has a new revision. So, make a new commit object with
that as the treeid. in other words, you'll be constructing a very
lightweight branch, but with its tree IDs all corresponding to
sub-directory treeids on the combined branch. The history ripple script
that was posted the other day probably has most of the pieces you need.
Once this is done, you can just clone that branch to "get it out".
Sam.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-02-21 20:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-21 7:53 rewriting pathnames in history Jeff King
2006-02-21 20:54 ` Sam Vilain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox