* How to merge by subtree while preserving history?
@ 2009-03-26 22:59 David Reitter
2009-03-27 7:38 ` Miklos Vajna
0 siblings, 1 reply; 4+ messages in thread
From: David Reitter @ 2009-03-26 22:59 UTC (permalink / raw)
To: git
I have two separately developed projects (foo, bar) which I'd like to
merge; the contents of foo should, initially, go in a subdirectory of
bar.
I'm aware of two methods: moving (renaming) everything within foo
into foo-dir, and then just pulling foo into bar.
This works beautifully, except that the big rename causes havoc w.r.t.
to the files histories, i.e. git-log needs a "--follow" argument now,
and "diff-tree" can't track changes when given the new file name. No
good.
I've also tried the method described in [1], but it seems that all
history is lost here (the text could point this out..)
I've tried to "git pull -s subtree foo master" directly as well, but
then it put foo into strange places (and lost the history).
So, I'm at a loss. Suggestions much appreciated.
[1] http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How to merge by subtree while preserving history?
2009-03-26 22:59 How to merge by subtree while preserving history? David Reitter
@ 2009-03-27 7:38 ` Miklos Vajna
2009-03-27 16:56 ` David Reitter
0 siblings, 1 reply; 4+ messages in thread
From: Miklos Vajna @ 2009-03-27 7:38 UTC (permalink / raw)
To: David Reitter; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]
On Thu, Mar 26, 2009 at 06:59:51PM -0400, David Reitter <david.reitter@gmail.com> wrote:
> I have two separately developed projects (foo, bar) which I'd like to
> merge; the contents of foo should, initially, go in a subdirectory of
> bar.
>
> I'm aware of two methods: moving (renaming) everything within foo
> into foo-dir, and then just pulling foo into bar.
The result of the two methods are the same.
> This works beautifully, except that the big rename causes havoc w.r.t.
> to the files histories, i.e. git-log needs a "--follow" argument now,
> and "diff-tree" can't track changes when given the new file name. No
> good.
>
> I've also tried the method described in [1], but it seems that all
> history is lost here (the text could point this out..)
Of course it is not lost. :)
Example:
commit f8c62880ef22b74ea6df47bb349ff0743d2a93f9
Merge: f474c52... 52b8ea9...
Author: Junio C Hamano <gitster@pobox.com>
Date: Sun Mar 1 22:20:52 2009 -0800
Merge git://git.kernel.org/pub/scm/gitk/gitk
Now do a 'git log f474c52..52b8ea9' and you'll see the merged commits.
But you are right about that 'git log -- path' will find the merge
commits only (which is right, as the tree objects are not modified when
merging, just the resulting tree has the original tree in a
subdirectory).
If this is a one-time operation then I would just use git filter-branch
to move the code to a subdir.
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How to merge by subtree while preserving history?
2009-03-27 7:38 ` Miklos Vajna
@ 2009-03-27 16:56 ` David Reitter
2009-03-27 17:20 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: David Reitter @ 2009-03-27 16:56 UTC (permalink / raw)
To: Miklos Vajna; +Cc: git
On Mar 27, 2009, at 3:38 AM, Miklos Vajna wrote:
> Now do a 'git log f474c52..52b8ea9' and you'll see the merged commits.
Sure :)
Needless to say, this is not practical and doesn't support people's
workflow.
For simple renames, "git log --follow" helps, but as soon as you want
to do a "diff" in one of the listed revisions, filtering for just this
one file, then history becomes invisible again. Concretely, this
breaks the common workflow with C-x C-v l, then "d" in Emacs.
I'm aware of the content-tracking vs. file-tracking discussion; it's
all fine, except that file names are meaningful meta-data for some
content, at least in some projects. Is there a command that gives me
the diff for a revision pair, restricted to what happened to content
in a given file in the current tree?
> But you are right about that 'git log -- path' will find the merge
> commits only (which is right, as the tree objects are not modified
> when
> merging, just the resulting tree has the original tree in a
> subdirectory).
>
> If this is a one-time operation then I would just use git filter-
> branch
> to move the code to a subdir.
For the record:
In the meantime, I managed to move the original file in the CVS
repository (by just moving all the ",v" files and getting rid of
CVSROOT/history, which doesn't seem needed). The I re-ran cvsimport,
mitigating a bunch of problems with "cvsps". For the record, cvsps /
cvsimport could not handle the case where my repository named "foo"
had a subdirectory also called "foo", in which I moved all the
stuff. I had to rename the directory to "bar". I also had to
delete cvsps's cache file with the -x argument (or delete it from the
surprising location ~/.cvsps).
Then, I merged with "git pull", noting the rev ID before the merge.
Next, I used "git filter-branch" to rename the directory again from
BAR to FOO as follows:
git filter-branch --index-filter \
'git ls-files -s | sed "s-BAR/-FOO/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' <last-rev-before-
merge>..
Finally, I had to "git gc" to prune a 200MB worth of objects (it told
me I had 500k objects overall).
--
http://aquamacs.org -- Aquamacs: Emacs on Mac OS X
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How to merge by subtree while preserving history?
2009-03-27 16:56 ` David Reitter
@ 2009-03-27 17:20 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2009-03-27 17:20 UTC (permalink / raw)
To: David Reitter; +Cc: Miklos Vajna, git
David Reitter <david.reitter@gmail.com> writes:
> ... Is there a command that gives me
> the diff for a revision pair, restricted to what happened to content
> in a given file in the current tree?
You can get a half of it from blame (and I presume the other half by
running the procedure in reverse).
"git blame" has an obscure switch -S that lets you lie about the ancestry
by allowing you to install a graft (this is primarily used by the annotate
operation of git-cvsserver).
Suppose you have revisions A and B, and a lot of code in a file F in the
original revision A migrated to many other places in a later revision B
over time. You want to see where each and every line in F from A ended up
in B.
To compute this, you pretend as if the history originates at B (i.e. B is
the root commit), and A is a direct descendant of it, and blame each and
every line of F in A, with a very agressive setting. E.g.
{
echo $(git rev-parse A) $(git rev-parse B)
echo $(git rev-parse B)
} >tmp-graft
git blame -C -C -C -w -S tmp-graft A -- F
I'll leave it as an exercise to the readers how to compute "where did each
and every line in G in B came from A?"
Note that in order for this to work, it needs a fix to "blame -S" that I
posted about 10 days ago: aa9ea77 (blame: read custom grafts given by -S
before calling setup_revisions(), 2009-03-18); the fix is sitting in 'pu',
because as far as I know nobody has cared about the breakage other than I,
at least until now.
I've attached a script that uses this trick to compute "How much of what
Linus originally wrote still survives." People who attended GitTogether'08
may have seen the result.
---
#!/bin/sh
# How much of the very original version from Linus survive?
_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
initial=$(git rev-parse --verify e83c5163316f89bfbde7d9ab23ca2e25604af290) &&
this=$(git rev-parse --verify ${1-HEAD}^0) || exit
tmp="/var/tmp/Linus.$$"
trap 'rm -f "$tmp".*' 0
# We blame each file in the initial revision pretending as if it is a
# direct descendant of the given version, and also pretend that the
# latter is a root commit. This way, lines in the initial revision
# that survived to the other version can be identified (they will be
# attributed to the other version).
graft="$tmp.graft" &&
{
echo "$initial $this"
echo "$this"
} >"$graft" || exit
opts='-C -C -C -w'
git ls-tree -r "$initial" |
while read mode type sha1 name
do
git blame $opts --porcelain -S "$graft" "$initial" -- "$name" |
sed -ne "s/^\($_x40\) .*/\1/p" |
sort |
uniq -c | {
# There are only two commits in the fake history, so
# there won't be at most two output from the above.
read cnt1 commit1
read cnt2 commit2
if test -z "$commit2"
then
cnt2=0
fi
if test "$initial" != "$commit1"
then
cnt_surviving=$cnt1
else
cnt_surviving=$cnt2
fi
cnt_total=$(( $cnt1 + $cnt2 ))
echo "$cnt_surviving $cnt_total $name"
}
done | {
total=0
surviving=0
while read s t n
do
total=$(( $total + $t )) surviving=$(( $surviving + $s ))
printf "%6d / %-6d %s\n" $s $t $n
done
printf "%6d / %-6d %s\n" $surviving $total Total
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-03-27 17:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-26 22:59 How to merge by subtree while preserving history? David Reitter
2009-03-27 7:38 ` Miklos Vajna
2009-03-27 16:56 ` David Reitter
2009-03-27 17:20 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).