From: "Shawn O. Pearce" <spearce@spearce.org>
To: Junio C Hamano <junkio@cox.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] A new merge stragety 'subtree'.
Date: Sat, 17 Feb 2007 03:45:58 -0500 [thread overview]
Message-ID: <20070217084558.GE27864@spearce.org> (raw)
In-Reply-To: <7vfy95y2n9.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> wrote:
> The detection of corresponding subtree is done by comparing the
> pathnames and types in the toplevel of the tree.
>
> Heuristics galore! That's the git way ;-).
I have some concerns about the match-tree heuristic you are using here.
For example, it is very common for Java projects to have the same
tree "shape". Just look at egit/jgit for an example, the three
top level directories are:
org.spearce.egit.core/
META-INF/
build.properties
plugin.xml
src/
org.spearce.egit.ui/
META-INF/
build.properties
plugin.xml
src/
org.spearce.jgit
META-INF/
src/
If I were to treat the first two as subprojects this new subtree
merge strategy might fail here as it could easily match to the
wrong directory.
What about a different approach?
In a merge of commit#1 (parent project) and commit#2 (subroject)...
We have the set of merge bases readily available. We just have
to find out in each merge base where the files went from commit#2,
then modify commit#2 to conform to that same shape.
Really that isn't too different from a rename detection. In other
words do something like the following:
a) Scan the parents of the merge base B for a commit that is
in commit#2's ancestory but not commit#1's ancestory, except by
the merge commit B. Such a parent must be from the project that
commit#2 is also from. For sake of explaining this, lets call
this parent B^2.
b) Perform a partial rename-diff between B^2 and B. The magic
here is we need to discard any path in B that also appears in
B^1 and B^2, and that has the same SHA-1 as in B^1, before we do
the rename-diff.
c) Find the most common prefix within the renamed files.
d) Fit commit#2 to use that prefix, and merge.
Here's a real example. In 67c75759 you merged git-gui.git.
67c75759^1 is from git.git, 67c75759^2 is from git-gui.git.
The stock rename-diff:
$ git diff-tree --abbrev -r -M --diff-filter=MRD 67c75759^2 67c75759
:100644 100644 c714d38... d99372a... M .gitignore
:100755 100755 8fac8cb... 7a10b60... M GIT-VERSION-GEN
:100644 100644 fd82d9d... 5d31e6d... M Makefile
:100644 100644 b95a137... b95a137... R100 TODO git-gui/TODO
:100755 100755 f5010dd... f5010dd... R100 git-gui.sh git-gui/git-gui.sh
The problem here is both ^1 and ^2 defines the first three paths,
so we think we modified them in the merge rather than moved them.
But these three files match ^1, as we did not do an evil merge here.
That's why they are showing as modified in this diff.
Now take 67c7 and whack those three files (step b above), and rediff:
$ C=$(git ls-tree 67c75759 | sed '
/ .gitignore$/d
/ GIT-VERSION-GEN$/d
/ Makefile$/d' | git mktree)
$ git diff-tree --abbrev -r -M --diff-filter=MRD 67c75759^2 $C
:100644 100644 c714d38... c714d38... R100 .gitignore git-gui/.gitignore
:100755 100755 8fac8cb... 8fac8cb... R100 GIT-VERSION-GEN git-gui/GIT-VERSION-GEN
:100644 100644 fd82d9d... fd82d9d... R100 Makefile git-gui/Makefile
:100644 100644 b95a137... b95a137... R100 TODO git-gui/TODO
:100755 100755 f5010dd... f5010dd... R100 git-gui.sh git-gui/git-gui.sh
Wow, look at that, everything starts with 'git-gui/'! ;-)
Then we just need to pick the most popular common prefix of all
renamed paths and fit commit#2 to conform to that structure.
Finally we can run the merge through.
The (now functional) pretend object stuff can be useful here,
such as to make $C above so we can pass it off to diffcore.
I think popping off the 'git-gui/' prefix would be the same deal,
only we'd be looking at the old names to determine the prefix to pop,
rather than the new names.
We already do rename detection in merge-recursive. Slapping an extra
rename pass in front of things when it is invoked as merge-subtree
can't performance hurt that much.
Thoughts?
--
Shawn.
next prev parent reply other threads:[~2007-02-17 8:46 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-17 1:49 [PATCH] A new merge stragety 'subtree' Junio C Hamano
2007-02-17 7:14 ` Shawn O. Pearce
2007-02-17 8:29 ` Junio C Hamano
2007-02-17 8:53 ` Shawn O. Pearce
2007-02-17 18:02 ` Junio C Hamano
2007-02-17 8:45 ` Shawn O. Pearce [this message]
2007-02-17 8:51 ` Junio C Hamano
2007-02-17 9:02 ` Shawn O. Pearce
2007-02-17 18:04 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070217084558.GE27864@spearce.org \
--to=spearce@spearce.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).