* [PATCH] Update the documentation for git-merge-base @ 2006-05-16 5:58 Fredrik Kuivinen 2006-05-16 6:13 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Fredrik Kuivinen @ 2006-05-16 5:58 UTC (permalink / raw) To: junkio; +Cc: git Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se> --- Is the code guaranteed to return a least common ancestor? If that is the case we should probably mention it in the documentation. Documentation/git-merge-base.txt | 18 ++++++++++++++---- 1 files changed, 14 insertions(+), 4 deletions(-) diff --git a/Documentation/git-merge-base.txt b/Documentation/git-merge-base.txt index d1d56f1..6099be2 100644 --- a/Documentation/git-merge-base.txt +++ b/Documentation/git-merge-base.txt @@ -8,16 +8,26 @@ git-merge-base - Finds as good a common SYNOPSIS -------- -'git-merge-base' <commit> <commit> +'git-merge-base' [--all] <commit> <commit> DESCRIPTION ----------- -"git-merge-base" finds as good a common ancestor as possible. Given a -selection of equally good common ancestors it should not be relied on -to decide in any particular way. + +"git-merge-base" finds as good a common ancestor as possible between +the two commits. That is, given two commits A and B 'git-merge-base A +B' will output a commit which is reachable from both A and B through +the parent relationship. + +Given a selection of equally good common ancestors it should not be +relied on to decide in any particular way. The "git-merge-base" algorithm is still in flux - use the source... +OPTIONS +------- +--all:: + Output all common ancestors for the two commits instead of + just one. Author ------ ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 5:58 [PATCH] Update the documentation for git-merge-base Fredrik Kuivinen @ 2006-05-16 6:13 ` Junio C Hamano 2006-05-16 6:54 ` Fredrik Kuivinen 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2006-05-16 6:13 UTC (permalink / raw) To: Fredrik Kuivinen; +Cc: git Fredrik Kuivinen <freku045@student.liu.se> writes: > Is the code guaranteed to return a least common ancestor? If that is > the case we should probably mention it in the documentation. Unfortunately, no, if you mean by "least common" closest to the tips. See the big illustration at the top of the source for how you can construct pathological case to defeat an attempt to guarantee such. --all guarantees that the output contains all interesting ones, but does not guarantee the output has no suboptimal merge bases. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 6:13 ` Junio C Hamano @ 2006-05-16 6:54 ` Fredrik Kuivinen 2006-05-16 7:51 ` Junio C Hamano 2006-05-16 15:32 ` Linus Torvalds 0 siblings, 2 replies; 7+ messages in thread From: Fredrik Kuivinen @ 2006-05-16 6:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Fredrik Kuivinen, git On Mon, May 15, 2006 at 11:13:12PM -0700, Junio C Hamano wrote: > Fredrik Kuivinen <freku045@student.liu.se> writes: > > > Is the code guaranteed to return a least common ancestor? If that is > > the case we should probably mention it in the documentation. > > Unfortunately, no, if you mean by "least common" closest to the > tips. > By "least" I mean the following: C is a least common ancestor of A and B if: * C is a common ancestor of A and B, and * for every other common ancestor D (different from C) of A and B, C is not reacheable from D. > See the big illustration at the top of the source for how you > can construct pathological case to defeat an attempt to > guarantee such. --all guarantees that the output contains all > interesting ones, but does not guarantee the output has no > suboptimal merge bases. There are two examples at the top of the source. In the first one a least common ancestor is returned. As I interpret the second one, it is an example of how the old algorithm without the postprocessing step produced a common ancestor which is not least. Am I wrong? Do we have any cases where the current merge-base algorithm gives us common ancestors which are not least? - Fredrik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 6:54 ` Fredrik Kuivinen @ 2006-05-16 7:51 ` Junio C Hamano 2006-05-16 15:32 ` Linus Torvalds 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2006-05-16 7:51 UTC (permalink / raw) To: Fredrik Kuivinen; +Cc: git Fredrik Kuivinen <freku045@student.liu.se> writes: >> See the big illustration at the top of the source for how you >> can construct pathological case to defeat an attempt to >> guarantee such. --all guarantees that the output contains all >> interesting ones, but does not guarantee the output has no >> suboptimal merge bases. > > There are two examples at the top of the source. In the first one a > least common ancestor is returned. As I interpret the second one, it > is an example of how the old algorithm without the postprocessing step > produced a common ancestor which is not least. Ah, yes, I remember now. The drawing was done while we were working on the solution to that pathological case; mark_reachable_commits() solves that horizon effect. http://article.gmane.org/gmane.comp.version-control.git/11410 http://article.gmane.org/gmane.comp.version-control.git/11429 http://article.gmane.org/gmane.comp.version-control.git/11552 http://article.gmane.org/gmane.comp.version-control.git/11613 However, our inability to come up with one is not a nonexistence proof of cases the current algorithm can fail, so math minded people _might_ want to prove the algorithm is optimal. Not me, though. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 6:54 ` Fredrik Kuivinen 2006-05-16 7:51 ` Junio C Hamano @ 2006-05-16 15:32 ` Linus Torvalds 2006-05-16 16:32 ` Linus Torvalds 1 sibling, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2006-05-16 15:32 UTC (permalink / raw) To: Fredrik Kuivinen; +Cc: Junio C Hamano, git On Tue, 16 May 2006, Fredrik Kuivinen wrote: > > By "least" I mean the following: > > C is a least common ancestor of A and B if: > > * C is a common ancestor of A and B, and > * for every other common ancestor D (different from C) of A and B, C > is not reacheable from D. Yes, git-merge-base should always return a least common ancestor, never anything less. The only question is what happens when there are multiple LCA commits. In fact, even in that case git-merge-base will have a pretty strong specification: "git-merge-base with the '--all' flag will return the complete set of least common ancestors, sorted by most recent (as defined purely by the commit date order, not any graph ordering) first. Without the '--all' flag, it will return just one LCA commit (the most recent one, by the same date-based definition). In the case two or more LCA commits have exactly the same committer date, the ordering between them is arbitrary" I don't think you can be more specific, or do a better job. The definition of "most recent" is arbitrary, of course - and going by commit date is just _one_ way to order them, but it happens to be an easy one, and one that is as good as any other choice. Of course, the defined ordering probably really matters only for the case where we return just one LCA out of many, but it's nice to be able to tell what the order will be even for the multi-commit case. > There are two examples at the top of the source. In the first one a > least common ancestor is returned. As I interpret the second one, it > is an example of how the old algorithm without the postprocessing step > produced a common ancestor which is not least. Correct. We used to occasionally get it wrong, and return a common ancestor that wasn't least. > Am I wrong? Do we have any cases where the current merge-base > algorithm gives us common ancestors which are not least? Modulo bugs, no. And I don't think there are any bugs in that respect. Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 15:32 ` Linus Torvalds @ 2006-05-16 16:32 ` Linus Torvalds 2006-05-16 23:20 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2006-05-16 16:32 UTC (permalink / raw) To: Fredrik Kuivinen; +Cc: Junio C Hamano, git On Tue, 16 May 2006, Linus Torvalds wrote: > > I don't think you can be more specific, or do a better job. The definition > of "most recent" is arbitrary, of course - and going by commit date is > just _one_ way to order them, but it happens to be an easy one, and one > that is as good as any other choice. Side note: since LCA's are (by definition) never going to have a direct graph relationship, there is obviously no ordering enforced by the graph itself. So they're all unordered in the graph sense, but you could use other measures of "distance" in the graph. Example other orderings that we _could_ do, and I considered (some purely graph based, others based on content): - order by number of commits between the LCA and the two commits that we are trying to find the LCA for (the "tips"). - order by diff size between the LCA and the tips - order by lack of conflicts between the LCA and the tips. however, none of the alternate orderings really seem to make a lot of sense. The date-based one trumps them all by being straightforward and simple to both implement and explain. And iirc, I actually verified that it happened to pick the "better" one for at least one of my tests when using the old stupid straigth three-way merge, so I think it was actually objectively a good measure at least once. Anybody who cares can obviously always just do "git-merge-base --all" and do their own sorting for the (relatively unlikely) case that you get more than one parent. Anyway, just out of interest I just did some statistics using some shell scripts: - For the current kernel tree, of 1857 merges, only 17 had more than one merge base (and none had more than two): 1840 o 17 oo - In contrast, for git (current master branch), the numbers are 35 out of 540, and there are lots of merges with many LCA's: 505 o 15 oo 13 ooo 2 oooo 3 ooooo 2 ooooooo I think the difference is that Junio does a lot of these branches where he keeps on pulling from them, and never syncs back (which is a great workflow). In contrast, the kernel tends to try to avoid that because the history gets messy enough as it is ;) Anyway, the two commits that apparently have seven (!) LCA's in the git tree should probably be checked out. They are probably a good thing to see if git-merge-base really _really_ does the right thing, and whether they really are true LCA's. They are commits ad0b46bf.. and e6a933bd.. respectively. Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Update the documentation for git-merge-base 2006-05-16 16:32 ` Linus Torvalds @ 2006-05-16 23:20 ` Junio C Hamano 0 siblings, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2006-05-16 23:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Fredrik Kuivinen, git Linus Torvalds <torvalds@osdl.org> writes: > - In contrast, for git (current master branch), the numbers are 35 out of > 540, and there are lots of merges with many LCA's: > > 505 o > 15 oo > 13 ooo > 2 oooo > 3 ooooo > 2 ooooooo > > I think the difference is that Junio does a lot of these branches where he > keeps on pulling from them, and never syncs back (which is a great > workflow). In contrast, the kernel tends to try to avoid that because the > history gets messy enough as it is ;) > > Anyway, the two commits that apparently have seven (!) LCA's in the git > tree should probably be checked out. They are probably a good thing to see > if git-merge-base really _really_ does the right thing, and whether they > really are true LCA's. > > They are commits ad0b46bf.. and e6a933bd.. respectively. The first one is because at 1.3.0 I pulled everything from "next" to "master". Usually "next" incorporates topic branches that stem from different commits on "master", and when a new topic is merged to "next", it gets the updates to "master" up to that point along with the new topic. When topics graduate (i.e. merged back) to "master", they do so at different pace. topic2 o---o---o---o---H---. / \ \ next -----------o---o---E---o---I-------B / / / \ \ topic1 / / o---D---. \ \ / / / \ \ \ master ---G---o---C---o---o---F---o---o---A---X The above illustration shows that two topics branched from master were cooked in next. Topic 1 branched from master at C, added two commits (its tip is at D), merged to next at E and then later merged to master at F. Similarly, topic 2 branched from master at G, added five commits (its tip is at H), merged to next at I and then later merged to master at A. When merging "next" into "master" by merging A and B to produce X, tips of topics 1 and 2 (D and H, respectively) become the merge base. Merging "next" wholesale to "master" is hopefully a rare event, but the seven bases you are seeing are the topic tips. The other one is the other way around. From time to time, "next" itself gets updates from "master" to keep it in sync with fixes that occurred on "master" directly. Such a merge into "next" will have this picture but the principles are the same. topic2 o---o---o---o---H---. / \ \ next -----------o---o---E---o---I-------B---Y / / / \ / topic1 / / o---D---. \ / / / / \ \ / master ---G---o---C---o---o---F---o---o---A ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-05-16 23:21 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-05-16 5:58 [PATCH] Update the documentation for git-merge-base Fredrik Kuivinen 2006-05-16 6:13 ` Junio C Hamano 2006-05-16 6:54 ` Fredrik Kuivinen 2006-05-16 7:51 ` Junio C Hamano 2006-05-16 15:32 ` Linus Torvalds 2006-05-16 16:32 ` Linus Torvalds 2006-05-16 23:20 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).