* [PATCH] Additional merge-base tests (revised)
@ 2006-07-05 0:35 A Large Angry SCM
2006-07-05 2:07 ` merge-base: update the clean-up postprocessing Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: A Large Angry SCM @ 2006-07-05 0:35 UTC (permalink / raw)
To: Junio C Hamano, git
Signed-off-by: A Large Angry SCM <gitzilla@gmail.com>
---
This demonstrates a problem with git-merge-base.
This is a slightly revised version of the same patch as before.
t/t6010-merge-base.sh | 45 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 45 insertions(+)
diff --git a/t/t6010-merge-base.sh b/t/t6010-merge-base.sh
index 1dce123..2b7b51c 100755
--- a/t/t6010-merge-base.sh
+++ b/t/t6010-merge-base.sh
@@ -44,6 +44,43 @@ A=$(doit 1 A $B)
G=$(doit 7 G $B $E)
H=$(doit 8 H $A $F)
+# Setup for second test to demonstrate that relying on timestamps in a
+# distributed SCM to provide a _consistent_ partial ordering of commits
+# leads to insanity.
+#
+# Relative
+# Structure timestamps
+#
+# PL PR +4 +4
+# / \/ \ / \/ \
+# L2 C2 R2 +3 -1 +3
+# | | | | | |
+# L1 C1 R1 +2 -2 +2
+# | | | | | |
+# L0 C0 R0 +1 -3 +1
+# \ | / \ | /
+# S 0
+#
+# The left and right chains of commits can be of any length and complexity as
+# long as all of the timestamps are greater than that of S.
+
+S=$(doit 0 S)
+
+C0=$(doit -3 C0 $S)
+C1=$(doit -2 C1 $C0)
+C2=$(doit -1 C2 $C1)
+
+L0=$(doit 1 L0 $S)
+L1=$(doit 2 L1 $L0)
+L2=$(doit 3 L2 $L1)
+
+R0=$(doit 1 R0 $S)
+R1=$(doit 2 R1 $R0)
+R2=$(doit 3 R2 $R1)
+
+PL=$(doit 4 PL $L2 $C2)
+PR=$(doit 4 PR $C2 $R2)
+
test_expect_success 'compute merge-base (single)' \
'MB=$(git-merge-base G H) &&
expr "$(git-name-rev "$MB")" : "[0-9a-f]* tags/B"'
@@ -56,4 +93,12 @@ test_expect_success 'compute merge-base
'MB=$(git-show-branch --merge-base G H) &&
expr "$(git-name-rev "$MB")" : "[0-9a-f]* tags/B"'
+test_expect_success 'compute merge-base (single)' \
+ 'MB=$(git-merge-base PL PR) &&
+ expr "$(git-name-rev "$MB")" : "[0-9a-f]* tags/C2"'
+
+test_expect_success 'compute merge-base (all)' \
+ 'MB=$(git-merge-base --all PL PR) &&
+ expr "$(git-name-rev "$MB")" : "[0-9a-f]* tags/C2"'
+
test_done
^ permalink raw reply related [flat|nested] 8+ messages in thread
* merge-base: update the clean-up postprocessing
2006-07-05 0:35 [PATCH] Additional merge-base tests (revised) A Large Angry SCM
@ 2006-07-05 2:07 ` Junio C Hamano
2006-07-05 7:50 ` Johannes Schindelin
2006-07-05 7:51 ` Junio C Hamano
0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2006-07-05 2:07 UTC (permalink / raw)
To: A Large Angry SCM; +Cc: git
This is "for concepts" only -- it still seems to have bugs
somewhere to break other tests, although it passes your new
tests.
-- >8 --
This removes the "contaminate the well even more" approach
taken in the current merge-base postprosessing code. Instead,
when there are more than one merge-base results, we compute the
merge-base between them and see if one is a fast-forward of the
other, in which case the ancestor is removed from the result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
* This is on top of Johannes/Rene's merge-base work (in "next").
commit.c | 279 ++++++++++++++++++++------------------------------------------
1 files changed, 89 insertions(+), 190 deletions(-)
diff --git a/commit.c b/commit.c
index c6bf10d..1bd6dca 100644
--- a/commit.c
+++ b/commit.c
@@ -854,6 +854,7 @@ void sort_in_topological_order_fn(struct
#define PARENT1 (1u<< 8)
#define PARENT2 (1u<< 9)
#define STALE (1u<<10)
+#define RESULT (1u<<11)
static struct commit *interesting(struct commit_list *list)
{
@@ -867,183 +868,42 @@ static struct commit *interesting(struct
return NULL;
}
-/*
- * A pathological example of how this thing works.
- *
- * Suppose we had this commit graph, where chronologically
- * the timestamp on the commit are A <= B <= C <= D <= E <= F
- * and we are trying to figure out the merge base for E and F
- * commits.
- *
- * F
- * / \
- * E A D
- * \ / /
- * B /
- * \ /
- * C
- *
- * First we push E and F to list to be processed. E gets bit 1
- * and F gets bit 2. The list becomes:
- *
- * list=F(2) E(1), result=empty
- *
- * Then we pop F, the newest commit, from the list. Its flag is 2.
- * We scan its parents, mark them reachable from the side that F is
- * reachable from, and push them to the list:
- *
- * list=E(1) D(2) A(2), result=empty
- *
- * Next pop E and do the same.
- *
- * list=D(2) B(1) A(2), result=empty
- *
- * Next pop D and do the same.
- *
- * list=C(2) B(1) A(2), result=empty
- *
- * Next pop C and do the same.
- *
- * list=B(1) A(2), result=empty
- *
- * Now it is B's turn. We mark its parent, C, reachable from B's side,
- * and push it to the list:
- *
- * list=C(3) A(2), result=empty
- *
- * Now pop C and notice it has flags==3. It is placed on the result list,
- * and the list now contains:
- *
- * list=A(2), result=C(3)
- *
- * We pop A and do the same.
- *
- * list=B(3), result=C(3)
- *
- * Next, we pop B and something very interesting happens. It has flags==3
- * so it is also placed on the result list, and its parents are marked
- * stale, retroactively, and placed back on the list:
- *
- * list=C(7), result=C(7) B(3)
- *
- * Now, list does not have any interesting commit. So we find the newest
- * commit from the result list that is not marked stale. Which is
- * commit B.
- *
- *
- * Another pathological example how this thing used to fail to mark an
- * ancestor of a merge base as STALE before we introduced the
- * postprocessing phase (mark_reachable_commits).
- *
- * 2
- * H
- * 1 / \
- * G A \
- * |\ / \
- * | B \
- * | \ \
- * \ C F
- * \ \ /
- * \ D /
- * \ | /
- * \| /
- * E
- *
- * list A B C D E F G H
- * G1 H2 - - - - - - 1 2
- * H2 E1 B1 - 1 - - 1 - 1 2
- * F2 E1 B1 A2 2 1 - - 1 2 1 2
- * E3 B1 A2 2 1 - - 3 2 1 2
- * B1 A2 2 1 - - 3 2 1 2
- * C1 A2 2 1 1 - 3 2 1 2
- * D1 A2 2 1 1 1 3 2 1 2
- * A2 2 1 1 1 3 2 1 2
- * B3 2 3 1 1 3 2 1 2
- * C7 2 3 7 1 3 2 1 2
- *
- * At this point, unfortunately, everybody in the list is
- * stale, so we fail to complete the following two
- * steps to fully marking stale commits.
- *
- * D7 2 3 7 7 3 2 1 2
- * E7 2 3 7 7 7 2 1 2
- *
- * and we ended up showing E as an interesting merge base.
- * The postprocessing phase re-injects C and continues traversal
- * to contaminate D and E.
- */
-
-static void mark_reachable_commits(struct commit_list *result,
- struct commit_list *list)
-{
- struct commit_list *tmp;
-
- /*
- * Postprocess to fully contaminate the well.
- */
- for (tmp = result; tmp; tmp = tmp->next) {
- struct commit *c = tmp->item;
- /* Reinject stale ones to list,
- * so we can scan their parents.
- */
- if (c->object.flags & STALE)
- commit_list_insert(c, &list);
- }
- while (list) {
- struct commit *c = list->item;
- struct commit_list *parents;
-
- tmp = list;
- list = list->next;
- free(tmp);
-
- /* Anything taken out of the list is stale, so
- * mark all its parents stale. We do not
- * parse new ones (we already parsed all the relevant
- * ones).
- */
- parents = c->parents;
- while (parents) {
- struct commit *p = parents->item;
- parents = parents->next;
- if (!(p->object.flags & STALE)) {
- p->object.flags |= STALE;
- commit_list_insert(p, &list);
- }
- }
- }
-}
-
-struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2,
- int cleanup)
+static struct commit_list *merge_bases(struct commit *one, struct commit *two)
{
struct commit_list *list = NULL;
struct commit_list *result = NULL;
- struct commit_list *tmp = NULL;
- if (rev1 == rev2)
- return commit_list_insert(rev1, &result);
+ if (one == two)
+ /* We do not mark this even with RESULT so we do not
+ * have to clean it up.
+ */
+ return commit_list_insert(one, &result);
- parse_commit(rev1);
- parse_commit(rev2);
+ parse_commit(one);
+ parse_commit(two);
- rev1->object.flags |= PARENT1;
- rev2->object.flags |= PARENT2;
- insert_by_date(rev1, &list);
- insert_by_date(rev2, &list);
+ one->object.flags |= PARENT1;
+ two->object.flags |= PARENT2;
+ insert_by_date(one, &list);
+ insert_by_date(two, &list);
while (interesting(list)) {
- struct commit *commit = list->item;
+ struct commit *commit;
struct commit_list *parents;
- int flags = commit->object.flags
- & (PARENT1 | PARENT2 | STALE);
+ struct commit_list *n;
+ int flags;
- tmp = list;
- list = list->next;
- free(tmp);
- if (flags == (PARENT1 | PARENT2)) {
- insert_by_date(commit, &result);
+ commit = list->item;
+ n = list->next;
+ free(list);
+ list = n;
+ flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
+ if (flags == (PARENT1 | PARENT2)) {
+ if (!(commit->object.flags & RESULT)) {
+ commit->object.flags |= RESULT;
+ insert_by_date(commit, &result);
+ }
/* Mark parents of a found merge stale */
flags |= STALE;
}
@@ -1059,35 +919,74 @@ struct commit_list *get_merge_bases(stru
}
}
- if (!result)
- goto finish;
-
- if (result->next && list)
- mark_reachable_commits(result, list);
+ /* Clean up the result to remove stale ones */
+ list = result; result = NULL;
+ while (list) {
+ struct commit_list *n = list->next;
+ if (!(list->item->object.flags & STALE))
+ insert_by_date(list->item, &result);
+ free(list);
+ list = n;
+ }
+ return result;
+}
- /* cull duplicates */
- for (tmp = result, list = NULL; tmp; ) {
- struct commit *commit = tmp->item;
- struct commit_list *next = tmp->next;
- if (commit->object.flags & STALE) {
- if (list != NULL)
- list->next = next;
- free(tmp);
- } else {
- if (list == NULL)
- result = tmp;
- list = tmp;
- commit->object.flags |= STALE;
+struct commit_list *get_merge_bases(struct commit *one,
+ struct commit *two,
+ int cleanup)
+{
+ struct commit_list *result = merge_bases(one, two);
+ struct commit_list *list;
+ struct commit **rslt;
+ unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
+ int cnt, i, j;
+
+ if (one == two)
+ return result;
+ if (!result || !result->next) {
+ if (cleanup) {
+ clear_commit_marks(one, all_flags);
+ clear_commit_marks(two, all_flags);
}
-
- tmp = next;
+ return result;
}
- finish:
- if (cleanup) {
- clear_commit_marks(rev1, PARENT1 | PARENT2 | STALE);
- clear_commit_marks(rev2, PARENT1 | PARENT2 | STALE);
+ /* There are more than one */
+ cnt = 0;
+ list = result;
+ while (list) {
+ list = list->next;
+ cnt++;
+ }
+ rslt = xcalloc(cnt, sizeof(*rslt));
+ for (list = result, i = 0; list; list = list->next)
+ rslt[i] = list->item;
+ free_commit_list(result);
+
+ clear_commit_marks(one, all_flags);
+ clear_commit_marks(two, all_flags);
+ for (i = 0; i < cnt - 1; i++) {
+ for (j = i+1; j < cnt; j++) {
+ if (!rslt[i] || !rslt[j])
+ continue;
+ result = merge_bases(rslt[i], rslt[j]);
+ clear_commit_marks(rslt[i], all_flags);
+ clear_commit_marks(rslt[j], all_flags);
+ for (list = result; list; list = list->next) {
+ if (rslt[i] == list->item)
+ rslt[i] = NULL;
+ if (rslt[j] == list->item)
+ rslt[j] = NULL;
+ }
+ }
}
+ /* Surviving ones in rslt[] are the independent results */
+ result = NULL;
+ for (i = 0; i < cnt; i++) {
+ if (rslt[i])
+ insert_by_date(rslt[i], &result);
+ }
+ free(rslt);
return result;
}
--
1.4.1.g7993
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-05 2:07 ` merge-base: update the clean-up postprocessing Junio C Hamano
@ 2006-07-05 7:50 ` Johannes Schindelin
2006-07-05 23:20 ` Junio C Hamano
2006-07-05 23:21 ` Junio C Hamano
2006-07-05 7:51 ` Junio C Hamano
1 sibling, 2 replies; 8+ messages in thread
From: Johannes Schindelin @ 2006-07-05 7:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: A Large Angry SCM, git
Hi,
On Tue, 4 Jul 2006, Junio C Hamano wrote:
> This is "for concepts" only -- it still seems to have bugs
> somewhere to break other tests, although it passes your new
> tests.
Doesn't this introduce a nasty O(n*m) performance (where m is the
number of merge bases, and n the number of traversed commits)? I think
possibly many commits are traversed multiple times.
BTW ALAS' argument about grafts not only shot down my maximumSkew, but
AFAICT also the generation number thing. Besides, the generation number
could be manipulated by a mean-spirited person also.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-05 2:07 ` merge-base: update the clean-up postprocessing Junio C Hamano
2006-07-05 7:50 ` Johannes Schindelin
@ 2006-07-05 7:51 ` Junio C Hamano
2006-07-11 8:13 ` Junio C Hamano
1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2006-07-05 7:51 UTC (permalink / raw)
To: A Large Angry SCM; +Cc: git
A fixed up version of this patch, along with your updated test,
is at the tip of "pu".
It does affect the processing time for cases where there are
more than one merge bases negatively. To compute all merge-base
for the 23 merges in the kernel reporitory, the old code with
the "contaminate the well a bit more" clean-up phase takes 2.5
seconds, while the new code takes 3.9 seconds.
Processing all 2215 merges in the kernel repository (the other
2192 merges have one merge-base between the parents) takes 160
seconds either way. In other words, multi merge-base merges are
relatively rare and a bit more time spent to clean-up with the
new code is lost in the noise.
The numbers are taken from /usr/bin/time on an Athron 64X2 3800.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-05 7:50 ` Johannes Schindelin
@ 2006-07-05 23:20 ` Junio C Hamano
2006-07-05 23:21 ` Junio C Hamano
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2006-07-05 23:20 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: A Large Angry SCM, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Tue, 4 Jul 2006, Junio C Hamano wrote:
>
>> This is "for concepts" only -- it still seems to have bugs
>> somewhere to break other tests, although it passes your new
>> tests.
>
> Doesn't this introduce a nasty O(n*m) performance (where m is the
> number of merge bases, and n the number of traversed commits)? I think
> possibly many commits are traversed multiple times.
In practice m is small and the recomputation of bases between
bases does not require the minimalization so O(m^2). I've given
the numbers from a small real world example on the fixed code.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-05 7:50 ` Johannes Schindelin
2006-07-05 23:20 ` Junio C Hamano
@ 2006-07-05 23:21 ` Junio C Hamano
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2006-07-05 23:21 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: A Large Angry SCM, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> BTW ALAS' argument about grafts not only shot down my maximumSkew, but
> AFAICT also the generation number thing. Besides, the generation number
> could be manipulated by a mean-spirited person also.
I think with the fixed-up version of "concepts only" patch,
generation number approach is already moot.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-05 7:51 ` Junio C Hamano
@ 2006-07-11 8:13 ` Junio C Hamano
2006-07-11 14:26 ` Johannes Schindelin
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2006-07-11 8:13 UTC (permalink / raw)
To: git; +Cc: A Large Angry SCM
Junio C Hamano <junkio@cox.net> writes:
> A fixed up version of this patch, along with your updated test,
> is at the tip of "pu".
>
> It does affect the processing time for cases where there are
> more than one merge bases negatively. To compute all merge-base
> for the 23 merges in the kernel reporitory, the old code with
> the "contaminate the well a bit more" clean-up phase takes 2.5
> seconds, while the new code takes 3.9 seconds.
>
> Processing all 2215 merges in the kernel repository (the other
> 2192 merges have one merge-base between the parents) takes 160
> seconds either way. In other words, multi merge-base merges are
> relatively rare and a bit more time spent to clean-up with the
> new code is lost in the noise.
>
> The numbers are taken from /usr/bin/time on an Athron 64X2 3800.
I did a similar test on git.git repository. Numbers are
interesting.
* I have 941 two-head merges. 89 of them are multi-base
merges. This is unproportionally higher compared to the
kernel repository.
* Both the version in "master" (i.e. the one with the horizon
effect) and this version with updated clean-up code produces
identical set of merge bases for all 941 two-head merges.
* The difference in processing time for 941 two-head merges
with both versions is lost within margin of error.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: merge-base: update the clean-up postprocessing
2006-07-11 8:13 ` Junio C Hamano
@ 2006-07-11 14:26 ` Johannes Schindelin
0 siblings, 0 replies; 8+ messages in thread
From: Johannes Schindelin @ 2006-07-11 14:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, A Large Angry SCM
Hi,
On Tue, 11 Jul 2006, Junio C Hamano wrote:
> * The difference in processing time for 941 two-head merges
> with both versions is lost within margin of error.
You convinced me.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-07-11 14:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-05 0:35 [PATCH] Additional merge-base tests (revised) A Large Angry SCM
2006-07-05 2:07 ` merge-base: update the clean-up postprocessing Junio C Hamano
2006-07-05 7:50 ` Johannes Schindelin
2006-07-05 23:20 ` Junio C Hamano
2006-07-05 23:21 ` Junio C Hamano
2006-07-05 7:51 ` Junio C Hamano
2006-07-11 8:13 ` Junio C Hamano
2006-07-11 14:26 ` Johannes Schindelin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.