* [PATCH v2 0/8] History traversal refinements
@ 2013-04-30 17:26 Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 1/8] decorate.c: compact table when growing Kevin Bracey
` (8 more replies)
0 siblings, 9 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
Okay, here's what I'll call v2 of this series. In the 3 parts from
before (4,6 & 7), I've addressed the comments from Junio and David,
corrected some errors, reconstructed the main commit message, and made
some adjustments in preparation for part 8.
New part 1 is just me making amends for writing NULL into decoration
and leaving cruft behind in part 4.
New part 2 expands the ancestry-path test - which is useful
because it's full of "-s ours" merges.
New part 3 has a little look at the TREESAME documentation bug -
maybe we should add Junio's little asterisk decorations.
Part 5 is Junio's test, in the correct place in the sequence. (Not
sure if it's valid to send that with git send-email - I'll find out).
And finally part 8 is a first attempt at the new UNINTERESTING/TREESAME
interaction logic. I'm pretty happy with the results it produces,
but it's an even more deep and scary change than the earlier
parts.
And we obviously need some more new tests - the effects of these changes
are almost non-existent on the pre-existing set. I'd like to beg for any
volunteers here - I'm not that proficient at shell scripting, and on
top of that something like this could really do with an independent
set of eyes checking that the claimed benefits actually match the
results. (And that the claims are understandable.)
Junio C Hamano (1):
t6012: update test for tweaked full-history traversal
Kevin Bracey (7):
decorate.c: compact table when growing
t6019: test file dropped in -s ours merge
rev-list-options.txt: correct TREESAME for P
revision.c: Make --full-history consider more merges
simplify-merges: never remove all TREESAME parents
simplify-merges: drop merge from irrelevant side branch
revision.c: discount UNINTERESTING parents
Documentation/rev-list-options.txt | 38 ++--
decorate.c | 2 +-
revision.c | 453 +++++++++++++++++++++++++++++++++----
revision.h | 1 +
t/t6012-rev-list-simplify.sh | 31 ++-
t/t6019-rev-list-ancestry-path.sh | 37 ++-
6 files changed, 494 insertions(+), 68 deletions(-)
--
1.8.2.1.632.gd2b1879
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/8] decorate.c: compact table when growing
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 2/8] t6019: test file dropped in -s ours merge Kevin Bracey
` (7 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
When growing the table, take the opportunity to "compact" it by removing
entries with NULL decoration.
Users may have "removed" decorations by passing NULL to
insert_decoration. An object's table entry can't actually be removed
during normal operation, as it would break the linear hash collision
search. But we can remove NULL decoration entries when rebuilding the
table.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
decorate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/decorate.c b/decorate.c
index 2f8a63e..7cb5d29 100644
--- a/decorate.c
+++ b/decorate.c
@@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n)
const struct object *base = old_hash[i].base;
void *decoration = old_hash[i].decoration;
- if (!base)
+ if (!decoration)
continue;
insert_decoration(n, base, decoration);
}
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 2/8] t6019: test file dropped in -s ours merge
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 1/8] decorate.c: compact table when growing Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P Kevin Bracey
` (6 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
In preparation for upcoming TREESAME work, check the result for G.t,
which is dropped in "-s ours" merge L. The default rev-list is empty, as
expected - it follows the first parent path where it never existed.
Unfortunately, --ancestry-path is also empty. Merges H J and L are all
TREESAME to 1 parent, so are treated as TREESAME and not shown. This is
clearly undesirable in the case of merge L, which dropped our G.t by
taking the non-ancestry-path version. Document this as a known failure,
and expect "H J L", the 3 merges along the path that had to choose
between G.t versions.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
t/t6019-rev-list-ancestry-path.sh | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index 39b4cb0..86ef908 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -13,6 +13,9 @@ test_description='--ancestry-path'
#
# D..M -- M.t == M
# --ancestry-path D..M -- M.t == M
+#
+# G..M -- G.t == [nothing - was dropped in "-s ours" merge L]
+# --ancestry-path G..M -- G.t == H J L
. ./test-lib.sh
@@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' '
test_cmp expect actual
'
-test_expect_success 'rev-list --ancestry-patch D..M -- M.t' '
+test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
echo M >expect &&
git rev-list --ancestry-path --format=%s D..M -- M.t |
sed -e "/^commit /d" >actual &&
test_cmp expect actual
'
+# G.t is dropped in an "-s ours" merge
+test_expect_success 'rev-list G..M -- G.t' '
+ >expect &&
+ git rev-list --format=%s G..M -- G.t |
+ sed -e "/^commit /d" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_failure 'rev-list --ancestry-path G..M -- G.t' '
+ for c in H J L; do echo $c; done >expect &&
+ git rev-list --ancestry-path --format=%s G..M -- G.t |
+ sed -e "/^commit /d" |
+ sort >actual &&
+ test_cmp expect actual
+'
+
# b---bc
# / \ /
# a X
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 1/8] decorate.c: compact table when growing Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 2/8] t6019: test file dropped in -s ours merge Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 4/8] revision.c: Make --full-history consider more merges Kevin Bracey
` (5 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
In the example given, P is not TREESAME to E. This doesn't affect the
current result, but it will matter when we change behaviour.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
Documentation/rev-list-options.txt | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 3bdbf5e..50bbff7 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -367,8 +367,7 @@ each merge. The commits are:
`N` and `D` to "foobarbaz"; i.e., it is not TREESAME to any parent.
* `E` changes `quux` to "xyzzy", and its merge `P` combines the
- strings to "quux xyzzy". Despite appearing interesting, `P` is
- TREESAME to all parents.
+ strings to "quux xyzzy". `P` is TREESAME to `O`, but not to `E`.
'rev-list' walks backwards through history, including or excluding
commits based on whether '\--full-history' and/or parent rewriting
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 4/8] revision.c: Make --full-history consider more merges
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (2 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 5/8] t6012: update test for tweaked full-history traversal Kevin Bracey
` (4 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.
While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by "-s
ours", then:
git log -m -p --full-history -Schange file
would successfully locate "change"'s addition but would not locate the
merge that resolved against it.
Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.
Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.
Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
Documentation/rev-list-options.txt | 6 +-
revision.c | 241 ++++++++++++++++++++++++++++++++-----
revision.h | 1 +
t/t6019-rev-list-ancestry-path.sh | 2 +-
4 files changed, 216 insertions(+), 34 deletions(-)
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 50bbff7..1dad341 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -409,10 +409,10 @@ parent lines.
the example, we get
+
-----------------------------------------------------------------------
- I A B N D O
+ I A B N D O P
-----------------------------------------------------------------------
+
-`P` and `M` were excluded because they are TREESAME to a parent. `E`,
+`M` was excluded because it is TREESAME to both parents. `E`,
`C` and `B` were all walked, but only `B` was !TREESAME, so the others
do not appear.
+
@@ -440,7 +440,7 @@ themselves. This results in
Compare to '\--full-history' without rewriting above. Note that `E`
was pruned away because it is TREESAME, but the parent list of P was
rewritten to contain `E`'s parent `I`. The same happened for `C` and
-`N`. Note also that `P` was included despite being TREESAME.
+`N`.
In addition to the above settings, you can change whether TREESAME
affects inclusion:
diff --git a/revision.c b/revision.c
index a67b615..c88ded8 100644
--- a/revision.c
+++ b/revision.c
@@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit)
return retval >= 0 && (tree_difference == REV_TREE_SAME);
}
+struct treesame_state {
+ unsigned int nparents;
+ unsigned char treesame[FLEX_ARRAY];
+};
+
+static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit)
+{
+ unsigned n = commit_list_count(commit->parents);
+ struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+ st->nparents = n;
+ add_decoration(&revs->treesame, &commit->object, st);
+ return st;
+}
+
+/*
+ * Must be called immediately after removing the nth_parent from a commit's
+ * parent list, if we are maintaining the per-parent treesame[] decoration.
+ * This does not recalculate the master TREESAME flag - update_treesame()
+ * should be called to update it after a sequence of treesame[] modifications
+ * that may have affected it.
+ */
+static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned nth_parent)
+{
+ struct treesame_state *st;
+ int old_same;
+
+ if (!commit->parents) {
+ /*
+ * Have just removed the only parent from a non-merge.
+ * Different handling, as we lack decoration.
+ */
+ if (nth_parent != 0)
+ die("compact_treesame %u", nth_parent);
+ old_same = !!(commit->object.flags & TREESAME);
+ if (rev_same_tree_as_empty(revs, commit))
+ commit->object.flags |= TREESAME;
+ else
+ commit->object.flags &= ~TREESAME;
+ return old_same;
+ }
+
+ st = lookup_decoration(&revs->treesame, &commit->object);
+ if (!st || nth_parent >= st->nparents)
+ die("compact_treesame %u", nth_parent);
+
+ old_same = st->treesame[nth_parent];
+ memmove(st->treesame + nth_parent,
+ st->treesame + nth_parent + 1,
+ st->nparents - nth_parent - 1);
+
+ /*
+ * If we've just become a non-merge commit, update TREESAME
+ * immediately, and remove the no-longer-needed decoration.
+ * If still a merge, defer update until update_treesame().
+ */
+ if (--st->nparents == 1) {
+ if (commit->parents->next)
+ die("compact_treesame parents mismatch");
+ if (st->treesame[0] && revs->dense)
+ commit->object.flags |= TREESAME;
+ else
+ commit->object.flags &= ~TREESAME;
+ free(add_decoration(&revs->treesame, &commit->object, NULL));
+ }
+
+ return old_same;
+}
+
+static unsigned update_treesame(struct rev_info *revs, struct commit *commit)
+{
+ if (commit->parents && commit->parents->next) {
+ unsigned n;
+ struct treesame_state *st;
+
+ st = lookup_decoration(&revs->treesame, &commit->object);
+ if (!st)
+ die("update_treesame %s", sha1_to_hex(commit->object.sha1));
+ commit->object.flags |= TREESAME;
+ for (n = 0; n < st->nparents; n++) {
+ if (!st->treesame[n]) {
+ commit->object.flags &= ~TREESAME;
+ break;
+ }
+ }
+ }
+
+ return commit->object.flags & TREESAME;
+}
+
static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp, *parent;
- int tree_changed = 0, tree_same = 0, nth_parent = 0;
+ struct treesame_state *ts = NULL;
+ int tree_changed = 0, nth_parent;
/*
* If we don't do pruning, everything is interesting
@@ -456,25 +546,43 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
if (!revs->dense && !commit->parents->next)
return;
- pp = &commit->parents;
- while ((parent = *pp) != NULL) {
+ for (pp = &commit->parents, nth_parent = 0;
+ (parent = *pp) != NULL;
+ pp = &parent->next, nth_parent++) {
struct commit *p = parent->item;
- /*
- * Do not compare with later parents when we care only about
- * the first parent chain, in order to avoid derailing the
- * traversal to follow a side branch that brought everything
- * in the path we are limited to by the pathspec.
- */
- if (revs->first_parent_only && nth_parent++)
- break;
+ if (nth_parent == 1) {
+ /*
+ * This our second loop iteration - so we now know
+ * we're dealing with a merge.
+ *
+ * Do not compare with later parents when we care only about
+ * the first parent chain, in order to avoid derailing the
+ * traversal to follow a side branch that brought everything
+ * in the path we are limited to by the pathspec.
+ */
+ if (revs->first_parent_only)
+ break;
+ /*
+ * If this will remain a potentially-simplifiable
+ * merge, remember per-parent treesame if needed.
+ * Initialise the array with the comparison from our
+ * first iteration.
+ */
+ if (revs->treesame.name &&
+ !revs->simplify_history &&
+ !(commit->object.flags & UNINTERESTING)) {
+ ts = initialise_treesame(revs, commit);
+ if (!tree_changed)
+ ts->treesame[0] = 1;
+ }
+ }
if (parse_commit(p) < 0)
die("cannot simplify commit %s (because of %s)",
sha1_to_hex(commit->object.sha1),
sha1_to_hex(p->object.sha1));
switch (rev_compare_tree(revs, p, commit)) {
case REV_TREE_SAME:
- tree_same = 1;
if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
/* Even if a merge with an uninteresting
* side branch brought the entire change
@@ -482,7 +590,8 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
* to lose the other branches of this
* merge, so we just keep going.
*/
- pp = &parent->next;
+ if (ts)
+ ts->treesame[nth_parent] = 1;
continue;
}
parent->next = NULL;
@@ -511,12 +620,11 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
case REV_TREE_OLD:
case REV_TREE_DIFFERENT:
tree_changed = 1;
- pp = &parent->next;
continue;
}
die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
}
- if (tree_changed && !tree_same)
+ if (tree_changed)
return;
commit->object.flags |= TREESAME;
}
@@ -1930,28 +2038,32 @@ static void add_child(struct rev_info *revs, struct commit *parent, struct commi
l->next = add_decoration(&revs->children, &parent->object, l);
}
-static int remove_duplicate_parents(struct commit *commit)
+static int remove_duplicate_parents(struct rev_info *revs, struct commit *commit)
{
+ struct treesame_state *ts = lookup_decoration(&revs->treesame, &commit->object);
struct commit_list **pp, *p;
int surviving_parents;
/* Examine existing parents while marking ones we have seen... */
pp = &commit->parents;
+ surviving_parents = 0;
while ((p = *pp) != NULL) {
struct commit *parent = p->item;
if (parent->object.flags & TMP_MARK) {
*pp = p->next;
+ if (ts)
+ compact_treesame(revs, commit, surviving_parents);
continue;
}
parent->object.flags |= TMP_MARK;
+ surviving_parents++;
pp = &p->next;
}
- /* count them while clearing the temporary mark */
- surviving_parents = 0;
+ /* clear the temporary mark */
for (p = commit->parents; p; p = p->next) {
p->item->object.flags &= ~TMP_MARK;
- surviving_parents++;
}
+ /* no update_treesame() - removing duplicates can't affect TREESAME */
return surviving_parents;
}
@@ -1971,6 +2083,70 @@ static struct merge_simplify_state *locate_simplify_state(struct rev_info *revs,
return st;
}
+static int mark_redundant_parents(struct rev_info *revs, struct commit *commit)
+{
+ struct commit_list *h = reduce_heads(commit->parents);
+ int i = 0, marked = 0;
+ struct commit_list *po, *pn;
+
+ /* Want these for sanity-checking only */
+ int orig_cnt = commit_list_count(commit->parents);
+ int cnt = commit_list_count(h);
+
+ /*
+ * Not ready to remove items yet, just mark them for now, based
+ * on the output of reduce_heads(). reduce_heads outputs the reduced
+ * set in its original order, so this isn't too hard.
+ */
+ po = commit->parents;
+ pn = h;
+ while (po) {
+ if (pn && po->item == pn->item) {
+ pn = pn->next;
+ i++;
+ } else {
+ po->item->object.flags |= TMP_MARK;
+ marked++;
+ }
+ po=po->next;
+ }
+
+ if (i != cnt || cnt+marked != orig_cnt)
+ die("mark_redundant_parents %d %d %d %d", orig_cnt, cnt, i, marked);
+
+ free_commit_list(h);
+
+ return marked;
+}
+
+static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
+{
+ struct commit_list **pp, *p;
+ int nth_parent, removed = 0;
+
+ pp = &commit->parents;
+ nth_parent = 0;
+ while ((p = *pp) != NULL) {
+ struct commit *parent = p->item;
+ if (parent->object.flags & TMP_MARK) {
+ parent->object.flags &= ~TMP_MARK;
+ *pp = p->next;
+ free(p);
+ removed++;
+ compact_treesame(revs, commit, nth_parent);
+ continue;
+ }
+ pp = &p->next;
+ nth_parent++;
+ }
+
+ /* Removing parents can only increase TREESAMEness */
+ if (removed && !(commit->object.flags & TREESAME))
+ update_treesame(revs, commit);
+
+ return nth_parent;
+}
+
static struct commit_list **simplify_one(struct rev_info *revs, struct commit *commit, struct commit_list **tail)
{
struct commit_list *p;
@@ -2015,7 +2191,9 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
}
/*
- * Rewrite our list of parents.
+ * Rewrite our list of parents. Note that this cannot
+ * affect our TREESAME flags in any way - a commit is
+ * always TREESAME to its simplification.
*/
for (p = commit->parents; p; p = p->next) {
pst = locate_simplify_state(revs, p->item);
@@ -2027,31 +2205,30 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
if (revs->first_parent_only)
cnt = 1;
else
- cnt = remove_duplicate_parents(commit);
+ cnt = remove_duplicate_parents(revs, commit);
/*
* It is possible that we are a merge and one side branch
* does not have any commit that touches the given paths;
- * in such a case, the immediate parents will be rewritten
- * to different commits.
+ * in such a case, the immediate parent from that branch
+ * will be rewritten to be the merge base.
*
* o----X X: the commit we are looking at;
* / / o: a commit that touches the paths;
* ---o----'
*
- * Further reduce the parents by removing redundant parents.
+ * Detect and simplify this case.
*/
if (1 < cnt) {
- struct commit_list *h = reduce_heads(commit->parents);
- cnt = commit_list_count(h);
- free_commit_list(commit->parents);
- commit->parents = h;
+ int marked = mark_redundant_parents(revs, commit);
+ if (marked)
+ cnt = remove_marked_parents(revs, commit);
}
/*
* A commit simplifies to itself if it is a root, if it is
* UNINTERESTING, if it touches the given paths, or if it is a
- * merge and its parents simplifies to more than one commits
+ * merge and its parents simplify to more than one commit
* (the first two cases are already handled at the beginning of
* this function).
*
@@ -2159,6 +2336,10 @@ int prepare_revision_walk(struct rev_info *revs)
if (!revs->leak_pending)
free(list);
+ /* Signal whether we need per-parent treesame decoration */
+ if (revs->simplify_merges)
+ revs->treesame.name = "treesame";
+
if (revs->no_walk != REVISION_WALK_NO_WALK_UNSORTED)
commit_list_sort_by_date(&revs->commits);
if (revs->no_walk)
@@ -2218,7 +2399,7 @@ static int rewrite_parents(struct rev_info *revs, struct commit *commit)
}
pp = &parent->next;
}
- remove_duplicate_parents(commit);
+ remove_duplicate_parents(revs, commit);
return 0;
}
diff --git a/revision.h b/revision.h
index 01bd2b7..1abe57b 100644
--- a/revision.h
+++ b/revision.h
@@ -167,6 +167,7 @@ struct rev_info {
struct reflog_walk_info *reflog_info;
struct decoration children;
struct decoration merge_simplification;
+ struct decoration treesame;
/* notes-specific options: which refs to show */
struct display_notes_opt notes_opt;
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index 86ef908..ebe79ac 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -81,7 +81,7 @@ test_expect_success 'rev-list G..M -- G.t' '
test_cmp expect actual
'
-test_expect_failure 'rev-list --ancestry-path G..M -- G.t' '
+test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
for c in H J L; do echo $c; done >expect &&
git rev-list --ancestry-path --format=%s G..M -- G.t |
sed -e "/^commit /d" |
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 5/8] t6012: update test for tweaked full-history traversal
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (3 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 4/8] revision.c: Make --full-history consider more merges Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 6/8] simplify-merges: never remove all TREESAME parents Kevin Bracey
` (3 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds
From: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
t/t6012-rev-list-simplify.sh | 29 +++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index dd6dc84..4e55872 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -14,21 +14,24 @@ unnote () {
test_expect_success setup '
echo "Hi there" >file &&
- git add file &&
- test_tick && git commit -m "Initial file" &&
+ echo "initial" >lost &&
+ git add file lost &&
+ test_tick && git commit -m "Initial file and lost" &&
note A &&
git branch other-branch &&
echo "Hello" >file &&
- git add file &&
- test_tick && git commit -m "Modified file" &&
+ echo "second" >lost &&
+ git add file lost &&
+ test_tick && git commit -m "Modified file and lost" &&
note B &&
git checkout other-branch &&
echo "Hello" >file &&
- git add file &&
+ >lost &&
+ git add file lost &&
test_tick && git commit -m "Modified the file identically" &&
note C &&
@@ -37,7 +40,9 @@ test_expect_success setup '
test_tick && git commit -m "Add another file" &&
note D &&
- test_tick && git merge -m "merge" master &&
+ test_tick &&
+ test_must_fail git merge -m "merge" master &&
+ >lost && git commit -a -m "merge" &&
note E &&
echo "Yet another" >elif &&
@@ -110,4 +115,16 @@ check_result 'I B A' -- file
check_result 'I B A' --topo-order -- file
check_result 'H' --first-parent -- another-file
+check_result 'E C B A' --full-history E -- lost
+test_expect_success 'full history simplification without parent' '
+ printf "%s\n" E C B A >expect &&
+ git log --pretty="$FMT" --full-history E -- lost |
+ unnote >actual &&
+ sed -e "s/^.* \([^ ]*\) .*/\1/" >check <actual &&
+ test_cmp expect check || {
+ cat actual
+ false
+ }
+'
+
test_done
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 6/8] simplify-merges: never remove all TREESAME parents
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (4 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 5/8] t6012: update test for tweaked full-history traversal Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
` (2 subsequent siblings)
8 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
When simplifying an odd merge, such as one that used "-s ours", we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.
This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
Documentation/rev-list-options.txt | 3 +-
revision.c | 69 ++++++++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+), 1 deletion(-)
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 1dad341..e23bdb0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -471,7 +471,8 @@ history according to the following rules:
+
* Replace each parent `P` of `C'` with its simplification `P'`. In
the process, drop parents that are ancestors of other parents, and
- remove duplicates.
+ remove duplicates, but take care to never drop all parents that
+ we are TREESAME to.
+
* If after this parent rewriting, `C'` is a root or merge commit (has
zero or >1 parents), a boundary commit, or !TREESAME, it remains.
diff --git a/revision.c b/revision.c
index c88ded8..7535757 100644
--- a/revision.c
+++ b/revision.c
@@ -2119,6 +2119,73 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit)
return marked;
}
+/*
+ * Awkward naming - this means one parent we are TREESAME to.
+ * cf mark_treesame_root_parents: root parents that are TREESAME (to an
+ * empty tree). Better name suggestions?
+ */
+static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *commit)
+{
+ struct treesame_state *ts = lookup_decoration(&revs->treesame, &commit->object);
+ struct commit *unmarked = NULL, *marked = NULL;
+ struct commit_list *p;
+ unsigned n;
+
+ for (p = commit->parents, n = 0; p; p = p->next, n++) {
+ if (ts->treesame[n]) {
+ if (p->item->object.flags & TMP_MARK) {
+ if (!marked)
+ marked = p->item;
+ } else {
+ if (!unmarked) {
+ unmarked = p->item;
+ break;
+ }
+ }
+ }
+ }
+
+ /*
+ * If we are TREESAME to a marked-for-deletion parent, but not to any
+ * unmarked parents, unmark the first TREESAME parent. This is the
+ * parent that the default simplify_history==1 scan would have followed,
+ * and it doesn't make sense to omit that path when asking for a
+ * simplified full history. Retaining it improves the chances of
+ * understanding odd missed merges that took an old version of a file.
+ *
+ * Example:
+ *
+ * I---------X A modified the file, but mainline merge X used
+ * \ / "-s ours", so took the version from I. X is
+ * `--A--' TREESAME to I and !TREESAME to A.
+ *
+ * Default log from X would produce "I". Without this check,
+ * --full-history --simplify-merges would produce "I-A-X", showing
+ * the merge commit X and that it changed A, but not making clear that
+ * it had just taken the I version. With this check, the topology above
+ * is retained.
+ *
+ * Note that it is possible that the simplification chooses a different
+ * TREESAME parent from the default, in which case this test doesn't
+ * activate, and we _do_ drop the default parent. Example:
+ *
+ * I------X A modified the file, but it was reverted in B,
+ * \ / meaning mainline merge X is TREESAME to both
+ * A--B parents.
+ *
+ * Default log would produce "I" by following the first parent;
+ * --full-history --simplify-merges will produce "I-A-B". But this is a
+ * reasonable result - it presents a logical full history leading from
+ * I to X, and X is not an important merge.
+ */
+ if (!unmarked && marked) {
+ marked->object.flags &= ~TMP_MARK;
+ return 1;
+ }
+
+ return 0;
+}
+
static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp, *p;
@@ -2222,6 +2289,8 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
if (1 < cnt) {
int marked = mark_redundant_parents(revs, commit);
if (marked)
+ marked -= leave_one_treesame_to_parent(revs, commit);
+ if (marked)
cnt = remove_marked_parents(revs, commit);
}
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (5 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 6/8] simplify-merges: never remove all TREESAME parents Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 20:54 ` Junio C Hamano
2013-04-30 17:26 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Kevin Bracey
2013-04-30 21:28 ` [PATCH v2 0/8] History traversal refinements Junio C Hamano
8 siblings, 1 reply; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.
Original log message follows.
The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially. Namely, in a history with this
shape:
---o---o---M
/
x---x---x
where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.
That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
Documentation/rev-list-options.txt | 34 +++++++++++++++++++++-------------
revision.c | 26 +++++++++++++++++++++++++-
t/t6012-rev-list-simplify.sh | 2 +-
3 files changed, 47 insertions(+), 15 deletions(-)
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index e23bdb0..b7fbc80 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to
illustrate the differences between simplification settings. We assume
that you are filtering for a file `foo` in this commit graph:
-----------------------------------------------------------------------
- .-A---M---N---O---P
- / / / / /
- I B C D E
- \ / / / /
- `-------------'
+ .-A---M---N---O---P---Q
+ / / / / / /
+ I B C D E Y
+ \ / / / / /
+ `-------------' X
-----------------------------------------------------------------------
-The horizontal line of history A---P is taken to be the first parent of
+The horizontal line of history A---Q is taken to be the first parent of
each merge. The commits are:
* `I` is the initial commit, in which `foo` exists with contents
@@ -369,6 +369,10 @@ each merge. The commits are:
* `E` changes `quux` to "xyzzy", and its merge `P` combines the
strings to "quux xyzzy". `P` is TREESAME to `O`, but not to `E`.
+* `X` is an indpendent root commit that added a new file `side`, and `Y`
+ modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
+ `Q` is TREESAME to `P`, but not to `Y`.
+
'rev-list' walks backwards through history, including or excluding
commits based on whether '\--full-history' and/or parent rewriting
(via '\--parents' or '\--children') are used. The following settings
@@ -409,7 +413,7 @@ parent lines.
the example, we get
+
-----------------------------------------------------------------------
- I A B N D O P
+ I A B N D O P Q
-----------------------------------------------------------------------
+
`M` was excluded because it is TREESAME to both parents. `E`,
@@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included
themselves. This results in
+
-----------------------------------------------------------------------
- .-A---M---N---O---P
+ .-A---M---N---O---P---Q
/ / / / /
I B / D /
\ / / / /
@@ -440,7 +444,7 @@ themselves. This results in
Compare to '\--full-history' without rewriting above. Note that `E`
was pruned away because it is TREESAME, but the parent list of P was
rewritten to contain `E`'s parent `I`. The same happened for `C` and
-`N`.
+`N`, and `X`, `Y` and `Q`.
In addition to the above settings, you can change whether TREESAME
affects inclusion:
@@ -470,9 +474,9 @@ history according to the following rules:
* Set `C'` to `C`.
+
* Replace each parent `P` of `C'` with its simplification `P'`. In
- the process, drop parents that are ancestors of other parents, and
- remove duplicates, but take care to never drop all parents that
- we are TREESAME to.
+ the process, drop parents that are ancestors of other parents or that are
+ root commits TREESAME to an empty tree, and remove duplicates, but take care
+ to never drop all parents that we are TREESAME to.
+
* If after this parent rewriting, `C'` is a root or merge commit (has
zero or >1 parents), a boundary commit, or !TREESAME, it remains.
@@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing to
`---------'
-----------------------------------------------------------------------
+
-Note the major differences in `N` and `P` over '--full-history':
+Note the major differences in `N`, `P` and `Q` over '--full-history':
+
--
* `N`'s parent list had `I` removed, because it is an ancestor of the
@@ -498,6 +502,10 @@ Note the major differences in `N` and `P` over '--full-history':
+
* `P`'s parent list similarly had `I` removed. `P` was then
removed completely, because it had one parent and is TREESAME.
++
+* `Q`'s parent list had `Y` simplified to `X`. `X` was then removed, because it
+ was a TREESAME root. `Q` was then removed completely, because it had one
+ parent and is TREESAME.
--
Finally, there is a fifth simplification mode available:
diff --git a/revision.c b/revision.c
index 7535757..20c7058 100644
--- a/revision.c
+++ b/revision.c
@@ -2119,6 +2119,22 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit)
return marked;
}
+static int mark_treesame_root_parents(struct rev_info *revs, struct commit *commit)
+{
+ struct commit_list *p;
+ int marked = 0;
+
+ for (p = commit->parents; p; p = p->next) {
+ struct commit *parent = p->item;
+ if (!parent->parents && (parent->object.flags & TREESAME)) {
+ parent->object.flags |= TMP_MARK;
+ marked++;
+ }
+ }
+
+ return marked;
+}
+
/*
* Awkward naming - this means one parent we are TREESAME to.
* cf mark_treesame_root_parents: root parents that are TREESAME (to an
@@ -2284,10 +2300,18 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
* / / o: a commit that touches the paths;
* ---o----'
*
- * Detect and simplify this case.
+ * Further, a merge of an independent branch that doesn't
+ * touch the path will reduce to a treesame root parent:
+ *
+ * ----o----X X: the commit we are looking at;
+ * / o: a commit that touches the paths;
+ * r r: a root commit not touching the paths
+ *
+ * Detect and simplify both cases.
*/
if (1 < cnt) {
int marked = mark_redundant_parents(revs, commit);
+ marked += mark_treesame_root_parents(revs, commit);
if (marked)
marked -= leave_one_treesame_to_parent(revs, commit);
if (marked)
diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index 4e55872..57ce239 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -110,7 +110,7 @@ check_result 'L K J I H G F E D C B A' --full-history
check_result 'K I H E C B A' --full-history -- file
check_result 'K I H E C B A' --full-history --topo-order -- file
check_result 'K I H E C B A' --full-history --date-order -- file
-check_outcome failure 'I E C B A' --simplify-merges -- file
+check_result 'I E C B A' --simplify-merges -- file
check_result 'I B A' -- file
check_result 'I B A' --topo-order -- file
check_result 'H' --first-parent -- another-file
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (6 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
@ 2013-04-30 17:26 ` Kevin Bracey
2013-04-30 21:18 ` Junio C Hamano
2013-04-30 21:28 ` [PATCH v2 0/8] History traversal refinements Junio C Hamano
8 siblings, 1 reply; 17+ messages in thread
From: Kevin Bracey @ 2013-04-30 17:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
The simplification and rewriting logic previously paid little heed to
whether parents were UNINTERESTING, leading to situations where limited
histories could unnecessarily include a lot of irrelevant merges along
the boundary. Tighten up the rules to properly account for limited
lists:
1) If a merge has INTERESTING parents, and it is TREESAME to them, then
do not let UNINTERESTING parents cause the merge to be treated as
!TREESAME. If we match our walked parents, we don't care if we don't
match unwalked parents.
2) Do not let UNINTERESTING parents prevent commits from being
simplified or omitted: merges with exactly 1 INTERESTING parent that
they are TREESAME to can be treated as a non-merge commit.
3) When rewriting parents, we don't need to show all merges - only merges
with 2 or more INTERESTING parents are required to hold topology together.
These changes greatly increase simplification in pruned, limited
revision lists - irrelevant merges from unlisted or partially listed
side branches can be omitted.
These rules paying more attention to UNINTERESTING do add a tricky
wrinkle to behaviour. Because limited revision lists are conventionally
expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING. Thus
its connection to the INTERESTING graph is not privileged over side
branches, and this can lead to its first descendant merge being shown
for no particularly good reason.
See t6019's "--ancestry-path G..M -- G.t" for an example of this effect.
Merges H and J are semantically identical and equally irrelevant, from
the point of view of tracking the history of G.t, but H is shown and J
isn't. Bottom commit G is marked UNINTERESTING, and thus isn't
privileged over E, so H is shown because it differs from E. Whereas
higher up the graph, I is INTERESTING and thus privileged over F, so we
don't care that J differs from F.
So should we treat bottom commits as "interesting" for the rules above?
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
revision.c | 141 ++++++++++++++++++++++++++++++++------
t/t6019-rev-list-ancestry-path.sh | 20 ++++--
2 files changed, 134 insertions(+), 27 deletions(-)
diff --git a/revision.c b/revision.c
index 20c7058..aa814bc 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,45 @@ static int everybody_uninteresting(struct commit_list *orig)
}
/*
+ * Compute whether we have only one "interesting" parent. If we are TREESAME,
+ * and this returns a parent, then we can safely simplify to that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent (whether UNINTERESTING or not, that is our "interesting" parent).
+ * TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole INTERESTING parent, if any.
+ * If we have only 1 INTERESTING parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one INTERESTING parent, or no INTERESTING parents
+ * (and multiple UNINTERESTING ones), then we can't choose a parent to follow,
+ * and we should not be simplified.
+ */
+struct commit *sole_interesting(struct rev_info *revs, struct commit_list *orig)
+{
+ struct commit_list *list = orig;
+ struct commit *interesting = NULL;
+
+ if (!orig)
+ return NULL;
+
+ if (revs->first_parent_only || !orig->next)
+ return orig->item;
+
+ while (list) {
+ struct commit *commit = list->item;
+ list = list->next;
+ if (!(commit->object.flags & UNINTERESTING)) {
+ if (interesting)
+ return NULL;
+ interesting = commit;
+ }
+ }
+ return interesting;
+}
+
+/*
* The goal is to get REV_TREE_NEW as the result only if the
* diff consists of all '+' (and no other changes), REV_TREE_OLD
* if the whole diff is removal of old data, and otherwise
@@ -502,27 +541,53 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit)
if (commit->parents && commit->parents->next) {
unsigned n;
struct treesame_state *st;
+ struct commit_list *p;
+ unsigned interesting_parents;
+ unsigned uninteresting_change, interesting_change;
st = lookup_decoration(&revs->treesame, &commit->object);
if (!st)
die("update_treesame %s", sha1_to_hex(commit->object.sha1));
- commit->object.flags |= TREESAME;
- for (n = 0; n < st->nparents; n++) {
- if (!st->treesame[n]) {
- commit->object.flags &= ~TREESAME;
- break;
+ interesting_parents = 0;
+ uninteresting_change = interesting_change = 0;
+ for (p = commit->parents, n = 0; p; n++, p = p->next) {
+ if (p->item->object.flags & UNINTERESTING)
+ uninteresting_change |= !st->treesame[n];
+ else {
+ interesting_change |= !st->treesame[n];
+ interesting_parents++;
}
}
+ if (interesting_parents ? interesting_change : uninteresting_change)
+ commit->object.flags &= ~TREESAME;
+ else
+ commit->object.flags |= TREESAME;
}
return commit->object.flags & TREESAME;
}
+static inline int limiting_can_increase_treesame(const struct rev_info *revs)
+{
+ /*
+ * TREESAME is irrelevant unless prune && dense;
+ * if simplify_history is set, we can't have a mixture of TREESAME and
+ * !TREESAME INTERESTING parents (and we don't have treesame[]
+ * decoration anyway);
+ * if first_parent_only is set, then the TREESAME flag is locked
+ * against the first parent (and again we lack treesame[] decoration).
+ */
+ return revs->prune && revs->dense &&
+ !revs->simplify_history &&
+ !revs->first_parent_only;
+}
+
static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp, *parent;
struct treesame_state *ts = NULL;
- int tree_changed = 0, nth_parent;
+ int uninteresting_change = 0, interesting_change = 0;
+ int interesting_parents, nth_parent;
/*
* If we don't do pruning, everything is interesting
@@ -546,10 +611,12 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
if (!revs->dense && !commit->parents->next)
return;
- for (pp = &commit->parents, nth_parent = 0;
+ for (pp = &commit->parents, nth_parent = 0, interesting_parents = 0;
(parent = *pp) != NULL;
pp = &parent->next, nth_parent++) {
struct commit *p = parent->item;
+ if (!(p->object.flags & UNINTERESTING))
+ interesting_parents++;
if (nth_parent == 1) {
/*
@@ -573,7 +640,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
!revs->simplify_history &&
!(commit->object.flags & UNINTERESTING)) {
ts = initialise_treesame(revs, commit);
- if (!tree_changed)
+ if (!(uninteresting_change || interesting_change))
ts->treesame[0] = 1;
}
}
@@ -619,14 +686,26 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
/* fallthrough */
case REV_TREE_OLD:
case REV_TREE_DIFFERENT:
- tree_changed = 1;
+ if (p->object.flags & UNINTERESTING)
+ uninteresting_change = 1;
+ else
+ interesting_change = 1;
continue;
}
die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
}
- if (tree_changed)
- return;
- commit->object.flags |= TREESAME;
+
+ /*
+ * TREESAME is straightforward for single-parent commits. For merge
+ * commits, it is most useful to define it so that UNINTERESTING
+ * parents cannot make us !TREESAME - if we have any interesting
+ * parents, then we only consider TREESAMEness with respect to them,
+ * allowing a irrelevant merge from UNINTERESTING branches to be
+ * simplified away. Only if we have only UNINTERESTING parents do we
+ * base TREESAME on them.
+ */
+ if (interesting_parents ? !interesting_change : !uninteresting_change)
+ commit->object.flags |= TREESAME;
}
static void commit_list_insert_by_date_cached(struct commit *p, struct commit_list **head,
@@ -1002,6 +1081,18 @@ static int limit_list(struct rev_info *revs)
free_commit_list(bottom);
}
+ /*
+ * Check if any commits have become TREESAME by some of their parents
+ * becoming UNINTERESTING.
+ */
+ if (limiting_can_increase_treesame(revs))
+ for (list = newlist; list; list = list->next) {
+ struct commit *c = list->item;
+ if (c->object.flags & (UNINTERESTING | TREESAME))
+ continue;
+ update_treesame(revs, c);
+ }
+
revs->commits = newlist;
return 0;
}
@@ -2233,6 +2324,7 @@ static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
static struct commit_list **simplify_one(struct rev_info *revs, struct commit *commit, struct commit_list **tail)
{
struct commit_list *p;
+ struct commit *parent;
struct merge_simplify_state *st, *pst;
int cnt;
@@ -2321,19 +2413,20 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
/*
* A commit simplifies to itself if it is a root, if it is
* UNINTERESTING, if it touches the given paths, or if it is a
- * merge and its parents simplify to more than one commit
+ * merge and its parents don't simplify to one interesting commit
* (the first two cases are already handled at the beginning of
* this function).
*
- * Otherwise, it simplifies to what its sole parent simplifies to.
+ * Otherwise, it simplifies to what its sole interesting parent
+ * simplifies to.
*/
if (!cnt ||
(commit->object.flags & UNINTERESTING) ||
!(commit->object.flags & TREESAME) ||
- (1 < cnt))
+ (parent = sole_interesting(revs, commit->parents)) == NULL)
st->simplified = commit;
else {
- pst = locate_simplify_state(revs, commit->parents->item);
+ pst = locate_simplify_state(revs, parent);
st->simplified = pst->simplified;
}
return tail;
@@ -2430,7 +2523,8 @@ int prepare_revision_walk(struct rev_info *revs)
free(list);
/* Signal whether we need per-parent treesame decoration */
- if (revs->simplify_merges)
+ if (revs->simplify_merges ||
+ (revs->limited && limiting_can_increase_treesame(revs)))
revs->treesame.name = "treesame";
if (revs->no_walk != REVISION_WALK_NO_WALK_UNSORTED)
@@ -2464,15 +2558,15 @@ static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp
if (!revs->limited)
if (add_parents_to_list(revs, p, &revs->commits, &cache) < 0)
return rewrite_one_error;
- if (p->parents && p->parents->next)
- return rewrite_one_ok;
if (p->object.flags & UNINTERESTING)
return rewrite_one_ok;
if (!(p->object.flags & TREESAME))
return rewrite_one_ok;
if (!p->parents)
return rewrite_one_noparents;
- *pp = p->parents->item;
+ if ((p = sole_interesting(revs, p->parents)) == NULL)
+ return rewrite_one_ok;
+ *pp = p;
}
}
@@ -2629,11 +2723,14 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
if (revs->prune && revs->dense) {
/* Commit without changes? */
if (commit->object.flags & TREESAME) {
+ /* drop root commits */
+ if (!commit->parents)
+ return commit_ignore;
/* drop merges unless we want parenthood */
if (!want_ancestry(revs))
return commit_ignore;
- /* non-merge - always ignore it */
- if (!commit->parents || !commit->parents->next)
+ /* exactly one interesting parent - always ignore it */
+ if (sole_interesting(revs, commit->parents))
return commit_ignore;
}
}
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index ebe79ac..be1f90a 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -4,18 +4,19 @@ test_description='--ancestry-path'
# D---E-------F
# / \ \
-# B---C---G---H---I---J
+# B---C-G0-G--H---I---J
# / \
# A-------K---------------L--M
#
-# D..M == E F G H I J K L M
+# D..M == E F G G0 H I J K L M
# --ancestry-path D..M == E F H I J L M
#
# D..M -- M.t == M
# --ancestry-path D..M -- M.t == M
#
# G..M -- G.t == [nothing - was dropped in "-s ours" merge L]
-# --ancestry-path G..M -- G.t == H J L
+# --ancestry-path G..M -- G.t == H L (H shown because both G and E are uninteresting)
+# --ancestry-path G0..M-- G.t == G L
. ./test-lib.sh
@@ -33,6 +34,7 @@ test_expect_success setup '
test_commit E &&
test_commit F &&
git reset --hard C &&
+ test_commit G0 &&
test_commit G &&
test_merge E H &&
test_commit I &&
@@ -44,7 +46,7 @@ test_expect_success setup '
'
test_expect_success 'rev-list D..M' '
- for c in E F G H I J K L M; do echo $c; done >expect &&
+ for c in E F G G0 H I J K L M; do echo $c; done >expect &&
git rev-list --format=%s D..M |
sed -e "/^commit /d" |
sort >actual &&
@@ -82,13 +84,21 @@ test_expect_success 'rev-list G..M -- G.t' '
'
test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
- for c in H J L; do echo $c; done >expect &&
+ for c in H L; do echo $c; done >expect &&
git rev-list --ancestry-path --format=%s G..M -- G.t |
sed -e "/^commit /d" |
sort >actual &&
test_cmp expect actual
'
+test_expect_success 'rev-list --ancestry-path G0..M -- G.t' '
+ for c in G L; do echo $c; done >expect &&
+ git rev-list --ancestry-path --format=%s G0..M -- G.t |
+ sed -e "/^commit /d" |
+ sort >actual &&
+ test_cmp expect actual
+'
+
# b---bc
# / \ /
# a X
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch
2013-04-30 17:26 ` [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
@ 2013-04-30 20:54 ` Junio C Hamano
0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2013-04-30 20:54 UTC (permalink / raw)
To: Kevin Bracey; +Cc: git, Linus Torvalds
Kevin Bracey <kevin@bracey.fi> writes:
> @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to
> illustrate the differences between simplification settings. We assume
> that you are filtering for a file `foo` in this commit graph:
> -----------------------------------------------------------------------
> + .-A---M---N---O---P---Q
> + / / / / / /
> + I B C D E Y
> + \ / / / / /
> + `-------------' X
> -----------------------------------------------------------------------
> -The horizontal line of history A---P is taken to be the first parent of
> +The horizontal line of history A---Q is taken to be the first parent of
> each merge. The commits are:
>
> * `I` is the initial commit, in which `foo` exists with contents
> @@ -369,6 +369,10 @@ each merge. The commits are:
> * `E` changes `quux` to "xyzzy", and its merge `P` combines the
> strings to "quux xyzzy". `P` is TREESAME to `O`, but not to `E`.
>
> +* `X` is an indpendent root commit that added a new file `side`, and `Y`
> + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
> + `Q` is TREESAME to `P`, but not to `Y`.
> +
OK, we say "filtering for a file `foo`" in the very beginning, so
there is an implied "with respect to `foo`" in all of these "A is
TREESAME to B", and the description in the new bullet point looks
correct.
> diff --git a/revision.c b/revision.c
> index 7535757..20c7058 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2119,6 +2119,22 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit)
> return marked;
> }
>
> +static int mark_treesame_root_parents(struct rev_info *revs, struct commit *commit)
> +{
> + struct commit_list *p;
> + int marked = 0;
> +
> + for (p = commit->parents; p; p = p->next) {
> + struct commit *parent = p->item;
> + if (!parent->parents && (parent->object.flags & TREESAME)) {
> + parent->object.flags |= TMP_MARK;
> + marked++;
> + }
> + }
> +
> + return marked;
> +}
> +
> /*
> * Awkward naming - this means one parent we are TREESAME to.
> * cf mark_treesame_root_parents: root parents that are TREESAME (to an
> @@ -2284,10 +2300,18 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
> * / / o: a commit that touches the paths;
> * ---o----'
> *
> - * Detect and simplify this case.
> + * Further, a merge of an independent branch that doesn't
> + * touch the path will reduce to a treesame root parent:
> + *
> + * ----o----X X: the commit we are looking at;
> + * / o: a commit that touches the paths;
> + * r r: a root commit not touching the paths
> + *
> + * Detect and simplify both cases.
> */
> if (1 < cnt) {
> int marked = mark_redundant_parents(revs, commit);
> + marked += mark_treesame_root_parents(revs, commit);
> if (marked)
> marked -= leave_one_treesame_to_parent(revs, commit);
> if (marked)
The solution looks surprisingly simple ;-)
Thanks.
> diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
> index 4e55872..57ce239 100755
> --- a/t/t6012-rev-list-simplify.sh
> +++ b/t/t6012-rev-list-simplify.sh
> @@ -110,7 +110,7 @@ check_result 'L K J I H G F E D C B A' --full-history
> check_result 'K I H E C B A' --full-history -- file
> check_result 'K I H E C B A' --full-history --topo-order -- file
> check_result 'K I H E C B A' --full-history --date-order -- file
> -check_outcome failure 'I E C B A' --simplify-merges -- file
> +check_result 'I E C B A' --simplify-merges -- file
> check_result 'I B A' -- file
> check_result 'I B A' --topo-order -- file
> check_result 'H' --first-parent -- another-file
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
2013-04-30 17:26 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Kevin Bracey
@ 2013-04-30 21:18 ` Junio C Hamano
2013-05-02 17:52 ` Kevin Bracey
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2013-04-30 21:18 UTC (permalink / raw)
To: Kevin Bracey; +Cc: git, Linus Torvalds
Kevin Bracey <kevin@bracey.fi> writes:
> The simplification and rewriting logic previously paid little heed to
> whether parents were UNINTERESTING, leading to situations where limited
> histories could unnecessarily include a lot of irrelevant merges along
> the boundary. Tighten up the rules to properly account for limited
> lists:
>
> 1) If a merge has INTERESTING parents, and it is TREESAME to them, then
> do not let UNINTERESTING parents cause the merge to be treated as
> !TREESAME. If we match our walked parents, we don't care if we don't
> match unwalked parents.
OK.
> 2) Do not let UNINTERESTING parents prevent commits from being
> simplified or omitted: merges with exactly 1 INTERESTING parent that
> they are TREESAME to can be treated as a non-merge commit.
OK.
> 3) When rewriting parents, we don't need to show all merges - only merges
> with 2 or more INTERESTING parents are required to hold topology together.
OK.
> These changes greatly increase simplification in pruned, limited
> revision lists - irrelevant merges from unlisted or partially listed
> side branches can be omitted.
It is a bit unclear what "unlisted" and "partially listed" mean in
this sentence.
> These rules paying more attention to UNINTERESTING do add a tricky
> wrinkle to behaviour. Because limited revision lists are conventionally
> expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING.
OK.
> Thus
> its connection to the INTERESTING graph is not privileged over side
> branches,
I take that "its connection" refers to the "===" link below, the
nodes connected with "---" form the "INTERESTING graph", and
....Z...A===X---o---o---B
\\ /
W---Y
"side branches" refer to the development that built W and Y and
merged at X. And you are saying that A===X is not "privileged" over
"Y---X", with some definition of "privileged" I am not sure about.
> and this can lead to its first descendant merge being shown
> for no particularly good reason.
Whose first descendant merge? Sorry, I am lost at this point, and
anything I would say in the remainder of this response may be
nonsense coming from this confusion.
> See t6019's "--ancestry-path G..M -- G.t" for an example of this effect.
> # D---E-------F
> # / \ \
> +# B---C-G0-G--H---I---J
> # / \
> # A-------K---------------L--M
> #
> +# D..M == E F G G0 H I J K L M
> # --ancestry-path D..M == E F H I J L M
> #
> # D..M -- M.t == M
> # --ancestry-path D..M -- M.t == M
> #
> # G..M -- G.t == [nothing - was dropped in "-s ours" merge L]
> -# --ancestry-path G..M -- G.t == H J L
> +# --ancestry-path G0..M-- G.t == G L
> Merges H and J are semantically identical and equally irrelevant, from
> the point of view of tracking the history of G.t, but H is shown and J
> isn't.
> > Bottom commit G is marked UNINTERESTING, and thus isn't
> privileged over E, so H is shown because it differs from E.
Doesn't that suggest we should do --ancestry-path a lot earlier?
Conceptually, the "ancestry-path" shouldn't get affected by any
pathspec. The range "--ancestry-path G0..M" should be equivalent to
the range "G0..M ^F ^E ^K", and with the rule to ignore non-sameness
with uninteresting side branches, I would have expected that H and J
would be equally irrelevant, because E and F would be outside the
graph we would want to look at sameness.
> Whereas
> higher up the graph, I is INTERESTING and thus privileged over F, so we
> don't care that J differs from F.
>
> So should we treat bottom commits as "interesting" for the rules above?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/8] History traversal refinements
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
` (7 preceding siblings ...)
2013-04-30 17:26 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Kevin Bracey
@ 2013-04-30 21:28 ` Junio C Hamano
8 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2013-04-30 21:28 UTC (permalink / raw)
To: Kevin Bracey; +Cc: git, Linus Torvalds
Kevin Bracey <kevin@bracey.fi> writes:
> Okay, here's what I'll call v2 of this series. In the 3 parts from
> before (4,6 & 7), I've addressed the comments from Junio and David,
> corrected some errors, reconstructed the main commit message, and made
> some adjustments in preparation for part 8.
Overally, very nicely done. Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
2013-04-30 21:18 ` Junio C Hamano
@ 2013-05-02 17:52 ` Kevin Bracey
2013-05-02 17:58 ` [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME Kevin Bracey
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-05-02 17:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linus Torvalds
On 01/05/2013 00:18, Junio C Hamano wrote:
>> These rules paying more attention to UNINTERESTING do add a tricky
>> wrinkle to behaviour. Because limited revision lists are conventionally
>> expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING.
> OK.
>
>> Thus
>> its connection to the INTERESTING graph is not privileged over side
>> branches,
> I take that "its connection" refers to the "===" link below, the
> nodes connected with "---" form the "INTERESTING graph", and
>
> ....Z...A===X---o---o---B
> \\ /
> W---Y
>
> "side branches" refer to the development that built W and Y and
> merged at X. And you are saying that A===X is not "privileged" over
> "Y---X", with some definition of "privileged" I am not sure about.
Okay, that's a good graph. The basic problem is that all the rules above
try to identify a merge from an irrelevant branch and eliminate it. But
how can we define what a side branch is?
I think the rules I state are conceptually sound - a side branch is an
extra parent coming in from outside our limited history graph. But the
problem is at the bottom. In the event that someone specifies "A..B"
with the above history, we get the resultant graph "W-Y-X-o-o-B". A is
not on that graph. So with the rules as they stand, "A" being
UNINTERESTING makes it get treated as a side branch of X, which isn't
good. A needs to be INTERESTING for the purpose of side-branch logic.
So when someone says "A..B" and generates "W-Y-X-o-o-B", we want to know
that X's parent path "(Z)-W-Y-X" is the (possibly irrelevant) side
branch, not "(A)-X".
Example undesirable behaviour, with A treated as a side branch:
1) If only commit A changed our file, and merge X took "old" version Y
for some reason, under these rules "--full-history A..B" would show
nothing. X doesn't consider A because it's UNINTERESTING. If there had
been an intervening (even irrelevant) commit A1 between A1 and X, X
would have been shown.
2) If only commit A changed our file, and merge X took A, with this
rules"--simplify-merges A..B" would show X, with two rewritten
UNINTERESTING parents A and Z. That's not what we want - Z is an
irrelevant side-branch in this case. If there had been an intervening
(unsimplifiable) commit A1 between A and X, X would have had INTERESTING
parent A1, and WY would have been successfully discarded.
3) Even before my new rules, there is one existing place trying to spot
side branches, and it can fail here for the same reasons. See
simplify_history's "even if a merge with an uninteresting side branch
brought the entire change we are interested in" test. It would do the
wrong thing at X, if W had made a change and Y reverted it. "git log
A..B" would show W and Y, which is not what we want. As the scan hits X,
it follows parent Y rather than just go for first parent A, because it
thinks A is "an uninteresting side branch". Again, if there had been an
intervening commit A1 before X, X would have followed A1 and W and Y
would not have been shown.
All 3 cases work if A is treated as "INTERESTING" for side-branch rules.
We shouldn't have needed to put in an extra A1 commit to make them work.
>> # D---E-------F
>> # / \ \
>> +# B---C-G0-G--H---I---J
>> # / \
>> # A-------K---------------L--M
>> #
>>
>>
>> Conceptually, the "ancestry-path" shouldn't get affected by any
>> pathspec. The range "--ancestry-path G0..M" should be equivalent to
>> the range "G0..M ^F ^E ^K", and with the rule to ignore non-sameness
>> with uninteresting side branches, I would have expected that H and J
>> would be equally irrelevant, because E and F would be outside the
>> graph we would want to look at sameness.
Those two pathspecs produce the same graph of commits, and yes, they've
always produced the same thing up until now. But we're trying to do
something new(ish) here. We're trying to define "side branch", to allow
us to make more useful pruning comparisons. And we can't reliably define
"side branch" unless we can reliably define where we're coming from.
Looking at this case, given that graph of commits G-H-I-J-L-M (produced
by any pathspec/flags), that is "easy" because it's linear. The bottom
of the INTERESTING graph has a single parent, and we can follow it
straight from bottom to top. We can deduce that non-merge "G" is bottom,
and thus call the connections to E and F "sides". (But that could have
been wrong if the graph had been made some other way, and the user
wasn't asking for history since G0).
But if presented with H-I-J-L-M we get stuck. The lowest commit in our
graph has 2 parents. How do we distinguish between E and G? Which is the
side, and which is the bottom? We can only define "side branch" here if
we know where our bottom is. The version of this patch as posted can't
distinguish whether E or G is more important, so merge H always gets
shown. And that makes me unhappy. And note that normal
merge-simplification will often prune away boring non-merge commits,
leaving the user-specified UNINTERESTING bottom attached to an
INTERESTING graph by a merge commit like this. So it would be very
common to stumble over this first connection onto the INTERESTING graph
with --simplify-merges.
So my proposal here is that for pruning purposes, we need to mark the
bottom(s), so we can reliably identify the sides.A side branch is
spotted by being UNINTERESTING && !BOTTOM.
Doing this means that "--ancestry-path E..M", "--ancestry-path G..M" and
"G..M ^F ^E ^K" would all produce a walk starting at H, but for pruning
purposes they will prioritise differences against specified bottom
commits, and discount them against non-specified boundary commits.
So "E..M" will treat "G" as a side branch, and "G..M" will treat "D-E"
as a side branch. "--ancestry-path E..M" will show H iff it differs from
E, and "--ancestry-path G..M" will show H iff it differs from G.
And as a result, manually specifying with "G..M ^F ^E ^K" will be paying
extra attention to merges H, J and L, caring if they differ from listed
commits E F and K, and not treating them as ignorable side branches. H
will be shown if it differs from either G or E.
All of which feels right and good to me, but it does mean the neat
mathematical purity of the commit set model is somewhat disrupted -
different pathspecs specifying the same set of commits will prune
differently. But I think the behaviour fits intuitive expectation well.
When a user specifies "A..B" or "B ^A", they almost always mean that
they want the change since A, and they’re not thinking of A as a “commit
set subtraction”. The specific edge to A is important to them. So let's
be helpful and meet normal expectation, and treat the edge between the
INTERESTING graph and the specified bottom commit as important.
Revised patch 8/8 doing this follows. The fact it adds a BOTTOM flag
potentially has an impact on other areas, as noted in the comments. If
we went down this route, the BOTTOM flag addition would obviously want
to be a separate preceding patch, covering impact on
collect_bottom_commits etc.
And the thing about "only need merges with 2+ INTERESTING parents for
topology" should really be separated out too - that's actually quite
distinct from the TREESAME side branch stuff, and rather more
straightforward.
Kevin
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME
2013-05-02 17:52 ` Kevin Bracey
@ 2013-05-02 17:58 ` Kevin Bracey
2013-05-02 18:26 ` [PATCH v2.2 " Kevin Bracey
2013-05-02 20:05 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Junio C Hamano
2 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-05-02 17:58 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
Add a BOTTOM flag to commit objects, and use it to define priority for
pruning. Priority commits are those that are !UNINTERESTING or BOTTOM,
and this allows us to identify irrelevant side branches (UNINTERESTING
&& !BOTTOM).
If a merge has priority parents, and it is TREESAME to them, then do
not let non-priority parents cause the merge to be treated as
!TREESAME.
When considering simplification, don't always include all merges -
merges with exactly 1 priority parent can be simplified, if TREESAME
according to the above rule.
These two changes greatly increase simplification in limited revision
lists - irrelevant merges from unlisted commits can be omitted.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
revision.c | 210 ++++++++++++++++++++++++++++++++------
revision.h | 3 +-
t/t6019-rev-list-ancestry-path.sh | 12 ++-
3 files changed, 188 insertions(+), 37 deletions(-)
diff --git a/revision.c b/revision.c
index ddaed33..2a7f08d 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list *orig)
}
/*
+ * A definition of "priority" commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A "priority" commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from non-priority parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only priority parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Including bottom commits in "priority" ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as an uninteresting
+ * side branch, but treated as part of the graph. For example:
+ *
+ * ....Z...A---X---o---o---B
+ * . /
+ * W---Y
+ *
+ * When computing "A..B", the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path "A..B", the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int priority_commit(struct commit *commit)
+{
+ return (commit->object.flags & (UNINTERESTING | BOTTOM)) != UNINTERESTING;
+}
+
+/*
+ * Return a single priority commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent. TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole priority parent, if any.
+ * If we have only one priority parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one priority parent, or no priority parents
+ * (and multiple non-priority ones), then we can't select a parent here and
+ * return NULL.
+ */
+struct commit *priority_select_parent(struct rev_info *revs, struct commit_list *orig)
+{
+ struct commit_list *list = orig;
+ struct commit *priority = NULL;
+
+ if (!orig)
+ return NULL;
+
+ if (revs->first_parent_only || !orig->next)
+ return orig->item;
+
+ while (list) {
+ struct commit *commit = list->item;
+ list = list->next;
+ if (priority_commit(commit)) {
+ if (priority)
+ return NULL;
+ priority = commit;
+ }
+ }
+ return priority;
+}
+
+/*
* The goal is to get REV_TREE_NEW as the result only if the
* diff consists of all '+' (and no other changes), REV_TREE_OLD
* if the whole diff is removal of old data, and otherwise
@@ -502,17 +572,28 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit)
if (commit->parents && commit->parents->next) {
unsigned n;
struct treesame_state *st;
+ struct commit_list *p;
+ unsigned priority_parents;
+ unsigned priority_change, nonpriority_change;
st = lookup_decoration(&revs->treesame, &commit->object);
if (!st)
die("update_treesame %s", sha1_to_hex(commit->object.sha1));
- commit->object.flags |= TREESAME;
- for (n = 0; n < st->nparents; n++) {
- if (!st->treesame[n]) {
- commit->object.flags &= ~TREESAME;
- break;
- }
+ priority_parents = 0;
+ priority_change = nonpriority_change = 0;
+ for (p = commit->parents, n = 0; p; n++, p = p->next) {
+ if (priority_commit(p->item)) {
+ priority_change |= !st->treesame[n];
+ priority_parents++;
+ } else
+ nonpriority_change |= !st->treesame[n];
}
+ //fprintf(stderr, "=>%s has %d priority parents; %d %d\n", sha1_to_hex(commit->object.sha1),
+ // priority_parents, priority_change, nonpriority_change);
+ if (priority_parents ? priority_change : nonpriority_change)
+ commit->object.flags &= ~TREESAME;
+ else
+ commit->object.flags |= TREESAME;
}
return commit->object.flags & TREESAME;
@@ -522,7 +603,8 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp, *parent;
struct treesame_state *ts = NULL;
- int tree_changed = 0, nth_parent;
+ int priority_change = 0, nonpriority_change = 0;
+ int priority_parents, nth_parent;
/*
* If we don't do pruning, everything is interesting
@@ -546,10 +628,12 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
if (!revs->dense && !commit->parents->next)
return;
- for (pp = &commit->parents, nth_parent = 0;
+ for (pp = &commit->parents, nth_parent = 0, priority_parents = 0;
(parent = *pp) != NULL;
pp = &parent->next, nth_parent++) {
struct commit *p = parent->item;
+ if (priority_commit(p))
+ priority_parents++;
if (nth_parent == 1) {
/*
@@ -564,8 +648,8 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
if (revs->first_parent_only)
break;
/*
- * If this is going to remain a merge, and it's
- * interesting, prepare to remember per-parent treesame
+ * If this will remain a potentially-simplifiable
+ * merge, prepare to remember per-parent treesame
* if needed. Initialise the array with our comparison
* from the first iteration.
*/
@@ -573,7 +657,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
!revs->simplify_history &&
!(commit->object.flags & UNINTERESTING)) {
ts = initialise_treesame(revs, commit);
- if (!tree_changed)
+ if (!(nonpriority_change || priority_change))
ts->treesame[0] = 1;
}
}
@@ -583,7 +667,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
sha1_to_hex(p->object.sha1));
switch (rev_compare_tree(revs, p, commit)) {
case REV_TREE_SAME:
- if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
+ if (!revs->simplify_history || !priority_commit(p)) {
/* Even if a merge with an uninteresting
* side branch brought the entire change
* we are interested in, we do not want
@@ -619,14 +703,31 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
/* fallthrough */
case REV_TREE_OLD:
case REV_TREE_DIFFERENT:
- tree_changed = 1;
+ if (priority_commit(p))
+ priority_change = 1;
+ else
+ nonpriority_change = 1;
continue;
}
die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
}
- if (tree_changed)
- return;
- commit->object.flags |= TREESAME;
+
+ //fprintf(stderr, "%s has %d priority parents; %d %d\n",
+ // sha1_to_hex(commit->object.sha1),
+ // priority_parents, priority_change, nonpriority_change);
+
+ /*
+ * TREESAME is straightforward for single-parent commits. For merge
+ * commits, it is most useful to define it so that "non-priority"
+ * parents cannot make us !TREESAME - if we have any priority
+ * parents, then we only consider TREESAMEness with respect to them,
+ * allowing irrelevant merges from uninteresting branches to be
+ * simplified away. Only if we have only nonpriority parents do we
+ * base TREESAME on them. Note that this logic is replicated in
+ * update_treesame, which should be kept in sync.
+ */
+ if (priority_parents ? !priority_change : !nonpriority_change)
+ commit->object.flags |= TREESAME;
}
static void commit_list_insert_by_date_cached(struct commit *p, struct commit_list **head,
@@ -1007,6 +1108,23 @@ static int limit_list(struct rev_info *revs)
free_commit_list(bottom);
}
+ /*
+ * Check if any commits have become TREESAME by some of their parents
+ * becoming UNINTERESTING. Note that TREESAME is irrelevant unless
+ * prune && dense; if simplify_history is set, we can't have a mixture
+ * of TREESAME and !TREESAME INTERESTING parents, so this can't happen
+ * (and we don't have treesame[] decoration anyway); if first_parent_only
+ * is set, then the TREESAME flag is locked against the first parent
+ * (and again we lack treesame[] decoration).
+ */
+ if (revs->prune && revs->dense && !revs->simplify_history && !revs->first_parent_only)
+ for (list = newlist; list; list = list->next) {
+ struct commit *c = list->item;
+ if (c->object.flags & (UNINTERESTING | TREESAME))
+ continue;
+ update_treesame(revs, c);
+ }
+
revs->commits = newlist;
return 0;
}
@@ -1020,6 +1138,16 @@ static void add_rev_cmdline(struct rev_info *revs,
struct rev_cmdline_info *info = &revs->cmdline;
int nr = info->nr;
+ /*
+ * Total hack, but at least doing it here means we're consistent
+ * with collect_bottom_commits(). But if we just flagged BOTTOM
+ * commits before this, could we ditch this entire function?
+ */
+ if (item->type == OBJ_COMMIT && (flags & UNINTERESTING)) {
+ item->flags |= BOTTOM;
+ flags |= BOTTOM;
+ }
+
ALLOC_GROW(info->rev, nr + 1, info->alloc);
info->rev[nr].item = item;
info->rev[nr].name = name;
@@ -1999,10 +2127,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
if (revs->topo_order)
revs->limited = 1;
- /* Signal whether we need per-parent treesame decoration */
- if (revs->simplify_merges)
- revs->treesame.name = "treesame";
-
if (revs->prune_data.nr) {
diff_tree_setup_paths(revs->prune_data.raw, &revs->pruning);
/* Can't prune commits with rename following: the paths change.. */
@@ -2242,6 +2366,7 @@ static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
static struct commit_list **simplify_one(struct rev_info *revs, struct commit *commit, struct commit_list **tail)
{
struct commit_list *p;
+ struct commit *parent;
struct merge_simplify_state *st, *pst;
int cnt;
@@ -2320,29 +2445,32 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
*/
if (1 < cnt) {
int marked = mark_redundant_parents(revs, commit);
+ //fprintf(stderr, "%s has %d parents, removing %d redundant", sha1_to_hex(commit->object.sha1), cnt, marked);
marked += mark_treesame_root_parents(revs, commit);
if (marked)
marked -= leave_one_treesame_to_parent(revs, commit);
if (marked)
cnt = remove_marked_parents(revs, commit);
+ //fprintf(stderr, ", final=%d\n", cnt);
}
/*
* A commit simplifies to itself if it is a root, if it is
* UNINTERESTING, if it touches the given paths, or if it is a
- * merge and its parents simplify to more than one commit
+ * merge and its parents don't simplify to one priority commit
* (the first two cases are already handled at the beginning of
* this function).
*
- * Otherwise, it simplifies to what its sole parent simplifies to.
+ * Otherwise, it simplifies to what its sole priority parent
+ * simplifies to.
*/
if (!cnt ||
(commit->object.flags & UNINTERESTING) ||
!(commit->object.flags & TREESAME) ||
- (1 < cnt))
+ (parent = priority_select_parent(revs, commit->parents)) == NULL)
st->simplified = commit;
else {
- pst = locate_simplify_state(revs, commit->parents->item);
+ pst = locate_simplify_state(revs, parent);
st->simplified = pst->simplified;
}
return tail;
@@ -2438,6 +2566,10 @@ int prepare_revision_walk(struct rev_info *revs)
if (!revs->leak_pending)
free(list);
+ /* Signal whether we need per-parent treesame decoration */
+ if (revs->simplify_merges || revs->limited)
+ revs->treesame.name = "treesame";
+
if (revs->no_walk != REVISION_WALK_NO_WALK_UNSORTED)
commit_list_sort_by_date(&revs->commits);
if (revs->no_walk)
@@ -2469,15 +2601,15 @@ static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp
if (!revs->limited)
if (add_parents_to_list(revs, p, &revs->commits, &cache) < 0)
return rewrite_one_error;
- if (p->parents && p->parents->next)
- return rewrite_one_ok;
if (p->object.flags & UNINTERESTING)
return rewrite_one_ok;
if (!(p->object.flags & TREESAME))
return rewrite_one_ok;
if (!p->parents)
return rewrite_one_noparents;
- *pp = p->parents->item;
+ if ((p = priority_select_parent(revs, p->parents)) == NULL)
+ return rewrite_one_ok;
+ *pp = p;
}
}
@@ -2621,10 +2753,7 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
if (revs->min_age != -1 && (commit->date > revs->min_age))
return commit_ignore;
if (revs->min_parents || (revs->max_parents >= 0)) {
- int n = 0;
- struct commit_list *p;
- for (p = commit->parents; p; p = p->next)
- n++;
+ int n = commit_list_count(commit->parents);
if ((n < revs->min_parents) ||
((revs->max_parents >= 0) && (n > revs->max_parents)))
return commit_ignore;
@@ -2634,12 +2763,25 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
if (revs->prune && revs->dense) {
/* Commit without changes? */
if (commit->object.flags & TREESAME) {
+ int n;
+ struct commit_list *p;
/* drop merges unless we want parenthood */
if (!want_ancestry(revs))
return commit_ignore;
- /* non-merge - always ignore it */
- if (!commit->parents || !commit->parents->next)
- return commit_ignore;
+ /*
+ * If we want ancestry, then need to keep any merge
+ * between INTERESTING commits to tie together topology.
+ * This is purely a topology question, so we do not
+ * care about merges from UNINTERESTING commits at the
+ * boundary, or necessarily about merges from BOTTOM
+ * (although a parent being BOTTOM will have impacted
+ * the TREESAME determination above).
+ */
+ for (n = 0, p = commit->parents; p; p = p->next)
+ if (!(p->item->object.flags & UNINTERESTING))
+ if (++n >= 2)
+ return commit_show;
+ return commit_ignore;
}
}
return commit_show;
diff --git a/revision.h b/revision.h
index 1abe57b..765630a 100644
--- a/revision.h
+++ b/revision.h
@@ -15,7 +15,8 @@
#define ADDED (1u<<7) /* Parents already parsed and added? */
#define SYMMETRIC_LEFT (1u<<8)
#define PATCHSAME (1u<<9)
-#define ALL_REV_FLAGS ((1u<<10)-1)
+#define BOTTOM (1u<<10)
+#define ALL_REV_FLAGS ((1u<<11)-1)
#define DECORATE_SHORT_REFS 1
#define DECORATE_FULL_REFS 2
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index ebe79ac..c5a412c 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -15,7 +15,8 @@ test_description='--ancestry-path'
# --ancestry-path D..M -- M.t == M
#
# G..M -- G.t == [nothing - was dropped in "-s ours" merge L]
-# --ancestry-path G..M -- G.t == H J L
+# --ancestry-path G..M -- G.t == L
+# --ancestry-path --simplify-merges G^..M -- G.t == G L
. ./test-lib.sh
@@ -82,8 +83,15 @@ test_expect_success 'rev-list G..M -- G.t' '
'
test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
- for c in H J L; do echo $c; done >expect &&
+ echo L >expect &&
git rev-list --ancestry-path --format=%s G..M -- G.t |
+ sed -e "/^commit /d" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
+ for c in G L; do echo $c; done >expect &&
+ git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
sed -e "/^commit /d" |
sort >actual &&
test_cmp expect actual
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2.2 8/8] revision.c: discount side branches when computing TREESAME
2013-05-02 17:52 ` Kevin Bracey
2013-05-02 17:58 ` [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME Kevin Bracey
@ 2013-05-02 18:26 ` Kevin Bracey
2013-05-02 20:05 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Junio C Hamano
2 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-05-02 18:26 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Linus Torvalds, Kevin Bracey
Add a BOTTOM flag to commit objects, and use it to define priority for
pruning. Priority commits are those that are !UNINTERESTING or BOTTOM,
and this allows us to identify irrelevant side branches (UNINTERESTING
&& !BOTTOM).
If a merge has priority parents, and it is TREESAME to them, then do
not let non-priority parents cause the merge to be treated as
!TREESAME.
When considering simplification, don't always include all merges -
merges with exactly 1 priority parent can be simplified, if TREESAME
according to the above rule.
These two changes greatly increase simplification in limited revision
lists - irrelevant merges from unlisted commits can be omitted.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
---
Something went wrong with v2.1; it got based on an old version of the series.
This one should apply to v2 correctly.
revision.c | 216 +++++++++++++++++++++++++++++++++-----
revision.h | 3 +-
t/t6019-rev-list-ancestry-path.sh | 12 ++-
3 files changed, 199 insertions(+), 32 deletions(-)
diff --git a/revision.c b/revision.c
index 20c7058..e5d5091 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list *orig)
}
/*
+ * A definition of "priority" commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A "priority" commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from non-priority parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only priority parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Including bottom commits in "priority" ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as an uninteresting
+ * side branch, but treated as part of the graph. For example:
+ *
+ * ....Z...A---X---o---o---B
+ * . /
+ * W---Y
+ *
+ * When computing "A..B", the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path "A..B", the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int priority_commit(struct commit *commit)
+{
+ return (commit->object.flags & (UNINTERESTING | BOTTOM)) != UNINTERESTING;
+}
+
+/*
+ * Return a single priority commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent. TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole priority parent, if any.
+ * If we have only one priority parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one priority parent, or no priority parents
+ * (and multiple non-priority ones), then we can't select a parent here and
+ * return NULL.
+ */
+struct commit *priority_select_parent(struct rev_info *revs, struct commit_list *orig)
+{
+ struct commit_list *list = orig;
+ struct commit *priority = NULL;
+
+ if (!orig)
+ return NULL;
+
+ if (revs->first_parent_only || !orig->next)
+ return orig->item;
+
+ while (list) {
+ struct commit *commit = list->item;
+ list = list->next;
+ if (priority_commit(commit)) {
+ if (priority)
+ return NULL;
+ priority = commit;
+ }
+ }
+ return priority;
+}
+
+/*
* The goal is to get REV_TREE_NEW as the result only if the
* diff consists of all '+' (and no other changes), REV_TREE_OLD
* if the whole diff is removal of old data, and otherwise
@@ -502,27 +572,54 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit)
if (commit->parents && commit->parents->next) {
unsigned n;
struct treesame_state *st;
+ struct commit_list *p;
+ unsigned priority_parents;
+ unsigned priority_change, nonpriority_change;
st = lookup_decoration(&revs->treesame, &commit->object);
if (!st)
die("update_treesame %s", sha1_to_hex(commit->object.sha1));
- commit->object.flags |= TREESAME;
- for (n = 0; n < st->nparents; n++) {
- if (!st->treesame[n]) {
- commit->object.flags &= ~TREESAME;
- break;
- }
+ priority_parents = 0;
+ priority_change = nonpriority_change = 0;
+ for (p = commit->parents, n = 0; p; n++, p = p->next) {
+ if (priority_commit(p->item)) {
+ priority_change |= !st->treesame[n];
+ priority_parents++;
+ } else
+ nonpriority_change |= !st->treesame[n];
}
+ //fprintf(stderr, "=>%s has %d priority parents; %d %d\n", sha1_to_hex(commit->object.sha1),
+ // priority_parents, priority_change, nonpriority_change);
+ if (priority_parents ? priority_change : nonpriority_change)
+ commit->object.flags &= ~TREESAME;
+ else
+ commit->object.flags |= TREESAME;
}
return commit->object.flags & TREESAME;
}
+static inline int limiting_can_increase_treesame(const struct rev_info *revs)
+{
+ /*
+ * TREESAME is irrelevant unless prune && dense;
+ * if simplify_history is set, we can't have a mixture of TREESAME and
+ * !TREESAME INTERESTING parents (and we don't have treesame[]
+ * decoration anyway);
+ * if first_parent_only is set, then the TREESAME flag is locked
+ * against the first parent (and again we lack treesame[] decoration).
+ */
+ return revs->prune && revs->dense &&
+ !revs->simplify_history &&
+ !revs->first_parent_only;
+}
+
static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp, *parent;
struct treesame_state *ts = NULL;
- int tree_changed = 0, nth_parent;
+ int priority_change = 0, nonpriority_change = 0;
+ int priority_parents, nth_parent;
/*
* If we don't do pruning, everything is interesting
@@ -546,10 +643,12 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
if (!revs->dense && !commit->parents->next)
return;
- for (pp = &commit->parents, nth_parent = 0;
+ for (pp = &commit->parents, nth_parent = 0, priority_parents = 0;
(parent = *pp) != NULL;
pp = &parent->next, nth_parent++) {
struct commit *p = parent->item;
+ if (priority_commit(p))
+ priority_parents++;
if (nth_parent == 1) {
/*
@@ -573,7 +672,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
!revs->simplify_history &&
!(commit->object.flags & UNINTERESTING)) {
ts = initialise_treesame(revs, commit);
- if (!tree_changed)
+ if (!(nonpriority_change || priority_change))
ts->treesame[0] = 1;
}
}
@@ -583,7 +682,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
sha1_to_hex(p->object.sha1));
switch (rev_compare_tree(revs, p, commit)) {
case REV_TREE_SAME:
- if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) {
+ if (!revs->simplify_history || !priority_commit(p)) {
/* Even if a merge with an uninteresting
* side branch brought the entire change
* we are interested in, we do not want
@@ -619,14 +718,31 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
/* fallthrough */
case REV_TREE_OLD:
case REV_TREE_DIFFERENT:
- tree_changed = 1;
+ if (priority_commit(p))
+ priority_change = 1;
+ else
+ nonpriority_change = 1;
continue;
}
die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));
}
- if (tree_changed)
- return;
- commit->object.flags |= TREESAME;
+
+ //fprintf(stderr, "%s has %d priority parents; %d %d\n",
+ // sha1_to_hex(commit->object.sha1),
+ // priority_parents, priority_change, nonpriority_change);
+
+ /*
+ * TREESAME is straightforward for single-parent commits. For merge
+ * commits, it is most useful to define it so that "non-priority"
+ * parents cannot make us !TREESAME - if we have any priority
+ * parents, then we only consider TREESAMEness with respect to them,
+ * allowing irrelevant merges from uninteresting branches to be
+ * simplified away. Only if we have only nonpriority parents do we
+ * base TREESAME on them. Note that this logic is replicated in
+ * update_treesame, which should be kept in sync.
+ */
+ if (priority_parents ? !priority_change : !nonpriority_change)
+ commit->object.flags |= TREESAME;
}
static void commit_list_insert_by_date_cached(struct commit *p, struct commit_list **head,
@@ -1002,6 +1118,23 @@ static int limit_list(struct rev_info *revs)
free_commit_list(bottom);
}
+ /*
+ * Check if any commits have become TREESAME by some of their parents
+ * becoming UNINTERESTING. Note that TREESAME is irrelevant unless
+ * prune && dense; if simplify_history is set, we can't have a mixture
+ * of TREESAME and !TREESAME INTERESTING parents, so this can't happen
+ * (and we don't have treesame[] decoration anyway); if first_parent_only
+ * is set, then the TREESAME flag is locked against the first parent
+ * (and again we lack treesame[] decoration).
+ */
+ if (revs->prune && revs->dense && !revs->simplify_history && !revs->first_parent_only)
+ for (list = newlist; list; list = list->next) {
+ struct commit *c = list->item;
+ if (c->object.flags & (UNINTERESTING | TREESAME))
+ continue;
+ update_treesame(revs, c);
+ }
+
revs->commits = newlist;
return 0;
}
@@ -1015,6 +1148,16 @@ static void add_rev_cmdline(struct rev_info *revs,
struct rev_cmdline_info *info = &revs->cmdline;
int nr = info->nr;
+ /*
+ * Total hack, but at least doing it here means we're consistent
+ * with collect_bottom_commits(). But if we just flagged BOTTOM
+ * commits before this, could we ditch this entire function?
+ */
+ if (item->type == OBJ_COMMIT && (flags & UNINTERESTING)) {
+ item->flags |= BOTTOM;
+ flags |= BOTTOM;
+ }
+
ALLOC_GROW(info->rev, nr + 1, info->alloc);
info->rev[nr].item = item;
info->rev[nr].name = name;
@@ -2233,6 +2376,7 @@ static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
static struct commit_list **simplify_one(struct rev_info *revs, struct commit *commit, struct commit_list **tail)
{
struct commit_list *p;
+ struct commit *parent;
struct merge_simplify_state *st, *pst;
int cnt;
@@ -2311,29 +2455,32 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c
*/
if (1 < cnt) {
int marked = mark_redundant_parents(revs, commit);
+ //fprintf(stderr, "%s has %d parents, removing %d redundant", sha1_to_hex(commit->object.sha1), cnt, marked);
marked += mark_treesame_root_parents(revs, commit);
if (marked)
marked -= leave_one_treesame_to_parent(revs, commit);
if (marked)
cnt = remove_marked_parents(revs, commit);
+ //fprintf(stderr, ", final=%d\n", cnt);
}
/*
* A commit simplifies to itself if it is a root, if it is
* UNINTERESTING, if it touches the given paths, or if it is a
- * merge and its parents simplify to more than one commit
+ * merge and its parents don't simplify to one priority commit
* (the first two cases are already handled at the beginning of
* this function).
*
- * Otherwise, it simplifies to what its sole parent simplifies to.
+ * Otherwise, it simplifies to what its sole priority parent
+ * simplifies to.
*/
if (!cnt ||
(commit->object.flags & UNINTERESTING) ||
!(commit->object.flags & TREESAME) ||
- (1 < cnt))
+ (parent = priority_select_parent(revs, commit->parents)) == NULL)
st->simplified = commit;
else {
- pst = locate_simplify_state(revs, commit->parents->item);
+ pst = locate_simplify_state(revs, parent);
st->simplified = pst->simplified;
}
return tail;
@@ -2430,7 +2577,8 @@ int prepare_revision_walk(struct rev_info *revs)
free(list);
/* Signal whether we need per-parent treesame decoration */
- if (revs->simplify_merges)
+ if (revs->simplify_merges ||
+ (revs->limited && limiting_can_increase_treesame(revs)))
revs->treesame.name = "treesame";
if (revs->no_walk != REVISION_WALK_NO_WALK_UNSORTED)
@@ -2464,15 +2612,15 @@ static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp
if (!revs->limited)
if (add_parents_to_list(revs, p, &revs->commits, &cache) < 0)
return rewrite_one_error;
- if (p->parents && p->parents->next)
- return rewrite_one_ok;
if (p->object.flags & UNINTERESTING)
return rewrite_one_ok;
if (!(p->object.flags & TREESAME))
return rewrite_one_ok;
if (!p->parents)
return rewrite_one_noparents;
- *pp = p->parents->item;
+ if ((p = priority_select_parent(revs, p->parents)) == NULL)
+ return rewrite_one_ok;
+ *pp = p;
}
}
@@ -2616,10 +2764,7 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
if (revs->min_age != -1 && (commit->date > revs->min_age))
return commit_ignore;
if (revs->min_parents || (revs->max_parents >= 0)) {
- int n = 0;
- struct commit_list *p;
- for (p = commit->parents; p; p = p->next)
- n++;
+ int n = commit_list_count(commit->parents);
if ((n < revs->min_parents) ||
((revs->max_parents >= 0) && (n > revs->max_parents)))
return commit_ignore;
@@ -2629,12 +2774,25 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
if (revs->prune && revs->dense) {
/* Commit without changes? */
if (commit->object.flags & TREESAME) {
+ int n;
+ struct commit_list *p;
/* drop merges unless we want parenthood */
if (!want_ancestry(revs))
return commit_ignore;
- /* non-merge - always ignore it */
- if (!commit->parents || !commit->parents->next)
- return commit_ignore;
+ /*
+ * If we want ancestry, then need to keep any merge
+ * between INTERESTING commits to tie together topology.
+ * This is purely a topology question, so we do not
+ * care about merges from UNINTERESTING commits at the
+ * boundary, or necessarily about merges from BOTTOM
+ * (although a parent being BOTTOM will have impacted
+ * the TREESAME determination above).
+ */
+ for (n = 0, p = commit->parents; p; p = p->next)
+ if (!(p->item->object.flags & UNINTERESTING))
+ if (++n >= 2)
+ return commit_show;
+ return commit_ignore;
}
}
return commit_show;
diff --git a/revision.h b/revision.h
index 1abe57b..765630a 100644
--- a/revision.h
+++ b/revision.h
@@ -15,7 +15,8 @@
#define ADDED (1u<<7) /* Parents already parsed and added? */
#define SYMMETRIC_LEFT (1u<<8)
#define PATCHSAME (1u<<9)
-#define ALL_REV_FLAGS ((1u<<10)-1)
+#define BOTTOM (1u<<10)
+#define ALL_REV_FLAGS ((1u<<11)-1)
#define DECORATE_SHORT_REFS 1
#define DECORATE_FULL_REFS 2
diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh
index ebe79ac..c5a412c 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -15,7 +15,8 @@ test_description='--ancestry-path'
# --ancestry-path D..M -- M.t == M
#
# G..M -- G.t == [nothing - was dropped in "-s ours" merge L]
-# --ancestry-path G..M -- G.t == H J L
+# --ancestry-path G..M -- G.t == L
+# --ancestry-path --simplify-merges G^..M -- G.t == G L
. ./test-lib.sh
@@ -82,8 +83,15 @@ test_expect_success 'rev-list G..M -- G.t' '
'
test_expect_success 'rev-list --ancestry-path G..M -- G.t' '
- for c in H J L; do echo $c; done >expect &&
+ echo L >expect &&
git rev-list --ancestry-path --format=%s G..M -- G.t |
+ sed -e "/^commit /d" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' '
+ for c in G L; do echo $c; done >expect &&
+ git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t |
sed -e "/^commit /d" |
sort >actual &&
test_cmp expect actual
--
1.8.2.1.632.gd2b1879
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
2013-05-02 17:52 ` Kevin Bracey
2013-05-02 17:58 ` [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME Kevin Bracey
2013-05-02 18:26 ` [PATCH v2.2 " Kevin Bracey
@ 2013-05-02 20:05 ` Junio C Hamano
2013-05-04 20:18 ` Kevin Bracey
2 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2013-05-02 20:05 UTC (permalink / raw)
To: Kevin Bracey; +Cc: git, Linus Torvalds
Kevin Bracey <kevin@bracey.fi> writes:
> On 01/05/2013 00:18, Junio C Hamano wrote:
>
>>> These rules paying more attention to UNINTERESTING do add a tricky
>>> wrinkle to behaviour. Because limited revision lists are conventionally
>>> expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING.
>> OK.
>>
>>> Thus
>>> its connection to the INTERESTING graph is not privileged over side
>>> branches,
>> I take that "its connection" refers to the "===" link below, the
>> nodes connected with "---" form the "INTERESTING graph", and
>>
>> ....Z...A===X---o---o---B
>> \\ /
>> W---Y
>> "side branches" refer to the development that built W and Y and
>> merged at X. And you are saying that A===X is not "privileged" over
>> "Y---X", with some definition of "privileged" I am not sure about.
>
> Okay, that's a good graph. The basic problem is that all the rules
> above try to identify a merge from an irrelevant branch and eliminate
> it. But how can we define what a side branch is?
>
> I think the rules I state are conceptually sound - a side branch is an
> extra parent coming in from outside our limited history graph. But the
> problem is at the bottom. In the event that someone specifies "A..B"
> with the above history, we get the resultant graph "W-Y-X-o-o-B". A is
> not on that graph. So with the rules as they stand, "A" being
> UNINTERESTING makes it get treated as a side branch of X, which isn't
> good. A needs to be INTERESTING for the purpose of side-branch logic.
>
> So when someone says "A..B" and generates "W-Y-X-o-o-B", we want to
> know that X's parent path "(Z)-W-Y-X" is the (possibly irrelevant)
> side branch, not "(A)-X".
OK, I think I understand it, and we are in agreement. For the
purpose of the above paragraph, a side branch is what is not on the
"--ancestry-path", so all of the below "examples" are about the
behaviour when --ancestry-path is used.
> Example undesirable behaviour, with A treated as a side branch:
> ...
> 1) If only commit A changed our file, and merge X took "old" version Y
> for some reason, under these rules "--full-history A..B" would show
> nothing. X doesn't consider A because it's UNINTERESTING. If there had
> been an intervening (even irrelevant) commit A1 between A1 and X, X
> would have been shown.
This is exactly why I hinted that ancestry-path limiting may need to
come before path limiting decides which parents are interesting and
which are irrelevant, and I think it applies the same for the other
two examples.
I haven't thought through the implications of the definition of
"INTERESTING" you propose (without changing the rest of the code to
apply ancestry-path first, I assume).
While I'd like to see a solution that avoids to actually limit with
ancestry-path before limiting with pathspec (because I would imagine
that the most naive way to do so is to scan the graph twice, first
without pathspec to find and cull side branches from "ancestry-path"
point of view and then traverse the resulting graph again with
pathspec to further drop branches that did not contribute to the end
result), I am fairly sure that the end result we produce, whatever
its more efficient implementation would end up to be, _should_ be
equilvalent to the "cull with ancestry path first and then ignore
noncontributing branches with pathspec" semantics.
It is not immediately clear to me that your proposed implementation
will give us that semantics. Oh, also, even though I said "I am
fairly sure" above, you may already have a counter-example that
shows why the semantics I think is right is not useful in some
cases.
I'll have to think about this a bit more, but it is unlikely I have
time to do so for the coming couple of days. Sorry about that.
> All 3 cases work if A is treated as "INTERESTING" for side-branch
> rules. We shouldn't have needed to put in an extra A1 commit to make
> them work.
>
>>> # D---E-------F
>>> # / \ \
>>> +# B---C-G0-G--H---I---J
>>> # / \
>>> # A-------K---------------L--M
>>> #
>>>
>>>
>>> Conceptually, the "ancestry-path" shouldn't get affected by any
>>> pathspec. The range "--ancestry-path G0..M" should be equivalent to
>>> the range "G0..M ^F ^E ^K", and with the rule to ignore non-sameness
>>> with uninteresting side branches, I would have expected that H and J
>>> would be equally irrelevant, because E and F would be outside the
>>> graph we would want to look at sameness.
>
> Those two pathspecs produce the same graph of commits, and yes,
> they've always produced the same thing up until now. But we're trying
> to do something new(ish) here. We're trying to define "side branch",
> to allow us to make more useful pruning comparisons. And we can't
> reliably define "side branch" unless we can reliably define where
> we're coming from.
>
> Looking at this case, given that graph of commits G-H-I-J-L-M
> (produced by any pathspec/flags), that is "easy" because it's
> linear. The bottom of the INTERESTING graph has a single parent, and
> we can follow it straight from bottom to top. We can deduce that
> non-merge "G" is bottom, and thus call the connections to E and F
> "sides". (But that could have been wrong if the graph had been made
> some other way, and the user wasn't asking for history since G0).
>
> But if presented with H-I-J-L-M we get stuck. The lowest commit in our
> graph has 2 parents. How do we distinguish between E and G? Which is
> the side, and which is the bottom? We can only define "side branch"
> here if we know where our bottom is. The version of this patch as
> posted can't distinguish whether E or G is more important, so merge H
> always gets shown. And that makes me unhappy. And note that normal
> merge-simplification will often prune away boring non-merge commits,
> leaving the user-specified UNINTERESTING bottom attached to an
> INTERESTING graph by a merge commit like this. So it would be very
> common to stumble over this first connection onto the INTERESTING
> graph with --simplify-merges.
>
> So my proposal here is that for pruning purposes, we need to mark the
> bottom(s), so we can reliably identify the sides.A side branch is
> spotted by being UNINTERESTING && !BOTTOM.
>
> Doing this means that "--ancestry-path E..M", "--ancestry-path G..M"
> and "G..M ^F ^E ^K" would all produce a walk starting at H, but for
> pruning purposes they will prioritise differences against specified
> bottom commits, and discount them against non-specified boundary
> commits.
>
> So "E..M" will treat "G" as a side branch, and "G..M" will treat "D-E"
> as a side branch. "--ancestry-path E..M" will show H iff it differs
> from E, and "--ancestry-path G..M" will show H iff it differs from G.
>
> And as a result, manually specifying with "G..M ^F ^E ^K" will be
> paying extra attention to merges H, J and L, caring if they differ
> from listed commits E F and K, and not treating them as ignorable side
> branches. H will be shown if it differs from either G or E.
>
> All of which feels right and good to me, but it does mean the neat
> mathematical purity of the commit set model is somewhat disrupted -
> different pathspecs specifying the same set of commits will prune
> differently. But I think the behaviour fits intuitive expectation
> well. When a user specifies "A..B" or "B ^A", they almost always mean
> that they want the change since A, and they’re not thinking of A as a
> “commit set subtraction”. The specific edge to A is important to
> them. So let's be helpful and meet normal expectation, and treat the
> edge between the INTERESTING graph and the specified bottom commit as
> important.
>
> Revised patch 8/8 doing this follows. The fact it adds a BOTTOM flag
> potentially has an impact on other areas, as noted in the comments. If
> we went down this route, the BOTTOM flag addition would obviously want
> to be a separate preceding patch, covering impact on
> collect_bottom_commits etc.
>
> And the thing about "only need merges with 2+ INTERESTING parents for
> topology" should really be separated out too - that's actually quite
> distinct from the TREESAME side branch stuff, and rather more
> straightforward.
>
> Kevin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
2013-05-02 20:05 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Junio C Hamano
@ 2013-05-04 20:18 ` Kevin Bracey
0 siblings, 0 replies; 17+ messages in thread
From: Kevin Bracey @ 2013-05-04 20:18 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linus Torvalds
On 02/05/2013 23:05, Junio C Hamano wrote:
>
>>> ....Z...A===X---o---o---B
>>> \\ /
>>> W---Y
>>>
> OK, I think I understand it, and we are in agreement. For the
> purpose of the above paragraph, a side branch is what is not on the
> "--ancestry-path", so all of the below "examples" are about the
> behaviour when --ancestry-path is used.
Ah, in fact, no. In some previous mails I was concentrating on
ancestry-path, but those 3 examples were really of ordinary "A..B", with
W+Y in the INTERESTING set. I think the side-branch logic remains sound
and desirable even in the absence of --ancestry-path, so I don't think
this is an ancestry-path change. (And from a basic naive usability point
of view I'm much more interested in improving the more obvious modes
than the rather more obscure --ancestry-path.)
Ancestry path forces side branches to be ignored - it's the "simple"
case for ignoring (and understanding) side branches. But if we let other
modes know where the bottom is, they too can benefit from reliable side
branch logic - we can find out if anything happened on side branches,
but we can also ignore them completely if they turn out to be totally
irrelevant.
When not using ancestry-path, the side branches this patch works on are
thosewhich go off and don't come back - they stub off at some
UNINTERESTING commit other than the bottom(s). If no other limiting is
set, they must have hit an ancestor of our BOTTOM commit(s);
simplify-merges could have potentially pruned away if unlimited. And
this patch restores that pruning ability - simplify-merges can rewrite
them back to just 1 UNINTERESTING merge parent at the boundary (looking
like an ancestry-path boundary), then this patch can chuck the boundary
merge. Hey presto, irrelevant branch now invisible. And the patch also
provides benefits to all other modes.
I'll post v3 of the sequence tomorrow - it includes a new test which
illustrates the changes - it's a 60-or-so-item test set, with about 15
"failures" in a variety of modes that get fixed by this sequence. I
think that should make an excellent discussion topic. We'll see whether
folks agree with my view about what should and shouldn't be shown...
Kevin
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-05-04 20:44 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-30 17:26 [PATCH v2 0/8] History traversal refinements Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 1/8] decorate.c: compact table when growing Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 2/8] t6019: test file dropped in -s ours merge Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 4/8] revision.c: Make --full-history consider more merges Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 5/8] t6012: update test for tweaked full-history traversal Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 6/8] simplify-merges: never remove all TREESAME parents Kevin Bracey
2013-04-30 17:26 ` [PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch Kevin Bracey
2013-04-30 20:54 ` Junio C Hamano
2013-04-30 17:26 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Kevin Bracey
2013-04-30 21:18 ` Junio C Hamano
2013-05-02 17:52 ` Kevin Bracey
2013-05-02 17:58 ` [PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME Kevin Bracey
2013-05-02 18:26 ` [PATCH v2.2 " Kevin Bracey
2013-05-02 20:05 ` [PATCH v2 8/8] revision.c: discount UNINTERESTING parents Junio C Hamano
2013-05-04 20:18 ` Kevin Bracey
2013-04-30 21:28 ` [PATCH v2 0/8] History traversal refinements Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).