From: Junio C Hamano <gitster@pobox.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Roman Zippel <zippel@linux-m68k.org>,
Martin Langhoff <martin.langhoff@gmail.com>,
Tim Harper <timcharper@gmail.com>,
git@vger.kernel.org
Subject: [PATCH v2] revision traversal: show full history with merge simplification
Date: Thu, 31 Jul 2008 01:17:41 -0700 [thread overview]
Message-ID: <7vhca6zcuy.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: 7vej5b3ozz.fsf@gitster.siamese.dyndns.org
The --full-history traversal keeps all merges in addition to non-merge
commits that touch paths in the given pathspec. This is useful to view
both sides of a merge in a topology like this:
A---M---o
/ /
---O---B
even when A and B makes identical change to the given paths. The revision
traversal without --full-history aims to come up with the simplest history
to explain the final state of the tree, and one of the side branches can
be pruned away.
The behaviour to keep all merges however is inconvenient if neither A nor
B touches the paths we are interested in. --full-history reduces the
topology to:
---O---M---o
in such a case, without removing M.
This adds a post processing phase on top of --full-history traversal to
remove needless merges from the resulting history.
The idea is to compute, for each commit in the "full history" result set,
the commit that should replace it in the simplified history. The commit
to replace it in the final history is determined as follows:
* In any case, we first figure out the replacement commits of parents of
the commit we are looking at. The commit we are looking at is
rewritten as if the replacement commits of its original parents are its
parents. While doing so, we reduce the redundant parents from the
rewritten parent list by not just removing the identical ones, but also
removing a parent that is an ancestor of another parent.
* After the above parent simplification, if the commit is a root commit,
an UNINTERESTING commit, a merge commit, or modifies the paths we are
interested in, then the replacement commit of the commit is itself. In
other words, such a commit is not dropped from the final result.
The first point above essentially means that the history is rewritten in
the bottom up direction. We can rewrite the parent list of a commit only
after we know how all of its parents are rewritten. This means that the
processing needs to happen on the full history (i.e. after limit_list()).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* Changes from the "quick hack" are:
- When the history is bounded at the bottom, the v1 patch did not
terminate, because it wanted to know the replacement for
UNINTERESTING parents of commits on revs->commit list, but these
parents were never processed. Oops.
- The option implies rewrite_parents. I was tempted to make it imply
"--parents" (which would make it always emit parent information as
well), but didn't.
- Toposort is still implied but it is done at the end.
- The code is more heavily commented.
- I do not think "--post-simplify" is particulary a good name, but I
couldn't come up with a good one. To mark it clearly not ready for
'master', I changed the name to a meaningless word for now ;-)
* Timings (best of 5 runs)
$ git rev-list --parents --full-history --topo-order HEAD -- kernel/printk.c
3.75user 0.47system 0:04.22elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
$ git rev-list --parents --oyoyo HEAD -- kernel/printk.c
4.31user 0.06system 0:04.37elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
$ git rev-list --parents --full-history HEAD -- kernel/printk.c | head -n 200
0.16user 0.02system 0:00.18elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k
revision.c | 171 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
revision.h | 1 +
2 files changed, 152 insertions(+), 20 deletions(-)
diff --git a/revision.c b/revision.c
index 3897fec..3b59e02 100644
--- a/revision.c
+++ b/revision.c
@@ -1045,6 +1045,11 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
} else if (!strcmp(arg, "--topo-order")) {
revs->lifo = 1;
revs->topo_order = 1;
+ } else if (!strcmp(arg, "--oyoyo")) {
+ revs->post_simplify = 1;
+ revs->rewrite_parents = 1;
+ revs->simplify_history = 0;
+ revs->limited = 1;
} else if (!strcmp(arg, "--date-order")) {
revs->lifo = 0;
revs->topo_order = 1;
@@ -1378,6 +1383,150 @@ static void add_child(struct rev_info *revs, struct commit *parent, struct commi
l->next = add_decoration(&revs->children, &parent->object, l);
}
+static int remove_duplicate_parents(struct commit *commit)
+{
+ struct commit_list **pp, *p;
+ int surviving_parents;
+
+ /* Examine existing parents while marking ones we have seen... */
+ pp = &commit->parents;
+ while ((p = *pp) != NULL) {
+ struct commit *parent = p->item;
+ if (parent->object.flags & TMP_MARK) {
+ *pp = p->next;
+ continue;
+ }
+ parent->object.flags |= TMP_MARK;
+ pp = &p->next;
+ }
+ /* count them while clearing the temporary mark */
+ surviving_parents = 0;
+ for (p = commit->parents; p; p = p->next) {
+ p->item->object.flags &= ~TMP_MARK;
+ surviving_parents++;
+ }
+ return surviving_parents;
+}
+
+static struct commit_list **post_simplify_one(struct commit *commit, struct commit_list **tail)
+{
+ struct commit_list *p;
+ int cnt;
+
+ /*
+ * We store which commit each one simplifies to in its util field.
+ * Have we handled this one?
+ */
+ if (commit->util)
+ return tail;
+
+ /*
+ * An UNINTERESTING commit simplifies to itself, so does a
+ * root commit. We do not rewrite parents of such commit
+ * anyway.
+ */
+ if ((commit->object.flags & UNINTERESTING) || !commit->parents) {
+ commit->util = commit;
+ return tail;
+ }
+
+ /*
+ * Do we know what commit all of our parents should be rewritten to?
+ * Otherwise we are not ready to rewrite this one yet.
+ */
+ for (cnt = 0, p = commit->parents; p; p = p->next) {
+ if (!p->item->util) {
+ tail = &commit_list_insert(p->item, tail)->next;
+ cnt++;
+ }
+ }
+ if (cnt)
+ return tail;
+
+ /*
+ * Rewrite our list of parents.
+ */
+ for (p = commit->parents; p; p = p->next)
+ p->item = p->item->util;
+ cnt = remove_duplicate_parents(commit);
+
+ /*
+ * It is possible that we are a merge and one side branch
+ * does not have any commit that touches the given paths;
+ * in such a case, the immediate parents will be rewritten
+ * to different commits.
+ *
+ * o----X X: the commit we are looking at;
+ * / / o: a commit that touches the paths;
+ * ---o----'
+ *
+ * Further reduce the parents by removing redundant parents.
+ */
+ if (1 < cnt) {
+ struct commit_list *h = reduce_heads(commit->parents);
+ cnt = commit_list_count(h);
+ free_commit_list(commit->parents);
+ commit->parents = h;
+ }
+
+ /*
+ * A commit simplifies to itself if it is a root, if it is
+ * UNINTERESTING, if it touches the given paths, or if it is a
+ * merge and its parents simplifies to more than one commits
+ * (the first two cases are already handled at the beginning of
+ * this function).
+ *
+ * Otherwise, it simplifies to what its sole parent simplifies to.
+ */
+ if (!cnt ||
+ (commit->object.flags & UNINTERESTING) ||
+ !(commit->object.flags & TREESAME) ||
+ (1 < cnt))
+ commit->util = commit;
+ else
+ commit->util = commit->parents->item->util;
+ return tail;
+}
+
+static void post_simplify(struct rev_info *revs)
+{
+ struct commit_list *list;
+ struct commit_list *yet_to_do, **tail;
+
+ /* feed the list reversed */
+ yet_to_do = NULL;
+ for (list = revs->commits; list; list = list->next)
+ commit_list_insert(list->item, &yet_to_do);
+ while (yet_to_do) {
+ list = yet_to_do;
+ yet_to_do = NULL;
+ tail = &yet_to_do;
+ while (list) {
+ struct commit *commit = list->item;
+ struct commit_list *next = list->next;
+ free(list);
+ list = next;
+ tail = post_simplify_one(commit, tail);
+ }
+ }
+
+ /* clean up the result, removing the simplified ones */
+ list = revs->commits;
+ revs->commits = NULL;
+ tail = &revs->commits;
+ while (list) {
+ struct commit *commit = list->item;
+ struct commit_list *next = list->next;
+ free(list);
+ list = next;
+ if (commit->util == commit)
+ tail = &commit_list_insert(commit, tail)->next;
+ }
+
+ /* sort topologically at the end */
+ sort_in_topological_order(&revs->commits, revs->lifo);
+}
+
static void set_children(struct rev_info *revs)
{
struct commit_list *l;
@@ -1418,6 +1567,8 @@ int prepare_revision_walk(struct rev_info *revs)
return -1;
if (revs->topo_order)
sort_in_topological_order(&revs->commits, revs->lifo);
+ if (revs->post_simplify)
+ post_simplify(revs);
if (revs->children.name)
set_children(revs);
return 0;
@@ -1450,26 +1601,6 @@ static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp
}
}
-static void remove_duplicate_parents(struct commit *commit)
-{
- struct commit_list **pp, *p;
-
- /* Examine existing parents while marking ones we have seen... */
- pp = &commit->parents;
- while ((p = *pp) != NULL) {
- struct commit *parent = p->item;
- if (parent->object.flags & TMP_MARK) {
- *pp = p->next;
- continue;
- }
- parent->object.flags |= TMP_MARK;
- pp = &p->next;
- }
- /* ... and clear the temporary mark */
- for (p = commit->parents; p; p = p->next)
- p->item->object.flags &= ~TMP_MARK;
-}
-
static int rewrite_parents(struct rev_info *revs, struct commit *commit)
{
struct commit_list **pp = &commit->parents;
diff --git a/revision.h b/revision.h
index f64e8ce..953e69b 100644
--- a/revision.h
+++ b/revision.h
@@ -41,6 +41,7 @@ struct rev_info {
simplify_history:1,
lifo:1,
topo_order:1,
+ post_simplify:1,
tag_objects:1,
tree_objects:1,
blob_objects:1,
--
1.6.0.rc1.31.gf448e
next prev parent reply other threads:[~2008-07-31 8:19 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-21 20:26 Bizarre missing changes (git bug?) Tim Harper
2008-07-21 20:37 ` Linus Torvalds
2008-07-21 22:53 ` Tim Harper
2008-07-21 22:55 ` Tim Harper
[not found] ` <8C23FB54-A28E-4294-ABEA-A5766200768B@gmail.com>
2008-07-21 22:57 ` Linus Torvalds
2008-07-26 3:12 ` Roman Zippel
2008-07-26 19:58 ` Linus Torvalds
2008-07-27 17:50 ` Roman Zippel
2008-07-27 18:47 ` Linus Torvalds
2008-07-27 23:14 ` Roman Zippel
2008-07-27 23:18 ` Linus Torvalds
2008-07-28 0:00 ` Roman Zippel
2008-07-28 5:00 ` Linus Torvalds
2008-07-28 5:30 ` Linus Torvalds
2008-07-29 2:59 ` Roman Zippel
2008-07-29 3:15 ` Martin Langhoff
2008-07-30 0:16 ` Roman Zippel
2008-07-30 0:25 ` Martin Langhoff
2008-07-30 0:32 ` Linus Torvalds
2008-07-30 0:48 ` Linus Torvalds
2008-07-30 23:56 ` Junio C Hamano
2008-07-31 0:15 ` Junio C Hamano
2008-07-31 0:30 ` Linus Torvalds
2008-07-31 8:17 ` Junio C Hamano [this message]
2008-07-31 8:18 ` [PATCH v2] revision traversal: show full history with merge simplification Junio C Hamano
2008-07-31 22:30 ` Linus Torvalds
2008-07-31 22:09 ` [PATCH v3-wip] " Junio C Hamano
2008-07-31 22:26 ` Linus Torvalds
2008-07-31 22:36 ` Junio C Hamano
2008-08-01 3:00 ` Junio C Hamano
2008-08-01 3:48 ` Linus Torvalds
2008-08-01 7:50 ` Junio C Hamano
2008-07-30 8:36 ` Bizarre missing changes (git bug?) Jakub Narebski
2008-07-29 3:29 ` Linus Torvalds
2008-07-29 3:33 ` Linus Torvalds
2008-07-29 11:39 ` Roman Zippel
2008-07-29 12:00 ` David Kastrup
2008-07-29 15:50 ` Linus Torvalds
2008-07-30 1:14 ` Roman Zippel
2008-07-30 1:32 ` Kevin Ballard
2008-07-30 1:49 ` Linus Torvalds
2008-07-29 5:31 ` Jeff King
2008-07-29 12:32 ` Roman Zippel
2008-07-29 12:48 ` Olivier Galibert
2008-07-29 12:52 ` Jeff King
2008-07-29 17:25 ` Linus Torvalds
2008-07-30 1:50 ` Roman Zippel
2008-07-30 2:05 ` Linus Torvalds
2008-07-30 4:26 ` Jeff King
2008-07-30 4:52 ` Linus Torvalds
2008-07-30 2:48 ` Roman Zippel
2008-07-30 3:20 ` Kevin Ballard
2008-07-30 3:21 ` Linus Torvalds
2008-07-30 3:35 ` Linus Torvalds
2008-07-30 4:23 ` Jeff King
2008-07-27 23:25 ` Martin Langhoff
2008-07-28 1:29 ` Roman Zippel
2008-07-21 20:42 ` Alex Riesen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vhca6zcuy.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=martin.langhoff@gmail.com \
--cc=timcharper@gmail.com \
--cc=torvalds@linux-foundation.org \
--cc=zippel@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).