From: Derrick Stolee <stolee@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, avarab@gmail.com, szeder.dev@gmail.com,
peff@peff.net, jnareb@gmail.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v5 0/7] Use generation numbers for --topo-order
Date: Thu, 1 Nov 2018 13:46:16 +0000 [thread overview]
Message-ID: <20181101134623.84055-1-dstolee@microsoft.com> (raw)
In-Reply-To: <pull.25.v4.git.gitgitgadget@gmail.com>
This patch series performs a decently-sized refactoring of the
revision-walk machinery. Well, "refactoring" is probably the wrong word,
as I don't actually remove the old code. Instead, when we see certain
options in the 'rev_info' struct, we redirect the commit-walk logic to
a new set of methods that distribute the workload differently. By using
generation numbers in the commit-graph, we can significantly improve
'git log --graph' commands (and the underlying 'git rev-list --topo-order').
On the Linux repository, I got the following performance results when
comparing to the previous version with or without a commit-graph:
Test: git rev-list --topo-order -100 HEAD
HEAD~1, no commit-graph: 6.80 s
HEAD~1, w/ commit-graph: 0.77 s
HEAD, w/ commit-graph: 0.02 s
Test: git rev-list --topo-order -100 HEAD -- tools
HEAD~1, no commit-graph: 9.63 s
HEAD~1, w/ commit-graph: 6.06 s
HEAD, w/ commit-graph: 0.06 s
If you want to read this series but are unfamiliar with the commit-graph
and generation numbers, then I recommend reading
`Documentation/technical/commit-graph.txt` or a blog post [1] I wrote on
the subject. In particular, the three-part walk described in "revision.c:
refactor basic topo-order logic" is present (but underexplained) as an
animated PNG [2].
**UPDATED** Now that we have had some review and some dogfooding, I'm
removing the paragraph I had here about "RFC quality". I think this is
ready to merge!
One notable case that is not included in this series is the case of a
history comparison such as 'git rev-list --topo-order A..B'. The existing
code in limit_list() has ways to cut the walk short when all pending
commits are UNINTERESTING. Since this code depends on commit_list instead
of the prio_queue we are using here, I chose to leave it untouched for now.
We can revisit it in a separate series later. Since handle_commit() turns
on revs->limited when a commit is UNINTERESTING, we do not hit the new
code in this case. Removing this 'revs->limited = 1;' line yields correct
results, but the performance can be worse.
**UPDATED** See the discussion about Generation Number V2 [4] for more
on this topic.
Changes in V5: Thanks Jakub for feedback on the huge commit! I think
I've responded to all the code feedback. See the range-diff at the
end of this cover-page.
Thanks,
-Stolee
[1] https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/
Supercharging the Git Commit Graph III: Generations and Graph Algorithms
[2] https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png
Animation showing three-part walk
[3] https://github.com/derrickstolee/git/tree/topo-order/test
A branch containing this series along with commits to compute commit-graph in entire test suite.
[4] https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51effef@gmail.com/
[RFC] Generation Number v2
Note: I'm not submitting this version via GitGitGadget because it's
currently struggling with how to handle a PR in a conflict state.
The new flags in revision.h have a conflict with recent changes in
master.
Derrick Stolee (7):
prio-queue: add 'peek' operation
test-reach: add run_three_modes method
test-reach: add rev-list tests
revision.c: begin refactoring --topo-order logic
commit/revisions: bookkeeping before refactoring
revision.c: generation-based topo-order algorithm
t6012: make rev-list tests more interesting
commit.c | 9 +-
commit.h | 7 +
object.h | 4 +-
prio-queue.c | 9 ++
prio-queue.h | 6 +
revision.c | 243 +++++++++++++++++++++++++++++++++--
revision.h | 6 +
t/helper/test-prio-queue.c | 26 ++--
t/t0009-prio-queue.sh | 14 ++
t/t6012-rev-list-simplify.sh | 45 +++++--
t/t6600-test-reach.sh | 96 +++++++++++++-
11 files changed, 426 insertions(+), 39 deletions(-)
base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303
--
2.19.1.542.gc4df23f792
-->8--
1: 2358cfd5ed = 1: 7c75a56505 prio-queue: add 'peek' operation
2: 3a4b68e479 = 2: 686c4370de test-reach: add run_three_modes method
3: 12a3f6d367 = 3: 7410c00982 test-reach: add rev-list tests
4: cd9eef9688 = 4: 5439e11e37 revision.c: begin refactoring --topo-order logic
5: f3e291665d ! 5: 71554deb9b commit/revisions: bookkeeping before refactoring
@@ -9,8 +9,8 @@
compare_commits_by_author_date() in revision.c. These are used
currently by sort_in_topological_order() in commit.c.
- 2. Moving these methods to commit.h requires adding the author_slab
- definition to commit.h.
+ 2. Moving these methods to commit.h requires adding an author_date_slab
+ declaration to commit.h. Consumers will need their own implementation.
3. The add_parents_to_list() method in revision.c performs logic
around the UNINTERESTING flag and other special cases depending
@@ -31,8 +31,7 @@
define_commit_slab(indegree_slab, int);
-/* record author-date for each commit object */
--define_commit_slab(author_date_slab, timestamp_t);
-+implement_shared_commit_slab(author_date_slab, timestamp_t);
+ define_commit_slab(author_date_slab, timestamp_t);
-static void record_author_date(struct author_date_slab *author_date,
- struct commit *commit)
@@ -69,8 +68,7 @@
extern int check_commit_signature(const struct commit *commit, struct signature_check *sigc);
+/* record author-date for each commit object */
-+define_shared_commit_slab(author_date_slab, timestamp_t);
-+
++struct author_date_slab;
+void record_author_date(struct author_date_slab *author_date,
+ struct commit *commit);
+
6: aa0bb2221d ! 6: 84c142e0bc revision.c: generation-based topo-order algorithm
@@ -195,6 +195,7 @@
-struct topo_walk_info {};
+define_commit_slab(indegree_slab, int);
++define_commit_slab(author_date_slab, timestamp_t);
+
+struct topo_walk_info {
+ uint32_t min_generation;
@@ -243,12 +244,12 @@
+}
+
+static void explore_to_depth(struct rev_info *revs,
-+ uint32_t gen)
++ uint32_t gen_cutoff)
+{
+ struct topo_walk_info *info = revs->topo_walk_info;
+ struct commit *c;
+ while ((c = prio_queue_peek(&info->explore_queue)) &&
-+ c->generation >= gen)
++ c->generation >= gen_cutoff)
+ explore_walk_step(revs);
+}
+
@@ -266,9 +267,6 @@
+
+ explore_to_depth(revs, c->generation);
+
-+ if (parse_commit_gently(c, 1) < 0)
-+ return;
-+
+ for (p = c->parents; p; p = p->next) {
+ struct commit *parent = p->item;
+ int *pi = indegree_slab_at(&info->indegree, parent);
@@ -285,12 +283,13 @@
+ }
+}
+
-+static void compute_indegrees_to_depth(struct rev_info *revs)
++static void compute_indegrees_to_depth(struct rev_info *revs,
++ uint32_t gen_cutoff)
+{
+ struct topo_walk_info *info = revs->topo_walk_info;
+ struct commit *c;
+ while ((c = prio_queue_peek(&info->indegree_queue)) &&
-+ c->generation >= info->min_generation)
++ c->generation >= gen_cutoff)
+ indegree_walk_step(revs);
+}
@@ -305,9 +304,9 @@
- limit_list(revs);
- sort_in_topological_order(&revs->commits, revs->sort_order);
+ init_indegree_slab(&info->indegree);
-+ memset(&info->explore_queue, '\0', sizeof(info->explore_queue));
-+ memset(&info->indegree_queue, '\0', sizeof(info->indegree_queue));
-+ memset(&info->topo_queue, '\0', sizeof(info->topo_queue));
++ memset(&info->explore_queue, 0, sizeof(info->explore_queue));
++ memset(&info->indegree_queue, 0, sizeof(info->indegree_queue));
++ memset(&info->topo_queue, 0, sizeof(info->topo_queue));
+
+ switch (revs->sort_order) {
+ default: /* REV_SORT_IN_GRAPH_ORDER */
@@ -329,23 +328,22 @@
+ info->min_generation = GENERATION_NUMBER_INFINITY;
+ for (list = revs->commits; list; list = list->next) {
+ struct commit *c = list->item;
-+ test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED);
-+ test_flag_and_insert(&info->indegree_queue, c, TOPO_WALK_INDEGREE);
+
+ if (parse_commit_gently(c, 1))
+ continue;
++
++ test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED);
++ test_flag_and_insert(&info->indegree_queue, c, TOPO_WALK_INDEGREE);
++
+ if (c->generation < info->min_generation)
+ info->min_generation = c->generation;
-+ }
+
-+ for (list = revs->commits; list; list = list->next) {
-+ struct commit *c = list->item;
+ *(indegree_slab_at(&info->indegree, c)) = 1;
+
+ if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
+ record_author_date(&info->author_date, c);
+ }
-+ compute_indegrees_to_depth(revs);
++ compute_indegrees_to_depth(revs, info->min_generation);
+
+ for (list = revs->commits; list; list = list->next) {
+ struct commit *c = list->item;
@@ -385,9 +383,8 @@
+ if (process_parents(revs, commit, NULL, NULL) < 0) {
if (!revs->ignore_missing_links)
die("Failed to traverse parents of commit %s",
-- oid_to_hex(&commit->object.oid));
-+ oid_to_hex(&commit->object.oid));
-+ }
+ oid_to_hex(&commit->object.oid));
+ }
+
+ for (p = commit->parents; p; p = p->next) {
+ struct commit *parent = p->item;
@@ -398,7 +395,7 @@
+
+ if (parent->generation < info->min_generation) {
+ info->min_generation = parent->generation;
-+ compute_indegrees_to_depth(revs);
++ compute_indegrees_to_depth(revs, info->min_generation);
+ }
+
+ pi = indegree_slab_at(&info->indegree, parent);
@@ -409,9 +406,10 @@
+
+ if (revs->first_parent_only)
+ return;
- }
++ }
}
+ int prepare_revision_walk(struct rev_info *revs)
diff --git a/revision.h b/revision.h
--- a/revision.h
7: a21febe112 = 7: 5479087812 t6012: make rev-list tests more interesting
next prev parent reply other threads:[~2018-11-01 13:46 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 20:41 [PATCH 0/6] Use generation numbers for --topo-order Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 1/6] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 2/6] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 3/6] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 4/6] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 5/6] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 6/6] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-08-27 21:23 ` [PATCH 0/6] Use generation numbers for --topo-order Junio C Hamano
2018-09-18 4:08 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2018-09-18 4:08 ` [PATCH v2 1/6] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-09-18 4:08 ` [PATCH v2 2/6] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-09-18 18:02 ` SZEDER Gábor
2018-09-19 19:31 ` Junio C Hamano
2018-09-19 19:38 ` Junio C Hamano
2018-09-20 21:18 ` Junio C Hamano
2018-09-18 4:08 ` [PATCH v2 3/6] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-09-18 4:08 ` [PATCH v2 4/6] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-09-18 4:08 ` [PATCH v2 5/6] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-09-18 4:08 ` [PATCH v2 6/6] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-09-18 5:51 ` Ævar Arnfjörð Bjarmason
2018-09-18 6:05 ` [PATCH v2 0/6] Use generation numbers for --topo-order Ævar Arnfjörð Bjarmason
2018-09-21 15:47 ` Derrick Stolee
2018-09-21 17:39 ` [PATCH v3 0/7] " Derrick Stolee via GitGitGadget
2018-09-21 17:39 ` [PATCH v3 1/7] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-09-26 19:15 ` Derrick Stolee
2018-10-11 13:54 ` Jeff King
2018-09-21 17:39 ` [PATCH v3 2/7] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-10-11 13:57 ` Jeff King
2018-09-21 17:39 ` [PATCH v3 3/7] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-10-11 13:58 ` Jeff King
2018-10-12 4:34 ` Junio C Hamano
2018-09-21 17:39 ` [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-10-11 14:06 ` Jeff King
2018-10-12 6:33 ` Junio C Hamano
2018-10-12 12:32 ` Derrick Stolee
2018-10-12 16:15 ` Johannes Sixt
2018-10-13 8:05 ` Junio C Hamano
2018-09-21 17:39 ` [PATCH v3 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-10-11 14:21 ` Jeff King
2018-09-21 17:39 ` [PATCH v3 6/7] revision.h: add whitespace in flag definitions Derrick Stolee via GitGitGadget
2018-09-21 17:39 ` [PATCH v3 7/7] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-09-27 17:57 ` Derrick Stolee
2018-10-06 16:56 ` Jakub Narebski
2018-10-11 15:35 ` Jeff King
2018-10-11 16:21 ` Derrick Stolee
2018-10-25 9:43 ` Jeff King
2018-10-25 13:00 ` Derrick Stolee
2018-10-11 22:32 ` Stefan Beller
2018-09-21 21:22 ` [PATCH v3 0/7] Use generation numbers for --topo-order Junio C Hamano
2018-10-16 22:36 ` [PATCH v4 " Derrick Stolee via GitGitGadget
2018-10-16 22:36 ` [PATCH v4 1/7] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-10-16 22:36 ` [PATCH v4 2/7] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-10-16 22:36 ` [PATCH v4 3/7] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-10-21 10:21 ` Jakub Narebski
2018-10-21 15:28 ` Derrick Stolee
2018-10-16 22:36 ` [PATCH v4 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-10-21 15:55 ` Jakub Narebski
2018-10-22 1:12 ` Junio C Hamano
2018-10-22 1:51 ` Derrick Stolee
2018-10-22 1:55 ` [RFC PATCH] revision.c: use new algorithm in A..B case Derrick Stolee
2018-10-25 8:28 ` [PATCH v4 4/7] revision.c: begin refactoring --topo-order logic Junio C Hamano
2018-10-26 20:56 ` Jakub Narebski
2018-10-16 22:36 ` [PATCH v4 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-10-21 21:17 ` Jakub Narebski
2018-10-16 22:36 ` [PATCH v4 6/7] revision.c: generation-based topo-order algorithm Derrick Stolee via GitGitGadget
2018-10-22 13:37 ` Jakub Narebski
2018-10-23 13:54 ` Derrick Stolee
2018-10-26 16:55 ` Jakub Narebski
2018-10-16 22:36 ` [PATCH v4 7/7] t6012: make rev-list tests more interesting Derrick Stolee via GitGitGadget
2018-10-23 15:48 ` Jakub Narebski
2018-10-21 12:57 ` [PATCH v4 0/7] Use generation numbers for --topo-order Jakub Narebski
2018-11-01 5:21 ` Junio C Hamano
2018-11-01 13:49 ` Derrick Stolee
2018-11-01 23:54 ` Junio C Hamano
2018-11-01 13:46 ` Derrick Stolee [this message]
2018-11-01 13:46 ` [PATCH v5 1/7] prio-queue: add 'peek' operation Derrick Stolee
2018-11-01 13:46 ` [PATCH v5 2/7] test-reach: add run_three_modes method Derrick Stolee
2018-11-01 13:46 ` [PATCH v5 3/7] test-reach: add rev-list tests Derrick Stolee
2018-11-01 13:46 ` [PATCH v5 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee
2018-11-01 13:46 ` [PATCH v5 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee
2018-11-01 13:46 ` [PATCH v5 6/7] revision.c: generation-based topo-order algorithm Derrick Stolee
2018-11-01 15:48 ` SZEDER Gábor
2018-11-01 16:12 ` Derrick Stolee
2019-11-08 2:50 ` Mike Hommey
2019-11-11 1:07 ` Derrick Stolee
2019-11-18 23:04 ` SZEDER Gábor
2018-11-01 13:46 ` [PATCH v5 7/7] t6012: make rev-list tests more interesting Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181101134623.84055-1-dstolee@microsoft.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=peff@peff.net \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.