All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <dstolee@microsoft.com>
To: "git@vger.kernel.org" <git@vger.kernel.org>
Cc: "peff@peff.net" <peff@peff.net>,
	"sbeller@google.com" <sbeller@google.com>,
	"jnareb@gmail.com" <jnareb@gmail.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [RFC PATCH 13/13] commit-reach: use can_all_from_reach
Date: Fri, 29 Jun 2018 16:13:03 +0000	[thread overview]
Message-ID: <20180629161223.229661-14-dstolee@microsoft.com> (raw)
In-Reply-To: <20180629161223.229661-1-dstolee@microsoft.com>

The is_descendant_of method previously used in_merge_bases() to check if
the commit can reach any of the commits in the provided list. This had
two performance problems:

1. The performance is quadratic in worst-case.

2. A single in_merge_bases() call requires walking beyond the target
   commit in order to find the full set of boundary commits that may be
   merge-bases.

The can_all_from_reach method avoids this quadratic behavior and can
limit the search beyond the target commits using generation numbers. It
requires a small prototype adjustment to stop using commit-date as a
cutoff, as that optimization is no longer appropriate here.

Performance was meausured on a copy of the Linux repository using the
'test-tool reach is_descendant_of' command using this input:

A:v4.9
X:v4.10
X:v4.11
X:v4.12
X:v4.13
X:v4.14
X:v4.15
X:v4.16
X:v4.17
X.v3.0

Note that this input is tailored to demonstrate the quadratic nature of
the previous method, as it will compute merge-bases for v4.9 versus all
of the later versions before checking against v4.1.

Before: 0.31 s
 After: 0.27 s

Since we previously used the is_descendant_of method in the ref_newer
method, we also measured performance there using
'test-tool reach ref_newer':

Before: 0.12 s
 After: 0.11 s

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---

One thing I know is missing from this commit is a special-case to use
the old logic when there is no commit-graph present. The
can_all_from_reach() algorithm can be worse when we do not have good
generation number cutoffs. In the previous case of
can_all_from_reach_with_flags(), we already had an established pattern
of using commit date as a cutoff, so the generation number is only a
second cutoff and the algorithm cannot walk more commits than before.

 commit-reach.c        | 34 +++++++++++++++++-----------------
 commit-reach.h        |  3 ++-
 t/helper/test-reach.c |  2 +-
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 249e9a4fac..a823d6965c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -273,17 +273,15 @@ struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
  */
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit)
 {
+	struct commit_list *from_list = NULL;
+	int result;
 	if (!with_commit)
 		return 1;
-	while (with_commit) {
-		struct commit *other;
 
-		other = with_commit->item;
-		with_commit = with_commit->next;
-		if (in_merge_bases(other, commit))
-			return 1;
-	}
-	return 0;
+	commit_list_insert(commit, &from_list);
+	result = can_all_from_reach(from_list, with_commit, 0);
+	free_commit_list(from_list);
+	return result;
 }
 
 /*
@@ -605,10 +603,11 @@ int can_all_from_reach_with_flag(struct object_array *from,
 	return result;
 }
 
-int can_all_from_reach(struct commit_list *from, struct commit_list *to)
+int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+		       int cutoff_by_min_date)
 {
 	struct object_array from_objs = OBJECT_ARRAY_INIT;
-	time_t min_commit_date = from->item->date;
+	time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0;
 	struct commit_list *from_iter = from;
 	struct commit_list *to_iter = to;
 	int result;
@@ -617,20 +616,21 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to)
 	while (from_iter) {
 		add_object_array(&from_iter->item->object, NULL, &from_objs);
 
-		if (from_iter->item->date < min_commit_date)
+		if (!parse_commit(from_iter->item) &&
+		    from_iter->item->date < min_commit_date)
 			min_commit_date = from_iter->item->date;
 
 		from_iter = from_iter->next;
 	}
 
 	while (to_iter) {
-		parse_commit(to_iter->item);
-
-		if (to_iter->item->date < min_commit_date)
-			min_commit_date = to_iter->item->date;
+		if (!parse_commit(to_iter->item)) {
+			if (to_iter->item->date < min_commit_date)
+				min_commit_date = to_iter->item->date;
 
-		if (to_iter->item->generation < min_generation)
-			min_generation = to_iter->item->generation;
+			if (to_iter->item->generation < min_generation)
+				min_generation = to_iter->item->generation;
+		}
 
 		to_iter->item->object.flags |= PARENT2;
 
diff --git a/commit-reach.h b/commit-reach.h
index 3eb4c057e6..180f865d7d 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -31,6 +31,7 @@ int can_all_from_reach_with_flag(struct object_array *from, int with_flag,
 				 int assign_flag, time_t min_commit_date,
 				 uint32_t min_generation);
 
-int can_all_from_reach(struct commit_list *from, struct commit_list *to);
+int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+		       int cutoff_by_min_date);
 
 #endif
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 14aaef5bff..c05137c9f3 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -120,7 +120,7 @@ int cmd__reach(int ac, const char **av)
 			list = list->next;
 		}
 	} else if (!strcmp(av[1], "can_all_from_reach")) {
-		int result = can_all_from_reach(list_X, list_Y);
+		int result = can_all_from_reach(list_X, list_Y, 1);
 		printf("%s(X,Y):%d\n", av[1], result);
 	}
 
-- 
2.18.0.118.gd4f65b8d14


  parent reply	other threads:[~2018-06-29 16:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 16:12 [RFC PATCH 00/13] Consolidate reachability logic Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 01/13] commit-reach: move walk methods from commit.c Derrick Stolee
2018-06-29 21:35   ` Stefan Beller
2018-06-29 21:52   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 02/13] commit-reach: move ref_newer from remote.c Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 03/13] commit-reach: move commit_contains from ref-filter Derrick Stolee
2018-06-29 21:38   ` Stefan Beller
2018-06-30  1:32     ` Derrick Stolee
2018-06-29 22:00   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 04/13] upload-pack: make reachable() more generic Derrick Stolee
2018-06-29 22:05   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 05/13] upload-pack: refactor ok_to_give_up() Derrick Stolee
2018-06-29 21:44   ` Stefan Beller
2018-06-29 16:12 ` [RFC PATCH 06/13] commit-reach: move can_all_from_reach_with_flag() Derrick Stolee
2018-06-29 21:47   ` Stefan Beller
2018-06-30  1:35     ` Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 07/13] test-reach Derrick Stolee
2018-06-29 21:54   ` Stefan Beller
2018-06-30  1:40     ` Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 08/13] test-reach: test reduce_heads() Derrick Stolee
2018-06-29 22:06   ` Stefan Beller
2018-06-29 16:12 ` [RFC PATCH 09/13] commit-reach: test can_all_from_reach Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 10/13] commit-reach: test is_descendant_of Derrick Stolee
2018-06-29 16:13 ` [RFC PATCH 11/13] commit-reach: make can_all_from_reach... linear Derrick Stolee
2018-06-29 23:18   ` Stefan Beller
2018-06-29 16:13 ` [RFC PATCH 12/13] commit-reach: use is_descendant_of for ref_newer Derrick Stolee
2018-06-29 16:13 ` Derrick Stolee [this message]
2018-06-29 23:21   ` [RFC PATCH 13/13] commit-reach: use can_all_from_reach Stefan Beller
2018-06-29 17:33 ` [RFC PATCH 00/13] Consolidate reachability logic Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180629161223.229661-14-dstolee@microsoft.com \
    --to=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.