git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] corner cases with "rev-list --use-bitmap-index -n"
@ 2016-06-02 23:09 Jeff King
  2016-06-02 23:10 ` [PATCH 1/2] rev-list: "adjust" results of "--count " Jeff King
  2016-06-02 23:10 ` [PATCH " Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2016-06-02 23:09 UTC (permalink / raw)
  To: git

This series fixes two corner-cases I found when using "-n" along with
bitmaps. The results do make _some_ sense if you interpret them
correctly, but are sufficiently confusing that I think it's worth
dealing with. See the commit messages for the gory details.

The good news is that in the first case, we can produce a more sensible
answer without spending any extra work.

The bad news is that the second case cannot, and must fall back to a
regular traversal. I doubt anybody even cares about this case in
practice, though, as "--objects -n" is somewhat nonsensical already (I
didn't run across it in practice; I only noticed while I fixing the
first one, which I did see in practice).

  [1/2]: rev-list: "adjust" results of "--count --use-bitmap-index -n"
  [2/2]: rev-list: disable bitmaps when "-n" is used with listing objects

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] rev-list: "adjust" results of "--count --use-bitmap-index -n"
  2016-06-02 23:09 [PATCH 0/2] corner cases with "rev-list --use-bitmap-index -n" Jeff King
@ 2016-06-02 23:10 ` Jeff King
  2016-06-03  7:06   ` [PATCH v2 0/2] corner cases with "rev-list " Jeff King
  2016-06-02 23:10 ` [PATCH " Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2016-06-02 23:10 UTC (permalink / raw)
  To: git

If you ask rev-list for:

    git rev-list --count --use-bitmap-index HEAD

we optimize out the actual traversal and just give you the
number of bits set in the commit bitmap. This is faster,
which is good.

But if you ask to limit the size of the traversal, like:

    git rev-list --count --use-bitmap-index -n 100 HEAD

we'll still output the full bitmapped number we found. On
the surface, that might even seem OK. You explicitly asked
to use the bitmap index, and it was cheap to compute the
real answer, so we gave it to you.

But there's something much more complicated going on under
the hood. If we don't have a bitmap directly for HEAD, then
we have to actually traverse backwards, looking for a
bitmapped commit. And _that_ traversal is bounded by our
`-n` count.

This is a good thing, because it bounds the work we have to
do, which is probably what the user wanted by asking for
`-n`. But now it makes the output quite confusing. You might
get many values:

  - your `-n` value, if we walked back and never found a
    bitmap (or fewer if there weren't that many commits)

  - the actual full count, if we found a bitmap root for
    every path of our traversal with in the `-n` limit

  - any number in between! We might have walked back and
    found _some_ bitmaps, but then cut off the traversal
    early with some commits not accounted for in the result.

So you cannot even see a value higher than your `-n` and say
"OK, bitmaps kicked in, this must be the real full count".
The only sane thing is for git to just clamp the value to a
maximum of the `-n` value, which means we should output the
exact same results whether bitmaps are in use or not.

The test in t5310 demonstrates this by using `-n 1`.
Without this patch we fail in the full-bitmap case (where we
do not have to traverse at all) but _not_ in the
partial-bitmap case (where we have to walk down to find an
actual bitmap). With this patch, both cases just work.

I didn't implement the crazy in-between case, just because
it's complicated to set up, and is really a subset of the
full-count case, which we do cover.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c      | 2 ++
 t/t5310-pack-bitmaps.sh | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 275da0d..aaa79a3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -360,6 +360,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			uint32_t commit_count;
 			if (!prepare_bitmap_walk(&revs)) {
 				count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
+				if (revs.max_count && revs.max_count < commit_count)
+					commit_count = revs.max_count;
 				printf("%d\n", commit_count);
 				return 0;
 			}
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index d446706..3893afd 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -47,6 +47,12 @@ rev_list_tests() {
 		test_cmp expect actual
 	'
 
+	test_expect_success "counting commits with limit ($state)" '
+		git rev-list --count -n 1 HEAD >expect &&
+		git rev-list --use-bitmap-index --count -n 1 HEAD >actual &&
+		test_cmp expect actual
+	'
+
 	test_expect_success "counting non-linear history ($state)" '
 		git rev-list --count other...master >expect &&
 		git rev-list --use-bitmap-index --count other...master >actual &&
-- 
2.9.0.rc1.132.g33c7f30

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] rev-list: disable bitmaps when "-n" is used with listing objects
  2016-06-02 23:09 [PATCH 0/2] corner cases with "rev-list --use-bitmap-index -n" Jeff King
  2016-06-02 23:10 ` [PATCH 1/2] rev-list: "adjust" results of "--count " Jeff King
@ 2016-06-02 23:10 ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2016-06-02 23:10 UTC (permalink / raw)
  To: git

You can ask rev-list to use bitmaps to speed up an --objects
traversal, which should generally give you your answers much
faster.

Likewise, you can ask rev-list to limit such a traversal
with `-n`, in which case we'll show only a limited set of
commits (and only the tree and commit objects directly
reachable from those commits).

But if you do both together, the results are nonsensical. We
end up limiting any fallback traversal we do to _find_ the
bitmaps, but the actual set of objects we output will be
picked arbitrarily from the union of any bitmaps we do find,
and will involve the objects of many more commits.

It's possible that somebody might want this as a "show me
what you can, but limit the amount of work you do" flag.
But as with the prior commit clamping "--count", the results
are basically non-deterministic; you'll get the values from
some commits between `n` and the total number, and you can't
tell which.

And unlike the `--count` case, we can't easily generate the
"real" value from the bitmap values (you can't just walk
back `-n` commits and subtract out the reachable objects
from the boundary commits; the bitmaps for `X` record its
total reachability, so you don't know which objects are
directly from `X` itself, which from `X^`, and so on).

So let's just fallback to the non-bitmap code path in this
case, so we always give a sane answer.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index aaa79a3..9fb6608 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -365,7 +365,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 				printf("%d\n", commit_count);
 				return 0;
 			}
-		} else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) {
+		} else if (!revs.max_count &&
+			   revs.tag_objects && revs.tree_objects && revs.blob_objects) {
 			if (!prepare_bitmap_walk(&revs)) {
 				traverse_bitmap_commit_list(&show_object_fast);
 				return 0;
-- 
2.9.0.rc1.132.g33c7f30

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 0/2] corner cases with "rev-list --use-bitmap-index -n"
  2016-06-02 23:10 ` [PATCH 1/2] rev-list: "adjust" results of "--count " Jeff King
@ 2016-06-03  7:06   ` Jeff King
  2016-06-03  7:07     ` [PATCH v2 1/2] rev-list: "adjust" results of "--count " Jeff King
  2016-06-03  7:08     ` [PATCH v2 2/2] rev-list: disable bitmaps when "-n" is used with listing objects Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2016-06-03  7:06 UTC (permalink / raw)
  To: git

On Thu, Jun 02, 2016 at 07:10:31PM -0400, Jeff King wrote:

> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index 275da0d..aaa79a3 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -360,6 +360,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  			uint32_t commit_count;
>  			if (!prepare_bitmap_walk(&revs)) {
>  				count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
> +				if (revs.max_count && revs.max_count < commit_count)
> +					commit_count = revs.max_count;

...aaaaand this patch is totally wrong. For one thing, the "not set"
sentinel value for max_count is "-1", not "0", so it didn't kick in for
"-n0".

And for another, any fallback traversal will actually decrement
revs.max_count, so we have to save the value before traversing.

And as luck would have it, those two bugs cancel each other out in many
cases (including the ones in the test suite!). If you pass "-n1", gets
decremented to "0".  That would give us a bogus adjustment, but instead
we decide that "0" means no adjustement is necessary, and return the
number of commits we traversed, which is the right answer as long as we
didn't hit any bitmaps on the way down (or we hit one right away, in
which case we didn't decrement at all).

Updated patches in a moment (the second has the same sentinel problem,
too).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] rev-list: "adjust" results of "--count --use-bitmap-index -n"
  2016-06-03  7:06   ` [PATCH v2 0/2] corner cases with "rev-list " Jeff King
@ 2016-06-03  7:07     ` Jeff King
  2016-06-03  7:08     ` [PATCH v2 2/2] rev-list: disable bitmaps when "-n" is used with listing objects Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2016-06-03  7:07 UTC (permalink / raw)
  To: git

If you ask rev-list for:

    git rev-list --count --use-bitmap-index HEAD

we optimize out the actual traversal and just give you the
number of bits set in the commit bitmap. This is faster,
which is good.

But if you ask to limit the size of the traversal, like:

    git rev-list --count --use-bitmap-index -n 100 HEAD

we'll still output the full bitmapped number we found. On
the surface, that might even seem OK. You explicitly asked
to use the bitmap index, and it was cheap to compute the
real answer, so we gave it to you.

But there's something much more complicated going on under
the hood. If we don't have a bitmap directly for HEAD, then
we have to actually traverse backwards, looking for a
bitmapped commit. And _that_ traversal is bounded by our
`-n` count.

This is a good thing, because it bounds the work we have to
do, which is probably what the user wanted by asking for
`-n`. But now it makes the output quite confusing. You might
get many values:

  - your `-n` value, if we walked back and never found a
    bitmap (or fewer if there weren't that many commits)

  - the actual full count, if we found a bitmap root for
    every path of our traversal with in the `-n` limit

  - any number in between! We might have walked back and
    found _some_ bitmaps, but then cut off the traversal
    early with some commits not accounted for in the result.

So you cannot even see a value higher than your `-n` and say
"OK, bitmaps kicked in, this must be the real full count".
The only sane thing is for git to just clamp the value to a
maximum of the `-n` value, which means we should output the
exact same results whether bitmaps are in use or not.

The test in t5310 demonstrates this by using `-n 1`.
Without this patch we fail in the full-bitmap case (where we
do not have to traverse at all) but _not_ in the
partial-bitmap case (where we have to walk down to find an
actual bitmap). With this patch, both cases just work.

I didn't implement the crazy in-between case, just because
it's complicated to set up, and is really a subset of the
full-count case, which we do cover.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c      | 3 +++
 t/t5310-pack-bitmaps.sh | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 275da0d..b29f8fc 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -358,8 +358,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (use_bitmap_index && !revs.prune) {
 		if (revs.count && !revs.left_right && !revs.cherry_mark) {
 			uint32_t commit_count;
+			int max_count = revs.max_count;
 			if (!prepare_bitmap_walk(&revs)) {
 				count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
+				if (max_count >= 0 && max_count < commit_count)
+					commit_count = max_count;
 				printf("%d\n", commit_count);
 				return 0;
 			}
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index d446706..3893afd 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -47,6 +47,12 @@ rev_list_tests() {
 		test_cmp expect actual
 	'
 
+	test_expect_success "counting commits with limit ($state)" '
+		git rev-list --count -n 1 HEAD >expect &&
+		git rev-list --use-bitmap-index --count -n 1 HEAD >actual &&
+		test_cmp expect actual
+	'
+
 	test_expect_success "counting non-linear history ($state)" '
 		git rev-list --count other...master >expect &&
 		git rev-list --use-bitmap-index --count other...master >actual &&
-- 
2.9.0.rc1.132.g33c7f30

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] rev-list: disable bitmaps when "-n" is used with listing objects
  2016-06-03  7:06   ` [PATCH v2 0/2] corner cases with "rev-list " Jeff King
  2016-06-03  7:07     ` [PATCH v2 1/2] rev-list: "adjust" results of "--count " Jeff King
@ 2016-06-03  7:08     ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2016-06-03  7:08 UTC (permalink / raw)
  To: git

You can ask rev-list to use bitmaps to speed up an --objects
traversal, which should generally give you your answers much
faster.

Likewise, you can ask rev-list to limit such a traversal
with `-n`, in which case we'll show only a limited set of
commits (and only the tree and commit objects directly
reachable from those commits).

But if you do both together, the results are nonsensical. We
end up limiting any fallback traversal we do to _find_ the
bitmaps, but the actual set of objects we output will be
picked arbitrarily from the union of any bitmaps we do find,
and will involve the objects of many more commits.

It's possible that somebody might want this as a "show me
what you can, but limit the amount of work you do" flag.
But as with the prior commit clamping "--count", the results
are basically non-deterministic; you'll get the values from
some commits between `n` and the total number, and you can't
tell which.

And unlike the `--count` case, we can't easily generate the
"real" value from the bitmap values (you can't just walk
back `-n` commits and subtract out the reachable objects
from the boundary commits; the bitmaps for `X` record its
total reachability, so you don't know which objects are
directly from `X` itself, which from `X^`, and so on).

So let's just fallback to the non-bitmap code path in this
case, so we always give a sane answer.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index b29f8fc..b82bcc3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -366,7 +366,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 				printf("%d\n", commit_count);
 				return 0;
 			}
-		} else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) {
+		} else if (revs.max_count < 0 &&
+			   revs.tag_objects && revs.tree_objects && revs.blob_objects) {
 			if (!prepare_bitmap_walk(&revs)) {
 				traverse_bitmap_commit_list(&show_object_fast);
 				return 0;
-- 
2.9.0.rc1.132.g33c7f30

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-03  7:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-02 23:09 [PATCH 0/2] corner cases with "rev-list --use-bitmap-index -n" Jeff King
2016-06-02 23:10 ` [PATCH 1/2] rev-list: "adjust" results of "--count " Jeff King
2016-06-03  7:06   ` [PATCH v2 0/2] corner cases with "rev-list " Jeff King
2016-06-03  7:07     ` [PATCH v2 1/2] rev-list: "adjust" results of "--count " Jeff King
2016-06-03  7:08     ` [PATCH v2 2/2] rev-list: disable bitmaps when "-n" is used with listing objects Jeff King
2016-06-02 23:10 ` [PATCH " Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).