* Re: [PATCH 0/4] perf improvements for git-describe with few tags
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
@ 2024-11-06 19:26 ` Jeff King
2024-11-06 21:16 ` [PATCH 1/4] t6120: demonstrate weakness in disjoint-root handling Jeff King
` (4 subsequent siblings)
5 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-11-06 19:26 UTC (permalink / raw)
To: git
Cc: Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
On Wed, Nov 06, 2024 at 02:22:37PM -0500, Jeff King wrote:
> So here's the series I came up with, which starts by adjusting the tests
> to be resilient to the later changes, but also to show the existing
> failure mode.
>
> And then the rest of the patches add the performance improvements we've
> been discussing in the thread.
Oops, I just realized there's a silly error in my patch, but I need to
go offline before I can fix it. I'll send patches later today, but I
didn't want to leave anybody confused about the delay. :)
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread* [PATCH 1/4] t6120: demonstrate weakness in disjoint-root handling
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
2024-11-06 19:26 ` Jeff King
@ 2024-11-06 21:16 ` Jeff King
2024-11-06 21:17 ` [PATCH 2/4] t/perf: add tests for git-describe Jeff King
` (3 subsequent siblings)
5 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-11-06 21:16 UTC (permalink / raw)
To: git
Cc: Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
Commit 30b1c7ad9d (describe: don't abort too early when searching tags,
2020-02-26) tried to fix a problem that happens when there are disjoint
histories: to accurately compare the counts for different tags, we need
to keep walking the history longer in order to find a common base.
But its fix misses a case: we may still bail early if we hit the
max_candidates limit, producing suboptimal output. You can see this in
action by adding "--candidates=2" to the tests; we'll stop traversing as
soon as we see the second tag and will produce the wrong answer. I hit
this in practice while trying to teach git-describe not to keep looking
for candidates after we've seen all tags in the repo (effectively adding
--candidates=2, since these toy repos have only two tags each).
This is probably fixable by continuing to walk after hitting the
max-candidates limit, all the way down to a common ancestor of all
candidates. But it's not clear in practice what the preformance
implications would be (it would depend on how long the branches that
hold the candidates are).
So I'm punting on that for now, but I'd like to adjust the tests to be
more resilient, and to document the findings. So this patch:
1. Adds an extra tag at the bottom of history. This shouldn't change
the output, but does mean we are more resilient to low values of
--candidates (e.g., if we start reducing it to the total number of
tags). This is arguably closer to the real world anyway, where
you're not going to have just 2 tags, but an arbitrarily long
history going back in time, possibly with multiple irrelevant tags
in it (I called the new tag "H" here for "history").
2. Run the same tests with --candidates=2, which shows that even with
the current code they can fail if we end the traversal early. That
leaves a trail for anybody interested in trying to improve the
behavior.
Signed-off-by: Jeff King <peff@peff.net>
---
t/t6120-describe.sh | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 05ed2510d9..69689d2f36 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -19,13 +19,17 @@ TEST_PASSES_SANITIZE_LEAK=true
check_describe () {
indir= &&
+ outcome=success &&
while test $# != 0
do
case "$1" in
-C)
indir="$2"
shift
;;
+ --expect-failure)
+ outcome=failure
+ ;;
*)
break
;;
@@ -36,7 +40,7 @@ check_describe () {
expect="$1"
shift
describe_opts="$@"
- test_expect_success "describe $describe_opts" '
+ test_expect_${outcome} "describe $describe_opts" '
git ${indir:+ -C "$indir"} describe $describe_opts >raw &&
sed -e "s/-g[0-9a-f]*\$/-gHASH/" <raw >actual &&
echo "$expect" >expect &&
@@ -617,7 +621,7 @@ test_expect_success 'name-rev --annotate-stdin works with commitGraph' '
# B
# o
-# \
+# H \
# o-----o---o----x
# A
#
@@ -627,6 +631,7 @@ test_expect_success 'setup: describe commits with disjoint bases' '
cd disjoint1 &&
echo o >> file && git add file && git commit -m o &&
+ git tag H -a -m H &&
echo A >> file && git add file && git commit -m A &&
git tag A -a -m A &&
echo o >> file && git add file && git commit -m o &&
@@ -639,8 +644,9 @@ test_expect_success 'setup: describe commits with disjoint bases' '
'
check_describe -C disjoint1 "A-3-gHASH" HEAD
+check_describe -C disjoint1 --expect-failure "A-3-gHASH" --candidates=2 HEAD
-# B
+# H B
# o---o---o------------.
# \
# o---o---x
@@ -658,13 +664,15 @@ test_expect_success 'setup: describe commits with disjoint bases 2' '
git checkout --orphan branch &&
echo o >> file2 && git add file2 && GIT_COMMITTER_DATE="2020-01-01 15:00" git commit -m o &&
echo o >> file2 && git add file2 && GIT_COMMITTER_DATE="2020-01-01 15:01" git commit -m o &&
+ git tag H -a -m H &&
echo B >> file2 && git add file2 && GIT_COMMITTER_DATE="2020-01-01 15:02" git commit -m B &&
git tag B -a -m B &&
git merge --no-ff --allow-unrelated-histories main -m x
)
'
check_describe -C disjoint2 "B-3-gHASH" HEAD
+check_describe -C disjoint2 --expect-failure "B-3-gHASH" --candidates=2 HEAD
test_expect_success 'setup misleading taggerdates' '
GIT_COMMITTER_DATE="2006-12-12 12:31" git tag -a -m "another tag" newer-tag-older-commit unique-file~1
--
2.47.0.441.g1a09955689
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 2/4] t/perf: add tests for git-describe
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
2024-11-06 19:26 ` Jeff King
2024-11-06 21:16 ` [PATCH 1/4] t6120: demonstrate weakness in disjoint-root handling Jeff King
@ 2024-11-06 21:17 ` Jeff King
2024-11-06 21:17 ` [PATCH 3/4] describe: stop digging for max_candidates+1 Jeff King
` (2 subsequent siblings)
5 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-11-06 21:17 UTC (permalink / raw)
To: git
Cc: Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
We don't have a perf script for git-describe, despite it often being
accused of slowness. Let's add a few simple tests to start with.
Rather than use the existing tags from our test repo, we'll make our own
so that we have a known quantity and position. We'll add a "new" tag
near the tip of HEAD, and an "old" one that is at the very bottom. And
then our tests are:
1. Describing HEAD naively requires walking all the way down to the
old tag as we collect candidates. This gives us a baseline for what
"slow" looks like.
2. Doing the same with --candidates=1 can potentially be fast, because
we can quie after finding "new". But we don't, and it's also slow.
3. Likewise we should be able to quit when there are no more tags to
find. This can happen naturally if a repo has few tags, but also if
you restrict the set of tags with --match.
Here are the results running against linux.git. Note that I have a
commit-graph built for the repo, so "slow" here is ~700ms. Without a
commit graph it's more like 9s!
Test HEAD
--------------------------------------------------------------
6100.2: describe HEAD 0.70(0.66+0.04)
6100.3: describe HEAD with one max candidate 0.70(0.66+0.04)
6100.4: describe HEAD with one tag 0.70(0.64+0.06)
Signed-off-by: Jeff King <peff@peff.net>
---
t/perf/p6100-describe.sh | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
create mode 100755 t/perf/p6100-describe.sh
diff --git a/t/perf/p6100-describe.sh b/t/perf/p6100-describe.sh
new file mode 100755
index 0000000000..069f91ce49
--- /dev/null
+++ b/t/perf/p6100-describe.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='performance of git-describe'
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+# clear out old tags and give us a known state
+test_expect_success 'set up tags' '
+ git for-each-ref --format="delete %(refname)" refs/tags >to-delete &&
+ git update-ref --stdin <to-delete &&
+ new=$(git rev-list -1000 HEAD | tail -n 1) &&
+ git tag -m new new $new &&
+ old=$(git rev-list HEAD | tail -n 1) &&
+ git tag -m old old $old
+'
+
+test_perf 'describe HEAD' '
+ git describe HEAD
+'
+
+test_perf 'describe HEAD with one max candidate' '
+ git describe --candidates=1 HEAD
+'
+
+test_perf 'describe HEAD with one tag' '
+ git describe --match=new HEAD
+'
+
+test_done
--
2.47.0.441.g1a09955689
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 3/4] describe: stop digging for max_candidates+1
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
` (2 preceding siblings ...)
2024-11-06 21:17 ` [PATCH 2/4] t/perf: add tests for git-describe Jeff King
@ 2024-11-06 21:17 ` Jeff King
2024-11-06 21:17 ` [PATCH 4/4] describe: stop traversing when we run out of names Jeff King
2024-11-26 5:05 ` [PATCH 0/4] perf improvements for git-describe with few tags Junio C Hamano
5 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-11-06 21:17 UTC (permalink / raw)
To: git
Cc: Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
By default, describe considers only 10 candidate matches, and stops
traversing when we have enough. This makes things much faster in a large
repository, where collecting all candidates requires walking all the way
down to the root (or at least to the oldest tag). This goes all the way
back to 8713ab3079 (Improve git-describe performance by reducing
revision listing., 2007-01-13).
However, we don't stop immediately when we have enough candidates. We
keep traversing and only bail when we find one more candidate that we're
ignoring. Usually this is not too expensive, if the tags are sprinkled
evenly throughout history. But if you are unlucky, you might hit the max
candidate quickly, and then have a huge swath of history before finding
the next one.
Our p6100 test has exactly this unlucky case: with a max of "1", we find
a recent tag quickly and then have to go all the way to the root to find
the old tag that will be discarded.
A more interesting real-world case is:
git describe --candidates=1 --match=v6.12-rc4 HEAD
in the linux.git repo. There we restrict the set of tags to a single
one, so there is no older candidate to find at all! But despite
--candidates=1, we keep traversing to the root only to find nothing.
So why do we keep traversing after hitting thet max? There are two
reasons I can see:
1. In theory the extra information that there was another candidate
could be useful, and we record it in the gave_up_on variable. But
we only show this information with --debug.
2. After finding the candidate, there's more processing we do in our
loop. The most important of this is propagating the "within" flags
to our parent commits, and putting them in the commit_list we'll
use for finish_depth_computation().
That function continues the traversal until we've counted all
commits reachable from the starting point but not reachable from
our best candidate tag (so essentially counting "$tag..$start", but
avoiding re-walking over the bits we've seen). If we break
immediately without putting those commits into the list, our depth
computation will be wrong (in the worst case we'll count all the
way down to the root, not realizing those commits are included in
our tag).
But we don't need to find a new candidate for (2). As soon as we finish
the loop iteration where we hit max_candidates, we can then quit on the
next iteration. This should produce the same output as the original code
(which could, after all, find a candidate on the very next commit
anyway) but ends the traversal with less pointless digging.
We still have to set "gave_up_on"; we've popped it off the list and it
has to go back. An alternative would be to re-order the loop so that it
never gets popped, but it's perhaps still useful to show in the --debug
output, so we need to know it anyway. We do have to adjust the --debug
output since it's now just a commit where we stopped traversing, and not
the max+1th candidate.
p6100 shows the speedup using linux.git:
Test HEAD^ HEAD
---------------------------------------------------------------------------------------
6100.2: describe HEAD 0.70(0.63+0.06) 0.71(0.66+0.04) +1.4%
6100.3: describe HEAD with one max candidate 0.70(0.64+0.05) 0.01(0.00+0.00) -98.6%
6100.4: describe HEAD with one tag 0.70(0.67+0.03) 0.70(0.63+0.06) +0.0%
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Helped-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/describe.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/builtin/describe.c b/builtin/describe.c
index 7330a77b38..69f2d942be 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -366,6 +366,12 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
struct commit_name **slot;
seen_commits++;
+
+ if (match_cnt == max_candidates) {
+ gave_up_on = c;
+ break;
+ }
+
slot = commit_names_peek(&commit_names, c);
n = slot ? *slot : NULL;
if (n) {
@@ -381,10 +387,6 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
if (n->prio == 2)
annotated_cnt++;
}
- else {
- gave_up_on = c;
- break;
- }
}
for (cur_match = 0; cur_match < match_cnt; cur_match++) {
struct possible_tag *t = &all_matches[cur_match];
@@ -470,9 +472,8 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
fprintf(stderr, _("traversed %lu commits\n"), seen_commits);
if (gave_up_on) {
fprintf(stderr,
- _("more than %i tags found; listed %i most recent\n"
- "gave up search at %s\n"),
- max_candidates, max_candidates,
+ _("found %i tags; gave up search at %s\n"),
+ max_candidates,
oid_to_hex(&gave_up_on->object.oid));
}
}
--
2.47.0.441.g1a09955689
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH 4/4] describe: stop traversing when we run out of names
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
` (3 preceding siblings ...)
2024-11-06 21:17 ` [PATCH 3/4] describe: stop digging for max_candidates+1 Jeff King
@ 2024-11-06 21:17 ` Jeff King
2024-12-04 23:15 ` [PATCH] fixup! " Josh Steadmon
2024-11-26 5:05 ` [PATCH 0/4] perf improvements for git-describe with few tags Junio C Hamano
5 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-11-06 21:17 UTC (permalink / raw)
To: git
Cc: Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
When trying to describe a commit, we'll traverse from the commit,
collecting candidate tags that point to its ancestors. But once we've
seen all of the tags in the repo, there's no point in traversing
further. There's nothing left to find!
For a default "git describe", this isn't usually a big problem. In a
large repo you'll probably have multiple tags, so we'll eventually find
10 candidates (the default for max_candidates) and stop there. And in a
small repo, it's quick to traverse to the root.
But you can imagine a large repo with few tags. Or, as we saw in a real
world case, explicitly limiting the set of matches like this (on
linux.git):
git describe --match=v6.12-rc4 HEAD
which goes all the way to the root before realizing that no, there are
no other tags under consideration besides the one we fed via --match.
If we add in "--candidates=1" there, it's much faster (at least as of
the previous commit).
But we should be able to speed this up without the user asking for it.
After expanding all matching tags, we know the total number of names. We
could just stop the traversal there, but as hinted at above we already
have a mechanism for doing that: the max_candidate limit. So we can just
reduce that limit to match the number of possible candidates.
Our p6100 test shows this off:
Test HEAD^ HEAD
---------------------------------------------------------------------------------------
6100.2: describe HEAD 0.71(0.65+0.06) 0.72(0.68+0.04) +1.4%
6100.3: describe HEAD with one max candidate 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
6100.4: describe HEAD with one tag 0.72(0.66+0.05) 0.01(0.00+0.00) -98.6%
Now we are fast automatically, just as if --candidates=1 were supplied
by the user.
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Helped-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/describe.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/builtin/describe.c b/builtin/describe.c
index 69f2d942be..8ec3be87df 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -667,6 +667,8 @@ int cmd_describe(int argc,
NULL);
if (!hashmap_get_size(&names) && !always)
die(_("No names found, cannot describe anything."));
+ if (hashmap_get_size(&names) < max_candidates)
+ max_candidates = hashmap_get_size(&names);
if (argc == 0) {
if (broken) {
--
2.47.0.441.g1a09955689
^ permalink raw reply related [flat|nested] 27+ messages in thread* [PATCH] fixup! describe: stop traversing when we run out of names
2024-11-06 21:17 ` [PATCH 4/4] describe: stop traversing when we run out of names Jeff King
@ 2024-12-04 23:15 ` Josh Steadmon
2024-12-04 23:27 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Josh Steadmon @ 2024-12-04 23:15 UTC (permalink / raw)
To: git; +Cc: peff, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
Don't exit when we run out of names if we also set --always
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
builtin/describe.c | 2 +-
t/t6120-describe.sh | 14 ++++++++++++++
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/builtin/describe.c b/builtin/describe.c
index 8ec3be87df..065c1bde6e 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -336,7 +336,7 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
return;
}
- if (!max_candidates)
+ if (!max_candidates && !always)
die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
if (debug)
fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 5633b11d01..9aebf09d3d 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -715,4 +715,18 @@ test_expect_success 'describe --broken --dirty with a file with changed stat' '
)
'
+test_expect_success '--always with no refs falls back to commit hash' '
+ git init always-no-refs &&
+ (
+ cd always-no-refs &&
+ test_commit --no-tag A &&
+ test_commit --no-tag B &&
+ test_commit --no-tag C &&
+ git describe --abbrev=12 --always HEAD^ >actual &&
+ echo 13 >expected_size &&
+ test_file_size actual >actual_size &&
+ test_cmp expected_size actual_size
+ )
+'
+
test_done
base-commit: a4f8a869558d59677e8d9798666a23391f0b4ca8
--
2.47.0.338.g60cca15819-goog
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH] fixup! describe: stop traversing when we run out of names
2024-12-04 23:15 ` [PATCH] fixup! " Josh Steadmon
@ 2024-12-04 23:27 ` Jeff King
2024-12-04 23:54 ` Jeff King
2024-12-05 20:14 ` [PATCH] describe: drop early return for max_candidates == 0 Jeff King
0 siblings, 2 replies; 27+ messages in thread
From: Jeff King @ 2024-12-04 23:27 UTC (permalink / raw)
To: Josh Steadmon
Cc: git, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Wed, Dec 04, 2024 at 03:15:42PM -0800, Josh Steadmon wrote:
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 8ec3be87df..065c1bde6e 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -336,7 +336,7 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
> return;
> }
>
> - if (!max_candidates)
> + if (!max_candidates && !always)
> die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
> if (debug)
> fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
Yep, this is the same spot I found. I think it's the right place to make
the fix.
> Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
This commit is already in 'next', so it's too late to squash in a change
(though I'd have done this separately anyway, as it's already an issue
for a manual --candidates=0 setting, as unlikely as that is).
Can you re-send with a full commit message?
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 5633b11d01..9aebf09d3d 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -715,4 +715,18 @@ test_expect_success 'describe --broken --dirty with a file with changed stat' '
> )
> '
>
> +test_expect_success '--always with no refs falls back to commit hash' '
> + git init always-no-refs &&
> + (
> + cd always-no-refs &&
> + test_commit --no-tag A &&
> + test_commit --no-tag B &&
> + test_commit --no-tag C &&
> + git describe --abbrev=12 --always HEAD^ >actual &&
> + echo 13 >expected_size &&
> + test_file_size actual >actual_size &&
> + test_cmp expected_size actual_size
> + )
> +'
I'm not sure if I'm missing anything subtle, but this seems more
complicated than necessary to show the bug. I think just the exit code
of:
git describe --match=does-not-exist --always HEAD
is sufficient, even in a repo with tags. If you really want to check
stdout, then probably comparing against:
git rev-list -1 --abbrev-commit --abbrev=13 HEAD >expect &&
test_cmp expect actual
is a little more obvious than the size check.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH] fixup! describe: stop traversing when we run out of names
2024-12-04 23:27 ` Jeff King
@ 2024-12-04 23:54 ` Jeff King
2024-12-05 20:14 ` [PATCH] describe: drop early return for max_candidates == 0 Jeff King
1 sibling, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-12-04 23:54 UTC (permalink / raw)
To: Josh Steadmon
Cc: git, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Wed, Dec 04, 2024 at 06:27:50PM -0500, Jeff King wrote:
> > - if (!max_candidates)
> > + if (!max_candidates && !always)
> > die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
> > if (debug)
> > fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
>
> Yep, this is the same spot I found. I think it's the right place to make
> the fix.
>
> > Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
>
> This commit is already in 'next', so it's too late to squash in a change
> (though I'd have done this separately anyway, as it's already an issue
> for a manual --candidates=0 setting, as unlikely as that is).
>
> Can you re-send with a full commit message?
In case it helps with writing a commit message:
I wondered why this line was there at all. It comes from 2c33f75754
(Teach git-describe --exact-match to avoid expensive tag searches,
2008-02-24). The --exact-match option is implemented by setting
max-candidates to 0. So:
git describe --exact-match --always
has always been broken, but probably nobody ever cared. My series
reduces the max_candidates setting automatically when there is nothing
to find, which means you are more likely to hit the bug.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread* [PATCH] describe: drop early return for max_candidates == 0
2024-12-04 23:27 ` Jeff King
2024-12-04 23:54 ` Jeff King
@ 2024-12-05 20:14 ` Jeff King
2024-12-05 22:28 ` Josh Steadmon
2024-12-06 3:01 ` Junio C Hamano
1 sibling, 2 replies; 27+ messages in thread
From: Jeff King @ 2024-12-05 20:14 UTC (permalink / raw)
To: Josh Steadmon
Cc: git, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Wed, Dec 04, 2024 at 06:27:50PM -0500, Jeff King wrote:
> > Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
>
> This commit is already in 'next', so it's too late to squash in a change
> (though I'd have done this separately anyway, as it's already an issue
> for a manual --candidates=0 setting, as unlikely as that is).
>
> Can you re-send with a full commit message?
Actually, after thinking on this a bit more, I think the solution below
is a bit more elegant. This can go on top of jk/describe-perf.
-- >8 --
From: Josh Steadmon <steadmon@google.com>
Subject: [PATCH] describe: drop early return for max_candidates == 0
Before we even start the describe algorithm, we check to see if
max_candidates is 0 and bail immediately if we did not find an exact
match. This comes from 2c33f75754 (Teach git-describe --exact-match to
avoid expensive tag searches, 2008-02-24), since the the --exact-match
option just sets max_candidates to 0.
But this interacts badly with the --always option (ironically added only
a week later in da2478dbb0 (describe --always: fall back to showing an
abbreviated object name, 2008-03-02)). With --always, we'd still want to
show the hash rather than calling die().
So this:
git describe --exact-match --always
and likewise:
git describe --exact-match --candidates=0
has always been broken. But nobody ever noticed, because using those
options together is rather unlikely. However, this bug became a lot
easier to trigger with a30154187a (describe: stop traversing when we run
out of names, 2024-10-31). There we reduce max_candidates automatically
based on the number of tags available. So in a repo with no tags (or one
where --match finds no tags), max_candidates becomes 0, and --always
will never show anything.
So that early check for --exact-match's zero candidates needs to be
adjusted. One way to do so is to have it check the "always" flag and
handle it specially, producing the expected hash. But that would require
duplicating the output code for "always".
Instead, we'd prefer to just fall through to the normal algorithm, which
should notice that we are not allowed to find any more candidates, stop
looking, and then hit the regular "always" output code. Back when
2c33f75754 was first done, this was a bad idea, since the normal
algorithm kept looking for the max+1 candidate. But since 082a4d90af
(describe: stop digging for max_candidates+1, 2024-10-31), we don't do
that anymore, and the algorithm is essentially a noop.
So we can drop the early return entirely, and the fact that
max_candidates is 0 will let us quit early without any special casing.
Reported-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
---
There is some small bit of setup work in the algorithm, like creating
the reverse index of commits->names in a slab. I don't think that's
worth worrying about. But if we did care, we could lazily initialize
that index, which would also benefit any other cases that bail before
needing it.
builtin/describe.c | 2 --
t/t6120-describe.sh | 6 ++++++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/builtin/describe.c b/builtin/describe.c
index 8ec3be87df..21e1c87c65 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -336,8 +336,6 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
return;
}
- if (!max_candidates)
- die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
if (debug)
fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 5633b11d01..009d84ff17 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -715,4 +715,10 @@ test_expect_success 'describe --broken --dirty with a file with changed stat' '
)
'
+test_expect_success '--always with no refs falls back to commit hash' '
+ git rev-parse HEAD >expect &&
+ git describe --no-abbrev --always --match=no-such-tag >actual &&
+ test_cmp expect actual
+'
+
test_done
--
2.47.1.734.g721956425b
^ permalink raw reply related [flat|nested] 27+ messages in thread* Re: [PATCH] describe: drop early return for max_candidates == 0
2024-12-05 20:14 ` [PATCH] describe: drop early return for max_candidates == 0 Jeff King
@ 2024-12-05 22:28 ` Josh Steadmon
2024-12-05 23:21 ` Jeff King
2024-12-06 3:01 ` Junio C Hamano
1 sibling, 1 reply; 27+ messages in thread
From: Josh Steadmon @ 2024-12-05 22:28 UTC (permalink / raw)
To: Jeff King
Cc: git, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On 2024.12.05 15:14, Jeff King wrote:
> On Wed, Dec 04, 2024 at 06:27:50PM -0500, Jeff King wrote:
>
> > > Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
> >
> > This commit is already in 'next', so it's too late to squash in a change
> > (though I'd have done this separately anyway, as it's already an issue
> > for a manual --candidates=0 setting, as unlikely as that is).
> >
> > Can you re-send with a full commit message?
>
> Actually, after thinking on this a bit more, I think the solution below
> is a bit more elegant. This can go on top of jk/describe-perf.
>
Thanks, and sorry for not replying earlier, I got distracted by a
different $DAYJOB breakage:
https://lore.kernel.org/git/b41ae080654a3603af09801018df539f656cf9d8.1733430345.git.steadmon@google.com/
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] describe: drop early return for max_candidates == 0
2024-12-05 22:28 ` Josh Steadmon
@ 2024-12-05 23:21 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-12-05 23:21 UTC (permalink / raw)
To: Josh Steadmon
Cc: git, gitster, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Thu, Dec 05, 2024 at 02:28:45PM -0800, Josh Steadmon wrote:
> > Actually, after thinking on this a bit more, I think the solution below
> > is a bit more elegant. This can go on top of jk/describe-perf.
> >
>
> Thanks, and sorry for not replying earlier, I got distracted by a
> different $DAYJOB breakage:
No problem. Thanks for finding it!
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] describe: drop early return for max_candidates == 0
2024-12-05 20:14 ` [PATCH] describe: drop early return for max_candidates == 0 Jeff King
2024-12-05 22:28 ` Josh Steadmon
@ 2024-12-06 3:01 ` Junio C Hamano
2024-12-06 3:28 ` Jeff King
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2024-12-06 3:01 UTC (permalink / raw)
To: Jeff King
Cc: Josh Steadmon, git, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
Jeff King <peff@peff.net> writes:
> Actually, after thinking on this a bit more, I think the solution below
> is a bit more elegant. This can go on top of jk/describe-perf.
>
> -- >8 --
> From: Josh Steadmon <steadmon@google.com>
> Subject: [PATCH] describe: drop early return for max_candidates == 0
OK, so the patch authorship still blames Josh. But there is no
sign-off because ... the approach to the fix is so different that
blaming Josh for this implementation is no longer appropriate?
> Reported-by: Josh Steadmon <steadmon@google.com>
> Signed-off-by: Jeff King <peff@peff.net>
If so, please take the authorship yourself.
> Before we even start the describe algorithm, we check to see if
> max_candidates is 0 and bail immediately if we did not find an exact
> match. This comes from 2c33f75754 (Teach git-describe --exact-match to
> avoid expensive tag searches, 2008-02-24), since the the --exact-match
> option just sets max_candidates to 0.
> ...
> So this:
>
> git describe --exact-match --always
>
> and likewise:
>
> git describe --exact-match --candidates=0
Did the latter mean to say "git decribe --candidates=0 --always", as
the earlier paragraph explains that "--exact" affects the number of
candidates?
Without this patch, all three give the same result:
$ git describe --exact-match --always HEAD
fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
$ git describe --exact-match --candidates=0 HEAD
fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
$ git describe --candidates=0 --always HEAD
fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
With this patch, we instead get this:
$ ./git describe --exact-match --always HEAD
59d18088fe
$ ./git describe --exact-match --candidates=0 HEAD
fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
Try --always, or create some tags.
$ ./git describe --candidates=0 --always HEAD
59d18088fe
> But this interacts badly with the --always option (ironically added only
> a week later in da2478dbb0 (describe --always: fall back to showing an
> abbreviated object name, 2008-03-02)). With --always, we'd still want to
> show the hash rather than calling die().
> ...
> has always been broken.
Hmph, I am not sure if the behaviour is _broken_ in the first place.
The user asks with "--exact-match" that a result based on some ref
that does not directly point at the object being described is *not*
acceptable, so with or without "--always", it looks to me that it is
doing the right thing, if there is no exact match (or there is no
tag and the user only allowed tag to describe the objects) and the
result is "no tag exactly matches object X" failure.
Or is our position that these mutually incompatible options, namely
"--exact-match" and "--always", follow the "last one wins" rule?
The implementation does not seem to say so.
If the earlier request is to describe only as exact tag (and fail if
there is no appropriate tag), but then we changed our mind and ask
to fall back to an abbreviation, this one is understandable:
$ ./git describe --exact-match --always HEAD
59d18088fe
But then this is not. The last thing we explicitly told the command
is that we accept only the exact match, but this one does not fail,
which seems like a bug:
$ ./git describe --always --exact-match HEAD
59d18088fe
So I am not sure if the "buggy" behaviour is buggy to begin with.
The way these two are documented can be read both ways,
--exact-match::
Only output exact matches (a tag directly references the
supplied commit). This is a synonym for --candidates=0.
--always::
Show uniquely abbreviated commit object as fallback.
but my reading is when you give both and when the object given is
not directly pointed at by any existing tag, "ONLY output exact
matches" cannot be satisified. And "show as fallback" cannot be
satisfied within the constraint that the command is allowed "only
output exact matches".
I think the complexity from the point of view of a calling script to
deal with either behaviour is probably similar. If you ask for
"--exact-match" and there is no exact match, you can ask rev-parse
to give a shortened one, and you know which one you are giving the
user. We can change what "--exact-match + --candidate=0" combination
means to let it fallback, but then you'd need to check the output to
see if you got an exact tag or a fallback, and for that you'd
probably need to ask "show-ref refs/tags/$output" or something.
So I am not sure if it is worth changing the behaviour this late in
the game?
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH] describe: drop early return for max_candidates == 0
2024-12-06 3:01 ` Junio C Hamano
@ 2024-12-06 3:28 ` Jeff King
2024-12-06 4:19 ` Junio C Hamano
0 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-12-06 3:28 UTC (permalink / raw)
To: Junio C Hamano
Cc: Josh Steadmon, git, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Fri, Dec 06, 2024 at 12:01:41PM +0900, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Actually, after thinking on this a bit more, I think the solution below
> > is a bit more elegant. This can go on top of jk/describe-perf.
> >
> > -- >8 --
> > From: Josh Steadmon <steadmon@google.com>
> > Subject: [PATCH] describe: drop early return for max_candidates == 0
>
> OK, so the patch authorship still blames Josh. But there is no
> sign-off because ... the approach to the fix is so different that
> blaming Josh for this implementation is no longer appropriate?
Oh, whoops. I had originally intended to just write the commit message
and leave him with credit, but then I ended up changing approach.
Leaving him as the author was an oversight.
> > Before we even start the describe algorithm, we check to see if
> > max_candidates is 0 and bail immediately if we did not find an exact
> > match. This comes from 2c33f75754 (Teach git-describe --exact-match to
> > avoid expensive tag searches, 2008-02-24), since the the --exact-match
> > option just sets max_candidates to 0.
> > ...
> > So this:
> >
> > git describe --exact-match --always
> >
> > and likewise:
> >
> > git describe --exact-match --candidates=0
>
> Did the latter mean to say "git decribe --candidates=0 --always", as
> the earlier paragraph explains that "--exact" affects the number of
> candidates?
Urgh, yes, you are correct. I can resend, but I think we should resolve
the questions below.
> Without this patch, all three give the same result:
>
> $ git describe --exact-match --always HEAD
> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
> $ git describe --exact-match --candidates=0 HEAD
> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
> $ git describe --candidates=0 --always HEAD
> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>
> With this patch, we instead get this:
>
> $ ./git describe --exact-match --always HEAD
> 59d18088fe
> $ ./git describe --exact-match --candidates=0 HEAD
> fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
> Try --always, or create some tags.
> $ ./git describe --candidates=0 --always HEAD
> 59d18088fe
Right, exactly.
> > But this interacts badly with the --always option (ironically added only
> > a week later in da2478dbb0 (describe --always: fall back to showing an
> > abbreviated object name, 2008-03-02)). With --always, we'd still want to
> > show the hash rather than calling die().
> > ...
> > has always been broken.
>
> Hmph, I am not sure if the behaviour is _broken_ in the first place.
>
> The user asks with "--exact-match" that a result based on some ref
> that does not directly point at the object being described is *not*
> acceptable, so with or without "--always", it looks to me that it is
> doing the right thing, if there is no exact match (or there is no
> tag and the user only allowed tag to describe the objects) and the
> result is "no tag exactly matches object X" failure.
>
> Or is our position that these mutually incompatible options, namely
> "--exact-match" and "--always", follow the "last one wins" rule?
> The implementation does not seem to say so.
I think you could argue that they are mutually incompatible. But we have
never marked them as such, nor do we do any sort of last-one-wins.
They are two distinct options, but in --exact-match mode, --always is
simply ignored. Which I think is a bug.
> So I am not sure if the "buggy" behaviour is buggy to begin with.
> The way these two are documented can be read both ways,
>
> --exact-match::
> Only output exact matches (a tag directly references the
> supplied commit). This is a synonym for --candidates=0.
>
> --always::
> Show uniquely abbreviated commit object as fallback.
>
> but my reading is when you give both and when the object given is
> not directly pointed at by any existing tag, "ONLY output exact
> matches" cannot be satisified. And "show as fallback" cannot be
> satisfied within the constraint that the command is allowed "only
> output exact matches".
I think there can be a more expansive reading of --exact-match (or of
--candidates=0), which is: only output a tag that matches exactly. And
then --always is orthogonal to that. There is no other output to
produce, so we show the commit object itself.
Now that more expansive reading is not what --exact-match says above.
But it is the only thing that makes sense to me for --candidates=0, and
the two are synonyms.
> I think the complexity from the point of view of a calling script to
> deal with either behaviour is probably similar. If you ask for
> "--exact-match" and there is no exact match, you can ask rev-parse
> to give a shortened one, and you know which one you are giving the
> user. We can change what "--exact-match + --candidate=0" combination
> means to let it fallback, but then you'd need to check the output to
> see if you got an exact tag or a fallback, and for that you'd
> probably need to ask "show-ref refs/tags/$output" or something.
>
> So I am not sure if it is worth changing the behaviour this late in
> the game?
I think there are really two questions here:
1. Is the current behavior of "describe --exact-match --always" a bug?
I'll grant that probably nobody cares deeply, which is why the
interaction has not been noticed for all of these years. I think
the semantics this patch gives are the only ones that make sense,
but I also don't care that deeply. But...
2. What should we do about the new regression caused by limiting the
candidate list? I.e., my earlier patches in this topic make us
behave as if --candidates=<n> was given when there are fewer tags
in the repo. That runs afoul of the special-casing of
--candidates=0 when there are no tags in the repo (or you limit the
candidates to zero via --match).
If we are not going to address (1) as this patch does, then we need
another solution. We can internally hold an extra variable to
distinguish the number of user-requested candidates from the number
of actual candidates available. But I think my solution to (1) here
harmonizes the --candidates=0 case with --always, and then the
auto-adjusted max-candidates case just falls out naturally.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH] describe: drop early return for max_candidates == 0
2024-12-06 3:28 ` Jeff King
@ 2024-12-06 4:19 ` Junio C Hamano
2024-12-06 5:42 ` [PATCH] describe: split "found all tags" and max_candidates logic Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2024-12-06 4:19 UTC (permalink / raw)
To: Jeff King
Cc: Josh Steadmon, git, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
Jeff King <peff@peff.net> writes:
>> Without this patch, all three give the same result:
>>
>> $ git describe --exact-match --always HEAD
>> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>> $ git describe --exact-match --candidates=0 HEAD
>> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>> $ git describe --candidates=0 --always HEAD
>> fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>>
>> With this patch, we instead get this:
>>
>> $ ./git describe --exact-match --always HEAD
>> 59d18088fe
>> $ ./git describe --exact-match --candidates=0 HEAD
>> fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
>> Try --always, or create some tags.
>> $ ./git describe --candidates=0 --always HEAD
>> 59d18088fe
> ...
> I think there are really two questions here:
>
> 1. Is the current behavior of "describe --exact-match --always" a bug?
> I'll grant that probably nobody cares deeply, which is why the
> interaction has not been noticed for all of these years. I think
> the semantics this patch gives are the only ones that make sense,
> but I also don't care that deeply. But...
>
> 2. What should we do about the new regression caused by limiting the
> candidate list?
Ahh, OK, these --candidate=0 / --exact-match were for illustration
purposes only. The real issue is that the user does not, with
$ git describe --always HEAD
ask for exact matches only at all, but we internally pretend as if
they did, which is not nice.
My gut reaction is that it is wrong not to give the abbreviated
object name in this case, but the price to do so shouldn't be to
change the behaviour when --exact-match was requested the the user.
Loosening the interaction between the two options, when both were
given explicitly, may be an improvement, but I think that should be
treated as a separate topic, with its merit justified independently,
since the command has been behaving this way from fairly early
version, possibly the one that had both of the options for the first
time.
$ rungit v2.20.0 describe --exact-match HEAD
fatal: No names found, cannot describe anything.
$ rungit v2.20.0 describe --exact-match --always HEAD
fatal: no tag exactly matches '13a3dd7fe014658da465e9621ec3651f5473041e'
$ rungit v2.20.0 describe --exact-match --candidate=0 HEAD
fatal: No names found, cannot describe anything.
Thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] describe: split "found all tags" and max_candidates logic
2024-12-06 4:19 ` Junio C Hamano
@ 2024-12-06 5:42 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-12-06 5:42 UTC (permalink / raw)
To: Junio C Hamano
Cc: Josh Steadmon, git, benno.martin.evers, benno, ravi, jpoimboe,
masahiroy
On Fri, Dec 06, 2024 at 01:19:36PM +0900, Junio C Hamano wrote:
> My gut reaction is that it is wrong not to give the abbreviated
> object name in this case, but the price to do so shouldn't be to
> change the behaviour when --exact-match was requested the the user.
I think this is where we differ. You call it a price, but to me it is a
bonus. ;)
> Loosening the interaction between the two options, when both were
> given explicitly, may be an improvement, but I think that should be
> treated as a separate topic, with its merit justified independently,
> since the command has been behaving this way from fairly early
> version, possibly the one that had both of the options for the first
> time.
>
> $ rungit v2.20.0 describe --exact-match HEAD
> fatal: No names found, cannot describe anything.
> $ rungit v2.20.0 describe --exact-match --always HEAD
> fatal: no tag exactly matches '13a3dd7fe014658da465e9621ec3651f5473041e'
> $ rungit v2.20.0 describe --exact-match --candidate=0 HEAD
> fatal: No names found, cannot describe anything.
OK. That's certainly the more conservative approach, since it reduces
the change of behavior from previous versions.
Here's a replacement patch (again, on top of jk/describe-perf).
-- >8 --
Subject: [PATCH] describe: split "found all tags" and max_candidates logic
Commit a30154187a (describe: stop traversing when we run out of names,
2024-10-31) taught git-describe to automatically reduce the
max_candidates setting to match the total number of possible names. This
lets us break out of the traversal rather than fruitlessly searching for
more candidates when there are no more to be found.
However, setting max_candidates to 0 (e.g., if the repo has no tags)
overlaps with the --exact-match option, which explicitly uses the same
value. And this causes a regression with --always, which is ignored in
exact-match mode. We used to get this in a repo with no tags:
$ git describe --always HEAD
b2f0a7f
and now we get:
$ git describe --always HEAD
fatal: no tag exactly matches 'b2f0a7f47f5f2aebe1e7fceff19a57de20a78c06'
The reason is that we bail early in describe_commit() when
max_candidates is set to 0. This logic goes all the way back to
2c33f75754 (Teach git-describe --exact-match to avoid expensive tag
searches, 2008-02-24).
We should obviously fix this regression, but there are two paths,
depending on what you think:
$ git describe --always --exact-match
and
$ git describe --always --candidates=0
should do. Since the "--always" option was added, it has always been
ignored in --exact-match (or --candidates=0) mode. I.e., we treat
--exact-match as a true exact match of a tag, and never fall back to
using --always, even if it was requested.
If we think that's a bug (or at least a misfeature), then the right
solution is to fix it by removing the early bail-out from 2c33f75754,
letting the noop algorithm run and then hitting the --always fallback
output. And then our regression naturally goes away, because it follows
the same path.
If we think that the current "--exact-match --always" behavior is the
right thing, then we have to differentiate the case where we
automatically reduced max_candidates to 0 from the case where the user
asked for it specifically. That's possible to do with a flag, but we can
also just reimplement the logic from a30154187a to explicitly break out
of the traversal when we run out of candidates (rather than relying on
the existing max_candidates check).
My gut feeling is along the lines of option 1 (it's a bug, and people
would be happy for "--exact-match --always" to give the fallback rather
than ignoring "--always"). But the documentation can be interpreted in
the other direction, and we've certainly lived with the existing
behavior for many years. So it's possible that changing it now is the
wrong thing.
So this patch fixes the regression by taking the second option,
retaining the "--exact-match" behavior as-is. There are two new tests.
The first shows that the regression is fixed (we don't even need a new
repo without tags; a restrictive --match is enough to create the
situation that there are no candidate names).
The second test confirms that the "--exact-match --always" behavior
remains unchanged and continues to die when there is no tag pointing at
the specified commit. It's possible we may reconsider this in the
future, but this shows that the approach described above is implemented
faithfully.
We can also run the perf tests in p6100 to see that we've retained the
speedup that a30154187a was going for:
Test HEAD^ HEAD
--------------------------------------------------------------------------------------
6100.2: describe HEAD 0.72(0.64+0.07) 0.72(0.66+0.06) +0.0%
6100.3: describe HEAD with one max candidate 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
6100.4: describe HEAD with one tag 0.01(0.01+0.00) 0.01(0.01+0.00) +0.0%
Reported-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/describe.c | 5 ++---
t/t6120-describe.sh | 10 ++++++++++
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/builtin/describe.c b/builtin/describe.c
index 8ec3be87df..a6ef8af32a 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -367,7 +367,8 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
seen_commits++;
- if (match_cnt == max_candidates) {
+ if (match_cnt == max_candidates ||
+ match_cnt == hashmap_get_size(&names)) {
gave_up_on = c;
break;
}
@@ -667,8 +668,6 @@ int cmd_describe(int argc,
NULL);
if (!hashmap_get_size(&names) && !always)
die(_("No names found, cannot describe anything."));
- if (hashmap_get_size(&names) < max_candidates)
- max_candidates = hashmap_get_size(&names);
if (argc == 0) {
if (broken) {
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 5633b11d01..3f6160d702 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -715,4 +715,14 @@ test_expect_success 'describe --broken --dirty with a file with changed stat' '
)
'
+test_expect_success '--always with no refs falls back to commit hash' '
+ git rev-parse HEAD >expect &&
+ git describe --no-abbrev --always --match=no-such-tag >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success '--exact-match does not show --always fallback' '
+ test_must_fail git describe --exact-match --always
+'
+
test_done
--
2.47.1.734.g721956425b
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 0/4] perf improvements for git-describe with few tags
2024-11-06 19:22 ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
` (4 preceding siblings ...)
2024-11-06 21:17 ` [PATCH 4/4] describe: stop traversing when we run out of names Jeff King
@ 2024-11-26 5:05 ` Junio C Hamano
2024-12-04 23:04 ` Josh Steadmon
5 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2024-11-26 5:05 UTC (permalink / raw)
To: Jeff King
Cc: git, Benno Evers, Rasmus Villemoes, Benno Evers, Josh Poimboeuf,
Masahiro Yamada
Jeff King <peff@peff.net> writes:
> So here's the series I came up with, which starts by adjusting the tests
> to be resilient to the later changes, but also to show the existing
> failure mode.
>
> And then the rest of the patches add the performance improvements we've
> been discussing in the thread.
So, this did not get any comments, but I had a time to read it over,
and did not find anything suspicious in there.
Let me mark it for 'next', this time for real, in the What's cooking
draft I have.
Thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [PATCH 0/4] perf improvements for git-describe with few tags
2024-11-26 5:05 ` [PATCH 0/4] perf improvements for git-describe with few tags Junio C Hamano
@ 2024-12-04 23:04 ` Josh Steadmon
2024-12-04 23:20 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Josh Steadmon @ 2024-12-04 23:04 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, git, Benno Evers, Rasmus Villemoes, Benno Evers,
Josh Poimboeuf, Masahiro Yamada
On 2024.11.26 14:05, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > So here's the series I came up with, which starts by adjusting the tests
> > to be resilient to the later changes, but also to show the existing
> > failure mode.
> >
> > And then the rest of the patches add the performance improvements we've
> > been discussing in the thread.
>
> So, this did not get any comments, but I had a time to read it over,
> and did not find anything suspicious in there.
>
> Let me mark it for 'next', this time for real, in the What's cooking
> draft I have.
>
> Thanks.
>
This breaks the case of `git describe --always $SOME_HASH` (we hit the
die at builtin/describe.c:340) when there are no tags in the repo. I can
send a test case and a small fix shortly.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/4] perf improvements for git-describe with few tags
2024-12-04 23:04 ` Josh Steadmon
@ 2024-12-04 23:20 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-12-04 23:20 UTC (permalink / raw)
To: Josh Steadmon
Cc: Junio C Hamano, git, Benno Evers, Rasmus Villemoes, Benno Evers,
Josh Poimboeuf, Masahiro Yamada
On Wed, Dec 04, 2024 at 03:04:59PM -0800, Josh Steadmon wrote:
> This breaks the case of `git describe --always $SOME_HASH` (we hit the
> die at builtin/describe.c:340) when there are no tags in the repo. I can
> send a test case and a small fix shortly.
Yeah, this is easy to reproduce. I think it was always broken with:
git describe --candidates=0 --always ...
since there is a line that skips the whole algorithm and bails early if
there are no candidates allowed. But that should surely not kick in if
"always" is set. I.e., I'd expect the fix to be something like:
diff --git a/builtin/describe.c b/builtin/describe.c
index d6c77a714f..d4c869e3d5 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -332,7 +332,7 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst)
return;
}
- if (!max_candidates)
+ if (!max_candidates && !always)
die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
if (debug)
fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
But I'll wait and see what your proposed fix looks like.
-Peff
^ permalink raw reply related [flat|nested] 27+ messages in thread