git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masahiro Yamada <masahiroy@kernel.org>
To: Rasmus Villemoes <ravi@prevas.dk>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>,
	linux-kernel@vger.kernel.org,  linux-kbuild@vger.kernel.org,
	git@vger.kernel.org
Subject: Re: [PATCH] setlocalversion: Add workaround for "git describe" performance issue
Date: Thu, 31 Oct 2024 12:43:56 +0100	[thread overview]
Message-ID: <CAK7LNAT6VbaT=_v5_qMp3-jw5CWui=ZXyiAb2dP20p1tU0JpyA@mail.gmail.com> (raw)
In-Reply-To: <87bjz0k17c.fsf@prevas.dk>

On Thu, Oct 31, 2024 at 11:37 AM Rasmus Villemoes <ravi@prevas.dk> wrote:
>
> On Wed, Oct 30 2024, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> > If HEAD isn't associated with an annotated tag, a bug (or feature?) in
> > "git describe --match" causes it to search every commit in the entire
> > repository looking for additional match candidates.  Instead of it
> > taking a fraction of a second, it adds 10-15 seconds to the beginning of
> > every kernel build.
> >
> > Fix it by adding an additional dummy match which is slightly further
> > away from the most recent one, along with setting the max candidate
> > count to 1 (not 2, apparently another bug).
> >
>
> cc += git list
>
> Hm, I tried looking at the git describe source code, and while I can't
> claim I understand it very well, I think the main problem is around
> this part:
>
>                         if (!tags && !all && n->prio < 2) {
>                                 unannotated_cnt++;
>                         } else if (match_cnt < max_candidates) {
>                                 struct possible_tag *t = &all_matches[match_cnt++];
>                                 t->name = n;
>                                 t->depth = seen_commits - 1;
>                                 t->flag_within = 1u << match_cnt;
>                                 t->found_order = match_cnt;
>                                 c->object.flags |= t->flag_within;
>                                 if (n->prio == 2)
>                                         annotated_cnt++;
>                         }
>                         else {
>                                 gave_up_on = c;
>                                 break;
>                         }
>
> So in the case where one doesn't pass any --match, we get something like
>
> git describe --debug 5f78aec0d7e9
> describe 5f78aec0d7e9
> No exact match on refs or tags, searching to describe
>  annotated        243 v4.19-rc5
>  annotated        485 v4.19-rc4
>  annotated        814 v4.19-rc3
>  annotated       1124 v4.19-rc2
>  annotated       1391 v4.19-rc1
>  annotated      10546 v4.18
>  annotated      10611 v4.18-rc8
>  annotated      10819 v4.18-rc7
>  annotated      11029 v4.18-rc6
>  annotated      11299 v4.18-rc5
> traversed 11400 commits
> more than 10 tags found; listed 10 most recent
> gave up search at 1e4b044d22517cae7047c99038abb444423243ca
> v4.19-rc5-243-g5f78aec0d7e9
>
> and that "gave up" commit is v4.18-rc4, the eleventh commit
> encountered. That also explains why you have to add a "dummy" second
> --match to make --candidates=1 have the expected behaviour.
>
> Perhaps the logic should instead be that as soon as match_cnt hits
> max_candidates (i.e. all the tags we're going to consider have actually
> been visited), we break out. That is, the last "else" above should
> instead be replaced by
>
>   if (match_cnt == max_candidates) {
>     ... /* ? , gave_up_on is now a misnomer */
>     break;
>   }
>
> Then as a further DWIM aid, wherever the initialization logic is could
> be updated so that, after expanding all the --match= wildcards, if the
> number of tags is less than max_candidates, automatically lower
> max_candidates to that number (which in the setlocalversion case will
> always be 1 because we're not actually passing a wildcard).
>
> Or, we could explicitly on the kernel side pass that --candidates=1, but
> yes, I agree it looks like a bug that the loop requires encountering +1
> tag to hit that break;.
>
> Rasmus
>

I still do not understand the logic either.

git traverses all the way back to
d8470b7c13e11c18cf14a7e3180f0b00e715e4f0.



$ git describe --match=v6.12-rc5 --debug c1e939a21eb1
describe c1e939a21eb1
No exact match on refs or tags, searching to describe
finished search at d8470b7c13e11c18cf14a7e3180f0b00e715e4f0
 annotated         44 v6.12-rc5
traversed 1310005 commits
v6.12-rc5-44-gc1e939a21eb1



Or, more simply,


$ git describe --match=v6.12-rc5 --debug e42b1a9a2557
describe e42b1a9a2557
No exact match on refs or tags, searching to describe
finished search at d8470b7c13e11c18cf14a7e3180f0b00e715e4f0
 annotated          5 v6.12-rc5
traversed 1309966 commits
v6.12-rc5-5-ge42b1a9a2557




e42b1a9a2557 merges v6.12-rc5 and 25f00a13dccf.

The latter is obviously, v6.12-rc2 + 4 commits.
v6.12-rc2 is an ancestor of v6.12-rc5.

I do not understand why git traverses 1.3 million commits.



--
Best Regards

Masahiro Yamada

      parent reply	other threads:[~2024-10-31 11:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org>
2024-10-31 10:37 ` [PATCH] setlocalversion: Add workaround for "git describe" performance issue Rasmus Villemoes
2024-10-31 11:42   ` Jeff King
2024-10-31 12:24     ` Jeff King
2024-10-31 14:43       ` Jeff King
2024-11-04 12:37         ` Benno Evers
2024-11-06 19:22           ` [PATCH 0/4] perf improvements for git-describe with few tags Jeff King
2024-11-06 19:26             ` Jeff King
2024-11-06 21:16             ` [PATCH 1/4] t6120: demonstrate weakness in disjoint-root handling Jeff King
2024-11-06 21:17             ` [PATCH 2/4] t/perf: add tests for git-describe Jeff King
2024-11-06 21:17             ` [PATCH 3/4] describe: stop digging for max_candidates+1 Jeff King
2024-11-06 21:17             ` [PATCH 4/4] describe: stop traversing when we run out of names Jeff King
2024-12-04 23:15               ` [PATCH] fixup! " Josh Steadmon
2024-12-04 23:27                 ` Jeff King
2024-12-04 23:54                   ` Jeff King
2024-12-05 20:14                   ` [PATCH] describe: drop early return for max_candidates == 0 Jeff King
2024-12-05 22:28                     ` Josh Steadmon
2024-12-05 23:21                       ` Jeff King
2024-12-06  3:01                     ` Junio C Hamano
2024-12-06  3:28                       ` Jeff King
2024-12-06  4:19                         ` Junio C Hamano
2024-12-06  5:42                           ` [PATCH] describe: split "found all tags" and max_candidates logic Jeff King
2024-11-26  5:05             ` [PATCH 0/4] perf improvements for git-describe with few tags Junio C Hamano
2024-12-04 23:04               ` Josh Steadmon
2024-12-04 23:20                 ` Jeff King
2024-11-01 10:23       ` [PATCH] setlocalversion: Add workaround for "git describe" performance issue Rasmus Villemoes
2024-11-01 11:39         ` Jeff King
2024-10-31 11:43   ` Masahiro Yamada [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK7LNAT6VbaT=_v5_qMp3-jw5CWui=ZXyiAb2dP20p1tU0JpyA@mail.gmail.com' \
    --to=masahiroy@kernel.org \
    --cc=git@vger.kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ravi@prevas.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).