* [PATCH v2] setlocalversion: work around "git describe" performance
@ 2024-11-12 21:05 Rasmus Villemoes
2024-11-13 20:38 ` Josh Poimboeuf
2024-11-15 23:17 ` Masahiro Yamada
0 siblings, 2 replies; 5+ messages in thread
From: Rasmus Villemoes @ 2024-11-12 21:05 UTC (permalink / raw)
To: Josh Poimboeuf, Masahiro Yamada; +Cc: linux-kernel, Jeff King, Rasmus Villemoes
Contrary to expectations, passing a single candidate tag to "git
describe" is slower than not passing any --match options.
$ time git describe --debug
...
traversed 10619 commits
...
v6.12-rc5-63-g0fc810ae3ae1
real 0m0.169s
$ time git describe --match=v6.12-rc5 --debug
...
traversed 1310024 commits
v6.12-rc5-63-g0fc810ae3ae1
real 0m1.281s
In fact, the --debug output shows that git traverses all or most of
history. For some repositories and/or git versions, those 1.3s are
actually 10-15 seconds.
This has been acknowledged as a performance bug in git [1], and a fix
is on its way [2]. However, no solution is yet in git.git, and even
when one lands, it will take quite a while before it finds its way to
a release and for $random_kernel_developer to pick that up.
So rewrite the logic to use plumbing commands. For each of the
candidate values of $tag, we ask: (1) is $tag even an annotated
tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of
HEAD? (3) If so, how many commits are in $tag..HEAD?
I have tested that this produces the same output as the current script
for ~700 random commits between v6.9..v6.10. For those 700 commits,
and in my git repo, the 'make -s kernelrelease' command is on average
~4 times faster with this patch applied (geometric mean of ratios).
For the commit mentioned in Josh's original report [3], the
time-consuming part of setlocalversion goes from
$ time git describe --match=v6.12-rc5 c1e939a21eb1
v6.12-rc5-44-gc1e939a21eb1
real 0m1.210s
to
$ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1
0 44
real 0m0.037s
[1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/
[2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/
[3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
v2: Drop odd here-doc, use "set -- $()" instead. Update commit log to
mention the git.git patches in flight.
scripts/setlocalversion | 53 ++++++++++++++++++++++++++++-------------
1 file changed, 37 insertions(+), 16 deletions(-)
diff --git a/scripts/setlocalversion b/scripts/setlocalversion
index 38b96c6797f4..1e3b01ec096c 100755
--- a/scripts/setlocalversion
+++ b/scripts/setlocalversion
@@ -30,6 +30,26 @@ if test $# -gt 0 -o ! -d "$srctree"; then
usage
fi
+try_tag() {
+ tag="$1"
+
+ # Is $tag an annotated tag?
+ [ "$(git cat-file -t "$tag" 2> /dev/null)" = "tag" ] || return 1
+
+ # Is it an ancestor of HEAD, and if so, how many commits are in $tag..HEAD?
+ set -- $(git rev-list --count --left-right "$tag"...HEAD 2> /dev/null)
+
+ # $1 is 0 if and only if $tag is an ancestor of HEAD. Use
+ # string comparison, because $1 is empty if the 'git rev-list'
+ # command somehow failed.
+ [ "$1" = 0 ] || return 1
+
+ # $2 is the number of commits in the range $tag..HEAD, possibly 0.
+ count="$2"
+
+ return 0
+}
+
scm_version()
{
local short=false
@@ -61,33 +81,33 @@ scm_version()
# stable kernel: 6.1.7 -> v6.1.7
version_tag=v$(echo "${KERNELVERSION}" | sed -E 's/^([0-9]+\.[0-9]+)\.0(.*)$/\1\2/')
+ # try_tag initializes count if the tag is usable.
+ count=
+
# If a localversion* file exists, and the corresponding
# annotated tag exists and is an ancestor of HEAD, use
# it. This is the case in linux-next.
- tag=${file_localversion#-}
- desc=
- if [ -n "${tag}" ]; then
- desc=$(git describe --match=$tag 2>/dev/null)
+ if [ -n "${file_localversion#-}" ] ; then
+ try_tag "${file_localversion#-}"
fi
# Otherwise, if a localversion* file exists, and the tag
# obtained by appending it to the tag derived from
# KERNELVERSION exists and is an ancestor of HEAD, use
# it. This is e.g. the case in linux-rt.
- if [ -z "${desc}" ] && [ -n "${file_localversion}" ]; then
- tag="${version_tag}${file_localversion}"
- desc=$(git describe --match=$tag 2>/dev/null)
+ if [ -z "${count}" ] && [ -n "${file_localversion}" ]; then
+ try_tag "${version_tag}${file_localversion}"
fi
# Otherwise, default to the annotated tag derived from KERNELVERSION.
- if [ -z "${desc}" ]; then
- tag="${version_tag}"
- desc=$(git describe --match=$tag 2>/dev/null)
+ if [ -z "${count}" ]; then
+ try_tag "${version_tag}"
fi
- # If we are at the tagged commit, we ignore it because the version is
- # well-defined.
- if [ "${tag}" != "${desc}" ]; then
+ # If we are at the tagged commit, we ignore it because the
+ # version is well-defined. If none of the attempted tags exist
+ # or were usable, $count is still empty.
+ if [ -z "${count}" ] || [ "${count}" -gt 0 ]; then
# If only the short version is requested, don't bother
# running further git commands
@@ -95,14 +115,15 @@ scm_version()
echo "+"
return
fi
+
# If we are past the tagged commit, we pretty print it.
# (like 6.1.0-14595-g292a089d78d3)
- if [ -n "${desc}" ]; then
- echo "${desc}" | awk -F- '{printf("-%05d", $(NF-1))}'
+ if [ -n "${count}" ]; then
+ printf "%s%05d" "-" "${count}"
fi
# Add -g and exactly 12 hex chars.
- printf '%s%s' -g "$(echo $head | cut -c1-12)"
+ printf '%s%.12s' -g "$head"
fi
if ${no_dirty}; then
--
2.47.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] setlocalversion: work around "git describe" performance
2024-11-12 21:05 [PATCH v2] setlocalversion: work around "git describe" performance Rasmus Villemoes
@ 2024-11-13 20:38 ` Josh Poimboeuf
2024-11-15 23:17 ` Masahiro Yamada
1 sibling, 0 replies; 5+ messages in thread
From: Josh Poimboeuf @ 2024-11-13 20:38 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: Masahiro Yamada, linux-kernel, Jeff King
On Tue, Nov 12, 2024 at 10:05:00PM +0100, Rasmus Villemoes wrote:
> Contrary to expectations, passing a single candidate tag to "git
> describe" is slower than not passing any --match options.
>
> $ time git describe --debug
> ...
> traversed 10619 commits
> ...
> v6.12-rc5-63-g0fc810ae3ae1
>
> real 0m0.169s
>
> $ time git describe --match=v6.12-rc5 --debug
> ...
> traversed 1310024 commits
> v6.12-rc5-63-g0fc810ae3ae1
>
> real 0m1.281s
Works for me, thanks!
Tested-by: Josh Poimboeuf <jpoimboe@kernel.org>
--
Josh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] setlocalversion: work around "git describe" performance
2024-11-12 21:05 [PATCH v2] setlocalversion: work around "git describe" performance Rasmus Villemoes
2024-11-13 20:38 ` Josh Poimboeuf
@ 2024-11-15 23:17 ` Masahiro Yamada
2024-11-17 12:20 ` Rasmus Villemoes
1 sibling, 1 reply; 5+ messages in thread
From: Masahiro Yamada @ 2024-11-15 23:17 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: Josh Poimboeuf, linux-kernel, Jeff King
This patch was not sent to linux-kbuild ML
(and it can be one reason when a patch falls into a crack),
but I guess I am expected to review and pick it.
On Wed, Nov 13, 2024 at 6:04 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> Contrary to expectations, passing a single candidate tag to "git
> describe" is slower than not passing any --match options.
>
> $ time git describe --debug
> ...
> traversed 10619 commits
> ...
> v6.12-rc5-63-g0fc810ae3ae1
>
> real 0m0.169s
>
> $ time git describe --match=v6.12-rc5 --debug
> ...
> traversed 1310024 commits
> v6.12-rc5-63-g0fc810ae3ae1
>
> real 0m1.281s
>
> In fact, the --debug output shows that git traverses all or most of
> history. For some repositories and/or git versions, those 1.3s are
> actually 10-15 seconds.
>
> This has been acknowledged as a performance bug in git [1], and a fix
> is on its way [2]. However, no solution is yet in git.git, and even
> when one lands, it will take quite a while before it finds its way to
> a release and for $random_kernel_developer to pick that up.
>
> So rewrite the logic to use plumbing commands. For each of the
> candidate values of $tag, we ask: (1) is $tag even an annotated
> tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of
> HEAD? (3) If so, how many commits are in $tag..HEAD?
>
> I have tested that this produces the same output as the current script
> for ~700 random commits between v6.9..v6.10. For those 700 commits,
> and in my git repo, the 'make -s kernelrelease' command is on average
> ~4 times faster with this patch applied (geometric mean of ratios).
>
> For the commit mentioned in Josh's original report [3], the
> time-consuming part of setlocalversion goes from
>
> $ time git describe --match=v6.12-rc5 c1e939a21eb1
> v6.12-rc5-44-gc1e939a21eb1
>
> real 0m1.210s
>
> to
>
> $ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1
> 0 44
>
> real 0m0.037s
>
> [1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/
> [2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/
> [3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
>
> Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Maybe, the comprehensive tag list looks like this?
Reported-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> ---
>
> v2: Drop odd here-doc, use "set -- $()" instead. Update commit log to
> mention the git.git patches in flight.
>
> scripts/setlocalversion | 53 ++++++++++++++++++++++++++++-------------
> 1 file changed, 37 insertions(+), 16 deletions(-)
>
> diff --git a/scripts/setlocalversion b/scripts/setlocalversion
> index 38b96c6797f4..1e3b01ec096c 100755
> --- a/scripts/setlocalversion
> +++ b/scripts/setlocalversion
> @@ -30,6 +30,26 @@ if test $# -gt 0 -o ! -d "$srctree"; then
> usage
> fi
>
> +try_tag() {
> + tag="$1"
> +
> + # Is $tag an annotated tag?
> + [ "$(git cat-file -t "$tag" 2> /dev/null)" = "tag" ] || return 1
The double-quotes for tag are unneeded.
"tag" --> tag
This function returns either 1 or 0, but how is it used?
> + [ "$1" = 0 ] || return 1
> +
> + # $2 is the number of commits in the range $tag..HEAD, possibly 0.
> + count="$2"
> +
> + return 0
Same here. The return value seems unnecessary.
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] setlocalversion: work around "git describe" performance
2024-11-15 23:17 ` Masahiro Yamada
@ 2024-11-17 12:20 ` Rasmus Villemoes
2024-11-17 14:29 ` Masahiro Yamada
0 siblings, 1 reply; 5+ messages in thread
From: Rasmus Villemoes @ 2024-11-17 12:20 UTC (permalink / raw)
To: Masahiro Yamada; +Cc: Josh Poimboeuf, linux-kernel, Jeff King
On Sat, Nov 16 2024, Masahiro Yamada <masahiroy@kernel.org> wrote:
> This patch was not sent to linux-kbuild ML
> (and it can be one reason when a patch falls into a crack),
> but I guess I am expected to review and pick it.
Sorry, but get_maintainer.pl doesn't tell one to cc linux-kbuild.
>>
>> Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
>
>
> Maybe, the comprehensive tag list looks like this?
>
> Reported-by: Sean Christopherson <seanjc@google.com>
> Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/
> Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
> Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
Fine by me.
>>
>> +try_tag() {
>> + tag="$1"
>> +
>> + # Is $tag an annotated tag?
>> + [ "$(git cat-file -t "$tag" 2> /dev/null)" = "tag" ] || return 1
>
> The double-quotes for tag are unneeded.
>
> "tag" --> tag
>
OK. The current script isn't consistent here, though (--no-local and +
are quoted where they need not be), and I find having the quotes on both
sides of = more visually appealing. Not a hill I'm gonna die on.
> This function returns either 1 or 0, but how is it used?
>
Well, you're right that it's not used currently, but I might as well let
the return value reflect whether it succeeded or not. I played around
with some variation of
if [ -n "${file_localversion#-}" ] && try_tag "${file_localversion#-}" ; then
:
elif [ -n "${file_localversion}" ] && try_tag "${version_tag}${file_localversion}" ; then
:
elif try_tag "${version_tag}"
:
else
count=""
fi
but in the end decided to keep the current logic of testing some shell
variable (previously $desc, not $count). Still, I see no reason to make
the early returns do "return 0".
Rasmus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] setlocalversion: work around "git describe" performance
2024-11-17 12:20 ` Rasmus Villemoes
@ 2024-11-17 14:29 ` Masahiro Yamada
0 siblings, 0 replies; 5+ messages in thread
From: Masahiro Yamada @ 2024-11-17 14:29 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: Josh Poimboeuf, linux-kernel, Jeff King
On Sun, Nov 17, 2024 at 9:20 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On Sat, Nov 16 2024, Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> > This patch was not sent to linux-kbuild ML
> > (and it can be one reason when a patch falls into a crack),
> > but I guess I am expected to review and pick it.
>
> Sorry, but get_maintainer.pl doesn't tell one to cc linux-kbuild.
Ah, I just realized that MAINTAINERS file does not cover this file.
KERNEL BUILD + files below scripts/ (unless maintained elsewhere)
So, I randomly pick up patches for the scripts/ directory.
> >>
> >> Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
> >
> >
> > Maybe, the comprehensive tag list looks like this?
> >
> > Reported-by: Sean Christopherson <seanjc@google.com>
> > Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/
> > Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
>
> Fine by me.
>
> >>
> >> +try_tag() {
> >> + tag="$1"
> >> +
> >> + # Is $tag an annotated tag?
> >> + [ "$(git cat-file -t "$tag" 2> /dev/null)" = "tag" ] || return 1
> >
> > The double-quotes for tag are unneeded.
> >
> > "tag" --> tag
> >
>
> OK. The current script isn't consistent here, though (--no-local and +
> are quoted where they need not be), and I find having the quotes on both
> sides of = more visually appealing. Not a hill I'm gonna die on.
My personal preference is to not add unnecessary quotes.
In contrast, necessary quotes are missing.
So, shellcheck shows warnings.
In scripts/setlocalversion line 79:
desc=$(git describe --match=$tag 2>/dev/null)
^--^ SC2086 (info): Double
quote to prevent globbing and word splitting.
If you contribute for consistency, it is appreciated too.
> > This function returns either 1 or 0, but how is it used?
> >
>
> Well, you're right that it's not used currently, but I might as well let
> the return value reflect whether it succeeded or not. I played around
> with some variation of
>
> if [ -n "${file_localversion#-}" ] && try_tag "${file_localversion#-}" ; then
> :
> elif [ -n "${file_localversion}" ] && try_tag "${version_tag}${file_localversion}" ; then
> :
> elif try_tag "${version_tag}"
> :
> else
> count=""
> fi
>
> but in the end decided to keep the current logic of testing some shell
> variable (previously $desc, not $count).
Either style is fine with me.
> Still, I see no reason to make
> the early returns do "return 0".
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-11-17 14:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-12 21:05 [PATCH v2] setlocalversion: work around "git describe" performance Rasmus Villemoes
2024-11-13 20:38 ` Josh Poimboeuf
2024-11-15 23:17 ` Masahiro Yamada
2024-11-17 12:20 ` Rasmus Villemoes
2024-11-17 14:29 ` Masahiro Yamada
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox