From: Junio C Hamano <gitster@pobox.com>
To: Kai Koponen <kaikoponen@google.com>
Cc: git@vger.kernel.org
Subject: Re: Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph
Date: Mon, 23 Jun 2025 12:36:21 -0700 [thread overview]
Message-ID: <xmqq8qli5jyi.fsf@gitster.g> (raw)
In-Reply-To: <CADYQcGqaMC=4jgbmnF9Q11oC11jfrqyvH8EuiRRHytpMXd4wYA@mail.gmail.com> (Kai Koponen's message of "Mon, 23 Jun 2025 13:58:03 -0400")
Kai Koponen <kaikoponen@google.com> writes:
> Reproduce steps:
> ```
> git clone https://github.com/golang/go.git
> cd go
> git config core.commitGraph true
> git commit-graph write --split --reachable --changed-paths # Without
> this, all calls equally slow (~1s)
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash > /dev/null # ~90ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/Make.dist > /dev/null # ~100ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash src/Make.dist > /dev/null # ~650ms
> ```
>
> The rev-list call with multiple paths takes over 3x longer than the
> sum of individual calls to it for the same files.
>
> Expectation: rev-list with multiple paths should take <= the sum of
> the time it takes to call it with each path individually (ideally <,
> since with the count limit it should be able to early-exit and search
> less commits for either path).
>
> Also reproduces without the -10 arg, or with a lower count (double
> instead of triple w/ -1), but these results are perhaps most
> surprising with a count present.
I asked
How does "git log -- path" use the changed-paths bloom filter
stored in the commit-graph file?
to https://deepwiki.com/git/git (there is a text field in the bottom
of the page), and an early part of its answer explains why in a
fairly convincing way ;-)
When you run git log -- path, Git first prepares to use bloom
filters in the prepare_to_use_bloom_filter function. This function:
1. Validates the pathspec - It calls forbid_bloom_filters to check
if bloom filters can be used revision.c:674-686 . Bloom filters
are disabled for wildcards, multiple paths, or complex pathspec
magic.
...
In short, the changed-path filter is used only when following
pathspec with a single element that is not a wildcard. So the
observed result is (unfortunately) quite expected.
next prev parent reply other threads:[~2025-06-23 19:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-23 17:58 Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph Kai Koponen
2025-06-23 18:04 ` Kai Koponen
2025-06-23 19:36 ` Junio C Hamano [this message]
2025-06-23 20:19 ` Kai Koponen
2025-06-23 21:00 ` Junio C Hamano
2025-06-24 3:16 ` Lidong Yan
2025-06-24 13:32 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq8qli5jyi.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=kaikoponen@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox