From: Junio C Hamano <gitster@pobox.com>
To: Kai Koponen <kaikoponen@google.com>
Cc: git@vger.kernel.org
Subject: Re: Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph
Date: Mon, 23 Jun 2025 12:36:21 -0700 [thread overview]
Message-ID: <xmqq8qli5jyi.fsf@gitster.g> (raw)
In-Reply-To: <CADYQcGqaMC=4jgbmnF9Q11oC11jfrqyvH8EuiRRHytpMXd4wYA@mail.gmail.com> (Kai Koponen's message of "Mon, 23 Jun 2025 13:58:03 -0400")
Kai Koponen <kaikoponen@google.com> writes:
> Reproduce steps:
> ```
> git clone https://github.com/golang/go.git
> cd go
> git config core.commitGraph true
> git commit-graph write --split --reachable --changed-paths # Without
> this, all calls equally slow (~1s)
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash > /dev/null # ~90ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/Make.dist > /dev/null # ~100ms
> time git rev-list -10 3730814f2f2bf24550920c39a16841583de2dac1 --
> src/clean.bash src/Make.dist > /dev/null # ~650ms
> ```
>
> The rev-list call with multiple paths takes over 3x longer than the
> sum of individual calls to it for the same files.
>
> Expectation: rev-list with multiple paths should take <= the sum of
> the time it takes to call it with each path individually (ideally <,
> since with the count limit it should be able to early-exit and search
> less commits for either path).
>
> Also reproduces without the -10 arg, or with a lower count (double
> instead of triple w/ -1), but these results are perhaps most
> surprising with a count present.
I asked
How does "git log -- path" use the changed-paths bloom filter
stored in the commit-graph file?
to https://deepwiki.com/git/git (there is a text field in the bottom
of the page), and an early part of its answer explains why in a
fairly convincing way ;-)
When you run git log -- path, Git first prepares to use bloom
filters in the prepare_to_use_bloom_filter function. This function:
1. Validates the pathspec - It calls forbid_bloom_filters to check
if bloom filters can be used revision.c:674-686 . Bloom filters
are disabled for wildcards, multiple paths, or complex pathspec
magic.
...
In short, the changed-path filter is used only when following
pathspec with a single element that is not a wildcard. So the
observed result is (unfortunately) quite expected.
next prev parent reply other threads:[~2025-06-23 19:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-23 17:58 Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph Kai Koponen
2025-06-23 18:04 ` Kai Koponen
2025-06-23 19:36 ` Junio C Hamano [this message]
2025-06-23 20:19 ` Kai Koponen
2025-06-23 21:00 ` Junio C Hamano
2025-06-24 3:16 ` Lidong Yan
2025-06-24 13:32 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq8qli5jyi.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=kaikoponen@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.