From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Emily Yang via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, stolee@gmail.com,
me@ttaylorr.com, ps@pks.im, newren@gmail.com,
Emily Yang <emilyyang.git@gmail.com>
Subject: Re: [PATCH v2] commit-graph: add new config for changed-paths & recommend it in scalar
Date: Wed, 29 Oct 2025 22:04:13 +0100 [thread overview]
Message-ID: <aQKBTQRQkGTgXkd6@szeder.dev> (raw)
In-Reply-To: <pull.1983.v2.git.1760734739642.gitgitgadget@gmail.com>
On Fri, Oct 17, 2025 at 08:58:59PM +0000, Emily Yang via GitGitGadget wrote:
> From: Emily Yang <emilyyang.git@gmail.com>
>
> The changed-path Bloom filters feature has proven stable and reliable
> over several years of use, delivering significant performance
> improvement for file history computation in large monorepos. Currently
> a user can opt-in to writing the changed-path Bloom filters using the
> "--changed-paths" option to "git commit-graph write". The filters will
> be persisted until the user drops the filters using the
> "--no-changed-paths" option. For this functionality, refer to 0087a87ba8
> (commit-graph: persist existence of changed-paths, 2020-07-01).
>
> Large monorepos using Git's background maintenance to build and update
> commit-graph files could use an easy switch to enable this feature
> without a foreground computation. In this commit, we're proposing a new
> config option "commitGraph.changedPaths":
>
> * If "true", "git commit-graph write" will write Bloom filters,
> equivalent to passing "--changed-paths";
> * If "false" or "unset", Bloom filters will be written during "git
> commit-graph write" only if the filters already exist in the current
> commit-graph file. This matches the default behaviour of "git
> commit-graph write" without any "--[no-]changed-paths" option. Note
> "false" can disable a previous "true" config value but doesn't imply
> "--no-changed-paths".
So if the commit-graph contains changed path Bloom filters, and the
user takes the effort, and explicitly sets this config variable to
false, then Git will just ignore that, and will continue to waste
resources to compute the Bloom filters?! This doesn't seems like a
sensible behavior to me.
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 0b3404f58f..98c6910963 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -946,4 +946,48 @@ test_expect_success 'stale commit cannot be parsed when traversing graph' '
> )
> '
>
> +test_expect_success 'config commitGraph.changedPaths acts like --changed-paths' '
> + git init config-changed-paths &&
> + (
> + cd config-changed-paths &&
> +
> + # commitGraph.changedPaths is not set and it should not write Bloom filters
> + test_commit first &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
> + test_grep ! "Bloom filters" error &&
> +
> + # Set commitGraph.changedPaths to true and it should write Bloom filters
> + test_commit second &&
> + git config commitGraph.changedPaths true &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
> + test_grep "Bloom filters" error &&
> +
> + # Add one more config commitGraph.changedPaths as false to disable the previous true config value
> + # It should still write Bloom filters due to existing filters
> + test_commit third &&
> + git config --add commitGraph.changedPaths false &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
> + test_grep "Bloom filters" error &&
> +
> + # commitGraph.changedPaths is still false and command line options should take precedence
> + test_commit fourth &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --no-changed-paths --reachable --progress 2>error &&
> + test_grep ! "Bloom filters" error &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
> + test_grep ! "Bloom filters" error &&
> +
> + # commitGraph.changedPaths is all cleared and then set to false again, command line options should take precedence
> + test_commit fifth &&
> + git config --unset-all commitGraph.changedPaths &&
> + git config commitGraph.changedPaths false &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --changed-paths --reachable --progress 2>error &&
> + test_grep "Bloom filters" error &&
> +
> + # commitGraph.changedPaths is still false and it should write Bloom filters due to existing filters
> + test_commit sixth &&
> + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
> + test_grep "Bloom filters" error
> + )
> +'
The interaction of split commit-graphs and changed path Bloom filters
used to be buggy, even after attempts to fix it. Therefore, I think
this config variable should be tested with split commit-graphs as
well.
prev parent reply other threads:[~2025-10-29 21:04 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-09 21:01 [PATCH] commit-graph: add new config for changed-paths & recommend it in scalar Emily Yang via GitGitGadget
2025-10-09 22:30 ` Junio C Hamano
2025-10-10 12:48 ` Derrick Stolee
2025-10-10 16:32 ` Junio C Hamano
2025-10-10 12:32 ` Derrick Stolee
2025-10-17 20:58 ` [PATCH v2] " Emily Yang via GitGitGadget
2025-10-22 14:53 ` Derrick Stolee
2025-10-22 17:42 ` Junio C Hamano
2025-10-29 21:04 ` SZEDER Gábor [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aQKBTQRQkGTgXkd6@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=emilyyang.git@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).