From: "Emily Yang via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, stolee@gmail.com, me@ttaylorr.com, ps@pks.im,
newren@gmail.com, Emily Yang <emilyyang.git@gmail.com>,
Emily Yang <emilyyang.git@gmail.com>
Subject: [PATCH v2] commit-graph: add new config for changed-paths & recommend it in scalar
Date: Fri, 17 Oct 2025 20:58:59 +0000 [thread overview]
Message-ID: <pull.1983.v2.git.1760734739642.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1983.git.1760043710502.gitgitgadget@gmail.com>
From: Emily Yang <emilyyang.git@gmail.com>
The changed-path Bloom filters feature has proven stable and reliable
over several years of use, delivering significant performance
improvement for file history computation in large monorepos. Currently
a user can opt-in to writing the changed-path Bloom filters using the
"--changed-paths" option to "git commit-graph write". The filters will
be persisted until the user drops the filters using the
"--no-changed-paths" option. For this functionality, refer to 0087a87ba8
(commit-graph: persist existence of changed-paths, 2020-07-01).
Large monorepos using Git's background maintenance to build and update
commit-graph files could use an easy switch to enable this feature
without a foreground computation. In this commit, we're proposing a new
config option "commitGraph.changedPaths":
* If "true", "git commit-graph write" will write Bloom filters,
equivalent to passing "--changed-paths";
* If "false" or "unset", Bloom filters will be written during "git
commit-graph write" only if the filters already exist in the current
commit-graph file. This matches the default behaviour of "git
commit-graph write" without any "--[no-]changed-paths" option. Note
"false" can disable a previous "true" config value but doesn't imply
"--no-changed-paths".
This config will always respect the precedence of command line option
"--[no-]changed-paths".
We also set this new config as optional recommended config in scalar to
turn on this feature for large repos.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Emily Yang <emilyyang.git@gmail.com>
---
commit-graph: add new config for changed-paths & recommend it in scalar
Hello,
I'm Emily and I'm interested in contributing to Git. This is my first
contribution to Git, super excited!
I'm from Microsoft and spend most of my time working in the Office
MonoRepo (OMR, one of the largest repos in the world). Recently I've
been working with Derrick Stolee on Git performance related topics. We'd
love to propose a small enhancement on the existing changed-paths Bloom
filters feature to benefit large repos like OMR. Please kindly review
the code and provide your feedback!
What's included in v2:
I received feedback about the confusion around the config explanation,
so in v2 I added more clarification in the doc and commit message,
hopefully it helps!
Thanks, Emily
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1983%2Femilyyang-ms%2Fchanged-paths-config-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1983/emilyyang-ms/changed-paths-config-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1983
Range-diff vs v1:
1: 90b271e905 ! 1: 365db79f4d commit-graph: add new config for changed-paths & recommend it in scalar
@@ Commit message
a user can opt-in to writing the changed-path Bloom filters using the
"--changed-paths" option to "git commit-graph write". The filters will
be persisted until the user drops the filters using the
- "--no-changed-paths" option.
+ "--no-changed-paths" option. For this functionality, refer to 0087a87ba8
+ (commit-graph: persist existence of changed-paths, 2020-07-01).
Large monorepos using Git's background maintenance to build and update
commit-graph files could use an easy switch to enable this feature
without a foreground computation. In this commit, we're proposing a new
- config option "commitGraph.changedPaths" - "true" value acts like
- "--changed-paths"; "false" disables a previous "true" config value but
- doesn't imply "--no-changed-paths". This config will always respect the
- precedence of command line option "--changed-paths" and
- "--no-changed-paths".
+ config option "commitGraph.changedPaths":
+
+ * If "true", "git commit-graph write" will write Bloom filters,
+ equivalent to passing "--changed-paths";
+ * If "false" or "unset", Bloom filters will be written during "git
+ commit-graph write" only if the filters already exist in the current
+ commit-graph file. This matches the default behaviour of "git
+ commit-graph write" without any "--[no-]changed-paths" option. Note
+ "false" can disable a previous "true" config value but doesn't imply
+ "--no-changed-paths".
+
+ This config will always respect the precedence of command line option
+ "--[no-]changed-paths".
We also set this new config as optional recommended config in scalar to
turn on this feature for large repos.
@@ Documentation/config/commitgraph.adoc: commitGraph.maxNewFilters::
+commitGraph.changedPaths::
+ If true, then `git commit-graph write` will compute and write
+ changed-path Bloom filters by default, equivalent to passing
-+ `--changed-paths`. If false or unset, changed-path Bloom filters
-+ will only be written when explicitly requested via `--changed-paths`.
-+ Command-line options always take precedence over this configuration.
-+ Defaults to unset.
++ `--changed-paths`. If false or unset, changed-paths Bloom filters will
++ be written during `git commit-graph write` only if the filters already
++ exist in the current commit-graph file. This matches the default
++ behavior of `git commit-graph write` without any `--[no-]changed-paths`
++ option. To rewrite a commit-graph file without any filters, use the
++ `--no-changed-paths` option. Command-line option `--[no-]changed-paths`
++ always takes precedence over this configuration. Defaults to unset.
+
commitGraph.readChangedPaths::
Deprecated. Equivalent to commitGraph.changedPathsVersion=-1 if true, and
commitGraph.changedPathsVersion=0 if false. (If commitGraph.changedPathVersion
+ ## Documentation/git-commit-graph.adoc ##
+@@ Documentation/git-commit-graph.adoc: take a while on large repositories. It provides significant performance gains
+ for getting history of a directory or a file with `git log -- <path>`. If
+ this option is given, future commit-graph writes will automatically assume
+ that this option was intended. Use `--no-changed-paths` to stop storing this
+-data.
++data. `--changed-paths` is implied by config `commitGraph.changedPaths=true`.
+ +
+ With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
+ filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
+
## builtin/commit-graph.c ##
@@ builtin/commit-graph.c: static int git_commit_graph_write_config(const char *var, const char *value,
{
Documentation/config/commitgraph.adoc | 11 +++++++
Documentation/git-commit-graph.adoc | 2 +-
builtin/commit-graph.c | 2 ++
scalar.c | 1 +
t/t5318-commit-graph.sh | 44 +++++++++++++++++++++++++++
5 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/Documentation/config/commitgraph.adoc b/Documentation/config/commitgraph.adoc
index 7f8c9d6638..70a56c53d2 100644
--- a/Documentation/config/commitgraph.adoc
+++ b/Documentation/config/commitgraph.adoc
@@ -8,6 +8,17 @@ commitGraph.maxNewFilters::
Specifies the default value for the `--max-new-filters` option of `git
commit-graph write` (c.f., linkgit:git-commit-graph[1]).
+commitGraph.changedPaths::
+ If true, then `git commit-graph write` will compute and write
+ changed-path Bloom filters by default, equivalent to passing
+ `--changed-paths`. If false or unset, changed-paths Bloom filters will
+ be written during `git commit-graph write` only if the filters already
+ exist in the current commit-graph file. This matches the default
+ behavior of `git commit-graph write` without any `--[no-]changed-paths`
+ option. To rewrite a commit-graph file without any filters, use the
+ `--no-changed-paths` option. Command-line option `--[no-]changed-paths`
+ always takes precedence over this configuration. Defaults to unset.
+
commitGraph.readChangedPaths::
Deprecated. Equivalent to commitGraph.changedPathsVersion=-1 if true, and
commitGraph.changedPathsVersion=0 if false. (If commitGraph.changedPathVersion
diff --git a/Documentation/git-commit-graph.adoc b/Documentation/git-commit-graph.adoc
index e9558173c0..6d19026035 100644
--- a/Documentation/git-commit-graph.adoc
+++ b/Documentation/git-commit-graph.adoc
@@ -71,7 +71,7 @@ take a while on large repositories. It provides significant performance gains
for getting history of a directory or a file with `git log -- <path>`. If
this option is given, future commit-graph writes will automatically assume
that this option was intended. Use `--no-changed-paths` to stop storing this
-data.
+data. `--changed-paths` is implied by config `commitGraph.changedPaths=true`.
+
With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index fe3ebaadad..d62005edc0 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -210,6 +210,8 @@ static int git_commit_graph_write_config(const char *var, const char *value,
{
if (!strcmp(var, "commitgraph.maxnewfilters"))
write_opts.max_new_filters = git_config_int(var, value, ctx->kvi);
+ else if (!strcmp(var, "commitgraph.changedpaths"))
+ opts.enable_changed_paths = git_config_bool(var, value) ? 1 : -1;
/*
* No need to fall-back to 'git_default_config', since this was already
* called in 'cmd_commit_graph()'.
diff --git a/scalar.c b/scalar.c
index 4a373c133d..f754311627 100644
--- a/scalar.c
+++ b/scalar.c
@@ -166,6 +166,7 @@ static int set_recommended_config(int reconfigure)
#endif
/* Optional */
{ "status.aheadBehind", "false" },
+ { "commitGraph.changedPaths", "true" },
{ "commitGraph.generationVersion", "1" },
{ "core.autoCRLF", "false" },
{ "core.safeCRLF", "false" },
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 0b3404f58f..98c6910963 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -946,4 +946,48 @@ test_expect_success 'stale commit cannot be parsed when traversing graph' '
)
'
+test_expect_success 'config commitGraph.changedPaths acts like --changed-paths' '
+ git init config-changed-paths &&
+ (
+ cd config-changed-paths &&
+
+ # commitGraph.changedPaths is not set and it should not write Bloom filters
+ test_commit first &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
+ test_grep ! "Bloom filters" error &&
+
+ # Set commitGraph.changedPaths to true and it should write Bloom filters
+ test_commit second &&
+ git config commitGraph.changedPaths true &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
+ test_grep "Bloom filters" error &&
+
+ # Add one more config commitGraph.changedPaths as false to disable the previous true config value
+ # It should still write Bloom filters due to existing filters
+ test_commit third &&
+ git config --add commitGraph.changedPaths false &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
+ test_grep "Bloom filters" error &&
+
+ # commitGraph.changedPaths is still false and command line options should take precedence
+ test_commit fourth &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --no-changed-paths --reachable --progress 2>error &&
+ test_grep ! "Bloom filters" error &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
+ test_grep ! "Bloom filters" error &&
+
+ # commitGraph.changedPaths is all cleared and then set to false again, command line options should take precedence
+ test_commit fifth &&
+ git config --unset-all commitGraph.changedPaths &&
+ git config commitGraph.changedPaths false &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --changed-paths --reachable --progress 2>error &&
+ test_grep "Bloom filters" error &&
+
+ # commitGraph.changedPaths is still false and it should write Bloom filters due to existing filters
+ test_commit sixth &&
+ GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error &&
+ test_grep "Bloom filters" error
+ )
+'
+
test_done
base-commit: 79cf913ea9321f774da29b2330b5781d5ff420ef
--
gitgitgadget
next prev parent reply other threads:[~2025-10-17 20:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-09 21:01 [PATCH] commit-graph: add new config for changed-paths & recommend it in scalar Emily Yang via GitGitGadget
2025-10-09 22:30 ` Junio C Hamano
2025-10-10 12:48 ` Derrick Stolee
2025-10-10 16:32 ` Junio C Hamano
2025-10-10 12:32 ` Derrick Stolee
2025-10-17 20:58 ` Emily Yang via GitGitGadget [this message]
2025-10-22 14:53 ` [PATCH v2] " Derrick Stolee
2025-10-22 17:42 ` Junio C Hamano
2025-10-29 21:04 ` SZEDER Gábor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1983.v2.git.1760734739642.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=emilyyang.git@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).