From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
"Eric W. Biederman" <ebiederm@gmail.com>,
Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>,
Patrick Steinhardt <ps@pks.im>,
Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH v5 09/17] repo-settings: introduce commitgraph.changedPathsVersion
Date: Mon, 29 Jan 2024 22:26:14 +0100 [thread overview]
Message-ID: <20240129212614.GB9612@szeder.dev> (raw)
In-Reply-To: <a77dcc99b4eb0a19dc6c09a40a84785413502126.1705442923.git.me@ttaylorr.com>
On Tue, Jan 16, 2024 at 05:09:28PM -0500, Taylor Blau wrote:
> A subsequent commit will introduce another version of the changed-path
> filter in the commit graph file. In order to control which version to
> write (and read), a config variable is needed.
>
> Therefore, introduce this config variable. For forwards compatibility,
> teach Git to not read commit graphs when the config variable
> is set to an unsupported version. Because we teach Git this,
> commitgraph.readChangedPaths is now redundant, so deprecate it and
> define its behavior in terms of the config variable we introduce.
>
> This commit does not change the behavior of writing (Git writes changed
> path filters when explicitly instructed regardless of any config
> variable), but a subsequent commit will restrict Git such that it will
> only write when commitgraph.changedPathsVersion is a recognized value.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> Documentation/config/commitgraph.txt | 26 ++++++++++++++++++++++---
> commit-graph.c | 5 +++--
> oss-fuzz/fuzz-commit-graph.c | 2 +-
> repo-settings.c | 6 +++++-
> repository.h | 2 +-
> t/t4216-log-bloom.sh | 29 +++++++++++++++++++++++++++-
> 6 files changed, 61 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config/commitgraph.txt b/Documentation/config/commitgraph.txt
> index 30604e4a4c..e68cdededa 100644
> --- a/Documentation/config/commitgraph.txt
> +++ b/Documentation/config/commitgraph.txt
> @@ -9,6 +9,26 @@ commitGraph.maxNewFilters::
> commit-graph write` (c.f., linkgit:git-commit-graph[1]).
>
> commitGraph.readChangedPaths::
> - If true, then git will use the changed-path Bloom filters in the
> - commit-graph file (if it exists, and they are present). Defaults to
> - true. See linkgit:git-commit-graph[1] for more information.
> + Deprecated. Equivalent to commitGraph.changedPathsVersion=-1 if true, and
> + commitGraph.changedPathsVersion=0 if false. (If commitGraph.changedPathVersion
> + is also set, commitGraph.changedPathsVersion takes precedence.)
> +
> +commitGraph.changedPathsVersion::
> + Specifies the version of the changed-path Bloom filters that Git will read and
> + write. May be -1, 0 or 1. Note that values greater than 1 may be
> + incompatible with older versions of Git which do not yet understand
> + those versions. Use caution when operating in a mixed-version
> + environment.
> ++
> +Defaults to -1.
> ++
> +If -1, Git will use the version of the changed-path Bloom filters in the
> +repository, defaulting to 1 if there are none.
> ++
> +If 0, Git will not read any Bloom filters, and will write version 1 Bloom
> +filters when instructed to write.
> ++
> +If 1, Git will only read version 1 Bloom filters, and will write version 1
> +Bloom filters.
> ++
> +See linkgit:git-commit-graph[1] for more information.
> diff --git a/commit-graph.c b/commit-graph.c
> index 00113b0f62..91c98ebc6c 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -459,7 +459,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
> graph->read_generation_data = 1;
> }
>
> - if (s->commit_graph_read_changed_paths) {
> + if (s->commit_graph_changed_paths_version) {
> read_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
> graph_read_bloom_index, graph);
> read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
> @@ -555,7 +555,8 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
> }
>
> if (g->bloom_filter_settings->bits_per_entry != settings->bits_per_entry ||
> - g->bloom_filter_settings->num_hashes != settings->num_hashes) {
> + g->bloom_filter_settings->num_hashes != settings->num_hashes ||
> + g->bloom_filter_settings->hash_version != settings->hash_version) {
> g->chunk_bloom_indexes = NULL;
> g->chunk_bloom_data = NULL;
> FREE_AND_NULL(g->bloom_filter_settings);
> diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
> index 2992079dd9..325c0b991a 100644
> --- a/oss-fuzz/fuzz-commit-graph.c
> +++ b/oss-fuzz/fuzz-commit-graph.c
> @@ -19,7 +19,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
> * possible.
> */
> the_repository->settings.commit_graph_generation_version = 2;
> - the_repository->settings.commit_graph_read_changed_paths = 1;
> + the_repository->settings.commit_graph_changed_paths_version = 1;
> g = parse_commit_graph(&the_repository->settings, (void *)data, size);
> repo_clear(the_repository);
> free_commit_graph(g);
> diff --git a/repo-settings.c b/repo-settings.c
> index 30cd478762..c821583fe5 100644
> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -23,6 +23,7 @@ void prepare_repo_settings(struct repository *r)
> int value;
> const char *strval;
> int manyfiles;
> + int read_changed_paths;
>
> if (!r->gitdir)
> BUG("Cannot add settings for uninitialized repository");
> @@ -53,7 +54,10 @@ void prepare_repo_settings(struct repository *r)
> /* Commit graph config or default, does not cascade (simple) */
> repo_cfg_bool(r, "core.commitgraph", &r->settings.core_commit_graph, 1);
> repo_cfg_int(r, "commitgraph.generationversion", &r->settings.commit_graph_generation_version, 2);
> - repo_cfg_bool(r, "commitgraph.readchangedpaths", &r->settings.commit_graph_read_changed_paths, 1);
> + repo_cfg_bool(r, "commitgraph.readchangedpaths", &read_changed_paths, 1);
> + repo_cfg_int(r, "commitgraph.changedpathsversion",
> + &r->settings.commit_graph_changed_paths_version,
> + read_changed_paths ? -1 : 0);
> repo_cfg_bool(r, "gc.writecommitgraph", &r->settings.gc_write_commit_graph, 1);
> repo_cfg_bool(r, "fetch.writecommitgraph", &r->settings.fetch_write_commit_graph, 0);
>
> diff --git a/repository.h b/repository.h
> index 5f18486f64..f71154e12c 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -29,7 +29,7 @@ struct repo_settings {
>
> int core_commit_graph;
> int commit_graph_generation_version;
> - int commit_graph_read_changed_paths;
> + int commit_graph_changed_paths_version;
> int gc_write_commit_graph;
> int fetch_write_commit_graph;
> int command_requires_full_index;
> diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
> index 484dd093cd..642b960893 100755
> --- a/t/t4216-log-bloom.sh
> +++ b/t/t4216-log-bloom.sh
> @@ -435,7 +435,7 @@ test_expect_success 'setup for mixed Bloom setting tests' '
> done
> '
>
> -test_expect_success 'ensure incompatible Bloom filters are ignored' '
> +test_expect_success 'ensure Bloom filters with incompatible settings are ignored' '
> # Compute Bloom filters with "unusual" settings.
> git -C $repo rev-parse one >in &&
> GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=3 git -C $repo commit-graph write \
> @@ -485,6 +485,33 @@ test_expect_success 'merge graph layers with incompatible Bloom settings' '
> test_must_be_empty err
> '
>
> +test_expect_success 'ensure Bloom filter with incompatible versions are ignored' '
> + rm "$repo/$graph" &&
> +
> + git -C $repo log --oneline --no-decorate -- $CENT >expect &&
> +
> + # Compute v1 Bloom filters for commits at the bottom.
> + git -C $repo rev-parse HEAD^ >in &&
> + git -C $repo commit-graph write --stdin-commits --changed-paths \
> + --split <in &&
> +
> + # Compute v2 Bloomfilters for the rest of the commits at the top.
> + git -C $repo rev-parse HEAD >in &&
> + git -C $repo -c commitGraph.changedPathsVersion=2 commit-graph write \
> + --stdin-commits --changed-paths --split=no-merge <in &&
> +
> + test_line_count = 2 $repo/$chain &&
> +
> + git -C $repo log --oneline --no-decorate -- $CENT >actual 2>err &&
> + test_cmp expect actual &&
> +
> + layer="$(head -n 1 $repo/$chain)" &&
> + cat >expect.err <<-EOF &&
> + warning: disabling Bloom filters for commit-graph layer $SQ$layer$SQ due to incompatible settings
> + EOF
> + test_cmp expect.err err
> +'
At this point in the series this test fails with:
+ test_cmp expect.err err
+ test 2 -ne 2
+ eval diff -u "$@"
+ diff -u expect.err err
--- expect.err 2024-01-29 21:02:57.927462620 +0000
+++ err 2024-01-29 21:02:57.923462642 +0000
@@ -1 +0,0 @@
-warning: disabling Bloom filters for commit-graph layer 'e338a7a1b4cfa5f6bcd31aea3e027df67d06442a' due to incompatible settings
error: last command exited with $?=1
> +
> get_first_changed_path_filter () {
> test-tool read-graph bloom-filters >filters.dat &&
> head -n 1 filters.dat
> --
> 2.43.0.334.gd4dbce1db5.dirty
>
next prev parent reply other threads:[~2024-01-29 21:26 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-06 22:01 [PATCH 0/7] merge-ort: implement support for packing objects together Taylor Blau
2023-10-06 22:01 ` [PATCH 1/7] bulk-checkin: factor out `format_object_header_hash()` Taylor Blau
2023-10-06 22:01 ` [PATCH 2/7] bulk-checkin: factor out `prepare_checkpoint()` Taylor Blau
2023-10-06 22:01 ` [PATCH 3/7] bulk-checkin: factor out `truncate_checkpoint()` Taylor Blau
2023-10-06 22:01 ` [PATCH 4/7] bulk-checkin: factor our `finalize_checkpoint()` Taylor Blau
2023-10-06 22:02 ` [PATCH 5/7] bulk-checkin: introduce `index_blob_bulk_checkin_incore()` Taylor Blau
2023-10-06 22:02 ` [PATCH 6/7] bulk-checkin: introduce `index_tree_bulk_checkin_incore()` Taylor Blau
2023-10-07 3:07 ` Eric Biederman
2023-10-09 1:31 ` Taylor Blau
2023-10-06 22:02 ` [PATCH 7/7] builtin/merge-tree.c: implement support for `--write-pack` Taylor Blau
2023-10-06 22:35 ` Junio C Hamano
2023-10-06 23:02 ` Taylor Blau
2023-10-08 7:02 ` Elijah Newren
2023-10-08 16:04 ` Taylor Blau
2023-10-08 17:33 ` Jeff King
2023-10-09 1:37 ` Taylor Blau
2023-10-09 20:21 ` Jeff King
2023-10-09 17:24 ` Junio C Hamano
2023-10-09 10:54 ` Patrick Steinhardt
2023-10-09 16:08 ` Taylor Blau
2023-10-10 6:36 ` Patrick Steinhardt
2023-10-17 16:31 ` [PATCH v2 0/7] merge-ort: implement support for packing objects together Taylor Blau
2023-10-17 16:31 ` [PATCH v2 1/7] bulk-checkin: factor out `format_object_header_hash()` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 2/7] bulk-checkin: factor out `prepare_checkpoint()` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 3/7] bulk-checkin: factor out `truncate_checkpoint()` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 4/7] bulk-checkin: factor our `finalize_checkpoint()` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 5/7] bulk-checkin: introduce `index_blob_bulk_checkin_incore()` Taylor Blau
2023-10-18 2:18 ` Junio C Hamano
2023-10-18 16:34 ` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 6/7] bulk-checkin: introduce `index_tree_bulk_checkin_incore()` Taylor Blau
2023-10-17 16:31 ` [PATCH v2 7/7] builtin/merge-tree.c: implement support for `--write-pack` Taylor Blau
2023-10-18 17:07 ` [PATCH v3 00/10] merge-ort: implement support for packing objects together Taylor Blau
2023-10-18 17:07 ` [PATCH v3 01/10] bulk-checkin: factor out `format_object_header_hash()` Taylor Blau
2023-10-18 17:07 ` [PATCH v3 02/10] bulk-checkin: factor out `prepare_checkpoint()` Taylor Blau
2023-10-18 17:07 ` [PATCH v3 03/10] bulk-checkin: factor out `truncate_checkpoint()` Taylor Blau
2023-10-18 17:07 ` [PATCH v3 04/10] bulk-checkin: factor out `finalize_checkpoint()` Taylor Blau
2023-10-18 17:08 ` [PATCH v3 05/10] bulk-checkin: extract abstract `bulk_checkin_source` Taylor Blau
2023-10-18 23:10 ` Junio C Hamano
2023-10-19 15:19 ` Taylor Blau
2023-10-19 17:55 ` Junio C Hamano
2023-10-18 17:08 ` [PATCH v3 06/10] bulk-checkin: implement `SOURCE_INCORE` mode for `bulk_checkin_source` Taylor Blau
2023-10-18 17:08 ` [PATCH v3 07/10] bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types Taylor Blau
2023-10-18 17:08 ` [PATCH v3 08/10] bulk-checkin: introduce `index_blob_bulk_checkin_incore()` Taylor Blau
2023-10-18 23:18 ` Junio C Hamano
2023-10-19 15:30 ` Taylor Blau
2023-10-18 17:08 ` [PATCH v3 09/10] bulk-checkin: introduce `index_tree_bulk_checkin_incore()` Taylor Blau
2023-10-18 17:08 ` [PATCH v3 10/10] builtin/merge-tree.c: implement support for `--write-pack` Taylor Blau
2023-10-18 18:32 ` [PATCH v4 00/17] bloom: changed-path Bloom filters v2 (& sundries) Taylor Blau
2023-10-18 18:32 ` [PATCH v4 01/17] t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` Taylor Blau
2023-10-18 18:32 ` [PATCH v4 02/17] revision.c: consult Bloom filters for root commits Taylor Blau
2023-10-18 18:32 ` [PATCH v4 03/17] commit-graph: ensure Bloom filters are read with consistent settings Taylor Blau
2023-10-18 18:32 ` [PATCH v4 04/17] gitformat-commit-graph: describe version 2 of BDAT Taylor Blau
2023-10-18 18:32 ` [PATCH v4 05/17] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-10-18 18:32 ` [PATCH v4 06/17] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-10-18 18:32 ` [PATCH v4 07/17] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-10-18 18:32 ` [PATCH v4 08/17] t4216: test changed path filters with high bit paths Taylor Blau
2023-10-18 18:32 ` [PATCH v4 09/17] repo-settings: introduce commitgraph.changedPathsVersion Taylor Blau
2023-10-18 18:32 ` [PATCH v4 10/17] commit-graph: new filter ver. that fixes murmur3 Taylor Blau
2023-10-18 18:33 ` [PATCH v4 11/17] bloom: annotate filters with hash version Taylor Blau
2023-10-18 18:33 ` [PATCH v4 12/17] bloom: prepare to discard incompatible Bloom filters Taylor Blau
2023-10-18 18:33 ` [PATCH v4 13/17] commit-graph.c: unconditionally load " Taylor Blau
2023-10-18 18:33 ` [PATCH v4 14/17] commit-graph: drop unnecessary `graph_read_bloom_data_context` Taylor Blau
2023-10-18 18:33 ` [PATCH v4 15/17] object.h: fix mis-aligned flag bits table Taylor Blau
2023-10-18 18:33 ` [PATCH v4 16/17] commit-graph: reuse existing Bloom filters where possible Taylor Blau
2023-10-18 18:33 ` [PATCH v4 17/17] bloom: introduce `deinit_bloom_filters()` Taylor Blau
2023-10-18 23:26 ` [PATCH v4 00/17] bloom: changed-path Bloom filters v2 (& sundries) Junio C Hamano
2023-10-20 17:27 ` Taylor Blau
2023-10-23 20:22 ` SZEDER Gábor
2023-10-30 20:24 ` Taylor Blau
2024-01-16 22:08 ` [PATCH v5 " Taylor Blau
2024-01-16 22:09 ` [PATCH v5 01/17] t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` Taylor Blau
2024-01-16 22:09 ` [PATCH v5 02/17] revision.c: consult Bloom filters for root commits Taylor Blau
2024-01-16 22:09 ` [PATCH v5 03/17] commit-graph: ensure Bloom filters are read with consistent settings Taylor Blau
2024-01-16 22:09 ` [PATCH v5 04/17] gitformat-commit-graph: describe version 2 of BDAT Taylor Blau
2024-01-16 22:09 ` [PATCH v5 05/17] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2024-01-16 22:09 ` [PATCH v5 06/17] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2024-01-16 22:09 ` [PATCH v5 07/17] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2024-01-16 22:09 ` [PATCH v5 08/17] t4216: test changed path filters with high bit paths Taylor Blau
2024-01-16 22:09 ` [PATCH v5 09/17] repo-settings: introduce commitgraph.changedPathsVersion Taylor Blau
2024-01-29 21:26 ` SZEDER Gábor [this message]
2024-01-29 23:58 ` Taylor Blau
2024-01-16 22:09 ` [PATCH v5 10/17] commit-graph: new Bloom filter version that fixes murmur3 Taylor Blau
2024-01-16 22:09 ` [PATCH v5 11/17] bloom: annotate filters with hash version Taylor Blau
2024-01-16 22:09 ` [PATCH v5 12/17] bloom: prepare to discard incompatible Bloom filters Taylor Blau
2024-01-16 22:09 ` [PATCH v5 13/17] commit-graph.c: unconditionally load " Taylor Blau
2024-01-16 22:09 ` [PATCH v5 14/17] commit-graph: drop unnecessary `graph_read_bloom_data_context` Taylor Blau
2024-01-16 22:09 ` [PATCH v5 15/17] object.h: fix mis-aligned flag bits table Taylor Blau
2024-01-16 22:09 ` [PATCH v5 16/17] commit-graph: reuse existing Bloom filters where possible Taylor Blau
2024-01-16 22:09 ` [PATCH v5 17/17] bloom: introduce `deinit_bloom_filters()` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240129212614.GB9612@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=ebiederm@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.