From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH 09/15] bloom: prepare to discard incompatible Bloom filters
Date: Mon, 21 Aug 2023 17:44:25 -0400 [thread overview]
Message-ID: <7c5a0090b52f829cab3ab2e107029a74b7ecfa87.1692654233.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1692654233.git.me@ttaylorr.com>
Callers use the inline `get_bloom_filter()` implementation as a thin
wrapper around `get_or_compute_bloom_filter()`. The former calls the
latter with a value of "0" for `compute_if_not_present`, making
`get_bloom_filter()` the default read-only path for fetching an existing
Bloom filter.
Callers expect the value returned from `get_bloom_filter()` is usable,
that is that it's compatible with the configured value corresponding to
`commitGraph.changedPathsVersion`.
This is OK, since the commit-graph machinery only initializes its BDAT
chunk (thereby enabling it to service Bloom filter queries) when the
Bloom filter hash_version is compatible with our settings. So any value
returned by `get_bloom_filter()` is trivially useable.
However, subsequent commits will load the BDAT chunk even when the Bloom
filters are built with incompatible hash versions. Prepare to handle
this by teaching `get_bloom_filter()` to discard filters that are
incompatible with the configured hash version.
Callers who wish to read incompatible filters (e.g., for upgrading
filters from v1 to v2) may use the lower level routine,
`get_or_compute_bloom_filter()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
bloom.c | 20 +++++++++++++++++++-
bloom.h | 20 ++++++++++++++++++--
2 files changed, 37 insertions(+), 3 deletions(-)
diff --git a/bloom.c b/bloom.c
index 9b6a30f6f6..739fa093ba 100644
--- a/bloom.c
+++ b/bloom.c
@@ -250,6 +250,23 @@ static void init_truncated_large_filter(struct bloom_filter *filter,
filter->version = version;
}
+struct bloom_filter *get_bloom_filter(struct repository *r, struct commit *c)
+{
+ struct bloom_filter *filter;
+ int hash_version;
+
+ filter = get_or_compute_bloom_filter(r, c, 0, NULL, NULL);
+ if (!filter)
+ return NULL;
+
+ prepare_repo_settings(r);
+ hash_version = r->settings.commit_graph_changed_paths_version;
+
+ if (!(hash_version == -1 || hash_version == filter->version))
+ return NULL; /* unusable filter */
+ return filter;
+}
+
struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
struct commit *c,
int compute_if_not_present,
@@ -275,7 +292,8 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
filter, graph_pos);
}
- if (filter->data && filter->len)
+ if ((filter->data && filter->len) &&
+ (!settings || settings->hash_version == filter->version))
return filter;
if (!compute_if_not_present)
return NULL;
diff --git a/bloom.h b/bloom.h
index 330a140520..bfe389e29c 100644
--- a/bloom.h
+++ b/bloom.h
@@ -110,8 +110,24 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
const struct bloom_filter_settings *settings,
enum bloom_filter_computed *computed);
-#define get_bloom_filter(r, c) get_or_compute_bloom_filter( \
- (r), (c), 0, NULL, NULL)
+/*
+ * Find the Bloom filter associated with the given commit "c".
+ *
+ * If any of the following are true
+ *
+ * - the repository does not have a commit-graph, or
+ * - the repository disables reading from the commit-graph, or
+ * - the given commit does not have a Bloom filter computed, or
+ * - there is a Bloom filter for commit "c", but it cannot be read
+ * because the filter uses an incompatible version of murmur3
+ *
+ * , then `get_bloom_filter()` will return NULL. Otherwise, the corresponding
+ * Bloom filter will be returned.
+ *
+ * For callers who wish to inspect Bloom filters with incompatible hash
+ * versions, use get_or_compute_bloom_filter().
+ */
+struct bloom_filter *get_bloom_filter(struct repository *r, struct commit *c);
int bloom_filter_contains(const struct bloom_filter *filter,
const struct bloom_key *key,
--
2.42.0.4.g52b49bb434
next prev parent reply other threads:[~2023-08-21 21:44 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-21 21:43 [PATCH 00/15] bloom: changed-path Bloom filters v2 Taylor Blau
2023-08-21 21:43 ` [PATCH 01/15] gitformat-commit-graph: describe version 2 of BDAT Taylor Blau
2023-08-21 21:44 ` [PATCH 02/15] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-08-21 21:44 ` [PATCH 03/15] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-08-21 21:44 ` [PATCH 04/15] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-08-21 21:44 ` [PATCH 05/15] t4216: test changed path filters with high bit paths Taylor Blau
2023-08-21 21:44 ` [PATCH 06/15] repo-settings: introduce commitgraph.changedPathsVersion Taylor Blau
2023-08-21 21:44 ` [PATCH 07/15] commit-graph: new filter ver. that fixes murmur3 Taylor Blau
2023-08-26 15:06 ` SZEDER Gábor
2023-08-29 16:31 ` Jonathan Tan
2023-08-30 20:02 ` SZEDER Gábor
2023-09-01 20:56 ` Jonathan Tan
2023-09-25 23:03 ` Taylor Blau
2023-10-08 14:35 ` SZEDER Gábor
2023-10-09 18:17 ` Taylor Blau
2023-10-09 19:31 ` Taylor Blau
2023-10-09 19:52 ` Junio C Hamano
2023-10-10 20:34 ` Taylor Blau
2023-08-21 21:44 ` [PATCH 08/15] bloom: annotate filters with hash version Taylor Blau
2023-08-21 21:44 ` Taylor Blau [this message]
2023-08-21 21:44 ` [PATCH 10/15] t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` Taylor Blau
2023-08-21 21:44 ` [PATCH 11/15] commit-graph.c: unconditionally load Bloom filters Taylor Blau
2023-08-21 21:44 ` [PATCH 12/15] commit-graph: drop unnecessary `graph_read_bloom_data_context` Taylor Blau
2023-08-21 21:44 ` [PATCH 13/15] object.h: fix mis-aligned flag bits table Taylor Blau
2023-08-21 21:44 ` [PATCH 14/15] commit-graph: reuse existing Bloom filters where possible Taylor Blau
2023-08-21 21:44 ` [PATCH 15/15] bloom: introduce `deinit_bloom_filters()` Taylor Blau
2023-08-24 22:22 ` [PATCH 00/15] bloom: changed-path Bloom filters v2 Jonathan Tan
2023-08-25 17:06 ` Jonathan Tan
2023-08-29 22:18 ` Jonathan Tan
2023-08-29 23:16 ` Junio C Hamano
2023-08-30 16:43 ` [PATCH v2 " Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 01/15] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 02/15] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 03/15] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 04/15] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 05/15] t4216: test changed path filters with high bit paths Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 06/15] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 07/15] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 08/15] bloom: annotate filters with hash version Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 09/15] bloom: prepare to discard incompatible Bloom filters Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 10/15] t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 11/15] commit-graph.c: unconditionally load Bloom filters Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 12/15] commit-graph: drop unnecessary `graph_read_bloom_data_context` Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 13/15] object.h: fix mis-aligned flag bits table Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 14/15] commit-graph: reuse existing Bloom filters where possible Jonathan Tan
2023-08-30 16:43 ` [PATCH v2 15/15] bloom: introduce `deinit_bloom_filters()` Jonathan Tan
2023-08-30 19:38 ` [PATCH v2 00/15] bloom: changed-path Bloom filters v2 Junio C Hamano
2023-10-10 20:33 ` [PATCH v3 00/17] bloom: changed-path Bloom filters v2 (& sundries) Taylor Blau
2023-10-10 20:33 ` [PATCH v3 01/17] t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` Taylor Blau
2023-10-10 20:33 ` [PATCH v3 02/17] revision.c: consult Bloom filters for root commits Taylor Blau
2023-10-10 20:33 ` [PATCH v3 03/17] commit-graph: ensure Bloom filters are read with consistent settings Taylor Blau
2023-10-17 8:45 ` Patrick Steinhardt
2023-10-10 20:33 ` [PATCH v3 04/17] gitformat-commit-graph: describe version 2 of BDAT Taylor Blau
2023-10-10 20:33 ` [PATCH v3 05/17] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-10-17 8:45 ` Patrick Steinhardt
2023-10-18 17:37 ` Taylor Blau
2023-10-18 23:56 ` Junio C Hamano
2023-10-10 20:33 ` [PATCH v3 06/17] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-10-10 20:33 ` [PATCH v3 07/17] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-10-10 20:33 ` [PATCH v3 08/17] t4216: test changed path filters with high bit paths Taylor Blau
2023-10-17 8:45 ` Patrick Steinhardt
2023-10-18 17:41 ` Taylor Blau
2023-10-10 20:33 ` [PATCH v3 09/17] repo-settings: introduce commitgraph.changedPathsVersion Taylor Blau
2023-10-10 20:33 ` [PATCH v3 10/17] commit-graph: new filter ver. that fixes murmur3 Taylor Blau
2023-10-17 8:45 ` Patrick Steinhardt
2023-10-18 17:46 ` Taylor Blau
2023-10-10 20:33 ` [PATCH v3 11/17] bloom: annotate filters with hash version Taylor Blau
2023-10-10 20:33 ` [PATCH v3 12/17] bloom: prepare to discard incompatible Bloom filters Taylor Blau
2023-10-10 20:33 ` [PATCH v3 13/17] commit-graph.c: unconditionally load " Taylor Blau
2023-10-17 8:45 ` Patrick Steinhardt
2023-10-10 20:34 ` [PATCH v3 14/17] commit-graph: drop unnecessary `graph_read_bloom_data_context` Taylor Blau
2023-10-10 20:34 ` [PATCH v3 15/17] object.h: fix mis-aligned flag bits table Taylor Blau
2023-10-10 20:34 ` [PATCH v3 16/17] commit-graph: reuse existing Bloom filters where possible Taylor Blau
2023-10-10 20:34 ` [PATCH v3 17/17] bloom: introduce `deinit_bloom_filters()` Taylor Blau
2023-10-17 8:45 ` [PATCH v3 00/17] bloom: changed-path Bloom filters v2 (& sundries) Patrick Steinhardt
2023-10-18 17:47 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7c5a0090b52f829cab3ab2e107029a74b7ecfa87.1692654233.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).