From: Patrick Steinhardt <ps@pks.im>
To: Lidong Yan <yldhome2d2@gmail.com>
Cc: git@vger.kernel.org, Lidong Yan <502024330056@smail.nju.edu.cn>
Subject: Re: [PATCH v3 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec
Date: Wed, 2 Jul 2025 17:08:23 +0200 [thread overview]
Message-ID: <aGVLZ9VUf2M1sWhL@pks.im> (raw)
In-Reply-To: <20250628042140.1097910-2-502024330056@smail.nju.edu.cn>
On Sat, Jun 28, 2025 at 12:21:39PM +0800, Lidong Yan wrote:
> diff --git a/bloom.h b/bloom.h
> index 6e46489a20..9e4e832c8c 100644
> --- a/bloom.h
> +++ b/bloom.h
> @@ -74,6 +74,11 @@ struct bloom_key {
> uint32_t *hashes;
> };
>
> +struct bloom_keyvec {
> + size_t count;
> + struct bloom_key key[FLEX_ARRAY];
> +};
> +
A short comment would help readers understand what the intent of this
data structure is.
> int load_bloom_filter_from_graph(struct commit_graph *g,
> struct bloom_filter *filter,
> uint32_t graph_pos);
> @@ -100,6 +105,17 @@ void add_key_to_filter(const struct bloom_key *key,
> void init_bloom_filters(void);
> void deinit_bloom_filters(void);
>
> +struct bloom_keyvec *create_bloom_keyvec(size_t count);
> +void destroy_bloom_keyvec(struct bloom_keyvec *vec);
These functions are named very unusually for us -- the first version of
this patch series was following our coding guidelines, but this version
here isn't anymore.
- The primary data structure that a subsystem 'S' deals with is called
`struct S`. Functions that operate on `struct S` are named
`S_<verb>()` and should generally receive a pointer to `struct S` as
first parameter. E.g.
Second, the functions should probably be called `*_new()` and `*_free()`
instead of `create_*()` and `destroy_*()`.
> +static inline void fill_bloom_keyvec_key(const char *data, size_t len,
> + struct bloom_keyvec *vec, size_t nr,
> + const struct bloom_filter_settings *settings)
> +{
> + assert(nr < vec->count);
> + fill_bloom_key(data, len, &vec->key[nr], settings);
> +}
> +
Similarly, this should probably be called `bloom_keyvec_fill_key()`.
> enum bloom_filter_computed {
> BLOOM_NOT_COMPUTED = (1 << 0),
> BLOOM_COMPUTED = (1 << 1),
> @@ -137,4 +153,8 @@ int bloom_filter_contains(const struct bloom_filter *filter,
> const struct bloom_key *key,
> const struct bloom_filter_settings *settings);
>
> +int bloom_filter_contains_vec(const struct bloom_filter *filter,
> + const struct bloom_keyvec *v,
> + const struct bloom_filter_settings *settings);
> +
> #endif
This one looks alright though.
> diff --git a/revision.c b/revision.c
> index afee111196..3aa544c137 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -779,11 +782,8 @@ static int check_maybe_different_in_bloom_filter(struct rev_info *revs,
> return -1;
> }
>
> - for (j = 0; result && j < revs->bloom_keys_nr; j++) {
> - result = bloom_filter_contains(filter,
> - &revs->bloom_keys[j],
> - revs->bloom_filter_settings);
> - }
> + result = bloom_filter_contains_vec(filter, revs->bloom_keyvecs[0],
> + revs->bloom_filter_settings);
>
> if (result)
> count_bloom_filter_maybe++;
This conversion feels wrong to me. Why don't we end up iterating through
`revs->bloom_keyvecs_nr` here? We do indeed change it back in the next
patch to use a for loop.
> @@ -3230,10 +3230,10 @@ void release_revisions(struct rev_info *revs)
> line_log_free(revs);
> oidset_clear(&revs->missing_commits);
>
> - for (int i = 0; i < revs->bloom_keys_nr; i++)
> - clear_bloom_key(&revs->bloom_keys[i]);
> - FREE_AND_NULL(revs->bloom_keys);
> - revs->bloom_keys_nr = 0;
> + for (int i = 0; i < revs->bloom_keyvecs_nr; i++)
It's puzzling that the number of keys is declared as `int`. It's not an
issue introduced by you, but can we maybe fix it while at it?
Patrick
next prev parent reply other threads:[~2025-07-02 15:08 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 12:55 [PATCH 0/2] bloom: use bloom filter given multiple pathspec Lidong Yan
2025-06-25 12:55 ` [PATCH 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-25 17:43 ` Junio C Hamano
2025-06-26 3:44 ` Lidong Yan
2025-06-25 12:55 ` [PATCH 2/2] bloom: enable multiple pathspec bloom keys Lidong Yan
2025-06-27 13:50 ` Junio C Hamano
2025-06-27 14:24 ` Lidong Yan
2025-06-27 18:09 ` Junio C Hamano
2025-07-01 5:52 ` Lidong Yan
2025-07-01 15:19 ` Junio C Hamano
2025-07-02 7:14 ` Lidong Yan
2025-07-02 15:48 ` Junio C Hamano
2025-07-03 1:52 ` Lidong Yan
2025-07-04 12:09 ` Lidong Yan
2025-07-01 8:50 ` SZEDER Gábor
2025-07-01 11:40 ` Lidong Yan
2025-07-01 15:43 ` Junio C Hamano
2025-06-27 20:39 ` Junio C Hamano
2025-06-28 2:54 ` Lidong Yan
2025-06-25 17:32 ` [PATCH 0/2] bloom: use bloom filter given multiple pathspec Junio C Hamano
2025-06-26 3:34 ` Lidong Yan
2025-06-26 14:15 ` Junio C Hamano
2025-06-27 6:21 ` [PATCH v2 0/2] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Lidong Yan
2025-06-28 4:21 ` [PATCH v3 " Lidong Yan
2025-07-04 11:14 ` [PATCH v4 0/4] " Lidong Yan
2025-07-04 11:14 ` [PATCH v4 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-04 11:14 ` [PATCH v4 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-04 11:14 ` [PATCH v4 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-07 11:35 ` Derrick Stolee
2025-07-07 14:14 ` Lidong Yan
2025-07-04 11:14 ` [PATCH v4 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-07 11:43 ` Derrick Stolee
2025-07-07 14:18 ` Lidong Yan
2025-07-07 15:14 ` Junio C Hamano
2025-07-10 8:48 ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements " Lidong Yan
2025-07-10 8:48 ` [PATCH v5 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-10 8:48 ` [PATCH v5 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-10 8:48 ` [PATCH v5 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-10 16:17 ` Junio C Hamano
2025-07-11 12:46 ` Lidong Yan
2025-07-11 15:06 ` Junio C Hamano
2025-07-10 8:48 ` [PATCH v5 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-10 13:51 ` [PATCH v5.1 3.5/4] revision: make helper for pathspec to bloom key Derrick Stolee
2025-07-10 15:42 ` Lidong Yan
2025-07-10 13:55 ` [PATCH v5.1 4/4] bloom: optimize multiple pathspec items in revision Derrick Stolee
2025-07-10 15:49 ` Lidong Yan
2025-07-10 13:49 ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-12 9:35 ` [PATCH v6 0/5] " Lidong Yan
2025-07-12 9:35 ` [PATCH v6 1/5] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-12 9:35 ` [PATCH v6 2/5] bloom: rename function operates on bloom_key Lidong Yan
2025-07-12 9:35 ` [PATCH v6 3/5] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-12 9:35 ` [PATCH v6 4/5] revision: make helper for pathspec to bloom keyvec Lidong Yan
2025-07-12 9:35 ` [PATCH v6 5/5] To enable optimize multiple pathspec items in revision traversal, return 0 if all pathspec item is literal in forbid_bloom_filters(). Add for loops to initialize and check each pathspec item's bloom_keyvec when optimization is possible Lidong Yan
2025-07-12 9:47 ` Lidong Yan
2025-07-12 9:51 ` [PATCH v6 5/5] bloom: optimize multiple pathspec items in revision Lidong Yan
2025-07-14 16:51 ` Derrick Stolee
2025-07-14 17:01 ` Junio C Hamano
2025-07-15 1:37 ` Lidong Yan
2025-07-15 2:56 ` [RESEND][PATCH " Lidong Yan
2025-07-14 16:53 ` [PATCH v6 0/5] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-14 17:02 ` Junio C Hamano
2025-07-15 1:34 ` Lidong Yan
2025-07-15 2:48 ` Derrick Stolee
2025-07-15 15:09 ` Junio C Hamano
2025-06-28 4:21 ` [PATCH v3 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-02 15:08 ` Patrick Steinhardt [this message]
2025-07-02 15:49 ` Lidong Yan
2025-07-02 18:28 ` Junio C Hamano
2025-07-03 1:41 ` Lidong Yan
2025-06-28 4:21 ` [PATCH v3 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-06-27 6:21 ` [PATCH v2 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-27 6:21 ` [PATCH v2 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGVLZ9VUf2M1sWhL@pks.im \
--to=ps@pks.im \
--cc=502024330056@smail.nju.edu.cn \
--cc=git@vger.kernel.org \
--cc=yldhome2d2@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).