From: Patrick Steinhardt <ps@pks.im>
To: Lidong Yan <yldhome2d2@gmail.com>
Cc: git@vger.kernel.org, Lidong Yan <502024330056@smail.nju.edu.cn>
Subject: Re: [PATCH v3 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec
Date: Wed, 2 Jul 2025 17:08:23 +0200 [thread overview]
Message-ID: <aGVLZ9VUf2M1sWhL@pks.im> (raw)
In-Reply-To: <20250628042140.1097910-2-502024330056@smail.nju.edu.cn>
On Sat, Jun 28, 2025 at 12:21:39PM +0800, Lidong Yan wrote:
> diff --git a/bloom.h b/bloom.h
> index 6e46489a20..9e4e832c8c 100644
> --- a/bloom.h
> +++ b/bloom.h
> @@ -74,6 +74,11 @@ struct bloom_key {
> uint32_t *hashes;
> };
>
> +struct bloom_keyvec {
> + size_t count;
> + struct bloom_key key[FLEX_ARRAY];
> +};
> +
A short comment would help readers understand what the intent of this
data structure is.
> int load_bloom_filter_from_graph(struct commit_graph *g,
> struct bloom_filter *filter,
> uint32_t graph_pos);
> @@ -100,6 +105,17 @@ void add_key_to_filter(const struct bloom_key *key,
> void init_bloom_filters(void);
> void deinit_bloom_filters(void);
>
> +struct bloom_keyvec *create_bloom_keyvec(size_t count);
> +void destroy_bloom_keyvec(struct bloom_keyvec *vec);
These functions are named very unusually for us -- the first version of
this patch series was following our coding guidelines, but this version
here isn't anymore.
- The primary data structure that a subsystem 'S' deals with is called
`struct S`. Functions that operate on `struct S` are named
`S_<verb>()` and should generally receive a pointer to `struct S` as
first parameter. E.g.
Second, the functions should probably be called `*_new()` and `*_free()`
instead of `create_*()` and `destroy_*()`.
> +static inline void fill_bloom_keyvec_key(const char *data, size_t len,
> + struct bloom_keyvec *vec, size_t nr,
> + const struct bloom_filter_settings *settings)
> +{
> + assert(nr < vec->count);
> + fill_bloom_key(data, len, &vec->key[nr], settings);
> +}
> +
Similarly, this should probably be called `bloom_keyvec_fill_key()`.
> enum bloom_filter_computed {
> BLOOM_NOT_COMPUTED = (1 << 0),
> BLOOM_COMPUTED = (1 << 1),
> @@ -137,4 +153,8 @@ int bloom_filter_contains(const struct bloom_filter *filter,
> const struct bloom_key *key,
> const struct bloom_filter_settings *settings);
>
> +int bloom_filter_contains_vec(const struct bloom_filter *filter,
> + const struct bloom_keyvec *v,
> + const struct bloom_filter_settings *settings);
> +
> #endif
This one looks alright though.
> diff --git a/revision.c b/revision.c
> index afee111196..3aa544c137 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -779,11 +782,8 @@ static int check_maybe_different_in_bloom_filter(struct rev_info *revs,
> return -1;
> }
>
> - for (j = 0; result && j < revs->bloom_keys_nr; j++) {
> - result = bloom_filter_contains(filter,
> - &revs->bloom_keys[j],
> - revs->bloom_filter_settings);
> - }
> + result = bloom_filter_contains_vec(filter, revs->bloom_keyvecs[0],
> + revs->bloom_filter_settings);
>
> if (result)
> count_bloom_filter_maybe++;
This conversion feels wrong to me. Why don't we end up iterating through
`revs->bloom_keyvecs_nr` here? We do indeed change it back in the next
patch to use a for loop.
> @@ -3230,10 +3230,10 @@ void release_revisions(struct rev_info *revs)
> line_log_free(revs);
> oidset_clear(&revs->missing_commits);
>
> - for (int i = 0; i < revs->bloom_keys_nr; i++)
> - clear_bloom_key(&revs->bloom_keys[i]);
> - FREE_AND_NULL(revs->bloom_keys);
> - revs->bloom_keys_nr = 0;
> + for (int i = 0; i < revs->bloom_keyvecs_nr; i++)
It's puzzling that the number of keys is declared as `int`. It's not an
issue introduced by you, but can we maybe fix it while at it?
Patrick
next prev parent reply other threads:[~2025-07-02 15:08 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 12:55 [PATCH 0/2] bloom: use bloom filter given multiple pathspec Lidong Yan
2025-06-25 12:55 ` [PATCH 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-25 17:43 ` Junio C Hamano
2025-06-26 3:44 ` Lidong Yan
2025-06-25 12:55 ` [PATCH 2/2] bloom: enable multiple pathspec bloom keys Lidong Yan
2025-06-27 13:50 ` Junio C Hamano
2025-06-27 14:24 ` Lidong Yan
2025-06-27 18:09 ` Junio C Hamano
2025-07-01 5:52 ` Lidong Yan
2025-07-01 15:19 ` Junio C Hamano
2025-07-02 7:14 ` Lidong Yan
2025-07-02 15:48 ` Junio C Hamano
2025-07-03 1:52 ` Lidong Yan
2025-07-04 12:09 ` Lidong Yan
2025-07-01 8:50 ` SZEDER Gábor
2025-07-01 11:40 ` Lidong Yan
2025-07-01 15:43 ` Junio C Hamano
2025-06-27 20:39 ` Junio C Hamano
2025-06-28 2:54 ` Lidong Yan
2025-06-25 17:32 ` [PATCH 0/2] bloom: use bloom filter given multiple pathspec Junio C Hamano
2025-06-26 3:34 ` Lidong Yan
2025-06-26 14:15 ` Junio C Hamano
2025-06-27 6:21 ` [PATCH v2 0/2] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Lidong Yan
2025-06-28 4:21 ` [PATCH v3 " Lidong Yan
2025-07-04 11:14 ` [PATCH v4 0/4] " Lidong Yan
2025-07-04 11:14 ` [PATCH v4 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-04 11:14 ` [PATCH v4 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-04 11:14 ` [PATCH v4 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-07 11:35 ` Derrick Stolee
2025-07-07 14:14 ` Lidong Yan
2025-07-04 11:14 ` [PATCH v4 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-07 11:43 ` Derrick Stolee
2025-07-07 14:18 ` Lidong Yan
2025-07-07 15:14 ` Junio C Hamano
2025-07-10 8:48 ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements " Lidong Yan
2025-07-10 8:48 ` [PATCH v5 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-10 8:48 ` [PATCH v5 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-10 8:48 ` [PATCH v5 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-10 16:17 ` Junio C Hamano
2025-07-11 12:46 ` Lidong Yan
2025-07-11 15:06 ` Junio C Hamano
2025-07-10 8:48 ` [PATCH v5 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-10 13:51 ` [PATCH v5.1 3.5/4] revision: make helper for pathspec to bloom key Derrick Stolee
2025-07-10 15:42 ` Lidong Yan
2025-07-10 13:55 ` [PATCH v5.1 4/4] bloom: optimize multiple pathspec items in revision Derrick Stolee
2025-07-10 15:49 ` Lidong Yan
2025-07-10 13:49 ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-12 9:35 ` [PATCH v6 0/5] " Lidong Yan
2025-07-12 9:35 ` [PATCH v6 1/5] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-12 9:35 ` [PATCH v6 2/5] bloom: rename function operates on bloom_key Lidong Yan
2025-07-12 9:35 ` [PATCH v6 3/5] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-12 9:35 ` [PATCH v6 4/5] revision: make helper for pathspec to bloom keyvec Lidong Yan
2025-07-12 9:35 ` [PATCH v6 5/5] To enable optimize multiple pathspec items in revision traversal, return 0 if all pathspec item is literal in forbid_bloom_filters(). Add for loops to initialize and check each pathspec item's bloom_keyvec when optimization is possible Lidong Yan
2025-07-12 9:47 ` Lidong Yan
2025-07-12 9:51 ` [PATCH v6 5/5] bloom: optimize multiple pathspec items in revision Lidong Yan
2025-07-14 16:51 ` Derrick Stolee
2025-07-14 17:01 ` Junio C Hamano
2025-07-15 1:37 ` Lidong Yan
2025-07-15 2:56 ` [RESEND][PATCH " Lidong Yan
2025-07-14 16:53 ` [PATCH v6 0/5] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-14 17:02 ` Junio C Hamano
2025-07-15 1:34 ` Lidong Yan
2025-07-15 2:48 ` Derrick Stolee
2025-07-15 15:09 ` Junio C Hamano
2025-06-28 4:21 ` [PATCH v3 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-02 15:08 ` Patrick Steinhardt [this message]
2025-07-02 15:49 ` Lidong Yan
2025-07-02 18:28 ` Junio C Hamano
2025-07-03 1:41 ` Lidong Yan
2025-06-28 4:21 ` [PATCH v3 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-06-27 6:21 ` [PATCH v2 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-27 6:21 ` [PATCH v2 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGVLZ9VUf2M1sWhL@pks.im \
--to=ps@pks.im \
--cc=502024330056@smail.nju.edu.cn \
--cc=git@vger.kernel.org \
--cc=yldhome2d2@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.