From: Derrick Stolee <derrickstolee@github.com>
To: Taylor Blau <me@ttaylorr.com>, Jonathan Tan <jonathantanmy@google.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH v6 0/7] Changed path filter hash fix and version bump
Date: Thu, 3 Aug 2023 09:18:11 -0400 [thread overview]
Message-ID: <491c237b-de23-52b0-c034-d4d358372864@github.com> (raw)
In-Reply-To: <ZMruSSAGQWS8crvi@nand.local>
On 8/2/2023 8:01 PM, Taylor Blau wrote:
> On Tue, Aug 01, 2023 at 02:08:50PM -0400, Taylor Blau wrote:
>> That's a good point. I think in general I'd expect Git to avoid
>> recomputing Bloom filters where that work can be avoided, if the work in
>> order to detect whether or not we need to recompute a filter is cheap
>> enough to carry out.
>
> I spent some time implementing this (patches are available in the branch
> 'tb/path-filter-fix-upgrade' from my fork). Handling incompatible Bloom
> filter versions is kind of tricky, but do-able without having to
> implement too much on top of what's already there.
>
> But I don't think that it's enough to say that we can reuse a commit's
> Bloom filter iff that commit's tree has no paths with characters >=
> 0x80. Suppose we have such a tree, whose Bloom filter we believe to be
> reusable. If its first parent *does* have such a path, then that path
> would appear as a deletion relative to its first parent. So that path
> *would* be in the filter, meaning that it isn't reusable.
>
> So I think the revised condition is something like: a commit's Bloom
> filter is reusable when there are no paths with characters >= 0x80 in
> a tree-diff against its first parent. I think that ensuring that there
> are no such paths in both a commit's root tree, as well as its first
> parent's root tree is equivalent, since that removes the possibility of
> such a path showing up in its tree-diff.
This condition is exactly "we computed the diff to know which paths were
input to the filter" which is as difficult as recomputing the Bloom filter
from scratch. I don't think there is much room to gain a performance
improvement here.
Thanks,
-Stolee
next prev parent reply other threads:[~2023-08-03 13:18 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 21:48 [PATCH 0/2] Changed path filter hash fix and version bump Jonathan Tan
2023-05-22 21:48 ` [PATCH 1/2] t4216: test wrong bloom filter version rejection Jonathan Tan
2023-05-22 21:48 ` [PATCH 2/2] commit-graph: fix murmur3, bump filter ver. to 2 Jonathan Tan
2023-05-23 13:00 ` Derrick Stolee
2023-05-23 23:00 ` Jonathan Tan
2023-05-23 23:51 ` Junio C Hamano
2023-05-24 21:26 ` Jonathan Tan
2023-05-26 13:19 ` Derrick Stolee
2023-05-30 17:26 ` Jonathan Tan
2023-05-23 4:42 ` [PATCH 0/2] Changed path filter hash fix and version bump Junio C Hamano
2023-05-31 23:12 ` [PATCH v2 0/3] " Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 1/3] t4216: test changed path filters with high bit paths Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 2/3] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 3/3] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-03 1:01 ` [PATCH v2 0/3] Changed path filter hash fix and version bump Junio C Hamano
2023-06-03 2:24 ` Junio C Hamano
2023-06-07 16:30 ` Jonathan Tan
2023-06-07 21:37 ` Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 0/4] " Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-08 19:52 ` Ramsay Jones
2023-06-12 21:26 ` Junio C Hamano
2023-06-08 19:21 ` [PATCH v3 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-08 19:50 ` [PATCH v3 0/4] Changed path filter hash fix and version bump Ramsay Jones
2023-06-09 0:08 ` Jonathan Tan
2023-06-12 21:31 ` Junio C Hamano
2023-06-13 17:16 ` Jonathan Tan
2023-06-13 17:29 ` [PATCH] CodingGuidelines: use octal escapes, not hex Jonathan Tan
2023-06-13 18:16 ` Eric Sunshine
2023-06-13 18:43 ` Jonathan Tan
2023-06-13 19:15 ` Eric Sunshine
2023-06-13 19:29 ` Junio C Hamano
2023-06-13 19:16 ` [PATCH v3 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-13 17:39 ` [PATCH v4 " Jonathan Tan
2023-06-13 17:39 ` [PATCH v4 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-13 21:58 ` Junio C Hamano
2023-06-20 13:22 ` Derrick Stolee
2023-06-21 12:08 ` Taylor Blau
2023-06-22 22:26 ` Jonathan Tan
2023-06-23 13:05 ` Derrick Stolee
2023-06-13 17:39 ` [PATCH v4 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-13 17:39 ` [PATCH v4 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-20 13:28 ` Derrick Stolee
2023-06-21 12:14 ` Taylor Blau
2023-06-13 17:39 ` [PATCH v4 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-20 13:39 ` Derrick Stolee
2023-06-20 18:37 ` Junio C Hamano
2023-06-13 19:21 ` [PATCH v4 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-20 13:43 ` Derrick Stolee
2023-06-20 21:56 ` Jonathan Tan
2023-06-21 12:19 ` Taylor Blau
2023-06-21 17:53 ` Derrick Stolee
2023-06-22 22:27 ` Jonathan Tan
2023-06-23 13:18 ` Derrick Stolee
2023-07-13 21:42 ` [PATCH v5 " Jonathan Tan
2023-07-13 21:42 ` [PATCH v5 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-19 17:25 ` Taylor Blau
2023-07-20 20:20 ` Jonathan Tan
2023-07-21 1:38 ` Taylor Blau
2023-07-13 21:42 ` [PATCH v5 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-13 22:50 ` Junio C Hamano
2023-07-19 17:27 ` Taylor Blau
2023-07-19 17:55 ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Taylor Blau
2023-07-19 17:55 ` [PATCH 1/4] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-07-19 17:55 ` [PATCH 2/4] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-07-19 17:55 ` [PATCH 3/4] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-07-19 17:55 ` [PATCH 4/4] fixup! t4216: test changed path filters with high bit paths Taylor Blau
2023-07-19 19:24 ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Junio C Hamano
2023-07-20 20:22 ` Jonathan Tan
2023-07-13 21:42 ` [PATCH v5 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-19 18:10 ` Taylor Blau
2023-07-20 20:42 ` Jonathan Tan
2023-07-20 21:02 ` Taylor Blau
2023-07-13 21:42 ` [PATCH v5 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-19 18:24 ` Taylor Blau
2023-07-20 21:27 ` Jonathan Tan
2023-07-26 23:32 ` Taylor Blau
2023-07-13 22:16 ` [PATCH v5 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-07-13 22:59 ` Junio C Hamano
2023-07-14 18:48 ` Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 0/7] " Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-07-26 23:26 ` Taylor Blau
2023-07-20 21:46 ` [PATCH v6 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-26 23:28 ` Taylor Blau
2023-07-20 21:46 ` [PATCH v6 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-25 20:52 ` [PATCH v6 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-07-26 20:39 ` Junio C Hamano
2023-07-27 0:17 ` Taylor Blau
2023-07-27 0:49 ` Junio C Hamano
2023-07-27 17:39 ` Jonathan Tan
2023-07-27 17:56 ` Taylor Blau
2023-07-27 20:53 ` Jonathan Tan
2023-08-01 18:08 ` Taylor Blau
2023-08-01 18:52 ` Jonathan Tan
2023-08-03 0:01 ` Taylor Blau
2023-08-03 13:18 ` Derrick Stolee [this message]
2023-08-03 18:45 ` Taylor Blau
2023-07-27 18:44 ` Junio C Hamano
2023-08-01 18:41 ` [PATCH v7 " Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-08-01 18:44 ` [PATCH v7 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-08-01 20:58 ` Taylor Blau
2023-08-01 21:07 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=491c237b-de23-52b0-c034-d4d358372864@github.com \
--to=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.