* [RFC PATCH] Not computing changed path filter for root commits @ 2023-09-11 22:31 Jonathan Tan 2023-09-15 18:25 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Jonathan Tan @ 2023-09-11 22:31 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, szeder.dev, me, derrickstolee This is following the discussion about adding a new changed path filter version due to the current implementation of murmur3 not matching the algorithm. [1] SZEDER Gábor suggested [2] that we change the revision walk to read changed path filters also for root commits, but I don't think that's possible - we have to tie reading changed path filters to when we read trees, and right now, we don't seem to read trees when evaluating root commits (rev_compare_tree() in revision.c is in the only code path that uses changed path filters, and it itself is only called per-parent and thus not called for root commits). The alternative is to not generate changed path filters for root commits (or what I did in this patch, which is to generate an all-1 filter), which seems reasonable to me. Is this a good idea? If yes, there is the follow-up question of how to report it in traces. I don't know off-hand if it's better to reuse the "large" statistic, since what we output is the same (all 1s), or if we should make a new one. Presumably what we need to weigh is the clarity versus the migration costs, but I don't know how to weigh it. If people are generally in agreement, I can send an updated patch that does not goto into an "else" block :) and also with updated tests (t0095 needs to check a non-root commit, and t4216 needs to be updated with whatever statistics we decide to use). [1] https://lore.kernel.org/git/20230826150610.GA1928@szeder.dev/ [2] https://lore.kernel.org/git/20230830200218.GA5147@szeder.dev/ --- bloom.c | 3 ++- t/t0095-bloom.sh | 2 +- t/t4216-log-bloom.sh | 12 +++--------- 3 files changed, 6 insertions(+), 11 deletions(-) diff --git a/bloom.c b/bloom.c index aef6b5fea2..b21130b236 100644 --- a/bloom.c +++ b/bloom.c @@ -226,7 +226,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, if (c->parents) diff_tree_oid(&c->parents->item->object.oid, &c->object.oid, "", &diffopt); else - diff_tree_oid(NULL, &c->object.oid, "", &diffopt); + goto large; diffcore_std(&diffopt); if (diff_queued_diff.nr <= settings->max_changed_paths) { @@ -292,6 +292,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, } else { for (i = 0; i < diff_queued_diff.nr; i++) diff_free_filepair(diff_queued_diff.queue[i]); +large: init_truncated_large_filter(filter); if (computed) diff --git a/t/t0095-bloom.sh b/t/t0095-bloom.sh index b567383eb8..02a0b41026 100755 --- a/t/t0095-bloom.sh +++ b/t/t0095-bloom.sh @@ -74,7 +74,7 @@ test_expect_success !SANITIZE_LEAK 'get bloom filters for commit with no changes git commit --allow-empty -m "c0" && cat >expect <<-\EOF && Filter_Length:1 - Filter_Data:00| + Filter_Data:ff| EOF test-tool bloom get_filter_for_commit "$(git rev-parse HEAD)" >actual && test_cmp expect actual diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index fa9d32facf..d14fe93fb1 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -283,7 +283,6 @@ test_expect_success 'correctly report changes over limit' ' git commit-graph write --reachable --changed-paths && test_max_changed_paths 11 trace-update && test_filter_computed 2 trace-update && - test_filter_trunc_large 0 trace-update && for path in $(git ls-tree -r --name-only HEAD) do @@ -306,9 +305,7 @@ test_expect_success 'correctly report commits with no changed paths' ' GIT_TRACE2_EVENT="$(pwd)/trace.event" \ git commit-graph write --reachable --changed-paths && test_filter_computed 1 trace.event && - test_filter_not_computed 0 trace.event && - test_filter_trunc_empty 1 trace.event && - test_filter_trunc_large 0 trace.event + test_filter_not_computed 0 trace.event ) ' @@ -363,8 +360,7 @@ test_expect_success '--max-new-filters overrides configuration' ' --max-new-filters=1 && test_filter_computed 1 trace.event && test_filter_not_computed 1 trace.event && - test_filter_trunc_empty 0 trace.event && - test_filter_trunc_large 0 trace.event + test_filter_trunc_empty 0 trace.event ) ' @@ -386,9 +382,7 @@ test_expect_success 'Bloom generation backfills empty commits' ' git commit-graph write --reachable \ --changed-paths --max-new-filters=2 && test_filter_computed 2 trace.event && - test_filter_not_computed 4 trace.event && - test_filter_trunc_empty 2 trace.event && - test_filter_trunc_large 0 trace.event || return 1 + test_filter_not_computed 4 trace.event || return 1 done && # Finally, make sure that once all commits have filters, that -- 2.42.0.283.g2d96d420d3-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-09-11 22:31 [RFC PATCH] Not computing changed path filter for root commits Jonathan Tan @ 2023-09-15 18:25 ` Junio C Hamano 2023-09-15 20:29 ` SZEDER Gábor 2023-09-19 18:21 ` Taylor Blau 2 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2023-09-15 18:25 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, szeder.dev, me, derrickstolee Jonathan Tan <jonathantanmy@google.com> writes: > This is following the discussion about adding a new changed path filter > version due to the current implementation of murmur3 not matching the > algorithm. [1] > > SZEDER Gábor suggested [2] that we change the revision walk to read > changed path filters also for root commits, but I don't think that's > possible - we have to tie reading changed path filters to when we read > trees, and right now, we don't seem to read trees when evaluating root > commits (rev_compare_tree() in revision.c is in the only code path that > uses changed path filters, and it itself is only called per-parent and > thus not called for root commits). The alternative is to not generate > changed path filters for root commits (or what I did in this patch, > which is to generate an all-1 filter), which seems reasonable to me. I know this is a very silly question, but if the filter is not read for root commits at runtime, does it matter if a filter is created for them beforehand (or not)? They will not be read whether if they exist or not, no? One observation in the thread [2] appears in was: In several of the above test cases test_bloom_filters_used is invoked in a repository with only a root commit, so they don't check that the output is the same with and without Bloom filters. i.e. the check would be ineffective with the current system that we know does not use the filter for a root commit even if it existed. But would it be an improvement to add a filter to a root commit and test with the filter enabled and disabled to compare the results, if we know the filter is not used anyway? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-09-11 22:31 [RFC PATCH] Not computing changed path filter for root commits Jonathan Tan 2023-09-15 18:25 ` Junio C Hamano @ 2023-09-15 20:29 ` SZEDER Gábor 2023-09-19 18:19 ` Taylor Blau 2023-09-19 18:21 ` Taylor Blau 2 siblings, 1 reply; 10+ messages in thread From: SZEDER Gábor @ 2023-09-15 20:29 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, me, derrickstolee On Mon, Sep 11, 2023 at 03:31:56PM -0700, Jonathan Tan wrote: > SZEDER Gábor suggested [2] that we change the revision walk to read > changed path filters also for root commits, but I don't think that's > possible - we have to tie reading changed path filters to when we read > trees, and right now, we don't seem to read trees when evaluating root > commits (rev_compare_tree() in revision.c is in the only code path that > uses changed path filters, and it itself is only called per-parent and > thus not called for root commits). When encountering a root commit during a pathspec-limited revision walk we call rev_same_tree_as_empty() instead of rev_compare_tree(). All that's missing there is checking the Bloom filter and accounting for false positives. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-09-15 20:29 ` SZEDER Gábor @ 2023-09-19 18:19 ` Taylor Blau 2023-10-02 22:55 ` Jonathan Tan 0 siblings, 1 reply; 10+ messages in thread From: Taylor Blau @ 2023-09-19 18:19 UTC (permalink / raw) To: SZEDER Gábor; +Cc: Jonathan Tan, git On Fri, Sep 15, 2023 at 10:29:12PM +0200, SZEDER Gábor wrote: > On Mon, Sep 11, 2023 at 03:31:56PM -0700, Jonathan Tan wrote: > > SZEDER Gábor suggested [2] that we change the revision walk to read > > changed path filters also for root commits, but I don't think that's > > possible - we have to tie reading changed path filters to when we read > > trees, and right now, we don't seem to read trees when evaluating root > > commits (rev_compare_tree() in revision.c is in the only code path that > > uses changed path filters, and it itself is only called per-parent and > > thus not called for root commits). > > When encountering a root commit during a pathspec-limited revision > walk we call rev_same_tree_as_empty() instead of rev_compare_tree(). > All that's missing there is checking the Bloom filter and accounting > for false positives. I think that we'd want something like this, though I would definitely appreciate a second set of eyes since I am not 100% confident in my set of changes here: --- 8< --- diff --git a/revision.c b/revision.c index 2f4c53ea20..1d36df49e2 100644 --- a/revision.c +++ b/revision.c @@ -837,14 +837,24 @@ static int rev_compare_tree(struct rev_info *revs, static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) { struct tree *t1 = repo_get_commit_tree(the_repository, commit); + int bloom_ret = 1; if (!t1) return 0; + if (revs->bloom_keys_nr) { + bloom_ret = check_maybe_different_in_bloom_filter(revs, commit); + if (!bloom_ret) + return 1; + } + tree_difference = REV_TREE_SAME; revs->pruning.flags.has_changes = 0; diff_tree_oid(NULL, &t1->object.oid, "", &revs->pruning); + if (bloom_ret == 1 && tree_difference == REV_TREE_SAME) + count_bloom_filter_false_positive++; + return tree_difference == REV_TREE_SAME; } diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index fa9d32facf..3a45cb997b 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -162,7 +162,7 @@ test_expect_success 'setup - add commit-graph to the chain with Bloom filters' ' test_bloom_filters_used_when_some_filters_are_missing () { log_args=$1 - bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":6,\"definitely_not\":9" + bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":6,\"definitely_not\":10" setup "$log_args" && grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" && test_cmp log_wo_bloom log_w_bloom --- >8 --- Thanks, Taylor ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-09-19 18:19 ` Taylor Blau @ 2023-10-02 22:55 ` Jonathan Tan 2023-10-09 17:20 ` Taylor Blau 0 siblings, 1 reply; 10+ messages in thread From: Jonathan Tan @ 2023-10-02 22:55 UTC (permalink / raw) To: Taylor Blau; +Cc: Jonathan Tan, SZEDER Gábor, git Taylor Blau <me@ttaylorr.com> writes: > diff --git a/revision.c b/revision.c > index 2f4c53ea20..1d36df49e2 100644 > --- a/revision.c > +++ b/revision.c > @@ -837,14 +837,24 @@ static int rev_compare_tree(struct rev_info *revs, > static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) > { > struct tree *t1 = repo_get_commit_tree(the_repository, commit); > + int bloom_ret = 1; > > if (!t1) > return 0; > > + if (revs->bloom_keys_nr) { > + bloom_ret = check_maybe_different_in_bloom_filter(revs, commit); > + if (!bloom_ret) > + return 1; > + } > + > tree_difference = REV_TREE_SAME; > revs->pruning.flags.has_changes = 0; > diff_tree_oid(NULL, &t1->object.oid, "", &revs->pruning); > > + if (bloom_ret == 1 && tree_difference == REV_TREE_SAME) > + count_bloom_filter_false_positive++; > + > return tree_difference == REV_TREE_SAME; > } I'll concentrate on getting this patch in, and will look at (and discuss) the other Bloom filter-related emails later. This looks good, possibly except a code path in try_to_simplify_commit() that calls this rev_same_tree_as_empty() function when rev_compare_tree() between a commit and its parent returns REV_TREE_NEW. So there are 2 issues: How can rev_compare_tree() ever return REV_TREE_NEW? And it doesn't seem right to check Bloom filters in this code path, since rev_same_tree_as_empty() was invoked here while we are enumerating through a commit's parents, which necessarily implies that the commit has parents, but here we're using the Bloom filter as if the commit is known to have no parents. As for the first issue, rev_compare_tree() returns REV_TREE_NEW when the parent's tree is NULL. I'm not sure how this can happen - the tree can be NULL if the parent commit is not parsed, but at this point I think that it has been parsed. And I think every commit has a tree. This goes back all the way to 3a5e860815 (revision: make tree comparison functions take commits rather than trees, 2008-11-03) and even beyond that (I didn't dig further). As for the second issue, we can probably solve this by being defensive in rev_same_tree_as_empty() by only using the Bloom filter when the commit has no parents. Not sure if this is being overly defensive, though. There is also the issue that count_bloom_filter_false_positive is incremented even when no Bloom filters are present, but I think this is fine (it matches the behavior of rev_compare_tree()). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-10-02 22:55 ` Jonathan Tan @ 2023-10-09 17:20 ` Taylor Blau 2023-10-09 17:26 ` Taylor Blau 0 siblings, 1 reply; 10+ messages in thread From: Taylor Blau @ 2023-10-09 17:20 UTC (permalink / raw) To: Jonathan Tan; +Cc: SZEDER Gábor, git On Mon, Oct 02, 2023 at 03:55:46PM -0700, Jonathan Tan wrote: > Taylor Blau <me@ttaylorr.com> writes: > > diff --git a/revision.c b/revision.c > > index 2f4c53ea20..1d36df49e2 100644 > > --- a/revision.c > > +++ b/revision.c > > @@ -837,14 +837,24 @@ static int rev_compare_tree(struct rev_info *revs, > > static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) > > { > > struct tree *t1 = repo_get_commit_tree(the_repository, commit); > > + int bloom_ret = 1; > > > > if (!t1) > > return 0; > > > > + if (revs->bloom_keys_nr) { > > + bloom_ret = check_maybe_different_in_bloom_filter(revs, commit); > > + if (!bloom_ret) > > + return 1; > > + } > > + > > tree_difference = REV_TREE_SAME; > > revs->pruning.flags.has_changes = 0; > > diff_tree_oid(NULL, &t1->object.oid, "", &revs->pruning); > > > > + if (bloom_ret == 1 && tree_difference == REV_TREE_SAME) > > + count_bloom_filter_false_positive++; > > + > > return tree_difference == REV_TREE_SAME; > > } > > I'll concentrate on getting this patch in, and will look at (and > discuss) the other Bloom filter-related emails later. Sounds good. I know that I have some pending mail from SZEDER as well, so I'll try and apply any feedback from both of you before sending a reroll so that we can get this all done in one shot. > This looks good, possibly except a code path in try_to_simplify_commit() > that calls this rev_same_tree_as_empty() function when > rev_compare_tree() between a commit and its parent returns REV_TREE_NEW. > So there are 2 issues: How can rev_compare_tree() ever return > REV_TREE_NEW? And it doesn't seem right to check Bloom filters in this > code path, since rev_same_tree_as_empty() was invoked here while we are > enumerating through a commit's parents, which necessarily implies that > the commit has parents, but here we're using the Bloom filter as if the > commit is known to have no parents. > As for the first issue, rev_compare_tree() returns REV_TREE_NEW when the > parent's tree is NULL. I'm not sure how this can happen - the tree can > be NULL if the parent commit is not parsed, but at this point I think > that it has been parsed. And I think every commit has a tree. This goes > back all the way to 3a5e860815 (revision: make tree comparison functions > take commits rather than trees, 2008-11-03) and even beyond that (I > didn't dig further). The more I think about this, the more confused I become ;-). I was able to track this all the way back to Linus's 461cf59f89 (rev-list: stop when the file disappears, 2006-01-18), but am convinced that both of these cases (we have an analogous one for when t2 is NULL and we return REV_TREE_OLD) are dead code. I replaced the "if (!t1) return REV_TREE_NEW;" with "if (!t1) BUG("oops")", and was able to get the whole test suite to pass. So... I am pretty sure that this is dead code, but not sure enough to remove it myself ;-). To your point about seeing the tree as NULL before parsing the parent, I don't think that is the case here, since we parse the parent immediately before calling rev_compare_tree() (and indeed Git will refuse to read commit objects which do not list a tree). > As for the second issue, we can probably solve this by being defensive > in rev_same_tree_as_empty() by only using the Bloom filter when the > commit has no parents. Not sure if this is being overly defensive, > though. I am also unsure whether we are being overly defensive here or not. But I agree that it does feel safer to apply something like: --- 8< --- diff --git a/revision.c b/revision.c index 3d78ea6a9a..21b3085465 100644 --- a/revision.c +++ b/revision.c @@ -834,7 +834,8 @@ static int rev_compare_tree(struct rev_info *revs, return tree_difference; } -static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) +static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit, + int use_bloom_filter) { struct tree *t1 = repo_get_commit_tree(the_repository, commit); int bloom_ret = 1; @@ -842,7 +843,7 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) if (!t1) return 0; - if (revs->bloom_keys_nr) { + if (use_bloom_filter && revs->bloom_keys_nr) { bloom_ret = check_maybe_different_in_bloom_filter(revs, commit); if (!bloom_ret) return 1; @@ -892,7 +893,7 @@ static int compact_treesame(struct rev_info *revs, struct commit *commit, unsign if (nth_parent != 0) die("compact_treesame %u", nth_parent); old_same = !!(commit->object.flags & TREESAME); - if (rev_same_tree_as_empty(revs, commit)) + if (rev_same_tree_as_empty(revs, commit, 1)) commit->object.flags |= TREESAME; else commit->object.flags &= ~TREESAME; @@ -988,7 +989,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) return; if (!commit->parents) { - if (rev_same_tree_as_empty(revs, commit)) + if (rev_same_tree_as_empty(revs, commit, 1)) commit->object.flags |= TREESAME; return; } @@ -1069,7 +1070,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) case REV_TREE_NEW: if (revs->remove_empty_trees && - rev_same_tree_as_empty(revs, p)) { + rev_same_tree_as_empty(revs, p, 0)) { /* We are adding all the specified * paths from this parent, so the * history beyond this parent is not --- >8 --- on top. > There is also the issue that count_bloom_filter_false_positive is > incremented even when no Bloom filters are present, but I think this is > fine (it matches the behavior of rev_compare_tree()). Yep, agreed. Thanks, Taylor ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-10-09 17:20 ` Taylor Blau @ 2023-10-09 17:26 ` Taylor Blau 2023-10-09 20:59 ` Jonathan Tan 0 siblings, 1 reply; 10+ messages in thread From: Taylor Blau @ 2023-10-09 17:26 UTC (permalink / raw) To: Jonathan Tan; +Cc: SZEDER Gábor, git On Mon, Oct 09, 2023 at 01:20:31PM -0400, Taylor Blau wrote: > On Mon, Oct 02, 2023 at 03:55:46PM -0700, Jonathan Tan wrote: > > As for the second issue, we can probably solve this by being defensive > > in rev_same_tree_as_empty() by only using the Bloom filter when the > > commit has no parents. Not sure if this is being overly defensive, > > though. > > I am also unsure whether we are being overly defensive here or not. But > I agree that it does feel safer to apply something like: Never mind, we are being overly defensive there. The goal here is to avoid using a commit's Bloom filter in cases where we are acting as if a commit is at the root of history, but in fact has parents. This only happens when we return REV_TREE_NEW from a call to `rev_compare_tree(revs, p, commit, nth_parent)`. But we'll only get REV_TREE_NEW back if repo_get_commit_tree(the_repository, p); returns NULL. But when we call rev_same_tree_as_empty(revs, p) in the REV_TREE_NEW case, we return early as follows: struct tree *t1 = repo_get_commit_tree(revs, p); if (!t1) return 0; So we won't even consult the Bloom filter in that case, since t1 is NULL for the same reason as what caused rev_compare_tree() to return REV_TREE_NEW in the first place. I am still dumbfounded by how we would ever get REV_TREE_NEW in the first place, but if we did, I think we would be OK here. Thanks, Taylor ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-10-09 17:26 ` Taylor Blau @ 2023-10-09 20:59 ` Jonathan Tan 2023-10-10 19:47 ` Taylor Blau 0 siblings, 1 reply; 10+ messages in thread From: Jonathan Tan @ 2023-10-09 20:59 UTC (permalink / raw) To: Taylor Blau; +Cc: Jonathan Tan, SZEDER Gábor, git Taylor Blau <me@ttaylorr.com> writes: > This only happens when we return REV_TREE_NEW from a call to > `rev_compare_tree(revs, p, commit, nth_parent)`. But we'll only get > REV_TREE_NEW back if > > repo_get_commit_tree(the_repository, p); > > returns NULL. But when we call rev_same_tree_as_empty(revs, p) in the > REV_TREE_NEW case, we return early as follows: > > struct tree *t1 = repo_get_commit_tree(revs, p); > if (!t1) > return 0; > > So we won't even consult the Bloom filter in that case, since t1 is NULL > for the same reason as what caused rev_compare_tree() to return > REV_TREE_NEW in the first place. > > I am still dumbfounded by how we would ever get REV_TREE_NEW in the > first place, but if we did, I think we would be OK here. > > Thanks, > Taylor Ah, good point. Your patch in https://lore.kernel.org/git/ZQnmTXUO94%2FQy8mq@nand.local/ looks good to me, then. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-10-09 20:59 ` Jonathan Tan @ 2023-10-10 19:47 ` Taylor Blau 0 siblings, 0 replies; 10+ messages in thread From: Taylor Blau @ 2023-10-10 19:47 UTC (permalink / raw) To: Jonathan Tan; +Cc: SZEDER Gábor, git On Mon, Oct 09, 2023 at 01:59:25PM -0700, Jonathan Tan wrote: > Taylor Blau <me@ttaylorr.com> writes: > > This only happens when we return REV_TREE_NEW from a call to > > `rev_compare_tree(revs, p, commit, nth_parent)`. But we'll only get > > REV_TREE_NEW back if > > > > repo_get_commit_tree(the_repository, p); > > > > returns NULL. But when we call rev_same_tree_as_empty(revs, p) in the > > REV_TREE_NEW case, we return early as follows: > > > > struct tree *t1 = repo_get_commit_tree(revs, p); > > if (!t1) > > return 0; > > > > So we won't even consult the Bloom filter in that case, since t1 is NULL > > for the same reason as what caused rev_compare_tree() to return > > REV_TREE_NEW in the first place. > > > > I am still dumbfounded by how we would ever get REV_TREE_NEW in the > > first place, but if we did, I think we would be OK here. > > > > Thanks, > > Taylor > > Ah, good point. Your patch in > https://lore.kernel.org/git/ZQnmTXUO94%2FQy8mq@nand.local/ looks good to > me, then. Oops, I made a mistake in the quoted portion, which is that we could get REV_TREE_NEW if the tree-diff itself only adds files. This is the non-trivial case that we get when t1 is non-NULL, and we end up calling `diff_tree_oid()` which sets the static `tree_difference` variable. So I think adding an nth_parent field (like you originally suggested[^1]) makes sense. Thanks, Taylor [^1]: Thanks for being patient with me ;-). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] Not computing changed path filter for root commits 2023-09-11 22:31 [RFC PATCH] Not computing changed path filter for root commits Jonathan Tan 2023-09-15 18:25 ` Junio C Hamano 2023-09-15 20:29 ` SZEDER Gábor @ 2023-09-19 18:21 ` Taylor Blau 2 siblings, 0 replies; 10+ messages in thread From: Taylor Blau @ 2023-09-19 18:21 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, szeder.dev, derrickstolee On Mon, Sep 11, 2023 at 03:31:56PM -0700, Jonathan Tan wrote: > SZEDER Gábor suggested [2] that we change the revision walk to read > changed path filters also for root commits, but I don't think that's > possible - we have to tie reading changed path filters to when we read > trees, and right now, we don't seem to read trees when evaluating root > commits (rev_compare_tree() in revision.c is in the only code path that > uses changed path filters, and it itself is only called per-parent and > thus not called for root commits). The alternative is to not generate > changed path filters for root commits (or what I did in this patch, > which is to generate an all-1 filter), which seems reasonable to me. I think between the two, the all-1's filter is the more sensible choice, since not computing a filter is typically reserved for blowing past the `commitGraph.maxNewFilters` setting. But, I agree with Gábor down-thread that we could instead teach `rev_same_tree_as_empty()` to be aware of Bloom filters, which I think would accomplish our goal of reading Bloom filters at the root commit while not having to tweak their generation. Thanks, Taylor ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-10-10 19:47 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-11 22:31 [RFC PATCH] Not computing changed path filter for root commits Jonathan Tan 2023-09-15 18:25 ` Junio C Hamano 2023-09-15 20:29 ` SZEDER Gábor 2023-09-19 18:19 ` Taylor Blau 2023-10-02 22:55 ` Jonathan Tan 2023-10-09 17:20 ` Taylor Blau 2023-10-09 17:26 ` Taylor Blau 2023-10-09 20:59 ` Jonathan Tan 2023-10-10 19:47 ` Taylor Blau 2023-09-19 18:21 ` Taylor Blau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).