From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: me@ttaylorr.com, jnareb@gmail.com, garimasigit@gmail.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v3 0/3] Integrate changed-path Bloom filters with 'git blame'
Date: Thu, 16 Apr 2020 20:14:01 +0000 [thread overview]
Message-ID: <pull.609.v3.git.1587068044.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.609.v2.git.1586789126.gitgitgadget@gmail.com>
If the changed-path Bloom filters are relatively stable, then I propose
trying to build upon them as a way to discover any deficiencies. Also, it's
good to use them when we can.
The goal I set out to achieve was to use Bloom filters as much as possible
in git blame. I think I have achieved most of that, except I did not
integrate it with the -C mode. In that case, the blob-diff computation takes
a majority of the computation time, so short-circuiting the tree diff using
Bloom filters. Also, it's just really complicated. If someone else thinks
there is an easy win there, then please go ahead and give it a try with the
extra logic presented here in PATCH 3.
While I was playing around with Bloom filters and git blame, I purposefully
got it working with some scenarios but not all. Then I tried to trigger a
failing build in the blame tests using GIT_TEST_COMMIT_GRAPH=1 and
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1. But the tests all succeeded!
Examining the data, I see that the commit-graph didn't have the Bloom filter
chunks at all. This is because we are not setting the flag to write them in
the right spot. The GIT_TEST_COMMIT_GRAPH=1 variable triggers a commit-graph
write during git commit, so we need to update the code there instead of just
inspecting the variable in git commit-graph write. (This is PATCH 2.)
By updating this variable, I saw some test failures in other tests regarding
non-exact pathspecs. I fixed these in PATCH 1 so we keep clean builds.
I based this change on [1] but it would apply cleanly (and logically) on
gs/commit-graph-path-filter
Updates in v2:
* Added PATCH 3 to write commit-graph files during 'git merge' if
GIT_TEST_COMMIT_GRAPH is enabled.
* Updated PATCH 1 with the simplification recommended by Taylor.
* Fixed the lower-case "bloom" in the commit message.
Thanks, -Stolee
[1]
https://lore.kernel.org/git/pull.601.v2.git.1586437211842.gitgitgadget@gmail.com/
Derrick Stolee (3):
revision: complicated pathspecs disable filters
tests: write commit-graph with Bloom filters
blame: use changed-path Bloom filters
blame.c | 139 ++++++++++++++++++++++++++++++++++++++++++++---
blame.h | 6 ++
builtin/blame.c | 10 ++++
builtin/commit.c | 4 +-
builtin/merge.c | 7 ++-
commit-graph.c | 14 +++++
commit-graph.h | 9 +++
revision.c | 19 ++++++-
8 files changed, 193 insertions(+), 15 deletions(-)
base-commit: f4df00a0dd448edce0e854a97f63598fefe27d27
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-609%2Fderrickstolee%2Fbloom-blame-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-609/derrickstolee/bloom-blame-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/609
Range-diff vs v2:
1: adc03eee4ac = 1: adc03eee4ac revision: complicated pathspecs disable filters
2: 7e8f1aed113 < -: ----------- commit: write commit-graph with Bloom filters
3: 824f8ad067b ! 2: 4073c8fe42f commit-graph: write commit-graph in more tests
@@ Metadata
Author: Derrick Stolee <dstolee@microsoft.com>
## Commit message ##
- commit-graph: write commit-graph in more tests
+ tests: write commit-graph with Bloom filters
- The GIT_TEST_COMMIT_GRAPH test environment variable triggers
- commit-graph writes during each "git commit" process. To expand
- the number of tests that have commits in the commit-graph file,
- add a helper method that computes the commit-graph and place
+ The GIT_TEST_COMMIT_GRAPH environment variable updates the commit-
+ graph file whenever "git commit" is run, ensuring that we always
+ have an updated commit-graph throughout the test suite. The
+ GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variable was
+ introduced to write the changed-path Bloom filters whenever "git
+ commit-graph write" is run. However, the GIT_TEST_COMMIT_GRAPH
+ trick doesn't launch a separate process and instead writes it
+ directly.
+
+ To expand the number of tests that have commits in the commit-graph
+ file, add a helper method that computes the commit-graph and place
that helper inside "git commit" and "git merge".
+ In the helper method, check GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS
+ to ensure we are writing changed-path Bloom filters whenever
+ possible.
+
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
## builtin/commit.c ##
@@ commit-graph.c
+void git_test_write_commit_graph_or_die(void)
+{
-+ if (git_env_bool(GIT_TEST_COMMIT_GRAPH, 0) &&
-+ write_commit_graph_reachable(the_repository->objects->odb, 0, NULL))
++ int flags = 0;
++ if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0))
++ return;
++
++ if (git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0))
++ flags = COMMIT_GRAPH_WRITE_BLOOM_FILTERS;
++
++ if (write_commit_graph_reachable(the_repository->objects->odb,
++ flags, NULL))
+ die("failed to write commit-graph under GIT_TEST_COMMIT_GRAPH");
+}
+
@@ commit-graph.h
#define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
#define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS"
++/*
++ * This method is only used to enhance coverage of the commit-graph
++ * feature in the test suite with the GIT_TEST_COMMIT_GRAPH and
++ * GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variables. Do not
++ * call this method oustide of a builtin, and only if you know what
++ * you are doing!
++ */
+void git_test_write_commit_graph_or_die(void);
+
struct commit;
4: 4ae196d6355 = 3: 463d6bf5033 blame: use changed-path Bloom filters
--
gitgitgadget
next prev parent reply other threads:[~2020-04-16 20:19 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-11 1:02 [PATCH 0/3] Integrate changed-path Bloom filters with 'git blame' Derrick Stolee via GitGitGadget
2020-04-11 1:02 ` [PATCH 1/3] revision: complicated pathspecs disable filters Derrick Stolee via GitGitGadget
2020-04-11 21:40 ` Junio C Hamano
2020-04-13 11:49 ` Derrick Stolee
2020-04-14 18:25 ` Junio C Hamano
2020-04-15 13:27 ` Derrick Stolee
2020-04-15 18:37 ` Derrick Stolee
2020-04-15 19:32 ` Junio C Hamano
2020-04-15 19:39 ` Junio C Hamano
2020-04-15 21:25 ` Junio C Hamano
2020-04-16 0:56 ` Taylor Blau
2020-04-15 22:18 ` Jakub Narębski
2020-04-16 0:52 ` Taylor Blau
2020-04-16 13:26 ` Derrick Stolee
2020-04-16 16:33 ` Taylor Blau
2020-04-16 18:02 ` Junio C Hamano
2020-04-12 22:22 ` Taylor Blau
2020-04-12 22:30 ` Junio C Hamano
2020-04-13 0:07 ` Taylor Blau
2020-04-13 11:54 ` Derrick Stolee
2020-04-11 1:03 ` [PATCH 2/3] commit: write commit-graph with bloom filters Derrick Stolee via GitGitGadget
2020-04-11 21:57 ` Junio C Hamano
2020-04-12 20:51 ` Taylor Blau
2020-04-13 12:08 ` Derrick Stolee
2020-04-13 22:11 ` Junio C Hamano
2020-04-11 1:03 ` [PATCH 3/3] blame: use changed-path Bloom filters Derrick Stolee via GitGitGadget
2020-04-11 22:03 ` Junio C Hamano
2020-04-12 7:39 ` Eric Sunshine
2020-04-11 21:30 ` [PATCH 0/3] Integrate changed-path Bloom filters with 'git blame' Junio C Hamano
2020-04-13 14:45 ` [PATCH v2 0/4] " Derrick Stolee via GitGitGadget
2020-04-13 14:45 ` [PATCH v2 1/4] revision: complicated pathspecs disable filters Derrick Stolee via GitGitGadget
2020-04-13 16:09 ` Taylor Blau
2020-04-13 22:18 ` Junio C Hamano
2020-04-13 14:45 ` [PATCH v2 2/4] commit: write commit-graph with Bloom filters Derrick Stolee via GitGitGadget
2020-04-13 16:12 ` Taylor Blau
2020-04-13 22:21 ` Junio C Hamano
2020-04-14 15:04 ` Derrick Stolee
2020-04-14 17:26 ` Junio C Hamano
2020-04-14 17:40 ` Derrick Stolee
2020-04-15 0:17 ` Taylor Blau
2020-04-13 14:45 ` [PATCH v2 3/4] commit-graph: write commit-graph in more tests Derrick Stolee via GitGitGadget
2020-04-13 14:45 ` [PATCH v2 4/4] blame: use changed-path Bloom filters Derrick Stolee via GitGitGadget
2020-04-13 16:21 ` [PATCH v2 0/4] Integrate changed-path Bloom filters with 'git blame' Taylor Blau
2020-04-16 20:14 ` Derrick Stolee via GitGitGadget [this message]
2020-04-16 20:14 ` [PATCH v3 1/3] revision: complicated pathspecs disable filters Derrick Stolee via GitGitGadget
2020-06-07 20:33 ` SZEDER Gábor
2020-04-16 20:14 ` [PATCH v3 2/3] tests: write commit-graph with Bloom filters Derrick Stolee via GitGitGadget
2020-04-16 20:14 ` [PATCH v3 3/3] blame: use changed-path " Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.609.v3.git.1587068044.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=dstolee@microsoft.com \
--cc=garimasigit@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.