From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
Ramsay Jones <ramsay@ramsayjones.plus.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v4 0/4] Changed path filter hash fix and version bump
Date: Tue, 13 Jun 2023 10:39:54 -0700 [thread overview]
Message-ID: <cover.1686677910.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1684790529.git.jonathantanmy@google.com>
Thanks Ramsay for spotting the errors and mentioning that I can use
octal escapes. Here's an update taking into account their comments.
Jonathan Tan (4):
gitformat-commit-graph: describe version 2 of BDAT
t4216: test changed path filters with high bit paths
repo-settings: introduce commitgraph.changedPathsVersion
commit-graph: new filter ver. that fixes murmur3
Documentation/config/commitgraph.txt | 16 ++++-
Documentation/gitformat-commit-graph.txt | 9 ++-
bloom.c | 65 +++++++++++++++++++-
bloom.h | 8 ++-
commit-graph.c | 29 +++++++--
oss-fuzz/fuzz-commit-graph.c | 2 +-
repo-settings.c | 6 +-
repository.h | 2 +-
t/helper/test-bloom.c | 9 ++-
t/t0095-bloom.sh | 8 +++
t/t4216-log-bloom.sh | 77 ++++++++++++++++++++++++
11 files changed, 210 insertions(+), 21 deletions(-)
Range-diff against v3:
1: d4b63945f6 ! 1: a3b52af4c9 gitformat-commit-graph: describe version 2 of BDAT
@@ Documentation/gitformat-commit-graph.txt: All multi-byte numbers are in network
described in https://doi.org/10.1007/978-3-540-30494-4_26 "Bloom Filters
- in Probabilistic Verification"
+ in Probabilistic Verification". Version 1 bloom filters have a bug that appears
-+ when int is signed and the repository has path names that have characters >=
++ when char is signed and the repository has path names that have characters >=
+ 0x80; Git supports reading and writing them, but this ability will be removed
+ in a future version of Git.
- The number of times a path is hashed and hence the number of bit positions
2: aa4535776e ! 2: f095e2b486 t4216: test changed path filters with high bit paths
@@ t/t4216-log-bloom.sh: test_expect_success 'Bloom generation backfills empty comm
+}
+
+# chosen to be the same under all Unicode normalization forms
-+CENT=$(printf "\xc2\xa2")
++CENT=$(printf "\302\242")
+
-+# Some systems (in particular, Linux on the CI running on GitHub at the time of
-+# writing) store into CENT a literal backslash, then "x", and so on (instead of
-+# the high-bit characters needed). In these systems, do not run the following
-+# tests.
-+if test "$(printf $CENT | perl -0777 -ne 'no utf8; print ord($_)')" = "194"
-+then
-+ test_set_prereq HIGH_BIT
-+fi
-+
-+test_expect_success HIGH_BIT 'set up repo with high bit path, version 1 changed-path' '
++test_expect_success 'set up repo with high bit path, version 1 changed-path' '
+ git init highbit1 &&
+ test_commit -C highbit1 c1 "$CENT" &&
+ git -C highbit1 commit-graph write --reachable --changed-paths
+'
+
-+test_expect_success HIGH_BIT 'setup check value of version 1 changed-path' '
++test_expect_success 'setup check value of version 1 changed-path' '
+ (cd highbit1 &&
+ printf "52a9" >expect &&
+ get_first_changed_path_filter >actual)
+'
+
-+# expect will not match actual if int is unsigned by default. Write the test
++# expect will not match actual if char is unsigned by default. Write the test
+# in this way, so that a user running this test script can still see if the two
+# files match. (It will appear as an ordinary success if they match, and a skip
+# if not.)
+if test_cmp highbit1/expect highbit1/actual
+then
-+ test_set_prereq SIGNED_INT_BY_DEFAULT
++ test_set_prereq SIGNED_CHAR_BY_DEFAULT
+fi
-+test_expect_success SIGNED_INT_BY_DEFAULT 'check value of version 1 changed-path' '
++test_expect_success SIGNED_CHAR_BY_DEFAULT 'check value of version 1 changed-path' '
+ # Only the prereq matters for this test.
+ true
+'
+
-+test_expect_success HIGH_BIT 'version 1 changed-path used when version 1 requested' '
++test_expect_success 'version 1 changed-path used when version 1 requested' '
+ (cd highbit1 &&
+ test_bloom_filters_used "-- $CENT")
+'
3: d6982268a4 = 3: 6adfa53daf repo-settings: introduce commitgraph.changedPathsVersion
4: e879483c42 ! 4: 5c65bf8a22 commit-graph: new filter ver. that fixes murmur3
@@ t/t0095-bloom.sh: test_expect_success 'compute unseeded murmur3 hash for test st
Hashes:0x5615800c|0x5b966560|0x61174ab4|0x66983008|0x6c19155c|0x7199fab0|0x771ae004|
## t/t4216-log-bloom.sh ##
-@@ t/t4216-log-bloom.sh: test_expect_success HIGH_BIT 'version 1 changed-path used when version 1 request
+@@ t/t4216-log-bloom.sh: test_expect_success 'version 1 changed-path used when version 1 requested' '
test_bloom_filters_used "-- $CENT")
'
-+test_expect_success HIGH_BIT 'version 1 changed-path not used when version 2 requested' '
++test_expect_success 'version 1 changed-path not used when version 2 requested' '
+ (cd highbit1 &&
+ git config --add commitgraph.changedPathsVersion 2 &&
+ test_bloom_filters_not_used "-- $CENT")
+'
+
-+test_expect_success HIGH_BIT 'set up repo with high bit path, version 2 changed-path' '
++test_expect_success 'set up repo with high bit path, version 2 changed-path' '
+ git init highbit2 &&
+ git -C highbit2 config --add commitgraph.changedPathsVersion 2 &&
+ test_commit -C highbit2 c2 "$CENT" &&
+ git -C highbit2 commit-graph write --reachable --changed-paths
+'
+
-+test_expect_success HIGH_BIT 'check value of version 2 changed-path' '
++test_expect_success 'check value of version 2 changed-path' '
+ (cd highbit2 &&
+ printf "c01f" >expect &&
+ get_first_changed_path_filter >actual &&
+ test_cmp expect actual)
+'
+
-+test_expect_success HIGH_BIT 'version 2 changed-path used when version 2 requested' '
++test_expect_success 'version 2 changed-path used when version 2 requested' '
+ (cd highbit2 &&
+ test_bloom_filters_used "-- $CENT")
+'
+
-+test_expect_success HIGH_BIT 'version 2 changed-path not used when version 1 requested' '
++test_expect_success 'version 2 changed-path not used when version 1 requested' '
+ (cd highbit2 &&
+ git config --add commitgraph.changedPathsVersion 1 &&
+ test_bloom_filters_not_used "-- $CENT")
--
2.41.0.162.gfafddb0af9-goog
next prev parent reply other threads:[~2023-06-13 17:40 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 21:48 [PATCH 0/2] Changed path filter hash fix and version bump Jonathan Tan
2023-05-22 21:48 ` [PATCH 1/2] t4216: test wrong bloom filter version rejection Jonathan Tan
2023-05-22 21:48 ` [PATCH 2/2] commit-graph: fix murmur3, bump filter ver. to 2 Jonathan Tan
2023-05-23 13:00 ` Derrick Stolee
2023-05-23 23:00 ` Jonathan Tan
2023-05-23 23:51 ` Junio C Hamano
2023-05-24 21:26 ` Jonathan Tan
2023-05-26 13:19 ` Derrick Stolee
2023-05-30 17:26 ` Jonathan Tan
2023-05-23 4:42 ` [PATCH 0/2] Changed path filter hash fix and version bump Junio C Hamano
2023-05-31 23:12 ` [PATCH v2 0/3] " Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 1/3] t4216: test changed path filters with high bit paths Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 2/3] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-05-31 23:12 ` [PATCH v2 3/3] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-03 1:01 ` [PATCH v2 0/3] Changed path filter hash fix and version bump Junio C Hamano
2023-06-03 2:24 ` Junio C Hamano
2023-06-07 16:30 ` Jonathan Tan
2023-06-07 21:37 ` Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 0/4] " Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-08 19:52 ` Ramsay Jones
2023-06-12 21:26 ` Junio C Hamano
2023-06-08 19:21 ` [PATCH v3 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-08 19:21 ` [PATCH v3 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-08 19:50 ` [PATCH v3 0/4] Changed path filter hash fix and version bump Ramsay Jones
2023-06-09 0:08 ` Jonathan Tan
2023-06-12 21:31 ` Junio C Hamano
2023-06-13 17:16 ` Jonathan Tan
2023-06-13 17:29 ` [PATCH] CodingGuidelines: use octal escapes, not hex Jonathan Tan
2023-06-13 18:16 ` Eric Sunshine
2023-06-13 18:43 ` Jonathan Tan
2023-06-13 19:15 ` Eric Sunshine
2023-06-13 19:29 ` Junio C Hamano
2023-06-13 19:16 ` [PATCH v3 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-13 17:39 ` Jonathan Tan [this message]
2023-06-13 17:39 ` [PATCH v4 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-06-13 21:58 ` Junio C Hamano
2023-06-20 13:22 ` Derrick Stolee
2023-06-21 12:08 ` Taylor Blau
2023-06-22 22:26 ` Jonathan Tan
2023-06-23 13:05 ` Derrick Stolee
2023-06-13 17:39 ` [PATCH v4 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-06-13 17:39 ` [PATCH v4 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-06-20 13:28 ` Derrick Stolee
2023-06-21 12:14 ` Taylor Blau
2023-06-13 17:39 ` [PATCH v4 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-06-20 13:39 ` Derrick Stolee
2023-06-20 18:37 ` Junio C Hamano
2023-06-13 19:21 ` [PATCH v4 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-06-20 13:43 ` Derrick Stolee
2023-06-20 21:56 ` Jonathan Tan
2023-06-21 12:19 ` Taylor Blau
2023-06-21 17:53 ` Derrick Stolee
2023-06-22 22:27 ` Jonathan Tan
2023-06-23 13:18 ` Derrick Stolee
2023-07-13 21:42 ` [PATCH v5 " Jonathan Tan
2023-07-13 21:42 ` [PATCH v5 1/4] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-19 17:25 ` Taylor Blau
2023-07-20 20:20 ` Jonathan Tan
2023-07-21 1:38 ` Taylor Blau
2023-07-13 21:42 ` [PATCH v5 2/4] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-13 22:50 ` Junio C Hamano
2023-07-19 17:27 ` Taylor Blau
2023-07-19 17:55 ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Taylor Blau
2023-07-19 17:55 ` [PATCH 1/4] t/helper/test-read-graph.c: extract `dump_graph_info()` Taylor Blau
2023-07-19 17:55 ` [PATCH 2/4] bloom.h: make `load_bloom_filter_from_graph()` public Taylor Blau
2023-07-19 17:55 ` [PATCH 3/4] t/helper/test-read-graph: implement `bloom-filters` mode Taylor Blau
2023-07-19 17:55 ` [PATCH 4/4] fixup! t4216: test changed path filters with high bit paths Taylor Blau
2023-07-19 19:24 ` [PATCH 0/4] commit-graph: avoid looking at Bloom filter data directly Junio C Hamano
2023-07-20 20:22 ` Jonathan Tan
2023-07-13 21:42 ` [PATCH v5 3/4] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-19 18:10 ` Taylor Blau
2023-07-20 20:42 ` Jonathan Tan
2023-07-20 21:02 ` Taylor Blau
2023-07-13 21:42 ` [PATCH v5 4/4] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-19 18:24 ` Taylor Blau
2023-07-20 21:27 ` Jonathan Tan
2023-07-26 23:32 ` Taylor Blau
2023-07-13 22:16 ` [PATCH v5 0/4] Changed path filter hash fix and version bump Junio C Hamano
2023-07-13 22:59 ` Junio C Hamano
2023-07-14 18:48 ` Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 0/7] " Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-07-26 23:26 ` Taylor Blau
2023-07-20 21:46 ` [PATCH v6 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-07-26 23:28 ` Taylor Blau
2023-07-20 21:46 ` [PATCH v6 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-07-20 21:46 ` [PATCH v6 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-07-25 20:52 ` [PATCH v6 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-07-26 20:39 ` Junio C Hamano
2023-07-27 0:17 ` Taylor Blau
2023-07-27 0:49 ` Junio C Hamano
2023-07-27 17:39 ` Jonathan Tan
2023-07-27 17:56 ` Taylor Blau
2023-07-27 20:53 ` Jonathan Tan
2023-08-01 18:08 ` Taylor Blau
2023-08-01 18:52 ` Jonathan Tan
2023-08-03 0:01 ` Taylor Blau
2023-08-03 13:18 ` Derrick Stolee
2023-08-03 18:45 ` Taylor Blau
2023-07-27 18:44 ` Junio C Hamano
2023-08-01 18:41 ` [PATCH v7 " Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 1/7] gitformat-commit-graph: describe version 2 of BDAT Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 2/7] t/helper/test-read-graph.c: extract `dump_graph_info()` Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 3/7] bloom.h: make `load_bloom_filter_from_graph()` public Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 4/7] t/helper/test-read-graph: implement `bloom-filters` mode Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 5/7] t4216: test changed path filters with high bit paths Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 6/7] repo-settings: introduce commitgraph.changedPathsVersion Jonathan Tan
2023-08-01 18:41 ` [PATCH v7 7/7] commit-graph: new filter ver. that fixes murmur3 Jonathan Tan
2023-08-01 18:44 ` [PATCH v7 0/7] Changed path filter hash fix and version bump Junio C Hamano
2023-08-01 20:58 ` Taylor Blau
2023-08-01 21:07 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1686677910.git.jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ramsay@ramsayjones.plus.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.