From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Jeff King <peff@peff.net>, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
Date: Fri, 29 May 2026 16:07:26 -0400 [thread overview]
Message-ID: <ahnx/oBWe9yyBZFg@nand.local> (raw)
In-Reply-To: <22a7e32f-f645-4f00-bc5b-6b4309e483c2@gmail.com>
On Fri, May 29, 2026 at 01:26:33PM -0400, Derrick Stolee wrote:
> On 5/28/26 11:28 AM, Derrick Stolee wrote:
> > On 5/27/26 7:18 PM, Taylor Blau wrote:
>
> > Do you have any end-to-end performance data to demonstrate that these
> > changes are effective at scale? Are we still producing packfiles with the
> > pack-file compression and now with .bitmap files? How does this impact
> > the performance of a clone or fetch when using a bitmap index at read
> > time?
>
> Here's my attempt to use our existing performance tests to analyze the
> impact of this series.
>
> Running p5311 against the base of this topic and this topic with
> GIT_TEST_PACK_PATH_WALK=1, I get this output:
Yikes. That's not great, but see below for what I think is going on.
(As an aside, we can focus in on either lookup=true or lookup=false,
since these are just controlling whether or not the bitmap lookup table
is written. On a repository as small as git.git, this shouldn't make a
huge difference either way. I have a separate series to make this the
default and to clean up the t/perf suite accordingly, but haven't sent
it to the list yet.)
> Test HEAD~3 HEAD
> -----------------------------------------------------------------
> [...]
I think this is either the primary reason why you're not seeing an
improvement here, or at least related to it...
> Do you have a good feeling for why the path-walk feature doesn't
> make a huge change in these test scenarios?
I think the problem is that we're relying on the TEST_ variable to tell
pack-objects to generate a pack using --path-walk, but treat it as a
fallback.
I suspect that since p5311 invokes repack *without* the '--path-walk'
option, we end up in this case within cmd_pack_objects():
if (path_walk < 0) {
if (use_bitmap_index > 0 ||
!use_internal_rev_list)
path_walk = 0;
else if (the_repository->gitdir &&
the_repository->settings.pack_use_path_walk)
path_walk = 1;
else
path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0);
}
, where `path_walk` is *not* set, but `use_bitmap_index` is since p5311
set the 'pack.writeBitmaps' configuration (as a separate aside, this
should really prefer the non-deprecated 'repack.writeBitmaps' variant).
So in that case, we fall back to `path_walk = 0`, and don't even bother
reading the `GIT_TEST_PACK_PATH_WALK` variable.
If I modify the perf test like so:
--- 8< ---
diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
index 047efb995d6..1c9c99216e3 100755
--- a/t/perf/p5311-pack-bitmaps-fetch.sh
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -13,7 +13,7 @@ test_fetch_bitmaps () {
test_expect_success 'create bitmapped server repo' '
git config pack.writebitmaps true &&
git config pack.writeBitmapLookupTable '"$1"' &&
- git repack -ad
+ git repack -ad --path-walk
'
# simulate a fetch from a repository that last fetched N days ago, for
--- >8 ---
, then I can get significantly improved results when running without the
GIT_TEST_PACK_PATH_WALK variablle (here I'm truncating the
'lookup=false' case, which performs nearly identically):
Test HEAD~3 HEAD
------------------------------------------------------------------------------------
5311.4: server (1 days) (lookup=true) 2.57(2.52+0.04) 0.03(0.02+0.00) -98.8%
5311.5: size (1 days) 153.4K 153.4K +0.0%
5311.6: client (1 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
5311.8: server (2 days) (lookup=true) 2.60(2.55+0.04) 0.02(0.02+0.00) -99.2%
5311.9: size (2 days) 153.4K 153.4K +0.0%
5311.10: client (2 days) (lookup=true) 0.00(0.01+0.00) 0.00(0.01+0.00) =
5311.12: server (4 days) (lookup=true) 2.60(2.54+0.05) 0.03(0.03+0.00) -98.8%
5311.13: size (4 days) 209.0K 209.0K +0.0%
5311.14: client (4 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
5311.16: server (8 days) (lookup=true) 2.58(2.53+0.04) 0.03(0.03+0.00) -98.8%
5311.17: size (8 days) 209.0K 209.0K +0.0%
5311.18: client (8 days) (lookup=true) 0.01(0.01+0.00) 0.01(0.02+0.00) +0.0%
5311.20: server (16 days) (lookup=true) 2.58(2.52+0.05) 0.03(0.03+0.00) -98.8%
5311.21: size (16 days) 209.0K 209.0K +0.0%
5311.22: client (16 days) (lookup=true) 0.01(0.02+0.00) 0.01(0.01+0.00) +0.0%
5311.24: server (32 days) (lookup=true) 2.61(2.58+0.03) 0.03(0.02+0.01) -98.9%
5311.25: size (32 days) 212.9K 212.9K +0.0%
5311.26: client (32 days) (lookup=true) 0.01(0.02+0.00) 0.02(0.02+0.00) +100.0%
5311.28: server (64 days) (lookup=true) 2.72(2.79+0.06) 0.19(0.30+0.03) -93.0%
5311.29: size (64 days) 4.5M 4.5M -0.0%
5311.30: client (64 days) (lookup=true) 0.49(0.58+0.02) 0.48(0.56+0.04) -2.0%
5311.32: server (128 days) (lookup=true) 2.90(3.21+0.09) 0.35(0.70+0.04) -87.9%
5311.33: size (128 days) 9.4M 9.5M +0.4%
5311.34: client (128 days) (lookup=true) 0.98(1.27+0.08) 0.98(1.33+0.06) +0.0%
My reading here is that we get significantly smaller packs (i.e. all
'test_size' tests drop from HEAD~3 to HEAD) in the same amount of time
(i.e. that all 'test_perf' tests are roughly flat).
That lines up with my expectation here, which is that even though we're
using bitmaps at read time, that's effectively seeding the verbatim pack
reuse over a significantly smaller pack, producing a much smaller output
pack as a result.
As to whether we should modify the perf suite to test this, naturally I
think we should. Likely that looks like modifying p5313 to re-run
`test_all_with_args` with '--use-bitmap-index' after repacking with
--path-walk and generating bitmaps like so (untested):
--- 8< ---
diff --git a/t/perf/p5313-pack-objects.sh b/t/perf/p5313-pack-objects.sh
index 46a6cd32d24..663717982b1 100755
--- a/t/perf/p5313-pack-objects.sh
+++ b/t/perf/p5313-pack-objects.sh
@@ -22,6 +22,21 @@ test_expect_success 'create rev input' '
EOF
'
+test_repack_with_args () {
+ args="$@"
+ export args
+
+ test_perf "repack with $args" '
+ git repack -adf $args
+ '
+
+ test_size "repack size with $args" '
+ gitdir=$(git rev-parse --git-dir) &&
+ pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
+ test_file_size "$pack"
+ '
+}
+
test_all_with_args () {
parameter=$1
export parameter
@@ -52,23 +67,22 @@ test_all_with_args () {
test_size "shallow pack size with $parameter" '
test_file_size out
'
-
- test_perf "repack with $parameter" '
- git repack -adf $parameter
- '
-
- test_size "repack size with $parameter" '
- gitdir=$(git rev-parse --git-dir) &&
- pack=$(ls $gitdir/objects/pack/pack-*.pack) &&
- test_file_size "$pack"
- '
}
for version in 1 2
do
- test_all_with_args --name-hash-version=$version
+ arg="--name-hash-version=$version" &&
+
+ test_all_with_args "$arg" &&
+ test_repack_with_args "$arg" || return 1
done
test_all_with_args --path-walk
+test_repack_with_args --path-walk
+
+# inverted order here: we want to test using reachability bitmaps on a
+# pack written with --path-walk
+test_repack_with_args --path-walk --write-bitmap-index
+test_all_with_args --use-bitmap-index
test_done
--- >8 ---
I don't have a strong opinion on whether or not we should include that
in this series or elsewhere.
Thanks,
Taylor
next prev parent reply other threads:[~2026-05-29 20:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
2026-05-28 15:28 ` [PATCH 0/3] pack-objects: support bitmaps and delta-islands " Derrick Stolee
2026-05-29 17:26 ` Derrick Stolee
2026-05-29 20:07 ` Taylor Blau [this message]
2026-05-29 21:28 ` Derrick Stolee
2026-05-29 22:20 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
2026-06-02 22:21 ` [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation Taylor Blau
2026-06-02 22:21 ` [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 3/4] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-06-02 22:21 ` [PATCH v2 4/4] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahnx/oBWe9yyBZFg@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox