From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, gitster@pobox.com,
johannes.schindelin@gmx.de, johncai86@gmail.com,
karthik.188@gmail.com, kristofferhaugsbakk@fastmail.com,
me@ttaylorr.com, newren@gmail.com, peff@peff.net, ps@pks.im,
Derrick Stolee <stolee@gmail.com>,
Derrick Stolee <stolee@gmail.com>
Subject: [PATCH 2/7] t/perf: add pack-objects filter and path-walk benchmark
Date: Sat, 02 May 2026 14:15:49 +0000 [thread overview]
Message-ID: <f3646218153ecfe5d534033cb54b1a43733ca3dd.1777731354.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2101.git.1777731354.gitgitgadget@gmail.com>
From: Derrick Stolee <stolee@gmail.com>
Add p5315-pack-objects-filter.sh to measure the performance of
'git pack-objects --revs --all' under different filter and traversal
combinations:
* no filter (baseline)
* --filter=blob:none (blobless)
* --filter=sparse:oid=<oid> (cone-mode sparse)
Each filter scenario is tested both with and without --path-walk,
producing paired measurements that show the impact of the path-walk
traversal for each filter type as we integrate the --path-walk feature
with different --filter options. It currently has no integration so
falls back to the standard revision walk. Thus, there are no significant
differences in the current results other than a full repack (and even
then, the --path-walk feature is not incredibly different for the
default Git repository):
Test HEAD
-----------------------------------------------------
5315.2: repack (no filter) 27.91
5315.3: repack size (no filter) 250.7M
5315.4: repack (no filter, --path-walk) 34.92
5315.5: repack size (no filter, --path-walk) 220.0M
5315.6: repack (blob:none) 13.63
5315.7: repack size (blob:none) 137.6M
5315.8: repack (blob:none, --path-walk) 13.48
5315.9: repack size (blob:none, --path-walk) 137.7M
5315.10: repack (sparse:oid) 72.67
5315.11: repack size (sparse:oid) 187.4M
5315.12: repack (sparse:oid, --path-walk) 72.47
5315.13: repack size (sparse:oid, --path-walk) 187.4M
The sparse filter definition is built automatically by sampling
depth-2 directories from the test repository, making the test work
on any repo passed via GIT_PERF_LARGE_REPO. For repos that lack
depth-2 directories, a single top-level directory is used; for flat
repos, the sparse tests are skipped via prerequisite.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
t/perf/p5315-pack-objects-filter.sh | 129 ++++++++++++++++++++++++++++
1 file changed, 129 insertions(+)
create mode 100755 t/perf/p5315-pack-objects-filter.sh
diff --git a/t/perf/p5315-pack-objects-filter.sh b/t/perf/p5315-pack-objects-filter.sh
new file mode 100755
index 0000000000..b009039c89
--- /dev/null
+++ b/t/perf/p5315-pack-objects-filter.sh
@@ -0,0 +1,129 @@
+#!/bin/sh
+
+test_description='Tests pack-objects performance with filters and --path-walk'
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'setup filter inputs' '
+ # Sample a few depth-2 directories from the test repo to build
+ # a cone-mode sparse-checkout definition. The sampling picks
+ # directories at evenly-spaced positions so the choice is stable
+ # and scales to repos of any shape.
+
+ git ls-tree -d --name-only HEAD >top-dirs &&
+ top_nr=$(wc -l <top-dirs) &&
+
+ >depth2-dirs &&
+ while read tdir
+ do
+ git ls-tree -d --name-only "HEAD:$tdir" 2>/dev/null |
+ sed "s|^|$tdir/|" >>depth2-dirs || return 1
+ done <top-dirs &&
+
+ d2_nr=$(wc -l <depth2-dirs) &&
+
+ if test "$d2_nr" -ge 2
+ then
+ # Pick two directories from evenly-spaced positions.
+ first=$(sed -n "1p" depth2-dirs) &&
+ mid=$(sed -n "$((d2_nr / 2 + 1))p" depth2-dirs) &&
+
+ p1=$(dirname "$first") &&
+ p2=$(dirname "$mid") &&
+
+ # Build cone-mode sparse-checkout patterns.
+ {
+ echo "/*" &&
+ echo "!/*/" &&
+ echo "/$p1/" &&
+ echo "!/$p1/*/" &&
+ if test "$p1" != "$p2"
+ then
+ echo "/$p2/" &&
+ echo "!/$p2/*/"
+ fi &&
+ echo "/$first/" &&
+ if test "$first" != "$mid"
+ then
+ echo "/$mid/"
+ fi
+ } >sparse-patterns &&
+
+ git hash-object -w sparse-patterns >sparse-oid &&
+ echo "Sparse cone: $first $mid" &&
+ cat sparse-patterns &&
+ test_set_prereq SPARSE_OID
+ elif test "$top_nr" -ge 1
+ then
+ # Fallback: use a single top-level directory.
+ first=$(sed -n "1p" top-dirs) &&
+ {
+ echo "/*" &&
+ echo "!/*/" &&
+ echo "/$first/"
+ } >sparse-patterns &&
+
+ git hash-object -w sparse-patterns >sparse-oid &&
+ echo "Sparse cone: $first" &&
+ cat sparse-patterns &&
+ test_set_prereq SPARSE_OID
+ fi
+'
+
+test_perf 'repack (no filter)' '
+ git pack-objects --stdout --no-reuse-delta --revs --all </dev/null >pk
+'
+
+test_size 'repack size (no filter)' '
+ test_file_size pk
+'
+
+test_perf 'repack (no filter, --path-walk)' '
+ git pack-objects --stdout --no-reuse-delta --revs --all --path-walk </dev/null >pk
+'
+
+test_size 'repack size (no filter, --path-walk)' '
+ test_file_size pk
+'
+
+test_perf 'repack (blob:none)' '
+ git pack-objects --stdout --no-reuse-delta --revs --all --filter=blob:none </dev/null >pk
+'
+
+test_size 'repack size (blob:none)' '
+ test_file_size pk
+'
+
+test_perf 'repack (blob:none, --path-walk)' '
+ git pack-objects --stdout --no-reuse-delta --revs --all --path-walk \
+ --filter=blob:none </dev/null >pk
+'
+
+test_size 'repack size (blob:none, --path-walk)' '
+ test_file_size pk
+'
+
+test_perf 'repack (sparse:oid)' \
+ --prereq SPARSE_OID '
+ git pack-objects --stdout --no-reuse-delta --revs --all \
+ --filter=sparse:oid=$(cat sparse-oid) </dev/null >pk
+'
+
+test_size 'repack size (sparse:oid)' \
+ --prereq SPARSE_OID '
+ test_file_size pk
+'
+
+test_perf 'repack (sparse:oid, --path-walk)' \
+ --prereq SPARSE_OID '
+ git pack-objects --stdout --no-reuse-delta --revs --all --path-walk \
+ --filter=sparse:oid=$(cat sparse-oid) </dev/null >pk
+'
+
+test_size 'repack size (sparse:oid, --path-walk)' \
+ --prereq SPARSE_OID '
+ test_file_size pk
+'
+
+test_done
--
gitgitgadget
next prev parent reply other threads:[~2026-05-02 14:16 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-02 14:15 [PATCH 0/7] pack-objects: integrate --path-walk and some --filter options Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 1/7] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-04 0:49 ` Junio C Hamano
2026-05-04 12:01 ` Derrick Stolee
2026-05-02 14:15 ` Derrick Stolee via GitGitGadget [this message]
2026-05-02 14:15 ` [PATCH 3/7] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 4/7] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-03 22:59 ` Junio C Hamano
2026-05-04 12:09 ` Derrick Stolee
2026-05-02 14:15 ` [PATCH 5/7] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 6/7] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 7/7] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 00/10] pack-objects: integrate --path-walk and some --filter options Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 01/10] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 02/10] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 03/10] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 04/10] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 05/10] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 06/10] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 07/10] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 08/10] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 09/10] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 10/10] path-walk: support `combine` filter Taylor Blau via GitGitGadget
2026-05-05 16:18 ` [PATCH v2 00/10] pack-objects: integrate --path-walk and some --filter options Derrick Stolee
2026-05-05 19:01 ` Taylor Blau
2026-05-05 19:44 ` Derrick Stolee
2026-05-05 20:42 ` Taylor Blau
2026-05-07 11:40 ` Derrick Stolee
2026-05-11 3:05 ` Junio C Hamano
2026-05-11 13:58 ` Derrick Stolee
2026-05-11 18:12 ` [PATCH v3 00/12] " Derrick Stolee via GitGitGadget
2026-05-11 18:12 ` [PATCH v3 01/12] t5620: make test work with path-walk var Derrick Stolee via GitGitGadget
2026-05-12 1:03 ` Taylor Blau
2026-05-11 18:12 ` [PATCH v3 02/12] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-12 1:04 ` Taylor Blau
2026-05-11 18:13 ` [PATCH v3 03/12] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-12 1:11 ` Taylor Blau
2026-05-13 18:23 ` Derrick Stolee
2026-05-11 18:13 ` [PATCH v3 04/12] path-walk: always emit directly-requested objects Derrick Stolee via GitGitGadget
2026-05-12 1:23 ` Taylor Blau
2026-05-13 18:29 ` Derrick Stolee
2026-05-11 18:13 ` [PATCH v3 05/12] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-11 18:38 ` Taylor Blau
2026-05-11 19:44 ` Derrick Stolee
2026-05-11 18:13 ` [PATCH v3 06/12] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-12 1:26 ` Taylor Blau
2026-05-11 18:13 ` [PATCH v3 07/12] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-12 1:33 ` Taylor Blau
2026-05-13 18:35 ` Derrick Stolee
2026-05-11 18:13 ` [PATCH v3 08/12] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-11 18:13 ` [PATCH v3 09/12] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-11 18:13 ` [PATCH v3 10/12] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-12 1:41 ` Taylor Blau
2026-05-13 19:46 ` Derrick Stolee
2026-05-11 18:13 ` [PATCH v3 11/12] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-11 18:13 ` [PATCH v3 12/12] path-walk: support `combine` filter Taylor Blau via GitGitGadget
2026-05-12 1:43 ` [PATCH v3 00/12] pack-objects: integrate --path-walk and some --filter options Taylor Blau
2026-05-13 21:18 ` [PATCH v4 00/13] " Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 01/13] t5620: make test work with path-walk var Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 02/13] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 03/13] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 04/13] path-walk: always emit directly-requested objects Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 05/13] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 06/13] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 07/13] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 08/13] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 09/13] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 10/13] t6601: tag otherwise-unreachable trees Derrick Stolee via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 11/13] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 12/13] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-13 21:18 ` [PATCH v4 13/13] path-walk: support `combine` filter Taylor Blau via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3646218153ecfe5d534033cb54b1a43733ca3dd.1777731354.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
--cc=johncai86@gmail.com \
--cc=karthik.188@gmail.com \
--cc=kristofferhaugsbakk@fastmail.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox