All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, gitster@pobox.com,
	johannes.schindelin@gmx.de, johncai86@gmail.com,
	karthik.188@gmail.com, kristofferhaugsbakk@fastmail.com,
	me@ttaylorr.com, newren@gmail.com, peff@peff.net, ps@pks.im,
	Taylor Blau <me@ttaylorr.com>, Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: [PATCH v4 03/13] t/perf: add pack-objects filter and path-walk benchmark
Date: Wed, 13 May 2026 21:18:45 +0000	[thread overview]
Message-ID: <fb8a0f9c43d4e41712839a93c4db6a294a7b5285.1778707135.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2101.v4.git.1778707135.gitgitgadget@gmail.com>

From: Derrick Stolee <stolee@gmail.com>

Add p5315-pack-objects-filter.sh to measure the performance of
'git pack-objects --revs --all' under different filter and traversal
combinations:

 * no filter (baseline)
 * --filter=blob:none (blobless)
 * --filter=sparse:oid=<oid> (cone-mode sparse)

Each filter scenario is tested both with and without --path-walk,
producing paired measurements that show the impact of the path-walk
traversal for each filter type as we integrate the --path-walk feature
with different --filter options. It currently has no integration so
falls back to the standard revision walk. Thus, there are no significant
differences in the current results other than a full repack (and even
then, the --path-walk feature is not incredibly different for the
default Git repository):

Test                                             HEAD
-----------------------------------------------------
5315.2: repack (no filter)                      27.91
5315.3: repack size (no filter)                250.7M
5315.4: repack (no filter, --path-walk)         34.92
5315.5: repack size (no filter, --path-walk)   220.0M
5315.6: repack (blob:none)                      13.63
5315.7: repack size (blob:none)                137.6M
5315.8: repack (blob:none, --path-walk)         13.48
5315.9: repack size (blob:none, --path-walk)   137.7M
5315.10: repack (sparse:oid)                    72.67
5315.11: repack size (sparse:oid)              187.4M
5315.12: repack (sparse:oid, --path-walk)       72.47
5315.13: repack size (sparse:oid, --path-walk) 187.4M

The sparse filter definition is built automatically by sampling
depth-2 directories from the test repository, making the test work
on any repo passed via GIT_PERF_LARGE_REPO. For repos that lack
depth-2 directories, a single top-level directory is used; for flat
repos, the sparse tests are skipped via prerequisite.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 t/perf/p5315-pack-objects-filter.sh | 131 ++++++++++++++++++++++++++++
 1 file changed, 131 insertions(+)
 create mode 100755 t/perf/p5315-pack-objects-filter.sh

diff --git a/t/perf/p5315-pack-objects-filter.sh b/t/perf/p5315-pack-objects-filter.sh
new file mode 100755
index 0000000000..21056abfc0
--- /dev/null
+++ b/t/perf/p5315-pack-objects-filter.sh
@@ -0,0 +1,131 @@
+#!/bin/sh
+
+test_description='Tests pack-objects performance with filters and --path-walk'
+. ./perf-lib.sh
+
+test_perf_large_repo
+
+test_expect_success 'setup filter inputs' '
+	# Sample a few depth-2 directories from the test repo to build
+	# a cone-mode sparse-checkout definition.  The sampling picks
+	# directories at evenly-spaced positions so the choice is stable
+	# and scales to repos of any shape.
+
+	git ls-tree -d HEAD >top-entries &&
+	grep "^040000" top-entries |
+		awk "{print \$4;}" >top-dirs &&
+	top_nr=$(wc -l <top-dirs) &&
+
+	>depth2-dirs &&
+	while read tdir
+	do
+		git ls-tree -d --name-only "HEAD:$tdir" 2>/dev/null || return 1
+	done <top-dirs >depth2-dirs.raw &&
+	sed "s|^|$tdir/|" <depth2-dirs.raw >depth2-dirs &&
+
+	d2_nr=$(wc -l <depth2-dirs) &&
+
+	if test "$d2_nr" -ge 2
+	then
+		# Pick two directories from evenly-spaced positions.
+		first=$(sed -n "1p" depth2-dirs) &&
+		mid=$(sed -n "$((d2_nr / 2 + 1))p" depth2-dirs) &&
+
+		p1=$(dirname "$first") &&
+		p2=$(dirname "$mid") &&
+
+		# Build cone-mode sparse-checkout patterns.
+		{
+			echo "/*" &&
+			echo "!/*/" &&
+			echo "/$p1/" &&
+			echo "!/$p1/*/" &&
+			if test "$p1" != "$p2"
+			then
+				echo "/$p2/" &&
+				echo "!/$p2/*/"
+			fi &&
+			echo "/$first/" &&
+			if test "$first" != "$mid"
+			then
+				echo "/$mid/"
+			fi
+		} >sparse-patterns &&
+
+		git hash-object -w sparse-patterns >sparse-oid &&
+		echo "Sparse cone: $first $mid" &&
+		cat sparse-patterns &&
+		test_set_prereq SPARSE_OID
+	elif test "$top_nr" -ge 1
+	then
+		# Fallback: use a single top-level directory.
+		first=$(sed -n "1p" top-dirs) &&
+		{
+			echo "/*" &&
+			echo "!/*/" &&
+			echo "/$first/"
+		} >sparse-patterns &&
+
+		git hash-object -w sparse-patterns >sparse-oid &&
+		echo "Sparse cone: $first" &&
+		cat sparse-patterns &&
+		test_set_prereq SPARSE_OID
+	fi
+'
+
+test_perf 'repack (no filter)' '
+	git pack-objects --stdout --no-reuse-delta --revs --all </dev/null >pk
+'
+
+test_size 'repack size (no filter)' '
+	test_file_size pk
+'
+
+test_perf 'repack (no filter, --path-walk)' '
+	git pack-objects --stdout --no-reuse-delta --revs --all --path-walk </dev/null >pk
+'
+
+test_size 'repack size (no filter, --path-walk)' '
+	test_file_size pk
+'
+
+test_perf 'repack (blob:none)' '
+	git pack-objects --stdout --no-reuse-delta --revs --all --filter=blob:none </dev/null >pk
+'
+
+test_size 'repack size (blob:none)' '
+	test_file_size pk
+'
+
+test_perf 'repack (blob:none, --path-walk)' '
+	git pack-objects --stdout --no-reuse-delta --revs --all --path-walk \
+		--filter=blob:none </dev/null >pk
+'
+
+test_size 'repack size (blob:none, --path-walk)' '
+	test_file_size pk
+'
+
+test_perf 'repack (sparse:oid)' \
+	--prereq SPARSE_OID '
+	git pack-objects --stdout --no-reuse-delta --revs --all \
+		--filter=sparse:oid=$(cat sparse-oid) </dev/null >pk
+'
+
+test_size 'repack size (sparse:oid)' \
+	--prereq SPARSE_OID '
+	test_file_size pk
+'
+
+test_perf 'repack (sparse:oid, --path-walk)' \
+	--prereq SPARSE_OID '
+	git pack-objects --stdout --no-reuse-delta --revs --all --path-walk \
+		--filter=sparse:oid=$(cat sparse-oid) </dev/null >pk
+'
+
+test_size 'repack size (sparse:oid, --path-walk)' \
+	--prereq SPARSE_OID '
+	test_file_size pk
+'
+
+test_done
-- 
gitgitgadget


  parent reply	other threads:[~2026-05-13 21:19 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-02 14:15 [PATCH 0/7] pack-objects: integrate --path-walk and some --filter options Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 1/7] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-04  0:49   ` Junio C Hamano
2026-05-04 12:01     ` Derrick Stolee
2026-05-02 14:15 ` [PATCH 2/7] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 3/7] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 4/7] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-03 22:59   ` Junio C Hamano
2026-05-04 12:09     ` Derrick Stolee
2026-05-02 14:15 ` [PATCH 5/7] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 6/7] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-02 14:15 ` [PATCH 7/7] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21 ` [PATCH v2 00/10] pack-objects: integrate --path-walk and some --filter options Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 01/10] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 02/10] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 03/10] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 04/10] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 05/10] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 06/10] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 07/10] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 08/10] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 09/10] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-04 20:21   ` [PATCH v2 10/10] path-walk: support `combine` filter Taylor Blau via GitGitGadget
2026-05-05 16:18   ` [PATCH v2 00/10] pack-objects: integrate --path-walk and some --filter options Derrick Stolee
2026-05-05 19:01     ` Taylor Blau
2026-05-05 19:44       ` Derrick Stolee
2026-05-05 20:42         ` Taylor Blau
2026-05-07 11:40           ` Derrick Stolee
2026-05-11  3:05         ` Junio C Hamano
2026-05-11 13:58           ` Derrick Stolee
2026-05-11 18:12   ` [PATCH v3 00/12] " Derrick Stolee via GitGitGadget
2026-05-11 18:12     ` [PATCH v3 01/12] t5620: make test work with path-walk var Derrick Stolee via GitGitGadget
2026-05-12  1:03       ` Taylor Blau
2026-05-11 18:12     ` [PATCH v3 02/12] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-12  1:04       ` Taylor Blau
2026-05-11 18:13     ` [PATCH v3 03/12] t/perf: add pack-objects filter and path-walk benchmark Derrick Stolee via GitGitGadget
2026-05-12  1:11       ` Taylor Blau
2026-05-13 18:23         ` Derrick Stolee
2026-05-11 18:13     ` [PATCH v3 04/12] path-walk: always emit directly-requested objects Derrick Stolee via GitGitGadget
2026-05-12  1:23       ` Taylor Blau
2026-05-13 18:29         ` Derrick Stolee
2026-05-11 18:13     ` [PATCH v3 05/12] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-11 18:38       ` Taylor Blau
2026-05-11 19:44         ` Derrick Stolee
2026-05-11 18:13     ` [PATCH v3 06/12] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-12  1:26       ` Taylor Blau
2026-05-11 18:13     ` [PATCH v3 07/12] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-12  1:33       ` Taylor Blau
2026-05-13 18:35         ` Derrick Stolee
2026-05-11 18:13     ` [PATCH v3 08/12] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-11 18:13     ` [PATCH v3 09/12] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-11 18:13     ` [PATCH v3 10/12] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-12  1:41       ` Taylor Blau
2026-05-13 19:46         ` Derrick Stolee
2026-05-11 18:13     ` [PATCH v3 11/12] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-11 18:13     ` [PATCH v3 12/12] path-walk: support `combine` filter Taylor Blau via GitGitGadget
2026-05-12  1:43     ` [PATCH v3 00/12] pack-objects: integrate --path-walk and some --filter options Taylor Blau
2026-05-13 21:18     ` [PATCH v4 00/13] " Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 01/13] t5620: make test work with path-walk var Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 02/13] pack-objects: pass --objects with --path-walk Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` Derrick Stolee via GitGitGadget [this message]
2026-05-13 21:18       ` [PATCH v4 04/13] path-walk: always emit directly-requested objects Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 05/13] path-walk: support blobless filter Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 06/13] backfill: die on incompatible filter options Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 07/13] path-walk: support blob size limit filter Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 08/13] path-walk: add pl_sparse_trees to control tree pruning Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 09/13] pack-objects: support sparse:oid filter with path-walk Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 10/13] t6601: tag otherwise-unreachable trees Derrick Stolee via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 11/13] path-walk: support `tree:0` filter Taylor Blau via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 12/13] path-walk: support `object:type` filter Taylor Blau via GitGitGadget
2026-05-13 21:18       ` [PATCH v4 13/13] path-walk: support `combine` filter Taylor Blau via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb8a0f9c43d4e41712839a93c4db6a294a7b5285.1778707135.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=johncai86@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=kristofferhaugsbakk@fastmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.