From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, newren@gmail.com,
Phillip Wood <phillip.wood123@gmail.com>,
Derrick Stolee <stolee@gmail.com>
Subject: [PATCH v2 0/4] Integrate the sparse index with 'git apply' and interactive add, checkout, and reset
Date: Fri, 16 May 2025 14:55:26 +0000 [thread overview]
Message-ID: <pull.1914.v2.git.1747407330.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1914.git.1746579320.gitgitgadget@gmail.com>
The sparse index helps make some Git commands faster when using
sparse-checkout in cone mode. However, not all code paths are aware that the
index can have non-blob entries, so we are careful about rolling this
feature out gradually. The cost of this rollout is that some commands are
slower with the sparse index as they need to expand a sparse index into a
full index in memory, which requires parsing tree objects to construct the
full path list.
This patch series focuses on the 'git add -p' command, which is slow with
the sparse index for a couple of reasons, handled in the first two patches:
1. 'git add -p' uses 'git apply' as a subcommand and 'git apply' needs
integration with the sparse index. Luckily, we just need to add the repo
setting and appropriate tests to confirm it behaves as expected.
2. The interactive modes of 'git add' ('-p' and '-i') leave cmd_add()
before the code that sets the repo setting to allow for a sparse index.
Patch 2 fixes this and adds appropriate tests to confirm the behavior in
a sparse-checkout.
3. The interactive mode of 'git reset' leaves cmd_reset() before the code
that sets the repo setting to allow for the sparse index.
A third patch adds a performance test to p2000-sparse-operations.sh to
confirm that we are getting the performance improvement we expect:
Test BASE PATCH 1 PATCH 2 PATCH 3
-------------------------------------------------------------------------------------
2000.118: ... git add -p (full-v3) 0.79 0.79 +0.0% 0.82 +3.8% 0.82 +3.8%
2000.119: ... git add -p (full-v4) 0.74 0.76 +2.7% 0.74 +0.0% 0.76 +2.7%
2000.120: ... git add -p (sparse-v3) 1.94 1.28 -34.0% 0.07 -96.4% 0.07 -96.4%
2000.121: ... git add -p (sparse-v4) 1.93 1.28 -33.7% 0.06 -96.9% 0.06 -96.9%
2000.122: ... git checkout -p (full-v3) 1.18 1.18 +0.0% 1.18 +0.0% 1.19 +0.8%
2000.123: ... git checkout -p (full-v4) 1.10 1.12 +1.8% 1.11 +0.9% 1.11 +0.9%
2000.124: ... git checkout -p (sparse-v3) 1.31 0.11 -91.6% 0.11 -91.6% 0.11 -91.6%
2000.125: ... git checkout -p (sparse-v4) 1.29 0.11 -91.5% 0.11 -91.5% 0.11 -91.5%
2000.126: ... git reset -p (full-v3) 0.81 0.80 -1.2% 0.83 +2.5% 0.83 +2.5%
2000.127: ... git reset -p (full-v4) 0.78 0.77 -1.3% 0.77 -1.3% 0.78 +0.0%
2000.128: ... git reset -p (sparse-v3) 1.58 0.92 -41.8% 0.91 -42.4% 0.07 -95.6%
2000.129: ... git reset -p (sparse-v4) 1.58 0.92 -41.8% 0.92 -41.8% 0.07 -95.6%
Updates in v2
=============
Thanks for the careful review from Elijah and the pointer from Phillip, we
have these changes:
1. The tests no longer have different expansion behaviors for 'git add -p'
and 'git add -i' due to partially-expanded indexes on disk.
2. We now test 'git checkout -p' and 'git reset -p'.
3. 'git reset -p' needed some changes to the builtin (similar to 'git add')
to be fast.
Thanks, -Stolee
Derrick Stolee (4):
apply: integrate with the sparse index
git add: make -p/-i aware of sparse index
reset: integrate sparse index with --patch
p2000: add performance test for patch-mode commands
builtin/add.c | 7 +-
builtin/apply.c | 7 +-
builtin/reset.c | 6 +-
t/perf/p2000-sparse-operations.sh | 3 +
t/t1092-sparse-checkout-compatibility.sh | 151 +++++++++++++++++++++++
5 files changed, 167 insertions(+), 7 deletions(-)
base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1914%2Fderrickstolee%2Fapply-sparse-index-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1914/derrickstolee/apply-sparse-index-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1914
Range-diff vs v1:
1: 0e6e199cd19 ! 1: 1adf81ecb2c apply: integrate with the sparse index
@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
+
+ # Expands when using --index.
+ ensure_expanded apply --index ../patch-outside &&
++
++ # Does not when index is partially expanded.
++ git -C sparse-index reset --hard &&
++ ensure_not_expanded apply --cached ../patch-outside &&
++
++ # Try again with a reset and collapsed index.
+ git -C sparse-index reset --hard &&
++ git -C sparse-index sparse-checkout reapply &&
+
-+ # Does not expand when using --cached.
-+ ensure_not_expanded apply --cached ../patch-outside
++ # Expands when index is collapsed.
++ ensure_expanded apply --cached ../patch-outside
+'
+
test_expect_success 'advice.sparseIndexExpanded' '
2: 63caae87634 ! 2: 0a2752721d0 git add: make -p/-i aware of sparse index
@@ Commit message
It turns out that control flows out of cmd_add() in the interactive
cases before the lines that confirm that the builtin is integrated with
- the sparse index. We need to move that earlier to ensure it prevents a
- full index expansion on read.
+ the sparse index.
- Add more test cases that confirm that these interactive add options work
- with the sparse index. One interesting aspect here is that the '-i'
- option avoids expanding the sparse index when a sparse directory exists
- on disk while the '-p' option does hit the ensure_full_index() method.
- This leaves some room for improvement, but this case should be atypical
- as users should remain within their sparse-checkout.
+ Moving that integration point earlier in cmd_add() allows 'git add -p'
+ and 'git add -p' to operate without expanding a sparse index to a full
+ one.
+
+ Add test cases that confirm that these interactive add options work with
+ the sparse index.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'add, commit, chec
init_repos &&
@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded: git apply' '
- ensure_not_expanded apply --cached ../patch-outside
+ ensure_expanded apply --cached ../patch-outside
'
+test_expect_success 'sparse-index is not expanded: git add -p' '
@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
+ git -C sparse-index reset &&
+ ensure_not_expanded add -i <in &&
+
++ # -p does expand when edits are outside sparse checkout.
+ mkdir -p sparse-index/folder1 &&
+ echo "new content" >sparse-index/folder1/a &&
-+
-+ # -p does expand when edits are outside sparse checkout.
+ test_write_lines y n y >in &&
+ ensure_expanded add -p <in &&
+
-+ # but -i does not expand.
-+ git -C sparse-index reset &&
++ # Fully reset the index.
++ git -C sparse-index reset --hard &&
++ git -C sparse-index sparse-checkout reapply &&
++
++ # -i does expand when edits are outside sparse checkout.
++ mkdir -p sparse-index/folder1 &&
++ echo "new content" >sparse-index/folder1/a &&
+ test_write_lines u 2 3 "" q >in &&
-+ ensure_not_expanded add -i <in
++ ensure_expanded add -i <in
+'
+
test_expect_success 'advice.sparseIndexExpanded' '
-: ----------- > 3: d1482a29d8f reset: integrate sparse index with --patch
3: 7a777281626 ! 4: a50c57f7628 p2000: add performance test for 'git add -p'
@@ Metadata
Author: Derrick Stolee <dstolee@microsoft.com>
## Commit message ##
- p2000: add performance test for 'git add -p'
+ p2000: add performance test for patch-mode commands
- The previous two changes contributed performance improvements to 'git
- apply' and 'git add -p' when using a sparse index. Add a performance
- test to demonstrate this (and to help validate that performance remains
- good in the future).
+ The previous three changes contributed performance improvements to 'git
+ apply', 'git add -p', and 'git reset -p' when using a sparse index. The
+ improvement to 'git apply' also improved 'git checkout -p'. Add
+ performance tests to demonstrate this (and to help validate that
+ performance remains good in the future).
In the truncated test output below, we see that the full checkout
performance changes within noise expectations, but the sparse index
- cases improve 33% and then 96%.
-
- HEAD~3 HEAD~2 HEAD~1
- ---------------------------------------------------------
- 2000.118: (full-v3) 0.80 0.84 +5.0% 0.84 +5.0%
- 2000.119: (full-v4) 0.76 0.79 +3.9% 0.80 +5.3%
- 2000.120: (sparse-v3) 2.09 1.39 -33.5% 0.07 -96.7%
- 2000.121: (sparse-v4) 2.09 1.39 -33.5% 0.07 -96.7%
+ cases improve 33% and then 96% for 'git add -p' and 41% and then 95% for
+ 'git reset -p'. 'git checkout -p' improves immediatley by 91% because it
+ does not need any change to its builtin.
+
+ Test HEAD~4 HEAD~3 HEAD~2 HEAD~1
+ -------------------------------------------------------------------------------------
+ 2000.118: ... git add -p (full-v3) 0.79 0.79 +0.0% 0.82 +3.8% 0.82 +3.8%
+ 2000.119: ... git add -p (full-v4) 0.74 0.76 +2.7% 0.74 +0.0% 0.76 +2.7%
+ 2000.120: ... git add -p (sparse-v3) 1.94 1.28 -34.0% 0.07 -96.4% 0.07 -96.4%
+ 2000.121: ... git add -p (sparse-v4) 1.93 1.28 -33.7% 0.06 -96.9% 0.06 -96.9%
+ 2000.122: ... git checkout -p (full-v3) 1.18 1.18 +0.0% 1.18 +0.0% 1.19 +0.8%
+ 2000.123: ... git checkout -p (full-v4) 1.10 1.12 +1.8% 1.11 +0.9% 1.11 +0.9%
+ 2000.124: ... git checkout -p (sparse-v3) 1.31 0.11 -91.6% 0.11 -91.6% 0.11 -91.6%
+ 2000.125: ... git checkout -p (sparse-v4) 1.29 0.11 -91.5% 0.11 -91.5% 0.11 -91.5%
+ 2000.126: ... git reset -p (full-v3) 0.81 0.80 -1.2% 0.83 +2.5% 0.83 +2.5%
+ 2000.127: ... git reset -p (full-v4) 0.78 0.77 -1.3% 0.77 -1.3% 0.78 +0.0%
+ 2000.128: ... git reset -p (sparse-v3) 1.58 0.92 -41.8% 0.91 -42.4% 0.07 -95.6%
+ 2000.129: ... git reset -p (sparse-v4) 1.58 0.92 -41.8% 0.92 -41.8% 0.07 -95.6%
It is worth noting that if our test was more involved and had multiple
hunks to evaluate, then the time spent in 'git apply' would dominate due
@@ t/perf/p2000-sparse-operations.sh: test_perf_on_all git diff-tree HEAD
test_perf_on_all "git worktree add ../temp && git worktree remove ../temp"
test_perf_on_all git check-attr -a -- $SPARSE_CONE/a
+test_perf_on_all 'echo >>a && test_write_lines y | git add -p'
++test_perf_on_all 'test_write_lines y y y | git checkout --patch -'
++test_perf_on_all 'echo >>a && git add a && test_write_lines y | git reset --patch'
test_done
--
gitgitgadget
next prev parent reply other threads:[~2025-05-16 14:55 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-07 0:55 [PATCH 0/3] Integrate the sparse index with 'git apply' and 'git add -p/-i' Derrick Stolee via GitGitGadget
2025-05-07 0:55 ` [PATCH 1/3] apply: integrate with the sparse index Derrick Stolee via GitGitGadget
2025-05-10 3:18 ` Elijah Newren
2025-05-16 12:49 ` Derrick Stolee
2025-05-07 0:55 ` [PATCH 2/3] git add: make -p/-i aware of " Derrick Stolee via GitGitGadget
2025-05-10 4:38 ` Elijah Newren
2025-05-16 12:54 ` Derrick Stolee
2025-05-07 0:55 ` [PATCH 3/3] p2000: add performance test for 'git add -p' Derrick Stolee via GitGitGadget
2025-05-10 4:39 ` Elijah Newren
2025-05-08 18:26 ` [PATCH 0/3] Integrate the sparse index with 'git apply' and 'git add -p/-i' Junio C Hamano
2025-05-14 15:16 ` Phillip Wood
2025-05-16 13:28 ` Derrick Stolee
2025-05-20 15:07 ` phillip.wood123
2025-05-16 14:55 ` Derrick Stolee via GitGitGadget [this message]
2025-05-16 14:55 ` [PATCH v2 1/4] apply: integrate with the sparse index Derrick Stolee via GitGitGadget
2025-05-16 14:55 ` [PATCH v2 2/4] git add: make -p/-i aware of " Derrick Stolee via GitGitGadget
2025-05-16 14:55 ` [PATCH v2 3/4] reset: integrate sparse index with --patch Derrick Stolee via GitGitGadget
2025-05-16 16:20 ` Elijah Newren
2025-05-16 14:55 ` [PATCH v2 4/4] p2000: add performance test for patch-mode commands Derrick Stolee via GitGitGadget
2025-05-16 15:32 ` [PATCH v2 0/4] Integrate the sparse index with 'git apply' and interactive add, checkout, and reset Elijah Newren
2025-05-16 16:35 ` Derrick Stolee
2025-05-16 18:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.1914.v2.git.1747407330.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=phillip.wood123@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).