git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shuqi Liang <cheskaqiqi@gmail.com>
To: git@vger.kernel.org
Cc: Shuqi Liang <cheskaqiqi@gmail.com>,
	vdye@github.com, gitster@pobox.com, derrickstolee@github.com
Subject: [PATCH v2] diff-tree: integrate with sparse index
Date: Thu, 18 May 2023 11:44:54 -0400	[thread overview]
Message-ID: <20230518154454.475487-1-cheskaqiqi@gmail.com> (raw)
In-Reply-To: <20230515191836.674234-1-cheskaqiqi@gmail.com>

The index is read in 'cmd_diff_tree' at two points:

1. The first index read was added in fd66bcc31ff (diff-tree: read the
index so attribute checks work in bare repositories, 2017-12-06) to deal
with reading '.gitattributes' content. 77efbb366ab (attr: be careful
about sparse directories, 2021-09-08) established that, in a sparse
index, we do _not_ try to load a '.gitattributes' file from within a
sparse directory.

2. The second index access point is involved in rename detection,
specifically when reading from stdin.This was initially added in
f0c6b2a2fd9 ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where
'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'.
That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in
ff7fe37b053 (diff.c: move read_index() code back to the caller,
2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as
of 6e0b8ed6d35 (diff.c: do not use a separate "size cache"., 2007-05-07)
and nothing about 'detect_rename' otherwise indicates index usage.

Hence we can just set the requires-full-index to false for "diff-tree".

Add tests that verify that 'git diff-tree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~98% execution time reduction for
'git diff-tree' using a sparse index:

Test                                                before  after
-----------------------------------------------------------------------
2000.94: git diff-tree HEAD (full-v3)                0.05   0.04 -20.0%
2000.95: git diff-tree HEAD (full-v4)                0.06   0.05 -16.7%
2000.96: git diff-tree HEAD (sparse-v3)              0.59   0.01 -98.3%
2000.97: git diff-tree HEAD (sparse-v4)              0.61   0.01 -98.4%
2000.98: git diff-tree HEAD -- f2/f4/a (full-v3)     0.05   0.05 +0.0%
2000.99: git diff-tree HEAD -- f2/f4/a (full-v4)     0.05   0.04 -20.0%
2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3)  0.58   0.01 -98.3%
2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4)  0.55   0.01 -98.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
---
Change since v1:

* Update commit message.
* Use existing test repo to simplify the test.
* Add test to ensure index won't expand regardless of 'diff-tree' a file
inside or outside the cone


Range-diff against v1:
1:  47049834d1 < -:  ---------- diff-tree: integrate with sparse index
-:  ---------- > 1:  a24605f579 diff-tree: integrate with sparse index

 builtin/diff-tree.c                      |  4 +++
 t/perf/p2000-sparse-operations.sh        |  2 ++
 t/t1092-sparse-checkout-compatibility.sh | 42 ++++++++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index 0b02c62b85..c0540317fb 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -122,6 +122,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
 		usage(diff_tree_usage);
 
 	git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
+
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	repo_init_revisions(the_repository, opt, prefix);
 	if (repo_read_index(the_repository) < 0)
 		die(_("index file corrupt"));
diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 901cc493ef..5a11910189 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -131,5 +131,7 @@ test_perf_on_all git describe --dirty
 test_perf_on_all 'echo >>new && git describe --dirty'
 test_perf_on_all git diff-files
 test_perf_on_all git diff-files -- $SPARSE_CONE/a
+test_perf_on_all git diff-tree HEAD
+test_perf_on_all git diff-tree HEAD -- $SPARSE_CONE/a
 
 test_done
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index e58bfbfcb4..90f827ffe9 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -2170,4 +2170,46 @@ test_expect_success 'sparse index is not expanded: diff-files' '
 	ensure_not_expanded diff-files -- "deep/*"
 '
 
+test_expect_success 'diff-tree' '
+	init_repos &&
+
+	# Test change inside sparse cone
+	tree1=$(git -C sparse-index rev-parse HEAD^{tree}) &&
+	tree2=$(git -C sparse-index rev-parse update-deep^{tree}) &&
+	test_all_match git diff-tree $tree1 $tree2 &&
+	test_all_match git diff-tree $tree1 $tree2 -- deep/a &&
+	test_all_match git diff-tree HEAD update-deep &&
+	test_all_match git diff-tree HEAD update-deep -- deep/a &&
+
+	# Test change outside sparse cone
+	tree3=$(git -C sparse-index rev-parse update-folder1^{tree}) &&
+	test_all_match git diff-tree $tree1 $tree3 &&
+	test_all_match git diff-tree $tree1 $tree3 -- folder1/a &&
+	test_all_match git diff-tree HEAD update-folder1 &&
+	test_all_match git diff-tree HEAD update-folder1 -- folder1/a &&
+
+	# Check that SKIP_WORKTREE files are not materialized
+	test_path_is_missing sparse-checkout/folder1/a &&
+	test_path_is_missing sparse-index/folder1/a &&
+	test_path_is_missing sparse-checkout/folder2/a &&
+	test_path_is_missing sparse-index/folder2/a
+'
+
+test_expect_success 'sparse-index is not expanded: diff-tree' '
+	init_repos &&
+
+	tree1=$(git -C sparse-index rev-parse HEAD^{tree}) &&
+	tree2=$(git -C sparse-index rev-parse update-deep^{tree}) &&
+	tree3=$(git -C sparse-index rev-parse update-folder1^{tree}) &&
+
+	ensure_not_expanded diff-tree $tree1 $tree2 &&
+	ensure_not_expanded diff-tree $tree1 $tree2 -- deep/a &&
+	ensure_not_expanded diff-tree HEAD update-deep &&
+	ensure_not_expanded diff-tree HEAD update-deep -- deep/a &&
+	ensure_not_expanded diff-tree $tree1 $tree3 &&
+	ensure_not_expanded diff-tree $tree1 $tree3 -- folder1/a &&
+	ensure_not_expanded diff-tree HEAD update-folder1 &&
+	ensure_not_expanded diff-tree HEAD update-folder1 -- folder1/a
+'
+
 test_done
-- 
2.39.0


  parent reply	other threads:[~2023-05-18 15:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-15 19:18 [RFC][PATCH V1] diff-tree: integrate with sparse index Shuqi Liang
2023-05-16 21:19 ` Victoria Dye
2023-05-16 23:14   ` Junio C Hamano
2023-05-17 18:47     ` Victoria Dye
2023-05-18 15:44 ` Shuqi Liang [this message]
2023-05-22 19:07   ` [PATCH v2] " Victoria Dye
2023-05-23  4:38     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230518154454.475487-1-cheskaqiqi@gmail.com \
    --to=cheskaqiqi@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).