All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kevin Lyles via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Derrick Stolee <stolee@gmail.com>, Kevin Lyles <klyles+github@epic.com>
Subject: [PATCH v2 0/2] Mark cat-file sparse-index compatible
Date: Fri, 30 Aug 2024 21:10:41 +0000	[thread overview]
Message-ID: <pull.1770.v2.git.git.1725052243.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1770.git.git.1724695732305.gitgitgadget@gmail.com>

Please note that this is my first contribution to git. I've tried to follow
the instructions about how to correctly submit a patch (I'm using
GitGitGadget as getting Outlook to do plain text e-mail correctly seems
impossible), but please let me know if I've missed something.

My motivation for making this change is purely performance. We have a large
repository that we enable the sparse index for, and I am developing a
pre-commit hook that (among other things) uses git cat-file to get the
staged contents of certain files. Without this change, getting the contents
of a single small file from the index can take upwards of 10 seconds due to
the index expansion. After this change, it only takes ~0.3 seconds unless
the file is outside of the sparse index.

Kevin Lyles (2):
  Allow using stdin in run_on_* functions
  Mark 'git cat-file' sparse-index compatible

 builtin/cat-file.c                       |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 48 +++++++++++++++++++++---
 2 files changed, 46 insertions(+), 5 deletions(-)


base-commit: 80ccd8a2602820fdf896a8e8894305225f86f61d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1770%2Fklylesatepic%2Fkl%2Fmark-cat-file-sparse-index-compatible-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1770/klylesatepic/kl/mark-cat-file-sparse-index-compatible-v2
Pull-Request: https://github.com/git/git/pull/1770

Range-diff vs v1:

 -:  ----------- > 1:  7067a4c5da2 Allow using stdin in run_on_* functions
 1:  60c15dd246a ! 2:  a92825e502f Mark `cat-file` sparse-index compatible
     @@ Metadata
      Author: Kevin Lyles <klyles+github@epic.com>
      
       ## Commit message ##
     -    Mark `cat-file` sparse-index compatible
     +    Mark 'git cat-file' sparse-index compatible
      
     -    `cat-file` will expand a sparse index to a full index when needed, but
     -    is currently marked as needing a full index (or rather, not marked as
     -    not needing a full index). This results in much slower `cat-file`
     +    This change affects how 'git cat-file' works with the index when
     +    specifying an object with the ":<path>" syntax (which will give file
     +    contents from the index).
     +
     +    'git cat-file' will expand a sparse index to a full index when needed,
     +    but is currently marked as needing a full index (or rather, not marked
     +    as not needing a full index). This results in much slower 'git cat-file'
          operations when working within the sparse index, since we expand the
          index whether it's needed or not.
      
     -    Mark `cat-file` as not needing a full index, so that you only pay the
     -    cost of expanding the sparse index to a full index when you request a
     -    file outside of the sparse index.
     +    Mark 'git cat-file' as not needing a full index, so that you only pay
     +    the cost of expanding the sparse index to a full index when you request
     +    a file outside of the sparse index.
      
          Add tests to ensure both that:
     -    - `cat-file` returns the correct file contents whether or not the file
     -      is in the sparse index
     -    - `cat-file` warns about expanding to the full index any time you
     -      request something outside of the sparse index
     +    - 'git cat-file' returns the correct file contents whether or not the
     +      file is in the sparse index
     +    - 'git cat-file' expands to the full index any time you request
     +      something outside of the sparse index
      
          Signed-off-by: Kevin Lyles <klyles+github@epic.com>
      
     @@ builtin/cat-file.c: int cmd_cat_file(int argc, const char **argv, const char *pr
       		if (opt_cw)
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
     -@@ t/t1092-sparse-checkout-compatibility.sh: init_repos_as_submodules () {
     - }
     - 
     - run_on_sparse () {
     -+	tee run_on_sparse-checkout run_on_sparse-index &&
     -+
     - 	(
     - 		cd sparse-checkout &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../sparse-checkout-out 2>../sparse-checkout-err
     --	) &&
     -+	) <run_on_sparse-checkout &&
     - 	(
     - 		cd sparse-index &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../sparse-index-out 2>../sparse-index-err
     --	)
     -+	) <run_on_sparse-index
     - }
     - 
     - run_on_all () {
     -+	tee run_on_all-full run_on_all-sparse &&
     -+
     - 	(
     - 		cd full-checkout &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../full-checkout-out 2>../full-checkout-err
     --	) &&
     --	run_on_sparse "$@"
     -+	) <run_on_all-full &&
     -+	run_on_sparse "$@" <run_on_all-sparse
     - }
     - 
     - test_all_match () {
     -@@ t/t1092-sparse-checkout-compatibility.sh: test_sparse_unstaged () {
     - 	done
     - }
     - 
     --# Usage: test_sprase_checkout_set "<c1> ... <cN>" "<s1> ... <sM>"
     -+# Usage: test_sparse_checkout_set "<c1> ... <cN>" "<s1> ... <sM>"
     - # Verifies that "git sparse-checkout set <c1> ... <cN>" succeeds and
     - # leaves the sparse index in a state where <s1> ... <sM> are sparse
     - # directories (and <c1> ... <cN> are not).
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseIndexExpanded' '
       	grep "The sparse index is expanding to a full index" err
       '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseInde
      +	echo "new content" >>sparse-index/deep/a &&
      +	run_on_all git add deep/a &&
      +
     -+	test_all_match git cat-file -p HEAD:deep/a &&
     -+	ensure_not_expanded cat-file -p HEAD:deep/a &&
     -+	test_all_match git cat-file -p HEAD:folder1/a &&
     -+	ensure_not_expanded cat-file -p HEAD:folder1/a &&
     -+
      +	test_all_match git cat-file -p :deep/a &&
      +	ensure_not_expanded cat-file -p :deep/a &&
     -+	run_on_all git cat-file -p :folder1/a &&
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     ++	test_all_match git cat-file -p :folder1/a &&
      +	ensure_expanded cat-file -p :folder1/a'
      +
      +test_expect_success 'cat-file --batch' '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseInde
      +	echo "new content" >>sparse-index/deep/a &&
      +	run_on_all git add deep/a &&
      +
     -+	cat <<-\EOF | test_all_match git cat-file --batch &&
     -+	HEAD:deep/a
     -+	:deep/a
     -+	EOF
     -+	cat <<-\EOF | ensure_not_expanded cat-file --batch &&
     -+	HEAD:deep/a
     -+	:deep/a
     -+	EOF
     ++	echo ":deep/a">in &&
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_not_expanded cat-file --batch <in &&
      +
     -+	echo "HEAD:folder1/a" | test_all_match git cat-file --batch &&
     -+	echo "HEAD:folder1/a" | ensure_not_expanded cat-file --batch &&
     ++	echo ":folder1/a">in &&
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_expanded cat-file --batch <in &&
      +
     -+	echo ":folder1/a" | run_on_all git cat-file --batch &&
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     -+	echo ":folder1/a" | ensure_expanded cat-file --batch &&
     -+
     -+	cat <<-\EOF | run_on_all git cat-file --batch &&
     ++	cat <<-\EOF >in &&
      +	:deep/a
      +	:folder1/a
      +	EOF
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     -+	cat <<-\EOF | ensure_expanded cat-file --batch
     -+	:deep/a
     -+	:folder1/a
     -+	EOF'
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_expanded cat-file --batch <in'
      +
       test_done

-- 
gitgitgadget

  parent reply	other threads:[~2024-08-30 21:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-26 18:08 [PATCH] Mark `cat-file` sparse-index compatible Kevin Lyles via GitGitGadget
2024-08-29  1:59 ` Derrick Stolee
2024-08-30 21:10 ` Kevin Lyles via GitGitGadget [this message]
2024-08-30 21:10   ` [PATCH v2 1/2] Allow using stdin in run_on_* functions Kevin Lyles via GitGitGadget
2024-08-30 21:10   ` [PATCH v2 2/2] Mark 'git cat-file' sparse-index compatible Kevin Lyles via GitGitGadget
2024-09-03 14:17     ` Derrick Stolee
2024-09-03 17:21       ` Junio C Hamano
2024-09-03 17:54   ` [PATCH v3 0/2] " Kevin Lyles via GitGitGadget
2024-09-03 17:54     ` [PATCH v3 1/2] Allow using stdin in run_on_* functions Kevin Lyles via GitGitGadget
2024-09-03 19:11       ` Junio C Hamano
2024-09-03 17:54     ` [PATCH v3 2/2] Mark 'git cat-file' sparse-index compatible Kevin Lyles via GitGitGadget
2024-09-03 19:19       ` Junio C Hamano
2024-09-03 22:06     ` [PATCH v4 0/2] builtin/cat-file: mark " Kevin Lyles via GitGitGadget
2024-09-03 22:06       ` [PATCH v4 1/2] t1092: allow run_on_* functions to use standard input Kevin Lyles via GitGitGadget
2024-09-04 16:23         ` Junio C Hamano
2024-09-03 22:06       ` [PATCH v4 2/2] builtin/cat-file: mark 'git cat-file' sparse-index compatible Kevin Lyles via GitGitGadget
2024-09-04 16:35         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1770.v2.git.git.1725052243.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=klyles+github@epic.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.