All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: derrickstolee@github.com, gitster@pobox.com, newren@gmail.com,
	Victoria Dye <vdye@github.com>
Subject: [PATCH 0/3] Fix use of 'cache_bottom' in sparse index
Date: Wed, 16 Mar 2022 20:11:59 +0000	[thread overview]
Message-ID: <pull.1179.git.1647461522.gitgitgadget@gmail.com> (raw)

An issue was found by @stolee after experimenting with the structure of the
test repo used in 't/t1092-sparse-checkout-compatibility.sh' (patch [1/3])
and finding that certain commands (namely those that internally performed a
"cache diff") failed when a sparse directory appeared lexicographically
before the in-cone directory. The root cause was traced to the
'unpack_trees_options.cache_bottom' variable, which was not being advanced
properly for sparse directories. The result was entries being
double-unpacked or improperly unpacked by 'unpack_trees', causing failures
in tests using 'git reset -- <pathspec>' and 'git diff --staged --
<pathspec>' using pathspecs.

The 'cache_bottom' was handled differently in sparse indexes in the first
place based on the reasoning laid out in 17a1bb570b (unpack-trees: preserve
cache_bottom, 2021-07-14). In short, the 'cache_bottom' advancement in
'mark_ce_used(...)' was disabled for sparse directories because
'unpack_callback(...)' would advance the 'cache_bottom' based on the number
of index entries matching or inside the tree being unpacked. If the tree was
a sparse directory, 'cache_bottom' would appropriately advance by 1, and
therefore didn't need the extra advancement in 'mark_ce_used(...)'. However,
this was insufficient to properly advance the 'cache_bottom' for two
reasons:

 1. 'unpack_callback(...)' would only advance the 'cache_bottom' for sparse
    directories if the operation in progress was a "cache diff" (true for
    'git diff --staged' and 'git reset --mixed', but not - for instance -
    'git reset --hard' or 'git read-tree -m').
 2. If sparse directories were unpacked with 'unpack_index_entry(...)'
    (e.g., in 'git reset -- <pathspec>'), the cache tree-based advancement
    of 'cache_bottom' would not happen.

Luckily, the first did not appear to have any behavioral impact. However,
the latter led to incorrect values being returned by 'next_cache_entry(...)'
depending on the structure of the index, causing the test failures observed
in 't1092'.

To fix this, the 'cache_bottom' advancement is reinstated in
'mark_ce_used(...)', and instead it is disabled in 'unpack_callback(...)' if
the tree in question is a sparse directory. This corrects both the
non-"cache diff" cases and the 'unpack_index_entry(...)' cases while
preventing the double-advancement 17a1bb570b originally intended to avoid
(patch [2/3]).

Finally, now that the cache bottom is advanced properly, we can revert the
"performance improvement" introduced in f2a454e0a5 (unpack-trees: improve
performance of next_cache_entry, 2021-11-29) that mitigated performance
issues arising in 'next_cache_entry(...)' from the non-advancing
'cache_bottom' (patch [3/3]). The performance results in
'p2000-sparse-operations.sh' showed expected variability around 0% change in
execution time (+/= 0.04s, depending on the command), with example results
for potentially-affected commands below.

'git reset'                      master            this_series                  
------------------------------------------------------------------------
full-v3                          0.51(0.21+0.27)   0.50(0.21+0.25) -2.0%
full-v4                          0.51(0.22+0.27)   0.50(0.21+0.24) -2.0%
sparse-v3                        0.30(0.04+0.55)   0.28(0.04+0.50) -6.7%
sparse-v4                        0.31(0.04+0.51)   0.29(0.04+0.51) -6.5%

'git reset -- does-not-exist'    master            this_series                  
------------------------------------------------------------------------
full-v3                          0.54(0.23+0.27)   0.55(0.22+0.28) +1.9%
full-v4                          0.56(0.25+0.26)   0.54(0.24+0.26) -3.6%
sparse-v3                        0.31(0.04+0.54)   0.31(0.04+0.50) +0.0%
sparse-v4                        0.31(0.04+0.52)   0.31(0.04+0.50) +0.0%

'git diff --cached'              master            this_series    
-------------------------------------------------------------------------
full-v3                          0.09(0.04+0.04)   0.09(0.04+0.04) +0.0%
full-v4                          0.09(0.04+0.04)   0.09(0.04+0.04) +0.0%
sparse-v3                        0.05(0.01+0.02)   0.05(0.01+0.03) +0.0%
sparse-v4                        0.04(0.01+0.02)   0.04(0.01+0.02) +0.0%


Thanks! -Victoria

Victoria Dye (3):
  t1092: add sparse directory before cone in test repo
  unpack-trees: increment cache_bottom for sparse directories
  Revert "unpack-trees: improve performance of next_cache_entry"

 t/t1092-sparse-checkout-compatibility.sh |  6 +++-
 unpack-trees.c                           | 39 +++++++++---------------
 2 files changed, 19 insertions(+), 26 deletions(-)


base-commit: 1a4874565fa3b6668042216189551b98b4dc0b1b
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1179%2Fvdye%2Fbugfix%2Fsparse-index-cache-bottom-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1179/vdye/bugfix/sparse-index-cache-bottom-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1179
-- 
gitgitgadget

             reply	other threads:[~2022-03-16 20:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16 20:11 Victoria Dye via GitGitGadget [this message]
2022-03-16 20:12 ` [PATCH 1/3] t1092: add sparse directory before cone in test repo Victoria Dye via GitGitGadget
2022-03-16 20:12 ` [PATCH 2/3] unpack-trees: increment cache_bottom for sparse directories Victoria Dye via GitGitGadget
2022-03-16 20:12 ` [PATCH 3/3] Revert "unpack-trees: improve performance of next_cache_entry" Victoria Dye via GitGitGadget
2022-03-16 20:21 ` [PATCH 0/3] Fix use of 'cache_bottom' in sparse index Junio C Hamano
2022-03-17 15:55 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-03-17 15:55   ` [PATCH v2 1/3] t1092: add sparse directory before cone in test repo Victoria Dye via GitGitGadget
2022-03-17 15:55   ` [PATCH v2 2/3] unpack-trees: increment cache_bottom for sparse directories Victoria Dye via GitGitGadget
2022-03-21 19:03     ` Derrick Stolee
2022-03-21 20:52       ` Junio C Hamano
2022-03-17 15:55   ` [PATCH v2 3/3] Revert "unpack-trees: improve performance of next_cache_entry" Victoria Dye via GitGitGadget
2022-03-21 19:12   ` [PATCH v2 0/3] Fix use of 'cache_bottom' in sparse index Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1179.git.1647461522.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.