git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, git@jeffhostetler.com, gitster@pobox.com,
	newren@gmail.com, stolee@gmail.com,
	Derrick Stolee <derrickstolee@github.com>,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH] sparse-index: copy dir_hash in ensure_full_index()
Date: Mon, 23 Aug 2021 16:25:15 +0200	[thread overview]
Message-ID: <87h7fgfdah.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <pull.1017.git.1629136135286.gitgitgadget@gmail.com>


On Mon, Aug 16 2021, Derrick Stolee via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Copy the 'index_state->dir_hash' back to the real istate after expanding
> a sparse index.
>
> A crash was observed in 'git status' during some hashmap lookups with
> corrupted hashmap entries.  During an index expansion, new cache-entries
> are added to the 'index_state->name_hash' and the 'dir_hash' in a
> temporary 'index_state' variable 'full'.  However, only the 'name_hash'
> hashmap from this temp variable was copied back into the real 'istate'
> variable.  The original copy of the 'dir_hash' was incorrectly
> preserved.  If the table in the 'full->dir_hash' hashmap were realloced,
> the stale version (in 'istate') would be corrupted.
>
> The test suite does not operate on index sizes sufficiently large to
> trigger this reallocation, so they do not cover this behavior.
> Increasing the test suite to cover such scale is fragile and likely
> wasteful.

How large does the index need to be to trigger this? I don't know if a
test here is useful, but FWIW if we had such a test then the EXPENSIVE
prereq + GIT_TEST_LONG=true might be a good fit for it.

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>     sparse-index: copy dir_hash in ensure_full_index()
>     
>     This fix is an issue we discovered in our first experimental release of
>     the sparse index in the microsoft/git fork. We fixed it in the latest
>     experimental release [1] and then I almost forgot about it until we
>     started rebasing sparse-index work on top of the 2.33.0 release
>     candidates.
>     
>     [1] https://github.com/microsoft/git/releases/tag/v2.32.0.vfs.0.102.exp
>     
>     This is a change that can be taken anywhere since 4300f8 (sparse-index:
>     implement ensure_full_index(), 2021-03-30), but this version is based on
>     v2.33.0-rc2.
>     
>     While the bug is alarming for users who hit it (seg fault) it requires
>     sufficient scale and use of the optional sparse index feature. We are
>     not recommending wide adoption of the sparse index yet because we don't
>     have a sufficient density of integrated commands. For that reason, I
>     don't think this should halt progress towards the full v2.33.0 release.
>     I did want to send this as soon as possible so that could be at the
>     discretion of the maintainer.
>     
>     Thanks, -Stolee
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1017%2Fderrickstolee%2Fsparse-index%2Ffix-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1017/derrickstolee/sparse-index/fix-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1017
>
>  sparse-index.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/sparse-index.c b/sparse-index.c
> index c6b4feec413..56eb65dc349 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -283,6 +283,7 @@ void ensure_full_index(struct index_state *istate)
>  
>  	/* Copy back into original index. */
>  	memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash));
> +	memcpy(&istate->dir_hash, &full->dir_hash, sizeof(full->dir_hash));
>  	istate->sparse_index = 0;
>  	free(istate->cache);
>  	istate->cache = full->cache;
>
> base-commit: 5d213e46bb7b880238ff5ea3914e940a50ae9369


  parent reply	other threads:[~2021-08-23 14:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 17:48 [PATCH] sparse-index: copy dir_hash in ensure_full_index() Derrick Stolee via GitGitGadget
2021-08-23 13:13 ` Derrick Stolee
2021-08-23 17:54   ` Elijah Newren
2021-08-23 13:22 ` Johannes Schindelin
2021-08-23 14:25 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-23 18:18   ` Jeff Hostetler
2021-08-23 19:27     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7fgfdah.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).