All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, git@jeffhostetler.com, gitster@pobox.com,
	newren@gmail.com, stolee@gmail.com,
	Derrick Stolee <derrickstolee@github.com>,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH] sparse-index: copy dir_hash in ensure_full_index()
Date: Mon, 23 Aug 2021 16:25:15 +0200	[thread overview]
Message-ID: <87h7fgfdah.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <pull.1017.git.1629136135286.gitgitgadget@gmail.com>


On Mon, Aug 16 2021, Derrick Stolee via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Copy the 'index_state->dir_hash' back to the real istate after expanding
> a sparse index.
>
> A crash was observed in 'git status' during some hashmap lookups with
> corrupted hashmap entries.  During an index expansion, new cache-entries
> are added to the 'index_state->name_hash' and the 'dir_hash' in a
> temporary 'index_state' variable 'full'.  However, only the 'name_hash'
> hashmap from this temp variable was copied back into the real 'istate'
> variable.  The original copy of the 'dir_hash' was incorrectly
> preserved.  If the table in the 'full->dir_hash' hashmap were realloced,
> the stale version (in 'istate') would be corrupted.
>
> The test suite does not operate on index sizes sufficiently large to
> trigger this reallocation, so they do not cover this behavior.
> Increasing the test suite to cover such scale is fragile and likely
> wasteful.

How large does the index need to be to trigger this? I don't know if a
test here is useful, but FWIW if we had such a test then the EXPENSIVE
prereq + GIT_TEST_LONG=true might be a good fit for it.

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>     sparse-index: copy dir_hash in ensure_full_index()
>     
>     This fix is an issue we discovered in our first experimental release of
>     the sparse index in the microsoft/git fork. We fixed it in the latest
>     experimental release [1] and then I almost forgot about it until we
>     started rebasing sparse-index work on top of the 2.33.0 release
>     candidates.
>     
>     [1] https://github.com/microsoft/git/releases/tag/v2.32.0.vfs.0.102.exp
>     
>     This is a change that can be taken anywhere since 4300f8 (sparse-index:
>     implement ensure_full_index(), 2021-03-30), but this version is based on
>     v2.33.0-rc2.
>     
>     While the bug is alarming for users who hit it (seg fault) it requires
>     sufficient scale and use of the optional sparse index feature. We are
>     not recommending wide adoption of the sparse index yet because we don't
>     have a sufficient density of integrated commands. For that reason, I
>     don't think this should halt progress towards the full v2.33.0 release.
>     I did want to send this as soon as possible so that could be at the
>     discretion of the maintainer.
>     
>     Thanks, -Stolee
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1017%2Fderrickstolee%2Fsparse-index%2Ffix-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1017/derrickstolee/sparse-index/fix-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1017
>
>  sparse-index.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/sparse-index.c b/sparse-index.c
> index c6b4feec413..56eb65dc349 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -283,6 +283,7 @@ void ensure_full_index(struct index_state *istate)
>  
>  	/* Copy back into original index. */
>  	memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash));
> +	memcpy(&istate->dir_hash, &full->dir_hash, sizeof(full->dir_hash));
>  	istate->sparse_index = 0;
>  	free(istate->cache);
>  	istate->cache = full->cache;
>
> base-commit: 5d213e46bb7b880238ff5ea3914e940a50ae9369


  parent reply	other threads:[~2021-08-23 14:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 17:48 [PATCH] sparse-index: copy dir_hash in ensure_full_index() Derrick Stolee via GitGitGadget
2021-08-23 13:13 ` Derrick Stolee
2021-08-23 17:54   ` Elijah Newren
2021-08-23 13:22 ` Johannes Schindelin
2021-08-23 14:25 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-23 18:18   ` Jeff Hostetler
2021-08-23 19:27     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7fgfdah.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.