git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Phillip Wood via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: "Derrick Stolee" <dstolee@microsoft.com>,
	"René Scharfe" <l.s.r@web.de>, "Elijah Newren" <newren@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Phillip Wood" <phillip.wood@dunelm.org.uk>,
	vdye@github.com
Subject: Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
Date: Wed, 6 Oct 2021 07:20:19 -0400	[thread overview]
Message-ID: <2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com> (raw)
In-Reply-To: <pull.1053.git.1633512591608.gitgitgadget@gmail.com>

On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when
> index_name_pos() looks up a entry that is missing from the index and
> that would be a descendant of a sparse entry. That triggers a call to
> ensure_full_index() which frees the cache tree that is being verified.
> Carrying on trying to verify the tree after this results in a
> use-after-free bug. Instead restart the verification if a sparse index
> is converted to a full index. This bug is triggered by a call to
> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
> help analyzing the problem.

Thank you for identifying an interesting case! I hadn't thought to
change the mode from --merge to --apply.

>     In a sparse index it is possible for the tree that is being verified to
>     be freed while it is being verified. This is an RFC as I'm not familiar
>     with the cache tree code. I'm confused as to why this bug is triggered
>     by the sequence
>     
>     unpack_trees()
>     prime_cache_tree()
>     write_locked_index()
>     
>     but not
>     
>     unpack_trees()
>     write_locked_index()
>     
>     
>     as unpack_trees() appears to update the cache tree with
>     
>     if (!cache_tree_fully_valid(o->result.cache_tree))
>                 cache_tree_update(&o->result,
>                           WRITE_TREE_SILENT |
>                           WRITE_TREE_REPAIR);
>     
>     
>     and I don't understand why the cache tree from prime_cache_tree()
>     results in different behavior. It concerns me that this fix is hiding
>     another bug.

prime_cache_tree() appears to clear the cache tree and start from scratch
from a tree object instead of using the index.

In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
cone, so the cache tree is the full size at that point.

When the verify_one() method reaches these nodes that are outside of the
cone, index_name_pos() triggers the index expansion in a way that the
cache-tree that is restricted to the sparse-checkout cone does not.

Hopefully that helps clear up _why_ this happens.

There is a remaining issue that "git rebase --apply" will be a lot slower
than "git rebase --merge" because of this construction of a cache-tree
that is much larger than necessary.

I will make note of this as a potential improvement for the future.

> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {
>  	int i, pos, len = path->len;
>  	struct strbuf tree_buf = STRBUF_INIT;
> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>  
>  	for (i = 0; i < it->subtree_nr; i++) {
>  		strbuf_addf(path, "%s/", it->down[i]->name);
> -		verify_one(r, istate, it->down[i]->cache_tree, path);
> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
> +			return 1;
>  		strbuf_setlen(path, len);
>  	}
>  
>  	if (it->entry_count < 0 ||
>  	    /* no verification on tests (t7003) that replace trees */
>  	    lookup_replace_object(r, &it->oid) != &it->oid)
> -		return;
> +		return 0;
>  
>  	if (path->len) {
> +		/*
> +		 * If the index is sparse index_name_pos() may trigger
> +		 * ensure_full_index() which will free the tree that is being
> +		 * verified.
> +		 */
> +		int is_sparse = istate->sparse_index;
>  		pos = index_name_pos(istate, path->buf, path->len);
> +		if (is_sparse && !istate->sparse_index)
> +			return 1;

I think this guard is good to have, even if we fix prime_cache_tree() to
avoid triggering expansion here in most cases.

>  		if (pos >= 0) {
>  			verify_one_sparse(r, istate, it, path, pos);
> -			return;
> +			return 0;
>  		}
>  
>  		pos = -pos - 1;
> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>  		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>  	strbuf_setlen(path, len);
>  	strbuf_release(&tree_buf);
> +	return 0;
>  }
>  
>  void cache_tree_verify(struct repository *r, struct index_state *istate)
> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}

And this limits us to doing at most two passes. Good.

>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"

Thank you for the additional test!

Thanks,
-Stolee

  reply	other threads:[~2021-10-06 11:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-06  9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
2021-10-06 11:20 ` Derrick Stolee [this message]
2021-10-06 14:01   ` Phillip Wood
2021-10-06 14:19     ` Derrick Stolee
2021-10-06 19:17 ` Junio C Hamano
2021-10-06 20:43   ` Derrick Stolee
2021-10-07  9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
2021-10-07 13:35   ` Derrick Stolee
2021-10-07 14:59     ` Phillip Wood
2021-10-07 13:53   ` Derrick Stolee
2021-10-07 15:05     ` Phillip Wood
2021-10-07 15:44       ` Derrick Stolee
2021-10-07 17:59         ` Phillip Wood
2021-10-07 18:07   ` [PATCH v3] " Phillip Wood via GitGitGadget
2021-10-07 21:23     ` Junio C Hamano
2021-10-08  9:09       ` Phillip Wood
2021-10-08 18:53         ` Derrick Stolee
2021-10-08 19:57         ` Junio C Hamano
2021-10-14 13:34           ` Phillip Wood
2021-10-14 16:42             ` Junio C Hamano
2021-10-08  9:38     ` Bagas Sanjaya
2021-10-14  9:40       ` Phillip Wood
2021-10-16  9:07     ` [PATCH v4] " Phillip Wood via GitGitGadget
2021-10-17  5:38       ` Junio C Hamano
2021-10-17 19:35         ` Derrick Stolee
2021-10-18  9:37         ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com \
    --to=stolee@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=l.s.r@web.de \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).