From: Derrick Stolee <stolee@gmail.com>
To: Phillip Wood via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: "Derrick Stolee" <dstolee@microsoft.com>,
"René Scharfe" <l.s.r@web.de>, "Elijah Newren" <newren@gmail.com>,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Phillip Wood" <phillip.wood@dunelm.org.uk>,
vdye@github.com
Subject: Re: [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
Date: Wed, 6 Oct 2021 07:20:19 -0400 [thread overview]
Message-ID: <2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com> (raw)
In-Reply-To: <pull.1053.git.1633512591608.gitgitgadget@gmail.com>
On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when
> index_name_pos() looks up a entry that is missing from the index and
> that would be a descendant of a sparse entry. That triggers a call to
> ensure_full_index() which frees the cache tree that is being verified.
> Carrying on trying to verify the tree after this results in a
> use-after-free bug. Instead restart the verification if a sparse index
> is converted to a full index. This bug is triggered by a call to
> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
> help analyzing the problem.
Thank you for identifying an interesting case! I hadn't thought to
change the mode from --merge to --apply.
> In a sparse index it is possible for the tree that is being verified to
> be freed while it is being verified. This is an RFC as I'm not familiar
> with the cache tree code. I'm confused as to why this bug is triggered
> by the sequence
>
> unpack_trees()
> prime_cache_tree()
> write_locked_index()
>
> but not
>
> unpack_trees()
> write_locked_index()
>
>
> as unpack_trees() appears to update the cache tree with
>
> if (!cache_tree_fully_valid(o->result.cache_tree))
> cache_tree_update(&o->result,
> WRITE_TREE_SILENT |
> WRITE_TREE_REPAIR);
>
>
> and I don't understand why the cache tree from prime_cache_tree()
> results in different behavior. It concerns me that this fix is hiding
> another bug.
prime_cache_tree() appears to clear the cache tree and start from scratch
from a tree object instead of using the index.
In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
cone, so the cache tree is the full size at that point.
When the verify_one() method reaches these nodes that are outside of the
cone, index_name_pos() triggers the index expansion in a way that the
cache-tree that is restricted to the sparse-checkout cone does not.
Hopefully that helps clear up _why_ this happens.
There is a remaining issue that "git rebase --apply" will be a lot slower
than "git rebase --merge" because of this construction of a cache-tree
that is much larger than necessary.
I will make note of this as a potential improvement for the future.
> -static void verify_one(struct repository *r,
> - struct index_state *istate,
> - struct cache_tree *it,
> - struct strbuf *path)
> +static int verify_one(struct repository *r,
> + struct index_state *istate,
> + struct cache_tree *it,
> + struct strbuf *path)
> {
> int i, pos, len = path->len;
> struct strbuf tree_buf = STRBUF_INIT;
> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>
> for (i = 0; i < it->subtree_nr; i++) {
> strbuf_addf(path, "%s/", it->down[i]->name);
> - verify_one(r, istate, it->down[i]->cache_tree, path);
> + if (verify_one(r, istate, it->down[i]->cache_tree, path))
> + return 1;
> strbuf_setlen(path, len);
> }
>
> if (it->entry_count < 0 ||
> /* no verification on tests (t7003) that replace trees */
> lookup_replace_object(r, &it->oid) != &it->oid)
> - return;
> + return 0;
>
> if (path->len) {
> + /*
> + * If the index is sparse index_name_pos() may trigger
> + * ensure_full_index() which will free the tree that is being
> + * verified.
> + */
> + int is_sparse = istate->sparse_index;
> pos = index_name_pos(istate, path->buf, path->len);
> + if (is_sparse && !istate->sparse_index)
> + return 1;
I think this guard is good to have, even if we fix prime_cache_tree() to
avoid triggering expansion here in most cases.
> if (pos >= 0) {
> verify_one_sparse(r, istate, it, path, pos);
> - return;
> + return 0;
> }
>
> pos = -pos - 1;
> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
> oid_to_hex(&new_oid), oid_to_hex(&it->oid));
> strbuf_setlen(path, len);
> strbuf_release(&tree_buf);
> + return 0;
> }
>
> void cache_tree_verify(struct repository *r, struct index_state *istate)
> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>
> if (!istate->cache_tree)
> return;
> - verify_one(r, istate, istate->cache_tree, &path);
> + if (verify_one(r, istate, istate->cache_tree, &path)) {
> + strbuf_reset(&path);
> + verify_one(r, istate, istate->cache_tree, &path);
> + }
And this limits us to doing at most two passes. Good.
> test_expect_success 'merge, cherry-pick, and rebase' '
> init_repos &&
>
> - for OPERATION in "merge -m merge" cherry-pick rebase
> + for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
Thank you for the additional test!
Thanks,
-Stolee
next prev parent reply other threads:[~2021-10-06 11:20 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-06 9:29 [PATCH] [RFC] sparse index: fix use-after-free bug in cache_tree_verify() Phillip Wood via GitGitGadget
2021-10-06 11:20 ` Derrick Stolee [this message]
2021-10-06 14:01 ` Phillip Wood
2021-10-06 14:19 ` Derrick Stolee
2021-10-06 19:17 ` Junio C Hamano
2021-10-06 20:43 ` Derrick Stolee
2021-10-07 9:50 ` [PATCH v2] " Phillip Wood via GitGitGadget
2021-10-07 13:35 ` Derrick Stolee
2021-10-07 14:59 ` Phillip Wood
2021-10-07 13:53 ` Derrick Stolee
2021-10-07 15:05 ` Phillip Wood
2021-10-07 15:44 ` Derrick Stolee
2021-10-07 17:59 ` Phillip Wood
2021-10-07 18:07 ` [PATCH v3] " Phillip Wood via GitGitGadget
2021-10-07 21:23 ` Junio C Hamano
2021-10-08 9:09 ` Phillip Wood
2021-10-08 18:53 ` Derrick Stolee
2021-10-08 19:57 ` Junio C Hamano
2021-10-14 13:34 ` Phillip Wood
2021-10-14 16:42 ` Junio C Hamano
2021-10-08 9:38 ` Bagas Sanjaya
2021-10-14 9:40 ` Phillip Wood
2021-10-16 9:07 ` [PATCH v4] " Phillip Wood via GitGitGadget
2021-10-17 5:38 ` Junio C Hamano
2021-10-17 19:35 ` Derrick Stolee
2021-10-18 9:37 ` Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2b1f0e56-5bb4-7f41-5a1e-d8a21096084a@gmail.com \
--to=stolee@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=l.s.r@web.de \
--cc=newren@gmail.com \
--cc=phillip.wood@dunelm.org.uk \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).