From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/3] cache-tree: detect mismatching number of index entries
Date: Tue, 24 Sep 2024 08:48:04 +0200 [thread overview]
Message-ID: <ZvJgnqSwANCXmj0G@pks.im> (raw)
In-Reply-To: <xmqqttec8ly0.fsf@gitster.g>
On Wed, Sep 18, 2024 at 06:35:35PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > + if (it->entry_count + pos > istate->cache_nr) {
> > + ret = error(_("corrupted cache-tree has entries not present in index"));
> > + goto out;
> > + }
>
> Is it a safe assumption that the if() condition always indicates an
> error? When sparse-index is in effect, istate->cache_nr may be a
> number that is smaller than the true number of paths in the index
> (because all paths under a subdirectory we are not interested in are
> folded into a single tree-ish entry), and I am not sure how it
> should interact with it->entry_count (i.e. the number of paths under
> the current directory we are looking at, which obviously cannot be a
> sparsified entry) and pos (i.e. the index into active_cache[] that
> represend the first path under the current directory)?
>
> I guess as long as "it" is not folded, it does not matter how other
> paths from different directories in active_cache[] are sparsified or
> expanded, as long as "pos" keeps track of the current position
> correctly.
It seems like we end up calling `ensure_full_index()` for a sparse
index, which does cause us to signal to the caller that they should
restart verification. So for all I understand, this function shouldn't
act on a sparsely-populated index.
But I cannot see how it could lead to anything sensible when the added
condition is violated because the first thing we do in the loop is this:
struct cache_entry *ce = istate->cache[pos + i];
And before we do anything else, we dereference that pointer. So if the
condition doesn't hold we _will_ get an out-of-bounds read of the cache
array and act on the garbage data. And that causes the observed segfault
on my machine and in the test.
So I think that ensuring this property is always the right thing to do.
But I wouldn't be surprised if overall this code could require more love
to make it behave sanely in all scenarios. It certainly feels somewhat
fragile to me.
Patrick
next prev parent reply other threads:[~2024-09-24 6:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-17 7:13 [PATCH 0/3] cache-tree: fix segfaults with invalid cache-trees Patrick Steinhardt
2024-09-17 7:13 ` [PATCH 1/3] cache-tree: refactor verification to return error codes Patrick Steinhardt
2024-09-17 17:05 ` Eric Sunshine
2024-09-18 5:11 ` Patrick Steinhardt
2024-09-17 7:13 ` [PATCH 2/3] cache-tree: detect mismatching number of index entries Patrick Steinhardt
2024-09-19 1:35 ` Junio C Hamano
2024-09-24 6:48 ` Patrick Steinhardt [this message]
2024-09-24 17:01 ` Junio C Hamano
2024-09-17 7:13 ` [PATCH 3/3] unpack-trees: detect mismatching number of cache-tree/index entries Patrick Steinhardt
2024-10-07 4:38 ` [PATCH v2 0/3] cache-tree: fix segfaults with invalid cache-trees Patrick Steinhardt
2024-10-07 4:38 ` [PATCH v2 1/3] cache-tree: refactor verification to return error codes Patrick Steinhardt
2024-10-07 4:38 ` [PATCH v2 2/3] cache-tree: detect mismatching number of index entries Patrick Steinhardt
2024-10-07 4:38 ` [PATCH v2 3/3] unpack-trees: detect mismatching number of cache-tree/index entries Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZvJgnqSwANCXmj0G@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).