From: Junio C Hamano <gitster@pobox.com>
To: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 5/5] cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
Date: Mon, 01 Jun 2026 21:33:12 +0900 [thread overview]
Message-ID: <xmqqldcy4f07.fsf@gitster.g> (raw)
In-Reply-To: <a87bbaa84fd5dcb2a585f82c4a5dfa1572b54588.1776731171.git.gitgitgadget@gmail.com> (Elijah Newren via GitGitGadget's message of "Tue, 21 Apr 2026 00:26:11 +0000")
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> I could not find any caller in current git that both allows the index to
> get into this state and then tries to write it out without doing other
> checks beyond the verify_cache() call in cache_tree_update(), but
> verify_cache() is documented as a safety net for preventing corrupt
> trees and should actually provide that guarantee.
Oh, absolutely. This kind of tightening is very much appreciated.
> diff --git a/cache-tree.c b/cache-tree.c
> index 7881b42aa2..f11844fe72 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -192,22 +192,62 @@ static int verify_cache(struct index_state *istate, int flags)
> for (i = 0; i + 1 < istate->cache_nr; i++) {
> /* path/file always comes after path because of the way
> * the cache is sorted. Also path can appear only once,
> - * which means conflicting one would immediately follow.
> + * so path/file is likely the immediately following path
> + * but might be separated if there is e.g. a
> + * path-internal/... file.
> */
> const struct cache_entry *this_ce = istate->cache[i];
> const struct cache_entry *next_ce = istate->cache[i + 1];
> const char *this_name = this_ce->name;
> const char *next_name = next_ce->name;
> int this_len = ce_namelen(this_ce);
> + const char *conflict_name = NULL;
> +
> if (this_len < ce_namelen(next_ce) &&
> - next_name[this_len] == '/' &&
> + next_name[this_len] <= '/' &&
> strncmp(this_name, next_name, this_len) == 0) {
> + if (next_name[this_len] == '/') {
> + conflict_name = next_name;
> + } else if (next_name[this_len] < '/') {
> + /*
> + * The immediately next entry shares our
> + * prefix but sorts before "path/" (e.g.,
> + * "path-internal" between "path" and
> + * "path/file", since '-' (0x2D) < '/'
> + * (0x2F)). Binary search to find where
> + * "path/" would be and check for a D/F
> + * conflict there.
> + */
> + struct cache_entry *other;
> + struct strbuf probe = STRBUF_INIT;
> + int pos;
> +
> + strbuf_add(&probe, this_name, this_len);
> + strbuf_addch(&probe, '/');
> + pos = index_name_pos_sparse(istate,
> + probe.buf,
> + probe.len);
> + strbuf_release(&probe);
> +
> + if (pos < 0)
> + pos = -pos - 1;
> + if (pos >= (int)istate->cache_nr)
> + continue;
> + other = istate->cache[pos];
> + if (ce_namelen(other) > this_len &&
> + other->name[this_len] == '/' &&
> + !strncmp(this_name, other->name, this_len))
> + conflict_name = other->name;
> + }
> + }
The narrow and tall comment block is a sign that this loop is
getting too deeply nested. I wonder if it makes it easier to follow
if we extract this new logic into a small helper function on its
own?
What the code checks and how it does so both make sense to me, though.
Thanks.
next prev parent reply other threads:[~2026-06-01 12:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 0:26 [PATCH 0/5] Duplicate entry hardening Elijah Newren via GitGitGadget
2026-04-21 0:26 ` [PATCH 1/5] merge-ort: propagate callback errors from traverse_trees_wrapper() Elijah Newren via GitGitGadget
2026-06-01 12:13 ` Junio C Hamano
2026-04-21 0:26 ` [PATCH 2/5] merge-ort: drop unnecessary show_all_errors from collect_merge_info() Elijah Newren via GitGitGadget
2026-06-01 12:23 ` Junio C Hamano
2026-04-21 0:26 ` [PATCH 3/5] merge-ort: free diff pairs queue in clear_or_reinit_internal_opts() Elijah Newren via GitGitGadget
2026-04-21 0:26 ` [PATCH 4/5] merge-ort: abort merge when trees have duplicate entries Elijah Newren via GitGitGadget
2026-06-01 12:23 ` Junio C Hamano
2026-04-21 0:26 ` [PATCH 5/5] cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts Elijah Newren via GitGitGadget
2026-06-01 12:33 ` Junio C Hamano [this message]
2026-06-01 12:33 ` [PATCH 0/5] Duplicate entry hardening Junio C Hamano
2026-06-01 13:54 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqldcy4f07.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.