From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "D. Ben Knoble" <ben.knoble@gmail.com>, Git <git@vger.kernel.org>
Subject: Re: git-diff in a worktree is an order of magnitude slower?
Date: Sun, 21 Jun 2026 13:24:32 -0400 [thread overview]
Message-ID: <20260621172432.GA2206349@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqa4sog1e9.fsf@gitster.g>
On Sat, Jun 20, 2026 at 05:53:02PM -0700, Junio C Hamano wrote:
> > which dates to aecbf914c4 (git-diff: resurrect the traditional empty
> > "diff --git" behaviour, 2007-08-31). On my system that comparison is
> > false because the double-negation produces 1
> > (diff_auto_refresh_index=1 or the result of git_config_bool).
>
> Not quite. It was false because double-negation initializes the
> member to 1, which causes a call to diffcore_skip_stat_unmatch()
> be made, *and* the diffcore_skip_stat_unmatch() function did not
> find any ghost changes, i.e., paths that were only stat-dirty hence
> needed a call to refresh_index_quietly().
I think this is the core of the issue. These entries are "racy git
dirty" in the sense that their mtimes are the same as the index mtime,
and so we double-check the contents. This is the first bullet point
under the "Racy Git" section of Documentation/technical/racy-git.adoc.
But diffcore_skip_stat_unmatch() doesn't count them as dirty, so we
don't increment the counter, and thus top-level git-diff won't write out
the new index. And thus every subsequent diff repeats the same
expensive double-check.
But I'm not sure where the blame lies. Either:
1. diffcore_skip_stat_unmatch() should be counting these in its
"dirty" counter; or
2. the index should be marking these differently. The second bullet
point of that Racy Git section says:
When the index file is updated that contains racily clean
entries, cached `st_size` information is truncated to zero
before writing a new version of the index file.
Should the index be written out with a 0 size field here, so that
we know they are dirty and should be updated? I guess that would be
user-visible, though, because commands that _don't_ update the
index (like plumbing diff-files) would generate a spurious diff
there rather than doing the content-level comparison.
I dunno. You had solved most of the racy git stuff before I came along,
so I never gave it too much thought (and what little thought I did was
many years ago).
> > So… has that conditional been quietly dead all this time? I can't
> > imagine that's right, but…
>
> I initially thought it was an embarrassing thinko, but after seeing
> how .skip_stat_unmatch is used as a 1-based counter (i.e., if the
> member says 42, it means it saw 41 paths that were stat-dirty but
> without actual content change), I do not think so.
>
> Now, it is a different matter if such a "dual" purpose "more than a
> simple boolean" counter is a good idea. Apparently it confused both
> of us in this case ;-).
Make that three of us. ;)
-Peff
next prev parent reply other threads:[~2026-06-21 17:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 23:36 git-diff in a worktree is an order of magnitude slower? D. Ben Knoble
2026-06-09 0:11 ` Jeff King
2026-06-09 17:15 ` D. Ben Knoble
2026-06-11 8:55 ` Jeff King
2026-06-11 17:43 ` Junio C Hamano
2026-06-11 21:06 ` brian m. carlson
2026-06-20 15:57 ` D. Ben Knoble
2026-06-21 0:53 ` Junio C Hamano
2026-06-21 3:58 ` Junio C Hamano
2026-06-21 17:24 ` Jeff King [this message]
2026-06-21 17:45 ` Jeff King
2026-06-21 20:24 ` Junio C Hamano
2026-06-21 21:28 ` Jeff King
2026-06-21 23:17 ` Junio C Hamano
2026-06-21 21:39 ` Junio C Hamano
2026-06-21 22:00 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260621172432.GA2206349@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=ben.knoble@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox