From: Jeff King <peff@peff.net>
To: Tian Yuchen <cat@malon.dev>
Cc: Justin Tobler <jltobler@gmail.com>,
Luca Stefani <luca.stefani.ge1@gmail.com>,
git@vger.kernel.org
Subject: Re: [BUG] git diff --no-index segfaults on large files (NULL object database)
Date: Sat, 4 Apr 2026 19:09:39 -0400 [thread overview]
Message-ID: <20260404230939.GA1360412@coredump.intra.peff.net> (raw)
In-Reply-To: <4be492cf-347b-4fa5-9bdd-83e7ea8abd92@malon.dev>
On Sun, Apr 05, 2026 at 01:07:27AM +0800, Tian Yuchen wrote:
> On 4/5/26 00:53, Luca Stefani wrote:
> > Thanks for looking into it.
> > Locally, I simply check against null storage and it works just fine,
> > flags is always 0 in my experiments so a check against
> > INDEX_WRITE_OBJECT also worked.
> >
> > diff --git a/object-file.c b/object-file.c
> > index f0b029ff0b..68303aa99c 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -1654,7 +1654,8 @@ int index_fd(struct index_state *istate, struct
> > object_id *oid,
> > } else if ((st->st_size >= 0 &&
> > (size_t)st->st_size <=
> > repo_settings_get_big_file_threshold(istate->repo)) ||
> > type != OBJ_BLOB ||
> > - (path && would_convert_to_git(istate, path))) {
> > + (path && would_convert_to_git(istate, path)) ||
> > + !(flags & INDEX_WRITE_OBJECT)) {
> > ret = index_core(istate, oid, fd, xsize_t(st->st_size),
> > type, path, flags);
> > } else {
> >
> > Luca.
>
> That looks good, almost exactly what I was about to send. I was mistaken—
> there isn’t a hash_write_object flag after all ;-)
>
> It looks like this is your first time posting on the Git mailing list. Would
> you consider contributing this (as a patch)?s
Alternatively, should the odb transaction system be more forgiving here,
and act as a noop when there is no odb?
Bisecting the segfault yields ce1661f9da (odb: add transaction
interface, 2025-09-16). Before then, we passed around the
object_database itself, saw that its transaction field was NULL, and
returned immediately. After that commit, we pass the object_databse to
odb_transaction_begin(), which narrows it to odb->sources (which is
NULL) while passing to object_file_transaction_begin(). And then that
function looks at source->odb to go back to the object_database! But
the source being NULL, it segfaults.
Immediately after that commit, the switch from taking an odb to a source
is not helpful, though I think eventually it is used to set
transaction->base.source. But should the whole thing check for a NULL
source and return early? Or otherwise establish some kind of noop
transaction?
I haven't thought about the implications (nor even really looked at odb
transaction code before). But doing it that way would fix not only this
bug, but also other potential bugs throughout the code base when callers
start a noop transaction.
+cc Justin (author of ce1661f9da) for any thoughts.
-Peff
next prev parent reply other threads:[~2026-04-04 23:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 10:39 [BUG] git diff --no-index segfaults on large files (NULL object database) Luca Stefani
2026-04-04 16:45 ` Tian Yuchen
2026-04-04 16:53 ` Luca Stefani
2026-04-04 17:07 ` Tian Yuchen
2026-04-04 23:09 ` Jeff King [this message]
2026-04-05 2:48 ` Tian Yuchen
2026-04-05 6:14 ` Jeff King
2026-04-06 17:57 ` Justin Tobler
2026-04-06 20:45 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260404230939.GA1360412@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=cat@malon.dev \
--cc=git@vger.kernel.org \
--cc=jltobler@gmail.com \
--cc=luca.stefani.ge1@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox