All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Luca Stefani <luca.stefani.ge1@gmail.com>
Cc: git@vger.kernel.org, cat@malon.dev
Subject: Re: [PATCH] object-file: don't use object database without a repository
Date: Sun, 5 Apr 2026 02:46:51 -0400	[thread overview]
Message-ID: <20260405064651.GA1452907@coredump.intra.peff.net> (raw)
In-Reply-To: <20260404172817.2995133-1-luca.stefani.ge1@gmail.com>

On Sat, Apr 04, 2026 at 07:28:17PM +0200, Luca Stefani wrote:

> When running `git diff -- $file1 $file2' on large enough files,
> index_fd() attempts to use 'the_repository->objects', assuming it
> is initialized, but that's not the case for non-repository usecases.
> 
> When git diff is invoked without a backing repository,
> INDEX_WRITE_OBJECT is never set in flags, meaning only the hash is
> needed and nothing should be written to the object store.
> 
> Enforce the use of index_core() in this case.

I don't think we want to use index_core() for a large file, though. A
test like this:

diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 15076dfe0d..7ef5604430 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -413,4 +413,10 @@ test_expect_success 'diff --no-index with pathspec glob and exclude' '
 	test_cmp expect actual
 '
 
+test_expect_success 'diff --no-index on a huge file' '
+	dd if=/dev/zero bs=1M count=4000 >big.file &&
+	echo whatever >small.file &&
+	test_expect_code 1 git diff --no-index big.file small.file
+'
+
 test_done


will now fail on a 32-bit system, because we try to mmap the whole file,
which will fail.  We really do want to follow the streaming code path
(which knows to respect the lack of a WRITE_OBJECT flag and works
without an odb in that case).

It's kind of an expensive test, though, so we probably don't want to
actually include it in the test suite.

-Peff

PS I'd expect a 4GB+ file to work, too, but it looks like the diff code
   barfs when trying to stuff the file into a diff_filespec. A simpler
   example is:

     dd if=/dev/zero bs=1G count=5 >big.file
     git hash-object big.file

   but that dies, too! It looks like the streaming helper uses a size_t
   to take the size, which is wrong. It really should be an off_t. So I
   dunno, maybe nobody cares about ever working with 4GB files on 32-bit
   systems these days. It still feels like we should avoid a large mmap,
   though.

  parent reply	other threads:[~2026-04-05  6:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-04 17:28 [PATCH] object-file: don't use object database without a repository Luca Stefani
2026-04-05  6:03 ` Pushkar Singh
2026-04-05  6:46 ` Jeff King [this message]
2026-04-05 16:10   ` Luca Stefani
2026-04-05 19:17     ` Jeff King
2026-04-06 18:17       ` Justin Tobler
2026-04-06 19:31         ` Luca Stefani
2026-04-06 20:31           ` Justin Tobler
2026-04-06 20:06         ` Jeff King
2026-04-06 20:38           ` Justin Tobler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260405064651.GA1452907@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=cat@malon.dev \
    --cc=git@vger.kernel.org \
    --cc=luca.stefani.ge1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.