From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Justin Tobler <jltobler@gmail.com>,
git@vger.kernel.org, ps@pks.im, luca.stefani.ge1@gmail.com
Subject: Re: [PATCH] object-file: avoid ODB transaction when not writing objects
Date: Tue, 7 Apr 2026 17:29:30 -0400 [thread overview]
Message-ID: <20260407212930.GA1315247@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqo6ju31wx.fsf@gitster.g>
On Tue, Apr 07, 2026 at 02:18:06PM -0700, Junio C Hamano wrote:
> From: Justin Tobler <jltobler@gmail.com>
> Date: Tue, 7 Apr 2026 15:17:30 -0500
> Subject: [PATCH] object-file: avoid ODB transaction when not writing objects
> [...]
> +static int hash_blob_stream(const struct git_hash_algo *hash_algo,
> + struct object_id *result_oid, int fd, size_t size)
> +{
> + unsigned char buf[16384];
> + struct git_hash_ctx ctx;
> + unsigned header_len;
> +
> + header_len = format_object_header((char *)buf, sizeof(buf),
> + OBJ_BLOB, size);
> + hash_algo->init_fn(&ctx);
> + git_hash_update(&ctx, buf, header_len);
> +
> + while (size) {
> + size_t rsize = size < sizeof(buf) ? size : sizeof(buf);
> + ssize_t read_result = read_in_full(fd, buf, rsize);
> +
> + if ((read_result < 0) || ((size_t)read_result != rsize))
> + return -1;
> +
> + git_hash_update(&ctx, buf, rsize);
> + size -= read_result;
> + }
> +
> + git_hash_final_oid(result_oid, &ctx);
This looks correct to me. In the back of my mind I felt like we might
already have a function to check a streaming hash, but I was just
thinking of how parse_object() streams blobs for its hash-check. And
that is always coming from the object database, whereas here we are
taking data from elsewhere. So we do need this new function.
I probably would have used fewer parentheses in the conditional, but
that may be personal preference. ;)
> diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
> index c824c1a25c..c1dbc6359a 100755
> --- a/t/t1517-outside-repo.sh
> +++ b/t/t1517-outside-repo.sh
> @@ -93,6 +93,14 @@ test_expect_success 'diff outside repository' '
> test_cmp expect actual
> '
>
> +test_expect_success 'diff files exceeding bigFileThreshold outside repository' '
> + cd non-repo &&
> + echo foo >foo &&
> + echo bar >bar &&
> + test_must_fail git -c core.bigFileThreshold=1 diff -- foo bar >actual &&
> + test_grep "diff --git a/foo b/bar" actual
> +'
This does a "cd" outside of a sub-shell, which affects all of the
subsequent tests.
We also are already using the "nongit" wrapper in this script, so it
could be used here.
Thought it was found originally with diff, the bug can also be
demonstrated with just hash-object, which does make the test a little
simpler.
The second and third are more style/taste questions, but I think the
first is a blocker.
-Peff
next prev parent reply other threads:[~2026-04-07 21:29 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 20:17 [PATCH] object-file: avoid ODB transaction when not writing objects Justin Tobler
2026-04-07 21:18 ` Junio C Hamano
2026-04-07 21:29 ` Jeff King [this message]
2026-04-07 21:43 ` Junio C Hamano
2026-04-07 21:43 ` Justin Tobler
2026-04-07 21:53 ` Junio C Hamano
2026-04-07 22:08 ` Justin Tobler
2026-04-07 22:24 ` Junio C Hamano
2026-04-07 22:41 ` Justin Tobler
2026-04-08 0:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260407212930.GA1315247@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jltobler@gmail.com \
--cc=luca.stefani.ge1@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox