From: Justin Tobler <jltobler@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ps@pks.im, peff@peff.net,
luca.stefani.ge1@gmail.com
Subject: Re: [PATCH] object-file: avoid ODB transaction when not writing objects
Date: Tue, 7 Apr 2026 16:43:08 -0500 [thread overview]
Message-ID: <adV44IS5iPwdjKAe@denethor> (raw)
In-Reply-To: <xmqqo6ju31wx.fsf@gitster.g>
On 26/04/07 02:18PM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
>
> I'd appreciate it if you can give your eyeballs to the attached to
> see if that is how you would fixed the bug in the original context
> of v2.52 track. If everything looks OK, then there is no need to
> spend time backporting on your side. We have everything necessary.
The amended patch below looks correct to me. Peff had some comments
regarding the test that I will work on addressing.
Thanks,
-Justin
>
> --- >8 ---
> From: Justin Tobler <jltobler@gmail.com>
> Date: Tue, 7 Apr 2026 15:17:30 -0500
> Subject: [PATCH] object-file: avoid ODB transaction when not writing objects
>
> In ce1661f9da (odb: add transaction interface, 2025-09-16), existing
> ODB transaction logic is adapted to create a transaction interface
> at the ODB layer. The intent here is for the ODB transaction
> interface to eventually provide an object source agnostic means to
> manage transactions.
>
> An unintended consequence of this change though is that
> `object-file.c:index_fd()` may enter the ODB transaction path even
> when no object write is requested. In non-repository contexts, this
> can result in a NULL dereference and segfault. One such case occurs
> when running git-diff(1) outside of a repository with
> "core.bigFileThreshold" forcing the streaming path in `index_fd()`:
>
> $ echo foo >foo
> $ echo bar >bar
> $ git -c core.bigFileThreshold=1 diff -- foo bar
>
> In this scenario, the caller only needs to compute the object ID. Object
> hashing does not require an ODB, so starting a transaction is both
> unnecessary and invalid.
>
> Fix the bug by avoiding the use of ODB transactions in `index_fd()` when
> callers are only interested in computing the object hash.
>
> Reported-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> Signed-off-by: Justin Tobler <jltobler@gmail.com>
> [jc: adjusted to fd13909e (Merge branch 'jt/odb-transaction', 2025-10-02)]
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> object-file.c | 49 ++++++++++++++++++++++++++++++++++-------
> t/t1517-outside-repo.sh | 8 +++++++
> 2 files changed, 49 insertions(+), 8 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index 17a236d2fe..0969e27f3d 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1599,6 +1599,34 @@ static int index_blob_packfile_transaction(struct odb_transaction *transaction,
> return 0;
> }
>
> +static int hash_blob_stream(const struct git_hash_algo *hash_algo,
> + struct object_id *result_oid, int fd, size_t size)
> +{
> + unsigned char buf[16384];
> + struct git_hash_ctx ctx;
> + unsigned header_len;
> +
> + header_len = format_object_header((char *)buf, sizeof(buf),
> + OBJ_BLOB, size);
> + hash_algo->init_fn(&ctx);
> + git_hash_update(&ctx, buf, header_len);
> +
> + while (size) {
> + size_t rsize = size < sizeof(buf) ? size : sizeof(buf);
> + ssize_t read_result = read_in_full(fd, buf, rsize);
> +
> + if ((read_result < 0) || ((size_t)read_result != rsize))
> + return -1;
> +
> + git_hash_update(&ctx, buf, rsize);
> + size -= read_result;
> + }
> +
> + git_hash_final_oid(result_oid, &ctx);
> +
> + return 0;
> +}
> +
> int index_fd(struct index_state *istate, struct object_id *oid,
> int fd, struct stat *st,
> enum object_type type, const char *path, unsigned flags)
> @@ -1620,14 +1648,19 @@ int index_fd(struct index_state *istate, struct object_id *oid,
> ret = index_core(istate, oid, fd, xsize_t(st->st_size),
> type, path, flags);
> } else {
> - struct odb_transaction *transaction;
> -
> - transaction = odb_transaction_begin(the_repository->objects);
> - ret = index_blob_packfile_transaction(the_repository->objects->transaction,
> - oid, fd,
> - xsize_t(st->st_size),
> - path, flags);
> - odb_transaction_commit(transaction);
> + if (flags & INDEX_WRITE_OBJECT) {
> + struct odb_transaction *transaction;
> +
> + transaction = odb_transaction_begin(the_repository->objects);
> + ret = index_blob_packfile_transaction(the_repository->objects->transaction,
> + oid, fd,
> + xsize_t(st->st_size),
> + path, flags);
> + odb_transaction_commit(transaction);
> + } else {
> + ret = hash_blob_stream(the_repository->hash_algo, oid,
> + fd, xsize_t(st->st_size));
> + }
> }
>
> close(fd);
> diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
> index c824c1a25c..c1dbc6359a 100755
> --- a/t/t1517-outside-repo.sh
> +++ b/t/t1517-outside-repo.sh
> @@ -93,6 +93,14 @@ test_expect_success 'diff outside repository' '
> test_cmp expect actual
> '
>
> +test_expect_success 'diff files exceeding bigFileThreshold outside repository' '
> + cd non-repo &&
> + echo foo >foo &&
> + echo bar >bar &&
> + test_must_fail git -c core.bigFileThreshold=1 diff -- foo bar >actual &&
> + test_grep "diff --git a/foo b/bar" actual
> +'
> +
> test_expect_success 'stripspace outside repository' '
> nongit git stripspace -s </dev/null
> '
> --
> 2.54.0-rc0-282-gaf2821fd49
>
next prev parent reply other threads:[~2026-04-07 21:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 20:17 [PATCH] object-file: avoid ODB transaction when not writing objects Justin Tobler
2026-04-07 21:18 ` Junio C Hamano
2026-04-07 21:29 ` Jeff King
2026-04-07 21:43 ` Junio C Hamano
2026-04-07 21:43 ` Justin Tobler [this message]
2026-04-07 21:53 ` Junio C Hamano
2026-04-07 22:08 ` Justin Tobler
2026-04-07 22:24 ` Junio C Hamano
2026-04-07 22:41 ` Justin Tobler
2026-04-08 0:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adV44IS5iPwdjKAe@denethor \
--to=jltobler@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=luca.stefani.ge1@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox