From: Justin Tobler <jltobler@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, ps@pks.im, peff@peff.net,
luca.stefani.ge1@gmail.com
Subject: Re: [PATCH] object-file: avoid ODB transaction when not writing objects
Date: Tue, 7 Apr 2026 16:43:08 -0500 [thread overview]
Message-ID: <adV44IS5iPwdjKAe@denethor> (raw)
In-Reply-To: <xmqqo6ju31wx.fsf@gitster.g>
On 26/04/07 02:18PM, Junio C Hamano wrote:
> Justin Tobler <jltobler@gmail.com> writes:
>
> I'd appreciate it if you can give your eyeballs to the attached to
> see if that is how you would fixed the bug in the original context
> of v2.52 track. If everything looks OK, then there is no need to
> spend time backporting on your side. We have everything necessary.
The amended patch below looks correct to me. Peff had some comments
regarding the test that I will work on addressing.
Thanks,
-Justin
>
> --- >8 ---
> From: Justin Tobler <jltobler@gmail.com>
> Date: Tue, 7 Apr 2026 15:17:30 -0500
> Subject: [PATCH] object-file: avoid ODB transaction when not writing objects
>
> In ce1661f9da (odb: add transaction interface, 2025-09-16), existing
> ODB transaction logic is adapted to create a transaction interface
> at the ODB layer. The intent here is for the ODB transaction
> interface to eventually provide an object source agnostic means to
> manage transactions.
>
> An unintended consequence of this change though is that
> `object-file.c:index_fd()` may enter the ODB transaction path even
> when no object write is requested. In non-repository contexts, this
> can result in a NULL dereference and segfault. One such case occurs
> when running git-diff(1) outside of a repository with
> "core.bigFileThreshold" forcing the streaming path in `index_fd()`:
>
> $ echo foo >foo
> $ echo bar >bar
> $ git -c core.bigFileThreshold=1 diff -- foo bar
>
> In this scenario, the caller only needs to compute the object ID. Object
> hashing does not require an ODB, so starting a transaction is both
> unnecessary and invalid.
>
> Fix the bug by avoiding the use of ODB transactions in `index_fd()` when
> callers are only interested in computing the object hash.
>
> Reported-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> Signed-off-by: Justin Tobler <jltobler@gmail.com>
> [jc: adjusted to fd13909e (Merge branch 'jt/odb-transaction', 2025-10-02)]
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> object-file.c | 49 ++++++++++++++++++++++++++++++++++-------
> t/t1517-outside-repo.sh | 8 +++++++
> 2 files changed, 49 insertions(+), 8 deletions(-)
>
> diff --git a/object-file.c b/object-file.c
> index 17a236d2fe..0969e27f3d 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1599,6 +1599,34 @@ static int index_blob_packfile_transaction(struct odb_transaction *transaction,
> return 0;
> }
>
> +static int hash_blob_stream(const struct git_hash_algo *hash_algo,
> + struct object_id *result_oid, int fd, size_t size)
> +{
> + unsigned char buf[16384];
> + struct git_hash_ctx ctx;
> + unsigned header_len;
> +
> + header_len = format_object_header((char *)buf, sizeof(buf),
> + OBJ_BLOB, size);
> + hash_algo->init_fn(&ctx);
> + git_hash_update(&ctx, buf, header_len);
> +
> + while (size) {
> + size_t rsize = size < sizeof(buf) ? size : sizeof(buf);
> + ssize_t read_result = read_in_full(fd, buf, rsize);
> +
> + if ((read_result < 0) || ((size_t)read_result != rsize))
> + return -1;
> +
> + git_hash_update(&ctx, buf, rsize);
> + size -= read_result;
> + }
> +
> + git_hash_final_oid(result_oid, &ctx);
> +
> + return 0;
> +}
> +
> int index_fd(struct index_state *istate, struct object_id *oid,
> int fd, struct stat *st,
> enum object_type type, const char *path, unsigned flags)
> @@ -1620,14 +1648,19 @@ int index_fd(struct index_state *istate, struct object_id *oid,
> ret = index_core(istate, oid, fd, xsize_t(st->st_size),
> type, path, flags);
> } else {
> - struct odb_transaction *transaction;
> -
> - transaction = odb_transaction_begin(the_repository->objects);
> - ret = index_blob_packfile_transaction(the_repository->objects->transaction,
> - oid, fd,
> - xsize_t(st->st_size),
> - path, flags);
> - odb_transaction_commit(transaction);
> + if (flags & INDEX_WRITE_OBJECT) {
> + struct odb_transaction *transaction;
> +
> + transaction = odb_transaction_begin(the_repository->objects);
> + ret = index_blob_packfile_transaction(the_repository->objects->transaction,
> + oid, fd,
> + xsize_t(st->st_size),
> + path, flags);
> + odb_transaction_commit(transaction);
> + } else {
> + ret = hash_blob_stream(the_repository->hash_algo, oid,
> + fd, xsize_t(st->st_size));
> + }
> }
>
> close(fd);
> diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
> index c824c1a25c..c1dbc6359a 100755
> --- a/t/t1517-outside-repo.sh
> +++ b/t/t1517-outside-repo.sh
> @@ -93,6 +93,14 @@ test_expect_success 'diff outside repository' '
> test_cmp expect actual
> '
>
> +test_expect_success 'diff files exceeding bigFileThreshold outside repository' '
> + cd non-repo &&
> + echo foo >foo &&
> + echo bar >bar &&
> + test_must_fail git -c core.bigFileThreshold=1 diff -- foo bar >actual &&
> + test_grep "diff --git a/foo b/bar" actual
> +'
> +
> test_expect_success 'stripspace outside repository' '
> nongit git stripspace -s </dev/null
> '
> --
> 2.54.0-rc0-282-gaf2821fd49
>
next prev parent reply other threads:[~2026-04-07 21:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 20:17 [PATCH] object-file: avoid ODB transaction when not writing objects Justin Tobler
2026-04-07 21:18 ` Junio C Hamano
2026-04-07 21:29 ` Jeff King
2026-04-07 21:43 ` Junio C Hamano
2026-04-07 21:43 ` Justin Tobler [this message]
2026-04-07 21:53 ` Junio C Hamano
2026-04-07 22:08 ` Justin Tobler
2026-04-07 22:24 ` Junio C Hamano
2026-04-07 22:41 ` Justin Tobler
2026-04-08 0:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adV44IS5iPwdjKAe@denethor \
--to=jltobler@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=luca.stefani.ge1@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.