From: Junio C Hamano <gitster@pobox.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 07/13] builtin/index-pack: fix deferred fsck outside repos
Date: Wed, 19 Nov 2025 13:27:51 -0800 [thread overview]
Message-ID: <xmqq1pltbtm0.fsf@gitster.g> (raw)
In-Reply-To: <20251119-b4-pks-odb-creation-v1-7-2b2ed2612cb6@pks.im> (Patrick Steinhardt's message of "Wed, 19 Nov 2025 08:50:55 +0100")
Patrick Steinhardt <ps@pks.im> writes:
> There's another option though: instead of skipping the final object
> checks, we can die if there are any queued object checks. With this
> change we now die exactly if and only if we would have previously
> segfaulted. Like this we ensure that objects that _may_ fail the
> consistency checks won't be silently skipped, and at the same time we
> give users a much better error message.
A packfile stream may not have the blob objects these tree entries
refer to, in which case index-pack cannot work outside a repository,
but I think that is fine.
> @@ -2110,8 +2110,23 @@ int cmd_index_pack(int argc,
> else
> close(input_fd);
>
> - if (do_fsck_object && fsck_finish(&fsck_options))
> - die(_("fsck error in pack objects"));
> + if (do_fsck_object) {
> + /*
> + * We cannot perform queued consistency checks when running
> + * outside of a repository because those require us to read
> + * from the object database, which is uninitialized.
> + *
> + * TODO: we may eventually set up an in-memory object database,
> + * which would allow us to perform these queued checks.
> + */
> + if (!startup_info->have_repository &&
> + fsck_has_queued_checks(&fsck_options))
> + die(_("cannot perform queued object checks outside "
> + "of a repository"));
> +
> + if (fsck_finish(&fsck_options))
> + die(_("fsck error in pack objects"));
> + }
OK.
> +bool fsck_has_queued_checks(struct fsck_options *options)
> +{
> + return !oidset_equal(&options->gitmodules_found, &options->gitmodules_done) ||
> + !oidset_equal(&options->gitattributes_found, &options->gitattributes_done);
> +}
So, if we see a tree entry for these special blobs (and remember
them in the _found oid set) before we see the blobs, fsck_blob()
would notice that it is looking at the blob that is in these _found
set, and throw it in _done set while checking the blob in-core.
A packfile we generate has trees before blobs, so a self contained
pack stream should still be validatable outside a repository with
this code, but other people's reimplementations of Git may produce
a packfile that has a blob before a tree that refers to the blob.
In other words, we can validate a self contained pack stream outside
repository on a best-effort basis. And that is perfectly fine.
next prev parent reply other threads:[~2025-11-19 21:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-19 7:50 [PATCH 00/13] Centralize management of object database sources Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 01/13] path: move `enter_repo()` into "setup.c" Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 02/13] setup: convert `set_git_dir()` to have file scope Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 03/13] odb: adopt logic to close object databases Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 04/13] odb: refactor `odb_clear()` to `odb_free()` Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 05/13] odb: move logic to disable ref updates into repo Patrick Steinhardt
2025-11-19 20:51 ` Junio C Hamano
2025-11-21 7:48 ` Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 06/13] oidset: introduce `oidset_equal()` Patrick Steinhardt
2025-11-19 20:59 ` Junio C Hamano
2025-11-19 7:50 ` [PATCH 07/13] builtin/index-pack: fix deferred fsck outside repos Patrick Steinhardt
2025-11-19 21:27 ` Junio C Hamano [this message]
2025-11-21 7:48 ` Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 08/13] t/helper: stop setting up `the_repository` repeatedly Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 09/13] http-push: stop setting up `the_repository` for each reference Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 10/13] odb: handle initialization of sources in `odb_new()` Patrick Steinhardt
2025-11-19 7:50 ` [PATCH 11/13] chdir-notify: add function to unregister listeners Patrick Steinhardt
2025-11-19 7:51 ` [PATCH 12/13] odb: handle changing a repository's commondir Patrick Steinhardt
2025-11-20 22:06 ` Junio C Hamano
2025-11-21 8:12 ` Patrick Steinhardt
2025-11-19 7:51 ` [PATCH 13/13] odb: handle recreation of quarantine directories Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1pltbtm0.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).