git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Justin Su <injustsu@gmail.com>
Cc: Jonathan Tan <jonathantanmy@google.com>, git@vger.kernel.org
Subject: Re: Fetching upstream remote fails if repo was a blobless clone
Date: Sat, 2 Aug 2025 15:31:10 -0400	[thread overview]
Message-ID: <20250802193110.GA1774743@coredump.intra.peff.net> (raw)
In-Reply-To: <CAB=S_8+aDwMNQkawY-Mod35EDm20mi_=xmmwfngU6As799ppqw@mail.gmail.com>

On Sat, Aug 02, 2025 at 02:28:24PM -0400, Justin Su wrote:

> Turns out this was because I had `transfer.fsckObjects = true` in my
> global config.
> 
> I think you should be able to repro if you change the last command to
> `git -c fetch.fsckObjects=true fetch upstream`.

Thanks, I can reproduce easily now. The object in question isn't
mentioned directly in the pack at all, as an incoming object or as a
delta. It's mentioned by a tree, c5b8c11446. And then when we fsck, we
hit it via fsck_walk_tree(). And then when we've finished indexing the
pack, we check for any objects that were mentioned but which we don't
have. And we don't have 0020d54b979, so we barf.

I assume what's happening is that 0020d54b979 is contained in the origin
repo, but we don't fetch (because of the blob:none filter). And then
when we talk to the upstream repo, it assumes we _do_ have it because of
the commits that we claimed to have. And that looks like the case. In
the partial clone we can do:

  $ git rev-list --objects --all --missing=print-info | grep 0020d54b
  ?0020d54b979cc8cf59a13406f98bfe515b190559 path=src/features/navigate.rs type=blob

There it is, mentioned by the origin repo.

So it is perfectly normal for us to be missing this object, and
index-pack is wrong to complain. Curiously, there's this code in
fetch-pack.c:

                  if (args->from_promisor)
                          /*
                           * create_promisor_file() may be called afterwards but
                           * we still need index-pack to know that this is a
                           * promisor pack. For example, if transfer.fsckobjects
                           * is true, index-pack needs to know that .gitmodules
                           * is a promisor object (so that it won't complain if
                           * it is missing).
                           */
                          strvec_push(&cmd.args, "--promisor");

which you'd think would kick in here. And I confirmed that the
index-pack which barfs is passed that option.

So I dunno. Clearly there is a bug, but it's not clear to me how this
code is actually supposed to work.

Doing this:

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 0a5c8a1ac8..e01cf7238b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -262,9 +262,14 @@ static unsigned check_object(struct object *obj)
 		unsigned long size;
 		int type = odb_read_object_info(the_repository->objects,
 						&obj->oid, &size);
-		if (type <= 0)
+		if (type <= 0) {
+			if (is_promisor_object(the_repository, &obj->oid)) {
+				obj->flags |= FLAG_CHECKED;
+				return 1;
+			}
 			die(_("did not receive expected object %s"),
 			      oid_to_hex(&obj->oid));
+		}
 		if (type != obj->type)
 			die(_("object %s: expected type %s, found %s"),
 			    oid_to_hex(&obj->oid),

makes the problem go away. But I feel like I'm probably missing
something (and that function is rather expensive to run, though maybe
not so bad if the alternative is crashing).

+cc Jonathan Tan as the author of the code comment above for any wisdom.

-Peff

  reply	other threads:[~2025-08-02 19:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-01  9:30 Fetching upstream remote fails if repo was a blobless clone Justin Su
2025-08-02  9:32 ` Jeff King
2025-08-02 18:02   ` Justin Su
2025-08-02 18:28     ` Justin Su
2025-08-02 19:31       ` Jeff King [this message]
2025-08-02 19:55         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250802193110.GA1774743@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=injustsu@gmail.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).