From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 3/4] packfile: expose function to read object stream for an offset
Date: Mon, 23 Feb 2026 16:59:09 +0100 [thread overview]
Message-ID: <aZx5Tc-rgih0S4gS@pks.im> (raw)
In-Reply-To: <20260223131201.GC215671@coredump.intra.peff.net>
On Mon, Feb 23, 2026 at 08:12:01AM -0500, Jeff King wrote:
> On Mon, Feb 23, 2026 at 01:21:06PM +0100, Patrick Steinhardt wrote:
>
> > > So your patch here might be making the problem a tiny bit worse, but not
> > > in a material way. I think we can ignore it for now.
> >
> > I guess the "tiny bit worse" part is that we don't handle the case
> > anymore where `unpack_object_header()` returns `OBJ_BAD`. As you say, we
> > previously didn't fully parse the object anyway, so we couldn't have
> > detected all kinds of corruptions. But we definitely handled the case
> > where `unpack_object_header()` failed.
>
> Yeah, I think that would cover it. Technically packed_object_info()
> could error on more cases (e.g., errors chasing delta bases for
> type/size info). But we would bail on trying to stream those anyway, so
> presumably any errors would be found via the non-streaming code paths in
> those cases.
>
> > So maybe we should do something like the below patch?
> > [...]
> > @@ -2571,6 +2572,9 @@ int packfile_read_object_stream(struct odb_read_stream **out,
> > switch (in_pack_type) {
> > default:
> > return -1; /* we do not do deltas for now */
> > + case OBJ_BAD:
> > + mark_bad_packed_object(pack, oid);
> > + return -1;
> > case OBJ_COMMIT:
> > case OBJ_TREE:
> > case OBJ_BLOB:
>
> I think that restores the original behavior. But I'm not sure it's even
> worth it. We are still missing the much more likely case of a bit error
> in the actual zlib stream, which would not be caught until much later.
>
> So yeah, if you want to feel better about making sure your patch keeps
> the behavior as identical as possible, I don't mind adding this. But it
> feels like the tip of the iceberg, and I'd be OK leaving it for later
> (or never).
>
> My biggest objection is not the two lines above (which I actually think
> clarify what is going on) but rather this interface change:
>
> > int packfile_read_object_stream(struct odb_read_stream **out,
> > + const struct object_id *oid,
> > struct packed_git *pack,
> > off_t offset);
>
> Now we are back to taking an oid, except we don't ever use it to look up
> the object! So it's a little misleading that it's there at all. It may
> be the best we can do, though.
Yeah, I agree it's a bit ugly. But the function is not likely to gain a
lot of additional callers anyway, as it is an implementation detail of
the packfile store. So overall I think it's okayish.
> The only other way I could think of is for packfile_read_object_stream()
> to return a more detailed error: one of "success", "chose not to
> stream", or "broken object". And then the caller can call
> mark_bad_packed_object() as appropriate. In this case, I think
> packfile_store_read_object_stream() would do so, but verify_pack()
> probably would not choose to (it is not interested in fallbacks at all
> but is going through an individual pack).
We could of course have it return the `enum object_type` directly, which
gives us enough context to do this. But on the other hand it'd mean that
we might now miss adding calls to `mark_bad_packed_object()` in the
future, and it causes a bit of repetition across callsites.
So I think I lean towards my proposed patch.
Thanks!
Patrick
next prev parent reply other threads:[~2026-02-23 15:59 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 9:50 [PATCH 0/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 9:50 ` [PATCH 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 11:13 ` Jeff King
2026-02-23 12:20 ` Patrick Steinhardt
2026-02-23 14:01 ` Eric Sunshine
2026-02-23 9:50 ` [PATCH 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 10:49 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt
2026-02-23 12:59 ` Jeff King
2026-02-23 9:50 ` [PATCH 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 11:07 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt
2026-02-23 13:12 ` Jeff King
2026-02-23 15:59 ` Patrick Steinhardt [this message]
2026-02-23 9:50 ` [PATCH 4/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 11:11 ` Jeff King
2026-02-23 11:30 ` Patrick Steinhardt
2026-02-23 12:58 ` Jeff King
2026-02-23 15:48 ` Patrick Steinhardt
2026-02-23 20:35 ` Junio C Hamano
2026-02-24 6:26 ` Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 4/4] pack-check: fix verification of large objects Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZx5Tc-rgih0S4gS@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox