From: Jeff King <peff@peff.net>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 3/4] packfile: expose function to read object stream for an offset
Date: Mon, 23 Feb 2026 08:12:01 -0500 [thread overview]
Message-ID: <20260223131201.GC215671@coredump.intra.peff.net> (raw)
In-Reply-To: <aZxGMrGkVNeAdC1N@pks.im>
On Mon, Feb 23, 2026 at 01:21:06PM +0100, Patrick Steinhardt wrote:
> > So your patch here might be making the problem a tiny bit worse, but not
> > in a material way. I think we can ignore it for now.
>
> I guess the "tiny bit worse" part is that we don't handle the case
> anymore where `unpack_object_header()` returns `OBJ_BAD`. As you say, we
> previously didn't fully parse the object anyway, so we couldn't have
> detected all kinds of corruptions. But we definitely handled the case
> where `unpack_object_header()` failed.
Yeah, I think that would cover it. Technically packed_object_info()
could error on more cases (e.g., errors chasing delta bases for
type/size info). But we would bail on trying to stream those anyway, so
presumably any errors would be found via the non-streaming code paths in
those cases.
> So maybe we should do something like the below patch?
> [...]
> @@ -2571,6 +2572,9 @@ int packfile_read_object_stream(struct odb_read_stream **out,
> switch (in_pack_type) {
> default:
> return -1; /* we do not do deltas for now */
> + case OBJ_BAD:
> + mark_bad_packed_object(pack, oid);
> + return -1;
> case OBJ_COMMIT:
> case OBJ_TREE:
> case OBJ_BLOB:
I think that restores the original behavior. But I'm not sure it's even
worth it. We are still missing the much more likely case of a bit error
in the actual zlib stream, which would not be caught until much later.
So yeah, if you want to feel better about making sure your patch keeps
the behavior as identical as possible, I don't mind adding this. But it
feels like the tip of the iceberg, and I'd be OK leaving it for later
(or never).
My biggest objection is not the two lines above (which I actually think
clarify what is going on) but rather this interface change:
> int packfile_read_object_stream(struct odb_read_stream **out,
> + const struct object_id *oid,
> struct packed_git *pack,
> off_t offset);
Now we are back to taking an oid, except we don't ever use it to look up
the object! So it's a little misleading that it's there at all. It may
be the best we can do, though.
The only other way I could think of is for packfile_read_object_stream()
to return a more detailed error: one of "success", "chose not to
stream", or "broken object". And then the caller can call
mark_bad_packed_object() as appropriate. In this case, I think
packfile_store_read_object_stream() would do so, but verify_pack()
probably would not choose to (it is not interested in fallbacks at all
but is going through an individual pack).
-Peff
next prev parent reply other threads:[~2026-02-23 13:12 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 9:50 [PATCH 0/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 9:50 ` [PATCH 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 11:13 ` Jeff King
2026-02-23 12:20 ` Patrick Steinhardt
2026-02-23 14:01 ` Eric Sunshine
2026-02-23 9:50 ` [PATCH 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 10:49 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt
2026-02-23 12:59 ` Jeff King
2026-02-23 9:50 ` [PATCH 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 11:07 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt
2026-02-23 13:12 ` Jeff King [this message]
2026-02-23 15:59 ` Patrick Steinhardt
2026-02-23 9:50 ` [PATCH 4/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 11:11 ` Jeff King
2026-02-23 11:30 ` Patrick Steinhardt
2026-02-23 12:58 ` Jeff King
2026-02-23 15:48 ` Patrick Steinhardt
2026-02-23 20:35 ` Junio C Hamano
2026-02-24 6:26 ` Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 4/4] pack-check: fix verification of large objects Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260223131201.GC215671@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox