public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 3/4] packfile: expose function to read object stream for an offset
Date: Mon, 23 Feb 2026 16:59:09 +0100	[thread overview]
Message-ID: <aZx5Tc-rgih0S4gS@pks.im> (raw)
In-Reply-To: <20260223131201.GC215671@coredump.intra.peff.net>

On Mon, Feb 23, 2026 at 08:12:01AM -0500, Jeff King wrote:
> On Mon, Feb 23, 2026 at 01:21:06PM +0100, Patrick Steinhardt wrote:
> 
> > > So your patch here might be making the problem a tiny bit worse, but not
> > > in a material way. I think we can ignore it for now.
> > 
> > I guess the "tiny bit worse" part is that we don't handle the case
> > anymore where `unpack_object_header()` returns `OBJ_BAD`. As you say, we
> > previously didn't fully parse the object anyway, so we couldn't have
> > detected all kinds of corruptions. But we definitely handled the case
> > where `unpack_object_header()` failed.
> 
> Yeah, I think that would cover it. Technically packed_object_info()
> could error on more cases (e.g., errors chasing delta bases for
> type/size info). But we would bail on trying to stream those anyway, so
> presumably any errors would be found via the non-streaming code paths in
> those cases.
> 
> > So maybe we should do something like the below patch?
> > [...]
> > @@ -2571,6 +2572,9 @@ int packfile_read_object_stream(struct odb_read_stream **out,
> >  	switch (in_pack_type) {
> >  	default:
> >  		return -1; /* we do not do deltas for now */
> > +	case OBJ_BAD:
> > +		mark_bad_packed_object(pack, oid);
> > +		return -1;
> >  	case OBJ_COMMIT:
> >  	case OBJ_TREE:
> >  	case OBJ_BLOB:
> 
> I think that restores the original behavior. But I'm not sure it's even
> worth it. We are still missing the much more likely case of a bit error
> in the actual zlib stream, which would not be caught until much later.
> 
> So yeah, if you want to feel better about making sure your patch keeps
> the behavior as identical as possible, I don't mind adding this. But it
> feels like the tip of the iceberg, and I'd be OK leaving it for later
> (or never).
> 
> My biggest objection is not the two lines above (which I actually think
> clarify what is going on) but rather this interface change:
> 
> >  int packfile_read_object_stream(struct odb_read_stream **out,
> > +				const struct object_id *oid,
> >  				struct packed_git *pack,
> >  				off_t offset);
> 
> Now we are back to taking an oid, except we don't ever use it to look up
> the object! So it's a little misleading that it's there at all. It may
> be the best we can do, though.

Yeah, I agree it's a bit ugly. But the function is not likely to gain a
lot of additional callers anyway, as it is an implementation detail of
the packfile store. So overall I think it's okayish.

> The only other way I could think of is for packfile_read_object_stream()
> to return a more detailed error: one of "success", "chose not to
> stream", or "broken object". And then the caller can call
> mark_bad_packed_object() as appropriate. In this case, I think
> packfile_store_read_object_stream() would do so, but verify_pack()
> probably would not choose to (it is not interested in fallbacks at all
> but is going through an individual pack).

We could of course have it return the `enum object_type` directly, which
gives us enough context to do this. But on the other hand it'd mean that
we might now miss adding calls to `mark_bad_packed_object()` in the
future, and it causes a bit of repetition across callsites.

So I think I lean towards my proposed patch.

Thanks!

Patrick

  reply	other threads:[~2026-02-23 15:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23  9:50 [PATCH 0/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23  9:50 ` [PATCH 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 11:13   ` Jeff King
2026-02-23 12:20     ` Patrick Steinhardt
2026-02-23 14:01   ` Eric Sunshine
2026-02-23  9:50 ` [PATCH 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 10:49   ` Jeff King
2026-02-23 12:21     ` Patrick Steinhardt
2026-02-23 12:59       ` Jeff King
2026-02-23  9:50 ` [PATCH 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 11:07   ` Jeff King
2026-02-23 12:21     ` Patrick Steinhardt
2026-02-23 13:12       ` Jeff King
2026-02-23 15:59         ` Patrick Steinhardt [this message]
2026-02-23  9:50 ` [PATCH 4/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 11:11   ` Jeff King
2026-02-23 11:30     ` Patrick Steinhardt
2026-02-23 12:58       ` Jeff King
2026-02-23 15:48         ` Patrick Steinhardt
2026-02-23 20:35   ` Junio C Hamano
2026-02-24  6:26     ` Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2026-02-23 16:00   ` [PATCH v2 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 16:00   ` [PATCH v2 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 16:00   ` [PATCH v2 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 16:00   ` [PATCH v2 4/4] pack-check: fix verification of large objects Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZx5Tc-rgih0S4gS@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox