From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 3/4] packfile: expose function to read object stream for an offset
Date: Mon, 23 Feb 2026 13:21:06 +0100 [thread overview]
Message-ID: <aZxGMrGkVNeAdC1N@pks.im> (raw)
In-Reply-To: <20260223110722.GB215364@coredump.intra.peff.net>
On Mon, Feb 23, 2026 at 06:07:22AM -0500, Jeff King wrote:
> On Mon, Feb 23, 2026 at 10:50:42AM +0100, Patrick Steinhardt wrote:
> > +int packfile_store_read_object_stream(struct odb_read_stream **out,
> > + struct packfile_store *store,
> > + const struct object_id *oid)
> > +{
> > + struct pack_entry e;
> > +
> > + if (!find_pack_entry(store, oid, &e))
> > + return -1;
> > +
> > + return packfile_read_object_stream(out, e.p, e.offset);
> > +}
>
> OK. The original read via packfile_store_read_object_info(), which does
> a bit more work. It called packed_object_info() and if necessary would
> trigger mark_bad_packed_object(). But now that we are leaving it to
> packfile_read_object_stream() to look at the header, we don't need to
> load any object info, and we have no error code to check.
>
> It does make me wonder, though, if we are missing out on marking bad
> objects here. The idea is that we'd usually do something like:
>
> 1. some code wants to access $OID
>
> 2. we find $OID in pack $P
>
> 3. that turns out to be broken for some reason, so we mark it as bad
>
> 4. we try again, skipping $P and finding it in some other pack
>
> But now I wonder if code that tries to stream will skip step 3, and then
> in step 4 we'll find the same broken $P over and over.
>
> But I suspect if that is possible, it was already true. We were only
> asking for the type and size, so any content-level corruption wouldn't
> be caught here and we'd have the same issue. I think the right thing is
> probably for the streaming code to know about the pack/oid pair it's
> trying to read, and to mark it as bad if it hits an error.
>
> So your patch here might be making the problem a tiny bit worse, but not
> in a material way. I think we can ignore it for now.
I guess the "tiny bit worse" part is that we don't handle the case
anymore where `unpack_object_header()` returns `OBJ_BAD`. As you say, we
previously didn't fully parse the object anyway, so we couldn't have
detected all kinds of corruptions. But we definitely handled the case
where `unpack_object_header()` failed.
So maybe we should do something like the below patch?
Patrick
diff --git a/packfile.c b/packfile.c
index 9d795a671f..3e61176128 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2554,6 +2554,7 @@ static int close_istream_pack_non_delta(struct odb_read_stream *_st)
}
int packfile_read_object_stream(struct odb_read_stream **out,
+ const struct object_id *oid,
struct packed_git *pack,
off_t offset)
{
@@ -2571,6 +2572,9 @@ int packfile_read_object_stream(struct odb_read_stream **out,
switch (in_pack_type) {
default:
return -1; /* we do not do deltas for now */
+ case OBJ_BAD:
+ mark_bad_packed_object(pack, oid);
+ return -1;
case OBJ_COMMIT:
case OBJ_TREE:
case OBJ_BLOB:
@@ -2601,5 +2605,5 @@ int packfile_store_read_object_stream(struct odb_read_stream **out,
if (!find_pack_entry(store, oid, &e))
return -1;
- return packfile_read_object_stream(out, e.p, e.offset);
+ return packfile_read_object_stream(out, oid, e.p, e.offset);
}
diff --git a/packfile.h b/packfile.h
index 67d5750140..b9f5f1c18c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -437,6 +437,7 @@ off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs,
off_t delta_obj_offset);
int packfile_read_object_stream(struct odb_read_stream **out,
+ const struct object_id *oid,
struct packed_git *pack,
off_t offset);
next prev parent reply other threads:[~2026-02-23 12:21 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 9:50 [PATCH 0/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 9:50 ` [PATCH 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 11:13 ` Jeff King
2026-02-23 12:20 ` Patrick Steinhardt
2026-02-23 14:01 ` Eric Sunshine
2026-02-23 9:50 ` [PATCH 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 10:49 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt
2026-02-23 12:59 ` Jeff King
2026-02-23 9:50 ` [PATCH 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 11:07 ` Jeff King
2026-02-23 12:21 ` Patrick Steinhardt [this message]
2026-02-23 13:12 ` Jeff King
2026-02-23 15:59 ` Patrick Steinhardt
2026-02-23 9:50 ` [PATCH 4/4] pack-check: fix verification of large objects Patrick Steinhardt
2026-02-23 11:11 ` Jeff King
2026-02-23 11:30 ` Patrick Steinhardt
2026-02-23 12:58 ` Jeff King
2026-02-23 15:48 ` Patrick Steinhardt
2026-02-23 20:35 ` Junio C Hamano
2026-02-24 6:26 ` Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 1/4] t/helper: improve "genrandom" test helper Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 2/4] object-file: adapt `stream_object_signature()` to take a stream Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 3/4] packfile: expose function to read object stream for an offset Patrick Steinhardt
2026-02-23 16:00 ` [PATCH v2 4/4] pack-check: fix verification of large objects Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZxGMrGkVNeAdC1N@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox