From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 01/10] loose_object_info(): BUG() on inflating content with unknown type
Date: Tue, 25 Feb 2025 12:42:22 +0100 [thread overview]
Message-ID: <Z72sns9I-zq_KoNr@pks.im> (raw)
In-Reply-To: <20250225062824.GA1293961@coredump.intra.peff.net>
On Tue, Feb 25, 2025 at 01:28:24AM -0500, Jeff King wrote:
> After unpack_loose_header() returns, it will have inflated not only the
> object header, but possibly some bytes of the object content. When we
> call unpack_loose_rest() to extract the actual content, it finds those
> extra bytes by skipping past the header's terminating NUL in the buffer.
> Like this:
>
> int bytes = strlen(buffer) + 1;
> n = stream->total_out - bytes;
> ...
> memcpy(buf, (char *) buffer + bytes, n);
>
> This won't work with the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag, as there
> we allow a header of arbitrary size. We put into a strbuf, but feed only
s/into/it &/
> the final 32-byte chunk we read to unpack_loose_rest(). In that case
> stream->total_out may unexpectedly large, and thus our "n" will be
s/may/& be/
> large, causing an out-of-bounds read (we do check it against our
> allocated buffer size, which prevents an out-of-bounds write).
>
> Probably this could be made to work by feeding the strbuf to
> unpack_loose_rest(), along with adjusting some types (e.g., "bytes"
> would need to be a size_t, since it is no longer operating on a 32-byte
> buffer).
I was a bit confused initially as I was thinking in terms of `size_t`
and `uint32_t` as I misread 32-byte for 32-bit, which is an immediate
shortcut that my brain took because 32 bit is something you read all the
time. I don't really have a great idea for how to introduce the byte
chunk better though to avoid this.
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I found this because I was tracing the code path after
> unpack_loose_header() returns to verify some assumptions in the other
> patches.
>
> It really makes me wonder if this "unknown type" stuff has any value
> at all. You can create an object with any type using "hash-object
> --literally -t". And you can ask about its type and size. But you can
> never retrieve the object content! Nor can you pack it or transfer it,
> since packs use a numeric type field.
>
> This code was added ~2015, but I don't think anybody built more on top
> of it. I wonder if we should just consider it a failed experiment and
> rip out the support.
I certainly do not know and cannot think of any usecase for this. I also
expect that a repository with unknown object types is a recipe for weird
edge cases in case they are being read somehow.
I guess we'll know a bit more when this patch series lands? In case
nobody complains it is another indicator that unknown object types
aren't being used out there in the wild.
> object-file.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/object-file.c b/object-file.c
> index 00c3a4b910..45b251ba04 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1580,6 +1580,8 @@ static int loose_object_info(struct repository *r,
>
> if (!oi->contentp)
> break;
> + if (hdrbuf.len)
> + BUG("unpacking content with unknown types not yet supported");
> *oi->contentp = unpack_loose_rest(&stream, hdr, *oi->sizep, oid);
> if (*oi->contentp)
> goto cleanup;
Okay. I was wondering whether we still need `hdrbuf`, but we of course
do in order to continue reading the type and length itself. The only
thing we restrict is reading the contents of such objects.
Patrick
next prev parent reply other threads:[~2025-02-25 11:42 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 6:25 [PATCH 0/10] some zlib inflating bug fixes Jeff King
2025-02-25 6:28 ` [PATCH 01/10] loose_object_info(): BUG() on inflating content with unknown type Jeff King
2025-02-25 11:42 ` Patrick Steinhardt [this message]
2025-02-26 1:47 ` Junio C Hamano
2025-02-28 0:16 ` Taylor Blau
2025-03-04 6:43 ` Jeff King
2025-03-04 15:41 ` Junio C Hamano
2025-02-28 0:14 ` Taylor Blau
2025-02-25 6:29 ` [PATCH 02/10] unpack_loose_header(): simplify next_out assignment Jeff King
2025-02-28 0:18 ` Taylor Blau
2025-02-25 6:29 ` [PATCH 03/10] unpack_loose_header(): report headers without NUL as "bad" Jeff King
2025-02-25 6:29 ` [PATCH 04/10] unpack_loose_header(): fix infinite loop on broken zlib input Jeff King
2025-02-25 11:42 ` Patrick Steinhardt
2025-02-25 19:00 ` Eric Sunshine
2025-02-26 12:56 ` Junio C Hamano
2025-02-28 0:21 ` Taylor Blau
2025-02-25 6:30 ` [PATCH 05/10] git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT Jeff King
2025-02-26 13:26 ` Junio C Hamano
2025-02-28 0:31 ` Taylor Blau
2025-03-04 7:08 ` Jeff King
2025-02-25 6:30 ` [PATCH 06/10] unpack_loose_header(): avoid numeric comparison of zlib status Jeff King
2025-02-28 0:32 ` Taylor Blau
2025-03-04 6:55 ` Jeff King
2025-02-25 6:31 ` [PATCH 07/10] unpack_loose_rest(): " Jeff King
2025-02-25 6:33 ` [PATCH 08/10] unpack_loose_rest(): never clean up zstream Jeff King
2025-02-26 13:16 ` Junio C Hamano
2025-02-25 6:33 ` [PATCH 09/10] unpack_loose_rest(): simplify error handling Jeff King
2025-02-26 13:46 ` Junio C Hamano
2025-02-28 0:34 ` Taylor Blau
2025-02-25 6:34 ` [PATCH 10/10] unpack_loose_rest(): rewrite return handling for clarity Jeff King
2025-02-28 0:36 ` Taylor Blau
2025-03-04 7:10 ` Jeff King
2025-03-04 21:32 ` Taylor Blau
2025-02-28 0:38 ` [PATCH 0/10] some zlib inflating bug fixes Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z72sns9I-zq_KoNr@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).