From: Junio C Hamano <gitster@pobox.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: <git@vger.kernel.org>, Patrick Steinhardt <ps@pks.im>,
Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH 4/9] docs: improve ambiguous areas of pack format documentation
Date: Fri, 19 Sep 2025 16:04:24 -0700 [thread overview]
Message-ID: <xmqqtt0yyrgn.fsf@gitster.g> (raw)
In-Reply-To: <20250919010911.649831-5-sandals@crustytoothpaste.net> (brian m. carlson's message of "Fri, 19 Sep 2025 01:09:06 +0000")
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> +=== Object encoding
> +
> +Unlike loose objects, packed objects do not have a prefix containing the type,
> +size, and a NUL byte. These are not necessary because they can be determined by
> +the n-byte type and length that prefixes the data and so they are omitted from
> +the compressed and deltified data.
> +
> +The computation of the object ID still uses this prefix, however.
Not wrong per-se, but I've always viewd that the in-pack object
header with n-byte type and length was an optimized representation
that stands in for the textual type+size+NUL, just like the payload
part also uses object representation different from that is used for
loose objects for performance.
And when you view the in-pack object header that way, "are not
necessary" and everything follows in the above appear to somewhat
miss the point. It is not just "type size<NUL>" that is recreated
on the fly for computation of the same object name as in the loose
object form, but the payload also is recreated on the fly to match
what loose object would have had, e.g., a deltified representation
would be reconstituted into non-deltified form, etc.
IOW, I would have exprected the description to go more along this
line intead.
Packed objects use the n-byte type and length in-pack object
header, with in-pack specific representation of the object data.
In order to compute the same object name as if the object were
loose, the object representation used in the loose object is
virtually recreated by translating n-byte type and length to the
textual type + size + NUL, concatenated with the undeltified and
inflated object data and hashing the result.
> === Size encoding
>
> This document uses the following "size encoding" of non-negative
> @@ -92,6 +105,11 @@ values are more significant.
> This size encoding should not be confused with the "offset encoding",
> which is also used in this document.
>
> +When encoding the size of an undeltified object in a pack, the size is that of
> +the uncompressed raw object. For deltified objects, it is the size of the
> +uncompressed delta. The base object name or offset is not included in the size
> +computation.
This is an important point worth describing. Very nice.
If we wanted to help the curious, we can say that these are used to
both help us know beforehand how much memory to allocate, before we
inflate and/or to run patch-delta on the payload.
Thanks.
next prev parent reply other threads:[~2025-09-19 23:04 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-19 1:09 [PATCH 0/9] SHA-1/SHA-256 interoperability, part 1 brian m. carlson
2025-09-19 1:09 ` [PATCH 1/9] docs: update pack index v3 format brian m. carlson
2025-09-19 22:08 ` Junio C Hamano
2025-09-20 15:23 ` brian m. carlson
2025-09-20 17:01 ` Junio C Hamano
2025-09-24 7:55 ` Patrick Steinhardt
2025-09-25 21:39 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 2/9] docs: update offset order for pack index v3 brian m. carlson
2025-09-19 1:09 ` [PATCH 3/9] docs: reflect actual double signature for tags brian m. carlson
2025-09-19 22:34 ` Junio C Hamano
2025-09-20 15:29 ` brian m. carlson
2025-09-20 17:04 ` Junio C Hamano
2025-09-24 7:55 ` Patrick Steinhardt
2025-09-25 21:46 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 4/9] docs: improve ambiguous areas of pack format documentation brian m. carlson
2025-09-19 23:04 ` Junio C Hamano [this message]
2025-09-19 1:09 ` [PATCH 5/9] docs: add documentation for loose objects brian m. carlson
2025-09-19 19:10 ` Junio C Hamano
2025-09-19 19:13 ` Junio C Hamano
2025-09-19 19:15 ` brian m. carlson
2025-09-19 20:18 ` Junio C Hamano
2025-09-24 7:55 ` Patrick Steinhardt
2025-09-25 21:40 ` brian m. carlson
2025-09-19 23:16 ` Junio C Hamano
2025-09-24 7:55 ` Patrick Steinhardt
2025-09-30 16:39 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 6/9] rev-parse: allow printing compatibility hash brian m. carlson
2025-09-19 23:24 ` Junio C Hamano
2025-09-24 7:55 ` Patrick Steinhardt
2025-09-25 21:48 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 7/9] fsck: consider gpgsig headers expected in tags brian m. carlson
2025-09-19 23:31 ` Junio C Hamano
2025-09-22 21:38 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 8/9] Allow specifying compatibility hash brian m. carlson
2025-09-24 7:56 ` Patrick Steinhardt
2025-09-30 16:44 ` brian m. carlson
2025-09-19 1:09 ` [PATCH 9/9] t: add a prerequisite for a " brian m. carlson
2025-09-24 7:56 ` Patrick Steinhardt
2025-10-02 22:38 ` [PATCH v2 0/9] SHA-1/SHA-256 interoperability, part 1 brian m. carlson
2025-10-02 22:38 ` [PATCH v2 1/9] docs: update pack index v3 format brian m. carlson
2025-10-03 17:00 ` Junio C Hamano
2025-10-02 22:38 ` [PATCH v2 2/9] docs: update offset order for pack index v3 brian m. carlson
2025-10-02 22:38 ` [PATCH v2 3/9] docs: reflect actual double signature for tags brian m. carlson
2025-10-02 22:38 ` [PATCH v2 4/9] docs: improve ambiguous areas of pack format documentation brian m. carlson
2025-10-03 17:07 ` Junio C Hamano
2025-10-03 21:06 ` brian m. carlson
2025-10-02 22:38 ` [PATCH v2 5/9] docs: add documentation for loose objects brian m. carlson
2025-10-03 17:05 ` Junio C Hamano
2025-10-02 22:38 ` [PATCH v2 6/9] rev-parse: allow printing compatibility hash brian m. carlson
2025-10-02 22:38 ` [PATCH v2 7/9] fsck: consider gpgsig headers expected in tags brian m. carlson
2025-10-02 22:38 ` [PATCH v2 8/9] t: allow specifying compatibility hash brian m. carlson
2025-10-03 17:14 ` Junio C Hamano
2025-10-03 20:45 ` brian m. carlson
2025-10-02 22:38 ` [PATCH v2 9/9] t1010: use BROKEN_OBJECTS prerequisite brian m. carlson
2025-10-09 21:56 ` [PATCH v3 0/9] SHA-1/SHA-256 interoperability, part 1 brian m. carlson
2025-10-09 21:56 ` [PATCH v3 1/9] docs: update pack index v3 format brian m. carlson
2025-10-09 21:56 ` [PATCH v3 2/9] docs: update offset order for pack index v3 brian m. carlson
2025-10-09 21:56 ` [PATCH v3 3/9] docs: reflect actual double signature for tags brian m. carlson
2025-10-09 21:56 ` [PATCH v3 4/9] docs: improve ambiguous areas of pack format documentation brian m. carlson
2025-10-09 21:56 ` [PATCH v3 5/9] docs: add documentation for loose objects brian m. carlson
2025-10-09 21:56 ` [PATCH v3 6/9] rev-parse: allow printing compatibility hash brian m. carlson
2025-10-09 21:56 ` [PATCH v3 7/9] fsck: consider gpgsig headers expected in tags brian m. carlson
2025-10-09 21:56 ` [PATCH v3 8/9] t: allow specifying compatibility hash brian m. carlson
2025-10-09 21:56 ` [PATCH v3 9/9] t1010: use BROKEN_OBJECTS prerequisite brian m. carlson
2025-10-13 15:24 ` [PATCH v3 0/9] SHA-1/SHA-256 interoperability, part 1 Junio C Hamano
2025-10-13 16:34 ` brian m. carlson
2025-10-14 5:53 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqtt0yyrgn.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=ps@pks.im \
--cc=sandals@crustytoothpaste.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).