From: Duy Nguyen <pclouds@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Farhan Khan <khanzf@gmail.com>, git@vger.kernel.org
Subject: Re: pack file object size question
Date: Mon, 17 Dec 2018 16:31:49 +0100 [thread overview]
Message-ID: <20181217153149.GA30261@duynguyen.home> (raw)
In-Reply-To: <20181217001446.GL75890@google.com>
On Sun, Dec 16, 2018 at 04:14:46PM -0800, Jonathan Nieder wrote:
> Hi,
>
> Farhan Khan wrote:
> >> Farhan Khan wrote:
>
> >>> I am having trouble figuring out the boundary between two objects in
> >>> the pack file.
> [...]
> > I think the issue is, the compressed object has a fixed
> > size and git inflates it, then moves on to the next object. I am
> > trying to figure out how where it identifies the size of the object.
>
> Do you mean the compressed size or uncompressed size?
>
> It sounds to me like pack-format.txt needs to do a better job of
> distinguishing the two.
How about something like this?
I mostly wrote this based on memory (and a very quick look at
index-pack) but I think we never ever really stored compressed
sizes. The "length" field (even in loose format) is always about
uncompressed size.
-- 8< --
diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index cab5bdd2ff..4fd49f61d6 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -31,6 +31,11 @@ Git pack format
is an OBJ_OFS_DELTA object
compressed delta data
+ Note: The length (in bytes) is of uncompressed objects or
+ deltified representation. We're supposed to reach the end of zlib
+ stream once we have inflated the given length, otherwise it's a
+ corrupted pack file.
+
Observation: length of each object is encoded in a variable
length format and is not constrained to 32-bit or anything.
@@ -199,7 +204,8 @@ Pack file entry: <+
is the size before compression).
If it is REF_DELTA, then
20-byte base object name SHA-1 (the size above is the
- size of the delta data that follows).
+ size of the delta data that follows, before
+ compression).
delta data, deflated.
If it is OFS_DELTA, then
n-byte offset (see below) interpreted as a negative
-- 8< --
next prev parent reply other threads:[~2018-12-17 15:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-16 21:52 pack file object size question Farhan Khan
2018-12-16 22:14 ` Jonathan Nieder
2018-12-16 23:36 ` Farhan Khan
2018-12-17 0:14 ` Jonathan Nieder
2018-12-17 15:31 ` Duy Nguyen [this message]
2018-12-17 19:39 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181217153149.GA30261@duynguyen.home \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=khanzf@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).