From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Nguyen Thai Ngoc Duy <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [BUG] serious inflate inconsistency on master
Date: Wed, 4 Jul 2012 03:12:14 -0400 [thread overview]
Message-ID: <20120704071214.GB24807@sigill.intra.peff.net> (raw)
In-Reply-To: <7vy5n0rq9c.fsf@alter.siamese.dyndns.org>
On Tue, Jul 03, 2012 at 11:31:43PM -0700, Junio C Hamano wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
> > By the way I searched the commit that introduces that check with "git
> > log --follow -p builtin/index-pack.c" but I could not find it. What
> > did I do wrong?
>
> Your commit 8a2e163cc shows changes to the file at ll.535-540; these
> come from 776ea370 builtin-index-pack.c ll.383-388.
>
> $ git show 776ea370:builtin-index-pack.c
>
> The get_data_from_pack() function in that commit gives sufficient
> buffer to output side (avail_out starts out as obj->size), and feeds
> the data from the packfile in chunks. With the arrangement this
> commit makes to call git_inflate(), it should never get stuck
> because it ran out of output buffer. In each iteration of the loop,
> when the function returns, status should read Z_OK and the function
> should have consumed all input.
>
> But the version that uses consume() function does not give
> sufficient output buffer to ensure that the input will always be
> inflated fully (avoiding to use large output buffer is the whole
> point of your patch after all), so with your patch, that no longer
> holds true.
Yeah, that makes sense. I was wondering if we could get rid of the
avail_in check (and instead just _always_ loop to drain avail_in,
whether we are using consume() or not). The intent would be that the
loop would be a harmless no-op in the non-consume case, because we would
always drain the input completely on the first call.
But that's not right; if we _didn't_ drain it, it's probably because the
input is malformed (i.e., the deflated data is larger than the object
size claims), and we would loop infinitely (because we are not extending
the output buffer during each run of the loop).
So the patch I posted earlier is the right direction. I didn't properly
deal with moving last_out into the inner loop in that patch, but after
checking to see where it goes, I'm pretty sure it's superfluous. Here it
is with last_out removed and a proper commit message:
-- >8 --
Subject: index-pack: loop while inflating objects in unpack_data
When the unpack_data function is given a consume() callback,
it unpacks only 64K of the input at a time, feeding it to
git_inflate along with a 64K output buffer. However,
because we are inflating, there is a good chance that the
output buffer will fill before consuming all of the input.
In this case, we need to loop on git_inflate until we have
fed the whole input buffer, feeding each chunk of output to
the consume buffer.
The current code does not do this, and as a result, will
fail the loop condition and trigger a fatal "serious inflate
inconsistency" error in this case.
While we're rearranging the loop, let's get rid of the
extra last_out pointer. It is meant to point to the
beginning of the buffer that we feed to git_inflate, but in
practice this is always the beginning of our same 64K
buffer, because:
1. At the beginning of the loop, we are feeding the
buffer.
2. At the end of the loop, if we are using a consume()
function, we reset git_inflate's pointer to the
beginning of the buffer. If we are not using a
consume() function, then we do not care about the value
of last_out at all.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/index-pack.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8b5c1eb..50d3876 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -524,7 +524,6 @@ static void *unpack_data(struct object_entry *obj,
stream.avail_out = consume ? 64*1024 : obj->size;
do {
- unsigned char *last_out = stream.next_out;
ssize_t n = (len < 64*1024) ? len : 64*1024;
n = pread(pack_fd, inbuf, n, from);
if (n < 0)
@@ -538,15 +537,19 @@ static void *unpack_data(struct object_entry *obj,
len -= n;
stream.next_in = inbuf;
stream.avail_in = n;
- status = git_inflate(&stream, 0);
- if (consume) {
- if (consume(last_out, stream.next_out - last_out, cb_data)) {
- free(inbuf);
- free(data);
- return NULL;
- }
- stream.next_out = data;
- stream.avail_out = 64*1024;
+ if (!consume)
+ status = git_inflate(&stream, 0);
+ else {
+ do {
+ status = git_inflate(&stream, 0);
+ if (consume(data, stream.next_out - data, cb_data)) {
+ free(inbuf);
+ free(data);
+ return NULL;
+ }
+ stream.next_out = data;
+ stream.avail_out = 64*1024;
+ } while (status == Z_OK && stream.avail_in);
}
} while (len && status == Z_OK && !stream.avail_in);
--
1.7.11.rc1.21.g3c8d91e
prev parent reply other threads:[~2012-07-04 7:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-03 22:19 [BUG] serious inflate inconsistency on master Jeff King
2012-07-03 22:40 ` Junio C Hamano
2012-07-04 5:35 ` Nguyen Thai Ngoc Duy
2012-07-04 6:31 ` Junio C Hamano
2012-07-04 7:01 ` Nguyen Thai Ngoc Duy
2012-07-04 7:24 ` Jeff King
2012-07-04 7:12 ` Jeff King [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120704071214.GB24807@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).