From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 5/5] archive-zip: stream large blobs into zip file
Date: Tue, 01 May 2012 00:54:01 +0200 [thread overview]
Message-ID: <4F9F1809.1060803@lsrfire.ath.cx> (raw)
In-Reply-To: <7vipghf2z0.fsf@alter.siamese.dyndns.org>
Am 30.04.2012 21:12, schrieb Junio C Hamano:
> Nguyễn Thái Ngọc Duy<pclouds@gmail.com> writes:
>
>> A large blob will be read twice. One for calculating crc32, one for
>> actual writing.
>
> Is that because you need the checksum before the payload? That is
> unfortunate. It would be nice (read: not a rejection of this patch---it
> is a good first step to do it stupid but correct way before trying to
> optimize it) to avoid it when the output is seekable, especially because
> we are talking about a *large* payload.
The ZIP format optionally allows writing the CRC and the sizes after the
data. This adds a data descriptor with a size of 16 bytes to each
streamed entry. Seeking back and correcting these values in an output
file would avoid that.
>> diff --git a/t/t1050-large.sh b/t/t1050-large.sh
>> index fe47554..458fdde 100755
>> --- a/t/t1050-large.sh
>> +++ b/t/t1050-large.sh
>> @@ -138,4 +138,8 @@ test_expect_success 'tar achiving' '
>> git archive --format=tar HEAD>/dev/null
>> '
>>
>> +test_expect_success 'zip achiving' '
>> + git archive --format=zip HEAD>/dev/null
>> +'
>
> Can't we do better than "we only check if it finishes without barfing; we
> cannot afford to check the correctness of the output"? The same comment
> applies to all the tests you added to this file in the past 3 months.
Streaming to tar can be tested by setting core.big_file_threshold big
enough, creating a non-streamed version of the archive and comparing it
to the streamed one. With the seek trick, this would work for ZIP as
well. For streaming with an added data descriptor we'd need to actually
unpack the ZIP file, though.
René
next prev parent reply other threads:[~2012-04-30 22:54 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-30 4:57 [PATCH 0/5] Large file support for git-archive Nguyễn Thái Ngọc Duy
2012-04-30 4:57 ` [PATCH 1/5] archive-tar: turn write_tar_entry into blob-writing only Nguyễn Thái Ngọc Duy
2012-04-30 18:15 ` Junio C Hamano
2012-04-30 22:11 ` René Scharfe
2012-04-30 4:57 ` [PATCH 2/5] archive-tar: unindent write_tar_entry by one level Nguyễn Thái Ngọc Duy
2012-04-30 4:57 ` [PATCH 3/5] archive: delegate blob reading to backend Nguyễn Thái Ngọc Duy
2012-04-30 21:07 ` René Scharfe
2012-04-30 4:57 ` [PATCH 4/5] archive-tar: stream large blobs to tar file Nguyễn Thái Ngọc Duy
2012-04-30 19:01 ` Junio C Hamano
2012-04-30 21:08 ` René Scharfe
2012-04-30 21:36 ` Junio C Hamano
2012-04-30 22:12 ` René Scharfe
2012-04-30 4:57 ` [PATCH 5/5] archive-zip: stream large blobs into zip file Nguyễn Thái Ngọc Duy
2012-04-30 19:12 ` Junio C Hamano
2012-04-30 22:54 ` René Scharfe [this message]
2012-04-30 22:11 ` [PATCH 5a/5] streaming: void pointer instead of char pointer René Scharfe
2012-04-30 22:12 ` [PATCH 6a/5] archive-zip: remove uncompressed_size René Scharfe
2012-04-30 22:12 ` [PATCH 7a/5] archive-zip: factor out helpers for writing sizes and CRC René Scharfe
2012-04-30 22:12 ` [PATCH 8a/5] archive-zip: streaming for stored files René Scharfe
2012-04-30 22:12 ` [PATCH 9a/5] archive-zip: streaming for deflated files René Scharfe
2012-04-30 19:15 ` [PATCH 0/5] Large file support for git-archive Junio C Hamano
2012-04-30 21:07 ` René Scharfe
2012-05-01 10:19 ` Nguyen Thai Ngoc Duy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9F1809.1060803@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).