From: Nicolas Pitre <nico@cam.org>
To: Carl Baldwin <cnb@fc.hp.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [PATCH] diff-delta: produce optimal pack data
Date: Fri, 24 Feb 2006 12:56:04 -0500 (EST) [thread overview]
Message-ID: <Pine.LNX.4.64.0602241252300.31162@localhost.localdomain> (raw)
In-Reply-To: <20060224174422.GA13367@hpsvcnb.fc.hp.com>
On Fri, 24 Feb 2006, Carl Baldwin wrote:
> Junio,
>
> This message came to me at exactly the right time. Yesterday I was
> exploring using git as the content storage back-end for some binary
> files. Up until now I've only used it for software projects.
>
> I found the largest RCS file that we had in our current back-end. It
> contained twelve versions of a binary file. Each version averaged about
> 20 MB. The ,v file from RCS was about 250MB. I did some experiments on
> these binary files.
>
> First, gzip consistantly is able to compress these files to about 10%
> their original size. So, they are quite inflated. Second, xdelta would
> produce a delta between two neighboring revisions of about 2.5MB in size
> that would compress down to about 2MB. (about the same size as the next
> revision compressed without deltification so packing is ineffective
> here).
>
> I added these 12 revisions to several version control back-ends
> including subversion and git. Git produced a much smaller repository
> size than the others simply due to the compression that it applies to
> objects. It also was at least as fast as the others.
>
> The problem came when I tried to clone this repository.
> git-pack-objects chewed on these 12 revisions for over an hour before I
> finally interrupted it. As far as I could tell, it hadn't made much
> progress.
I must ask if you had applied my latest delta patches?
Also did you use a recent version of git that implements pack data
reuse?
Nicolas
next prev parent reply other threads:[~2006-02-24 17:56 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-22 1:45 [PATCH] diff-delta: produce optimal pack data Nicolas Pitre
2006-02-24 8:49 ` Junio C Hamano
2006-02-24 15:37 ` Nicolas Pitre
2006-02-24 23:55 ` Junio C Hamano
2006-02-24 17:44 ` Carl Baldwin
2006-02-24 17:56 ` Nicolas Pitre [this message]
2006-02-24 18:35 ` Carl Baldwin
2006-02-24 18:57 ` Nicolas Pitre
2006-02-24 19:23 ` Carl Baldwin
2006-02-24 20:02 ` Nicolas Pitre
2006-02-24 20:40 ` Carl Baldwin
2006-02-24 21:12 ` Nicolas Pitre
2006-02-24 22:50 ` Carl Baldwin
2006-02-25 3:53 ` Nicolas Pitre
2006-02-24 20:02 ` Linus Torvalds
2006-02-24 20:19 ` Nicolas Pitre
2006-02-24 20:53 ` Junio C Hamano
2006-02-24 21:39 ` Nicolas Pitre
2006-02-24 21:48 ` Nicolas Pitre
2006-02-25 0:45 ` Linus Torvalds
2006-02-25 3:07 ` Nicolas Pitre
2006-02-25 4:05 ` Linus Torvalds
2006-02-25 5:10 ` Nicolas Pitre
2006-02-25 5:35 ` Nicolas Pitre
2006-03-07 23:48 ` [RFH] zlib gurus out there? Junio C Hamano
2006-03-08 0:59 ` Linus Torvalds
2006-03-08 1:22 ` Junio C Hamano
2006-03-08 2:00 ` Linus Torvalds
2006-03-08 9:46 ` Johannes Schindelin
2006-03-08 10:45 ` [PATCH] write_sha1_file(): Perform Z_FULL_FLUSH between header and data Sergey Vlasov
2006-03-08 11:04 ` Junio C Hamano
2006-03-08 14:17 ` Sergey Vlasov
2006-02-25 19:18 ` [PATCH] diff-delta: produce optimal pack data Linus Torvalds
2006-02-24 18:49 ` Carl Baldwin
2006-02-24 19:03 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0602241252300.31162@localhost.localdomain \
--to=nico@cam.org \
--cc=cnb@fc.hp.com \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).