git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Carl Baldwin <cnb@fc.hp.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [PATCH] diff-delta: produce optimal pack data
Date: Fri, 24 Feb 2006 12:56:04 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0602241252300.31162@localhost.localdomain> (raw)
In-Reply-To: <20060224174422.GA13367@hpsvcnb.fc.hp.com>

On Fri, 24 Feb 2006, Carl Baldwin wrote:

> Junio,
> 
> This message came to me at exactly the right time.  Yesterday I was
> exploring using git as the content storage back-end for some binary
> files.  Up until now I've only used it for software projects.
> 
> I found the largest RCS file that we had in our current back-end.  It
> contained twelve versions of a binary file.  Each version averaged about
> 20 MB.  The ,v file from RCS was about 250MB.  I did some experiments on
> these binary files.
> 
> First, gzip consistantly is able to compress these files to about 10%
> their original size.  So, they are quite inflated.  Second, xdelta would
> produce a delta between two neighboring revisions of about 2.5MB in size
> that would compress down to about 2MB.  (about the same size as the next
> revision compressed without deltification so packing is ineffective
> here).
> 
> I added these 12 revisions to several version control back-ends
> including subversion and git.  Git produced a much smaller repository
> size than the others simply due to the compression that it applies to
> objects.  It also was at least as fast as the others.
> 
> The problem came when I tried to clone this repository.
> git-pack-objects chewed on these 12 revisions for over an hour before I
> finally interrupted it.  As far as I could tell, it hadn't made much
> progress.

I must ask if you had applied my latest delta patches?

Also did you use a recent version of git that implements pack data 
reuse?


Nicolas

  reply	other threads:[~2006-02-24 17:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-22  1:45 [PATCH] diff-delta: produce optimal pack data Nicolas Pitre
2006-02-24  8:49 ` Junio C Hamano
2006-02-24 15:37   ` Nicolas Pitre
2006-02-24 23:55     ` Junio C Hamano
2006-02-24 17:44   ` Carl Baldwin
2006-02-24 17:56     ` Nicolas Pitre [this message]
2006-02-24 18:35       ` Carl Baldwin
2006-02-24 18:57         ` Nicolas Pitre
2006-02-24 19:23           ` Carl Baldwin
2006-02-24 20:02             ` Nicolas Pitre
2006-02-24 20:40               ` Carl Baldwin
2006-02-24 21:12                 ` Nicolas Pitre
2006-02-24 22:50                   ` Carl Baldwin
2006-02-25  3:53                     ` Nicolas Pitre
2006-02-24 20:02             ` Linus Torvalds
2006-02-24 20:19               ` Nicolas Pitre
2006-02-24 20:53               ` Junio C Hamano
2006-02-24 21:39                 ` Nicolas Pitre
2006-02-24 21:48                   ` Nicolas Pitre
2006-02-25  0:45                   ` Linus Torvalds
2006-02-25  3:07                     ` Nicolas Pitre
2006-02-25  4:05                       ` Linus Torvalds
2006-02-25  5:10                         ` Nicolas Pitre
2006-02-25  5:35                           ` Nicolas Pitre
2006-03-07 23:48                             ` [RFH] zlib gurus out there? Junio C Hamano
2006-03-08  0:59                               ` Linus Torvalds
2006-03-08  1:22                                 ` Junio C Hamano
2006-03-08  2:00                                   ` Linus Torvalds
2006-03-08  9:46                                     ` Johannes Schindelin
2006-03-08 10:45                               ` [PATCH] write_sha1_file(): Perform Z_FULL_FLUSH between header and data Sergey Vlasov
2006-03-08 11:04                                 ` Junio C Hamano
2006-03-08 14:17                                   ` Sergey Vlasov
2006-02-25 19:18                           ` [PATCH] diff-delta: produce optimal pack data Linus Torvalds
2006-02-24 18:49       ` Carl Baldwin
2006-02-24 19:03         ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0602241252300.31162@localhost.localdomain \
    --to=nico@cam.org \
    --cc=cnb@fc.hp.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).