git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Nicolas Pitre" <nico@cam.org>
Cc: "Shawn Pearce" <spearce@spearce.org>, git <git@vger.kernel.org>
Subject: Re: Huge win, compressing a window of delta runs as a unit
Date: Fri, 18 Aug 2006 09:15:10 -0400	[thread overview]
Message-ID: <9e4733910608180615q4895334bw57c55e59a4ac5482@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0608172323520.11359@localhost.localdomain>

On 8/18/06, Nicolas Pitre <nico@cam.org> wrote:
> A better way to get such a size saving is to increase the window and
> depth parameters.  For example, a window of 20 and depth of 20 can
> usually provide a pack size saving greater than 11% with none of the
> disadvantages mentioned above.

Our window size is effectively infinite. I am handing him all of the
revisions from a single file in optimal order. This includes branches.
He takes these revisions, runs xdiff on them, and then puts the entire
result into a single zlib blob.

I suspect the size reduction is directly proportional to the age of
the repository. The kernel repository only has three years worth of
data in it.  Linus has the full history in another repository that is
not in general distribution. We can get it from him when he gets back
from vacation.

If the repository doesn't contain long delta chains the optimization
doesn't help that much. On the other hand it doesn't hurt either since
the chains weren't long.  My repository is four times as old as the
kernel one and I am getting 4x the benefit.

This is a good format for archival data that is infrequency accessed.
That is why I proposed a two pack system. One pack would contain the
archival data and be highly optimized for size and the second pack
would contain recent changes and be optimized for speed. The size
optimization is important for controlling bandwidth costs.

-- 
Jon Smirl
jonsmirl@gmail.com

  parent reply	other threads:[~2006-08-18 13:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-16 17:20 Huge win, compressing a window of delta runs as a unit Jon Smirl
2006-08-17  4:07 ` Shawn Pearce
2006-08-17  7:56   ` Johannes Schindelin
2006-08-17  8:07     ` Johannes Schindelin
2006-08-17 14:36       ` Jon Smirl
2006-08-17 15:45         ` Johannes Schindelin
2006-08-17 16:33           ` Nicolas Pitre
2006-08-17 17:05             ` Johannes Schindelin
2006-08-17 17:22             ` Jon Smirl
2006-08-17 18:15               ` Nicolas Pitre
2006-08-17 17:17           ` Jon Smirl
2006-08-17 17:32             ` Nicolas Pitre
2006-08-17 18:06               ` Jon Smirl
2006-08-17 17:22   ` Nicolas Pitre
2006-08-17 18:03     ` Jon Smirl
2006-08-17 18:24       ` Nicolas Pitre
2006-08-18  4:03 ` Nicolas Pitre
2006-08-18 12:53   ` Jon Smirl
2006-08-18 16:30     ` Nicolas Pitre
2006-08-18 16:56       ` Jon Smirl
2006-08-21  3:45         ` Nicolas Pitre
2006-08-21  6:46           ` Shawn Pearce
2006-08-21 10:24             ` Jakub Narebski
2006-08-21 16:23             ` Jon Smirl
2006-08-18 13:15   ` Jon Smirl [this message]
2006-08-18 13:36     ` Johannes Schindelin
2006-08-18 13:50       ` Jon Smirl
2006-08-19 19:25         ` Linus Torvalds
2006-08-18 16:25     ` Nicolas Pitre
2006-08-21  7:06       ` Shawn Pearce
2006-08-21 14:07         ` Jon Smirl
2006-08-21 15:46         ` Nicolas Pitre
2006-08-21 16:14           ` Jon Smirl
2006-08-21 17:48             ` Nicolas Pitre
2006-08-21 17:55               ` Nicolas Pitre
2006-08-21 18:01                 ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910608180615q4895334bw57c55e59a4ac5482@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=nico@cam.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).