git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: A Large Angry SCM <gitzilla@gmail.com>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: A look at some alternative PACK file encodings
Date: Wed, 06 Sep 2006 16:39:19 -0700	[thread overview]
Message-ID: <44FF5C27.2040300@gmail.com> (raw)
In-Reply-To: <9e4733910609061623k73086dbey4a600ecf2852c024@mail.gmail.com>

Jon Smirl wrote:
> On 9/6/06, A Large Angry SCM <gitzilla@gmail.com> wrote:
>> TREE objects do not delta or deflate well.
> 
> I can understand why they don't deflate, the path names are pretty
> much unique and the sha1s are incompressible. By why don't they delta
> well? Does sorting them by size mess up the delta process?

My guess would be the TREEs would only delta well against other TREE
versions for the same path.

> Shawn is doing some prototype work on true dictionary based
> compression. I don't know how far along he is but it has potential for
> taking 30% off the Mozilla pack.

Just looking at the structures in non-BLOBS, I see a lot of potential
for the use of a set dictionaries when deflating TREEs and another set
of dictionaries when deflating COMMITs and TAGs. The low hanging fruit
is to create dictionaries of the most referenced IDs across all TREE or
COMMIT/TAG objects.

  reply	other threads:[~2006-09-06 23:39 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-06 21:47 A look at some alternative PACK file encodings A Large Angry SCM
2006-09-06 23:23 ` Jon Smirl
2006-09-06 23:39   ` A Large Angry SCM [this message]
2006-09-06 23:56     ` Linus Torvalds
2006-09-07  0:10       ` Jon Smirl
2006-09-07  0:06         ` David Lang
2006-09-07  0:19       ` A Large Angry SCM
2006-09-07  0:45         ` Linus Torvalds
2006-09-07  0:37       ` Nicolas Pitre
2006-09-07  0:04     ` Jon Smirl
2006-09-07  5:41       ` Shawn Pearce
2006-09-07  5:34     ` Shawn Pearce
2006-09-07  0:40   ` Nicolas Pitre
2006-09-07  0:59     ` Jon Smirl
2006-09-07  2:30       ` Nicolas Pitre
2006-09-07  2:33       ` A Large Angry SCM
2006-09-07  1:11     ` Junio C Hamano
2006-09-07  2:47       ` Nicolas Pitre
2006-09-07  4:33     ` Shawn Pearce
2006-09-07  5:27       ` Junio C Hamano
2006-09-07  5:46         ` Shawn Pearce
2006-09-07 18:50           ` Junio C Hamano
2006-09-07  5:21   ` Shawn Pearce
     [not found] ` <9e4733910609061617m6783d6c4xaca2f9575e12d455@mail.gmail.com>
2006-09-07  5:39   ` A Large Angry SCM
  -- strict thread matches above, loose matches on Subject: below --
2006-09-07  8:41 linux
2006-09-07 17:20 ` Nicolas Pitre
2006-09-07 19:16   ` linux
2006-09-07  9:07 linux
2006-09-07 12:57 ` Jon Smirl
2006-09-07 13:34   ` linux
2006-09-07 14:19     ` Jon Smirl
2006-09-07 15:01       ` linux
2006-09-07 14:39     ` Richard Curnow
2006-09-07 17:40       ` Junio C Hamano
2006-09-07 17:22   ` A Large Angry SCM
2006-09-07 17:32 ` Nicolas Pitre
2006-09-07 19:22   ` linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44FF5C27.2040300@gmail.com \
    --to=gitzilla@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).