From: A Large Angry SCM <gitzilla@gmail.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Jon Smirl <jonsmirl@gmail.com>, git@vger.kernel.org
Subject: Re: A look at some alternative PACK file encodings
Date: Wed, 06 Sep 2006 17:19:18 -0700 [thread overview]
Message-ID: <44FF6586.8080206@gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0609061651500.27779@g5.osdl.org>
Linus Torvalds wrote:
>
> On Wed, 6 Sep 2006, A Large Angry SCM wrote:
>
>> Jon Smirl wrote:
>>> On 9/6/06, A Large Angry SCM <gitzilla@gmail.com> wrote:
>>>> TREE objects do not delta or deflate well.
>>> I can understand why they don't deflate, the path names are pretty
>>> much unique and the sha1s are incompressible. By why don't they delta
>>> well? Does sorting them by size mess up the delta process?
>> My guess would be the TREEs would only delta well against other TREE
>> versions for the same path.
>
> That's what you'd normally have in a real project, though. I wonder if
> your "pack mashup" lost the normal behaviour: we very much sort trees
> together normally, thanks to the "sort-by-filename, then by size"
> behaviour that git-pack-objects should have (for trees, the size normally
> shouldn't change, so the sorting should basically boil down to "sort the
> same directory together, keeping the ordering it had from git-rev-list").
The mashup is just all the projects in a single repository with a bushy
refs tree so I can view the updates in a single gitk window.
The sorting by name, then by path may be breaking the object version
relationship for wide graphs.
> Btw, that "keeping the ordering it had" part I'm not convinced we actually
> enforce. That would depend on the sort algorithm used by "qsort()", I
> think. So there might be room for improvement there in order to keep
> things in recency order.
qsort() is not stable.
>> Just looking at the structures in non-BLOBS, I see a lot of potential
>> for the use of a set dictionaries when deflating TREEs and another set
>> of dictionaries when deflating COMMITs and TAGs. The low hanging fruit
>> is to create dictionaries of the most referenced IDs across all TREE or
>> COMMIT/TAG objects.
>
> Is there any way to get zlib to just generate a suggested dictionary from
> a given set of input?
The docs suggest "no".
next prev parent reply other threads:[~2006-09-07 0:19 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-06 21:47 A look at some alternative PACK file encodings A Large Angry SCM
2006-09-06 23:23 ` Jon Smirl
2006-09-06 23:39 ` A Large Angry SCM
2006-09-06 23:56 ` Linus Torvalds
2006-09-07 0:10 ` Jon Smirl
2006-09-07 0:06 ` David Lang
2006-09-07 0:19 ` A Large Angry SCM [this message]
2006-09-07 0:45 ` Linus Torvalds
2006-09-07 0:37 ` Nicolas Pitre
2006-09-07 0:04 ` Jon Smirl
2006-09-07 5:41 ` Shawn Pearce
2006-09-07 5:34 ` Shawn Pearce
2006-09-07 0:40 ` Nicolas Pitre
2006-09-07 0:59 ` Jon Smirl
2006-09-07 2:30 ` Nicolas Pitre
2006-09-07 2:33 ` A Large Angry SCM
2006-09-07 1:11 ` Junio C Hamano
2006-09-07 2:47 ` Nicolas Pitre
2006-09-07 4:33 ` Shawn Pearce
2006-09-07 5:27 ` Junio C Hamano
2006-09-07 5:46 ` Shawn Pearce
2006-09-07 18:50 ` Junio C Hamano
2006-09-07 5:21 ` Shawn Pearce
[not found] ` <9e4733910609061617m6783d6c4xaca2f9575e12d455@mail.gmail.com>
2006-09-07 5:39 ` A Large Angry SCM
-- strict thread matches above, loose matches on Subject: below --
2006-09-07 8:41 linux
2006-09-07 17:20 ` Nicolas Pitre
2006-09-07 19:16 ` linux
2006-09-07 9:07 linux
2006-09-07 12:57 ` Jon Smirl
2006-09-07 13:34 ` linux
2006-09-07 14:19 ` Jon Smirl
2006-09-07 15:01 ` linux
2006-09-07 14:39 ` Richard Curnow
2006-09-07 17:40 ` Junio C Hamano
2006-09-07 17:22 ` A Large Angry SCM
2006-09-07 17:32 ` Nicolas Pitre
2006-09-07 19:22 ` linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44FF6586.8080206@gmail.com \
--to=gitzilla@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonsmirl@gmail.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.