From: A Large Angry SCM <gitzilla@gmail.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Jon Smirl <jonsmirl@gmail.com>, git@vger.kernel.org
Subject: Re: A look at some alternative PACK file encodings
Date: Wed, 06 Sep 2006 17:19:18 -0700 [thread overview]
Message-ID: <44FF6586.8080206@gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0609061651500.27779@g5.osdl.org>
Linus Torvalds wrote:
>
> On Wed, 6 Sep 2006, A Large Angry SCM wrote:
>
>> Jon Smirl wrote:
>>> On 9/6/06, A Large Angry SCM <gitzilla@gmail.com> wrote:
>>>> TREE objects do not delta or deflate well.
>>> I can understand why they don't deflate, the path names are pretty
>>> much unique and the sha1s are incompressible. By why don't they delta
>>> well? Does sorting them by size mess up the delta process?
>> My guess would be the TREEs would only delta well against other TREE
>> versions for the same path.
>
> That's what you'd normally have in a real project, though. I wonder if
> your "pack mashup" lost the normal behaviour: we very much sort trees
> together normally, thanks to the "sort-by-filename, then by size"
> behaviour that git-pack-objects should have (for trees, the size normally
> shouldn't change, so the sorting should basically boil down to "sort the
> same directory together, keeping the ordering it had from git-rev-list").
The mashup is just all the projects in a single repository with a bushy
refs tree so I can view the updates in a single gitk window.
The sorting by name, then by path may be breaking the object version
relationship for wide graphs.
> Btw, that "keeping the ordering it had" part I'm not convinced we actually
> enforce. That would depend on the sort algorithm used by "qsort()", I
> think. So there might be room for improvement there in order to keep
> things in recency order.
qsort() is not stable.
>> Just looking at the structures in non-BLOBS, I see a lot of potential
>> for the use of a set dictionaries when deflating TREEs and another set
>> of dictionaries when deflating COMMITs and TAGs. The low hanging fruit
>> is to create dictionaries of the most referenced IDs across all TREE or
>> COMMIT/TAG objects.
>
> Is there any way to get zlib to just generate a suggested dictionary from
> a given set of input?
The docs suggest "no".
next prev parent reply other threads:[~2006-09-07 0:19 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-06 21:47 A look at some alternative PACK file encodings A Large Angry SCM
2006-09-06 23:23 ` Jon Smirl
2006-09-06 23:39 ` A Large Angry SCM
2006-09-06 23:56 ` Linus Torvalds
2006-09-07 0:10 ` Jon Smirl
2006-09-07 0:06 ` David Lang
2006-09-07 0:19 ` A Large Angry SCM [this message]
2006-09-07 0:45 ` Linus Torvalds
2006-09-07 0:37 ` Nicolas Pitre
2006-09-07 0:04 ` Jon Smirl
2006-09-07 5:41 ` Shawn Pearce
2006-09-07 5:34 ` Shawn Pearce
2006-09-07 0:40 ` Nicolas Pitre
2006-09-07 0:59 ` Jon Smirl
2006-09-07 2:30 ` Nicolas Pitre
2006-09-07 2:33 ` A Large Angry SCM
2006-09-07 1:11 ` Junio C Hamano
2006-09-07 2:47 ` Nicolas Pitre
2006-09-07 4:33 ` Shawn Pearce
2006-09-07 5:27 ` Junio C Hamano
2006-09-07 5:46 ` Shawn Pearce
2006-09-07 18:50 ` Junio C Hamano
2006-09-07 5:21 ` Shawn Pearce
[not found] ` <9e4733910609061617m6783d6c4xaca2f9575e12d455@mail.gmail.com>
2006-09-07 5:39 ` A Large Angry SCM
-- strict thread matches above, loose matches on Subject: below --
2006-09-07 8:41 linux
2006-09-07 17:20 ` Nicolas Pitre
2006-09-07 19:16 ` linux
2006-09-07 9:07 linux
2006-09-07 12:57 ` Jon Smirl
2006-09-07 13:34 ` linux
2006-09-07 14:19 ` Jon Smirl
2006-09-07 15:01 ` linux
2006-09-07 14:39 ` Richard Curnow
2006-09-07 17:40 ` Junio C Hamano
2006-09-07 17:22 ` A Large Angry SCM
2006-09-07 17:32 ` Nicolas Pitre
2006-09-07 19:22 ` linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44FF6586.8080206@gmail.com \
--to=gitzilla@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonsmirl@gmail.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).