From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Junio C Hamano <junkio@cox.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Horrible re-packing?
Date: Mon, 05 Jun 2006 17:20:23 -0400 (EDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0606051637490.24152@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0606051155000.5498@g5.osdl.org>
On Mon, 5 Jun 2006, Linus Torvalds wrote:
>
>
> On this same thread..
>
> This trivial patch not only simplifies the name hashing, it actually
> improves packing for both git and the kernel.
>
> The git archive pack shrinks from 6824090->6622627 bytes (a 3%
> improvement), and the kernel pack shrinks from 108756213 to 108219021 (a
> mere 0.5% improvement, but still, it's an improvement from making the
> hashing much simpler!)
OK here's the scoop. I still have a sample repo (I forget who it was
from) that used to exhibit a big packing size regression which was fixed
a while ago. I tend to test new packing strategies on that repo as well
since it has rather interesting characteristics that makes it pretty
sensitive to changes to name hashing and size filtering heuristics.
Before this hashing patch (including the rev-list fix):
$ git repack -a -f
Generating pack...
Done counting 46391 objects.
Deltifying 46391 objects.
100% (46391/46391) done
Writing 46391 objects.
100% (46391/46391) done
Total 46391, written 46391 (delta 7457), reused 38934 (delta 0)
Pack pack-7f766f5af5547554bacb28c0294bd562589dc5e7 created.
$ ll .git/objects/pack/pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
-rw-rw-r-- 1 nico nico 39486095 Jun 5 16:28 .git/objects/pack/pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
Now with this patch applied:
$ git repack -a -f
Generating pack...
Done counting 46391 objects.
Deltifying 46391 objects.
100% (46391/46391) done
Writing 46391 objects.
100% (46391/46391) done
Total 46391, written 46391 (delta 9920), reused 36447 (delta 0)
Pack pack-7f766f5af5547554bacb28c0294bd562589dc5e7 created.
$ ll .git/objects/pack/pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
-rw-rw-r-- 1 nico nico 16150417 Jun 5 16:31 .git/objects/pack/pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
In other words, the pack shrunk to less than half the size of the
previous one !
And yes fsck-objects still pass (I was doubtful at first).
Nicolas
next prev parent reply other threads:[~2006-06-05 21:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-05 17:08 Horrible re-packing? Linus Torvalds
2006-06-05 18:44 ` Linus Torvalds
2006-06-05 19:03 ` Linus Torvalds
2006-06-05 19:37 ` Junio C Hamano
2006-06-05 19:57 ` Linus Torvalds
2006-06-05 23:54 ` Junio C Hamano
2006-06-06 0:14 ` Junio C Hamano
2006-06-05 21:14 ` Olivier Galibert
2006-06-05 21:22 ` Nicolas Pitre
2006-06-06 0:18 ` Chris Wedgwood
2006-06-06 0:35 ` Linus Torvalds
2006-06-05 21:27 ` Linus Torvalds
2006-06-05 21:20 ` Nicolas Pitre [this message]
2006-06-05 21:40 ` Linus Torvalds
2006-06-05 23:13 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0606051637490.24152@localhost.localdomain \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).