From: Junio C Hamano <gitster@pobox.com>
To: Nicolas Pitre <nico@cam.org>
Cc: "Robin H. Johnson" <robbat2@gentoo.org>, git@vger.kernel.org
Subject: Re: Weird growth in packfile during initial push
Date: Wed, 29 Apr 2009 16:57:37 -0700 [thread overview]
Message-ID: <7vy6tj109a.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: alpine.LFD.2.00.0904151443030.6741@xanadu.home
Nicolas Pitre <nico@cam.org> writes:
>> $ git push origin master:master
>> Initialized empty Git repository in /var/gitroot/exp/gentoo-x86.git/
>> Counting objects: 4969800, done.
>> Delta compression using up to 8 threads.
>> Compressing objects: 100% (1217809/1217809), done.
>> Writing objects: 100% (4969800/4969800), 810.56 MiB | 21608 KiB/s, done.
>> Total 4969800 (delta 3735812), reused 4969800 (delta 3735812)
>
> Here we know for sure that all objects were directly reused, so no
> attempt at recompressing them was done. The only thing that
> pack-objects might do in this case in addition to directly streaming the
> existing pack is to convert delta object headers from OFS_DELTA to
> REF_DELTA.
>
>> $ ls -la /var/gitroot/exp/gentoo-x86.git/objects/pack
>> total 966876
>> drwxr-xr-x 2 git git 4096 Apr 14 08:43 .
>> drwxr-xr-x 4 git git 4096 Apr 14 08:35 ..
>> -r--r--r-- 1 git git 139155472 Apr 14 08:43 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.idx
>> -r--r--r-- 1 git git 849936308 Apr 14 08:43 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.pack
>
> Let's see if my theory stands:
>
> 849936308 - 786336481 = 63599827
> 63599827 / 3735812 = 17.02
>
> Hence an average difference of 17 bytes per delta. Given that REF_DELTA
> objects have a 20-byte SHA1 base reference which is replaced with a
> variable length encoding of a pack offset in the OFS_DELTA case, we're
> talking about 2.98 bytes for that offset encoding which feels about
> right.
>
> [...]
>
> And the code matches this theory as well. Can you try this patch if you
> have a chance?
Is there any progress on this?
I think you did a veryclear analysis. 8% size reduction is not only
unignorable but use of delta offset should also help runtime efficiency,
right?
next prev parent reply other threads:[~2009-04-29 23:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-15 18:27 Weird growth in packfile during initial push Robin H. Johnson
2009-04-15 19:51 ` Nicolas Pitre
2009-04-29 23:57 ` Junio C Hamano [this message]
2009-04-30 2:52 ` Nicolas Pitre
2009-05-01 6:17 ` Robin H. Johnson
2009-05-01 20:56 ` [PATCH] allow OFS_DELTA objects during a push Nicolas Pitre
2009-05-01 23:49 ` Junio C Hamano
2009-05-02 0:01 ` Compatibility between git.git and jgit Shawn O. Pearce
2009-05-02 1:14 ` A Large Angry SCM
2009-05-02 1:39 ` Nicolas Pitre
2009-05-02 1:59 ` Shawn O. Pearce
2009-05-02 16:56 ` Ealdwulf Wuffinga
2009-05-02 1:40 ` Michael Witten
2009-05-02 0:24 ` [PATCH] allow OFS_DELTA objects during a push Nicolas Pitre
2009-05-04 22:11 ` Shawn O. Pearce
2009-05-04 22:30 ` Shawn O. Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vy6tj109a.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=nico@cam.org \
--cc=robbat2@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).