From: Junio C Hamano <junkio@cox.net>
To: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Add git-explode-packs
Date: Sun, 26 Mar 2006 19:53:31 -0800 [thread overview]
Message-ID: <7vd5g8pmpw.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <20060326125450.GT31387@lug-owl.de> (Jan-Benedict Glaw's message of "Sun, 26 Mar 2006 14:54:50 +0200")
Jan-Benedict Glaw <jbglaw@lug-owl.de> writes:
> On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@cox.net> wrote:
>> The script seems to do what it claims to, but now why would one
>> need to use this? In other words what's the situation one would
>> find this useful?
>
> It's possibly useful if you oftenly access old objects with
> git-cat-file or git-ls-tree.
Benchmarks?
I created two cloned repositories from git.git. victim03
repository is fully packed with the default pack parameter of
depth and window set both to 10. victim04 repository has the
same set of objects and refs but the pack is expanded (16232
loose objects).
Now in victim03 repository, 657 blobs have depth 10 (i.e. you
need to inflate and apply delta 10 times to get to the object).
So I made the list of these "expensive to access" objects and
run this:
$ cd victim03
$ /usr/bin/time sh -c '
while read sha1; do git cat-file blob $sha1;
done >/dev/null <list
'
3.43user 3.36system 0:07.17elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364561minor)pagefaults 0swaps
3.51user 3.33system 0:07.10elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364499minor)pagefaults 0swaps
3.76user 2.99system 0:07.28elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+365155minor)pagefaults 0swaps
With the same file list, in victim04 repository that has 16232
loose objects:
$ cd victim04
$ /usr/bin/time sh -c '
while read sha1; do git cat-file blob $sha1;
done >/dev/null <../victim03/list
'
3.29user 2.98system 0:06.33elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+348786minor)pagefaults 0swaps
3.26user 2.88system 0:06.63elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347512minor)pagefaults 0swaps
3.16user 2.98system 0:06.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347489minor)pagefaults 0swaps
So you are getting slight performance gain out of this by
exploding the pack, but on the other hand you are taxing the
buffer cache quite heavily by reading the loose objects (in both
of the experiments above, I discarded numbers from the very
first run). The size of object databases in these cases are:
$ du -sh victim0[34]/.git/objects
6.2M victim03/.git/objects
84M victim04/.git/objects
So I am still not convinced it would be useful in general. It
used to be that exploding everything and repacking was the only
way to clean out garbage from packs, but after "repack -a -d"
was invented by Frank Sorenson that became more convenient way.
Especially with the recent "delta reusing" pack-objects, doing
"repack -a -d" has become quite cheap, so...
prev parent reply other threads:[~2006-03-27 3:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-25 12:02 [PATCH] Add git-explode-packs Martin Atukunda
2006-03-26 6:12 ` Junio C Hamano
2006-03-26 12:54 ` Jan-Benedict Glaw
2006-03-27 3:53 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vd5g8pmpw.fsf@assigned-by-dhcp.cox.net \
--to=junkio@cox.net \
--cc=git@vger.kernel.org \
--cc=jbglaw@lug-owl.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox