* [PATCH] Add git-explode-packs
@ 2006-03-25 12:02 Martin Atukunda
2006-03-26 6:12 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Martin Atukunda @ 2006-03-25 12:02 UTC (permalink / raw)
To: git; +Cc: Martin Atukunda
This script does the opposite of git repack -a -d.
Signed-Off-By: Martin Atukunda <matlads@dsmagic.com>
---
.gitignore | 1 +
Documentation/git-explode-packs.txt | 45 +++++++++++++++++++++++++++++++++++
Makefile | 2 +-
git-explode-packs.sh | 26 ++++++++++++++++++++
4 files changed, 73 insertions(+), 1 deletions(-)
create mode 100644 Documentation/git-explode-packs.txt
create mode 100755 git-explode-packs.sh
277352dd9a0549cd626242b14454da37acbc72f3
diff --git a/.gitignore b/.gitignore
index b4355b9..0ac74e3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -133,3 +133,4 @@ libgit.a
*.py[co]
config.mak
git-blame
+git-explode-packs
diff --git a/Documentation/git-explode-packs.txt b/Documentation/git-explode-packs.txt
new file mode 100644
index 0000000..9651a4e
--- /dev/null
+++ b/Documentation/git-explode-packs.txt
@@ -0,0 +1,45 @@
+git-explode-packs(1)
+====================
+
+NAME
+----
+git-explode-packs - Script used to explode packs in a repository into objects
+
+
+SYNOPSIS
+--------
+'git-explode-packs'
+
+DESCRIPTION
+-----------
+
+This script is used to explode all packs into the constituent objects.
+
+A pack is a collection of objects, individually compressed, with
+delta compression applied, stored in a single file, with an
+associated index file.
+
+Packs are used to reduce the load on mirror systems, backup
+engines, disk storage, etc.
+
+This script removes all the packs in the repository, replacing them with the
+objects that are stored inside them.
+
+Author
+------
+Written by Martin Atukunda <matlads@dsmagic.com>
+
+Documentation
+--------------
+Documentation by Martin Atukunda <matlads@dsmag.com>
+
+See Also
+--------
+gitlink:git-pack-objects[1]
+gitlink:git-prune-packed[1]
+gitlink:git-repack[1]
+
+GIT
+---
+Part of the gitlink:git[7] suite
+
diff --git a/Makefile b/Makefile
index 8d45378..71e31f0 100644
--- a/Makefile
+++ b/Makefile
@@ -125,7 +125,7 @@ SCRIPT_SH = \
git-applymbox.sh git-applypatch.sh git-am.sh \
git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
- git-lost-found.sh
+ git-lost-found.sh git-explode-packs.sh
SCRIPT_PERL = \
git-archimport.perl git-cvsimport.perl git-relink.perl \
diff --git a/git-explode-packs.sh b/git-explode-packs.sh
new file mode 100755
index 0000000..a7e9761
--- /dev/null
+++ b/git-explode-packs.sh
@@ -0,0 +1,26 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Martin Atukunda
+#
+
+USAGE=''
+. git-sh-setup
+
+PACKDIR="$GIT_OBJECT_DIRECTORY/pack"
+PRESDIR="./++preserve"
+
+mkdir "$PRESDIR" && (
+ for p in "$GIT_OBJECT_DIRECTORY"/pack/pack-*.pack; do
+ if test -f "$p"; then
+ mv "$p" "$PRESDIR"
+ fi
+ done
+
+ for p in "$PRESDIR"/pack-*.pack; do
+ if test -f "$p"; then
+ git-unpack-objects <$p
+ rm -- $p
+ fi
+ done
+ rmdir "$PRESDIR"
+)
--
1.2.4.gd3e1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Add git-explode-packs
2006-03-25 12:02 [PATCH] Add git-explode-packs Martin Atukunda
@ 2006-03-26 6:12 ` Junio C Hamano
2006-03-26 12:54 ` Jan-Benedict Glaw
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2006-03-26 6:12 UTC (permalink / raw)
To: Martin Atukunda; +Cc: git
Martin Atukunda <matlads@dsmagic.com> writes:
> This script does the opposite of git repack -a -d.
The script seems to do what it claims to, but now why would one
need to use this? In other words what's the situation one would
find this useful?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Add git-explode-packs
2006-03-26 6:12 ` Junio C Hamano
@ 2006-03-26 12:54 ` Jan-Benedict Glaw
2006-03-27 3:53 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Jan-Benedict Glaw @ 2006-03-26 12:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Martin Atukunda, git
[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]
On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@cox.net> wrote:
> Martin Atukunda <matlads@dsmagic.com> writes:
> > This script does the opposite of git repack -a -d.
>
> The script seems to do what it claims to, but now why would one
> need to use this? In other words what's the situation one would
> find this useful?
It's possibly useful if you oftenly access old objects with
git-cat-file or git-ls-tree.
Not being a Perl hacker, a friend and I eg. started to hack GIT
support into LXR. I've just posted some very early patches on the LXR
mailing list
(http://sourceforge.net/mailarchive/forum.php?forum_id=1734). What
would be even more interesting is to not unpack _all_ objects, but
only those belonging to specifically mentioned commits or tags. I
think LXR could make _good_ use of that.
MfG, JBG
--
Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 _ O _
"Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O
für einen Freien Staat voll Freier Bürger" | im Internet! | im Irak! O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Add git-explode-packs
2006-03-26 12:54 ` Jan-Benedict Glaw
@ 2006-03-27 3:53 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2006-03-27 3:53 UTC (permalink / raw)
To: Jan-Benedict Glaw; +Cc: git
Jan-Benedict Glaw <jbglaw@lug-owl.de> writes:
> On Sat, 2006-03-25 22:12:46 -0800, Junio C Hamano <junkio@cox.net> wrote:
>> The script seems to do what it claims to, but now why would one
>> need to use this? In other words what's the situation one would
>> find this useful?
>
> It's possibly useful if you oftenly access old objects with
> git-cat-file or git-ls-tree.
Benchmarks?
I created two cloned repositories from git.git. victim03
repository is fully packed with the default pack parameter of
depth and window set both to 10. victim04 repository has the
same set of objects and refs but the pack is expanded (16232
loose objects).
Now in victim03 repository, 657 blobs have depth 10 (i.e. you
need to inflate and apply delta 10 times to get to the object).
So I made the list of these "expensive to access" objects and
run this:
$ cd victim03
$ /usr/bin/time sh -c '
while read sha1; do git cat-file blob $sha1;
done >/dev/null <list
'
3.43user 3.36system 0:07.17elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364561minor)pagefaults 0swaps
3.51user 3.33system 0:07.10elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+364499minor)pagefaults 0swaps
3.76user 2.99system 0:07.28elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+365155minor)pagefaults 0swaps
With the same file list, in victim04 repository that has 16232
loose objects:
$ cd victim04
$ /usr/bin/time sh -c '
while read sha1; do git cat-file blob $sha1;
done >/dev/null <../victim03/list
'
3.29user 2.98system 0:06.33elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+348786minor)pagefaults 0swaps
3.26user 2.88system 0:06.63elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347512minor)pagefaults 0swaps
3.16user 2.98system 0:06.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+347489minor)pagefaults 0swaps
So you are getting slight performance gain out of this by
exploding the pack, but on the other hand you are taxing the
buffer cache quite heavily by reading the loose objects (in both
of the experiments above, I discarded numbers from the very
first run). The size of object databases in these cases are:
$ du -sh victim0[34]/.git/objects
6.2M victim03/.git/objects
84M victim04/.git/objects
So I am still not convinced it would be useful in general. It
used to be that exploding everything and repacking was the only
way to clean out garbage from packs, but after "repack -a -d"
was invented by Frank Sorenson that became more convenient way.
Especially with the recent "delta reusing" pack-objects, doing
"repack -a -d" has become quite cheap, so...
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-03-27 3:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-25 12:02 [PATCH] Add git-explode-packs Martin Atukunda
2006-03-26 6:12 ` Junio C Hamano
2006-03-26 12:54 ` Jan-Benedict Glaw
2006-03-27 3:53 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox