git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sam Vilain <sam.vilain@catalyst.net.nz>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Sam Vilain <sam.vilain@catalyst.net.nz>
Subject: [PATCH] git-repack: generational repacking (and example hook script)
Date: Sat, 30 Jun 2007 20:56:21 +1200	[thread overview]
Message-ID: <1183193782608-git-send-email-sam.vilain@catalyst.net.nz> (raw)
In-Reply-To: <11831937823588-git-send-email-sam.vilain@catalyst.net.nz>

Add an option to git-repack that makes the repack run suitable for
running very often.  The idea is that packs get given a "generation",
and that the number of packs in each generation (except the last one)
is bounded.

The useful invocation of this is git-repack -d -g

The -a option then becomes a degenerate case of generative repacking.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
 Documentation/git-repack.txt |    6 +++
 git-repack.sh                |   74 +++++++++++++++++++++++++++++++++++-------
 templates/hooks--post-commit |   14 +++++++-
 3 files changed, 81 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index be8e5f8..d458377 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -42,6 +42,12 @@ OPTIONS
 	existing packs redundant, remove the redundant packs.
 	Also runs gitlink:git-prune-packed[1].
 
+-g::
+	Enable "generational" repacking.  This attempts to keep the
+	number of packs under control when repacking very often.  Most
+	useful when called from the `post-commit` hook (see
+	link:hooks.html[hooks] for more information).
+
 -l::
         Pass the `--local` option to `git pack-objects`, see
         gitlink:git-pack-objects[1].
diff --git a/git-repack.sh b/git-repack.sh
index 8c32724..3d253fa 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -3,19 +3,21 @@
 # Copyright (c) 2005 Linus Torvalds
 #
 
-USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [--max-pack-size=N] [--window=N] [--depth=N]'
+USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [-g] [--max-pack-size=N] [--window=N] [--depth=N]'
 SUBDIRECTORY_OK='Yes'
 . git-sh-setup
 
-no_update_info= all_into_one= remove_redundant=
-local= quiet= no_reuse= extra=
+no_update_info= generations= remove_redundant=
+local= quiet= no_reuse= extra= generation_width=
 while case "$#" in 0) break ;; esac
 do
 	case "$1" in
 	-n)	no_update_info=t ;;
-	-a)	all_into_one=t ;;
+	-a)	generations=0 ;;
 	-d)	remove_redundant=t ;;
 	-q)	quiet=-q ;;
+	-g)	generations=3 generation_width=10 ;;
+	-G)	generations=$2; generation_width=5; shift ;;
 	-f)	no_reuse=--no-reuse-object ;;
 	-l)	local=--local ;;
 	--max-pack-size=*) extra="$extra $1" ;;
@@ -40,24 +42,69 @@ PACKTMP="$GIT_OBJECT_DIRECTORY/.tmp-$$-pack"
 rm -f "$PACKTMP"-*
 trap 'rm -f "$PACKTMP"-*' 0 1 2 3 15
 
+generation=
+redundant=
+
 # There will be more repacking strategies to come...
-case ",$all_into_one," in
+case ",$generations," in
 ,,)
 	args='--unpacked --incremental'
 	;;
-,t,)
+,*,)
 	if [ -d "$PACKDIR" ]; then
+		max_gen=0
 		for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
 			| sed -e 's/^\.\///' -e 's/\.pack$//'`
 		do
 			if [ -e "$PACKDIR/$e.keep" ]; then
 				: keep
 			else
-				args="$args --unpacked=$e.pack"
 				existing="$existing $e"
+				if [ -e "$PACKDIR/$e.gen" ]; then
+					gen=`cat $PACKDIR/$e.gen`
+				else
+					gen=1
+				fi
+				[ "$max_gen" -lt $gen ] && max_gen=$gen
+				eval "gen_${gen}=\"\$gen_${gen} $e\"";
+				eval "c_gen_${gen}=\$((\$c_gen_${gen} + 1))";
 			fi
 		done
+		i=$max_gen
+		packing=
+		while [ $i -gt 0 ]
+		do
+			eval "c_gen=\$c_gen_$i"
+			eval "packs=\$gen_$i"
+			if [ -n "$c_gen" -a $i -gt "$generations" ]
+			then
+				echo "saw $c_gen packs at generation $i"
+				echo "therefore, repacking everything"
+				packing=1
+				[ -z "$generation" ] && generation=$(($i + 1))
+			elif [ -n "$c_gen" -a "$c_gen" -ge "$generation_width" -a "$i" -lt "$generations" ]
+			then
+				echo -n "generation $i has too many packs "
+				echo "($c_gen >= $generation_width)"
+				echo "repacking at this level and below"
+				packing=1
+				[ -z "$generation" ] && generation=$(($i + 1))
+			fi
+			if [ -n "$packing" ]
+			then
+				for x in $packs; do
+					args="$args --unpacked=$x.pack"
+					redundant="$redundant $x"
+				done
+			fi
+			i=$(($i - 1))
+		done
+		if [ -n "$generation" ]; then
+			[ "$generation" -gt "$generations" ] && generation=$generations
+			[ "$generation" -eq 0 ] && generation=1
+		fi
 	fi
+
 	[ -z "$args" ] && args='--unpacked --incremental'
 	;;
 esac
@@ -95,20 +142,23 @@ for name in $names ; do
 		exit 1
 	}
 	rm -f "$PACKDIR/old-pack-$name.pack" "$PACKDIR/old-pack-$name.idx"
+	[ -n "$generation" ] && echo $generation > "$PACKDIR/pack-$name.gen"
 done
 
 if test "$remove_redundant" = t
 then
-	# We know $existing are all redundant.
-	if [ -n "$existing" ]
+	echo "removing redundant packs"
+	# We know $redundant are all redundant.
+	if [ -n "$redundant" ]
 	then
 		sync
 		( cd "$PACKDIR" &&
-		  for e in $existing
+		  for e in $redundant
 		  do
 			case " $fullbases " in
-			*" $e "*) ;;
-			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
+			*" $e "*) echo "ignoring $e" ;;
+			*)	echo "removing $e.pack etc";
+				rm -f "$e.pack" "$e.idx" "$e.keep" ;;
 			esac
 		  done
 		)
diff --git a/templates/hooks--post-commit b/templates/hooks--post-commit
index 8be6f34..669f1fc 100644
--- a/templates/hooks--post-commit
+++ b/templates/hooks--post-commit
@@ -5,4 +5,16 @@
 #
 # To enable this hook, make this file executable.
 
-: Nothing
+threshold=`git-config gc.threshold`
+threshold=${threshold-250}
+
+gd=`git-rev-parse --git-dir`
+found=$(find $gd/objects/?? -type f | head -$threshold | wc -l)
+
+if [ $found -ge $threshold ]
+then
+    echo "At least $threshold loose objects, running generational repack"
+    git-repack -g -d
+else
+    echo "Found only $found loose objects, less than $threshold"
+fi
-- 
1.5.2.1.1131.g3b90

  reply	other threads:[~2007-06-30  8:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-30  8:56 a bunch of outstanding updates Sam Vilain
2007-06-30  8:56 ` [PATCH] repack: improve documentation on -a option Sam Vilain
2007-06-30  8:56   ` [PATCH] git-svn: use git-log rather than rev-list | xargs cat-file Sam Vilain
2007-06-30  8:56     ` [PATCH] git-svn: cache max revision in rev_db databases Sam Vilain
2007-06-30  8:56       ` [PATCH] GIT-VERSION-GEN: don't convert - delimiter to .'s Sam Vilain
2007-06-30  8:56         ` [PATCH] git-remote: document -n Sam Vilain
2007-06-30  8:56           ` [PATCH] git-remote: allow 'git-remote fetch' as a synonym for 'git fetch' Sam Vilain
2007-06-30  8:56             ` [PATCH] git-merge-ff: fast-forward only merge Sam Vilain
2007-06-30  8:56               ` [PATCH] git-mergetool: add support for ediff Sam Vilain
2007-06-30  8:56                 ` [PATCH] contrib/hooks: add post-update hook for updating working copy Sam Vilain
2007-06-30  8:56                   ` Sam Vilain [this message]
2007-07-03  3:36                     ` [PATCH] git-repack: generational repacking (and example hook script) Nicolas Pitre
2007-07-03  4:58                       ` Sam Vilain
2007-07-03 14:45                         ` Nicolas Pitre
2007-07-03 14:55                           ` Shawn O. Pearce
2007-07-04  0:05                           ` Sam Vilain
2007-07-04  1:01                             ` Johannes Schindelin
2007-07-04  6:16                               ` Sam Vilain
2007-07-04  7:02                                 ` Alex Riesen
2007-07-04 15:42                                 ` Nicolas Pitre
2007-06-30 17:19                   ` [PATCH] contrib/hooks: add post-update hook for updating working copy Junio C Hamano
2007-07-01 22:30                     ` Sam Vilain
2007-07-02  1:10                       ` Junio C Hamano
2007-06-30 17:19                 ` [PATCH] git-mergetool: add support for ediff Junio C Hamano
2007-07-01 22:33                   ` Sam Vilain
2007-06-30 14:28               ` [PATCH] git-merge-ff: fast-forward only merge Johannes Schindelin
2007-06-30 18:32               ` Matthias Lederhofer
2007-06-30 17:19             ` [PATCH] git-remote: allow 'git-remote fetch' as a synonym for 'git fetch' Junio C Hamano
2007-06-30 11:12           ` [PATCH] git-remote: document -n Frank Lichtenheld
2007-06-30 17:19         ` [PATCH] GIT-VERSION-GEN: don't convert - delimiter to .'s Junio C Hamano
2007-07-11 10:49         ` Jakub Narebski
2007-07-01  3:50       ` [PATCH] git-svn: cache max revision in rev_db databases Junio C Hamano
2007-07-01  5:31         ` Eric Wong
2007-07-01  6:49           ` Junio C Hamano
2007-06-30 11:15   ` [PATCH] repack: improve documentation on -a option Frank Lichtenheld
2007-06-30 17:19     ` Junio C Hamano
2007-06-30 11:05 ` a bunch of outstanding updates Frank Lichtenheld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1183193782608-git-send-email-sam.vilain@catalyst.net.nz \
    --to=sam.vilain@catalyst.net.nz \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).